ELT Pipeline: CDC + dbt Core
A complete data pipeline has two jobs: move the data and reshape it. We use two specialized tools — each does one job extremely well.
The Kitchen Analogy
🚚
Altinity Sink Connector
The delivery truck
Brings raw ingredients into the kitchen — fast, reliable, unchanged
👨🍳
dbt Core
The chef
Takes raw ingredients and turns them into finished dishes ready to serve
The truck doesn't cook, and the chef doesn't drive. Each tool has a single responsibility.
Data Engineering Terms
Extract + Load
Altinity Sink Connector (CDC)
Moves raw data as-is from the OLTP source database into ClickHouse. No transformation — the tables arrive with the same schema, same column names, same data types. This is the EL in ELT.
Transform
dbt Core
Reshapes the raw OLTP tables into a star schema optimized for analytics — dimension tables, fact tables, and analytical views. This is the T in ELT.
The Pipeline
┌─────────┐ CDC ┌────────────┐ dbt ┌──────────────┐
│ OLTP DB │ ──────▸ │ ClickHouse │ ──────▸ │ ClickHouse │
│ (source)│ │ (raw OLTP) │ │ (star schema)│
└─────────┘ └────────────┘ └──────────────┘
Extract Load Transform
Data flows left to right. The OLTP database is the source of truth. CDC streams every change into ClickHouse as raw tables. Then dbt reads those raw tables and builds the star schema on top.
Why not transform during load?
- Raw data is preserved — you can always rebuild the star schema from scratch
- Transformations are version-controlled — dbt models are SQL files in Git
- Separation of concerns — CDC team and analytics team can work independently
- Replayability — change a dbt model, re-run, and the output updates. No need to re-extract data.
Together, CDC + dbt form a complete ELT pipeline — real-time data movement plus version-controlled transformations.