ELT Pipeline: CDC + dbt Core

A complete data pipeline has two jobs: move the data and reshape it. We use two specialized tools — each does one job extremely well.

The Kitchen Analogy

🚚

Altinity Sink Connector

The delivery truck
Brings raw ingredients into the kitchen — fast, reliable, unchanged

👨‍🍳

dbt Core

The chef
Takes raw ingredients and turns them into finished dishes ready to serve

The truck doesn't cook, and the chef doesn't drive. Each tool has a single responsibility.

Data Engineering Terms

Extract + Load

Altinity Sink Connector (CDC)

Moves raw data as-is from the OLTP source database into ClickHouse. No transformation — the tables arrive with the same schema, same column names, same data types. This is the EL in ELT.

Transform

dbt Core

Reshapes the raw OLTP tables into a star schema optimized for analytics — dimension tables, fact tables, and analytical views. This is the T in ELT.

The Pipeline

  ┌─────────┐   CDC    ┌────────────┐   dbt    ┌──────────────┐
  │ OLTP DB │ ──────▸  │ ClickHouse │ ──────▸  │  ClickHouse  │
  │ (source)│          │ (raw OLTP) │          │ (star schema)│
  └─────────┘          └────────────┘          └──────────────┘
    Extract              Load                    Transform

Data flows left to right. The OLTP database is the source of truth. CDC streams every change into ClickHouse as raw tables. Then dbt reads those raw tables and builds the star schema on top.

Why not transform during load?

Raw data is preserved — you can always rebuild the star schema from scratch
Transformations are version-controlled — dbt models are SQL files in Git
Separation of concerns — CDC team and analytics team can work independently
Replayability — change a dbt model, re-run, and the output updates. No need to re-extract data.

Together, CDC + dbt form a complete ELT pipeline — real-time data movement plus version-controlled transformations.