Step 2: Configure the Sync Pipeline

This step sets up PostgreSQL for CDC (Change Data Capture) and configures the Altinity Sink Connector to stream changes into ClickHouse.

1 Configure PostgreSQL for CDC

PostgreSQL needs logical replication enabled so Debezium can read change events from the WAL (Write-Ahead Log).

-- Enable logical replication (requires PostgreSQL restart)
ALTER SYSTEM SET wal_level = 'logical';

-- Create replication user
CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'repl_password';
GRANT SELECT ON ALL TABLES IN SCHEMA public TO replicator;

-- Create publication for the tables you want to sync
CREATE PUBLICATION my_publication FOR TABLE orders, customers;
Restart required: After changing wal_level, restart the PostgreSQL container for the setting to take effect: docker restart rb-northwind-postgres

2 Configure the Sink Connector

Add the following service to your docker-compose.yml file in the db/ directory:

clickhouse-sink-connector:
  image: altinityinfra/clickhouse-sink-connector:latest
  environment:
    - DATABASE_HOSTNAME=host.docker.internal
    - DATABASE_PORT=5432
    - DATABASE_USER=replicator
    - DATABASE_PASSWORD=repl_password
    - DATABASE_NAME=Northwind
    - DATABASE_SERVER_NAME=postgres-source
    - CONNECTOR_CLASS=io.debezium.connector.postgresql.PostgresConnector
    - SLOT_NAME=ch_sink_slot
    - PUBLICATION_NAME=my_publication
    - CLICKHOUSE_URL=http://clickhouse:8123
    - CLICKHOUSE_USER=default
    - CLICKHOUSE_PASSWORD=clickhouse
    - CLICKHOUSE_DATABASE=northwind
    - TABLE_INCLUDE_LIST=public.orders,public.customers
    - AUTO_CREATE_TABLES=true
    - SINK_CONNECTOR_LIGHTWEIGHT_UPDATE_DELETE=true
Key settings explained:

3 Start + Verify

Start the connector and watch the logs:

# Start the sink connector
cd db
docker-compose up -d clickhouse-sink-connector

# Watch the logs for progress
docker-compose logs -f clickhouse-sink-connector
Look for in the logs: "Snapshot completed" — this means the initial data load finished and the connector has switched to streaming mode.

The connector will first perform a full snapshot of the configured tables, then switch to real-time streaming. This typically takes seconds to minutes depending on data volume.