PostgreSQL Environment Preparation (Windows)- FineDataLink Help Document

Last update: September 23, 2024

Overview

To synchronize real-time data from PostgreSQL databases through Data Pipeline, you need to configure the data source.

This document describes how to configure the data source on Windows systems. For details about the configuration steps on Linux systems, see PostgreSQL Environment Preparation (Linux).

CDC Principle and Support

CDC Principle

The logical decoding function of PostgreSQL first appeared in version 9.4. This mechanism allows extracting changes submitted to the transaction log and processing them in a user-friendly way via output plugins. The output plugin must be installed before the PostgreSQL server runs and must be started together with a replication slot so that clients can utilize the changes.

CDC Support

Logical Decoding: parses logical change events from the WAL log
Replication Protocol: provides a mechanism for consumers to subscribe to database changes in real-time (even synchronously)
Snapshot Export: allows exporting consistent snapshots of the database (with the pg_export_snapshot function)
Replication Slot: saves consumer offsets and tracks subscriber progress

In summary, you need to install a logical decoding plugin. Currently, FineDataLink supports the wal2json decoder, with the output format being JSON.

Procedure

Note:

Synchronizing views is not currently supported.

Ensure that the PostgreSQL database version is supported by Data Pipeline before using PostgreSQL databases for data synchronization.

Confirming the Database Version

Check if the version is supported by Data Pipeline.

Currently, only non-read-only single-instance PostgreSQL databases of versions 9.4 and above are supported. For details, see Overview of Data Pipeline.

Modifying REPLICA IDENTITY

This attribute determines which fields are logged when data is updated or deleted.

DEFAULT: records the old values of the columns of the primary key when UPDATE or DELETE occurs
NOTHING: records no information about the old row when UPDATE or DELETE occurs
FULL: records the old values of all columns in the row when UPDATE or DELETE occurs
INDEX (Index Name): records the old values of the columns defined in the index name when UPDATE or DELETE occurs

Use the following command to modify the table setting.

alter table Schema name.Table name REPLICA IDENTITY FULL

Enter SQL Shell on a Windows system, as shown in the following figure.

Modify the table settings. For example, enter the following statement to modify the settings of the student table in the public schema.

Installing the Plugin

Installing wal2json

You can download the installation package of the wal2json plugin: wal2json_windows_x64.zip

After decompression, find the wal2json.dll file of the corresponding version from the directory, and then copy and paste the file to the /lib directory of PostgreSQL, as shown in the following figure.

Modifying the Configuration File

Go to the data folder under the PostgreSQL installation directory, find the postgresql.conf file, and suffix the following configuration to the file:

shared_preload_libraries = 'wal2json'
wal_level = logical
max_wal_senders = 10
max_replication_slots = 10

Find the pg_hba.conf file in the /data directory and remove the number sign (#) at the beginning of the two lines in the red rectangle in the following figure.

Restarting the Database

Open Windows Services and restart the database, as shown in the following figure.

Confirming the User Permission

The above steps are performed using the default user postgres. If you use another user account to configure the database, the user account must have REPLICATION and LOGIN attributes, and select permission on the tables to be synchronized.

Use the following command to create a user:

GRANT SELECT ON ALL TABLES IN SCHEMA Schema name TO Username;

Use the following command to set the username and password for the user:

CREATE USER Username WITH REPLICATION LOGIN PASSWORD 'Password';

Grant the replication and login privileges to the user.

Go to the data folder in the installation directory and check the configuration file pg_hba.conf to ensure that the user has these privileges.

Confirming that Privileges Are Available

Go to SQL Shell and run the following command. If the slot returns the record, the configuration is successful.

select * from pg_create_logical_replication_slot('slot_test', 'wal2json');

Subsequent Operations

After you configure the database, you can configure the data source in FineDataLink.

Configure the data connection to source and target databases in FineDataLink before configuring pipeline tasks for real-time synchronization. This will allow you to control synchronization tasks by selecting the data source name during the task configuration process. For details, see Data Connection Configuration.

Set up the pipeline task. For details, see Pipeline Task Configuration.

Notes

If there is an issue preventing the startup, you can double-click Application under Computer Management > Windows Logs to view the causes.

For versions 9.6 and earlier, the tool command pg_resetwal is called pg_resetxlog. Its main function is to clean up unnecessary WAL files. Go to the bin folder under the PostgreSQL installation directory, open the command line, and execute the following command:

.\pg_resetxlog.exe -f ..\data

After resetting the logs, try to start the database again. Namely, go to the bin folder under the PostgreSQL installation directory, open the command line, and execute the following command:

.\pg_ctl start -D ..\data

Helpful
Not helpful
Only read

中文（简体）

English

PostgreSQL Environment Preparation (Windows)