To synchronize real-time data from PostgreSQL databases through Data Pipeline, you need to configure the data source.
This document describes how to configure the data source on Windows systems. For details about the configuration steps on Linux systems, see PostgreSQL Environment Preparation (Linux).
The logical decoding function of PostgreSQL first appeared in version 9.4. This mechanism allows extracting changes submitted to the transaction log and processing them in a user-friendly way via output plugins. The output plugin must be installed before the PostgreSQL server runs and must be started together with a replication slot so that clients can utilize the changes.
Logical Decoding: parses logical change events from the WAL log
Replication Protocol: provides a mechanism for consumers to subscribe to database changes in real-time (even synchronously)
Snapshot Export: allows exporting consistent snapshots of the database (with the pg_export_snapshot function)
Replication Slot: saves consumer offsets and tracks subscriber progress
In summary, you need to install a logical decoding plugin. Currently, FineDataLink supports the wal2json decoder, with the output format being JSON.
Synchronizing views is not currently supported.
Ensure that the PostgreSQL database version is supported by Data Pipeline before using PostgreSQL databases for data synchronization.
Check if the version is supported by Data Pipeline.
Currently, only non-read-only single-instance PostgreSQL databases of versions 9.4 and above are supported. For details, see Overview of Data Pipeline.
This attribute determines which fields are logged when data is updated or deleted.
DEFAULT: records the old values of the columns of the primary key when UPDATE or DELETE occurs
NOTHING: records no information about the old row when UPDATE or DELETE occurs
FULL: records the old values of all columns in the row when UPDATE or DELETE occurs
INDEX (Index Name): records the old values of the columns defined in the index name when UPDATE or DELETE occurs
Use the following command to modify the table setting.
alter table Schema name.Table name REPLICA IDENTITY FULL
Enter SQL Shell on a Windows system, as shown in the following figure.
Log in to SQL Shell using the default account postgres.
Modify the table settings. For example, enter the following statement to modify the settings of the student table in the public schema.
Installing wal2json
You can download the installation package of the wal2json plugin: wal2json_windows_x64.zip
After decompression, find the wal2json.dll file of the corresponding version from the directory, and then copy and paste the file to the /lib directory of PostgreSQL, as shown in the following figure.
Modifying the Configuration File
Go to the data folder under the PostgreSQL installation directory, find the postgresql.conf file, and suffix the following configuration to the file:
shared_preload_libraries = 'wal2json'wal_level = logicalmax_wal_senders = 10max_replication_slots = 10
Find the pg_hba.conf file in the /data directory and remove the number sign (#) at the beginning of the two lines in the red rectangle in the following figure.
Open Windows Services and restart the database, as shown in the following figure.
The above steps are performed using the default user postgres. If you use another user account to configure the database, the user account must have REPLICATION and LOGIN attributes, and select permission on the tables to be synchronized.
Use the following command to create a user:
GRANT SELECT ON ALL TABLES IN SCHEMA Schema name TO Username;
Use the following command to set the username and password for the user:
CREATE USER Username WITH REPLICATION LOGIN PASSWORD 'Password';
Grant the replication and login privileges to the user.
Go to the data folder in the installation directory and check the configuration file pg_hba.conf to ensure that the user has these privileges.
Go to SQL Shell and run the following command. If the slot returns the record, the configuration is successful.
select * from pg_create_logical_replication_slot('slot_test', 'wal2json');
After you configure the database, you can configure the data source in FineDataLink.
Configure the data connection to source and target databases in FineDataLink before configuring pipeline tasks for real-time synchronization. This will allow you to control synchronization tasks by selecting the data source name during the task configuration process. For details, see Data Connection Configuration.
Set up the pipeline task. For details, see Pipeline Task Configuration.
If there is an issue preventing the startup, you can double-click Application under Computer Management > Windows Logs to view the causes.
For versions 9.6 and earlier, the tool command pg_resetwal is called pg_resetxlog. Its main function is to clean up unnecessary WAL files. Go to the bin folder under the PostgreSQL installation directory, open the command line, and execute the following command:
.\pg_resetxlog.exe -f ..\data
After resetting the logs, try to start the database again. Namely, go to the bin folder under the PostgreSQL installation directory, open the command line, and execute the following command:
.\pg_ctl start -D ..\data
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy