This document takes the MySQL database as an example, demonstrating how to synchronize the S_Order and S_Product tables from the fdl_test database to the fdldemotest database in real time.
This document applies to FineDataLink of V4.2.11.3 and later versions.
Prepare an independently deployed FineDataLink project with registered function points related to Data Pipeline.
Select the source and target databases as needed. For details about databases supported by Data Pipeline, see Real-Time Task.
Establish data connections to source and target databases in Data Connection Management, so that you can select the source and target databases during pipeline task configuration. For details, see Data Connection Configuration.
Grant the account configured in the data connection to be used in the pipeline task the necessary permission to perform the required operations on the database. For details, see Overview of Database Environment Preparation.
Deploy Kafka (an open-source event streaming platform) as the middleware. For details, see Kafka Deployment - ZooKeeper Mode and Transmission Queue Configuration.
Grant permission to use Data Pipeline to users who are not super admins. For details, see Pipeline Task Management Permission.
Log in to the FineDataLink project, choose Data Pipeline > Real-Time Pipeline, and create a pipeline task, as shown in the following figure.
Data Source and Data Destination
Select the data connection of the tables to be synchronized and the data connection and database of the target tables.
You are advised to click the Data Source Permission Check button in Data Source to check if the database user has permission to read logs.
Basic Setting
The following figure shows the operation steps.
For detailed instructions on settings, see Pipeline Task Configuration - Advanced Setting.
Synchronization Configuration
1. Add source tables.
Click the Add Table button, and select the S_Order and S_Product tables from the fdl_test database.
Set Synchronization Type to Full + Incremental Synchronization. The pipeline task will synchronize all inventory data first and then continuously synchronize the changes.
Finally, click the Add Table button, as shown in the following figure.
2. Set target tables.
For each of the two source tables, rename the target table and set a physical primary key for the target table, as shown in the following figure. (The set physical primary key will be displayed in Primary Key Mapping in Write Method)
In this example, the target table of S_Product is named S_Product_1, and the target table of S_Order is named S_Order_1.
Because the synchronization type is Full + Incremental Synchronization, data writing will experience two stages, with different write methods. The strategy for primary key conflicts is uneditable.
Since Mark Timestamp During Synchronization is enabled and Logical Deletion at Target End is selected, each target table has two new fields, namely, _fdl_update_timestamp and _fdl_marked_deleted, as shown in the following figure.
3. Save the configuration.
Click the Save button in the lower left corner.
Task Control
Click the Task Control button and configure settings.
For details, see Real-Time Pipeline Task Configuration - Task Control.
Set Single-Table Dirty Data Threshold to 1000 Row(s).
The task aborts once the number of dirty data records in a single table in the pipeline task reaches the limit.
1. You can set it to at most 100,000 row(s). The counter is reset after the dirty data has been processed.
2. For details about dirty data processing, see Real-Time Pipeline Task - Dirty Data Processing.
Enable Retry After Failure and set it to Retry 3 Times, 2 Minute(s) Apart.
Tick required notification content and configure the notification channel and recipient. In this example, if the pipeline task aborts, the task retries due to an error, table synchronization retries due to an error, or the source table structure changes, the specified person in charge will be notified.
Starting Synchronization for All Tables
Click the Start Synchronization for All Tables button to start real-time synchronization for source tables, as shown in the following figure.
1. You can view the number of read and written rows, as shown in the following figure.
In this example, a dirty data record is generated in S_Order_1. You can process it following Real-Time Pipeline Task - Dirty Data Processing.
You can pause the synchronization of a single table or all tables.
2. You can view the two tables in the fdldemotest database.
Data in S_Product_1 (the target table of S_Product) is shown in the following figure.
Since the notification for source table structure changes has been configured in the "Task Control" section, adding a field named test to the S_Product table in the fdl_test database will trigger a notification.
A test field will be added to the S_Product_1 table, as shown in the following figure.
Choose O&M Center > Real-Time Pipeline, where you can view the task execution status and the data synchronization performance, and check and handle exceptions, as shown in the following figure.
For details, see Real-Time Pipeline Task O&M - Task Management.
To modify the configuration of a real-time pipeline task that has been running for a period of time, such as adding/deleting tables, modifying the settings of Task Control, Advanced Setting, and Synchronization Type, see Real-Time Pipeline Task Management.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy