Pipeline Task Example

  • Last update: June 25, 2026
  • Overview

    This document takes the MySQL database as an example, demonstrating how to synchronize the S_Order and S_Product tables from the fdl_test database to the fdldemotest database in real time.

    This document applies to FineDataLink of V4.2.11.3 and later versions.

    Procedure

    Preparation

    Prepare an independently deployed FineDataLink project with registered function points related to Data Pipeline.

    Procedure
    Step One: Data Source Configuration

    Select the source and target databases as needed. For details about databases supported by Data Pipeline, see Real-Time Task.

    Establish data connections to source and target databases in Data Connection Management, so that you can select the source and target databases during pipeline task configuration. For details, see Data Connection Configuration.

    Step Two: Database Environment Preparation

    Grant the account configured in the data connection to be used in the pipeline task the necessary permission to perform the required operations on the database. For details, see Overview of Database Environment Preparation.

    Step Three: Pipeline Task Environment Preparation

    Deploy Kafka (an open-source event streaming platform) as the middleware. For details, see Kafka Deployment - ZooKeeper Mode and Transmission Queue Configuration.

    iconNote:
    You are advised to deploy Kafka on a Linux system. (While Kafka can also be deployed on a Windows system, its performance will be limited. This deployment method is only suitable for demonstration purposes and not recommended in production environments.) Additionally, you can deploy Kafka and FineDataLink on different servers.
    Step Four: Pipeline Task Permission Assignment

    Grant permission to use Data Pipeline to users who are not super admins. For details, see Pipeline Task Management Permission.

    iconNote:
    To synchronize multiple tables in real time from a MySQL, SQL Server, or Oracle database, you are advised to use a single pipeline task to avoid overloading the database.

    Pipeline Task Creation

    Log in to the FineDataLink project, choose Data Pipeline > Real-Time Pipeline, and create a pipeline task, as shown in the following figure.

    Pipeline Task Configuration

    Data Source and Data Destination

    Select the data connection of the tables to be synchronized and the data connection and database of the target tables.

    You are advised to click the Data Source Permission Check button in Data Source to check if the database user has permission to read logs.

    Basic Setting

    The following figure shows the operation steps.

    For detailed instructions on settings, see Pipeline Task Configuration - Advanced Setting.

    Step
    Description
    Set Data Deletion Strategy to Logical Deletion at Target End.A boolean field named _fdl_marked_deleted (whose value defaults to false) will be added to the target table to record the deletion status of data. If a data record is deleted from the source table, the system will not physically delete the corresponding record in the target table after synchronization. Instead, it will change the _fdl_marked_deleted value of this record to true.
    Enable Mark Timestamp During Synchronization.A long integer field named _fdl_update_timestamp will be added to all target tables to record the time of data inserts or updates, based on the database time, using millisecond-level timestamps.
    Enable Synchronize Source Table Structure Change.The pipeline task will automatically synchronize data definition language (DDL) operations that occur on the source database, such as deleting tables, adding/deleting/renaming fields, and modifying field types (including length change and compatible type change) to the target end without the need for manual intervention.

    Synchronization Configuration

    1. Add source tables.

    Click the Add Table button, and select the S_Order and S_Product tables from the fdl_test database.

    Set Synchronization Type to Full + Incremental Synchronization. The pipeline task will synchronize all inventory data first and then continuously synchronize the changes.

    Finally, click the Add Table button, as shown in the following figure.

    2. Set target tables.

    For each of the two source tables, rename the target table and set a physical primary key for the target table, as shown in the following figure. (The set physical primary key will be displayed in Primary Key Mapping in Write Method)

    In this example, the target table of S_Product is named S_Product_1, and the target table of S_Order is named S_Order_1.

    Because the synchronization type is Full + Incremental Synchronization, data writing will experience two stages, with different write methods. The strategy for primary key conflicts is uneditable.

    Since Mark Timestamp During Synchronization is enabled and Logical Deletion at Target End is selected, each target table has two new fields, namely, _fdl_update_timestamp and _fdl_marked_deleted, as shown in the following figure.

    iconNote:
    You can also set target tables to existing tables and perform batch operations such as renaming target tables and configuring primary keys. For details, see Real-Time Pipeline Task Configuration - Synchronization Configuration.

    3. Save the configuration.

    Click the Save button in the lower left corner.

    Task Control

    Click the Task Control button and configure settings.

    For details, see Real-Time Pipeline Task Configuration - Task Control.

    Setting Item
    Description

    Set Single-Table Dirty Data Threshold to 1000 Row(s).

    The task aborts once the number of dirty data records in a single table in the pipeline task reaches the limit. 

    iconNote:

    1. You can set it to at most 100,000 row(s). The counter is reset after the dirty data has been processed. 

    2. For details about dirty data processing, see Real-Time Pipeline Task - Dirty Data Processing. 

    Enable Retry After Failure and set it to Retry 3 Times, 2 Minute(s) Apart.

    If the pipeline task fails, it retries three times with a two-minute interval between each retry.

    Tick required notification content and configure the notification channel and recipient. In this example, if the pipeline task aborts, the task retries due to an error, table synchronization retries due to an error, or the source table structure changes, the specified person in charge will be notified.


    Set the log level to INFO if you want detailed logs.

    Starting Synchronization for All Tables

    Click the Start Synchronization for All Tables button to start real-time synchronization for source tables, as shown in the following figure.

    iconNote:
    Before starting a pipeline task, ensure there is sufficient memory and concurrent capacity to run the real-time pipeline task. For details, see Load Distribution.

    Effect Display

    1. You can view the number of read and written rows, as shown in the following figure.

    In this example, a dirty data record is generated in S_Order_1. You can process it following Real-Time Pipeline Task - Dirty Data Processing.

    You can pause the synchronization of a single table or all tables.

    2. You can view the two tables in the fdldemotest database.

    Data in S_Product_1 (the target table of S_Product) is shown in the following figure.


    Since the notification for source table structure changes has been configured in the "Task Control" section, adding a field named test to the S_Product table in the fdl_test database will trigger a notification.

    A test field will be added to the S_Product_1 table, as shown in the following figure.

    Pipeline Task O&M

    Choose O&M Center > Real-Time Pipeline, where you can view the task execution status and the data synchronization performance, and check and handle exceptions, as shown in the following figure.

    For details, see Real-Time Pipeline Task O&M - Task Management.

    Pipeline Task Configuration Modification

    To modify the configuration of a real-time pipeline task that has been running for a period of time, such as adding/deleting tables, modifying the settings of Task Control, Advanced Setting, and Synchronization Type, see Real-Time Pipeline Task Management.

    附件列表


    主题: Real-Time Pipeline
    • Helpful
    • Not helpful
    • Only read

    滑鼠選中內容,快速回饋問題

    滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。

    不再提示

    10s後關閉

    Get
    Help
    Online Support
    Professional technical support is provided to quickly help you solve problems.
    Online support is available from 9:00-12:00 and 13:30-17:30 on weekdays.
    Page Feedback
    You can provide suggestions and feedback for the current web page.
    Pre-Sales Consultation
    Business Consultation
    Business: international@fanruan.com
    Support: support@fanruan.com
    Page Feedback
    *Problem Type
    Cannot be empty
    Problem Description
    0/1000
    Cannot be empty

    Submitted successfully

    Network busy