Function Description of Real-Time Capture Task

  • Last update: May 07, 2026
  • Overview

    Version

    FineDataLink Version

    Functional Change

    4.2.4.3

    /

    4.2.14.1

    Optimized the execution logic of real-time capture tasks.

    4.2.17.4

    Optimized the retry logic of capture tasks.

    Application Scenario

    When multiple real-time tasks or real-time pipeline tasks synchronize data in real time from different tables in the same database, especially when different tables are processed in different real-time tasks, the database logs are parsed repeatedly, causing heavy database load.

    Function Description

    Real-Time Capture Task supports the management of database log parsing and stores the change data generated by log parsing. Real-time tasks/real-time pipeline tasks can consume data from the real-time capture task.

    iconNote:
    All real-time tasks and real-time pipeline tasks default to using real-time capture tasks for their CDC.

    image 1.png

    Execution Description of Real-Time Capture Tasks

    image 2.png

    Creating a Real-Time Capture Task

    Procedure

    1. Choose System Management > Data Connection > Real-Time Capture Task, and configure Transmission Queue Configuration.

    2. After you create a real-time pipeline task or real-time task, a real-time capture task is automatically added under System Management > Data Connection > Real-Time Capture Task, as shown in the following figure. You do not need to add it manually.

    1.jpg

    Relationships Between Real-Time Capture Tasks and Data Connections

    For FineDataLink of versions earlier than V4.2.14.1:

    Capture tasks are distinguished by the data connection URL:

    • If the URLs of different data connections are exactly the same, only one real-time collection task would be created.

    • If the data connection URL changes (for example, with a space or a query parameter added), a new capture task will be created.

    2.png

    From FineDataLink V4.2.14.1:

    Capture tasks are distinguished by the data connection name.

    Example

    iconNote:
    Resetting and starting a real-time pipeline task or real-time task is equivalent to creating a task. For details, see the table below.
    ScenarioDescription
    An existing real-time capture task is running, and a task is created for a table not included in the real-time capture task:
    • Real-Time Capture Task1 for the link data connection is running, capturing CDC data from table1 and table2.

    • You create and start Task1 for table3 with link1.

    When the task starts, log parsing for table3 is automatically added to Real-Time Capture Task1.

    • Logs are parsed based on the time point of table3 to independently capture data for table3.

    An existing real-time capture task is running, and a task is created for a table included in the capture task:

    • Real-Time Capture Task1 for link1 is running, capturing CDC data from table1, table2, and table3.

    • You create and start Task1 for table1 with link1.

    Task1 directly uses the log data captured by Real-Time Capture Task1:

    1. If Synchronization Type of Task1 is set to Full + Incremental Synchronization or set to Incremental Synchronization Only with Incremental Sync Start Time set to Start Time of Table Synchronization, Task1 uses Real-Time Capture Task1.

    2. If Synchronization Type of Task1 is set to Incremental Synchronization Only, with Incremental Sync Start Time set to a custom time:

    • If the custom time is earlier than the earliest time point in Real-Time Capture Task1, logs are parsed from the time point of table1 to automatically backfill the missing data. If the logs for the corresponding time point are no longer available in the database during backfilling, Task1 aborts with an error, while Real-Time Capture Task1 continues to run normally.

    • If the custom time is later than the earliest time point in Real-Time Capture Task1, Task1 directly uses Real-Time Capture Task1.

    Impact of Pausing/Aborting a Real-Time Pipeline Task or Real-Time Task on the Capture Task

    For FineDataLink of versions earlier than V4.2.14.1:

    ScenarioDescription


    • Real-Time Capture Task1 for link1 is running, capturing CDC data from table1, table2, and table3.

    • Task1 is synchronizing data for table1, table2, and table3, and Task1 is stopped/aborted for reasons unrelated to CDC data.

    The system checks whether the CDC data for all source tables used in the task is being used by other tasks:

    • If no other tasks are using the data, Real-Time Capture Task1 continues capturing for 24 hours, then automatically stops.

    • If other tasks are using the data, Real-Time Capture Task1 continues running.


    • Real-Time Capture Task1 for link1 is running, capturing CDC data from table1, table2, and table3.

    • Task1 is synchronizing data for table1, and is stopped/aborted for reasons unrelated to CDC data.

    The system checks whether the CDC data for all source tables used in the task is being used by other tasks:

    • If no other tasks are using the data, Real-Time Capture Task1 automatically pauses capturing CDC data from table1 but continues capturing CDC data from table2 and table3.

    • If other tasks are using the data, Real-Time Capture Task1 continues capturing CDC data from table1, table2, and table3.

    From FineDataLink V4.2.14.1:

    When a real-time pipeline task/real-time task is paused, the system no longer automatically pauses the capture of tables in the capture task. (If there are unused tables in the real-time capture task, you will receive a notification. You can also manually remove tables from the real-time capture task.)

    Impact of Deleting a Real-Time Pipeline Task or Real-Time Task on the Capture Task

    For FineDataLink of versions earlier than V4.2.14.1:

    ScenarioDescription


    • Real-Time Capture Task1 for link1 is running, capturing CDC data from table1, table2, and table3.

    • Task1 is synchronizing data for table1, table2, and table3. After stopping the task, you delete it.

    When you delete the task, the system checks whether the CDC data of all source tables used in the task is being used by other tasks:

    • If no other tasks are using the data, the system automatically deletes Real-Time Capture Task1 and the captured CDC data from table1, table2, and table3.

    • If other tasks are using the data, Real-Time Capture Task1 continues capturing CDC data from table1, table2, and table3.


    • Real-Time Capture Task1 for link1 is running, capturing CDC data from table1, table2, and table3.

    • Task1 is synchronizing data for table1. After stopping the task, you delete it.

    When you delete the task, the system checks whether the CDC data of all source tables used in the task is being used by other tasks:

    • If no other tasks are using the data, Real-Time Capture Task1 no longer captures CDC data from table1 and automatically deletes its captured data, while continuing to capture CDC data from table2 and table3.

    If other tasks are using the data, Real-Time Capture Task1 continues capturing CDC data from table1, table2, and table3.

    From FineDataLink V4.2.14.1:

    When a real-time pipeline task/real-time task is deleted, the system no longer automatically deletes tables in the capture task. (If there are unused tables in the real-time capture task, you will receive a notification. You can also manually remove tables from the real-time capture task.)

    Impact of Starting a Real-Time Pipeline Task or Real-Time Task on the Capture Task

    ScenarioDescription


    • Task1 that synchronizes data from table1, table2, and table3 is currently paused. Meanwhile, Real-Time Capture Task1 is capturing CDC data from table1, table2, and table3.

    • You resume Task1.

    When you resume a task, the system checks whether the CDC data of all source tables used in the task is being captured:

    1. If data is being captured, it indicates that other tasks are using the CDC data of these tables. Synchronization can proceed from the current task checkpoint using data from the real-time data sharing center.

    • If the checkpoint time is earlier than the earliest data in the real-time data sharing center, the task aborts with an error.

    2. If data is not being captured, it indicates that no other tasks are using the CDC data of these tables. The real-time capture task is automatically resumed to continue capturing data from table1, table2, and table3, and synchronization can proceed from the current task checkpoint using data from the real-time data sharing center.

    • If the checkpoint time is earlier than the earliest data in the real-time data sharing center, the task aborts with an error.

    3. When Task1 is resumed, there are three cases:

    • Case 1: If data at the checkpoint time for this table is available in the real-time data sharing center, synchronization can proceed normally.

    • Case 2: If the checkpoint time is earlier than the earliest data in the real-time data sharing center, data is automatically backfilled.

    • Case 3: If the checkpoint time is later than the latest data in the real-time data sharing center, the real-time capture task is first resumed to capture data up to the corresponding checkpoint, and then synchronization can proceed.

     image 5.png


    • Task1 that previously synchronizes data from table1 is currently paused. Meanwhile, Real-Time Capture Task1 is capturing CDC data from table1, table2, and table3.

    • You resume Task1.

    When you resume a task, the system checks whether the CDC data of all source tables used in the task is being captured:

    1. If data is being captured, it indicates that other tasks are using the CDC data of these tables. Synchronization can proceed from the current task checkpoint using data from the real-time data sharing center.

    • If the checkpoint time is earlier than the earliest data in the real-time data sharing center, the task aborts with an error.

    2. If data is not being captured, it indicates that no other tasks are using the CDC data of these tables. The real-time capture task is automatically resumed to continue capturing data from table1, and synchronization can proceed from the current task checkpoint using data from the real-time data sharing center.

    • If the checkpoint time is earlier than the earliest data in the real-time data sharing center, the task aborts with an error.

    The logic is optimized from FineDataLink V4.2.14.1:

    1. If a capture task is manually paused and later resumed by a real-time task, it resumes data capture for all tables.

    2. When a real-time pipeline task or real-time task is resumed:

    • If the corresponding capture task or table does not exist, the real-time pipeline task reports a table-level error, and the real-time task reports a task-level error. Within the capture task, the following error is reported: "No data parsing exists for Table name. Resynchronize to automatically add parsing for this table."

    • If a runtime error occurs for the corresponding table, the real-time pipeline task reports a table-level error, and the real-time task reports a task-level error. Within the capture task, the following error is reported: Parsing error for Table name. Resynchronize to automatically reset parsing for this table.

    • If the earliest message time of the corresponding table is later than the required time, the real-time pipeline task reports a table-level error, and the real-time task reports a task-level error. Within the capture task, the incremental data for Table name has expired, requiring a resynchronization.

    Real-Time Capture Task Runtime Errors

    ScenarioDescription
    The real-time capture task transitions from Running to Runtime Error:
    • Task1 is synchronizing data from table1, table2, and table3 with the data connection link1 in real time.

    • Real-Time Capture Task1 for link1 encounters an error due to various reasons, such as database connection interruption or missing log files.

    • Real-Time Capture Task1 aborts due to a runtime error.

    • Task1 aborts with an error and outputs the specific error reasons for Real-Time Capture Task1.

    All tasks dependent on Real-Time Capture Task1 abort due to runtime errors and output the specific error reasons for Real-Time Capture Task1.

    You can take appropriate actions based on the specific error details.

    The real-time capture task is stopped abnormally:

    • Task1 is synchronizing data from table1, table2, and table3 with link1 in real time.

    • Real-Time Capture Task1 for link1 was stopped due to other reasons.

    • All tasks dependent on Real-Time Capture Task1 abort due to runtime errors, and output the specific error details: The real-time capture task (Task1) has stopped.

    • If you resume a specific task, the system automatically resumes its real-time capture task. You only need to resume the other tasks.

    Table Exceptions in Real-Time Capture Task

    1. Table exceptions within real-time capture tasks are reflected in both real-time tasks and pipeline tasks.

    When a table within a real-time capture task encounters a runtime error or is abnormally deleted, the dependent real-time task aborts with an error. The abnormal table is removed from the real-time pipeline task, while normal tables are synchronized as usual. The pipeline task reports an overall error only when all tables in the real-time pipeline task are abnormal.

    For example, when parsing logs at the corresponding time point, if the logs for that time point are missing, the real-time capture task aborts with an error, and tasks depending on this real-time capture task also abort with an error.

    2When a new real-time task or real-time pipeline task has a start time earlier than the earliest data time point in the real-time data sharing center, data is automatically backfilled.

    Impact on Real-Time Tasks and Real-Time Pipeline Tasks

    1. The real-time capture task writes all CDC data from the database into the real-time data sharing center for you to decide which type of data to use. For example, Data Pipeline supports DDL change synchronization, while real-time tasks currently do not support it.

    iconNote:

    Real-time capture tasks do not currently support dirty data tolerance. The retry logic is triggered first upon dirty data generation. If the retry fails 3 times, the capture task aborts with an error.

    1. For FineDataLink of versions earlier than V4.2.17.4, after three retries, encountering the same error again would not trigger another retry; however, encountering a different error would reset the retry count and trigger the first retry.

    2. From FineDataLink V4.2.17.4, the retry count is reset once the exception occurs more than 30 minutes after the last task startup.

    2. Changes to the execution logic of real-time tasks and pipeline tasks:

    a. Retry logic: Retries for pipeline tasks and real-time tasks do not apply to the CDC data source. They only control retries for the connection between pipeline tasks and the real-time data sharing center, as well as their output endpoints.

    b. When a table is deleted:

    • Upon a table deletion event, a real-time task aborts with an error, outputs the corresponding error log, and triggers a result notification.

    • Upon a table deletion event, a real-time pipeline task triggers a notification, outputs the corresponding error log, and continues synchronizing other tables.

    c. Custom sync time: When Custom Time is selected as the synchronization type in real-time tasks or real-time pipeline tasks, you can set the synchronization time to the earlier time between the earliest database log time and the earliest time of data in the sharing center. For real-time pipeline tasks, the synchronization uses the latest time among the earliest time points of all tables in the current task.

    d. Resumption of tasks: When a real-time or pipeline task resumes, if the earliest data in the real-time data sharing center is later than the checkpoint time, the task aborts with an error.

    Task O&M

    For details, see Real-Time Capture Task O&M.

     


    附件列表


    主题: System Management
    • Helpful
    • Not helpful
    • Only read

    滑鼠選中內容,快速回饋問題

    滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。

    不再提示

    10s後關閉

    Get
    Help
    Online Support
    Professional technical support is provided to quickly help you solve problems.
    Online support is available from 9:00-12:00 and 13:30-17:30 on weekdays.
    Page Feedback
    You can provide suggestions and feedback for the current web page.
    Pre-Sales Consultation
    Business Consultation
    Business: international@fanruan.com
    Support: support@fanruan.com
    Page Feedback
    *Problem Type
    Cannot be empty
    Problem Description
    0/1000
    Cannot be empty

    Submitted successfully

    Network busy