Task Control - Fault Tolerance Mechanism

  • Last update: May 13, 2025
  • Overview

    Version

    FineDataLink VersionFunctional Change

    4.1.1

    Retried failed nodes in Loop Container with parameters of the current loop immediately upon node failure (without retrying the entire Loop Container node) when Retry After Failure was enabled.

    4.1.4

    Added Save and Cancel buttons for you to confirm or discard your edits.

    Function Description

    You can configure Timeout LimitRetry After Failure, and Dirty Data Tolerance on the Fault Tolerance Mechanism tab page, as shown in the following figure.

    iconNote:
    The Result Notification configuration of a scheduled task takes effect only when you publish the task to Production Mode.

    1.2.png


    Timeout Limit

    A task that has been running for a long time may encounter exceptions and occupy resources.

    You can configure timeout limits in Timeout Limit for tasks to forcibly terminate running tasks that run longer than the set duration.

    Retry After Failure

    A task interrupted due to network fluctuations or other reasons can be executed successfully if you rerun it after a while.

    You can configure the number of retries and the interval between retries in Retry After Failure to automatically rerun the task upon failure.

    Dirty Data Tolerance

    You can set Dirty Data Threshold to enhance the fault tolerance of tasks.

    For details about the definition of dirty data, see Dirty Data.

    Retry After Failure

    A task interrupted due to network fluctuations or other reasons can be executed successfully if you rerun it after a while.To prevent such task interruption, you can configure the number of retries and the interval between retries in Retry After Failure to automatically rerun the task upon failure.

    The Retry After Failure function is disabled by default and needs to be enabled manually.

    2-1.png

    Setting ItemDescription

    Retry Times

    The default value is 3 and the maximum value is 100.

    Retry Interval

    The default value is 2 and the maximum value is 60 (unit: minute).

    Note:

    1. The logic of Retry After Failure: The setting takes effect for the failed node only.

    2-2.png

    2. Situations like network interruptions may cause database disconnection and partial data appending. In such cases, rerunning the task may result in duplicate data appending or errors due to primary key conflicts.

    3. If Retry After Failure is enabled for the task invoked using the Invocation Task node, the entire subtask will be re-executed, rather than the failed node in the subtask only.

    4. If a scheduled task contains a Loop Container node with Retry After Failure enabled:

    For FineDataLink of versions before V4.1.0:

    • If the failed node is within Loop Container, the entire Loop Container node will be rerun.

    • If the Loop Container node is set to run five times but a node within it fails during the second iteration, the entire node will rerun from the first iteration.

    In FineDataLink of V4.0.29 and later versions:

    • When a node in Loop Container fails, the system will immediately retry the failed node using the parameters of the current loop instead of retrying the entire Loop Container node.

    5. If Retry After Failure is enabled for a task with the number of retries set to 3, the task will only be considered failed after all three retries have failed.

    The execution judgment on the connector specifies that the Notification node is executed only if the Data Synchronization node fails, as shown in the following figure.If the number of retries is set to and the Data Synchronization node fails, the Notification node will be executed when all three retries of the Data Synchronization node fail.

    2-3.png

    Timeout Limit

    A task that has been running for a long time may experience exceptions and occupy resources.To address this issue, you can configure timeout limits in Timeout Limit for tasks to forcibly terminate running tasks that run longer than the set duration.

    The Timeout Limit function is disabled by default and needs to be enabled manually.The default value is one hour, and the maximum value is 48 hours.

    3-1.png

    Dirty Data Tolerance

    You can set Dirty Data Threshold to enhance the fault tolerance of tasks.The scheduled task continues running despite dirty data and does not trigger the error until the set limit of dirty data is reached.

    4-1.png

    Dirty Data Definition:

    For details, see Dirty Data.

    iconNote:
    For running scheduled tasks, the cumulative dirty data counted by Dirty Data Threshold does not include the dirty data generated due to data processing exceptions (for example, JSON parsing failure) during transformation.

    Function Description:

    1. The Dirty Data Tolerance button is disabled by default. The default value of Dirty Data Threshold is 1000, with an input range of 1 to 10,000.

    2. If Dirty Data Threshold is set, erroneous data in all output components will be routed to the dirty data queue and recorded as dirty data generated in the corresponding operator/node.

    An output component must meet all the following conditions:

    iconNote:
    For Redshift (with high-speed loading enabled), StarRocks, and Doris databases, the data writing is subject to the threshold setting, but FineDataLink cannot track the reasons for dirty data statistically.
    • It is a Data Synchronization node or an output operator in the Data Transformation node.

    • DB Table Output is selected as the output end, and the data destination is not a Hive (HDFS), GP (with Parallel Loading enabled), or Transwarp database.

    4-2.png

    3. The Dirty Data Threshold configuration applies to all output components in the task. For example, if you set Dirty Data Threshold to 1000, all output components in the task will be subject to the same node-level dirty data threshold of 1000."

    If dirty data is generated during task execution:

    1. The Statistics tab page in the task log display area shows dirty data information. You can click the number of dirty data rows to view the row count of dirty data caused by different reasons.

    4-3.png

    2. You can choose O&M Center > Running Record and click the View Details button on the right side of a task instance to retry the task and view retry records.For details about the task retry function and retry scenarios, see Task Retry.

    3. If the number of generated dirty data records does not exceed the dirty data threshold, the scheduled task will continue running despite dirty data and be labeled Successful in Running Status if it is executed successfully.

    附件列表


    主题: Task O&M
    Previous
    Next
    • Helpful
    • Not helpful
    • Only read

    滑鼠選中內容,快速回饋問題

    滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。

    不再提示

    10s後關閉

    Get
    Help
    Online Support
    Professional technical support is provided to quickly help you solve problems.
    Online support is available from 9:00-12:00 and 13:30-17:30 on weekdays.
    Page Feedback
    You can provide suggestions and feedback for the current web page.
    Pre-Sales Consultation
    Business Consultation
    Business: international@fanruan.com
    Support: support@fanruan.com
    Page Feedback
    *Problem Type
    Cannot be empty
    Problem Description
    0/1000
    Cannot be empty

    Submitted successfully

    Network busy