You may have the following questions after running a pipeline task:
How many pipeline tasks can my project run simultaneously at most?
What should I do after a pipeline task is manually paused or aborted?
How can I add or delete tables after the pipeline task has started running?
The original task cannot run and needs reconfiguration, but a certain setting cannot be modified on the editing page. What should I do?
Dirty data handling.
How can I view the pipeline task logs?
This article answers these questions for you.
The maximum number of data pipeline tasks that can be started simultaneously in FineDataLink is explained as follows:
For FineDataLink of V4.1.4 and later versions:
You can modify the number of concurrent pipeline tasks in Concurrency Control.
The system records checkpoint information for pipeline tasks. A manually paused or aborted pipeline task will continue synchronizing data from the checkpoint after being restarted.
For example:
The pipeline task read data on March 21, stopped reading data on March 23, and restarted on March 27. Data from March 23 to March 27 would be synchronized.
Before modifying the pipeline task configuration, you need to pause the pipeline task first, as shown in the following figure.
Then click the Edit button to edit the pipeline task:
Add source tables:
Enter the task editing page of a pipeline task and add source tables. The added tables will be synchronized according to the selected synchronization type:
1. If you set Synchronization Type to Full + Incremental Synchronization, the added table will experience full synchronization first, with incremental synchronization suspended in the background, which will start after the full synchronization of the added table is finished.
2. If you set Synchronization Type to Incremental Sync Only:
If the Incremental Sync Start Point is modified, all tables (including the added table) will be synchronized according to the specified start point.
If the Incremental Sync Start Point is not modified, the added table will be synchronized according to the built-in checkpoint of the task.
The page is shown in the following figure.
Delete source tables:
Enter the task editing page of a pipeline task and delete source tables. 2. If you remove a source table and save the modification, all related information of the removed table will be deleted. The corresponding table will not be synchronized if the task starts.
In the following scenarios:
During task operation, due to various reasons (such as invalidation of historical checkpoints and main database downtime), the original task cannot run normally and needs to be reconfigured.
If a configuration error (usually a mapping error that causes automatic table creation to fail) causes initial execution failure, you need to modify the task to restore execution.
Since the editable scope of a pipeline task that has been executed is limited, you can create a copy of the pipeline task and configure it, as shown in the following figure.
The pipeline task copy supports more modifications.
For the pipeline task copy:
You can set the target table of the copy to that of the copied pipeline task. Because if you select an existing table as the target table and the structure (referring to the table name and the field name) of the target table is the same as that of the source table, the task will empty the target table and write the full amount of data during the first execution, and then perform incremental synchronization. Therefore, executing the task copy will not affect the data.
You can also create a target table to store the data.
The former pipeline tasks can be deleted.
Definition of dirty data in data pipelines:
Data that fails to be written due to a mismatch between source and target fields (such as length/type mismatch, target field missing, and violation of NOT NULL constraints of target tables) is regarded as dirty data.
Dirty data threshold:
In the Pipeline Control step, you can set a dirty data threshold, and the task will be terminated when the threshold is reached. The dirty data threshold limits the total number of dirty data records in a task since task creation, as shown in the following figure.
Dirty data handling:
For details, see Real-Time Pipeline Task - Dirty Data Processing.
For details about how to view pipeline task statistical logs and pipeline task operation logs, see Single Pipeline Task O&M.
To view more detailed logs, you can modify the log level of the pipeline task to INFO in the Pipeline Control step, which will print detailed fanruan.log logs, as shown in the following figure.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy