You want the insert, update, and delete operations on the source table to be synchronized to the target table, while ensuring that the update field in the target table records the timestamp of each data change.
You can use the Data Comparison operator, which will generate an fdl_comparison_type column to record insert, update, and delete operations by identifier value automatically. However, the update field in the target table cannot be updated simultaneously to record the data deletion timestamp.
1. Synchronize the insert, update, and delete operations from the source table to the target table.
Use the Data Comparison operator.
2. Enable the update field in the target table to record the data change time.
Use the New Calculation Column operator, add an update field, with its value set to the current task execution time. Use the Data Distribution function to route data marked as Added or Changed to the DB Table Output operator for insert/update operations. Route data marked as Removed to the DB Table Output operator for update operations. In this way, the data change time can be recorded in the update field, including data deletion time, as shown in the following figure.
The following figure shows the final effect.
You can download the example data: Source_Table.xls, Target_Table.xls.
Now the data of Sugar cane and Frozen pear are deleted, the data of Grape is inserted, and the data of Pineapple is updated in the source table. You want to synchronize the latest data to the target table and update the data change time in the update field in the target table.
Each data update involves comparing the data from the source table with the non-deleted records in the target table. Therefore, you need to add an identifier column to the target table to mark data changes.
Since the Data Comparison operator will generate an fdl_comparison_type column to record identifier values, you should add a column named fdl_comparison_type (with empty values initially) to the target table, as shown in the following figure.
1. Create a scheduled task, drag a Data Transformation node onto the page, and enter the Data Transformation editing page.
2. Drag in a DB Table Input operator, rename the operator Source Table, and configure it to fetch data from the source table, as shown in the following figure.
3. Drag in another DB Table Input operator, rename the operator Target Table, and configure it to fetch the non-deleted records from the target table, as shown in the following figure.
SELECT ID,Name,Sales FROM `fdldemotest`.`Target_Table` where fdl_comparison_type != 'Removed' OR fdl_comparison_type is Null
1. Add a Data Comparison operator and connect it to the two DB Table Input operators as the downstream operator using connectors.
2. In the Data Comparison operator, configure the logical primary key and comparison field, as shown in the following figure.
The following table describes the configuration items.
You can set field pairs, based on which the corresponding values in two tables will be compared. An identifier column fdl_comparison_type will be automatically generated based on the comparison results. This column can be deleted in Field Mapping in the DB Table Output operator.
If a field is selected as Comparison Field, data insertions, updates, and deletions on this field are identified.
If a field is not selected as Comparison Field, data insertions and deletions on this field are identified, but any updates on it cannot be identified.
The Data Preview tab page is shown in the following figure, where you can see that all data changes have been marked by the fdl_comparison_type field.
Drag in a New Calculation Column operator, and add a new column named update with the value now() to record the current task execution time, as shown in the following figure.
Click Data Preview, as shown in the following figure.
1. Drag in two DB Table Output operators and rename them, as shown in the following figure.
2. Right-click the New Calculation Column operator, click Data Distribution, and configure the data flow: Data with an identifier value not equal to Removed will be routed to the DB Table Output (Added/Updated) operator. Data with an identifier value equal to Removed will be routed to the DB Table Output (Deleted) operator, as shown in the following figure.
Description:
Data with identifier values Added and Changed is shown in the following figure.
This section configures the DB Table Output (Added/Updated) operator to write added and updated data to the target table.
Specific Setting:
Click Write Method, select Add/Modify/Delete Data Based on Identifier Field as Write Method, and tick Add and Modify. Select ID as the logical primary key and select fdl_comparison_type as Identifier Field. Values set in Identifier Value should correspond to the values of Identifier in the Data Comparison operator accordingly, as shown in the following figure.
Data with the identifier value Removed is shown in the following figure.
This section configures the DB Table Output (Deleted) operator to update the deleted data to the target table.
Click Write Method, select Add/Modify/Delete Data Based on Identifier Field as Write Method, and tick Modify. Select ID as the logical primary key and select fdl_comparison_type as Identifier Field. Values set in Identifier Value should correspond to the values of Identifier in the Data Comparison operator accordingly, as shown in the following figure.
1. Click Run. After the successful execution, the Log area at the bottom displays the node execution status, as shown in the following figure.
2. The data in the target table is shown in the following figure. You can see that the added, modified, and deleted data have been synchronized to the target table, and the update field has also been updated accordingly.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy