Data Pipeline - Synchronizing Source Table Structure Changes

  • Last update: February 24, 2025
  • Overview

    Version

    FineDataLink Version

    Functional Change

    4.0.17

    Allowed configuring to synchronize source table structure changes of the MySQL database.

    4.1.1

    Allowed using SQL Server as the source Data Definition Language (DDL). For details about the logic of DDL, see the table in section "Synchronization Results on the Target End." DDL synchronization did not support automatic synchronization of newly added fields on the source end. To use DDL to process newly added fields, see the chapter "Operation Instruction of SQL Server DDL."

    4.1.5.4

    Synchronized changes (additions, deletions, and modifications) to comments of fields in the source table to the target table if DDL was enabled. Comments for newly added fields would not be synchronized.

    4.1.8.2

    Allowed writing pipeline data into the ShenTong database, which supported DDL.

    4.1.9.3

    • Allowed customizing field type mapping rules and configuring valid data connections.

    • Prioritized the customized mapping rules of data connections over the default rules in scenarios including automatic table creation, DDL change synchronization via pipeline tasks, and DDL change prompting in scheduled tasks.

    For details, see General Configuration - Field Mapping Rule.

    4.1.10.1

    Supported DDL if the StarRocks or Doris database was used on the target end.

    4.1.11.2

    Supported DDL if the YMatrix database was used on the target end.

    4.1.11.3

    Supported DDL if the PolarDB PostgreSQL database was used on the target end.

    Application Scenario

    In the process of real-time data synchronization using the pipeline task, the source end structure may change due to business adjustments, such as the addition and deletion of tables and fields, and the modification of field names and types. In this case, you hope these adjustments on the source end can be automatically synchronized to the target end, instead of being modified manually. 

    Function Description

    Pipeline tasks support the synchronization of DDL (Data Definition Language) changes in the source database. If relevant configurations are enabled, the pipeline tasks can automatically synchronize DDL changes that occurred in the source database, such as deleting tables, adding/deleting/renaming fields, and modifying field types/length/compatibility, to the target end, without the need to manually modify the target table structure.

    Supported Range

    To use DDL functions normally, both the source end and the target end of the pipeline task should be displayed as "Supported" in the following table.

    For example, if the MySQL database is used on the source end and the SQL Server database is used on the target end, the DDL function is supported. If the MySQL database is used on the source end and the Db2 database is used on the target end, the DDL function is not supported.

    Data Source

    Used as the Source End

    Used as the Target End

    Notes

    MySQL

    Supported

    Supported


    Oracle

    Supported

    Supported


    SQL Server

    Supported

    Supported

    Due to the characteristics of the SQL Server database, DDL synchronization does not support automatic synchronization of newly added fields on the source end currently. If you need to use DDL to process newly added fields, see the chapter "Operation Instruction of SQL Server DDL."

    GaussDB 200

    Not supported

    Supported


    PostgreSQL

    Supported

    Supported


    Greenplum (including parallel loading)

    Not supported

    Supported


    Kafka

    Not supported

    Not supported (Kafka is not supported on the target end currently.)


    TiDB

    Not supported (TiDB is not supported on the source end currently.)

    Supported


    Amazon Redshift

    Not supported (Amazon Redshift is not supported on the source end currently.)

    Supported


    Db2

    Supported

    Not supported


    SeaboxMPP

    Not supported (SeaboxMPP is not supported on the source end currently.)

    Supported


    SAP HANA

    Not supported

    Not supported


    ShenTong

    Not supported

    Supported


    StarRocks

    Not supported

    Supported

    If you use the StarRocks or Doris database on the source end, you may encounter restrictions when modifying the column type. 

    Doris

    Not supported

    Supported

    YMatrix

    Not supported

    Supported


    PolarDB for PostgreSQL

    Not supported

    Supported


    Procedure


    iconNote:

    This section is about the DDL operations on a non-SQL Server database. If you need to use the SQL Server database for DDL, you need to configure the database. For details, see the chapter "Operation Instruction of SQL Server DDL."


    Take the MySQL data synchronization to the MySQL database as an example.

    Synchronize the Outbound and Inbound Information, DEMO_PRODUCT, and CUSTOMER tables in the demo 1 database to the demotest database in real-time.

    DEMO_PRODUCT.xlsInbound and Outbound Information.xlsCUSTOMER.xls

    Data Pipeline Task Configuration

    Preparation

    Before configuring a pipeline task on FineDataLink, you need to prepare an independently deployed project, register pipeline-related function points in the project, prepare data sources and the database environment, deploy Kafka, configure transmission queues, and assign pipeline task permissions (optional).

    Source Selection

    1. Create a pipeline task in the folder on which you have permission.

    2. Select MySQL as the data source, select Full + Incremental Synchronization as the synchronization type, and tick the data table to be synchronized as the synchronization object on the Source Selection page, as shown in the following figure.

    3. Click Next.

    Data Destination

    1. Select the fdl_demotest data connection of MySQL as the data source, select Physical Deletion at Target End, and enable Mark Timestamp During Synchronization and Synchronize Source Table Structure Change on the Target Selection page, as shown in the following figure.

    2. Click Next.

    Table Field Mapping

    1. Select Auto Created Table as the target. If a table of an identical name exists in the target database, you need to modify the name of the created table. Configure the field mapping relationships and physical primary key, as shown in the following figure.


    2. Click Next.

    Pipeline Control

    1. In this example, configuration items on the Pipeline Control page are kept default. Click Save and Start, as shown in the following figure.



    2. The following figure shows the page after the pipeline task is started.

    Data Changes on the Source End

    1. In the Outbound and Inbound Information table, the Status field is added, the Category field is modified to Commodity Category, and the Initial Inventory field is deleted.

    2. A row of data is added.


    Synchronization Results on the Target End

    View the Outbound and Inbound Information_1 table, which is on the target end of the Outbound and Inbound Information table.

    • Field addition: Newly added fields will synchronize subsequently added data (The Status field will synchronize newly added data.).

    • Field deletion: Deleted fields will be filled with null values in subsequent synchronization (The Initial Inventory field will synchronize null values.).

    • Field name modification: The original name field will be filled with null values in subsequent synchronizations, and synchronize subsequently the added data of the new name field (After the Category field is modified to Commodity Category, the Category field stops synchronizing, and the Commodity Category field continues to synchronize the newly added data.).

    The modifications will be marked in Table Field Mapping.

    After you enable Synchronize Source Table Structure Change, the real-time synchronization of data structure changes is described as follows.

    Operation

    Enabling Synchronize Source Table Structure Change

    Changes in Field Mapping Configuration

    Changes in Target Table Structure

    Changes in Target Table Data

    Delete a table.

    Continues synchronizing other tables.

    Marks the deleted synchronized table.

    The corresponding source table has been deleted, and this table will no longer be synchronized.

    Not changed.

    No new data is written into the deleted table in subsequent synchronization.

    Rename a table.

    Continues synchronizing other tables.


    iconNote:

    If PostgreSQL is the source end, renaming tables is not supported.


    Marks the table with the original name as deleted.

    The corresponding source table has been deleted, and this table will no longer be synchronized.

    Not changed.

    No new data is written into the table with the original name in subsequent synchronization.

    Delete a field.

    Continues synchronizing other fields.

    Marks the deleted field.

    The corresponding source field has been deleted. The field will be filled with null values in subsequent synchronization.

    Not changed.

    The deleted field is filled with null values in subsequent synchronization.


    iconNote:

    If SQLServer is the source end, when a field deletion is detected (The table structure will be checked by comparison in each query.), FineDataLink will treat the field as deleted.

    At this moment, the CDC table may still contain some data with this field, and the values of the field will not be synchronized to the target table.


    Add a field.

    Synchronizes the new field automatically.

    If synchronization fails, the error will be recorded in the log and a notification will be sent. The task will continue running.


    iconNote:

    If you use data from SQLServer as the source, DDL synchronization does not support automatic synchronization of newly added fields on the source end currently. If you need to add a new field DDL, you need to configure the database manually. For details, see the chapter "Operation Instruction of SQL Server DDL."


    Adds a mapping relationship between the source and target ends automatically.

    Synchronizes the new field.

    iconNote:

    The new field will not be marked as a physical or logical primary key.



    Synchronizes the subsequently added data of the new field.

    Rename a field.

    Deletes the field with the original name, and add a field with the new name.

    iconNote:

    If SQLServer or Db2 is the source end, renaming fields is not supported.


    Marks the field with the original name as deleted.

    Adds a mapping relationship of the field with the new name between the source and target ends automatically.

    Synchronizes the field with the new name.

    The field with the original name is filled with null values in subsequent synchronization.

    Synchronizes the subsequently added data of the field with the new name.

    Modify the field type or length.

    Synchronizes the modified field type automatically. If the modification is unsuccessful, the error will be recorded in the log and the task continues running. The field mapping logic during modification is consistent with that during automatic table creation.

    Configures to modify the field types of both source and target tables automatically.

    Synchronizes the modified field type.

    Historical data and the subsequently added data will be changed to data of corresponding types.

    Changes to comments of fields in the source table (additions, deletions, and modifications)

    Synchronizes the changes to the target table.

    iconNote:

    Comments for newly added fields will not be synchronized.



    ///

    Task Management

    The modifications of DDL will be displayed on the Running Log tab page.

    Operation Instruction of SQL Server DDL

    Adding Fields

    For SQL Server data sources, DDL synchronization does not support automatic synchronization of newly added fields on the source end currently. If you need to use DDL to process newly added fields, see the content in this chapter.

    Online Solution


    iconNote:

    In this solution, if data is constantly written into the source table and a new CDC instance is not created immediately after a field is added to the source table, the data modifications in the  source table will not be written to the new CDC instance during the period between the addition of the field to the source table and the creation of the CDC instance. This may result in the loss of data on the newly added fields. If you want to completely avoid this situation, see the offline solution.


    1. Execute a DDL statement on the source table to add a new field.

    2. Create a capture instance in the source table.

    Use the following command.

    EXEC sys.sp_cdc_enable_table
         @Source_schema = N'Schema name',
         @Source_name = N'Table name',
         @Role_name = NULL, -- You may need to configure this item if you want to restrict the
          roles that can read the data changes.
         @Capture_instance = N'Schema name_Table name_Newname'; -- the new name of the capture
          instance


    iconNote:

    The table name should not contain the schema name.


    3. The new capture instance will be automatically switched after FineDataLink detects the creation.

    The newly added fields are synchronized to the target table, as shown in the following figure.

    4. The original capture instance can be deleted manually.

    Use the following command to check the name of the original capture instance in change_tables.

    use Database name;
    GO
    EXECUTE sys.sp_cdc_help_change_data_capture;
    GO

    The following is an example of the command used to disable the capture instance.

    -- Disable a Capture Instance for a table
        EXEC sys.sp_cdc_enable_table
        @Source_schema = N'Schema name',
        @Source_name = N'Table name',
        @Capture_instance = N'Schema name_Table name'


    iconNote:

    The table name should not contain the schema name. Schema name_Table name is the original capture instance that needs to be deleted.


    Use the following command to check whether this record exists in change_tables. The nonexistence of the record indicates it has been deleted. The existence of the record indicates you need to delete it manually using SQL statements.

    use Database name;
    GO
    EXECUTE sys.sp_cdc_help_change_data_capture;
    GO

    Offline Solution

    1. Stop writing to the source table and wait for FineDataLink to synchronize all data in the source table.

    2. Stop pipeline tasks.

    3. Execute a DDL statement on the source table to add a new field.

    4. Create a capture instance in the source table.

    Use the following command.

    EXEC sys.sp_cdc_enable_table
         @Source_schema = N'Schema name',
         @Source_name = N'Table name',
         @Role_name = NULL, -- You may need to configure this item if you want to restrict 
         the roles that can read the data changes.
         @Capture_instance = N'Schema name_Table name_Newname'; -- the new name of the capture 
         instance

    5. Resume writing to the source table.

    6. Start pipeline tasks.

    7. The new capture instance will be automatically switched after FineDataLink detects the creation.

    The newly added fields are synchronized to the target table, as shown in the following figure.

    8. The original capture instance can be deleted manually.


    iconNote:

    If the operations are not performed in the correct order, some data of new fields may flow into the original CDC instance and become missing.


    SAP HANA DDL Operations

    When SAP HANA is used as the data source, changes in the source table structure cannot be detected. You need to manually and systematically perform operations such as table structure changes to complete synchronization.

    Adding Fields

    • A field is added to the data table of a running pipeline task and needs to be synchronized.

    Stop writing to the source table, and pause the FineDataLink pipeline task.

    Go to the configuration page of the pipeline task, remove the table that needs to add fields, and click Save, as shown in the following figure.

    Add fields in the source and target tables.

    Add the table to the FineDataLink pipeline task again, and select Existing Table as the target on the Table Field Mapping page.

    Click Save and Start, and resume writing to the source table.

    • If you do not need to synchronize the added fields, you can only add fields to the source table. No other operations are required.

    Deleting Fields

    Stop writing to the source table, and pause the FineDataLink pipeline task.

    Remove the table from which the field is deleted, and save the task.

    Delete the field in the source and target tables, add the table again on FineDataLink, and select Existing Table as the target on the Table Field Mapping page.

    Click Save and Start, and resume writing to the source table.

    The principles are the same as those mentioned in section "Adding Fields."

    Modifying the Field Type or Length

    Stop writing to the source table, and pause the FineDataLink pipeline task.

    Adjust the field type and length in the source and target tables.

    The following is an example of the command used to modify the field type of the source table.

    ALTER TABLE "FDL_ROXY"."dingdan" ALTER ("CREATETIME" NVARCHAR (255) NULL);

    Start the task, and resume writing to the source table.

    Renaming Fields

    Stop writing to the source table, and pause the FineDataLink pipeline task.

    Remove the table whose field is renamed, and save the task.

    Rename the field in the source and target tables.

    Add the table again on FineDataLink.

    Click Save and Start, and resume writing to the source table.

    The principles are the same as those mentioned in section "Adding Fields."

    附件列表


    主题: Data Pipeline
    Previous
    Next
    • Helpful
    • Not helpful
    • Only read

    滑鼠選中內容,快速回饋問題

    滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。

    不再提示

    10s後關閉

    Get
    Help
    Online Support
    Professional technical support is provided to quickly help you solve problems.
    Online support is available from 9:00-12:00 and 13:30-17:30 on weekdays.
    Page Feedback
    You can provide suggestions and feedback for the current web page.
    Pre-Sales Consultation
    Business Consultation
    Business: international@fanruan.com
    Support: support@fanruan.com
    Page Feedback
    *Problem Type
    Cannot be empty
    Problem Description
    0/1000
    Cannot be empty

    Submitted successfully

    Network busy