Scheduled Task supports data reading from and writing into a YMatrix database.
When you set the data destination to a YMatrix database in a scheduled task:The Write Method tab page is shown in the following figure.
The following table introduces the Load Method.
Parallel Loading
1. By default, gpfdist uses port 15500 to provide services.
2. Binary fields cannot be synchronized when you select Parallel Loading.
3. The writing of JSON fields is supported.
4. The following strategies for the primary key conflict are supported: Ignore Source Data If Same Primary Key Value Exists, Record as Dirty Data If Same Primary Key Value Exists, and Overwrite Data in Target Table If Same Primary Key Value Exists.
5. After you enable Dirty Data Tolerance, if the Parallel Loading process fails, the relevant database components need to leverage the built-in error table logic of GPLOAD to obtain the dirty data information about the dirty data and record it correctly. If Dirty Data Tolerance is disabled, the node will directly report an error.
COPY Loading
If you select COPY Loading, you need to create a fdl_temp schema in the target database to store temporary tables and assign users permissions to create tables within the specified schema. (If the schema has been created and granted permissions by the database administrator, database users do not need permissions to create schemas.)
This method supports the writing of binary fields and JSON fields.
Write Data into Target Table Directly
1. Use COPY Loading when the target table has no primary key and Primary Key Mapping is not configured.
2. When the target table has a primary key or Primary Key Mapping is configured, three primary key conflict strategies are available: Ignore Source Data If Same Primary Key Value Exists, Record as Dirty Data If Same Primary Key Value Exists, and Overwrite Data in Target Table If Same Primary Key Value Exists. After selecting one of them as Strategy for Primary Key Conflict, COPY Loading and Common Loading are used.
Write Data into Target Table after Emptying It
Use COPY Loading and Common Loading.
Add/Modify/Delete Data Based on Identifier Field
When COPY Loading and Common Loading are used:
If the COPY Loading process fails, you can try to write the batch of data using the Common Loading method. Any data that fails to be written will be recorded as dirty data. Once the writing of this batch is completed, the next batch will again prioritize COPY Loading.
Common Loading
It is selected for JDBC-based serial loading.
Supports a partial DDL synchronization function in scheduled tasks.
For details, see Partition Table Creation and Data Reading/Writing.
Scheduled Task allows you to select YMatrix partition tables as data sources or destinations.
Scheduled Task supports the configuration of partitions and distribution logic when Target Table is set to Auto Created Table.
When Target Table is set to Auto Created Table, you can click Partition Key Setting.Click Partition Key Setting. The page is shown in the following figure.
YMatrix supports three partitioning methods: RANGE, HASH, and LIST.
Details about the configuration method are as follows:
RANGE and LIST allow you to leave Partition Name empty, in which case a default name will be automatically generated based on the partition position. (This means that you do not have to specify a name. The database will assign names automatically, and no FineDataLink processing is required.)
RANGE has two ways to specify the range. (Supports specifying inclusion or exclusion.)
1. Method one: You need to set the start value and end value, and you can only set Partitioning Interval for automatic partition division when both start and end values are valid, for example: "start (date '2015-01-01') end (date '2020-12-31') every (interval '1 year')".
When the field data is of the date type, you can set Partitioning Interval to a specified year/month/day.
When the field data is of the numeric type, you can set Partitioning Interval to a specified positive integer.
2. Method two: You can set the conditions to Greater than or equal to XXX or Less than or equal to XXX separately.
You can set a default partition.
Pipeline Task supports data writing into a YMatrix database.
1. After you select the primary key for the target table, the selected primary key will serve as the table's primary key and distribution key. The Primary Key column will also serve as the matching column, and the columns of other fields will serve as update columns.
2. Currently, you can only specify distribution keys. Manual Table Creation can temporarily serve as an alternative for other table creation features. Advanced table creation strategies include specifying the storage type (row store/column store), the distribution strategy (random/specified columns), and the partitioning strategy (by time field or other fields).
1. Pipeline Task allows you to select YMatrix partition tables as data destinations.
2. Pipeline Task supports the configuration of partitions and distribution logic when Target Table is set to Auto Created Table.
When Target Table is set to Auto Created Table, you can click Partition Key Setting.
Click Partition Key Setting. The page shown in the following figure.
When used for releasing data services, the YMatrix database supports pagination queries based on pagination parameters.
Supports data reading from YMatrix partition tables.
Supports Database Table Management and Lineage Analysis. (These functions are supported when SQL is selected as Configuration Method at the source end. For details, see the related documents.)
For details, see General Configuration - Auto Table Creation Settings.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy