This document mainly introduces the structure of FineDB tables related to FineDataLink. For the platform-related FineDB table structure, see FineDB Table Structure.
1. This document uses the built-in project database as an example. Note that field data types may differ if you are using an external database.
2. The FineDB configuration database stores project configuration information. Tables are interrelated. Arbitrary modifications may lead to serious consequences, such as project startup failure.
Do not manually add, delete, or modify any data in the FineDB database! Such an operation may cause irreparable bugs, and you will bear the consequences.
The relationships between tables are as follows:
It is the global parameter definition table that defines and stores global parameter configuration.
This table records information about the most recent execution records of tasks.
If a scheduled task has not run today, the table will retain its most recent execution record.
If the scheduled task has run today, a new execution record will be added (whose LASTRECORD value is true, indicating it is the most recent execution record). The historical execution record will not be deleted immediately.
Whether it is the latest run record
Example value: true
Storage path for execution instance statistics and task snapshots
Example value: dpworks/record/2024-03-13/04c51446-0053-48f3-b507-bad1bdf46592.log
ID of the retried instance
This field has a value if the instance has been retried. Otherwise, the value is null.
Trigger source
It shows the username for manual triggers, or the scheduling plan ID for scheduled triggers.
This table is a backup table for deleted tasks. The table is required to retain the task ID and task name of each task at the time of deletion.
This table is the scheduling calendar table, storing scheduling calendars uploaded by users.
This table covers both timed/event scheduling and single/batch scheduling, including four scenarios, described briefly as follows:
The table structure of FDL_PLAN_SCHEDULE is shown in the following table.
Example:
{ "id": "82606bf6-4ccf-4ba9-907a-3b417334511e", //Plan ID "name": "Scheduling 1-Timed B", //Plan name "type": "WORK_SCHEDULE_PLAN" //Plan type: Single-task plan: WORK_SCHEDULE_PLAN; Batch plan: SCHEDULE_PLAN}
Scheduling configuration - timed scheduling:
"frequency": { // Scheduling frequency configuration "type": 2, "value": { // Specific frequency configuration "cron": null, // Cron expression (used when type is cron) "executeTime": null, // Execution time point (used when executing by time point) "executeDay": null, // Execution date (used when executing by date) "executeMonth": null, // Execution month (used when executing by month) "space": 1, // Interval quantity, used in conjunction with a unit "unit": 3 // Time unit } },
Meaning of the type field:
Scheduling configuration - Event scheduling (deprecated in version 4.1.11.1 and later):
Scheduling Type:
Versions before 4.1.11.1: Includes timed scheduling (TIME) and event scheduling (EVENT)
Versions 4.1.11.1 and later: Timed scheduling (TIME)
This table defines the relationship between plans (including both timed and event-based schedules) and tasks. A single plan can correspond to multiple tasks.
This table records events that can trigger event scheduling.
Last modification time
Example data: 1721704150000
Task status (must be a completed status)
Example data: SUCCESS
This table stores the basic information for event scheduling.
Add time
It can be sorted by the addition time.
Scheduling status
Three possible states: OPEN/CLOSE/ABNORMAL
This table stores the task groups for event scheduling.
Judgement Condition:
{ "type":TIMING/REALTIME "condition": { "type":"judge", "conditionCompareType":"DATE_AFTER", "valueType":"DATE", "source": { "type":"field", "value":"taskFinishTime", }, "target":{ "type": "DATE_INTERVAL", "value": { "num": 1, "unit": "DAY" } } } "timing":{ "id": "0f3c2bb9-498a-4bf0-991b-545aa13a41d3", //Actual task group ID "scheduleOpen": true, //Whether to enable scheduling; default value is true "startTime": { //Scheduling start time "value": "2024-03-13 18:22:19" }, "frequency": { //Scheduling frequency configuration "type": 1, "value": null }, "endTime": { //Scheduling end time "type": 1, "value": null }, "type": "TIME", //Scheduling type, TIME represents timed scheduling "calendar": { "open": false, "calendarId": "" } } once:true,//Whether to use only once status:"SUCCESS" //Task status that can trigger downstream task groups; SUCCESS/FINISHED}
This table is the mapping table for tasks and task groups. For details about the task group description, see Event Scheduling.
This table is the relationship table for task groups.
This table is the pipeline source information table.
Read mode
Example data:
{ "name": "Binlog", "startPoint": "", "format": "", "schemaRegistryUrl": ""}
SYNC_TYPE
Synchronization Type
Deleted in version 4.2.11.3
Incremental starting point only
This table contains information related to the data source.
Source table ID in the pipeline task
Target table ID
Table name
Whether synchronization is completed
Synchronization Type:
FULL_AND_INCREMENTAL
INCREMENTAL
Added in version 4.2.11.3
CUSTOM: Custom starting point
EARLIEST: Earliest valid starting point
DEFAULT: Task start time
Incremental start time, in timestamp format
This configuration takes effect only when the startup type is custom.
This table is the pipeline target information table.
Connection name
Deletion strategy, currently including logical deletion and physical deletion
Whether high-speed loading is enabled
Load Type
Whether to enable synchronization without a primary key
Removed in version 4.2.11.3
This table stores configuration information for pipeline target tables.
Task ID
Schema name
Table Type
Comment for the target table
Whether to allow primary keys to be null:
true: When updating or deleting using primary or logical primary keys, a null value is considered.
false: By default, primary keys and logical primary keys are assumed not to be null
Comparison field
Incremental write configuration
varchar
Full write configuration
Other Configurations
Currently used for batch write configuration
This table stores the configuration for pipeline task groups.
ID
Field mapping
This table stores the mapping information of pipeline tasks.
This table stores pipeline task checkpoint records for resuming from breakpoints.
Record ID
Node type
Incremental start time
This table stores execution records and statistics of pipeline tasks.
Some fields represent table-level records (fields highlighted in green in the following table), while task-level records are aggregated from table-level data. Other fields represent task-level records (fields highlighted in gray).
Number of deleted rows (table-level)
Volume of deleted data (in Bytes)
Number of failed rows
Number of inserted rows
Volume of inserted data (in Bytes)
Last task record time
Last write time on the FineDataLink server
Volume of data read (in Bytes)
Full name of the source table
Number of rows to be synchronized
Data volume to be synchronized (in Bytes)
Type of record, whether it is a table or task; deprecated in version 4.2.1.1 and later
Time of the corresponding log in the database when last read
Time of the corresponding log in the database when last written
Table status
FDL_PIPELINE_TASK_RECORD
This table stores execution information of pipeline tasks (task-level).
In pipeline tasks, if dirty data is captured, the basic information of the dirty data will be stored in this table.
{"before": "Primary key before update","after": "Primary key after update"}
The relationships between tables are as follows.
This table stores call records of data services.
Primary key field UUID; no practical function
Authentication informationExample data:
{"authType": "EmptyAuth", // Currently effective authentication type"authConfig": [ //Specific configuration of authentication{"type": "EmptyAuth" // No authentication},{"code": "AppCode a25787ba-fd6c-4f23-b46a-8b148c2ab1a9","type": "AppCodeAuth" // APPCode authentication},{"secret": "","type": "DigestSignatureAuth" // Digest authentication}],"type": "Auth"}
Mapping table for applications and APIs.
The value -1 means no limit.
Other values represent corresponding timestamps
Number of access requests allowed per unit time
Other values represent the exact number of requests allowed for that period.
Time unit
MINUTE: Minutes
HOURS: Hours
DAYS: Days
Sample data: Maximum frequency of 100 times per minute
{ "limitCount": 100, // Maximum number of access requests allowed per time unit "timeUnit": "HOURS", // Time unit "type": "RateLimit"}
FDL_SERVICE_API
API configuration table.
This table stores the offsets that have already been consumed by the current api_id.
Added in version 4.2.8.4.
SQL script management table: This table stores user-saved SQL script data.
SQL Script Management Table: This table stores user-saved SQL script data.
This table stores general configurations, defining and managing common settings such as case-conversion rules based on data connections.
The data connection name and its corresponding case conversion rules. Currently, the supported transformations include converting all characters to uppercase (UPPER_CASE) or lowercase (LOWER_CASE).Example data:
{"transformation":"UPPER_CASE","connections":["local_fdl_data"]}
This table stores recent edit records, including the latest marking records from the data development and data pipeline modules.
Each row in this table represents a node in the lineage graph.
The field values in the following table represent the internal logic of the code and are unrelated to the functionality.
Lineage node relationship table:
This table stores records of SQL statements that failed to parse. When a SQL parsing failure occurs, it is recorded in this table. Currently, it is used for logging purposes only.
This table is used to record offset information for the event center.
This table is used for lineage relationships between data connections and tasks.
This table stores the checkpoint information for real-time tasks.
Stores breakpoint data (specific data for shared breakpoints is also stored here)
This table stores task execution records. Each time a task runs, a record of its execution is generated.
This table stores the status of real-time tasks, recording the state information of each task.
This table stores the latest execution record, including the latest run and the initial snapshot generated from that execution.
Data detection task running status
(BUILDING: constructing;
BUILD_FAIL: construction failed;
PASS - passed;
NOT_PASS - not passed;
ERROR: execution failed;
INVALID - invalid;
INTERRUPT: interrupted;
RUNNING: running;
QUEUING: queuing)
This is the data detection task configuration table, including task directory and notification settings.
The Global rule definition table is used to define and store configurations for global rules.
Rule typeExample values: TABLE/COLUMN (meaning table-level/field-level)
This table represents the relationship between tasks and global rules, storing their many-to-many (N:N) references.
This table maintains the relationship between task IDs and fork IDs. Related functionalities include import/export of scheduled tasks and scheduling plans in batches.
For detailed table structure, see Configuration Information Storage Table.
This section only records FineDataLink-related fields:
Starting from version 4.1.9.3, the FINE_CONF_ENTITY table adds the field FDLIntegrationConfig.previewCache, with a default value of false; when set to true, it enables preview caching for operators/nodes, displaying the last cached result directly in the next preview.
This table is the directory table for data connections.
This table stores the test connection results for data sources.
A common table is shared by different modules. The resource_type field distinguishes data from different modules, and resource_id represents the business ID.
Directory node type:
ENTITY – file
PACKAGE – folder
Resource type (each business module is distinguished based on this field)
PIPELINE: Data pipeline
DATA_SERVICE_API: Data service API
DATA_SERVICE_APP: Data service application
DETECTION: Data detection task
OFFLINE: Scheduled task
STREAM: Real-time task
Only records the configuration update time
For offline tasks, it records the canvas update time; updating task control does not change this timestamp.
This table stores task controls for the development version.
Remark
{ "workAttributeConfig": { "taskPriority": 0, "customLogLevel": false, "logLevel": "ERROR", "dispatchType": null }, "noticeConfig": { "notification": false, "noticeConfig": { "notify": true, "notifyInDetail": false, "notifyDirtyData": true, "notifyDirtyDataInDetail": false, "notifyDDLChangeEvent": false }, "userGroup": { "users": null, "depts": null, "roles": null, "roleStr": null }, "noticeChannels": [] }, "timeoutRetryConfig": { "timeoutConfig": { "hour": 1, "minute": 0, "enabled": true }, "retryConfig": { "max": 3, "delayMinute": 2, "enabled": true }, "errorLimitConfig": { "enable": false, "limit": 1000 } }}
{ "resourceType": "OFFLINE", "timeoutRetryConfigEntity": null, "noticeEntity": null, "workAttributeConfig": null}
{ "id": "01e41650-3ad6-4f7c-9350-30eef420efef", "controlId": "83b98b52-08f3-4b13-b132-72a5ea0c8fe8", "errorQueueConfig": { "limitNum": -1 }, "notifyConfig": { "notification": false, "notifyContent": { "taskError": true, "syncingSourceTableDeleted": false, "notifyInDetail": false, "ddl": false, "retryNotice": false }, "userGroup": { "users": null, "depts": null, "roles": null, "roleStr": null }, "noticeChannels": [] }, "retry": { "max": 0, "delayMinute": 0, "enabled": false }, "logConfig": { "customLogLevel": true, "logLevel": "INFO" }}
Task control information for scheduled pipeline tasks
Task control information for data detection tasks
When a real-time task is created, a default task control record is generated.
The version information table stores basic information about released versions.
This table is the task control version table.
This table stores task controls for the deployment version.
This table is a common configuration table, including fields such as the current version and whether it has been restored.
Where namespace is the sub-table name, and entity_value (in JSON format) contains the sub-table fields and data. Example data:
{ "version": "4.1.5.5" //Current version number, mandatory field}
The global rule definition table is used to define and store configurations for global rules.
(BUILDING-constructing;
BUILD_FAIL-construction failed;
PASS-passed;
NOT_PASS-not passed;
ERROR-execution failed;
INVALID-invalid;
INTERRUPT-interrupted;
RUNNING-running;
QUEUEING-queuing)
Trigger method
(FIX_TIME - Scheduled; MANUAL - Manual)
Log file storage path
The path is a folder containing log files and snapshot files.
Each collection task has certain configuration data that needs to be persistently stored.
Whether the collection task is disabled
This field is deprecated
This table stores the runtime status data for each collection task, including collection breakpoints, table structures, and other related information.
This table is the global cleaning rule entity table, used to define and store global cleaning rules.
This table stores business reference rule details, recording the references of tasks to global cleaning rules within the business.
Data connection catalog table
This table stores source configuration information for scheduled pipelines.
Added in version 4.2.8.1.
This table stores target configuration information for scheduled pipelines.
This is the basic mapping information table.
This table defines the source tables for scheduled pipelines.
This table defines the target tables for scheduled pipelines.
This table stores information about application data source tables.
This table stores logs for scheduled pipelines.
This table is the operation and maintenance (O&M) record table.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy