Load Distribution- FineDataLink Help Document

Last update: April 25, 2025

Overview

Version

FineDataLink Version	Functional Change
4.1.4	/
4.2.1.4	Data Inspection supported the configuration of the concurrency quantity.
4.2.4.3	Merged Real-Time Task and Pipeline Task into Real-Time Module.

Application Scenario

Since the execution of scheduled tasks, pipeline tasks, and APIs in Data Service in FineDataLink all require memory and concurrency resources, you may need to adjust the memory and concurrency allocated to them based on actual usage.

Function Description

FineDataLink provides independent resource control for scheduled tasks, real-time tasks, pipeline tasks, and APIs in Data Service.

You are allowed to control resources in Load Distribution under System Management > Intelligent O&M > Load Management.

Note:

1. The execution of pipeline tasks, scheduled tasks, and real-time tasks is limited by both memory and concurrency. If either condition is not met, an error or prompt will be reported.

2. With Use permission on Intelligent O&M in System Management, you can use the Load Management function. For details, see Load Management Use Permission.

Task Type	Setting Item	Setting Logic	Description
Scheduled Task	Memory limit	Specifies the maximum memory proportion of Real-Time Module.	Controls the maximum number of concurrent data synchronization tasks.
	Concurrency limit	Specifies the maximum number of concurrencies.	Controls the maximum number of concurrent data synchronization tasks.
	Spark memory limit	Specifies the Spark memory proportion. Note: It accounts for 75% of the reserved memory for the system and components by default.	Affects computational complexity and speed.
Real-Time Module Controls the memory and concurrency resources allocated to real-time tasks and pipeline tasks.	Memory limit	Specifies the maximum memory proportion, with the memory of real-time tasks defaulting to 256 MB. For FineDataLink of V4.1.13.4 and later versions, you can adjust the memory limit by modifying FineDB configuration. For details about the modification method, you can contact Fanruan technical support by sending an email to international@fanruan.com or clicking at https://help.fanruan.com/finedatalink-en/.
	Concurrency control	Specifies the maximum number of concurrencies, with the concurrency of real-time tasks defaulting to 4.
Data Service	Memory limit Note: This setting item is currently not effective and awaits optimization.	Specifies the maximum memory proportion of the Data Service module.	Controls concurrency and the amount of data returned per request.
Data Inspection	Concurrency control	Specifies the maximum number of concurrencies.	Controls the number of concurrent tasks.

1.3.png

Compatibility Instruction

Memory Limit

For FineDataLink of V4.1.4 and later versions, changes in the memory limit are as follows:

If the memory limit is set in independently deployed FineDataLink projects before the upgrade, this configuration will become invalid after the upgrade. If you need to modify the memory limit again, you can contact Fanruan technical support.

Concurrency Control

For FineDataLink versions earlier than V4.1.4, if you modify the number of concurrent tasks, the previous concurrency settings will become invalid after the upgrade. You need to reconfigure the settings in Concurrency Control on the Load Distribution tab page.

Memory Distribution

Function Description

You can adjust the memory proportion of each module in the Memory Distribution area.

3.1-1.png

Number	Description
A	Displays the total memory in a stand-alone environment. Displays the memory available for Data Development, Data Pipeline, and Data Service. Note: The configurable memory proportion ranges from 10% to 60% of the total memory. For example, if the total memory of the environment is 2 GB, the configurable memory ranges from 0.2 GB to 1.2 GB.
B	Displays the memory that can be allocated to scheduled tasks.
C	Displays the memory that can be allocated to real-time tasks and pipeline tasks.
D	Displays the memory that can be allocated to APIs in Data Service.
E	Displays the public space memory, which is represented by the green bar. If the memory used by a single module exceeds the configured limit, the module can compete for available memory from the public space memory.
F	The configurable memory proportion ranges from 10% to 60% of the total memory. The memory reserved for the system and components is 40% of the total memory, which defines the upper limit of the configurable memory. Note: The default Spark memory for scheduled tasks accounts for 75% of the reserved memory for the system and components.
G

Note:

1. In a cluster environment, the available memory for all nodes is displayed.

2. In the current version, the default upper limit for the estimated memory of Data Pipeline is 1 GB, and any estimation exceeding 1 GB is still calculated as 1 GB.

You can click the Edit button in the upper right corner to configure the memory proportion by drag-and-drop operations, as shown in the following figure.

Note:

When you configure the memory proportion for Public Space by drag-and-drop operations, the sizes of all four modules shrink proportionally, but the minimum proportion for a single module is 1%. When dragging the slider for the other three modules, only the two modules adjacent to the slider will be resized.

3.1-2.png

Application Scenario

For example, if you encounter the following error during the execution of a scheduled task, it indicates that the memory limit is exceeded.

3.2-1.png

In this case, you need to increase the memory proportion for the Scheduled Task module, as shown in the following figure.

3.2-2.png

If you are prompted that the memory limit is exceeded during the startup of a pipeline task, as shown in the following figure:

3.2-3.png

You can configure Memory Distribution to increase the memory proportion for Real-Time Module.

3.2-4.png

Concurrency Control

Function Description

You can adjust the number of concurrent tasks for Scheduled Task, Real-Time Module, and Data Inspection in the Concurrency Control area.

4.1-1.png

Number	Description
A	Displays the number of CPU cores and the total number of configurable concurrencies in FineDataLink. Note: The total number of configurable concurrencies is calculated as the number of CPU cores multiplied by N, where the default value of N is 10. In a cluster environment, the maximum number of concurrencies for all nodes is displayed. For example, the maximum number of concurrencies for Node 1 is Y, and the maximum number of concurrencies for Node 2 is Y.
B	You can adjust the concurrency proportion by drag-and-drop operations, with 0.5 as the minimum adjustment unit. The adjustable range of the maximum number of concurrencies: 0.5 * the number of CPU cores ≤ the concurrent number of pipeline tasks and scheduled tasks ≤ 9.5 * the number of CPU cores In a cluster environment, the number of CPU cores of each node is displayed, but all nodes must be uniformly configured. The Number of Concurrencies in Data Development: The actual number of concurrencies in the Data Synchronization/Data Transformation/Parameter Assignment nodes is limited by the smallest value between the thread pool and concurrency proportion set in Concurrency Control. The thread pool of Data Synchronization/Data Transformation: Defaults to 1 * the total number of concurrent data development tasks. The thread pool of Shell Script/Bat Script: Defualts to 1 * the total number of concurrent data development tasks. The thread pool of SQL Script: Defaults to 1 * the total number of concurrent data development tasks. Other nodes in scheduled tasks (including SQL Script, Shell Script, Bat Script, and Python Script) are not limited by Concurrency Control. They are only limited by their own thread pool. The number of concurrencies in Real-Time Module includes the number of concurrent pipeline tasks and real-time tasks. The number of concurrent tasks in Data Pipeline: The thread pool of pipeline tasks occupies 0.25 * the total number of concurrent pipeline tasks, with a minimum of 8. One pipeline task requires 4 concurrencies. The available number of concurrencies is calculated as the configured number of concurrencies - the number of concurrencies occupied by the thread pool of pipeline tasks. For example, if the total number of threads for pipeline tasks is 30, of which 8 is for data writing by default, then 22 threads remain. Since one pipeline task requires 4 concurrencies, you can still configure 5 pipeline tasks. The number of concurrent real-time tasks: The execution of a real-time task will start after the system detects 4 concurrencies available for the task, so you need to reserve sufficient execution resources in Load Distribution. The number of concurrencies in Data Inspection: Defaults is 0 and requires manual adjustment. One data inspection task occupies one concurrency.

Number

Description

Displays the number of CPU cores and the total number of configurable concurrencies in FineDataLink.

Note:

The total number of configurable concurrencies is calculated as the number of CPU cores multiplied by N, where the default value of N is 10.

In a cluster environment, the maximum number of concurrencies for all nodes is displayed. For example, the maximum number of concurrencies for Node 1 is Y, and the maximum number of concurrencies for Node 2 is Y.

You can adjust the concurrency proportion by drag-and-drop operations, with 0.5 as the minimum adjustment unit.

The adjustable range of the maximum number of concurrencies:

0.5 * the number of CPU cores ≤ the concurrent number of pipeline tasks and scheduled tasks ≤ 9.5 * the number of CPU cores

In a cluster environment, the number of CPU cores of each node is displayed, but all nodes must be uniformly configured.

The Number of Concurrencies in Data Development:

The actual number of concurrencies in the Data Synchronization/Data Transformation/Parameter Assignment nodes is limited by the smallest value between the thread pool and concurrency proportion set in Concurrency Control.

The thread pool of Data Synchronization/Data Transformation: Defaults to 1 * the total number of concurrent data development tasks.

The thread pool of Shell Script/Bat Script: Defualts to 1 * the total number of concurrent data development tasks.

The thread pool of SQL Script: Defaults to 1 * the total number of concurrent data development tasks.

Other nodes in scheduled tasks (including SQL Script, Shell Script, Bat Script, and Python Script) are not limited by Concurrency Control. They are only limited by their own thread pool.

The number of concurrencies in Real-Time Module includes the number of concurrent pipeline tasks and real-time tasks.

The number of concurrent tasks in Data Pipeline: The thread pool of pipeline tasks occupies 0.25 * the total number of concurrent pipeline tasks, with a minimum of 8. One pipeline task requires 4 concurrencies.

The available number of concurrencies is calculated as the configured number of concurrencies - the number of concurrencies occupied by the thread pool of pipeline tasks.

For example, if the total number of threads for pipeline tasks is 30, of which 8 is for data writing by default, then 22 threads remain. Since one pipeline task requires 4 concurrencies, you can still configure 5 pipeline tasks.

The number of concurrent real-time tasks: The execution of a real-time task will start after the system detects 4 concurrencies available for the task, so you need to reserve sufficient execution resources in Load Distribution.

The number of concurrencies in Data Inspection: Defaults is 0 and requires manual adjustment. One data inspection task occupies one concurrency.

After clicking the Edit button in the upper right corner, you can configure the maximum concurrency by drag-and-drop operations, as shown in the following figure.

4.1-2.png

Application Scenario

For example, during the execution of a pipeline task, if the concurrency limit is exceeded, a prompt will be displayed at the startup of the task, as shown in the following figure.

4.2-1.png

In this case, you can increase the number of concurrencies in Real-Time Module by configuring Concurrency Control, as shown in the following figure.

4.2-2.png

Note: If the concurrency limit is exceeded during the execution of a scheduled task, a message will be recorded in the log, as shown in the following figure.

4.2-3.png

Previous：Surveillance Alert Setting

Next：Overview of Intelligent O&M

Helpful
Not helpful
Only read

中文（简体）

English

Load Distribution

Overview

Version

Application Scenario

Function Description

Compatibility Instruction

Memory Limit

Concurrency Control

Memory Distribution

Function Description

Application Scenario

Concurrency Control

Function Description

Application Scenario

附件列表