Category
Model
Linux
Centos6
Centos 6.5, Centos 6.6, Centos 6.7, Centos 6.8, and Centos 6.9
Centos 7.x
RedHat6
RedHat 6.5, RedHat 6.6, RedHat 6.7, RedHat 6.8, and RedHat 6.9
RedHat7
RedHat 7.0, RedHat 7.1, RedHat 7.2, RedHat 7.3, and RedHat 7.4
Windows
Windows Server 2008 and above
Windows 11
The Data Pipeline function is usable only on Linux systems.
Configuration database
RDS MySQL, MySQL, SQL Server, Oracle, Db2, and PostgreSQL
Configure an external database for formal projects. For details, see External Database Configuration.
Browser
Google Chrome
The latest version is recommended.
Note: The above operating systems are recommended. You can contact FanRuan technical support if you encounter problems deploying FineDataLink on other Linux systems.
The project can be deployed in both wide area networks and local area networks. The requirements for the network environment are shown in the following table.
Network
Data Volume
Wide area network
Less than ten million rows
Local area network
Unlimited
More than or equal to ten million rows
Bandwidth is mainly used to limit the amount of data transmitted over a network connection in a given amount of time, which can be calculated using the following formula.
Bandwidth (Mb/s) * 80%/8 (bit) = Traffic (Mb/s).
Multiplying the bandwidth by 80% is to reserve a bandwidth margin to avoid network congestion in the actual transmission.
You can refer to the following table.
Recommendation
After conducting numerous internal tests using tens of millions of data records, it has been found that the transmission traffic is close to a value corresponding to the bandwidth of 50 Mb/s (not exceeding 100 Mb/s). Therefore, bandwidth between 50 Mb/s and 100 Mb/s is recommended.
Note: For details of resource control settings, see Load Distribution.
Unknown Number of Scheduled Tasks and Pipeline Tasks (Applicable to Newly Deployed Projects)
Allocate the memory according to actual business needs. An excessively large memory may cause a long FullGC time.
Known Number of Scheduled Tasks and Pipeline Tasks (Applicable to Project Migration and Upgrade)
Minimum memory size = MAX(minimum memory size for running scheduled tasks, minimum memory size for running pipeline tasks)
Memory
Accurate
The minimum memory size shall be the greater one between the minimum memory size for running scheduled tasks and for running pipeline tasks. For details, see the following table.
Rough
The minimum memory size shall be the greater one between the minimum memory size for running scheduled tasks and for running pipeline tasks. Take 1 GB as the memory value for each scheduled and pipeline task.
1. For 4.1.55 and later releases:
Type
Node
Scheduled task
Single input node
Calculation formula: buffer + outputSize*2*channel
The formula is described below.
buffer:
For non-relational DB table input (such as Jodoo and Mongo) and other inputs (such as API input and file input), a Reader takes up 64 MB of memory.
For relational DB table input, the size depends on the table structure. Allocate 1 MB of memory for each column of the input table, and in particular, 2 MB if the precision of the column exceeds 1024. The resulting size shall be a multiple of 8MB and not exceed 64 MB.
For example, if a table has the following structure, it takes 9 MB of memory and shall be allocated with 16 GB of memory.
channel:
The calculation of channel memory is relatively complex. Generally, it takes 8 MB or 16 MB of memory, not exceeding 64 MB.
outputSize:
It is the number of succeeding nodes connected with the input node.
Process node
64 + outputSize*2*64
outputSize (different from one mentioned above):
It is the sum of the number of output nodes and Python nodes that are directly connected with the process node.
The succeeding process node of the process node is not included.
Single output node
32 MB
An output node usually takes 32 MB of memory. Specifically, if the data is output to Doris or StarRocks, a single output node takes 90 MB of memory.
Pipeline task
Take the following task an an example, the required memory is described below.
Input nodes (three): (8 + 1 * 2 * 24) * 3
Process node (one): 64 + 2 * 2 * 64
Output nodes (two): 32 + 32
Total: 552 MB
You can find the description in the corresponding log.
2. For releases before 4.1.55:
a. Estimated memory required for running scheduled/pipeline tasks (applicable to multi-task scenarios where accurate calculation is impossible)
JVM Memory
b. Accurate memory required for running scheduled/pipeline tasks
64 MB + 128 MB * Number of output channels
The following is an example of calculating the memory of a scheduled task.
Input: 64 MB + Output node quantity * (64 MB +64 MB)
Process: 64 MB + Ultimate output node quantity * (64 MB +64 MB)
Output: Output node quantity * 32 MB
Calculate the memory used by the task:
Input: 2 * (64 MB + 1 * (64 MB +64 MB)) = 384 MB
Process: 64 MB + 3 * (64 MB +64 MB) = 448 MB
Output: 3* 32 MB =96 MB
Total: 928 MB
The Web container memory should be equal to or larger than the resource control memory. For example, if an initialized project has a resource control memory of 16 GB, the Web container memory should be set to a value higher than 16 GB, but not more than 80% of the system memory (recommended).
Note 1: For details of modifying the container memory, see Tomcat Memory Modification.
Note 2: For details of the resource control memory, see the Resource Control Memory section of this article.
The system memory should be equal to or greater than the Web container memory (with a recommended value of less than 80% of the system memory).
Note: The larger the system memory, the stronger the system scalability during use. For example, to increase the number of concurrent computations from four to eight, you just need to increase the resource control memory and the Web container memory.
The number of threads should be at least twice the number of concurrent tasks.
To ensure the high performance of concurrent transmission, the number of CPU threads can be slightly greater than twice the number of concurrent tasks.
Number of Concurrent Tasks
The CPU mainly limits the number of scheduled tasks and pipeline tasks that run concurrently.
The number of CPU threads affects full-volume synchronization of the data pipeline task, whereas incremental synchronization remains unaffected.
The disk space shall be more than 50 GB.
Disk space is mainly occupied by files (installation files, task files, log files, and backup files) and task read and write throughput. Data table read and write mainly uses the memory, thus requiring little disk space if the memory is sufficient.
File space
20 MB per hundred tasks
It is an estimated value based on the task quantity in the internal testing environment.
Running log file
Application log
Less than 10 GB
Backup file
Local directory of the server
Increase it according to actual usage.
Data throughput
10 GB
In summary, reserve at least 50 GB of disk space for server deployment. You can increase the disk space as needed if you need to store Excel and CSV files in the server's local directory.
Contact the technical support personnel for the installation package.
Web container
8080
Note: For projects deployed in FDL 4.0.6 and later releases, the default port number is modified to 8068.
Notification
WebSocket port
Default port numbers for FDL 4.0.6 and later releases are 58888 and 59888, and 38888 and 39888 for releases before 4.0.6.
For details, see WebSocket Port Configuration for Standalone Deployment.
WebSocket forwarding port
The default port number for FDL 4.0.6 and later releases is 58889, and 38889 for releases before 4.0.6.
1. For details about port occupation, see Viewing Port Usage.
2. If the default port number conflicts with that of other projects, modify the port number and then open the corresponding port.
3. To deploy multiple Tomcat projects on a server, modify the Tomcat port number to prevent port conflict. For details, see Modifying Tomcat Port Number.
4. If the firewall is enabled, you need to open the relevant port. For a Windows system, see Setting Inbound and Outbound Rules for Windows Server. For a Linux system, see Using and Configuring Linux Firewall.
5. For environments with strict port restrictions between Docker containers or servers, open ports between the node servers for inter-node communication.
If you use the TCP protocol, open the following ports: 7800, 7810, 7820, 7830, 7840, 7850, 7860, and 7870.
If you use the UDP protocol, the port number for inter-node communication is a random one between 45588 and 65536.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy