Python Operator- FineDataLink Help Document

Last update: July 11, 2025

Overview

Version

FineDataLink Version	Functional Change
4.1.6.2	Changed the default path where files were loaded from FineDataLink installation directory\webapps\webroot\WEB-INF\assist\python to FineDataLink installation directory\webapps\webroot\WEB-INF\plugin\fdl_python.
4.1.13.1	Added a PythonConfig.metaFromMock parameter to FINE_CONF_ENTITY, through which you could control the execution logic of the Python operator.
4.2.5.3	Allowed viewing the return result of Python code for debugging during data development. For details, see the "Configuring the Python Operator" section of this document. Allowed connecting a Python operator with multiple input operators and one process operator. Fixed the issue that the Python module in the image package of FineOps-deployed FineDataLink would be overwritten due to directory mounting to the host machine. Optimized the memory-limited logic for the Python operator. For details, see the "Memory Usage Limit of the Python Operator" section of this document.

Application Scenario

Complex data processing that is difficult to implement using visual operators or Spark SQL can be done with Python scripts in the Data Transformation node during data development.
During data development in FineDataLink, you may want to read data from files that are not supported by the File Input operator. In such scenarios, you can load file data by running Python scripts.

Function Description

The Python operator in the Data Transformation node enables you to call Python scripts for complex data processing. The following figure explains the function.

Note:

For details about differences between the Python operator and the Python Script node, see Differences Between the Python Script Node and the Python Operator.

Usage Instruction

1. In FineDataLink of versions before V4.2.5.3, the Python operator could only be preceded by one input operator.

Starting from FineDataLink V4.2.5.3, the Python operator can be preceded by multiple input operators and one process operator.

2. The Python operator cannot be placed between two process operators. If all inputs of the Python operator are input operators, the Python operator can be followed by process operators of the connection, transformation, and laboratory types, excluding the Field Setting operator. If the input of the Python operator contains process operators, the Python operator can only be followed by output operators.

3. The Python editor provides auto-completion exclusively for basic Python syntax, but not for methods imported via import statements. It lacks syntax highlighting and error validation.

4. The Python operator supports file loading from absolute and relative paths.

Note: In FineDataLink of versions before V4.1.6.2, the default path was FineDataLink installation directory\webapps\webroot\WEB-INF\assist\python. Starting from FineDataLink V4.1.6.2, the default path is FineDataLink installation directory\webapps\webroot\WEB-INF\plugin\fdl_python.

This path can be customized. For details, see the "Adding the python.properties File" section of this document. Calculate the relative path according to this path.

5. You can import custom functions using the Python operator.

You can import third-party modules installed in the Python runtime environment.
You can import custom modules from webroot/WEB-INF/assist/python/resources.

6. The input of the Python operator exists as Pandas DataFrames in Python code. Reference the usage of DataFrames if you need to process the data source.

7. The NumPy library removes support for the np.float type since V2.0 (released on June 16, 2024). Type adjustments using np.float will fail.

8. FineDataLink of V4.1.13.1 and later versions introduces a new parameter PythonConfig.metaFromMock in FINE_CONF_ENTITY, through which you can control the execution logic of the Python operator.

Note:

Generally, you do not need to modify its value. Adjust it based on actual requirements.

false (default value): When a Python operator starts execution, the system will execute code in the Python operator once using preview data from upstream operators to obtain output metadata and then perform a second execution for actual data transformation.
true (You need to restart the FineDataLink project after modifying the value to true.): When a Python operator starts execution, the system will generate empty mock data by mimicking upstream operators' metadata, use it to execute code in the Python operator once to obtain output metadata, and then perform a second execution for actual data transformation.

Prerequisite

To use the Python operator, you must prepare a Python environment.

Confirming the Python Version

Use Python 3.x.

Installing Essential Packages (Required)

Note:

Modify the following statements according to the actual environment.

In Linux and Windows environments:

1. Install pandas.

pip3 install pandas

2. Install datetime.

pip3 install datetime

Adding the python.properties File (Optional)

The following table describes customizable contents in the python.properties file.

Setting Item	Description
python.workdir	It specifies the working directory. For FineDataLink of versions before V4.1.6.2, the default path is FineDataLink installation directory\webapps\webroot\WEB-INF\assist\python. For FineDataLink of V4.1.6.2 and later versions, the default path is FineDataLink installation directory\webapps\webroot\WEB-INF\plugin\fdl_python.
python.cmd	It specifies the Python environment path. The default environment variable is python for Windows systems and python3 for Linux systems. By default: The system uses the Python environment available in the environment variable, requiring no manual path configuration. For Linux systems, FineDataLink can identify the python3 command if it's accessible via the command line interface. For Windows systems, FineDataLink can identify the python command if it's accessible via the command line interface. Examples of the custom content: 1. Linux system: `python.cmd=/home/python/bin/python3` 2. Windows system: `python.cmd=E:\\Python3x\\python.exe` Note: For Windows systems, the separator in the path should be double backslashes (\\).
python.concurrency	It specifies the number of concurrent Python threads. The default value is 5.
python.timeout	It specifies the timeout (in seconds) of the Python program. The default value is 1800 s.

Follow the steps if you want to adjust the configuration of these items.

1. Create a folder.

For FineDataLink of versions before 4.1.6.2:

Create a python\config path in tomcat\webapps\webroot\WEB-INF\assist.

For FineDataLink of V4.1.7.2 and later versions

Create a config folder in tomcat\webapps\webroot\WEB-INF\plugin\fdl_python.

2. Place the python.properties file into the created folder. (After modifying the python.properties file, you must restart the project, which can be done after you finish operations in the "Modifying the fine_conf_entity File" section of this document.)

You can download the example file of python.properties. (Modify the values based on actual requirements.)python.properties.zip

Modifying the fine_conf_entity File (Required)

Find the fine_conf_entity table in the FineDB database and add a setting item PythonConfig.enable with its value set to true. Restart the project after adding the setting item.

Example

This example illustrates how to use a Python script to generate the code for each book in the book table.

Fetching Data from the book Table

1. Create a scheduled task, drag a Data Transformation node onto the page, and enter the Data Transformation editing page.

2. Drag in a DB Table Input operator and fetch data from the book table, as shown in the following figure.

Configuring the Python Operator

1. Drag in a Python operator and write a script that generates the code for each book.

Note:

1. The DB Table Input in the following script is the input source of the Python operator, which should be input by clicking the input source.

2. In Windows systems, FineDataLink of versions before V4.0.30 does not support double quotes ("") in the code, which are supported in FineDataLink of V4.0.30 and later versions.

3. Reference parameters (if any) in the format of ${Parameter name}.

import pandas as pd
# You must use pandas.
# If there is a connected data source, you can click the data source above to use it. Data from the input source exists in a pandas DataFrame, and can be processed through the DataFrame method.

input = DB Table Input

output = input.assign(book_code=range(1, len(DB Table Input.title) + 1))
# Assign the data to be output to the downstream operator to an output variable. If the data is of the DataFrame data type, output it in the form of a two-dimensional table. If the data is of other data types, output it in the form of a string.

2. Click Data Preview. A book_code column is generated. The following figure explains the function.

Starting from V4.2.5.3, FineDataLink supports Python code debugging. You can use print statements in the Python operator and click Code Execution Result to check the return message, enabling iterative code adjustments, as shown in the following figure.

The code execution result also includes the logs during execution, as shown in the following figure.

Outputting Data

1. Drag a DB Table Output operator onto the page and configure the operator, as shown in the following figure.

2. Click the Run button in the upper right corner.

Effect Display

Data in the generated table after successful execution is shown in the following figure.

Scenario			Description
Concurrency-limiting logic	Non-containerized deployment and containerized deployment	Preview	The number of preview threads is determined by the value of python.developConcurrency in the python.properties file, which defaults to 5.
Concurrency-limiting logic	Non-containerized deployment and containerized deployment	Execution	The number of execution threads is determined by the value of python.concurrency in the python.properties file, which defaults to 1 since FineDataLink V4.5.2.3. (The default value is 5 for FineDataLink of versions before V4.5.2.3).
Memory-limiting logic	Non-containerized deployment	Preview	The system calculates a per-thread memory threshold for Python preview threads, which is 30% of system free memory/Thread count, then validates the data volume in the preceding operator against this threshold. If the data volume exceeds the per-thread limit, preview failure will occur. For example, if there is 6 GB of free system memory, the total available memory for Python preview threads will be 1.8 GB. If there are 3 preview threads, the available memory for each thread will be 0.6 GB. If the data volume in a preceding operator exceeds 0.6 GB, a preview error will occur.
	Non-containerized deployment	Execution	The system calculates a per-thread memory threshold for Python execution threads, which is 50% of system free memory/Thread count, then validates the data volume in the preceding operator against this threshold. If the data volume exceeds the per-thread limit, execution failure will occur. For example, if there is 6 GB of free system memory, the total available memory for Python execution threads will be 3 GB. If there are 3 execution threads, the available memory for each thread will be 1 GB. If the data volume in a preceding operator exceeds 1 GB, an execution error will occur.
	Containerized deployment	Preview	The system does not validate the corresponding memory usage for FineDataLink projects deployed in containers.
	Containerized deployment	Execution

Helpful
Not helpful
Only read

中文（简体）

English

Python Operator

Overview

Version

Application Scenario

Function Description

Usage Instruction

Prerequisite

Confirming the Python Version

Installing Essential Packages (Required)

Adding the python.properties File (Optional)

Modifying the fine_conf_entity File (Required)

Example

Fetching Data from the book Table

Configuring the Python Operator

Outputting Data

Effect Display

Further Reading

Log Description

Memory Usage Limit of the Python Operator

附件列表