Python Operator

  • Last update: July 11, 2025
  • Overview

    Version

    FineDataLink VersionFunctional Change
    4.1.6.2Changed the default path where files were loaded from FineDataLink installation directory\webapps\webroot\WEB-INF\assist\python to FineDataLink installation directory\webapps\webroot\WEB-INF\plugin\fdl_python.
    4.1.13.1Added a PythonConfig.metaFromMock parameter to FINE_CONF_ENTITY, through which you could control the execution logic of the Python operator.
    4.2.5.3
    • Allowed viewing the return result of Python code for debugging during data development. For details, see the "Configuring the Python Operator" section of this document.

    • Allowed connecting a Python operator with multiple input operators and one process operator.

    • Fixed the issue that the Python module in the image package of FineOps-deployed FineDataLink would be overwritten due to directory mounting to the host machine.

    • Optimized the memory-limited logic for the Python operator. For details, see the "Memory Usage Limit of the Python Operator" section of this document.

    Application Scenario

    • Complex data processing that is difficult to implement using visual operators or Spark SQL can be done with Python scripts in the Data Transformation node during data development.

    • During data development in FineDataLink, you may want to read data from files that are not supported by the File Input operator. In such scenarios, you can load file data by running Python scripts.

    Function Description

    The Python operator in the Data Transformation node enables you to call Python scripts for complex data processing. The following figure explains the function.

    1718699371824241_fixed.jpeg

    iconNote:
    For details about differences between the Python operator and the Python Script node, see Differences Between the Python Script Node and the Python Operator.

    Usage Instruction

    1. In FineDataLink of versions before V4.2.5.3, the Python operator could only be preceded by one input operator.

    Starting from FineDataLink V4.2.5.3, the Python operator can be preceded by multiple input operators and one process operator.

    2. The Python operator cannot be placed between two process operators. If all inputs of the Python operator are input operators, the Python operator can be followed by process operators of the connection, transformation, and laboratory types, excluding the Field Setting operator. If the input of the Python operator contains process operators, the Python operator can only be followed by output operators.

    3. The Python editor provides auto-completion exclusively for basic Python syntax, but not for methods imported via import statements. It lacks syntax highlighting and error validation.

    4. The Python operator supports file loading from absolute and relative paths.

    Note: In FineDataLink of versions before V4.1.6.2, the default path was FineDataLink installation directory\webapps\webroot\WEB-INF\assist\python. Starting from FineDataLink V4.1.6.2, the default path is FineDataLink installation directory\webapps\webroot\WEB-INF\plugin\fdl_python.

    This path can be customized. For details, see the  "Adding the python.properties File" section of this document. Calculate the relative path according to this path.

    5. You can import custom functions using the Python operator.

    • You can import third-party modules installed in the Python runtime environment.

    • You can import custom modules from webroot/WEB-INF/assist/python/resources.

    6. The input of the Python operator exists as Pandas DataFrames in Python code. Reference the usage of DataFrames if you need to process the data source.

    7. The NumPy library removes support for the np.float type since V2.0 (released on June 16, 2024). Type adjustments using np.float will fail.

    8. FineDataLink of V4.1.13.1 and later versions introduces a new parameter PythonConfig.metaFromMock in FINE_CONF_ENTITY, through which you can control the execution logic of the Python operator.

    iconNote:
    Generally, you do not need to modify its value. Adjust it based on actual requirements.
    • false (default value): When a Python operator starts execution, the system will execute code in the Python operator once using preview data from upstream operators to obtain output metadata and then perform a second execution for actual data transformation.

    • true (You need to restart the FineDataLink project after modifying the value to true.): When a Python operator starts execution, the system will generate empty mock data by mimicking upstream operators' metadata, use it to execute code in the Python operator once to obtain output metadata, and then perform a second execution for actual data transformation.

    Prerequisite

    To use the Python operator, you must prepare a Python environment.

    Confirming the Python Version

    Use Python 3.x.

    Installing Essential Packages (Required)

    iconNote:
    Modify the following statements according to the actual environment.

    In Linux and Windows environments:

    1. Install pandas.

    pip3 install pandas

    2. Install datetime.

    pip3 install datetime

    Adding the python.properties File (Optional)

    The following table describes customizable contents in the python.properties file.

    Setting ItemDescription
    python.workdir

    It specifies the working directory.

    • For FineDataLink of versions before V4.1.6.2, the default path is FineDataLink installation directory\webapps\webroot\WEB-INF\assist\python.

    • For FineDataLink of V4.1.6.2 and later versions, the default path is FineDataLink installation directory\webapps\webroot\WEB-INF\plugin\fdl_python.

    python.cmd

    It specifies the Python environment path. The default environment variable is python for Windows systems and python3 for Linux systems.

    By default:

    The system uses the Python environment available in the environment variable, requiring no manual path configuration.

    • For Linux systems, FineDataLink can identify the python3 command if it's accessible via the command line interface.

    • For Windows systems, FineDataLink can identify the python command if it's accessible via the command line interface.

    Examples of the custom content:

    1. Linux system:

    python.cmd=/home/python/bin/python3

    2. Windows system:

    python.cmd=E:\\Python3x\\python.exe

    iconNote:
    For Windows systems, the separator in the path should be double backslashes (\\).
    python.concurrencyIt specifies the number of concurrent Python threads. The default value is 5.
    python.timeoutIt specifies the timeout (in seconds) of the Python program. The default value is 1800 s.


    Follow the steps if you want to adjust the configuration of these items.

    1. Create a folder.

    • For FineDataLink of versions before 4.1.6.2:

    Create a python\config path in tomcat\webapps\webroot\WEB-INF\assist.

    • For FineDataLink of V4.1.7.2 and later versions

    Create a config folder in tomcat\webapps\webroot\WEB-INF\plugin\fdl_python.

    2. Place the python.properties file into the created folder. (After modifying the python.properties file, you must restart the project, which can be done after you finish operations in the "Modifying the fine_conf_entity File" section of this document.)

    You can download the example file of python.properties. (Modify the values based on actual requirements.)python.properties.zip

    Modifying the fine_conf_entity File (Required)

    Find the fine_conf_entity table in the FineDB database and add a setting item PythonConfig.enable with its value set to true. Restart the project after adding the setting item.

    Example

    This example illustrates how to use a Python script to generate the code for each book in the book table.

    Fetching Data from the book Table

    1. Create a scheduled task, drag a Data Transformation node onto the page, and enter the Data Transformation editing page.

    2. Drag in a DB Table Input operator and fetch data from the book table, as shown in the following figure.

    Configuring the Python Operator

    1. Drag in a Python operator and write a script that generates the code for each book.

    iconNote:

    1. The DB Table Input in the following script is the input source of the Python operator, which should be input by clicking the input source.

    2. In Windows systems, FineDataLink of versions before V4.0.30 does not support double quotes ("") in the code, which are supported in FineDataLink of V4.0.30 and later versions.

    3. Reference parameters (if any) in the format of ${Parameter name}.

    import pandas as pd
    # You must use pandas.
    # If there is a connected data source, you can click the data source above to use it. Data from the input source exists in a pandas DataFrame, and can be processed through the DataFrame method.

    input = DB Table Input

    output = input.assign(book_code=range(1, len(DB Table Input.title) + 1))
    # Assign the data to be output to the downstream operator to an output variable. If the data is of the DataFrame data type, output it in the form of a two-dimensional table. If the data is of other data types, output it in the form of a string.

     

    2. Click Data Preview. A book_code column is generated. The following figure explains the function.

    Starting from V4.2.5.3, FineDataLink supports Python code debugging. You can use print statements in the Python operator and click Code Execution Result to check the return message, enabling iterative code adjustments, as shown in the following figure.

    The code execution result also includes the logs during execution, as shown in the following figure.

    Outputting Data

    1. Drag a DB Table Output operator onto the page and configure the operator, as shown in the following figure. 

    2. Click the Run button in the upper right corner.

    Effect Display

    Data in the generated table after successful execution is shown in the following figure.

    Further Reading

    Log Description

    You can set the output log level for individual scheduled tasks under Task Control > Task Attribute > Log Level Setting to meet different needs of log viewing, debugging, and troubleshooting.

    To get a detailed log display, select INFO from the drop-down list of Log Level Setting. For details, see Log Level Setting.

    Detailed logs will be displayed in Log after you run the task.

    Memory Usage Limit of the Python Operator

    ScenarioDescription
    Concurrency-limiting logicNon-containerized deployment and containerized deploymentPreviewThe number of preview threads is determined by the value of python.developConcurrency in the python.properties file, which defaults to 5.
    ExecutionThe number of execution threads is determined by the value of python.concurrency in the python.properties file, which defaults to since FineDataLink V4.5.2.3. (The default value is 5 for FineDataLink of versions before V4.5.2.3).
    Memory-limiting logicNon-containerized deploymentPreview

    The system calculates a per-thread memory threshold for Python preview threads, which is 30% of system free memory/Thread count, then validates the data volume in the preceding operator against this threshold. If the data volume exceeds the per-thread limit, preview failure will occur.

    For example, if there is 6 GB of free system memory, the total available memory for Python preview threads will be 1.8 GB. If there are 3 preview threads, the available memory for each thread will be 0.6 GB. If the data volume in a preceding operator exceeds 0.6 GB, a preview error will occur.

    Execution

    The system calculates a per-thread memory threshold for Python execution threads, which is 50% of system free memory/Thread count, then validates the data volume in the preceding operator against this threshold. If the data volume exceeds the per-thread limit, execution failure will occur.

    For example, if there is 6 GB of free system memory, the total available memory for Python execution threads will be 3 GB. If there are 3 execution threads, the available memory for each thread will be 1 GB. If the data volume in a preceding operator exceeds 1 GB, an execution error will occur.

    Containerized deploymentPreview

    The system does not validate the corresponding memory usage for FineDataLink projects deployed in containers.

    Execution


    附件列表


    主题: Data Development - Scheduled Task
    Previous
    Next
    • Helpful
    • Not helpful
    • Only read

    滑鼠選中內容,快速回饋問題

    滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。

    不再提示

    10s後關閉

    Get
    Help
    Online Support
    Professional technical support is provided to quickly help you solve problems.
    Online support is available from 9:00-12:00 and 13:30-17:30 on weekdays.
    Page Feedback
    You can provide suggestions and feedback for the current web page.
    Pre-Sales Consultation
    Business Consultation
    Business: international@fanruan.com
    Support: support@fanruan.com
    Page Feedback
    *Problem Type
    Cannot be empty
    Problem Description
    0/1000
    Cannot be empty

    Submitted successfully

    Network busy