Scanning a Batch of Files for Data Synchronization

  • Last update: January 09, 2025
  • Overview

    Application Scenario

    A state-owned enterprise with over 100 subsidiaries requires each subsidiary's finance department to submit various financial budget sheets and final accounts (such as balance sheets, income statements, and cash flow statements) to the headquarters monthly. The finance team of the headquarters needs to manually summarize data in the financial reports of the same type for reporting to the senior management.

    The current problems are:

    • Manual data summary by the headquarters' finance team is time-consuming and inefficient.

    • Errors may occur in the summarized data and you may find it hard to trace anomalies quickly.

    This document provides a solution to automate the reading of files of the same type, helping enterprises reduce costs and improve efficiency.

    Implementation Method

    You can read Excel and CSV files in the same format using the File Input operator.

    Prerequisite

    1. The solution in this document uses the File Input operator to read a batch of files, which is similar to concatenating data from all files row-wise and reading summarized data. If you need to read file data multiple times in a loop and process the read data from a single file per fetch, see Looping Through Files in the Linux Environment for Data Synchronization and Looping Through Files in the Windows Environment for Data Synchronization.

    2. The Excel files in the folder must have identical field names, and the CSV files in the folder must have the same delimiter.

    iconNote:
    The supported Excel file versions range from Excel 97 to the latest version of Microsoft Excel.

    For details about the File Input operator, see Function Description of File Input.

    Procedure

    This example in this document demonstrates how to read data from a batch of CSV files on the FineDataLink server and write it into a database table.

    iconNote:
    The solution in this document allows you to read data from Excel or CSV files on local FineDataLink servers and FTP/SFTP servers.

    Scenario Description

    A company uploads files to the CSV folder on a monthly schedule, and the file data is manually summarized and written into a specific database table regularly, leading to inefficiency and a high likelihood of errors.

    In this example, the CSV folder contains two CSV files. The following figure shows the file data.

    2025-01-09_10-55-28.png

    Preparation

    1. Create a test folder in the /webroot/WEB-INF/assets/local_files path of the FineDataLink project, and upload the files to be scanned into the test folder each month, as shown in the following figure.

    2025-01-09_17-11-02.png

    iconNote:
    To read data from CSV/Excel files on the local FineDataLink server via a data connection to the local server directory, the folder storing files to be scanned must be placed in the /webroot/WEB-INF/assets/local_files path of the FineDataLink project.

    2. Create a data connection to the local server directory, and select the path in which the test folder is stored in Local Server Directory.

    2025-01-09_17-13-59.png

    Reading Data from Files in the Folder

    1. Create a scheduled task, add a Data Transformation node, and enter the Data Transformation editing page.

    2. Add a File Input operator and configure it to read data from the test folder in the FineDataLink project. Retain the default settings of other items, as shown in the following figure.

    iconNote:
    For details about each setting item of the File Input operator, see Function Description of File Input.

    2025-01-09_17-16-32.png

    3. Click Data Preview. The file data in the test folder has been fetched, as shown in the following figure.

    2025-01-09_17-17-42.png

    Outputting the Data

    1. Add a DB Output operator and configure it to output the fetched data into a database table, as shown in the following figure.

    2025-01-09_17-19-19.png

    2. Set Write Method to Write Data into Target Table After Emptying It, which is used for writing all data into the target table.

    3. Run the task.

    4. Click the Publish button to publish the task to Production Mode.

    2025-01-09_17-20-01.png

    Setting a Timed Scheduling Plan

    Configure the timed scheduling plan to summarize data at the end of every month, as shown in the following figure.

    2025-01-09_17-22-19.png

    Result Display

    The following figure shows the data in the summary table.

    2025-01-09_17-27-03.png

    附件列表


    主题: Data Development - Scheduled Task
    • Helpful
    • Not helpful
    • Only read

    滑鼠選中內容,快速回饋問題

    滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。

    不再提示

    9s后關閉

    Get
    Help
    Online Support
    Professional technical support is provided to quickly help you solve problems.
    Online support is available from 9:00-12:00 and 13:30-17:30 on weekdays.
    Page Feedback
    You can provide suggestions and feedback for the current web page.
    Pre-Sales Consultation
    Business Consultation
    Business: international@fanruan.com
    Support: support@fanruan.com
    Page Feedback
    *Problem Type
    Cannot be empty
    Problem Description
    0/1000
    Cannot be empty

    Submitted successfully

    Network busy