You have multiple Excel files in the same format on the Windows system and need to read all the file data and perform row-to-column conversion.
When you use the File Input operator to read files in batches, all file data is concatenated row-wise. The concatenated data may contain duplicate values, which will cause an error in row-to-column conversion, as the Row to Column operator requires unique names of the fields to be converted.
FineDataLink allows you to loop through all Excel files and read and process data from a single Excel file per fetch. This method prevents duplicate data that may occur in reading files all at once, ensuring smooth row-to-column conversion.
Configuring the Data Connection
1. Since the names of all Excel files in the folder to be scanned are retrieved by executing a batch script, you need to configure an SSH data connection to the remote target server to call the batch script on it.
For details, see SSH Data Connection.
2. After the batch script is successfully executed, an excel_name.CSV file is generated in the folder to be scanned. The file contains the names of all Excel files in the folder.
You need to output a parameter with file names as its values and use it in the Loop Container node to read data from Excel files sequentially. You must be able to read data from the excel_name.CSV file to output this parameter.
The example in this document uses the data connection to the local server directory to read data from excel_name.CSV.
Storing the Files to be Read
You can download the example Excel files. Excel.zip
In this example, the Excel files to be read are stored in FineDataLink installation directory\webapps\webroot\WEB-INF\assets\local_files\test\excel\.
The following figure shows the data in the Excel files to be read.
This section illustrates how to obtain the names of all Excel files by calling the batch script.
Preparing the Batch Script
You can download the example batch script. scan_excel.bat.zip
1. In this example, the Excel files to be read are stored in FineDataLink installation directory\webapps\webroot\WEB-INF\assets\local_files\test\excel\. After you call the batch script, an excel_name.CSV file will be generated in the excel folder. This file contains the names of all Excel files in the folder.
Modify the path in the following code as needed.
SETLOCAL EnableDelayedExpansionREM Enable delayed environment variable expansion.SET "xls_path=FineDataLink installation directory\webapps\webroot\WEB-INF\assets\local_files\test\excel\*.xls"SET "xlsx_path=FineDataLink installation directory\webapps\webroot\WEB-INF\assets\local_files\test\excel\*.xlsx"REM The xls_path and xlsx_path parameters define the address and file type of the folder to be scanned.SET "output_path=FineDataLink installation directory\webapps\webroot\WEB-INF\assets\local_files\test\excel\excel_name.CSV"REM The output_path parameter defines the path and name of the output file.DIR "!xls_path!" "!xlsx_path!" /B /ON > "!output_path!"REM The /B parameter outputs file names only, and the /ON parameter sorts the output by file name.
2. Place the batch script in a specific directory on the Windows system.
Calling the Batch Script
1. Create a scheduled task, drag a Bat Script node onto the page, and configure the node, as shown in the following figure.
2. Right-click the Bat Script node and select Run Node from the drop-down list to retrieve the names of all Excel files to be read, as shown in the following figure. In subsequent steps, you can output a parameter using the Excel file names.
3. After the node is successfully run, an excel_name.CSV file is generated in FineDataLink installation directory\webapps\webroot\WEB-INF\assets\local_files\test\excel\. This file contains the names of all Excel files in the folder to be scanned.
1. Add a Parameter Assignment node and configure the node to read data from the excel_name.CSV file.
Set Encoding to GBK. Do not tick First Row As Field Name since the excel_name.CSV file contains only names of the Excel files to be read without a column header.
2. Output a parameter with the names of Excel files to be read as its values, as shown in the following figure.
To enable the scheduled reading of Excel files, add an SQL Script node to delete the target table before the iteration operation. Create the target table later in the Loop Container node, as shown in the following figure.
Use the parameter created in the "Outputting the File Name as a Parameter" section to pass file names to the Loop Container node (which will iterate through the output value of the Parameter Assignment node).
Configuring the Loop Container Node
Configure the Loop Container node as shown in the following figure.
Processing Data
1. Drag a Data Transformation node into the Loop Container node to process the data fetched per iteration.
2. Enter the Data Transformation node and add a File Input operator to extract file data, as shown in the following figure.
Reference the parameter output in the "Outputting the File Name as a Parameter" section in File Address. In this example, the file address is assets/local_files/test/excel/${Table_name}.
3. Add other data processing operators (such as the Row to Column operator and the Field Setting operator) behind the File Input operator as needed. This step is not included in the current example.
4. Use the DB Table Output operator to output the extracted data, as shown in the following figure.
Set Write Method to Write Data into Target Table Directly.
5. Click the Save button in the upper right corner.
Publish the task, click Scheduling Plan, and set Execution Frequency, as shown in the following figure.
Click Run in the upper right corner. After successful execution, the data in the target table is shown in the following figure.
If you need to loop through CSV files, you can modify the batch script code in the "Preparing the Batch Script" section as follows.
SETLOCAL EnableDelayedExpansionREM Enable delayed environment variable expansion.SET "csv_path=FineDataLink installation directory\webapps\webroot\WEB-INF\assets\local_files\test\excel\*.csv"REM The csv_path parameter defines the folder address and file types of the folder to be scanned.SET "output_path=FineDataLink installation directory\webapps\webroot\WEB-INF\assets\local_files\test\excel\excel_name.CSV"REM The output_path parameter defines the path and name of the output file.DIR "!csv_path!" /B /ON > "!output_path!"REM The /B parameter outputs file names only, and the /ON parameter sorts the output by file name.
Note that in the above code, the output file excel_name is in the CSV format. Therefore, the output file path must be different from the path of the CSV files to be scanned.
You can adjust the remaining steps as needed according to the "Procedure" section of this document.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy