FineDataLink Version
Functional Change
4.1.6.4
Supported to_json and struct functions. For details, see Common SparkSQL Syntax.
4.1.7.3
Optimized the interaction.
The common practice of processing data from non-relational databases, such as data from files and APIs, is to store data in the database and then process it with SQL statements, which leads to redundant data in the target database. However, by using the Spark SQL operator, you can perform operations such as left joins, right joins, union all, grouping, summary, and sorting on data from various sources within FineDataLink.
Additionally, you can use the Spark SQL operator to handle various data processing needs that the visual operators in Data Transformation fail to meet.
When authentication is required to connect to the API data source, you can use Spark SQL encryption functions and variables to obtain tokens.
You can use the Spark SQL operator to query and process data output by the upstream node/operator and output the result data to the downstream node/operator.
The Spark SQL operator can be used as an input operator, where you can use parameters or constants for encryption and authentication.
1. Spark SQL can be used as an input operator, where you can use encryption functions and parameters.
Reference a parameter in a format similar to select '${cyctime}' as date.
2. The Spark SQL operator supports standard SQL statements, and the usage is consistent with that of SQL in Configuration Method under Data Synchronization > Data Source. If you want to use statements such as UPDATE, use the SQL Script node.
3. For details about the common syntax and introduction of Spark SQL, see Spark SQL Syntax Overview.
4. The setting items on the Spark SQL setting page are described in the following table.
Tab Page
Setting Item
Introduction
Configuration
Input Source
The preceding operator of the Spark SQL operator serves as the input table by default, with its name being the input table name. The Spark SQL operator supports two types of preceding operators, including: Data input operators: such as DB Table Input, API Input, and Dataset Input Connection and transformation operators: such as Data Association, Row to Column, Column to Row, and JSON Parsing.
SQL Statement
You need to enter the statement.
The auto-completion feature is available when you enter the SQL statement. For example, when you type the table name followed by a period (.), the field names will be suggested.
If there are many fields, you can scroll through field names (shown in the red box in the following figure) to select fields.
Data Preview
/
After finishing the configuration, you can click the Data Preview tab to preview data after being processed by Spark SQL.
5. On the Data Preview tab page, at most 5000 rows of data are sampled for the calculation to avoid server overload due to the excessive data volume. All the selected data participates in the calculation during actual execution.
You want to associate data from the SOrder file and the SCustomer database table and filter data.
You can download the example data:
Create a scheduled task and drag a Data Transformation node onto the design page, as shown in the following figure.
Click the Data Transformation node to enter the editing page. Upload the local Excel file to the FineDataLink system, as shown in the following figure. For details, see Function Description of File Input.
Add a DB Table Input operator and use the SQL statement to fetch all the data from the SCustomer table, as shown in the following figure.
Add a Data Association operator and set Join Method to Left Join to combine the SOrder table and the SCustomer table based on the Customer ID field, as shown in the following figure.
Add a Spark SQL operator and connect it to the Data Association operator as the downstream operator. Enter the following statement in SQL Statement.
select * FROM Data Association where Data Association.Fax is not null and Data Association.Ship City = 'Beijing'
Note the requirement of writing the SQL statement, as shown in the following figure.
Click Data Preview, as shown in the following figure.
Add a Field Setting operator, where you can delete and modify the field name and the data type, as shown in the following figure.
Drag another DB Table Output operator to the design page and connect it to the Spark SQL operator as the downstream operator.
Click the DB Table Output operator to configure it, as shown in the following figure.
1. Click the Save button in the upper right corner. Click Run in the upper right corner, as shown in the following figure. After successful execution, a message indicating successful execution will be displayed in Log.
You can see that the out_order table has been added to the database, as shown in the following figure.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy