You have two tables stored in different databases. Each table contains names and subject scores. You want to merge these two tables into a single table, add a column that represents the total of the subject scores, and output the final result to a specified database.
First, merge the two score tables into a single table using the Data Association operator. Then, use the Spark SQL operator to create the new calculated column. Finally, output the table data to the target database using the DB Table Output operator.
The data from the two tables are in two different databases, as shown in the following figure.
You want to merge these two tables into a single table and add a new column whose value is the sum of the "English Score" and "History Score". The following figure shows the final effect.
Create a scheduled task and drag a Data Transformation node onto the design page, as shown in the following figure.
1. Click the Data Transformation node to enter the setting page.
2. Drag a DB Table Input operator to the Data Transformation design page, set the data source as shown in the following figure, click Node Information to enter the tab page, and rename it to English Score Table.
3. Drag another DB Table Input operator to the Data Transformation design page, set the data source as shown in the following figure, click Node Information to enter the tab page, and rename it to History Score Table.
Drag a Data Association operator to the Data Transformation design page, and connect it with lines to the two upstream nodes: English Score Table and History Score Table.
Click the Data Association node, set Left Table as English Score Table and Right Table as History Score Table, set Join Method as Left Join, set Join Field to Name = Name, and rename the node to Data Association, as shown in the following figure.
After configuring the association, you can click Data Preview to enter the tab page and check the effect of table merging.
You want to calculate the sum of the "English Score" and the "History Score" for each student.
1. Add a Spark SQL operator and connect it to the Data Association operator as the downstream operator, as shown in the following figure.
2. Configure Spark SQL as shown in the following figure. The SQL statement is: select $[Data Association].`Name`, $[Data Association].`English Score`, $[Data Association].`History_Score`, $[Data Association].`English Score`+$[Data Association].`History_Score` as Final_Result from $[Data Association], as shown in the following figure.
Click Data Preview to enter the tab page to view data, as shown in the following figure.
Drag another DB Table Output operator to the design page and connect it to the Spark SQL operator as the downstream operator.
Click the DB Table Output operator to configure it, as shown in the following figure.
Select Map Fields with Same Name in Field Mapping, and the page will appear as shown in the following figure.
Note
:
If you want to modify the field name, you can directly edit it in Target Table Field.
Click Run in the upper right corner. After successful execution, a message indicating successful execution will be displayed in Log, as shown in the following figure.
The table containing the calculated Final_Result is shown in the following figure.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy