You want to parse the data in XML format returned from APIs, WebServices, or OData interfaces, as well as the data from XML files, into row-column format data for subsequent processing and storage. You can use the XML Parsing operator to parse the input XML data into data in the row-column format.
This document uses a simple example to demonstrate how to use the XML Parsing operator.
Fetch XML data through API Input in the Data Transformation node, as shown in the following figure. For details, see API Input - Webservice.
Preview the XML data fetched from the API on the Data Preview tab page, as shown in the following figure.
In the Data Transformation node, drag an XML Parsing operator onto the page and connect it to the upstream API Input operator, as shown in the following figure.
Select Source Field and choose whether to enable Keep All Upstream Output Fields After Parsing, as shown in the following figure.
You need to specify the namespace (if any) to obtain the correct node. For the API in the following figure, prefixes like xsi, xsd, and soap and their corresponding URIs must be entered for the node names with prefixes to be identified correctly.
The following table describes setting items in Parsing Configuration:
Configuration Item
Description
Select Source Field:
The drop-down list of Select Source Field contains all fields in the preceding node.
When the upstream node reads data from a REST API, the source field defaults to default.
Keep All Upstream Output Fields After Parsing
If the checkbox is selected, all upstream output fields and the new fields generated through parsing will be merged and output together.
Specify Namespace
Description:
Specify the namespace to ensure the nodes can be read correctly if the XML data contains a namespace.
The namespace setting area is displayed when you tick the checkbox.
You can add and delete the namespace.
Namespace Prefix: It is editable. Duplicate names are disallowed. If identical namespace prefixes exist in the XML file, fill in URIs correctly and name the two prefixes differently for normal parsing.
Namespace URI: It is editable. Duplicate names are allowed.
Select XML Node
The first non-empty XML row is read as a template to parse all the selectable XML paths.
Click Select XML Node, select the node to be parsed, and click OK, as shown in the following figure.
After you select the XML nodes, the selected content is populated to the Output Fields table, which displays Field Name After Parsing and XPath.
Field in the Table
Type
XPath
Editable text box
You can configure the XPath expressions for the fields generated after parsing.
Referencing parameters is disallowed.
Setting the XPath manually is allowed.
Field Name After Parsing
You can configure the name of fields generated after XML parsing.
2. Referencing parameters is disallowed.
Click Data Preview to view the data after parsing, as shown in the following figure.
Drag a Field-to-Row Splitting node onto the page, and configure the node to split the field into multiple rows of data according to the separator, as shown in the following figure.
Use Field Setting to delete the unsplit source field body:string, as shown in the following figure.
Use DB Table Output to synchronize the data after processing to the database, as shown in the following figure.
Click Run to execute the task. The running result in Log upon successful execution is shown in the following figure.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy