XML Parsing Operator- FineDataLink Help Document

Last update: September 09, 2024

Overview

Version

FineDataLink Version	Functional Change
4.0.9	Added the XML Parsing operator, which could be used to parse the input XML data into the data in row-and-column format.

FineDataLink Version

Functional Change

4.0.9

Added the XML Parsing operator, which could be used to parse the input XML data into the data in row-and-column format.

Application Scenario

You want to parse the data in XML format returned by APIs, WebServices, or OData-based APIs, as well as the data from XML files, into the data in row-and-column format for subsequent processing and storage.

Function Introduction

You can use the XML Parsing operator in the Data Transformation node in FineDataLink to parse the data in XML format into the data in row-and-column for subsequent processing and storage.

Function Description

The Parsing Configuration page of the XML Parsing operator is shown in the following figure.

Selecting the Source Field

The drop-down list of Select Source Field contains all field names in preceding nodes.

If the upstream node is API Input and the data is not expanded into a two-dimensional table, the source field defaults to Default.

If you tick Keep All Upstream Output Fields After Parsing, all fields output by the upstream node will be merged with the new fields generated after XML parsing for output.

Namespace

Specify the namespace to ensure the nodes can be read correctly if the XML file has a namespace.

The namespace list is displayed after you tick Specify Namespace, where you can add and delete namespaces.

Field	Description
Namespace Prefix	It is editable. Duplicate names are not allowed. If identical namespace prefixes exist in the XML file, fill in URIs correctly and name the two prefixes differently for normal parsing.
Namespace URI	It is editable. Duplicate names are allowed.

If there is a default namespace in the XML file, customize a namespace prefix and fill the URI of the default namespace for normal parsing.

For example, there is no namespace prefix in http://111111, so you need to customize a namespace prefix such as xlms, and fill in http://111111 as the namespace URI to make it parsed normally.

Parsing XML Data

Selecting the XML Node

Click the Select XML Node button and select the XML node in the pop-up node selection box.

Example	Multiple Selection Tree Content
	Leaf node: a node that has no child nodes The fields in yellow are leaf nodes. Others are non-leaf nodes. Non-leaf nodes cannot be selected. When two nodes with the same name and different parent nodes are selected, the name of one output field is suffixed with 1. For example, if you select the title node in the /bookstore/store path and the /bookstore/book path in the above figure, the names of the output fields after parsing are title and title1 and the XPath of the two fields is the valid path of the corresponding node.

Example

Multiple Selection Tree Content

Leaf node: a node that has no child nodes

The fields in yellow are leaf nodes. Others are non-leaf nodes.

Non-leaf nodes cannot be selected.

When two nodes with the same name and different parent nodes are selected, the name of one output field is suffixed with 1.

For example, if you select the title node in the /bookstore/store path and the /bookstore/book path in the above figure, the names of the output fields after parsing are title and title1 and the XPath of the two fields is the valid path of the corresponding node.

Outputting the Field

You can add and delete the output field.

All fields generated after XML parsing are of the string type. (The type of fields passed from the upstream node remains unchanged.)

Field	Description
Field Name After Parsing	It is editable. You can configure the name of fields generated after XML parsing. Note: 1. Duplicate field names are not allowed. 2. Referencing parameters is not allowed.
XPath	It is editable. It is the XPath expression of the output field. Referencing parameters is not allowed. Setting XPath manually is allowed.

Field

Description

Field Name After Parsing

It is editable. You can configure the name of fields generated after XML parsing.

Note:

1. Duplicate field names are not allowed.

2. Referencing parameters is not allowed.

XPath

It is editable. It is the XPath expression of the output field.

Referencing parameters is not allowed.

Setting XPath manually is allowed.

You can enter two kinds of XPath expressions: node set and predicate.

The following is an example of an XML file:

<?xml version="1.0" encoding="ISO-8859-1"?>

<bookstore>

<book>
  <title lang="eng">Harry Potter</title>
  <price>29.99</price>
</book>

<book>
  <title lang="eng">Learning XML</title>
  <price>39.95</price>
</book>

</bookstore>
Expand

Node Set

Nodes in the XML file are selected based on path expressions.

Some path expressions and results are shown in the following table.

Path Expression	Result
bookstore	All child nodes of the bookstore element are selected.
/bookstore	The root element bookstore is selected. Note: A path beginning with a forward slash (/) always represents an absolute path of the element.
bookstore/book	All book elements under the bookstore element are selected.
//book	All book elements are selected, regardless of their locations in the file.
bookstore//book	All book elements under the bookstore element are selected, regardless of their locations.
//@lang	All attributes named lang are selected.

Predicate

Predicates are used to look for a specific node or a node that contains a specified value.

Predicates are enclosed in square brackets ([ ]).

Some path expressions with predicates and results are shown in the following table.

Path Expression	Result
/bookstore/book[1]	The first book element under the bookstore element is selected.
/bookstore/book[last()]	The last book element under the bookstore element is selected.
/bookstore/book[last()-1]	The penultimate book element under the bookstore element is selected.
/bookstore/book[position()<3]	The first two book elements under the bookstore element are selected.
//title[@lang]	All title elements with the lang attribute are selected.
//title[@lang='eng']	All title elements that have a lang attribute with a value of eng are selected.
/bookstore/book[price>35.00]	All book elements under the bookstore element whose price element has a value greater than 35.00 are selected.
/bookstore/book[price>35.00]/title	All title elements under the book element (whose price element has a value greater than 35.00) under the bookstore element are selected.

Special Scenario Handling Strategy

Scenario	Result or Handling Strategy
The source XML data contains multiple root elements, as shown in the following figure.	When you click Select XML Node, an error message appears: XML data root node is missing. When you preview and run the manually set XPath, an error message appears: XML data root node is missing.
The configured XPath is incorrect.	The field content after parsing is empty.
The configured XPath is invalid or the namespace prefix contains characters other than English letters.	Parsing exception occurs.
The namespace prefixes are repeated.	The namespace prefixes cannot be repeated. In this example, rename one s prefix, and fill in the corresponding URIs. The data is parsed normally after you select nodes from the node tree. If the paths are filled in manually, set the path according to the new namespace prefix.
The source XML data is incomplete, as shown in the following figure.	When you click Select XML Node, an error message appears: XML data format is incomplete. When you preview and run the manually set XPath, an error message appears: XML data format is incomplete.

Example

For details about using the XML Parsing operator, see Example of XML Parsing.

Previous：JSON Parsing Operator Example

Next：XML Parsing Example

Helpful
Not helpful
Only read

中文（简体）

English

XML Parsing Operator