XML Parsing Operator

  • Last update: September 09, 2024
  • Overview

    Version

    FineDataLink Version

    Functional Change

    4.0.9

    Added the XML Parsing operator, which could be used to parse the input XML data into the data in row-and-column format.

    Application Scenario

    You want to parse the data in XML format returned by APIs, WebServices, or OData-based APIs, as well as the data from XML files, into the data in row-and-column format for subsequent processing and storage.

    Function Introduction

    You can use the XML Parsing operator in the Data Transformation node in FineDataLink to parse the data in XML format into the data in row-and-column for subsequent processing and storage.

    Function Description

    The Parsing Configuration page of the XML Parsing operator is shown in the following figure.

    Selecting the Source Field

    The drop-down list of Select Source Field contains all field names in preceding nodes.

    If the upstream node is API Input and the data is not expanded into a two-dimensional table, the source field defaults to Default.

    If you tick Keep All Upstream Output Fields After Parsing, all fields output by the upstream node will be merged with the new fields generated after XML parsing for output.

    Namespace

    Specify the namespace to ensure the nodes can be read correctly if the XML file has a namespace.

    The namespace list is displayed after you tick Specify Namespace, where you can add and delete namespaces.

    Field

    Description

    Namespace Prefix

    It is editable. Duplicate names are not allowed. If identical namespace prefixes exist in the XML file, fill in URIs correctly and name the two prefixes differently for normal parsing.

    Namespace URI

    It is editable. Duplicate names are allowed.

    If there is a default namespace in the XML file, customize a namespace prefix and fill the URI of the default namespace for normal parsing.

    For example, there is no namespace prefix in http://111111, so you need to customize a namespace prefix such as xlms, and fill in http://111111 as the namespace URI to make it parsed normally.

    Parsing XML Data

    Selecting the XML Node

    Click the Select XML Node button and select the XML node in the pop-up node selection box.

    Example

    Multiple Selection Tree Content

    6.png

    Leaf node: a node that has no child nodes

    The fields in yellow are leaf nodes. Others are non-leaf nodes.

    1685418094200362.png

    Non-leaf nodes cannot be selected.

    When two nodes with the same name and different parent nodes are selected, the name of one output field is suffixed with 1.

    For example, if you select the title node in the /bookstore/store path and the /bookstore/book path in the above figure, the names of the output fields after parsing are title and title1 and the XPath of the two fields is the valid path of the corresponding node.

    Outputting the Field

    You can add and delete the output field.

    All fields generated after XML parsing are of the string type. (The type of fields passed from the upstream node remains unchanged.)

    Field

    Description

    Field Name After Parsing

    It is editable. You can configure the name of fields generated after XML parsing.

    iconNote:

    1. Duplicate field names are not allowed.

    2. Referencing parameters is not allowed.

    XPath

    It is editable. It is the XPath expression of the output field.

    Referencing parameters is not allowed.

    Setting XPath manually is allowed.

    You can enter two kinds of XPath expressions: node set and predicate.

    The following is an example of an XML file:

    <?xml version="1.0" encoding="ISO-8859-1"?>

    <bookstore>

    <book>
      <title lang="eng">Harry Potter</title>
      <price>29.99</price>
    </book>

    <book>
      <title lang="eng">Learning XML</title>
      <price>39.95</price>
    </book>

    </bookstore>

    Node Set

    Nodes in the XML file are selected based on path expressions.

    Some path expressions and results are shown in the following table.

    Path Expression

    Result

    bookstore

    All child nodes of the bookstore element are selected.

    /bookstore

    The root element bookstore is selected.

    iconNote:

    A path beginning with a forward slash (/) always represents an absolute path of the element.

    bookstore/book

    All book elements under the bookstore element are selected.

    //book

    All book elements are selected, regardless of their locations in the file.

    bookstore//book

    All book elements under the bookstore element are selected, regardless of their locations.

    //@lang

    All attributes named lang are selected.

    Predicate

    Predicates are used to look for a specific node or a node that contains a specified value.

    Predicates are enclosed in square brackets ([ ]).

    Some path expressions with predicates and results are shown in the following table.

    Path Expression

    Result

    /bookstore/book[1]

    The first book element under the bookstore element is selected.

    /bookstore/book[last()]

    The last book element under the bookstore element is selected.

    /bookstore/book[last()-1]

    The penultimate book element under the bookstore element is selected.

    /bookstore/book[position()<3]

    The first two book elements under the bookstore element are selected.

    //title[@lang]

    All title elements with the lang attribute are selected.

    //title[@lang='eng']

    All title elements that have a lang attribute with a value of eng are selected.

    /bookstore/book[price>35.00]

    All book elements under the bookstore element whose price element has a value greater than 35.00 are selected.

    /bookstore/book[price>35.00]/title

    All title elements under the book element (whose price element has a value greater than 35.00) under the bookstore element are selected.

    Special Scenario Handling Strategy

    Scenario

    Result or Handling Strategy

    The source XML data contains multiple root elements, as shown in the following figure.

    1685427389170899.png

    When you click Select XML Node, an error message appears: XML data root node is missing.

    When you preview and run the manually set XPath, an error message appears: XML data root node is missing.

    The configured XPath is incorrect.

    The field content after parsing is empty.

    The configured XPath is invalid or the namespace prefix contains characters other than English letters.

    Parsing exception occurs.

    The namespace prefixes are repeated.

    9.png

    The namespace prefixes cannot be repeated.

    In this example, rename one s prefix, and fill in the corresponding URIs. The data is parsed normally after you select nodes from the node tree.

    If the paths are filled in manually, set the path according to the new namespace prefix.

    The source XML data is incomplete, as shown in the following figure.

    1685427733248711.png

    When you click Select XML Node, an error message appears: XML data format is incomplete.

    When you preview and run the manually set XPath, an error message appears: XML data format is incomplete.

    Example

    For details about using the XML Parsing operator, see Example of XML Parsing.

    附件列表


    主题: Data Development - Scheduled Task
    Previous
    Next
    • Helpful
    • Not helpful
    • Only read

    滑鼠選中內容,快速回饋問題

    滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。

    不再提示

    10s後關閉

    Get
    Help
    Online Support
    Professional technical support is provided to quickly help you solve problems.
    Online support is available from 9:00-12:00 and 13:30-17:30 on weekdays.
    Page Feedback
    You can provide suggestions and feedback for the current web page.
    Pre-Sales Consultation
    Business Consultation
    Business: international@fanruan.com
    Support: support@fanruan.com
    Page Feedback
    *Problem Type
    Cannot be empty
    Problem Description
    0/1000
    Cannot be empty

    Submitted successfully

    Network busy