Truncating the Content After Specified Characters Appearing Multiple Times

  • Last update:  2022-05-10
  • 1. Overview

    1.1 Expected effect

    Users sometimes need to intercept the character at a specified position of a character that appears multiple times in a field.

    For example, the user has characters with different lengths and needs to intercept the characters after the last "_", as shown in the following figure:

    20.png

    Or the user needs to intercept the value of column "B" in the "A|B|C" field. For example, the user needs to intercept all academic information, as shown in the following figure:

    21.png

    1.2 Implementation ideas 

    Use the combination of INDEXOF, FIND, MID, SUBSTITUTE, RIGHT and other functions to implement the interception field.

    2. Intercept the characters after the last specified character

    Sample data: intercept the content after multiple occurrences of characters.xlsx

    Upload sample data to FineBI, as shown in the figure below:

    22.png

    2.1 Create a self-service dataset

    Create a self-service dataset, select the uploaded Excel dataset, and check the sample fields, as shown in the figure below:

    23.png

    2.2 Intercept specified characters

    Click "+" to add "New column", as shown in the figure below:

    24.png

    2.2.1 Method One

    Name it "Intercept the characters after the last _", enter the formula: INDEXOF(SPLIT(Field,"_"),LEN(SPLIT(Field,"_"))-1), click "OK", as shown in the following figure:

    25.png

    Note: The functions and fields in the formula box need to be selected by clicking the selection area on the left, and cannot be entered manually.

    Formula description:

    FormulaDescriptionResult
    SPLIT(Field,"_")Divide the field into arrays according to "_"

    Before spliting: "a_b_c"

    After splitting: "a, b, c"

     LEN(SPLIT(Field,"_"))-1

    Calculate the number of arrays divided by "_" -1


    For example, the calculation result of "a_b_c" is 2

    INDEXOF(SPLIT(Field,"_"),LEN(SPLIT(Field,"_"))-1)Returns the character after the last "_" of the array after applying SPLIT() to split.For example, the calculation result of "a_b_c" is c

    2.2.2 Method two

    Add a new column, calculate the "calculated field according to the number of characters divided by _", enter the formula:LEN(SPLIT(Field,"_")), for example, the calculation result of "a_b_c" is 3, as shown in the following figure:

    26.png

    Add a new column to replace the last "_" with "-", enter the formula: SUBSTITUTE(Field,"_","-",LEN(SPLIT(Field,"_"))-1), as shown in the figure below:

    27.png

    Add a new column, start from the first character to find the position of "-", enter the formula: FIND("-",SUBSTITUTE(Field,"_","-",LEN(SPLIT(Field,"_"))-1 ),1),

    28.png

    Add a new column, calculate the number of characters after the last "_", enter the formula:LEN(Field)-FIND("-",SUBSTITUTE(Field,"_","-",LEN(SPLIT(Field,"_"))-1),1), as shown in the figure below:

    29.png

    Add a new column, extract the characters after the last "_" in the field, enter the formula RIGHT(Field,LEN(Field)-FIND("-",SUBSTITUTE(Field,"_","-",LEN(SPLIT(Field," _"))-1),1)), click "OK", as shown in the figure below:

    30.png

    2.3 Effect display

    See section 1.1 of this article for details.

    3. Intercept the value of column B in the A|B|C field

    Note: The length of each data field in column B is inconsistent.

    Sample data: Recruitment information.xlsx

    Upload sample data to FineBI, as shown in the figure below:

    31.png

    3.1 Create a self-service dataset

    Create a self-service dataset, select the uploaded Excel dataset, and check the sample fields, as shown in the figure below:

    32.png

    3.2 Intercept specified characters

    Click "+" to add "New column", as shown in the figure below:

    33.png

    Name and enter the formula: INDEXOF(SPLIT(number of people with regional education,"\\|"),1), click "OK", as shown in the figure below:

    34.png

    Note: The functions and fields in the formula box need to be selected by clicking the selection area on the left, and cannot be entered manually.

    Formula description:

    FormulaDescription
    SPLIT(Region Qualification Number,"\\|")

    Divide the field of the number of people with regional academic qualifications according to CHARACTER "|".

    For example: "Shenzhen|Bachelor|Recruit 5 people" becomes "Shenzhen, Bachelor, Recruit 5 people".

    INDEXOF(SPLIT(Region Qualification Number,"\\|"),1)

    Return the content at the second position of the divided result string.

    For example: "Shenzhen, Bachelor, Recruit 5 people" return to "Bachelor".


    Note: Since the "|" in the original field is a keyword in the regular expression, if it is not escaped, "SPLIT" will treat it as a regular expression, so it must be changed to SPLIT(string, "\\|" ).

    3.3 Effect display

    See section 1.1 of this article for details.

    For more details on the content of the intercepted fields, see: Field Classification.

    附件列表


    主题: Advanced Data Analysis
    • Helpful
    • Not helpful
    • Only read

    滑鼠選中內容,快速回饋問題

    滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。

    不再提示

    10s後關閉

    Get
    Help
    Online Support
    Professional technical support is provided to quickly help you solve problems.
    Online support is available from 9:00-12:00 and 13:30-17:30 on weekdays.
    Page Feedback
    You can provide suggestions and feedback for the current web page.
    Pre-Sales Consultation
    Business Consultation
    Business: international@fanruan.com
    Support: support@fanruan.com
    Page Feedback
    *Problem Type
    Cannot be empty
    Problem Description
    0/1000
    Cannot be empty

    Submitted successfully

    Network busy