Spark SQL String Function

  • Last update: February 13, 2025
  • Overview

    This document introduces the syntax of the string functions in Spark SQL.

    String Character Count

    You are advised to use LEN in New Calculation Column of FineDatalink.

    • CHAR_LENGTH(String): Returns the number of characters in the string.

    • CHARACTER_LENGTH(String): Returns the number of characters in the string.

    • LENGTH(String): Returns the number of characters in the string.

    For example, to obtain the number of characters of Contract Type, you can use the statement shown in the following figure.

    select Contract Type ,CHAR_LENGTH(Contract Type) FROM DB Table Input

    1.png


    String Concatenation

    Description

    You are advised to use CONCATENATE in New Calculation Column of FineDataLink.

    You can also use the syntax in Spark SQL.

    You can also use the CONCAT(s1,s2...sn) clause in Spark SQL to concatenate multiple strings into one string.

    For example,

    SELECT CONCAT ('Fine',' Data','Link').

    2.png

    Notes

    Problem Description:

    The 13-digit timestamp you obtain using the New Calculation Column operator will be displayed in scientific notation in the CONCAT function in the Spark SQL operator. For example, the statement md5(concat(1700794824509), 'fixed_key') will changes to md5(concat(1700794824509E12), 'fixed_key') during actual execution, changing the MD5 encryption result.

    Solution:

    Before concatenation, use the Field Setting operator to convert the timestamp field obtained through the New Calculation Column operator to the long type.

    String Reversal

    REVERSE(String): Returns the string with the order of the characters reversed.

    For example, to reverse the Contract Type string, you can use the statement shown in the following figure. 

    select Contract Type ,reverse(Contract Type) FROM DB Table Input1

    3.png

    String Character Extraction

    You are advised to use RIGHT, MID, and LEFT in New Calculation Column of FineDataLink.

    You can also use the syntax in Spark SQL.

    Name

    Usage in the Spark SQL Operator

    LEFT

    LEFT(Expression, Length)

    Example: SELECT LEFT('FineDataLink',4)

    For example, to extract the leftmost two characters from the Contract Type string, you can use the following statement, as shown in the following figure.

    select Contract Type ,LEFT(Contract Type,2) FROM DB Table Input2

    4.png


    RIGHT

    RIGHT(Expression, Length)

    Example: SELECT RIGHT('FineDataLink',8)

    For example, to extract the rightmost two characters of the Contract Type string, you can use the following statement, as shown in the following figure. 

    select Contract Type,RIGHT(Contract Type, 2 ) FROM DB Table Input3

    5.png

    Substring

    SUBSTR(Expression, Position, Length)

    Example: SELECT SUBSTR('FineDataLink',5,4)

    For example, to extract two characters starting from the third character in the Contract Type string, you can use the following statement, as shown in the following figure. 

    select Contract Type,SUBSTR(Contract Type,3,2) FROM DB Table Input4

    6.png

    String Character Case Conversion

    You are advised to use UPPER and LOWER in New Calculation Column of FineDataLink.

    You can also use the syntax in Spark SQL.

    Name

    Usage in the Spark SQL Operator

    Lowercase to uppercase

    UPPER(Expression)

    Example: SELECT UPPER('FineDataLink')

    Uppercase to lowercase

    LOWER(Expression)

    Example: SELECT LOWER('FineDataLink')

    Repeating a String

    You are advised to use REPEAT in New Calculation Column of FineDataLink.

    You can also use the syntax in Spark SQL.

    REPEAT (String, Number): Repeats a string the specified number of times.

    Example: select repeat('Small',2)

    Padding a String

    LPAD(String1, Length, String2): Returns the String1 value left-padded with the String2 value to a length of Length characters.

    Example:

    select Contract Payment Type,Contract Type,lpad(Contract Type,35,Contract Payment Type) FROM DB Table Input5

    Left-pad the Contract Type value with the Contract Payment Type value to a length of 35 characters. The returned result set is shown in the following figure.

    7.png


    RPAD(String1, Length, String2): Returns the String1 value right-padded with the String2 value to a length of Length characters.

    Example:

    select Contract Payment Type,Contract Type,rpad(Contract Type,35,Contract Payment Type) FROM DB Table Input6

    Right-pad the Contract Type value with the Contract Payment Type value to a length of 10 characters. The returned result set is shown in the following figure.

    8.png

    Finding a String Within a String List

    FIND_IN_SET(String, String list): Returns a number representing the position of the first occurrence of the string in the string list, where the string list is a string composed of substrings separated by comma (,). Returns 0 if the string is not in the string list.

    Example: 

    select find_in_set('a small b','cd, ef, a small b, de')

    You are advised to use FIND in New Calculation Column of FineDatalink.

    String Position

    POSITION(String1 IN String2): Returns a number representing the position of the first occurrence of the String1 value in the String value. Returns 0 if the String1 value is not in the String value.

    Example: 

    select POSITION('b' IN 'abcb')

    You are advised to use FIND in New Calculation Column of FineDatalink.

    Substring Replacement

    REGEXP_REPLACE(String1, String2, String3): Replaces occurrences in the String1 value that match the regular expression specified by String2 with the String3 value and returns the resulting string.

    Example:

    select regexp_replace("Purchase", "Pu|ch", "tt")

    The substrings Pu and ch in the string Purchase will be replaced with tt, as shown in the following figure.

    9.png

     


    附件列表


    主题: Data Development - Scheduled Task
    Previous
    Next
    • Helpful
    • Not helpful
    • Only read

    滑鼠選中內容,快速回饋問題

    滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。

    不再提示

    10s後關閉

    Get
    Help
    Online Support
    Professional technical support is provided to quickly help you solve problems.
    Online support is available from 9:00-12:00 and 13:30-17:30 on weekdays.
    Page Feedback
    You can provide suggestions and feedback for the current web page.
    Pre-Sales Consultation
    Business Consultation
    Business: international@fanruan.com
    Support: support@fanruan.com
    Page Feedback
    *Problem Type
    Cannot be empty
    Problem Description
    0/1000
    Cannot be empty

    Submitted successfully

    Network busy