Extracted Data
Usage 1
Overview
Grammar | REGEXP(str,pattern) | Determines whether a character string (specified by str) matches a regular expression (specified by pattern). |
Parameter 1 | str | Character string 1 |
Parameter 2 | pattern | Regular expression |
Notes
To use the character "\", you need to add another "\" in the formula. For example, the formula REGEXP (string,"\d") is invalid and needs to be changed toREGEXP (string,"\\d"), as shown in the following figure.
The function supports two text-type parameters.
Example
Formula | Result | Notes |
---|---|---|
REGEXP("aaaaac","a*c") | 1 | |
REGEXP("abc","a*c") | 0 |
Take \d and \w as an example to use this function during data editing. In regular expressions, \d matches numbers and \w matches alphanumeric characters, underscores, and Chinese characters, as shown in the following figure.
For more examples of regular expressions, see Regular Expression Example.
Usage 2
Overview
Grammar | REGEXP(str,pattern,intNumber) | Determines whether a character string (specified by str) matches the regular expression (specified by pattern) with the given mode (specified by intNumber). |
Parameter 1 | str | Character string 1 |
Parameter 2 | pattern | Regular expression |
Parameter 3 | intNumber | Given mode |
Notes
The mode corresponding to each intNumber is as follows:
You can simultaneously enable several modes by setting intNumber to the sum of the intNumber in two modes. For example, intNumber = 1 + 2 = 3 enables both UNIX_LINES mode and CASE_INSENSITIVE matching mode.
intNumber Mode | Concept |
---|---|
intNumber = 1 | Enables the UNIX_LINES mode, in which only the "\n" line terminator is recognized in the behavior of ".", "^", and "$". |
intNumber = 2 | Enables CASE_INSENSITIVE matching. By default, mere matching of characters in the US-ASCII charset is assumed for case-insensitive matching. Unicode-aware case-insensitive matching can be enabled by specifying the UNICODE_CASE mode along with this mode. |
intNumber = 4 | Enables the COMMENTS mode where whitespace and comments are permitted. In this mode, whitespace and embedded comments starting with # are ignored until the line end. |
intNumber = 8 | Enables the MULTILINE mode. |
intNumber = 16 | Enables the LITERAL mode for textual analysis. In this mode, the input character string is treated as a sequence of literal characters. Metacharacters or escape sequences in the input sequence is given no special meaning. When the CASE_INSENSITIVE and UNICODE_CASE modes are used with LITERAL mode, matching will be affected. When the two modes are used with other modes, matching will not be affected. |
intNumber = 32 | Enables the DOTALL mode where the expression "." matches any character, including a line terminator. By default, this expression does not match line terminators. |
intNumber = 64 | Enables Unicode-aware case folding (UNICODE_CASE mode). When this mode and the CASE_INSENSITIVE mode are specified, case-insensitive matching is completed in a manner consistent with the Unicode Standard. |
intNumber = 128 | Enables canonical equivalence (CANON_EQ mode). In this mode, two characters will only be considered as matched if their full canonical decompositions match. |
Notes
The function supports three parameters, first and second of which are texts, and third of which is a number.
Example
Formula | Result | Explanation |
---|---|---|
REGEXP("Aaaaabbbbc","a*b*c",3) | 1 | Enables both the UNIX_LINES mode and CASE_INSENSITIVE matching mode, in which case 1 is returned. |
REGEXP("Aaaaabbbbc","a*b*c",1) | 0 | Enables the UNIX_LINES mode, in which 0 is returned because the case-sensitive mode is enabled. |
REGEXP("Abc","abc",2) | 1 | Enables the CASE_INSENSITIVE mode, in which case 1 is returned. |
REGEXP("Abc","abc",2) returns results, as shown in the following figure.
Direct Connect Data
Overview
Grammar | REGEXP(str,pattern) | Determines whether a character string (specified by str) matches a regular expression (specified by pattern). |
Parameter 1 | str | Character string 1 |
Parameter 2 | pattern | Regular expression |
Notes
The function supports two text-type parameters.
Example
Formula | Result | Notes |
---|---|---|
REGEXP("aaaaac","a*c") | 1 | |
REGEXP("abc","a+c") | 0 |