This document introduces common Spark SQL syntax.
Operator
Meaning
Detail
=
Equal operator
A=B. Returns TRUE if the A value is equal to the B value. Returns FALSE otherwise.
In FineDataLink, you can use Data Filtering instead.
<>
Not equal operator
A <> B. Returns NULL if the A value is NULL or the B value is NULL. Returns TRUE if the A value is unequal to the B value. Returns FALSE otherwise.
<
Less than operator
A < B. Returns NULL if the A value is NULL or the B value is NULL. Returns TRUE if the A value is less than the B value. Returns FALSE otherwise.
<=
Less than or equal operator
A <= B. Returns NULL if the A value is NULL or the B value is NULL. Returns TRUE if the A value is less than or equal to the B value. Returns FALSE otherwise.
>=
Greater than or equal operator
A >= B. Returns NULL if the A value is NULL or the B value is NULL. Returns TRUE if the A value is greater than or equal to the B value. Returns FALSE otherwise.
IS NULL
NULL value test
A IS NULL. Returns TRUE if the A value is NULL. Returns FALSE otherwise.
IS NOT NULL
NOT NULL value test
A IS NOT NULL. Returns FALSE if the A value is NULL. Returns TRUE otherwise.
LIKE
Fuzzy matching
Syntax one: A LIKE B
Syntax two: LIKE (A, B)
Expression type: string
Return value type: boolean or NULL
Returns NULL if either the A value or the B value is NULL.
Returns TRUE if the A value matches the regular syntax of the expression B. Returns FALSE otherwise. With LIKE you can only use the percent sign (%) and the underscore (_) as the wildcard character in the expression B, where % matches any number of characters, even zero characters, and _ matches exactly one character. (You can use multiple underscores, with each representing a character.)
For example, select Football like %ba from Test and select like (Football, _otba%) from Test.
FineDataLink makes fuzzy querying and filtering easier by visualizing the process, improving efficiency. For details, seeFuzzy Field Querying/Filtering.
AND
Logical AND
A AND B.
Returns TRUE if both the A value and the B value are TRUE. Returns FALSE otherwise.
OR
Logical OR
A OR B.
Returns TRUE if either the A value or the B value is TRUE. Returns FALSE otherwise.
Example:
Add a Type column with the Test value as the column value. You can use the following SQL statement. select Test AS Type from Input Source
Add a Mark column with the column value as 1 for data with a math score greater than 90 and as 0 otherwise. You can use the following SQL statement. select Math, if(Math >90,1,0) AS Mark from Input Source
For a simpler solution, see Adding a Column Using the IF Function.
Enclose the Chinese column name with single quotation marks (’‘). For example, select Test as Field.
For details, see Data Deduplication.
ORDER BY: global sorting
SORT BY, DISTRIBUTE BY, and CLUSTER BY: partial sorting
For details, see Data Sorting.
select * from DB Table Input limit 3
Description: Returns records within a given range. It is applied to numeric, string, and date columns.
select * FROM DB Table Input where Product ID between 2 and 3
LEFT JOIN
LEFT OUTER JOIN
LEFT SEMI JOIN
INNER JOIN or JOIN
RIGHT JOIN
RIGHT OUTER JOIN
FULL JOIN
FULL OUTER JOIN
UNION
UNION ALL
For details, see JOIN and UNION.
Filters records based on the condition in the HAVING clause after the GROUP BY clause groups data.
select Product ID FROM DB Table Input group by Product ID having Product ID > 400
You are advised to use the Row-to-Column operator.
You are advised to use the Column-to-Row operator.
The to_json and struct functions are supported in 4.1.6.4 and later versions.
Difference Between the to_json Function and the JSON Generation operator
The JSON Generation operator can convert a two-dimensional table into a JSON object or convert it into a two-dimensional table with multiple rows and a single column (where the value on each row is a JSON object). However, it cannot create a two-dimensional table containing a combination of plain text, JSON objects, and JSON arrays. The to_json function can be used more flexibly.
The JSON Generation operator can be used with the API Output operator to generate JSON data and then output it to the API. The to_json function can be used to construct JSON fields for MongoDB Output.
The to_json function can convert data to a JSON string.
SQL Statement in Spark SQL
Result
SELECT to_json(named_struct('a', 1, 'b', 2));
{"a":1,"b":2}
SELECT to_json(map('a', named_struct('b', 1)));
{"a":{"b":1}}
SELECT to_json(array(map('a', 1)));
[{"a":1}]
To convert text to a JSON array:
SELECT to_json(split('a,b,c', ',')) AS jsonArray
["a", "b", "c"]
struct Function
The struct function is used to concatenate multiple columns or fields into a structured data object.
Example one: Concatenate multiple fields of a two-dimensional table into a JSON object.
Example two: Concatenate multiple fields of a two-dimensional table into a JSON object array through struct, to_json, collect_list, concat_ws, concat, and group by.For details, see MongoDB Output.
滑鼠選中內容,快速回饋問題
滑鼠選中存在疑惑的內容,即可快速回饋問題,我們將會跟進處理。
不再提示
10s後關閉
Submitted successfully
Network busy