This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris.git
The following commit(s) were added to refs/heads/master by this push: new 3e8cd0c669 [typo](doc) Add the description of json HDFS broker load (#13683) 3e8cd0c669 is described below commit 3e8cd0c669f7911c0acc02e446f9499856537da1 Author: Tiewei Fang <43782773+bepppo...@users.noreply.github.com> AuthorDate: Thu Oct 27 09:36:57 2022 +0800 [typo](doc) Add the description of json HDFS broker load (#13683) Add the instruction of HDFS broker load with json format file. --- .../Load/BROKER-LOAD.md | 54 +++++++++++++++++++++ .../Load/BROKER-LOAD.md | 55 ++++++++++++++++++++++ 2 files changed, 109 insertions(+) diff --git a/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md b/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md index 57fb1003b1..6cf381b74f 100644 --- a/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md +++ b/docs/en/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md @@ -72,6 +72,7 @@ WITH BROKER broker_name [WHERE predicate] [DELETE ON expr] [ORDER BY source_sequence] + [PROPERTIES ("key1"="value1", ...)] ```` - `[MERGE|APPEND|DELETE]` @@ -128,6 +129,10 @@ WITH BROKER broker_name Tables only for the Unique Key model. Used to specify the column in the imported data that represents the Sequence Col. Mainly used to ensure data order when importing. + - `PROPERTIES ("key1"="value1", ...)` + + Specify some parameters of the imported format. For example, if the imported file is in `json` format, you can specify parameters such as `json_root`, `jsonpaths`, `fuzzy parse`, etc. + - `WITH BROKER broker_name` Specify the Broker service name to be used. In the public cloud Doris. Broker service name is `bos` @@ -405,6 +410,55 @@ WITH BROKER broker_name `my_table` must be an Unqiue Key model table with Sequence Col specified. The data will be ordered according to the value of the `source_sequence` column in the source data. +10. Import a batch of data from HDFS, specify the file format as `json`, and specify parameters of `json_root` and `jsonpaths`. + + ```sql + LOAD LABEL example_db.label10 + ( + DATA INFILE("HDFS://test:port/input/file.json") + INTO TABLE `my_table` + FORMAT AS "json" + PROPERTIES( + "json_root" = "$.item", + "jsonpaths" = "[$.id, $.city, $.code]" + ) + ) + with HDFS ( + "hadoop.username" = "user" + "password" = "" + ) + PROPERTIES + ( + "timeout"="1200", + "max_filter_ratio"="0.1" + ); + ``` + + `jsonpaths` can be use with `column list` and `SET(column_mapping)`: + + ```sql + LOAD LABEL example_db.label10 + ( + DATA INFILE("HDFS://test:port/input/file.json") + INTO TABLE `my_table` + FORMAT AS "json" + (id, code, city) + SET (id = id * 10) + PROPERTIES( + "json_root" = "$.item", + "jsonpaths" = "[$.id, $.code, $.city]" + ) + ) + with HDFS ( + "hadoop.username" = "user" + "password" = "" + ) + PROPERTIES + ( + "timeout"="1200", + "max_filter_ratio"="0.1" + ); + ``` ### Keywords BROKER, LOAD diff --git a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md index 44b7d5fcee..51bab9d6a1 100644 --- a/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md +++ b/docs/zh-CN/docs/sql-manual/sql-reference/Data-Manipulation-Statements/Load/BROKER-LOAD.md @@ -72,6 +72,7 @@ WITH BROKER broker_name [WHERE predicate] [DELETE ON expr] [ORDER BY source_sequence] + [PROPERTIES ("key1"="value1", ...)] ``` - `[MERGE|APPEND|DELETE]` @@ -128,6 +129,10 @@ WITH BROKER broker_name 仅针对 Unique Key 模型的表。用于指定导入数据中表示 Sequence Col 的列。主要用于导入时保证数据顺序。 + - `PROPERTIES ("key1"="value1", ...)` + + 指定导入的format的一些参数。如导入的文件是`json`格式,则可以在这里指定`json_root`、`jsonpaths`、`fuzzy_parse`等参数。 + - `WITH BROKER broker_name` 指定需要使用的 Broker 服务名称。在公有云 Doris 中。Broker 服务名称为 `bos` @@ -404,6 +409,56 @@ WITH BROKER broker_name `my_table` 必须是 Unqiue Key 模型表,并且指定了 Sequcence Col。数据会按照源数据中 `source_sequence` 列的值来保证顺序性。 +10. 从 HDFS 导入一批数据,指定文件格式为 `json` 并指定 `json_root`、`jsonpaths` + + ```sql + LOAD LABEL example_db.label10 + ( + DATA INFILE("HDFS://test:port/input/file.json") + INTO TABLE `my_table` + FORMAT AS "json" + PROPERTIES( + "json_root" = "$.item", + "jsonpaths" = "[$.id, $.city, $.code]" + ) + ) + with HDFS ( + "hadoop.username" = "user" + "password" = "" + ) + PROPERTIES + ( + "timeout"="1200", + "max_filter_ratio"="0.1" + ); + ``` + + `jsonpaths` 可与 `column list` 及 `SET (column_mapping)`配合: + + ```sql + LOAD LABEL example_db.label10 + ( + DATA INFILE("HDFS://test:port/input/file.json") + INTO TABLE `my_table` + FORMAT AS "json" + (id, code, city) + SET (id = id * 10) + PROPERTIES( + "json_root" = "$.item", + "jsonpaths" = "[$.id, $.code, $.city]" + ) + ) + with HDFS ( + "hadoop.username" = "user" + "password" = "" + ) + PROPERTIES + ( + "timeout"="1200", + "max_filter_ratio"="0.1" + ); + ``` + ### Keywords BROKER, LOAD --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org