morrySnow commented on code in PR #1952: URL: https://github.com/apache/doris-website/pull/1952#discussion_r1942162694
########## docs/sql-manual/sql-statements/data-modification/load-and-export/BROKER-LOAD.md: ########## @@ -1,9 +1,8 @@ --- { - "title": "BROKER LOAD", + "title": "BROKER-LOAD", Review Comment: ```suggestion "title": "BROKER LOAD", ``` ########## docs/sql-manual/sql-statements/data-modification/load-and-export/BROKER-LOAD.md: ########## @@ -24,345 +23,255 @@ KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> + ## Description -This command is mainly used to import data on remote storage (such as S3, HDFS) through the Broker service process. +Broker Load is a data import method in Doris, primarily used to import large - scale data from remote storage systems such as HDFS or S3. It is initiated through the MySQL API and is an asynchronous import method. The import progress and results can be queried using the `SHOW LOAD` statement. Review Comment: ```suggestion Broker Load is a data import method in Doris, primarily used to import large scale data from remote storage systems such as HDFS or S3. It is initiated through the MySQL API and is an asynchronous import method. The import progress and results can be queried using the `SHOW LOAD` statement. ``` ########## docs/sql-manual/sql-statements/data-modification/load-and-export/BROKER-LOAD.md: ########## @@ -24,345 +23,255 @@ KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> + ## Description -This command is mainly used to import data on remote storage (such as S3, HDFS) through the Broker service process. +Broker Load is a data import method in Doris, primarily used to import large - scale data from remote storage systems such as HDFS or S3. It is initiated through the MySQL API and is an asynchronous import method. The import progress and results can be queried using the `SHOW LOAD` statement. + +In earlier versions, S3 and HDFS Load relied on the Broker process. However, with version optimizations, data is now read directly from the data source without relying on an additional Broker process. Nevertheless, due to the similar syntax, S3 Load, HDFS Load, and Broker Load are collectively referred to as Broker Load. + +## Syntax ```sql -LOAD LABEL load_label +LOAD LABEL <db_name>.<load_label> Review Comment: db是必须要写的吗?好像是可选的? ```suggestion LOAD LABEL [<db_name>.]<load_label> ``` ########## docs/sql-manual/sql-statements/data-modification/load-and-export/CANCEL-LOAD.md: ########## @@ -25,58 +24,74 @@ specific language governing permissions and limitations under the License. --> - ## Description -This statement is used to undo an import job for the specified label. Or batch undo import jobs via fuzzy matching +This statement is used to cancel an import job with a specified `label`, or to cancel import jobs in batches through fuzzy matching. + +## Syntax ```sql CANCEL LOAD -[FROM db_name] -WHERE [LABEL = "load_label" | LABEL like "label_pattern" | STATE = "PENDING/ETL/LOADING"] +[FROM <db_name>] +WHERE [LABEL = "<load_label>" | LABEL like "<label_pattern>" | STATE = { "PENDING" | "ETL" | "LOADING" } ] ``` -Notice: Cancel by State is supported since 1.2.0. +## Required Parameters + +**1. `<db_name>`** + +> The name of the database where the import job to be cancelled resides. + +## Optional Parameters + +**1. `<load_label>`** + +> If `LABEL = "<load_label>"` is used, it precisely matches the specified label. + +**2. `<label_pattern>`** + +> If `LABEL LIKE "<label_pattern>"` is used, it matches import tasks whose labels contain the `label_pattern`. + +**3. `<PENDING>`** Review Comment: ```suggestion **3. `STATE = { "PENDING" | "ETL" | "LOADING" }`** ``` ########## docs/sql-manual/sql-statements/data-modification/load-and-export/SHOW-LOAD.md: ########## @@ -24,90 +24,109 @@ specific language governing permissions and limitations under the License. --> - - ## Description -This statement is used to display the execution of the specified import task +This statement is used to display the execution status of the specified import task. -grammar: +## Syntax ```sql SHOW LOAD -[FROM db_name] +[FROM <db_name>] [ WHERE - [LABEL [ = "your_label" | LIKE "label_matcher"]] - [STATE = ["PENDING"|"ETL"|"LOADING"|"FINISHED"|"CANCELLED"|]] + [LABEL = [ "<your_label>" | LIKE "<label_matcher>"]] + [STATE = [ { " PENDING " | " ETL " | " LOADING " | " FINISHED " | " CANCELLED " } ]] Review Comment: ```suggestion [ STATE = { " PENDING " | " ETL " | " LOADING " | " FINISHED " | " CANCELLED " } ] ``` ########## docs/sql-manual/sql-statements/data-modification/load-and-export/SHOW-LOAD.md: ########## @@ -24,90 +24,109 @@ specific language governing permissions and limitations under the License. --> - - ## Description -This statement is used to display the execution of the specified import task +This statement is used to display the execution status of the specified import task. -grammar: +## Syntax ```sql SHOW LOAD -[FROM db_name] +[FROM <db_name>] [ WHERE - [LABEL [ = "your_label" | LIKE "label_matcher"]] - [STATE = ["PENDING"|"ETL"|"LOADING"|"FINISHED"|"CANCELLED"|]] + [LABEL = [ "<your_label>" | LIKE "<label_matcher>"]] + [STATE = [ { " PENDING " | " ETL " | " LOADING " | " FINISHED " | " CANCELLED " } ]] ] -[ORDER BY...] -[LIMIT limit][OFFSET offset]; +[ORDER BY [{ <col_name> | <expr> | <position> }]] +[LIMIT <limit>][OFFSET <offset>]; ``` -illustrate: +## Optional Parameters + +**1. `<db_name>`** + +> If `db_name` is not specified, the current default database will be used. + +**2. `<label_matcher>`** + +> When using `LABEL LIKE = "<label_matcher>"`, it will match import tasks whose labels contain `label_matcher`. + +**3. `<your_label>`** + +> When using `LABEL = "<your_label>"`, it will precisely match the specified label. + +**4. `<PENDING>`** Review Comment: STATE = { " PENDING " | " ETL " | " LOADING " | " FINISHED " | " CANCELLED " } ########## docs/sql-manual/sql-statements/data-modification/load-and-export/SHOW-LOAD-WARNINGS.md: ########## @@ -1,6 +1,6 @@ --- { - "title": "SHOW LOAD WARNINGS", + "title": "SHOW-LOAD-WARNINGS", Review Comment: ```suggestion "title": "SHOW LOAD WARNINGS", ``` ########## docs/sql-manual/sql-statements/data-modification/load-and-export/BROKER-LOAD.md: ########## @@ -24,345 +23,255 @@ KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> + ## Description -This command is mainly used to import data on remote storage (such as S3, HDFS) through the Broker service process. +Broker Load is a data import method in Doris, primarily used to import large - scale data from remote storage systems such as HDFS or S3. It is initiated through the MySQL API and is an asynchronous import method. The import progress and results can be queried using the `SHOW LOAD` statement. + +In earlier versions, S3 and HDFS Load relied on the Broker process. However, with version optimizations, data is now read directly from the data source without relying on an additional Broker process. Nevertheless, due to the similar syntax, S3 Load, HDFS Load, and Broker Load are collectively referred to as Broker Load. + +## Syntax ```sql -LOAD LABEL load_label +LOAD LABEL <db_name>.<load_label> ( -data_desc1[, data_desc2, ...] +[ { MERGE | APPEND | DELETE } ] +DATA INFILE +( +"<file_path>"[, ...] +) +[ NEGATIVE ] +INTO TABLE `<table_name>` +[ PARTITION ( <partition_name> [ , ... ] ) ] +[ COLUMNS TERMINATED BY "<column_separator>" ] +[ LINES TERMINATED BY "<line_delimiter>" ] +[ FORMAT AS "<file_type>" ] +[ COMPRESS_TYPE AS "<compress_type>" ] +[ (<column_list>) ] +[ COLUMNS FROM PATH AS (<column_name> [ , ... ] ) ] +[ SET (<column_mapping>) ] +[ PRECEDING FILTER <predicate> ] +[ WHERE <predicate> ] +[ DELETE ON <expr> ] +[ ORDER BY <source_sequence> ] +[ PROPERTIES ("<key1>"="<value1>", ...) ] ) -WITH BROKER broker_name -[broker_properties] -[load_properties] -[COMMENT "comment"]; +WITH BROKER "<broker_name>" +( <broker_properties> + [ , ... ]) +[ PROPERTIES ( + <load_properties> + [ , ... ]) ] +[COMMENT "<comment>" ]; ``` -- `load_label` - - Each import needs to specify a unique Label. You can use this label to view the progress of the job later. - - `[database.]label_name` - -- `data_desc1` - - Used to describe a set of files that need to be imported. - - ```sql - [MERGE|APPEND|DELETE] - DATA INFILE - ( - "file_path1"[, file_path2, ...] - ) - [NEGATIVE] - INTO TABLE `table_name` - [PARTITION (p1, p2, ...)] - [COLUMNS TERMINATED BY "column_separator"] - [LINES TERMINATED BY "line_delimiter"] - [FORMAT AS "file_type"] - [COMPRESS_TYPE AS "compress_type"] - [(column_list)] - [COLUMNS FROM PATH AS (c1, c2, ...)] - [SET (column_mapping)] - [PRECEDING FILTER predicate] - [WHERE predicate] - [DELETE ON expr] - [ORDER BY source_sequence] - [PROPERTIES ("key1"="value1", ...)] - ``` - - - `[MERGE|APPEND|DELETE]` - - Data merge type. The default is APPEND, indicating that this import is a normal append write operation. The MERGE and DELETE types are only available for Unique Key model tables. The MERGE type needs to be used with the `[DELETE ON]` statement to mark the Delete Flag column. The DELETE type indicates that all data imported this time are deleted data. - - - `DATA INFILE` - - Specify the file path to be imported. Can be multiple. Wildcards can be used. The path must eventually match to a file, if it only matches a directory the import will fail. - - - `NEGATIVE` - - This keyword is used to indicate that this import is a batch of "negative" imports. This method is only for aggregate data tables with integer SUM aggregate type. This method will reverse the integer value corresponding to the SUM aggregate column in the imported data. Mainly used to offset previously imported wrong data. - - - `PARTITION(p1, p2, ...)` - - You can specify to import only certain partitions of the table. Data that is no longer in the partition range will be ignored. - - - `COLUMNS TERMINATED BY` - - Specifies the column separator. Only valid in CSV format. Only single-byte delimiters can be specified. - - - `LINES TERMINATED BY` - - Specifies the line delimiter. Only valid in CSV format. Only single-byte delimiters can be specified. - - - `FORMAT AS` - - Specifies the file type, CSV, PARQUET and ORC formats are supported. Default is CSV. - - - `COMPRESS_TYPE AS` - Specifies the file compress type, GZ/LZO/BZ2/LZ4FRAME/DEFLATE/LZOP - - - `column list` - - Used to specify the column order in the original file. For a detailed introduction to this part, please refer to the [Column Mapping, Conversion and Filtering](../../../../data-operate/import/import-scenes/load-data-convert.md) document. - - `(k1, k2, tmpk1)` - - - `COLUMNS FROM PATH AS` - - Specifies the columns to extract from the import file path. - - - `SET (column_mapping)` - - Specifies the conversion function for the column. - - - `PRECEDING FILTER predicate` - - Pre-filter conditions. The data is first concatenated into raw data rows in order according to `column list` and `COLUMNS FROM PATH AS`. Then filter according to the pre-filter conditions. For a detailed introduction to this part, please refer to the [Column Mapping, Conversion and Filtering](../../../../data-operate/import/import-scenes/load-data-convert.md) document. - - - `WHERE predicate` - - Filter imported data based on conditions. For a detailed introduction to this part, please refer to the [Column Mapping, Conversion and Filtering](../../../../data-operate/import/import-scenes/load-data-convert.md) document. - - - `DELETE ON expr` - - It needs to be used with the MEREGE import mode, only for the table of the Unique Key model. Used to specify the columns and calculated relationships in the imported data that represent the Delete Flag. - - - `ORDER BY` - - Tables only for the Unique Key model. Used to specify the column in the imported data that represents the Sequence Col. Mainly used to ensure data order when importing. - - - `PROPERTIES ("key1"="value1", ...)` - - Specify some parameters of the imported format. For example, if the imported file is in `json` format, you can specify parameters such as `json_root`, `jsonpaths`, `fuzzy parse`, etc. - - - enclose - - When the csv data field contains row delimiters or column delimiters, to prevent accidental truncation, single-byte characters can be specified as brackets for protection. For example, the column separator is ",", the bracket is "'", and the data is "a,'b,c'", then "b,c" will be parsed as a field. Note: when the bracket is `"`, trim\_double\_quotes must be set to true. - - - escape - - Used to escape characters that appear in a csv field identical to the enclosing characters. For example, if the data is "a,'b,'c'", enclose is "'", and you want "b,'c to be parsed as a field, you need to specify a single-byte escape character, such as "\", and then modify the data to "a,' b,\'c'". - -- `WITH BROKER broker_name` - - Specify the Broker service name to be used. In the public cloud Doris. Broker service name is `bos` - -- `broker_properties` - - Specifies the information required by the broker. This information is usually used by the broker to be able to access remote storage systems. Such as BOS or HDFS. See the [Broker](../../../../advanced/broker.md) documentation for specific information. - - ```text - ( - "key1" = "val1", - "key2" = "val2", - ... - ) - ``` - -- `load_properties` - - Specifies import-related parameters. The following parameters are currently supported: +## Required Parameters + +**1. `<db_name>`** +> Specifies the name of the database for import. + +**2. `<load_label>`** +> Each import task needs to specify a unique Label. The job progress can be queried later using this Label. + +**3. `<table_name>`** +> Specifies the table corresponding to the import task. + +**4. `<file_path>`** +> Specifies the file path to be imported. Multiple paths can be specified, and wildcards can be used. The path must ultimately match a file; if it only matches a directory, the import will fail. + +**5. `<broker_name>`** +> Specifies the name of the Broker service to be used. For example, in public - cloud Doris, the Broker service name is `bos`. + +**6. `<broker_properties>`** +> Specifies the information required by the broker. This information is typically used to enable the Broker to access the remote storage system, such as BOS or HDFS. +> +>```text +> ( +> "username" = "user", +> "password" = "pass", +> ... +> ) +>``` + +## Optional Parameters + +| Parameter Name | Parameter Description | Review Comment: Optional Parameters 章节应和 Required Parameters 章节格式保持一致 ########## docs/sql-manual/sql-statements/data-modification/load-and-export/BROKER-LOAD.md: ########## @@ -24,345 +23,255 @@ KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. --> + ## Description -This command is mainly used to import data on remote storage (such as S3, HDFS) through the Broker service process. +Broker Load is a data import method in Doris, primarily used to import large - scale data from remote storage systems such as HDFS or S3. It is initiated through the MySQL API and is an asynchronous import method. The import progress and results can be queried using the `SHOW LOAD` statement. + +In earlier versions, S3 and HDFS Load relied on the Broker process. However, with version optimizations, data is now read directly from the data source without relying on an additional Broker process. Nevertheless, due to the similar syntax, S3 Load, HDFS Load, and Broker Load are collectively referred to as Broker Load. Review Comment: ```suggestion In earlier versions, S3 and HDFS Load relied on the Broker process. Now, data is read directly from the data source without relying on an additional Broker process. Nevertheless, due to the similar syntax, S3 Load, HDFS Load, and Broker Load are collectively referred to as Broker Load. ``` ########## docs/sql-manual/sql-statements/data-modification/load-and-export/CANCEL-LOAD.md: ########## @@ -1,9 +1,8 @@ --- { - "title": "CANCEL LOAD", + "title": "CANCEL-LOAD", Review Comment: ```suggestion "title": "CANCEL LOAD", ``` ########## docs/sql-manual/sql-statements/data-modification/load-and-export/SHOW-CREATE-LOAD.md: ########## @@ -1,6 +1,6 @@ --- { - "title": "SHOW CREATE LOAD", + "title": "SHOW-CREATE-LOAD", Review Comment: ```suggestion "title": "SHOW CREATE LOAD", ``` ########## docs/sql-manual/sql-statements/data-modification/load-and-export/SHOW-CREATE-LOAD.md: ########## @@ -24,34 +24,38 @@ specific language governing permissions and limitations under the License. --> - - - ## Description -This statement is used to demonstrate the creation statement of a import job. +This statement is used to display the creation statement of an import job. -grammar: +## Syntax ```sql -SHOW CREATE LOAD for load_name; +SHOW CREATE LOAD FOR <load_name>; ``` -illustrate: +## Required Parameters -- `load_name`: import job name +**`<load_name>`** -## Example +> The name of the routine import job. -1. Show the creation statement of the specified import job under the default db +## Access Control Requirements - ```sql - SHOW CREATE LOAD for test_load - ``` +Users executing this SQL command must have at least the following permissions: -## Keywords +| Privilege | Object | Notes | +| :---------------- | :------------- | :---------------------------- | +| ADMIN/OPERATOR | Database | Cluster administrator privileges are required. | Review Comment: 应该是 ADMIN_PRIV 或者 NODE_PRIV ########## docs/sql-manual/sql-statements/data-modification/load-and-export/MYSQL-LOAD.md: ########## @@ -1,6 +1,6 @@ --- { - "title": "MYSQL LOAD", + "title": "MYSQL-LOAD", Review Comment: ```suggestion "title": "MYSQL LOAD", ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org