kaijchen commented on code in PR #1511: URL: https://github.com/apache/doris-website/pull/1511#discussion_r1893360377
########## docs/data-operate/import/data-source/minio.md: ########## @@ -0,0 +1,204 @@ +--- +{ + "title": "MinIO", + "language": "en" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +Doris provides two ways to load files from MinIO: +- Use S3 Load to load MinIO files into Doris, which is an asynchronous load method. +- Use TVF to load MinIO files into Doris, which is a synchronous load method. + +## load with S3 Load + +Use S3 Load to import files on object storage. For detailed steps, please refer to the [Broker Load Manual](../import-way/broker-load-manual.md) + +### Step 1: Prepare the data + +Create a CSV file s3load_example.csv The file is stored on MinIO and its content is as follows: + +``` +1,Emily,25 +2,Benjamin,35 +3,Olivia,28 +4,Alexander,60 +5,Ava,17 +6,William,69 +7,Sophia,32 +8,James,64 +9,Emma,37 +10,Liam,64 +``` + +### Step 2: Create a table in Doris + +```sql +CREATE TABLE test_s3load( + user_id BIGINT NOT NULL COMMENT "user id", + name VARCHAR(20) COMMENT "name", + age INT COMMENT "age" +) +DUPLICATE KEY(user_id) +DISTRIBUTED BY HASH(user_id) BUCKETS 10; +``` + +### Step 3: Load data using S3 Load + +:::caution Caution +If you deployed MinIO in a local network and did not have TLS enabled, you need to explicitly add `http://` in the endpoint string. + +- `"s3.endpoint" = "http://localhost:9000"` Review Comment: ```suggestion - `"s3.endpoint" = "http://localhost:9000"` The S3 SDK uses the virtual-hosted style by default. However, MinIO does not enable virtual-hosted style access by default. In this case, we can add the `use_path_style` parameter to force the use of the path style. - `"use_path_style" = "true"` ``` ########## docs/data-operate/import/data-source/minio.md: ########## @@ -0,0 +1,204 @@ +--- +{ + "title": "MinIO", + "language": "en" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +Doris provides two ways to load files from MinIO: +- Use S3 Load to load MinIO files into Doris, which is an asynchronous load method. +- Use TVF to load MinIO files into Doris, which is a synchronous load method. + +## load with S3 Load + +Use S3 Load to import files on object storage. For detailed steps, please refer to the [Broker Load Manual](../import-way/broker-load-manual.md) + +### Step 1: Prepare the data + +Create a CSV file s3load_example.csv The file is stored on MinIO and its content is as follows: + +``` +1,Emily,25 +2,Benjamin,35 +3,Olivia,28 +4,Alexander,60 +5,Ava,17 +6,William,69 +7,Sophia,32 +8,James,64 +9,Emma,37 +10,Liam,64 +``` + +### Step 2: Create a table in Doris + +```sql +CREATE TABLE test_s3load( + user_id BIGINT NOT NULL COMMENT "user id", + name VARCHAR(20) COMMENT "name", + age INT COMMENT "age" +) +DUPLICATE KEY(user_id) +DISTRIBUTED BY HASH(user_id) BUCKETS 10; +``` + +### Step 3: Load data using S3 Load + +:::caution Caution +If you deployed MinIO in a local network and did not have TLS enabled, you need to explicitly add `http://` in the endpoint string. + +- `"s3.endpoint" = "http://localhost:9000"` +::: + +```sql +LOAD LABEL s3_load_2022_04_05 +( + DATA INFILE("s3://your_bucket_name/s3load_example.csv") + INTO TABLE test_s3load + COLUMNS TERMINATED BY "," + FORMAT AS "CSV" + (user_id, name, age) +) +WITH S3 +( + "provider" = "S3", + "s3.endpoint" = "play.min.io:9000", + "s3.region" = "us-east-1", + "s3.access_key" = "myminioadmin", + "s3.secret_key" = "minio-secret-key-change-me", + "use_path_style" = "true" +) +PROPERTIES +( + "timeout" = "3600" +); +``` + +### Step 4: Check the imported data + +```sql +SELECT * FROM test_s3load; +``` + +Results: + +``` +mysql> select * from test_s3load; ++---------+-----------+------+ +| user_id | name | age | ++---------+-----------+------+ +| 5 | Ava | 17 | +| 10 | Liam | 64 | +| 7 | Sophia | 32 | +| 9 | Emma | 37 | +| 1 | Emily | 25 | +| 4 | Alexander | 60 | +| 2 | Benjamin | 35 | +| 3 | Olivia | 28 | +| 6 | William | 69 | +| 8 | James | 64 | ++---------+-----------+------+ +10 rows in set (0.04 sec) +``` + +## Load with TVF + +### Step 1: Prepare the data + +Create a CSV file s3load_example.csv The file is stored on MinIO and its content is as follows: + +``` +1,Emily,25 +2,Benjamin,35 +3,Olivia,28 +4,Alexander,60 +5,Ava,17 +6,William,69 +7,Sophia,32 +8,James,64 +9,Emma,37 +10,Liam,64 +``` + +### Step 2: Create a table in Doris + +```sql +CREATE TABLE test_s3load( + user_id BIGINT NOT NULL COMMENT "user id", + name VARCHAR(20) COMMENT "name", + age INT COMMENT "age" +) +DUPLICATE KEY(user_id) +DISTRIBUTED BY HASH(user_id) BUCKETS 10; +``` + +### Step 3: Load data using TVF + +:::caution Caution +If you deployed MinIO in a local network and did not have TLS enabled, you need to explicitly add `http://` in the endpoint string. + +- `"s3.endpoint" = "http://localhost:9000"` Review Comment: ```suggestion - `"s3.endpoint" = "http://localhost:9000"` The S3 SDK uses the virtual-hosted style by default. However, MinIO does not enable virtual-hosted style access by default. In this case, we can add the `use_path_style` parameter to force the use of the path style. - `"use_path_style" = "true"` ``` ########## i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/minio.md: ########## @@ -0,0 +1,204 @@ +--- +{ + "title": "MinIO", + "language": "zh-CN" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +Doris 提供两种方式从 MinIO 导入文件: +- 使用 S3 Load 将 MinIO 文件导入到 Doris 中,这是一个异步的导入方式。 +- 使用 TVF 将 MinIO 文件导入到 Doris 中,这是一个同步的导入方式。 + +## 使用 S3 Load 导入 + +使用 S3 Load 导入对象存储上的文件,详细步骤可以参考 [Broker Load 手册](../import-way/broker-load-manual.md) + +### 第 1 步:准备数据 + +创建 CSV 文件 s3load_example.csv 文件存储在 MinIO 上,其内容如下: + +``` +1,Emily,25 +2,Benjamin,35 +3,Olivia,28 +4,Alexander,60 +5,Ava,17 +6,William,69 +7,Sophia,32 +8,James,64 +9,Emma,37 +10,Liam,64 +``` + +### 第 2 步:在 Doris 中创建表 + +```sql +CREATE TABLE test_s3load( + user_id BIGINT NOT NULL COMMENT "user id", + name VARCHAR(20) COMMENT "name", + age INT COMMENT "age" +) +DUPLICATE KEY(user_id) +DISTRIBUTED BY HASH(user_id) BUCKETS 10; +``` + +### 第 3 步:使用 S3 Load 导入数据 + +:::caution 注意 +如果您在本地网络中部署了 MinIO 并且未启用 TLS,则需要在 endpoint 字符串中明确添加 `http://`。 + +- `"s3.endpoint" = "http://localhost:9000"` +::: + +```sql +LOAD LABEL s3_load_2022_04_01 +( + DATA INFILE("s3://your_bucket_name/s3load_example.csv") + INTO TABLE test_s3load + COLUMNS TERMINATED BY "," + FORMAT AS "CSV" + (user_id, name, age) +) +WITH S3 +( + "provider" = "S3", + "s3.endpoint" = "play.min.io:9000", + "s3.region" = "us-east-1", + "s3.access_key" = "myminioadmin", + "s3.secret_key" = "minio-secret-key-change-me", + "use_path_style" = "true" +) +PROPERTIES +( + "timeout" = "3600" +); +``` + +### 第 4 步:检查导入数据 + +```sql +SELECT * FROM test_s3load; +``` + +结果: + +``` +mysql> select * from test_s3load; ++---------+-----------+------+ +| user_id | name | age | ++---------+-----------+------+ +| 5 | Ava | 17 | +| 10 | Liam | 64 | +| 7 | Sophia | 32 | +| 9 | Emma | 37 | +| 1 | Emily | 25 | +| 4 | Alexander | 60 | +| 2 | Benjamin | 35 | +| 3 | Olivia | 28 | +| 6 | William | 69 | +| 8 | James | 64 | ++---------+-----------+------+ +10 rows in set (0.04 sec) +``` + +## 使用 TVF 导入 + +### 第 1 步:准备数据 + +创建 CSV 文件 s3load_example.csv 文件存储在 MinIO 上,其内容如下: + +``` +1,Emily,25 +2,Benjamin,35 +3,Olivia,28 +4,Alexander,60 +5,Ava,17 +6,William,69 +7,Sophia,32 +8,James,64 +9,Emma,37 +10,Liam,64 +``` + +### 第 2 步:在 Doris 中创建表 + +```sql +CREATE TABLE test_s3load( + user_id BIGINT NOT NULL COMMENT "user id", + name VARCHAR(20) COMMENT "name", + age INT COMMENT "age" +) +DUPLICATE KEY(user_id) +DISTRIBUTED BY HASH(user_id) BUCKETS 10; +``` + +### 第 3 步:使用 TVF 导入数据 + +:::caution 注意 +如果您在本地网络中部署了 MinIO 并且未启用 TLS,则需要在 endpoint 字符串中明确添加 `http://`。 + +- `"s3.endpoint" = "http://localhost:9000"` Review Comment: ```suggestion - `"s3.endpoint" = "http://localhost:9000"` S3 SDK 默认使用 virtual-hosted style 方式。但 MinIO 默认没开启 virtual-hosted style 方式的访问,此时我们可以添加 `use_path_style` 参数来强制使用 path style 方式。 - `"use_path_style" = "true"` ``` ########## i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/data-source/minio.md: ########## @@ -0,0 +1,204 @@ +--- +{ + "title": "MinIO", + "language": "zh-CN" +} +--- + +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +Doris 提供两种方式从 MinIO 导入文件: +- 使用 S3 Load 将 MinIO 文件导入到 Doris 中,这是一个异步的导入方式。 +- 使用 TVF 将 MinIO 文件导入到 Doris 中,这是一个同步的导入方式。 + +## 使用 S3 Load 导入 + +使用 S3 Load 导入对象存储上的文件,详细步骤可以参考 [Broker Load 手册](../import-way/broker-load-manual.md) + +### 第 1 步:准备数据 + +创建 CSV 文件 s3load_example.csv 文件存储在 MinIO 上,其内容如下: + +``` +1,Emily,25 +2,Benjamin,35 +3,Olivia,28 +4,Alexander,60 +5,Ava,17 +6,William,69 +7,Sophia,32 +8,James,64 +9,Emma,37 +10,Liam,64 +``` + +### 第 2 步:在 Doris 中创建表 + +```sql +CREATE TABLE test_s3load( + user_id BIGINT NOT NULL COMMENT "user id", + name VARCHAR(20) COMMENT "name", + age INT COMMENT "age" +) +DUPLICATE KEY(user_id) +DISTRIBUTED BY HASH(user_id) BUCKETS 10; +``` + +### 第 3 步:使用 S3 Load 导入数据 + +:::caution 注意 +如果您在本地网络中部署了 MinIO 并且未启用 TLS,则需要在 endpoint 字符串中明确添加 `http://`。 + +- `"s3.endpoint" = "http://localhost:9000"` Review Comment: ```suggestion - `"s3.endpoint" = "http://localhost:9000"` S3 SDK 默认使用 virtual-hosted style 方式。但 MinIO 默认没开启 virtual-hosted style 方式的访问,此时我们可以添加 `use_path_style` 参数来强制使用 path style 方式。 - `"use_path_style" = "true"` ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org