This is an automated email from the ASF dual-hosted git repository. morningman pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/doris-website.git
The following commit(s) were added to refs/heads/master by this push: new 05df3e29dc6 [feat](openx-json) add openx json doc (#2281) 05df3e29dc6 is described below commit 05df3e29dc6a9afe99648c67916de38fa87cca96 Author: daidai <2017501...@qq.com> AuthorDate: Thu Apr 24 16:38:28 2025 +0800 [feat](openx-json) add openx json doc (#2281) ## Versions - [x] dev - [ ] 3.0 - [ ] 2.1 - [ ] 2.0 ## Languages - [x] Chinese - [ ] English ## Docs Checklist - [ ] Checked by AI - [ ] Test Cases Built --------- Co-authored-by: morningman <yun...@selectdb.com> --- docs/lakehouse/file-formats/text.md | 19 +++++++++++++++---- .../current/lakehouse/file-formats/text.md | 19 +++++++++++++++---- 2 files changed, 30 insertions(+), 8 deletions(-) diff --git a/docs/lakehouse/file-formats/text.md b/docs/lakehouse/file-formats/text.md index a6a40f9d92f..e94d5004336 100644 --- a/docs/lakehouse/file-formats/text.md +++ b/docs/lakehouse/file-formats/text.md @@ -57,13 +57,24 @@ This document introduces the support for reading and writing text file formats i ## JSON -* Catalog +### Catalog - Supports reading Hive tables in the `org.apache.hive.hcatalog.data.JsonSerDe` format. (Supported from version 3.0.4) +- Hive table in `org.apache.hive.hcatalog.data.JsonSerDe` format (supported since version 3.0.4) -* Import + 1. Supports both primitive and complex types. + 2. Does not support the `timestamp.formats` SERDEPROPERTIES. + +- Hive table in [`org.openx.data.jsonserde.JsonSerDe`](https://github.com/rcongiu/Hive-JSON-Serde) format (supported since version 3.0.6) + + 1. Supports both primitive and complex types. + 2. SERDEPROPERTIES: Only [`ignore.malformed.json`](https://github.com/rcongiu/Hive-JSON-Serde?tab=readme-ov-file#importing-malformed-data) is supported and behaves the same as in this JsonSerDe. Other SERDEPROPERTIES are not effective. + 3. Does not support [`Using Arrays`](https://github.com/rcongiu/Hive-JSON-Serde?tab=readme-ov-file#using-arrays) (similar to Text/CSV format, where all column data is placed into a single array). + 4. Does not support [`Promoting a Scalar to an Array`](https://github.com/rcongiu/Hive-JSON-Serde?tab=readme-ov-file#promoting-a-scalar-to-an-array) (promoting a scalar to a single-element array). + 5. By default, Doris can correctly recognize the table schema. However, due to the lack of support for certain parameters, automatic schema recognition might fail. In this case, you can set `read_hive_json_in_one_column = true` to place the entire JSON row into the first column to ensure the original data is fully read. Users can then process it manually. This feature requires the first column's data type to be `String`. + +### Import - Import functionality supports JSON formats. See the import documentation for details. +Import functionality supports JSON formats. See the import documentation for details. ## Character Set diff --git a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/file-formats/text.md b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/file-formats/text.md index 6066b1870d0..009289ea080 100644 --- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/file-formats/text.md +++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/lakehouse/file-formats/text.md @@ -64,13 +64,24 @@ under the License. ## JSON -* Catalog +### Catalog - 支持读取 `org.apache.hive.hcatalog.data.JsonSerDe` 格式的 Hive 表。(3.0.4 版本支持) +- `org.apache.hive.hcatalog.data.JsonSerDe` 格式的 Hive 表(自3.0.4 版本支持) -* 导入 + 1. 支持普通类型和复杂类型。 + 2. 不支持 `timestamp.formats` SERDEPROPERTIES + +- [`org.openx.data.jsonserde.JsonSerDe`](https://github.com/rcongiu/Hive-JSON-Serde) 格式的 Hive 表(自3.0.6 版本支持) + + 1. 支持普通类型和复杂类型。 + 2. SERDEPROPERTIES: 只支持 [`ignore.malformed.json`](https://github.com/rcongiu/Hive-JSON-Serde?tab=readme-ov-file#importing-malformed-data) 且行为与该 JsonSerDe 一致, 其他 SERDEPROPERTIES 不生效。 + 3. 不支持[`Using Arrays`](https://github.com/rcongiu/Hive-JSON-Serde?tab=readme-ov-file#using-arrays) (类似于 Text/CSV, 将所有列的数据放一个数组中)。 + 4. 不支持[`Promoting a Scalar to an Array`](https://github.com/rcongiu/Hive-JSON-Serde?tab=readme-ov-file#promoting-a-scalar-to-an-array) (提升标量返回一个的单元素数组)。 + 5. 默认情况下,Doris 会正常识别表的 Schema。但因为某些特殊参数不支持,可能导致自动识别 Schema 失败。此时可以通过`set read_hive_json_in_one_column = true`, 将一整行 Json 数据都放到第一列中,这样可以确保原始数据被完整读取,用户可以自行处理。该功能要求第一列的数据类型为 String. + +### 导入 - 导入功能支持的 JSON 格式,详见导入相关文档。 +导入功能支持的 JSON 格式,详见导入相关文档。 ## 字符集 --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org