This is an automated email from the ASF dual-hosted git repository.

jiafengzheng pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/doris-website.git


The following commit(s) were added to refs/heads/master by this push:
     new 4ff38d855e1 Add the actual hive bitmap udf documentation
4ff38d855e1 is described below

commit 4ff38d855e112e239d1d90b97c9ed2e3d7316cd6
Author: jiafeng.zhang <zhang...@gmail.com>
AuthorDate: Thu Aug 18 11:24:09 2022 +0800

    Add the actual hive bitmap udf documentation
    
    Add the actual hive bitmap udf documentation
---
 .../import/import-way/spark-load-manual.md         |   8 +-
 docs/ecosystem/external-table/hive-bitmap-udf.md   |  86 +++++++++++++++++
 .../import/import-way/spark-load-manual.md         |   4 +-
 .../ecosystem/external-table/hive-bitmap-udf.md    | 104 +++++++++++++++++++++
 4 files changed, 195 insertions(+), 7 deletions(-)

diff --git a/docs/data-operate/import/import-way/spark-load-manual.md 
b/docs/data-operate/import/import-way/spark-load-manual.md
index a1ebce9837b..9da17cc58cd 100644
--- a/docs/data-operate/import/import-way/spark-load-manual.md
+++ b/docs/data-operate/import/import-way/spark-load-manual.md
@@ -132,7 +132,7 @@ In the existing Doris import process, the data structure of 
global dictionary is
 
 ## Hive Bitmap UDF
 
-Spark supports loading hive-generated bitmap data directly into Doris
+Spark supports loading hive-generated bitmap data directly into Doris, see 
[hive-bitmap-udf 
documentation](../../../ecosystem/external-table/hive-bitmap-udf)
 
 ## Basic operation
 
@@ -284,22 +284,18 @@ You can use the `USAGE_PRIV` permission is given to a 
user or a role, and the ro
 
 GRANT USAGE_PRIV ON RESOURCE "spark0" TO "user0"@"%";
 
-
 -- Grant permission to the spark0 resource to role ROLE0
 
 GRANT USAGE_PRIV ON RESOURCE "spark0" TO ROLE "role0";
 
-
 -- Grant permission to all resources to user user0
 
 GRANT USAGE_PRIV ON RESOURCE * TO "user0"@"%";
 
-
 -- Grant permission to all resources to role ROLE0
 
 GRANT USAGE_PRIV ON RESOURCE * TO ROLE "role0";
 
-
 -- Revoke the spark0 resource permission of user user0
 
 REVOKE USAGE_PRIV ON RESOURCE "spark0" FROM "user0"@"%";
@@ -554,6 +550,8 @@ PROPERTIES
 );
 ````
 
+
+
 You can view the details syntax about creating load by input `help spark 
load`. This paper mainly introduces the parameter meaning and precautions in 
the creation and load syntax of spark load.
 
 **Label**
diff --git a/docs/ecosystem/external-table/hive-bitmap-udf.md 
b/docs/ecosystem/external-table/hive-bitmap-udf.md
new file mode 100644
index 00000000000..1402cf1ef69
--- /dev/null
+++ b/docs/ecosystem/external-table/hive-bitmap-udf.md
@@ -0,0 +1,86 @@
+---
+{
+    "title": "Hive Bitmap UDF",
+    "language": "en"
+}
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+  http://www.apache.org/licenses/LICENSE-2.0
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Hive UDF
+
+ Hive Bitmap UDF provides UDFs for generating bitmap and bitmap operations in 
hive tables. The bitmap in Hive is exactly the same as the Doris bitmap. The 
bitmap in Hive can be imported into doris through (spark bitmap load).
+
+ the main purpose:
+  1. Reduce the time of importing data into doris, and remove processes such 
as dictionary building and bitmap pre-aggregation;
+  2. Save hive storage, use bitmap to compress data, reduce storage cost;
+  3. Provide flexible bitmap operations in hive, such as: intersection, union, 
and difference operations, and the calculated bitmap can also be directly 
imported into doris; imported into doris;
+
+## How To Use
+
+### Create Bitmap type table in Hive
+
+```sql
+-- Example: Create Hive Bitmap Table
+CREATE TABLE IF NOT EXISTS `hive_bitmap_table`(
+  `k1`   int       COMMENT '',
+  `k2`   String    COMMENT '',
+  `k3`   String    COMMENT '',
+  `uuid` binary    COMMENT 'bitmap'
+) comment  'comment'
+```
+
+### Hive Bitmap UDF Usage:
+
+   Hive Bitmap UDF used in Hive/Spark
+
+```sql
+-- Load the Hive Bitmap Udf jar package (Upload the compiled hive-udf jar 
package to HDFS)
+add jar hdfs://node:9001/hive-udf-jar-with-dependencies.jar;
+-- Create Hive Bitmap UDAF function
+create temporary function to_bitmap as 'org.apache.doris.udf.ToBitmapUDAF';
+create temporary function bitmap_union as 
'org.apache.doris.udf.BitmapUnionUDAF';
+-- Create Hive Bitmap UDF function
+create temporary function bitmap_count as 
'org.apache.doris.udf.BitmapCountUDF';
+create temporary function bitmap_and as 'org.apache.doris.udf.BitmapAndUDF';
+create temporary function bitmap_or as 'org.apache.doris.udf.BitmapOrUDF';
+create temporary function bitmap_xor as 'org.apache.doris.udf.BitmapXorUDF';
+-- Example: Generate bitmap by to_bitmap function and write to Hive Bitmap 
table
+insert into hive_bitmap_table
+select 
+    k1,
+    k2,
+    k3,
+    to_bitmap(uuid) as uuid
+from 
+    hive_table
+group by 
+    k1,
+    k2,
+    k3
+-- Example: The bitmap_count function calculate the number of elements in the 
bitmap
+select k1,k2,k3,bitmap_count(uuid) from hive_bitmap_table
+-- Example: The bitmap_union function calculate the grouped bitmap union
+select k1,bitmap_union(uuid) from hive_bitmap_table group by k1
+```
+
+###  Hive Bitmap UDF  Description
+
+## Hive Bitmap import into Doris
+
+ see details: Load Data -> Spark Load -> Basic operation -> Create 
load(Example 3: when the upstream data source is hive binary type table)
\ No newline at end of file
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/spark-load-manual.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/spark-load-manual.md
index 443bac0f99f..762a6790ab4 100644
--- 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/spark-load-manual.md
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/data-operate/import/import-way/spark-load-manual.md
@@ -107,7 +107,7 @@ Spark load 任务的执行主要分为以下5个阶段。
 
 ## Hive Bitmap UDF
 
-Spark 支持将 hive 生成的 bitmap 数据直接导入到 Doris。
+Spark 支持将 hive 生成的 bitmap 数据直接导入到 Doris。详见 [hive-bitmap-udf 
文档](../../../ecosystem/external-table/hive-bitmap-udf)
 
 ## 基本操作
 
@@ -200,7 +200,7 @@ PROPERTIES
   "spark.submit.deployMode" = "client",
   "working_dir" = "hdfs://127.0.0.1:10000/tmp/doris",
   "broker" = "broker1"
-)
+);
 ```
 
 **Spark Load 支持 Kerberos 认证**
diff --git 
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/ecosystem/external-table/hive-bitmap-udf.md
 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/ecosystem/external-table/hive-bitmap-udf.md
new file mode 100644
index 00000000000..30749ca00bf
--- /dev/null
+++ 
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/ecosystem/external-table/hive-bitmap-udf.md
@@ -0,0 +1,104 @@
+---
+{
+    "title": "Hive Bitmap UDF",
+    "language": "zh-CN"
+}
+---
+
+<!-- 
+Licensed to the Apache Software Foundation (ASF) under one
+or more contributor license agreements.  See the NOTICE file
+distributed with this work for additional information
+regarding copyright ownership.  The ASF licenses this file
+to you under the Apache License, Version 2.0 (the
+"License"); you may not use this file except in compliance
+with the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing,
+software distributed under the License is distributed on an
+"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+KIND, either express or implied.  See the License for the
+specific language governing permissions and limitations
+under the License.
+-->
+
+# Hive UDF
+
+ Hive Bitmap UDF 提供了在 hive 表中生成 bitmap 、bitmap 运算等 UDF,Hive 中的 bitmap 与 Doris 
bitmap 完全一致 ,Hive 中的 bitmap 可以通过 spark bitmap load 导入 doris
+
+ 主要目的:
+  1. 减少数据导入 doris 时间 , 除去了构建字典、bitmap 预聚合等流程;
+  2. 节省 hive 存储 ,使用 bitmap 对数据压缩 ,减少了存储成本;
+  3. 提供在 hive 中 bitmap 的灵活运算 ,比如:交集、并集、差集运算 ,计算后的 bitmap 也可以直接导入 doris;
+
+## 使用方法
+
+### 在 Hive 中创建 Bitmap 类型表
+
+```sql
+
+-- 例子:创建 Hive Bitmap 表
+CREATE TABLE IF NOT EXISTS `hive_bitmap_table`(
+  `k1`   int       COMMENT '',
+  `k2`   String    COMMENT '',
+  `k3`   String    COMMENT '',
+  `uuid` binary    COMMENT 'bitmap'
+) comment  'comment'
+
+-- 例子:创建普通 Hive 表
+CREATE TABLE IF NOT EXISTS `hive_table`(
+    `k1`   int       COMMENT '',
+    `k2`   String    COMMENT '',
+    `k3`   String    COMMENT '',
+    `uuid` int       COMMENT ''
+) comment  'comment'
+```
+
+### Hive Bitmap UDF 使用:
+
+ Hive Bitmap UDF 需要在 Hive/Spark 中使用
+
+```sql
+
+-- 加载hive bitmap udf jar包  (需要将编译好的 hive-udf jar 包上传至 HDFS)
+add jar hdfs://node:9001/hive-udf-jar-with-dependencies.jar;
+
+-- 创建UDAF函数
+create temporary function to_bitmap as 'org.apache.doris.udf.ToBitmapUDAF';
+create temporary function bitmap_union as 
'org.apache.doris.udf.BitmapUnionUDAF';
+
+-- 创建UDF函数
+create temporary function bitmap_count as 
'org.apache.doris.udf.BitmapCountUDF';
+create temporary function bitmap_and as 'org.apache.doris.udf.BitmapAndUDF';
+create temporary function bitmap_or as 'org.apache.doris.udf.BitmapOrUDF';
+create temporary function bitmap_xor as 'org.apache.doris.udf.BitmapXorUDF';
+
+-- 例子:通过 to_bitmap 生成 bitmap 写入 Hive Bitmap 表
+insert into hive_bitmap_table
+select 
+    k1,
+    k2,
+    k3,
+    to_bitmap(uuid) as uuid
+from 
+    hive_table
+group by 
+    k1,
+    k2,
+    k3
+
+-- 例子:bitmap_count 计算 bitmap 中元素个数
+select k1,k2,k3,bitmap_count(uuid) from hive_bitmap_table
+
+-- 例子:bitmap_union 用于计算分组后的 bitmap 并集
+select k1,bitmap_union(uuid) from hive_bitmap_table group by k1
+
+```
+
+###  Hive Bitmap UDF  说明
+
+## Hive bitmap 导入 doris
+
+ 详见: 数据导入 -> Spark Load -> 基本操作  -> 创建导入 (示例3:上游数据源是hive binary类型情况)
\ No newline at end of file


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to