EmmyMiao87 opened a new issue #3552: URL: https://github.com/apache/incubator-doris/issues/3552
# Support Bitmap Intersect Support aggregate function Bitmap Intersect, it is mainly used to take intersection of grouped data. # bitmap_intersect Calculates the intersection of bitmap columns and returns a bitmap object. ``` bitmap_intersect(expr) ``` **Parameters** The `expr` column type must be bitmap. **Return value** bitmap object **Example** table schema ``` create table bitmap_intersect_test ( tag varchar(20), user_id bitmap bitmap_union ) AGGREGATE KEY(tag) DISTRIBUTED BY HASH(tag) BUCKETS 3; ``` Query which users satisfy the three tags a, b, and c at the same time. ``` select bitmap_to_string(bitmap_intersect(user_id)) from ( select bitmap_union(user_id) user_id from bitmap_intersect_test where tag in ('a', 'b', 'c') group by tag ) a ``` # Design ## Semantic analysis The child type of bitmap_intersect must be bitmap. ``` class FunctionCallExpr { void analyze() { if(fnName.equals("bitmap_intersect")) { ... if(!fn.getChild(0).isBitmapType()) { throw new AnalysisException("the child type of " + fnName + " must be bitmap") } ... } } } ``` ## Function implement The function of each stage of `` `bitmap_intersect``` is declared in` `` function set```. **Function definition** ``` FunctionName: bitmap_union, InputType: bitmap, OutputType: bitmap, IntermediateType: varchar ``` **init** Directly reuse the current bitmap init function ``` "_ZN5doris15BitmapFunctions11bitmap_initEPN9doris_udf15FunctionContextEPNS1_9StringValE" ``` **update** **merge** Perform intersection calculation on the bitmap grouped on the current node ``` void BitmapFunctions::bitmap_intersect(FunctionContext* ctx, const StringVal& src, StringVal* dst) { if (src.is_null) { return; } auto dst_bitmap = reinterpret_cast<BitmapValue*>(dst->ptr); // zero size means the src input is a agg object if (src.len == 0) { (*dst_bitmap) &= *reinterpret_cast<BitmapValue*>(src.ptr); } else { (*dst_bitmap) &= BitmapValue((char*) src.ptr); } } ``` **serialize** **finalize** Directly replace the current bitmap serialization function ``` "_ZN5doris15BitmapFunctions16bitmap_serializeEPN9doris_udf15FunctionContextERKNS1_9StringValE", ``` **Query plan** ``` mysql> explain select bitmap_intersect(user_id) from (select bitmap_union(user_id) user_id from bitmap_intersect_test where tag in ('a', 'b', 'c') group by tag ) a; +----------------------------------------------------------------------------------------+ | Explain String | +----------------------------------------------------------------------------------------+ | PLAN FRAGMENT 0 | | OUTPUT EXPRS:<slot 8> | | PARTITION: UNPARTITIONED | | | | RESULT SINK | | | | 6:AGGREGATE (merge finalize) | | | output: bitmap_intersect(<slot 7>) | | | group by: | | | tuple ids: 5 | | | | | 5:EXCHANGE | | tuple ids: 4 | | | | PLAN FRAGMENT 1 | | OUTPUT EXPRS: | | PARTITION: HASH_PARTITIONED: <slot 2> | | | | STREAM DATA SINK | | EXCHANGE ID: 05 | | UNPARTITIONED | | | | 2:AGGREGATE (update serialize) | | | output: bitmap_intersect(<slot 5>) | | | group by: | | | tuple ids: 4 | | | | | 4:AGGREGATE (merge finalize) | | | output: bitmap_union(<slot 3>) | | | group by: <slot 2> | | | tuple ids: 2 | | | | | 3:EXCHANGE | | tuple ids: 1 | | | | PLAN FRAGMENT 2 | | OUTPUT EXPRS: | | PARTITION: RANDOM | | | | STREAM DATA SINK | | EXCHANGE ID: 03 | | HASH_PARTITIONED: <slot 2> | | | | 1:AGGREGATE (update serialize) | | | STREAMING | | | output: bitmap_union(`user_id`) | | | group by: `tag` | | | tuple ids: 1 | | | | | 0:OlapScanNode | | TABLE: bitmap_intersect_test | | PREAGGREGATION: ON | | PREDICATES: `tag` IN ('a', 'b', 'c') | | partitions=1/1 | | rollup: bitmap_intersect_test | | tabletRatio=100/100 | | | numNodes=6 | | tuple ids: 0 | +----------------------------------------------------------------------------------------+ ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org