w517424787 opened a new issue, #22528:
URL: https://github.com/apache/doris/issues/22528

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no 
similar issues.
   
   
   ### Description
   
   1. Doris版本:2.0-beta
   2. 环境配置:1FE + 3BE,8C 32G
   3. 建表语句:
   `
   CREATE TABLE test.test10 
   (
     stat_date varchar(256) NULL,
     region varchar(256) NULL,
     cmp_city_branch_type varchar(256) NULL,
     cmp_city varchar(256) NULL,
     city_level varchar(256) NULL
   ) ENGINE=OLAP
   DUPLICATE KEY(stat_date, region, cmp_city_branch_type)
   COMMENT 'OLAP'
   DISTRIBUTED BY HASH(stat_date) BUCKETS 6
   PROPERTIES (
   "replication_allocation" = "tag.location.default: 3",
   "storage_format" = "V2",
   "disable_auto_compaction" = "false",
   "enable_single_replica_compaction" = "false"
   );
   `
   
   4. 手动插入千万级数据测试
   
   5. 查询简单SQL性能对比
   SQL1,分布键值传 string:
   SELECT count(stat_date) as cnt FROM test.test10 WHERE stat_date = '20200601';
   耗时:13ms
   
   SQL2,分布键值传 int:
   SELECT count(stat_date) as cnt FROM test.test10 WHERE stat_date = 20200601;
   耗时:134ms
   
   6. 两种SQL explain对比
   SQL1:
   -------------主要位置---------------
     0:VOlapScanNode
        TABLE: default_cluster:test.test10(test10), PREAGGREGATION: ON
        PREDICATES: `stat_date` = '20200601'
        **partitions=1/1, tablets=1/6, tabletList=18152**
        cardinality=12804193, avgRowSize=0.5953839, numNodes=3
   
   SQL2:
   -------------主要位置---------------
     0:VOlapScanNode
        TABLE: default_cluster:test.test10(test10), PREAGGREGATION: ON
        PREDICATES: `stat_date` = 20200601
        **partitions=1/1, tablets=6/6, tabletList=18140,18144,18148 ...**
        cardinality=12804193, avgRowSize=3.3233776, numNodes=3
   
   8. 分析原因:
   当分布键值传string类型时,能快速过滤出tabletList,而传 
int类型值,无法过滤出具体的tabletList,等于全表扫描,随着数据量级不同,查询性能差异越大。
   
   
   ### Solution
   
   希望针对分布键是string类型,但字段存储的是 int值,where条件过滤传 int 和 string 值,查询耗时能一样!
   
   目前当分布键是 int类型,where条件过滤传 int 和 string 值,查询耗时一样。
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org
For additional commands, e-mail: commits-h...@doris.apache.org

Reply via email to