kangpinghuang opened a new issue #2967: add a test for different encoding URL: https://github.com/apache/incubator-doris/issues/2967 I add a test for encoding in different situation. I generate 100million int, classified into 4 type: sequence/random/small step/large step. the original data size is as following: sequence | random | small_step | big_step -- | -- | -- | -- 848 | 1000M | 859 | 859 - tests I test for encoding method, including: alpha/beta_bitshuffle/beta_for(frame of reference)/beta_rle. The result is as following: 1. space the following is space after encoding for 100million ints. 单位(KB) | sequence | random | small_step | big_step -- | -- | -- | -- | -- alpha | 2865.152 | 104420.4 | 2108.416 | 2224.128 beta_bitshuffle | 4094.976 | 143268.9 | 1682.432 | 2679.808 beta_for | 4582.4 | 94251.01 | 797.3325 | 956.233728 beta_rle | 818.0101 | 10342.4 | 778.4581 | 783.970304 the graph is as following:  2. query time cost for count(*) the time is 95% percentile time cost, unit is : ms | sequence | random | small_step | big_step -- | -- | -- | -- | -- alpha | 7399.1 | 5416.48 | 6231 | 5372.88 beta_bitshuffle | 14342 | 12059.82 | 9186.91 | 8817.78 beta_for | 8752.04 | 11379.43 | 12403.98 | 8415.49 beta_rle | 8544.95 | 8614.29 | 9299.58 | 8295.44 the graph is:  3. query time cost for point query select count(*) from table where id = xxx; | sequence | random | small_step | big_step -- | -- | -- | -- | -- alhpa | 8.3 | 8.66 | 477.26 | 10.73 beta_bitshuffle | 9.65 | 9.86 | 413.63 | 10.91 beta_for | 25.3 | 29.98 | 401.32 | 30.13 beta_rle | 8.65 | 9.06 | 398.92 | 10.86 the graph is:  - conclusion beta rle aquire the best space efficiency in all situation than other beta's encodings and alpha encoding. The query performance of beta rle is the best in encodings of Segment V2, but is a bit poor than alpha encoding.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@doris.apache.org For additional commands, e-mail: commits-h...@doris.apache.org