richardstartin opened a new issue #7795: URL: https://github.com/apache/pinot/issues/7795
LZ4 decompression is consistently faster than Snappy and produces smaller file sizes. These results are taken from a benchmark which consists of raw string sentences composed words from Wikipedia: ``` Benchmark (_chunkCompressionType) (_distribution) (_maxChunkSize) (_records) Mode Cnt Score Error Units BenchmarkRawForwardIndexReader.readV3 SNAPPY UNIFORM(1000,10000) 1048576 100000 avgt 5 4472.955 ± 45.246 ms/op BenchmarkRawForwardIndexReader.readV3 SNAPPY EXP(0.001) 1048576 100000 avgt 5 861.179 ± 75.607 ms/op BenchmarkRawForwardIndexReader.readV3 LZ4 UNIFORM(1000,10000) 1048576 100000 avgt 5 2178.528 ± 46.808 ms/op BenchmarkRawForwardIndexReader.readV3 LZ4 EXP(0.001) 1048576 100000 avgt 5 360.927 ± 12.732 ms/op BenchmarkRawForwardIndexReader.readV3 ZSTANDARD UNIFORM(1000,10000) 1048576 100000 avgt 5 4116.442 ± 88.894 ms/op BenchmarkRawForwardIndexReader.readV3 ZSTANDARD EXP(0.001) 1048576 100000 avgt 5 789.733 ± 35.641 ms/op BenchmarkRawForwardIndexReader.readV4 SNAPPY UNIFORM(1000,10000) 1048576 100000 avgt 5 4471.859 ± 55.049 ms/op BenchmarkRawForwardIndexReader.readV4 SNAPPY EXP(0.001) 1048576 100000 avgt 5 791.099 ± 3.990 ms/op BenchmarkRawForwardIndexReader.readV4 LZ4 UNIFORM(1000,10000) 1048576 100000 avgt 5 2096.095 ± 57.949 ms/op BenchmarkRawForwardIndexReader.readV4 LZ4 EXP(0.001) 1048576 100000 avgt 5 344.592 ± 3.445 ms/op BenchmarkRawForwardIndexReader.readV4 ZSTANDARD UNIFORM(1000,10000) 1048576 100000 avgt 5 4136.956 ± 98.780 ms/op BenchmarkRawForwardIndexReader.readV4 ZSTANDARD EXP(0.001) 1048576 100000 avgt 5 742.575 ± 23.906 ms/op BenchmarkRawForwardIndexWriter.writeV3 SNAPPY UNIFORM(1000,10000) 1048576 100000 ss 5 71012.041 ± 425.849 ms/op BenchmarkRawForwardIndexWriter.writeV3:b SNAPPY UNIFORM(1000,10000) 1048576 100000 ss 5 7560214965.000 # BenchmarkRawForwardIndexWriter.writeV3:kb SNAPPY UNIFORM(1000,10000) 1048576 100000 ss 5 7383020.000 # BenchmarkRawForwardIndexWriter.writeV3:mb SNAPPY UNIFORM(1000,10000) 1048576 100000 ss 5 7205.000 # BenchmarkRawForwardIndexWriter.writeV3 SNAPPY EXP(0.001) 1048576 100000 ss 5 9968.552 ± 785.299 ms/op BenchmarkRawForwardIndexWriter.writeV3:b SNAPPY EXP(0.001) 1048576 100000 ss 5 1387187650.000 # BenchmarkRawForwardIndexWriter.writeV3:kb SNAPPY EXP(0.001) 1048576 100000 ss 5 1354675.000 # BenchmarkRawForwardIndexWriter.writeV3:mb SNAPPY EXP(0.001) 1048576 100000 ss 5 1320.000 # BenchmarkRawForwardIndexWriter.writeV3 LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 72593.624 ± 7111.796 ms/op BenchmarkRawForwardIndexWriter.writeV3:b LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7801600700.000 # BenchmarkRawForwardIndexWriter.writeV3:kb LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7618750.000 # BenchmarkRawForwardIndexWriter.writeV3:mb LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7440.000 # BenchmarkRawForwardIndexWriter.writeV3 LZ4 EXP(0.001) 1048576 100000 ss 5 10565.451 ± 473.811 ms/op BenchmarkRawForwardIndexWriter.writeV3:b LZ4 EXP(0.001) 1048576 100000 ss 5 1458628405.000 # BenchmarkRawForwardIndexWriter.writeV3:kb LZ4 EXP(0.001) 1048576 100000 ss 5 1424440.000 # BenchmarkRawForwardIndexWriter.writeV3:mb LZ4 EXP(0.001) 1048576 100000 ss 5 1390.000 # BenchmarkRawForwardIndexWriter.writeV3 ZSTANDARD UNIFORM(1000,10000) 1048576 100000 ss 5 48690.818 ± 2852.004 ms/op BenchmarkRawForwardIndexWriter.writeV3:b ZSTANDARD UNIFORM(1000,10000) 1048576 100000 ss 5 5010074735.000 # BenchmarkRawForwardIndexWriter.writeV3:kb ZSTANDARD UNIFORM(1000,10000) 1048576 100000 ss 5 4892650.000 # BenchmarkRawForwardIndexWriter.writeV3:mb ZSTANDARD UNIFORM(1000,10000) 1048576 100000 ss 5 4775.000 # BenchmarkRawForwardIndexWriter.writeV3 ZSTANDARD EXP(0.001) 1048576 100000 ss 5 8096.877 ± 806.071 ms/op BenchmarkRawForwardIndexWriter.writeV3:b ZSTANDARD EXP(0.001) 1048576 100000 ss 5 967798195.000 # BenchmarkRawForwardIndexWriter.writeV3:kb ZSTANDARD EXP(0.001) 1048576 100000 ss 5 945115.000 # BenchmarkRawForwardIndexWriter.writeV3:mb ZSTANDARD EXP(0.001) 1048576 100000 ss 5 920.000 # BenchmarkRawForwardIndexWriter.writeV4 SNAPPY UNIFORM(1000,10000) 1048576 100000 ss 5 16158.218 ± 363.814 ms/op BenchmarkRawForwardIndexWriter.writeV4:b SNAPPY UNIFORM(1000,10000) 1048576 100000 ss 5 7551233800.000 # BenchmarkRawForwardIndexWriter.writeV4:kb SNAPPY UNIFORM(1000,10000) 1048576 100000 ss 5 7374250.000 # BenchmarkRawForwardIndexWriter.writeV4:mb SNAPPY UNIFORM(1000,10000) 1048576 100000 ss 5 7200.000 # BenchmarkRawForwardIndexWriter.writeV4 SNAPPY EXP(0.001) 1048576 100000 ss 5 2914.195 ± 81.574 ms/op BenchmarkRawForwardIndexWriter.writeV4:b SNAPPY EXP(0.001) 1048576 100000 ss 5 1367008240.000 # BenchmarkRawForwardIndexWriter.writeV4:kb SNAPPY EXP(0.001) 1048576 100000 ss 5 1334965.000 # BenchmarkRawForwardIndexWriter.writeV4:mb SNAPPY EXP(0.001) 1048576 100000 ss 5 1300.000 # BenchmarkRawForwardIndexWriter.writeV4 LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 9818.165 ± 490.165 ms/op BenchmarkRawForwardIndexWriter.writeV4:b LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7785795405.000 # BenchmarkRawForwardIndexWriter.writeV4:kb LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7603315.000 # BenchmarkRawForwardIndexWriter.writeV4:mb LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7425.000 # BenchmarkRawForwardIndexWriter.writeV4 LZ4 EXP(0.001) 1048576 100000 ss 5 1765.996 ± 77.316 ms/op BenchmarkRawForwardIndexWriter.writeV4:b LZ4 EXP(0.001) 1048576 100000 ss 5 1410988895.000 # BenchmarkRawForwardIndexWriter.writeV4:kb LZ4 EXP(0.001) 1048576 100000 ss 5 1377915.000 # BenchmarkRawForwardIndexWriter.writeV4:mb LZ4 EXP(0.001) 1048576 100000 ss 5 1345.000 # BenchmarkRawForwardIndexWriter.writeV4 ZSTANDARD UNIFORM(1000,10000) 1048576 100000 ss 5 18359.873 ± 380.187 ms/op BenchmarkRawForwardIndexWriter.writeV4:b ZSTANDARD UNIFORM(1000,10000) 1048576 100000 ss 5 4964714505.000 # BenchmarkRawForwardIndexWriter.writeV4:kb ZSTANDARD UNIFORM(1000,10000) 1048576 100000 ss 5 4848350.000 # BenchmarkRawForwardIndexWriter.writeV4:mb ZSTANDARD UNIFORM(1000,10000) 1048576 100000 ss 5 4730.000 # BenchmarkRawForwardIndexWriter.writeV4 ZSTANDARD EXP(0.001) 1048576 100000 ss 5 3346.148 ± 169.362 ms/op BenchmarkRawForwardIndexWriter.writeV4:b ZSTANDARD EXP(0.001) 1048576 100000 ss 5 900821780.000 # BenchmarkRawForwardIndexWriter.writeV4:kb ZSTANDARD EXP(0.001) 1048576 100000 ss 5 879705.000 # BenchmarkRawForwardIndexWriter.writeV4:mb ZSTANDARD EXP(0.001) 1048576 100000 ss 5 855.000 # ``` Since overriding the chunk compression requires verbose configuration and this is an entirely backward compatible change (the raw indexes contain compression info in their headers) I would like to propose that the default be changed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org