richardstartin commented on pull request #7930: URL: https://github.com/apache/pinot/pull/7930#issuecomment-997424151
For posterity, we do have a [benchmark](https://github.com/apache/pinot/blob/master/pinot-perf/src/main/java/org/apache/pinot/perf/BenchmarkRawForwardIndexWriter.java) which would catch this regression if it were to happen again: master: ``` Benchmark (_chunkCompressionType) (_distribution) (_maxChunkSize) (_records) Mode Cnt Score Error Units BenchmarkRawForwardIndexWriter.writeV3 LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 81529.035 ± 2089.707 ms/op BenchmarkRawForwardIndexWriter.writeV3:b LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7796873815.000 # BenchmarkRawForwardIndexWriter.writeV3:kb LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7614130.000 # BenchmarkRawForwardIndexWriter.writeV3:mb LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7435.000 # BenchmarkRawForwardIndexWriter.writeV3 LZ4 EXP(0.001) 1048576 100000 ss 5 13314.818 ± 302.889 ms/op BenchmarkRawForwardIndexWriter.writeV3:b LZ4 EXP(0.001) 1048576 100000 ss 5 1482381585.000 # BenchmarkRawForwardIndexWriter.writeV3:kb LZ4 EXP(0.001) 1048576 100000 ss 5 1447635.000 # BenchmarkRawForwardIndexWriter.writeV3:mb LZ4 EXP(0.001) 1048576 100000 ss 5 1410.000 # ``` branch: ``` Benchmark (_chunkCompressionType) (_distribution) (_maxChunkSize) (_records) Mode Cnt Score Error Units BenchmarkRawForwardIndexWriter.writeV3 LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7690.783 ± 87.355 ms/op BenchmarkRawForwardIndexWriter.writeV3:b LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7796873815.000 # BenchmarkRawForwardIndexWriter.writeV3:kb LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7614130.000 # BenchmarkRawForwardIndexWriter.writeV3:mb LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7435.000 # BenchmarkRawForwardIndexWriter.writeV3 LZ4 EXP(0.001) 1048576 100000 ss 5 1438.566 ± 39.583 ms/op BenchmarkRawForwardIndexWriter.writeV3:b LZ4 EXP(0.001) 1048576 100000 ss 5 1482381585.000 # BenchmarkRawForwardIndexWriter.writeV3:kb LZ4 EXP(0.001) 1048576 100000 ss 5 1447635.000 # BenchmarkRawForwardIndexWriter.writeV3:mb LZ4 EXP(0.001) 1048576 100000 ss 5 1410.000 # ``` For comparison, the mmap approach does work well for the V4 raw index writer, which is slightly slower to build than V3 on this branch but is guaranteed never to use more than 1MB memory (and is slightly faster to read according to benchmarks): ``` Benchmark (_chunkCompressionType) (_distribution) (_maxChunkSize) (_records) Mode Cnt Score Error Units BenchmarkRawForwardIndexWriter.writeV4 LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 9857.203 ± 167.084 ms/op BenchmarkRawForwardIndexWriter.writeV4:b LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7781048545.000 # BenchmarkRawForwardIndexWriter.writeV4:kb LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7598680.000 # BenchmarkRawForwardIndexWriter.writeV4:mb LZ4 UNIFORM(1000,10000) 1048576 100000 ss 5 7420.000 # BenchmarkRawForwardIndexWriter.writeV4 LZ4 EXP(0.001) 1048576 100000 ss 5 1802.487 ± 112.871 ms/op BenchmarkRawForwardIndexWriter.writeV4:b LZ4 EXP(0.001) 1048576 100000 ss 5 1416542875.000 # BenchmarkRawForwardIndexWriter.writeV4:kb LZ4 EXP(0.001) 1048576 100000 ss 5 1383340.000 # BenchmarkRawForwardIndexWriter.writeV4:mb LZ4 EXP(0.001) 1048576 100000 ss 5 1350.000 # ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org