mqliang commented on issue #6720: URL: https://github.com/apache/incubator-pinot/issues/6720#issuecomment-808982812
@Jackie-Jiang It turns out there is a bug in the benchmark: for `BenchmarkDataTableRowIdColIdBuildRandomOrder()`, I also time the function of shuffling column ids. After moving the shuffling logic out of the function (see fix: https://github.com/mqliang/incubator-pinot/commit/c19b51947a5df804a3c819725f037fb1d6388062), the result becomes: benchmark result of 100 rows table: ``` Benchmark Mode Cnt Score Error Units BenchmarkDataTableBulkBuild.BenchmarkDataTableRowBulkBuild avgt 5 147.196 ± 1.683 us/op BenchmarkDataTableBulkBuild.BenchmarkDataTableRowIdColIdBuildInOrder avgt 5 146.157 ± 5.207 us/op BenchmarkDataTableBulkBuild.BenchmarkDataTableRowIdColIdBuildRandomOrder avgt 5 145.171 ± 6.464 us/op ``` benchmark result of 1000 rows table: ``` Benchmark Mode Cnt Score Error Units BenchmarkDataTableBulkBuild.BenchmarkDataTableRowBulkBuild avgt 5 1612.386 ± 51.587 us/op BenchmarkDataTableBulkBuild.BenchmarkDataTableRowIdColIdBuildInOrder avgt 5 1622.108 ± 65.567 us/op BenchmarkDataTableBulkBuild.BenchmarkDataTableRowIdColIdBuildRandomOrder avgt 5 1610.517 ± 64.187 us/op ``` So, the overhead of calling `ByteBuffer.position()` is actually negligible. @Jackie-Jiang Can you please take a look at my benchmark code? If the benchmark code is correct, we don't need to address the TODO. And I will send a PR to remove the TODO and write comments stating that such a optimization is unnecessary. cc @siddharthteotia @mcvsubbu -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org