gortiz opened a new pull request, #13303: URL: https://github.com/apache/pinot/pull/13303
This PR includes several changes in the code that builds, serializes and deserializes DataBlocks in order to improve the performace. Changes here should not change the binary format (test included verify that). Instead I've changed how the code to reduce allocation and copies. I'm sure more can be done to improve the performance without breaking the binary format and even more could be done if we decide to break the format. The PR includes 4 benchmarks: One that given a List<Object[]> creates a datablock, one that serializes that datablock, one that deserialize it and one that does the three things in a row. The results of the latest are: ``` Benchmark (_blockType) (_dataType) (_rows) (_version) Mode Cnt Score Error Units BenchmarkDataBlock.BuildSerde.all COLUMNAR INT 10000 bytes thrpt 5 49.644 ± 1.022 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR INT 10000 bytes thrpt 5 293024.139 ± 0.002 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR INT 10000 zero_heap_small thrpt 5 92.337 ± 1.269 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR INT 10000 zero_heap_small thrpt 5 94464.075 ± 0.001 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR INT 1000000 bytes thrpt 5 0.560 ± 0.010 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR INT 1000000 bytes thrpt 5 24222029.749 ± 219.167 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR INT 1000000 zero_heap_small thrpt 5 1.190 ± 0.020 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR INT 1000000 zero_heap_small thrpt 5 4054533.753 ± 0.069 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR LONG 10000 bytes thrpt 5 31.059 ± 0.366 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR LONG 10000 bytes thrpt 5 558576.223 ± 0.005 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR LONG 10000 zero_heap_small thrpt 5 77.407 ± 0.664 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR LONG 10000 zero_heap_small thrpt 5 134464.088 ± 0.001 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR LONG 1000000 bytes thrpt 5 0.326 ± 0.007 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR LONG 1000000 bytes thrpt 5 48416344.508 ± 27.136 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR LONG 1000000 zero_heap_small thrpt 5 0.678 ± 0.036 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR LONG 1000000 zero_heap_small thrpt 5 8054748.832 ± 362.451 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR STRING 10000 bytes thrpt 5 9.972 ± 1.555 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR STRING 10000 bytes thrpt 5 315184.066 ± 5.302 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR STRING 10000 zero_heap_small thrpt 5 8.083 ± 1.454 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR STRING 10000 zero_heap_small thrpt 5 271128.852 ± 0.158 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR STRING 1000000 bytes thrpt 5 0.065 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR STRING 1000000 bytes thrpt 5 40244631.654 ± 95.775 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR STRING 1000000 zero_heap_small thrpt 5 0.070 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR STRING 1000000 zero_heap_small thrpt 5 20071722.329 ± 4.955 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR BYTES 10000 bytes thrpt 5 2.815 ± 0.038 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR BYTES 10000 bytes thrpt 5 6556946.464 ± 0.046 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR BYTES 10000 zero_heap_small thrpt 5 7.606 ± 0.062 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR BYTES 10000 zero_heap_small thrpt 5 1139528.910 ± 0.007 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR BYTES 1000000 bytes thrpt 5 0.028 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR BYTES 1000000 bytes thrpt 5 608447632.969 ± 26.685 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR BYTES 1000000 zero_heap_small thrpt 5 0.074 ± 0.002 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR BYTES 1000000 zero_heap_small thrpt 5 108579301.322 ± 4.422 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR BIG_DECIMAL 10000 bytes thrpt 5 2.130 ± 0.028 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR BIG_DECIMAL 10000 bytes thrpt 5 2740939.242 ± 0.068 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR BIG_DECIMAL 10000 zero_heap_small thrpt 5 1.655 ± 0.028 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR BIG_DECIMAL 10000 zero_heap_small thrpt 5 1186124.127 ± 0.105 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR BIG_DECIMAL 1000000 bytes thrpt 5 0.021 ± 0.002 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR BIG_DECIMAL 1000000 bytes thrpt 5 261424854.502 ± 49.776 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR BIG_DECIMAL 1000000 zero_heap_small thrpt 5 0.022 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR BIG_DECIMAL 1000000 zero_heap_small thrpt 5 114337689.445 ± 49.717 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR BOOLEAN 10000 bytes thrpt 5 50.924 ± 0.536 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR BOOLEAN 10000 bytes thrpt 5 293024.136 ± 0.003 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR BOOLEAN 10000 zero_heap_small thrpt 5 99.126 ± 1.292 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR BOOLEAN 10000 zero_heap_small thrpt 5 94464.070 ± 0.001 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR BOOLEAN 1000000 bytes thrpt 5 0.629 ± 0.014 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR BOOLEAN 1000000 bytes thrpt 5 24221975.893 ± 278.919 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR BOOLEAN 1000000 zero_heap_small thrpt 5 1.620 ± 0.032 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR BOOLEAN 1000000 zero_heap_small thrpt 5 4054676.229 ± 0.096 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR LONG_ARRAY 10000 bytes thrpt 5 1.571 ± 0.030 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR LONG_ARRAY 10000 bytes thrpt 5 4908500.402 ± 0.053 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR LONG_ARRAY 10000 zero_heap_small thrpt 5 3.546 ± 0.052 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR LONG_ARRAY 10000 zero_heap_small thrpt 5 926497.946 ± 0.065 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR LONG_ARRAY 1000000 bytes thrpt 5 0.016 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR LONG_ARRAY 1000000 bytes thrpt 5 548447881.600 ± 10.041 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR LONG_ARRAY 1000000 zero_heap_small thrpt 5 0.036 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR LONG_ARRAY 1000000 zero_heap_small thrpt 5 88572811.244 ± 26.064 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR STRING_ARRAY 10000 bytes thrpt 5 0.604 ± 0.015 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR STRING_ARRAY 10000 bytes thrpt 5 4282229.866 ± 397.730 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR STRING_ARRAY 10000 zero_heap_small thrpt 5 0.799 ± 0.013 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR STRING_ARRAY 10000 zero_heap_small thrpt 5 2146608.719 ± 1.379 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR STRING_ARRAY 1000000 bytes thrpt 5 0.006 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR STRING_ARRAY 1000000 bytes thrpt 5 454253240.533 ± 299.935 B/op BenchmarkDataBlock.BuildSerde.all COLUMNAR STRING_ARRAY 1000000 zero_heap_small thrpt 5 0.008 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm COLUMNAR STRING_ARRAY 1000000 zero_heap_small thrpt 5 208283178.222 ± 144.616 B/op BenchmarkDataBlock.BuildSerde.all ROW INT 10000 bytes thrpt 5 10.869 ± 0.131 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW INT 10000 bytes thrpt 5 324751.142 ± 12.887 B/op BenchmarkDataBlock.BuildSerde.all ROW INT 10000 zero_heap_small thrpt 5 24.332 ± 0.265 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW INT 10000 zero_heap_small thrpt 5 94472.283 ± 0.005 B/op BenchmarkDataBlock.BuildSerde.all ROW INT 1000000 bytes thrpt 5 0.108 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW INT 1000000 bytes thrpt 5 28213967.706 ± 0.799 B/op BenchmarkDataBlock.BuildSerde.all ROW INT 1000000 zero_heap_small thrpt 5 0.253 ± 0.002 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW INT 1000000 zero_heap_small thrpt 5 4054901.566 ± 117.807 B/op BenchmarkDataBlock.BuildSerde.all ROW LONG 10000 bytes thrpt 5 9.655 ± 0.060 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW LONG 10000 bytes thrpt 5 550304.715 ± 0.025 B/op BenchmarkDataBlock.BuildSerde.all ROW LONG 10000 zero_heap_small thrpt 5 24.767 ± 0.143 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW LONG 10000 zero_heap_small thrpt 5 134472.278 ± 0.007 B/op BenchmarkDataBlock.BuildSerde.all ROW LONG 1000000 bytes thrpt 5 0.099 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW LONG 1000000 bytes thrpt 5 48408293.165 ± 1.692 B/op BenchmarkDataBlock.BuildSerde.all ROW LONG 1000000 zero_heap_small thrpt 5 0.256 ± 0.012 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW LONG 1000000 zero_heap_small thrpt 5 8054901.681 ± 116.976 B/op BenchmarkDataBlock.BuildSerde.all ROW STRING 10000 bytes thrpt 5 4.409 ± 0.043 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW STRING 10000 bytes thrpt 5 506889.570 ± 0.040 B/op BenchmarkDataBlock.BuildSerde.all ROW STRING 10000 zero_heap_small thrpt 5 6.838 ± 0.279 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW STRING 10000 zero_heap_small thrpt 5 271025.356 ± 3.068 B/op BenchmarkDataBlock.BuildSerde.all ROW STRING 1000000 bytes thrpt 5 0.046 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW STRING 1000000 bytes thrpt 5 44236302.908 ± 21.096 B/op BenchmarkDataBlock.BuildSerde.all ROW STRING 1000000 zero_heap_small thrpt 5 0.057 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW STRING 1000000 zero_heap_small thrpt 5 20071545.468 ± 5.535 B/op BenchmarkDataBlock.BuildSerde.all ROW BYTES 10000 bytes thrpt 5 2.361 ± 0.034 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW BYTES 10000 bytes thrpt 5 6548754.934 ± 0.073 B/op BenchmarkDataBlock.BuildSerde.all ROW BYTES 10000 zero_heap_small thrpt 5 6.104 ± 0.125 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW BYTES 10000 zero_heap_small thrpt 5 1139504.536 ± 123.329 B/op BenchmarkDataBlock.BuildSerde.all ROW BYTES 1000000 bytes thrpt 5 0.018 ± 0.007 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW BYTES 1000000 bytes thrpt 5 608439491.114 ± 168.527 B/op BenchmarkDataBlock.BuildSerde.all ROW BYTES 1000000 zero_heap_small thrpt 5 0.062 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW BYTES 1000000 zero_heap_small thrpt 5 108579240.033 ± 8.143 B/op BenchmarkDataBlock.BuildSerde.all ROW BIG_DECIMAL 10000 bytes thrpt 5 1.539 ± 0.019 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW BIG_DECIMAL 10000 bytes thrpt 5 2732748.498 ± 0.074 B/op BenchmarkDataBlock.BuildSerde.all ROW BIG_DECIMAL 10000 zero_heap_small thrpt 5 1.847 ± 0.088 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW BIG_DECIMAL 10000 zero_heap_small thrpt 5 1186043.736 ± 0.158 B/op BenchmarkDataBlock.BuildSerde.all ROW BIG_DECIMAL 1000000 bytes thrpt 5 0.017 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW BIG_DECIMAL 1000000 bytes thrpt 5 261416641.778 ± 4.841 B/op BenchmarkDataBlock.BuildSerde.all ROW BIG_DECIMAL 1000000 zero_heap_small thrpt 5 0.018 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW BIG_DECIMAL 1000000 zero_heap_small thrpt 5 114337735.495 ± 28.417 B/op BenchmarkDataBlock.BuildSerde.all ROW BOOLEAN 10000 bytes thrpt 5 10.783 ± 0.163 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW BOOLEAN 10000 bytes thrpt 5 324752.190 ± 3.915 B/op BenchmarkDataBlock.BuildSerde.all ROW BOOLEAN 10000 zero_heap_small thrpt 5 25.165 ± 0.108 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW BOOLEAN 10000 zero_heap_small thrpt 5 94432.275 ± 0.006 B/op BenchmarkDataBlock.BuildSerde.all ROW BOOLEAN 1000000 bytes thrpt 5 0.107 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW BOOLEAN 1000000 bytes thrpt 5 28213968.178 ± 1.020 B/op BenchmarkDataBlock.BuildSerde.all ROW BOOLEAN 1000000 zero_heap_small thrpt 5 0.258 ± 0.006 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW BOOLEAN 1000000 zero_heap_small thrpt 5 4054898.269 ± 130.176 B/op BenchmarkDataBlock.BuildSerde.all ROW LONG_ARRAY 10000 bytes thrpt 5 1.431 ± 0.019 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW LONG_ARRAY 10000 bytes thrpt 5 4900228.832 ± 0.121 B/op BenchmarkDataBlock.BuildSerde.all ROW LONG_ARRAY 10000 zero_heap_small thrpt 5 3.092 ± 0.031 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW LONG_ARRAY 10000 zero_heap_small thrpt 5 926370.236 ± 0.043 B/op BenchmarkDataBlock.BuildSerde.all ROW LONG_ARRAY 1000000 bytes thrpt 5 0.013 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW LONG_ARRAY 1000000 bytes thrpt 5 548439670.629 ± 4.821 B/op BenchmarkDataBlock.BuildSerde.all ROW LONG_ARRAY 1000000 zero_heap_small thrpt 5 0.027 ± 0.002 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW LONG_ARRAY 1000000 zero_heap_small thrpt 5 88572792.150 ± 47.157 B/op BenchmarkDataBlock.BuildSerde.all ROW STRING_ARRAY 10000 bytes thrpt 5 0.561 ± 0.029 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW STRING_ARRAY 10000 bytes thrpt 5 4273999.895 ± 342.393 B/op BenchmarkDataBlock.BuildSerde.all ROW STRING_ARRAY 10000 zero_heap_small thrpt 5 0.770 ± 0.016 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW STRING_ARRAY 10000 zero_heap_small thrpt 5 2146468.109 ± 26.118 B/op BenchmarkDataBlock.BuildSerde.all ROW STRING_ARRAY 1000000 bytes thrpt 5 0.006 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW STRING_ARRAY 1000000 bytes thrpt 5 454244994.133 ± 11.248 B/op BenchmarkDataBlock.BuildSerde.all ROW STRING_ARRAY 1000000 zero_heap_small thrpt 5 0.007 ± 0.001 ops/ms BenchmarkDataBlock.BuildSerde.all:gc.alloc.rate.norm ROW STRING_ARRAY 1000000 zero_heap_small thrpt 5 208283117.800 ± 25.014 B/op ``` As you can see, throughput is between 1x to 3x, but the difference in allocation is even better. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org