gortiz commented on code in PR #13303: URL: https://github.com/apache/pinot/pull/13303#discussion_r1633031444
########## pinot-core/src/main/java/org/apache/pinot/core/common/datablock/DataBlockBuilder.java: ########## @@ -187,200 +141,315 @@ public static RowDataBlock buildFromRows(List<Object[]> rows, DataSchema dataSch dataSchema.getColumnName(colId))); } } - rowBuilder._fixedSizeDataByteArrayOutputStream.write(byteBuffer.array(), 0, byteBuffer.position()); } + + CompoundDataBuffer.Builder varBufferBuilder = new CompoundDataBuffer.Builder(ByteOrder.BIG_ENDIAN, true) + .addPagedOutputStream(varSize); + Review Comment: This is probably the biggest improvement in performance when creating the block. The older version allocated a large array (which is expensive as it may be outside the TLAB) and then copy that into the ArrayOutputStream, which probably ends up allocating that amount of bytes again. Now instead we just reuse the whole byte buffer, adding it into the builder, which is basically a list of byte buffers that can be later be used to send the data through the network or directly read the info on another local stage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org