gortiz commented on code in PR #13303:
URL: https://github.com/apache/pinot/pull/13303#discussion_r1633031444


##########
pinot-core/src/main/java/org/apache/pinot/core/common/datablock/DataBlockBuilder.java:
##########
@@ -187,200 +141,315 @@ public static RowDataBlock buildFromRows(List<Object[]> 
rows, DataSchema dataSch
                     dataSchema.getColumnName(colId)));
         }
       }
-      rowBuilder._fixedSizeDataByteArrayOutputStream.write(byteBuffer.array(), 
0, byteBuffer.position());
     }
+
+    CompoundDataBuffer.Builder varBufferBuilder = new 
CompoundDataBuffer.Builder(ByteOrder.BIG_ENDIAN, true)
+        .addPagedOutputStream(varSize);
+

Review Comment:
   This is probably the biggest improvement in performance when creating the 
block. The older version allocated a large array (which is expensive as it may 
be outside the TLAB) and then copy that into the ArrayOutputStream, which 
probably ends up allocating that amount of bytes again.
   
   Now instead we just reuse the whole byte buffer, adding it into the builder, 
which is basically a list of byte buffers that can be later be used to send the 
data through the network or directly read the info on another local stage.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to