Re: [PR] Improve performance of DataBlock serde [pinot]

via GitHub Tue, 09 Jul 2024 02:35:08 -0700


gortiz commented on code in PR #13303:
URL: https://github.com/apache/pinot/pull/13303#discussion_r1670135123



##########
pinot-common/src/main/java/org/apache/pinot/common/datablock/BaseDataBlock.java:
##########
@@ -426,168 +401,202 @@ public Map<Integer, String> getExceptions() {
     return _errCodeToExceptionMap;
   }
 
-  /**
-   * Serialize this data block to a byte array.
-   * <p>
-   * In order to deserialize it, {@link 
DataBlockUtils#getDataBlock(ByteBuffer)} should be used.
-   */
   @Override
-  public byte[] toBytes()
+  public List<ByteBuffer> serialize()
       throws IOException {
-    ThreadResourceUsageProvider threadResourceUsageProvider = new 
ThreadResourceUsageProvider();
-
-    UnsynchronizedByteArrayOutputStream byteArrayOutputStream = new 
UnsynchronizedByteArrayOutputStream(8192);
-    DataOutputStream dataOutputStream = new 
DataOutputStream(byteArrayOutputStream);
-    writeLeadingSections(dataOutputStream);
-
-    // Write metadata: length followed by actual metadata bytes.
-    // NOTE: We ignore metadata serialization time in 
"responseSerializationCpuTimeNs" as it's negligible while
-    // considering it will bring a lot code complexity.
-    serializeMetadata(dataOutputStream);
-
-    return byteArrayOutputStream.toByteArray();
-  }
-
-  private void writeLeadingSections(DataOutputStream dataOutputStream)
-      throws IOException {
-    dataOutputStream.writeInt(getDataBlockVersionType());
-    dataOutputStream.writeInt(_numRows);
-    dataOutputStream.writeInt(_numColumns);
-    int dataOffset = HEADER_SIZE;
-
-    // Write exceptions section offset(START|SIZE).
-    dataOutputStream.writeInt(dataOffset);
-    byte[] exceptionsBytes;
-    exceptionsBytes = serializeExceptions();
-    dataOutputStream.writeInt(exceptionsBytes.length);
-    dataOffset += exceptionsBytes.length;
-
-    // Write dictionary map section offset(START|SIZE).
-    dataOutputStream.writeInt(dataOffset);
-    byte[] dictionaryBytes = null;
-    if (_stringDictionary != null) {
-      dictionaryBytes = serializeStringDictionary();
-      dataOutputStream.writeInt(dictionaryBytes.length);
-      dataOffset += dictionaryBytes.length;
-    } else {
-      dataOutputStream.writeInt(0);
-    }
-
-    // Write data schema section offset(START|SIZE).
-    dataOutputStream.writeInt(dataOffset);
-    byte[] dataSchemaBytes = null;
-    if (_dataSchema != null) {
-      dataSchemaBytes = _dataSchema.toBytes();
-      dataOutputStream.writeInt(dataSchemaBytes.length);
-      dataOffset += dataSchemaBytes.length;
-    } else {
-      dataOutputStream.writeInt(0);
+    if (_serialized == null) {
+      _serialized = DataBlockUtils.serialize(this);

Review Comment:
   I think the opposite. Having serdes as separate classes:
   1. Simplifies the class. BaseDataBlock is very complex, by removing the 
serialization/deserialization code we simplify this class.
   2. Separates responsibilities. 
      * DataBlock defines how to use the block in the OpChain. On a cluster 
running different versions of Pinot, the memory layout of a data block may be 
different. 
      * DataBlockSerde defines the layout used to send the blocks through the 
network. Different Pinot instances may support different block versions, but in 
order to be able to talk to each other they have to have at least one version 
in common.
   3. Opens the possibility of having different implementations.
      * Imagine BaseDataBlock includes the serde code. If we want to try a new 
implementation, we need to modify the code, destroying the older code. That 
means it is more difficult to test or benchmark the new version.
      * In the first commits of this PR I included a class that implemented the 
V1_V2 protocol version but using almost the same code we use right now in 
master. I also kept the old `toBytes` code. It was easy to benchmark each 
implementation by having these options. I decided to remove these classes to 
make this PR simpler, but during development it was useful.
      * Imagine we create a new serde for the current V1_V2 version that is 
more efficient. The code does not support it right now but it is very easy to 
be able to configure which serde we want to use for each version with 
PinotConfiguration. This means we can try to use one serde by default but 
change to the other in case we find a bug.
   
   > I think keeping it within the class is easier for version control.
   
   I don't understand this. Obviously this PR destroys git history. But once we 
start using this mechanism there will be no difference in terms of git.
   
   Imagine that we add a V3 that serializes blocks using Apache Arrow or 
Parquet. V1_V2 and this theoretical V3 would be super different.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Re: [PR] Improve performance of DataBlock serde [pinot]

Reply via email to