Jackie-Jiang commented on code in PR #13303: URL: https://github.com/apache/pinot/pull/13303#discussion_r1722533948
########## pinot-common/src/main/java/org/apache/pinot/common/datablock/BaseDataBlock.java: ########## @@ -426,168 +401,202 @@ public Map<Integer, String> getExceptions() { return _errCodeToExceptionMap; } - /** - * Serialize this data block to a byte array. - * <p> - * In order to deserialize it, {@link DataBlockUtils#getDataBlock(ByteBuffer)} should be used. - */ @Override - public byte[] toBytes() + public List<ByteBuffer> serialize() throws IOException { - ThreadResourceUsageProvider threadResourceUsageProvider = new ThreadResourceUsageProvider(); - - UnsynchronizedByteArrayOutputStream byteArrayOutputStream = new UnsynchronizedByteArrayOutputStream(8192); - DataOutputStream dataOutputStream = new DataOutputStream(byteArrayOutputStream); - writeLeadingSections(dataOutputStream); - - // Write metadata: length followed by actual metadata bytes. - // NOTE: We ignore metadata serialization time in "responseSerializationCpuTimeNs" as it's negligible while - // considering it will bring a lot code complexity. - serializeMetadata(dataOutputStream); - - return byteArrayOutputStream.toByteArray(); - } - - private void writeLeadingSections(DataOutputStream dataOutputStream) - throws IOException { - dataOutputStream.writeInt(getDataBlockVersionType()); - dataOutputStream.writeInt(_numRows); - dataOutputStream.writeInt(_numColumns); - int dataOffset = HEADER_SIZE; - - // Write exceptions section offset(START|SIZE). - dataOutputStream.writeInt(dataOffset); - byte[] exceptionsBytes; - exceptionsBytes = serializeExceptions(); - dataOutputStream.writeInt(exceptionsBytes.length); - dataOffset += exceptionsBytes.length; - - // Write dictionary map section offset(START|SIZE). - dataOutputStream.writeInt(dataOffset); - byte[] dictionaryBytes = null; - if (_stringDictionary != null) { - dictionaryBytes = serializeStringDictionary(); - dataOutputStream.writeInt(dictionaryBytes.length); - dataOffset += dictionaryBytes.length; - } else { - dataOutputStream.writeInt(0); - } - - // Write data schema section offset(START|SIZE). - dataOutputStream.writeInt(dataOffset); - byte[] dataSchemaBytes = null; - if (_dataSchema != null) { - dataSchemaBytes = _dataSchema.toBytes(); - dataOutputStream.writeInt(dataSchemaBytes.length); - dataOffset += dataSchemaBytes.length; - } else { - dataOutputStream.writeInt(0); + if (_serialized == null) { + _serialized = DataBlockUtils.serialize(this); Review Comment: Thanks for the explanation. Now I understand why the version is moved into the ser-de class: basically we want to shift the wire format control into the `DataBlockSerDe`, and `DataBlock` only defines the in-memory format right? Since you are removing the version from `DataBlock`, and currently we only allow a single version of `DataBlock`, should we consider also cleaning up the backward compatible `DataBlock`s in the test since this no longer apply? In the `DataBlockSerDe`, we can make a clean start from version 0 or 1 to reduce confusion for future developers. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org