mqliang commented on a change in pull request #6710:
URL: https://github.com/apache/incubator-pinot/pull/6710#discussion_r599306832



##########
File path: 
pinot-core/src/main/java/org/apache/pinot/core/common/datatable/DataTableImplV2V3.java
##########
@@ -33,12 +33,15 @@
 import org.apache.pinot.common.utils.DataTable;
 import org.apache.pinot.common.utils.StringUtil;
 import org.apache.pinot.core.common.ObjectSerDeUtils;
+import org.apache.pinot.core.query.request.context.ThreadTimer;
 import org.apache.pinot.spi.utils.ByteArray;
 import org.apache.pinot.spi.utils.BytesUtils;
 
 
-public class DataTableImplV2 implements DataTable {
-  private static final int VERSION = 2;
+public class DataTableImplV2V3 implements DataTable {

Review comment:
       I name it as DataTableImplV2V3 since V2 and V3 share a lot of common 
logic. If V2 and V3 has major changes, as you suggest:
   > Since we are anyway bumping up the version, how about we move the existing 
metadata of key-value pairs to the end of file to keep consistency in the 
format. So, all the metadata stuff (aka key-value pairs) + new positional stuff 
can be a file footer.
   
   If we do that, I vote for put V2 logic into DataTableImplV2 and V3 logic 
into DataTableImplV3, and extract common logic (e.g. serialize/de-serialize 
metada/dictionaryMap into DataTableUtils.java)
   
   > move the existing metadata of key-value pairs to the end of file 
   
   Actually I considered that. I also considered to make metadata as a 
`String[]` instead of `Map<String, String>` and make all meta data keys as enum 
value. Also make "serialization_cpu_times_ns" as part of metadata. In other 
words, "serialization_cpu_times_ns" is part of mate data and footer section 
only contains meta data. In this way:
   * all meta data is positional, we can replace values in metadata even after 
data table is serialized. (`Map<String, String>` is not positional because when 
loop over a hashmap, the order of items is not deterministic,  but loop over of 
an array, the order is deterministic)
   * meta data previously is `Map<String, String>`, where we need to write 
keys(type string) to byte buffer. When replaces as `String[]`, we don't write 
the enum constant itself. Just the value (length+bytes) corresponding to the 
ordinal/position of the constant. So less data is transfered between 
server/broker.
   
   But if we change in this way, as I previously stated, I vote to keep the 
current DataTableImplV2.java as it is, and create a DataTableImplV3.java to put 
all V3 logic (with extracting common  into DataTableUtils.java ). Otherwise, 
puting all V2/V3 logic in same file will make the code hard to read. 
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to