mqliang commented on pull request #6710:
URL: https://github.com/apache/incubator-pinot/pull/6710#issuecomment-805121867


   @mcvsubbu 
   > Any reason we are restricting the trailer (or footer) to have only 
key-value pairs? We don't need to place that restriction as long as the length 
is also encoded up front. It can be any serialized object, right?
   
   You are right, it can be any serialized object, but restricting to only 
contains KV pairs has following benefit:
   
   * Any object can be add as a KV pair, just: (key, serialized_object). So 
it's easy to add new section to footer in future.
   * For all KV pairs in footer, put their keys in enum, so when serialize 
footer, the order of KV pairs is deterministic. This make all KV pairs is 
positional/locatable. So we are able to replace value of a given key in footer 
even after serialized. 
   * If we want to add a new object into data table. If we are OK to put it as 
a KV pair into footer, we don't need to bum up version Here is the pseudocode 
of serialize/de-serialize footer:
   ```
   enum footerkeys {
        k0,
        k1,
        k2,
   }
   
   String footerkeysToStr = new String[]{
        "k0",
        "k1",
        "k2",
   }
   
   function serializeFooter() {
        byte[] bytes;
        for (key in footerkeys) {
            String data = encode_to_str(value_of_key(key));
            bytes = append(bytes, len(data));
            bytes = append(bytes, data.toBytes());
        }
   }
   
   function String[] deSerializeFooter(byte[] bytes) {
        String[] values = new String[len(footerkeys)];
        for (int i = 0; i < len(footerkeys); i++) {
           int data_len = bytes.nextInt();
           values[i] = bytes.nextBytesofLens(data_len);
        }
   }
   
   // If values_i is a complex object instead of a string, we can deserialize 
it even further:
       String[] footerKVpairs = deSerializeFooter(bytes);
        Object_i = deserialize(footerKVpairs[i].toBytes());
   
   ```
   So, if we want to add  new object to footer, add it as KV pair, and as long 
as we add the key as the last one of the enum, old broker will just ignore the 
extra one, it's back-compatable).
   
   If we make footer not only contains KV pairs, but also other arbitrary 
serializable objects:
   ```
   +------------------------------------+
   |     
   |    serializable object 1
   |
   +------------------------------------
   |
   |    serializable object 2
   |
   +------------------------------------
   |
   |    KV pairs
   |
   +------------------------------------
   
   ```
   It's not extensible: If we wanner add a serializable_object_3 in between of 
serializable_object_2 and KV_pairs, we need to bump up version (If we bump 
version, we can also add in to the middle of data table, not necessarily in 
footer). 
   
   That's the reason I prefer footer only contains KV pairs: If we want to add 
a new simple section into data table, and don't want bump up version, add it as 
KV pair to footer. If we want add new very complex section or re-arrange 
current sections, add it into the middle of data table, and bump up version.
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to