mcvsubbu commented on issue #7004:
URL: 
https://github.com/apache/incubator-pinot/issues/7004#issuecomment-854927322


   +1 to keeping it virtual column
   +1 to making it configurable, since it can be quite some overhead
   +1 to keeping it a string, pinot is transparent to the stream underneath.
   We should not be building indices on it. Please use raw index. It is 
supported for consuming segments now.
   
   In case of kinesis, we should (may want to) also keep track of other 
metadata like partition IDs in the group during the time the segment was being 
consumed. I think these do not change (if they do, we close the segment), but 
@npawar  or @KKcorps  can comment on that.
   
   Since this is stream dependent, I would make it a string that has (at the 
minimum) the StreamMsgOffset serialized, and also the partition group ID. 
Beyond that, each stream may add its own stuff.
   
   Also, consider having a less verbose version of this by having some data 
common to the entire segment, in the segment metadata (Some of these are there 
in zk metadata). For kafka, this could mean start/end offset, partition group 
id, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to