mcvsubbu commented on issue #5647:
URL: 
https://github.com/apache/incubator-pinot/issues/5647#issuecomment-660538878


   Assuming there is only one Pinot schema, and we don't have to do any "join" 
across two streams, here is one approach I can  think of:
   - Change our software so that we are not dependent on table name being the 
first part of an LLC segment name. This may not be too hard, We may just need 
to send the table name in the segment completion protocol requests and responses
   - Consume each stream independently, naming the segments differently for 
each stream. So, we may have segment names for stream A (with say, 4 
partitions) starting with the string "StreamA" and segments from stream 2 
(with, say 7 partitions), starting with the string "StreamB".
   - Do the capacity calculation, ,etc. just like we do for two tables, each 
consuming one stream. Complete the segments independently for the two streams, 
but add the completed segments (and the consuming segments) to the same table. 
So, we may have 3 servers consuming StreamA and 5 servers consuming StreamB but 
that is ok.
   
   Now, I have no clue what to do if the two schemas are different and we need 
to join the streams.
   
   @npawar , @kishoreg I can write this up in a design doc if this acceptable 
as a feature ask. If you think join is a feature soon to come, then we need to 
step back and re-think the approach.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to