mcvsubbu commented on issue #5647: URL: https://github.com/apache/incubator-pinot/issues/5647#issuecomment-660538878
Assuming there is only one Pinot schema, and we don't have to do any "join" across two streams, here is one approach I can think of: - Change our software so that we are not dependent on table name being the first part of an LLC segment name. This may not be too hard, We may just need to send the table name in the segment completion protocol requests and responses - Consume each stream independently, naming the segments differently for each stream. So, we may have segment names for stream A (with say, 4 partitions) starting with the string "StreamA" and segments from stream 2 (with, say 7 partitions), starting with the string "StreamB". - Do the capacity calculation, ,etc. just like we do for two tables, each consuming one stream. Complete the segments independently for the two streams, but add the completed segments (and the consuming segments) to the same table. So, we may have 3 servers consuming StreamA and 5 servers consuming StreamB but that is ok. Now, I have no clue what to do if the two schemas are different and we need to join the streams. @npawar , @kishoreg I can write this up in a design doc if this acceptable as a feature ask. If you think join is a feature soon to come, then we need to step back and re-think the approach. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org