mcvsubbu commented on issue #6637:
URL: 
https://github.com/apache/incubator-pinot/issues/6637#issuecomment-790119420


   This has to be done on a per-partition basis. A couple of ways to do this:
   - Provide a method to start the offset of the current consuming segment at a 
certain point. So, if a segment started consumption at offset 100, but had a 
problem at 120, the user can set the start offset to be (say) 125. In this 
case, there is loss of data from 100 to 120. Hopefully there are no bad offsets 
beyond 125.
   - Provide a method to skip certain offsets in a stream-partition. The user 
specifies an array of offsets that are to be skipped by the consumer. This is 
more complicated in terms of user-interface, but preserves maximum data 
possible.
   
   I expect the second option to be a bit more complex in implementation as 
well. We will need to consume each row and check whether it is taboo or not.
   
   I prefer the first approach -- big hammer but simple(r).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to