mcvsubbu opened a new issue #7741: URL: https://github.com/apache/pinot/issues/7741
This issue happens on tables that are configured with an offset criteria of anything other than SMALLEST. Tables are often provisioned with offset criteria set to LARGEST (basically, ignore earlier offsets and consume only from the latest messages). This is done so that we don't have to consume older data from a stream, only to discard all the data consumed so far since they are too old. Other possible criteria are CUSTOM or TIME period based. Pinot has a periodic task (RealtimeSegmentValidationManager) that periodically scans the stream for new partitions and starts consumers for the new partitions detected. It is possible (and most likely the case) that the new partitions were created in between two runs of RealtimeSegmentValidationManager, and that the new partitions already have some data in them. In such cases, for the newer partitions that appeared, pinot will ignore the first some messages, and will consume after applying the offset criteria specified in table config. This was introduced in PR #4695 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org