swaminathanmanish commented on PR #12157:
URL: https://github.com/apache/pinot/pull/12157#issuecomment-1861764515

   > > We should be able to emit metric within 
`PartitionGroupConsumer.fetchMessages()` when the start offset is not available 
(e.g. the asked offset range is not fully returned).
   > 
   > @Jackie-Jiang Are you suggesting to run this check in in 
`RealtimeSegmentDataManager::consumeLoop` effectively instead of when 
`PartitionGroupConsumer::start` is called? Will that cause too many checks?
   > 
   > One doubt I have is whether the consumer may fall behind even after the 
consumer was started. In that case it is better to move the check to 
`consumeLoop` or `fetchMessages`.
   
   Yes I think its possible that records in kafka can expire while consumeLoop 
is consuming from stream (beyond start phase). Filling a realtime segment can 
take several hours during which records may expire. 
   
   @Jackie-Jiang - Is your intention to add the check to 
`PartitionGroupConsumer.fetchMessages()` for completeness of the check as well 
as reuse across stream fetchers ? 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to