jadami10 opened a new issue, #8929:
URL: https://github.com/apache/pinot/issues/8929

   We have a fairly unique consumption pattern that leads to consumption issues 
in Pinot.
   
   - our Pinot table is set to consume topic T across P partitions.
   - topic T has many different shapes of events, but at ingestion time we 
filter out any events that do not meet criteria C
   - topic T has a high number of events per second
   - for some Partitions P, there may be 0 events for days that match our 
criteria
   
   What ends up happening is Pinot never seals those segments but continues 
extending their lease. When we go to restart Pinot servers, they then restart 
consuming from days ago. This leads to a huge amount of data being consumed 
just to be filtered out again, throttling from the kafka side, and waiting 
hours for the server to go healthy again.
   
   I believe kinesis already has a way to seal "empty segments" that we would 
need here as well to get Pinot to continue advancing offsets correctly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to