Harnoor7 commented on PR #14771: URL: https://github.com/apache/pinot/pull/14771#issuecomment-2577062850
> I'm curious about why the lock is not released. Slower consumer shouldn't cause lock to be held. That is the core to this problem Yes, I understand the core of the problem is that the semaphore is acquired for too long by the consumers and we should focus on that problem. There has been 2 incidents where the blame has been put on partial upserts. For Instance: 1. A table with partial upserts enabled resulted in all helix threads being blocked. 2. A server took 16 hours to load consuming segment for table which had partial upsert enabled. What I am trying to address here is that we should not run into situation where slow consuming segments block entire ingestion. For example - If we have `K` Kafka partitions, `K` helix threads acquiring the semaphore (catching up to offset) should not stop downloading of other segments for same table OR consumption of segments for diff tables. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org