klsince opened a new pull request, #12976: URL: https://github.com/apache/pinot/pull/12976
This PR tries to support consistent table view for querying upsert tables. The consistency is at table partition level, which can contain many segments. Today, updates that involve two segments' bitmaps are not atomic, thus causing queries to see inconsistent table view, e.g. to return less than expected PKs. The high level idea is either to synchronize the query threads and upsert threads (including consuming thread or HelixTaskExecutor threads) for queries to get a consistent set of segments' validDocIds bitmaps; or let upsert threads keep a copy of bitmaps and refresh it regularly. Both modes are added in this PR, as they can be wired up using one R/W lock. Configs: 1. by default, the feature is disabled. 2. if _enableUpsertView=true, the upsert threads take the WLock when the upsert involves two segments' bitmaps; and the query threads take the RLock when getting bitmaps for all its selected segments. This is the sync mode, for best data freshness, but queries may block data ingestion. Mainly for low qps or low ingestion cases. 3. if _enableUpsertViewBatchRefresh=true, the query threads don't need to take lock when getting bitmaps. The query threads access a copy of bitmaps that are kept updated by upsert thread periodically. In this batch mode, if data freshness is concerned, the query can specify a query option as called upsertViewFreshnessMs (e.g. 3s) to set its tolerance on data freshness, and the query thread can fresh the bitmap copies immediately if they are not fresh enough. For high qps and high ingestion cases. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org