tuor713 opened a new issue #7958: URL: https://github.com/apache/pinot/issues/7958
After testing upsert loads with latest master (including https://github.com/apache/pinot/pull/7844), we still see occasional inconsistent queries (i.e. receiving intermittently fewer records than previously known), even with using only a single (large enough) segment. The test table has 1 million position records. After an initial load of all 1 million records, we continuously push upsert into Kafka running ingestion at essentially full capacity. Simultaneously we issue `select count(*) from position` every second to test whether we get expected results of 1 million. After apply 7844 merge, results mostly return 1 million as expected but with occasional outliers of 999,999, 999,998 etc. The frequency of which seems to depend on machine specs. Potentially, the few inconsistencies could be related to order of snapshots in FilterPlanNode - taking a snapshot of the total number of documents _before_ taking the snapshot of validity bitmap. We are experimenting with switching that around (https://github.com/tuor713/pinot/commit/22c518e4df55572efbc35cb63c76eee731d44fcf) and in preliminary tests that appears to eliminate all inconsistent results (or at the very least drastically reduce their frequency). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org