tuor713 opened a new issue #7958:
URL: https://github.com/apache/pinot/issues/7958


   After testing upsert loads with latest master (including 
https://github.com/apache/pinot/pull/7844), we still see occasional 
inconsistent queries (i.e. receiving intermittently fewer records than 
previously known), even with using only a single (large enough) segment.
   
   The test table has 1 million position records. After an initial load of all 
1 million records, we continuously push upsert into Kafka running ingestion at 
essentially full capacity. Simultaneously we issue `select count(*) from 
position` every second to test whether we get expected results of 1 million. 
   
   After apply 7844 merge, results mostly return 1 million as expected but with 
occasional outliers of 999,999, 999,998 etc. The frequency of which seems to 
depend on machine specs.
   
   Potentially, the few inconsistencies could be related to order of snapshots 
in FilterPlanNode - taking a snapshot of the total number of documents _before_ 
taking the snapshot of validity bitmap. We are experimenting with switching 
that around 
(https://github.com/tuor713/pinot/commit/22c518e4df55572efbc35cb63c76eee731d44fcf)
 and in preliminary tests that appears to eliminate all inconsistent results 
(or at the very least drastically reduce their frequency). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to