walterddr commented on PR #12244: URL: https://github.com/apache/pinot/pull/12244#issuecomment-1894377982
> Yes @walterddr . So at FixedByteMVMutableForwardIndex.java , should we pass the MaxDocID for getLongMV(),etc. methods to see if we are only reading in the max value range we are sure of ? so basically 1. at the time of this `IndexContainer` creation (`MutableSegmentImpl`) it has a max length of all the MV values for that index container (or column if you will) 2. the method that causes problem is `FixedByteMVMutableForwardIndex.getLongMV(docId)`, this is during query 3. the argument passed in here is the docId from `MVScanDocIdIterator`, which iterates over from 0 to `numDocs` 4. this `numDocs` is passed in durign query time when `_indexSegment.getSegmentMetadata().getTotalDocs()` <-- which is from the segment metadata 5. this is acquire by `MutableSegmentImpl.getTotalDocs` there a discrepancy between step 5 (at the time of query) and step 1 (at the time of IndexContainer creation) chasing down the codepath it looks like 5 is always acquire before 1. so i dont really think this is a problem. but there might be a codepath i didn't chase down that causes a reverse order of getting datasource and getting numDocs. what i would suggest a fix is instead of getting the total docs from the indexed segment, we get the total docs from the data source itself (not sure if this is possible) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org