[I] Can we improve caching of dense results? [lucene]

via GitHub Fri, 15 May 2026 06:21:12 -0700


iverase opened a new issue, #16071:
URL: https://github.com/apache/lucene/issues/16071


   I am working into improving the behaviour of DocValues skippers when they 
are applied to a single value, dense field that is the primary sort of an 
index. In this case, for some queries we know the result is dense so we can 
just find the minimum and maximum document of the result and create a 
DocIdSetIterator using `DocIdSetIterator.range(minDocID, maxDocID)`. This is 
the denser representation you can have for this type of iterator.
   
   The issue I am seeing is that if this iterator get cached it looses the 
density and potentially can be cached as a FixedBitSet. This feels pretty 
wasteful and in addition it looses some characteristics, for example if cached 
as a RoaringDocIdSet, then the iterator produced does not implement 
#docIdRunEnd so you need to iterate one document at a time.
   
   I cannot see a way to detect a scorer is dense so I wonder if someone has 
suggestions on how to improve this case. Maybe we should not cache such queries 
although caching still helps here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] Can we improve caching of dense results? [lucene]

Reply via email to