Robert Muir created LUCENE-9841:
-----------------------------------

             Summary: fix slow uses of SortedDocValues in join/ and misc/
                 Key: LUCENE-9841
                 URL: https://issues.apache.org/jira/browse/LUCENE-9841
             Project: Lucene - Core
          Issue Type: Bug
            Reporter: Robert Muir


Background: LUCENE-9796, LUCENE-9795

While fixing the API, I discovered some bad guys doing lookupOrd() on every 
document. But we just fixed the API in LUCENE-9796 and didn't fix the slow 
things these two are doing yet.

It may be a good idea to break these two problems down into separate subtasks:

* lucene/join: This has some slow implementations exposed, but fast ones are 
available (e.g. using SortedSetDocValues). Seems really easy to fix: simply use 
the SortedSetDocValues algorithm for SortedDocValues too, and remove the slow 
stuff.
* lucene/misc: DocValuesStats seems to just want the min and max values for 
Sorted and SortedSet. This is can be efficiently done with ordinals instead of 
bytes: e.g. just get the min and max ordinal for each segment and lookupOrd 
twice at the end of processing the segment.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to