Robert Muir created LUCENE-9841:
-----------------------------------
Summary: fix slow uses of SortedDocValues in join/ and misc/
Key: LUCENE-9841
URL: https://issues.apache.org/jira/browse/LUCENE-9841
Project: Lucene - Core
Issue Type: Bug
Reporter: Robert Muir
Background: LUCENE-9796, LUCENE-9795
While fixing the API, I discovered some bad guys doing lookupOrd() on every
document. But we just fixed the API in LUCENE-9796 and didn't fix the slow
things these two are doing yet.
It may be a good idea to break these two problems down into separate subtasks:
* lucene/join: This has some slow implementations exposed, but fast ones are
available (e.g. using SortedSetDocValues). Seems really easy to fix: simply use
the SortedSetDocValues algorithm for SortedDocValues too, and remove the slow
stuff.
* lucene/misc: DocValuesStats seems to just want the min and max values for
Sorted and SortedSet. This is can be efficiently done with ordinals instead of
bytes: e.g. just get the min and max ordinal for each segment and lookupOrd
twice at the end of processing the segment.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]