Robert Muir created LUCENE-9841: ----------------------------------- Summary: fix slow uses of SortedDocValues in join/ and misc/ Key: LUCENE-9841 URL: https://issues.apache.org/jira/browse/LUCENE-9841 Project: Lucene - Core Issue Type: Bug Reporter: Robert Muir
Background: LUCENE-9796, LUCENE-9795 While fixing the API, I discovered some bad guys doing lookupOrd() on every document. But we just fixed the API in LUCENE-9796 and didn't fix the slow things these two are doing yet. It may be a good idea to break these two problems down into separate subtasks: * lucene/join: This has some slow implementations exposed, but fast ones are available (e.g. using SortedSetDocValues). Seems really easy to fix: simply use the SortedSetDocValues algorithm for SortedDocValues too, and remove the slow stuff. * lucene/misc: DocValuesStats seems to just want the min and max values for Sorted and SortedSet. This is can be efficiently done with ordinals instead of bytes: e.g. just get the min and max ordinal for each segment and lookupOrd twice at the end of processing the segment. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org