uschindler commented on code in PR #12183: URL: https://github.com/apache/lucene/pull/12183#discussion_r1332936743
########## lucene/core/src/java/org/apache/lucene/index/TermStates.java: ########## @@ -211,4 +244,40 @@ public String toString() { return sb.toString(); } + + /** Wrapper over TermState, ordinal value, term doc frequency and total term frequency */ Review Comment: In main branch we could make this a `record` (unfortunately not in Java 11). ########## lucene/core/src/java/org/apache/lucene/index/TermStates.java: ########## @@ -86,19 +90,48 @@ public TermStates( * @param needsStats if {@code true} then all leaf contexts will be visited up-front to collect * term statistics. Otherwise, the {@link TermState} objects will be built only when requested */ - public static TermStates build(IndexReaderContext context, Term term, boolean needsStats) + public static TermStates build(IndexSearcher indexSearcher, Term term, boolean needsStats) throws IOException { - assert context != null && context.isTopLevel; + IndexReaderContext context = indexSearcher.getTopReaderContext(); + assert context != null; final TermStates perReaderTermState = new TermStates(needsStats ? null : term, context); if (needsStats) { - for (final LeafReaderContext ctx : context.leaves()) { - // if (DEBUG) System.out.println(" r=" + leaves[i].reader); - TermsEnum termsEnum = loadTermsEnum(ctx, term); - if (termsEnum != null) { - final TermState termState = termsEnum.termState(); - // if (DEBUG) System.out.println(" found"); - perReaderTermState.register( - termState, ctx.ord, termsEnum.docFreq(), termsEnum.totalTermFreq()); + TaskExecutor taskExecutor = indexSearcher.getTaskExecutor(); + if (taskExecutor != null) { + // build the term states concurrently + List<TaskExecutor.Task<TermStateInfo>> tasks = + context.leaves().stream() + .map( + ctx -> + taskExecutor.createTask( + () -> { + TermsEnum termsEnum = loadTermsEnum(ctx, term); + if (termsEnum != null) { + return new TermStateInfo( + termsEnum.termState(), + ctx.ord, + termsEnum.docFreq(), + termsEnum.totalTermFreq()); + } + return null; + })) + .toList(); + List<TaskExecutor.Task<TermStateInfo>> taskList = new ArrayList<>(tasks); Review Comment: To me it makes no sense to clone the list, as it is a new list already (created by `stream.toList()`). In addition the original list is not used anymore. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org