Re: [I] Stop duplicating per-segment work across segment partitions [lucene]

via GitHub Tue, 09 Sep 2025 16:49:58 -0700


smuching202 commented on issue #13745:
URL: https://github.com/apache/lucene/issues/13745#issuecomment-3272662016


   Hi @javanna, is there a known list of the queries that have duplicated 
per-segment computation?
   
   I reviewed the benchmarks you ran in PR 
[#13542](https://github.com/apache/lucene/pull/13542#issuecomment-2332114836) 
with intra-segment parallelism enabled. Based off those results, it looks like 
these queries showed regressions:
   - `CommonTermsQuery` (i.e. HighTerm)
   - `TermQuery` (i.e. CountTerm)
   - `BooleanQuery` (i.e. OrHighLow)
   - `PrefixQuery` (i.e. Prefix3)
   - `WildcardQuery` (i.e. Wildcard)
   - `FuzzyQuery` (i.e. Fuzzy1)
   - `MatchAllDocsQuery` (i.e. BrowseDateSSDVFacets)
   
   Granted, it's possible that the regressions stem from underlying issues 
rather than the top-level query itself. For example:
   - The regression in `OrHighLow` might be due to problems in 
`CommonTermsQuery`
   - `PrefixQuery`, `WildCardQuery`, and `FuzzyQuery` all rely on 
`MultiTermQuery`, so the root cause could be from 
`MultiTermQuery#CONSTANT_SCORE_BLENDED_REWRITE` (see 
https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/MultiTermQueryConstantScoreBlendedWrapper.java#L60)
   
   Does this generally align with your findings so far?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] Stop duplicating per-segment work across segment partitions [lucene]

Reply via email to