smuching202 commented on issue #13745: URL: https://github.com/apache/lucene/issues/13745#issuecomment-3272662016
Hi @javanna, is there a known list of the queries that have duplicated per-segment computation? I reviewed the benchmarks you ran in PR [#13542](https://github.com/apache/lucene/pull/13542#issuecomment-2332114836) with intra-segment parallelism enabled. Based off those results, it looks like these queries showed regressions: - `CommonTermsQuery` (i.e. HighTerm) - `TermQuery` (i.e. CountTerm) - `BooleanQuery` (i.e. OrHighLow) - `PrefixQuery` (i.e. Prefix3) - `WildcardQuery` (i.e. Wildcard) - `FuzzyQuery` (i.e. Fuzzy1) - `MatchAllDocsQuery` (i.e. BrowseDateSSDVFacets) Granted, it's possible that the regressions stem from underlying issues rather than the top-level query itself. For example: - The regression in `OrHighLow` might be due to problems in `CommonTermsQuery` - `PrefixQuery`, `WildCardQuery`, and `FuzzyQuery` all rely on `MultiTermQuery`, so the root cause could be from `MultiTermQuery#CONSTANT_SCORE_BLENDED_REWRITE` (see https://github.com/apache/lucene/blob/main/lucene/core/src/java/org/apache/lucene/search/MultiTermQueryConstantScoreBlendedWrapper.java#L60) Does this generally align with your findings so far? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
