[ https://issues.apache.org/jira/browse/LUCENE-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030680#comment-17030680 ]
Michael Gibney commented on LUCENE-9207: ---------------------------------------- I think the special logic building SpanQueries for the slop=0 case was left in place by LUCENE-8531 because the resulting behavior is functionally identical to the MultiPhraseQuery approach, and SpanQueries for slop=0 are more efficient (potentially _vastly_ more efficient) than the exponential expansion that can result from MultiPhraseQuery over graph TokenStreams (e.g., for bigrams, synonyms, wdgf, etc.). [~romseygeek], do you think the code simplification is worth the potential performance hit for the {{slop=0}} case? [~jim.ferenczi], [~sarowe], [~uschindler], I'm curious for your perspectives (having been involved in the discussion around LUCENE-8531). For heavily branching token streams (e.g., bigrams, certain tYpEs 0f 1nPuT to common WGDF configurations), the performance impact is substantial. I know of (and in fact personally know) many people who have been bitten by this in the form of SOLR-13336; but the underlying performance issue is not Solr-specific and is not directly addressed by the fix for SOLR-13336, which simply restores Lucene's maxBooleanClauses threshold for shortcircuiting individual queries. FWIW, I think LUCENE-7398 is a bit of a red herring here; I'm shooting from the hip a bit, but I'm 90% confident that the LUCENE-7398 issues don't affect the slop=0 case for _query_-time graph TokenStreams; and to the extent that they affect _index_-time graph TokenStreams, they affect SpanQueries and MultiPhraseQuery equally (that's a whole separate question!). > Don't build SpanQuery in QueryBuilder > ------------------------------------- > > Key: LUCENE-9207 > URL: https://issues.apache.org/jira/browse/LUCENE-9207 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Alan Woodward > Assignee: Alan Woodward > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Subtask of LUCENE-9204. QueryBuilder currently has special logic for graph > phrase queries with no slop, constructing a spanquery that attempts to follow > all paths using a combination of OR and NEAR queries. Given the known bugs > in this type of query (LUCENE-7398) and that we would like to move span > queries out of core in any case, we should remove this logic and just build a > disjunction of phrase queries, one phrase per path. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org