[ https://issues.apache.org/jira/browse/LUCENE-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030773#comment-17030773 ]
Alan Woodward commented on LUCENE-9207: --------------------------------------- LUCENE-7398 is triggered by precisely this situation, no? You have a synonym mapping of 'gene -> genome sequence', and a search for `"human genome sequence reader"` won't find documents containing that exact phrase because of span minimization. In terms of exponential expansion, we are at least guarded here by the fact that we build a boolean query to hold all the possible paths, and so there is the usual maxBooleanClauses protection. If you have a heavily branching token stream then it's going to produce an unwieldy query whatever we do, really... > Don't build SpanQuery in QueryBuilder > ------------------------------------- > > Key: LUCENE-9207 > URL: https://issues.apache.org/jira/browse/LUCENE-9207 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Alan Woodward > Assignee: Alan Woodward > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Subtask of LUCENE-9204. QueryBuilder currently has special logic for graph > phrase queries with no slop, constructing a spanquery that attempts to follow > all paths using a combination of OR and NEAR queries. Given the known bugs > in this type of query (LUCENE-7398) and that we would like to move span > queries out of core in any case, we should remove this logic and just build a > disjunction of phrase queries, one phrase per path. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org