[ 
https://issues.apache.org/jira/browse/LUCENE-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030773#comment-17030773
 ] 

Alan Woodward commented on LUCENE-9207:
---------------------------------------

LUCENE-7398 is triggered by precisely this situation, no?  You have a synonym 
mapping of 'gene -> genome sequence', and a search for `"human genome sequence 
reader"` won't find documents containing that exact phrase because of span 
minimization.

In terms of exponential expansion, we are at least guarded here by the fact 
that we build a boolean query to hold all the possible paths, and so there is 
the usual maxBooleanClauses protection.  If you have a heavily branching token 
stream then it's going to produce an unwieldy query whatever we do, really...

> Don't build SpanQuery in QueryBuilder
> -------------------------------------
>
>                 Key: LUCENE-9207
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9207
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Subtask of LUCENE-9204.  QueryBuilder currently has special logic for graph 
> phrase queries with no slop, constructing a spanquery that attempts to follow 
> all paths using a combination of OR and NEAR queries.  Given the known bugs 
> in this type of query (LUCENE-7398) and that we would like to move span 
> queries out of core in any case, we should remove this logic and just build a 
> disjunction of phrase queries, one phrase per path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to