[ 
https://issues.apache.org/jira/browse/LUCENE-9207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030899#comment-17030899
 ] 

Michael Gibney commented on LUCENE-9207:
----------------------------------------

Thanks, [~romseygeek]. You're right, I was wrong about LUCENE-7398; in its 
current state (and any state that doesn't implement backtracking) 
SpanNearQuery/SpanOrQuery can potentially miss matches in cases like the one 
you describe, even when slop=0. So that's surely still a bug, which would 
indeed be addressed by this change (so to be explicit, that makes me +1 to this 
change).

Yes, maxBooleanClauses is a good failsafe, but I think it's worth specifically 
calling attention to the possibility that for some analyzer configurations and 
inputs, this will result in queries failing differently (and more consistently 
and transparently) than other queries had failed before (silently missing 
matches under certain conditions).

bq.If you have a heavily branching token stream then it's going to produce an 
unwieldy query whatever we do, really...

True in some ways; but the characteristics of the implementations do vary 
fundamentally, so it's not really 6-of-1, half-dozen-of-another. A complete 
nested SpanQuery (as proposed for LUCENE-7398, or analogous Intervals) 
implementation has the potential to be significantly more efficient than 
MultiPhraseQuery or its analogous Intervals impl (which expand all possible 
variants up front).

> Don't build SpanQuery in QueryBuilder
> -------------------------------------
>
>                 Key: LUCENE-9207
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9207
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Alan Woodward
>            Assignee: Alan Woodward
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Subtask of LUCENE-9204.  QueryBuilder currently has special logic for graph 
> phrase queries with no slop, constructing a spanquery that attempts to follow 
> all paths using a combination of OR and NEAR queries.  Given the known bugs 
> in this type of query (LUCENE-7398) and that we would like to move span 
> queries out of core in any case, we should remove this logic and just build a 
> disjunction of phrase queries, one phrase per path.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to