[ 
https://issues.apache.org/jira/browse/LUCENE-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106265#comment-17106265
 ] 

Michael McCandless commented on LUCENE-9365:
--------------------------------------------

{quote}
bq. so +1 to make FuzzyQuery lenient to these cases and rewrite itself to 
PrefixQuery or RegexpQuery instead.

Would this mean we need to add a max length option to PrefixQuery?
{quote}

OK, let me narrow my +1 a bit ;)

I'm +1 to having {{FuzzyQuery}} be lenient by allowing this strange case where 
{{prefix == term.text().length()}} and implementing it "correctly", to make it 
less trappy for users.

But I'm less clear on how exactly we should implement that.  You're right, if 
we rewrite to {{PrefixQuery}} then we must then add a max length option to it.  
Maybe that is indeed a useful option to expose publicly to {{PrefixQuery}} 
users?  That would let users cap how many characters are allowed after the 
prefix.

Alternatively, we could just rewrite to an anonymous {{AutomatonQuery}} that 
accepts precisely the term as prefix, and then at most {{edit-distance}} 
additional arbitrary characters?

I'm not sure which approach is better ... I think I would favor the first 
option.

> Fuzzy query has a false negative when prefix length == search term length 
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-9365
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9365
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/query/scoring
>            Reporter: Mark Harwood
>            Priority: Major
>
> When using FuzzyQuery the search string `bba` does not match doc value `bbab` 
> with an edit distance of 1 and prefix length of 3.
> In FuzzyQuery an automaton is created for the "suffix" part of the search 
> string which in this case is an empty string.
> In this scenario maybe the FuzzyQuery should rewrite to a WildcardQuery of 
> the following form :
> {code:java}
>     searchString + "?" 
> {code}
> .. where there's an appropriate number of ? characters according to the edit 
> distance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to