[ https://issues.apache.org/jira/browse/LUCENE-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17106265#comment-17106265 ]
Michael McCandless commented on LUCENE-9365: -------------------------------------------- {quote} bq. so +1 to make FuzzyQuery lenient to these cases and rewrite itself to PrefixQuery or RegexpQuery instead. Would this mean we need to add a max length option to PrefixQuery? {quote} OK, let me narrow my +1 a bit ;) I'm +1 to having {{FuzzyQuery}} be lenient by allowing this strange case where {{prefix == term.text().length()}} and implementing it "correctly", to make it less trappy for users. But I'm less clear on how exactly we should implement that. You're right, if we rewrite to {{PrefixQuery}} then we must then add a max length option to it. Maybe that is indeed a useful option to expose publicly to {{PrefixQuery}} users? That would let users cap how many characters are allowed after the prefix. Alternatively, we could just rewrite to an anonymous {{AutomatonQuery}} that accepts precisely the term as prefix, and then at most {{edit-distance}} additional arbitrary characters? I'm not sure which approach is better ... I think I would favor the first option. > Fuzzy query has a false negative when prefix length == search term length > -------------------------------------------------------------------------- > > Key: LUCENE-9365 > URL: https://issues.apache.org/jira/browse/LUCENE-9365 > Project: Lucene - Core > Issue Type: Bug > Components: core/query/scoring > Reporter: Mark Harwood > Priority: Major > > When using FuzzyQuery the search string `bba` does not match doc value `bbab` > with an edit distance of 1 and prefix length of 3. > In FuzzyQuery an automaton is created for the "suffix" part of the search > string which in this case is an empty string. > In this scenario maybe the FuzzyQuery should rewrite to a WildcardQuery of > the following form : > {code:java} > searchString + "?" > {code} > .. where there's an appropriate number of ? characters according to the edit > distance. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org