[ 
https://issues.apache.org/jira/browse/LUCENE-9365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17104233#comment-17104233
 ] 

Mark Harwood commented on LUCENE-9365:
--------------------------------------

bq. Maybe we should disallow prefix == term.text().length() for FuzzyQuery?  It 
is sort of strange to use FuzzyQuery in this way

Following that logic it's probably a lower threshold than that - 
{code:java}
    prefixLength + editDistance == term.text().length()
{code}
...would be "mis-use" too. e.g. given term `abcd` and prefix=2, edit=2 the 
characters `cd` are wholly redundant offerings and no more informative to the 
search than providing the search term `ab` with the same parameters.

This probably comes down to how user-friendly we want to be. I would have 
leaned towards a more lenient approach - the Query objects the user creates 
capture the high-level intent and the rewrite phase is the opportunity for that 
Query class to figure out if a term, prefix-with-fixed-length or FuzzyQuery is 
the preferred mode of execution.

> Fuzzy query has a false negative when prefix length == search term length 
> --------------------------------------------------------------------------
>
>                 Key: LUCENE-9365
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9365
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/query/scoring
>            Reporter: Mark Harwood
>            Priority: Major
>
> When using FuzzyQuery the search string `bba` does not match doc value `bbab` 
> with an edit distance of 1 and prefix length of 3.
> In FuzzyQuery an automaton is created for the "suffix" part of the search 
> string which in this case is an empty string.
> In this scenario maybe the FuzzyQuery should rewrite to a WildcardQuery of 
> the following form :
> {code:java}
>     searchString + "?" 
> {code}
> .. where there's an appropriate number of ? characters according to the edit 
> distance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to