I've been away from parsers for a bit, but you should be able to subclass a 
getFuzzyQuery() (or similar) call fairly easily.

Again, last time I looked, it used the automaton (fast) for <=2 and backed off 
to truly slow for > 2.  Note that transposition is only operational for the 
automaton, not yet for the SlowFuzzyQuery.

<self-promotion>Might want to take a look at LUCENE-5205 and SOLR-5410.  Those 
offer a parser that uses SlowFuzzyQuery for exactly your use 
case.</self-promotion>

The recommended solution for handling fuzziness > 2 (I think), though, is to 
use character ngrams as in the SpellChecker.

Best,

       Tim

-----Original Message-----
From: Michael Tobias [mailto:mich...@tobias.org.uk] 
Sent: Sunday, June 29, 2014 8:17 PM
To: solr-user@lucene.apache.org
Subject: SlowFuzzySearch

Hi guys

I know that Solr now has a fast Fuzzy Search capability for levenshtein 
distances of up to 2, but I would like to use distances of 3 or 4 (up to half 
the word length if possible).

I have been told it is possible to use an older fuzzy search version called 
SlowFuzzyQuery but I am not sure how to use it.  I realise it will be slow(er) 
but my database will be reasonably small and I would like to test out the 
performance to see if it is a feasible option.  Is it still part of the Solr 
code or must I install it separately?

Any examples of its usage????? And for distances of 2 or less does it actually 
perform a fast fuzzy search or must I revert to using the ~ syntax for those 
faster fuzzy searches?

All help appreciated.

Michael

Reply via email to