Hi Zheng, actually that version of the fuzzy search is deprecated! Currently the fuzzy search syntax is : <query>~1 or <query>~2 The ~(tilde) param is the number of edit we provide to generate all the expanded query to run. Can I ask you which version of Solr are you using ?
This article from 2011 shows the biggest change in fuzzy query, and I guess it's still the current approach! Related the performance, what do you mean ? Are you worried if the length check will affect the query time ? The answer is yes, but the delay will be un-noticeable as you simply check the length and apply the proper fuzzy param related. Regarding the fact fuzzy query being slower than a normal query, that is true, but the FST approach guarantee really fast fuzzy query. So if you do need the fuzziness, it's something you can cope with. Cheers 2015-05-08 3:12 GMT+01:00 Zheng Lin Edwin Yeo <edwinye...@gmail.com>: > Thank you for the information. > > I've currently using the fuzzy search and set the edit distance value to > ~0.79, and this has allowed a 20% error rate. (ie for words with 5 > characters, it allows 1 mis-spelled character, and for words with 10 > characters, it allows 2 mis-speed characters). > > However, for words with 4 characters, I'll need to set the value to ~0.75 > to allow 1 mis-spelled character, as in order to accommodate 4 characters > word, it requires a 25% error rate for 1 mis-spelled character. We probably > will not accommodate for 3 characters word. > > I've gotten the information from here: > http://lucene.apache.org/core/3_6_0/queryparsersyntax.html#Fuzzy%20Searches > < > http://mail.growhill.com/cgi-bin/webmanager/webmail.cgi?cmd=url&xdata=~2-dd4639fc876fef5244efd32efa438fb90296a3eadadba2c6d7ce00&url=http!3A!2F!2Flucene.apache.org!2Fcore!2F3_6_0!2Fqueryparsersyntax.html!23Fuzzy!2520Searches > > > > Just to check, will this affect the performance of the system? > > Regards, > Edwin > > > On 7 May 2015 at 20:00, Alessandro Benedetti <benedetti.ale...@gmail.com> > wrote: > > > Hi ! > > Currently Solr builds FST to provide proper fuzzy search or spellcheck > > suggestions based on the string distance . > > The current default algorithm is the Levenstein distance ( that returns > the > > number of edit as distance metric). > > In your case you should calculate client side, the edit you want to apply > > to your search. > > In your client code, should be not difficult to process the query and > apply > > the proper number of edit depending on the length. > > > > Anyway the max edit for the levenstein default distance is fixed to 2 . > > > > Cheers > > > > > > > > 2015-05-05 10:24 GMT+01:00 Zheng Lin Edwin Yeo <edwinye...@gmail.com>: > > > > > Hi, > > > > > > Would like to check, how do we implement character proximity searching > > > that's in terms of percentage with regards to the length of the word, > > > instead of a fixed number of edit distance (characters)? > > > > > > For example, if we have a proximity of 20%, a word with 5 characters > will > > > have an edit distance of 1, and a word with 10 characters will > > > automatically have an edit distance of 2. > > > > > > Will Solr be able to do that for us? > > > > > > Regards, > > > Edwin > > > > > > > > > > > -- > > -------------------------- > > > > Benedetti Alessandro > > Visiting card : http://about.me/alessandro_benedetti > > > > "Tyger, tyger burning bright > > In the forests of the night, > > What immortal hand or eye > > Could frame thy fearful symmetry?" > > > > William Blake - Songs of Experience -1794 England > > > -- -------------------------- Benedetti Alessandro Visiting card : http://about.me/alessandro_benedetti "Tyger, tyger burning bright In the forests of the night, What immortal hand or eye Could frame thy fearful symmetry?" William Blake - Songs of Experience -1794 England