Mixing fuzzy with phonetic can give bizarre matches. I worked on a search engine that did that.
You really don't want to mix stemming, phonetic, and fuzzy. They are distinct transformations of the surface word that do different things. Stemming: conflate different inflections of the same word, like car and cars. Phonetic: conflate words that sound similar, like moody and mudie. Fuzzy: conflate words with different spellings or misspellings, like smith, smyth, and smit. If you want all of these, make three fields with separate transformations. wunder On Aug 28, 2013, at 5:46 AM, Erick Erickson wrote: > No, ComplexPhraseQuery has been around for quite a while but > never incorporated into the code base, it's pretty much what you > need to do both fuzzy and phrase at once. > > But, doesn't phonetic really incorporate at least a flavor of fuzzy? > Is it close enough for your needs to just do phonetic matches? > > Best > Erick > > > On Wed, Aug 28, 2013 at 8:31 AM, Prasi S <prasi1...@gmail.com> wrote: > >> sry , i copied it wrong. Below is the correct analysis. >> >> Index time >> >> ST >> trinity >> services >> SF >> trinity >> services >> LCF >> trinity >> services >> SF >> trinity >> services >> SF >> trinity >> services >> WDF >> trinity >> services >> SF >> triniti >> servic >> PF >> TRNTtriniti >> SRFKservic >> HWF >> TRNTtriniti >> SRFKservic >> PSF >> TRNTtriniti >> SRFKservic >> >> >> >> *Query time* >> ST >> trinity >> services >> SF >> trinity >> services >> LCF >> trinity >> services >> WDF >> trinity >> services >> SF >> triniti >> servic >> PSF >> triniti >> servic >> PF >> TRNTtriniti >> SRFKservic >> >> Apart from this, fuzzy would be for indivual words and proximity would be >> phrase. Is this correct. >> also can we have fuzzy on phrases? >> >> >> On Wed, Aug 28, 2013 at 5:58 PM, Prasi S <prasi1...@gmail.com> wrote: >> >>> hi Erick, >>> Yes it is correct. These results are because of stemming + phonetic >>> matching. Below is the >>> >>> Index time >>> >>> ST >>> trinity >>> services >>> SF >>> trinity >>> services >>> LCF >>> trinity >>> services >>> SF >>> trinity >>> services >>> SF >>> trinity >>> services >>> WDF >>> trinity >>> services >>> Query time >>> >>> SF >>> triniti >>> servic >>> PF >>> TRNT triniti >>> SRFK servic >>> HWF >>> TRNT triniti >>> SRFK servic >>> PSF >>> TRNT triniti >>> SRFK servic >>> Apart from this, fuzzy would be for indivual words and proximity would be >>> phrase. Is this correct. >>> also can we have fuzzy on phrases? >>> >>> >>> >>> On Wed, Aug 28, 2013 at 5:36 PM, Erick Erickson <erickerick...@gmail.com >>> wrote: >>> >>>> The first thing I'd recommend is to look at the admin/analysis >>>> page. I suspect you aren't seeing fuzzy query results >>>> at all, what you're seeing is the result of stemming. >>>> >>>> Stemming is algorithmic, so sometimes produces very >>>> surprising results, i.e. Trinidad and Trinigee may stem >>>> to something like triniti. >>>> >>>> But you didn't provide the field definition so it's just a guess. >>>> >>>> Best >>>> Erick >>>> >>>> >>>> On Wed, Aug 28, 2013 at 7:43 AM, Prasi S <prasi1...@gmail.com> wrote: >>>> >>>>> Hi, >>>>> with solr 4.0 the fuzzy query syntax is like <keyword>~1 (or 2) >>>>> Proximity search is like "value"~20. >>>>> >>>>> How does this differentiate between the two searches. My thought was >>>>> promiximity would be on phrases and fuzzy on individual words. Is that >>>>> correct? >>>>> >>>>> I wasnted to do a promiximity search for text field and gave the below >>>>> query, >>>>> >> <ip>:<port>/collection1/select?q="trinity%20service"~50&debugQuery=yes, >>>>> >>>>> it gives me results as >>>>> >>>>> <result name="response" numFound="111" start="0" maxScore="4.1237307"> >>>>> <doc> >>>>> <str name="business_name">*Trinidad *Services</str> >>>>> </doc> >>>>> <doc> >>>>> <str name="business_name">Trinity Services</str> >>>>> </doc> >>>>> <doc> >>>>> <str name="business_name">Trinity Services</str> >>>>> </doc> >>>>> <doc> >>>>> <str name="business_name">*Trinitee *Service</str> >>>>> >>>>> How to differentiate between fuzzy and proximity. >>>>> >>>>> >>>>> Thanks, >>>>> Prasi >>>>> >>>> >>> >>> >> -- Walter Underwood wun...@wunderwood.org