Mixing fuzzy with phonetic can give bizarre matches. I worked on a search 
engine that did that.

You really don't want to mix stemming, phonetic, and fuzzy. They are distinct 
transformations of the surface word that do different things.

Stemming: conflate different inflections of the same word, like car and cars.
Phonetic: conflate words that sound similar, like moody and mudie.
Fuzzy: conflate words with different spellings or misspellings, like smith, 
smyth, and smit.

If you want all of these, make three fields with separate transformations.

wunder

On Aug 28, 2013, at 5:46 AM, Erick Erickson wrote:

> No, ComplexPhraseQuery has been around for quite a while but
> never incorporated into the code base, it's pretty much what you
> need to do both fuzzy and phrase at once.
> 
> But, doesn't phonetic really incorporate at least a flavor of fuzzy?
> Is it close enough for your needs to just do phonetic matches?
> 
> Best
> Erick
> 
> 
> On Wed, Aug 28, 2013 at 8:31 AM, Prasi S <prasi1...@gmail.com> wrote:
> 
>> sry , i copied it wrong. Below is the correct analysis.
>> 
>> Index time
>> 
>> ST
>> trinity
>> services
>> SF
>> trinity
>> services
>> LCF
>> trinity
>> services
>> SF
>> trinity
>> services
>> SF
>> trinity
>> services
>> WDF
>> trinity
>> services
>> SF
>> triniti
>> servic
>> PF
>> TRNTtriniti
>> SRFKservic
>> HWF
>> TRNTtriniti
>> SRFKservic
>> PSF
>> TRNTtriniti
>> SRFKservic
>> 
>> 
>> 
>> *Query time*
>> ST
>> trinity
>> services
>> SF
>> trinity
>> services
>> LCF
>> trinity
>> services
>> WDF
>> trinity
>> services
>> SF
>> triniti
>> servic
>> PSF
>> triniti
>> servic
>> PF
>> TRNTtriniti
>> SRFKservic
>> 
>> Apart from this, fuzzy would be for indivual words and proximity would be
>> phrase. Is this correct.
>> also can we have fuzzy on phrases?
>> 
>> 
>> On Wed, Aug 28, 2013 at 5:58 PM, Prasi S <prasi1...@gmail.com> wrote:
>> 
>>> hi Erick,
>>> Yes it is correct. These results are because of stemming + phonetic
>>> matching. Below is the
>>> 
>>> Index time
>>> 
>>> ST
>>>   trinity
>>>  services
>>> SF
>>>   trinity
>>>  services
>>> LCF
>>>   trinity
>>>  services
>>> SF
>>>   trinity
>>>  services
>>> SF
>>>   trinity
>>>  services
>>> WDF
>>>   trinity
>>>  services
>>> Query time
>>> 
>>> SF
>>>   triniti
>>>  servic
>>> PF
>>>   TRNT  triniti
>>>  SRFK  servic
>>> HWF
>>>   TRNT  triniti
>>>  SRFK  servic
>>> PSF
>>>   TRNT  triniti
>>>  SRFK  servic
>>> Apart from this, fuzzy would be for indivual words and proximity would be
>>> phrase. Is this correct.
>>> also can we have fuzzy on phrases?
>>> 
>>> 
>>> 
>>> On Wed, Aug 28, 2013 at 5:36 PM, Erick Erickson <erickerick...@gmail.com
>>> wrote:
>>> 
>>>> The first thing I'd recommend is to look at the admin/analysis
>>>> page. I suspect you aren't seeing fuzzy query results
>>>> at all, what you're seeing is the result of stemming.
>>>> 
>>>> Stemming is algorithmic, so sometimes produces very
>>>> surprising results, i.e. Trinidad and Trinigee may stem
>>>> to something like triniti.
>>>> 
>>>> But you didn't provide the field definition so it's just a guess.
>>>> 
>>>> Best
>>>> Erick
>>>> 
>>>> 
>>>> On Wed, Aug 28, 2013 at 7:43 AM, Prasi S <prasi1...@gmail.com> wrote:
>>>> 
>>>>> Hi,
>>>>> with solr 4.0 the fuzzy query syntax is like  <keyword>~1 (or 2)
>>>>> Proximity search is like "value"~20.
>>>>> 
>>>>> How does this differentiate between the two searches. My thought was
>>>>> promiximity would be on phrases and fuzzy on individual words. Is that
>>>>> correct?
>>>>> 
>>>>> I wasnted to do a promiximity search for text field and gave the below
>>>>> query,
>>>>> 
>> <ip>:<port>/collection1/select?q="trinity%20service"~50&debugQuery=yes,
>>>>> 
>>>>> it gives me results as
>>>>> 
>>>>> <result name="response" numFound="111" start="0" maxScore="4.1237307">
>>>>> <doc>
>>>>> <str name="business_name">*Trinidad *Services</str>
>>>>> </doc>
>>>>> <doc>
>>>>> <str name="business_name">Trinity Services</str>
>>>>> </doc>
>>>>> <doc>
>>>>> <str name="business_name">Trinity Services</str>
>>>>> </doc>
>>>>> <doc>
>>>>> <str name="business_name">*Trinitee *Service</str>
>>>>> 
>>>>> How to differentiate between fuzzy and proximity.
>>>>> 
>>>>> 
>>>>> Thanks,
>>>>> Prasi
>>>>> 
>>>> 
>>> 
>>> 
>> 

--
Walter Underwood
wun...@wunderwood.org



Reply via email to