It won't crash Solr if you include it, but it probably won't do what
you expect either due to how wildcards are expanded.

And it gets worse. DoubleMetaphone tries to reduce what it
analyzes, well, phonetically with "close" letters (or multiple
choices). Some phonetic filters change to fixed 4 letter
combinations as I remember. Some hash to a completely different
string. Some....

About combining fuzzy and wildcard. I haven't thought it through,
but it strikes me as fraught with unexpected results. Consider
har* and treating it as a fuzzy match. How would you calculate
the "fuzziness" of "hardiness" and "harp"? Would you consider
"her" a fuzzy match? How about "farther"? or even "father"?

You might be able to do something interesting with EdgeNgram
here though but it still seems like it's going to either explode
computationally or produce results that don't really mean much.
But I'm mostly speculating here....

Frankly, though, I'd do what Jan suggests. Try it out and see if
it's "good enough". Especially pin down the use cases. Often
requirements like this are specified by someone who, when
presented with the results of what you can do easily, decide the
effort could best be spent somewhere else.

Because this whole approach will only increase the number of
documents that are found as the result of a search without
necessarily increasing the relevance of the top N docs on the
first page. Users rarely go to the second page, and often don't
even look past the first few results. Doing wildcard AND fuzzy
queries would likely result in something useful  a very small
percentage of the time. But that's just a guess.

Best
Erick


On Tue, Oct 9, 2012 at 5:54 AM, Haagen Hasle <haagenha...@gmail.com> wrote:
>
> I used the admin/analysis page (great tip, I had never used it before - thank 
> you!) and it seems to me that the DoubleMetaphone filter converts "Hågen" to 
> both "JN" and "KN".  Will that crash the Solr analysis if I try to include 
> this filter in the multiterm-analysis?
>
> Do you know where I can find out more about combining wildcard and fuzzy in 
> the same query?  When you say you don't think it is possible, do you mean it 
> is not implemented in Solr today, or it can't be implemented because it is 
> technically impossible or functionally doesn't make sense? :)
>
> I wrote in an answer to Otis that I'd like to try to combine fuzzy with Ngram 
> as well.  Do you know if that is possible and makes any sense?
>
>
> Thanks to everyone for quick and good answers, I really appreciate it!
>
>
> Regards, Hågen
>
> Den 8. okt. 2012 kl. 21:35 skrev Erick Erickson:
>
>> To answer your first question, yes, you've got it right. If you define
>> a multiterm section in your fieldType, whatever you put in that section
>> gets applied whether the underlying class is MultiTermAware or not.
>> Which means you can shoot yourself in the foot really bad <G>...
>>
>> (…)
>>
>> Fuzzy searches + wildcards. I don't think you can do that reasonably, but
>> I'm not entirely sure.
>>
>> Best
>> Erick
>

Reply via email to