Pradeep,

First, some background on fuzzy term expansions:

1) A query for foobar~2 is really a query for (foobar OR foobar~1 OR
foobar~2)
2) Fuzzy term expansion will only take the first 50 terms found in the
index and drop the rest.

For implementation notes, see this comment -
https://github.com/apache/lucene-solr/blob/master/lucene/core/src/java/org/apache/lucene/search/FuzzyTermsEnum.java#L229-L232

So in your first search, your available terms in the title_txt_en field are
few enough that "probl~2" does match "problem"
In your second search, with the copy field, there are likely many more
terms in all_text_txt_enus. Here, the edit distance 1 terms crowd out the
edit distance 2 terms and they never match.
You can imagine that the term expands into… "probl OR prob OR probe OR
prob1 OR…"

I don't see a way to specify the number of expansions from a Solr query,
maybe somebody else on the list would know.

But at the end of the day, like Wunder said, you might want a prefix query
based on what you're describing.

Mike


On Mon, Apr 13, 2020 at 6:01 PM Deepu <kpkumar1...@gmail.com> wrote:

> Corrected Typo mistake.
>
> Hi Team,
>
> We have 8 text fields (*_txt_en) in schema and one multi valued text field
> which is copy field of other text fields, like below.
>
> tittle_txt_en, configuration_summary_txt_en, all_text_txt_ens (multi value
> field)
>
> Observed one issue with Fuzzy match, same term with distance of two(~2) is
> working on individual fields but not returning any results from multi
> valued field.
>
> Term we used is "probl" and document has "problem" term in two text fields,
> so all_text field has two occurrences of 'problem" terms.
>
>
>
> title_txt_en:probl~2. (given results)
>
> all_text_txt_ens:probl~2 (no results)
>
>
>
> is there any other factors involved in distance calculation other
> than Damerau-Levenshtein Distance algoritham?
>
> what might be the reason same input with same distance worked with one
> field and failed with other field in same collection?
>
> is there a way we can get actual distance solr calculated w.r.t specific
> document and specific field ?
>
>
>
> Thanks in advance !!
>
>
> Thanks,
>
> Pradeep
>
> On Mon, Apr 13, 2020 at 2:35 PM Deepu <kpkumar1...@gmail.com> wrote:
>
> > Hi Team,
> >
> > We have 8 text fields (*_txt_en) in schema and one multi valued text
> field
> > which is copy field of other text fields, like below.
> >
> > tittle_txt_en, configuration_summary_txt_en, all_text_txt_ens (multi
> value
> > field)
> >
> > Observed one issue with Fuzzy match, same term with distance of two(~2)
> is
> > working on individual fields but not returning any results from multi
> > valued field.
> >
> > Term we used is "prob" and document has "problem" term in two text
> fields,
> > so all_text field has two occurrences of 'problem" terms.
> >
> >
> >
> > title_txt_en:prob~2. (given results)
> >
> > all_text_txt_ens:prob~2 (no results)
> >
> >
> >
> > is there any other factors involved in distance calculation other
> > than Damerau-Levenshtein Distance algoritham?
> >
> > what might be the reason same input with same distance worked with one
> > field and failed with other field in same collection?
> >
> > is there a way we can get actual distance solr calculated w.r.t specific
> > document and specific field ?
> >
> >
> >
> > Thanks in advance !!
> >
> >
> > Thanks,
> >
> > Pradeep
> >
>

Reply via email to