Re: Similarity plugins which are normalized

Tanya Bompi Thu, 29 Nov 2018 15:29:26 -0800

Thanks a lot Doug. Maybe setting more importance to certain fields is the
way to go in conjunction with the overall match.


Tanu

On Thu, Nov 29, 2018 at 1:52 PM Doug Turnbull <
dturnb...@opensourceconnections.com> wrote:

> The usual advice is relevance scores don’t exist on a scale where a
> threshold is useful. As these are just heuristics used for ranking , not a
> confidence level.
>
> I would instead focus on what attributes of a document consider it relevant
> or not (strong match in certain fields).
>
> A couple of things prevent field scores from being comparable:
> - doc freq differs per field
> - field length/ avg field length differs per field
> - typical term frequency of a term in a field differs
>
> You might find this article useful:
>
>
> https://opensourceconnections.com/blog/2013/07/02/getting-dissed-by-dismax-why-your-incorrect-assumptions-about-dismax-are-hurting-search-relevancy/
>
> Doug
>
> On Thu, Nov 29, 2018 at 4:44 PM Tanya Bompi <tanya.bo...@gmail.com> wrote:
>
> > Hi,
> >   As I am tuning the relevancy of my query parser, I see that 2 different
> > queries with  phrase matches get very different scores primarily
> influenced
> > by the Term Frequency component. Since I am using a threshold to filter
> the
> > results for a matched record based off the SOLR score, a somewhat
> > normalized score is needed.
> > Are there any similarity classes that are more suitable to my needs?
> >
> > Thanks,
> > Tanu
> >
> --
> *Doug Turnbull **| *Search Relevance Consultant | OpenSource Connections
> <http://opensourceconnections.com>, LLC | 240.476.9983
> Author: Relevant Search <http://manning.com/turnbull>
> This e-mail and all contents, including attachments, is considered to be
> Company Confidential unless explicitly stated otherwise, regardless
> of whether attachments are marked as such.
>

Re: Similarity plugins which are normalized

Reply via email to