Re: Re: Re: Re: Re: Query Autocomplete Evaluation

2020-02-28 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
are given 0s) / total suggestions displayed ? > > If the above is true, wouldn't Selection to Display be binary? I.e. it's > either 1/# of suggestions displayed (assuming this is a constant) or 0? > > Best, > Audrey > > > __

Re: Re: Re: Re: Query Autocomplete Evaluation

2020-02-28 Thread Paras Lehana
suggestions displayed (assuming this is a constant) or 0? > > Best, > Audrey > > > > From: Paras Lehana > Sent: Thursday, February 27, 2020 2:58:25 AM > To: solr-user@lucene.apache.org > Subject: [EXTERNAL] Re: Re: Re: Query Autocomplete Eval

Re: Re: Re: Re: Query Autocomplete Evaluation

2020-02-27 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
L] Re: Re: Re: Query Autocomplete Evaluation Hi Audrey, For MRR, we assume that if a suggestion is selected, it's relevant. It's also assumed that the user will always click the highest relevant suggestion. Thus, we calculate position selection for each selection. If still, I'm not un

Re: Re: Re: Query Autocomplete Evaluation

2020-02-26 Thread Paras Lehana
Hi Audrey, For MRR, we assume that if a suggestion is selected, it's relevant. It's also assumed that the user will always click the highest relevant suggestion. Thus, we calculate position selection for each selection. If still, I'm not understanding your question correctly, feel free to contact

Re: Re: Re: Query Autocomplete Evaluation

2020-02-25 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
This article http://wwwconference.org/proceedings/www2011/proceedings/p107.pdf also indicates that MRR needs binary relevance labels, p. 114: "To this end, we selected a random sample of 198 (query, context) pairs from the set of 7,311 pairs, and manually tagged each of them as related (i.e., th

Re: Re: Query Autocomplete Evaluation

2020-02-25 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Thank you, Walter & Paras! So, from the MRR equation, I was under the impression the suggestions all needed a binary label (0,1) indicating relevance.* But it's great to know that you guys use proxies for relevance, such as clicks. *The reason I think MRR has to have binary relevance labels is

Re: Query Autocomplete Evaluation

2020-02-24 Thread Walter Underwood
Here is a blog article with a worked example for MRR based on customer clicks. https://observer.wunderwood.org/2016/09/12/measuring-search-relevance-with-mrr/ At my place of work, we compare the CTR and MRR of queries using suggestions to those that do not use suggestions. Solr autosuggest based

Re: Re: Query Autocomplete Evaluation

2020-02-24 Thread Paras Lehana
Hey Audrey, I assume MRR is about the ranking of the intended suggestion. For this, no human judgement is required. We track position selection - the position (1-10) of the selected suggestion. For example, this is our recent numbers: Position 1 Selected (B3) 107,699 Position 2 Selected (B4) 58,7

Re: Re: Query Autocomplete Evaluation

2020-02-24 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Hi Paras, This is SO helpful, thank you. Quick question about your MRR metric -- do you have binary human judgements for your suggestions? If no, how do you label suggestions successful or not? Best, Audrey On 2/24/20, 2:27 AM, "Paras Lehana" wrote: Hi Audrey, I work for Auto-S

Re: Query Autocomplete Evaluation

2020-02-23 Thread Paras Lehana
Hi Audrey, I work for Auto-Suggest at IndiaMART. Although we don't use the Suggester component, I think you need evaluation metrics for Auto-Suggest as a business product and not specifically for Solr Suggester which is the backend. We use edismax parser with EdgeNGrams Tokenization. Every week,

Query Autocomplete Evaluation

2020-02-14 Thread Audrey Lorberfeld - audrey.lorberf...@ibm.com
Hi all, How do you all evaluate the success of your query autocomplete (i.e. suggester) component if you use it? We cannot use MRR for various reasons (I can go into them if you're interested), so we're thinking of using nDCG since we already use that for relevance eval of our system as a who