Re: measure result set quality

Erick Erickson Fri, 18 Oct 2013 04:28:38 -0700

bq: How do you compare the quality of your
search result in order to decide which schema is better?

Well, that's actually a hard problem. There's the
various TREC data, but that's a generic solution and most
every individual application of this generic thing called
"search" has its own version of "good" results.

Note that scores are NOT comparable across different
queries even in the same data set, so don't go down that
path.

I'd fire the question back at you, "Can you define what
good (or better) results are in such a way that you can
program an evaluation?" Often the answer is "no"...

One common technique is to have knowledgable users
do what's called A/B testing. You fire the query at two
separate Solr instances and display the results side-by-side,
and the user says "A is more relevant", or "B is more
relevant". Kind of like an eye doctor. In sophisticated A/B
testing, the program randomly changes which side the
results go, so you remove "sidedness" bias.

FWIW,
Erick

On Thu, Oct 17, 2013 at 11:28 AM, Alvaro Cabrerizo <topor...@gmail.com>wrote:

> Hi,
>
> Imagine the next situation. You have a corpus of documents and a list of
> queries extracted from production environment. The corpus haven't been
> manually annotated with relvant/non relevant tags for every query. Then you
> configure various solr instances changing the schema (adding synonyms,
> stopwords...). After indexing, you prepare and execute the test over
> different schema configurations.  How do you compare the quality of your
> search result in order to decide which schema is better?
>
> Regards.
>

Re: measure result set quality

Reply via email to