> We use two fields, one with and one without stopwords. The exact
> field has a higher boost than the other. That works pretty well.

Thanks for the tip, wunder!  We are doing likewise for our pf parm of
DisMax and that part works well -- exact matches are highly relevant
and stopped-matches less so but still present in the results set.  The
main problem is getting past the qf parm such that we don't have
invisible titles (stop-words removed by the qf pipeine leaving an
empty query) or over-specified generated queries (where stop-words
turn out to be required but can't match for various reasons).

> It helps to have an automated relevance test when tuning the boost
> (and other things). I extracted queries and clicks from the logs
> for a couple of months. Not perfect, but it is hard to argue with
> 32 million clicks.

I'd say -- a dream data set.  :-)  Good idea on the relevance test --
eyeballing boost changes seems definitely prone to unexpected effects
across all of the queries one didn't think to try.  (A dark art, boost
tuning...)

Ron

Reply via email to