Interestingly, I worked at two different web search companies with two 
different completely different search engines, and one arrived at an 8X title 
boost and the other at a 7.5X title boost. So I consider 8X a universal 
physical constant.

I totally agree about using real user queries and real user clicks to evaluate 
your configuration.

wunder
Former Infoseek, Inktomi, Verity, Autonomy, Netflix, etc.

On Apr 25, 2012, at 6:00 PM, Erick Erickson wrote:

> Your first and biggest problem will be to define "good"
> result ordering. You have some anecdotal statements
> that amount to something like "sometimes I don't like
> the results". But unless you can quantify this, you'll spend a
> LOT of time going tweaking the results ordering and then
> going back and re-tweaking based on another result....
> 
> But to your point. the 4x boosting is actually rather high. You
> might be able to get better results by boosting by significantly
> smaller values, say 1.5 or something.
> 
> But under any circumstances, _some_ searches will not be
> satisfactory, I guess it's up to you to figure out what's
> about "the best you can do"... Wish I had better answers, but
> judgement calls are like that <G>..
> 
> Best
> Erick
> 
> On Tue, Apr 24, 2012 at 5:28 PM, Tavi Nathanson
> <tavi.nathan...@gmail.com> wrote:
>> Hey everyone,
>> 
>> I field documents by "title" and "body". The title field often has far fewer
>> terms than the body field. IDF, as a result, will have a profound effect in
>> the title field compared to the body field.
>> 
>> I currently have the title field boosted by 4x relative to the body field.
>> While I want matches in the title field to result in higher scores than
>> matches in the body field, I don't believe I want the title to completely
>> trump the body. I've seen this happen when a rare term is present in the
>> title field, and IDF combines with the 4x boost to wreak havoc.
>> 
>> I'd like to get your thoughts on the following:
>> 
>> - Is it standard practice to avoid boosting the title field much, because of
>> the (generally) high IDF of title field terms?
>> - Are there other strategies for handling the high IDF of a title field?
>> 
>> Thanks!
>> 
>> --
>> View this message in context: 
>> http://lucene.472066.n3.nabble.com/Title-Boosting-and-IDF-tp3936709p3936709.html
>> Sent from the Solr - User mailing list archive at Nabble.com.

--
Walter Underwood
wun...@wunderwood.org



Reply via email to