Interestingly, I worked at two different web search companies with two different completely different search engines, and one arrived at an 8X title boost and the other at a 7.5X title boost. So I consider 8X a universal physical constant.
I totally agree about using real user queries and real user clicks to evaluate your configuration. wunder Former Infoseek, Inktomi, Verity, Autonomy, Netflix, etc. On Apr 25, 2012, at 6:00 PM, Erick Erickson wrote: > Your first and biggest problem will be to define "good" > result ordering. You have some anecdotal statements > that amount to something like "sometimes I don't like > the results". But unless you can quantify this, you'll spend a > LOT of time going tweaking the results ordering and then > going back and re-tweaking based on another result.... > > But to your point. the 4x boosting is actually rather high. You > might be able to get better results by boosting by significantly > smaller values, say 1.5 or something. > > But under any circumstances, _some_ searches will not be > satisfactory, I guess it's up to you to figure out what's > about "the best you can do"... Wish I had better answers, but > judgement calls are like that <G>.. > > Best > Erick > > On Tue, Apr 24, 2012 at 5:28 PM, Tavi Nathanson > <tavi.nathan...@gmail.com> wrote: >> Hey everyone, >> >> I field documents by "title" and "body". The title field often has far fewer >> terms than the body field. IDF, as a result, will have a profound effect in >> the title field compared to the body field. >> >> I currently have the title field boosted by 4x relative to the body field. >> While I want matches in the title field to result in higher scores than >> matches in the body field, I don't believe I want the title to completely >> trump the body. I've seen this happen when a rare term is present in the >> title field, and IDF combines with the 4x boost to wreak havoc. >> >> I'd like to get your thoughts on the following: >> >> - Is it standard practice to avoid boosting the title field much, because of >> the (generally) high IDF of title field terms? >> - Are there other strategies for handling the high IDF of a title field? >> >> Thanks! >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/Title-Boosting-and-IDF-tp3936709p3936709.html >> Sent from the Solr - User mailing list archive at Nabble.com. -- Walter Underwood wun...@wunderwood.org