Re: Why is multiplicative boost prefered over additive?

Jan Høydahl Sat, 19 Mar 2016 11:50:40 -0700

You can also use functions to “compress” the source number, so that the
effect of a certain boost becomes bigger or smaller compared to the other
boost you have.


--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

> 17. mar. 2016 kl. 23.21 skrev Upayavira <u...@odoko.co.uk>:
> 
> Yes. Boosting adjusts an existing score. That original score can vary,
> e.g. depending upon how many search terms there are.
> 
> If you use additive boosting, when you add a boost to a search with one
> term, (e.g. between 0 and 1) you get a different effect compared to when
> you add the same boost to a search with four terms (e.g. between 0 and
> 4). If, however, you used multiplicative boosting, the impact of the
> boosts would be the same.
> 
> If, for example, you want to add a recency boost, say with recip, where
> the boost value is between 0 and 1, then use score*(1+boost). This way,
> a boost of 0 has no effect on the score, whereas a boost of 1 doubles
> the score. If you use plain multiplicative here, a boost of 0 wipes out
> the score entirely, which can have nasty effects (it has, at least, for
> me).
> 
> Upayavira
> 
> On Thu, 17 Mar 2016, at 06:58 PM, Walter Underwood wrote:
>> Think about using popularity as a boost. If one movie has a million
>> rentals and one has a hundred rentals, there is no additive formula that
>> balances that with text relevance. Even with log(popularity), it doesn’t
>> work.
>> 
>> With multiplicative boost, we only care about the difference between the
>> one rented one million time and the one rented 800 thousand times (think
>> about the Twilight movies at Netflix). But it also distinguishes between
>> the one rented 100 times and the one rented 80 times.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>> 
>>> On Mar 17, 2016, at 11:29 AM, jimi.hulleg...@svensktnaringsliv.se wrote:
>>> 
>>> Hi,
>>> 
>>> After reading a bit on various sites, and especially the blog post 
>>> "Comparing boost methods in Solr", it seems that the preferred boosting 
>>> type is the multiplicative one, over the additive one. But I can't really 
>>> get my head around *why* that is so, since in most boosting problems I can 
>>> think of, it seems that an additive boost would suit better.
>>> 
>>> For example, in our project we want to boost documents depending on various 
>>> factors, but in essence they can be summarized as:
>>> 
>>> - Regular edismax logic, like qf=title^2 mainText^1
>>> - Multiple custom document fields, with weights specified at query time
>>> 
>>> So, first of, the custom fields... It became obvious to me quite quickly 
>>> that multiplicative logic here would totally ruin the purpose of the 
>>> weights, since something like "(f1 *  w1) * (f2 * w2)" is the same as "(f1 
>>> *  w2) * (f2 * w1)". So, I ended up using additive boost here.
>>> 
>>> Then we have the combination of the edismax boost, and my custom boost. As 
>>> far as I understand it, when using the boost field with edismax, this 
>>> combination is always performed using multiplicative logic. But the same 
>>> problem exists here as it did with my custom fields. Because if I boost the 
>>> aggregated result of the custom fields using some weight, it doesn't affect 
>>> the order of the documents because that weight influences the edismax boost 
>>> just as much. What I want is to have the weight only influence my custom 
>>> boost value, so that I can control how much (or little) the final score 
>>> should be effected by the custom boost.
>>> 
>>> So, in both cases I find myself wanting to use the additive boost. But 
>>> surely I must be missing something, right? Am I thinking backwards or 
>>> something?
>>> 
>>> I don't use any out-of-the-box example indexes, so I can provide you with a 
>>> working URL that shows exactly what I am doing. But in essence my query 
>>> looks like this:
>>> 
>>> - q=test
>>> - defType=edismax
>>> - qf=title^2&qf=mainText1^1
>>> - 
>>> totalRanking=div(sum(product(random1,1),product(random2,1.5),product(random3,2),product(random4,2.5),product(random5,3)),5)
>>> - weightedTotalRanking=product($totalRanking,1.5)
>>> - bf=$weightedTotalRanking
>>> - fl=*,score,[explain style=text],$weightedTotalRanking
>>> 
>>> random1 to random5 are document fields of type double, with random values 
>>> between 0.0 and 1.0.
>>> 
>>> With this setup, I can change the overall importance of my custom boosting 
>>> using the factor in weightedTotalRanking (1.5 above). But that is only 
>>> because bf is additive. If I switch to the boost parameter, I can no longer 
>>> influence the order of the documents using this factor, no matter how high 
>>> a value I choose.
>>> 
>>> Am I looking at the this the wrong way? Is there a much better approach to 
>>> achieve what I want?
>>> 
>>> Regards
>>> /Jimi
>>

Re: Why is multiplicative boost prefered over additive?

Reply via email to