Hi,

After reading a bit on various sites, and especially the blog post "Comparing 
boost methods in Solr", it seems that the preferred boosting type is the 
multiplicative one, over the additive one. But I can't really get my head 
around *why* that is so, since in most boosting problems I can think of, it 
seems that an additive boost would suit better.

For example, in our project we want to boost documents depending on various 
factors, but in essence they can be summarized as:

- Regular edismax logic, like qf=title^2 mainText^1
- Multiple custom document fields, with weights specified at query time

So, first of, the custom fields... It became obvious to me quite quickly that 
multiplicative logic here would totally ruin the purpose of the weights, since 
something like "(f1 *  w1) * (f2 * w2)" is the same as "(f1 *  w2) * (f2 * 
w1)". So, I ended up using additive boost here.

Then we have the combination of the edismax boost, and my custom boost. As far 
as I understand it, when using the boost field with edismax, this combination 
is always performed using multiplicative logic. But the same problem exists 
here as it did with my custom fields. Because if I boost the aggregated result 
of the custom fields using some weight, it doesn't affect the order of the 
documents because that weight influences the edismax boost just as much. What I 
want is to have the weight only influence my custom boost value, so that I can 
control how much (or little) the final score should be effected by the custom 
boost.

So, in both cases I find myself wanting to use the additive boost. But surely I 
must be missing something, right? Am I thinking backwards or something?

I don't use any out-of-the-box example indexes, so I can provide you with a 
working URL that shows exactly what I am doing. But in essence my query looks 
like this:

- q=test
- defType=edismax
- qf=title^2&qf=mainText1^1
- 
totalRanking=div(sum(product(random1,1),product(random2,1.5),product(random3,2),product(random4,2.5),product(random5,3)),5)
- weightedTotalRanking=product($totalRanking,1.5)
- bf=$weightedTotalRanking
- fl=*,score,[explain style=text],$weightedTotalRanking

random1 to random5 are document fields of type double, with random values 
between 0.0 and 1.0.

With this setup, I can change the overall importance of my custom boosting 
using the factor in weightedTotalRanking (1.5 above). But that is only because 
bf is additive. If I switch to the boost parameter, I can no longer influence 
the order of the documents using this factor, no matter how high a value I 
choose.

Am I looking at the this the wrong way? Is there a much better approach to 
achieve what I want?

Regards
/Jimi

Reply via email to