Hi List

I have a solr index where I want to include numerical fields in my ranking
function as well as keyword relevance. For example, each document has a
document view count, and I'd like to increase the relevancy of documents
that are read often, and penalize documents with a very low view count. I'm
aware that this could be achieved with a filter as well, but ignore that
for this question :) since this will be extended to other numerical fields.

The keyword scoring works just fine and I can include the view count as a
factor in the scoring, but I would like to somehow express that the view
count accounts for e.g. 25% of the total score. This could be achieved by
mapping the view count into some predetermined fixed range and then
performing suitable arithmetic to scale to the score of the query. The
score of the term query is normalized to queryNorm, so I'd like somehow to
express that the view count score should be normalized to the queryNorm.

If I look at the explain of how the score below is computed, the 17.4 is
the part of the score that comes from term relevancy. Searching for another
(set of) terms yields a different queryNorm, so I can't see how I can
a-priori pick a scaling function (I've used log for this example) and boost
factor that will give control of the final contribution of the view count
to the score.

19.14161 = (MATCH) sum of:
  17.403849 = (MATCH) max plus 0.1 times others of:
    16.747877 = (MATCH) weight(document:water^4.0 in 1076362), product of:
      0.22298127 = queryWeight(document:water^4.0), product of:
        4.0 = boost
        2.939238 = idf(docFreq=527730, maxDocs=3669552)
        0.018965907 = queryNorm
      75.108894 = (MATCH) fieldWeight(document:water in 1076362), product
of:
        25.553865 = tf(termFreq(document:water)=653)
        2.939238 = idf(docFreq=527730, maxDocs=3669552)
        1.0 = fieldNorm(field=document, doc=1076362)
[snip]
  1.7377597 = (MATCH) FunctionQuery(log(map(int(views),0.0,0.0,1.0))),
product of:
    1.8325089 = log(map(int(views)=68,min=0.0,max=0.0,target=1.0))
    50.0 = boost
    0.018965907 = queryNorm

Thanks in advance for your help,
/Martin

Reply via email to