Hi,

just a background on my setup. I'm crawling with Nutch-1.2, I used Solr-1.4
and Solr-3.5, with the same result. Solr is still using the default
settings.

I found this problem just by accident. I queried "mobile broadband", page
A, has 2 occurences and scores higher than page B that has 19 occurences. I
found it weird and that's why I started investigating.

The debug results are given below and you can see that queryWeight, idf
and queryNorm are the same, tf is higher, as expected, in B but what makes
the difference is clearly fieldNorm.

A: 0.010779975 = (MATCH) weight(content:"mobil broadband" in 18730),
product of: 1.0 = queryWeight(content:"mobil broadband"), product of:
6.2444286 = idf(content: mobil=4922 broadband=2290) 0.16014275 = queryNorm
0.010779975 = fieldWeight(content:"mobil broadband" in 18730), product of:
1.4142135 = tf(phraseFreq=2.0) 6.2444286 = idf(content: mobil=4922
broadband=2290) 0.0012207031 = fieldNorm(field=content, doc=18730)

B: 8.5223187E-4 = (MATCH) weight(content:"mobil broadband" in 14391),
product of: 1.0 = queryWeight(content:"mobil broadband"), product of:
6.2444286 = idf(content: mobil=4922 broadband=2290) 0.16014275 = queryNorm
8.5223187E-4 = fieldWeight(content:"mobil broadband" in 14391), product of:
4.472136 = tf(phraseFreq=20.0) 6.2444286 = idf(content: mobil=4922
broadband=2290) 3.0517578E-5 = fieldNorm(field=content, doc=14391)

Remi

On Wed, Jan 18, 2012 at 8:52 PM, Jan Høydahl <jan....@cominvent.com> wrote:

> > I've come accros a problem where newly indexed pages almost always come
> > first even when the term frequency is relatively slow.
>
> There is no inherent index-time boost, so this must be something else.
> Can you give us an example of a query? Which query parser do you use?
>
> > I read the posts below on "fieldNorm" and "omitNorms" but setting
> > "omitNorms=true" doesn't change anything for me on the calculation of
> > fieldNorm.
>
> Are you sure you have spelled omitNorms="true" correctly, then restarted
> Solr (to refresh config)? The effect of Norms on your score will be that
> shorter fields score higher than long fields.
>
> Perhaps you instead can try to tell us your use-case. What kind of raning
> are you trying to achieve? Then we can help suggest how to get there.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> Solr Training - www.solrtraining.com

Reply via email to