Re: Strange relevance scoring

Ahmet Arslan Tue, 08 Apr 2014 05:07:38 -0700

Hi David,

omitNorms="true" will cause additional performance gains too. 
https://wiki.apache.org/solr/SolrPerformanceFactors#indexed_fields


To globally disable length norm, one can create a custom similarity and 
register it as a default similarity though. 



On Tuesday, April 8, 2014 2:59 PM, David Santamauro 
<david.santama...@gmail.com> wrote:

Is there any general setting that removes this "punishment" or must 
omitNorms=false be part of every field definition?



On 4/8/2014 7:04 AM, Ahmet Arslan wrote:
> Hi,
>
> length normal is computed for every document at index time. I think it is 
> 1/sqrt(number of terms). Please see section 6. norm(t,d) at
>
> https://lucene.apache.org/core/4_7_0/core/org/apache/lucene/search/similarities/TFIDFSimilarity.html
>
>
> If you don't care about length normalisation, you can set omitNorms=true in 
> field declarations. http://wiki.apache.org/solr/SchemaXml#Common_field_options
>
>
>
> On Tuesday, April 8, 2014 1:57 PM, John Nielsen <j...@mcb.dk> wrote:
> Hi,
>
> I couldn't find any occurrence of SpanFirstQuery in either the schema.xml
> or solrconfig.xml files.
>
> This is the query i used with debug=results.
> http://pastebin.com/bWzUkjKz
>
> And here is the answer.
> http://pastebin.com/nCXFcuky
>
> I am not sure what I am supposed to be looking for.
>
>
>
> On Tue, Apr 8, 2014 at 11:34 AM, Markus Jelsma
> <markus.jel...@openindex.io>wrote:
>
>> Hi - the thing you describe is possible when your set up uses
>> SpanFirstQuery. But to be sure what's going on you should post the debug
>> output.
>>
>> -----Original message-----
>>> From:John Nielsen <j...@mcb.dk>
>>> Sent: Tuesday 8th April 2014 11:03
>>> To: solr-user@lucene.apache.org
>>> Subject: Strange relevance scoring
>>>
>>> Hi,
>>>
>>> We are seeing a strange phenomenon with our Solr setup which I have been
>>> unable to answer.
>>>
>>> My Google-fu is clearly not up to the task, so I am trying here.
>>>
>>> It appears that if i do a freetext search for a single word, say
>> "modellering"
>>> on a text field, the scoring is massively boosted if the first word of
>> the
>>> text field is a hit.
>>>
>>> For instance if there is only one occurrence of the word "modellering" in
>>> the text field and that occurrence is the first word of the text, then
>> that
>>> document gets a higher relevancy than if the word "modelling" occurs 5
>>> times in the text and the first word of the text is any other word.
>>>
>>> Is this normal behavior? Is special attention paid to the first word in a
>>> text field? I would think that the latter case would get the highest
>> score.
>>>
>>>
>>> --
>>> Med venlig hilsen / Best regards
>>>
>>> *John Nielsen*
>>> Programmer
>>>
>>>
>>>
>>> *MCB A/S*
>>> Enghaven 15
>>> DK-7500 Holstebro
>>>
>>> Kundeservice: +45 9610 2824
>>> p...@mcb.dk
>>> www.mcb.dk
>
>>>
>>
>
>
>

Re: Strange relevance scoring

Reply via email to