Hi Alexander,
This is because you have length normalization enabled for that field.
http://ir.dcs.gla.ac.uk/wiki/Length_Normalisation
If you want it disabled set the following:
<fieldType name="series" class="solr.TextField" positionIncrementGap="100"
omitNorms="true">
Jeroen
On 4-7-2013 11:10, Lochschmied, Alexander wrote:
Hi Solr people!
querying for "series:RCWP" returns me the response below. Why does "RCWP Moisture
Resistant" score worse than "D/CRCW-P e3" with the field definition below? OK, we are ignoring
dashes and spaces, but I would have expected that matches towards the beginning score better. Can I change
this behavior (in Solr 4)?
----------------------------------------------------------------------------------------------------------------------------------
<result>
<doc>
<str name="series">RCWP</str>
<float name="score">3.2698402</float>
</doc>
<doc>
<str name="series">D/CRCW-P e3</str>
<float name="score">1.3624334</float>
</doc>
<doc>
<str name="series">RCWP Moisture Resistant</str>
<float name="score">0.5449734</float>
</doc>
</result>
----------------------------------------------------------------------------------------------------------------------------------
<fieldType name="series" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[\-\s]+"
replacement=""/>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.NGramFilterFactory" minGramSize="2"
maxGramSize="50"/>
</analyzer>
<analyzer type="query">
<charFilter class="solr.PatternReplaceCharFilterFactory" pattern="[\-\s]+"
replacement=""/>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Thanks,
Alexander