I thought it was something simple. Here is my configuration:
<fieldType name="searchType" class="solr.TextField"
positionIncrementGap="100">
<analyzer>
<charFilter class="solr.HTMLStripCharFilterFactory"/>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="false"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ASCIIFoldingFilterFactory"/>
</analyzer>
</fieldType>
<field name="searchField" type="searchType" indexed="true" stored="true"
multiValued="true"/>
<copyField source="name" dest="searchField" maxChars="500"/>
<copyField source="storechain" dest="searchField" maxChars="500"/>
<copyField source="related_category" dest="searchField" maxChars="500"/>
I search for "supermarket":
<doc>
<str name="companyid">357</str>
<str name="name">LIDL Headoffice</str>
<arr name="related_category">
<str>Supermarkt</str>
</arr>
<str name="storechain">LIDL</str>
<arr name="searchField">
<str>LIDL</str>
<str>LIDL Headoffice</str>
<str>Supermarket</str>
</arr>
</doc>
<doc>
<str name="companyid">719</str>
<str name="name">LIDL</str>
<arr name="related_category">
<str>Supermarket</str>
</arr>
<str name="storechain">LIDL</str>
<arr name="searchField">
<str>LIDL</str>
<str>LIDL</str>
<str>Supermarket</str>
</arr>
</doc>
debugQuery:
Both documents has the same score, but doc 357 has more characters in the
searchField.
<lst name="explain">
<str name="357">
1.4330883 = (MATCH) fieldWeight(searchField:supermarket in
325), product
of: 1.0 = tf(termFreq(searchField:supermarket)=1) 2.8661766 =
idf(docFreq=3194, maxDocs=20651) 0.5 =
fieldNorm(field=searchField, doc=325)
</str>
<str name="719">
1.4330883 = (MATCH) fieldWeight(searchField:supermarket in
678), product
of: 1.0 = tf(termFreq(searchField:supermarket)=1) 2.8661766 =
idf(docFreq=3194, maxDocs=20651) 0.5 =
fieldNorm(field=searchField, doc=678)
</str>
</lst>
--
View this message in context:
http://lucene.472066.n3.nabble.com/WhitespaceTokenizer-and-scoring-field-length-tp2865784p2869546.html
Sent from the Solr - User mailing list archive at Nabble.com.