I thought it was something simple. Here is my configuration:

<fieldType name="searchType" class="solr.TextField"
positionIncrementGap="100">
   <analyzer>
        <charFilter class="solr.HTMLStripCharFilterFactory"/>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="false"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.ASCIIFoldingFilterFactory"/>
   </analyzer>
</fieldType>

<field name="searchField" type="searchType" indexed="true" stored="true"
multiValued="true"/>

<copyField source="name" dest="searchField" maxChars="500"/>
<copyField source="storechain" dest="searchField" maxChars="500"/>
<copyField source="related_category" dest="searchField" maxChars="500"/>

I search for "supermarket":

<doc>
        <str name="companyid">357</str>
        <str name="name">LIDL Headoffice</str>
        <arr name="related_category">
                <str>Supermarkt</str>
        </arr>
        <str name="storechain">LIDL</str>
        <arr name="searchField">
                <str>LIDL</str>
                <str>LIDL Headoffice</str>
                <str>Supermarket</str>
        </arr>
</doc>

<doc>
        <str name="companyid">719</str>
        <str name="name">LIDL</str>
        <arr name="related_category">
                <str>Supermarket</str>
        </arr>
        <str name="storechain">LIDL</str>
        <arr name="searchField">
                <str>LIDL</str>
                <str>LIDL</str>
                <str>Supermarket</str>
        </arr>
</doc>



debugQuery:
Both documents has the same score, but doc 357 has more characters in the
searchField.

<lst name="explain">
        <str name="357">
                1.4330883 = (MATCH) fieldWeight(searchField:supermarket in 
325), product
of: 1.0 = tf(termFreq(searchField:supermarket)=1) 2.8661766 =
idf(docFreq=3194, maxDocs=20651) 0.5 =                          
fieldNorm(field=searchField, doc=325)
        </str>
        
        <str name="719">
                1.4330883 = (MATCH) fieldWeight(searchField:supermarket in 
678), product
of: 1.0 = tf(termFreq(searchField:supermarket)=1) 2.8661766 =
idf(docFreq=3194, maxDocs=20651) 0.5 =                          
fieldNorm(field=searchField, doc=678)
        </str>
</lst>

--
View this message in context: 
http://lucene.472066.n3.nabble.com/WhitespaceTokenizer-and-scoring-field-length-tp2865784p2869546.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to