I'm not 100% on this, but I imagine this is what happens: (using -> to mean "tokenized to")
Suppose that you index: "I am running home" -> "am run running home" If you then query "running home" -> "run running home" and thus give a higher score than if you query "runs home" -> "run runs home" ----- Original Message ----- > The Solr wiki says "A repeated question is "how can I have the > original term contribute > more to the score than the stemmed version"? In Solr 4.3, the > KeywordRepeatFilterFactory has been added to assist this > functionality. " > > https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Stemming > > (Full section reproduced below.) > I can see how in the example from the wiki reproduced below that both > the stemmed and original term get indexed, but I don't see how the > original term gets more weight than the stemmed term. Wouldn't this > require a filter that gives terms with the keyword attribute more > weight? > > What am I missing? > > Tom > > > > --------------------------------------------- > "A repeated question is "how can I have the original term contribute > more to the score than the stemmed version"? In Solr 4.3, the > KeywordRepeatFilterFactory has been added to assist this > functionality. This filter emits two tokens for each input token, one > of them is marked with the Keyword attribute. Stemmers that respect > keyword attributes will pass through the token so marked without > change. So the effect of this filter would be to index both the > original word and the stemmed version. The 4 stemmers listed above all > respect the keyword attribute. > > For terms that are not changed by stemming, this will result in > duplicate, identical tokens in the document. This can be alleviated by > adding the RemoveDuplicatesTokenFilterFactory. > > <fieldType name="text_keyword" class="solr.TextField" > positionIncrementGap="100"> > <analyzer> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.KeywordRepeatFilterFactory"/> > <filter class="solr.PorterStemFilterFactory"/> > <filter class="solr.RemoveDuplicatesTokenFilterFactory"/> > </analyzer> > </fieldType>" > -- Diego Fernandez - 爱国 Software Engineer GSS - Diagnostics