I kind of suspected stemming to be the reason behind this. But I consider
stemming to be a good feature.

The point is that if an exact match exists, then solr should report that
first.... and then stemmed results should be reported.

disabling stemming altogether would be a step in the wrong direction.



Shalin Shekhar Mangar wrote:
> 
> On Tue, Mar 9, 2010 at 4:38 PM, abhishes <abhis...@gmail.com> wrote:
> 
>>
>> I am indexing a column in a database. I have chosen field type of text
>> for
>> this column (this type was defined in the sample schema file which comes
>> in
>> the Solr Example).
>>
>> When I search for the word "impress" and top 3 results. I get these 3
>> documents
>>
>> <str name="TEXT">bare desire pronounce villainy draught beasts blockish
>> impression acquit</str>
>> <str name="TEXT">bare impression villainy pronounce beasts desire
>> blockish
>> draught acquit</str>
>> <str name="TEXT">beasts desire villainy pronounce bare acquit impression
>> draught blockish</str>
>>
>> But here the TEXT doesn't really contain the word "impress" it contains
>> the
>> word "impression"
>>
>> Now the database does contain a few rows where the word "impress" is
>> there,
>> but those rows do not come in top 3 results.
>>
>> So my question is that why did the rows containing the word "impression"
>> got
>> ranked higher than the rows containing the word "impress" when I searched
>> for "impress"?
>>
>>
> The "text" type is configured to do stemming on the input. So I'm guessing
> that "impression" and "impress" both stem to the same form. You can remove
> the EnglishPorterFilterFactory from the text type if you don't need
> stemming.
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://old.nabble.com/Confused-by-Solr-Ranking-tp27834227p27836299.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to