>
>
> > I kind of suspected stemming to be the reason behind this.
> > But I consider stemming to be a good feature.
>
> This is the side effect of stemming. Stemming increases recall while
> harming precision.
>

This is a side effect of stemming, the way it is currently implemented in
Lucene. Stemming could theoretically increase recall without hurting
precision or relevancy. One way to do this would be to always store the
original token, along with the stemmed token. Then, at scoring time, give a
boost to matches which are closer to the original form.

-- Avi

Reply via email to