Re: Advice on Stemming in Solr

Zheng Lin Edwin Yeo Wed, 01 Nov 2017 19:24:06 -0700

Hi Emir,

We do have quite alot of words that should not be stemmed. Currently, the
KStemFilterFactory are stemming all the non-English words that end with
"ing" as well. There are quite alot of places and names which ends in
"ing", and all these are being stemmed as well, which leads to an
inaccurate search.


Regards,
Edwin


On 1 November 2017 at 18:20, Emir Arnautović <[email protected]>
wrote:

> Hi Edwin,
> If the number of words that should not be stemmed is not high you could
> use KeywordMarkerFilterFactory to flag those words as keywords and it
> should prevent stemmer from changing them.
> Depending on what you want to achieve, you might not be able to avoid
> using stemmer at indexing time. If you want to find documents that contain
> only “walking” with search term “walk”, then you have to stem at index
> time. Cases when you use stemming on query time only are rare and specific.
> If you want to prefer exact matches over stemmed matches, you have to
> index same content with and without stemming and boost matches on field
> without stemming.
>
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
>
> > On 1 Nov 2017, at 10:11, Zheng Lin Edwin Yeo <[email protected]>
> wrote:
> >
> > Hi,
> >
> > We are currently using KStemFilterFactory in Solr, but we found that it
> is
> > actually doing stemming on non-English words like "ximenting", which it
> > stem to "ximent". This is not what we wanted.
> >
> > Another option is to use the HunspellStemFilterFactory, but there are
> some
> > English words like "running", walking" that are not being stemmed.
> >
> > Would like to check, is it advisable to use Stemming at index? Or we
> should
> > not use Stemming at index time, but at query time, do a search for the
> > stemmed words as well, like for example, if the user search for
> "walking",
> > we will do the search together with "walk", and the actual word of
> walking
> > will have higher weightage.
> >
> > I'm currently using Solr 6.5.1.
> >
> > Regards,
> > Edwin
>
>

Re: Advice on Stemming in Solr

Reply via email to