It seems unlikely you are going to find something that stems everything
exactly how you want it, and nothing how you don't want it. This is very
domain dependent, as you've discovered. I doubt there's even such a
thing as the way everyone doing a 'retail product title search' would
want it, it's going to vary.
You could use the synonym feature to make your own stemming dictionary,
tell it to stem "seating" to "seat".
Of course, that's also very "expensive" in terms of your time, to create
your own custom dictionary. But you're going to have to live with one
of the compromises, software cant' do magic!
For particular titles, you could also, in your own metadata control, add
"alternate titles" that you want it to match on, before it even gets
indexed.
On 3/29/2011 1:43 PM, Robert Petersen wrote:
For retail product title search, would there be a better stemmer to use? We
wanted a less aggressive stemmer, but I would expect the term seating to stem.
I have found several other words which end in ing and do not get stemmed.
Amongst our product lines are four million books with all kinds of crazy
titles, like the following oddity! Here counseling stems and unknowing doesn't:
1. The Cloud of Unknowing and the Book of Privy Counseling
Buy New: $29.95 $18.30
3 New and Used from $18.30
-----Original Message-----
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley
Sent: Tuesday, March 29, 2011 10:27 AM
To: solr-user@lucene.apache.org
Cc: Robert Petersen
Subject: Re: FW: no results searching for stadium seating chairs
On Tue, Mar 29, 2011 at 1:17 PM, Robert Petersen<rober...@buy.com> wrote:
Very interestingly, LucidKStemFilterFactory is stemming 'ing's differently for
different words. The word 'seating' doesn't lose the 'ing' but the word
'counseling' does! Can anyone explain the difference here? protwords.txt is
empty btw.
KStem is dictionary driven, so "seating" is probably in the
dictionary. I guess the author decided that "seating" and "seat" were
sufficiently different.
-Yonik
http://www.lucenerevolution.org -- Lucene/Solr User Conference, May
25-26, San Francisco