Jan,
Looks interesting. I will try this.
Thanks!
Darren
On Mon, 2010-06-28 at 19:54 +0200, Jan Høydahl / Cominvent wrote:
> Hi,
>
> You might also want to check out the new Lucene-Hunspell stemmer at
> http://code.google.com/p/lucene-hunspell/
> It uses OpenOffice dictionaries with known st
Hi,
You might also want to check out the new Lucene-Hunspell stemmer at
http://code.google.com/p/lucene-hunspell/
It uses OpenOffice dictionaries with known stems in combination with a large
set of language specific rules.
It handles your example, but it is an early release, so test it thoroughl
the general consensus among people who run into the problem you have
is to use a plurals only stemmer, a synonyms file or a combination of
both (for irregular nouns etc)
if you search the archives you can find info on a plurals stemmer
On Mon, Jun 28, 2010 at 6:49 AM, wrote:
> Thanks for the ti
Thanks for the tip. Yeah, I think the stemming confounds search results as
it stands (porter stemmer).
I was also thinking of using my dictionary of 500,000 words with their
complete morphologies and conjugations and create a synonyms.txt to
provide english accurate morphology.
Is this a good ide
Hi Darren,
You might want to look at the KStemmer
(http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem) instead of
the standard PorterStemmer. It essentially has a 'dictionary' of exception
words where stemming stops if found, so in your case president won't be stemmed
any furthe