Re: preside != president

2010-06-29 Thread Darren Govoni
Jan, Looks interesting. I will try this. Thanks! Darren On Mon, 2010-06-28 at 19:54 +0200, Jan Høydahl / Cominvent wrote: > Hi, > > You might also want to check out the new Lucene-Hunspell stemmer at > http://code.google.com/p/lucene-hunspell/ > It uses OpenOffice dictionaries with known st

Re: preside != president

2010-06-28 Thread Jan Høydahl / Cominvent
Hi, You might also want to check out the new Lucene-Hunspell stemmer at http://code.google.com/p/lucene-hunspell/ It uses OpenOffice dictionaries with known stems in combination with a large set of language specific rules. It handles your example, but it is an early release, so test it thoroughl

Re: preside != president

2010-06-28 Thread Joe Calderon
the general consensus among people who run into the problem you have is to use a plurals only stemmer, a synonyms file or a combination of both (for irregular nouns etc) if you search the archives you can find info on a plurals stemmer On Mon, Jun 28, 2010 at 6:49 AM, wrote: > Thanks for the ti

Re: preside != president

2010-06-28 Thread darren
Thanks for the tip. Yeah, I think the stemming confounds search results as it stands (porter stemmer). I was also thinking of using my dictionary of 500,000 words with their complete morphologies and conjugations and create a synonyms.txt to provide english accurate morphology. Is this a good ide

Re: preside != president

2010-06-28 Thread Brendan Grainger
Hi Darren, You might want to look at the KStemmer (http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem) instead of the standard PorterStemmer. It essentially has a 'dictionary' of exception words where stemming stops if found, so in your case president won't be stemmed any furthe