Attached is the modified Snowball source code for plural-only English stemmer. You need to compile it to Java using instruction here: http://snowball.tartarus.org/runtime/use.html. Essentially, you need to:
1) Download (Snowball, algorithms, and libstemmer library)<http://snowball.tartarus.org/dist/snowball_code.tgz> and compile Snowball compiler it self using this command: gcc -O -o snowball compiler/*.c. 2) Compile the the attached file to Java: ./snowball stem_ISO_8859_1.sbl -java -o EnglishStemmer -name EnglishStemmer You can change EnglishStemmer to whatever you like, for example, PluralEnglishStemmer. After that, you need to modify the generated Java class so that it references the appropriate classes in net.sf.snowball.* package instead of the one from Snowball website. I think only 2 classes you need to import are Among and SnowballProgram. Once, you have the new stemmer ready, write something similar to EnglishPorterFilterFactory to use it within Solr. Hope this helps. Cheers, Cuong On Tue, Jul 1, 2008 at 6:07 PM, Guillaume Smet <[EMAIL PROTECTED]> wrote: > Hi Cuong, > > On Tue, Jul 1, 2008 at 4:45 AM, climbingrose <[EMAIL PROTECTED]> > wrote: > > I modified the original English Stemmer written in Snowball language and > > regenerate the Java implementation using Snowball compiler. It's been > > working for me so far. I certainly can share the modified Snowball > English > > Stemmer if anyone wants to use it. > > Yeah, it would be nice. A step by step explanation of how to > regenerate the Java files would be nice too (or a pointer to such a > documentation if you found one). > > Thanks, > > -- > Guillaume >