Some collisions are listed here: http://www.attivio.com/blog/34-attivio-blog/333-doing-things-with-words-part-three-stemming-and-lemmatization.html
Have you asked Martin Porter? You can find his e-mail here: http://tartarus.org/~martin/ wunder On Jul 30, 2010, at 1:41 PM, Otis Gospodnetic wrote: > Hello, > > I'm looking for a list of English words that, when stemmed by Porter > stemmer, > end up in the same stem as some similar, but unrelated words. Below are > some > examples: > > # this gets stemmed to "iron", so if you search for "ironic", you'll get > "iron" > matches > ironic > > # same stem as animal > anime > animated > animation > animations > > I imagine such a list could be added to the example protwords.txt > > Thanks, > Otis > ---- > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/