Hello Walter.

We believe this kind of thing is better managed by a content team that works with user feedback. It would be costly everytime we find a word that brings irrelevant results the fact that, to correct that, we'd need to build a new stemmer. It's a lot better to create a simple interface that allows anyone to define which are the protected words we need according to user feedback in a simple, easy way.

Erik just said it wouldn't be hard to bring that functionality to Snowball. Erik, do you know what needs to be done in order to achieve that? Don't you guys have plans for that? I'm sure that I'm not the only one with that problem using SOLR with portuguese language (or any other idiom).

Thank you very much for your help,

Leonardo.

Walter Underwood escreveu:
You can define exceptions in the Snowball language and generate
a new stemmer. See the examples here:

http://snowball.tartarus.org/algorithms/english/stemmer.html

wunder

On 2/18/09 9:56 AM, "Erik Hatcher" <e...@ehatchersolutions.com> wrote:

On Feb 18, 2009, at 12:40 PM, Leonardo Dias wrote:
Is there a way to make the snowball algorithm work with a
protwords.txt file?
Currently, and unfortunately, no - the protected words feature is not
available the SnowballPorterFilterFactory.    It wouldn't take much
effort to bring that capability across though.

Erik






Reply via email to