Why do these approaches have to be mutually exclusive?
Do a dictionary lookup, if no satisfactory match found use an
algorithmic stemmer. Would probably save a few CPU cycles by
algorithmic stemming iff necessary.


On Wed, Apr 21, 2010 at 1:31 PM, Robert Muir <rcm...@gmail.com> wrote:
> sy to look at the "faults" of some algorithmic stemmer, in truth its
> only purpose is to cause related forms of the word to conflate to the same
> form, and hopefully avoiding unrelated terms from conflating to this form.
>
> A dictionary-based stemmer is out-of-date the day you put it into
> production: languages aren't static. For example, you can't expect a
> dictionary-based stemmer to properly deal with forms like "googling" or
> "tweets" that have recently slipped into English vocabulary, but an
> algorithmic stemmer will likely deal with these just fine.

Reply via email to