Re: Minimum word length for stemming

2013-02-13 Thread Chris Hostetter
: Thanks for confirming my suspicions, the custom : TokenLengthMarkerFilterFactory sounds like the best approach for doing this. that sounds like something that could be generally useful to lots of people ... by all means please open a jira issue and attach whatever you come up with for possibl

Re: Minimum word length for stemming

2013-01-31 Thread Jamie Johnson
Thanks for confirming my suspicions, the custom TokenLengthMarkerFilterFactory sounds like the best approach for doing this. On Thu, Jan 31, 2013 at 5:12 PM, Jan Høydahl wrote: > Hi, > > I believe each stemmer implementation decides that themselves. At least > the MinimalNorwegianStemmer has a

Re: Minimum word length for stemming

2013-01-31 Thread Jan Høydahl
Hi, I believe each stemmer implementation decides that themselves. At least the MinimalNorwegianStemmer has a built-in logic which stems certain suffixes only if the token is >N chars. If you want external control, you can look at http://wiki.apache.org/solr/LanguageAnalysis#Customizing_Stemmi