2010/12/15 Emmanuel Bégué <medu...@gmail.com>: > Hello, > > According to the wiki http://wiki.apache.org/solr/LanguageAnalysis, > the light stemmers for French (solr.FrenchLightStemFilterFactory and > solr.FrenchMinimalStemFilterFactory) are only available for SOLR 3.1. > > Is there a way to make them work with 1.4.1?
you could take the source code and backport it to solr 1.4.1... but see below: > > - - - > > Additionally, there is an "official" list of inflected word forms for > the French language produced by a government agency (this being > France...) It's called "Morphalou": > http://www.cnrtl.fr/lexiques/morphalou/ and it contains over 540 k > inflicted forms. > > Or is there a better way to use such a list than a synonyms file? In this case I would recommend also considering StemmerOverrideFilter (again only in 3.1+, sorry) See http://wiki.apache.org/solr/LanguageAnalysis#solr.StemmerOverrideFilterFactory The StemmerOverrideFilter will "stem" based on a tab-separated dictionary. But, when it does this it also marks the word with KeywordAttribute, which tells any future stemmer to ignore it. So with this approach you can have a StemmerOverrideFilter with your dictionary, then followed by a stemmer which will only work on words that aren't in your dictionary. The words that hit the dictionary will be completely ignored by the stemmer. This should also be much more RAM-efficient than using SynonymFilter