I believe the german porter stemmer should handle this. I haven't used it with SOLR but I've used it with other projects, and basically, when the word is parsed, the umlauts and also accented vowels are converted to plain vowels. I guess with SOLR you use solr.SnowballPorterFilterFactory:

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-b80fb581f4e078142c694014f1a8f60c0935e080

with the German option (like in their example).

You probably want to apply this both at index and query time.

--
Steve

On Dec 16, 2008, at 6:02 PM, Julian Davchev wrote:

Hi,
I am just going through
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters and maillist
archive
but somehow can't find the solution. Is it possible that I treat
'möchten' , 'mochten' and  'moechten' the same way.
Of course not hardcoding this but rather work for any umlaut.
Cheers



Reply via email to