Chris Hostetter wrote: > : is there an analyzer which automatically converts all german special > : characters to their specific dissected from, such as ü to ue and ä to > : ae, etc.?! > > See also the ISOLatin1TokenFilter which does this regardless of langauge.
Actually, ISOLatin1TokenFilter does NOT convert /ü/ to /ue/, /ä/ to /ae/, etc. Instead, it converts /ü/ to /u/, /ä/ to /a/, etc. It *does* convert /ß/ to /ss/, though I've seen some people write that the correct substitution for /ß/ in German is /sz/ - I don't speak or read German, so I don't know. Maybe there should be an option on ISOLatin1TokenFilter to use German substitutions, in addition to the current behavior of simply stripping diacritics? Does anyone know if there are other (Latin-1-utilizing) languages besides German with standardized diacritic substitutions that involve something other than just stripping the diacritics? Steve