On Tue, 12 Aug 2008 11:44:42 -0400 "Steven A Rowe" <[EMAIL PROTECTED]> wrote:
> Solr is Unicode aware. The ISOLatin1AccentFilterFactory handles diacritics > for the ISO Latin-1 section of the Unicode character set. UTF (do you mean > UTF-8?) is a (set of) Unicode serialization(s), and once Solr has > deserialized it, it is just Unicode characters (Java's in-memory UTF-16 > representation). > > So as long as you're only concerned about removing diacritics from the set of > Unicode characters that overlaps ISO Latin-1, and not about other Unicode > characters, then ISOLatin1AccentFilterFactory should work for you. hi, do you know if anyone has implemented a similar filter using icu and mapping (a lot more of) UTF-8 to ascii ? B _________________________ {Beto|Norberto|Numard} Meijome "He has the attention span of a lightning bolt." Robert Redford I speak for myself, not my employer. Contents may be hot. Slippery when wet. Reading disclaimers makes you go blind. Writing them is worse. You have been Warned.