On Tue, 12 Aug 2008 11:44:42 -0400
"Steven A Rowe" <[EMAIL PROTECTED]> wrote:

> Solr is Unicode aware.  The ISOLatin1AccentFilterFactory handles diacritics 
> for the ISO Latin-1 section of the Unicode character set.  UTF (do you mean 
> UTF-8?) is a (set of) Unicode serialization(s), and once Solr has 
> deserialized it, it is just Unicode characters (Java's in-memory UTF-16 
> representation).
> 
> So as long as you're only concerned about removing diacritics from the set of 
> Unicode characters that overlaps ISO Latin-1, and not about other Unicode 
> characters, then ISOLatin1AccentFilterFactory should work for you.

hi,
do you know if anyone has implemented a similar filter using icu and mapping (a 
lot more of) UTF-8 to ascii ? 

B

_________________________
{Beto|Norberto|Numard} Meijome

"He has the attention span of a lightning bolt."
  Robert Redford

I speak for myself, not my employer. Contents may be hot. Slippery when wet. 
Reading disclaimers makes you go blind. Writing them is worse. You have been 
Warned.

Reply via email to