Re: Where is ISOLatin1AccentFilterFactory (Solr4)?

Uwe Reh Wed, 02 Jan 2013 15:44:55 -0800

Hi,

I like the best of both worlds:

 <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-specials.txt" 
/>

 Mask some specials like "C++" to "cplusplus" or "C#" to "csharp" ...

 <tokenizer class="solr.ICUTokenizerFactory" />

 Tokenize an identify on unicode whitespaces and charsets

 <filter class="solr.WordDelimiterFilterFactory" />

 Well known splitter for composed words

 <filter class="solr.ICUFoldingFilterFactory" />

 Perfect superset of <charFilter ... ISOLatin1Accent.txt"/>

or the ISOLatin1AccentFilterFactory because it can handle composed anddecomposed accents and umlauts

 <filter class="solr.CJKBigramFilterFactory" />

Nice workaround for missing whitespace as word separator in thislanguages.



Am 01.01.2013 17:48, schrieb Jack Krupansky:

Hmmm... quite some time ago I switched from ASCIIFoldingFilterFactory
to MappingCharFilterFactory, because I was told (by who I can't recall)
that the latter was "better/preferred". Is there any particular reason
to favor one over the other?

-----Original Message----- From: Erick Erickson
ASCIIFoldingFilterFactory is preferred, does that suit your needs?

Re: Where is ISOLatin1AccentFilterFactory (Solr4)?

Reply via email to