On Sat, Nov 26, 2011 at 8:43 PM, Michael Sokolov <soko...@ifactory.com> wrote: > That's great news! We can't really track trunk, but it looks like this is > targeted for 3.6, right? As a short-term alternative, I was considering > using ICUFoldingFilter; this won't preserve some of the finer distinctions, > but will at least sort the accented characters in with their unaccented kin, > which is 90% of what we need. Does that make sense? It should index regular > characters then, and not ICU collation keys, I think? >
yes, should be pretty easy to make the range queries work for these. As far as doing things with filters as an alternative: it depends what you need, doing stuff with the analyzer is pretty inflexible because its just a tokenfilter and still binary order at the end of the day, so the order might not make sense for some languages. Because of this its also difficult/impossible if you are picky about sorting to do things like sort lowercase values first (for when you care about case), ignore punctuation (so U.S.A. = USA), sort numerics correctly (so FOOBAR-10 sorts after FOOBAR-9)... etc etc... though the factory in solr doesn't yet expose these options either :) also, looking at your configuration, the lowercasefilter is not needed, you are using primary strength. -- lucidimagination.com