Re: trouble with CollationKeyFilter

Robert Muir Sun, 27 Nov 2011 19:27:44 -0800

On Sat, Nov 26, 2011 at 8:43 PM, Michael Sokolov <soko...@ifactory.com> wrote:
> That's great news!  We can't really track trunk, but it looks like this is
> targeted for 3.6, right? As a short-term alternative, I was considering
> using ICUFoldingFilter; this won't preserve some of the finer distinctions,
> but will at least sort the accented characters in with their unaccented kin,
> which is 90% of what we need. Does that make sense?  It should index regular
> characters then, and not ICU collation keys, I think?
>


yes, should be pretty easy to make the range queries work for these.

As far as doing things with filters as an alternative: it depends what
you need, doing stuff with the analyzer is pretty inflexible because
its just a tokenfilter and still binary order at the end of the day,
so the order might not make sense for some languages.

Because of this its also difficult/impossible if you are picky about
sorting to do things like sort lowercase values first (for when you
care about case), ignore punctuation (so U.S.A. = USA), sort numerics
correctly (so FOOBAR-10 sorts after FOOBAR-9)... etc etc... though the
factory in solr doesn't yet expose these options either :)

also, looking at your configuration, the lowercasefilter is not
needed, you are using primary strength.

-- 
lucidimagination.com

Re: trouble with CollationKeyFilter

Reply via email to