Re: Case Insensitive Matching in Solr/Lucene

Shawn Heisey Tue, 25 Nov 2014 13:27:18 -0800

On 11/25/2014 6:27 AM, Alexandre Rafalovitch wrote:
> The usual solution is to have faceting using the other field (with
> copyField). Usually it is because people want the original unmodified
> version the string without tokenization (So, "United States of
> America" instead of "united" "states" "america"). It sounds like your
> case is a little different and you do want tokenized values, just not
> lowercased.


Something I've been wondering about related to facets.  This might be a
tangent from the original issue, but it's somewhat related, so I'm
asking it here.

It's my understanding that DocValues have the same info as stored fields
-- that is, the original value, completely unmodified by the analysis chain.

It's also my understanding that DocValues get used for sorting and
facets if they are present.

If both of these assumptions/understandings are correct, then I would
think that simply turning on DocValues for a field with the lowercase
filter (and reindexing) would allow case-insensitive queries *plus*
facets with the original unmodified and untokenized values.

Have I got completely the wrong idea?  I haven't tested any of this.

Thanks,
Shawn

Re: Case Insensitive Matching in Solr/Lucene

Reply via email to