On 11/25/2014 6:27 AM, Alexandre Rafalovitch wrote: > The usual solution is to have faceting using the other field (with > copyField). Usually it is because people want the original unmodified > version the string without tokenization (So, "United States of > America" instead of "united" "states" "america"). It sounds like your > case is a little different and you do want tokenized values, just not > lowercased.
Something I've been wondering about related to facets. This might be a tangent from the original issue, but it's somewhat related, so I'm asking it here. It's my understanding that DocValues have the same info as stored fields -- that is, the original value, completely unmodified by the analysis chain. It's also my understanding that DocValues get used for sorting and facets if they are present. If both of these assumptions/understandings are correct, then I would think that simply turning on DocValues for a field with the lowercase filter (and reindexing) would allow case-insensitive queries *plus* facets with the original unmodified and untokenized values. Have I got completely the wrong idea? I haven't tested any of this. Thanks, Shawn