If you require that the facets show both the folded and non-folded versions, then you have no choice except to index both somehow.
But I think you're saying that you expect "néd" and "ned" to be counted in one bucket. Then, indeed, you have to somehow pre-apply the relevant filters. You can do that in the client code or you could write a QueryComponent that intercepted the query (probably a first-component) and "did the right thing". The advantage there is that since this is running on the server it has full access to the analysis chain and could force the token to go through selected parts of the chain without having to change the client code. I say "parts of the chain" because some things just wouldn't make sense. Say you had WordDelimiterFilterFactory in your chain. If your prefix has a change in case, you'd get two tokens, definitely not what you want. Which is one of the reasons facet prefixes don't do this by default. Another gotcha would be, say, stemming. facet.prefix=runn doesn't stem like "runner" for instance. In fact it doesn't stem at all.... Note that case sensitivity matters here too. If you specified a prefix of Ned I don't think you'd get anything counted in that bucket. If I were going to make a queryComponent out of it, I'd probably just define a new field that has selected filters in it (lowerCase, folding, etc). and force the prefix through that. Here's some background on the general problem: https://lucidworks.com/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/ Skimming that again it _does_ seem possible that sending a facet prefix through the analysis chain as though it were a wildcarded term would do what you're asking, but nobody has yet volunteered to write the code. It would probably require a new facet parameter like facet.analyze=true or something. But frankly I think that's overkill. My bet is that you could do this on the client side "well enough" and much more quickly.... Best, Erick On Wed, Jun 7, 2017 at 6:03 PM, arik <arik...@gmail.com> wrote: > Thanks Erick, indeed your hunch is correct, it's the analyzing filters that > facet.prefix seems to bypass, and getting rid of my > ASCIIFoldingFilterFactory and MappingCharFilterFactory make it work ok. > > The problem is I need those filters... otherwise how should I create facets > which match against both Anglicized as well as international prefix > spellings? I could of course maintain separate fields and do multiple > queries, but seems like that quickly gets out of hand if I also want to > support mixed case and other filtering dimensions. > > Is there a way to route facet.prefix through the field type filters like all > the other params? I suppose I could manually instantiate and pre-apply the > filters in the client code... any other ideas? > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/international-characters-in-facet-prefix-tp4339415p4339534.html > Sent from the Solr - User mailing list archive at Nabble.com.