Hi,

First of all, a bit of a disclaimer: I am not a Czech language speaker, at
all.

We are using Solr's dynamic fields in our project (XWiki), and we have
recently noticed a problem [1] with the Czech language.

Basically, our mapping says something like this:

<dynamicField name="*_cz" type="text_cz" indexed="true" stored="true"
multiValued="true" />

...but at runtime, we ask for the language code "cs" (which is the ISO
language code for Czech [2]) and it obviously fails (due to the mapping).

Now, we can easily fix this on our end by fixing the mapping to name="*_cs",
but what we are really wondering now is why does Lucene/Solr use "cz"
(country code) instead of "cs" (language code) in both its "text_cz" field
and its "stopwords_cz.txt" file?

Is that a mistake on the Solr/Lucene side? Is it some kind of convention?
Is it going to be fixed?

Thanks,
Eduard

----------
[1] http://jira.xwiki.org/browse/XWIKI-11897
[2] http://en.wikipedia.org/wiki/Czech_language

Reply via email to