On Tue, Jul 15, 2008 at 10:29 AM, Stefan Oestreicher <[EMAIL PROTECTED]> wrote: > as I understand the WordDelimiterFilter should split on case changes, word > delimiters and changes from character to digit, but it should not > differentiate between ASCII and multibyte chars. It does however. The word > "hälse" (german plural of "neck") gets split into "h", "ä" and "lse", which > unfortunately renders this filter quite unusable for me. Am i missing > something or is this a bug? > I'm using solr 1.3 built from trunk.
Look for charset issues in communicating with Solr. I just tried this with the "text" field via Solr's analysis.jsp and it works fine. -Yonik