Hi Stefan, I wrote a test case for the problem you described but it is working fine. I used the following definition:
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="0" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="0" preserveOriginal="0"/> What configuration are you using? If it is different, please share it so that I can test with it. On Tue, Jul 15, 2008 at 7:59 PM, Stefan Oestreicher < [EMAIL PROTECTED]> wrote: > Hi, > > as I understand the WordDelimiterFilter should split on case changes, word > delimiters and changes from character to digit, but it should not > differentiate between ASCII and multibyte chars. It does however. The word > "hälse" (german plural of "neck") gets split into "h", "ä" and "lse", which > unfortunately renders this filter quite unusable for me. Am i missing > something or is this a bug? > I'm using solr 1.3 built from trunk. > > TIA, > > Stefan Oestreicher > > -- Regards, Shalin Shekhar Mangar.