: It seems to be an unrecognisable pattern, this is from the log, last
: paragraph says "unknown character block name". The java version is
: "1.6.0_31":

Did you read the rest of my reply? about testing if java recognizes your 
block name independent of Solr ... because that error is coming directly 
from the java regex engine...

: Caused by: java.util.regex.PatternSyntaxException: Unknown character block
: name {Latin-1_Supplement} near index 23
: \p{InLatin-1_Supplement}
:                        ^
:         at java.util.regex.Pattern.error(Pattern.java:1713)
:         at java.util.regex.Pattern.unicodeBlockPropertyFor(Pattern.java:2424)

Why are you using an "_" at all? Isn't "\p{InLatin-1 Supplement}"  (or 
"\p{InLatin-1Supplement}" what you mean? Either of those work for me, and 
match the javadocs for what block names are supported in the JVM...

http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html#ubc
>> The block names supported by Pattern are the valid block names accepted 
>> and defined by UnicodeBlock.forName.

http://docs.oracle.com/javase/6/docs/api/java/lang/Character.UnicodeBlock.html#forName%28java.lang.String%29
>> This method accepts block names in the following forms:
>> 
>>   1. Canonical block names as defined by the Unicode Standard. For
>>   example, the standard defines a "Basic Latin" block. Therefore, this
>>   method accepts "Basic Latin" as a valid block name. The documentation
>>   of each UnicodeBlock provides the canonical name.
>>   2. Canonical block names with all spaces removed. For example,
>>   "BasicLatin" is a valid block name for the "Basic Latin" block.
>>   ...



-Hoss

Reply via email to