: This works. Other patterns tried were: \p{InLatin-1_Supplement} or \p{Latin} : These throw an exception, from the log: : *** : Mar 29, 2012 5:56:45 PM org.apache.solr.common.SolrException log : SEVERE: null:org.apache.solr.common.SolrException: Plugin init failure for : [schema.xml] fieldType:Plugin init failure for [schema.xml] : analyzer/charFilter:Configuration Error: 'pattern' can not be parsed in : org.apache.solr.analysis.PatternReplaceCharFilterFactory
Immediately below that should have been more details on what error generated by the Java regex engine when trying to parse your pattern. (something like "caused by: ...") which is fairly crucial to understand what might be going wrong. : Can anybody help? Or, might this be a java issue? I suspect it's a java issue ... you didn't mention which version of java you are using, and i don't know which java versions corripsond to which unicode versions in terms of the block names they support, but is it possible some of those patterns are only legal in a newer version of java then you have? have you tried running a simple little java main() to verify that those patterns are legal in your JVM? public static final class PatTest { public static final void main(String[] args) throws Exception { String pat = args[0]; String input = args[1]; Pattern p = Pattern.compile(pat); System.out.println(input + " does " + (p.matcher(input).matches() ? "" : "NOT") + " match " + pat); } } -Hoss