Using Solr 3.5.0 and in my schema.xml I'm using the following to mark the end
of sentences and replace the end punctuation with a symbolic token:

<charFilter class=&quot;solr.PatternReplaceCharFilterFactory&quot;
pattern=&quot;(?&lt;=[^.!?\\s][^.!?]*(?:[.!?](?![']?\s|$)[^.!?]*)*)[.!?]+(?=\\s|$)&quot;
replacement=&quot; monkeysentence&quot;/>

I'm not sure if that will even work for what I want, but first I need to
solve the problem of escaping the '<' character in the first '?<='
lookbehind.

I get the following error:

org.xml.sax.SAXParseException: The value of attribute "pattern" associated
with an element type "null" must not contain the '<' character.

I've tried using a '\' as in:

pattern="(?\<=[^.!?\\s][^.!?]*(?:[.!?](?![']?\s|$)[^.!?]*)*)[.!?]+(?=\\s|$)"

But I get the same error.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-to-escape-character-in-regex-in-Solr-schema-xml-tp3921961p3921961.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to