Using Solr 3.5.0 and in my schema.xml I'm using the following to mark the end
of sentences and replace the end punctuation with a symbolic token:
I'm not sure if that will even work for what I want, but first I need to
solve the problem of escaping the '<' character in the first '?<='
lookbehind
Thanks Jeevanandam. I couldn't get any regex pattern to work except a basic
one to look for sentence-ending punctuation followed by whitespace:
[.!?](?=\s)
However, this isn't good enough for my needs so I'm switching tactics at the
moment and working on plugging in OpenNLP's SentenceDetector int