Am 29.11.2011 15:20, schrieb Erick Erickson:
Hmmm, I tried this in straight Java, no Solr/Lucene involved and the
behavior I'm seeing is that no example works if it has more than
one whitespace character after the hyphen, including your failure
example.
I haven't lived inside regexes for long enough that I don't know what
the right regex should be, but it doesn't appear to be a Solr problem
Jay,
I think the problem is this:
You're checking whether the character preceding the array of at least
one whitespace is not a hyphen.
However, when you've more than one whitespace, like this:
"foo- \n bar"
then there's another array of whitespaces - "\n " - which is precedes by
the first whitespace - " ".
Therefore, you'll need to not only check for preceding hyphens, but also
for preceding whitespaces.
I'll leave this as an exercise for you. ;)
-Kuli