Ahh yes, thanks for the suggestions! I've implemented them.
I thought about you're second point previously and had encountered that
issue. Once it's tokenized, I don't believe there is a way to get the full
string back from the token stream.
--
View this message in context:
http://lucene.47206
Cool! suggestion: you might want to replace
externalVal.toLowerCase().split(" ");
with
externalVal.toLowerCase().split("\\s+");
also I bet folks might have different ideas about what to do with
hyphens, so maybe:
externalVal.toLowerCase().split("[-\\s]+");
In fact why not make it a config