Does anybody know of a tokenizer which can be configured with (multiple) regular expressions to mark some of the input text as keyword and behave like StandardTokenizer (or UAX29URLEmailTokenizer) otherwise?
Input: Does my order 4711.0815!-somecode_and.other(stuff) arrive on friday? Tokens: does|my|order|4711.0815!-somecode_and.other(stuff)|arrive|on|Friday Any pointer? How to code? Regards, Kai Gülzau