Hello, Solr community: I would like to tokenize the following sentence. I do want to tokens that remain hyphens. So, for example, original text: This is a new abc-edg and xyz-abc is coming soon! desired output tokens: this/is/a/new/abc-edg/and/xyz-abc/is/coming/soon/!
Is there any way that I do not omit hyphens from tokens? I though HyphenatedWordsFilter does have similar functionalities, but it gets rid of hyphens. Any help will be appreciated. -- Sincerely, Kaya github: https://github.com/28kayak