Re: Preceding special characters in ClassicTokenizerFactory

2016-10-03 Thread Ahmet Arslan
Hi Andy, WordDelimeterFilter has "types" option. There is an example file named wdftypes.txt in the source tree that preserves #hashtags and @mentions. If you follow this path, please use Whitespace tokenizer. Ahmet On Monday, October 3, 2016 9:52 PM, "Whelan, Andy" wrote: Hello, I am guess

Preceding special characters in ClassicTokenizerFactory

2016-10-03 Thread Whelan, Andy
Hello, I am guessing that what I am looking for is probably going to require extending StandardTokenizerFactory or ClassicTokenizerFactory. But I thought I would ask the group here before attempting this. We are indexing documents from an eclectic set of sources. There is, however, a heavy inter