Edwin There is a spec for which characters are acceptable in an email name, and another spec for chars in a domain name. I suspect you will have more success with a tokenizer which is specialized for email, but I have not looked at UAX29URLEmailTokenizerFactory. Does ClassicTokenizerFactory split on hyphens? Cheers --Rick
On November 24, 2017 3:46:46 AM EST, Zheng Lin Edwin Yeo <edwinye...@gmail.com> wrote: >Hi, > >I am indexing email addresses into Solr via EML files. Currently, I am >using ClassicTokenizerFactory with LowerCaseFilterFactory. However, I >also >found that we can also use UAX29URLEmailTokenizerFactory with >LowerCaseFilterFactory. > >Does anyone have any recommendation on which Tokenizer is better? > >I am currently using Solr 6.5.1, and planning to upgrade to Solr 7.1.0. > >Regards, >Edwin -- Sorry for being brief. Alternate email is rickleir at yahoo dot com