You can use only one tokenizer per analyzer. You'd better use separate fields + 
fieldTypes for different languages.

> I am looking for a clear example of using more than one tokenizer for a
> source single field. My application has a single "body" field which until
> recently was all latin characters, but we're now encountering both English
> and Japanese words in a single message. Obviously, we need to be using CJK
> in addition to WhitespaceTokenizerFactory.
> 
> I've found some references to using copyFields or NGrams but I can't quite
> grasp what the whole solution would look like.

Reply via email to