Good example of multiple tokenizers for a single field

Jacob Elder Mon, 29 Nov 2010 14:16:16 -0800

I am looking for a clear example of using more than one tokenizer for a
source single field. My application has a single "body" field which until
recently was all latin characters, but we're now encountering both English
and Japanese words in a single message. Obviously, we need to be using CJK
in addition to WhitespaceTokenizerFactory.


I've found some references to using copyFields or NGrams but I can't quite
grasp what the whole solution would look like.

-- 
Jacob Elder
@jelder
(646) 535-3379

Good example of multiple tokenizers for a single field

Reply via email to