lto:[EMAIL PROTECTED]
Sent: Thursday, February 07, 2008 11:05 AM
To: solr-user@lucene.apache.org
Subject: RE: Indexing Japanese & English
Here are the comments for CJKTokenizer. First, is this what you want?
Remember, there are three Japanese writing systems.
/**
* CJKTokenizer was m
Here are the comments for CJKTokenizer. First, is this what you want?
Remember, there are three Japanese writing systems.
/**
* CJKTokenizer was modified from StopTokenizer which does a decent job for
* most European languages. It performs other token methods for double-byte
* Characters: the