Re: half width katakana

Koji Sekiguchi Tue, 28 Apr 2009 01:49:08 -0700

The exception is expected if you use CharStream aware Tokenizer withoutCharFilters.

Please see example/solr/conf/schema.xml for the setting of CharFilter and
CharStreamAware*Tokenizer:


   <!-- charFilter + "CharStream aware" WhitespaceTokenizer  -->


Thank you,

Koji


Ashish P wrote:

Koji san,

Using CharStreamAwareCJKTokenizerFactory is giving me following error,
SEVERE: java.lang.ClassCastException: java.io.StringReader cannot be cast to
org.apache.solr.analysis.CharStream

May be you are typecasting Reader to subclass.
Thanks,
Ashish


Koji Sekiguchi-2 wrote:

If you use CharFilter, you should use "CharStream aware" Tokenizer tocorrect terms offsets.

There are two CharStreamAware*Tokenizer in trunk/Solr 1.4.
Probably you want to use CharStreamAwareCJKTokenizer(Factory).

Koji


Ashish P wrote:

After this should I be using same cjkAnalyzer or use charFilter??
Thanks,
Ashish


Koji Sekiguchi-2 wrote:

Ashish P wrote:

I want to convert half width katakana to full width katakana. I tried
using
cjk analyzer but not working.
Does cjkAnalyzer do it or is there any other way??

CharFilter which comes with trunk/Solr 1.4 just covers this type of
problem.
If you are using Solr 1.3, try the patch attached below:

https://issues.apache.org/jira/browse/SOLR-822

Koji

Re: half width katakana

Reply via email to