The exception is expected if you use CharStream aware Tokenizer without CharFilters.
Please see example/solr/conf/schema.xml for the setting of CharFilter and
CharStreamAware*Tokenizer:

   <!-- charFilter + "CharStream aware" WhitespaceTokenizer  -->
<!-- <fieldType name="textCharNorm" class="solr.TextField" positionIncrementGap="100"> <analyzer> <charFilter class="solr.MappingCharFilterFactory" mapping="mapping-ISOLatin1Accent.txt"/> <tokenizer class="solr.CharStreamAwareWhitespaceTokenizerFactory"/> </analyzer> </fieldType> -->

Thank you,

Koji


Ashish P wrote:
Koji san,

Using CharStreamAwareCJKTokenizerFactory is giving me following error,
SEVERE: java.lang.ClassCastException: java.io.StringReader cannot be cast to
org.apache.solr.analysis.CharStream

May be you are typecasting Reader to subclass.
Thanks,
Ashish


Koji Sekiguchi-2 wrote:
If you use CharFilter, you should use "CharStream aware" Tokenizer to correct terms offsets.
There are two CharStreamAware*Tokenizer in trunk/Solr 1.4.
Probably you want to use CharStreamAwareCJKTokenizer(Factory).

Koji


Ashish P wrote:
After this should I be using same cjkAnalyzer or use charFilter??
Thanks,
Ashish


Koji Sekiguchi-2 wrote:
Ashish P wrote:
I want to convert half width katakana to full width katakana. I tried
using
cjk analyzer but not working.
Does cjkAnalyzer do it or is there any other way??
CharFilter which comes with trunk/Solr 1.4 just covers this type of
problem.
If you are using Solr 1.3, try the patch attached below:

https://issues.apache.org/jira/browse/SOLR-822

Koji







Reply via email to