CharFilter can normalize (convert) traditional chinese to simplified
chinese or vice versa,
if you define mapping.txt. Here is the sample of Chinese character
normalization:
https://issues.apache.org/jira/secure/attachment/12392639/character-normalization.JPG
See SOLR-822 for the detail:
https://issues.apache.org/jira/browse/SOLR-822
Koji
revathy arun wrote:
Hi,
When I index chinese content using chinese tokenizer and analyzer in solr
1.3 ,some of the chinese text files are getting indexed but others are not.
Since chinese has got many different language subtypes as in standard
chinese,simplified chinese etc which of these does the chinese tokenizer
support and is there any method to find the type of chiense language from
the file?
Rgds