On Mon, Feb 16, 2009 at 4:30 PM, revathy arun wrote:
> Hi,
>
> When I index chinese content using chinese tokenizer and analyzer in solr
> 1.3 ,some of the chinese text files are getting indexed but others are not.
>
are u sure ur analyzer can do it good?
if not sure, u can use analzyer link in
> >
> >
> >
> > ----- Original Message
> >> From: Fer-Bj
> >> To: solr-user@lucene.apache.org
> >> Sent: Thursday, June 4, 2009 2:20:03 AM
> >> Subject: Re: indexing Chienese langage
> >>
> >>
> >> We are
>
> - Original Message
>> From: Fer-Bj
>> To: solr-user@lucene.apache.org
>> Sent: Thursday, June 4, 2009 2:20:03 AM
>> Subject: Re: indexing Chienese langage
>>
>>
>> We are trying SOLR 1.3 with Paoding Chinese Analyzer , and after
&
t; To: solr-user@lucene.apache.org
> Sent: Thursday, June 4, 2009 2:20:03 AM
> Subject: Re: indexing Chienese langage
>
>
> We are trying SOLR 1.3 with Paoding Chinese Analyzer , and after reindexing
> the index size went from 1.5 Gb to 2.7 Gb.
>
> Is that some expected behavior ?
&g
e subtypes as in standard
> >> chinese,simplified chinese etc which of these does the chinese tokenizer
> >> support and is there any method to find the type of chiense language
> >> from
> >> the file?
> >>
> >> Rgds
> >>
> >>
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/indexing-Chienese-langage-tp22033302p23864358.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>
t;> not.
>>
>> Since chinese has got many different language subtypes as in standard
>> chinese,simplified chinese etc which of these does the chinese tokenizer
>> support and is there any method to find the type of chiense language
>> from
>> the file
CharFilter can normalize (convert) traditional chinese to simplified
chinese or vice versa,
if you define mapping.txt. Here is the sample of Chinese character
normalization:
https://issues.apache.org/jira/secure/attachment/12392639/character-normalization.JPG
See SOLR-822 for the detail:
http
-user@lucene.apache.org
Sent: Monday, February 16, 2009 4:30:47 PM
Subject: indexing Chienese langage
Hi,
When I index chinese content using chinese tokenizer and analyzer in solr
1.3 ,some of the chinese text files are getting indexed but others are not.
Since chinese has got many different lan
Hi,
When I index chinese content using chinese tokenizer and analyzer in solr
1.3 ,some of the chinese text files are getting indexed but others are not.
Since chinese has got many different language subtypes as in standard
chinese,simplified chinese etc which of these does the chinese tokenizer