if ur analyzer is standard, u can try use tokenize.(u can find the answer from analyzer source code and schema.xml)
On Nov 27, 2007 9:39 AM, zx zhang <[EMAIL PROTECTED]> wrote: > lance, > > The following is a instance schema fieldtype using solr1.2 and CJK > package. > And it works. As you said, CJK does parse cjk string in a bi-gram way, > just > like turning 'C1C2C3C4' into 'C1C2 C2C3 C3C4'. > > More to the point, it is worthwhile to mention that the index expand > beyond > tolerance to use cjk package, and it will take a long time to index > document. For most enterprise applications, I think, it need a more > effective string parser. > > > <fieldtype name="text_cjk" class="solr.TextField"> > <analyzer class="org.apache.lucene.analysis.cjk.CJKAnalyzer"/> > </fieldtype> > > > > On 11/27/07, Norskog, Lance <[EMAIL PROTECTED]> wrote: > > > > I notice this is in the future tense. Is the CJKTokenizer available yet? > > From what I can see, the CJK code should be a Filter instead anyway. > > Also, the ChineseFilter and CJKTokenizer do two different things. > > > > CJKTokenizer turns C1C2C3C4 into 'C1C2 C2C3 C3C4'. ChineseFilter (from > > 2001) turns C1C2 into 'C1 C2'. I hope someone who speaks Mandarin or > > Cantonese understands what this should do. > > > > Lance > > > > -----Original Message----- > > From: Eswar K [mailto:[EMAIL PROTECTED] > > Sent: Monday, November 26, 2007 10:28 AM > > To: solr-user@lucene.apache.org > > Subject: Re: CJK Analyzers for Solr > > > > Hoss, > > > > Thanks a lot. Will look into it. > > > > Regards, > > Eswar > > > > On Nov 26, 2007 11:55 PM, Chris Hostetter <[EMAIL PROTECTED]> > > wrote: > > > > > > > > : Does Solr come with Language analyzers for CJK? If not, can you > > > please > > > : direct me to some good CJK analyzers? > > > > > > Lucene has a CJKTokenizer and CJKAnalyzer in the contrib/analyzers > > jar. > > > they can be used in Solr. both have been included in Solr for a while > > > > > now, so you can specify CJKAnalyzer in your schema with Solr 1.2, but > > > starting with Solr 1.3 a Factory for the Tokenizer will also be > > > included so it can be used in a more complex analysis chain defined in > > the schema. > > > > > > > > > > > > -Hoss > > > > > > > > > -- regards jl