What is the performance of these CJK analyzers (one in lucene and hylanda )? We would potentially be indexing millions of documents.
James, We would have a look at hylanda too. What abt japanese and korean analyzers, any recommendations? - Eswar On Nov 27, 2007 7:21 AM, James liu <[EMAIL PROTECTED]> wrote: > I don't think NGram is good method for Chinese. > > CJKAnalyzer of Lucene is 2-Gram. > > Eswar K: > if it is chinese analyzer,,i recommend hylanda(www.hylanda.com),,,it is > the best chinese analyzer and it not free. > if u wanna free chinese analyzer, maybe u can try je-analyzer. it have > some problem when using it. > > > > On Nov 27, 2007 5:56 AM, Otis Gospodnetic <[EMAIL PROTECTED]> > wrote: > > > Eswar, > > > > We've uses the NGram stuff that exists in Lucene's contrib/analyzers > > instead of CJK. Doesn't that allow you to do everything that the > Chinese > > and CJK analyzers do? It's been a few months since I've looked at > Chinese > > and CJK Analzyers, so I could be off. > > > > Otis > > > > -- > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > ----- Original Message ---- > > From: Eswar K <[EMAIL PROTECTED]> > > To: solr-user@lucene.apache.org > > Sent: Monday, November 26, 2007 8:30:52 AM > > Subject: CJK Analyzers for Solr > > > > Hi, > > > > Does Solr come with Language analyzers for CJK? If not, can you please > > direct me to some good CJK analyzers? > > > > Regards, > > Eswar > > > > > > > > > > > -- > regards > jl >