Otis, Thanks for the information, we will check this out.
Regards, Eswar On Nov 28, 2007 12:20 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Eswar, > > I wouldn't worry about the performance of those CJK analyzers too much - > they are fairly trivial. The StandardAnalyzer is slower, for example. I > recently indexed cca 20MM large docs on a 8-core, 8 GB RAM box in 10 hours - > 550 docs/second. No CJK, just English. > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > ----- Original Message ---- > From: Eswar K <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Monday, November 26, 2007 9:27:15 PM > Subject: Re: CJK Analyzers for Solr > > thanks james... > > How much time does it take to index 18m docs? > > - Eswar > > On Nov 27, 2007 7:43 AM, James liu <[EMAIL PROTECTED]> wrote: > > > i not use HYLANDA analyzer. > > > > i use je-analyzer and indexing at least 18m docs. > > > > i m sorry i only use chinese analyzer. > > > > > > On Nov 27, 2007 10:01 AM, Eswar K <[EMAIL PROTECTED]> wrote: > > > > > What is the performance of these CJK analyzers (one in lucene and > > hylanda > > > )? > > > We would potentially be indexing millions of documents. > > > > > > James, > > > > > > We would have a look at hylanda too. What abt japanese and korean > > > analyzers, > > > any recommendations? > > > > > > - Eswar > > > > > > On Nov 27, 2007 7:21 AM, James liu <[EMAIL PROTECTED]> wrote: > > > > > > > I don't think NGram is good method for Chinese. > > > > > > > > CJKAnalyzer of Lucene is 2-Gram. > > > > > > > > Eswar K: > > > > if it is chinese analyzer,,i recommend > hylanda(www.hylanda.com),,,it > > is > > > > the best chinese analyzer and it not free. > > > > if u wanna free chinese analyzer, maybe u can try je-analyzer. > it > > have > > > > some problem when using it. > > > > > > > > > > > > > > > > On Nov 27, 2007 5:56 AM, Otis Gospodnetic > <[EMAIL PROTECTED]> > > > > wrote: > > > > > > > > > Eswar, > > > > > > > > > > We've uses the NGram stuff that exists in Lucene's > contrib/analyzers > > > > > instead of CJK. Doesn't that allow you to do everything that > the > > > > Chinese > > > > > and CJK analyzers do? It's been a few months since I've looked > at > > > > Chinese > > > > > and CJK Analzyers, so I could be off. > > > > > > > > > > Otis > > > > > > > > > > -- > > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > > > ----- Original Message ---- > > > > > From: Eswar K <[EMAIL PROTECTED]> > > > > > To: solr-user@lucene.apache.org > > > > > Sent: Monday, November 26, 2007 8:30:52 AM > > > > > Subject: CJK Analyzers for Solr > > > > > > > > > > Hi, > > > > > > > > > > Does Solr come with Language analyzers for CJK? If not, can you > > please > > > > > direct me to some good CJK analyzers? > > > > > > > > > > Regards, > > > > > Eswar > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > regards > > > > jl > > > > > > > > > > > > > > > -- > > regards > > jl > > > > > >