Otis,

Thanks for the information, we will check this out.

Regards,
Eswar

On Nov 28, 2007 12:20 PM, Otis Gospodnetic <[EMAIL PROTECTED]>
wrote:

> Eswar,
>
> I wouldn't worry about the performance of those CJK analyzers too much -
> they are fairly trivial.  The StandardAnalyzer is slower, for example.  I
> recently indexed cca 20MM large docs on a 8-core, 8 GB RAM box in 10 hours -
> 550 docs/second.  No CJK, just English.
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
> ----- Original Message ----
> From: Eswar K <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Monday, November 26, 2007 9:27:15 PM
> Subject: Re: CJK Analyzers for Solr
>
> thanks james...
>
> How much time does it take to index 18m docs?
>
> - Eswar
>
> On Nov 27, 2007 7:43 AM, James liu <[EMAIL PROTECTED]> wrote:
>
> > i not use HYLANDA analyzer.
> >
> > i use je-analyzer and indexing at least 18m docs.
> >
> > i m sorry i only use chinese analyzer.
> >
> >
> > On Nov 27, 2007 10:01 AM, Eswar K <[EMAIL PROTECTED]> wrote:
> >
> > > What is the performance of these CJK analyzers (one in lucene and
> > hylanda
> > > )?
> > > We would potentially be indexing millions of documents.
> > >
> > > James,
> > >
> > > We would have a look at hylanda too. What abt japanese and korean
> > > analyzers,
> > > any recommendations?
> > >
> > > - Eswar
> > >
> > > On Nov 27, 2007 7:21 AM, James liu <[EMAIL PROTECTED]> wrote:
> > >
> > > > I don't think NGram is good method for Chinese.
> > > >
> > > > CJKAnalyzer of Lucene is 2-Gram.
> > > >
> > > > Eswar K:
> > > >  if it is chinese analyzer,,i recommend
>  hylanda(www.hylanda.com),,,it
> > is
> > > > the best chinese analyzer and it not free.
> > > >  if u wanna free chinese analyzer, maybe u can try je-analyzer.
>  it
> > have
> > > > some problem when using it.
> > > >
> > > >
> > > >
> > > > On Nov 27, 2007 5:56 AM, Otis Gospodnetic
>  <[EMAIL PROTECTED]>
> > > > wrote:
> > > >
> > > > > Eswar,
> > > > >
> > > > > We've uses the NGram stuff that exists in Lucene's
>  contrib/analyzers
> > > > > instead of CJK.  Doesn't that allow you to do everything that
>  the
> > > > Chinese
> > > > > and CJK analyzers do?  It's been a few months since I've looked
>  at
> > > > Chinese
> > > > > and CJK Analzyers, so I could be off.
> > > > >
> > > > > Otis
> > > > >
> > > > > --
> > > > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > > > >
> > > > > ----- Original Message ----
> > > > > From: Eswar K <[EMAIL PROTECTED]>
> > > > > To: solr-user@lucene.apache.org
> > > > > Sent: Monday, November 26, 2007 8:30:52 AM
> > > > > Subject: CJK Analyzers for Solr
> > > > >
> > > > > Hi,
> > > > >
> > > > > Does Solr come with Language analyzers for CJK? If not, can you
> > please
> > > > > direct me to some good CJK analyzers?
> > > > >
> > > > > Regards,
> > > > > Eswar
> > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > regards
> > > > jl
> > > >
> > >
> >
> >
> >
> > --
> > regards
> > jl
> >
>
>
>
>

Reply via email to