Re: Solr production live implementation

2007-11-02 Thread Otis Gospodnetic
Hi Tim (switching to the more appropriate solr-user list) It's hard to tell and depends on thing like integration of search in the rest of the site, the placement of search field/form, the exposure, etc. The corpus/index does not sound large, but the mention of Windows scares me, as does 2GB o

RE: Phrase Query Performance Question

2007-11-02 Thread Haishan Chen
> Date: Fri, 2 Nov 2007 12:31:29 -0700> From: [EMAIL PROTECTED]> To: > solr-user@lucene.apache.org> Subject: Re: Phrase Query Performance Question> > > > : It still feels to me that you are trying doing something unique with > your> : phrase queries. Unfortunately, you still haven't said what

Re: Phrase Query Performance Question

2007-11-02 Thread Chris Hostetter
: It still feels to me that you are trying doing something unique with your : phrase queries. Unfortunately, you still haven't said what you are trying to : do in general terms, which makes it very difficult for people to help you. Agreed. This seems very special case, but we dont' know what th

Re: Solr and Lucene Indexing Performance

2007-11-02 Thread Mike Klaas
On 2-Nov-07, at 11:41 AM, Jae Joo wrote: Hi, I have 6 millions article to be indexed by Solr and do need your recommendation. I do need to parse and generate the Solr based xml file to post it. How about to use Lucene directly? I have short testing, it looks like Sola based indexing is fast

Re: Phrase Query Performance Question

2007-11-02 Thread Mike Klaas
On 2-Nov-07, at 10:03 AM, Haishan Chen wrote: Date: Fri, 2 Nov 2007 07:32:30 -0700> Subject: Re: Phrase Query Performance Question> From: [EMAIL PROTECTED]> To: solr- [EMAIL PROTECTED]> > He means "extremely frequent" and I agree. --wunder Then it means a PHRASE (combination of terms

Solr and Lucene Indexing Performance

2007-11-02 Thread Jae Joo
Hi, I have 6 millions article to be indexed by Solr and do need your recommendation. I do need to parse and generate the Solr based xml file to post it. How about to use Lucene directly? I have short testing, it looks like Sola based indexing is faster than direct indexing through Lucene. Am I d

RE: Phrase Query Performance Question

2007-11-02 Thread Haishan Chen
> Date: Fri, 2 Nov 2007 07:32:30 -0700> Subject: Re: Phrase Query Performance > Question> From: [EMAIL PROTECTED]> To: solr-user@lucene.apache.org> > He > means "extremely frequent" and I agree. --wunder Then it means a PHRASE (combination of terms except stopwords) appear in 5% to 10% o

Re: Any tips for indexing large amounts of data?

2007-11-02 Thread Brendan Grainger
Thanks so much for your suggestions. I am attempting to index 550K docs at once, but have found I've had to break them up into smaller batches. Indexing seems to stop at around 47K docs (the index reaches 264M in size at this point). The index eventually itself grows to about 2Gb. I am usin

Re: Phrase Query Performance Question

2007-11-02 Thread Walter Underwood
He means "extremely frequent" and I agree. --wunder On 11/2/07 1:51 AM, "Haishan Chen" <[EMAIL PROTECTED]> wrote: > Thanks for the advice. You certainly have a point. I believe you mean a query > term that appears in 5-10% of an index in a natural language corpus is > extremely INFREQUENT?

RE: Phrase Query Performance Question

2007-11-02 Thread Haishan Chen
> From: [EMAIL PROTECTED]> Subject: Re: Phrase Query Performance Question> > Date: Thu, 1 Nov 2007 11:25:26 -0700> To: solr-user@lucene.apache.org> > On > 31-Oct-07, at 11:54 PM, Haishan Chen wrote:> > >> >> Date: Wed, 31 Oct 2007 > 17:54:53 -0700> Subject: Re: Phrase Query > >> Performance