RE: Performance help for heavy indexing workload

2008-02-12 Thread Lance Norskog
an impressive app. Lance -Original Message- From: James Brady [mailto:[EMAIL PROTECTED] Sent: Tuesday, February 12, 2008 12:41 PM To: solr-user@lucene.apache.org Subject: Re: Performance help for heavy indexing workload Hi - thanks to everyone for their responses. A couple of extra pieces

Re: Performance help for heavy indexing workload

2008-02-12 Thread James Brady
Hi - thanks to everyone for their responses. A couple of extra pieces of data which should help me optimise - documents are very rarely updated once in the index, and I can throw away index data older than 7 days. So, based on advice from Mike and Walter, it seems my best option will be t

Re: Performance help for heavy indexing workload

2008-02-12 Thread Mike Klaas
On 11-Feb-08, at 11:38 PM, James Brady wrote: Hello, I'm looking for some configuration guidance to help improve performance of my application, which tends to do a lot more indexing than searching. At present, it needs to index around two documents / sec - a document being the stripped c

Re: Performance help for heavy indexing workload

2008-02-12 Thread Walter Underwood
On 2/12/08 7:40 AM, "Ken Krugler" <[EMAIL PROTECTED]> wrote: > In general immediate updating of an index with a continuous stream of > new content, and fast search results, work in opposition. The > searcher's various caches are getting continuously flushed to avoid > stale content, which can easi

Re: Performance help for heavy indexing workload

2008-02-12 Thread Walter Underwood
That does seem really slow. Is the index on NFS-mounted storage? wunder On 2/12/08 7:04 AM, "Erick Erickson" <[EMAIL PROTECTED]> wrote: > Well, the *first* sort to the underlying Lucene engine is expensive since > it builds up the terms to sort. I wonder if you're closing and opening the > under

Re: Performance help for heavy indexing workload

2008-02-12 Thread Erick Erickson
Well, the *first* sort to the underlying Lucene engine is expensive since it builds up the terms to sort. I wonder if you're closing and opening the underlying searcher for every request? This is a definite limiter. Disclaimer: I mostly do Lucene, not SOLR (yet), so don't *even* ask me how to chan