RE: Scaling indexes with high document count

2010-03-11 Thread Peter S
to tune for fast indexing are: * commiting lists of docs rather than each one separately * not optimizing too often * bump up the mergeFactor (I use a value of 25) Many Thanks! Peter > Date: Thu, 11 Mar 2010 09:19:12 -0800 > From: hossman_luc...@fucit.org > To: solr

Re: Scaling indexes with high document count

2010-03-11 Thread Chris Hostetter
: I wonder if anyone might have some insight/advice on index scaling for high : document count vs size deployments... Your general approach sounds reasonable, although specifics of how you'll need to tune the caches and how much hardware you'll need will largely depend on the specifics of the d

Scaling indexes with high document count

2010-03-10 Thread Peter S
Hello, I wonder if anyone might have some insight/advice on index scaling for high document count vs size deployments... The nature of the incoming data is a steady stream of, on average, 4GB per day. Importantly, the number of documents inserted during this time is ~7million (i.e. lots of sm

Scaling indexes with high document count

2010-03-10 Thread Peter Sturge
Hello, I wonder if anyone might have some insight/advice on index scaling for high document count vs size deployments... The nature of the incoming data is a steady stream of, on average, 4GB per day. Importantly, the number of documents inserted during this time is ~7million (i.e. lots of small

Scaling indexes with high document count

2010-03-10 Thread Peter Sturge
Hello, I wonder if anyone might have some insight/advice on index scaling for high document count vs size deployments... The nature of the incoming data is a steady stream of, on average, 4GB per day. Importantly, the number of documents inserted during this time is ~7million (i.e. lots of small

Scaling indexes with high document count

2010-03-09 Thread Peter Sturge
Hello, I wonder if anyone might have some insight/advice on index scaling for high document count vs size deployments... The nature of the incoming data is a steady stream of, on average, 4GB per day. Importantly, the number of documents inserted during this time is ~7million (i.e. lots of small