On Mar 21, 2012, at 9:37 PM, I-Chiang Chen wrote:

> We are currently experimenting with SolrCloud functionality in Solr 4.0.
> The goal is to see if Solr 4.0 trunk with is current state is able to
> handle roughly 200million documents. The document size is not big around 40
> fields no more than a KB, most of which are empty majority of times.
> 
> The setup we have is 4 servers w/ 2 shards w/ 2 servers per shard. We are
> running in Tomcat.
> 
> The questions are giving the approximate data volume, is it a realistic to
> expect above setup can handle it.

So 100 million docs per machine essentially? Totally depends on the hardware 
and what features you are using - but def in the realm of possibility.

> Giving the number of documents should
> commit every x documents or rely on auto commits?

The number of docs shouldn't really matter here. Do you need near real time 
search?

You should be able to commit about as frequently as you'd like with NRT (eg 
every 1 second if you'd like) - either using soft auto commit or commitWithin.

Then you want to do a hard commit less frequently - every minute (or more or 
less) with openSearcher=false.

eg

     <autoCommit> 
       <maxTime>15000</maxTime> 
       <openSearcher>false</openSearcher> 
     </autoCommit>

> 
> -- 
> -IC

- Mark Miller
lucidimagination.com











Reply via email to