We did some tests too with many millions of documents and auto-commit enabled. 
It didn't take long for the indexer to stall and in the meantime the number of 
open files exploded, to over 16k, then 32k.

On Friday 23 March 2012 12:20:15 Mark Miller wrote:
> What issues? It really shouldn't be a problem.
> 
> On Mar 22, 2012, at 11:44 PM, I-Chiang Chen <ichiangc...@gmail.com> wrote:
> > At this time we are not leveraging the NRT functionality. This is the
> > initial data load process where the idea is to just add all 200 millions
> > records first. Than do a single commit at the end to make them
> > searchable. We actually disabled auto commit at this time.
> > 
> > We have tried to leave auto commit enabled during the initial data load
> > process and ran into multiple issues that leads to botched loading
> > process.
> > 
> > On Thu, Mar 22, 2012 at 2:15 PM, Mark Miller <markrmil...@gmail.com> 
wrote:
> >> On Mar 21, 2012, at 9:37 PM, I-Chiang Chen wrote:
> >>> We are currently experimenting with SolrCloud functionality in Solr
> >>> 4.0. The goal is to see if Solr 4.0 trunk with is current state is
> >>> able to handle roughly 200million documents. The document size is not
> >>> big around
> >> 
> >> 40
> >> 
> >>> fields no more than a KB, most of which are empty majority of times.
> >>> 
> >>> The setup we have is 4 servers w/ 2 shards w/ 2 servers per shard. We
> >>> are running in Tomcat.
> >>> 
> >>> The questions are giving the approximate data volume, is it a realistic
> >> 
> >> to
> >> 
> >>> expect above setup can handle it.
> >> 
> >> So 100 million docs per machine essentially? Totally depends on the
> >> hardware and what features you are using - but def in the realm of
> >> possibility.
> >> 
> >>> Giving the number of documents should
> >>> commit every x documents or rely on auto commits?
> >> 
> >> The number of docs shouldn't really matter here. Do you need near real
> >> time search?
> >> 
> >> You should be able to commit about as frequently as you'd like with NRT
> >> (eg every 1 second if you'd like) - either using soft auto commit or
> >> commitWithin.
> >> 
> >> Then you want to do a hard commit less frequently - every minute (or
> >> more or less) with openSearcher=false.
> >> 
> >> eg
> >> 
> >>    <autoCommit>
> >>    
> >>      <maxTime>15000</maxTime>
> >>      <openSearcher>false</openSearcher>
> >>    
> >>    </autoCommit>
> >>> 
> >>> --
> >>> -IC
> >> 
> >> - Mark Miller
> >> lucidimagination.com

-- 
Markus Jelsma - CTO - Openindex

Reply via email to