What issues? It really shouldn't be a problem. 

On Mar 22, 2012, at 11:44 PM, I-Chiang Chen <ichiangc...@gmail.com> wrote:

> At this time we are not leveraging the NRT functionality. This is the
> initial data load process where the idea is to just add all 200 millions
> records first. Than do a single commit at the end to make them searchable.
> We actually disabled auto commit at this time.
> 
> We have tried to leave auto commit enabled during the initial data load
> process and ran into multiple issues that leads to botched loading process.
> 
> On Thu, Mar 22, 2012 at 2:15 PM, Mark Miller <markrmil...@gmail.com> wrote:
> 
>> 
>> On Mar 21, 2012, at 9:37 PM, I-Chiang Chen wrote:
>> 
>>> We are currently experimenting with SolrCloud functionality in Solr 4.0.
>>> The goal is to see if Solr 4.0 trunk with is current state is able to
>>> handle roughly 200million documents. The document size is not big around
>> 40
>>> fields no more than a KB, most of which are empty majority of times.
>>> 
>>> The setup we have is 4 servers w/ 2 shards w/ 2 servers per shard. We are
>>> running in Tomcat.
>>> 
>>> The questions are giving the approximate data volume, is it a realistic
>> to
>>> expect above setup can handle it.
>> 
>> So 100 million docs per machine essentially? Totally depends on the
>> hardware and what features you are using - but def in the realm of
>> possibility.
>> 
>>> Giving the number of documents should
>>> commit every x documents or rely on auto commits?
>> 
>> The number of docs shouldn't really matter here. Do you need near real
>> time search?
>> 
>> You should be able to commit about as frequently as you'd like with NRT
>> (eg every 1 second if you'd like) - either using soft auto commit or
>> commitWithin.
>> 
>> Then you want to do a hard commit less frequently - every minute (or more
>> or less) with openSearcher=false.
>> 
>> eg
>> 
>>    <autoCommit>
>>      <maxTime>15000</maxTime>
>>      <openSearcher>false</openSearcher>
>>    </autoCommit>
>> 
>>> 
>>> --
>>> -IC
>> 
>> - Mark Miller
>> lucidimagination.com
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> 
> -- 
> -IC

Reply via email to