It's the optimize step. Optimize essentially forces all the segments to
be copied into a single new segment, which means that your entire index
will be replicated to the slaves.

In recent Solrs, there's usually no need to optimize, so unless and until you
can demonstrate a noticeable change, I'd just leave the optimize step off. In
fact, trunk renames it to forceMerge or something just because it's so common
for people to think "of course I want to optimize my index!" and get the
unintended consequences you're seeing even thought the optimize doesn't
actually do that much good in most cases.

Some people just do the optimize once a day (or week or whatever) during
off-peak hours as a compromise.

Best
Erick


On Mon, Mar 26, 2012 at 5:02 AM, Ben McCarthy
<ben.mccar...@tradermedia.co.uk> wrote:
> Hello,
>
> Had to leave the office so didn't get a chance to reply.  Nothing in the 
> logs.  Just ran one through from the ingest tool.
>
> Same results full copy of the index.
>
> Is it something to do with:
>
> server.commit();
> server.optimize();
>
> I call this at the end of the ingestion.
>
> Would optimize then work across the whole index?
>
> Thanks
> Ben
>
> -----Original Message-----
> From: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.com]
> Sent: 23 March 2012 15:10
> To: solr-user@lucene.apache.org
> Subject: Re: Simple Slave Replication Question
>
> Also, what happens if, instead of adding the 40K docs you add just one and 
> commit?
>
> 2012/3/23 Tomás Fernández Löbbe <tomasflo...@gmail.com>
>
>> Have you changed the mergeFactor or are you using 10 as in the example
>> solrconfig?
>>
>> What do you see in the slave's log during replication? Do you see any
>> line like "Skipping download for..."?
>>
>>
>> On Fri, Mar 23, 2012 at 11:57 AM, Ben McCarthy <
>> ben.mccar...@tradermedia.co.uk> wrote:
>>
>>> I just have a index directory.
>>>
>>> I push the documents through with a change to a field.  Im using
>>> SOLRJ to do this.  Im using the guide from the wiki to setup the
>>> replication.  When the feed of updates to the master finishes I call
>>> a commit again using SOLRJ.  I then have a poll period of 5 minutes
>>> from the slave.  When it kicks in I see a new version of the index
>>> and then it copys the full 5gb index.
>>>
>>> Thanks
>>> Ben
>>>
>>> -----Original Message-----
>>> From: Tomás Fernández Löbbe [mailto:tomasflo...@gmail.com]
>>> Sent: 23 March 2012 14:29
>>> To: solr-user@lucene.apache.org
>>> Subject: Re: Simple Slave Replication Question
>>>
>>> Hi Ben, only new segments are replicated from master to slave. In a
>>> situation where all the segments are new, this will cause the index
>>> to be fully replicated, but this rarely happen with incremental
>>> updates. It can also happen if the slave Solr assumes it has an "invalid" 
>>> index.
>>> Are you committing or optimizing on the slaves? After replication,
>>> the index directory on the slaves is called "index" or "index.<timestamp>"?
>>>
>>> Tomás
>>>
>>> On Fri, Mar 23, 2012 at 11:18 AM, Ben McCarthy <
>>> ben.mccar...@tradermedia.co.uk> wrote:
>>>
>>> > So do you just simpy address this with big nic and network pipes.
>>> >
>>> > -----Original Message-----
>>> > From: Martin Koch [mailto:m...@issuu.com]
>>> > Sent: 23 March 2012 14:07
>>> > To: solr-user@lucene.apache.org
>>> > Subject: Re: Simple Slave Replication Question
>>> >
>>> > I guess this would depend on network bandwidth, but we move around
>>> > 150G/hour when hooking up a new slave to the master.
>>> >
>>> > /Martin
>>> >
>>> > On Fri, Mar 23, 2012 at 12:33 PM, Ben McCarthy <
>>> > ben.mccar...@tradermedia.co.uk> wrote:
>>> >
>>> > > Hello,
>>> > >
>>> > > Im looking at the replication from a master to a number of slaves.
>>> > > I have configured it and it appears to be working.  When updating
>>> > > 40K records on the master is it standard to always copy over the
>>> > > full index, currently 5gb in size.  If this is standard what do
>>> > > people do who have massive 200gb indexs, does it not take a while
>>> > > to bring the
>>> > slaves inline with the master?
>>> > >
>>> > > Thanks
>>> > > Ben
>>> > >
>>> > > ________________________________________
>>> > >
>>> > >
>>> > > This e-mail is sent on behalf of Trader Media Group Limited,
>>> > > Registered
>>> > > Office: Auto Trader House, Cutbush Park Industrial Estate,
>>> > > Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in 
>>> > > England No.
>>> > 4768833).
>>> > > This email and any files transmitted with it are confidential and
>>> > > may be legally privileged, and intended solely for the use of the
>>> > > individual or entity to whom they are addressed. If you have
>>> > > received this email in error please notify the sender. This email
>>> > > message has been swept for the presence of computer viruses.
>>> > >
>>> > >
>>> >
>>> > ________________________________________
>>> >
>>> >
>>> > This e-mail is sent on behalf of Trader Media Group Limited,
>>> > Registered
>>> > Office: Auto Trader House, Cutbush Park Industrial Estate,
>>> > Danehill, Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England 
>>> > No.
>>> 4768833).
>>> > This email and any files transmitted with it are confidential and
>>> > may be legally privileged, and intended solely for the use of the
>>> > individual or entity to whom they are addressed. If you have
>>> > received this email in error please notify the sender. This email
>>> > message has been swept for the presence of computer viruses.
>>> >
>>> >
>>>
>>> ________________________________________
>>>
>>>
>>> This e-mail is sent on behalf of Trader Media Group Limited,
>>> Registered
>>> Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill,
>>> Lower Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 
>>> 4768833).
>>> This email and any files transmitted with it are confidential and may
>>> be legally privileged, and intended solely for the use of the
>>> individual or entity to whom they are addressed. If you have received
>>> this email in error please notify the sender. This email message has
>>> been swept for the presence of computer viruses.
>>>
>>>
>>
>
> ________________________________________
>
>
> This e-mail is sent on behalf of Trader Media Group Limited, Registered 
> Office: Auto Trader House, Cutbush Park Industrial Estate, Danehill, Lower 
> Earley, Reading, Berkshire, RG6 4UT(Registered in England No. 4768833). This 
> email and any files transmitted with it are confidential and may be legally 
> privileged, and intended solely for the use of the individual or entity to 
> whom they are addressed. If you have received this email in error please 
> notify the sender. This email message has been swept for the presence of 
> computer viruses.
>

Reply via email to