Re: Solr 1.4 Replication scheme

KaktuChakarabati Fri, 14 Aug 2009 01:28:41 -0700

Hey Shalin,
thanks for your prompt reply.
To clarity:
With the old script-based replication, I would snappull every x minutes
(say, on the order of 5 minutes).
Assuming no index optimize occured ( I optimize 1-2 times a day so we can
disregard it for the sake of argument), the snappull would take a few
seconds to run on each iteration. 
I then have a crontab on all slaves that runs snapinstall on a fixed time,
lets say every 15 minutes from start of a round hour, inclusive. (slave
machine times are synced e.g via ntp) so that essentially all slaves will
begin a snapinstall exactly at the same time - assuming uniform load and the
fact they all have at this point in time the same snapshot since I snappull
frequently - this leads to a fairly synchronized replication across the
board.

With the new replication however, it seems that by binding the pulling and
installing as well specifying the timing in delta's only (as opposed to
"absolute-time" based like in crontab) we've essentially made it impossible
to effectively keep multiple slaves up to date and synchronized; e.g if we
set poll interval to 15 minutes, a slight offset in the startup times of the
slaves (that can very much be the case for arbitrary resets/maintenance
operations) can lead to deviations in snappull(+install) times. this in turn
is further made worse by the fact that the pollInterval is then computed
based on the offset of when the last commit *finished* - and this number
seems to have a higher variance, e.g due to warmup which might be different
across machines based on the queries they've handled previously.

To summarize, It seems to me like it might be beneficial to introduce a
second parameter that acts more like a crontab time-based tableau, in so far
that it can enable a user to specify when an actual commit should occur - so
then we can have the pollInterval set to a low value (e.g 60 seconds) but
then specify to only perform a commit on the 0,15,30,45-minutes of every
hour. this makes the commit times on the slaves fairly deterministic.

Does this make sense or am i missing something with current in-process
replication?

Thanks,
-Chak

Shalin Shekhar Mangar wrote:
> 
> On Fri, Aug 14, 2009 at 8:39 AM, KaktuChakarabati
> <jimmoe...@gmail.com>wrote:
> 
>>
>> In the old replication, I could snappull with multiple slaves
>> asynchronously
>> but perform the snapinstall on each at the same time (+- epsilon
>> seconds),
>> so that way production load balanced query serving will always be
>> consistent.
>>
>> With the new system it seems that i have no control over syncing them,
>> but
>> rather it polls every few minutes and then decides the next cycle based
>> on
>> last time it *finished* updating, so in any case I lose control over the
>> synchronization of snap installation across multiple slaves.
>>
> 
> That is true. How did you synchronize them with the script based solution?
> Assuming network bandwidth is equally distributed and all slaves are equal
> in hardware/configuration, the time difference between new searcher
> registration on any slave should not be more then pollInterval, no?
> 
> 
>>
>> Also, I noticed the default poll interval is 60 seconds. It would seem
>> that
>> for such a rapid interval, what i mentioned above is a non issue, however
>> i
>> am not clear how this works vis-a-vis the new searcher warmup? for a
>> considerable index size (20Million docs+) the warmup itself is an
>> expensive
>> and somewhat lengthy process and if a new searcher opens and warms up
>> every
>> minute, I am not at all sure i'll be able to serve queries with
>> reasonable
>> QTimes.
>>
> 
> If the pollInterval is 60 seconds, it does not mean that a new index is
> fetched every 60 seconds. A new index is downloaded and installed on the
> slave only if a commit happened on the master (i.e. the index was actually
> changed on the master).
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Solr-1.4-Replication-scheme-tp24965590p24968105.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 1.4 Replication scheme

Reply via email to