Re: Solr 1.4 Replication scheme

KaktuChakarabati Fri, 14 Aug 2009 01:58:47 -0700

Hey Noble,
you are right in that this will solve the problem, however it implicitly
assumes that commits to the master are infrequent enough ( so that most
polling operations yield no update and only every few polls lead to an
actual commit. )
This is a relatively safe assumption in most cases, but one that couples the
master update policy with the performance of the slaves - if the master gets
updated (and committed to) frequently, slaves might face a commit on every
1-2 poll's, much more than is feasible given new searcher warmup times..
In effect what this comes down to it seems is that i must make the master
commit frequency the same as i'd want the slaves to use - and this is
markedly different than previous behaviour with which i could have the
master get updated(+committed to) at one rate and slaves committing those
updates at a different rate.



Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> usually the pollInterval is kept to a small value like 10secs. there
> is no harm in polling more frequently. This can ensure that the
> replication happens at almost same time
> 
> 
> 
> 
> On Fri, Aug 14, 2009 at 1:58 PM, KaktuChakarabati<jimmoe...@gmail.com>
> wrote:
>>
>> Hey Shalin,
>> thanks for your prompt reply.
>> To clarity:
>> With the old script-based replication, I would snappull every x minutes
>> (say, on the order of 5 minutes).
>> Assuming no index optimize occured ( I optimize 1-2 times a day so we can
>> disregard it for the sake of argument), the snappull would take a few
>> seconds to run on each iteration.
>> I then have a crontab on all slaves that runs snapinstall on a fixed
>> time,
>> lets say every 15 minutes from start of a round hour, inclusive. (slave
>> machine times are synced e.g via ntp) so that essentially all slaves will
>> begin a snapinstall exactly at the same time - assuming uniform load and
>> the
>> fact they all have at this point in time the same snapshot since I
>> snappull
>> frequently - this leads to a fairly synchronized replication across the
>> board.
>>
>> With the new replication however, it seems that by binding the pulling
>> and
>> installing as well specifying the timing in delta's only (as opposed to
>> "absolute-time" based like in crontab) we've essentially made it
>> impossible
>> to effectively keep multiple slaves up to date and synchronized; e.g if
>> we
>> set poll interval to 15 minutes, a slight offset in the startup times of
>> the
>> slaves (that can very much be the case for arbitrary resets/maintenance
>> operations) can lead to deviations in snappull(+install) times. this in
>> turn
>> is further made worse by the fact that the pollInterval is then computed
>> based on the offset of when the last commit *finished* - and this number
>> seems to have a higher variance, e.g due to warmup which might be
>> different
>> across machines based on the queries they've handled previously.
>>
>> To summarize, It seems to me like it might be beneficial to introduce a
>> second parameter that acts more like a crontab time-based tableau, in so
>> far
>> that it can enable a user to specify when an actual commit should occur -
>> so
>> then we can have the pollInterval set to a low value (e.g 60 seconds) but
>> then specify to only perform a commit on the 0,15,30,45-minutes of every
>> hour. this makes the commit times on the slaves fairly deterministic.
>>
>> Does this make sense or am i missing something with current in-process
>> replication?
>>
>> Thanks,
>> -Chak
>>
>>
>> Shalin Shekhar Mangar wrote:
>>>
>>> On Fri, Aug 14, 2009 at 8:39 AM, KaktuChakarabati
>>> <jimmoe...@gmail.com>wrote:
>>>
>>>>
>>>> In the old replication, I could snappull with multiple slaves
>>>> asynchronously
>>>> but perform the snapinstall on each at the same time (+- epsilon
>>>> seconds),
>>>> so that way production load balanced query serving will always be
>>>> consistent.
>>>>
>>>> With the new system it seems that i have no control over syncing them,
>>>> but
>>>> rather it polls every few minutes and then decides the next cycle based
>>>> on
>>>> last time it *finished* updating, so in any case I lose control over
>>>> the
>>>> synchronization of snap installation across multiple slaves.
>>>>
>>>
>>> That is true. How did you synchronize them with the script based
>>> solution?
>>> Assuming network bandwidth is equally distributed and all slaves are
>>> equal
>>> in hardware/configuration, the time difference between new searcher
>>> registration on any slave should not be more then pollInterval, no?
>>>
>>>
>>>>
>>>> Also, I noticed the default poll interval is 60 seconds. It would seem
>>>> that
>>>> for such a rapid interval, what i mentioned above is a non issue,
>>>> however
>>>> i
>>>> am not clear how this works vis-a-vis the new searcher warmup? for a
>>>> considerable index size (20Million docs+) the warmup itself is an
>>>> expensive
>>>> and somewhat lengthy process and if a new searcher opens and warms up
>>>> every
>>>> minute, I am not at all sure i'll be able to serve queries with
>>>> reasonable
>>>> QTimes.
>>>>
>>>
>>> If the pollInterval is 60 seconds, it does not mean that a new index is
>>> fetched every 60 seconds. A new index is downloaded and installed on the
>>> slave only if a commit happened on the master (i.e. the index was
>>> actually
>>> changed on the master).
>>>
>>> --
>>> Regards,
>>> Shalin Shekhar Mangar.
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Solr-1.4-Replication-scheme-tp24965590p24968105.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> -----------------------------------------------------
> Noble Paul | Principal Engineer| AOL | http://aol.com
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Solr-1.4-Replication-scheme-tp24965590p24968460.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr 1.4 Replication scheme

Reply via email to