Re: Solr 1.4 Replication scheme

Yonik Seeley Fri, 14 Aug 2009 10:53:29 -0700

On Fri, Aug 14, 2009 at 1:48 PM, Jason
Rutherglen<jason.rutherg...@gmail.com> wrote:
> This would be good! Especially for NRT where this problem is
> somewhat harder. I think we may need to look at caching readers
> per corresponding http session.


For something like distributed search I was thinking of a simple
reservation mechanism... let the client specify how long to hold open
that version of the index (perhaps still have a max number of open
versions to prevent an errant client from blowing things up).

-Yonik
http://www.lucidimagination.com


 The pitfall is expiring them
> before running out of RAM.
>
> On Fri, Aug 14, 2009 at 6:34 AM, Yonik Seeley<yo...@lucidimagination.com> 
> wrote:
>> Longer term, it might be nice to enable clients to specify what
>> version of the index they were searching against.  This could be used
>> to prevent consistency issues across different slaves, even if they
>> commit at different times.  It could also be used in distributed
>> search to make sure the index didn't change between phases.
>>
>> -Yonik
>> http://www.lucidimagination.com
>>
>>
>>
>> 2009/8/14 Noble Paul നോബിള്‍  नोब्ळ् <noble.p...@corp.aol.com>:
>>> On Fri, Aug 14, 2009 at 2:28 PM, KaktuChakarabati<jimmoe...@gmail.com> 
>>> wrote:
>>>>
>>>> Hey Noble,
>>>> you are right in that this will solve the problem, however it implicitly
>>>> assumes that commits to the master are infrequent enough ( so that most
>>>> polling operations yield no update and only every few polls lead to an
>>>> actual commit. )
>>>> This is a relatively safe assumption in most cases, but one that couples 
>>>> the
>>>> master update policy with the performance of the slaves - if the master 
>>>> gets
>>>> updated (and committed to) frequently, slaves might face a commit on every
>>>> 1-2 poll's, much more than is feasible given new searcher warmup times..
>>>> In effect what this comes down to it seems is that i must make the master
>>>> commit frequency the same as i'd want the slaves to use - and this is
>>>> markedly different than previous behaviour with which i could have the
>>>> master get updated(+committed to) at one rate and slaves committing those
>>>> updates at a different rate.
>>> I see , the argument. But , isn't it better to keep both the mster and
>>> slave as consistent as possible? There is no use in committing in
>>> master, if you do not plan to search on those docs. So the best thing
>>> to do is do a commit only as frequently as you wish to commit in a
>>> slave.
>>>
>>> On a different track, if we can have an option of disabling commit
>>> after replication, is it worth it? So the user can trigger a commit
>>> explicitly
>>>
>>>>
>>>>
>>>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>>>>
>>>>> usually the pollInterval is kept to a small value like 10secs. there
>>>>> is no harm in polling more frequently. This can ensure that the
>>>>> replication happens at almost same time
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Aug 14, 2009 at 1:58 PM, KaktuChakarabati<jimmoe...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> Hey Shalin,
>>>>>> thanks for your prompt reply.
>>>>>> To clarity:
>>>>>> With the old script-based replication, I would snappull every x minutes
>>>>>> (say, on the order of 5 minutes).
>>>>>> Assuming no index optimize occured ( I optimize 1-2 times a day so we can
>>>>>> disregard it for the sake of argument), the snappull would take a few
>>>>>> seconds to run on each iteration.
>>>>>> I then have a crontab on all slaves that runs snapinstall on a fixed
>>>>>> time,
>>>>>> lets say every 15 minutes from start of a round hour, inclusive. (slave
>>>>>> machine times are synced e.g via ntp) so that essentially all slaves will
>>>>>> begin a snapinstall exactly at the same time - assuming uniform load and
>>>>>> the
>>>>>> fact they all have at this point in time the same snapshot since I
>>>>>> snappull
>>>>>> frequently - this leads to a fairly synchronized replication across the
>>>>>> board.
>>>>>>
>>>>>> With the new replication however, it seems that by binding the pulling
>>>>>> and
>>>>>> installing as well specifying the timing in delta's only (as opposed to
>>>>>> "absolute-time" based like in crontab) we've essentially made it
>>>>>> impossible
>>>>>> to effectively keep multiple slaves up to date and synchronized; e.g if
>>>>>> we
>>>>>> set poll interval to 15 minutes, a slight offset in the startup times of
>>>>>> the
>>>>>> slaves (that can very much be the case for arbitrary resets/maintenance
>>>>>> operations) can lead to deviations in snappull(+install) times. this in
>>>>>> turn
>>>>>> is further made worse by the fact that the pollInterval is then computed
>>>>>> based on the offset of when the last commit *finished* - and this number
>>>>>> seems to have a higher variance, e.g due to warmup which might be
>>>>>> different
>>>>>> across machines based on the queries they've handled previously.
>>>>>>
>>>>>> To summarize, It seems to me like it might be beneficial to introduce a
>>>>>> second parameter that acts more like a crontab time-based tableau, in so
>>>>>> far
>>>>>> that it can enable a user to specify when an actual commit should occur -
>>>>>> so
>>>>>> then we can have the pollInterval set to a low value (e.g 60 seconds) but
>>>>>> then specify to only perform a commit on the 0,15,30,45-minutes of every
>>>>>> hour. this makes the commit times on the slaves fairly deterministic.
>>>>>>
>>>>>> Does this make sense or am i missing something with current in-process
>>>>>> replication?
>>>>>>
>>>>>> Thanks,
>>>>>> -Chak
>>>>>>
>>>>>>
>>>>>> Shalin Shekhar Mangar wrote:
>>>>>>>
>>>>>>> On Fri, Aug 14, 2009 at 8:39 AM, KaktuChakarabati
>>>>>>> <jimmoe...@gmail.com>wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> In the old replication, I could snappull with multiple slaves
>>>>>>>> asynchronously
>>>>>>>> but perform the snapinstall on each at the same time (+- epsilon
>>>>>>>> seconds),
>>>>>>>> so that way production load balanced query serving will always be
>>>>>>>> consistent.
>>>>>>>>
>>>>>>>> With the new system it seems that i have no control over syncing them,
>>>>>>>> but
>>>>>>>> rather it polls every few minutes and then decides the next cycle based
>>>>>>>> on
>>>>>>>> last time it *finished* updating, so in any case I lose control over
>>>>>>>> the
>>>>>>>> synchronization of snap installation across multiple slaves.
>>>>>>>>
>>>>>>>
>>>>>>> That is true. How did you synchronize them with the script based
>>>>>>> solution?
>>>>>>> Assuming network bandwidth is equally distributed and all slaves are
>>>>>>> equal
>>>>>>> in hardware/configuration, the time difference between new searcher
>>>>>>> registration on any slave should not be more then pollInterval, no?
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Also, I noticed the default poll interval is 60 seconds. It would seem
>>>>>>>> that
>>>>>>>> for such a rapid interval, what i mentioned above is a non issue,
>>>>>>>> however
>>>>>>>> i
>>>>>>>> am not clear how this works vis-a-vis the new searcher warmup? for a
>>>>>>>> considerable index size (20Million docs+) the warmup itself is an
>>>>>>>> expensive
>>>>>>>> and somewhat lengthy process and if a new searcher opens and warms up
>>>>>>>> every
>>>>>>>> minute, I am not at all sure i'll be able to serve queries with
>>>>>>>> reasonable
>>>>>>>> QTimes.
>>>>>>>>
>>>>>>>
>>>>>>> If the pollInterval is 60 seconds, it does not mean that a new index is
>>>>>>> fetched every 60 seconds. A new index is downloaded and installed on the
>>>>>>> slave only if a commit happened on the master (i.e. the index was
>>>>>>> actually
>>>>>>> changed on the master).
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> Shalin Shekhar Mangar.
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://www.nabble.com/Solr-1.4-Replication-scheme-tp24965590p24968105.html
>>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> -----------------------------------------------------
>>>>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context: 
>>>> http://www.nabble.com/Solr-1.4-Replication-scheme-tp24965590p24968460.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> -----------------------------------------------------
>>> Noble Paul | Principal Engineer| AOL | http://aol.com
>>>
>>
>

Re: Solr 1.4 Replication scheme

Reply via email to