Hey Noble, you are right in that this will solve the problem, however it implicitly assumes that commits to the master are infrequent enough ( so that most polling operations yield no update and only every few polls lead to an actual commit. ) This is a relatively safe assumption in most cases, but one that couples the master update policy with the performance of the slaves - if the master gets updated (and committed to) frequently, slaves might face a commit on every 1-2 poll's, much more than is feasible given new searcher warmup times.. In effect what this comes down to it seems is that i must make the master commit frequency the same as i'd want the slaves to use - and this is markedly different than previous behaviour with which i could have the master get updated(+committed to) at one rate and slaves committing those updates at a different rate.
Noble Paul നോബിള് नोब्ळ्-2 wrote: > > usually the pollInterval is kept to a small value like 10secs. there > is no harm in polling more frequently. This can ensure that the > replication happens at almost same time > > > > > On Fri, Aug 14, 2009 at 1:58 PM, KaktuChakarabati<jimmoe...@gmail.com> > wrote: >> >> Hey Shalin, >> thanks for your prompt reply. >> To clarity: >> With the old script-based replication, I would snappull every x minutes >> (say, on the order of 5 minutes). >> Assuming no index optimize occured ( I optimize 1-2 times a day so we can >> disregard it for the sake of argument), the snappull would take a few >> seconds to run on each iteration. >> I then have a crontab on all slaves that runs snapinstall on a fixed >> time, >> lets say every 15 minutes from start of a round hour, inclusive. (slave >> machine times are synced e.g via ntp) so that essentially all slaves will >> begin a snapinstall exactly at the same time - assuming uniform load and >> the >> fact they all have at this point in time the same snapshot since I >> snappull >> frequently - this leads to a fairly synchronized replication across the >> board. >> >> With the new replication however, it seems that by binding the pulling >> and >> installing as well specifying the timing in delta's only (as opposed to >> "absolute-time" based like in crontab) we've essentially made it >> impossible >> to effectively keep multiple slaves up to date and synchronized; e.g if >> we >> set poll interval to 15 minutes, a slight offset in the startup times of >> the >> slaves (that can very much be the case for arbitrary resets/maintenance >> operations) can lead to deviations in snappull(+install) times. this in >> turn >> is further made worse by the fact that the pollInterval is then computed >> based on the offset of when the last commit *finished* - and this number >> seems to have a higher variance, e.g due to warmup which might be >> different >> across machines based on the queries they've handled previously. >> >> To summarize, It seems to me like it might be beneficial to introduce a >> second parameter that acts more like a crontab time-based tableau, in so >> far >> that it can enable a user to specify when an actual commit should occur - >> so >> then we can have the pollInterval set to a low value (e.g 60 seconds) but >> then specify to only perform a commit on the 0,15,30,45-minutes of every >> hour. this makes the commit times on the slaves fairly deterministic. >> >> Does this make sense or am i missing something with current in-process >> replication? >> >> Thanks, >> -Chak >> >> >> Shalin Shekhar Mangar wrote: >>> >>> On Fri, Aug 14, 2009 at 8:39 AM, KaktuChakarabati >>> <jimmoe...@gmail.com>wrote: >>> >>>> >>>> In the old replication, I could snappull with multiple slaves >>>> asynchronously >>>> but perform the snapinstall on each at the same time (+- epsilon >>>> seconds), >>>> so that way production load balanced query serving will always be >>>> consistent. >>>> >>>> With the new system it seems that i have no control over syncing them, >>>> but >>>> rather it polls every few minutes and then decides the next cycle based >>>> on >>>> last time it *finished* updating, so in any case I lose control over >>>> the >>>> synchronization of snap installation across multiple slaves. >>>> >>> >>> That is true. How did you synchronize them with the script based >>> solution? >>> Assuming network bandwidth is equally distributed and all slaves are >>> equal >>> in hardware/configuration, the time difference between new searcher >>> registration on any slave should not be more then pollInterval, no? >>> >>> >>>> >>>> Also, I noticed the default poll interval is 60 seconds. It would seem >>>> that >>>> for such a rapid interval, what i mentioned above is a non issue, >>>> however >>>> i >>>> am not clear how this works vis-a-vis the new searcher warmup? for a >>>> considerable index size (20Million docs+) the warmup itself is an >>>> expensive >>>> and somewhat lengthy process and if a new searcher opens and warms up >>>> every >>>> minute, I am not at all sure i'll be able to serve queries with >>>> reasonable >>>> QTimes. >>>> >>> >>> If the pollInterval is 60 seconds, it does not mean that a new index is >>> fetched every 60 seconds. A new index is downloaded and installed on the >>> slave only if a commit happened on the master (i.e. the index was >>> actually >>> changed on the master). >>> >>> -- >>> Regards, >>> Shalin Shekhar Mangar. >>> >>> >> >> -- >> View this message in context: >> http://www.nabble.com/Solr-1.4-Replication-scheme-tp24965590p24968105.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > > -- > ----------------------------------------------------- > Noble Paul | Principal Engineer| AOL | http://aol.com > > -- View this message in context: http://www.nabble.com/Solr-1.4-Replication-scheme-tp24965590p24968460.html Sent from the Solr - User mailing list archive at Nabble.com.