On Aug 11, 2011, at 9:53 AM, eks dev wrote: > Thinking aloud and grateful for sparing .. > > I need to support high commit rate (low update latency) in a master > slave setup and I have a bad feelings about it, even with disabling > warmup and stripping everything down that slows down refresh.
I think some try and people do this, and I think a lot will depend on your index size, but in the past I rarely hear about hoping for sub minute. All depends I suppose. Replication and NRT don't go well together. With the size of the index given below, don't see you likely doing well at all here. > > I will try it anyway, but I started thinking about "backup plan", like > NRT on slaves. > > An idea is to have Master working on disk, doing commits in throughput > friendly manner (e.g. every 5-10 minutes), but let slaves do the same > updates with softCommit > > I am basically going to let slaves "possibly run out of sync" with > master, by issuing the same updates on all slaves with softCommit ... > every now and than syncing with Master. Nice idea! > > Could this work? the trick is, index is big (can fit in Ca. 16-20G > Ram), but update rate is small and ugly distributed in time (every > couple of seconds a few documents), one hard commit on master + slave > update would probably cost much more than add(document) with > softCommit on every slave (2-5 of them) > > So all in all, master remains real master and is there to ensure: > a ) seeding if slave restarts > b) authoritative index master, if slaves run out of sync (small diff > is ok if they get corrected once a day) Seems like it would work fine to me. Seems like supporting deletes might make it a bit more dicey... > > In general, do you find such idea wrong for some reason, should I be > doing something else/better to achieve low update latency in master > slave (for low update throughput)? Sounds like your best bet on the low hanging fruit tree. It seems a few of us are going to make a push on this so called "Solr Cloud" indexing side shortly. Sending docs individually to each machine is the only decent hope I know of for distrib NRT (other than hot/cold time series, index juggling, stuff). > > Anything I can do to make standard master slave latency better apart > from disabling warmup? Would loading os ramdisk (tmpfs forced in ram) > on slaves bring much. Haven't tried this with replication...*if* it avoids polluting the filesystem cache with index file copy operations, that would seem nice...I can only guess, but I'd certainly be surprised if it improved replication speed enough to get you to NRT. > > I am talking about Ca. 1 second (plus/minus) update latency target > from update to search on slave... But not more than 0.5 - 2 updates > every second. And what I so far understood how solr works, this is > going to be possible only with NRT on slaves (Analysis in my case is > fast, so not an issue)... Yeah, it should be a breeze - just set the soft auto commit to a second or something on each salve, set hard auto commit to X on master, and follow your plan above. - Mark Miller lucidimagination.com