: We have an index around 25-30G w/ 1 master and 5 slaves. We perform : replication every 30 mins. During replication the disk I/O obviously shoots up : on the slaves to the point where all requests routed to that slave take a : really long time... sometimes to the point of timing out. : : Is there any logical or physical changes we could make to our architecture to : overcome this problem?
If the problem really is disk I/O then perhaps you don't have enough RAM set asside for the filesystem cache to keep the "current" index in memory? I've seen people have this type of problem before, but usually it's network I/O that is the bottleneck, in which case using multiple NICs on your slaves (one for client requests, one for replication) I think at one point there was also talk about leveraging an rsync option to force snappuller to throttle itself an only use a max amount of bandwidth -- but then we moved away from script based replication to java based replication and i don't think the Java Network/IO system suports that type of throttling. However: you might be able to configure it in your switches/routers (ie: only let the slaves use X% of their total badwidth to talk to the master) -Hoss