Re: Replication slows down massively during high load

Shawn Heisey Wed, 16 Mar 2011 17:10:16 -0700

On 3/16/2011 7:56 AM, Vadim Kisselmann wrote:

If the load is low, both slaves replicate with around 100MB/s from master.


But when I use Solrmeter (100-400 queries/min) for load tests (over
the load balancer), the replication slows down to an unacceptable
speed, around 100KB/s (at least that's whats the replication page on
/solr/admin says).

<snip>

- Same hardware for all servers: Physical machines with quad core
CPUs, 24GB RAM (JVM starts up with -XX:+UseConcMarkSweepGC -Xms10G
-Xmx10G)
- Index size is about 100GB with 40M docs


Primary assumption:  You have a 64-bit OS and a 64-bit JVM.

It sounds to me like you're I/O bound, because your machine cannot keepenough of your index in RAM. Relative to your 100GB index, you onlyhave a maximum of 14GB of RAM available to the OS disk cache, sinceJava's heap size is 10GB. How much disk space do all of the index filesthat end in "x" take up? I would venture a guess that it'ssignificantly more than 14GB. On Linux, you could do this command totally it quickly:


du -hc *x

If you installed enough RAM so the disk cache can be much larger thanthe total size of those files ending in "x", you'd probably stop havingthese performance issues. Realizing that this is a Alternatively, youcould take steps to reduce the size of your index, or perhaps add moremachines to go distributed.

My own index is distributed and replicated. I've got nearly 53 milliondocuments and a total index size of 95GB. This is split into six shardsthat each are nearly 16GB. Running that du command I gave you above,the total on one shard is 2.5GB, and there is 7GB of RAM available forthe OS cache.


NB: I could be completely wrong about the source of the problem.

Thanks,
Shawn

Re: Replication slows down massively during high load

Reply via email to