On 3/16/2011 7:56 AM, Vadim Kisselmann wrote:
If the load is low, both slaves replicate with around 100MB/s from master.

But when I use Solrmeter (100-400 queries/min) for load tests (over
the load balancer), the replication slows down to an unacceptable
speed, around 100KB/s (at least that's whats the replication page on
/solr/admin says).
<snip>
- Same hardware for all servers: Physical machines with quad core
CPUs, 24GB RAM (JVM starts up with -XX:+UseConcMarkSweepGC -Xms10G
-Xmx10G)
- Index size is about 100GB with 40M docs

Primary assumption:  You have a 64-bit OS and a 64-bit JVM.

It sounds to me like you're I/O bound, because your machine cannot keep enough of your index in RAM. Relative to your 100GB index, you only have a maximum of 14GB of RAM available to the OS disk cache, since Java's heap size is 10GB. How much disk space do all of the index files that end in "x" take up? I would venture a guess that it's significantly more than 14GB. On Linux, you could do this command to tally it quickly:

du -hc *x

If you installed enough RAM so the disk cache can be much larger than the total size of those files ending in "x", you'd probably stop having these performance issues. Realizing that this is a Alternatively, you could take steps to reduce the size of your index, or perhaps add more machines to go distributed.

My own index is distributed and replicated. I've got nearly 53 million documents and a total index size of 95GB. This is split into six shards that each are nearly 16GB. Running that du command I gave you above, the total on one shard is 2.5GB, and there is 7GB of RAM available for the OS cache.

NB: I could be completely wrong about the source of the problem.

Thanks,
Shawn

Reply via email to