On 3/16/2011 7:56 AM, Vadim Kisselmann wrote:
If the load is low, both slaves replicate with around 100MB/s from master.
But when I use Solrmeter (100-400 queries/min) for load tests (over
the load balancer), the replication slows down to an unacceptable
speed, around 100KB/s (at least that's whats the replication page on
/solr/admin says).
<snip>
- Same hardware for all servers: Physical machines with quad core
CPUs, 24GB RAM (JVM starts up with -XX:+UseConcMarkSweepGC -Xms10G
-Xmx10G)
- Index size is about 100GB with 40M docs
Primary assumption: You have a 64-bit OS and a 64-bit JVM.
It sounds to me like you're I/O bound, because your machine cannot keep
enough of your index in RAM. Relative to your 100GB index, you only
have a maximum of 14GB of RAM available to the OS disk cache, since
Java's heap size is 10GB. How much disk space do all of the index files
that end in "x" take up? I would venture a guess that it's
significantly more than 14GB. On Linux, you could do this command to
tally it quickly:
du -hc *x
If you installed enough RAM so the disk cache can be much larger than
the total size of those files ending in "x", you'd probably stop having
these performance issues. Realizing that this is a Alternatively, you
could take steps to reduce the size of your index, or perhaps add more
machines to go distributed.
My own index is distributed and replicated. I've got nearly 53 million
documents and a total index size of 95GB. This is split into six shards
that each are nearly 16GB. Running that du command I gave you above,
the total on one shard is 2.5GB, and there is 7GB of RAM available for
the OS cache.
NB: I could be completely wrong about the source of the problem.
Thanks,
Shawn