On 7/4/2018 3:32 AM, Arturas Mazeika wrote:
Details:
I am benchmarking solrcloud setup on a single machine (Intel 7 with 8 "cpu
cores", an SSD as well as a HDD) using the German Wikipedia collection. I
created 4 nodes, 4 shards, rep factor: 2 cluster on the same machine (and
managed to push the CPU or SSD to the hardware limits, i.e., ~200MB/s,
~100% CPU). Now I wanted to see what happens if I push HDD to the limits.
Indexing the files from the SSD (I am able to scan the collection at the
actual rate 400-500MB/s) with 16 threads, I tried to send those to the solr
cluster with all indexes on the HDD.
<snip>
- 4 cores running 2gb ram
If this is saying that the machine running Solr has 2GB of installed
memory, that's going to be a serious problem.
The default heap size that Solr starts with is 512MB. With 4 Solr nodes
running on the machine, each with a 512MB heap, all of your 2GB of
memory is going to be required by the heaps. Java requires memory
beyond the heap to run. Your operating system and its other processes
will also require some memory.
This means that not only are you going to have no memory left for the OS
disk cache, you're actually going to allocating MORE than the 2GB of
installed memory, which means the OS is going to start swapping to
accommodate memory allocations.
When you don't have enough memory for good disk caching, Solr
performance is absolutely terrible. When Solr has to wait for data to
be read off of disk, even if the disk is SSD, its performance will not
be good.
When the OS starts swapping, the performance of ANY software on the
system drops SIGNIFICANTLY.
You need a lot more memory than 2GB on your server.
Thanks,
Shawn