On 10/26/2018 9:55 AM, Sofiya Strochyk wrote:
We have a SolrCloud setup with the following configuration:
I'm late to this party. You've gotten some good replies already. I
hope I can add something useful.
* 4 nodes (3x128GB RAM Intel Xeon E5-1650v2, 1x64GB RAM Intel Xeon
E5-1650v2, 12 cores, with SSDs)
* One collection, 4 shards, each has only a single replica (so 4
replicas in total), using compositeId router
* Total index size is about 150M documents/320GB, so about 40M/80GB
per node
With 80GB of index data and one node that only has 64GB of memory, the
full index won't fit into memory on that one server. With approximately
56GB of memory (assuming there's nothing besides Solr running on these
servers and the size of all Java heaps on the system is 8GB) to cache
80GB of index data, performance might be good. Or it might be
terrible. It's impossible to predict effectively.
* Heap size is set to 8GB.
I'm not sure that an 8GB heap is large enough. Especially given what
you said later about experiencing OOM and seeing a lot of full GCs.
If properly tuned, the G1 collector is overall more efficient than CMS,
but CMS can be quite good. If GC is not working well with CMS, chances
are that switching to G1 will not help. The root problem is likely to
be something that a different collector can't fix -- like the heap being
too small.
I wrote the page you referenced for GC tuning. I have *never* had a
single problem using G1 with Solr.
Target query rate is up to 500 qps, maybe 300, and we need to keep
response time at <200ms. But at the moment we only see very good
search performance with up to 100 requests per second. Whenever it
grows to about 200, average response time abruptly increases to 0.5-1
second. (Also it seems that request rate reported by SOLR in admin
metrics is 2x higher than the real one, because for every query, every
shard receives 2 requests: one to obtain IDs and second one to get
data by IDs; so target rate for SOLR metrics would be 1000 qps).
Getting 100 requests per second on a single replica is quite good,
especially with a sharded index. I never could get performance like
that. To handle hundreds of requests per second, you need several replicas.
If you can reduce the number of shards, the amount of work involved for
a single request will decrease, which MIGHT increase the queries per
second your hardware can handle. With four shards, one query typically
is actually 9 requests.
Unless your clients are all Java-based, to avoid a single point of
failure, you need a load balancer as well. (The Java client can talk to
the entire SolrCloud cluster and wouldn't need a load balancer)
What you are seeing where there is a sharp drop in performance from a
relatively modest load increase is VERY common. This is the way that
almost all software systems behave when faced with extreme loads.
Search for "knee" on this page:
https://www.oreilly.com/library/view/the-art-of/9780596155858/ch04.html
During high request load, CPU usage increases dramatically on the SOLR
nodes. It doesn't reach 100% but averages at 50-70% on 3 servers and
about 93% on 1 server (random server each time, not the smallest one).
Very likely the one with a higher load is the one that is aggregating
shard requests for a full result.
The documentation mentions replication to spread the load between the
servers. We tested replicating to smaller servers (32GB RAM, Intel
Core i7-4770). However, when we tested it, the replicas were going out
of sync all the time (possibly during commits) and reported errors
like "PeerSync Recovery was not successful - trying replication." Then
they proceed with replication which takes hours and the leader handles
all requests singlehandedly during that time. Also both leaders and
replicas started encountering OOM errors (heap space) for unknown reason.
With only 32GB of memory, assuming 8GB is allocated to the heap, there's
only 24GB to cache the 80GB of index data. That's not enough, and
performance would be MUCH worse than your 64GB or 128GB machines.
I would suspect extreme GC pauses and/or general performance issues from
not enough cache memory to be the root cause of the sync and recovery
problems.
Heap dump analysis shows that most of the memory is consumed by [J
(array of long) type, my best guess would be that it is "_version_"
field, but it's still unclear why it happens.
I'm not familiar enough with how Lucene allocates memory internally to
have any hope of telling you exactly what that memory structure is.
Also, even though with replication request rate and CPU usage drop 2
times, it doesn't seem to affect mean_ms, stddev_ms or p95_ms numbers
(p75_ms is much smaller on nodes with replication, but still not as
low as under load of <100 requests/s).
Garbage collection is much more active during high load as well. Full
GC happens almost exclusively during those times. We have tried tuning
GC options like suggested here
<https://wiki.apache.org/solr/ShawnHeisey#CMS_.28ConcurrentMarkSweep.29_Collector>
and it didn't change things though.
Symptoms like that generally mean that your heap is too small and needs
to be increased.
* How do we increase throughput? Is replication the only solution?
Ensuring there's enough memory for caching is the first step. But that
can only take you so far. Dealing with the very high query rate you've
got will require multiple replicas.
* if yes - then why doesn't it affect response times, considering
that CPU is not 100% used and index fits into memory?
Hard to say without an in-depth look. See the end of my reply.
* How to deal with OOM and replicas going into recovery?
There are precisely two ways to deal with OOM. One is to increase the
size of the resource that's depleted. The other is to change things so
that the program doesn't require as much of that resource. The second
option is frequently not possible.
* Is memory or CPU the main problem? (When searching on the
internet, i never see CPU as main bottleneck for SOLR, but our
case might be different)
Most Solr performance issues are memory related. With an extreme query
rate, CPU can also be a bottleneck, but memory will almost always be the
bottleneck you run into first.
* Or do we need smaller shards? Could segments merging be a problem?
Smaller shards really won't make much difference in segment merging,
unless the size reduction is *EXTREME* -- switching to a VERY large
number of shards.
If you increase the numbers in your merge policy, then merging will
happen less frequently. The config that I chose to use was 35 for
maxMergeAtOnce and segmentsPerTier, with 105 for
maxMergeAtOnceExplicit. The disadvantage to this is that your indexes
will have a LOT more files in them, so it's much easier to run into an
open file limit in the OS.
* How to add faceting without search queries slowing down too much?
As Erick said ... this isn't possible. To handle the query load you've
mentioned *with* facets will require even more replicas. Facets require
more heap memory, more CPU resources, and are likely to access more of
the index data -- which means having plenty of cache memory is even more
important.
* How to diagnose these problems and narrow down to the real reason
in hardware or setup?
A lot of useful information can be obtained from the GC logs that Solr's
built-in scripting creates. Can you share these logs?
The screenshots described here can also be very useful for troubleshooting:
https://wiki.apache.org/solr/SolrPerformanceProblems#Asking_for_help_on_a_memory.2Fperformance_issue
Thanks,
Shawn