On 8/15/2019 8:14 AM, Kojo wrote:
I am starting to think that my setup has more than one problem.
As I said before, I am not balancing my load to Solr nodes, and I have
eight nodes. All of my web application requests go to one Solr node, the
only one that dies. If I distribute the load across the other nodes, is it
possible that these problems may end?
Even if I downsize the Solr cloud setup to 2 boxes 2 nodes each with less
shards than the 16 shards that I have now, I would like to know your
oppinion about the question above.
Based on those GC logs, we have 58 hours of good steady operation,
followed by something bad. Something happened in those few minutes that
*didn't* happen in the previous 58 hours.
You could try increasing the heap beyond 6GB, but depending on what went
wrong, that might not help. And as Erick was hinting at, large heaps
can create their own problems.
The better option is to figure out what's happening when it all goes bad
and keep that from happening. Load balancing might help, or it might
cause whatever's happening on the one node to happen to all your nodes.
Thanks,
Shawn