On 8/12/2019 5:47 AM, Kojo wrote:
I am using Solr cloud on this configuration:
2 boxes (one Solr in each box)
4 instances per box
Why are you running multiple instances on one server? For most setups,
this has too much overhead. A single instance can handle many indexes.
The only good reason I can think of to run multiple instances is when
the amount of heap memory needed exceeds 31GB. And even then, four
instances seems excessive. If you only have 300000 documents, there
should be no reason for a super large heap.
At this moment I have an active collections with about 300.000 docs. The
other collections are not being queried. The acctive collection is
configured:
- shards: 16
- replication factor: 2
These two Solrs (Solr1 and Solr2) use Zookeper (one box, one instance. No
zookeeper cluster)
My application point to Solr1, and everything works fine, until suddenly on
instance of this Solr1 dies. This istance is on port 8983, the "main"
instance. I thought it could be related to memory usage, but we increase
RAM and JVM memory but it still dies.
The Solr1, the one wich dies,is the destination where I point my web
application.
You will have to check the logs. If Solr is not running on Windows,
then any OutOfMemoryError exception, which can be caused by things other
than a memory shortage, will result in Solr terminating itself. On
Windows, that functionality does not yet exist, so it would have to be
Java or the OS that kills it.
Here I have two questions that I hope you can help me:
1. Which log can I look for debug this issue?
Assuming you're NOT on Windows, check to see if there is a logfile named
solr_oom_killer-8983.log in the logs directory where solr.log lives. If
there is, then that means the oom killer script was executed, and that
happens when there is an OutOfMemoryError thrown. The solr.log file
MIGHT contain the OOME exception which will tell you what system
resource was depleted. If it was not heap memory that was depleted,
then increasing memory probably won't help.
If you share the gc log that Solr writes, we can analyze this to see if
it was heap memory that was depleted.
2. After this instance dies, the Solr cloud does not answer to my web
application. Is this correct? I thougth that the replicas should answer if
one shard, instance or one box goes down.
If a Solr instance dies, you can't make connections directly to it.
Connections would need to go to another instance. You need a load
balancer to handle that automatically, or a cloud-aware client. The
only cloud-aware client that I am sure about is the one for Java -- it
is named SolrJ, created by the Solr project and distributed with Solr.
I think that a third party MIGHT have written a cloud-aware client for
Python, but I am not sure about this.
If you set up a load balancer, you will need to handle redundancy for that.
Side note: A fully redundant zookeeper install needs three servers. Do
not put a load balancer in front of zookeeper. The ZK protocol handles
redundancy itself and a load balancer will break that.
Thanks.
Shawn