On 8/12/2019 5:47 AM, Kojo wrote:
I am using Solr cloud on this configuration:

2 boxes (one Solr in each box)
4 instances per box

Why are you running multiple instances on one server? For most setups, this has too much overhead. A single instance can handle many indexes. The only good reason I can think of to run multiple instances is when the amount of heap memory needed exceeds 31GB. And even then, four instances seems excessive. If you only have 300000 documents, there should be no reason for a super large heap.

At this moment I have an active collections with about 300.000 docs. The
other collections are not being queried. The acctive collection is
configured:
- shards: 16
- replication factor: 2

These two Solrs (Solr1 and Solr2) use Zookeper (one box, one instance. No
zookeeper cluster)

My application point to Solr1, and everything works fine, until suddenly on
instance of this Solr1 dies. This istance is on port 8983, the "main"
instance. I thought it could be related to memory usage, but we increase
RAM and JVM memory but it still dies.
The Solr1, the one wich dies,is the destination where I point my web
application.

You will have to check the logs. If Solr is not running on Windows, then any OutOfMemoryError exception, which can be caused by things other than a memory shortage, will result in Solr terminating itself. On Windows, that functionality does not yet exist, so it would have to be Java or the OS that kills it.

Here I have two questions that I hope you can help me:

1. Which log can I look for debug this issue?

Assuming you're NOT on Windows, check to see if there is a logfile named solr_oom_killer-8983.log in the logs directory where solr.log lives. If there is, then that means the oom killer script was executed, and that happens when there is an OutOfMemoryError thrown. The solr.log file MIGHT contain the OOME exception which will tell you what system resource was depleted. If it was not heap memory that was depleted, then increasing memory probably won't help.

If you share the gc log that Solr writes, we can analyze this to see if it was heap memory that was depleted.

2. After this instance dies, the Solr cloud does not answer to my web
application. Is this correct? I thougth that the replicas should answer if
one shard, instance or one box goes down.

If a Solr instance dies, you can't make connections directly to it. Connections would need to go to another instance. You need a load balancer to handle that automatically, or a cloud-aware client. The only cloud-aware client that I am sure about is the one for Java -- it is named SolrJ, created by the Solr project and distributed with Solr. I think that a third party MIGHT have written a cloud-aware client for Python, but I am not sure about this.

If you set up a load balancer, you will need to handle redundancy for that.

Side note: A fully redundant zookeeper install needs three servers. Do not put a load balancer in front of zookeeper. The ZK protocol handles redundancy itself and a load balancer will break that.

Thanks.
Shawn

Reply via email to