What version of Solr? There are some anecdotal reports of abnormal CPU loads on very recent Solr’s.
Is the server with the high load the “Overseer”? In the admin UI>>SolrCloud>>tree you can see which node is the Overseer. This is really a shot in the dark, as unless you are doing a lot of collection maintenance operations, the Overseer shouldn’t be doing much really. There is _one_ Overseer per cluster and it’s in charge of coordinating changes to ZooKeeper. If there’s a correlation there, it’d be great to know. It’s possible to move the Overseer to a different node, one that’s running Solr but not necessarily hosting any replicas. This isn’t a permanent solution, but would help isolate the issue. First, let’s see if the not node is always the Overseer... Best, Erick > On Mar 4, 2019, at 6:51 AM, Gael Jourdan-Weil > <gael.jourdan-w...@kelkoogroup.com> wrote: > > Hello Furkan, > > Yes the 3 servers have exact same configuration. > > Varnish load balancing is effectively round robin. > We monitor the number of requests per second, and we effectively see the 3 > servers are receiving same amount of requests. > > Kind Regards, > Gaël > > ________________________________ > De : Furkan KAMACI <furkankam...@gmail.com> > Envoyé : lundi 4 mars 2019 15:00 > À : solr-user@lucene.apache.org > Objet : Re: SolrCloud one server with high load > > Hi Gaël, > > Does all three servers have same specifications? On the other hand, is your > load balancing configuration for Varnish is round-robin? > > Kind Regards, > Furkan KAMACI > > On Mon, Mar 4, 2019 at 3:18 PM Gael Jourdan-Weil < > gael.jourdan-w...@kelkoogroup.com> wrote: > >> Hello, >> >> I come again to the community for some ideas regarding a performance issue >> we are having. >> >> We have a SolrCloud cluster of 3 servers. >> Each server hosts 1 replica of 2 collections. >> There is no sharding, every server hosts the whole collection. >> >> Requests are evenly distributed by a Varnish system. >> >> During some peaks of requests, we see one server of the cluster having >> very high load while the two others are totally fine. >> The server experiencing this high load is always the same until we reboot >> it and the behavior moves to another server. >> The server experiencing the issue is not necessarily the leader. >> All servers receive the same number of requests per seconds. >> >> Load data: >> - Server1: 5% CPU when low QPS, 90% CPU when high QPS (this one having >> issues) >> - Server2: 5% CPU when low QPS, 25% CPU when high QPS >> - Server3: 5% CPU when low QPS, 20% CPU when high QPS >> >> What could explain this behavior in SolrCloud mechanisms? >> >> Thank you for reading, >> >> Gaël Jourdan-Weil >>