Tomáš, One thing that causes a clusterstatus call is alias resolution if the HttpClusterStateProvider is in use instead of the ZkClusterStateProvider. I've just been fixing spurious error messages generated by this in SOLR-12938.
-Gus On Tue, Nov 6, 2018 at 1:08 PM Zimmermann, Thomas < tzimmerm...@techtarget.com> wrote: > Hi Shawn, > > We¹re equally impressed by how well the server is handling it. We¹re using > Sematext for monitoring and the load on the box has been steady under 1 > and not entering a swap state memory wise. > > We are 100% certain the traffic is coming from the 3 web hosts running > this code. We have put some custom logging in place that logs all requests > to an access style log and stores that data in kibana/logstash. In > logstash we are able to confirm that all these requests (~40million in the > last 12 hours) are coming from our web front ends directly to a single box > in the cluster. > > Our client codes is on separate servers from our solr servers and zk has > it¹s own boxes as well. > > Here¹s a scrubbed pastbin of our cluster status response from that machine > that is getting all the traffic, I pulled this via browser on my local > machine. > https://pastebin.com/42haKVME > > We can attempt to update the SolrJ dependency on our lower env and see if > that fixes the problem if you think that a good course of action, but we > are also in the midst of switching over to HTTP Client to resolve the > production issues we are seeing ASAP, so I can¹t promise a timeline. If > you think there¹s a chance that will fix this, we could of course give it > a quick go. > > > -TZ > > > > On 11/6/18, 12:35 PM, "Shawn Heisey" <apa...@elyograg.org> wrote: > > >On 11/6/2018 10:12 AM, Zimmermann, Thomas wrote: > >> Shawn - > >> > >> Server performance is fine and request time are great. We are tolerating > >> the level of traffic, but the server that is taking all the hits is > >> obviously performing a bit slower than the others. Response times are > >> under 5MS avg for queries on all servers, which is within our perf > >> thresholds. > > > >I was asking specifically about the clusterstatus requests -- whether > >the response looks complete if you manually execute the same request and > >whether it returns quickly. And I'd like to see the solr.log where > >these are happening. > > > >Knowing that requests in general are performing well is good info, > >although I have no idea how that is possible on the node that is getting > >over a thousand clusterstatus requests per second. I would expect that > >node to be essentially dead under that much load. Since it's apparently > >handling it fine ... that's really impressive. > > > >> We are running 7.4 on the client and server side, moving to 7.5 was > >> troublesome for us so we are holding off for the time being. > > > >I was hoping you could just upgrade the SolrJ client, which would > >involve either replacing the solrj jar or bumping the version number in > >the config for a dependency manager (things like ivy, maven, gradle, > >etc). A 7.5 client should be pretty safe against 7.4 servers. The > >client would be newer than the server and very close to the same > >version, which is the general recommendation for CloudSolrClient when > >the two versions cannot be identical for some reason. > > > >Are you absolutely sure that those requests are coming from the program > >with CloudSolrClient? To find out, you'll need to enable the request > >log in jetty.xml (it just needs to be un-commented) and restart the > >server. The source address is not logged in solr.log. It's very > >important to be absolutely sure where the requests are coming from. If > >you're running the client code on the same machine as one of your Solr > >servers, it will be difficult to be sure about the source, so I would > >definitely suggest running the client code on a completely different > >machine, so the source addresses in the request log are useful. > > > >Thanks, > >Shawn > > > > -- http://www.the111shift.com