On 12/6/2014 12:09 PM, JoeSmith wrote: > We are currently using CloudSolrServer, but it looks like this class is not > thread-safe (setDefaultCollection). Should this instance be initialized > once (at startup) and then re-used (in all threads) until shutdown when the > process terminates? Or should it re-instantiated for each request? > > Currently, we are trying to use CloudSolrServer as a singleton, but it > looks like the connections to the host are not being closed and under load > we start getting failures. and In the Zookeeper logs we see this error: > >> WARN - 2014-12-04 10:09:14.364; >> org.apache.zookeeper.server.NIOServerCnxnFactory; Too many connections from >> /11.22.33.44 - max is 60 > > netstat (on the Zookeeper host) shows that the connections are not being > closed. What is the 'correct' way to fix this? Apologies if i have missed > any documentation that explains, pointers would be helpful.
All SolrServer implementations in SolrJ, including CloudSolrServer, are supposed to be threadsafe. If it turns out they're not actually threadsafe, then we treat that as a bug. The discussion to determine that it's a bug takes place on this mailing list, and once we determine that, the next step is to file an issue in Jira. The general way to use SolrJ is to initialize the server instance at the beginning and re-use it for all client communication to Solr. With CloudSolrServer, you normally only need a single server instance to talk to the entire cloud, because you can set the "collection" parameter on each request to indicate which collection to work on. If you only have a handful of collections, you might want to use multiple instances and use setDefaultCollection to specify the collection. With HttpSolrServer, an instance is required for each core, because the core name is in the initialization URL. I've not looked at the code, but I can't imagine that the client ever needs to make more than one connection to each server in the zookeeper ensemble. Here's a list of the open connections on one of my zookeeper servers for my SolrCloud 4.2.1 install: java 21800 root 21u IPv6 2836983 0t0 TCP 10.8.0.151:50178->10.8.0.152:2888 (ESTABLISHED) java 21800 root 22u IPv6 2661097 0t0 TCP 10.8.0.151:3888->10.8.0.152:34116 (ESTABLISHED) java 21800 root 26u IPv6 28065088 0t0 TCP 10.8.0.151:2181->10.8.0.141:52583 (ESTABLISHED) java 21800 root 27u IPv6 23967470 0t0 TCP 10.8.0.151:2181->10.8.0.152:49436 (ESTABLISHED) java 21800 root 28r IPv6 23969636 0t0 TCP 10.8.0.151:2181->10.8.0.151:57290 (ESTABLISHED) java 21800 root 29r IPv6 23969951 0t0 TCP 10.8.0.151:3888->10.8.0.153:54721 (ESTABLISHED) The 151, 152, and 153 addresses are my ZK servers, with Solr also running on 151 and 152. The 141 address is the SolrJ client. The main ZK port is 2181, with ports 2888 and 3888 used for internal zookeeper communication. I actually would have expected to see two client connections from .141 ... one for the indexer program and one for the webapp. They haven't reported a Solr problem to me, so I guess it must be OK. If your install is re-establishing connections and not closing the old ones, then there is either something wrong with your setup or a bug. Because there are not a large number of people with the same complaint, I would lean more towards problems in your setup. I won't rule out the possibility that there's a bug, because we've had a lot of them. One thing to try immediately is upgrading to 4.10.2 ... there have been two bugfix releases since the version you're running came out, with 16 bug issues closed. None of those issues sounds like what you're running into, but sometimes when mistakes are noticed in the code, fixing them can make other seemingly unrelated problems go away. Upgrading to a bugfix release on the same minor version should be a drop-in replacement with no configuration changes necessary. http://lucene.apache.org/solr/4_10_2/changes/Changes.html Beyond that, we need more information. Are there ERROR or WARN messages in your Solr log and/or your SolrJ client log that don't come from bad queries? If there are, it may indicate some kind of problem, especially if they relate to the zk client timeout. Problems like that can be caused by general performance issues, including garbage collection pauses. http://wiki.apache.org/solr/SolrPerformanceProblems Depending on what is found in your log, other questions about your setup may need answsering. Thanks, Shawn