On 12/6/2014 12:09 PM, JoeSmith wrote:
> We are currently using CloudSolrServer, but it looks like this class is not
> thread-safe (setDefaultCollection). Should this instance be initialized
> once (at startup) and then re-used (in all threads) until shutdown when the
> process terminates?  Or should it re-instantiated for each request?
> 
> Currently, we are trying to use CloudSolrServer as a singleton, but it
> looks like the connections to the host are not being closed and under load
> we start getting failures.  and In the Zookeeper logs we see this error:
> 
>> WARN  - 2014-12-04 10:09:14.364;
>> org.apache.zookeeper.server.NIOServerCnxnFactory; Too many connections from
>> /11.22.33.44 - max is 60
> 
> netstat (on the Zookeeper host) shows that the connections are not being
> closed. What is the 'correct' way to fix this?   Apologies if i have missed
> any documentation that explains, pointers would be helpful.

All SolrServer implementations in SolrJ, including CloudSolrServer, are
supposed to be threadsafe.  If it turns out they're not actually
threadsafe, then we treat that as a bug.  The discussion to determine
that it's a bug takes place on this mailing list, and once we determine
that, the next step is to file an issue in Jira.

The general way to use SolrJ is to initialize the server instance at the
beginning and re-use it for all client communication to Solr.  With
CloudSolrServer, you normally only need a single server instance to talk
to the entire cloud, because you can set the "collection" parameter on
each request to indicate which collection to work on.  If you only have
a handful of collections, you might want to use multiple instances and
use setDefaultCollection  to specify the collection.  With
HttpSolrServer, an instance is required for each core, because the core
name is in the initialization URL.

I've not looked at the code, but I can't imagine that the client ever
needs to make more than one connection to each server in the zookeeper
ensemble.  Here's a list of the open connections on one of my zookeeper
servers for my SolrCloud 4.2.1 install:

java    21800 root   21u  IPv6            2836983      0t0      TCP
10.8.0.151:50178->10.8.0.152:2888 (ESTABLISHED)
java    21800 root   22u  IPv6            2661097      0t0      TCP
10.8.0.151:3888->10.8.0.152:34116 (ESTABLISHED)
java    21800 root   26u  IPv6           28065088      0t0      TCP
10.8.0.151:2181->10.8.0.141:52583 (ESTABLISHED)
java    21800 root   27u  IPv6           23967470      0t0      TCP
10.8.0.151:2181->10.8.0.152:49436 (ESTABLISHED)
java    21800 root   28r  IPv6           23969636      0t0      TCP
10.8.0.151:2181->10.8.0.151:57290 (ESTABLISHED)
java    21800 root   29r  IPv6           23969951      0t0      TCP
10.8.0.151:3888->10.8.0.153:54721 (ESTABLISHED)

The 151, 152, and 153 addresses are my ZK servers, with Solr also
running on 151 and 152.  The 141 address is the SolrJ client.  The main
ZK port is 2181, with ports 2888 and 3888 used for internal zookeeper
communication.  I actually would have expected to see two client
connections from .141 ... one for the indexer program and one for the
webapp.  They haven't reported a Solr problem to me, so I guess it must
be OK.

If your install is re-establishing connections and not closing the old
ones, then there is either something wrong with your setup or a bug.
Because there are not a large number of people with the same complaint,
I would lean more towards problems in your setup.  I won't rule out the
possibility that there's a bug, because we've had a lot of them.

One thing to try immediately is upgrading to 4.10.2 ... there have been
two bugfix releases since the version you're running came out, with 16
bug issues closed.  None of those issues sounds like what you're running
into, but sometimes when mistakes are noticed in the code, fixing them
can make other seemingly unrelated problems go away.  Upgrading to a
bugfix release on the same minor version should be a drop-in replacement
with no configuration changes necessary.

http://lucene.apache.org/solr/4_10_2/changes/Changes.html

Beyond that, we need more information.  Are there ERROR or WARN messages
in your Solr log and/or your SolrJ client log that don't come from bad
queries?  If there are, it may indicate some kind of problem, especially
if they relate to the zk client timeout.  Problems like that can be
caused by general performance issues, including garbage collection pauses.

http://wiki.apache.org/solr/SolrPerformanceProblems

Depending on what is found in your log, other questions about your setup
may need answsering.

Thanks,
Shawn

Reply via email to