On 9/3/2014 3:31 PM, Ethan wrote:
> We have SolrCloud instance with 2 solr nodes and 3 zk ensemble. One of the
> solr node goes down as soon as we send search traffic to it, but update
> works fine.
>
> When I analyzed thread dump I saw lot of blocked threads with following
> error message. This
Hmmm, I'm puzzled then. I'm guessing that the node
that keeps going down is the follower, which means
it should have _less_ work to do than the node that
stays up. Not a lot less, but less still.
I'd try lengthening out my commit interval. I realize you've
set it to 2 seconds for a reason, this is
Erick,
It is just one shard. Indexing traffic is going to the other node and then
synched with this one(both are part of cloud). We kept that setting
running for 5 days as defective node would just go down with search
traffic. So both were in sync when search was turned on. Soft commit is
very
Do you have indexing traffic going to it? b/c this _looks_
like the node is just starting up or a searcher is
being opened and you're loading your
index first time. This happens when you index data and
when you start up your nodes. Adding some autowarming
(firstSearcher in this case) might load up
Forgot to add the source thread thats blocking every other thread
"http-bio-52158-exec-61" - Thread t@591
java.lang.Thread.State: RUNNABLE
at
org.apache.lucene.search.FieldCacheImpl$Uninvert.uninvert(FieldCacheImpl.java:312)
at
org.apache.lucene.search.FieldCacheImpl$LongCache.createValue(Fie