Hi all! After upgrading to Solr 4.6.1 we encountered a situation where a cluster outage was traced to a single node misbehaving, after restarting the node the cluster immediately returned to normal operation. The bad node had ~420 threads locked on FastLRUCache and most httpshardexecutor threads were waiting on apache commons http futures.
Has anyone encountered such a situation? what can we do to prevent misbehaving nodes from bringing down the entire cluster? Cheers, Avishai