To elaborate a bit on what Marcin said:
* Once a node starts to believe that a few other nodes are down, it seems
to stay that way for a very long time (hours). I'm not even sure it will
recover without a restart.
* I've tried to stop then start gossip with nodetool on the node that
thinks several
Do you happen to be using a tool like Nagios or Ganglia that are able to
report utilization (CPU, Load, disk io, network)? There are plugins for
both that will also notify you of (depending on whether you enabled the
intermediate GC logging) about what is happening.
On Thu, Apr 2, 2015 at 8:35 A
Marcin ;
are all your nodes within the same Region ? If not in the same region,
what is the Snitch type that you are using ?
Jan/
On Thursday, April 2, 2015 3:28 AM, Michal Michalski
wrote:
Hey Marcin,
Are they actually going up and down repeatedly (flapping) or just dow
Hey Marcin,
Are they actually going up and down repeatedly (flapping) or just down and
they never come back?
There might be different reasons for flapping nodes, but to list what I
have at the top of my head right now:
1. Network issues. I don't think it's your case, but you can read about the
is
Hi!
We have 56 node cluster with C* 2.0.13 + CASSANDRA-9036 patch
installed. Assume we have nodes A, B, C, D, E. On some irregular basis
one of those nodes starts to report that subset of other nodes is in
DN state although C* deamon on all nodes is running:
A$ nodetool status
UN B
DN C
DN D
UN E