Actually I have not checked how repair -pr abort logic is implemented in code.
So irrespective of repair pr or full repair scenarios, problem can be stated as
follows:
20 node cluster, RF=5, Read/Write Quorum, gc grace period=20. If a node goes
down, 1/20 th of data for which the failed node was
Hi Tyler,
I think the scenario needs some correction. 20 node clsuter, RF=5, Read/Write
Quorum, gc grace period=20. If a node goes down, repair -pr would fail on 4
nodes maintaining replicas and full repair would fail on even greater no.of
number of nodes but not 19. Please confirm.
Anyways the
There is a JIRA
Issue https://issues.apache.org/jira/browse/CASSANDRA-10446 .
But its open with Minor prority and type as Improvement. I think its a very
valid concern for all and especially for users who have bigger clusters. More
of an issue related with Design decision rather than an improve
On Tue, Jan 19, 2016 at 10:44 AM, Anuj Wadehra
wrote:
>
> Consider a scenario where I have a 20 node clsuter, RF=5, Read/Write
> Quorum, gc grace period=20. My cluster is fault tolerant and it can afford
> 2 node failure. Suddenly, one node goes down due to some hardware issue.
> Its 10 days sinc
Thanks Tyler !!
I understand that we need to consider a node as lost when its down for gc grace
and bootstrap it. My question is more about the JIRA
https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-2290
where an intentional decision was taken to abort the repair if a single r
On Fri, Jan 15, 2016 at 12:06 PM, Anuj Wadehra
wrote:
> Increase the gc grace period temporarily. Then we should have capacity
> planning to accomodate the extra storage needed for extra gc grace that may
> be needed in case of node failure scenarios.
I would do this. Nodes that are down for l
Hi
I have intentionally posted this message to the dev mailing list instead of
users list because its regarding a conscious design decision taken regarding a
bug and I feel that dev team is the most appropriate team who could respond to
it. Please let me know if there are better ways to get it a
Hi
We are on 2.0.14,RF=3 in a 3 node cluster. We use repair -pr . Recently, we
observed that repair -pr for all nodes fails if a node is down. Then I found
the JIRA
https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-2290
where an intentional decision was taken to abort the re