Re: Repair when a replica is Down

2016-01-19 Thread Anuj Wadehra
Actually I have not checked how repair -pr abort logic is implemented in code. So irrespective of repair pr or full repair scenarios, problem can be stated as follows: 20 node cluster, RF=5, Read/Write Quorum, gc grace period=20. If a node goes down, 1/20 th of data for which the failed node was

Re: Repair when a replica is Down

2016-01-19 Thread Anuj Wadehra
Hi Tyler, I think the scenario needs some correction. 20 node clsuter, RF=5, Read/Write Quorum, gc grace period=20. If a node goes down, repair -pr would fail on 4 nodes maintaining replicas and full repair would fail on even greater no.of number of nodes but not 19. Please confirm. Anyways the

Re: Repair when a replica is Down

2016-01-19 Thread Anuj Wadehra
There is a JIRA  Issue https://issues.apache.org/jira/browse/CASSANDRA-10446 .  But its open with Minor prority and type as Improvement. I think its a very valid concern for all and especially for users who have bigger clusters. More of an issue related with Design decision rather than an improve

Re: Repair when a replica is Down

2016-01-19 Thread Tyler Hobbs
On Tue, Jan 19, 2016 at 10:44 AM, Anuj Wadehra wrote: > > Consider a scenario where I have a 20 node clsuter, RF=5, Read/Write > Quorum, gc grace period=20. My cluster is fault tolerant and it can afford > 2 node failure. Suddenly, one node goes down due to some hardware issue. > Its 10 days sinc

Re: Repair when a replica is Down

2016-01-19 Thread Anuj Wadehra
Thanks Tyler !! I understand that we need to consider a node as lost when its down for gc grace and bootstrap it. My question is more about the JIRA  https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-2290 where an intentional decision was taken to abort the repair if a single r

Re: Repair when a replica is Down

2016-01-19 Thread Tyler Hobbs
On Fri, Jan 15, 2016 at 12:06 PM, Anuj Wadehra wrote: > Increase the gc grace period temporarily. Then we should have capacity > planning to accomodate the extra storage needed for extra gc grace that may > be needed in case of node failure scenarios. I would do this. Nodes that are down for l

Re: Repair when a replica is Down

2016-01-16 Thread Anuj Wadehra
Hi I have intentionally posted this message to the dev mailing list instead of users list because its regarding a conscious design decision taken regarding a bug and I feel that dev team is the most appropriate team who could respond to it. Please let me know if there are better ways to get it a

Repair when a replica is Down

2016-01-15 Thread Anuj Wadehra
Hi  We are on 2.0.14,RF=3 in a 3 node cluster. We use repair -pr . Recently, we observed that repair -pr for all nodes fails if a node is down. Then I found the JIRA  https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-2290 where an intentional decision was taken to abort the re