Re: removing a drive - 4.0.1

2022-06-09 Thread Joe Obernberger
When a drive fails in a large cluster and you don't immediately have a replacement drive, is it OK to just remove the drive from cassandra.yaml and restart the node?  Will the missing data (assuming RF=3) be re-replicated? I have disk_failure_policy set to "best_effort", but the node still fail

Re: removing a drive - 4.0.1

2022-01-07 Thread Joe Obernberger
Thank you Dmitry. At this point the one node where I removed the first drive from the list and then rebuilt it, is now in some odd state.  Locally nodetool status shows it as up (UN), but all the other nodes in the cluster show it as down (DN). Not sure what to do at this juncture. -Joe On

Re: removing a drive - 4.0.1

2022-01-07 Thread Dmitry Saprykin
There is a jira ticket describing your situation https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-14793 I may be wrong but is seems that system directories are pinned to first data directory in cassandra.yaml by default. When you removed first item from the list system data re

Re: removing a drive - 4.0.1

2022-01-07 Thread Joe Obernberger
Hi - in order to get the node back up and running I did the following: Deleted all data on the node: Added: -Dcassandra.replace_address=172.16.100.39 to the cassandra.env.sh file, and started it up.  It is currently bootstrapping. In cassandra.yaml, say you have the following: data_file_direct

Re: removing a drive - 4.0.1

2022-01-07 Thread Mano ksio
Hi, you may have already tried, but this may help. https://stackoverflow.com/questions/29323709/unable-to-start-cassandra-node-already-exists can you be little narrate 'If I remove a drive other than the first one'? what does it means On Fri, Jan 7, 2022 at 2:52 PM Joe Obernberger wrote: > Hi A

removing a drive - 4.0.1

2022-01-07 Thread Joe Obernberger
Hi All - I have a 13 node cluster running Cassandra 4.0.1.  If I stop a node, edit the cassandra.yaml file, comment out the first drive in the list, and restart the node, it fails to start saying that a node already exists in the cluster with the IP address. If I put the drive back into the li