When a drive fails in a large cluster and you don't immediately have a
replacement drive, is it OK to just remove the drive from cassandra.yaml
and restart the node? Will the missing data (assuming RF=3) be
re-replicated?
I have disk_failure_policy set to "best_effort", but the node still
fail
Thank you Dmitry.
At this point the one node where I removed the first drive from the list
and then rebuilt it, is now in some odd state. Locally nodetool status
shows it as up (UN), but all the other nodes in the cluster show it as
down (DN).
Not sure what to do at this juncture.
-Joe
On
There is a jira ticket describing your situation
https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-14793
I may be wrong but is seems that system directories are pinned to first
data directory in cassandra.yaml by default. When you removed first item
from the list system data re
Hi - in order to get the node back up and running I did the following:
Deleted all data on the node:
Added: -Dcassandra.replace_address=172.16.100.39
to the cassandra.env.sh file, and started it up. It is currently
bootstrapping.
In cassandra.yaml, say you have the following:
data_file_direct
Hi, you may have already tried, but this may help.
https://stackoverflow.com/questions/29323709/unable-to-start-cassandra-node-already-exists
can you be little narrate 'If I remove a drive other than the first one'?
what does it means
On Fri, Jan 7, 2022 at 2:52 PM Joe Obernberger
wrote:
> Hi A
Hi All - I have a 13 node cluster running Cassandra 4.0.1. If I stop a
node, edit the cassandra.yaml file, comment out the first drive in the
list, and restart the node, it fails to start saying that a node already
exists in the cluster with the IP address.
If I put the drive back into the li