Re: Failed disks - correct procedure

2023-01-17 Thread C. Scott Andreas
Bumping this note from Andy downthread to make sure everyone has seen it and is aware:“Before you do that, you will want to make sure a cycle of repairs has run on the replicas of the down node to ensure they are consistent with each other.”When replacing an instance, it’s necessary to run repair (

Re: Failed disks - correct procedure

2023-01-17 Thread Joe Obernberger
I come from the hadoop world where we have a cluster with probably over 500 drives.  Drives fail all the time; or well several a year anyway.  We remove that single drive from HDFS, HDFS re-balances, and when we get around to it, we swap in a new drive, format it, and add it back to HDFS.  We k

RE: Failed disks - correct procedure

2023-01-17 Thread Durity, Sean R via user
For physical hardware when disks fail, I do a removenode, wait for the drive to be replaced, reinstall Cassandra, and then bootstrap the node back in (and run clean-up across the DC). All of our disks are presented as one file system for data, which is not what the original question was asking.

RE: Failed disks - correct procedure

2023-01-17 Thread Marc Hoppins
HI all, I was pondering this very situation. We have a node with a crapped-out disk (not the first time). Removenode vs repairnode: in regard time, there is going to be little difference twixt replacing a dead node and removing then re-installing a node. There is going to be a bunch of reads/