Hi. I'm setting up a new cluster (DSE 4.7), and interested in failure modes of JBOD setups. My main concern is the likelihood of needing to rebuild an entire node whenever a single drive goes wobbly. After I have replaced a failed or failing disk, how likely is it that it contained critical information and the node unable to restart and rejoin the cluster?
My current thinking is that the commitlog_directory and savedcaches_directory should be on my OS RAID0 partition (just a pair of small disks), with my data_file_directories pointing to my large drives and not using RAID. Is this a good idea or a terrible one? My alternative is to pay the capacity and write bandwidth penalties and go RAID5, with the advantage that drives can be swapped out by data center engineers without needing to shut the node down and without lowered redundancy for several hours while the repair completes. -- Stuart Bishop <stu...@stuartbishop.net> http://www.stuartbishop.net/