date:20230308

[DISCUSS] Enhanced Disk Error Handling

2023-03-08 Thread Bowen Song via dev

At the moment, when a read error, such as unrecoverable bit error or data corruption, occurs in the SSTable data files, regardless of the disk_failure_policy configuration, manual (or to be precise, external) intervention is required to recover from the error. Commonly, there's two approach to

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-08 Thread C. Scott Andreas

For this to be safe, my understanding is that:– A repair of the affected range would need to be completed among the replicas without such corruption (including paxos repair).– And we'd need a mechanism to execute repair on the affected node without it being available to respond to queries, eith

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-08 Thread C. Scott Andreas

Realized I’m somewhat mistaken here - The repair of surviving replicas would be necessary for correctness prior to the node with deleted data files to be able to serve client/internode reads. But the repair of the node with deleted data files prior to being brought back into the cluster is more

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-08 Thread Bowen Song via dev

/– A repair of the affected range would need to be completed among the replicas without such corruption (including paxos repair)./ It can be safe without a repair by over-streaming the data from more (or all) available replicas, either within the DC (when LOCAL_* CL is used) or across the

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-08 Thread Jeff Jirsa

On Wed, Mar 8, 2023 at 5:25 AM Bowen Song via dev wrote: > At the moment, when a read error, such as unrecoverable bit error or data > corruption, occurs in the SSTable data files, regardless of the > disk_failure_policy configuration, manual (or to be precise, external) > intervention is require

New episode of The Apache Cassandra (R) Corner podcast!

2023-03-08 Thread Aaron Ploetz

Link to the next episode: https://drive.google.com/file/d/1_EOBpG3yiuptDJ-PU-3a7amSVvi7pgM8/view?usp=sharing s2Ep2 - Aaron Morton (You may have to download it to listen) It will remain in staging for 72 hours, going live (assuming no objections) by Saturday, March 11th (22:00 UTC). If anyone sh

[DISCUSS] Enhanced Disk Error Handling

Re: [DISCUSS] Enhanced Disk Error Handling

Re: [DISCUSS] Enhanced Disk Error Handling

Re: [DISCUSS] Enhanced Disk Error Handling

Re: [DISCUSS] Enhanced Disk Error Handling

New episode of The Apache Cassandra (R) Corner podcast!

6 matches

Site Navigation

Mail list logo

Footer information