Thanks Richard,
Note the SSTable corruption probably only happens as a result of some testing
patterns we’re doing. That said we still want to make sure we can handle if it
does happen (since the corrupted nodes will NOT be known to be down and thus
still receive traffic).
In our particular us
We're using Cassandra 1.1 with Hector 1.1 library. We've found that reducing
the CL when an exception occurs is useful as it's usually easier to deal with
things not being consistent for a few seconds than the database read/write not
succeeding at all.
We have multiple DCs and use NetworkTopolo