Thanks for the input.
My primary draw to Cassandra is dynamic schema. I could make it work
relationally, perhaps even nicely with something like postgres'
hstore, but I haven't investigated that fully yet. Relatively linear
scaling has it's appeal and competitive advantages too. I also find
We saw corruption pre 0.4 days. Digg hasn't seen corruption since that got
taken care of. We are only doing this for the "just in case the shit hits the
fan". Cassandra is rapidly changing and it would be completely careless of us
to forgo a path of using a new database as our primary datastore.
Recent messages to the list regarding durabilty and backup strategies
leads me to a few questions that other new users may also have.
What's the general experience with corruption to date?
Is it common?
Would I regret operating a single node cluster?
Digg referenced sending snapshots to hdfs