Also, as a note related to EC2, choose whether you want to be in multiple
availability zones. The highest performance possible is to be in a single
AZ, as all those machines will have *very* high speed interconnects. But,
individual AZs also can suffer outages. You can distribute your instances
if you are updating columns quite rapidly, you will scatter the columns
over many sstables as you update them over time. this means that a read
of a specific column will require looking at more sstables to find the
data. performing a compaction (using nodetool) will merge the sstables
into on
> However Ben Black suggests here that the cleanup will actually only
> impact data deleted through the API:
>
> http://comments.gmane.org/gmane.comp.db.cassandra.user/4437
>
> In this case, I guess that we need not worry too much about the
> setting since we are actually updating, never deleting.
Hi all
We're rolling out a Cassandra cluster on EC2 and I've got a couple if
questions about settings. I'm interested to hear what other people
have experienced with different values and generally seek advice.
*gcgraceseconds*
Currently we configure one setting for all CFs. We experimented with