Re: Column Family migration/tombstones

2013-01-07 Thread aaron morton
> are there two rows being tracked by bloomfilters Yes. Bloom filters are just for the SSTables. > or does Cassandra possibly do something more efficient? Bloom Filters are a space efficient data structure. You can reduce their size by adjusting the bloom_filter_fp_chance > are bloomfilters ac

Re: Column Family migration/tombstones

2013-01-07 Thread Mike
Thanks, Another related question. In the situation described below, where we have a row and a tombstone across more than one SSTable, and it would take a very long time for these SSTables to be compacted, are there two rows being tracked by bloomfilters (since there is a bloom filter per SST

Re: Column Family migration/tombstones

2013-01-06 Thread aaron morton
> Are there other performance considerations that I need to keep in mind? Thats about it. Sylvain has written a script or some such to reverse compaction. It was mentioned sometime in the last month I think. Sylvain ? > after we are complete the migration should be fairly small (about 500,000

Re: Column Family migration/tombstones

2013-01-06 Thread Mike
Thanks Aaron, I appreciate it. It is my understanding, major compactions are not recommended because it will essentially create one massive SSTable that will not compact with any new SSTables for some time. I can see how this might be a performance concern in the general case, because any rea

Re: Column Family migration/tombstones

2013-01-06 Thread aaron morton
> When these rows are deleted, tombstones will be created and stored in more > recent sstables. Upon compaction of sstables, and after gc_grace_period, I > presume cassandra will have removed all traces of that row from disk. Yes. When using Size Tiered compaction (the default) tombstones are pu

Re: Column Family migration/tombstones

2013-01-05 Thread Mike
A couple more questions. When these rows are deleted, tombstones will be created and stored in more recent sstables. Upon compaction of sstables, and after gc_grace_period, I presume cassandra will have removed all traces of that row from disk. However, after deleting such a large amount of

Re: Column Family migration/tombstones

2013-01-02 Thread aaron morton
> 1) As one can imagine, the index and bloom filter for this column family is > large. Am I correct to assume that bloom filter and index space will not be > reduced until after gc_grace_period? Yes. > 2) If I would manually run repair across a cluster, is there a process I can > use to safel

Column Family migration/tombstones

2012-12-29 Thread Mike
Hello, We are undergoing a change to our internal datamodel that will result in the eventual deletion of over a hundred million rows from a Cassandra column family. From what I understand, this will result in the generation of tombstones, which will be cleaned up during compaction, after gc_