Re: Combining all CFs into one big one

2011-05-02 Thread Tyler Hobbs
On Mon, May 2, 2011 at 12:06 PM, David Boxenhorn wrote: > I guess I'm still feeling fuzzy on this because my actual use-case isn't so > black-and-white. I don't have any CFs that are accessed purely, or even > mostly, in once-through batch mode. What I have is CFs with more and less > data, and C

Re: Combining all CFs into one big one

2011-05-02 Thread David Boxenhorn
I guess I'm still feeling fuzzy on this because my actual use-case isn't so black-and-white. I don't have any CFs that are accessed purely, or even mostly, in once-through batch mode. What I have is CFs with more and less data, and CFs that are accessed more and less frequently. On Mon, May 2, 20

Re: Combining all CFs into one big one

2011-05-02 Thread Tyler Hobbs
On Mon, May 2, 2011 at 5:05 AM, David Boxenhorn wrote: > Wouldn't it be the case that the once-used rows in your batch process would > quickly be traded out of the cache, and replaced by frequently-used rows? > Yes, and you'll pay a cache miss penalty for each of the replacements. > This would

Re: Combining all CFs into one big one

2011-05-02 Thread David Boxenhorn
Wouldn't it be the case that the once-used rows in your batch process would quickly be traded out of the cache, and replaced by frequently-used rows? This would be the case even if your batch process goes on for a long time, since caching is done on a row-by-row basis. In effect, it would mean that

Re: Combining all CFs into one big one

2011-05-01 Thread Tyler Hobbs
> > If you had one big cache, wouldn't it be the case that it's mostly > populated with frequently accessed rows, and less populated with rarely > accessed rows? > Yes. In fact, wouldn't one big cache dynamically and automatically give you > exactly what you want? If you try to partition the same

Re: Combining all CFs into one big one

2011-05-01 Thread David Boxenhorn
If you had one big cache, wouldn't it be the case that it's mostly populated with frequently accessed rows, and less populated with rarely accessed rows? In fact, wouldn't one big cache dynamically and automatically give you exactly what you want? If you try to partition the same amount of memory

Re: Combining all CFs into one big one

2011-05-01 Thread Tyler Hobbs
On Sun, May 1, 2011 at 2:16 PM, Jake Luciani wrote: > > > On Sun, May 1, 2011 at 2:58 PM, shimi wrote: > >> On Sun, May 1, 2011 at 9:48 PM, Jake Luciani wrote: >> >>> If you have N column families you need N * memtable size of RAM to >>> support this. If that's not an option you can merge them

Re: Combining all CFs into one big one

2011-05-01 Thread Jake Luciani
On Sun, May 1, 2011 at 2:58 PM, shimi wrote: > On Sun, May 1, 2011 at 9:48 PM, Jake Luciani wrote: > >> If you have N column families you need N * memtable size of RAM to support >> this. If that's not an option you can merge them into one as you suggest >> but then you will have much larger SS

Re: Combining all CFs into one big one

2011-05-01 Thread shimi
On Sun, May 1, 2011 at 9:48 PM, Jake Luciani wrote: > If you have N column families you need N * memtable size of RAM to support > this. If that's not an option you can merge them into one as you suggest > but then you will have much larger SSTables, slower compactions, etc. > I don't necessa

Re: Combining all CFs into one big one

2011-05-01 Thread Jake Luciani
If you have N column families you need N * memtable size of RAM to support this. If that's not an option you can merge them into one as you suggest but then you will have much larger SSTables, slower compactions, etc. I don't necessarily agree with Tyler that the OS cache will be less effective..

Re: Combining all CFs into one big one

2011-05-01 Thread Tyler Hobbs
When you have a high number of CFs, it's a good idea to consider merging CFs with highly correlated access patterns and similar structure into one. It is *not* a good idea to merge all of your CFs into one (unless they all happen to meet this criteria). Here's why: Besides big compactions and long

Re: Combining all CFs into one big one

2011-05-01 Thread David Boxenhorn
Shouldn't these kinds of problems be solved by Cassandra? Isn't there a maximum SSTable size? On Sun, May 1, 2011 at 3:24 PM, shimi wrote: > Big sstables, long compactions, in major compaction you will need to have > free disk space in the size of all the sstables (which you should have > anyway

Re: Combining all CFs into one big one

2011-05-01 Thread shimi
Big sstables, long compactions, in major compaction you will need to have free disk space in the size of all the sstables (which you should have anyway). Shimi On Sun, May 1, 2011 at 2:03 PM, David Boxenhorn wrote: > I'm having problems administering my cluster because I have too many CFs > (~4

Combining all CFs into one big one

2011-05-01 Thread David Boxenhorn
I'm having problems administering my cluster because I have too many CFs (~40). I'm thinking of combining them all into one big CF. I would prefix the current CF name to the keys, repeat the CF name in a column, and index the column (so I can loop over all rows, which I have to do sometimes, for s