Re: Cassandra Scaling Questions

2010-08-06 Thread Rob Coli
On 8/5/10 1:42 AM, Oleg Anastasjev wrote: 3.) When using the random partitioner how much difference should be expected (or has been observed) between nodes? 2%? 10%? This depends on data. It will distribute keys almost equal between nodes, nut sizes of row data could be different for different

Re: Cassandra Scaling Questions

2010-08-05 Thread Oleg Anastasjev
> > 1.) What have you found to be the best ratio of Cassandra row cache to memory free on the system for filesystem cache?  Are you tuning it like an RDBMS so Cassandra has the vast majority of the RAM in the system or are you letting the filesystem cache do some of the work? This depends on your

Re: Cassandra Scaling Questions

2010-08-02 Thread Aaron Morton
I *think* people lean towards more JVM than file cache. Often people email about the JVM running Out Of Memory, so  give it more and see how much it's using in your case. Your nodes will gave a minimum requirement for memory based on the Memtable Thresholds, cache settings and the usage patters. It

Re: Cassandra Scaling Questions

2010-08-02 Thread Aaron Morton
Thanks for the tip. AaronOn 03 Aug, 2010,at 11:51 AM, Benjamin Black wrote:On Mon, Aug 2, 2010 at 2:24 PM, Aaron Morton wrote: > > 3.5) Yes load balance restores things, I suggest you run it on one node at a > time. Start with the node with the lowest load. Watching the progress by > watching the

Re: Cassandra Scaling Questions

2010-08-02 Thread Aaron Blew
1.) 16 to 24GB out of how much total system memory? Is this 50% of available system RAM or 90%? Thanks for the reply! -Aaron On Mon, Aug 2, 2010 at 2:24 PM, Aaron Morton wrote: > Will answer as best I can, others will know more. > > 1) Most people seem to lean towards more memory for the JVM,

Re: Cassandra Scaling Questions

2010-08-02 Thread Benjamin Black
On Mon, Aug 2, 2010 at 2:24 PM, Aaron Morton wrote: > > 3.5) Yes load balance restores things, I suggest you run it on one node at a > time. Start with the node with the lowest load. Watching the progress by > watching the streams via JMX or nodetool. > I recommend you _never_ use nodetool loadba

Re: Cassandra Scaling Questions

2010-08-02 Thread Aaron Morton
Will answer as best I can, others will know more. 1) Most people seem to lean towards more memory for the JVM, around 16 to 24gb. Memory is also used by the MemTables and I assume during the compaction processes. 2) Cannot say for sure, but I assume so. Think I've seen the cache with data in it whe

Cassandra Scaling Questions

2010-08-02 Thread Aaron Blew
Hi All, I've got a couple questions that have come up about how Cassandra works and what others are seeing in their environments. Here goes: 1.) What have you found to be the best ratio of Cassandra row cache to memory free on the system for filesystem cache? Are you tuning it like an RDBMS so C