Re: Cassandra OOM - 1.0.2

2012-02-07 Thread Ajeet Grewal
On Tue, Feb 7, 2012 at 10:45 AM, aaron morton wrote: > Just to ask the stupid question, have you tried setting it really high ? > Like 50 ? > No I have not. I moved to mmap_index_only as a stopgap solution. Is it possible for there to be that many mmaps for about 300 db files? -- Regards,

Re: sensible data model ?

2012-02-06 Thread Ajeet Grewal
> It's also a good idea to partition time series data so that the rows do not > grow too big. You can have 2 billion columns in a row, but big rows have > operational down sides. What are the down sides here? Unfortunately I have an existing system which I modeled with large rows (because I use th

Re: Cassandra OOM - 1.0.2

2012-02-06 Thread Ajeet Grewal
Here are the last few lines of strace (of one of the threads). There are a bunch of mmap system calls. Notice the last mmap call a couple of lines before the trace ends. Could the last mmap call fail? == BEGIN STRACE == mmap(NULL, 2147487599, PROT_READ, MAP_SHARED, 37, 0xbb000) = 0x7709b54000

Re: Cassandra OOM - 1.0.2

2012-02-06 Thread Ajeet Grewal
On Mon, Feb 6, 2012 at 11:50 AM, Ajeet Grewal wrote: > On Sat, Feb 4, 2012 at 7:03 AM, Jonathan Ellis wrote: >> Sounds like you need to increase sysctl vm.max_map_count > > This did not work. I increased vm.max_map_count from 65536 to 131072. > I am still getting the same err

Re: Cassandra OOM - 1.0.2

2012-02-06 Thread Ajeet Grewal
On Sat, Feb 4, 2012 at 7:03 AM, Jonathan Ellis wrote: > Sounds like you need to increase sysctl vm.max_map_count This did not work. I increased vm.max_map_count from 65536 to 131072. I am still getting the same error. ERROR [SSTableBatchOpen:4] 2012-02-06 11:43:50,463 AbstractCassandraDaemon.jav

Cassandra OOM - 1.0.2

2012-02-03 Thread Ajeet Grewal
Hey guys, I am getting an out of memory (mmap failed) error with Cassandra 1.0.2. The relevant log lines are pasted at http://pastebin.com/UM28ZC1g. Cassandra works fine until it reaches about 300-400GB of load (on one instance, I have 12 nodes RF=2). Then nodes start failing with such errors. Th