Re: OOM recovering failed node with many CFs

2011-05-26 Thread Jonathan Ellis
We've applied a fix to the 0.7 branch in https://issues.apache.org/jira/browse/CASSANDRA-2714. The patch probably applies to 0.7.6 as well. On Thu, May 26, 2011 at 11:36 AM, Flavio Baronti wrote: > I tried the manual copy you suggest, but the SystemTable.checkHealth() > function > complains it c

Re: OOM recovering failed node with many CFs

2011-05-26 Thread Flavio Baronti
I tried the manual copy you suggest, but the SystemTable.checkHealth() function complains it can't load the system files. Log follows, I will gather some more info and create a ticket as soon as possible. INFO [main] 2011-05-26 18:25:36,147 AbstractCassandraDaemon.java Logging initialized INFO

Re: OOM recovering failed node with many CFs

2011-05-26 Thread Jonathan Ellis
Sounds like a legitimate bug, although looking through the code I'm not sure what would cause a tight retry loop on migration announce/rectify. Can you create a ticket at https://issues.apache.org/jira/browse/CASSANDRA ? As a workaround, I would try manually copying the Migrations and Schema sstab

OOM recovering failed node with many CFs

2011-05-26 Thread Flavio Baronti
I can't seem to be able to recover a failed node on a database where i did many updates to the schema. I have a small cluster with 2 nodes, around 1000 CF (I know it's a lot, but it can't be changed right now), and ReplicationFactor=2. I shut down a node and cleaned its data entirely, then trie