Restarting servers

2011-08-12 Thread Jason Baker
So restarting cassandra servers has a tendency to cause a lot of exceptions like "MaximumRetryException: Retried 6 times. Last failure was UnavailableException()" and "TApplicationException: Internal error processing batch_mutate" (using pycassa). If I restart the servers too quickly, I get "all s

Re: Tuning a column family for archival

2011-08-11 Thread Jason Baker
On Thu, Aug 11, 2011 at 6:14 AM, Edward Capriolo wrote: > > In many regards Cassandra automatically does the correct thing. Other then > the costs of the bloom filters for the table size being in ram, if you never > read or write to those sstables and you are not reusing the row key, the OS > will

Tuning a column family for archival

2011-08-10 Thread Jason Baker
I have a column family that I'm using to archive records. They're mostly kept around for historical purposes. Aside from that, they're mostly considered deleted. It's probably going to be very rare that anyone reads from this table *ever*. I don't really even write to it that much. Does anyone

nodetool repair: No neighbors

2011-07-30 Thread Jason Baker
When I run nodetool repair on a node on my 3-node cluster, I see 3 messages like the following: INFO [manual-repair-6d9a617f-c496-4744-9002-a56909b83d5b] 2011-07-30 18:50:28,464 AntiEntropyService.java (line 636) No neighbors to repair with for system on (0,56713727820156410577229101238628035242]

Running hadoop jobs against data in remote data center

2011-07-06 Thread Jason Baker
I'm just setting up a Cassandra cluster for my company. For a variety of reasons, we have the servers that run our hadoop jobs in our local office and our production machines in a collocated data center. We don't want to run hadoop jobs against cassandra servers on the other side of the US from u