Re: Map/Reduce over Cassandra

2010-08-18 Thread Drew Dahlke
Hey Bill, A few months ago we did an experiment with 5 hadoop nodes pulling from 4 cass nodes. It was pulling down 1 column family with 8 small columns & just dumping the raw data to hdfs. It was cycling through around 17K map tasks per sec. The machines weren't being taxed too hard, so I'm sure t

Map/Reduce over Cassandra

2010-08-17 Thread Bill Hastings
Hi All How performant is M/R on Cassandra when compared to running it on HDFS? Anyone have any numbers they can share? Specifically how much of data the M/R job was run against and what was the throughput etc. Any information would be very helpful. -- Cheers Bill