Re: real-world dataset from social network?

2010-05-20 Thread Matt Revelle
It's unclear if you're looking for data that can be stored in Cassandra or an example of someone using Cassandra to store a network; I'm assuming the former. You will have a hard time finding a social network dataset with relationships already well-defined for free. I have seen crawls of Twitte

Re: timeout while running simple hadoop job

2010-05-07 Thread Matt Revelle
On May 7, 2010, at 9:40, gabriele renzi wrote: On Fri, May 7, 2010 at 2:53 PM, Matt Revelle wrote: re: not reporting, I thought this was not needed with the new mapred api (Mapper class vs Mapper interface), plus I can see that the mappers do work, report percentage and happily terminate

Re: timeout while running simple hadoop job

2010-05-07 Thread Matt Revelle
There's also the mapred.task.timeout property that can be tweaked. But reporting is the correct way to fix timeouts during execution. On May 7, 2010, at 8:49 AM, Joseph Stein wrote: > The problem could be that you are crunching more data than will be > completed within the interval expire setti

Re: Cassandra training on May 21 in Palo Alto

2010-05-07 Thread Matt Revelle
Reston, VA is a good spot in the DC metro area for tech events. The recent Pragmatic Programmer Clojure class sold out and already has two more return visits planned. On May 7, 2010, at 6:42 AM, S Ahmed wrote: > toronto :) > > If not toronto, Virginia. > > On Thu, May 6, 2010 at 5:28 PM, Jo