On Tue, Nov 27, 2012 at 11:13:25AM -0500, Ellis H. Wilson III wrote: > Are these problems EP such that they could be entirely Map tasks?
Not at all. This particular application is to derive optimal feature extraction algorithms from high-resolution volumetric data (mammal or primate connectome). At ~8 nm, even a mouse will produce a mountain of structural data. > Because otherwise you are going to have a fairly significant shuffle > stage in your MapReduce application that will lead to overheads moving > the data over the network and in and out of memory/disk/etc. Shuffling > can be a real PITA, but it tends to be present in most real-world > applications I've run into. The extracted feature set would be much more compact than the raw dataset (at least 10^3 to 10^6 more compact), and could be loaded over the GBit/s network into the main cluster with no problems. > Maybe you weren't referring to using Hadoop, in which case this > basically looks just like the FAWN project I had mentioned in the past > that came out of CMU (with the addition of tiered storage). http://www.cs.cmu.edu/~fawnproj/ ? Cute, and probably the right application for the Adapteva project. If the boards are credit-card sized you can mount them on a rackmount tray along with a 24-port switch, with a couple of fans. However, I'm thinking about a board you directly plug your SATA or SAS hard drive into, probably using the hard drive itself (which should be 5k rpm then) as a heatsink. _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf