On 11/27/2012 08:14 AM, Vincent Diepeveen wrote: > Don't post something ridicioulous like that.
You both are right, so lets stop antagonizing the antagonist here. Laptops are a reasonable place to toy around with and educate oneself about Hadoop, but they are also not (obviously, I don't think this was the intention of the poster) the ideal environment for Hadoop. However, this is not because of infiniband or some such thing -- most production Hadoop setups use 1Gb Ethernet or 10GbE if they are very lucky. Only a few use infiniband, and any efforts to use Hadoop over RDMA are very, very recent (saw a few at SC12) and the benefits are to be determined IMHO. But this isn't the real reason why Hadoop and portable commodity HW won't work well together -- the real reason is to really take advantage of Hadoop you should have multiple HDDs per box and most importantly, you need a LOT of RAM to get best performance (complicated explanation revolving around data spilling during computation and whatnot). Laptops tend not to have lots of RAM packed in. So, to get back to your original query here Jonathan, yes, you can run Hadoop (with some effort) on a "storage server," which I interpret in this case to be a NAS box of some sort. However, please note that typical Hadoop already couples the "compute server" and the "storage server" by co-locating both the compute and storage daemons on the same box. The MapReduce scheduler attempts to (sometimes poorly, this is in need of serious improvement) schedule jobs on machines which already have the data, which mitigates the need for super huge network pipes in order to push the data to that node. So in a way, by using Hadoop in it's traditional incarnation you already achieve this. I've done some testing of Hadoop within NAS, although most of my research has been on Hadoop /on/ NAS, or in other words, getting rid of the storage aspect of Hadoop and solely leveraging the MapReduce framework atop existing NAS storage. This can be done efficiently, but it's easy to mistakenly head down an inefficient path. Hopefully this (finally) answers some part of your original question, ellis _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf