Jeff,

I've seen tests showing that 10 Gigabit Ethernet networking benefits

Hadoop clusters and the benefit is especially pronounced if Hadoop

node use SSDs on the back-end. Also, as each node does bother storage

I/O and processing, 10GbE NICs that offload protocol processing

are especially beneficial e.g

 

High-Performance Networking for Optimized Hadoop Deployments
<http://www.chelsio.com/wp-content/uploads/2011/08/Hadoop-White-Paper-w-tuto
rial-8.11.pdf> 

 

Saqib

 

-----Original Message-----
From: Jeff Kubina [mailto:[email protected]] 
Sent: Thursday, March 15, 2012 11:30 AM
To: [email protected]
Subject: Most Practical Bandwidth for a Hadoop Cluster?

 

Suppose you have Hadoop jobs that are communication-bound (due to lots of
data shuffling between maps and reduces), what is the most practical network
bandwidth to strive for in such a cluster? I think it should be the
sustained read bandwidth of the disks on the nodes times the number of
nodes, since any more bandwidth than this could not be utilized. Agree or
disagree? If you disagree, could you explain what you think it should be.
Thanks.

Reply via email to