The current hadoop implementation shuffles directly to disk and then those disk files are eventually requested by the target nodes which are responsible for doing the reduce() on the intermediate data.
However, this requires more 2x IO than strictly necessary. If the data were instead shuffled DIRECTLY to the target host, this IO overhead would be removed. I believe that any benefits from writing locally (compressing, combining) and then doing a transfer can be had by simply allocating a buffer and (say 250-500MB per map task) and then transfering data directly. I don't think that the savings will be 100% on par with first writing locally but remember it's already 2x faster by not having to write to disk... so any advantages to first shuffling to the local disk would have to be more than 100% faster. However, writing data to the local disk first could in theory had some practical advantages under certain loads. I just don't think they're practical and that direct shuffling is superior. Anyone have any thoughts here?
