Hi, Which node on HDFS is MapReduce's "Shuffle" phase that aggregates all values corresponding to a key, performed on?
The Map phase happens on the datanode containing a block. I assume that the Reduce phase happens on some arbitrary free node. But which node is the shuffle phase performed on? (since it aggregates values from all datanodes before passing it to the Reducer) Is the Shuffle phase performed on the client node? Thank you, -- Pratyush Das
