Reuse output collectors across maps running on the same jvm
-----------------------------------------------------------
Key: HADOOP-5830
URL: https://issues.apache.org/jira/browse/HADOOP-5830
Project: Hadoop Core
Issue Type: Improvement
Components: mapred
Reporter: Arun C Murthy
We have evidence that cutting the shuffle-crossbar between maps and reduces (m
* r) leads to perfomant applications since:
# It cuts down the number of connections necessary to shuffle and hence reduces
load on the serving-side (TaskTracker) and improves latency (terasort,
HADOOP-1338, HADOOP-5223)
# Reduces seeks required for the TaskTracker to serve the map-outputs
So far we've had to manually tune applications to cut down the shuffle-
crossbar by having fatter maps with custom input formats etc. For e.g. we saw a
significant improvement while running the petasort when we went from ~800,000
maps to 80,00 maps (1.5G to 15G per map) i.e. from 48+ hours to 16 hours,
The downsides are:
# The burden falls on the application-writer to tune this with custom
input-formats etc.
# The naive method of using a higher min.split.size leads to considerable
non-local i/o on the maps.
Given these, the proposal is to keep the 'output collector' open across jvm
reuse for maps, there-by enabling 'combiners' across map-tasks. This would have
the happy-effect of fixing both the above. The downsides are that it will add
latency to jobs (since map-outputs cannot be shuffled till a few maps on the
same jvm are done, then followed by a final sort/merge/combine) and the failure
cases get a bit more complicated.
Thoughts? Lets discuss...
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.