[
https://issues.apache.org/jira/browse/HADOOP-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andras Bokor resolved HADOOP-2829.
----------------------------------
Resolution: Invalid
It seems obsolete.
> JT should consider the disk each task is on before scheduling jobs...
> ---------------------------------------------------------------------
>
> Key: HADOOP-2829
> URL: https://issues.apache.org/jira/browse/HADOOP-2829
> Project: Hadoop Common
> Issue Type: Improvement
> Reporter: eric baldeschwieler
> Priority: Major
>
> The DataNode can support a JBOD config, where blocks exist on explicit disks.
> But this information is not exported or considered by the JT when assigning
> tasks. This leads to non-optimal disk use. if 4 slots are used, 2 running
> tasks will likely be on the same disk and we observe them running more slowly
> then other tasks on the same machine.
> We could follow a number of strategies to address this.
> for example: The data nodes could support a what disk is this block on call.
> Then the JT could discover the info and assign jobs accordingly.
> Of course the TT itself uses disks for merge and temp space and the datanodes
> on the same machine can be used by off node sources, so it is not clear
> optimizing all of this is simple enough to be worth it.
> This issue deserves study.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]