[jira] [Resolved] (HADOOP-2829) JT should consider the disk each task is on before scheduling jobs...

Andras Bokor (JIRA) Fri, 23 Feb 2018 05:19:27 -0800

     [ 
https://issues.apache.org/jira/browse/HADOOP-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Andras Bokor resolved HADOOP-2829.
----------------------------------
    Resolution: Invalid

It seems obsolete.

> JT should consider the disk each task is on before scheduling jobs...
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-2829
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2829
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: eric baldeschwieler
>            Priority: Major
>
> The DataNode can support a JBOD config, where blocks exist on explicit disks. 
>  But this information is not exported or considered by the JT when assigning 
> tasks.  This leads to non-optimal disk use.  if 4 slots are used, 2 running 
> tasks will likely be on the same disk and we observe them running more slowly 
> then other tasks on the same machine.
> We could follow a number of strategies to address this.
> for example: The data nodes could support a what disk is this block on call.  
> Then the JT could discover the info and assign jobs accordingly.
> Of course the TT itself uses disks for merge and temp space and the datanodes 
> on the same machine can be used by off node sources, so it is not clear 
> optimizing all of this is simple enough to be worth it.
> This issue deserves study.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (HADOOP-2829) JT should consider the disk each task is on before scheduling jobs...

Reply via email to