[ 
https://issues.apache.org/jira/browse/HADOOP-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Lowe updated HADOOP-15284:
--------------------------------
    Affects Version/s: 3.1.0
              Summary: Docker launch fails when user private filecache 
directory is missing  (was: Could not determine real path of mount)

ContainerLocalizer, which is run for every user-specific localization (i.e.: 
PRIVATE and APPLICATION visibility), creates both the 
usercache/_user_/filecache and usercache/_user_/appcache directories whenever 
it runs (see ContainerLocalizer#initDirs).

If this directory is missing then I'm wondering if this is a case where 
_nothing_ was localized for this user, not just PRIVATE but also no APPLICATION 
visibility resources (i.e.: only public resources or no resources at all).  The 
only reason this would have worked before YARN-7815 is because the container 
executor creates the container work directory which exists under the 
usercache/_user_ directory, and that's what it used to mount before tha changes 
in YARN-7815.

> Docker launch fails when user private filecache directory is missing
> --------------------------------------------------------------------
>
>                 Key: HADOOP-15284
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15284
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 3.1.0
>            Reporter: Eric Yang
>            Priority: Major
>
> Docker container is failing to launch in trunk.  The root cause is:
> {code}
> [COMPINSTANCE sleeper-1 : container_1520032931921_0001_01_000020]: 
> [2018-03-02 23:26:09.196]Exception from container-launch.
> Container id: container_1520032931921_0001_01_000020
> Exit code: 29
> Exception message: image: hadoop/centos:latest is trusted in hadoop registry.
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Invalid docker mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache',
>  realpath=/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache
> Error constructing docker command, docker error code=12, error 
> message='Invalid docker mount'
> Shell output: main : command provided 4
> main : run as user is hbase
> main : requested yarn user is hbase
> Creating script paths...
> Creating local dirs...
> [2018-03-02 23:26:09.240]Diagnostic message from attempt 0 : [2018-03-02 
> 23:26:09.240]
> [2018-03-02 23:26:09.240]Container exited with a non-zero exit code 29.
> [2018-03-02 23:26:39.278]Could not find 
> nmPrivate/application_1520032931921_0001/container_1520032931921_0001_01_000020//container_1520032931921_0001_01_000020.pid
>  in any of the directories
> [COMPONENT sleeper]: Failed 11 times, exceeded the limit - 10. Shutting down 
> now...
> {code}
> The filecache cant not be mounted because it doesn't exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to