[
https://issues.apache.org/jira/browse/HADOOP-15284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated HADOOP-15284:
--------------------------------
Affects Version/s: 3.1.0
Summary: Docker launch fails when user private filecache
directory is missing (was: Could not determine real path of mount)
ContainerLocalizer, which is run for every user-specific localization (i.e.:
PRIVATE and APPLICATION visibility), creates both the
usercache/_user_/filecache and usercache/_user_/appcache directories whenever
it runs (see ContainerLocalizer#initDirs).
If this directory is missing then I'm wondering if this is a case where
_nothing_ was localized for this user, not just PRIVATE but also no APPLICATION
visibility resources (i.e.: only public resources or no resources at all). The
only reason this would have worked before YARN-7815 is because the container
executor creates the container work directory which exists under the
usercache/_user_ directory, and that's what it used to mount before tha changes
in YARN-7815.
> Docker launch fails when user private filecache directory is missing
> --------------------------------------------------------------------
>
> Key: HADOOP-15284
> URL: https://issues.apache.org/jira/browse/HADOOP-15284
> Project: Hadoop Common
> Issue Type: Bug
> Affects Versions: 3.1.0
> Reporter: Eric Yang
> Priority: Major
>
> Docker container is failing to launch in trunk. The root cause is:
> {code}
> [COMPINSTANCE sleeper-1 : container_1520032931921_0001_01_000020]:
> [2018-03-02 23:26:09.196]Exception from container-launch.
> Container id: container_1520032931921_0001_01_000020
> Exit code: 29
> Exception message: image: hadoop/centos:latest is trusted in hadoop registry.
> Could not determine real path of mount
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Could not determine real path of mount
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Invalid docker mount
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache',
> realpath=/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache
> Error constructing docker command, docker error code=12, error
> message='Invalid docker mount'
> Shell output: main : command provided 4
> main : run as user is hbase
> main : requested yarn user is hbase
> Creating script paths...
> Creating local dirs...
> [2018-03-02 23:26:09.240]Diagnostic message from attempt 0 : [2018-03-02
> 23:26:09.240]
> [2018-03-02 23:26:09.240]Container exited with a non-zero exit code 29.
> [2018-03-02 23:26:39.278]Could not find
> nmPrivate/application_1520032931921_0001/container_1520032931921_0001_01_000020//container_1520032931921_0001_01_000020.pid
> in any of the directories
> [COMPONENT sleeper]: Failed 11 times, exceeded the limit - 10. Shutting down
> now...
> {code}
> The filecache cant not be mounted because it doesn't exist.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]