On 2/14/19 11:09 PM, Vinay Kashyap wrote:
I am running hadoop on my mac and all the folders have *myuser:staff*
as the owner. I have verified the permissions for the local dirs to be
755.
This doesn't sound right. By-the-book, there are supposed to be separate
"users" for hdfs, yarn, and mapred to run their respective daemons. The
directories they read/write in are supposed to be permed and owned to
expect that. One possible approach for purposes of log-writing etc. is
to put those user accounts in a group (perhaps named "hadoop") so that
read/written areas in common are owned by that group and permed accordingly.
If you're going to ad-lib that arrangement then you'll have to ad-lib a
lot of the rest of how worker nodes and edge nodes behave accordingly.
I run all hadoop services with myuser and I have configured
/yarn.nodemanager.linux-container-executor.group/*=staff *accordingly
both in *yarn-site.xml* and *container-executor.cfg*
1. Is the container-executor binary certified to work as expected on
OSX.?
2. When linux container executor is configured, is there any hard
expectation that users of the running hadoop services to be part of
[*root, hdfs, yarn...*] and group to be *hadoop*.? So that the
directory permissions fall in line accordingly?
Can you please help me understand this.? Could not find any write up
on this.
On Thu, Feb 14, 2019 at 11:13 PM Prabhu Josephraj
<[email protected] <mailto:[email protected]>> wrote:
In case of Distributed Shell Job - ApplicationMaster runs in
normal linux container and the subsequent shell command runs
inside Docker
container. The job fails even before launching AM, that is before
starting Docker Container. I think the Distributed Shell job will
fail even
without Docker Settings.
As per the error code 20 , it is mostly related to accessing of NM
local directory.
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cdh_sg_yarn_container_exec_errors.html
20
INITIALIZE_USER_FAILED
Couldn't get, stat, or secure the per-user NodeManager directory.
Can we try below steps on (all) NodeManager machine.
Remove all contents under /data/yarn and make sure the /data and
/data/yarn directory permission is 755 with owner root:root and
local directory
is owned by yarn:hadoop.
[root@tparimi-tarunhdp26-4 ~]# ls -lrt /
drwxr-xr-x. 5 root root 44 Oct 24 11:47 data
[root@tparimi-tarunhdp26-4 ~]# ls -lrt /data/
drwxr-xr-x. 4 root root 28 Oct 24 14:30 yarn
[root@tparimi-tarunhdp26-4 ~]# ls -lrt /data/yarn/
total 4
drwxr-xr-x. 5 yarn hadoop 54 Feb 14 17:32 local
drwxrwxr-x. 10 yarn hadoop 4096 Feb 14 17:32 log
And also check if Distributed Shell jobs runs fine without Docker
Settings.
On Thu, Feb 14, 2019 at 10:15 PM Vinay Kashyap
<[email protected] <mailto:[email protected]>> wrote:
Hi Prabhu,
Thanks for your reply.
I tried the configurations as per your suggestion. But I get
the same error.
Is this related to container localization by any chance?.
Also, is there any log or out information which says that the
docker container runtime has been picked up.?
On Thu, Feb 14, 2019 at 9:38 PM Prabhu Josephraj
<[email protected] <mailto:[email protected]>> wrote:
Hi Vinay,
Can you try specifying below configs under Docker
section in container-executor.cfg which will allow Docker
Containers to use the NM Local Dirs.
docker.allowed.ro-mounts=/data/yarn/local,,/usr/jdk64/jdk1.8.0_112/bin
docker.allowed.rw-mounts=/data/yarn/local,/data/yarn/log
Thanks,
Prabhu Joseph
On Thu, Feb 14, 2019 at 9:28 PM Vinay Kashyap
<[email protected] <mailto:[email protected]>> wrote:
I am using Hadoop 3.2.0 and trying to run a simple
application in a docker container and I have made the
required configuration changes both in
*/yarn-site.xml/* and */container-executor.cfg/* to
choose LinuxContainerExecutor and docker runtime.
I use the example of distributed shell in one of the
hortonworks blog.
https://hortonworks.com/blog/trying-containerized-applications-apache-hadoop-yarn-3-1/
The problem I face here is when the application is
submitted to YARN it fails with a reason related to
directory creation issue with the below error
2019-02-14 20:51:16,450 INFO
distributedshell.Client: Got application report
from ASM for, appId=2, clientToAMToken=null,
appDiagnostics=Application
application_1550156488785_0002 failed 2 times due
to AM Container for
appattempt_1550156488785_0002_000002 exited with
exitCode: -1000 Failing this attempt.Diagnostics:
[2019-02-14 20:51:16.282]Application
application_1550156488785_0002 initialization
failed (exitCode=20) with output: main : command
provided 0 main : user is myuser main : requested
yarn user is myuser Failed to create directory
/data/yarn/local/nmPrivate/container_1550156488785_0002_02_000001.tokens/usercache/myuser
- Not a directory
I have configured *yarn.nodemanager.local-dirs* in
yarn-site.xml and I can see the same reflected in YARN
web ui *localhost:8088/conf*
|<property> <name>yarn.nodemanager.local-dirs</name>
<value>/data/yarn/local</value> <final>false</final>
<source>yarn-site.xml</source> </property> |
I do not understand why is it trying to create
usercache dir inside the nmPrivate directory.
Note : I have verified the permissions for myuser to
the directories and also have tried clearing the
directories manually as suggested in a related post.
But no fruit. I do not see any additional information
about container launch failure in any other logs.
How do I debug why the usercache dir is not resolved
properly??
Really appreciate any help on this.
Thanks
Vinay Kashyap
--
*/Thanks and regards/*
*/Vinay Kashyap/*
--
*/Thanks and regards/*
*/Vinay Kashyap/*