[
https://issues.apache.org/jira/browse/HADOOP-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16262809#comment-16262809
]
Jason Lowe commented on HADOOP-15059:
-------------------------------------
bq. Are we going to keep binary compatibility across hadoop-2.x and hadoop-3.x?
Wire compatibility between 2.x clients and 3.x servers is a prerequisite to
supporting a rolling upgrade from 2.x to 3.x, but I do not think everyone
realizes wire compatibility between a 3.x client and a 2.x server is also very
important to many of our users. There are many cases where more than one
cluster is involved in a workflow. Requiring that all clusters upgrade from
2.x to 3.x simultaneously is a huge hurdle for adoption, and most users will
upgrade them one at a time. As individual clusters upgrade there will be
clients/jobs on a newly upgraded 3.x cluster trying to interact with an older
2.x cluster.
Back to the issue of launching jobs using an incompatible token format --
here's a couple of options we could consider:
1) YARN nodemanager writes out *two* token credential files, the original 2.x
file for backwards compatibility and a new 3.x file. The 3.x UGI code looks
for the new file and falls back to the old one if it cannot find it. The 2.x
code will simply load the old format from the original filename as it does
today.
2) Application submission context contains information on which version of
credentials to use for an application. This gets transferred to the container
launch context for each container, and the nodemanager writes out the
appropriate credentials version based on what was specified in the container
launch context. In other words, the nodemanager knows which version of the
credentials format the container is expecting to find and writes the token file
in that format.
> 3.0 deployment cannot work with old version MR tar ball which break rolling
> upgrade
> -----------------------------------------------------------------------------------
>
> Key: HADOOP-15059
> URL: https://issues.apache.org/jira/browse/HADOOP-15059
> Project: Hadoop Common
> Issue Type: Bug
> Components: security
> Reporter: Junping Du
> Priority: Blocker
>
> I tried to deploy 3.0 cluster with 2.9 MR tar ball. The MR job is failed
> because following error:
> {noformat}
> 2017-11-21 12:42:50,911 INFO [main]
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for
> application appattempt_1511295641738_0003_000001
> 2017-11-21 12:42:51,070 WARN [main] org.apache.hadoop.util.NativeCodeLoader:
> Unable to load native-hadoop library for your platform... using builtin-java
> classes where applicable
> 2017-11-21 12:42:51,118 FATAL [main]
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> java.lang.RuntimeException: Unable to determine current user
> at
> org.apache.hadoop.conf.Configuration$Resource.getRestrictParserDefault(Configuration.java:254)
> at
> org.apache.hadoop.conf.Configuration$Resource.<init>(Configuration.java:220)
> at
> org.apache.hadoop.conf.Configuration$Resource.<init>(Configuration.java:212)
> at
> org.apache.hadoop.conf.Configuration.addResource(Configuration.java:888)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1638)
> Caused by: java.io.IOException: Exception reading
> /tmp/nm-local-dir/usercache/jdu/appcache/application_1511295641738_0003/container_e03_1511295641738_0003_01_000001/container_tokens
> at
> org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:208)
> at
> org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:907)
> at
> org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:820)
> at
> org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:689)
> at
> org.apache.hadoop.conf.Configuration$Resource.getRestrictParserDefault(Configuration.java:252)
> ... 4 more
> Caused by: java.io.IOException: Unknown version 1 in token storage.
> at
> org.apache.hadoop.security.Credentials.readTokenStorageStream(Credentials.java:226)
> at
> org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:205)
> ... 8 more
> 2017-11-21 12:42:51,122 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting
> with status 1: java.lang.RuntimeException: Unable to determine current user
> {noformat}
> I think it is due to token incompatiblity change between 2.9 and 3.0. As we
> claim "rolling upgrade" is supported in Hadoop 3, we should fix this before
> we ship 3.0 otherwise all MR running applications will get stuck during/after
> upgrade.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]