[
https://issues.apache.org/jira/browse/HADOOP-15059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jason Lowe updated HADOOP-15059:
--------------------------------
Attachment: HADOOP-15059.004.patch
Thanks for joining the conversation, Allen, and for pointing out the
motivations behind the protobuf change. Do you know of existing use cases that
are relying on the new format?
I completely agree the new format is a great path forward for extensibility and
portability, but unfortunately it breaks a number of existing use cases.
bq. Let's be clear: this is only a problem if one has a bundled
hadoop-common.jar.
It's also important to point out that this is a rather common occurrence.
Besides the typical habit of users running their *-with-dependencies.jar on the
cluster, anyone leveraging the framework-on-HDFS approach will be bitten by
this as soon as the nodemanager upgrades.
Having frameworks deploy via HDFS rather than picking them up from the
nodemanager's jars has proven to be a very useful way to better isolate apps
during cluster rolling upgrades and support multiple versions of the framework
on the cluster simultaneously.
bq. Is the end result of this JIRA going to be that all file formats are locked
forever, regardless of where they come from?
I don't think so. As discussed above, we should be able to remove support for
the Writable format when Hadoop no longer supports 2.x apps. Yes, that's
likely quite a long time, but it does not have to be forever.
bq. Hadoop releases have broken rolling upgrade (and non-rolling upgrades, for
that matter) in the middle of the 2.x stream before by removing things such as
container execution types.
We've completed rolling upgrades across all of our clusters for every minor
release of 2.x since rolling upgrades were first supported in 2.6, so we must
not have hit this landmine. Was this the removal of the dedicated Docker
container executor in favor of the unified Linux executor that does everything?
I'm attaching a patch that implements the "bridge release(s)" approach where
the code supports reading the new format but will write the old format by
default. Code can still request the new format explicitly if necessary. The
main drawback is that we don't get to easily leverage the benefits of the new
format since it's not the default format. However I'm hoping native services
and other things that need the new protobuf format can leverage dtutil to
translate the credentials format for easier consumption.
> 3.0 deployment cannot work with old version MR tar ball which break rolling
> upgrade
> -----------------------------------------------------------------------------------
>
> Key: HADOOP-15059
> URL: https://issues.apache.org/jira/browse/HADOOP-15059
> Project: Hadoop Common
> Issue Type: Bug
> Components: security
> Reporter: Junping Du
> Assignee: Jason Lowe
> Priority: Blocker
> Attachments: HADOOP-15059.001.patch, HADOOP-15059.002.patch,
> HADOOP-15059.003.patch, HADOOP-15059.004.patch
>
>
> I tried to deploy 3.0 cluster with 2.9 MR tar ball. The MR job is failed
> because following error:
> {noformat}
> 2017-11-21 12:42:50,911 INFO [main]
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Created MRAppMaster for
> application appattempt_1511295641738_0003_000001
> 2017-11-21 12:42:51,070 WARN [main] org.apache.hadoop.util.NativeCodeLoader:
> Unable to load native-hadoop library for your platform... using builtin-java
> classes where applicable
> 2017-11-21 12:42:51,118 FATAL [main]
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> java.lang.RuntimeException: Unable to determine current user
> at
> org.apache.hadoop.conf.Configuration$Resource.getRestrictParserDefault(Configuration.java:254)
> at
> org.apache.hadoop.conf.Configuration$Resource.<init>(Configuration.java:220)
> at
> org.apache.hadoop.conf.Configuration$Resource.<init>(Configuration.java:212)
> at
> org.apache.hadoop.conf.Configuration.addResource(Configuration.java:888)
> at
> org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1638)
> Caused by: java.io.IOException: Exception reading
> /tmp/nm-local-dir/usercache/jdu/appcache/application_1511295641738_0003/container_e03_1511295641738_0003_01_000001/container_tokens
> at
> org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:208)
> at
> org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:907)
> at
> org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:820)
> at
> org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:689)
> at
> org.apache.hadoop.conf.Configuration$Resource.getRestrictParserDefault(Configuration.java:252)
> ... 4 more
> Caused by: java.io.IOException: Unknown version 1 in token storage.
> at
> org.apache.hadoop.security.Credentials.readTokenStorageStream(Credentials.java:226)
> at
> org.apache.hadoop.security.Credentials.readTokenStorageFile(Credentials.java:205)
> ... 8 more
> 2017-11-21 12:42:51,122 INFO [main] org.apache.hadoop.util.ExitUtil: Exiting
> with status 1: java.lang.RuntimeException: Unable to determine current user
> {noformat}
> I think it is due to token incompatiblity change between 2.9 and 3.0. As we
> claim "rolling upgrade" is supported in Hadoop 3, we should fix this before
> we ship 3.0 otherwise all MR running applications will get stuck during/after
> upgrade.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]