[
https://issues.apache.org/jira/browse/HADOOP-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
John Zhuge updated HADOOP-13770:
--------------------------------
Priority: Minor (was: Blocker)
> Shell.checkIsBashSupported swallowed an interrupted exception
> -------------------------------------------------------------
>
> Key: HADOOP-13770
> URL: https://issues.apache.org/jira/browse/HADOOP-13770
> Project: Hadoop Common
> Issue Type: Bug
> Components: util
> Reporter: Wei-Chiu Chuang
> Assignee: Wei-Chiu Chuang
> Priority: Minor
> Labels: oct16-easy, shell, supportability
> Fix For: 2.8.0, 3.0.0-alpha2
>
> Attachments: HADOOP-12652.001.patch, YARN-4467.001.patch
>
>
> Shell.checkIsBashSupported() creates a bash shell command to verify if the
> system supports bash. However, its error message is misleading, and the logic
> should be updated.
> If the shell command throws an IOException, it does not imply the bash did
> not run successfully. If the shell command process was interrupted, its
> internal logic throws an InterruptedIOException, which is a subclass of
> IOException.
> {code:title=Shell.checkIsBashSupported|borderStyle=solid}
> ShellCommandExecutor shexec;
> boolean supported = true;
> try {
> String[] args = {"bash", "-c", "echo 1000"};
> shexec = new ShellCommandExecutor(args);
> shexec.execute();
> } catch (IOException ioe) {
> LOG.warn("Bash is not supported by the OS", ioe);
> supported = false;
> }
> {code}
> An example of it appeared in a recent jenkins job
> https://builds.apache.org/job/PreCommit-HADOOP-Build/8257/testReport/org.apache.hadoop.ipc/TestRPCWaitForProxy/testInterruptedWaitForProxy/
> The test logic in TestRPCWaitForProxy.testInterruptedWaitForProxy starts a
> thread, wait it for 1 second, and interrupt the thread, expecting the thread
> to terminate. However, the method Shell.checkIsBashSupported swallowed the
> interrupt, and therefore failed.
> {noformat}
> 2015-12-16 21:31:53,797 WARN util.Shell
> (Shell.java:checkIsBashSupported(718)) - Bash is not supported by the OS
> java.io.InterruptedIOException: java.lang.InterruptedException
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:930)
> at org.apache.hadoop.util.Shell.run(Shell.java:838)
> at
> org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1117)
> at org.apache.hadoop.util.Shell.checkIsBashSupported(Shell.java:716)
> at org.apache.hadoop.util.Shell.<clinit>(Shell.java:705)
> at org.apache.hadoop.util.StringUtils.<clinit>(StringUtils.java:79)
> at
> org.apache.hadoop.security.SecurityUtil.getAuthenticationMethod(SecurityUtil.java:639)
> at
> org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:273)
> at
> org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:261)
> at
> org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:803)
> at
> org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:773)
> at
> org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:646)
> at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:397)
> at org.apache.hadoop.ipc.RPC.waitForProtocolProxy(RPC.java:350)
> at org.apache.hadoop.ipc.RPC.waitForProxy(RPC.java:330)
> at
> org.apache.hadoop.ipc.TestRPCWaitForProxy$RpcThread.run(TestRPCWaitForProxy.java:115)
> Caused by: java.lang.InterruptedException
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:503)
> at java.lang.UNIXProcess.waitFor(UNIXProcess.java:264)
> at org.apache.hadoop.util.Shell.runCommand(Shell.java:920)
> ... 15 more
> {noformat}
> The original design is not desirable, as it swallowed a potential interrupt,
> causing TestRPCWaitForProxy.testInterruptedWaitForProxy to fail.
> Unfortunately, Java does not allow this static method to throw exception. We
> should removed the static member variable, so that the method can throw the
> interrupt exception. The node manager should call the static method, instead
> of using the static member variable.
> This fix has an associated benefit: the tests could run faster, because it
> will no longer need to spawn a bash process when it uses a Shell static
> method variable (which happens quite often for checking what operating system
> Hadoop is running on)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]