[ 
https://issues.apache.org/jira/browse/HADOOP-17812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17391079#comment-17391079
 ] 

Bobby Wang commented on HADOOP-17812:
-------------------------------------

Hi [[email protected]]

I just cherry-picked the patch to branch-3.3 and re-ran the integration tests. 
And one new test *testUnbufferMultipleReads* failed. It seems the failure test 
was not caused by my patch, since I can repro it even without my patch. I 
uploaded the result to the attachment. Please refer to 
[^3.3-branch-failsafe-report.html.gz]

 
{code:java}
java.lang.AssertionError: failed to read expected number of bytes from stream. 
This may be transient expected:<128> but was:<93> at 
org.junit.Assert.fail(Assert.java:89) at 
org.junit.Assert.failNotEquals(Assert.java:835) at 
org.junit.Assert.assertEquals(Assert.java:647) at 
org.apache.hadoop.fs.contract.AbstractContractUnbufferTest.validateFileContents(AbstractContractUnbufferTest.java:139)
 at 
org.apache.hadoop.fs.contract.AbstractContractUnbufferTest.testUnbufferMultipleReads(AbstractContractUnbufferTest.java:111)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
java.lang.Thread.run(Thread.java:748){code}

> NPE in S3AInputStream read() after failure to reconnect to store
> ----------------------------------------------------------------
>
>                 Key: HADOOP-17812
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17812
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 3.2.2, 3.3.1
>            Reporter: Bobby Wang
>            Assignee: Bobby Wang
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 3.4.0
>
>         Attachments: 3.3-branch-failsafe-report.html.gz, 
> failsafe-report.html.gz, s3a-test.tar.gz
>
>          Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> when [reading from S3a 
> storage|https://github.com/apache/hadoop/blob/rel/release-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L450],
>  SSLException (which extends IOException) happens, which will trigger 
> [onReadFailure|https://github.com/apache/hadoop/blob/rel/release-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L458].
> onReadFailure calls "reopen". it will first close the original 
> *wrappedStream* and set *wrappedStream = null*, and then it will try to 
> [re-get 
> *wrappedStream*|https://github.com/apache/hadoop/blob/rel/release-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L184].
>  But what if the previous code [obtaining 
> S3Object|https://github.com/apache/hadoop/blob/rel/release-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L183]
>  throw exception, then "wrappedStream" will be null.
> And the 
> [retry|https://github.com/apache/hadoop/blob/rel/release-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L446]
>  mechanism may re-execute the 
> [wrappedStream.read|https://github.com/apache/hadoop/blob/rel/release-3.2.0/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AInputStream.java#L450]
>  and cause NPE.
>  
> For more details, please refer to 
> [https://github.com/NVIDIA/spark-rapids/issues/2915]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to