On 09/05/2019 21:53, Mark Thomas wrote:
> On 05/05/2019 09:37, r...@apache.org wrote:
>> This is an automated email from the ASF dual-hosted git repository.
>>
>> remm pushed a commit to branch master
>> in repository https://gitbox.apache.org/repos/asf/tomcat.git
>>
>>
>> The following commit(s) were added to refs/heads/master by this push:
>>      new 6232d82  Make sure timeout elapses before calling a timeout
>> 6232d82 is described below
>>
>> commit 6232d82cc5db73e5da5392d6ec6d9d01ce65c85e
>> Author: remm <r...@apache.org>
>> AuthorDate: Sun May 5 10:37:05 2019 +0200
>>
>>     Make sure timeout elapses before calling a timeout
>>     
>>     It seems there are extra stream notify as although 0 bytes have been
>>     allocated only a few ms have passed when there is a failure.
> 
> I think something else is going on here. I'm still trying to figure it out.

I can't reproduce the test failure - either with the code or some code
written to explicitly test this scenario. However, I do believe I know
what the problem is.

backLogStreams is a concurrent Map of Stream to int[]. The int[] is a
two element array where the first element is number of bytes the stream
requires and the second is the number allocated.

I believe the issue occurs as follows:
- Thread A requests more bytes for Stream A than are available in the
  connection window
- Stream A is placed into the backlog
- Thread A enters streamA.wait()
- Thread B processes a Window update message
- Thread B allocates some bytes to streamA
- Thread B calls streamA.notify()
- Thread A exits the wait but, because the elements of the int[] are not
  volatile, it does not see the window update
- Thread A treats existing the wait without an allocation as a timeout

I've left out the obtaining and releasing of locks on streamA and the
Http2UpgradeHandler for clarity but I'll note that some reads of the
allocation occur outside of any lock

The above description is consistent with a known failure trace:
https://ci.apache.org/projects/tomcat/tomcat9/logs/4288/TEST-org.apache.coyote.http2.TestHttp2Section_5_3.NIO2.txt

My plan is as follows:
- revert the "check it has really timed out fix"
- modify the access to the allocation so that updates to the allocation
  are guaranteed to be visible after the wait() exits

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to