On Wed, Nov 28, 2018 at 4:20 PM Mark Thomas <ma...@apache.org> wrote:
> On 28/11/2018 15:00, ma...@apache.org wrote: > > Author: markt > > Date: Wed Nov 28 15:00:06 2018 > > New Revision: 1847646 > > > > URL: http://svn.apache.org/viewvc?rev=1847646&view=rev > > Log: > > Fix possible cause of intermittent TestCoyoteOutputStream failures. > > I thought this would be worthy of a longer explanation than seemed > appropriate for a commit message. > I feel bad for not thinking about it since it does sound quite logical. BTW, the testsuite failed but it wasn't *that*. What a coincidence ! Rémy > > I have tried to recreate the issue locally without success. I was able > to recreate it occasionally running the tests on silvanus.a.o (the CI > machine that runs all our buildbot jobs). > > I captured a network trace that confirmed that this was a server side > bug. What I saw was a corrupted response. The headers and first chunk > were correct but rather than the 5 bytes of the end chunk I saw the > other 8187 (8192-5) bytes of the buffer. It was clear the buffer was > configured for write when it was being read. > > I then tried to figure out how this could happen with a view to > reproducing the issue. > > There were a lot of dead ends during which I noticed that the write > pattern varied when I added additional debug statements. I discovered > that, depending on timing, the NIO2 endpoint would sometimes use a > gathering write when performing a non-blocking flush. > > There is a non-blocking flush just before the switch back to blocking > I/O (after the dispatch to end the async component) and it looked to be > possible that the gathering write could still be in progress when the > following blocking write was performed. That in turn meant that one of > the buffers used by the gathering write could be modified during the > following blocking write. > > However, my current understanding of the code is that the gathering > write will have written all the data from the buffer that is used by the > following blocking write before that blocking write occurs. So I may > have missed the root cause completely. It depends a lot on the internal > workings of the AsynchronousSocketChannel. > > On balance, I decided to commit this fix as there does appear to be a > bug here. Hopefully, it is the root cause of the intermittent > TestCoyoteOutputStream failures. If it is, great. If not, I'll keep > looking. > > Mark > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org > For additional commands, e-mail: dev-h...@tomcat.apache.org > >