2016-06-03 16:11 GMT+02:00 Mark Thomas <ma...@apache.org>: > On 03/06/2016 14:36, Rémy Maucherat wrote: > > Hi, > > > > With direct connect having been hacked in (err, I mean, "implemented"), > it > > is (a lot) easier to do meaningful performance tests. h2load is a drop in > > replacement of ab that uses HTTP/2, and it allowed doing some easy > > profiling. > > > > The good news is that the code seems to be well optimized already with > few > > visible problems. The only issue is a very heavy sync contention on the > > socket wrapper object in Http2UpgradeHandler.writeHeaders and > > Http2UpgradeHandler.writeBody. > > I suspect that is inevitable given the nature of the test. There is only > one connection and if you have 100 streams all trying to write to the > one connection at the same time you have to synchronise on something. > > > The reason for that is when you do: > > h2load -c 1 -n 100 http://127.0.0.1:8080/tomcat.gif > > It ends up being translated in Tomcat into: process one hundred > concurrent > > streams over one connection. Although h2load is not real world use, > that's > > something that would need to be solved as a client can use of a lot of > > threads. > > Hmm. We might be able to do something if we buffer writes on the server > side (I'm thinking a buffer for streams to write into with a dedicated > thread to do the writing) but I suspect that the bottleneck will quickly > switch to the network in that case. > > > There are two main issues in HTTP/2 that could be improved: > > 1) Ideally, there should be a way to limit stream concurrency to some > > extent and queue. But then there's a risk to stall a useful stream > (that's > > where stream priority comes in of course). Not easy. > > That should already be supported. Currently the default for concurrent > streams is unlimited but we can make it whatever we think is reasonable. > The HTTP/2 spec suggests it should be no lower than 100. >
I am not talking about a limit on concurrent streams where things are being refused (and this is exposed through the settings), rather on streams which are effectively being processed concurrently (= for example, in headersEnd, we put the StreamProcessor in a queue rather than executing it immediately ? unless it's a high priority stream, right ?). h2load allows comparing with other servers, and JF told me httpd has a lower HTTP/2 performance impact compared to Tomcat. Given the profiling, the problem is the heavy lock contention (no surprise, this is something that is very expensive) and we could get better performance by controlling the contention. JF's original "HTTP/2 torture test" HTML page with 1000 images probably also runs into this. IMO we will eventually need a better execution strategy than what is in place at the moment, since all dumb benchmarks will run into that edge case. But I agree that it's only partially legitimate, the client has the opportunity to control it. > > > 2) All reads/writes are blocking mid frame. It's not too bad in practice, > > but it's a useless risk, that's where async IO can provide an "easy" > > solution using a dedicated NIO2 implementation. > > They are blocking mid-frame but given the flow control provided by > HTTP/2 the risk should be zero unless the client advertises a larger > window than it can handle which would be the client's problem in my view. > I'm only half convinced since it's not very modern :) We have to experiment with our "better"/fancier async tech at some point and see the benefits. A "selling" item of the NIO1 connector was its non blocking HTTP/1.1 headers reading, and now we no longer have the feature in HTTP/2. With async IO and reads, the frame complete check code can be in the completion handler, which will then only "complete" when the frame is fully read. It's a simple and generic solution to the problem. Writes are simpler (I think). The main pitfall in both cases is the buffering and what to do with the socket buffer [it's probably better to use it with SSL, and better to ignore it when unencrypted]. Of course this won't provide the full benefits if the user code is not using Servlet 3.1 IO. Rémy