On Thu, Apr 23, 2015 at 1:34 AM Paul deGrandis <[email protected]> wrote:
> > I'd also encourage you to reconsider your benchmark - ask yourself, "What > does this really tell me?" Is the benchmark an accurate representation of > the kinds of HTTP services you build? Are the payloads (parsing and > generation) representative of common data you deal with in the systems you > build? Is the network, topology, and traffic generation realistic (or at > least analogous) to production systems? Can the results of the benchmark > directly inform architectural considerations of real systems? > (I'm answering this one, but I really have to thank everyone who has chimed in, all of the comments are valuable) Your comments helped me understand a lot about what I'm trying to do here. The underlying goal is to uncover the fallacy that peak performance of web server frameworks matter a whole lot in practice. To show that, I have a latency oriented benchmark, which is able to show there are some discrepancies in the high percentiles of different frameworks. Usually due to the invocation of GC, or due to queueing/stalling under load. It changes the notion that you have a "winner" because many of the web servers which tend to have good median latencies, turns out to have outright appalling latencies in the 99.9th percentile. Furthermore, the test verifies work is being evenly distributed over all the 10k connections. Again, it turns out that many measurements are flawed because the web server will serve 300 connections, and letting 9700 stall. Then the server can have excellent peak req/s rate, and good average/median latency, but it is pretty bad for real work in a 10k connection setting (think websockets). But to tell that story I don't need a whole lot of "specimens". You made me realize that the work is not about testing every and all frameworks in existence, but rather about presenting differing behavior. In fact, the argument can be made that if I already have undertow in the mix, I have a fairly realistic coverage of how the JVM reacts under load. It can even be argued that I shouldn't test too broadly since people would falsely rely too much on a number for which their use case doesn't match. By not doing so, I would be forcing people to make up their own measurements. I do think there is some value in reporting what I've found up to this point, but I definitely plan to include a large section of caveats and known weaknesses of the test methodology. One, which Paul touches on, is that the network is "perfect" in the sense that it can handle the load from the load generator. But in practice, there will be clients accessing over WAN with highly varying behaviour in the network. This test can't address that point, which is very real in a practical setting. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to [email protected] Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
