On Thu, Apr 23, 2015 at 1:34 AM Paul deGrandis <[email protected]>
wrote:

>
> I'd also encourage you to reconsider your benchmark - ask yourself, "What
> does this really tell me?"  Is the benchmark an accurate representation of
> the kinds of HTTP services you build?  Are the payloads (parsing and
> generation) representative of common data you deal with in the systems you
> build?  Is the network, topology, and traffic generation realistic (or at
> least analogous) to production systems?  Can the results of the benchmark
> directly inform architectural considerations of real systems?
>

(I'm answering this one, but I really have to thank everyone who has chimed
in, all of the comments are valuable)

Your comments helped me understand a lot about what I'm trying to do here.
The underlying goal is to uncover the fallacy that peak performance of web
server frameworks matter a whole lot in practice. To show that, I have a
latency oriented benchmark, which is able to show there are some
discrepancies in the high percentiles of different frameworks. Usually due
to the invocation of GC, or due to queueing/stalling under load. It changes
the notion that you have a "winner" because many of the web servers which
tend to have good median latencies, turns out to have outright appalling
latencies in the 99.9th percentile. Furthermore, the test verifies work is
being evenly distributed over all the 10k connections. Again, it turns out
that many measurements are flawed because the web server will serve 300
connections, and letting 9700 stall. Then the server can have excellent
peak req/s rate, and good average/median latency, but it is pretty bad for
real work in a 10k connection setting (think websockets).

But to tell that story I don't need a whole lot of "specimens". You made me
realize that the work is not about testing every and all  frameworks in
existence, but rather about presenting differing behavior. In fact, the
argument can be made that if I already have undertow in the mix, I have a
fairly realistic coverage of how the JVM reacts under load. It can even be
argued that I shouldn't test too broadly since people would falsely rely
too much on a number for which their use case doesn't match. By not doing
so, I would be forcing people to make up their own measurements.

I do think there is some value in reporting what I've found up to this
point, but I definitely plan to include a large section of caveats and
known weaknesses of the test methodology. One, which Paul touches on, is
that the network is "perfect" in the sense that it can handle the load from
the load generator. But in practice, there will be clients accessing over
WAN with highly varying behaviour in the network. This test can't address
that point, which is very real in a practical setting.

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to [email protected]
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to