Re: Getting my head around NIO 'simulated' blocking (trying to)

Christopher Schultz Wed, 27 Mar 2013 13:30:12 -0700

Igaz,

On 3/24/13 11:46 PM, igaz wrote:
>> You need to read the Javadoc more carefully. 
> 
> I suggest you take a look at java.io.ByteArrayInputStream (both source and
> javadoc).
> Perfectly good InputStream (never heard anyone claim otherwise)
> Never blocks.


Yes, but it doesn't really follow the NIO model. The trust is that, if
you want asynchronous communication, you shouldn't be using standard
BIO-style request/response interaction: you want to use servlet-async,
WebSocket, or Comet,

Your pseudocode makes no sense for an InputStream: there's no selector
to register with.

>> While the words "blocking
>> and "non-blocking" are not used in ServletInputStream blocking IO is the
>> only way to implement readLine. 
> Nope.  readLine is beyond trivial (just delegates to #read()) -- has nothing
> to do with blocking/non-blocking.

So if client code calls readLine but doesn't find a newline, what should
be returned to the caller? Nothing? The bytes so far? Either of those
seem like they would violate the "spec" as defined by the Javadoc, which
says that bytes are returned up to the "len" parameter or when a newline
occurs -- whichever is first. If neither condition is reached and the
stream isn't closed, the only reasonable behavior is to block.

> Could implement a non-blocking ServletInputStream (only defines extra
> readLine method) in a minute or two.

Okay: time starts now.

> If I had to guess, you're confusing non-blocking io with asynchronous io 
> (it's a common mistake).  The servlet 3.0 spec does proscribe asynchronous
> reads when servlets access the http request body (and headers as well for
> that matter)

Where?

> The reason that I'm even bringing this up is that more and more web
> applications receive an increasing share of their traffic from GPS devices,
> cellphones, etc.  And these devices often use comparatively unreliable
> networks, where it is not uncommon for tcp segments to arrive seconds apart
> (and I mean tcp segments that are part of the same http request body).

How much body-data are these things generating?

> In such a case, Tomcat's current NIO connector leaves us in the same old
> place: a dedicated thread per socket, in an io-wait state (i.e. blocking). 
> If that's the majority of your traffic, that doesn't scale (and that's just
> an empirical statement, not a judgment).

Are you primarily worried about thread-usage, socket usage, CPU usage or
... what? Certainly "pretty and elegant code" isn't what comes to mind
when NIO is concerned...

> Are there tradeoffs with reading (not parsing, just reading) all the bytes
> from the http request body before invoking the servlet FilterChain?  Sure.
> Although for servlets that access the http request body via
> HttpRequest.getParameter('xxxx') (which I submit are the vast majority)
> there really is no tradeoff (memory usage is the same).

I disagree with your premise that most bodies are accessed via getParameter.

> You certainly identified one; I can think of others.  And if you're
> writing an online backup service, those tradeoffs aren't going to
> make you very happy.

There are lots of reasons to stream input. We routinely receive multi-MB
POSTs that aren't multipart/form-data: they are XML. We need to process
all that data in a streaming fashion, otherwise WE don't scale. So who's
right? History is on my side, and sadly in the HTTP world, history
rules. That's why there are other technologies that are piggybacking on
HTTP to make things work in this new web-scale world where for some
reason the laws of physics are all different.

> I'm sure the tomcat committers wouldn't be thrilled with yet another
> configuration parameter, yet another code execution path (I know I wouldn't
> be) ; maxPostSize is too coarse of course.  Ideally, you want something
> where the container could ask the servlet (based upon the dynamic http
> request header) - "should I read all of the request body (into memory) or
> should I defer reading and let you do it (via request.getInputStream() or
> request.getReader()).  This has to be asked *before* invoking the
> FilterChain.  Now that is definitely not in the spec (maybe it should be).

That interaction already exists: the container delegates control to the
servlet, which can do anything it wants. There's no reason for the
container to orchestrate the reading of the bytes from the client. I
would say that the servlet could even maintain its own thread pool, NIO
channels, etc. except that the HttpServletRequest, associated response,
and - yes - ServletInputStream need to stick together for the whole
request service. The spec also says that a single thread handles the
whole request, so technically delegating part of the processing to
another thread might fail to meet expectations that consumers of those
objects have -- like ThreadLocals being present, etc.

You are free to rail against the spec, the implementation, etc. but I'm
not sure you have a leg to stand on other than "I don't like the
implications of all this stuff". The implications are nonetheless there,
and you have alternatives if you find the restrictions unpalatable.

-chris

signature.asc
Description: OpenPGP digital signature

Re: Getting my head around NIO 'simulated' blocking (trying to)

Reply via email to