There is likely performance reasons the IOCP model is more efficient (e.g
faster / higher bandwidth utilization), even if you have to emulate it via
epoll. Even where you could get gradual notifications, you don't want to be
filling a buffer one byte at a time. This would be massively wasteful of
processor time and communication bandwidth. The underlying layers will
generally try to do block memory moves, since those are more efficient (due
to dividing the overhead over a larger amount of data). You have the
beginning of the right idea with "break it into 1-byte write requests", but
this is the wrong granularity (too many context switches wastes effort).
You want to instead provide a large enough block of data in each request to
ensure each packet is full, and only get notified when there is at least
that much space in the write buffer. Then you can decide how much memory
you want to pre-fill vs. how bad it is to miss a deadline. For each chunk,
we can estimate the TCP MTU to give us a starting point (1500 byte) and
then add two orders of magnitude (let's say N=128kB) to drive the overhead
error towards zero and because there are many request blocks in flight on
the wire simultaneously(*). Finally, decide how many extra blocks you want
libuv to be ready to transmit. For a good start, it's probably reasonable
just to pick K=two (this would also let it use a ping-pong buffer strategy,
rather than a ring buffer, but `malloc` is usually also just fine). Later,
if it's not meeting requirements, you can use queuing theory to estimate
the optimal thresholds. (it's been a few years since I took a class in
networking, so sorry if I'm a bit hand-wavy on some of the details, and let
me know if I missed anything in my estimation attempts)

Finally, to tie this all together, inside the application, the goal would
be to perform `uv_write` operations on blocks of size N whenever the
pending count (should be feasible to manage this yourself, or look at the
libuv field of outstanding write reqs) is below your chosen threshold on
queue length (e.g. K=2).

(*) Another way to derive this is the formula `buffer-size = ping-time *
bandwidth`, which more directly and precisely estimates just how much data
could theoretically could get removed from the buffer when the ACK packets
return.


On Sat, Mar 24, 2018 at 2:51 PM Michael Kilburn <[email protected]>
wrote:

> On Sat, Mar 24, 2018 at 4:51 AM, Ben Noordhuis <[email protected]> wrote:
>
>> On Thu, Mar 22, 2018 at 11:32 PM, CM <[email protected]> wrote:
>> > To be more precise -- I wonder if you can get "gradual" buffer write
>> > notifications. I.e. as OS "drains" my buffer (writes it out to the
>> network)
>> > I'd like to receive notifications indicating how much of the buffer was
>> > written out. Is this possible with libuv (or Windows IOCP)?
>>
>> In general that's not possible.  You could hack libuv's UNIX port to
>> give you that kind of notification but it won't work on Windows.
>>
>
> Hmm... Indeed, Unix "readiness" model (where you get notified of socket
> being "ready", write as much as can fit and wait for next notification)
> works very nicely here, but it leads to one extra memory copy (moving data
> from user buffer to socket buffer). IOCP model -- not so much, but it
> potentially enables zero-copy direct-memory protocol (where you register
> you buffer(s) and network card reads it directly).
>
> So for this to happen on Linux all I need is to "pierce" the "conversion
> layer" libuv put on top of readiness model for it to work like IOCP model.
> On Windows -- the only thing that comes to mind is to break every write
> request into 1-byte write requests :-)
>
> If only we had "buffer readiness" notification added to IOCP -- similar to
> how edge-triggered epoll works... I.e. once NIC sends out some data -- it
> updates "bytes sent" counter and (unless it is already set) sets the
> "alarm", once app receives notification about alarm -- it'll read counter,
> "disarm" alarm and do smth with (now free) buffer. Very similar to Unix
> readiness model, but instead of "socket buffer is ready to receive data" it
> means "your buffer is no longer needed".
>
> --
> You received this message because you are subscribed to the Google Groups
> "libuv" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/libuv.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"libuv" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/libuv.
For more options, visit https://groups.google.com/d/optout.

Reply via email to