There is likely performance reasons the IOCP model is more efficient (e.g faster / higher bandwidth utilization), even if you have to emulate it via epoll. Even where you could get gradual notifications, you don't want to be filling a buffer one byte at a time. This would be massively wasteful of processor time and communication bandwidth. The underlying layers will generally try to do block memory moves, since those are more efficient (due to dividing the overhead over a larger amount of data). You have the beginning of the right idea with "break it into 1-byte write requests", but this is the wrong granularity (too many context switches wastes effort). You want to instead provide a large enough block of data in each request to ensure each packet is full, and only get notified when there is at least that much space in the write buffer. Then you can decide how much memory you want to pre-fill vs. how bad it is to miss a deadline. For each chunk, we can estimate the TCP MTU to give us a starting point (1500 byte) and then add two orders of magnitude (let's say N=128kB) to drive the overhead error towards zero and because there are many request blocks in flight on the wire simultaneously(*). Finally, decide how many extra blocks you want libuv to be ready to transmit. For a good start, it's probably reasonable just to pick K=two (this would also let it use a ping-pong buffer strategy, rather than a ring buffer, but `malloc` is usually also just fine). Later, if it's not meeting requirements, you can use queuing theory to estimate the optimal thresholds. (it's been a few years since I took a class in networking, so sorry if I'm a bit hand-wavy on some of the details, and let me know if I missed anything in my estimation attempts)
Finally, to tie this all together, inside the application, the goal would be to perform `uv_write` operations on blocks of size N whenever the pending count (should be feasible to manage this yourself, or look at the libuv field of outstanding write reqs) is below your chosen threshold on queue length (e.g. K=2). (*) Another way to derive this is the formula `buffer-size = ping-time * bandwidth`, which more directly and precisely estimates just how much data could theoretically could get removed from the buffer when the ACK packets return. On Sat, Mar 24, 2018 at 2:51 PM Michael Kilburn <[email protected]> wrote: > On Sat, Mar 24, 2018 at 4:51 AM, Ben Noordhuis <[email protected]> wrote: > >> On Thu, Mar 22, 2018 at 11:32 PM, CM <[email protected]> wrote: >> > To be more precise -- I wonder if you can get "gradual" buffer write >> > notifications. I.e. as OS "drains" my buffer (writes it out to the >> network) >> > I'd like to receive notifications indicating how much of the buffer was >> > written out. Is this possible with libuv (or Windows IOCP)? >> >> In general that's not possible. You could hack libuv's UNIX port to >> give you that kind of notification but it won't work on Windows. >> > > Hmm... Indeed, Unix "readiness" model (where you get notified of socket > being "ready", write as much as can fit and wait for next notification) > works very nicely here, but it leads to one extra memory copy (moving data > from user buffer to socket buffer). IOCP model -- not so much, but it > potentially enables zero-copy direct-memory protocol (where you register > you buffer(s) and network card reads it directly). > > So for this to happen on Linux all I need is to "pierce" the "conversion > layer" libuv put on top of readiness model for it to work like IOCP model. > On Windows -- the only thing that comes to mind is to break every write > request into 1-byte write requests :-) > > If only we had "buffer readiness" notification added to IOCP -- similar to > how edge-triggered epoll works... I.e. once NIC sends out some data -- it > updates "bytes sent" counter and (unless it is already set) sets the > "alarm", once app receives notification about alarm -- it'll read counter, > "disarm" alarm and do smth with (now free) buffer. Very similar to Unix > readiness model, but instead of "socket buffer is ready to receive data" it > means "your buffer is no longer needed". > > -- > You received this message because you are subscribed to the Google Groups > "libuv" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/libuv. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "libuv" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/libuv. For more options, visit https://groups.google.com/d/optout.
