On Thu, Jan 18, 2018 at 6:20 PM, Sowmini Varadhan <sowmini.varad...@oracle.com> wrote: > On (01/18/18 18:09), Willem de Bruijn wrote: >> If that is true in general for PF_RDS, then it is a reasonable approach. >> How about treating it as a (follow-on) optimization path. Opportunistic >> piggybacking of notifications on data reads is more widely applicable. > > sounds good. > >> > that's similar to what I have, except that it does not have the >> > MSG_PEEK part (you'd need to enforce that the data portion >> > is upper-bounded, and that the application has the responsibility >> > of sending down "enough" buffer with recvmsg). >> >> Right. I think that an upper bound is the simplest solution here. >> >> By the way, if you allocate an skb immediately on page pinning, then >> there are always sufficient skbs to store all notifications. On errqueue >> enqueue just drop the new skb and copy its notification to the body of >> the skb already on the queue, if one exists and it has room. That is >> essentially what the tcp zerocopy code does with the [data, info] range. > > ok, I'll give that a shot (I'm working through the other review comments > as well) > > fwiw, the data-corruption issue I mentioned turned out to be a day-one > bug in rds-tcp (patched in http://patchwork.ozlabs.org/patch/863183/). > The buffer reaping with zcopy (and aggressiveness of rds-stress) brought > this one out..
Thanks. Good to hear that it's not in zerocopy, itself.