On 2/25/19 12:15 PM, Vasily Averin wrote: > On 2/22/19 7:39 PM, Eric Dumazet wrote: >> On Fri, Feb 22, 2019 at 6:02 AM Vasily Averin <v...@virtuozzo.com> wrote: > >>> Eric, could you please elaborate once again why tcp_sendpage() should not >>> handle slab objects? >> >> Simply because SLAB has its own way to manage objects from a page, and >> does not care >> about the underlying page having its refcount elevated. >> >> ptr = kmalloc(xx) >> ... < here you can attempt cheating and add one to the underlying page> >> kfree(ptr); // SLAB does not care of page count, it will effectively >> put ptr in the free list. >> >> ptr2 = kmalloc(xx); // >> >> ptr2 can be the same than ptr (object was kfreed() earlier) >> >> This means that some other stuff will happily reuse the piece of >> memory that you wanted to use for zero-copy. >> >> This is a serious bug IMO, since this would allow for data corruption. > > Thank you for explanation, however I still have some doubts. > > Yes, it's strange to use sendpage if we want to send some small 8-bytes-long > slab based object, > it's better to use sendmsg instead. > > Yes, using of sendpage for slab-based objects can require special attention > to guarantee that slab object will not be freed until end of IO. > However IMO this should be guaranteed if caller uses sendmsg instead of > sendpage. > Btw, as far as I understand in my example XFS did it correctly, submitted > slab objects was kept in use > and seems they should be freed after end of IO, via end_io callback. > At least I did not found any bugs in sendpage callers. > > And most important, it seems for me switch from sendpage to sendmsg doe not > resolve the problem completely: > tcp_sendmsg_locked() under some conditions can merge neighbours slab-based > tcp fragments, > so local tcp_recvmsg() can trigger BUG_ON in this case too. > > Am I missed something probably?
Seems I missed that skb_copy_to_page_nocache() in tcp_sendpage_locked copies data from original slab object, and merges fragments with copied data.