On 2/25/19 12:15 PM, Vasily Averin wrote:
> On 2/22/19 7:39 PM, Eric Dumazet wrote:
>> On Fri, Feb 22, 2019 at 6:02 AM Vasily Averin <v...@virtuozzo.com> wrote:
> 
>>> Eric, could you please elaborate once again why tcp_sendpage() should not 
>>> handle slab objects?
>>
>> Simply because SLAB has its own way to manage objects from a page, and
>> does not care
>> about the underlying page having its refcount elevated.
>>
>> ptr = kmalloc(xx)
>> ...  < here you can attempt cheating and add one to the underlying page>
>> kfree(ptr); // SLAB does not care of page count, it will effectively
>> put ptr in the free list.
>>
>> ptr2 = kmalloc(xx); //
>>
>> ptr2 can be the same than ptr (object was kfreed() earlier)
>>
>> This means that some other stuff will happily reuse the piece of
>> memory that you wanted to use for zero-copy.
>>
>> This is a serious bug IMO, since this would allow for data corruption.
> 
> Thank you for explanation, however I still have some doubts.
> 
> Yes, it's strange to use sendpage if we want to send some small 8-bytes-long 
> slab based object,
> it's better to use sendmsg instead.
> 
> Yes, using of sendpage for slab-based objects can require special attention 
> to guarantee that slab object will not be freed until end of IO.
> However IMO this should be guaranteed if caller uses sendmsg instead of 
> sendpage.
> Btw, as far as I understand in my example XFS did it correctly, submitted 
> slab objects was kept in use
> and seems they should be freed after end of IO, via end_io callback.
> At least I did not found any bugs in sendpage callers.
> 
> And most important, it seems for me  switch from sendpage  to sendmsg doe not 
> resolve the problem completely: 
> tcp_sendmsg_locked() under some conditions can merge neighbours slab-based 
> tcp fragments,
> so local tcp_recvmsg() can trigger BUG_ON in this case too.
> 
> Am I missed something probably? 

Seems I missed that skb_copy_to_page_nocache() in tcp_sendpage_locked copies 
data from original slab object,
and merges fragments with copied data.

Reply via email to