On 2/20/19 6:53 PM, Eric Dumazet wrote:
> On 02/20/2019 05:34 AM, Vasily Averin wrote:
>> Dear David,
>>
>> currently do_tcp_sendpages() calls skb_can_coalesce() to merge proper tcp 
>> fragments.
>> If these fragments are slab objects and the data is not transferred out of 
>> the local host
>> then tcp_recvmsg() can crash host on BUG_ON (see [2] below).
>>
>> There is known usecase when slab objects are provided to tcp_sendpage:
>> XFS over locally landed network blockdevice.
>>
>> I found few such cases:
>> - _drbd_send_page() had PageSlab() check log time ago.
>> - recently Ilya Dryomov fixed it in ceph 
>>  by commit 7e241f647dc7 "libceph: fall back to sendmsg for slab pages"
>>
>> Recently OpenVZ team noticed this problem during experiments with
>> XFS over locally-landed iscsi target.
>>
>> I would note: triggered BUG is not a real problem but false alert,
>> that though crashes host.
>>
>> I can fix last problem by adding PageSlab() into iscsi_tcp_segment_map(),
>> however it does not fix the problem completely,
>> there are chances that the problem will be reproduced again with some other 
>> filesystems 
>> or with some other kind of network blockdevice.
>>
>> David, what do you think, is it probably better to add PageSlab() check
>> directly into skb_can_coalesce()? (see [1] below)
>>
> 
> No, this would be wrong.
> 
> There is no way a page fragment can be backed by slab object,
> since a page fragment can be shared (the page refcount needs to be 
> manipulated, without slab/slub
> being aware of this)

Thank you for explanation, 
though this happen in real life and triggers BUG_ON only if receiving side is 
located on the same host.
Is it probably makes sense to add WARN_ON into skb_can_coalesce to detect such 
cases?

> Please fix the callers.

Ok, will do it.

Reply via email to