On 2/21/19 7:00 PM, Eric Dumazet wrote:
> On Thu, Feb 21, 2019 at 7:30 AM Vasily Averin <[email protected]> wrote:
>>
>> There was few incidents when XFS over network block device generates
>> IO requests with slab-based metadata. If these requests are processed
>> via sendpage path tcp_sendpage() calls skb_can_coalesce() and merges
>> neighbour slab objects into one skb fragment.
>>
>> If receiving side is located on the same host tcp_recvmsg() can trigger
>> following BUG_ON
>> usercopy: kernel memory exposure attempt detected
>> from XXXXXX (kmalloc-512) (1024 bytes)
>>
>> This patch helps to detect the reason of similar incidents on sending side.
>>
>> Signed-off-by: Vasily Averin <[email protected]>
>> ---
>> net/ipv4/tcp.c | 1 +
>> 1 file changed, 1 insertion(+)
>>
>> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
>> index 2079145a3b7c..cf9572f4fc0f 100644
>> --- a/net/ipv4/tcp.c
>> +++ b/net/ipv4/tcp.c
>> @@ -996,6 +996,7 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page
>> *page, int offset,
>> goto wait_for_memory;
>>
>> if (can_coalesce) {
>> + WARN_ON_ONCE(PageSlab(page));
>
> Please use VM_WARN_ON_ONCE() to make this a nop for CONFIG_VM_DEBUG=n
> Also the whole tcp_sendpage() should be protected, not only the coalescing
> part.
> (The get_page() done few lines later should not be attempted either)
Eric, what do you think about following patch?
I validate its backported version on RHEL7 based OpenVZ kernel before sending
to mainline.
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index cf3c5095c10e..7be7b6abe8b5 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -943,6 +943,11 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page
*page, int offset,
ssize_t copied;
long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
+ if (PageSlab(page)) {
+ VM_WARN_ONCE(true, "sendpage should not handle Slab objects,"
+ " please fix callers\n");
+ return sock_no_sendpage_locked(sk, page, offset, size, flags);
+ }
/* Wait for a connection to finish. One exception is TCP Fast Open
* (passive side) where data is allowed to be sent before a connection
* is fully established.