Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-07-02 Thread Zefan Li
On 2020/7/3 0:02, Roman Gushchin wrote: > On Wed, Jul 01, 2020 at 09:48:48PM -0700, Cong Wang wrote: >> On Tue, Jun 30, 2020 at 3:48 PM Roman Gushchin wrote: >>> >>> Btw if we want to backport the problem but can't blame a specific commit, >>> we can always use something like "Cc: [3.1+]". >>

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-07-02 Thread Peter Geis
On Thu, Jul 2, 2020 at 12:03 PM Roman Gushchin wrote: > > On Wed, Jul 01, 2020 at 09:48:48PM -0700, Cong Wang wrote: > > On Tue, Jun 30, 2020 at 3:48 PM Roman Gushchin wrote: > > > > > > Btw if we want to backport the problem but can't blame a specific commit, > > > we can always use something li

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-07-02 Thread Roman Gushchin
On Wed, Jul 01, 2020 at 09:48:48PM -0700, Cong Wang wrote: > On Tue, Jun 30, 2020 at 3:48 PM Roman Gushchin wrote: > > > > Btw if we want to backport the problem but can't blame a specific commit, > > we can always use something like "Cc: [3.1+]". > > Sure, but if we don't know which is the r

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-07-02 Thread Thomas Lamprecht
On 02.07.20 06:48, Cong Wang wrote: > On Tue, Jun 30, 2020 at 3:48 PM Roman Gushchin wrote: >> >> Btw if we want to backport the problem but can't blame a specific commit, >> we can always use something like "Cc: [3.1+]". > > Sure, but if we don't know which is the right commit to blame, then

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-07-01 Thread Cong Wang
On Tue, Jun 30, 2020 at 3:48 PM Roman Gushchin wrote: > > Btw if we want to backport the problem but can't blame a specific commit, > we can always use something like "Cc: [3.1+]". Sure, but if we don't know which is the right commit to blame, then how do we know which stable version should t

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-30 Thread Zefan Li
On 2020/7/1 6:48, Roman Gushchin wrote: > On Tue, Jun 30, 2020 at 03:22:34PM -0700, Cong Wang wrote: >> On Sat, Jun 27, 2020 at 4:41 PM Roman Gushchin wrote: >>> >>> On Fri, Jun 26, 2020 at 10:58:14AM -0700, Cong Wang wrote: On Thu, Jun 25, 2020 at 10:23 PM Cameron Berkenpas wrote: >>>

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-30 Thread Roman Gushchin
On Tue, Jun 30, 2020 at 03:22:34PM -0700, Cong Wang wrote: > On Sat, Jun 27, 2020 at 4:41 PM Roman Gushchin wrote: > > > > On Fri, Jun 26, 2020 at 10:58:14AM -0700, Cong Wang wrote: > > > On Thu, Jun 25, 2020 at 10:23 PM Cameron Berkenpas > > > wrote: > > > > > > > > Hello, > > > > > > > > Somew

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-30 Thread Cong Wang
On Sat, Jun 27, 2020 at 4:41 PM Roman Gushchin wrote: > > On Fri, Jun 26, 2020 at 10:58:14AM -0700, Cong Wang wrote: > > On Thu, Jun 25, 2020 at 10:23 PM Cameron Berkenpas wrote: > > > > > > Hello, > > > > > > Somewhere along the way I got the impression that it generally takes > > > those affect

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-30 Thread Cong Wang
On Sat, Jun 27, 2020 at 3:59 PM Cameron Berkenpas wrote: > > The box has been up without issue for over 25 hours now. The patch seems > solid. That's great! Thanks for testing!

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-27 Thread Roman Gushchin
On Fri, Jun 26, 2020 at 10:58:14AM -0700, Cong Wang wrote: > On Thu, Jun 25, 2020 at 10:23 PM Cameron Berkenpas wrote: > > > > Hello, > > > > Somewhere along the way I got the impression that it generally takes > > those affected hours before their systems lock up. I'm (generally) able > > to repr

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-27 Thread Cameron Berkenpas
The box has been up without issue for over 25 hours now. The patch seems solid. On 6/26/20 3:03 PM, Cameron Berkenpas wrote: Box has been up for 25 minutes without issue. Probably the patch works, but I can further confirm by tomorrow. On 6/26/2020 10:58 AM, Cong Wang wrote: On Thu, Jun 25, 2

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-26 Thread Cameron Berkenpas
Box has been up for 25 minutes without issue. Probably the patch works, but I can further confirm by tomorrow. On 6/26/2020 10:58 AM, Cong Wang wrote: On Thu, Jun 25, 2020 at 10:23 PM Cameron Berkenpas wrote: Hello, Somewhere along the way I got the impression that it generally takes those a

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-26 Thread Cong Wang
On Thu, Jun 25, 2020 at 10:23 PM Cameron Berkenpas wrote: > > Hello, > > Somewhere along the way I got the impression that it generally takes > those affected hours before their systems lock up. I'm (generally) able > to reproduce this issue much faster than that. Regardless, I can help test. > >

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-25 Thread Cameron Berkenpas
Hello, Somewhere along the way I got the impression that it generally takes those affected hours before their systems lock up. I'm (generally) able to reproduce this issue much faster than that. Regardless, I can help test. Are there any patches that need testing or is this all still pending

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-23 Thread Roman Gushchin
On Fri, Jun 19, 2020 at 08:31:14PM -0700, Cong Wang wrote: > On Fri, Jun 19, 2020 at 5:51 PM Zefan Li wrote: > > > > 在 2020/6/20 8:45, Zefan Li 写道: > > > On 2020/6/20 3:51, Cong Wang wrote: > > >> On Thu, Jun 18, 2020 at 11:40 PM Zefan Li wrote: > > >>> > > >>> On 2020/6/19 5:09, Cong Wang wrote:

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-23 Thread Cong Wang
On Tue, Jun 23, 2020 at 1:45 AM Zhang,Qiang wrote: > > There are some message in kernelv5.4, I don't know if it will help. > > demsg: > > cgroup: cgroup: disabling cgroup2 socket matching due to net_prio or > net_cls activation ... > ---[ cut here ]--- > percpu ref (cgroup_bpf_rele

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-23 Thread Zhang,Qiang
On Mon, Jun 22, 2020 at 11:14:20AM -0700, Cong Wang wrote: > On Sat, Jun 20, 2020 at 8:58 AM Roman Gushchin wrote: > > > > On Fri, Jun 19, 2020 at 08:00:41PM -0700, Cong Wang wrote: > > > On Fri, Jun 19, 2020 at 6:14 PM Roman Gushchin wrote: > > > > > > > > On Sat, Jun 20, 2020 at 09:00:40AM +08

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-23 Thread Zhang,Qiang
The tester found the following information during the test The dmesg information is as follows (kernelv5.4) I don't know if it helps for this question root@intel-x86-64:~# cgroup: cgroup: disabling cgroup2 socket matching due to net_prio or net_cls activation IPv6: ADDRCONF(NETDEV_CHANGE

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-23 Thread Zhang,Qiang
There are some message in kernelv5.4, I don't know if it will help. demsg: cgroup: cgroup: disabling cgroup2 socket matching due to net_prio or net_cls activation IPv6: ADDRCONF(NETDEV_CHANGE): veth4c31d8d2: link becomes ready cni0: port 1(veth4c31d8d2) entered blocking state cni0: port 1(veth

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-22 Thread Roman Gushchin
On Mon, Jun 22, 2020 at 11:14:20AM -0700, Cong Wang wrote: > On Sat, Jun 20, 2020 at 8:58 AM Roman Gushchin wrote: > > > > On Fri, Jun 19, 2020 at 08:00:41PM -0700, Cong Wang wrote: > > > On Fri, Jun 19, 2020 at 6:14 PM Roman Gushchin wrote: > > > > > > > > On Sat, Jun 20, 2020 at 09:00:40AM +080

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-22 Thread Cong Wang
On Sat, Jun 20, 2020 at 8:58 AM Roman Gushchin wrote: > > On Fri, Jun 19, 2020 at 08:00:41PM -0700, Cong Wang wrote: > > On Fri, Jun 19, 2020 at 6:14 PM Roman Gushchin wrote: > > > > > > On Sat, Jun 20, 2020 at 09:00:40AM +0800, Zefan Li wrote: > > > > I think so, though I'm not familiar with the

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-20 Thread Roman Gushchin
On Sat, Jun 20, 2020 at 03:52:49PM +0800, Zefan Li wrote: > 在 2020/6/20 11:31, Cong Wang 写道: > > On Fri, Jun 19, 2020 at 5:51 PM Zefan Li wrote: > >> > >> 在 2020/6/20 8:45, Zefan Li 写道: > >>> On 2020/6/20 3:51, Cong Wang wrote: > On Thu, Jun 18, 2020 at 11:40 PM Zefan Li wrote: > > > >>>

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-20 Thread Roman Gushchin
On Fri, Jun 19, 2020 at 08:00:41PM -0700, Cong Wang wrote: > On Fri, Jun 19, 2020 at 6:14 PM Roman Gushchin wrote: > > > > On Sat, Jun 20, 2020 at 09:00:40AM +0800, Zefan Li wrote: > > > I think so, though I'm not familiar with the bfp cgroup code. > > > > > > > If so, we might wanna fix it in a d

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-20 Thread Zefan Li
在 2020/6/20 11:31, Cong Wang 写道: > On Fri, Jun 19, 2020 at 5:51 PM Zefan Li wrote: >> >> 在 2020/6/20 8:45, Zefan Li 写道: >>> On 2020/6/20 3:51, Cong Wang wrote: On Thu, Jun 18, 2020 at 11:40 PM Zefan Li wrote: > > On 2020/6/19 5:09, Cong Wang wrote: >> On Thu, Jun 18, 2020 at 12:3

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-19 Thread Cong Wang
On Fri, Jun 19, 2020 at 5:51 PM Zefan Li wrote: > > 在 2020/6/20 8:45, Zefan Li 写道: > > On 2020/6/20 3:51, Cong Wang wrote: > >> On Thu, Jun 18, 2020 at 11:40 PM Zefan Li wrote: > >>> > >>> On 2020/6/19 5:09, Cong Wang wrote: > On Thu, Jun 18, 2020 at 12:36 PM Roman Gushchin wrote: > > >

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-19 Thread Cong Wang
On Fri, Jun 19, 2020 at 6:14 PM Roman Gushchin wrote: > > On Sat, Jun 20, 2020 at 09:00:40AM +0800, Zefan Li wrote: > > I think so, though I'm not familiar with the bfp cgroup code. > > > > > If so, we might wanna fix it in a different way, > > > just checking if (!(css->flags & CSS_NO_REF)) in cg

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-19 Thread Zefan Li
>>> If so, we might wanna fix it in a different way, >>> just checking if (!(css->flags & CSS_NO_REF)) in cgroup_bpf_put() >>> like in cgroup_put(). It feels more reliable to me. >>> >> >> Yeah I also have this idea in my mind. > > I wonder if the following patch will fix the issue? > I guess so

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-19 Thread Roman Gushchin
On Sat, Jun 20, 2020 at 09:00:40AM +0800, Zefan Li wrote: > On 2020/6/20 8:51, Roman Gushchin wrote: > > On Fri, Jun 19, 2020 at 02:40:19PM +0800, Zefan Li wrote: > >> On 2020/6/19 5:09, Cong Wang wrote: > >>> On Thu, Jun 18, 2020 at 12:36 PM Roman Gushchin wrote: > > On Thu, Jun 18, 202

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-19 Thread Zefan Li
On 2020/6/20 8:51, Roman Gushchin wrote: > On Fri, Jun 19, 2020 at 02:40:19PM +0800, Zefan Li wrote: >> On 2020/6/19 5:09, Cong Wang wrote: >>> On Thu, Jun 18, 2020 at 12:36 PM Roman Gushchin wrote: On Thu, Jun 18, 2020 at 12:19:13PM -0700, Cong Wang wrote: > On Wed, Jun 17, 2020 at

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-19 Thread Zefan Li
在 2020/6/20 8:45, Zefan Li 写道: > On 2020/6/20 3:51, Cong Wang wrote: >> On Thu, Jun 18, 2020 at 11:40 PM Zefan Li wrote: >>> >>> On 2020/6/19 5:09, Cong Wang wrote: On Thu, Jun 18, 2020 at 12:36 PM Roman Gushchin wrote: > > On Thu, Jun 18, 2020 at 12:19:13PM -0700, Cong Wang wrote: >

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-19 Thread Roman Gushchin
On Fri, Jun 19, 2020 at 02:40:19PM +0800, Zefan Li wrote: > On 2020/6/19 5:09, Cong Wang wrote: > > On Thu, Jun 18, 2020 at 12:36 PM Roman Gushchin wrote: > >> > >> On Thu, Jun 18, 2020 at 12:19:13PM -0700, Cong Wang wrote: > >>> On Wed, Jun 17, 2020 at 6:44 PM Zefan Li wrote: > > Cc: R

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-19 Thread Zefan Li
On 2020/6/20 3:51, Cong Wang wrote: > On Thu, Jun 18, 2020 at 11:40 PM Zefan Li wrote: >> >> On 2020/6/19 5:09, Cong Wang wrote: >>> On Thu, Jun 18, 2020 at 12:36 PM Roman Gushchin wrote: On Thu, Jun 18, 2020 at 12:19:13PM -0700, Cong Wang wrote: > On Wed, Jun 17, 2020 at 6:44 PM Ze

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-19 Thread Cong Wang
On Thu, Jun 18, 2020 at 11:40 PM Zefan Li wrote: > > On 2020/6/19 5:09, Cong Wang wrote: > > On Thu, Jun 18, 2020 at 12:36 PM Roman Gushchin wrote: > >> > >> On Thu, Jun 18, 2020 at 12:19:13PM -0700, Cong Wang wrote: > >>> On Wed, Jun 17, 2020 at 6:44 PM Zefan Li wrote: > > Cc: Roman G

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-18 Thread Zefan Li
On 2020/6/19 5:09, Cong Wang wrote: > On Thu, Jun 18, 2020 at 12:36 PM Roman Gushchin wrote: >> >> On Thu, Jun 18, 2020 at 12:19:13PM -0700, Cong Wang wrote: >>> On Wed, Jun 17, 2020 at 6:44 PM Zefan Li wrote: Cc: Roman Gushchin Thanks for fixing this. On 2020/6/17

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-18 Thread Peter Geis
On Thu, Jun 18, 2020 at 5:26 PM Roman Gushchin wrote: > > On Thu, Jun 18, 2020 at 02:09:43PM -0700, Cong Wang wrote: > > On Thu, Jun 18, 2020 at 12:36 PM Roman Gushchin wrote: > > > > > > On Thu, Jun 18, 2020 at 12:19:13PM -0700, Cong Wang wrote: > > > > On Wed, Jun 17, 2020 at 6:44 PM Zefan Li

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-18 Thread Roman Gushchin
On Thu, Jun 18, 2020 at 02:09:43PM -0700, Cong Wang wrote: > On Thu, Jun 18, 2020 at 12:36 PM Roman Gushchin wrote: > > > > On Thu, Jun 18, 2020 at 12:19:13PM -0700, Cong Wang wrote: > > > On Wed, Jun 17, 2020 at 6:44 PM Zefan Li wrote: > > > > > > > > Cc: Roman Gushchin > > > > > > > > Thanks f

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-18 Thread Cong Wang
On Thu, Jun 18, 2020 at 12:36 PM Roman Gushchin wrote: > > On Thu, Jun 18, 2020 at 12:19:13PM -0700, Cong Wang wrote: > > On Wed, Jun 17, 2020 at 6:44 PM Zefan Li wrote: > > > > > > Cc: Roman Gushchin > > > > > > Thanks for fixing this. > > > > > > On 2020/6/17 2:03, Cong Wang wrote: > > > > Whe

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-18 Thread Roman Gushchin
On Thu, Jun 18, 2020 at 12:19:13PM -0700, Cong Wang wrote: > On Wed, Jun 17, 2020 at 6:44 PM Zefan Li wrote: > > > > Cc: Roman Gushchin > > > > Thanks for fixing this. > > > > On 2020/6/17 2:03, Cong Wang wrote: > > > When we clone a socket in sk_clone_lock(), its sk_cgrp_data is > > > copied, so

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-18 Thread Cong Wang
On Wed, Jun 17, 2020 at 6:44 PM Zefan Li wrote: > > Cc: Roman Gushchin > > Thanks for fixing this. > > On 2020/6/17 2:03, Cong Wang wrote: > > When we clone a socket in sk_clone_lock(), its sk_cgrp_data is > > copied, so the cgroup refcnt must be taken too. And, unlike the > > sk_alloc() path, so

Re: [Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-17 Thread Zefan Li
Cc: Roman Gushchin Thanks for fixing this. On 2020/6/17 2:03, Cong Wang wrote: > When we clone a socket in sk_clone_lock(), its sk_cgrp_data is > copied, so the cgroup refcnt must be taken too. And, unlike the > sk_alloc() path, sock_update_netprioidx() is not called here. > Therefore, it is saf

[Patch net] cgroup: fix cgroup_sk_alloc() for sk_clone_lock()

2020-06-16 Thread Cong Wang
When we clone a socket in sk_clone_lock(), its sk_cgrp_data is copied, so the cgroup refcnt must be taken too. And, unlike the sk_alloc() path, sock_update_netprioidx() is not called here. Therefore, it is safe and necessary to grab the cgroup refcnt even when cgroup_sk_alloc is disabled. sk_clone