On 02.07.20 06:48, Cong Wang wrote: > On Tue, Jun 30, 2020 at 3:48 PM Roman Gushchin <g...@fb.com> wrote: >> >> Btw if we want to backport the problem but can't blame a specific commit, >> we can always use something like "Cc: <sta...@vger.kernel.org> [3.1+]". > > Sure, but if we don't know which is the right commit to blame, then how > do we know which stable version should the patch target? :)
We run into a similar issue here once we made an update from the 5.4.41 to the 5.4.44 stable kernel. This patch addresses the issue, at least we are running stable at >17 hours uptime with this patch, whereas we ran into issues normally at <6 hour uptime without this patch. That update included newly the commit 090e28b229af92dc5b ("netprio_cgroup: Fix unlimited memory leak of v2 cgroups") which this patch originally mentions as "Fixes", whereas the other mentioned possible culprit 4bfc0bb2c60e2f4c ("bpf: decouple the lifetime of cgroup_bpf from cgroup itself") was included with 5.2 here, and did *not* made problems here. So, while the real culprit may be something else, a mix of them, or even more complex, the race is at least triggered way more frequently with the 090e28b229af92dc5b ("netprio_cgroup: Fix unlimited memory leak of v2 cgroups") one or, for the sake of mentioning, possibly also something else from the v5.4.41..v5.4.44 commit range - I did not looked into that in detail yet. > > I am open to all options here, including not backporting to stable at all. As said, the stable-5.4.y tree profits from having this patch here, so there's that. Also, FWIW: Tested-by: Thomas Lamprecht <t.lampre...@proxmox.com>