Hello,

Op zo 14 jun. 2020 om 20:29 schreef Cong Wang <xiyou.wangc...@gmail.com>:
>
> Hello,
>
> On Sun, Jun 14, 2020 at 5:39 AM Daniël Sonck <dsonc...@gmail.com> wrote:
> >
> > Hello,
> >
> > I found on the archive that this bug I encountered also happened to
> > others. I too have a very similar stacktrace. The issue I'm
> > experiencing is:
> >
> > Whenever I fully boot my cluster, in some time, the host crashes with
> > the __cgroup_bpf_run_filter_skb NULL pointer dereference. This has
> > been sporadic enough before not to cause real issues. However, as of
> > lately, the bug is triggered much more frequently. I've changed my
> > server hardware so I could capture serial output in order to get the
> > trace. This trace looked very similar as reported by Lu Fengqi. As it
> > currently stands, I cannot run the cluster as it's almost instantly
> > crashing the host.
>
> This has been reported for multiple times. Are you able to test the
> attached patch? And let me know if everything goes fine with it.

I will try out the patch. Since the host reliably crashed each time as
I booted up
the cluster VMs I will be able to tell whether it has any positive effect.
>
> I suspect we may still leak some cgroup refcnt even with the patch,
> but it might be much harder to trigger with this patch applied.

Currently applying the patch to the kernel and compiling so I should
know in a few hours
>
> Thanks.

Reply via email to