On 26/09/2017 3:51 PM, Eric Dumazet wrote:
On Tue, Sep 26, 2017 at 4:21 AM, Tariq Toukan <tar...@mellanox.com> wrote:
Hi Eric,
We see a regression introduced in this series, specifically in the patches
touching lib/kobject_uevent.c.
We tried to figure out what is wrong there, but couldn't point it out.
Bug is that mlx4 driver restart fails, because mlx4_core is still in use.
According to module dependencies, both mlx4_en and mlx4_ib should have been
unloaded at this point
Please see log below.
This looks to be some kind of a race, as the repro is not deterministic.
Probably the en/ib modules are now mistakenly reloaded.
Any idea what could this be?
Regards,
Tariq
[root@reg-l-vrt-41016-009 ~]# /etc/init.d/openibd stop
Unloading HCA driver: [ OK ]
[root@reg-l-vrt-41016-009 ~]# /etc/init.d/openibd start
Loading HCA driver and Access Layer: [ OK ]
[root@reg-l-vrt-41016-009 ~]# /etc/init.d/openibd stop
Unloading mlx4_core [FAILED]
rmmod: ERROR: Module mlx4_core is in use
I have absolutely no idea. Please bisect.
We previously saw a similar issue, that was reported in mailing list.
Dmitry Torokhov suggested the following fix:
https://lkml.org/lkml/2017/9/12/523
And indeed, it solved the issue.
We kept the suggested patch in our internal branch, and rebased.
Issue appeared again once your series was accepted.
By bisecting, we see that the issue re-appears in this patch:
4a336a23d619 kobject: copy env blob in one go
Are you really using netns in the first place ?
No. But seems like it still affects the modules load/unload.
Regards,
Tariq