On 29.12.2018 17:59, Norbert Jurkeit wrote: > Am 29.12.18 um 16:44 schrieb Heiner Kallweit: >> >> I don't think this patch can have any impact on the issue. Maybe WoL is >> still active from previous test? >> Manual WoL settings may survive a reboot, you can disable WoL by "ethtool -s >> <if> wol d". > > In theory I agree, but we have seen before that it can not be predicted or > logically explained which kernel build suffers from the issue or does not. > WoL is definitely off. When it was enabled, the LED already turned on with > the BIOS diagnostics screen and not at the end of the boot process as > observed with the patched kernel. > >> >> What could be helpful in addition: I provided a patch with some debug output >> in comment 106 >> in the bug ticket (https://bugzilla.redhat.com/show_bug.cgi?id=1650984). >> If you could apply this, trigger a fail scenario, and attach the full dmesg >> to the bug ticket. > I just tried the Fedora kernel provided in comment 107. Unfortunately the > fault neither shows up with this kernel nor with the stock Fedora kernel > 4.19.12 it is based on. I will further try to find a kernel which fails to > bring up the link AND provides some useful debug information but can't > anticipate if and when. >> Thanks a lot! > You are welcome ;-) > >
Just by chance I came across the concept of MODULE_SOFTDEP. This is basically in driver code what people did as a workaround manually in the modprobe config files. It ensures that the PHY driver module is loaded before the network driver. It's still a workaround but the most elegant I can think of. I'm pretty sure this reliably avoids the issue, but: could you please test? If it works, then what I list below as one patch would be splitted: 1. add MODULE_SOFTDEP to r8169 2. remove preliminary fix Thanks, Heiner diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c index 298930d39..4c1485a42 100644 --- a/drivers/net/ethernet/realtek/r8169.c +++ b/drivers/net/ethernet/realtek/r8169.c @@ -706,6 +706,7 @@ module_param(use_dac, int, 0); MODULE_PARM_DESC(use_dac, "Enable PCI DAC. Unsafe on 32 bit PCI slot."); module_param_named(debug, debug.msg_enable, int, 0); MODULE_PARM_DESC(debug, "Debug verbosity level (0=none, ..., 16=all)"); +MODULE_SOFTDEP("pre: realtek"); MODULE_LICENSE("GPL"); MODULE_FIRMWARE(FIRMWARE_8168D_1); MODULE_FIRMWARE(FIRMWARE_8168D_2); diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c index 9560a2b84..80be72844 100644 --- a/drivers/net/phy/phy_device.c +++ b/drivers/net/phy/phy_device.c @@ -2259,14 +2259,6 @@ int phy_driver_register(struct phy_driver *new_driver, struct module *owner) new_driver->mdiodrv.driver.remove = phy_remove; new_driver->mdiodrv.driver.owner = owner; - /* The following works around an issue where the PHY driver doesn't bind - * to the device, resulting in the genphy driver being used instead of - * the dedicated driver. The root cause of the issue isn't known yet - * and seems to be in the base driver core. Once this is fixed we may - * remove this workaround. - */ - new_driver->mdiodrv.driver.probe_type = PROBE_FORCE_SYNCHRONOUS; - retval = driver_register(&new_driver->mdiodrv.driver); if (retval) { pr_err("%s: Error %d in registering driver\n", -- 2.20.1