On 29.12.2018 17:59, Norbert Jurkeit wrote:
> Am 29.12.18 um 16:44 schrieb Heiner Kallweit:
>>
>> I don't think this patch can have any impact on the issue. Maybe WoL is
>> still active from previous test?
>> Manual WoL settings may survive a reboot, you can disable WoL by "ethtool -s
>> <if> wol d".
>
> In theory I agree, but we have seen before that it can not be predicted or
> logically explained which kernel build suffers from the issue or does not.
> WoL is definitely off. When it was enabled, the LED already turned on with
> the BIOS diagnostics screen and not at the end of the boot process as
> observed with the patched kernel.
>
>>
>> What could be helpful in addition: I provided a patch with some debug output
>> in comment 106
>> in the bug ticket (https://bugzilla.redhat.com/show_bug.cgi?id=1650984).
>> If you could apply this, trigger a fail scenario, and attach the full dmesg
>> to the bug ticket.
> I just tried the Fedora kernel provided in comment 107. Unfortunately the
> fault neither shows up with this kernel nor with the stock Fedora kernel
> 4.19.12 it is based on. I will further try to find a kernel which fails to
> bring up the link AND provides some useful debug information but can't
> anticipate if and when.
>> Thanks a lot!
> You are welcome ;-)
>
>
Just by chance I came across the concept of MODULE_SOFTDEP. This is
basically in driver code what people did as a workaround manually in
the modprobe config files. It ensures that the PHY driver module is
loaded before the network driver. It's still a workaround but the
most elegant I can think of. I'm pretty sure this reliably avoids
the issue, but: could you please test?
If it works, then what I list below as one patch would be splitted:
1. add MODULE_SOFTDEP to r8169
2. remove preliminary fix
Thanks, Heiner
diff --git a/drivers/net/ethernet/realtek/r8169.c
b/drivers/net/ethernet/realtek/r8169.c
index 298930d39..4c1485a42 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -706,6 +706,7 @@ module_param(use_dac, int, 0);
MODULE_PARM_DESC(use_dac, "Enable PCI DAC. Unsafe on 32 bit PCI slot.");
module_param_named(debug, debug.msg_enable, int, 0);
MODULE_PARM_DESC(debug, "Debug verbosity level (0=none, ..., 16=all)");
+MODULE_SOFTDEP("pre: realtek");
MODULE_LICENSE("GPL");
MODULE_FIRMWARE(FIRMWARE_8168D_1);
MODULE_FIRMWARE(FIRMWARE_8168D_2);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 9560a2b84..80be72844 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -2259,14 +2259,6 @@ int phy_driver_register(struct phy_driver *new_driver,
struct module *owner)
new_driver->mdiodrv.driver.remove = phy_remove;
new_driver->mdiodrv.driver.owner = owner;
- /* The following works around an issue where the PHY driver doesn't bind
- * to the device, resulting in the genphy driver being used instead of
- * the dedicated driver. The root cause of the issue isn't known yet
- * and seems to be in the base driver core. Once this is fixed we may
- * remove this workaround.
- */
- new_driver->mdiodrv.driver.probe_type = PROBE_FORCE_SYNCHRONOUS;
-
retval = driver_register(&new_driver->mdiodrv.driver);
if (retval) {
pr_err("%s: Error %d in registering driver\n",
--
2.20.1