Hi, While doing final testing of the mvneta changes for phylink, a very easy to trigger race condition was found with the Marvell PHY driver which manifested itself as the link going down when a hibernate cycle terminates.
The issue turned out to be a race between two threads accessing the PHY - one trying to do a status read and the other configuring the PHY. The result is the configuration thread tries to read-modify-write a paged register in a non-copper page, but the status read thread switches the PHY back to the copper page half-way through. Various solutions involving phy->lock were considered, but found to create more lock dependency issues than were nice to deal with. The solution proposed here uses the mdiobus lock to ensure that accesses to paged registers become atomic with respect to all other bus accesses, including those from userspace. There is an open question whether there should be generic helpers for this. Generic helpers would mean: - Additional couple of function pointers in phy_driver to read/write the paging register. This has the restriction that there must only be one paging register. - The helpers become more expensive, and because they're in a separate compilation unit, the compiler will be unable to optimise them by inlining the static functions. - The helpers would be re-usable, saving replications of that code, and making it more likely for phy authors to safely access the PHY. Another potential question is whether using the mdiobus lock (which excludes all other MII bus access) is best - while it has the advantage of also ensuring atomicity with userspace accesses, it means that no one else can access an independent PHY on the same bus while a paged access is on-going. It feels like a big hammer, but I'm not convinced that we will see a lot of contention on it. Comments? drivers/net/phy/marvell.c | 365 +++++++++++++++++++++------------------------ drivers/net/phy/mdio_bus.c | 65 ++++++-- drivers/net/phy/phy-core.c | 11 +- include/linux/mdio.h | 3 + include/linux/phy.h | 26 ++++ 5 files changed, 256 insertions(+), 214 deletions(-) -- RMK's Patch system: http://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up According to speedtest.net: 8.21Mbps down 510kbps up