On 24.02.2019 00:42, Andrew Lunn wrote: >> it took me quite some time to debug this issue .. >> >> At first a bisect pointed to one of my commits: >> 5502b218e001 ("net: phy: use phy_resolve_aneg_linkmode in >> genphy_read_status") >> >> Further digging lead me to some suspicious dsa code: >> In dsa_port_fixed_link_register_of() there's a call to genphy_read_status(). >> At the time of the call phydev->advertising is empty, therefore the fixed phy >> settings are overwritten with defaults (10/half) what breaks the system. >> >> Worth to be mentioned is that for the PHY these two flags are set: >> - is_pseudo_fixed_link (that's ok) >> - autoneg -> I'm not sure this is correct. >> >> It seems that you once added the code in question: >> 39b0c705195e ("net: dsa: Allow configuration of CPU & DSA port >> speeds/duplex") > > Hi Heiner > > For Ethernet switches, you can have device trees like this: > > switch0: switch@0 { > compatible = "marvell,mv88e6085"; > pinctrl-0 = <&pinctrl_gpio_switch0>; > pinctrl-names = "default"; > reg = <0>; > dsa,member = <0 0>; > interrupt-parent = <&gpio0>; > interrupts = <27 IRQ_TYPE_LEVEL_LOW>; > interrupt-controller; > #interrupt-cells = <2>; > eeprom-length = <512>; > > ports { > #address-cells = <1>; > #size-cells = <0>; > > port@0 { > reg = <0>; > label = "lan0"; > phy-handle = <&switch0phy0>; > }; > > ... > switch0port5: port@5 { > reg = <5>; > label = "dsa"; > phy-mode = "rgmii-txid"; > link = <&switch1port6 > &switch2port9>; > fixed-link { > speed = <1000>; > full-duplex; > }; > }; > > This is a DSA port. It is used to connect two Ethernet switches > together. In this case, the MACs are connected back to back. There are > no PHYs involved. These ports don't have a netdev associated with > them, since they are just pipes between switches, not user interfaces. > > If i remember correctly, the code you are looking at was to deal with > the "rgmii-txid". Without the TXID, i could not get frames to pass > between the ports. > > There is a second use case as well. The Vybrid FEC ethernet controller > is a Fast Ethernet port. The switch ports are 1G. DSA drivers set the > port connecting to the SoC to its highest speed. Again, there is no > netdev associated to the switch port, and generally the MAC ports are > connected back to back. This 100 vs 1000 does not work for the > Vybrid. So we add a fixed PHY to the CPU port. > > port@6 { > reg = <6>; > label = "cpu"; > ethernet = <&fec1>; > > fixed-link { > speed = <100>; > full-duplex; > }; > }; > > And there is a third use case: > > port@4 { > reg = <4>; > label = "optical4"; > > fixed-link { > speed = <1000>; > full-duplex; > link-gpios = <&gpio6 3 > > GPIO_ACTIVE_HIGH>; > }; > }; > > We have a GPIO which tells us if the link it up, because there is not > a PHY here, but an optical module. The GPIO tells us about loss of > signal. Unfortunately we cannot use the SFP driver here, because of a > hardware issue. > > In all cases end up in the same code. We want to tell the MAC to > configure RGMII delays, or to configure the port speed, or to have the > correct initial link state. The adjust_link() call can do this, if > passed a phydev. And we have a phydev for the fixed-link PHY. However, > since it has never been attached to a netdev, phy_start() called, etc, > the state information is maybe not correct. So we call config_init() > and read_status(), so phydev should reflect the state of the > fixed-link PHY. And a fixed-link PHY is always c22, and it can be > driven by genphy. Take a look at drivers/net/phy/swphy.c which is the > core of the simulation, and fixed_phy.c > Thanks a lot for the very comprehensive explanation!
I think what's not correct is that phydev->autoneg is set (by phy_device_create) for a fixed link. genphy_config_init() ensures that aneg isn't set in the "supported" bitmap, but it doesn't care about phydev->autoneg. IMO we need to consider this at few places: - genphy_config_init It should clear phydev->autoneg if aneg isn't supported. - phy_probe After having read the chip features we should clear phydev->autoneg if aneg isn't supported. - In __fixed_phy_register() we should clear phydev->autoneg. >> I did what I like to do most and removed some code. >> W/o the calls to genphy_config_init() and genphy_read_status() it works >> again. >> Do these calls have some purpose here with a fixed link? > > Maybe with all the core changes, these calls are no longer needed? We > just need to ensure the state in phydev reflects the state of the > fixed link, i.e. in this case 100 Full. > I think at least the call to genphy_config_init can be removed. I don't really like the name of this function anyway, because it doesn't initialize anything but is a sanity check that reads chip capabilities and removes unsupported modes from the "supported" bitmap. genphy_read_status() we need to read the link status from GPIO (if applicable in the use case). To be more precise, most likely calling genphy_update_link() would be sufficient because speed and duplex are fixed. > Looking forward, at some point we are going to have to make fixed-link > support higher speeds. That probably means we need a swphy-c45 which > emulates the standard registers for 2.5G, 5G and 10G. At that point > genphy will not work... > > Andrew > . > Heiner