On Thu, 7 May 2020 at 01:45, Florian Fainelli <f.faine...@gmail.com> wrote: > > > > On 5/6/2020 2:40 PM, Vladimir Oltean wrote: > > Hi Florian, > > > > On Thu, 7 May 2020 at 00:24, Florian Fainelli <f.faine...@gmail.com> wrote: > >> > >> > >> > >> On 5/5/2020 2:23 PM, Vivien Didelot wrote: > >>> On Tue, 5 May 2020 14:02:53 -0700, Florian Fainelli > >>> <f.faine...@gmail.com> wrote: > >>>> If we are probed through platform_data we would be intentionally > >>>> dropping the reference count on master after dev_to_net_device() > >>>> incremented it. If we are probed through Device Tree, > >>>> of_find_net_device() does not do a dev_hold() at all. > >>>> > >>>> Ensure that the DSA master device is properly reference counted by > >>>> holding it as soon as the CPU port is successfully initialized and later > >>>> released during dsa_switch_release_ports(). dsa_get_tag_protocol() does > >>>> a short de-reference, so we hold and release the master at that time, > >>>> too. > >>>> > >>>> Fixes: 83c0afaec7b7 ("net: dsa: Add new binding implementation") > >>>> Signed-off-by: Florian Fainelli <f.faine...@gmail.com> > >>> > >>> Reviewed-by: Vivien Didelot <vivien.dide...@gmail.com> > >>> > >> Andrew, Vladimir, any thoughts on that? > >> -- > >> Florian > > > > I might be completely off because I guess I just don't understand what > > is the goal of keeping a reference to the DSA master in this way for > > the entire lifetime of the DSA switch. I think that dev_hold is for > > short-term things that cannot complete atomically, but I think that > > you are trying to prevent the DSA master from getting freed from under > > our feet, which at the moment would fault the kernel instantaneously? > > Yes, that's the idea, you should not be able to rmmod/unbind the DSA > master while there is a DSA switch tree hanging off of it. > > > > > If this is correct, it certainly doesn't do what it intends to do: > > echo 0000\:00\:00.5> /sys/bus/pci/drivers/mscc_felix/unbind > > [ 71.576333] unregister_netdevice: waiting for swp0 to become free. > > Usage count = 1 > > (hangs there) > > Is this with the sja1105 switch hanging off felix?
Yes, but it actually doesn't matter that the DSA master is a DSA slave too. > If so, is not it > working as expected because you still have sja1150 being bound to one of > those ports? If not, then I will look into why. > I just unbound the driver for the DSA master and the shell got stuck in kernel process context telling me that it's waiting for the reference to be freed. So I think it's just that my "expected" is not the same as yours - it looks like what I'm doing would qualify as "incorrect usage". > > > > But if I'm right and that's indeed what you want to achieve, shouldn't > > we be using device links instead? > > https://www.kernel.org/doc/html/v4.14/driver-api/device_link.html > > device links could work but given that the struct device and struct > net_device have almost the same lifetime, with the net_device being a > little bit shorter, and that is what DSA uses, I am not sure whether > device link would bring something better. At the very least, I think they would bring us graceful teardown of consumers of the DSA master device. > -- > Florian Thanks, -Vladimir