Re: [PATCH v7 net-next 08/11] net: dsa: allow changing the tag protocol via the "tagging" device attribute

Jakub Kicinski Thu, 28 Jan 2021 18:16:02 -0800

On Thu, 28 Jan 2021 22:07:50 +0200 Vladimir Oltean wrote:
> On Wed, Jan 27, 2021 at 05:30:44PM -0800, Jakub Kicinski wrote:
> > > +const struct dsa_device_ops *dsa_find_tagger_by_name(const char *buf)
> > > +{
> > > + const struct dsa_device_ops *ops = NULL;
> > > + struct dsa_tag_driver *dsa_tag_driver;
> > > +
> > > + mutex_lock(&dsa_tag_drivers_lock);
> > > + list_for_each_entry(dsa_tag_driver, &dsa_tag_drivers_list, list) {
> > > +         const struct dsa_device_ops *tmp = dsa_tag_driver->ops;
> > > +
> > > +         if (!sysfs_streq(buf, tmp->name))
> > > +                 continue;
> > > +
> > > +         ops = tmp;
> > > +         break;
> > > + }
> > > + mutex_unlock(&dsa_tag_drivers_lock);  
> >
> > What's protecting from the tag driver unloading at this very moment?  
> 
> The user's desire to not crash the kernel, and do something productive
> instead? Anyway, I've fixed this in my tree and I will repost soon.


:)

> > > + info.tag_ops = old_tag_ops;
> > > + err = dsa_tree_notify(dst, DSA_NOTIFIER_TAG_PROTO_DEL, &info);
> > > + if (err)
> > > +         goto out_unlock;
> > > +
> > > + info.tag_ops = tag_ops;
> > > + err = dsa_tree_notify(dst, DSA_NOTIFIER_TAG_PROTO_SET, &info);  
> >
> > Not sure I should bother you about this or not, but it looks like
> > Ocelot does allocations on SET, so there is a chance of not being
> > able to return to the previous config, leaving things broken.
> >
> > There's quite a few examples where we use REPLACE instead of set,
> > so that a careful driver can prep its resources before it kills
> > the previous config. Although that's not perfect either because
> > we'd rather have as much of that logic in the core as possible.
> >
> > What are your thoughts?  
> 
> On one hand, my immediate thoughts are:
> (a) out of the 2 ocelot taggers, only tag_8021q can fail, the NPI tagger
>     always returns 0. So either the .set_tag_protocol will always
>     succeed, or we'll always be able to restore the initial tag
>     protocol.
> (b) changing the tag protocol is done with network down, so if you can't
>     allocate some memory for some TCAM rules, you're probably in the
>     wrong business anyway once the ports go back up and you'll start
>     receiving network traffic.
> 
> But either way, I could create a single .replace_tag_protocol callback
> instead of the current .set_tag_protocol and .del_tag_protocol, and have
> the felix driver still call felix_set_tag_protocol privately from
> .setup(), and .felix_del_tag_protocol from .teardown(). That's a
> relatively non-invasive change which would make zero practical
> difference to my use case due to point (a) above.
> 
> However I will not pretend that having an "atomic" .replace_tag_protocol
> is going to ensure a consistent state of the tagger, because it won't.
> In the case where you have a DSA switch tree with 7 switches, and
> .replace_tag_protocol fails for the 5th, what do you do? You create a
> transactional model, with a prepare and commit phase, right? But I am
> doing some memory allocations from callbacks of external API
> (struct dsa_8021q_ops felix_tag_8021q_ops), so unless I have a crystal
> ball to guess what parameters will tag_8021q call me with (so I could
> preallocate), my options are:
> - propagate the prepare and commit phase to tag_8021q, which I'm not
>   going to.
> - do all of my setup in the prepare phase, the one that can return
>   errors, and privately restore my tagger e.g. from tag_8021q mode to
>   NPI mode if any allocation failed. Aka just do a facade thing with the
>   whole prepare/commit model.
> And having a prepare/commit model means that you do memory allocation in
> prepare so that you can use it during commit, which means that there
> must be some structure which holds the transactional storage of the
> driver. All is well except that the preparation phase of the 5th switch
> out of 7 may still fail, so you should also have an unprepare method
> that performs the resource deallocation for the first four. Normally the
> unprepare should not fail, but if I implement it the only I said I can
> (i.e. I do all my configuration in the prepare phase, and return in
> commit), then for all practical purposes, the unprepare phase will be a
> .replace_tag_protocol in the opposite direction. Aka an operation that
> can still fail.
> 
> If you have a better idea of how I can make dsa_tree_change_tag_proto
> guarantee that all switches in the tree end up with the same tagger,
> while not sabotaging the only driver implementing that API, do let me
> know.

I'm just trying to nudge you towards perfection. Working with servers
"at scale" I really appreciate the ability to reconfigure something on
very many machines without worrying that ~1% of them will be under
memory pressure and the failure will cause it to fall off the network.

But I take your point that it may be an overkill here.

A less ambitious plan would be to retry the SET when bringing up master
if the tree was left in a funky state?

Not knowing much about DSA internals I'll leave the decision to you and
other DSA maintainers.

Re: [PATCH v7 net-next 08/11] net: dsa: allow changing the tag protocol via the "tagging" device attribute

Reply via email to