On Thu, 20 Sep 2018 11:17:17 +0200, Magnus Karlsson wrote: > On Wed, Sep 19, 2018 at 9:18 AM Magnus Karlsson wrote: > > On Wed, Sep 19, 2018 at 3:58 AM Jakub Kicinski wrote: > > > > > > On Tue, 18 Sep 2018 10:22:11 -0700, Y Song wrote: > > > > > +/* The umem is stored both in the _rx struct and the _tx struct as > > > > > we do > > > > > + * not know if the device has more tx queues than rx, or the > > > > > opposite. > > > > > + * This might also change during run time. > > > > > + */ > > > > > +static void xdp_reg_umem_at_qid(struct net_device *dev, struct > > > > > xdp_umem *umem, > > > > > + u16 queue_id) > > > > > +{ > > > > > + if (queue_id < dev->real_num_rx_queues) > > > > > + dev->_rx[queue_id].umem = umem; > > > > > + if (queue_id < dev->real_num_tx_queues) > > > > > + dev->_tx[queue_id].umem = umem; > > > > > +} > > > > > + > > > > > +static struct xdp_umem *xdp_get_umem_from_qid(struct net_device *dev, > > > > > + u16 queue_id) > > > > > +{ > > > > > + if (queue_id < dev->real_num_rx_queues) > > > > > + return dev->_rx[queue_id].umem; > > > > > + if (queue_id < dev->real_num_tx_queues) > > > > > + return dev->_tx[queue_id].umem; > > > > > + > > > > > + return NULL; > > > > > +} > > > > > + > > > > > +static void xdp_clear_umem_at_qid(struct net_device *dev, u16 > > > > > queue_id) > > > > > +{ > > > > > + /* Zero out the entry independent on how many queues are > > > > > configured > > > > > + * at this point in time, as it might be used in the future. > > > > > + */ > > > > > + if (queue_id < dev->num_rx_queues) > > > > > + dev->_rx[queue_id].umem = NULL; > > > > > + if (queue_id < dev->num_tx_queues) > > > > > + dev->_tx[queue_id].umem = NULL; > > > > > +} > > > > > + > > > > > > > > I am sure whether the following scenario can happen or not. > > > > Could you clarify? > > > > 1. suppose initially we have num_rx_queues = num_tx_queues = 10 > > > > xdp_reg_umem_at_qid() set umem1 to queue_id = 8 > > > > 2. num_tx_queues is changed to 5 > > > > 3. xdp_clear_umem_at_qid() is called for queue_id = 8, > > > > and dev->_rx[8].umum = 0. > > > > 4. xdp_reg_umem_at_qid() is called gain to set for queue_id = 8 > > > > dev->_rx[8].umem = umem2 > > > > 5. num_tx_queues is changed to 10 > > > > Now dev->_rx[8].umem != dev->_tx[8].umem, is this possible and is it > > > > a problem? > > > > > > Plus IIRC the check of qid vs real_num_[rt]x_queues in xsk_bind() is > > > not under rtnl_lock so it doesn't count for much. Why not do all the > > > checks against num_[rt]x_queues here, instead of real_..? > > > > You are correct, two separate rtnl_lock regions is broken. Will spin a > > v2 tomorrow when I am back in the office. > > > > Thanks Jakub for catching this. I really appreciate you reviewing my code. > > Sorry, forgot to answer your question about why real_num_ instead of > num_. If we used num_ instead, we would get a behavior where we can > open a socket on a queue id, let us say 10, that will not be active if > real_num_ is below 10. So no traffic will flow on this socket until we > issue an ethtool command to state that we have 10 or more queues > configured. While this behavior is sane (and consistent), I believe it > will lead to a more complicated driver implementation, break the > current uapi (since we will return an error if you try to bind to a > queue id that is not active at the moment) and I like my suggestion > below better :-). > > What I would like to suggest is that the current model at bind() is > kept, that is it will fail if you try to bind to a non-active queue > id. But we add some code in ethool_set_channels in net/core/ethtool.c > to check if you have an active af_xdp socket on a queue id and try to > make it inactive, we return an error and disallow the change. This > would IMHO be like not allowing a load module to be unloaded if > someone is using it or not allowing unmounting a file system if files > are in use. What do you think? If you think this is the right way to > go, I can implement this in a follow up patch or in this patch set if > that makes more sense, let me know. This suggestion would actually > make it simpler to implement zero-copy support in the driver. Some of > the complications we are having today is due to the fact that ethtool > can come in from the side and change things for us when an AF_XDP > socket is active on a queue id. And implementing zero copy support in > the driver needs to get simpler IMO. > > Please let me know what you think.
I think I'd like that: https://patchwork.ozlabs.org/project/netdev/list/?series=57843&state=* in particular: https://patchwork.ozlabs.org/patch/949891/ ;) If we fix the resize indeed you could stick to real_* but to be honest, I'm not entirely clear on what the advantage would be? The arrays are sized to num..queues, do you envision a use case where having xdp_get_umem_from_qid() return a NULL when asked for qid > real_.. would help? Is there one for xdp_reg_umem_at_qid() I have no strong preference here, just wondering.