On Wed, 9 May 2018 17:22:50 -0700, Michael Chan wrote: > On Wed, May 9, 2018 at 4:15 PM, Jakub Kicinski wrote: > > On Wed, 9 May 2018 07:21:41 -0400, Michael Chan wrote: > >> VF Queue resources are always limited and there is currently no > >> infrastructure to allow the admin. on the host to add or reduce queue > >> resources for any particular VF. With ever increasing number of VFs > >> being supported, it is desirable to allow the admin. to configure queue > >> resources differently for the VFs. Some VFs may require more or fewer > >> queues due to different bandwidth requirements or different number of > >> vCPUs in the VM. This patch adds the infrastructure to do that by > >> adding IFLA_VF_QUEUES netlink attribute and a new .ndo_set_vf_queues() > >> to the net_device_ops. > >> > >> Four parameters are exposed for each VF: > >> > >> o min_tx_queues - Guaranteed or current tx queues assigned to the VF. > > > > This muxing of semantics may be a little awkward and unnecessary, would > > it make sense for struct ifla_vf_info to have a separate fields for > > current number of queues and the admin-set guaranteed min? > > The loose semantics is mainly to allow some flexibility in > implementation. Sure, we can tighten the definitions or add > additional fields.
I would appreciate that, if others don't disagree. I personally don't see the need for flexibility (AKA per-vendor behaviour) here, quite the opposite, min/max/current number of queues seems quite self-explanatory. Or at least don't allow min to mean current? Otherwise the API gets a bit asymmetrical :( > > Is there a real world use case for the min value or are you trying to > > make the API feature complete? > > In this proposal, these parameters are mainly viewed as the bounds for > the queues that each VF can potentially allocate. The actual number > of queues chosen by the VF driver or modified by the VF user can be > any number within the bounds. Perhaps you have misspoken here - these are not allowed bounds, right? min is the guarantee that queues will be available, not requirement. Similar to bandwidth allocation. IOW if the bounds are set [4, 16] the VF may still choose to use 1 queue, event thought that's not within bounds. > We currently need to have min and max parameters to support the > different modes we use to distribute the queue resources to the VFs. > In one mode, for example, resources are statically divided and each VF > has a small number of guaranteed queues (min = max). In a different > mode, we allow more flexible resource allocation with each VF having a > small number of guaranteed queues but a higher number of > non-guaranteed queues (min < max). Some VFs may be able to allocate > queues much higher than min when resources are still available, while > others may only be able to allocate min queues when resources are used > up. > > With min and max exposed, the PF user can properly tweak the resources > for each VF described above. Right, I was just looking for a real world scenario where this flexibility is going to be used - mainly because the switchdev model I described below won't allow it. I'm not sure users will leave a portion of queues to be allocated by chance. > >> o max_tx_queues - Maximum but not necessarily guaranteed tx queues > >> available to the VF. > >> > >> o min_rx_queues - Guaranteed or current rx queues assigned to the VF. > >> > >> o max_rx_queues - Maximum but not necessarily guaranteed rx queues > >> available to the VF. > >> > >> The "ip link set" command will subsequently be patched to support the new > >> operation to set the above parameters. > >> > >> After the admin. makes a change to the above parameters, the corresponding > >> VF will have a new range of channels to set using ethtool -L. > >> > >> Signed-off-by: Michael Chan <michael.c...@broadcom.com> > > > > In switchdev mode we can use number of queues on the representor as a > > proxy for max number of queues allowed for the ASIC port. This works > > better when representors are muxed in the first place than when they > > have actual queues backing them. WDYT about such scheme, Or? A very > > pleasant side-effect is that one can configure qdiscs and get stats > > per-HW queue. > > This is an interesting approach. But it doesn't have the min and max > for each VF, and also only works in switchdev mode. It allows controlling all ports of the switch with the same, existing and well known API (incl. PFs). But sadly I don't think we are at the point where switchdev-mode solutions are considered an alternative, so I'm only mentioning it to broaden the discussion :) I'm not opposed to your patches :)