On Mon, Nov 30, 2020 at 11:36 AM Jason Wang <jasow...@redhat.com> wrote: > > > On 2020/11/27 下午1:52, Yongji Xie wrote: > > On Fri, Nov 27, 2020 at 11:53 AM Jason Wang <jasow...@redhat.com > > <mailto:jasow...@redhat.com>> wrote: > > > > > > On 2020/11/12 下午2:39, Parav Pandit wrote: > > > This patchset covers user requirements for managing existing > > vdpa devices, > > > using a tool and its internal design notes for kernel drivers. > > > > > > Background and user requirements: > > > ---------------------------------- > > > (1) Currently VDPA device is created by driver when driver is > > loaded. > > > However, user should have a choice when to create or not create > > a vdpa device > > > for the underlying parent device. > > > > > > For example, mlx5 PCI VF and subfunction device supports > > multiple classes of > > > device such netdev, vdpa, rdma. Howevever it is not required to > > always created > > > vdpa device for such device. > > > > > > (2) In another use case, a device may support creating one or > > multiple vdpa > > > device of same or different class such as net and block. > > > Creating vdpa devices at driver load time further limits this > > use case. > > > > > > (3) A user should be able to monitor and query vdpa queue level > > or device level > > > statistics for a given vdpa device. > > > > > > (4) A user should be able to query what class of vdpa devices > > are supported > > > by its parent device. > > > > > > (5) A user should be able to view supported features and > > negotiated features > > > of the vdpa device. > > > > > > (6) A user should be able to create a vdpa device in vendor > > agnostic manner > > > using single tool. > > > > > > Hence, it is required to have a tool through which user can > > create one or more > > > vdpa devices from a parent device which addresses above user > > requirements. > > > > > > Example devices: > > > ---------------- > > > +-----------+ +-----------+ +---------+ +--------+ +-----------+ > > > |vdpa dev 0 | |vdpa dev 1 | |rdma dev | |netdev | |vdpa dev 3 | > > > |type=net | |type=block | |mlx5_0 | |ens3f0 | |type=net | > > > +----+------+ +-----+-----+ +----+----+ +-----+--+ +----+------+ > > > | | | | | > > > | | | | | > > > +----+-----+ | +----+----+ | +----+----+ > > > | mlx5 +--------+ |mlx5 +-------+ |mlx5 | > > > |pci vf 2 | |pci vf 4 | |pci sf 8 | > > > |03:00:2 | |03:00.4 | |mlx5_sf.8| > > > +----+-----+ +----+----+ +----+----+ > > > | | | > > > | +----+-----+ | > > > +----------------------+mlx5 +----------------+ > > > |pci pf 0 | > > > |03:00.0 | > > > +----------+ > > > > > > vdpa tool: > > > ---------- > > > vdpa tool is a tool to create, delete vdpa devices from a parent > > device. It is a > > > tool that enables user to query statistics, features and may be > > more attributes > > > in future. > > > > > > vdpa tool command draft: > > > ------------------------ > > > (a) List parent devices which supports creating vdpa devices. > > > It also shows which class types supported by this parent device. > > > In below command example two parent devices support vdpa device > > creation. > > > First is PCI VF whose bdf is 03.00:2. > > > Second is PCI VF whose name is 03:00.4. > > > Third is PCI SF whose name is mlx5_core.sf.8 > > > > > > $ vdpa parentdev list > > > vdpasim > > > supported_classes > > > net > > > pci/0000:03.00:3 > > > supported_classes > > > net block > > > pci/0000:03.00:4 > > > supported_classes > > > net block > > > auxiliary/mlx5_core.sf.8 > > > supported_classes > > > net > > > > > > (b) Now add a vdpa device of networking class and show the device. > > > $ vdpa dev add parentdev pci/0000:03.00:2 type net name foo0 $ > > vdpa dev show foo0 > > > foo0: parentdev pci/0000:03.00:2 type network parentdev vdpasim > > vendor_id 0 max_vqs 2 max_vq_size 256 > > > > > > (c) Show features of a vdpa device > > > $ vdpa dev features show foo0 > > > supported > > > iommu platform > > > version 1 > > > > > > (d) Dump vdpa device statistics > > > $ vdpa dev stats show foo0 > > > kickdoorbells 10 > > > wqes 100 > > > > > > (e) Now delete a vdpa device previously created. > > > $ vdpa dev del foo0 > > > > > > vdpa tool support in this patchset: > > > ----------------------------------- > > > vdpa tool is created to create, delete and query vdpa devices. > > > examples: > > > Show vdpa parent device that supports creating, deleting vdpa > > devices. > > > > > > $ vdpa parentdev show > > > vdpasim: > > > supported_classes > > > net > > > > > > $ vdpa parentdev show -jp > > > { > > > "show": { > > > "vdpasim": { > > > "supported_classes": { > > > "net" > > > } > > > } > > > } > > > > > > Create a vdpa device of type networking named as "foo2" from the > > parent device vdpasim: > > > > > > $ vdpa dev add parentdev vdpasim type net name foo2 > > > > > > Show the newly created vdpa device by its name: > > > $ vdpa dev show foo2 > > > foo2: type network parentdev vdpasim vendor_id 0 max_vqs 2 > > max_vq_size 256 > > > > > > $ vdpa dev show foo2 -jp > > > { > > > "dev": { > > > "foo2": { > > > "type": "network", > > > "parentdev": "vdpasim", > > > "vendor_id": 0, > > > "max_vqs": 2, > > > "max_vq_size": 256 > > > } > > > } > > > } > > > > > > Delete the vdpa device after its use: > > > $ vdpa dev del foo2 > > > > > > vdpa tool support by kernel: > > > ---------------------------- > > > vdpa tool user interface will be supported by existing vdpa > > kernel framework, > > > i.e. drivers/vdpa/vdpa.c It services user command through a > > netlink interface. > > > > > > Each parent device registers supported callback operations with > > vdpa subsystem > > > through which vdpa device(s) can be managed. > > > > > > FAQs: > > > ----- > > > 1. Where does userspace vdpa tool reside which users can use? > > > Ans: vdpa tool can possibly reside in iproute2 [1] as it enables > > user to > > > create vdpa net devices. > > > > > > 2. Why not create and delete vdpa device using sysfs/configfs? > > > Ans: > > > (a) A device creation may involve passing one or more attributes. > > > Passing multiple attributes and returning error code and more > > verbose > > > information for invalid attributes cannot be handled by > > sysfs/configfs. > > > > > > (b) netlink framework is rich that enables user space and kernel > > driver to > > > provide nested attributes. > > > > > > (c) Exposing device specific file under sysfs without net namespace > > > awareness exposes details to multiple containers. Instead exposing > > > attributes via a netlink socket secures the communication > > channel with kernel. > > > > > > (d) netlink socket interface enables to run syscaller kernel tests. > > > > > > 3. Why not use ioctl() interface? > > > Ans: ioctl() interface replicates the necessary plumbing which > > already > > > exists through netlink socket. > > > > > > 4. What happens when one or more user created vdpa devices exist > > for a > > > parent PCI VF or SF and such parent device is removed? > > > Ans: All user created vdpa devices are removed that belong to a > > parent. > > > > > > [1] > > git://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git > > <http://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git> > > > > > > Next steps: > > > ----------- > > > (a) Post this patchset and iproute2/vdpa inclusion, remaining > > two drivers > > > will be coverted to support vdpa tool instead of creating > > unmanaged default > > > device on driver load. > > > (b) More net specific parameters such as mac, mtu will be added. > > > (c) Features bits get and set interface will be added. > > > > > > Adding Yong Ji for sharing some thoughts from the view of > > userspace vDPA > > device. > > > > > > Thanks for adding me, Jason! > > > > Now I'm working on a v2 patchset for VDUSE (vDPA Device in Userspace) > > [1]. This tool is very useful for the vduse device. So I'm considering > > integrating this into my v2 patchset. But there is one problem: > > > > In this tool, vdpa device config action and enable action are combined > > into one netlink msg: VDPA_CMD_DEV_NEW. But in vduse case, it needs to > > be splitted because a chardev should be created and opened by a > > userspace process before we enable the vdpa device (call > > vdpa_register_device()). > > > > So I'd like to know whether it's possible (or have some plans) to add > > two new netlink msgs something like: VDPA_CMD_DEV_ENABLE and > > VDPA_CMD_DEV_DISABLE to make the config path more flexible. > > > > Actually, we've discussed such intermediate step in some early > discussion. It looks to me VDUSE could be one of the users of this. > > Or I wonder whether we can switch to use anonymous inode(fd) for VDUSE > then fetching it via an VDUSE_GET_DEVICE_FD ioctl? >
Yes, we can. Actually the current implementation in VDUSE is like this. But seems like this is still a intermediate step. The fd should be binded to a name or something else which need to be configured before. Thanks, Yongji