On Wed, Oct 22, 2025 at 1:43 PM Michael S. Tsirkin <[email protected]> wrote: > > On Wed, Oct 22, 2025 at 12:50:53PM +0200, Eugenio Perez Martin wrote: > > Let me switch to MQ as I think it illustrates the point better. > > > > IIUC the workflow: > > a) virtio-net sends MQ_VQ_PAIRS_SET 2 to the device > > b) VDUSE CVQ sends ok to the virtio-net driver > > c) VDUSE CVQ sends the command to the VDUSE device > > d) Now the virtio-net driver sends virtio-net sends MQ_VQ_PAIRS_SET 1 > > e) VDUSE CVQ sends ok to the virtio-net driver > > > > The device didn't process the MQ_VQ_PAIRS_SET 1 command at this point, > > so it potentially uses the second rx queue. But, by the standard: > > > > The device MUST NOT queue packets on receive queues greater than > > virtqueue_pairs once it has placed the VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET > > command in a used buffer. > > > > So the driver does not expect rx buffers on that queue at all. From > > the driver's POV, the device is invalid, and it could mark it as > > broken. > > ok intresting. Note that if userspace processes vqs it should process > cvq too. I don't know what to do in this case yet, I'm going on > vacation, let me ponder this a bit. >
Sure. > > > And, what's worse, how to handle it if the device now replies with > > VIRTIO_NET_ERR to the VDUSE CVQ? > > this part does not bother me much. break it, probably. > To "successfully break it" we should implement NEED_RESET, or would it work to just stop forwarding messages? > > > > If we wait for the device to reply, we're in the > > > > same situation regarding the RTNL. > > > > > > > > Now we receive a new state (A, B, E). We haven't sent the (A, B, D), > > > > so it is good to just replace the (A, B, D) with that. and send it > > > > when (A, B, C) is completed with either success or failure. > > > > > > > > 2) VQ_PAIRS_SET > > > > > > > > The driver starts with 1 vq pair. Now the driver sets 3 vq pairs, and > > > > the VDUSE CVQ forwards the command. The driver still thinks that it is > > > > using 1 vq pair. I can store that the driver request was 3, and it is > > > > still in-flight. Now the timeout occurs, so the VDUSE device returns > > > > fail to the driver, and the driver frees the vq regions etc. After > > > > that, the device now replies OK. The memory that was sent as the new > > > > vqs avail ring and descriptor ring now contains garbage, and it could > > > > happen that the device start overriding unrelated memory. > > > > > > > > Not even VQ_RESET protects against it as there is still a window > > > > between the CMD set and the VQ reset. > > > > > > Timeouts should be up to userspace. If userspace times out > > > and then gets confused, kernel is not to blame. > > > > > > > > > > I meant the virtio-net driver will be confused. >

