Hi, VHOST_USER_PROTOCOL_F_STATUS is enabled by default (dpdk):
lib/vhost/vhost_user.h 17 #define VHOST_USER_PROTOCOL_FEATURES ((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \ 18 (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\ 19 (1ULL << VHOST_USER_PROTOCOL_F_RARP) | \ 20 (1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \ 21 (1ULL << VHOST_USER_PROTOCOL_F_NET_MTU) | \ 22 (1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ) | \ 23 (1ULL << VHOST_USER_PROTOCOL_F_CRYPTO_SESSION) | \ 24 (1ULL << VHOST_USER_PROTOCOL_F_SLAVE_SEND_FD) | \ 25 (1ULL << VHOST_USER_PROTOCOL_F_HOST_NOTIFIER) | \ 26 (1ULL << VHOST_USER_PROTOCOL_F_PAGEFAULT) | \ 27 (1ULL << VHOST_USER_PROTOCOL_F_STATUS)) Remove VHOST_USER_PROTOCOL_F_STATUS can disable VHOST_USER_SET/GET_STATUS message. Should W.A. this issue. Thanks, Yajun -----Original Message----- From: Laurent Vivier <[email protected]> Sent: Wednesday, January 11, 2023 5:50 PM To: Maxime Coquelin <[email protected]> Cc: [email protected]; Peter Maydell <[email protected]>; Yajun Wu <[email protected]>; Parav Pandit <[email protected]>; Michael S. Tsirkin <[email protected]> Subject: Re: [PULL v4 76/83] vhost-user: Support vhost_dev_start External email: Use caution opening links or attachments On 1/9/23 11:55, Michael S. Tsirkin wrote: > On Fri, Jan 06, 2023 at 03:21:43PM +0100, Laurent Vivier wrote: >> Hi, >> >> it seems this patch breaks vhost-user with DPDK. >> >> See >> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbug >> zilla.redhat.com%2Fshow_bug.cgi%3Fid%3D2155173&data=05%7C01%7Cyajunw% >> 40nvidia.com%7Cf4c581251ab548d64ae708daf3b94867%7C43083d15727340c1b7d >> b39efd9ccc17a%7C0%7C0%7C638090274351645141%7CUnknown%7CTWFpbGZsb3d8ey >> JWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C30 >> 00%7C%7C%7C&sdata=m582YO4Sd2jJ0S%2F%2FSv9zx6NSuXQIrRwkqBPgYedO%2Fr8%3 >> D&reserved=0 >> >> it seems QEMU doesn't receive the expected commands sequence: >> >> Received unexpected msg type. Expected 22 received 40 Fail to update >> device iotlb Received unexpected msg type. Expected 40 received 22 >> Received unexpected msg type. Expected 22 received 11 Fail to update >> device iotlb Received unexpected msg type. Expected 11 received 22 >> vhost VQ 1 ring restore failed: -71: Protocol error (71) Received >> unexpected msg type. Expected 22 received 11 Fail to update device >> iotlb Received unexpected msg type. Expected 11 received 22 vhost VQ >> 0 ring restore failed: -71: Protocol error (71) unable to start vhost >> net: 71: falling back on userspace virtio >> >> It receives VHOST_USER_GET_STATUS (40) when it expects >> VHOST_USER_IOTLB_MSG (22) and VHOST_USER_IOTLB_MSG when it expects >> VHOST_USER_GET_STATUS. >> and VHOST_USER_GET_VRING_BASE (11) when it expect VHOST_USER_GET_STATUS and >> so on. >> >> Any idea? >> >> Thanks, >> Laurent > > > So I am guessing it's coming from: > > if (msg.hdr.request != request) { > error_report("Received unexpected msg type. Expected %d received %d", > request, msg.hdr.request); > return -EPROTO; > } > > in process_message_reply and/or in vhost_user_get_u64. > > >> On 11/7/22 23:53, Michael S. Tsirkin wrote: >>> From: Yajun Wu <[email protected]> >>> >>> The motivation of adding vhost-user vhost_dev_start support is to >>> improve backend configuration speed and reduce live migration VM >>> downtime. >>> >>> Today VQ configuration is issued one by one. For virtio net with >>> multi-queue support, backend needs to update RSS (Receive side >>> scaling) on every rx queue enable. Updating RSS is time-consuming >>> (typical time like 7ms). >>> >>> Implement already defined vhost status and message in the vhost >>> specification [1]. >>> (a) VHOST_USER_PROTOCOL_F_STATUS >>> (b) VHOST_USER_SET_STATUS >>> (c) VHOST_USER_GET_STATUS >>> >>> Send message VHOST_USER_SET_STATUS with VIRTIO_CONFIG_S_DRIVER_OK >>> for device start and reset(0) for device stop. >>> >>> On reception of the DRIVER_OK message, backend can apply the needed >>> setting only once (instead of incremental) and also utilize >>> parallelism on enabling queues. >>> >>> This improves QEMU's live migration downtime with vhost user backend >>> implementation by great margin, specially for the large number of >>> VQs of 64 from 800 msec to 250 msec. >>> >>> [1] >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fqe >>> mu-project.gitlab.io%2Fqemu%2Finterop%2Fvhost-user.html&data=05%7C01 >>> %7Cyajunw%40nvidia.com%7Cf4c581251ab548d64ae708daf3b94867%7C43083d15 >>> 727340c1b7db39efd9ccc17a%7C0%7C0%7C638090274351645141%7CUnknown%7CTW >>> FpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVC >>> I6Mn0%3D%7C3000%7C%7C%7C&sdata=eEmHPgZlmImC5LTDZ2jTJauNW7cRFDhsme8%2 >>> Fjk7ywIE%3D&reserved=0 >>> >>> Signed-off-by: Yajun Wu <[email protected]> >>> Acked-by: Parav Pandit <[email protected]> >>> Message-Id: <[email protected]> >>> Reviewed-by: Michael S. Tsirkin <[email protected]> >>> Signed-off-by: Michael S. Tsirkin <[email protected]> > > Probably easiest to debug from dpdk side. > Does the problem go away if you disable the feature > VHOST_USER_PROTOCOL_F_STATUS in dpdk? Maxime could you help to debug this? Thanks, Laurent
