Not quite sure about the whole picture.

Seems while qemu waiting response of vhost_user_get_status, dpdk send out 
VHOST_USER_SLAVE_IOTLB_MSG and trigger qemu function 
vhost_backend_update_device_iotlb.
Qemu wait on reply of VHOST_USER_IOTLB_MSG but get VHOST_USER_GET_STATUS reply.

Break on first error message("Received unexpected msg type. Expected 22 
received 40")

#0  0x0000555555b72ed4 in process_message_reply (dev=0x5555584dd600, 
msg=0x7fffffffa330) at ../hw/virtio/vhost-user.c:445
#1  0x0000555555b77c26 in vhost_user_send_device_iotlb_msg (dev=0x5555584dd600, 
imsg=0x7fffffffa600) at ../hw/virtio/vhost-user.c:2341
#2  0x0000555555b7179e in vhost_backend_update_device_iotlb 
(dev=0x5555584dd600, iova=10442706944, uaddr=140736119902208, len=4096, 
perm=IOMMU_RW) at ../hw/virtio/vhost-backend.c:361
#3  0x0000555555b6e34c in vhost_device_iotlb_miss (dev=0x5555584dd600, 
iova=10442706944, write=1) at ../hw/virtio/vhost.c:1113
#4  0x0000555555b718d9 in vhost_backend_handle_iotlb_msg (dev=0x5555584dd600, 
imsg=0x7fffffffa7b0) at ../hw/virtio/vhost-backend.c:393
#5  0x0000555555b76144 in slave_read (ioc=0x555557a38680, condition=G_IO_IN, 
opaque=0x5555584dd600) at ../hw/virtio/vhost-user.c:1726
#6  0x0000555555c797a5 in qio_channel_fd_source_dispatch 
(source=0x555556a06fb0, callback=0x555555b75f86 <slave_read>, 
user_data=0x5555584dd600) at ../io/channel-watch.c:84
#7  0x00007ffff554895d in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#8  0x00007ffff5548d18 in g_main_context_iterate.isra () at 
/lib64/libglib-2.0.so.0
#9  0x00007ffff5549042 in g_main_loop_run () at /lib64/libglib-2.0.so.0
#10 0x0000555555b72de7 in vhost_user_read (dev=0x5555584dd600, 
msg=0x7fffffffac50) at ../hw/virtio/vhost-user.c:413
#11 0x0000555555b72e9b in process_message_reply (dev=0x5555584dd600, 
msg=0x7fffffffaf10) at ../hw/virtio/vhost-user.c:439
#12 0x0000555555b77c26 in vhost_user_send_device_iotlb_msg (dev=0x5555584dd600, 
imsg=0x7fffffffb1e0) at ../hw/virtio/vhost-user.c:2341
#13 0x0000555555b7179e in vhost_backend_update_device_iotlb 
(dev=0x5555584dd600, iova=10468392960, uaddr=140736145588224, len=4096, 
perm=IOMMU_RW) at ../hw/virtio/vhost-backend.c:361
#14 0x0000555555b6e34c in vhost_device_iotlb_miss (dev=0x5555584dd600, 
iova=10468392960, write=1) at ../hw/virtio/vhost.c:1113
#15 0x0000555555b718d9 in vhost_backend_handle_iotlb_msg (dev=0x5555584dd600, 
imsg=0x7fffffffb390) at ../hw/virtio/vhost-backend.c:393
#16 0x0000555555b76144 in slave_read (ioc=0x555557a38680, condition=G_IO_IN, 
opaque=0x5555584dd600) at ../hw/virtio/vhost-user.c:1726
#17 0x0000555555c797a5 in qio_channel_fd_source_dispatch 
(source=0x555556c70250, callback=0x555555b75f86 <slave_read>, 
user_data=0x5555584dd600) at ../io/channel-watch.c:84
#18 0x00007ffff554895d in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#19 0x00007ffff5548d18 in g_main_context_iterate.isra () at 
/lib64/libglib-2.0.so.0
#20 0x00007ffff5549042 in g_main_loop_run () at /lib64/libglib-2.0.so.0
#21 0x0000555555b72de7 in vhost_user_read (dev=0x5555584dd600, 
msg=0x7fffffffb830) at ../hw/virtio/vhost-user.c:413
#22 0x0000555555b72e9b in process_message_reply (dev=0x5555584dd600, 
msg=0x7fffffffbaf0) at ../hw/virtio/vhost-user.c:439
#23 0x0000555555b77c26 in vhost_user_send_device_iotlb_msg (dev=0x5555584dd600, 
imsg=0x7fffffffbdc0) at ../hw/virtio/vhost-user.c:2341
#24 0x0000555555b7179e in vhost_backend_update_device_iotlb 
(dev=0x5555584dd600, iova=10442702848, uaddr=140736119898112, len=4096, 
perm=IOMMU_RW) at ../hw/virtio/vhost-backend.c:361
#25 0x0000555555b6e34c in vhost_device_iotlb_miss (dev=0x5555584dd600, 
iova=10442702848, write=1) at ../hw/virtio/vhost.c:1113
#26 0x0000555555b718d9 in vhost_backend_handle_iotlb_msg (dev=0x5555584dd600, 
imsg=0x7fffffffbf70) at ../hw/virtio/vhost-backend.c:393
#27 0x0000555555b76144 in slave_read (ioc=0x555557a38680, condition=G_IO_IN, 
opaque=0x5555584dd600) at ../hw/virtio/vhost-user.c:1726
#28 0x0000555555c797a5 in qio_channel_fd_source_dispatch 
(source=0x555556f1a530, callback=0x555555b75f86 <slave_read>, 
user_data=0x5555584dd600) at ../io/channel-watch.c:84
#29 0x00007ffff554895d in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#30 0x00007ffff5548d18 in g_main_context_iterate.isra () at 
/lib64/libglib-2.0.so.0
#31 0x00007ffff5549042 in g_main_loop_run () at /lib64/libglib-2.0.so.0
#32 0x0000555555b72de7 in vhost_user_read (dev=0x5555584dd600, 
msg=0x7fffffffc420) at ../hw/virtio/vhost-user.c:413
#33 0x0000555555b754b1 in vhost_user_get_u64 (dev=0x5555584dd600, request=40, 
u64=0x7fffffffc6e0) at ../hw/virtio/vhost-user.c:1349
#34 0x0000555555b758ff in vhost_user_get_status (dev=0x5555584dd600, 
status=0x7fffffffc713 "W\020") at ../hw/virtio/vhost-user.c:1474
#35 0x0000555555b75967 in vhost_user_add_status (dev=0x5555584dd600, status=7 
'\a') at ../hw/virtio/vhost-user.c:1488
#36 0x0000555555b78bf6 in vhost_user_dev_start (dev=0x5555584dd600, 
started=true) at ../hw/virtio/vhost-user.c:2758
#37 0x0000555555b709ad in vhost_dev_start (hdev=0x5555584dd600, 
vdev=0x555557b965d0, vrings=false) at ../hw/virtio/vhost.c:1988
#38 0x000055555584291c in vhost_net_start_one (net=0x5555584dd600, 
dev=0x555557b965d0) at ../hw/net/vhost_net.c:271
#39 0x0000555555842f1e in vhost_net_start (dev=0x555557b965d0, 
ncs=0x555557bc09e0, data_queue_pairs=1, cvq=0) at ../hw/net/vhost_net.c:412
#40 0x0000555555b1bf61 in virtio_net_vhost_status (n=0x555557b965d0, status=15 
'\017') at ../hw/net/virtio-net.c:311
#41 0x0000555555b1c20c in virtio_net_set_status (vdev=0x555557b965d0, status=15 
'\017') at ../hw/net/virtio-net.c:392
#42 0x0000555555b1ed04 in virtio_net_handle_mq (n=0x555557b965d0, cmd=0 '\000', 
iov=0x555556c7ef50, iov_cnt=1) at ../hw/net/virtio-net.c:1497
#43 0x0000555555b1eef0 in virtio_net_handle_ctrl_iov (vdev=0x555557b965d0, 
in_sg=0x555556a09880, in_num=1, out_sg=0x555556a09890, out_num=1) at 
../hw/net/virtio-net.c:1534
#44 0x0000555555b1efe9 in virtio_net_handle_ctrl (vdev=0x555557b965d0, 
vq=0x7fffc04ac140) at ../hw/net/virtio-net.c:1557
#45 0x0000555555b63776 in virtio_queue_notify_vq (vq=0x7fffc04ac140) at 
../hw/virtio/virtio.c:2249
#46 0x0000555555b669dc in virtio_queue_host_notifier_read (n=0x7fffc04ac1b4) at 
../hw/virtio/virtio.c:3529
#47 0x0000555555e3f458 in aio_dispatch_handler (ctx=0x555556a016c0, 
node=0x7ffd8800e430) at ../util/aio-posix.c:369
#48 0x0000555555e3f613 in aio_dispatch_handlers (ctx=0x555556a016c0) at 
../util/aio-posix.c:412
#49 0x0000555555e3f669 in aio_dispatch (ctx=0x555556a016c0) at 
../util/aio-posix.c:422
#50 0x0000555555e585de in aio_ctx_dispatch (source=0x555556a016c0, 
callback=0x0, user_data=0x0) at ../util/async.c:321
#51 0x00007ffff554895d in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
#52 0x0000555555e5abea in glib_pollfds_poll () at ../util/main-loop.c:295
#53 0x0000555555e5ac64 in os_host_main_loop_wait (timeout=0) at 
../util/main-loop.c:318
#54 0x0000555555e5ad69 in main_loop_wait (nonblocking=0) at 
../util/main-loop.c:604
#55 0x00005555559693de in qemu_main_loop () at ../softmmu/runstate.c:731
#56 0x00005555556e7c06 in qemu_default_main () at ../softmmu/main.c:37
#57 0x00005555556e7c3c in main (argc=71, argv=0x7fffffffcda8) at 
../softmmu/main.c:48



-----Original Message-----
From: Maxime Coquelin <[email protected]> 
Sent: Thursday, January 12, 2023 5:26 PM
To: Laurent Vivier <[email protected]>
Cc: [email protected]; Peter Maydell <[email protected]>; Yajun Wu 
<[email protected]>; Parav Pandit <[email protected]>; Michael S. Tsirkin 
<[email protected]>
Subject: Re: [PULL v4 76/83] vhost-user: Support vhost_dev_start

External email: Use caution opening links or attachments


Hi Laurent,

On 1/11/23 10:50, Laurent Vivier wrote:
> On 1/9/23 11:55, Michael S. Tsirkin wrote:
>> On Fri, Jan 06, 2023 at 03:21:43PM +0100, Laurent Vivier wrote:
>>> Hi,
>>>
>>> it seems this patch breaks vhost-user with DPDK.
>>>
>>> See 
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbu
>>> gzilla.redhat.com%2Fshow_bug.cgi%3Fid%3D2155173&data=05%7C01%7Cyajun
>>> w%40nvidia.com%7C47e6e0fabd044383fd3308daf47f0253%7C43083d15727340c1
>>> b7db39efd9ccc17a%7C0%7C0%7C638091123577559319%7CUnknown%7CTWFpbGZsb3
>>> d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D
>>> %7C3000%7C%7C%7C&sdata=1pjChYTKHVmBoempNitiZHBdrlPIMFjKoD6FeOVSay0%3
>>> D&reserved=0
>>>
>>> it seems QEMU doesn't receive the expected commands sequence:
>>>
>>> Received unexpected msg type. Expected 22 received 40 Fail to update 
>>> device iotlb Received unexpected msg type. Expected 40 received 22 
>>> Received unexpected msg type. Expected 22 received 11 Fail to update 
>>> device iotlb Received unexpected msg type. Expected 11 received 22 
>>> vhost VQ 1 ring restore failed: -71: Protocol error (71) Received 
>>> unexpected msg type. Expected 22 received 11 Fail to update device 
>>> iotlb Received unexpected msg type. Expected 11 received 22 vhost VQ 
>>> 0 ring restore failed: -71: Protocol error (71) unable to start 
>>> vhost net: 71: falling back on userspace virtio
>>>
>>> It receives VHOST_USER_GET_STATUS (40) when it expects 
>>> VHOST_USER_IOTLB_MSG (22) and VHOST_USER_IOTLB_MSG when it expects 
>>> VHOST_USER_GET_STATUS.
>>> and VHOST_USER_GET_VRING_BASE (11) when it expect 
>>> VHOST_USER_GET_STATUS and so on.
>>>
>>> Any idea?

We only have a single thread on DPDK side to handle Vhost-user requests, it 
will read a request, handle it and reply to it. Then it reads the next one, 
etc... So I don't think it is possible to mix request replies order on DPDK 
side.

Maybe there are two threads concurrently sending requests on QEMU side?

Regards,
Maxime

>>> Thanks,
>>> Laurent
>>
>>
>> So I am guessing it's coming from:
>>
>>      if (msg.hdr.request != request) {
>>          error_report("Received unexpected msg type. Expected %d 
>> received %d",
>>                       request, msg.hdr.request);
>>          return -EPROTO;
>>      }
>>
>> in process_message_reply and/or in vhost_user_get_u64.
>>
>>
>>> On 11/7/22 23:53, Michael S. Tsirkin wrote:
>>>> From: Yajun Wu <[email protected]>
>>>>
>>>> The motivation of adding vhost-user vhost_dev_start support is to 
>>>> improve backend configuration speed and reduce live migration VM 
>>>> downtime.
>>>>
>>>> Today VQ configuration is issued one by one. For virtio net with 
>>>> multi-queue support, backend needs to update RSS (Receive side
>>>> scaling) on every rx queue enable. Updating RSS is time-consuming 
>>>> (typical time like 7ms).
>>>>
>>>> Implement already defined vhost status and message in the vhost 
>>>> specification [1].
>>>> (a) VHOST_USER_PROTOCOL_F_STATUS
>>>> (b) VHOST_USER_SET_STATUS
>>>> (c) VHOST_USER_GET_STATUS
>>>>
>>>> Send message VHOST_USER_SET_STATUS with VIRTIO_CONFIG_S_DRIVER_OK 
>>>> for device start and reset(0) for device stop.
>>>>
>>>> On reception of the DRIVER_OK message, backend can apply the needed 
>>>> setting only once (instead of incremental) and also utilize 
>>>> parallelism on enabling queues.
>>>>
>>>> This improves QEMU's live migration downtime with vhost user 
>>>> backend implementation by great margin, specially for the large 
>>>> number of VQs of 64 from 800 msec to 250 msec.
>>>>
>>>> [1] 
>>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fq
>>>> emu-project.gitlab.io%2Fqemu%2Finterop%2Fvhost-user.html&data=05%7C
>>>> 01%7Cyajunw%40nvidia.com%7C47e6e0fabd044383fd3308daf47f0253%7C43083
>>>> d15727340c1b7db39efd9ccc17a%7C0%7C0%7C638091123577559319%7CUnknown%
>>>> 7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiL
>>>> CJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=YPbrFRJA92KLLwADMUDvwBt%2Fme2Ef
>>>> GZuVANOmXH5pic%3D&reserved=0
>>>>
>>>> Signed-off-by: Yajun Wu <[email protected]>
>>>> Acked-by: Parav Pandit <[email protected]>
>>>> Message-Id: <[email protected]>
>>>> Reviewed-by: Michael S. Tsirkin <[email protected]>
>>>> Signed-off-by: Michael S. Tsirkin <[email protected]>
>>
>> Probably easiest to debug from dpdk side.
>> Does the problem go away if you disable the feature 
>> VHOST_USER_PROTOCOL_F_STATUS in dpdk?
>
> Maxime could you help to debug this?
>
> Thanks,
> Laurent
>


Reply via email to