On 19/07/2023 21.56, Milan Zamazal wrote:
Thomas Huth <[email protected]> writes:
On 18/07/2023 14.55, Milan Zamazal wrote:
Thomas Huth <[email protected]> writes:
On 11/07/2023 01.02, Michael S. Tsirkin wrote:
From: Milan Zamazal <[email protected]>
We don't have a virtio-scmi implementation in QEMU and only support
a
vhost-user backend. This is very similar to virtio-gpio and we add the same
set of tests, just passing some vhost-user messages over the control socket.
Signed-off-by: Milan Zamazal <[email protected]>
Acked-by: Thomas Huth <[email protected]>
Message-Id: <[email protected]>
Reviewed-by: Michael S. Tsirkin <[email protected]>
Signed-off-by: Michael S. Tsirkin <[email protected]>
---
tests/qtest/libqos/virtio-scmi.h | 34 ++++++
tests/qtest/libqos/virtio-scmi.c | 174 +++++++++++++++++++++++++++++++
tests/qtest/vhost-user-test.c | 44 ++++++++
MAINTAINERS | 1 +
tests/qtest/libqos/meson.build | 1 +
5 files changed, 254 insertions(+)
create mode 100644 tests/qtest/libqos/virtio-scmi.h
create mode 100644 tests/qtest/libqos/virtio-scmi.c
Hi!
I'm seeing some random failures with this new scmi test, so far only
on non-x86 systems, e.g.:
https://app.travis-ci.com/github/huth/qemu/jobs/606246131#L4774
It also reproduces on a s390x host here, but only if I run "make check
-j$(nproc)" - if I run the tests single-threaded, the qos-test passes
there. Seems like there is a race somewhere in this test?
Hmm, it's basically the same as virtio-gpio.c test, so it should be
OK.
Is it possible that the two tests (virtio-gpio.c & virtio-scmi.c)
interfere with each other in some way? Is there possibly a way to
serialize them to check?
I think within one qos-test, the sub-tests are already run
serialized.
I see, OK.
But there might be multiple qos-tests running in parallel, e.g. one
for the aarch64 target and one for the ppc64 target. And indeed, I can
reproduce the problem on my x86 laptop by running this in one terminal
window:
for ((x=0;x<1000;x++)); do \
QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon \
G_TEST_DBUS_DAEMON=.tests/dbus-vmstate-daemon.sh \
QTEST_QEMU_BINARY=./qemu-system-ppc64 \
MALLOC_PERTURB_=188 QTEST_QEMU_IMG=./qemu-img \
tests/qtest/qos-test -p \
/ppc64/pseries/spapr-pci-host-bridge/pci-bus-spapr/pci-bus/vhost-user-scmi-pci/vhost-user-scmi/vhost-user-scmi-tests/scmi/read-guest-mem/memfile
\
|| break ; \
done
And this in another terminal window at the same time:
for ((x=0;x<1000;x++)); do \
QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon \
G_TEST_DBUS_DAEMON=.tests/dbus-vmstate-daemon.sh \
QTEST_QEMU_BINARY=./qemu-system-aarch64 \
MALLOC_PERTURB_=188 QTEST_QEMU_IMG=./qemu-img \
tests/qtest/qos-test -p \
/aarch64/virt/generic-pcihost/pci-bus-generic/pci-bus/vhost-user-scmi-pci/vhost-user-scmi/vhost-user-scmi-tests/scmi/read-guest-mem/memfile
\
|| break ; \
done
After a while, the aarch64 test broke with:
/aarch64/virt/generic-pcihost/pci-bus-generic/pci-bus/vhost-user-scmi-pci/vhost-user-scmi/vhost-user-scmi-tests/scmi/read-guest-mem/memfile:
qemu-system-aarch64: Failed to set msg fds.
qemu-system-aarch64: Failed to set msg fds.
qemu-system-aarch64: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-aarch64: Failed to set msg fds.
qemu-system-aarch64: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-aarch64: Failed to set msg fds.
qemu-system-aarch64: vhost_set_vring_call failed 22
qemu-system-aarch64: Failed to set msg fds.
qemu-system-aarch64: vhost_set_vring_call failed 22
qemu-system-aarch64: Failed to write msg. Wrote -1 instead of 20.
qemu-system-aarch64: Failed to set msg fds.
qemu-system-aarch64: vhost VQ 0 ring restore failed: -22: Invalid argument (22)
qemu-system-aarch64: Failed to set msg fds.
qemu-system-aarch64: vhost VQ 1 ring restore failed: -22: Invalid argument (22)
qemu-system-aarch64: ../../devel/qemu/hw/pci/msix.c:659:
msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier
&& dev->msix_vector_release_notifier' failed.
../../devel/qemu/tests/qtest/libqtest.c:200: kill_qemu() detected QEMU
death from signal 6 (Aborted) (core dumped)
**
ERROR:../../devel/qemu/tests/qtest/qos-test.c:191:subprocess_run_one_test:
child process
(/aarch64/virt/generic-pcihost/pci-bus-generic/pci-bus/vhost-user-scmi-pci/vhost-user-scmi/vhost-user-scmi-tests/scmi/read-guest-mem/memfile/subprocess
[488457]) failed unexpectedly
Aborted (core dumped)
Interesting, good discovery.
Can you also reproduce it this way?
Unfortunately not. I ran the loops several times and everything passed.
I tried to compile and run it in a different distro container and it
passed too. I also haven't been successful in getting any idea how the
processes could influence each other.
What OS and what QEMU configure flags did you use to compile and run it?
I'm using RHEL 8 on an older laptop ... and maybe the latter is related: I
just noticed that I can also reproduce the problem by just running one of
the above two for-loop while putting a lot of load on the machine otherwise,
e.g. by running a "make -j$(nproc)" to rebuild the whole QEMU sources. So
it's definitely a race *within* one QEMU process.
Thomas