On Wed, Jun 04, 2025 at 03:18:43PM -0400, Stefan Hajnoczi wrote:
> Since commit 7ff9ff039380 ("meson: mitigate against use of uninitialize
> stack for exploits") the -ftrivial-auto-var-init=zero compiler option is
> used to zero local variables. While this reduces security risks
> associated with uninitialized stack data, it introduced a measurable
> bottleneck in the virtqueue_split_pop() and virtqueue_packed_pop()
> functions.
>
> These virtqueue functions are in the hot path. They are called for each
> element (request) that is popped from a VIRTIO device's virtqueue. Using
> __attribute__((uninitialized)) on large stack variables in these
> functions improves fio randread bs=4k iodepth=64 performance from 304k
> to 332k IOPS (+9%).
I ask however whether it is always not worth it for all users.
It does reduce chances of leaking stack info, does it not?
Maybe we can start with a tri-state Kconfig knob to select between
performance/balanced/paranoid for this and similar variables?
> This issue was found using perf-top(1). virtqueue_split_pop() was one of
> the top CPU consumers and the "annotate" feature showed that the memory
> zeroing instructions at the beginning of the functions were hot.
>
> Fixes: 7ff9ff039380 ("meson: mitigate against use of uninitialize stack for
> exploits")
> Cc: Daniel P. Berrangé <[email protected]>
> Signed-off-by: Stefan Hajnoczi <[email protected]>
> ---
> include/qemu/compiler.h | 12 ++++++++++++
> hw/virtio/virtio.c | 8 ++++----
> 2 files changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
> index 496dac5ac1..fabd540b02 100644
> --- a/include/qemu/compiler.h
> +++ b/include/qemu/compiler.h
> @@ -207,6 +207,18 @@
> # define QEMU_USED
> #endif
>
> +/*
> + * Disable -ftrivial-auto-var-init on a local variable. Use this in rare
> cases
> + * when the compiler zeroes a large on-stack variable and this causes a
> + * performance bottleneck. Only use it when performance data indicates this
> is
> + * necessary since security risks increase with uninitialized stack
> variables.
> + */
> +#if __has_attribute(uninitialized)
> +# define QEMU_UNINITIALIZED __attribute__((uninitialized))
> +#else
> +# define QEMU_UNINITIALIZED
> +#endif
> +
> /*
> * http://clang.llvm.org/docs/ThreadSafetyAnalysis.html
> *
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 5534251e01..82a285a31d 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -1689,8 +1689,8 @@ static void *virtqueue_split_pop(VirtQueue *vq, size_t
> sz)
> VirtIODevice *vdev = vq->vdev;
> VirtQueueElement *elem = NULL;
> unsigned out_num, in_num, elem_entries;
> - hwaddr addr[VIRTQUEUE_MAX_SIZE];
> - struct iovec iov[VIRTQUEUE_MAX_SIZE];
> + hwaddr QEMU_UNINITIALIZED addr[VIRTQUEUE_MAX_SIZE];
> + struct iovec QEMU_UNINITIALIZED iov[VIRTQUEUE_MAX_SIZE];
> VRingDesc desc;
> int rc;
>
> @@ -1836,8 +1836,8 @@ static void *virtqueue_packed_pop(VirtQueue *vq, size_t
> sz)
> VirtIODevice *vdev = vq->vdev;
> VirtQueueElement *elem = NULL;
> unsigned out_num, in_num, elem_entries;
> - hwaddr addr[VIRTQUEUE_MAX_SIZE];
> - struct iovec iov[VIRTQUEUE_MAX_SIZE];
> + hwaddr QEMU_UNINITIALIZED addr[VIRTQUEUE_MAX_SIZE];
> + struct iovec QEMU_UNINITIALIZED iov[VIRTQUEUE_MAX_SIZE];
> VRingPackedDesc desc;
> uint16_t id;
> int rc;
> --
> 2.49.0