On 9/3/25 5:49 AM, Stefan Hajnoczi wrote:
On Sat, Aug 30, 2025 at 08:00:00AM -0400, Brian Song wrote:
We used fio to test a 1 GB file under both traditional FUSE and
FUSE-over-io_uring modes. The experiments were conducted with the
following iodepth and numjobs configurations: 1-1, 64-1, 1-4, and 64-4,
with 70% read and 30% write, resulting in a total of eight test cases,
measuring both latency and throughput.
Test results:
https://gist.github.com/hibriansong/a4849903387b297516603e83b53bbde4
Hanna: You benchmarked the FUSE export coroutine implementation a little
while ago. What do you think about these results with
FUSE-over-io_uring?
What stands out to me is that iodepth=1 numjobs=4 already saturates the
system, so increasing iodepth to 64 does not improve the results much.
Brian: What is the qemu-storage-daemon command-line for the benchmark
and what are the details of /mnt/tmp/ (e.g. a preallocated 10 GB file
with an XFS file system mounted from the FUSE image)?
QMP script:
https://gist.github.com/hibriansong/399f9564a385cfb94db58669e63611f8
Or:
### NORMAL
./qemu/build/storage-daemon/qemu-storage-daemon \
--object iothread,id=iothread1 \
--object iothread,id=iothread2 \
--object iothread,id=iothread3 \
--object iothread,id=iothread4 \
--blockdev node-name=prot-node,driver=file,filename=ubuntu.qcow2 \
--blockdev node-name=fmt-node,driver=qcow2,file=prot-node \
--export
type=fuse,id=exp0,node-name=fmt-node,mountpoint=mount-point,writable=on,iothread.0=iothread1,iothread.1=iothread2,iothread.2=iothread3,iothread.3=iothread4
### URING
echo Y > /sys/module/fuse/parameters/enable_uring
./qemu/build/storage-daemon/qemu-storage-daemon \
--object iothread,id=iothread1 \
--object iothread,id=iothread2 \
--object iothread,id=iothread3 \
--object iothread,id=iothread4 \
--blockdev node-name=prot-node,driver=file,filename=ubuntu.qcow2 \
--blockdev node-name=fmt-node,driver=qcow2,file=prot-node \
--export
type=fuse,id=exp0,node-name=fmt-node,mountpoint=mount-point,writable=on,io-uring=on,iothread.0=iothread1,iothread.1=iothread2,iothread.2=iothread3,iothread.3=iothread4
ubuntu.qcow2 has been prealloacted and enlarge the space to 100GB by
$ qemu-img resize ubuntu.qcow2 100G
$ virt-customize \
--run-command '/bin/bash /bin/growpart /dev/sda 1' \
--run-command 'resize2fs /dev/sda1' -a ubuntu.qcow2
The image file, formatted with an Ext4 filesystem, was mounted on
/mnt/tmp on my PC equipped with a Kingston PCIe 4.0 NVMe SSD
$ sudo kpartx -av mount-point
$ sudo mount /dev/mapper/loop31p1 /mnt/tmp/
Unmount the partition after done using it.
$ sudo umount /mnt/tmp
# sudo kpartx -dv mount-point
Thanks,
Stefan
On 8/29/25 10:50 PM, Brian Song wrote:
Hi all,
This is a GSoC project. More details are available here:
https://wiki.qemu.org/Google_Summer_of_Code_2025#FUSE-over-io_uring_exports
This patch series includes:
- Add a round-robin mechanism to distribute the kernel-required Ring
Queues to FUSE Queues
- Support multiple in-flight requests (multiple ring entries)
- Add tests for FUSE-over-io_uring
More detail in the v2 cover letter:
https://lists.nongnu.org/archive/html/qemu-block/2025-08/msg00140.html
And in the v1 cover letter:
https://lists.nongnu.org/archive/html/qemu-block/2025-07/msg00280.html
Brian Song (4):
export/fuse: add opt to enable FUSE-over-io_uring
export/fuse: process FUSE-over-io_uring requests
export/fuse: Safe termination for FUSE-uring
iotests: add tests for FUSE-over-io_uring
block/export/fuse.c | 838 +++++++++++++++++++++------
docs/tools/qemu-storage-daemon.rst | 11 +-
qapi/block-export.json | 5 +-
storage-daemon/qemu-storage-daemon.c | 1 +
tests/qemu-iotests/check | 2 +
tests/qemu-iotests/common.rc | 45 +-
util/fdmon-io_uring.c | 5 +-
7 files changed, 717 insertions(+), 190 deletions(-)