Verification
============

Performance results without the patch:
--------------------------------------

Ubuntu Release:
lsb_release -rc
Release:        25.04
Codename:       plucky

Kernel in use:
uname -srvm
Linux 6.14.0-1015-gcp #16-Ubuntu SMP Tue Aug 19 00:02:17 UTC 2025 x86_64

nvme topology:
sudo lstopo | grep -A 4 NVMExp
          PCI 05:00.1 (NVMExp)
            Block(Disk) "nvme0n1"
      HostBridge
        PCI 6b:00.0 (Co-Processor)
      HostBridge
--
          PCI 3d:00.0 (NVMExp)
            Block(Disk) "nvme1n1"
          PCI 3d:00.1 (NVMExp)
            Block(Disk) "nvme10n1"
        PCIBridge
          PCI 3e:00.0 (NVMExp)
            Block(Disk) "nvme11n1"
          PCI 3e:00.1 (NVMExp)
            Block(Disk) "nvme12n1"
  Package L#1 + L3 L#1 (105MB)
    Group0 L#2
      NUMANode L#2 (P#2 378GB)
--
          PCI 86:00.0 (NVMExp)
            Block(Disk) "nvme2n1"
          PCI 86:00.1 (NVMExp)
            Block(Disk) "nvme4n1"
        PCIBridge
          PCI 87:00.0 (NVMExp)
            Block(Disk) "nvme7n1"
          PCI 87:00.1 (NVMExp)
            Block(Disk) "nvme9n1"
      HostBridge
        PCI e8:00.0 (Co-Processor)
      HostBridge
--
          PCI b7:00.0 (NVMExp)
            Block(Disk) "nvme3n1"
          PCI b7:00.1 (NVMExp)
            Block(Disk) "nvme5n1"
        PCIBridge
          PCI b8:00.0 (NVMExp)
            Block(Disk) "nvme6n1"
          PCI b8:00.1 (NVMExp)
            Block(Disk) "nvme8n1"
...

- From the topology we see nvme1n1 and nvme3n1 are under different
bridges. Run the fio benchmark against these nvmes and observe the read
perf is 6152MiB/s

sudo fio --readwrite=randread --blocksize=4k --iodepth=32 --numjobs=8 
--time_based --runtime=40 --ioengine=libaio --direct=1 --group_reporting 
--new_group --name=job1 --filename=/dev/nvme1n1 --new_group --name=job2 
--filename=/dev/nvme3n1
...
Jobs: 16 (f=16): [r(16)][100.0%][r=6152MiB/s][r=1575k IOPS][eta 00m:00s]
job1: (groupid=0, jobs=8): err= 0: pid=12326: Wed Sep  3 15:12:38 2025
  read: IOPS=787k, BW=3073MiB/s (3222MB/s)(120GiB/40001msec)
...
job2: (groupid=1, jobs=8): err= 0: pid=12334: Wed Sep  3 15:12:38 2025
  read: IOPS=787k, BW=3073MiB/s (3222MB/s)(120GiB/40001msec)
...
Run status group 0 (all jobs):
   READ: bw=3073MiB/s (3222MB/s), 3073MiB/s-3073MiB/s (3222MB/s-3222MB/s), 
io=120GiB (129GB), run=40001-40001msec
Run status group 1 (all jobs):
   READ: bw=3073MiB/s (3222MB/s), 3073MiB/s-3073MiB/s (3222MB/s-3222MB/s), 
io=120GiB (129GB), run=40001-40001msec
Disk stats (read/write):
  nvme1n1: ios=31460219/0, sectors=251681752/0, merge=0/0, ticks=10023925/0, 
in_queue=10023925, util=99.46%
  nvme3n1: ios=31460291/0, sectors=251682328/0, merge=0/0, ticks=10039463/0, 
in_queue=10039463, util=99.49%

- We see that nvme1n1 is under the same bridge as nvme10n1. Run the same
benchmark against these two nvmes and note the degraded performance
(4947MiB/s), despite initially bursting at around the same performance
of ~6150MiB/s

sudo fio --readwrite=randread --blocksize=4k --iodepth=32 --numjobs=8 
--time_based --runtime=40 --ioengine=libaio --direct=1 --group_reporting 
--new_group --name=job1 --filename=/dev/nvme1n1 --new_group --name=job2 
--filename=/dev/nvme10n1
...
Jobs: 16 (f=16): [r(16)][100.0%][r=4947MiB/s][r=1266k IOPS][eta 00m:00s]
job1: (groupid=0, jobs=8): err= 0: pid=12547: Wed Sep  3 15:13:55 2025
...
job2: (groupid=1, jobs=8): err= 0: pid=12555: Wed Sep  3 15:13:55 2025
  read: IOPS=675k, BW=2636MiB/s (2764MB/s)(103GiB/40001msec)
Run status group 0 (all jobs):
   READ: bw=2618MiB/s (2745MB/s), 2618MiB/s-2618MiB/s (2745MB/s-2745MB/s), 
io=102GiB (110GB), run=40001-40001msec
Run status group 1 (all jobs):
   READ: bw=2636MiB/s (2764MB/s), 2636MiB/s-2636MiB/s (2764MB/s-2764MB/s), 
io=103GiB (111GB), run=40001-40001msec
Disk stats (read/write):
  nvme1n1: ios=26789594/0, sectors=214316752/0, merge=0/0, ticks=2449737/0, 
in_queue=2449737, util=99.42%
  nvme10n1: ios=26972471/0, sectors=215779776/0, merge=0/0, ticks=2500999/0, 
in_queue=2500999, util=99.48%

Performance results with the patch:
-----------------------------------

Upgrade to the -proposed -generic kernel

Kernel in use:

uname -srvm
Linux 6.14.0-32-generic #32-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug 29 14:21:26 UTC 
2025 x86_64

- Rerun the same job for nvmes under different bridges and observe no
regression in the performance compared to the previous kernel (it
remains at 6153MiB/s)

sudo fio --readwrite=randread --blocksize=4k --iodepth=32 --numjobs=8 
--time_based --runtime=40 --ioengine=libaio --direct=1 --group_reporting 
--new_group --name=job1 --filename=/dev/nvme1n1 --new_group --name=job2 
--filename=/dev/nvme3n1
...
Jobs: 16 (f=16): [r(16)][100.0%][r=6153MiB/s][r=1575k IOPS][eta 00m:00s]
job1: (groupid=0, jobs=8): err= 0: pid=18197: Wed Sep  3 22:12:26 2025
  read: IOPS=786k, BW=3070MiB/s (3219MB/s)(120GiB/40001msec)
...
job2: (groupid=1, jobs=8): err= 0: pid=18205: Wed Sep  3 22:12:26 2025
  read: IOPS=786k, BW=3071MiB/s (3220MB/s)(120GiB/40001msec)
...
Run status group 0 (all jobs):
   READ: bw=3070MiB/s (3219MB/s), 3070MiB/s-3070MiB/s (3219MB/s-3219MB/s), 
io=120GiB (129GB), run=40001-40001msec
Run status group 1 (all jobs):
   READ: bw=3071MiB/s (3220MB/s), 3071MiB/s-3071MiB/s (3220MB/s-3220MB/s), 
io=120GiB (129GB), run=40001-40001msec
Disk stats (read/write):
  nvme1n1: ios=31385500/0, sectors=251084000/0, merge=0/0, ticks=9980787/0, 
in_queue=9980787, util=99.23%
  nvme3n1: ios=31395528/0, sectors=251164224/0, merge=0/0, ticks=9984572/0, 
in_queue=9984572, util=99.35%

- Rerun the benchmark again for nvmes under the same PCI bridge and see
that now this performs as well as nvmes under different bridges
(6153MiB/s):

sudo fio --readwrite=randread --blocksize=4k --iodepth=32 --numjobs=8 
--time_based --runtime=40 --ioengine=libaio --direct=1 --group_reporting 
--new_group --name=job1 --filename=/dev/nvme1n1 --new_group --name=job2 
--filename=/dev/nvme10n1
...
Jobs: 16 (f=16): [r(16)][100.0%][r=6153MiB/s][r=1575k IOPS][eta 00m:00s]
job1: (groupid=0, jobs=8): err= 0: pid=91963: Wed Sep  3 22:17:41 2025
  read: IOPS=786k, BW=3072MiB/s (3221MB/s)(120GiB/40001msec)
...
job2: (groupid=1, jobs=8): err= 0: pid=91971: Wed Sep  3 22:17:41 2025
  read: IOPS=787k, BW=3075MiB/s (3224MB/s)(120GiB/40001msec)
...
Run status group 0 (all jobs):
   READ: bw=3072MiB/s (3221MB/s), 3072MiB/s-3072MiB/s (3221MB/s-3221MB/s), 
io=120GiB (129GB), run=40001-40001msec
Run status group 1 (all jobs):
   READ: bw=3075MiB/s (3224MB/s), 3075MiB/s-3075MiB/s (3224MB/s-3224MB/s), 
io=120GiB (129GB), run=40001-40001msec
Disk stats (read/write):
  nvme1n1: ios=31433923/0, sectors=251471400/0, merge=0/0, ticks=9988587/0, 
in_queue=9988587, util=99.38%
  nvme10n1: ios=31432262/0, sectors=251458128/0, merge=0/0, ticks=9974708/0, 
in_queue=9974708, util=99.45%

The I/O performance regression for nvmes under the same PCI bridge is
fixed in this kernel

** Tags removed: verification-needed-plucky-linux
** Tags added: verification-done-plucky-linux

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2115738

Title:
  I/O performance regression on NVMes under same bridge (dual port nvme)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2115738/+subscriptions


-- 
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to