I performed a bisect between the 5.4 and 5.15 kernels. The performance regression was introduced by a stable update in 5.15.57 by the following commit:
62b4db57eefec ("x86/entry: Add kernel IBRS implementation") This commit applies IBRS kernel mitigation for Spectre_v2. IBRS is: Indirect Branch Restricted Speculation. This commit was also applied up upstream stable 5.4 with the following SHA1: a3111faed5c1d ("x86/entry: Add kernel IBRS implementation") However, the backport to 5.4 did not introduce as much as a performance regression as the backport to 5.15. There are many difference between the 5.4 and 5.15 backports of the commit. Much of the assembly logic in 5.15 does not exist in 5.4, since it was not needed. There are several commits that are later applied to 5.15 stable that depend on this patch, so it would not be easily reverted. One option is to use a boot option to disable IBRS mitigation. However, the security versus performance trade-off must be considered carefully. IBRS can be disabled with the boot parameter "noibrs" There is a wiki page that describes the various boot parameters here: https://wiki.ubuntu.com/SecurityTeam/KnowledgeBase/SpectreAndMeltdown/MitigationControls -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/2042564 Title: Performance regression in the 5.15 Ubuntu 20.04 kernel compared to 5.4 Ubuntu 20.04 kernel Status in linux package in Ubuntu: Triaged Status in linux source package in Focal: Triaged Bug description: We in the Canonical Public Cloud team have received report from our colleagues in Google regarding a potential performance regression with the 5.15 kernel vs the 5.4 kernel on ubuntu 20.04. Their test were performed using the linux-gkeop and linux-gkeop-5.15 kernels. I have verified with the generic Ubuntu 20.04 5.4 linux-generic and the Ubuntu 20.04 5.15 linux-generic-hwe-20.04 kernels. The tests were run using `fio` fio commands: * 4k initwrite: `fio --ioengine=libaio --blocksize=4k --readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sdc` * 4k overwrite: `fio --ioengine=libaio --blocksize=4k --readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sdc` My reproducer was to launch an Ubuntu 20.04 cloud image locally with qemu the results are below: Using 5.4 kernel ``` ubuntu@cloudimg:~$ uname --kernel-release 5.4.0-164-generic ubuntu@cloudimg:~$ sudo fio --ioengine=libaio --blocksize=4k --readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sda fiojob1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128 ... fio-3.16 Starting 8 processes Jobs: 8 (f=8): [W(8)][99.6%][w=925MiB/s][w=237k IOPS][eta 00m:01s] fiojob1: (groupid=0, jobs=8): err= 0: pid=2443: Thu Nov 2 09:15:22 2023 write: IOPS=317k, BW=1237MiB/s (1297MB/s)(320GiB/264837msec); 0 zone resets slat (nsec): min=628, max=37820k, avg=7207.71, stdev=101058.61 clat (nsec): min=457, max=56099k, avg=3222240.45, stdev=1707823.38 lat (usec): min=23, max=56100, avg=3229.78, stdev=1705.80 clat percentiles (usec): | 1.00th=[ 775], 5.00th=[ 1352], 10.00th=[ 1647], 20.00th=[ 2024], | 30.00th=[ 2343], 40.00th=[ 2638], 50.00th=[ 2933], 60.00th=[ 3261], | 70.00th=[ 3654], 80.00th=[ 4146], 90.00th=[ 5014], 95.00th=[ 5932], | 99.00th=[ 8979], 99.50th=[10945], 99.90th=[18220], 99.95th=[22676], | 99.99th=[32113] bw ( MiB/s): min= 524, max= 1665, per=100.00%, avg=1237.72, stdev=20.42, samples=4232 iops : min=134308, max=426326, avg=316855.16, stdev=5227.36, samples=4232 lat (nsec) : 500=0.01%, 750=0.01%, 1000=0.01% lat (usec) : 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01%, 100=0.01% lat (usec) : 250=0.05%, 500=0.54%, 750=0.37%, 1000=0.93% lat (msec) : 2=17.40%, 4=58.02%, 10=22.01%, 20=0.60%, 50=0.07% lat (msec) : 100=0.01% cpu : usr=3.29%, sys=7.45%, ctx=1262621, majf=0, minf=103 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1% issued rwts: total=0,83886080,0,8 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=128 Run status group 0 (all jobs): WRITE: bw=1237MiB/s (1297MB/s), 1237MiB/s-1237MiB/s (1297MB/s-1297MB/s), io=320GiB (344GB), run=264837-264837msec Disk stats (read/write): sda: ios=36/32868891, merge=0/50979424, ticks=5/27498602, in_queue=1183124, util=100.00% ``` After upgrading to linux-generic-hwe-20.04 kernel and rebooting ``` ubuntu@cloudimg:~$ uname --kernel-release 5.15.0-88-generic ubuntu@cloudimg:~$ sudo fio --ioengine=libaio --blocksize=4k --readwrite=write --filesize=40G --end_fsync=1 --iodepth=128 --direct=1 --group_reporting --numjobs=8 --name=fiojob1 --filename=/dev/sda fiojob1: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=128 ... fio-3.16 Starting 8 processes Jobs: 1 (f=1): [_(7),W(1)][100.0%][w=410MiB/s][w=105k IOPS][eta 00m:00s] fiojob1: (groupid=0, jobs=8): err= 0: pid=1438: Thu Nov 2 09:46:49 2023 write: IOPS=155k, BW=605MiB/s (634MB/s)(320GiB/541949msec); 0 zone resets slat (nsec): min=660, max=325426k, avg=10351.04, stdev=232438.50 clat (nsec): min=1100, max=782743k, avg=6595008.67, stdev=6290570.04 lat (usec): min=86, max=782748, avg=6606.08, stdev=6294.03 clat percentiles (usec): | 1.00th=[ 914], 5.00th=[ 2180], 10.00th=[ 2802], 20.00th=[ 3556], | 30.00th=[ 4178], 40.00th=[ 4817], 50.00th=[ 5538], 60.00th=[ 6259], | 70.00th=[ 7177], 80.00th=[ 8455], 90.00th=[ 10683], 95.00th=[ 13566], | 99.00th=[ 26870], 99.50th=[ 34866], 99.90th=[ 63177], 99.95th=[ 80217], | 99.99th=[145753] bw ( KiB/s): min=39968, max=1683451, per=100.00%, avg=619292.10, stdev=26377.19, samples=8656 iops : min= 9990, max=420862, avg=154822.58, stdev=6594.34, samples=8656 lat (usec) : 2=0.01%, 4=0.01%, 10=0.01%, 20=0.01%, 50=0.01% lat (usec) : 100=0.01%, 250=0.01%, 500=0.05%, 750=0.48%, 1000=0.65% lat (msec) : 2=2.79%, 4=23.00%, 10=60.93%, 20=10.08%, 50=1.83% lat (msec) : 100=0.16%, 250=0.02%, 500=0.01%, 1000=0.01% cpu : usr=3.27%, sys=7.39%, ctx=1011754, majf=0, minf=93 IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.1% issued rwts: total=0,83886080,0,8 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=128 Run status group 0 (all jobs): WRITE: bw=605MiB/s (634MB/s), 605MiB/s-605MiB/s (634MB/s-634MB/s), io=320GiB (344GB), run=541949-541949msec Disk stats (read/write): sda: ios=264/31713991, merge=0/52167896, ticks=127/57278442, in_queue=57278609, util=99.95% ``` I have shared the results with xnox and the important datapoints to see are `bw=1237MiB/s` with the 5.4 kernel and only `bw=605MiB/s` with the 5.15 kernel. Attached find the test results initially reported by our google colleagues To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2042564/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp