Control: tag -1 + unreproducible
On 04.09.2025 18:29, Jakub Ružička wrote:
Source: qemu
Version: 1:10.0.2+ds-2+deb13u1
Severity: important
Dear Maintainer,
when using arm64 emulation on amd64 host with
pbuilder build --architecture arm64 --distro trixie
it's incredibly slow to a point where tests fail due to timeouts and
installation of texlive-latex-extra takes tens of minutes.
This problem is present:
* only on Trixie GUESTS (Bookworm, Bullseye, and even Buster guests work
fine)
* on both Bookworm and Trixie amd64 HOSTS (tested on 2 Bookworm and 1 Trixie
host)
So, from this we can conclude it is the GUEST which behaves differently,
not qemu. I guess you can file a bug against arm64 trixie :)
I guess pbuilder means qemu-user package (as I noted before, I never
used pbuilder and don't know how to use it).
I can't confirm this observation, at least not generally.
Some operations has indeed become slower. For example, I ran openssl
speed benchmark on bookworm guest and trixie guest, here are some
results on x86-64 host (ops/sec, more is better):
Operation Bookworm Trixie
md5/16 1496014 981482
md5/64 1382894 967298
md5/256 1158511 883699
md5/1024 526236 469784
md5/8192 107971 130183
md5/16384 58668 72427
sha1/16 794488 571021
sha1/64 601573 466057
sha1/256 362402 309965
sha1/1024 141595 132793
sha1/8192 21644 21597
sha1/16384 11106 10937
...
some operations are slightly faster on trixie though.
However, some other benchmarks are significantly faster on trixie,
for example:
trixie$ 7z b
7-Zip 24.09 (arm64) : Copyright (c) 1999-2024 Igor Pavlov : 2024-11-29
64-bit arm_v:8-A locale=C.UTF-8 Threads:16 OPEN_MAX:1024, ASM
Compiler: ver:14.2.0 GCC 14.2.0 : UNALIGNED
Linux : 6.12.38+deb13-amd64 : #1 SMP PREEMPT_DYNAMIC Debian 6.12.38-1
(2025-07-16) : aarch64
PageSize:4KB hwcap:EFFFFFFB:CRC32:SHA1:SHA2:SHA3:SHA512:AES:ASIMD
hwcap2:18007FC77FFF
arm64
1T CPU Freq (MHz): 2297 2308 2330 2336 2327 2328 2328
8T CPU Freq (MHz): 745% 1648 746% 1646
16T CPU Freq (MHz): 1411% 1372 1406% 1118
RAM size: 15695 MB, # CPU hardware threads: 16
RAM usage: 3559 MB, # Benchmark threads: 16
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 5552 893 605 5402 | 132019 1412 797 11257
23: 3971 852 475 4046 | 98865 1464 584 8553
24: 5474 917 642 5886 | 85897 1435 525 7537
...
bookworm$ 7z b
7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,16 CPUs LE)
LE
CPU Freq: 3764705 - - - - - - - -
RAM size: 4096 MB, # CPU hardware threads: 16
RAM usage: 3530 MB, # Benchmark threads: 16
Compressing | Decompressing
Dict Speed Usage R/U Rating | Speed Usage R/U Rating
KiB/s % MIPS MIPS | KiB/s % MIPS MIPS
22: 4289 967 431 4173 | 64538 1442 382 5504
23: 3807 1017 382 3880 | 62524 1419 381 5410
24: 3872 1027 405 4164 | 65082 1468 389 5713
...
As you can see, decompression on trixie, for dict size of 22, is
2 times faster than on bookworm. This - most likely - is the result of
code optimization in 7z.
I also tried to compile a few packages in a foreign arm64 chroot on
an x86_64 machine - I don't see dramatic speed difference here, more
like gradual difference due to compiler being "smarter", doing more
optimizations and checking for more cases in the code - maybe 5% the
difference overall. I observe similar difference in speed when
building the same package on trixie and bookworm natively.
So.. this is an 'unreproducible' for now.
It might be particular source package you're building, or particular
toolchain difference, - I dunno. It looks like it is not a qemu
problem.
Thanks,
/mjt