On Thu, Jan 02, 2025 at 09:06:11PM +0100, Mateusz Jończyk wrote:
> Hello,
> 
> I also hit this bug on 32-bit mipsel, on the Malta platform in QEMU.
> 
> I used images from 
> https://ftp.debian.org/debian/dists/Debian12.8/main/installer-mipsel/current/images/malta/netboot/
> 
> Command line:
> 
> qemu-system-mipsel     -cpu    24Kc     -M      malta     -m      512         
>  -kernel debian12.8/installer-mipsel/malta/vmlinuz-6.1.0-27-4kc-malta     
> -initrd
> debian12.8/installer-mipsel/malta/initrd.gz  -hda 
> /media/1T-data/virtual_machines/debian_mips/hda.raw    -append 
> "root=/dev/sda1 nokaslr"     -nographic
[snip]
> 
> I have run gdb and the offending instruction is in the ZSTD_RowFindBestMatch 
> function and it is the "prefx" instruction.
> 
>    0x555b02f8 <+1160>:    addiu    v0,v0,31
>    0x555b02fc <+1164>:    andi    v0,v0,0xf
>    0x555b0300 <+1168>:    sll    v0,v0,0x2
>    0x555b0304 <+1172>:    addu    v0,a3,v0
>    0x555b0308 <+1176>:    lw    v0,0(v0)
>    0x555b030c <+1180>:    b    0x555b0368 
> <ZSTD_RowFindBestMatch_noDict_5_4+1272>
>    0x555b0310 <+1184>:    move    t7,v0
> => 0x555b0314 <+1188>:    prefx    0x6,t7(s5)
>    0x555b0318 <+1192>:    subu    v0,a1,v0
>    0x555b031c <+1196>:    sw    t7,0(a2)
>    0x555b0320 <+1200>:    addiu    t9,a0,-1
>    0x555b0324 <+1204>:    and    a1,a1,v0
>    0x555b0328 <+1208>:    and    a0,a0,t9
> 
> It appears that this instruction requires a floating-point coprocessor and is 
> a CP1X instruction.
> It is used to prefetch locations from memory.

Thanks a lot for your analysis! (and to наб for reporting the problem with lots 
of
detail, too)

So, hm, if this is a prefetch instruction, is it possible that it would 
correspond to
one of the PREFETCH_L1() invocations, either in ZSTD_RowFindBestMatch itself, 
or in
ZSTD_row_prefetch()?

  
https://sources.debian.org/src/libzstd/1.5.4%2Bdfsg2-5/lib/compress/zstd_lazy.c/#L1139

  
https://sources.debian.org/src/libzstd/1.5.4%2Bdfsg2-5/lib/compress/zstd_lazy.c/#L823

If I'm reading the conditional compilation directives right, for GCC >= 4
the PREFETCH_L1() macro would be left as an exercise to the compiler... erm,
I mean, would be defined as __builtin_prefetch():

  
https://sources.debian.org/src/libzstd/1.5.4%2Bdfsg2-5/lib/common/compiler.h/#L119

And in the build log for libzstd-1.5.4+dfsg2-5 for mipsel, it seems that this
particular file, zstd_lazy.c, was not compiled with any special flags:

  
https://buildd.debian.org/status/fetch.php?pkg=libzstd&arch=mips64el&ver=1.5.4%2Bdfsg2-5&stamp=1679182427&raw=0

  CC obj/conf_d0b7c101029993bfb103f90ea5393d0b/zstd_lazy.o
  mips64el-linux-gnuabi64-gcc -g -O2 -ffile-prefix-map=/<<PKGBUILDDIR>>=.
     -fstack-protector-strong -Wformat -Werror=format-security 
-DBACKTRACE_ENABLE=0
     -Wa,--noexecstack -Wdate-time -D_FORTIFY_SOURCE=2 -DXXH_NAMESPACE=ZSTD_ 
-DDEBUGLEVEL=0
     -DZSTD_LEGACY_SUPPORT=5 -DZSTD_MULTITHREAD -DZSTD_GZCOMPRESS 
-DZSTD_GZDECOMPRESS
     -DZSTD_LZMACOMPRESS -DZSTD_LZMADECOMPRESS -DZSTD_LZ4COMPRESS 
-DZSTD_LZ4DECOMPRESS
     -DZSTD_LEGACY_SUPPORT=5  -c -MT 
obj/conf_d0b7c101029993bfb103f90ea5393d0b/zstd_lazy.o
     -MMD -MP -MF obj/conf_d0b7c101029993bfb103f90ea5393d0b/zstd_lazy.d
     -o obj/conf_d0b7c101029993bfb103f90ea5393d0b/zstd_lazy.o
     ../lib/compress/zstd_lazy.c

[snip]
> Indeed, when using the 24Kf variant (qemu-system-mipsel -cpu 24Kf), zstd -9 
> works.
> 
> The question is why the Linux kernel's math-emu module (which is compiled in 
> and enabled) didn't catch and emulate it.
> 
>     root@mateusz-debian-mips:/sys/kernel/debug/mips# cat fpuemustats_clear
>     root@mateusz-debian-mips:/sys/kernel/debug/mips# zstd -9 </etc/fstab 
> >/dev/null
>     Caught SIGILL signal, printing stack:
>     Illegal instruction
>     root@mateusz-debian-mips:/sys/kernel/debug/mips# grep -r . fpuemustats/*
>     fpuemustats/branches:11
>     fpuemustats/cp1ops:44
>     fpuemustats/cp1xops:1
>     fpuemustats/ds_emul:1
>     fpuemustats/emulated:535
>     fpuemustats/errors:0
>     fpuemustats/ieee754_inexact:4
>     fpuemustats/ieee754_invalidop:0
>     fpuemustats/ieee754_overflow:0
>     fpuemustats/ieee754_underflow:0
>     fpuemustats/ieee754_zerodiv:0
>     [...]

So yeah, does this look like a qemu bug, or a compiler bug? I'm still
not completely sure, but from the source code it does not really
seem to be a libzstd bug - it leaves __builtin_prefetch() to GCC...

Would it be possible for somebody with access to hardware to
test it on real hardware, not in qemu?

G'luck,
Peter

-- 
Peter Pentchev  r...@ringlet.net r...@debian.org pe...@morpheusly.com
PGP key:        https://www.ringlet.net/roam/roam.key.asc
Key fingerprint 2EE7 A7A5 17FC 124C F115  C354 651E EFB0 2527 DF13

Attachment: signature.asc
Description: PGP signature

Reply via email to