Bug#946940: lshw crashes with floating point exception

Dave Gomboc Thu, 19 Dec 2019 23:10:20 -0800

Thanks for the pointers.  I have installed the utilities you mentioned, and
also fetched the Debian source code for lshw.  I also made sure that I have
no USB sticks plugged in (they are often formatted as FAT) and that I have
no drive images mounted in such a way that any FAT device should be present.

I can confirm that I see the same as you did in gdb: vs.sectors_per_cluster
is zero:

Program received signal SIGFPE, Arithmetic exception.
scan_fat (n=..., id=...) at fat.cc:237
237             if (cluster_count < FAT16_MAX)
(gdb) print cluster_count
Division by zero
(gdb) print vs.sectors_per_cluster
$1 = 0 '\000'
(gdb) info locals
dir = <optimized out>
sector_size_bytes = 512
dir_entries = 0
sect_count = <optimized out>
reserved_sct = 32
fat_size_sct = <optimized out>
root_cluster = <optimized out>
dir_size_sct = 0
cluster_count = <error reading variable cluster_count (Division by zero)>
root_start_sect = <optimized out>
start_data_sect = <optimized out>
buf = 0x0
buf_size = <optimized out>
label = 0x0
next_cluster = <optimized out>

I do have a pair of mirrored drives that cfdisk describes as having some
partitions of type "Microsoft basic data".

Disk: /dev/sdp (the other drive also has the same partition info)
Size: 2.75TiB, 30000592982016 byres, 5860533168 sectors
Label: gpt, identifier: (redacted)
Device     ...  Type
/dev/sdp1       Microsoft basic data
/dev/sdp2       Linux RAID
/dev/sdp3       Linux RAID
/dev/sdp4       Microsoft basic data
Free space
/dev/sdp5       BIOS boot
/dev/sdp6       Microsoft basic data

/dev/sdp6 is actually formatted as ext2 and mounted as /boot.  I tried
mounting all of the other partitions with vfat but did not succeed:

sudo mount -o ro -t vfat /dev/sdp5 /mnt/fat
[ 3288.241356] FAT-fs (sdp5): bogus sectors per cluster 0
[ 3288.241436] FAT-fs (sdp5): Can't find a valid FAT filesystem
mount: /mnt/fat: wrong fs type, bad option, bad superblock on /dev/sdp5,
missing codepage or helper program, or other error.

A similar message is also emitted when I try to mount either /dev/sdp1 or
/dev/sdp4.

I'm obviously not going to be erasing my BIOS boot partition. :-)  It seems
that lshw contains some assumption that a valid FAT filesystem exists when
it finds something that sort of looks like it might be one but actually
isn't one.

I went back to "sudo gdb lshw":
(gdb) run
[crash info as above]
(gdb) print vs.type.sector
$1 = "\000 \000\000\000\000\000\000\002\000\000\000\001\000\006", '\000'
<repeat
s 13 times>, "\200\000)Q%%#NO NAME    FAT32
 \016\037\276t~\254\"\300t\006\264\
016\315\020\353\365\264\000\315\026\264\000\315\031\353\376This partition
does n
ot have an operating system loader installed on it.\n\rPress a key to
reboot...\
000MSW", '\000' <repeats 292 times>...

This seems like data that the software needs to detect and not treat as a
typical FAT filesystem.

Dave

On Wed, 18 Dec 2019 at 12:00, Bernhard Übelacker <bernha...@mailbox.org>
wrote:

> Hello Dave,
> I am not involved in packaging lshw, just looking
> at some random crash bug reports.
>
> First, when reporting crashes that line from dmesg is most
> of the time not sufficient. Therefore a simple way to retrieve
> some more information could be to run it by 'catchsegv lshw'.
>
> Better woudl be if it is possible to install something like
> systemd-coredump. That way a backtrace should be printed to
> 'journalctl --no-pager'.
>
> And even better would be to install additionally matching
> dbgsym packages e.g. lshw-dbgsym.
> Therefore another package repository is needed to be activated.
> More details in [1].
>
> ----
>
> But as lshw is small enough, I guess I found something.
> The instruction offset in combination with the 'divide error'
> points to this line:
>
> (gdb) list fat.cc:220
> 227             cluster_count /= vs.sectors_per_cluster;
>
> Could you confirm to have a FAT partition attached to the system?
> Maybe a damaged or ancient one, maybe a floppy?
>
> If vs.sectors_per_cluster would be 0 the crash would happen
> exactly like you experienced it.
>
> (gdb) bt
> #0  0x00005555555f2e73 in scan_fat (n=..., id=...) at fat.cc:237
> #1  0x00005555555edea4 in detect_fat (n=..., s=...) at volumes.cc:513
> #2  0x00005555555eb895 in scan_volume (n=..., s=...) at volumes.cc:1075
> #3  0x00005555555e4115 in detect_dosmap (s=..., n=...) at
> partitions.cc:1197
> #4  0x00005555555e0e96 in scan_partitions (n=...) at partitions.cc:1386
> #5  0x00005555555cb00a in scan_disk (n=...) at disk.cc:79
> #6  0x00005555555c81d5 in scan_sg (n=...) at scsi.cc:762
> #7  0x00005555555c9e75 in scan_scsi (n=...) at scsi.cc:909
> #8  0x000055555558bffd in scan_system (system=...) at main.cc:134
> #9  0x000055555557a50c in main (argc=<optimized out>, argv=<optimized
> out>) at lshw.cc:247
>
>
> Kind regards,
> Bernhard
>
> P.S.:
> You compiled upstream package lshw-B.02.18.tar.gz and experienced
> a crash. I guess that one is fixed in upstream git in the last
> commit to usb.cc, which is unrelated to the above issue.
> More details at the bottom of attached file.
>
>
> [1]
> https://wiki.debian.org/HowToGetABacktrace#Installing_the_debugging_symbols
>

Bug#946940: lshw crashes with floating point exception

Reply via email to