When the file system is stress-tested, it seems that the device (an
internal drive) is lost.
A recent photograph:
<https://photos.app.goo.gl/wB7gZKLF5PQzusrz7>
Transcribed manually:
nvme0: Resetting controller due to a timeout.
nvme0: resetting controller
nvme0: controller ready did not become 0 within 5500 ms
nvme0: failing outstanding i/o
nvme0: WRITE sqid:2 cid:115 nsid:1 lba:296178856 len:64
nvme0: ABORTED - BY REQUEST (00/07) sqid:2 cid:115 cdw0:0
g_vfs_done():nvd0p2[WRITE(offset=151370924032, length=32768)]error = 6
UFS: forcibly unmounting /dev/nvd0p2 from /
nvme0: failing outstanding i/o
… et cetera.
Is this a sure sign of a hardware problem? Or must I do something
special to gain reliability under stress?
I don't how to interpret parts of the manual page for nvme(4). There's
direction to include this line in loader.conf(5):
nvme_load="YES"
– however when I used kldload(8), it seemed that the module was already
loaded, or in kernel.
Using StressDisk:
<https://github.com/ncw/stressdisk>
– failures typically occur after around six minutes of testing.
The drive is very new, less than 2 TB written:
<https://bsd-hardware.info/?probe=7138e2a9e7&log=smartctl>
I do suspect a hardware problem, because two prior installations of
Windows 10 became non-bootable.
Also: I find peculiarities with use of fsck_ffs(8), which I can describe
later. Maybe to be expected, if there's a problem with the drive.