Hi George,

It would be useful if you paste here full smart attribute stats, command:
sudo smartctl /dev/sda --all
replace sda with correct name as needed

Do "long" SMART test on this drive. It should be able to map out bad
sectors so Linux doesn't see the errors any more. Unless bad sectors are
growing and changing constantly due to dying hard drive. Sometimes you
need to repeat long test, until the number of bad sectors reallocated
stop growing (for time being if HDD is dying).

How to do long smart test:
sudo smartctl /dev/sda --test=long

Next step, if you want to continue to use this drive (to some extent, as
you will never be able to consider it *reliable* to store important
data), is to use excellent (paid) SpinRite program by Steve Gibson. It
can recover data from unrecoverable sectors, and map out ALL bad
sectors, so that remaining ones work smoothly and Linux kernel isn't
thrown off every few minutes.

https://www.grc.com/sr/spinrite.htm


On 10/08/2024 09:20, George at Clug wrote:
Hi,

I case there might be a known, fixable, fault that could cause this,
anyone know what the following errors indicate?

Aug 10 17:30:51 srv01 kernel: ata6: EH complete
Aug 10 17:30:54 srv01 kernel: ata6.00: exception Emask 0x0 SAct 0x4000
SErr 0xc0000 action 0x0
Aug 10 17:30:54 srv01 kernel: ata6.00: irq_stat 0x40000008
Aug 10 17:30:54 srv01 kernel: ata6: SError: { CommWake 10B8B }
Aug 10 17:30:54 srv01 kernel: ata6.00: failed command: READ FPDMA QUEUED
Aug 10 17:30:54 srv01 kernel: ata6.00: cmd
60/08:70:68:c4:00/00:00:00:00:00/40 tag 14 ncq dma 4096 in
Aug 10 17:30:54 srv01 kernel: ata6.00: status: { DRDY ERR }
Aug 10 17:30:54 srv01 kernel: ata6.00: error: { UNC }
Aug 10 17:30:54 srv01 kernel: ata6.00: configured for UDMA/133
Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 FAILED Result:
hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=2s
Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 Sense Key :
Medium Error [current]
Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 Add. Sense:
Unrecovered read error - auto reallocate failed
Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 CDB: Read(16) 88
00 00 00 00 00 00 00 c4 68 00 00 00 08 00 00
Aug 10 17:30:54 srv01 kernel: I/O error, dev sdb, sector 50280 op
0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Aug 10 17:30:54 srv01 kernel: Buffer I/O error on dev sdb, logical block
6285, async page read

I am running badblocks against a Western Digital 3TB WDC
WD30EFRX-68AX9N0​, and it keep generating the above.
https://www.storagereview.com/review/western-digital-red-nas-hard-drive-review-wd30efrx

I have changed the port it is connected to, and the SATA cable, but
still the errors follow the disk drive.

I suspect that the error message "Medium Error", means just that, an
area of the disk has failed, hence "Unrecovered read error".

Sadly "Unrecovered read error" also implies "auto reallocate failed", so
what ever data was on the failed area, it is gone forever. Do not worry,
backups mean important data is safe, but it does mean a few hours effort
to replace the drive, test the replacement, and then restore data. Sadly
I was just starting to use the storage for testing, and now I will have
to again copy of the data for testing to the replacement drive.

Bad blocks start at 21632  and so far continue past 26606.

George.

Home test lab, Debian Bookworm, KVM host server. AMD Ryzen 9 3900X CPU
and motherboard. The drive was mounted as spare data storage.



--
With kindest regards, Piotr.

⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system
⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org/
⠈⠳⣄⠀⠀⠀⠀

Reply via email to