Re: I/O errors during RAID check but no SMART errors

2024-10-10 Thread Franco Martelli
On 09/10/24 at 21:10, Jochen Spieker wrote: Andy Smith: Hi, On Wed, Oct 09, 2024 at 08:41:38PM +0200, Franco Martelli wrote: Do you know whether MD is clever enough to send an email to root when it fails the device? Or have I to keep an eye on /proc/mdstat? For more than a decade mdadm has s

Re: I/O errors during RAID check but no SMART errors

2024-10-09 Thread Jochen Spieker
Andy Smith: > Hi, > > On Wed, Oct 09, 2024 at 08:41:38PM +0200, Franco Martelli wrote: >> Do you know whether MD is clever enough to send an email to root when it >> fails the device? Or have I to keep an eye on /proc/mdstat? > > For more than a decade mdadm has shipped with a service that runs i

Re: I/O errors during RAID check but no SMART errors

2024-10-09 Thread Andy Smith
Hi, On Wed, Oct 09, 2024 at 08:41:38PM +0200, Franco Martelli wrote: > Do you know whether MD is clever enough to send an email to root when it > fails the device? Or have I to keep an eye on /proc/mdstat? For more than a decade mdadm has shipped with a service that runs in monitor mode to do thi

Re: I/O errors during RAID check but no SMART errors

2024-10-09 Thread Franco Martelli
On 08/10/24 at 20:40, Andy Smith wrote: Hi, On Tue, Oct 08, 2024 at 04:58:46PM +0200, Jochen Spieker wrote: Why is the RAID still considered healthy? At some point I would expect the disk to be kicked from the RAID. This will happen when/if MD can't compensate by reading data from other m

Re: I/O errors during RAID check but no SMART errors

2024-10-09 Thread Jochen Spieker
Michael Kjörling: > On 8 Oct 2024 11:29 -0400, from d...@randomstring.org (Dan Ritter): >> >> This looks like a drive which is old and starting to wear out >> but is not there yet. The raw read error rate is starting to >> creep up but isn't at a threshold. > > I agree. The almost 62000 hours is

Re: I/O errors during RAID check but no SMART errors

2024-10-09 Thread Jochen Spieker
e...@gmx.us: > On 10/8/24 16:07, Jochen Spieker wrote: >>| Oct 06 14:27:11 jigsaw kernel: I/O error, dev sdb, sector 9361257600 op >>0x0:(READ) flags 0x0 phys_seg 150 prio class 3 >>| Oct 06 14:27:30 jigsaw kernel: I/O error, dev sdb, sector 9361275264 op >>0x0:(READ) flags 0x4000 phys_seg 161 pr

Re: I/O errors during RAID check but no SMART errors

2024-10-08 Thread Michael Kjörling
On 8 Oct 2024 11:29 -0400, from d...@randomstring.org (Dan Ritter): >> The disk has been running continuously for seven years now and I am >> running out of space anyway, so I already ordered a replacement. But I >> do not fully understand what is happening. > > The drive is dying, slowly. In this

Re: I/O errors during RAID check but no SMART errors

2024-10-08 Thread eben
On 10/8/24 16:07, Jochen Spieker wrote: | Oct 06 14:27:11 jigsaw kernel: I/O error, dev sdb, sector 9361257600 op 0x0:(READ) flags 0x0 phys_seg 150 prio class 3 | Oct 06 14:27:30 jigsaw kernel: I/O error, dev sdb, sector 9361275264 op 0x0:(READ) flags 0x4000 phys_seg 161 prio class 3 | Oct 06 1

Re: I/O errors during RAID check but no SMART errors

2024-10-08 Thread Jochen Spieker
Andy Smith: > On Tue, Oct 08, 2024 at 04:58:46PM +0200, Jochen Spieker wrote: >> The way I understand these messages is that some sectors cannot be read >> from sdb at all and the disk is unable to reallocate the data somewhere >> else (probably because it doesn't know what the data should be in th

Re: I/O errors during RAID check but no SMART errors

2024-10-08 Thread Jochen Spieker
Dan Ritter: > Jochen Spieker wrote: > >> The sector number mentioned at the bottom is increasing during the >> check. > > So it repeats, and it's contiguous. That suggests a flaw in the > drive itself. It definitely looks like that: | Oct 06 14:27:11 jigsaw kernel: I/O error, dev sdb, sector 9

Re: I/O errors during RAID check but no SMART errors

2024-10-08 Thread Andy Smith
Hi, On Tue, Oct 08, 2024 at 04:58:46PM +0200, Jochen Spieker wrote: > The way I understand these messages is that some sectors cannot be read > from sdb at all and the disk is unable to reallocate the data somewhere > else (probably because it doesn't know what the data should be in the > first pl

Re: I/O errors during RAID check but no SMART errors

2024-10-08 Thread Dan Ritter
Jochen Spieker wrote: > I have two disks in a RAID-1: > > | $ cat /proc/mdstat > | Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] > [raid4] [raid10] > | md0 : active raid1 sdb1[2] sdc1[0] > | 5860390400 blocks super 1.2 [2/2] [UU] > | bitmap: 5/44 pages [20KB],

I/O errors during RAID check but no SMART errors

2024-10-08 Thread Jochen Spieker
Hey, please forgive me for posting a question that is not Debian-specific, but maybe somebody here can explain this to me. Ten years ago I would have posted to Usenet instead. I have two disks in a RAID-1: | $ cat /proc/mdstat | Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5