On Tue, Apr 30, 2019 at 8:05 AM Michelle Sullivan <[email protected]> wrote: > > > > Michelle Sullivan > http://www.mhix.org/ > Sent from my iPad > > > On 01 May 2019, at 00:01, Alan Somers <[email protected]> wrote: > > > >> On Tue, Apr 30, 2019 at 7:30 AM Michelle Sullivan <[email protected]> > >> wrote: > >> > >> Karl Denninger wrote: > >>> On 4/30/2019 05:14, Michelle Sullivan wrote: > >>>>>> On 30 Apr 2019, at 19:50, Xin LI <[email protected]> wrote: > >>>>>> On Tue, Apr 30, 2019 at 5:08 PM Michelle Sullivan <[email protected]> > >>>>>> wrote: > >>>>>> but in my recent experience 2 issues colliding at the same time > >>>>>> results in disaster > >>>>> Do we know exactly what kind of corruption happen to your pool? If you > >>>>> see it twice in a row, it might suggest a software bug that should be > >>>>> investigated. > >>>>> > >>>>> All I know is it’s a checksum error on a meta slab (122) and from what > >>>>> I can gather it’s the spacemap that is corrupt... but I am no expert. > >>>>> I don’t believe it’s a software fault as such, because this was cause > >>>>> by a hard outage (damaged UPSes) whilst resilvering a single (but > >>>>> completely failed) drive. ...and after the first outage a second > >>>>> occurred (same as the first but more damaging to the power hardware)... > >>>>> the host itself was not damaged nor were the drives or controller. > >>> ..... > >>>>> Note that ZFS stores multiple copies of its essential metadata, and in > >>>>> my experience with my old, consumer grade crappy hardware (non-ECC RAM, > >>>>> with several faulty, single hard drive pool: bad enough to crash almost > >>>>> monthly and damages my data from time to time), > >>>> This was a top end consumer grade mb with non ecc ram that had been > >>>> running for 8+ years without fault (except for hard drive platter > >>>> failures.). Uptime would have been years if it wasn’t for patching. > >>> Yuck. > >>> > >>> I'm sorry, but that may well be what nailed you. > >>> > >>> ECC is not just about the random cosmic ray. It also saves your bacon > >>> when there are power glitches. > >> > >> No. Sorry no. If the data is only half to disk, ECC isn't going to save > >> you at all... it's all about power on the drives to complete the write. > > > > ECC RAM isn't about saving the last few seconds' worth of data from > > before a power crash. It's about not corrupting the data that gets > > written long before a crash. If you have non-ECC RAM, then a cosmic > > ray/alpha ray/row hammer attack/bad luck can corrupt data after it's > > been checksummed but before it gets DMAed to disk. Then disk will > > contain corrupt data and you won't know it until you try to read it > > back. > > I know this... unless I misread Karl’s message he implied the ECC would have > saved the corruption in the crash... which is patently false... I think > you’ll agree..
I don't think that's what Karl meant. I think he meant that the non-ECC RAM could've caused latent corruption that was only detected when the crash forced a reboot and resilver. > > Michelle > > > > > > -Alan > > > >>> > >>> Unfortunately however there is also cache memory on most modern hard > >>> drives, most of the time (unless you explicitly shut it off) it's on for > >>> write caching, and it'll nail you too. Oh, and it's never, in my > >>> experience, ECC. > > > > Fortunately, ZFS never sends non-checksummed data to the hard drive. > > So an error in the hard drive's cache ram will usually get detected by > > the ZFS checksum. > > > >> > >> No comment on that - you're right in the first part, I can't comment if > >> there are drives with ECC. > >> > >>> > >>> In addition, however, and this is something I learned a LONG time ago > >>> (think Z-80 processors!) is that as in so many very important things > >>> "two is one and one is none." > >>> > >>> In other words without a backup you WILL lose data eventually, and it > >>> WILL be important. > >>> > >>> Raidz2 is very nice, but as the name implies it you have two > >>> redundancies. If you take three errors, or if, God forbid, you *write* > >>> a block that has a bad checksum in it because it got scrambled while in > >>> RAM, you're dead if that happens in the wrong place. > >> > >> Or in my case you write part data therefore invalidating the checksum... > >>> > >>>> Yeah.. unlike UFS that has to get really really hosed to restore from > >>>> backup with nothing recoverable it seems ZFS can get hosed where issues > >>>> occur in just the wrong bit... but mostly it is recoverable (and my > >>>> experience has been some nasty shit that always ended up being > >>>> recoverable.) > >>>> > >>>> Michelle > >>> Oh that is definitely NOT true.... again, from hard experience, > >>> including (but not limited to) on FreeBSD. > >>> > >>> My experience is that ZFS is materially more-resilient but there is no > >>> such thing as "can never be corrupted by any set of events." > >> > >> The latter part is true - and my blog and my current situation is not > >> limited to or aimed at FreeBSD specifically, FreeBSD is my experience. > >> The former part... it has been very resilient, but I think (based on > >> this certain set of events) it is easily corruptible and I have just > >> been lucky. You just have to hit a certain write to activate the issue, > >> and whilst that write and issue might be very very difficult (read: hit > >> and miss) to hit in normal every day scenarios it can and will > >> eventually happen. > >> > >>> Backup > >>> strategies for moderately large (e.g. many Terabytes) to very large > >>> (e.g. Petabytes and beyond) get quite complex but they're also very > >>> necessary. > >>> > >> and there in lies the problem. If you don't have a many 10's of > >> thousands of dollars backup solutions, you're either: > >> > >> 1/ down for a looooong time. > >> 2/ losing all data and starting again... > >> > >> ..and that's the problem... ufs you can recover most (in most > >> situations) and providing the *data* is there uncorrupted by the fault > >> you can get it all off with various tools even if it is a complete > >> mess.... here I am with the data that is apparently ok, but the > >> metadata is corrupt (and note: as I had stopped writing to the drive > >> when it started resilvering the data - all of it - should be intact... > >> even if a mess.) > >> > >> Michelle > >> > >> -- > >> Michelle Sullivan > >> http://www.mhix.org/ > >> > >> _______________________________________________ > >> [email protected] mailing list > >> https://lists.freebsd.org/mailman/listinfo/freebsd-stable > >> To unsubscribe, send any mail to "[email protected]" _______________________________________________ [email protected] mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "[email protected]"
