On Sun, Aug 25, 2024 at 11:41:44PM +0200, Maximilian Engelhardt wrote: > I am changing the severity back to normal as the xen package works fine for > many people without any serious issues. From your last message it also seems
Yet for some lucky people data is corrupted/lost. There could be other people who reproduce this, but don't send e-mail saying "me too" to this bug report. Presently the main reason there aren't very many reproductions is few people are bothering to use RAID with flash. The initial reports are SSDs have a lower failure rate than disks, but the failure rate isn't even close to zero. Whereas the data loss/corruption easily reproduces. While both cases in #988477 were on systems with AMD hardware, I am presently doubtful that is a requirement. The most similar known bug was found to be more severe on AMD hardware, but also occur on Intel hardware. I suspect this issue may be similar, simply no one has noticed the problem yet... > you found a workaround for your problem. Please don't change the bug severity Something was found which seems to have made another issue more prominent. It may reduce the rate at which data corruption occurs, but I've since confirmed data loss/corruption continues to occur. > without at least giving an explanation why you think the new severity is > justified. I had thought the original reporter's justification was sufficient. This appears to have some specific requirement to meet, but if you meet them you may be in trouble before alerts trigger. So far both reports are with AMD machines with IOMMUv2 functionality (I tried on a machine with IOMMUv1/GART and it didn't reproduce). Both reports feature Samsung SATA devices. A NVMe device from another manufacturer also showed the issue (I'm almost certain Samsung NVMe devices will also show the issue). I suspect Intel machines may also be effected by this issue, but it may not manifest as severely. I suspect this is a case of people with AMD machines being a bit more wary of hardware failure (thus actually bothering to use RAID1 even with flash devices). > >From the few log lines in this bug report this seems to be an upstream issue > with xen or the linux kernel. Please report your observations upstream. The > Debian xen team does not have the resources and knowledge to debug or fix > such > problems. Once the issue has been identified and fixed upstream we can see if > we can backport a fix to our Debian packages, but this is only possible once > an upstream fix has landed. Perhaps it has become easier to report things upstream, but the original procedure was reportters were supposed to report to bugs.debian.org and NOT forward upstream. Other problem is I've run into a chasm with upstream and no way to build a bridge across. I do have one more thing to try, but don't yet have a time-frame for when I'll check that. -- (\___(\___(\______ --=> 8-) EHM <=-- ______/)___/)___/) \BS ( | ehem+sig...@m5p.com PGP 87145445 | ) / \_CS\ | _____ -O #include <stddisclaimer.h> O- _____ | / _/ 8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445