Package: libc6 Version: 2.19-4 Severity: grave Justification: causes non-serious data loss
Intel Broadwell-H and Skylake-S/H have critical errata that causes HLE to be extremely dangerous to use on those processors, resulting in unpredictable behavior (i.e. process crashes when you are lucky, data corruption when you are not) when hardware lock-elision is enabled in glibc/libpthread. Broadwell errata BBD50 (desktop/mobile), BDW50 (server): An HLE (Hardware Lock Elision) transactional region begins with an instruction with the XACQUIRE prefix. Due to this erratum, reads from within the transactional region of the memory destination of that instruction may return the value that was in memory before the transactional region began According to the Intel errata list, a firmware fix is possible, but I have no idea whether it is done by toggling a boot-locked MSR that disables HLE, or through a microcode update. The MSR is more likely, but if it is a microcode update, it is going to be as much of a hazard as the Haswell one that disabled TSX+HLE. I recommend that we extend the HLE blacklist in glibc to also include CPU signature 0x40671. This will disable HLE on Xeon E3-1200v4, and 5th-generation Core i5/i7. These processors are supposed to already have TSX disabled (errata BBD51/BDW51). Skylake's latest public specification update still doesn't list any HLE errata, but it is not really recent. OTOH, there is a Gentoo user's report that Skylake is also unstable when HLE is enabled in glibc and that the crashes stop when glibc is compiled without lock elision. For that reason, it might be a good idea to also blacklist HLE on CPU signatures 0x506e1, 0x506e2 and 0x506e3, which would disable HLE on Skylake-S and Skylake-H (6th gen Core i5/i7). This won't cover the Skylake Xeon E3-1200v5, for which there are no reports of breakage (nor a public specification update I could find). References: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=762195 https://bbs.archlinux.org/viewtopic.php?id=202545 In hindsight, it looks like we would have been better off by disabling lock elision entirely for Debian jessie when we fixed #762195. Something to consider when the time comes to fix this bug in stable through a stable update... -- "One disk to rule them all, One disk to find them. One disk to bring them all and in the darkness grind them. In the Land of Redmond where the shadows lie." -- The Silicon Valley Tarot Henrique Holschuh