> -----Original Message----- > From: [email protected] <[email protected]> On > Behalf Of Borislav Petkov > Sent: Thursday, May 16, 2019 12:21 PM > To: Ghannam, Yazen <[email protected]> > Cc: Luck, Tony <[email protected]>; [email protected]; > [email protected]; [email protected] > Subject: Re: [PATCH v3 5/6] x86/MCE: Save MCA control bits that get set in > hardware > > > On Thu, May 16, 2019 at 05:09:11PM +0000, Ghannam, Yazen wrote: > > So that the sysfs files show the control values that are set in the > > hardware. It seemed like this would be more helpful than showing all > > 0xF's. > > Yeah, but it has been like that since forever and it hasn't bugged > anybody. Probably because anybody doesn't even look at those files. As > Tony says: > > "RAS is a lonely subsystem ... even EDAC gets more love." > > :-))) > > And adding yet another vendor check for this seemed just not worth it. > > > Should I send out another version of this set? > > I simply zapped 5/6. I still think your 6/6 makes sense though. > > --- > From: Yazen Ghannam <[email protected]> > Date: Tue, 30 Apr 2019 20:32:21 +0000 > Subject: [PATCH] x86/MCE: Determine MCA banks' init state properly > > The OS is expected to write all bits to MCA_CTL for each bank, > thus enabling error reporting in all banks. However, some banks > may be unused in which case the registers for such banks are > Read-as-Zero/Writes-Ignored. Also, the OS may avoid setting some control > bits because of quirks, etc. > > A bank can be considered uninitialized if the MCA_CTL register returns > zero. This is because either the OS did not write anything or because > the hardware is enforcing RAZ/WI for the bank. > > Set a bank's init value based on if the control bits are set or not in > hardware. Return an error code in the sysfs interface for uninitialized > banks. > > [ bp: Massage a bit. ] > > Signed-off-by: Yazen Ghannam <[email protected]> > Signed-off-by: Borislav Petkov <[email protected]> > Cc: "H. Peter Anvin" <[email protected]> > Cc: Ingo Molnar <[email protected]> > Cc: "[email protected]" <[email protected]> > Cc: Thomas Gleixner <[email protected]> > Cc: Tony Luck <[email protected]> > Cc: "[email protected]" <[email protected]> > Link: https://lkml.kernel.org/r/[email protected] > --- > arch/x86/kernel/cpu/mce/core.c | 17 +++++++++++++---- > 1 file changed, 13 insertions(+), 4 deletions(-) > > diff --git a/arch/x86/kernel/cpu/mce/core.c b/arch/x86/kernel/cpu/mce/core.c > index 5bcecadcf4d9..c049689f3d73 100644 > --- a/arch/x86/kernel/cpu/mce/core.c > +++ b/arch/x86/kernel/cpu/mce/core.c > @@ -1567,10 +1567,13 @@ static void __mcheck_cpu_init_clear_banks(void) > for (i = 0; i < this_cpu_read(mce_num_banks); i++) { > struct mce_bank *b = &mce_banks[i]; > > - if (!b->init) > - continue; > - wrmsrl(msr_ops.ctl(i), b->ctl); > - wrmsrl(msr_ops.status(i), 0); > + if (b->init) { > + wrmsrl(msr_ops.ctl(i), b->ctl); > + wrmsrl(msr_ops.status(i), 0); > + } > + > + /* Bank is initialized if bits are set in hardware. */ > + b->init = !!b->ctl;
We don't actually know if there are bits set in hardware until we read it back. So I don't think this is adding anything new. Thanks, Yazen

