On Fri, 18 Mar 2016, Peter Zijlstra wrote:
> It turns out AMD gets x86_max_cores wrong when there are compute
> units.
>
> The issue is that Linux assumes:
>
> nr_logical_cpus = nr_cores * nr_siblings
>
> But AMD reports its CU unit as 2 cores, but then sets num_smp_siblings
> to 2 as well.
>
> Cc: Ingo Molnar <[email protected]>
> Cc: Borislav Petkov <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: Andreas Herrmann <[email protected]>
> Reported-by: Xiong Zhou <[email protected]>
> Fixes: 1f12e32f4cd5 ("x86/topology: Create logical package id")
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Link:
> http://lkml.kernel.org/r/[email protected]
> ---
> arch/x86/kernel/cpu/amd.c | 8 ++++----
> arch/x86/kernel/smpboot.c | 11 ++++++-----
> 2 files changed, 10 insertions(+), 9 deletions(-)
>
> --- a/arch/x86/kernel/cpu/amd.c
> +++ b/arch/x86/kernel/cpu/amd.c
> @@ -313,9 +313,9 @@ static void amd_get_topology(struct cpui
> node_id = ecx & 7;
>
> /* get compute unit information */
> - smp_num_siblings = ((ebx >> 8) & 3) + 1;
> + cores_per_cu = smp_num_siblings = ((ebx >> 8) & 3) + 1;
> + c->x86_max_cores /= smp_num_siblings;
Unfortunately that will break stuff in event/amd/core.c, ras/mce_amd_inj.c
which rely on the AMD interpretation of c->x86_max_cores.
I'm seriously grumpy about pointless inconsistent representations depending on
the CPU vendor. That's just lazy and sloppy hackery from the "works for me"
departement.
For now we'll get away with Peters workaround for the Intel HT mess, which is
a different kind of mental insanity invented by HW/BIOS people, but this
topology stuff needs to be made consistent ASAP.
Thanks,
tglx