On 7/14/25 10:41, Zhao Liu wrote:
> On Mon, Jul 14, 2025 at 09:51:25AM -0500, Moger, Babu wrote:
>> Date: Mon, 14 Jul 2025 09:51:25 -0500
>> From: "Moger, Babu" <babu.mo...@amd.com>
>> Subject: Re: [PATCH v2 7/7] i386/cpu: Honor maximum value for
>> CPUID.8000001DH.EAX[25:14]
>>
>> Hi Zhao,
>>
>> On 7/14/25 03:08, Zhao Liu wrote:
>>> CPUID.8000001DH:EAX[25:14] is "NumSharingCache", and the number of
>>> logical processors sharing this cache is the value of this field
>>> incremented by 1. Because of its width limitation, the maximum value
>>> currently supported is 4095.
>>>
>>> Though at present Q35 supports up to 4096 CPUs, by constructing a
>>> specific topology, the width of the APIC ID can be extended beyond 12
>>> bits. For example, using `-smp threads=33,cores=9,modules=9` results in
>>> a die level offset of 6 + 4 + 4 = 14 bits, which can also cause
>>> overflow. Check and honor the maximum value as CPUID.04H did.
>>>
>>> Cc: Babu Moger <babu.mo...@amd.com>
>>> Signed-off-by: Zhao Liu <zhao1....@intel.com>
Reviewed-by: Babu Moger <babu.mo...@amd.com>
>>> ---
>>> Changes Since RFC v1 [*]:
>>> * Correct the RFC's description, now there's the overflow case. Provide
>>> an overflow example.
>>>
>>> RFC:
>>> * Although there are currently no overflow cases, to avoid any
>>> potential issue, add the overflow check, just as I did for Intel.
>>>
>>> [*]:
>>> https://lore.kernel.org/qemu-devel/20250227062523.124601-5-zhao1....@intel.com/
>>> ---
>>> target/i386/cpu.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
>>> index fedeeea151ee..eceda9865b8f 100644
>>> --- a/target/i386/cpu.c
>>> +++ b/target/i386/cpu.c
>>> @@ -558,7 +558,8 @@ static void encode_cache_cpuid8000001d(CPUCacheInfo
>>> *cache,
>>>
>>> *eax = CACHE_TYPE(cache->type) | CACHE_LEVEL(cache->level) |
>>> (cache->self_init ? CACHE_SELF_INIT_LEVEL : 0);
>>> - *eax |= max_thread_ids_for_cache(topo_info, cache->share_level) << 14;
>>> + /* Bits 25:14 - NumSharingCache: maximum 4095. */
>>> + *eax |= MIN(max_thread_ids_for_cache(topo_info, cache->share_level),
>>> 4095) << 14;
>>
>> Will this be more meaningful?
>>
>> *eax |=
>> max_thread_ids_for_cache(topo_info, cache->share_level) & 0xFFF << 14
>
> Hi Babu, thank you for your feedback! This approach depends on truncation,
> which might lead to more erroneous conclusions. Currently, such cases
> shouldn't exist on actual hardware; it's only QEMU that supports so many
> CPUs and custom topologies.
>
> Previously, when Intel handled similar cases (where the topology space
> wasn't large enough), it would encode the maximum value rather than
> truncate, as I'm doing now (you can refer to the description of 0x1 in
> patch 5, and similar fixes in Intel's 0x4 leaf in patch 6). In the
> future, if actual hardware CPUs reach such numbers and has special
> behavior, we can update accordingly. I think at least for now, this
> avoids overflow caused by special topology in QEMU emulation.
>
Sure. Sounds good to me.
--
Thanks
Babu Moger