On 3/11/2026 6:09 AM, Bjorn Helgaas wrote:
> On Tue, Mar 10, 2026 at 10:58:49AM -0500, Jeremy Linton wrote:
>> On 3/9/26 10:20 PM, Chengwen Feng wrote:
>>> pcie_tph_get_cpu_st() is broken on ARM64:
>>> 1. pcie_tph_get_cpu_st() passes cpu_uid to the PCI ACPI DSM method.
>>> cpu_uid should be the ACPI Processor UID [1].
>>> 2. In BNXT, pcie_tph_get_cpu_st() is passed a cpu_uid obtained via
>>> cpumask_first(irq->cpu_mask) - the logical CPU ID of a CPU core,
>>> generated and managed by kernel (e.g., [0,255] for a system with 256
>>> logical CPU cores).
>>> 3. On ARM64 platforms, ACPI assigns Processor UID to cores listed in the
>>> MADT table, and this UID may not match the kernel's logical CPU ID.
>>> When this occurs, the mismatch results in the wrong CPU steer-tag.
>>> 4. On AMD x86 the logical CPU ID is identical to the ACPI Processor UID
>>> so the mismatch is not seen.
>
>>> int pcie_tph_get_cpu_st(struct pci_dev *pdev, enum tph_mem_type mem_type,
>>> - unsigned int cpu_uid, u16 *tag)
>>> + unsigned int cpu, u16 *tag)
>>> {
>>> #ifdef CONFIG_ACPI
>>> + u32 cpu_uid = acpi_get_cpu_acpi_id(cpu);
>
> From AI review (gemini/gemini-3.1-pro-preview):
>
> Does this code need to validate that `cpu` is within bounds before
> using it? Before this change, the `cpu_uid` parameter was passed
> opaquely to the ACPI firmware via `tph_invoke_dsm()`, which would
> gracefully handle invalid values.
>
> Now, `cpu` is treated as a logical CPU index and passed to
> `acpi_get_cpu_acpi_id(cpu)`. On architectures like arm64 and riscv,
> `acpi_get_cpu_acpi_id()` uses `cpu` directly as an array index
> (`&cpu_madt_gicc[cpu]` and `&cpu_madt_rintc[cpu]`). On x86, it uses
> `per_cpu(x86_cpu_to_acpiid, cpu)`.
>
> If a caller passes an out-of-bounds `cpu` index (for example, if an
> IRQ affinity mask is empty and `cpumask_first()` returns
> `nr_cpu_ids`, or if userspace passes an arbitrary ID via
> `mlx5_st_alloc_index()`), this will result in an out-of-bounds
> memory read.
>
> Consider adding a bounds check:
>
> if (cpu >= nr_cpu_ids)
> return -EINVAL;
>
> I agree that this is an issue, and I think implementations of
> acpi_get_cpu_acpi_id() should validate their inputs.
>
> I don't know if there's a value that can never be a valid ACPI CPU UID
> and could be used as an error value from acpi_get_cpu_acpi_id(). I do
> see a few mentions of a ~0 value meaning "all processors" (ACPI r6.6,
> sec 5.2.12.13).
I only have the ACPI Specification Version 6.5, so I will use v6.5 as an
example.
The ACPI specification does not define invalid value ranges for the ACPI UID.
For the arm64 platform (Section 5.2.12.14):
ACPI Processor UID: The OS associates this GICC Structure with a processor
device
object in the namespace when the _UID child object of the
processor device evaluates to a numeric value that matches
the numeric value in this field.
I am concerned that we cannot implement it like this:
int acpi_get_cpu_uid(unsigned int cpu) {
if (cpu >= nr_cpu_ids)
return -EINVAL;
...
}
or:
u32 acpi_get_cpu_uid(unsigned int cpu) {
if (cpu >= nr_cpu_ids)
return U32_MAX;
...
}
How about implementing it as follows:
s64 acpi_get_cpu_uid(unsigned int cpu) {
if (cpu >= nr_cpu_ids)
return -EINVAL;
...
}
or
int acpi_get_cpu_uid(unsigned int cpu, u32 *uid) {
if (cpu >= nr_cpu_ids)
return -EINVAL;
*uid = xxx;
return 0;
}
Another issue: This commit also provides an implementation for the x86 platform.
However, further code analysis revealed a potential problem in the
implementation:
The acpi_get_cpu_acpi_id() retrieves uid from x86_cpu_to_acpiid in SMP, and
x86_cpu_to_acpiid is set through the call chain: acpi_parse_lapic() ->
topology_register_apic() -> topo_register_apic() -> topo_set_cpuids() ->
x86_cpu_to_acpiid. It appears to retrieve the "ACPI Processor UID" from
ACPI Section 5.2.12.2, but the problem is that this field is only one byte in
length,
which may cause issues in huge-core systems.
Therefore, I suggest re-implementing the acpi_get_cpu_uid function for the x86
platform. Either I provide a default implementation (shown below), or x86 guys
contribute to the implementation:
s64 acpi_get_cpu_uid(unsigned int cpu) {
if (cpu >= nr_cpu_ids)
return -EINVAL;
return cpu;
}
Thanks
>