On 3/26/26 6:48 PM, Erni Sri Satya Vennela wrote:
> As a part of MANA hardening for CVM, validate that max_num_sq and
> max_num_rq returned by MANA_QUERY_VPORT_CONFIG are not zero. These
> values flow into apc->num_queues, which is used as an allocation count
> and loop bound. A zero value would result in zero-size allocations and
> incorrect driver behavior.
> 
> Return -EPROTO if either value is zero.
> 
> Signed-off-by: Erni Sri Satya Vennela <[email protected]>
> ---
>  drivers/net/ethernet/microsoft/mana/mana_en.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c 
> b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index b39e8b920791..a4197b4b0597 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -1249,6 +1249,12 @@ static int mana_query_vport_cfg(struct 
> mana_port_context *apc, u32 vport_index,
>  
>       *max_sq = resp.max_num_sq;
>       *max_rq = resp.max_num_rq;
> +
> +     if (*max_sq == 0 || *max_rq == 0) {
> +             netdev_err(apc->ndev, "Invalid max queues from vPort config\n");
> +             return -EPROTO;

AI review says:

Will returning -EPROTO here expose a pre-existing resource leak in the
driver's teardown path?
If mana_query_vport_cfg() returns an error, mana_init_port() fails and
mana_probe_port() frees the ndev, leaving ac->ports[i] as NULL. In
mana_probe(), the port initialization loop breaks upon this error, but
the err variable is then overwritten:

mana_probe() {
    ...
    for (i = 0; i < ac->num_ports; i++) {
        err = mana_probe_port(ac, i, &ac->ports[i]);
        if (err) {
            dev_err(dev, "Probe Failed for port %d\n", i);
            break;
        }
    }

    err = add_adev(gd, "eth");
    ...
}

If add_adev() succeeds, mana_probe() completes successfully instead of
failing, masking the earlier error while leaving ac->ports[0] as NULL.
Later, when the driver is unloaded or if add_adev() fails and triggers
immediate cleanup, mana_remove() is called. It iterates over ac->ports
and, upon encountering the NULL device, immediately executes goto out:

mana_remove() {
    ...
    for (i = 0; i < ac->num_ports; i++) {
        ndev = ac->ports[i];
        if (!ndev) {
            if (i == 0)
                ...
            goto out;
        }
        ...
    }

    mana_destroy_eq(ac);
out:
    ...
}

Because the out label in mana_remove() is located after the
mana_destroy_eq(ac) call, jumping there completely skips destroying the
event queues allocated earlier by mana_create_eq(ac).
In a Confidential Virtual Machine context, could an untrusted hypervisor
repeatedly return invalid configs to continuously leak guest memory and
hardware queues?


Reply via email to