On 3/26/26 6:30 PM, Erni Sri Satya Vennela wrote:
> As a part of MANA hardening for CVM, validate the adapter_mtu value
> returned from the MANA_QUERY_DEV_CONFIG HWC command.
>
> The adapter_mtu value is used to compute ndev->max_mtu via:
> gc->adapter_mtu - ETH_HLEN. If hardware returns a bogus adapter_mtu
> smaller than ETH_HLEN (e.g. 0), the unsigned subtraction wraps to a
> huge value, silently allowing oversized MTU settings.
>
> Add a validation check to reject adapter_mtu values below
> ETH_MIN_MTU + ETH_HLEN, returning -EPROTO to fail the device
> configuration early with a clear error message.
>
> Signed-off-by: Erni Sri Satya Vennela <[email protected]>
> ---
> drivers/net/ethernet/microsoft/mana/mana_en.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c
> b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index b39e8b920791..bd07d17a6017 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> @@ -1207,10 +1207,16 @@ static int mana_query_device_cfg(struct mana_context
> *ac, u32 proto_major_ver,
>
> *max_num_vports = resp.max_num_vports;
>
> - if (resp.hdr.response.msg_version >= GDMA_MESSAGE_V2)
> + if (resp.hdr.response.msg_version >= GDMA_MESSAGE_V2) {
> + if (resp.adapter_mtu < ETH_MIN_MTU + ETH_HLEN) {
> + dev_err(dev, "Adapter MTU too small: %u\n",
> + resp.adapter_mtu);
> + return -EPROTO;
AI review says:
If this returns -EPROTO, does the caller mana_probe() jump to an error
label and call mana_remove()?
If so, mana_remove() unconditionally calls
disable_work_sync(&ac->link_change_work) and
cancel_delayed_work_sync(&ac->gf_stats_work).
Since mana_query_device_cfg() is called before INIT_WORK() and
INIT_DELAYED_WORK() in the probe sequence, wouldn't this result in
calling sync cancellation functions on uninitialized, zeroed work
structures?
This can lead to a WARN_ON(!work->func) in __flush_work(), or debug
object warnings if CONFIG_DEBUG_OBJECTS_WORK is enabled.
While this initialization issue appears to already exist for other early
error paths, this new error path can also trigger it.