This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
---
net: mana: Add NULL guards in teardown path to prevent panic on attach failure
When queue allocation fails partway through, the error cleanup frees
and NULLs apc->tx_qp and apc->rxqs. The patch adds NULL guards for these
pointers in teardown paths such as mana_destroy_vport(), mana_fence_rqs(),
and mana_dealloc_queues() to prevent NULL pointer dereferences.
> diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c
> b/drivers/net/ethernet/microsoft/mana/mana_en.c
> index 9afc786b297a8..9e7e4bf526bf6 100644
> --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
[ ... ]
> @@ -2858,13 +2861,16 @@ static void mana_destroy_vport(struct
> mana_port_context *apc)
> struct mana_rxq *rxq;
> u32 rxq_idx;
>
> - for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) {
> - rxq = apc->rxqs[rxq_idx];
> - if (!rxq)
> - continue;
> + if (apc->rxqs) {
>
> - mana_destroy_rxq(apc, rxq, true);
> - apc->rxqs[rxq_idx] = NULL;
> + for (rxq_idx = 0; rxq_idx < apc->num_queues; rxq_idx++) {
> + rxq = apc->rxqs[rxq_idx];
> + if (!rxq)
> + continue;
> +
> + mana_destroy_rxq(apc, rxq, true);
> + apc->rxqs[rxq_idx] = NULL;
> + }
> }
If the teardown path frees apc->rxqs and sets it to NULL on attach failure,
how does the interface recover if the user subsequently brings it back up?
When mana_open() is invoked, it directly calls mana_alloc_queues(). Inside
mana_alloc_queues() -> mana_add_rx_queues(), the code does:
apc->rxqs[i] = rxq;
If apc->rxqs was left as NULL during the teardown, will this cause a NULL
pointer dereference on the next interface bring-up?
[ ... ]
> @@ -3287,33 +3294,38 @@ static int mana_dealloc_queues(struct net_device
> *ndev)
> * number of queues.
> */
>
> - for (i = 0; i < apc->num_queues; i++) {
> - txq = &apc->tx_qp[i].txq;
> - tsleep = 1000;
> - while (atomic_read(&txq->pending_sends) > 0 &&
> - time_before(jiffies, timeout)) {
> - usleep_range(tsleep, tsleep + 1000);
> - tsleep <<= 1;
> - }
> - if (atomic_read(&txq->pending_sends)) {
> - err = pcie_flr(to_pci_dev(gd->gdma_context->dev));
> - if (err) {
> - netdev_err(ndev, "flr failed %d with %d pkts
> pending in txq %u\n",
> - err,
> atomic_read(&txq->pending_sends),
> - txq->gdma_txq_id);
> + if (apc->tx_qp) {
In the error path where queue creation fails partway through, will
mana_destroy_txq() be called?
If mana_destroy_txq() unconditionally iterates over all apc->num_queues:
for (i = 0; i < apc->num_queues; i++) {
...
mana_destroy_wq_obj(apc, GDMA_SQ, apc->tx_qp[i].tx_object);
...
Will it pass 0 or INVALID_MANA_HANDLE to mana_destroy_wq_obj() for queues
that were never fully initialized?
Does mana_destroy_wq_obj() send a MANA_DESTROY_WQ_OBJ command to the hardware
with these uninitialized handles, and could handle 0 corrupt the firmware
state?
> + for (i = 0; i < apc->num_queues; i++) {
> + txq = &apc->tx_qp[i].txq;
> + tsleep = 1000;
> + while (atomic_read(&txq->pending_sends) > 0 &&
> + time_before(jiffies, timeout)) {
> + usleep_range(tsleep, tsleep + 1000);
> + tsleep <<= 1;
> + }
> + if (atomic_read(&txq->pending_sends)) {
> + err =
> + pcie_flr(to_pci_dev(gd->gdma_context->dev));
If pcie_flr() is triggered here, the PCI Function Level Reset wipes the
internal GDMA state.
Further down in this function, mana_disable_vport_rx(), mana_fence_rqs(),
and mana_destroy_vport() are called. Will these functions attempt to send
commands to the now-destroyed GDMA admin queue via mana_send_request() and
wait for completions?
Could this lead to cascading timeouts since the hardware was just reset and
cannot process admin commands?
> + if (err) {
> + netdev_err(ndev, "flr failed %d with %d
> pkts pending in txq %u\n",
> + err,
> + atomic_read(&txq->pending_sends),
> + txq->gdma_txq_id);
> + }
> + break;
> }
> - break;
> }
--
pw-bot: cr