Hi Willem, Thanks for the review.
> -----Original Message----- > From: Willem de Bruijn <willemdebruijn.ker...@gmail.com> > Sent: Monday, November 2, 2020 7:12 PM > To: George Cherian <gcher...@marvell.com> > Cc: Network Development <netdev@vger.kernel.org>; linux-kernel <linux- > ker...@vger.kernel.org>; Jakub Kicinski <k...@kernel.org>; David Miller > <da...@davemloft.net>; Sunil Kovvuri Goutham > <sgout...@marvell.com>; Linu Cherian <lcher...@marvell.com>; > Geethasowjanya Akula <gak...@marvell.com>; masahi...@kernel.org > Subject: Re: [net-next PATCH 2/3] octeontx2-af: Add devlink health > reporters for NPA > > On Mon, Nov 2, 2020 at 12:07 AM George Cherian > <george.cher...@marvell.com> wrote: > > > > Add health reporters for RVU NPA block. > > Only reporter dump is supported > > > > Output: > > # devlink health > > pci/0002:01:00.0: > > reporter npa > > state healthy error 0 recover 0 > > # devlink health dump show pci/0002:01:00.0 reporter npa > > NPA_AF_GENERAL: > > Unmap PF Error: 0 > > Free Disabled for NIX0 RX: 0 > > Free Disabled for NIX0 TX: 0 > > Free Disabled for NIX1 RX: 0 > > Free Disabled for NIX1 TX: 0 > > Free Disabled for SSO: 0 > > Free Disabled for TIM: 0 > > Free Disabled for DPI: 0 > > Free Disabled for AURA: 0 > > Alloc Disabled for Resvd: 0 > > NPA_AF_ERR: > > Memory Fault on NPA_AQ_INST_S read: 0 > > Memory Fault on NPA_AQ_RES_S write: 0 > > AQ Doorbell Error: 0 > > Poisoned data on NPA_AQ_INST_S read: 0 > > Poisoned data on NPA_AQ_RES_S write: 0 > > Poisoned data on HW context read: 0 > > NPA_AF_RVU: > > Unmap Slot Error: 0 > > > > Signed-off-by: Sunil Kovvuri Goutham <sgout...@marvell.com> > > Signed-off-by: Jerin Jacob <jer...@marvell.com> > > Signed-off-by: George Cherian <george.cher...@marvell.com> > > > > +static bool rvu_npa_af_request_irq(struct rvu *rvu, int blkaddr, int > > offset, > > + const char *name, irq_handler_t fn) > > +{ > > + struct rvu_devlink *rvu_dl = rvu->rvu_dl; > > + int rc; > > + > > + WARN_ON(rvu->irq_allocated[offset]); > > Please use WARN_ON sparingly for important unrecoverable events. This > seems like a basic precondition. If it can happen at all, can probably catch > in a > normal branch with a netdev_err. The stacktrace in the oops is not likely to > point at the source of the non-zero value, anyway. Okay, will fix it in v2. > > > + rvu->irq_allocated[offset] = false; > > Why initialize this here? Are these fields not zeroed on alloc? Is this here > only > to safely call rvu_npa_unregister_interrupts on partial alloc? Then it might > be > simpler to just have jump labels in this function to free the successfully > requested irqs. It shouldn't be initialized like this; it is zeroed on alloc. Will fix in v2. > > > + sprintf(&rvu->irq_name[offset * NAME_SIZE], name); > > + rc = request_irq(pci_irq_vector(rvu->pdev, offset), fn, 0, > > + &rvu->irq_name[offset * NAME_SIZE], rvu_dl); > > + if (rc) > > + dev_warn(rvu->dev, "Failed to register %s irq\n", name); > > + else > > + rvu->irq_allocated[offset] = true; > > + > > + return rvu->irq_allocated[offset]; } > > > +static int rvu_npa_health_reporters_create(struct rvu_devlink > > +*rvu_dl) { > > + struct devlink_health_reporter *rvu_npa_health_reporter; > > + struct rvu_npa_event_cnt *npa_event_count; > > + struct rvu *rvu = rvu_dl->rvu; > > + > > + npa_event_count = kzalloc(sizeof(*npa_event_count), GFP_KERNEL); > > + if (!npa_event_count) > > + return -ENOMEM; > > + > > + rvu_dl->npa_event_cnt = npa_event_count; > > + rvu_npa_health_reporter = devlink_health_reporter_create(rvu_dl- > >dl, > > + > > &rvu_npa_hw_fault_reporter_ops, > > + 0, rvu); > > + if (IS_ERR(rvu_npa_health_reporter)) { > > + dev_warn(rvu->dev, "Failed to create npa reporter, err > > =%ld\n", > > + PTR_ERR(rvu_npa_health_reporter)); > > + return PTR_ERR(rvu_npa_health_reporter); > > + } > > + > > + rvu_dl->rvu_npa_health_reporter = rvu_npa_health_reporter; > > + return 0; > > +} > > + > > +static void rvu_npa_health_reporters_destroy(struct rvu_devlink > > +*rvu_dl) { > > + if (!rvu_dl->rvu_npa_health_reporter) > > + return; > > + > > + > > +devlink_health_reporter_destroy(rvu_dl->rvu_npa_health_reporter); > > +} > > + > > +static int rvu_health_reporters_create(struct rvu *rvu) { > > + struct rvu_devlink *rvu_dl; > > + > > + if (!rvu->rvu_dl) > > + return -EINVAL; > > + > > + rvu_dl = rvu->rvu_dl; > > + return rvu_npa_health_reporters_create(rvu_dl); > > No need for local var rvu_dl. Here and below. > > Without that, the entire helper is probably not needed. This helper is needed as we add support for more HW blocks. > > > +} > > + > > +static void rvu_health_reporters_destroy(struct rvu *rvu) { > > + struct rvu_devlink *rvu_dl; > > + > > + if (!rvu->rvu_dl) > > + return; > > + > > + rvu_dl = rvu->rvu_dl; > > + rvu_npa_health_reporters_destroy(rvu_dl); > > +} > > + > > static int rvu_devlink_info_get(struct devlink *devlink, struct > devlink_info_req *req, > > struct netlink_ext_ack *extack) { @@ > > -53,7 +483,8 @@ int rvu_register_dl(struct rvu *rvu) > > rvu_dl->dl = dl; > > rvu_dl->rvu = rvu; > > rvu->rvu_dl = rvu_dl; > > - return 0; > > + > > + return rvu_health_reporters_create(rvu); > > when would this be called with rvu->rvu_dl == NULL? During initialization. Regards, -George