On Thu, May 21, 2020 at 01:22:34PM -0700, Jacob Keller wrote:
> On 5/20/2020 5:16 PM, Jakub Kicinski wrote:
> > On Wed, 20 May 2020 17:03:02 -0700 Jacob Keller wrote:
> >> Hi Jiri, Jakub,
> >>
> >> I've been asked to investigate using devlink as a mechanism for
> >> reporting asynchronous events/messages from firmware including
> >> diagnostic messages, etc.
> >>
> >> Essentially, the ice firmware can report various status or diagnostic
> >> messages which are useful for debugging internal behavior. We want to be
> >> able to get these messages (and relevant data associated with them) in a
> >> format beyond just "dump it to the dmesg buffer and recover it later".
> >>
> >> It seems like this would be an appropriate use of devlink. I thought
> >> maybe this would work with devlink health:
> >>
> >> i.e. we create a devlink health reporter, and then when firmware sends a
> >> message, we use devlink_health_report.
> >>
> >> But when I dug into this, it doesn't seem like a natural fit. The health
> >> reporters expect to see an "error" state, and don't seem to really fit
> >> the notion of "log a message from firmware" notion.
> >>
> >> One of the issues is that the health reporter only keeps one dump, when
> >> what we really want is a way to have a monitoring application get the
> >> dump and then store its contents.
> >>
> >> Thoughts on what might make sense for this? It feels like a stretch of
> >> the health interface...
> >>
> >> I mean basically what I am thinking of having is using the devlink_fmsg
> >> interface to just send a netlink message that then gets sent over the
> >> devlink monitor socket and gets dumped immediately.
> > 
> > Why does user space need a raw firmware interface in the first place?
> > 
> > Examples?
> > 
> 
> So the ice firmware can optionally send diagnostic debug messages via
> its control queue. The current solutions we've used internally
> essentially hex-dump the binary contents to the kernel log, and then
> these get scraped and converted into a useful format for human consumption.
> 
> I'm not 100% of the format, but I know it's based on a decoding file
> that is specific to a given firmware image, and thus attempting to tie
> this into the driver is problematic.

You explained how it works, but not why it's needed :)

> There is also a plan to provide a simpler interface for some of the
> diagnostic messages where a simple bijection between one code to one
> message for a handful of events, like if the link engine can detect a
> known reason why it wasn't able to get link. I suppose these could be
> translated and immediately printed by the driver without a special
> interface.

Petr worked on something similar last year:
https://lore.kernel.org/netdev/cover.1552672441.git.pe...@mellanox.com/

Amit is currently working on a new version based on ethtool (netlink).

Reply via email to