On Thu, May 21, 2020 at 01:22:34PM -0700, Jacob Keller wrote: > On 5/20/2020 5:16 PM, Jakub Kicinski wrote: > > On Wed, 20 May 2020 17:03:02 -0700 Jacob Keller wrote: > >> Hi Jiri, Jakub, > >> > >> I've been asked to investigate using devlink as a mechanism for > >> reporting asynchronous events/messages from firmware including > >> diagnostic messages, etc. > >> > >> Essentially, the ice firmware can report various status or diagnostic > >> messages which are useful for debugging internal behavior. We want to be > >> able to get these messages (and relevant data associated with them) in a > >> format beyond just "dump it to the dmesg buffer and recover it later". > >> > >> It seems like this would be an appropriate use of devlink. I thought > >> maybe this would work with devlink health: > >> > >> i.e. we create a devlink health reporter, and then when firmware sends a > >> message, we use devlink_health_report. > >> > >> But when I dug into this, it doesn't seem like a natural fit. The health > >> reporters expect to see an "error" state, and don't seem to really fit > >> the notion of "log a message from firmware" notion. > >> > >> One of the issues is that the health reporter only keeps one dump, when > >> what we really want is a way to have a monitoring application get the > >> dump and then store its contents. > >> > >> Thoughts on what might make sense for this? It feels like a stretch of > >> the health interface... > >> > >> I mean basically what I am thinking of having is using the devlink_fmsg > >> interface to just send a netlink message that then gets sent over the > >> devlink monitor socket and gets dumped immediately. > > > > Why does user space need a raw firmware interface in the first place? > > > > Examples? > > > > So the ice firmware can optionally send diagnostic debug messages via > its control queue. The current solutions we've used internally > essentially hex-dump the binary contents to the kernel log, and then > these get scraped and converted into a useful format for human consumption. > > I'm not 100% of the format, but I know it's based on a decoding file > that is specific to a given firmware image, and thus attempting to tie > this into the driver is problematic.
You explained how it works, but not why it's needed :) > There is also a plan to provide a simpler interface for some of the > diagnostic messages where a simple bijection between one code to one > message for a handful of events, like if the link engine can detect a > known reason why it wasn't able to get link. I suppose these could be > translated and immediately printed by the driver without a special > interface. Petr worked on something similar last year: https://lore.kernel.org/netdev/cover.1552672441.git.pe...@mellanox.com/ Amit is currently working on a new version based on ethtool (netlink).