Wed, May 27, 2020 at 10:57:11PM CEST, michael.c...@broadcom.com wrote: >On Wed, May 27, 2020 at 1:14 PM Jakub Kicinski <k...@kernel.org> wrote: >> >> On Wed, 27 May 2020 09:07:09 +0530 Vasundhara Volam wrote: >> > Here is a sample sequence of commands to do a "live reset" to get some >> > clear idea. >> > Note that I am providing the examples based on the current patchset. >> > >> > 1. FW live reset is disabled in the device/adapter. Here adapter has 2 >> > physical ports. >> > >> > $ devlink dev >> > pci/0000:3b:00.0 >> > pci/0000:3b:00.1 >> > pci/0000:af:00.0 >> > $ devlink dev param show pci/0000:3b:00.0 name allow_fw_live_reset >> > pci/0000:3b:00.0: >> > name allow_fw_live_reset type generic >> > values: >> > cmode runtime value false >> > cmode permanent value false >> > $ devlink dev param show pci/0000:3b:00.1 name allow_fw_live_reset >> > pci/0000:3b:00.1: >> > name allow_fw_live_reset type generic >> > values: >> > cmode runtime value false >> > cmode permanent value false >> >> What's the permanent value? What if after reboot the driver is too old >> to change this, is the reset still allowed? > >The permanent value should be the NVRAM value. If the NVRAM value is >false, the feature is always and unconditionally disabled. If the >permanent value is true, the feature will only be available when all >loaded drivers indicate support for it and set the runtime value to >true. If an old driver is loaded afterwards, it wouldn't indicate >support for this feature and it wouldn't set the runtime value to >true. So the feature will not be available until the old driver is >unloaded or upgraded. > >> >> > 2. If a user issues "ethtool --reset p1p1 all", the device cannot >> > perform "live reset" as capability is not enabled. >> > >> > User needs to do a driver reload, for firmware to undergo reset. >> >> Why does driver reload have anything to do with resetting a potentially >> MH device? > >I think she meant that all drivers have to be unloaded before the >reset would take place in case it's a MH device since live reset is >not supported. If it's a single function device, unloading this >driver is sufficient. > >> >> > $ ethtool --reset p1p1 all >> >> Reset probably needs to be done via devlink. In any case you need a new >> reset level for resetting MH devices and smartnics, because the current >> reset mask covers port local, and host local cases, not any form of MH. > >RIght. This reset could be just a single function reset in this example. > >> >> > ETHTOOL_RESET 0xffffffff >> > Components reset: 0xff0000 >> > Components not reset: 0xff00ffff >> > $ dmesg >> > [ 198.745822] bnxt_en 0000:3b:00.0 p1p1: Firmware reset request >> > successful. >> > [ 198.745836] bnxt_en 0000:3b:00.0 p1p1: Reload driver to complete reset >> >> You said the reset was not performed, yet there is no information to >> that effect in the log?! > >The firmware has been requested to reset, but the reset hasn't taken >place yet because live reset cannot be done. We can make the logs >more clear. > >> >> > 3. Now enable the capability in the device and reboot for device to >> > enable the capability. Firmware does not get reset just by setting the >> > param to true. >> > >> > $ devlink dev param set pci/0000:3b:00.1 name allow_fw_live_reset >> > value true cmode permanent >> > >> > 4. After reboot, values of param. >> >> Is the reboot required here? >> > >In general, our new NVRAM permanent parameters will take effect after >reset (or reboot).
Ah, you need a reboot. I was not expecting that :/ So the "devlink dev reload" attr would not work for you. MLNX hardware can change this on runtime. > >> > $ devlink dev param show pci/0000:3b:00.1 name allow_fw_live_reset >> > pci/0000:3b:00.1: >> > name allow_fw_live_reset type generic >> > values: >> > cmode runtime value true >> >> Why is runtime value true now? >> > >If the permanent (NVRAM) parameter is true, all loaded new drivers >will indicate support for this feature and set the runtime value to >true by default. The runtime value would not be true if any loaded >driver is too old or has set the runtime value to false. This is a bit odd. It is a configuration, not an indication. When you want to indicate what you support something, I think it should be done in a different place. I think that "devlink dev info" is the place to put it, I think that we need "capabilities" there.