Wed, Jul 01, 2020 at 01:59:14PM CEST, vasundhara-v.vo...@broadcom.com wrote: >On Wed, Jul 1, 2020 at 3:17 PM Jiri Pirko <j...@resnulli.us> wrote: >> >> Wed, Jul 01, 2020 at 11:25:50AM CEST, vasundhara-v.vo...@broadcom.com wrote: >> >On Wed, Jul 1, 2020 at 11:21 AM Jiri Pirko <j...@resnulli.us> wrote: >> >> >> >> Tue, Jun 30, 2020 at 05:15:18PM CEST, vasundhara-v.vo...@broadcom.com >> >> wrote: >> >> >On Tue, Jun 30, 2020 at 6:23 PM Jiri Pirko <j...@resnulli.us> wrote: >> >> >> >> >> >> Tue, Jun 30, 2020 at 01:34:06PM CEST, vasundhara-v.vo...@broadcom.com >> >> >> wrote: >> >> >> >Advanced NICs support live reset of some of the hardware >> >> >> >components, that resets the device immediately with all the >> >> >> >host drivers loaded. >> >> >> > >> >> >> >Add devlink reset subcommand to support live and deferred modes >> >> >> >of reset. It allows to reset the hardware components of the >> >> >> >entire device and supports the following fields: >> >> >> > >> >> >> >component: >> >> >> >---------- >> >> >> >1. MGMT : Management processor. >> >> >> >2. DMA : DMA engine. >> >> >> >3. RAM : RAM shared between multiple components. >> >> >> >4. AP : Application processor. >> >> >> >5. ROCE : RoCE management processor. >> >> >> >6. All : All possible components. >> >> >> > >> >> >> >Drivers are allowed to reset only a subset of requested components. >> >> >> >> >> >> I don't understand why would user ever want to do this. He does not >> >> >> care >> >> >> about some magic hw entities. He just expects the hw to work. I don't >> >> >> undestand the purpose of exposing something like this. Could you please >> >> >> explain in details? Thanks! >> >> >> >> >> >If a user requests multiple components and if the driver is only able >> >> >to honor a subset, the driver will return the components unset which >> >> >it is able to reset. For example, if a user requests MGMT, RAM and >> >> >ROCE components to be reset and driver resets only MGMT and ROCE. >> >> >Driver will unset only MGMT and ROCE bits and notifies the user that >> >> >RAM is not reset. >> >> > >> >> >This will be useful for drivers to reset only a subset of components >> >> >requested instead of returning error or silently doing only a subset >> >> >of components. >> >> > >> >> >Also, this will be helpful as user will not know the components >> >> >supported by different vendors. >> >> >> >> Your reply does not seem to be related to my question :/ >> >I thought that you were referring to: "Drivers are allowed to reset >> >only a subset of requested components." >> > >> >or were you referring to components? If yes, the user can select the >> >components that he wants to go for reset. This will be useful in the >> >case where, if the user flashed only a certain component and he wants >> >to reset that particular component. For example, in the case of SOC >> >there are 2 components: MGMT and AP. If a user flashes only >> >application processor, he can choose to reset only application >> >processor. >> >> We already have notion of "a component" in "devlink dev flash". I think >> that the reset component name should be in-sync with the flash. >Only 1 type of component "ETHTOOL_FLASH_ALL_REGIONS" is defined
I wonder why did you get impression I'm talking about ethtool. I'm not. I'm talking about "devlink dev flash". >currently. We can have same components for reset as well and extend as >needed. >> >> Thinking about it a bit more, we can extend the flash command by "reset" >> attribute that would indicate use wants to do flash&reset right away. >This will remove the freedom of user to reset later after flashing. >But I think it is fine. > >Also, I think adding reset attribute may complicate the flash command >as we need more attributes for reset alone like width and mode. >> >> Also, thinking how this all aligns with "devlink dev reload" which we >> currently have. The purpose of it is to re-instantiate driver instances, >> but in case of mlxsw it means friggering FW reset as well. >As I understand, "devlink dev reload" is to re-instantiate driver >instances and will not be able to send firmware command to request a >reset. Well, it does. >> >> Moshe (cced) is now working on "devlink dev reload" extension that would >> allow user to ask for a certain level of reload: driver instances only, >> fw reset too, live fw patching, etc. >> >> Not sure how this overlaps with your intentions. I think it would be >> great to see Moshe's RFC here as well so we can aligh the efforts. > >Sure, I will wait for RFC to get more idea. > >Thanks. >> >> >> > >> >> >> >> >> >> > >> >> >Thanks, >> >> >Vasundhara >> >> > >> >> >> >> >> >> > >> >> >> >width: >> >> >> >------ >> >> >> >1. single - single host. >> >> >> >2. multi - Multi host. >> >> >> > >> >> >> >mode: >> >> >> >----- >> >> >> >1. deferred - Reset will happen after unloading all the host drivers >> >> >> > on the device. This is be default reset type, if user >> >> >> > does not specify the type. >> >> >> >2. live - Reset will happen immediately with all host drivers loaded >> >> >> > in real time. If the live reset is not supported, driver >> >> >> > will return the error. >> >> >> > >> >> >> >This patch is a proposal in continuation to discussion to the >> >> >> >following thread: >> >> >> > >> >> >> >"[PATCH v3 net-next 0/6] bnxt_en: Add 'enable_live_dev_reset' and >> >> >> >'allow_live_dev_reset' generic devlink params." >> >> >> > >> >> >> >and here is the URL to the patch series: >> >> >> > >> >> >> >https://patchwork.ozlabs.org/project/netdev/list/?series=180426&state=* >> >> >> > >> >> >> >If the proposal looks good, I will re-send the whole patchset >> >> >> >including devlink changes and driver usage. >> >> >> > >> >> >> >Signed-off-by: Vasundhara Volam <vasundhara-v.vo...@broadcom.com> >> >> >> >Reviewed-by: Michael Chan <michael.c...@broadcom.com> >> >> >> >--- >> >> >> >v2: >> >> >> >- Switch RAM and AP component definitions. >> >> >> >- Remove IRQ, FILTER, OFFLOAD, MAC, PHY components as they are port >> >> >> >specific components. >> >> >> >- Rename function to host in width parameter. >> >> >> >--- >> >> >> > Documentation/networking/devlink/devlink-reset.rst | 50 +++++++++++++ >> >> >> > include/net/devlink.h | 2 + >> >> >> > include/uapi/linux/devlink.h | 46 ++++++++++++ >> >> >> > net/core/devlink.c | 85 >> >> >> > ++++++++++++++++++++++ >> >> >> > 4 files changed, 183 insertions(+) >> >> >> > create mode 100644 Documentation/networking/devlink/devlink-reset.rst >> >> >> > >> >> >> >diff --git a/Documentation/networking/devlink/devlink-reset.rst >> >> >> >b/Documentation/networking/devlink/devlink-reset.rst >> >> >> >new file mode 100644 >> >> >> >index 0000000..652800d >> >> >> >--- /dev/null >> >> >> >+++ b/Documentation/networking/devlink/devlink-reset.rst >> >> >> >@@ -0,0 +1,50 @@ >> >> >> >+.. SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) >> >> >> >+ >> >> >> >+.. _devlink_reset: >> >> >> >+ >> >> >> >+============= >> >> >> >+Devlink reset >> >> >> >+============= >> >> >> >+ >> >> >> >+The ``devlink-reset`` API allows reset the hardware components of >> >> >> >the device. After the reset, >> >> >> >+device loads the pending updated firmware image. >> >> >> >+Example use:: >> >> >> >+ >> >> >> >+ $ devlink dev reset pci/0000:05:00.0 components COMPONENTS >> >> >> >+ >> >> >> >+Note that user can mention multiple components. >> >> >> >+ >> >> >> >+================ >> >> >> >+Reset components >> >> >> >+================ >> >> >> >+ >> >> >> >+List of available components:: >> >> >> >+ >> >> >> >+``DEVLINK_RESET_COMP_MGMT`` - Management processor. >> >> >> >+``DEVLINK_RESET_COMP_DMA`` - DMA engine. >> >> >> >+``DEVLINK_RESET_COMP_RAM`` - RAM shared between multiple components. >> >> >> >+``DEVLINK_RESET_COMP_AP`` - Application processor. >> >> >> >+``DEVLINK_RESET_COMP_ROCE`` - RoCE management processor. >> >> >> >+``DEVLINK_RESET_COMP_ALL`` - All components. >> >> >> >+ >> >> >> >+=========== >> >> >> >+Reset width >> >> >> >+=========== >> >> >> >+ >> >> >> >+List of available widths:: >> >> >> >+ >> >> >> >+``DEVLINK_RESET_WIDTH_SINGLE`` - Device is used by single dedicated >> >> >> >host. >> >> >> >+``DEVLINK_RESET_WIDTH_MULTI`` - Device is shared across multiple >> >> >> >hosts. >> >> >> >+ >> >> >> >+Note that if user specifies DEVLINK_RESET_WIDTH_SINGLE in a >> >> >> >multi-host environment, driver returns >> >> >> >+error if it does not support resetting a single host. >> >> >> >+ >> >> >> >+=========== >> >> >> >+Reset modes >> >> >> >+=========== >> >> >> >+ >> >> >> >+List of available reset modes:: >> >> >> >+ >> >> >> >+``DEVLINK_RESET_MODE_DEFERRED`` - Reset happens after all host >> >> >> >drivers are unloaded on the device. >> >> >> >+``DEVLINK_RESET_MODE_LIVE`` - Reset happens immediately, with >> >> >> >all loaded host drivers in real >> >> >> >+ time. >> >> >> >diff --git a/include/net/devlink.h b/include/net/devlink.h >> >> >> >index 428f55f..a71c8f5 100644 >> >> >> >--- a/include/net/devlink.h >> >> >> >+++ b/include/net/devlink.h >> >> >> >@@ -1129,6 +1129,8 @@ struct devlink_ops { >> >> >> > int (*port_function_hw_addr_set)(struct devlink *devlink, >> >> >> > struct devlink_port *port, >> >> >> > const u8 *hw_addr, int >> >> >> > hw_addr_len, >> >> >> > struct netlink_ext_ack >> >> >> > *extack); >> >> >> >+ int (*reset)(struct devlink *devlink, u32 *components, u8 >> >> >> >width, u8 mode, >> >> >> >+ struct netlink_ext_ack *extack); >> >> >> > }; >> >> >> > >> >> >> > static inline void *devlink_priv(struct devlink *devlink) >> >> >> >diff --git a/include/uapi/linux/devlink.h >> >> >> >b/include/uapi/linux/devlink.h >> >> >> >index 87c83a8..6f32c00 100644 >> >> >> >--- a/include/uapi/linux/devlink.h >> >> >> >+++ b/include/uapi/linux/devlink.h >> >> >> >@@ -122,6 +122,9 @@ enum devlink_command { >> >> >> > DEVLINK_CMD_TRAP_POLICER_NEW, >> >> >> > DEVLINK_CMD_TRAP_POLICER_DEL, >> >> >> > >> >> >> >+ DEVLINK_CMD_RESET, >> >> >> >+ DEVLINK_CMD_RESET_STATUS, /* notification only */ >> >> >> >+ >> >> >> > /* add new commands above here */ >> >> >> > __DEVLINK_CMD_MAX, >> >> >> > DEVLINK_CMD_MAX = __DEVLINK_CMD_MAX - 1 >> >> >> >@@ -265,6 +268,44 @@ enum devlink_trap_type { >> >> >> > DEVLINK_TRAP_TYPE_CONTROL, >> >> >> > }; >> >> >> > >> >> >> >+/** >> >> >> >+ * enum devlink_reset_component - Reset components. >> >> >> >+ * @DEVLINK_RESET_COMP_MGMT: Management processor. >> >> >> >+ * @DEVLINK_RESET_COMP_DMA: DMA engine. >> >> >> >+ * @DEVLINK_RESET_COMP_RAM: RAM shared between multiple components. >> >> >> >+ * @DEVLINK_RESET_COMP_AP: Application processor. >> >> >> >+ * @DEVLINK_RESET_COMP_ROCE: RoCE management processor. >> >> >> >+ * @DEVLINK_RESET_COMP_ALL: All components. >> >> >> >+ */ >> >> >> >+enum devlink_reset_component { >> >> >> >+ DEVLINK_RESET_COMP_MGMT = (1 << 0), >> >> >> >+ DEVLINK_RESET_COMP_DMA = (1 << 1), >> >> >> >+ DEVLINK_RESET_COMP_RAM = (1 << 2), >> >> >> >+ DEVLINK_RESET_COMP_AP = (1 << 3), >> >> >> >+ DEVLINK_RESET_COMP_ROCE = (1 << 4), >> >> >> >+ DEVLINK_RESET_COMP_ALL = 0xffffffff, >> >> >> >+}; >> >> >> >+ >> >> >> >+/** >> >> >> >+ * enum devlink_reset_width - Number of hosts effected by reset. >> >> >> >+ * @DEVLINK_RESET_WIDTH_SINGLE: Device is used by single dedicated >> >> >> >host. >> >> >> >+ * @DEVLINK_RESET_WIDTH_MULTI: Device is shared across multiple >> >> >> >hosts. >> >> >> >+ */ >> >> >> >+enum devlink_reset_width { >> >> >> >+ DEVLINK_RESET_WIDTH_SINGLE = 0, >> >> >> >+ DEVLINK_RESET_WIDTH_MULTI = 1, >> >> >> >+}; >> >> >> >+ >> >> >> >+/** >> >> >> >+ * enum devlink_reset_mode - Modes of reset. >> >> >> >+ * @DEVLINK_RESET_MODE_DEFERRED: Reset will happen after host >> >> >> >drivers are unloaded. >> >> >> >+ * @DEVLINK_RESET_MODE_LIVE: All host drivers also will be reset >> >> >> >without reloading manually. >> >> >> >+ */ >> >> >> >+enum devlink_reset_mode { >> >> >> >+ DEVLINK_RESET_MODE_DEFERRED = 0, >> >> >> >+ DEVLINK_RESET_MODE_LIVE = 1, >> >> >> >+}; >> >> >> >+ >> >> >> > enum { >> >> >> > /* Trap can report input port as metadata */ >> >> >> > DEVLINK_ATTR_TRAP_METADATA_TYPE_IN_PORT, >> >> >> >@@ -455,6 +496,11 @@ enum devlink_attr { >> >> >> > >> >> >> > DEVLINK_ATTR_INFO_BOARD_SERIAL_NUMBER, /* string */ >> >> >> > >> >> >> >+ DEVLINK_ATTR_RESET_COMPONENTS, /* u32 */ >> >> >> >+ DEVLINK_ATTR_RESET_WIDTH, /* u8 */ >> >> >> >+ DEVLINK_ATTR_RESET_MODE, /* u8 */ >> >> >> >+ DEVLINK_ATTR_RESET_STATUS_MSG, /* string */ >> >> >> >+ >> >> >> > /* add new attributes above here, update the policy in >> >> >> > devlink.c */ >> >> >> > >> >> >> > __DEVLINK_ATTR_MAX, >> >> >> >diff --git a/net/core/devlink.c b/net/core/devlink.c >> >> >> >index 6ae3680..c0eebc5 100644 >> >> >> >--- a/net/core/devlink.c >> >> >> >+++ b/net/core/devlink.c >> >> >> >@@ -6797,6 +6797,82 @@ static int >> >> >> >devlink_nl_cmd_trap_policer_set_doit(struct sk_buff *skb, >> >> >> > return devlink_trap_policer_set(devlink, policer_item, info); >> >> >> > } >> >> >> > >> >> >> >+static int devlink_nl_reset_fill(struct sk_buff *msg, struct devlink >> >> >> >*devlink, >> >> >> >+ const char *status_msg, u32 >> >> >> >components) >> >> >> >+{ >> >> >> >+ void *hdr; >> >> >> >+ >> >> >> >+ hdr = genlmsg_put(msg, 0, 0, &devlink_nl_family, 0, >> >> >> >DEVLINK_CMD_RESET_STATUS); >> >> >> >+ if (!hdr) >> >> >> >+ return -EMSGSIZE; >> >> >> >+ >> >> >> >+ if (devlink_nl_put_handle(msg, devlink)) >> >> >> >+ goto nla_put_failure; >> >> >> >+ >> >> >> >+ if (status_msg && nla_put_string(msg, >> >> >> >DEVLINK_ATTR_RESET_STATUS_MSG, status_msg)) >> >> >> >+ goto nla_put_failure; >> >> >> >+ >> >> >> >+ if (nla_put_u32(msg, DEVLINK_ATTR_RESET_COMPONENTS, >> >> >> >components)) >> >> >> >+ goto nla_put_failure; >> >> >> >+ >> >> >> >+ genlmsg_end(msg, hdr); >> >> >> >+ return 0; >> >> >> >+ >> >> >> >+nla_put_failure: >> >> >> >+ genlmsg_cancel(msg, hdr); >> >> >> >+ return -EMSGSIZE; >> >> >> >+} >> >> >> >+ >> >> >> >+static void __devlink_reset_notify(struct devlink *devlink, const >> >> >> >char *status_msg, u32 components) >> >> >> >+{ >> >> >> >+ struct sk_buff *msg; >> >> >> >+ int err; >> >> >> >+ >> >> >> >+ msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); >> >> >> >+ if (!msg) >> >> >> >+ return; >> >> >> >+ >> >> >> >+ err = devlink_nl_reset_fill(msg, devlink, status_msg, >> >> >> >components); >> >> >> >+ if (err) >> >> >> >+ goto out; >> >> >> >+ >> >> >> >+ genlmsg_multicast_netns(&devlink_nl_family, >> >> >> >devlink_net(devlink), msg, 0, >> >> >> >+ DEVLINK_MCGRP_CONFIG, GFP_KERNEL); >> >> >> >+ return; >> >> >> >+ >> >> >> >+out: >> >> >> >+ nlmsg_free(msg); >> >> >> >+} >> >> >> >+ >> >> >> >+static int devlink_nl_cmd_reset(struct sk_buff *skb, struct >> >> >> >genl_info *info) >> >> >> >+{ >> >> >> >+ struct devlink *devlink = info->user_ptr[0]; >> >> >> >+ u32 components, req_comps; >> >> >> >+ struct nlattr *nla_type; >> >> >> >+ u8 width, mode; >> >> >> >+ int err; >> >> >> >+ >> >> >> >+ if (!devlink->ops->reset) >> >> >> >+ return -EOPNOTSUPP; >> >> >> >+ >> >> >> >+ if (!info->attrs[DEVLINK_ATTR_RESET_COMPONENTS]) >> >> >> >+ return -EINVAL; >> >> >> >+ components = >> >> >> >nla_get_u32(info->attrs[DEVLINK_ATTR_RESET_COMPONENTS]); >> >> >> >+ >> >> >> >+ nla_type = info->attrs[DEVLINK_ATTR_RESET_WIDTH]; >> >> >> >+ width = nla_type ? nla_get_u8(nla_type) : >> >> >> >DEVLINK_RESET_WIDTH_SINGLE; >> >> >> >+ >> >> >> >+ nla_type = info->attrs[DEVLINK_ATTR_RESET_MODE]; >> >> >> >+ mode = nla_type ? nla_get_u8(nla_type) : >> >> >> >DEVLINK_RESET_MODE_DEFERRED; >> >> >> >+ >> >> >> >+ req_comps = components; >> >> >> >+ __devlink_reset_notify(devlink, "Reset request", components); >> >> >> >+ err = devlink->ops->reset(devlink, &components, width, mode, >> >> >> >info->extack); >> >> >> >+ __devlink_reset_notify(devlink, "Components reset", req_comps >> >> >> >& ~components); >> >> >> >+ >> >> >> >+ return err; >> >> >> >+} >> >> >> >+ >> >> >> > static const struct nla_policy devlink_nl_policy[DEVLINK_ATTR_MAX + >> >> >> > 1] = { >> >> >> > [DEVLINK_ATTR_UNSPEC] = { .strict_start_type = >> >> >> > DEVLINK_ATTR_TRAP_POLICER_ID }, >> >> >> >@@ -6842,6 +6918,9 @@ static int >> >> >> >devlink_nl_cmd_trap_policer_set_doit(struct sk_buff *skb, >> >> >> > [DEVLINK_ATTR_TRAP_POLICER_RATE] = { .type = NLA_U64 }, >> >> >> > [DEVLINK_ATTR_TRAP_POLICER_BURST] = { .type = NLA_U64 }, >> >> >> > [DEVLINK_ATTR_PORT_FUNCTION] = { .type = NLA_NESTED }, >> >> >> >+ [DEVLINK_ATTR_RESET_COMPONENTS] = { .type = NLA_U32 }, >> >> >> >+ [DEVLINK_ATTR_RESET_WIDTH] = { .type = NLA_U8 }, >> >> >> >+ [DEVLINK_ATTR_RESET_MODE] = { .type = NLA_U8 }, >> >> >> > }; >> >> >> > >> >> >> > static const struct genl_ops devlink_nl_ops[] = { >> >> >> >@@ -7190,6 +7269,12 @@ static int >> >> >> >devlink_nl_cmd_trap_policer_set_doit(struct sk_buff *skb, >> >> >> > .flags = GENL_ADMIN_PERM, >> >> >> > .internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK, >> >> >> > }, >> >> >> >+ { >> >> >> >+ .cmd = DEVLINK_CMD_RESET, >> >> >> >+ .doit = devlink_nl_cmd_reset, >> >> >> >+ .flags = GENL_ADMIN_PERM, >> >> >> >+ .internal_flags = DEVLINK_NL_FLAG_NEED_DEVLINK, >> >> >> >+ }, >> >> >> > }; >> >> >> > >> >> >> > static struct genl_family devlink_nl_family __ro_after_init = { >> >> >> >-- >> >> >> >1.8.3.1 >> >> >> >