[PATCH dpdk v3 0/2] Fix race in ethdev telemetry

2024-10-14 Thread Robin Jarry
Following a discussion we had during the summit, here is one series to
fix a race between an application thread and the telemetry thread
handling requests on ethdev ports.

The problem may be generic to other device classes providing telemetry
callbacks, but for now, this series goes with a simple and naive
approach of putting locks in the ethdev layer.

v3: reordered callback arguments.

v2: added new telemetry api to register callbacks with a private arg.

Robin Jarry (2):
  telemetry: add api to register command with private argument
  ethdev: fix potential race in telemetry endpoints

 doc/guides/rel_notes/release_24_11.rst |  5 ++
 lib/ethdev/rte_ethdev_telemetry.c  | 66 ++
 lib/telemetry/rte_telemetry.h  | 46 ++
 lib/telemetry/telemetry.c  | 38 +++
 lib/telemetry/version.map  |  3 ++
 5 files changed, 131 insertions(+), 27 deletions(-)

-- 
2.46.2



Re: [PATCH dpdk v3 2/2] ethdev: fix potential race in telemetry endpoints

2024-10-14 Thread Stephen Hemminger
On Mon, 14 Oct 2024 21:32:37 +0200
Robin Jarry  wrote:

> While invoking telemetry commands (which may happen at any time, out of
> control of the application), an application thread may concurrently
> add/remove ports. The telemetry callbacks may then access partially
> initialized/uninitialised ethdev data.
> 
> Reuse the ethdev lock that protects port allocation/destruction and the
> new telemetry callback register api that takes an additional private
> argument. Pass eth_dev_telemetry_do as the main callback and the actual
> endpoint callbacks as private argument.
> 
> Fixes: c190daedb9b1 ("ethdev: add telemetry callbacks")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Robin Jarry 
> Acked-by: Bruce Richardson 
> ---
>  lib/ethdev/rte_ethdev_telemetry.c | 66 ++-
>  1 file changed, 47 insertions(+), 19 deletions(-)
> 
> diff --git a/lib/ethdev/rte_ethdev_telemetry.c 
> b/lib/ethdev/rte_ethdev_telemetry.c
> index 6b873e7abe68..7599fa2852b6 100644
> --- a/lib/ethdev/rte_ethdev_telemetry.c
> +++ b/lib/ethdev/rte_ethdev_telemetry.c
> @@ -1395,45 +1395,73 @@ eth_dev_handle_port_tm_node_caps(const char *cmd 
> __rte_unused,
>   return ret;
>  }
>  
> +static int eth_dev_telemetry_do(const char *cmd, const char *params,
> + void *arg, struct rte_tel_data *d)
> +{
> + int ret;
> + telemetry_cb fn = arg;
> + rte_spinlock_lock(rte_mcfg_ethdev_get_lock());
> + ret = fn(cmd, params, d);
> + rte_spinlock_unlock(rte_mcfg_ethdev_get_lock());
> + return ret;
> +}

If this happens often, and the function takes a long time (like doing i/o)
it might be worth changing this to reader/writer in future.

Also, would be best to add a comment here as to what is being protected
if you do another version.

Acked-by: Stephen Hemminger 


Re: [PATCH v6 2/3] doc: update graph layout and node anatomy images

2024-10-14 Thread Robin Jarry

, Oct 14, 2024 at 18:10:

From: Pavan Nikhilesh 

update the graph memory layout and node anatomy
images to reflect the xstats memory region.

Signed-off-by: Pavan Nikhilesh 
---


Reviewed-by: Robin Jarry 



[PATCH dpdk v3 1/2] telemetry: add api to register command with private argument

2024-10-14 Thread Robin Jarry
Add a new rte_telemetry_register_cmd_arg public function to register
a telemetry endpoint with a callback that takes an additional private
argument.

This will be used in the next commit to protect ethdev endpoints with
a lock.

Update perform_command() to take a struct callback object copied from
the list of callbacks and invoke the correct function pointer.

Update release notes.

Signed-off-by: Robin Jarry 
Acked-by: Bruce Richardson 
---

Notes:
v3: reorder callback arguments

 doc/guides/rel_notes/release_24_11.rst |  5 +++
 lib/telemetry/rte_telemetry.h  | 46 ++
 lib/telemetry/telemetry.c  | 38 -
 lib/telemetry/version.map  |  3 ++
 4 files changed, 84 insertions(+), 8 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_11.rst 
b/doc/guides/rel_notes/release_24_11.rst
index dcee09b5d0b2..26590f1b2819 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -178,6 +178,11 @@ New Features
   This field is used to pass an extra configuration settings such as ability
   to lookup IPv4 addresses in network byte order.
 
+* **Added new API to register telemetry endpoint callbacks with private 
arguments.**
+
+  A new ``rte_telemetry_register_cmd_arg`` function is available to pass an 
opaque value to
+  telemetry endpoint callback.
+
 
 Removed Items
 -
diff --git a/lib/telemetry/rte_telemetry.h b/lib/telemetry/rte_telemetry.h
index 463819e2bfe5..2ccfc73a5f01 100644
--- a/lib/telemetry/rte_telemetry.h
+++ b/lib/telemetry/rte_telemetry.h
@@ -336,6 +336,30 @@ rte_tel_data_add_dict_uint_hex(struct rte_tel_data *d, 
const char *name,
 typedef int (*telemetry_cb)(const char *cmd, const char *params,
struct rte_tel_data *info);
 
+/**
+ * This telemetry callback is used when registering a telemetry command with
+ * rte_telemetry_register_cmd_arg().
+ *
+ * It handles getting and formatting information to be returned to telemetry
+ * when requested.
+ *
+ * @param cmd
+ *   The cmd that was requested by the client.
+ * @param params
+ *   Contains data required by the callback function.
+ * @param arg
+ *   The opaque value that was passed to rte_telemetry_register_cmd_arg().
+ * @param info
+ *   The information to be returned to the caller.
+ *
+ * @return
+ *   Length of buffer used on success.
+ * @return
+ *   Negative integer on error.
+ */
+typedef int (*telemetry_arg_cb)(const char *cmd, const char *params, void *arg,
+   struct rte_tel_data *info);
+
 /**
  * Used for handling data received over a telemetry socket.
  *
@@ -367,6 +391,28 @@ typedef void * (*handler)(void *sock_id);
 int
 rte_telemetry_register_cmd(const char *cmd, telemetry_cb fn, const char *help);
 
+/**
+ * Used when registering a command and callback function with telemetry.
+ *
+ * @param cmd
+ *   The command to register with telemetry.
+ * @param fn
+ *   Callback function to be called when the command is requested.
+ * @param arg
+ *   An opaque value that will be passed to the callback function.
+ * @param help
+ *   Help text for the command.
+ *
+ * @return
+ *   0 on success.
+ * @return
+ *   -EINVAL for invalid parameters failure.
+ * @return
+ *   -ENOMEM for mem allocation failure.
+ */
+__rte_experimental
+int
+rte_telemetry_register_cmd_arg(const char *cmd, telemetry_arg_cb fn, void 
*arg, const char *help);
 
 /**
  * Get a pointer to a container with memory allocated. The container is to be
diff --git a/lib/telemetry/telemetry.c b/lib/telemetry/telemetry.c
index c4c5a61a5cf8..31a2c91c0657 100644
--- a/lib/telemetry/telemetry.c
+++ b/lib/telemetry/telemetry.c
@@ -37,6 +37,8 @@ client_handler(void *socket);
 struct cmd_callback {
char cmd[MAX_CMD_LEN];
telemetry_cb fn;
+   telemetry_arg_cb fn_arg;
+   void *arg;
char help[RTE_TEL_MAX_STRING_LEN];
 };
 
@@ -68,14 +70,15 @@ static rte_spinlock_t callback_sl = 
RTE_SPINLOCK_INITIALIZER;
 static RTE_ATOMIC(uint16_t) v2_clients;
 #endif /* !RTE_EXEC_ENV_WINDOWS */
 
-int
-rte_telemetry_register_cmd(const char *cmd, telemetry_cb fn, const char *help)
+static int
+register_cmd(const char *cmd, const char *help,
+telemetry_cb fn, telemetry_arg_cb fn_arg, void *arg)
 {
struct cmd_callback *new_callbacks;
const char *cmdp = cmd;
int i = 0;
 
-   if (strlen(cmd) >= MAX_CMD_LEN || fn == NULL || cmd[0] != '/'
+   if (strlen(cmd) >= MAX_CMD_LEN || (fn == NULL && fn_arg == NULL) || 
cmd[0] != '/'
|| strlen(help) >= RTE_TEL_MAX_STRING_LEN)
return -EINVAL;
 
@@ -102,6 +105,8 @@ rte_telemetry_register_cmd(const char *cmd, telemetry_cb 
fn, const char *help)
 
strlcpy(callbacks[i].cmd, cmd, MAX_CMD_LEN);
callbacks[i].fn = fn;
+   callbacks[i].fn_arg = fn_arg;
+   callbacks[i].arg = arg;
strlcpy(callbacks[i].help, help, RTE_TEL_MAX_STRING_LEN);
num_callbacks++;
r

Re: [PATCH v4 01/47] net/bnxt: tf_core: fix wc tcam multi slice delete issue

2024-10-14 Thread Stephen Hemminger
On Fri,  4 Oct 2024 23:22:52 +0530
Sriharsha Basavapatna  wrote:

> From: Shahaji Bhosle 
> 
> FW tries to update the HWRM request data in the
> delete case to update the mode bit and also
> update invalid profile id. This update only
> happens when the data is send over DMA. HWRM
> requests are read only buffers and cannot be
> updated. So driver now will always send WC
> tcam set message over DMA channel.
> 
> Update tunnel alloc apis to provide error message.
> 
> Fixes: ca5e61bd562d ("net/bnxt: support EM and TCAM lookup with table scope")
> Reviewed-by: Randy Schacher 
> Reviewed-by: Kishore Padmanabha 
> Signed-off-by: Shahaji Bhosle 
> Signed-off-by: Sriharsha Basavapatna 
> Reviewed-by: Ajit Khaparde 
> ---

The patch series needs to be rebased, there is some fuzz and git doesn't like 
to apply.

Also, not every patch builds. For example after applying the first patch
the build fails with:


../drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.c: In function 
‘bnxt_tunnel_dst_port_alloc’:
../drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.c:40:17: error: implicit 
declaration of function ‘PMD_DRV_LOG’; did you mean ‘PMD_DRV_LOG_LINE’? 
[-Wimplicit-function-declaration]
   40 | PMD_DRV_LOG(ERR, "Tunnel type:%d alloc failed for 
port:%d error:%s\n",
  | ^~~
  | PMD_DRV_LOG_LINE
../drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.c:40:17: warning: nested extern 
declaration of ‘PMD_DRV_LOG’ [-Wnested-externs]
../drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.c:40:29: error: ‘ERR’ undeclared 
(first use in this function)
   40 | PMD_DRV_LOG(ERR, "Tunnel type:%d alloc failed for 
port:%d error:%s\n",
  | ^~~
../drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.c:40:29: note: each undeclared 
identifier is reported only once for each function it appears in
../drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.c: In function 
‘bnxt_pmd_global_tunnel_set’:
../drivers/net/bnxt/tf_ulp/bnxt_tf_pmd_shim.c:601:37: error: ‘ERR’ undeclared 
(first use in this function)
  601 | PMD_DRV_LOG(ERR, "Tunnel type:%d alloc failed 
for port:%d error:%s\n",
  | ^~~
[16/2343] Compiling C object 
drivers/libtmp_rte_net_bnxt.a.p/net_bnxt_tf_ulp_ulp_mapper.c.o


[PATCH dpdk v3 2/2] ethdev: fix potential race in telemetry endpoints

2024-10-14 Thread Robin Jarry
While invoking telemetry commands (which may happen at any time, out of
control of the application), an application thread may concurrently
add/remove ports. The telemetry callbacks may then access partially
initialized/uninitialised ethdev data.

Reuse the ethdev lock that protects port allocation/destruction and the
new telemetry callback register api that takes an additional private
argument. Pass eth_dev_telemetry_do as the main callback and the actual
endpoint callbacks as private argument.

Fixes: c190daedb9b1 ("ethdev: add telemetry callbacks")
Cc: sta...@dpdk.org

Signed-off-by: Robin Jarry 
Acked-by: Bruce Richardson 
---
 lib/ethdev/rte_ethdev_telemetry.c | 66 ++-
 1 file changed, 47 insertions(+), 19 deletions(-)

diff --git a/lib/ethdev/rte_ethdev_telemetry.c 
b/lib/ethdev/rte_ethdev_telemetry.c
index 6b873e7abe68..7599fa2852b6 100644
--- a/lib/ethdev/rte_ethdev_telemetry.c
+++ b/lib/ethdev/rte_ethdev_telemetry.c
@@ -1395,45 +1395,73 @@ eth_dev_handle_port_tm_node_caps(const char *cmd 
__rte_unused,
return ret;
 }
 
+static int eth_dev_telemetry_do(const char *cmd, const char *params,
+   void *arg, struct rte_tel_data *d)
+{
+   int ret;
+   telemetry_cb fn = arg;
+   rte_spinlock_lock(rte_mcfg_ethdev_get_lock());
+   ret = fn(cmd, params, d);
+   rte_spinlock_unlock(rte_mcfg_ethdev_get_lock());
+   return ret;
+}
+
 RTE_INIT(ethdev_init_telemetry)
 {
-   rte_telemetry_register_cmd("/ethdev/list", eth_dev_handle_port_list,
+   rte_telemetry_register_cmd_arg("/ethdev/list",
+   eth_dev_telemetry_do, eth_dev_handle_port_list,
"Returns list of available ethdev ports. Takes no 
parameters");
-   rte_telemetry_register_cmd("/ethdev/stats", eth_dev_handle_port_stats,
+   rte_telemetry_register_cmd_arg("/ethdev/stats",
+   eth_dev_telemetry_do, eth_dev_handle_port_stats,
"Returns the common stats for a port. Parameters: int 
port_id");
-   rte_telemetry_register_cmd("/ethdev/xstats", eth_dev_handle_port_xstats,
+   rte_telemetry_register_cmd_arg("/ethdev/xstats",
+   eth_dev_telemetry_do, eth_dev_handle_port_xstats,
"Returns the extended stats for a port. Parameters: int 
port_id,hide_zero=true|false(Optional for indicates hide zero xstats)");
 #ifndef RTE_EXEC_ENV_WINDOWS
-   rte_telemetry_register_cmd("/ethdev/dump_priv", 
eth_dev_handle_port_dump_priv,
+   rte_telemetry_register_cmd_arg("/ethdev/dump_priv",
+   eth_dev_telemetry_do, eth_dev_handle_port_dump_priv,
"Returns dump private information for a port. 
Parameters: int port_id");
 #endif
-   rte_telemetry_register_cmd("/ethdev/link_status",
-   eth_dev_handle_port_link_status,
+   rte_telemetry_register_cmd_arg("/ethdev/link_status",
+   eth_dev_telemetry_do, eth_dev_handle_port_link_status,
"Returns the link status for a port. Parameters: int 
port_id");
-   rte_telemetry_register_cmd("/ethdev/info", eth_dev_handle_port_info,
+   rte_telemetry_register_cmd_arg("/ethdev/info",
+   eth_dev_telemetry_do, eth_dev_handle_port_info,
"Returns the device info for a port. Parameters: int 
port_id");
-   rte_telemetry_register_cmd("/ethdev/module_eeprom", 
eth_dev_handle_port_module_eeprom,
+   rte_telemetry_register_cmd_arg("/ethdev/module_eeprom",
+   eth_dev_telemetry_do, eth_dev_handle_port_module_eeprom,
"Returns module EEPROM info with SFF specs. Parameters: 
int port_id");
-   rte_telemetry_register_cmd("/ethdev/macs", eth_dev_handle_port_macs,
+   rte_telemetry_register_cmd_arg("/ethdev/macs",
+   eth_dev_telemetry_do, eth_dev_handle_port_macs,
"Returns the MAC addresses for a port. Parameters: int 
port_id");
-   rte_telemetry_register_cmd("/ethdev/flow_ctrl", 
eth_dev_handle_port_flow_ctrl,
+   rte_telemetry_register_cmd_arg("/ethdev/flow_ctrl",
+   eth_dev_telemetry_do, eth_dev_handle_port_flow_ctrl,
"Returns flow ctrl info for a port. Parameters: int 
port_id");
-   rte_telemetry_register_cmd("/ethdev/rx_queue", eth_dev_handle_port_rxq,
+   rte_telemetry_register_cmd_arg("/ethdev/rx_queue",
+   eth_dev_telemetry_do, eth_dev_handle_port_rxq,
"Returns Rx queue info for a port. Parameters: int 
port_id, int queue_id (Optional if only one queue)");
-   rte_telemetry_register_cmd("/ethdev/tx_queue", eth_dev_handle_port_txq,
+   rte_telemetry_register_cmd_arg("/ethdev/tx_queue",
+   eth_dev_telemetry_do, eth_dev_handle_port_txq,
"Returns Tx queue info for a port. Parameters: in

Re: [PATCH v5 4/5] test/graph_feature_arc: add functional tests

2024-10-14 Thread Stephen Hemminger
On Mon, 14 Oct 2024 20:03:57 +0530
Nitin Saxena  wrote:

> Added functional unit test case for verifying feature arc control plane
> and fast path APIs
> 
> How to run:
> $ echo "graph_feature_arc_autotest" | ./bin/dpdk-test
> 
> Signed-off-by: Nitin Saxena 

With current upstream kernel checkpatch additional warnings:


WARNING:MACRO_ARG_UNUSED: Argument 'idx' is not used in function-like macro
#217: FILE: app/test/test_graph_feature_arc.c:186:
+#define R(idx, node, node_cookie) {\
+   if (!strcmp(child, node)) { \
+   user_data += node_cookie;   \
+   }   \
+   }

WARNING:MACRO_ARG_UNUSED: Argument 'user_data' is not used in function-like 
macro
#272: FILE: app/test/test_graph_feature_arc.c:241:
+#define R(idx, _name, user_data) { \
+   if (!strcmp(node->name, _name)) {   \
+   priv->node_index = idx; \
+   }   \
+   }


Personally, using macros to generate tests like this can get confusing.


[PATCH v1 12/18] net/r8169: implement Tx path

2024-10-14 Thread Howard Wang
Add implementation for TX datapath.

Signed-off-by: Howard Wang 
---
 drivers/net/r8169/r8169_base.h   |   7 +
 drivers/net/r8169/r8169_ethdev.c |   6 +
 drivers/net/r8169/r8169_ethdev.h |  11 +
 drivers/net/r8169/r8169_rxtx.c   | 687 ++-
 4 files changed, 695 insertions(+), 16 deletions(-)

diff --git a/drivers/net/r8169/r8169_base.h b/drivers/net/r8169/r8169_base.h
index 53a58e10fa..043d66f6c2 100644
--- a/drivers/net/r8169/r8169_base.h
+++ b/drivers/net/r8169/r8169_base.h
@@ -589,6 +589,13 @@ enum RTL_chipset_name {
 
 #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL << (n)) - 1))
 
+#ifndef WRITE_ONCE
+#define WRITE_ONCE(var, val) (*((volatile typeof(val) *)(&(var))) = (val))
+#endif
+#ifndef READ_ONCE
+#define READ_ONCE(var) (*((volatile typeof(var) *)(&(var
+#endif
+
 static inline u32
 rtl_read32(volatile void *addr)
 {
diff --git a/drivers/net/r8169/r8169_ethdev.c b/drivers/net/r8169/r8169_ethdev.c
index 6c06f71385..61aa16cc10 100644
--- a/drivers/net/r8169/r8169_ethdev.c
+++ b/drivers/net/r8169/r8169_ethdev.c
@@ -81,6 +81,11 @@ static const struct eth_dev_ops rtl_eth_dev_ops = {
.rx_queue_setup   = rtl_rx_queue_setup,
.rx_queue_release = rtl_rx_queue_release,
.rxq_info_get = rtl_rxq_info_get,
+
+   .tx_queue_setup   = rtl_tx_queue_setup,
+   .tx_queue_release = rtl_tx_queue_release,
+   .tx_done_cleanup  = rtl_tx_done_cleanup,
+   .txq_info_get = rtl_txq_info_get,
 };
 
 static int
@@ -363,6 +368,7 @@ rtl_dev_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
 
dev_info->rx_offload_capa = (rtl_get_rx_port_offloads() |
 dev_info->rx_queue_offload_capa);
+   dev_info->tx_offload_capa = rtl_get_tx_port_offloads();
 
return 0;
 }
diff --git a/drivers/net/r8169/r8169_ethdev.h b/drivers/net/r8169/r8169_ethdev.h
index cfcf576bc1..5776601081 100644
--- a/drivers/net/r8169/r8169_ethdev.h
+++ b/drivers/net/r8169/r8169_ethdev.h
@@ -77,6 +77,8 @@ struct rtl_hw {
u16 hw_clo_ptr_reg;
u16 sw_tail_ptr_reg;
u32 MaxTxDescPtrMask;
+   u32 NextHwDesCloPtr0;
+   u32 BeginHwDesCloPtr0;
 
/* Dash */
u8 HwSuppDashVer;
@@ -114,16 +116,25 @@ uint16_t rtl_recv_scattered_pkts(void *rx_queue, struct 
rte_mbuf **rx_pkts,
  uint16_t nb_pkts);
 
 void rtl_rx_queue_release(struct rte_eth_dev *dev, uint16_t rx_queue_id);
+void rtl_tx_queue_release(struct rte_eth_dev *dev, uint16_t tx_queue_id);
 
 void rtl_rxq_info_get(struct rte_eth_dev *dev, uint16_t queue_id,
   struct rte_eth_rxq_info *qinfo);
+void rtl_txq_info_get(struct rte_eth_dev *dev, uint16_t queue_id,
+  struct rte_eth_txq_info *qinfo);
 
 uint64_t rtl_get_rx_port_offloads(void);
+uint64_t rtl_get_tx_port_offloads(void);
 
 int rtl_rx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
uint16_t nb_rx_desc, unsigned int socket_id,
const struct rte_eth_rxconf *rx_conf,
struct rte_mempool *mb_pool);
+int rtl_tx_queue_setup(struct rte_eth_dev *dev, uint16_t queue_idx,
+   uint16_t nb_tx_desc, unsigned int socket_id,
+   const struct rte_eth_txconf *tx_conf);
+
+int rtl_tx_done_cleanup(void *tx_queue, uint32_t free_cnt);
 
 int rtl_stop_queues(struct rte_eth_dev *dev);
 void rtl_free_queues(struct rte_eth_dev *dev);
diff --git a/drivers/net/r8169/r8169_rxtx.c b/drivers/net/r8169/r8169_rxtx.c
index 8c4bcdf4e5..cb354e19fe 100644
--- a/drivers/net/r8169/r8169_rxtx.c
+++ b/drivers/net/r8169/r8169_rxtx.c
@@ -29,6 +29,28 @@
 #include "r8169_hw.h"
 #include "r8169_logs.h"
 
+/* Bit mask to indicate what bits required for building TX context */
+#define RTL_TX_OFFLOAD_MASK (RTE_MBUF_F_TX_IPV6 |  \
+RTE_MBUF_F_TX_IPV4 |   \
+RTE_MBUF_F_TX_VLAN |   \
+RTE_MBUF_F_TX_IP_CKSUM |   \
+RTE_MBUF_F_TX_L4_MASK |\
+RTE_MBUF_F_TX_TCP_SEG)
+
+#define MIN_PATCH_LENGTH 47
+#define ETH_ZLEN60 /* Min. octets in frame sans FCS */
+
+/* Struct TxDesc in kernel r8169 */
+struct rtl_tx_desc {
+   u32 opts1;
+   u32 opts2;
+   u64 addr;
+   u32 reserved0;
+   u32 reserved1;
+   u32 reserved2;
+   u32 reserved3;
+};
+
 /* Struct RxDesc in kernel r8169 */
 struct rtl_rx_desc {
u32 opts1;
@@ -36,27 +58,47 @@ struct rtl_rx_desc {
u64 addr;
 };
 
+/* Structure associated with each descriptor of the TX ring of a TX queue. */
+struct rtl_tx_entry {
+   struct rte_mbuf *mbuf;
+};
+
 /* Structure associated with each descriptor of the RX ring of a RX queue. */
 struct rtl_rx_entry {
struct rte_mbuf *mbuf;
 };
 
+/* Structure as

[PATCH v1 13/18] net/r8169: implement device statistics

2024-10-14 Thread Howard Wang
Signed-off-by: Howard Wang 
---
 drivers/net/r8169/r8169_base.h   | 16 +++
 drivers/net/r8169/r8169_ethdev.c | 49 ++-
 drivers/net/r8169/r8169_ethdev.h |  3 ++
 drivers/net/r8169/r8169_hw.c | 80 
 drivers/net/r8169/r8169_hw.h |  6 +++
 5 files changed, 153 insertions(+), 1 deletion(-)

diff --git a/drivers/net/r8169/r8169_base.h b/drivers/net/r8169/r8169_base.h
index 043d66f6c2..98c965ac23 100644
--- a/drivers/net/r8169/r8169_base.h
+++ b/drivers/net/r8169/r8169_base.h
@@ -23,6 +23,22 @@ typedef uint16_t  u16;
 typedef uint32_t  u32;
 typedef uint64_t  u64;
 
+struct rtl_counters {
+   u64 tx_packets;
+   u64 rx_packets;
+   u64 tx_errors;
+   u32 rx_errors;
+   u16 rx_missed;
+   u16 align_errors;
+   u32 tx_one_collision;
+   u32 tx_multi_collision;
+   u64 rx_unicast;
+   u64 rx_broadcast;
+   u32 rx_multicast;
+   u16 tx_aborted;
+   u16 tx_underun;
+};
+
 enum mcfg {
CFG_METHOD_1 = 1,
CFG_METHOD_2,
diff --git a/drivers/net/r8169/r8169_ethdev.c b/drivers/net/r8169/r8169_ethdev.c
index 61aa16cc10..cf9ea4dca4 100644
--- a/drivers/net/r8169/r8169_ethdev.c
+++ b/drivers/net/r8169/r8169_ethdev.c
@@ -40,7 +40,9 @@ static int rtl_dev_set_link_up(struct rte_eth_dev *dev);
 static int rtl_dev_set_link_down(struct rte_eth_dev *dev);
 static int rtl_dev_infos_get(struct rte_eth_dev *dev,
  struct rte_eth_dev_info *dev_info);
-
+static int rtl_dev_stats_get(struct rte_eth_dev *dev,
+ struct rte_eth_stats *rte_stats);
+static int rtl_dev_stats_reset(struct rte_eth_dev *dev);
 /*
  * The set of PCI devices this driver supports
  */
@@ -78,6 +80,9 @@ static const struct eth_dev_ops rtl_eth_dev_ops = {
 
.link_update  = rtl_dev_link_update,
 
+   .stats_get= rtl_dev_stats_get,
+   .stats_reset  = rtl_dev_stats_reset,
+
.rx_queue_setup   = rtl_rx_queue_setup,
.rx_queue_release = rtl_rx_queue_release,
.rxq_info_get = rtl_rxq_info_get,
@@ -242,6 +247,11 @@ rtl_dev_start(struct rte_eth_dev *dev)
goto error;
}
 
+   /* This can fail when allocating mem for tally counters */
+   err = rtl_tally_init(dev);
+   if (err)
+   goto error;
+
/* Enable uio/vfio intr/eventfd mapping */
rte_intr_enable(intr_handle);
 
@@ -288,6 +298,8 @@ rtl_dev_stop(struct rte_eth_dev *dev)
 
rtl_stop_queues(dev);
 
+   rtl_tally_free(dev);
+
/* Clear the recorded link status */
memset(&link, 0, sizeof(link));
rte_eth_linkstatus_set(dev, &link);
@@ -373,6 +385,41 @@ rtl_dev_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
return 0;
 }
 
+static int
+rtl_dev_stats_reset(struct rte_eth_dev *dev)
+{
+   struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+   struct rtl_hw *hw = &adapter->hw;
+
+   rtl_clear_tally_stats(hw);
+
+   memset(&adapter->sw_stats, 0, sizeof(adapter->sw_stats));
+
+   return 0;
+}
+
+static void
+rtl_sw_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *rte_stats)
+{
+   struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+   struct rtl_sw_stats *sw_stats = &adapter->sw_stats;
+
+   rte_stats->ibytes = sw_stats->rx_bytes;
+   rte_stats->obytes = sw_stats->tx_bytes;
+}
+
+static int
+rtl_dev_stats_get(struct rte_eth_dev *dev, struct rte_eth_stats *rte_stats)
+{
+   struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+   struct rtl_hw *hw = &adapter->hw;
+
+   rtl_get_tally_stats(hw, rte_stats);
+   rtl_sw_stats_get(dev, rte_stats);
+
+   return 0;
+}
+
 /* Return 0 means link status changed, -1 means not changed */
 static int
 rtl_dev_link_update(struct rte_eth_dev *dev, int wait __rte_unused)
diff --git a/drivers/net/r8169/r8169_ethdev.h b/drivers/net/r8169/r8169_ethdev.h
index 5776601081..c209b49db4 100644
--- a/drivers/net/r8169/r8169_ethdev.h
+++ b/drivers/net/r8169/r8169_ethdev.h
@@ -47,6 +47,9 @@ struct rtl_hw {
u8  mac_addr[MAC_ADDR_LEN];
u32 rx_buf_sz;
 
+   struct rtl_counters *tally_vaddr;
+   u64 tally_paddr;
+
u8  RequirePhyMdiSwapPatch;
u8  NotWrMcuPatchCode;
u8  HwSuppMacMcuVer;
diff --git a/drivers/net/r8169/r8169_hw.c b/drivers/net/r8169/r8169_hw.c
index 3be56061cf..27b67c1ed6 100644
--- a/drivers/net/r8169/r8169_hw.c
+++ b/drivers/net/r8169/r8169_hw.c
@@ -1503,3 +1503,83 @@ rtl_rar_set(struct rtl_hw *hw, uint8_t *addr)
rtl_disable_cfg9346_write(hw);
 }
 
+void
+rtl_get_tally_stats(struct rtl_hw *hw, struct rte_eth_stats *rte_stats)
+{
+   struct rtl_counters *counters;
+   uint64_t paddr;
+   u32 cmd;
+   u32 wait_cnt;
+
+   counters = hw->tally_vaddr;
+   paddr = hw->tally_paddr;
+   if (!counters)
+   return;
+
+   RTL_W32(hw, CounterAddrHigh, (u64)paddr >> 3

[PATCH v1 09/18] net/r8169: add support for hw initialization

2024-10-14 Thread Howard Wang
Signed-off-by: Howard Wang 
---
 drivers/net/r8169/meson.build|   1 +
 drivers/net/r8169/r8169_base.h   |  43 +++
 drivers/net/r8169/r8169_dash.c   |  89 +
 drivers/net/r8169/r8169_dash.h   |  35 ++
 drivers/net/r8169/r8169_ethdev.c |  47 ++-
 drivers/net/r8169/r8169_ethdev.h |  30 +-
 drivers/net/r8169/r8169_hw.c | 583 +++
 drivers/net/r8169/r8169_hw.h |  42 +++
 drivers/net/r8169/r8169_phy.h|  16 +-
 9 files changed, 876 insertions(+), 10 deletions(-)
 create mode 100644 drivers/net/r8169/r8169_dash.c
 create mode 100644 drivers/net/r8169/r8169_dash.h

diff --git a/drivers/net/r8169/meson.build b/drivers/net/r8169/meson.build
index 08995453c7..8235e8ca43 100644
--- a/drivers/net/r8169/meson.build
+++ b/drivers/net/r8169/meson.build
@@ -6,6 +6,7 @@ sources = files(
'r8169_hw.c',
'r8169_rxtx.c',
'r8169_phy.c',
+   'r8169_dash.c',
'base/rtl8125a.c',
'base/rtl8125a_mcu.c',
'base/rtl8125b.c',
diff --git a/drivers/net/r8169/r8169_base.h b/drivers/net/r8169/r8169_base.h
index e01b1e3470..2ee6fc6782 100644
--- a/drivers/net/r8169/r8169_base.h
+++ b/drivers/net/r8169/r8169_base.h
@@ -237,6 +237,10 @@ enum RTL_registers {
IMR_V4_L2_CLEAR_REG_8125 = 0x0D10,
IMR_V4_L2_SET_REG_8125   = 0x0D18,
ISR_V4_L2_8125  = 0x0D14,
+   SW_TAIL_PTR0_8125BP = 0x0D30,
+   SW_TAIL_PTR1_8125BP = 0x0D38,
+   HW_CLO_PTR0_8125BP = 0x0D34,
+   HW_CLO_PTR1_8125BP = 0x0D3C,
DOUBLE_VLAN_CONFIG = 0x1000,
TX_NEW_CTRL= 0x203E,
TNPDS_Q1_LOW_8125  = 0x2100,
@@ -482,6 +486,16 @@ enum RTL_register_content {
ISRIMR_V2_LINKCHG= (1 << 21),
 };
 
+enum RTL_chipset_name {
+   RTL8125A = 0,
+   RTL8125B,
+   RTL8168KB,
+   RTL8125BP,
+   RTL8125D,
+   RTL8126A,
+   UNKNOWN
+};
+
 #define PCI_VENDOR_ID_REALTEK 0x10EC
 
 #define RTL_PCI_REG_ADDR(hw, reg) ((u8 *)(hw)->mmio_addr + (reg))
@@ -522,6 +536,35 @@ enum RTL_register_content {
 
 #define ETH_HLEN14
 
+#define SPEED_10   10
+#define SPEED_100  100
+#define SPEED_1000 1000
+#define SPEED_2500 2500
+#define SPEED_5000 5000
+
+#define DUPLEX_HALF1
+#define DUPLEX_FULL2
+
+#define AUTONEG_ENABLE 1
+#define AUTONEG_DISABLE0
+
+#define ADVERTISE_10_HALF 0x0001
+#define ADVERTISE_10_FULL 0x0002
+#define ADVERTISE_100_HALF0x0004
+#define ADVERTISE_100_FULL0x0008
+#define ADVERTISE_1000_HALF   0x0010 /* Not used, just FYI */
+#define ADVERTISE_1000_FULL   0x0020
+#define ADVERTISE_2500_HALF   0x0040 /* NOT used, just FYI */
+#define ADVERTISE_2500_FULL   0x0080
+#define ADVERTISE_5000_HALF   0x0100 /* NOT used, just FYI */
+#define ADVERTISE_5000_FULL   0x0200
+
+#define RTL8126_ALL_SPEED_DUPLEX (ADVERTISE_10_HALF | ADVERTISE_10_FULL | \
+ADVERTISE_100_HALF | ADVERTISE_100_FULL | ADVERTISE_1000_FULL | \
+ADVERTISE_2500_FULL | ADVERTISE_5000_FULL)
+
+#define MAC_ADDR_LENRTE_ETHER_ADDR_LEN
+
 static inline u32
 rtl_read32(volatile void *addr)
 {
diff --git a/drivers/net/r8169/r8169_dash.c b/drivers/net/r8169/r8169_dash.c
new file mode 100644
index 00..e803ce8305
--- /dev/null
+++ b/drivers/net/r8169/r8169_dash.c
@@ -0,0 +1,89 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Realtek Corporation. All rights reserved
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include "r8169_base.h"
+#include "r8169_dash.h"
+#include "r8169_hw.h"
+
+bool
+rtl_is_allow_access_dash_ocp(struct rtl_hw *hw)
+{
+   bool allow_access = false;
+   u16 mac_ocp_data;
+
+   if (!HW_DASH_SUPPORT_DASH(hw))
+   goto exit;
+
+   allow_access = true;
+   switch (hw->mcfg) {
+   case CFG_METHOD_2:
+   case CFG_METHOD_3:
+   mac_ocp_data = rtl_mac_ocp_read(hw, 0xd460);
+   if (mac_ocp_data == 0x || !(mac_ocp_data & BIT_0))
+   allow_access = false;
+   break;
+   case CFG_METHOD_8:
+   case CFG_METHOD_9:
+   mac_ocp_data = rtl_mac_ocp_read(hw, 0xd4c0);
+   if (mac_ocp_data == 0x || (mac_ocp_data & BIT_3))
+   allow_access = false;
+   break;
+   default:
+   goto exit;
+   }
+exit:
+   return allow_access;
+}
+
+static u32
+rtl_get_dash_fw_ver(struct rtl_hw *hw)
+{
+   u32 ver = 0x;
+
+   if (FALSE == HW_DASH_SUPPORT_GET_FIRMWARE_VERSION(hw))
+   goto exit;
+
+   ver = rtl_ocp_read(hw, OCP_REG_FIRMWARE_MAJOR_VERSION, 4);
+
+exit:
+   return ver;
+}
+
+static int
+_rtl_check_dash(struct rtl_hw *hw)
+{
+   if (!hw->AllowAccessDashOcp)
+   return 0;
+
+   if (HW_DASH_SUPPORT_TYPE_2(hw) || HW_DASH_SUPPORT_TYPE_4(hw)) {
+   if (rtl_ocp_read(hw, 0x128, 1) & BIT_0)
+  

[PATCH v1 11/18] net/r8169: implement Rx path

2024-10-14 Thread Howard Wang
Add implementation for RX datapath.

Signed-off-by: Howard Wang 
---
 drivers/net/r8169/r8169_base.h   |  27 ++
 drivers/net/r8169/r8169_ethdev.c |  76 ++-
 drivers/net/r8169/r8169_ethdev.h |  18 +
 drivers/net/r8169/r8169_rxtx.c   | 787 ++-
 4 files changed, 905 insertions(+), 3 deletions(-)

diff --git a/drivers/net/r8169/r8169_base.h b/drivers/net/r8169/r8169_base.h
index 2960288981..53a58e10fa 100644
--- a/drivers/net/r8169/r8169_base.h
+++ b/drivers/net/r8169/r8169_base.h
@@ -562,6 +562,33 @@ enum RTL_chipset_name {
 
 #define MAC_ADDR_LENRTE_ETHER_ADDR_LEN
 
+#define RTL_MAX_TX_DESC 4096
+#define RTL_MAX_RX_DESC 4096
+#define RTL_MIN_TX_DESC 64
+#define RTL_MIN_RX_DESC 64
+
+#define RTL_RING_ALIGN 256
+
+#define RTL_MAX_TX_SEG 64
+#define RTL_DESC_ALIGN 64
+
+#define RTL_RX_FREE_THRESH 32
+#define RTL_TX_FREE_THRESH 32
+
+#define VLAN_TAG_SIZE   4
+
+/*
+ * The overhead from MTU to max frame size.
+ * Considering VLAN so a tag needs to be counted.
+ */
+#define RTL_ETH_OVERHEAD (RTE_ETHER_HDR_LEN + RTE_ETHER_CRC_LEN + 
VLAN_TAG_SIZE)
+
+#define ETH_HLEN14
+#define VLAN_HLEN   4
+#define Jumbo_Frame_9k  (9 * 1024 - ETH_HLEN - VLAN_HLEN - RTE_ETHER_CRC_LEN)
+
+#define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL << (n)) - 1))
+
 static inline u32
 rtl_read32(volatile void *addr)
 {
diff --git a/drivers/net/r8169/r8169_ethdev.c b/drivers/net/r8169/r8169_ethdev.c
index ecf4a4e984..6c06f71385 100644
--- a/drivers/net/r8169/r8169_ethdev.c
+++ b/drivers/net/r8169/r8169_ethdev.c
@@ -38,6 +38,8 @@ static int rtl_dev_close(struct rte_eth_dev *dev);
 static int rtl_dev_link_update(struct rte_eth_dev *dev, int wait __rte_unused);
 static int rtl_dev_set_link_up(struct rte_eth_dev *dev);
 static int rtl_dev_set_link_down(struct rte_eth_dev *dev);
+static int rtl_dev_infos_get(struct rte_eth_dev *dev,
+ struct rte_eth_dev_info *dev_info);
 
 /*
  * The set of PCI devices this driver supports
@@ -50,6 +52,20 @@ static const struct rte_pci_id pci_id_r8169_map[] = {
{.vendor_id = 0, /* sentinel */ },
 };
 
+static const struct rte_eth_desc_lim rx_desc_lim = {
+   .nb_max   = RTL_MAX_RX_DESC,
+   .nb_min   = RTL_MIN_RX_DESC,
+   .nb_align = RTL_DESC_ALIGN,
+};
+
+static const struct rte_eth_desc_lim tx_desc_lim = {
+   .nb_max = RTL_MAX_TX_DESC,
+   .nb_min = RTL_MIN_TX_DESC,
+   .nb_align   = RTL_DESC_ALIGN,
+   .nb_seg_max = RTL_MAX_TX_SEG,
+   .nb_mtu_seg_max = RTL_MAX_TX_SEG,
+};
+
 static const struct eth_dev_ops rtl_eth_dev_ops = {
.dev_configure= rtl_dev_configure,
.dev_start= rtl_dev_start,
@@ -58,8 +74,13 @@ static const struct eth_dev_ops rtl_eth_dev_ops = {
.dev_reset= rtl_dev_reset,
.dev_set_link_up  = rtl_dev_set_link_up,
.dev_set_link_down= rtl_dev_set_link_down,
+   .dev_infos_get= rtl_dev_infos_get,
 
.link_update  = rtl_dev_link_update,
+
+   .rx_queue_setup   = rtl_rx_queue_setup,
+   .rx_queue_release = rtl_rx_queue_release,
+   .rxq_info_get = rtl_rxq_info_get,
 };
 
 static int
@@ -149,6 +170,7 @@ _rtl_setup_link(struct rte_eth_dev *dev)
 error_invalid_config:
PMD_INIT_LOG(ERR, "Invalid advertised speeds (%u) for port %u",
 dev->data->dev_conf.link_speeds, dev->data->port_id);
+   rtl_stop_queues(dev);
return -EINVAL;
 }
 
@@ -229,6 +251,7 @@ rtl_dev_start(struct rte_eth_dev *dev)
 
return 0;
 error:
+   rtl_stop_queues(dev);
return -EIO;
 }
 
@@ -258,6 +281,8 @@ rtl_dev_stop(struct rte_eth_dev *dev)
 
rtl_powerdown_pll(hw);
 
+   rtl_stop_queues(dev);
+
/* Clear the recorded link status */
memset(&link, 0, sizeof(link));
rte_eth_linkstatus_set(dev, &link);
@@ -298,6 +323,50 @@ rtl_dev_set_link_down(struct rte_eth_dev *dev)
return 0;
 }
 
+static int
+rtl_dev_infos_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info)
+{
+   struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+   struct rtl_hw *hw = &adapter->hw;
+
+   dev_info->min_rx_bufsize = 1024;
+   dev_info->max_rx_pktlen = Jumbo_Frame_9k;
+   dev_info->max_mac_addrs = 1;
+
+   dev_info->max_rx_queues = 1;
+   dev_info->max_tx_queues = 1;
+
+   dev_info->default_rxconf = (struct rte_eth_rxconf) {
+   .rx_free_thresh = RTL_RX_FREE_THRESH,
+   };
+
+   dev_info->default_txconf = (struct rte_eth_txconf) {
+   .tx_free_thresh = RTL_TX_FREE_THRESH,
+   };
+
+   dev_info->rx_desc_lim = rx_desc_lim;
+   dev_info->tx_desc_lim = tx_desc_lim;
+
+   dev_info->speed_capa = RTE_ETH_LINK_SPEED_10M_HD | 
RTE_ETH_LINK_SPEED_10M |
+  RTE_ETH_LINK_SPEED_100M_HD | 
RTE_ETH_LINK_SPEED_100M |
+  RTE_ETH_LINK_SPEED_1G;
+
+   switch (hw->chipset_

[PATCH v1 10/18] net/r8169: add link status and interrupt management

2024-10-14 Thread Howard Wang
Signed-off-by: Howard Wang 
---
 drivers/net/r8169/r8169_base.h   |   5 +-
 drivers/net/r8169/r8169_ethdev.c | 279 ++-
 drivers/net/r8169/r8169_ethdev.h |   3 +
 drivers/net/r8169/r8169_hw.c |   8 +-
 drivers/net/r8169/r8169_hw.h |   3 +
 drivers/net/r8169/r8169_phy.c| 121 ++
 drivers/net/r8169/r8169_phy.h|   3 +
 7 files changed, 413 insertions(+), 9 deletions(-)

diff --git a/drivers/net/r8169/r8169_base.h b/drivers/net/r8169/r8169_base.h
index 2ee6fc6782..2960288981 100644
--- a/drivers/net/r8169/r8169_base.h
+++ b/drivers/net/r8169/r8169_base.h
@@ -379,6 +379,7 @@ enum RTL_register_content {
 
/* PHY status */
PowerSaveStatus = 0x80,
+   _5000bpsF   = 0x1000,
_2500bpsF   = 0x400,
TxFlowCtrl  = 0x40,
RxFlowCtrl  = 0x20,
@@ -559,10 +560,6 @@ enum RTL_chipset_name {
 #define ADVERTISE_5000_HALF   0x0100 /* NOT used, just FYI */
 #define ADVERTISE_5000_FULL   0x0200
 
-#define RTL8126_ALL_SPEED_DUPLEX (ADVERTISE_10_HALF | ADVERTISE_10_FULL | \
-ADVERTISE_100_HALF | ADVERTISE_100_FULL | ADVERTISE_1000_FULL | \
-ADVERTISE_2500_FULL | ADVERTISE_5000_FULL)
-
 #define MAC_ADDR_LENRTE_ETHER_ADDR_LEN
 
 static inline u32
diff --git a/drivers/net/r8169/r8169_ethdev.c b/drivers/net/r8169/r8169_ethdev.c
index 9af46b390c..ecf4a4e984 100644
--- a/drivers/net/r8169/r8169_ethdev.c
+++ b/drivers/net/r8169/r8169_ethdev.c
@@ -35,6 +35,9 @@ static int rtl_dev_start(struct rte_eth_dev *dev);
 static int rtl_dev_stop(struct rte_eth_dev *dev);
 static int rtl_dev_reset(struct rte_eth_dev *dev);
 static int rtl_dev_close(struct rte_eth_dev *dev);
+static int rtl_dev_link_update(struct rte_eth_dev *dev, int wait __rte_unused);
+static int rtl_dev_set_link_up(struct rte_eth_dev *dev);
+static int rtl_dev_set_link_down(struct rte_eth_dev *dev);
 
 /*
  * The set of PCI devices this driver supports
@@ -53,6 +56,10 @@ static const struct eth_dev_ops rtl_eth_dev_ops = {
.dev_stop = rtl_dev_stop,
.dev_close= rtl_dev_close,
.dev_reset= rtl_dev_reset,
+   .dev_set_link_up  = rtl_dev_set_link_up,
+   .dev_set_link_down= rtl_dev_set_link_down,
+
+   .link_update  = rtl_dev_link_update,
 };
 
 static int
@@ -61,6 +68,119 @@ rtl_dev_configure(struct rte_eth_dev *dev __rte_unused)
return 0;
 }
 
+static void
+rtl_disable_intr(struct rtl_hw *hw)
+{
+   PMD_INIT_FUNC_TRACE();
+   RTL_W32(hw, IMR0_8125, 0x);
+   RTL_W32(hw, ISR0_8125, RTL_R32(hw, ISR0_8125));
+}
+
+static void
+rtl_enable_intr(struct rtl_hw *hw)
+{
+   PMD_INIT_FUNC_TRACE();
+   RTL_W32(hw, IMR0_8125, LinkChg);
+}
+
+static int
+_rtl_setup_link(struct rte_eth_dev *dev)
+{
+   struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+   struct rtl_hw *hw = &adapter->hw;
+   u64 adv = 0;
+   u32 *link_speeds = &dev->data->dev_conf.link_speeds;
+
+   /* Setup link speed and duplex */
+   if (*link_speeds == RTE_ETH_LINK_SPEED_AUTONEG)
+   rtl_set_link_option(hw, AUTONEG_ENABLE, SPEED_5000, 
DUPLEX_FULL, rtl_fc_full);
+   else if (*link_speeds != 0) {
+
+   if (*link_speeds & ~(RTE_ETH_LINK_SPEED_10M_HD | 
RTE_ETH_LINK_SPEED_10M |
+RTE_ETH_LINK_SPEED_100M_HD | 
RTE_ETH_LINK_SPEED_100M |
+RTE_ETH_LINK_SPEED_1G | 
RTE_ETH_LINK_SPEED_2_5G |
+RTE_ETH_LINK_SPEED_5G | 
RTE_ETH_LINK_SPEED_FIXED))
+   goto error_invalid_config;
+
+   if (*link_speeds & RTE_ETH_LINK_SPEED_10M_HD) {
+   hw->speed = SPEED_10;
+   hw->duplex = DUPLEX_HALF;
+   adv |= ADVERTISE_10_HALF;
+   }
+   if (*link_speeds & RTE_ETH_LINK_SPEED_10M) {
+   hw->speed = SPEED_10;
+   hw->duplex = DUPLEX_FULL;
+   adv |= ADVERTISE_10_FULL;
+   }
+   if (*link_speeds & RTE_ETH_LINK_SPEED_100M_HD) {
+   hw->speed = SPEED_100;
+   hw->duplex = DUPLEX_HALF;
+   adv |= ADVERTISE_100_HALF;
+   }
+   if (*link_speeds & RTE_ETH_LINK_SPEED_100M) {
+   hw->speed = SPEED_100;
+   hw->duplex = DUPLEX_FULL;
+   adv |= ADVERTISE_100_FULL;
+   }
+   if (*link_speeds & RTE_ETH_LINK_SPEED_1G) {
+   hw->speed = SPEED_1000;
+   hw->duplex = DUPLEX_FULL;
+   adv |= ADVERTISE_1000_FULL;
+   }
+   if (*link_speeds & RTE_ETH_LINK_SPEED_2_5G) {
+   hw->speed = SPEED_2500;
+   hw->duplex = DUPLEX_FULL;
+   adv |= ADVERTISE_2500_FULL;
+  

[PATCH v1 08/18] net/r8169: add support for phy configuration

2024-10-14 Thread Howard Wang
This patch contains phy config, ephy config and so on.

Signed-off-by: Howard Wang 
---
 drivers/net/r8169/r8169_ethdev.c |  10 +
 drivers/net/r8169/r8169_ethdev.h |   6 +
 drivers/net/r8169/r8169_phy.c| 445 +++
 drivers/net/r8169/r8169_phy.h| 100 +++
 4 files changed, 561 insertions(+)

diff --git a/drivers/net/r8169/r8169_ethdev.c b/drivers/net/r8169/r8169_ethdev.c
index f4a79b494a..294d942862 100644
--- a/drivers/net/r8169/r8169_ethdev.c
+++ b/drivers/net/r8169/r8169_ethdev.c
@@ -72,6 +72,12 @@ rtl_dev_start(struct rte_eth_dev *dev)
struct rtl_hw *hw = &adapter->hw;
int err;
 
+   rtl_powerup_pll(hw);
+
+   rtl_hw_ephy_config(hw);
+
+   rtl_hw_phy_config(hw);
+
rtl_hw_config(hw);
 
/* Initialize transmission unit */
@@ -84,6 +90,8 @@ rtl_dev_start(struct rte_eth_dev *dev)
goto error;
}
 
+   rtl_mdio_write(hw, 0x1F, 0x);
+
hw->adapter_stopped = 0;
 
return 0;
@@ -103,6 +111,8 @@ rtl_dev_stop(struct rte_eth_dev *dev)
if (hw->adapter_stopped)
return 0;
 
+   rtl_powerdown_pll(hw);
+
hw->adapter_stopped = 1;
dev->data->dev_started = 0;
 
diff --git a/drivers/net/r8169/r8169_ethdev.h b/drivers/net/r8169/r8169_ethdev.h
index c9acaabf8e..a0da173685 100644
--- a/drivers/net/r8169/r8169_ethdev.h
+++ b/drivers/net/r8169/r8169_ethdev.h
@@ -39,6 +39,12 @@ struct rtl_hw {
 
u8  NotWrRamCodeToMicroP;
u8  HwHasWrRamCodeToMicroP;
+   u8  HwSuppCheckPhyDisableModeVer;
+
+   u16 sw_ram_code_ver;
+   u16 hw_ram_code_ver;
+
+   u32 HwSuppMaxPhyLinkSpeed;
 
/* Enable Tx No Close */
u8 EnableTxNoClose;
diff --git a/drivers/net/r8169/r8169_phy.c b/drivers/net/r8169/r8169_phy.c
index bd707ee5b6..3198116946 100644
--- a/drivers/net/r8169/r8169_phy.c
+++ b/drivers/net/r8169/r8169_phy.c
@@ -330,3 +330,448 @@ rtl_set_phy_mcu_ram_code(struct rtl_hw *hw, const u16 
*ramcode, u16 codesize)
return;
 }
 
+static u8
+rtl_is_phy_disable_mode_enabled(struct rtl_hw *hw)
+{
+   u8 phy_disable_mode_enabled = FALSE;
+
+   switch (hw->HwSuppCheckPhyDisableModeVer) {
+   case 3:
+   if (RTL_R8(hw, 0xF2) & BIT_5)
+   phy_disable_mode_enabled = TRUE;
+   break;
+   }
+
+   return phy_disable_mode_enabled;
+}
+
+static u8
+rtl_is_gpio_low(struct rtl_hw *hw)
+{
+   u8 gpio_low = FALSE;
+
+   switch (hw->HwSuppCheckPhyDisableModeVer) {
+   case 3:
+   if (!(rtl_mac_ocp_read(hw, 0xDC04) & BIT_13))
+   gpio_low = TRUE;
+   break;
+   }
+
+   return gpio_low;
+}
+
+static u8
+rtl_is_in_phy_disable_mode(struct rtl_hw *hw)
+{
+   u8 in_phy_disable_mode = FALSE;
+
+   if (rtl_is_phy_disable_mode_enabled(hw) && rtl_is_gpio_low(hw))
+   in_phy_disable_mode = TRUE;
+
+   return in_phy_disable_mode;
+}
+
+static void
+rtl_wait_phy_ups_resume(struct rtl_hw *hw, u16 PhyState)
+{
+   u16 tmp_phy_state;
+   int i = 0;
+
+   switch (hw->mcfg) {
+   case CFG_METHOD_48 ... CFG_METHOD_57:
+   case CFG_METHOD_69 ... CFG_METHOD_71:
+   do {
+   tmp_phy_state = rtl_mdio_direct_read_phy_ocp(hw, 
0xA420);
+   tmp_phy_state &= 0x7;
+   mdelay(1);
+   i++;
+   } while ((i < 100) && (tmp_phy_state != PhyState));
+   }
+}
+
+static void
+rtl_phy_power_up(struct rtl_hw *hw)
+{
+   if (rtl_is_in_phy_disable_mode(hw))
+   return;
+
+   rtl_mdio_write(hw, 0x1F, 0x);
+   rtl_mdio_write(hw, MII_BMCR, BMCR_ANENABLE);
+
+   /* Wait ups resume (phy state 3) */
+   switch (hw->mcfg) {
+   case CFG_METHOD_48 ... CFG_METHOD_57:
+   case CFG_METHOD_69 ... CFG_METHOD_71:
+   rtl_wait_phy_ups_resume(hw, 3);
+   }
+}
+
+void
+rtl_powerup_pll(struct rtl_hw *hw)
+{
+   switch (hw->mcfg) {
+   case CFG_METHOD_48 ... CFG_METHOD_57:
+   case CFG_METHOD_69 ... CFG_METHOD_71:
+   RTL_W8(hw, PMCH, RTL_R8(hw, PMCH) | BIT_7 | BIT_6);
+   }
+
+   rtl_phy_power_up(hw);
+}
+
+static void
+rtl_phy_power_down(struct rtl_hw *hw)
+{
+   rtl_mdio_write(hw, 0x1F, 0x);
+   rtl_mdio_write(hw, MII_BMCR, BMCR_ANENABLE | BMCR_PDOWN);
+}
+
+void
+rtl_powerdown_pll(struct rtl_hw *hw)
+{
+   if (hw->DASH)
+   return;
+
+   rtl_phy_power_down(hw);
+
+   switch (hw->mcfg) {
+   case CFG_METHOD_48 ... CFG_METHOD_57:
+   case CFG_METHOD_69 ... CFG_METHOD_71:
+   RTL_W8(hw, PMCH, RTL_R8(hw, PMCH) & ~BIT_7);
+   break;
+   }
+}
+
+void
+rtl_hw_ephy_config(struct rtl_hw *hw)
+{
+   hw->hw_ops.hw_ephy_config(hw);
+}
+
+static int
+rtl_wait_phy_reset_complete(struct rtl_hw *hw)
+{
+   int i, val;
+
+   for (i = 0; i < 2500; i++) {
+   val 

[PATCH v1 14/18] net/r8169: implement promisc and allmulti modes

2024-10-14 Thread Howard Wang
Add support for promiscuous/allmulticast modes configuration.

Signed-off-by: Howard Wang 
---
 drivers/net/r8169/r8169_ethdev.c | 68 
 1 file changed, 68 insertions(+)

diff --git a/drivers/net/r8169/r8169_ethdev.c b/drivers/net/r8169/r8169_ethdev.c
index cf9ea4dca4..3e6bc570d6 100644
--- a/drivers/net/r8169/r8169_ethdev.c
+++ b/drivers/net/r8169/r8169_ethdev.c
@@ -43,6 +43,11 @@ static int rtl_dev_infos_get(struct rte_eth_dev *dev,
 static int rtl_dev_stats_get(struct rte_eth_dev *dev,
  struct rte_eth_stats *rte_stats);
 static int rtl_dev_stats_reset(struct rte_eth_dev *dev);
+static int rtl_promiscuous_enable(struct rte_eth_dev *dev);
+static int rtl_promiscuous_disable(struct rte_eth_dev *dev);
+static int rtl_allmulticast_enable(struct rte_eth_dev *dev);
+static int rtl_allmulticast_disable(struct rte_eth_dev *dev);
+
 /*
  * The set of PCI devices this driver supports
  */
@@ -78,6 +83,11 @@ static const struct eth_dev_ops rtl_eth_dev_ops = {
.dev_set_link_down= rtl_dev_set_link_down,
.dev_infos_get= rtl_dev_infos_get,
 
+   .promiscuous_enable   = rtl_promiscuous_enable,
+   .promiscuous_disable  = rtl_promiscuous_disable,
+   .allmulticast_enable  = rtl_allmulticast_enable,
+   .allmulticast_disable = rtl_allmulticast_disable,
+
.link_update  = rtl_dev_link_update,
 
.stats_get= rtl_dev_stats_get,
@@ -385,6 +395,64 @@ rtl_dev_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
return 0;
 }
 
+static int
+rtl_promiscuous_enable(struct rte_eth_dev *dev)
+{
+   struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+   struct rtl_hw *hw = &adapter->hw;
+
+   int rx_mode = AcceptBroadcast | AcceptMulticast | AcceptMyPhys | 
AcceptAllPhys;
+
+   RTL_W32(hw, RxConfig, rx_mode | (RTL_R32(hw, RxConfig)));
+   rtl_allmulticast_enable(dev);
+
+   return 0;
+}
+
+static int
+rtl_promiscuous_disable(struct rte_eth_dev *dev)
+{
+   struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+   struct rtl_hw *hw = &adapter->hw;
+   int rx_mode = ~AcceptAllPhys;
+
+   RTL_W32(hw, RxConfig, rx_mode & (RTL_R32(hw, RxConfig)));
+
+   if (dev->data->all_multicast == 1)
+   rtl_allmulticast_enable(dev);
+   else
+   rtl_allmulticast_disable(dev);
+
+   return 0;
+}
+
+static int
+rtl_allmulticast_enable(struct rte_eth_dev *dev)
+{
+   struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+   struct rtl_hw *hw = &adapter->hw;
+
+   RTL_W32(hw, MAR0 + 0, 0x);
+   RTL_W32(hw, MAR0 + 4, 0x);
+
+   return 0;
+}
+
+static int
+rtl_allmulticast_disable(struct rte_eth_dev *dev)
+{
+   struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+   struct rtl_hw *hw = &adapter->hw;
+
+   if (dev->data->promiscuous == 1)
+   return 0; /* Must remain in all_multicast mode */
+
+   RTL_W32(hw, MAR0 + 0, 0);
+   RTL_W32(hw, MAR0 + 4, 0);
+
+   return 0;
+}
+
 static int
 rtl_dev_stats_reset(struct rte_eth_dev *dev)
 {
-- 
2.34.1



[PATCH v1 15/18] net/r8169: impelment MTU configuration

2024-10-14 Thread Howard Wang
Add support for updating MTU value.

Signed-off-by: Howard Wang 
---
 drivers/net/r8169/r8169_ethdev.c | 29 +
 1 file changed, 29 insertions(+)

diff --git a/drivers/net/r8169/r8169_ethdev.c b/drivers/net/r8169/r8169_ethdev.c
index 3e6bc570d6..70c3661691 100644
--- a/drivers/net/r8169/r8169_ethdev.c
+++ b/drivers/net/r8169/r8169_ethdev.c
@@ -47,6 +47,7 @@ static int rtl_promiscuous_enable(struct rte_eth_dev *dev);
 static int rtl_promiscuous_disable(struct rte_eth_dev *dev);
 static int rtl_allmulticast_enable(struct rte_eth_dev *dev);
 static int rtl_allmulticast_disable(struct rte_eth_dev *dev);
+static int rtl_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
 
 /*
  * The set of PCI devices this driver supports
@@ -93,6 +94,8 @@ static const struct eth_dev_ops rtl_eth_dev_ops = {
.stats_get= rtl_dev_stats_get,
.stats_reset  = rtl_dev_stats_reset,
 
+   .mtu_set  = rtl_dev_mtu_set,
+
.rx_queue_setup   = rtl_rx_queue_setup,
.rx_queue_release = rtl_rx_queue_release,
.rxq_info_get = rtl_rxq_info_get,
@@ -388,6 +391,9 @@ rtl_dev_infos_get(struct rte_eth_dev *dev, struct 
rte_eth_dev_info *dev_info)
break;
}
 
+   dev_info->min_mtu = RTE_ETHER_MIN_MTU;
+   dev_info->max_mtu = dev_info->max_rx_pktlen - RTL_ETH_OVERHEAD;
+
dev_info->rx_offload_capa = (rtl_get_rx_port_offloads() |
 dev_info->rx_queue_offload_capa);
dev_info->tx_offload_capa = rtl_get_tx_port_offloads();
@@ -610,6 +616,29 @@ rtl_dev_close(struct rte_eth_dev *dev)
return ret_stp;
 }
 
+static int
+rtl_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
+{
+   struct rte_eth_dev_info dev_info;
+   struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+   struct rtl_hw *hw = &adapter->hw;
+   int ret;
+   uint32_t frame_size = mtu + RTL_ETH_OVERHEAD;
+
+   ret = rtl_dev_infos_get(dev, &dev_info);
+   if (ret != 0)
+   return ret;
+
+   if (mtu < RTE_ETHER_MIN_MTU || frame_size > dev_info.max_rx_pktlen)
+   return -EINVAL;
+
+   hw->mtu = mtu;
+
+   RTL_W16(hw, RxMaxSize, frame_size);
+
+   return 0;
+}
+
 static int
 rtl_dev_init(struct rte_eth_dev *dev)
 {
-- 
2.34.1



[PATCH v1 05/18] net/r8169: add support for hw config

2024-10-14 Thread Howard Wang
Implement the rtl_hw_config function to configure the hardware.

Signed-off-by: Howard Wang 
---
 drivers/net/r8169/meson.build|   1 +
 drivers/net/r8169/r8169_base.h   | 125 ++
 drivers/net/r8169/r8169_ethdev.c |   2 +
 drivers/net/r8169/r8169_ethdev.h |  15 +-
 drivers/net/r8169/r8169_hw.c | 710 +++
 drivers/net/r8169/r8169_hw.h |  17 +
 drivers/net/r8169/r8169_phy.c|  41 ++
 drivers/net/r8169/r8169_phy.h|  21 +
 8 files changed, 930 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/r8169/r8169_phy.c
 create mode 100644 drivers/net/r8169/r8169_phy.h

diff --git a/drivers/net/r8169/meson.build b/drivers/net/r8169/meson.build
index ff7d6ca4b8..56f857ac8c 100644
--- a/drivers/net/r8169/meson.build
+++ b/drivers/net/r8169/meson.build
@@ -5,5 +5,6 @@ sources = files(
'r8169_ethdev.c',
'r8169_hw.c',
'r8169_rxtx.c',
+   'r8169_phy.c',
 )
 
diff --git a/drivers/net/r8169/r8169_base.h b/drivers/net/r8169/r8169_base.h
index 0e79d8d22a..2e72faeb2c 100644
--- a/drivers/net/r8169/r8169_base.h
+++ b/drivers/net/r8169/r8169_base.h
@@ -23,6 +23,117 @@ typedef uint16_t  u16;
 typedef uint32_t  u32;
 typedef uint64_t  u64;
 
+enum mcfg {
+   CFG_METHOD_1 = 1,
+   CFG_METHOD_2,
+   CFG_METHOD_3,
+   CFG_METHOD_4,
+   CFG_METHOD_5,
+   CFG_METHOD_6,
+   CFG_METHOD_7,
+   CFG_METHOD_8,
+   CFG_METHOD_9,
+   CFG_METHOD_10,
+   CFG_METHOD_11,
+   CFG_METHOD_12,
+   CFG_METHOD_13,
+   CFG_METHOD_14,
+   CFG_METHOD_15,
+   CFG_METHOD_16,
+   CFG_METHOD_17,
+   CFG_METHOD_18,
+   CFG_METHOD_19,
+   CFG_METHOD_20,
+   CFG_METHOD_21,
+   CFG_METHOD_22,
+   CFG_METHOD_23,
+   CFG_METHOD_24,
+   CFG_METHOD_25,
+   CFG_METHOD_26,
+   CFG_METHOD_27,
+   CFG_METHOD_28,
+   CFG_METHOD_29,
+   CFG_METHOD_30,
+   CFG_METHOD_31,
+   CFG_METHOD_32,
+   CFG_METHOD_33,
+   CFG_METHOD_34,
+   CFG_METHOD_35,
+   CFG_METHOD_36,
+   CFG_METHOD_37,
+   CFG_METHOD_38,
+   CFG_METHOD_39,
+   CFG_METHOD_40,
+   CFG_METHOD_41,
+   CFG_METHOD_42,
+   CFG_METHOD_43,
+   CFG_METHOD_44,
+   CFG_METHOD_45,
+   CFG_METHOD_46,
+   CFG_METHOD_47,
+   CFG_METHOD_48,
+   CFG_METHOD_49,
+   CFG_METHOD_50,
+   CFG_METHOD_51,
+   CFG_METHOD_52,
+   CFG_METHOD_53,
+   CFG_METHOD_54,
+   CFG_METHOD_55,
+   CFG_METHOD_56,
+   CFG_METHOD_57,
+   CFG_METHOD_58,
+   CFG_METHOD_59,
+   CFG_METHOD_60,
+   CFG_METHOD_61,
+   CFG_METHOD_62,
+   CFG_METHOD_63,
+   CFG_METHOD_64,
+   CFG_METHOD_65,
+   CFG_METHOD_66,
+   CFG_METHOD_67,
+   CFG_METHOD_68,
+   CFG_METHOD_69,
+   CFG_METHOD_70,
+   CFG_METHOD_71,
+   CFG_METHOD_MAX,
+   CFG_METHOD_DEFAULT = 0xFF
+};
+
+enum bits {
+   BIT_0 = (1UL << 0),
+   BIT_1 = (1UL << 1),
+   BIT_2 = (1UL << 2),
+   BIT_3 = (1UL << 3),
+   BIT_4 = (1UL << 4),
+   BIT_5 = (1UL << 5),
+   BIT_6 = (1UL << 6),
+   BIT_7 = (1UL << 7),
+   BIT_8 = (1UL << 8),
+   BIT_9 = (1UL << 9),
+   BIT_10 = (1UL << 10),
+   BIT_11 = (1UL << 11),
+   BIT_12 = (1UL << 12),
+   BIT_13 = (1UL << 13),
+   BIT_14 = (1UL << 14),
+   BIT_15 = (1UL << 15),
+   BIT_16 = (1UL << 16),
+   BIT_17 = (1UL << 17),
+   BIT_18 = (1UL << 18),
+   BIT_19 = (1UL << 19),
+   BIT_20 = (1UL << 20),
+   BIT_21 = (1UL << 21),
+   BIT_22 = (1UL << 22),
+   BIT_23 = (1UL << 23),
+   BIT_24 = (1UL << 24),
+   BIT_25 = (1UL << 25),
+   BIT_26 = (1UL << 26),
+   BIT_27 = (1UL << 27),
+   BIT_28 = (1UL << 28),
+   BIT_29 = (1UL << 29),
+   BIT_30 = (1UL << 30),
+   BIT_31 = (1UL << 31)
+};
+
 enum RTL_registers {
MAC0= 0x00, /* Ethernet hardware address */
MAC4= 0x04,
@@ -363,6 +474,8 @@ enum RTL_register_content {
INT_CFG0_ENABLE_8125= (1 << 0),
INT_CFG0_TIMEOUT0_BYPASS_8125   = (1 << 1),
INT_CFG0_MITIGATION_BYPASS_8125 = (1 << 2),
+   INT_CFG0_RDU_BYPASS_8126= (1 << 4),
+   INT_CFG0_MSIX_ENTRY_NUM_MODE= (1 << 5),
ISRIMR_V2_ROK_Q0 = (1 << 0),
ISRIMR_TOK_Q0= (1 << 16),
ISRIMR_TOK_Q1= (1 << 18),
@@ -389,6 +502,18 @@ enum RTL_register_content {
 #define msleep rte_delay_ms
 #define usleep rte_delay_us
 
+#define RX_DMA_BURST_unlimited  7   /* Maximum PCI burst, '7' is unlimited */
+#define RX_DMA_BURST_5125
+#define TX_DMA_BURST_unlimited  7
+#define TX_DMA_BURST_1024   6
+#define TX_DMA_BURST_5125
+#define TX_DMA_BURST_2564
+#define TX_DMA_BURST_1283
+#define TX_DMA_BURST_64 2
+#define TX_DMA_BURST_32 1
+#define TX_DMA_BURST_16 0
+#define InterFrameGap   0x03/* 3 means InterFrameG

[PATCH v1 03/18] net/r8169: add hardware registers access routines

2024-10-14 Thread Howard Wang
Add implementation for hardware registers access routines.

Signed-off-by: Howard Wang 
---
 drivers/net/r8169/meson.build|   1 +
 drivers/net/r8169/r8169_base.h   | 389 +++
 drivers/net/r8169/r8169_ethdev.h |   1 +
 drivers/net/r8169/r8169_hw.c |  94 
 drivers/net/r8169/r8169_hw.h |  29 +++
 5 files changed, 514 insertions(+)
 create mode 100644 drivers/net/r8169/r8169_hw.c
 create mode 100644 drivers/net/r8169/r8169_hw.h

diff --git a/drivers/net/r8169/meson.build b/drivers/net/r8169/meson.build
index e37b4fb237..f659e56192 100644
--- a/drivers/net/r8169/meson.build
+++ b/drivers/net/r8169/meson.build
@@ -3,5 +3,6 @@
 
 sources = files(
'r8169_ethdev.c',
+   'r8169_hw.c',
 )
 
diff --git a/drivers/net/r8169/r8169_base.h b/drivers/net/r8169/r8169_base.h
index c3b0186daa..0e79d8d22a 100644
--- a/drivers/net/r8169/r8169_base.h
+++ b/drivers/net/r8169/r8169_base.h
@@ -5,12 +5,401 @@
 #ifndef _R8169_BASE_H_
 #define _R8169_BASE_H_
 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
 typedef uint8_t   u8;
 typedef uint16_t  u16;
 typedef uint32_t  u32;
 typedef uint64_t  u64;
 
+enum RTL_registers {
+   MAC0= 0x00, /* Ethernet hardware address */
+   MAC4= 0x04,
+   MAR0= 0x08, /* Multicast filter */
+   CounterAddrLow  = 0x10,
+   CounterAddrHigh = 0x14,
+   CustomLED   = 0x18,
+   TxDescStartAddrLow  = 0x20,
+   TxDescStartAddrHigh = 0x24,
+   TxHDescStartAddrLow = 0x28,
+   TxHDescStartAddrHigh = 0x2C,
+   FLASH   = 0x30,
+   INT_CFG0_8125   = 0x34,
+   ERSR= 0x36,
+   ChipCmd = 0x37,
+   TxPoll  = 0x38,
+   IntrMask= 0x3C,
+   IntrStatus  = 0x3E,
+   TxConfig= 0x40,
+   RxConfig= 0x44,
+   TCTR= 0x48,
+   Cfg9346 = 0x50,
+   Config0 = 0x51,
+   Config1 = 0x52,
+   Config2 = 0x53,
+   Config3 = 0x54,
+   Config4 = 0x55,
+   Config5 = 0x56,
+   TDFNR   = 0x57,
+   TimeInt0= 0x58,
+   TimeInt1= 0x5C,
+   PHYAR   = 0x60,
+   CSIDR   = 0x64,
+   CSIAR   = 0x68,
+   PHYstatus   = 0x6C,
+   MACDBG  = 0x6D,
+   GPIO= 0x6E,
+   PMCH= 0x6F,
+   ERIDR   = 0x70,
+   ERIAR   = 0x74,
+   INT_CFG1_8125   = 0x7A,
+   EPHY_RXER_NUM   = 0x7C,
+   EPHYAR  = 0x80,
+   TimeInt2= 0x8C,
+   OCPDR   = 0xB0,
+   MACOCP  = 0xB0,
+   OCPAR   = 0xB4,
+   SecMAC0 = 0xB4,
+   SecMAC4 = 0xB8,
+   PHYOCP  = 0xB8,
+   DBG_reg = 0xD1,
+   TwiCmdReg   = 0xD2,
+   MCUCmd_reg  = 0xD3,
+   RxMaxSize   = 0xDA,
+   EFUSEAR = 0xDC,
+   CPlusCmd= 0xE0,
+   IntrMitigate= 0xE2,
+   RxDescAddrLow   = 0xE4,
+   RxDescAddrHigh  = 0xE8,
+   MTPS= 0xEC,
+   FuncEvent   = 0xF0,
+   PPSW= 0xF2,
+   FuncEventMask   = 0xF4,
+   TimeInt3= 0xF4,
+   FuncPresetState = 0xF8,
+   CMAC_IBCR0  = 0xF8,
+   CMAC_IBCR2  = 0xF9,
+   CMAC_IBIMR0 = 0xFA,
+   CMAC_IBISR0 = 0xFB,
+   FuncForceEvent  = 0xFC,
+
+   /* 8125 */
+   IMR0_8125  = 0x38,
+   ISR0_8125  = 0x3C,
+   TPPOLL_8125= 0x90,
+   IMR1_8125  = 0x800,
+   ISR1_8125  = 0x802,
+   IMR2_8125  = 0x804,
+   ISR2_8125  = 0x806,
+   IMR3_8125  = 0x808,
+   ISR3_8125  = 0x80A,
+   BACKUP_ADDR0_8125  = 0x19E0,
+   BACKUP_ADDR1_8125  = 0X19E4,
+   TCTR0_8125 = 0x0048,
+   TCTR1_8125 = 0x004C,
+   TCTR2_8125 = 0x0088,
+   TCTR3_8125 = 0x001C,
+   TIMER_INT0_8125= 0x0058,
+   TIMER_INT1_8125= 0x005C,
+   TIMER_INT2_8125= 0x008C,
+   TIMER_INT3_8125= 0x00F4,
+   INT_MITI_V2_0_RX   = 0x0A00,
+   INT_MITI_V2_0_TX   = 0x0A02,
+   INT_MITI_V2_1_RX   = 0x0A08,
+   INT_MITI_V2_1_TX   = 0x0A0A,
+   IMR_V2_CLEAR_REG_8125 = 0x0D00,
+   ISR_V2_8125   = 0x0D04,
+   IMR_V2_SET_REG_8125   = 0x0D0C,
+   TDU_STA_8125   = 0x0D08,
+   RDU_STA_8125   = 0x0D0A,
+   IMR_V4_L2_CLEAR_REG_8125 = 0x0D10,
+   IMR_V4_L2_SET_REG_8125   = 0x0D18,
+   ISR_V4_L2_8125  = 0x0D14,
+   DOUBLE_VLAN_CONFIG = 0x1000,
+   TX_NEW_CTRL= 0x203E,
+   TNPDS_Q1_LOW_8125  = 0x2100,
+   PLA_TXQ0_IDLE_CREDIT = 0x2500,
+   PLA_TXQ1_IDLE_CREDIT = 0x2504,
+   SW_TAIL_PTR0_8125  = 0x2800,
+   HW_CLO_PTR0_8125   = 0x2802,
+ 

[PATCH v1 06/18] net/r8169: add phy registers access routines

2024-10-14 Thread Howard Wang
Signed-off-by: Howard Wang 
---
 drivers/net/r8169/r8169_ethdev.h |   1 +
 drivers/net/r8169/r8169_phy.c| 219 +++
 drivers/net/r8169/r8169_phy.h|  18 +++
 3 files changed, 238 insertions(+)

diff --git a/drivers/net/r8169/r8169_ethdev.h b/drivers/net/r8169/r8169_ethdev.h
index 20dbf06c9b..9656a26eb0 100644
--- a/drivers/net/r8169/r8169_ethdev.h
+++ b/drivers/net/r8169/r8169_ethdev.h
@@ -18,6 +18,7 @@ struct rtl_hw {
u8  *mmio_addr;
u32 mcfg;
u8  HwSuppIntMitiVer;
+   u16 cur_page;
 
/* Enable Tx No Close */
u8 EnableTxNoClose;
diff --git a/drivers/net/r8169/r8169_phy.c b/drivers/net/r8169/r8169_phy.c
index f0a880eeca..cfec426ee1 100644
--- a/drivers/net/r8169/r8169_phy.c
+++ b/drivers/net/r8169/r8169_phy.c
@@ -39,3 +39,222 @@ rtl_set_mac_ocp_bit(struct rtl_hw *hw, u16 addr, u16 mask)
rtl_clear_set_mac_ocp_bit(hw, addr, 0, mask);
 }
 
+static u16
+rtl_map_phy_ocp_addr(u16 PageNum, u8 RegNum)
+{
+   u8 ocp_reg_num = 0;
+   u16 ocp_page_num = 0;
+   u16 ocp_phy_address = 0;
+
+   if (PageNum == 0) {
+   ocp_page_num = OCP_STD_PHY_BASE_PAGE + (RegNum / 8);
+   ocp_reg_num = 0x10 + (RegNum % 8);
+   } else {
+   ocp_page_num = PageNum;
+   ocp_reg_num = RegNum;
+   }
+
+   ocp_page_num <<= 4;
+
+   if (ocp_reg_num < 16)
+   ocp_phy_address = 0;
+   else {
+   ocp_reg_num -= 16;
+   ocp_reg_num <<= 1;
+
+   ocp_phy_address = ocp_page_num + ocp_reg_num;
+   }
+
+   return ocp_phy_address;
+}
+
+static u32
+rtl_mdio_real_read_phy_ocp(struct rtl_hw *hw, u32 RegAddr)
+{
+   u32 data32;
+   int i, value = 0;
+
+   data32 = RegAddr / 2;
+   data32 <<= OCPR_Addr_Reg_shift;
+
+   RTL_W32(hw, PHYOCP, data32);
+   for (i = 0; i < 100; i++) {
+   udelay(1);
+
+   if (RTL_R32(hw, PHYOCP) & OCPR_Flag)
+   break;
+   }
+   value = RTL_R32(hw, PHYOCP) & OCPDR_Data_Mask;
+
+   return value;
+}
+
+u32
+rtl_mdio_direct_read_phy_ocp(struct rtl_hw *hw, u32 RegAddr)
+{
+   return rtl_mdio_real_read_phy_ocp(hw, RegAddr);
+}
+
+static u32
+rtl_mdio_read_phy_ocp(struct rtl_hw *hw, u16 PageNum, u32 RegAddr)
+{
+   u16 ocp_addr;
+
+   ocp_addr = rtl_map_phy_ocp_addr(PageNum, RegAddr);
+
+   return rtl_mdio_direct_read_phy_ocp(hw, ocp_addr);
+}
+
+static u32
+rtl_mdio_real_read(struct rtl_hw *hw, u32 RegAddr)
+{
+   return rtl_mdio_read_phy_ocp(hw, hw->cur_page, RegAddr);
+}
+
+static void
+rtl_mdio_real_write_phy_ocp(struct rtl_hw *hw, u32 RegAddr, u32 value)
+{
+   u32 data32;
+   int i;
+
+   data32 = RegAddr / 2;
+   data32 <<= OCPR_Addr_Reg_shift;
+   data32 |= OCPR_Write | value;
+
+   RTL_W32(hw, PHYOCP, data32);
+   for (i = 0; i < 100; i++) {
+   udelay(1);
+
+   if (!(RTL_R32(hw, PHYOCP) & OCPR_Flag))
+   break;
+   }
+}
+
+void
+rtl_mdio_direct_write_phy_ocp(struct rtl_hw *hw, u32 RegAddr, u32 value)
+{
+   rtl_mdio_real_write_phy_ocp(hw, RegAddr, value);
+}
+
+static void
+rtl_mdio_write_phy_ocp(struct rtl_hw *hw, u16 PageNum, u32 RegAddr, u32 value)
+{
+   u16 ocp_addr;
+
+   ocp_addr = rtl_map_phy_ocp_addr(PageNum, RegAddr);
+
+   rtl_mdio_direct_write_phy_ocp(hw, ocp_addr, value);
+}
+
+static void
+rtl_mdio_real_write(struct rtl_hw *hw, u32 RegAddr, u32 value)
+{
+   if (RegAddr == 0x1F)
+   hw->cur_page = value;
+   rtl_mdio_write_phy_ocp(hw, hw->cur_page, RegAddr, value);
+}
+
+u32
+rtl_mdio_read(struct rtl_hw *hw, u32 RegAddr)
+{
+   return rtl_mdio_real_read(hw, RegAddr);
+}
+
+void
+rtl_mdio_write(struct rtl_hw *hw, u32 RegAddr, u32 value)
+{
+   rtl_mdio_real_write(hw, RegAddr, value);
+}
+
+void
+rtl_clear_and_set_eth_phy_ocp_bit(struct rtl_hw *hw, u16 addr, u16 clearmask,
+  u16 setmask)
+{
+   u16 phy_reg_value;
+
+   phy_reg_value = rtl_mdio_direct_read_phy_ocp(hw, addr);
+   phy_reg_value &= ~clearmask;
+   phy_reg_value |= setmask;
+   rtl_mdio_direct_write_phy_ocp(hw, addr, phy_reg_value);
+}
+
+void
+rtl_clear_eth_phy_ocp_bit(struct rtl_hw *hw, u16 addr, u16 mask)
+{
+   rtl_clear_and_set_eth_phy_ocp_bit(hw, addr, mask, 0);
+}
+
+void
+rtl_set_eth_phy_ocp_bit(struct rtl_hw *hw, u16 addr, u16 mask)
+{
+   rtl_clear_and_set_eth_phy_ocp_bit(hw, addr, 0, mask);
+}
+
+void
+rtl_ephy_write(struct rtl_hw *hw, int addr, int value)
+{
+   int i;
+
+   RTL_W32(hw, EPHYAR, EPHYAR_Write |
+   (addr & EPHYAR_Reg_Mask_v2) << EPHYAR_Reg_shift |
+   (value & EPHYAR_Data_Mask));
+
+   for (i = 0; i < 10; i++) {
+   udelay(100);
+
+   /* Check if the NIC has completed EPHY write */
+   if (!(RTL_R32(hw, EPHYAR) & EPHYAR_Flag))
+   break;

[PATCH v1 01/18] net/r8169: add PMD driver skeleton

2024-10-14 Thread Howard Wang
Meson build infrastructure, r8169_ethdev minimal skeleton,
header with Realtek NIC device and vendor IDs.

Signed-off-by: Howard Wang 
---
 MAINTAINERS  |   7 ++
 drivers/net/meson.build  |   1 +
 drivers/net/r8169/meson.build|   7 ++
 drivers/net/r8169/r8169_base.h   |  16 +++
 drivers/net/r8169/r8169_ethdev.c | 179 +++
 drivers/net/r8169/r8169_ethdev.h |  41 +++
 6 files changed, 251 insertions(+)
 create mode 100644 drivers/net/r8169/meson.build
 create mode 100644 drivers/net/r8169/r8169_base.h
 create mode 100644 drivers/net/r8169/r8169_ethdev.c
 create mode 100644 drivers/net/r8169/r8169_ethdev.h

diff --git a/MAINTAINERS b/MAINTAINERS
index c5a703b5c0..5f9eccc43f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1076,6 +1076,13 @@ F: drivers/net/memif/
 F: doc/guides/nics/memif.rst
 F: doc/guides/nics/features/memif.ini
 
+Realtek r8169
+M: Howard Wang 
+M: ChunHao Lin 
+M: Xing Wang 
+M: Realtek NIC SW 
+F: drivers/net/r8169
+
 
 Crypto Drivers
 --
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index fb6d34b782..fddcf39655 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -53,6 +53,7 @@ drivers = [
 'pfe',
 'qede',
 'ring',
+'r8169',
 'sfc',
 'softnic',
 'tap',
diff --git a/drivers/net/r8169/meson.build b/drivers/net/r8169/meson.build
new file mode 100644
index 00..e37b4fb237
--- /dev/null
+++ b/drivers/net/r8169/meson.build
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Realtek Corporation. All rights reserved
+
+sources = files(
+   'r8169_ethdev.c',
+)
+
diff --git a/drivers/net/r8169/r8169_base.h b/drivers/net/r8169/r8169_base.h
new file mode 100644
index 00..c3b0186daa
--- /dev/null
+++ b/drivers/net/r8169/r8169_base.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Realtek Corporation. All rights reserved
+ */
+
+#ifndef _R8169_BASE_H_
+#define _R8169_BASE_H_
+
+typedef uint8_t   u8;
+typedef uint16_t  u16;
+typedef uint32_t  u32;
+typedef uint64_t  u64;
+
+#define PCI_VENDOR_ID_REALTEK 0x10EC
+
+#endif
+
diff --git a/drivers/net/r8169/r8169_ethdev.c b/drivers/net/r8169/r8169_ethdev.c
new file mode 100644
index 00..e5f8857304
--- /dev/null
+++ b/drivers/net/r8169/r8169_ethdev.c
@@ -0,0 +1,179 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Realtek Corporation. All rights reserved
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "r8169_ethdev.h"
+#include "r8169_base.h"
+
+static int rtl_dev_configure(struct rte_eth_dev *dev __rte_unused);
+static int rtl_dev_start(struct rte_eth_dev *dev);
+static int rtl_dev_stop(struct rte_eth_dev *dev);
+static int rtl_dev_reset(struct rte_eth_dev *dev);
+static int rtl_dev_close(struct rte_eth_dev *dev);
+
+/*
+ * The set of PCI devices this driver supports
+ */
+static const struct rte_pci_id pci_id_r8169_map[] = {
+   { RTE_PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x8125) },
+   { RTE_PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x8162) },
+   { RTE_PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x8126) },
+   { RTE_PCI_DEVICE(PCI_VENDOR_ID_REALTEK, 0x5000) },
+   {.vendor_id = 0, /* sentinel */ },
+};
+
+static const struct eth_dev_ops rtl_eth_dev_ops = {
+   .dev_configure= rtl_dev_configure,
+   .dev_start= rtl_dev_start,
+   .dev_stop = rtl_dev_stop,
+   .dev_close= rtl_dev_close,
+   .dev_reset= rtl_dev_reset,
+};
+
+static int
+rtl_dev_configure(struct rte_eth_dev *dev __rte_unused)
+{
+   return 0;
+}
+
+/*
+ * Configure device link speed and setup link.
+ * It returns 0 on success.
+ */
+static int
+rtl_dev_start(struct rte_eth_dev *dev)
+{
+   struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+   struct rtl_hw *hw = &adapter->hw;
+
+   hw->adapter_stopped = 0;
+
+   return 0;
+}
+
+/*
+ * Stop device: disable RX and TX functions to allow for reconfiguring.
+ */
+static int
+rtl_dev_stop(struct rte_eth_dev *dev)
+{
+   struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+   struct rtl_hw *hw = &adapter->hw;
+
+   if (hw->adapter_stopped)
+   return 0;
+
+   hw->adapter_stopped = 1;
+   dev->data->dev_started = 0;
+
+   return 0;
+}
+
+/*
+ * Reset and stop device.
+ */
+static int
+rtl_dev_close(struct rte_eth_dev *dev)
+{
+   int ret_stp;
+
+   if (rte_eal_process_type() != RTE_PROC_PRIMARY)
+   return 0;
+
+   ret_stp = rtl_dev_stop(dev);
+
+   return ret_stp;
+}
+
+static int
+rtl_dev_init(struct rte_eth_dev *dev)
+{
+   struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(dev);
+   struct rte_intr_handle *intr_handle 

[PATCH v1 00/18] net/r8169: add r8169 pmd to dpdk

2024-10-14 Thread Howard Wang
R8169 pmd supports Realtek 2.5G and 5G ethernet nics.

Howard Wang (18):
  net/r8169: add PMD driver skeleton
  net/r8169: add logging structure
  net/r8169: add hardware registers access routines
  net/r8169: implement core logic for Tx/Rx
  net/r8169: add support for hw config
  net/r8169: add phy registers access routines
  net/r8169: add support for hardware operations
  net/r8169: add support for phy configuration
  net/r8169: add support for hw initialization
  net/r8169: add link status and interrupt management
  net/r8169: implement Rx path
  net/r8169: implement Tx path
  net/r8169: implement device statistics
  net/r8169: implement promisc and allmulti modes
  net/r8169: impelment MTU configuration
  net/r8169: add support for getting fw version
  net/r8169: add driver_start and driver_stop
  doc/guides/nics: add documents for r8169 pmd

 MAINTAINERS|9 +
 doc/guides/nics/features/r8169.ini |   32 +
 doc/guides/nics/r8169.rst  |   17 +
 drivers/net/meson.build|1 +
 drivers/net/r8169/base/rtl8125a.c  |  413 
 drivers/net/r8169/base/rtl8125a_mcu.c  | 1586 +
 drivers/net/r8169/base/rtl8125a_mcu.h  |   15 +
 drivers/net/r8169/base/rtl8125b.c  |  391 
 drivers/net/r8169/base/rtl8125b_mcu.c  | 1068 +
 drivers/net/r8169/base/rtl8125b_mcu.h  |   15 +
 drivers/net/r8169/base/rtl8125bp.c |  116 +
 drivers/net/r8169/base/rtl8125bp_mcu.c |  289 +++
 drivers/net/r8169/base/rtl8125bp_mcu.h |   14 +
 drivers/net/r8169/base/rtl8125d.c  |  245 ++
 drivers/net/r8169/base/rtl8125d_mcu.c  |  618 +
 drivers/net/r8169/base/rtl8125d_mcu.h  |   14 +
 drivers/net/r8169/base/rtl8126a.c  |  534 +
 drivers/net/r8169/base/rtl8126a_mcu.c  | 2994 
 drivers/net/r8169/base/rtl8126a_mcu.h  |   17 +
 drivers/net/r8169/meson.build  |   21 +
 drivers/net/r8169/r8169_base.h |  632 +
 drivers/net/r8169/r8169_dash.c |  230 ++
 drivers/net/r8169/r8169_dash.h |   58 +
 drivers/net/r8169/r8169_ethdev.c   |  821 +++
 drivers/net/r8169/r8169_ethdev.h   |  146 ++
 drivers/net/r8169/r8169_hw.c   | 1590 +
 drivers/net/r8169/r8169_hw.h   |  115 +
 drivers/net/r8169/r8169_logs.h |   53 +
 drivers/net/r8169/r8169_phy.c  |  898 +++
 drivers/net/r8169/r8169_phy.h  |  148 ++
 drivers/net/r8169/r8169_rxtx.c | 1495 
 31 files changed, 14595 insertions(+)
 create mode 100644 doc/guides/nics/features/r8169.ini
 create mode 100644 doc/guides/nics/r8169.rst
 create mode 100644 drivers/net/r8169/base/rtl8125a.c
 create mode 100644 drivers/net/r8169/base/rtl8125a_mcu.c
 create mode 100644 drivers/net/r8169/base/rtl8125a_mcu.h
 create mode 100644 drivers/net/r8169/base/rtl8125b.c
 create mode 100644 drivers/net/r8169/base/rtl8125b_mcu.c
 create mode 100644 drivers/net/r8169/base/rtl8125b_mcu.h
 create mode 100644 drivers/net/r8169/base/rtl8125bp.c
 create mode 100644 drivers/net/r8169/base/rtl8125bp_mcu.c
 create mode 100644 drivers/net/r8169/base/rtl8125bp_mcu.h
 create mode 100644 drivers/net/r8169/base/rtl8125d.c
 create mode 100644 drivers/net/r8169/base/rtl8125d_mcu.c
 create mode 100644 drivers/net/r8169/base/rtl8125d_mcu.h
 create mode 100644 drivers/net/r8169/base/rtl8126a.c
 create mode 100644 drivers/net/r8169/base/rtl8126a_mcu.c
 create mode 100644 drivers/net/r8169/base/rtl8126a_mcu.h
 create mode 100644 drivers/net/r8169/meson.build
 create mode 100644 drivers/net/r8169/r8169_base.h
 create mode 100644 drivers/net/r8169/r8169_dash.c
 create mode 100644 drivers/net/r8169/r8169_dash.h
 create mode 100644 drivers/net/r8169/r8169_ethdev.c
 create mode 100644 drivers/net/r8169/r8169_ethdev.h
 create mode 100644 drivers/net/r8169/r8169_hw.c
 create mode 100644 drivers/net/r8169/r8169_hw.h
 create mode 100644 drivers/net/r8169/r8169_logs.h
 create mode 100644 drivers/net/r8169/r8169_phy.c
 create mode 100644 drivers/net/r8169/r8169_phy.h
 create mode 100644 drivers/net/r8169/r8169_rxtx.c

-- 
2.34.1



[PATCH v1 04/18] net/r8169: implement core logic for Tx/Rx

2024-10-14 Thread Howard Wang
Add RX/TX function prototypes for further datapath development.

Signed-off-by: Howard Wang 
---
 drivers/net/r8169/meson.build|  1 +
 drivers/net/r8169/r8169_ethdev.c | 17 ++
 drivers/net/r8169/r8169_ethdev.h |  3 ++
 drivers/net/r8169/r8169_rxtx.c   | 57 
 4 files changed, 78 insertions(+)
 create mode 100644 drivers/net/r8169/r8169_rxtx.c

diff --git a/drivers/net/r8169/meson.build b/drivers/net/r8169/meson.build
index f659e56192..ff7d6ca4b8 100644
--- a/drivers/net/r8169/meson.build
+++ b/drivers/net/r8169/meson.build
@@ -4,5 +4,6 @@
 sources = files(
'r8169_ethdev.c',
'r8169_hw.c',
+   'r8169_rxtx.c',
 )
 
diff --git a/drivers/net/r8169/r8169_ethdev.c b/drivers/net/r8169/r8169_ethdev.c
index 09e12fb56d..92121ad3fb 100644
--- a/drivers/net/r8169/r8169_ethdev.c
+++ b/drivers/net/r8169/r8169_ethdev.c
@@ -27,6 +27,8 @@
 
 #include "r8169_ethdev.h"
 #include "r8169_base.h"
+#include "r8169_logs.h"
+#include "r8169_hw.h"
 
 static int rtl_dev_configure(struct rte_eth_dev *dev __rte_unused);
 static int rtl_dev_start(struct rte_eth_dev *dev);
@@ -68,10 +70,23 @@ rtl_dev_start(struct rte_eth_dev *dev)
 {
struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
struct rtl_hw *hw = &adapter->hw;
+   int err;
+
+   /* Initialize transmission unit */
+   rtl_tx_init(dev);
+
+   /* This can fail when allocating mbufs for descriptor rings */
+   err = rtl_rx_init(dev);
+   if (err) {
+   PMD_INIT_LOG(ERR, "Unable to initialize RX hardware");
+   goto error;
+   }
 
hw->adapter_stopped = 0;
 
return 0;
+error:
+   return -EIO;
 }
 
 /*
@@ -117,6 +132,8 @@ rtl_dev_init(struct rte_eth_dev *dev)
struct rtl_hw *hw = &adapter->hw;
 
dev->dev_ops = &rtl_eth_dev_ops;
+   dev->tx_pkt_burst = &rtl_xmit_pkts;
+   dev->rx_pkt_burst = &rtl_recv_pkts;
 
/* For secondary processes, the primary process has done all the work */
if (rte_eal_process_type() != RTE_PROC_PRIMARY)
diff --git a/drivers/net/r8169/r8169_ethdev.h b/drivers/net/r8169/r8169_ethdev.h
index 04458dc497..7c6e110e7f 100644
--- a/drivers/net/r8169/r8169_ethdev.h
+++ b/drivers/net/r8169/r8169_ethdev.h
@@ -35,6 +35,9 @@ struct rtl_adapter {
 #define RTL_DEV_PRIVATE(eth_dev) \
((struct rtl_adapter *)((eth_dev)->data->dev_private))
 
+int rtl_rx_init(struct rte_eth_dev *dev);
+int rtl_tx_init(struct rte_eth_dev *dev);
+
 uint16_t rtl_xmit_pkts(void *txq, struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
 uint16_t rtl_recv_pkts(void *rxq, struct rte_mbuf **rx_pkts, uint16_t nb_pkts);
 
diff --git a/drivers/net/r8169/r8169_rxtx.c b/drivers/net/r8169/r8169_rxtx.c
new file mode 100644
index 00..cce78d4e60
--- /dev/null
+++ b/drivers/net/r8169/r8169_rxtx.c
@@ -0,0 +1,57 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Realtek Corporation. All rights reserved
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "r8169_ethdev.h"
+#include "r8169_hw.h"
+#include "r8169_logs.h"
+
+/* -RX-- */
+int
+rtl_rx_init(struct rte_eth_dev *dev)
+{
+   return 0;
+}
+
+uint16_t
+rtl_recv_pkts(void *rxq, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
+{
+   return 0;
+}
+
+/* -TX-- */
+int
+rtl_tx_init(struct rte_eth_dev *dev)
+{
+   return 0;
+}
+
+uint16_t
+rtl_xmit_pkts(void *txq, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
+{
+   return 0;
+}
+
-- 
2.34.1



[PATCH v1 02/18] net/r8169: add logging structure

2024-10-14 Thread Howard Wang
Implement logging macros for debug purposes.

Signed-off-by: Howard Wang 
---
 drivers/net/r8169/r8169_ethdev.c | 40 
 drivers/net/r8169/r8169_logs.h   | 53 
 2 files changed, 93 insertions(+)
 create mode 100644 drivers/net/r8169/r8169_logs.h

diff --git a/drivers/net/r8169/r8169_ethdev.c b/drivers/net/r8169/r8169_ethdev.c
index e5f8857304..09e12fb56d 100644
--- a/drivers/net/r8169/r8169_ethdev.c
+++ b/drivers/net/r8169/r8169_ethdev.c
@@ -177,3 +177,43 @@ RTE_PMD_REGISTER_PCI(net_r8169, rte_r8169_pmd);
 RTE_PMD_REGISTER_PCI_TABLE(net_r8169, pci_id_r8169_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_r8169, "* igb_uio | uio_pci_generic | vfio-pci");
 
+int r8169_logtype_init;
+int r8169_logtype_driver;
+
+#ifdef RTE_LIBRTE_R8169_DEBUG_RX
+int r8169_logtype_rx;
+#endif
+#ifdef RTE_LIBRTE_R8169_DEBUG_TX
+int r8169_logtype_tx;
+#endif
+#ifdef RTE_LIBRTE_R8169_DEBUG_TX_FREE
+int r8169_logtype_tx_free;
+#endif
+
+RTE_INIT(r8169_init_log)
+{
+   r8169_logtype_init = rte_log_register("pmd.net.r8169.init");
+   if (r8169_logtype_init >= 0)
+   rte_log_set_level(r8169_logtype_init, RTE_LOG_NOTICE);
+   r8169_logtype_driver = rte_log_register("pmd.net.r8169.driver");
+   if (r8169_logtype_driver >= 0)
+   rte_log_set_level(r8169_logtype_driver, RTE_LOG_NOTICE);
+#ifdef RTE_LIBRTE_R8169_DEBUG_RX
+   r8169_logtype_rx = rte_log_register("pmd.net.r8169.rx");
+   if (r8169_logtype_rx >= 0)
+   rte_log_set_level(r8169_logtype_rx, RTE_LOG_DEBUG);
+#endif
+
+#ifdef RTE_LIBRTE_R8169_DEBUG_TX
+   r8169_logtype_tx = rte_log_register("pmd.net.r8169.tx");
+   if (r8169_logtype_tx >= 0)
+   rte_log_set_level(r8169_logtype_tx, RTE_LOG_DEBUG);
+#endif
+
+#ifdef RTE_LIBRTE_R8169_DEBUG_TX_FREE
+   r8169_logtype_tx_free = rte_log_register("pmd.net.r8169.tx_free");
+   if (r8169_logtype_tx_free >= 0)
+   rte_log_set_level(r8169_logtype_tx_free, RTE_LOG_DEBUG);
+#endif
+}
+
diff --git a/drivers/net/r8169/r8169_logs.h b/drivers/net/r8169/r8169_logs.h
new file mode 100644
index 00..6ce5b4b5ac
--- /dev/null
+++ b/drivers/net/r8169/r8169_logs.h
@@ -0,0 +1,53 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Realtek Corporation. All rights reserved
+ */
+
+#ifndef _R8169_LOGS_H_
+#define _R8169_LOGS_H_
+
+#include 
+
+extern int r8169_logtype_init;
+#define PMD_INIT_LOG(level, fmt, args...) \
+   rte_log(RTE_LOG_ ## level, r8169_logtype_init, \
+   "%s(): " fmt "\n", __func__, ##args)
+
+#define PMD_INIT_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>")
+
+#ifdef RTE_LIBRTE_R8169_DEBUG_RX
+extern int r8169_logtype_rx;
+#define PMD_RX_LOG(level, fmt, args...)\
+   rte_log(RTE_LOG_ ## level, r8169_logtype_rx,\
+   "%s(): " fmt "\n", __func__, ## args)
+#else
+#define PMD_RX_LOG(level, fmt, args...) do { } while (0)
+#endif
+
+#ifdef RTE_LIBRTE_R8169_DEBUG_TX
+extern int r8169_logtype_tx;
+#define PMD_TX_LOG(level, fmt, args...)\
+   rte_log(RTE_LOG_ ## level, r8169_logtype_tx,\
+   "%s(): " fmt "\n", __func__, ## args)
+#else
+#define PMD_TX_LOG(level, fmt, args...) do { } while (0)
+#endif
+
+#ifdef RTE_LIBRTE_R8169_DEBUG_TX_FREE
+extern int r8169_logtype_tx_free;
+#define PMD_TX_FREE_LOG(level, fmt, args...)   \
+   rte_log(RTE_LOG_ ## level, r8169_logtype_tx_free,   \
+   "%s(): " fmt "\n", __func__, ## args)
+#else
+#define PMD_TX_FREE_LOG(level, fmt, args...) do { } while (0)
+#endif
+
+extern int r8169_logtype_driver;
+#define PMD_DRV_LOG_RAW(level, fmt, args...) \
+   rte_log(RTE_LOG_ ## level, r8169_logtype_driver, "%s(): " fmt, \
+   __func__, ## args)
+
+#define PMD_DRV_LOG(level, fmt, args...) \
+   PMD_DRV_LOG_RAW(level, fmt "\n", ## args)
+
+#endif /* _R8169_LOGS_H_ */
+
-- 
2.34.1



[PATCH v1 17/18] net/r8169: add driver_start and driver_stop

2024-10-14 Thread Howard Wang
rtl8125ap and rtl8125bp need driver start and stop whether
dash is enabled or not.

Signed-off-by: Howard Wang 
---
 drivers/net/r8169/base/rtl8126a_mcu.h |   1 +
 drivers/net/r8169/r8169_base.h|   6 +-
 drivers/net/r8169/r8169_dash.c| 149 +-
 drivers/net/r8169/r8169_dash.h|  25 -
 drivers/net/r8169/r8169_ethdev.c  |   4 +
 drivers/net/r8169/r8169_hw.c  |   5 +
 6 files changed, 184 insertions(+), 6 deletions(-)

diff --git a/drivers/net/r8169/base/rtl8126a_mcu.h 
b/drivers/net/r8169/base/rtl8126a_mcu.h
index ae4aa5f3d4..89e600d87c 100644
--- a/drivers/net/r8169/base/rtl8126a_mcu.h
+++ b/drivers/net/r8169/base/rtl8126a_mcu.h
@@ -12,5 +12,6 @@ void rtl_set_mac_mcu_8126a_3(struct rtl_hw *hw);
 void rtl_set_phy_mcu_8126a_1(struct rtl_hw *hw);
 void rtl_set_phy_mcu_8126a_2(struct rtl_hw *hw);
 void rtl_set_phy_mcu_8126a_3(struct rtl_hw *hw);
+
 #endif
 
diff --git a/drivers/net/r8169/r8169_base.h b/drivers/net/r8169/r8169_base.h
index 98c965ac23..6e05ab3f3c 100644
--- a/drivers/net/r8169/r8169_base.h
+++ b/drivers/net/r8169/r8169_base.h
@@ -271,7 +271,11 @@ enum RTL_registers {
Q_NUM_CTRL_8125= 0x4800,
RSS_KEY_8125   = 0x4600,
RSS_INDIRECTION_TBL_8125_V2 = 0x4700,
-   EEE_TXIDLE_TIMER_8125 = 0x6048,
+   EEE_TXIDLE_TIMER_8125   = 0x6048,
+   IB2SOC_SET = 0x0010,
+   IB2SOC_DATA= 0x0014,
+   IB2SOC_CMD = 0x0018,
+   IB2SOC_IMR = 0x001C,
 };
 
 enum RTL_register_content {
diff --git a/drivers/net/r8169/r8169_dash.c b/drivers/net/r8169/r8169_dash.c
index e803ce8305..0b2fd59de1 100644
--- a/drivers/net/r8169/r8169_dash.c
+++ b/drivers/net/r8169/r8169_dash.c
@@ -26,14 +26,14 @@ rtl_is_allow_access_dash_ocp(struct rtl_hw *hw)
 
allow_access = true;
switch (hw->mcfg) {
-   case CFG_METHOD_2:
-   case CFG_METHOD_3:
+   case CFG_METHOD_48:
+   case CFG_METHOD_49:
mac_ocp_data = rtl_mac_ocp_read(hw, 0xd460);
if (mac_ocp_data == 0x || !(mac_ocp_data & BIT_0))
allow_access = false;
break;
-   case CFG_METHOD_8:
-   case CFG_METHOD_9:
+   case CFG_METHOD_54:
+   case CFG_METHOD_55:
mac_ocp_data = rtl_mac_ocp_read(hw, 0xd4c0);
if (mac_ocp_data == 0x || (mac_ocp_data & BIT_3))
allow_access = false;
@@ -87,3 +87,144 @@ rtl_check_dash(struct rtl_hw *hw)
return 0;
 }
 
+static void
+rtl8125_dash2_disable_tx(struct rtl_hw *hw)
+{
+   u16 wait_cnt = 0;
+   u8 tmp_uchar;
+
+   if (!HW_DASH_SUPPORT_CMAC(hw))
+   return;
+
+   if (!hw->DASH)
+   return;
+
+   /* Disable oob Tx */
+   RTL_CMAC_W8(hw, CMAC_IBCR2, RTL_CMAC_R8(hw, CMAC_IBCR2) & ~BIT_0);
+
+   /* Wait oob Tx disable */
+   do {
+   tmp_uchar = RTL_CMAC_R8(hw, CMAC_IBISR0);
+   if (tmp_uchar & ISRIMR_DASH_TYPE2_TX_DISABLE_IDLE)
+   break;
+
+   udelay(50);
+   wait_cnt++;
+   } while (wait_cnt < 2000);
+
+   /* Clear ISRIMR_DASH_TYPE2_TX_DISABLE_IDLE */
+   RTL_CMAC_W8(hw, CMAC_IBISR0, RTL_CMAC_R8(hw, CMAC_IBISR0) |
+   ISRIMR_DASH_TYPE2_TX_DISABLE_IDLE);
+}
+
+static void
+rtl8125_dash2_disable_rx(struct rtl_hw *hw)
+{
+   if (!HW_DASH_SUPPORT_CMAC(hw))
+   return;
+
+   if (!hw->DASH)
+   return;
+
+   RTL_CMAC_W8(hw, CMAC_IBCR0, RTL_CMAC_R8(hw, CMAC_IBCR0) & ~BIT_0);
+}
+
+void
+rtl8125_dash2_disable_txrx(struct rtl_hw *hw)
+{
+   if (!HW_DASH_SUPPORT_CMAC(hw))
+   return;
+
+   rtl8125_dash2_disable_tx(hw);
+   rtl8125_dash2_disable_rx(hw);
+}
+
+static void
+rtl8125_notify_dash_oob_cmac(struct rtl_hw *hw, u32 cmd)
+{
+   u32 tmp_value;
+
+   if (!HW_DASH_SUPPORT_CMAC(hw))
+   return;
+
+   rtl_ocp_write(hw, 0x180, 4, cmd);
+   tmp_value = rtl_ocp_read(hw, 0x30, 4);
+   tmp_value |= BIT_0;
+   rtl_ocp_write(hw, 0x30, 4, tmp_value);
+}
+
+static void
+rtl8125_notify_dash_oob_ipc2(struct rtl_hw *hw, u32 cmd)
+{
+   if (FALSE == HW_DASH_SUPPORT_TYPE_4(hw))
+   return;
+
+   rtl_ocp_write(hw, IB2SOC_DATA, 4, cmd);
+   rtl_ocp_write(hw, IB2SOC_CMD, 4, 0x00);
+   rtl_ocp_write(hw, IB2SOC_SET, 4, 0x01);
+}
+
+static void
+rtl8125_notify_dash_oob(struct rtl_hw *hw, u32 cmd)
+{
+   switch (hw->HwSuppDashVer) {
+   case 2:
+   case 3:
+   rtl8125_notify_dash_oob_cmac(hw, cmd);
+   break;
+   case 4:
+   rtl8125_notify_dash_oob_ipc2(hw, cmd);
+   break;
+   default:
+   break;
+   }
+}
+
+static int
+rtl8125_wait_dash_fw_ready(struct rtl_hw *hw)
+{
+   int rc = -1;
+   int timeout;
+
+   if (!hw->DASH)
+   goto out;
+
+   for (timeout = 0; timeout < 10; timeout++) {
+  

[PATCH v1 18/18] doc/guides/nics: add documents for r8169 pmd

2024-10-14 Thread Howard Wang
Signed-off-by: Howard Wang 
---
 MAINTAINERS|  2 ++
 doc/guides/nics/features/r8169.ini | 32 ++
 doc/guides/nics/r8169.rst  | 17 
 3 files changed, 51 insertions(+)
 create mode 100644 doc/guides/nics/features/r8169.ini
 create mode 100644 doc/guides/nics/r8169.rst

diff --git a/MAINTAINERS b/MAINTAINERS
index 5f9eccc43f..6f56c966fd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1082,6 +1082,8 @@ M: ChunHao Lin 
 M: Xing Wang 
 M: Realtek NIC SW 
 F: drivers/net/r8169
+F: doc/guides/nics/r8169.rst
+F: doc/guides/nics/features/r8169.ini
 
 
 Crypto Drivers
diff --git a/doc/guides/nics/features/r8169.ini 
b/doc/guides/nics/features/r8169.ini
new file mode 100644
index 00..8e4142f64e
--- /dev/null
+++ b/doc/guides/nics/features/r8169.ini
@@ -0,0 +1,32 @@
+;
+; Supported features of the 'r8169' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Speed capabilities   = Y
+Link speed configuration = Y
+Link status  = Y
+Link status event= Y
+MTU update   = Y
+Scattered Rx = Y
+TSO  = Y
+Promiscuous mode = Y
+Allmulticast mode= Y
+Unicast MAC filter   = Y
+Multicast MAC filter = Y
+Flow control = Y
+CRC offload  = Y
+L3 checksum offload  = Y
+L4 checksum offload  = Y
+Packet type parsing  = Y
+Rx descriptor status = Y
+Tx descriptor status = Y
+Basic stats  = Y
+Extended stats   = Y
+Stats per queue  = Y
+FW version   = Y
+Registers dump   = Y
+Linux= Y
+x86-32   = Y
+x86-64   = Y
diff --git a/doc/guides/nics/r8169.rst b/doc/guides/nics/r8169.rst
new file mode 100644
index 00..149276cc91
--- /dev/null
+++ b/doc/guides/nics/r8169.rst
@@ -0,0 +1,17 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+Copyright(c) 2024 Realtek Corporation. All rights reserved
+
+R8169 Poll Mode Driver
+==
+
+The R8169 PMD provides poll mode driver support for Realtek 2.5 and 5 Gigabit
+Ethernet NICs.
+
+Features
+
+
+Features of the R8169 PMD are:
+
+* Checksum offload
+* TCP segmentation offload
+* Jumbo frames supported
-- 
2.34.1



[PATCH v1 16/18] net/r8169: add support for getting fw version

2024-10-14 Thread Howard Wang
Signed-off-by: Howard Wang 
---
 drivers/net/r8169/r8169_ethdev.c | 20 
 1 file changed, 20 insertions(+)

diff --git a/drivers/net/r8169/r8169_ethdev.c b/drivers/net/r8169/r8169_ethdev.c
index 70c3661691..dd2c7dda24 100644
--- a/drivers/net/r8169/r8169_ethdev.c
+++ b/drivers/net/r8169/r8169_ethdev.c
@@ -48,6 +48,8 @@ static int rtl_promiscuous_disable(struct rte_eth_dev *dev);
 static int rtl_allmulticast_enable(struct rte_eth_dev *dev);
 static int rtl_allmulticast_disable(struct rte_eth_dev *dev);
 static int rtl_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu);
+static int rtl_fw_version_get(struct rte_eth_dev *dev, char *fw_version,
+  size_t fw_size);
 
 /*
  * The set of PCI devices this driver supports
@@ -96,6 +98,8 @@ static const struct eth_dev_ops rtl_eth_dev_ops = {
 
.mtu_set  = rtl_dev_mtu_set,
 
+   .fw_version_get   = rtl_fw_version_get,
+
.rx_queue_setup   = rtl_rx_queue_setup,
.rx_queue_release = rtl_rx_queue_release,
.rxq_info_get = rtl_rxq_info_get,
@@ -639,6 +643,22 @@ rtl_dev_mtu_set(struct rte_eth_dev *dev, uint16_t mtu)
return 0;
 }
 
+static int
+rtl_fw_version_get(struct rte_eth_dev *dev, char *fw_version, size_t fw_size)
+{
+   struct rtl_adapter *adapter = RTL_DEV_PRIVATE(dev);
+   struct rtl_hw *hw = &adapter->hw;
+   int ret;
+
+   ret = snprintf(fw_version, fw_size, "0x%08x", hw->hw_ram_code_ver);
+
+   ret += 1; /* Add the size of '\0' */
+   if (fw_size < (u32)ret)
+   return ret;
+   else
+   return 0;
+}
+
 static int
 rtl_dev_init(struct rte_eth_dev *dev)
 {
-- 
2.34.1



Re: [PATCH v6] devtools: add .clang-format file

2024-10-14 Thread Stephen Hemminger
On Mon, 14 Oct 2024 22:15:45 +
Abdullah Ömer Yamaç  wrote:

> clang-format is a tool to format C/C++/Objective-C code. It can be used
> to reformat code to match a given coding style, or to ensure that code
> adheres to a specific coding style. It helps to maintain a consistent
> coding style across the DPDK codebase.
> 
> .clang-format file overrides the default style options provided by
> clang-format and large set of IDEs and text editors support it.
> 
> Signed-off-by: Abdullah Ömer Yamaç 


Looks good, as an experiment I ran some files through clang-format and tried
to make sure that clang-format and checkpatch do not disagree.

One that showed up was that clang-format liked to leave open ended functions.


CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#407: FILE: drivers/net/tap/rte_eth_tap.c:593:
+   *l4_cksum = rte_ipv4_udptcp_cksum_mbuf(



Re: [PATCH v4 0/5] power: refactor power management library

2024-10-14 Thread Stephen Hemminger
On Tue, 15 Oct 2024 02:49:53 +
Sivaprasad Tummala  wrote:

> This patchset refactors the power management library, addressing both
> core and uncore power management. The primary changes involve the
> creation of dedicated directories for each driver within
> 'drivers/power/core/*' and 'drivers/power/uncore/*'.
>   
> This refactor significantly improves code organization, enhances
> clarity, and boosts maintainability. It lays the foundation for more
> focused development on individual drivers and facilitates seamless
> integration of future enhancements, particularly the AMD uncore driver.
>  
> Furthermore, this effort aims to streamline code maintenance by
> consolidating common functions for cpufreq and cppc across various
> core drivers, thus reducing code duplication.
> 
> Sivaprasad Tummala (5):
>   power: refactor core power management library
>   power: refactor uncore power management library
>   test/power: removed function pointer validations
>   power/amd_uncore: uncore support for AMD EPYC processors
>   maintainers: update for drivers/power
> 
>  MAINTAINERS   |   1 +
>  app/test/test_power.c |  95 -
>  app/test/test_power_cpufreq.c |  52 ---
>  app/test/test_power_kvm_vm.c  |  36 --
>  drivers/meson.build   |   1 +
>  .../power/acpi/acpi_cpufreq.c |  22 +-
>  .../power/acpi/acpi_cpufreq.h |   6 +-
>  drivers/power/acpi/meson.build|  10 +
>  .../power/amd_pstate/amd_pstate_cpufreq.c |  24 +-
>  .../power/amd_pstate/amd_pstate_cpufreq.h |   8 +-
>  drivers/power/amd_pstate/meson.build  |  10 +
>  drivers/power/amd_uncore/amd_uncore.c | 329 ++
>  drivers/power/amd_uncore/amd_uncore.h | 226 
>  drivers/power/amd_uncore/meson.build  |  20 ++
>  .../power/cppc/cppc_cpufreq.c |  22 +-
>  .../power/cppc/cppc_cpufreq.h |   8 +-
>  drivers/power/cppc/meson.build|  10 +
>  .../power/intel_uncore/intel_uncore.c |  18 +-
>  .../power/intel_uncore/intel_uncore.h |   8 +-
>  drivers/power/intel_uncore/meson.build|   6 +
>  .../power/kvm_vm}/guest_channel.c |   0
>  .../power/kvm_vm}/guest_channel.h |   0
>  .../power/kvm_vm/kvm_vm.c |  22 +-
>  .../power/kvm_vm/kvm_vm.h |   6 +-
>  drivers/power/kvm_vm/meson.build  |  16 +
>  drivers/power/meson.build |  14 +
>  drivers/power/pstate/meson.build  |  10 +
>  .../power/pstate/pstate_cpufreq.c |  22 +-
>  .../power/pstate/pstate_cpufreq.h |   6 +-
>  examples/l3fwd-power/main.c   |  12 +-
>  lib/power/meson.build |   9 +-
>  lib/power/power_common.c  |   2 +-
>  lib/power/power_common.h  |  16 +-
>  lib/power/rte_power.c | 287 +--
>  lib/power/rte_power.h | 139 +---
>  lib/power/rte_power_cpufreq_api.h | 208 +++
>  lib/power/rte_power_uncore.c  | 207 +--
>  lib/power/rte_power_uncore.h  |  87 +++--
>  lib/power/rte_power_uncore_ops.h  | 239 +
>  lib/power/version.map |  15 +
>  40 files changed, 1605 insertions(+), 624 deletions(-)
>  rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c 
> (95%)
>  rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h 
> (98%)
>  create mode 100644 drivers/power/acpi/meson.build
>  rename lib/power/power_amd_pstate_cpufreq.c => 
> drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
>  rename lib/power/power_amd_pstate_cpufreq.h => 
> drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
>  create mode 100644 drivers/power/amd_pstate/meson.build
>  create mode 100644 drivers/power/amd_uncore/amd_uncore.c
>  create mode 100644 drivers/power/amd_uncore/amd_uncore.h
>  create mode 100644 drivers/power/amd_uncore/meson.build
>  rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c 
> (95%)
>  rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h 
> (97%)
>  create mode 100644 drivers/power/cppc/meson.build
>  rename lib/power/power_intel_uncore.c => 
> drivers/power/intel_uncore/intel_uncore.c (95%)
>  rename lib/power/power_intel_uncore.h => 
> drivers/power/intel_uncore/intel_uncore.h (97%)
>  create mode 100644 drivers/power/intel_uncore/meson.build
>  rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
>  rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
>  rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
>  rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
>  create mode 100644 drivers/power/k

[PATCH v5 0/9] net/zxdh: introduce net zxdh driver

2024-10-14 Thread Junlong Wang
V5:
  - split driver into multiple patches,part of the zxdh driver,
later provide dev start/stop,queue_setup,mac,vlan,rss ,etc.
  - fix errors reported by scripts.
  - move the product link in zxdh.rst.
  - fix meson check use RTE_ARCH_X86_64/RTE_ARCH_ARM64.
  - modify other comments according to Ferruh's comments.

V4:
  - Resolve compilation issues

Junlong Wang (9):
  net/zxdh: add zxdh ethdev pmd driver
  net/zxdh: add logging implementation
  net/zxdh: add zxdh device pci init implementation
  net/zxdh: add msg chan and msg hwlock init
  net/zxdh: add msg chan enable implementation
  net/zxdh: add zxdh get device backend infos
  net/zxdh: add configure zxdh intr implementation
  net/zxdh: add zxdh dev infos get ops
  net/zxdh: add zxdh dev configure ops

 doc/guides/nics/features/zxdh.ini |9 +
 doc/guides/nics/index.rst |1 +
 doc/guides/nics/zxdh.rst  |   30 +
 drivers/net/meson.build   |1 +
 drivers/net/zxdh/meson.build  |   22 +
 drivers/net/zxdh/zxdh_common.c|  368 +++
 drivers/net/zxdh/zxdh_common.h|   42 ++
 drivers/net/zxdh/zxdh_ethdev.c| 1021 +
 drivers/net/zxdh/zxdh_ethdev.h|  105 +++
 drivers/net/zxdh/zxdh_logs.h  |   35 +
 drivers/net/zxdh/zxdh_msg.c   |  992 
 drivers/net/zxdh/zxdh_msg.h   |  228 +++
 drivers/net/zxdh/zxdh_pci.c   |  449 +
 drivers/net/zxdh/zxdh_pci.h   |  191 ++
 drivers/net/zxdh/zxdh_queue.c |  131 
 drivers/net/zxdh/zxdh_queue.h |  280 
 drivers/net/zxdh/zxdh_rxtx.h  |   55 ++
 17 files changed, 3960 insertions(+)
 create mode 100644 doc/guides/nics/features/zxdh.ini
 create mode 100644 doc/guides/nics/zxdh.rst
 create mode 100644 drivers/net/zxdh/meson.build
 create mode 100644 drivers/net/zxdh/zxdh_common.c
 create mode 100644 drivers/net/zxdh/zxdh_common.h
 create mode 100644 drivers/net/zxdh/zxdh_ethdev.c
 create mode 100644 drivers/net/zxdh/zxdh_ethdev.h
 create mode 100644 drivers/net/zxdh/zxdh_logs.h
 create mode 100644 drivers/net/zxdh/zxdh_msg.c
 create mode 100644 drivers/net/zxdh/zxdh_msg.h
 create mode 100644 drivers/net/zxdh/zxdh_pci.c
 create mode 100644 drivers/net/zxdh/zxdh_pci.h
 create mode 100644 drivers/net/zxdh/zxdh_queue.c
 create mode 100644 drivers/net/zxdh/zxdh_queue.h
 create mode 100644 drivers/net/zxdh/zxdh_rxtx.h

-- 
2.27.0

[PATCH v5 3/9] net/zxdh: add zxdh device pci init implementation

2024-10-14 Thread Junlong Wang
Add device pci init implementation,
to obtain PCI capability and read configuration, etc

Signed-off-by: Junlong Wang 
---
 drivers/net/zxdh/meson.build   |   5 +-
 drivers/net/zxdh/zxdh_ethdev.c |  43 +
 drivers/net/zxdh/zxdh_ethdev.h |  20 ++-
 drivers/net/zxdh/zxdh_pci.c| 290 +
 drivers/net/zxdh/zxdh_pci.h| 151 +
 drivers/net/zxdh/zxdh_queue.h  | 109 +
 drivers/net/zxdh/zxdh_rxtx.h   |  55 +++
 7 files changed, 669 insertions(+), 4 deletions(-)
 create mode 100644 drivers/net/zxdh/zxdh_pci.c
 create mode 100644 drivers/net/zxdh/zxdh_pci.h
 create mode 100644 drivers/net/zxdh/zxdh_queue.h
 create mode 100644 drivers/net/zxdh/zxdh_rxtx.h

diff --git a/drivers/net/zxdh/meson.build b/drivers/net/zxdh/meson.build
index 58e39c1f96..080c6c7725 100644
--- a/drivers/net/zxdh/meson.build
+++ b/drivers/net/zxdh/meson.build
@@ -14,5 +14,6 @@ if not dpdk_conf.has('RTE_ARCH_X86_64') or not 
dpdk_conf.get('RTE_ARCH_64')
 endif
 
 sources = files(
-   'zxdh_ethdev.c',
-   )
+'zxdh_ethdev.c',
+'zxdh_pci.c',
+)
diff --git a/drivers/net/zxdh/zxdh_ethdev.c b/drivers/net/zxdh/zxdh_ethdev.c
index 7220770c01..bb219c189f 100644
--- a/drivers/net/zxdh/zxdh_ethdev.c
+++ b/drivers/net/zxdh/zxdh_ethdev.c
@@ -8,6 +8,40 @@
 
 #include "zxdh_ethdev.h"
 #include "zxdh_logs.h"
+#include "zxdh_pci.h"
+
+struct zxdh_hw_internal zxdh_hw_internal[RTE_MAX_ETHPORTS];
+
+static int32_t zxdh_init_device(struct rte_eth_dev *eth_dev)
+{
+   struct zxdh_hw *hw = eth_dev->data->dev_private;
+   struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(eth_dev);
+   int ret = 0;
+
+   ret = zxdh_read_pci_caps(pci_dev, hw);
+   if (ret) {
+   PMD_INIT_LOG(ERR, "port 0x%x pci caps read failed .", 
hw->port_id);
+   goto err;
+   }
+
+   zxdh_hw_internal[hw->port_id].vtpci_ops = &zxdh_dev_pci_ops;
+   zxdh_vtpci_reset(hw);
+   zxdh_get_pci_dev_config(hw);
+
+   rte_ether_addr_copy((struct rte_ether_addr *)hw->mac_addr, 
ð_dev->data->mac_addrs[0]);
+
+   /* If host does not support both status and MSI-X then disable LSC */
+   if (vtpci_with_feature(hw, ZXDH_NET_F_STATUS) && (hw->use_msix != 
ZXDH_MSIX_NONE))
+   eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
+   else
+   eth_dev->data->dev_flags &= ~RTE_ETH_DEV_INTR_LSC;
+
+   return 0;
+
+err:
+   PMD_INIT_LOG(ERR, "port %d init device failed", eth_dev->data->port_id);
+   return ret;
+}
 
 static int zxdh_eth_dev_init(struct rte_eth_dev *eth_dev)
 {
@@ -45,6 +79,15 @@ static int zxdh_eth_dev_init(struct rte_eth_dev *eth_dev)
hw->is_pf = 1;
}
 
+   ret = zxdh_init_device(eth_dev);
+   if (ret < 0)
+   goto err_zxdh_init;
+
+   return ret;
+
+err_zxdh_init:
+   rte_free(eth_dev->data->mac_addrs);
+   eth_dev->data->mac_addrs = NULL;
return ret;
 }
 
diff --git a/drivers/net/zxdh/zxdh_ethdev.h b/drivers/net/zxdh/zxdh_ethdev.h
index 04023bfe84..18d9916713 100644
--- a/drivers/net/zxdh/zxdh_ethdev.h
+++ b/drivers/net/zxdh/zxdh_ethdev.h
@@ -9,6 +9,7 @@
 extern "C" {
 #endif
 
+#include 
 #include "ethdev_driver.h"
 
 /* ZXDH PCI vendor/device ID. */
@@ -23,16 +24,31 @@ extern "C" {
 #define ZXDH_MAX_MC_MAC_ADDRS  32
 #define ZXDH_MAX_MAC_ADDRS (ZXDH_MAX_UC_MAC_ADDRS + ZXDH_MAX_MC_MAC_ADDRS)
 
+#define ZXDH_RX_QUEUES_MAX  128U
+#define ZXDH_TX_QUEUES_MAX  128U
+
 #define ZXDH_NUM_BARS2
 
 struct zxdh_hw {
struct rte_eth_dev *eth_dev;
-   uint64_t bar_addr[ZXDH_NUM_BARS];
+   struct zxdh_pci_common_cfg *common_cfg;
+   struct zxdh_net_config *dev_cfg;
 
-   uint32_t  speed;
+   uint64_t bar_addr[ZXDH_NUM_BARS];
+   uint64_t host_features;
+   uint64_t guest_features;
+   uint32_t max_queue_pairs;
+   uint32_t speed;
+   uint32_t notify_off_multiplier;
+   uint16_t *notify_base;
+   uint16_t pcie_id;
uint16_t device_id;
uint16_t port_id;
 
+   uint8_t *isr;
+   uint8_t weak_barriers;
+   uint8_t use_msix;
+   uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
uint8_t duplex;
uint8_t is_pf;
 };
diff --git a/drivers/net/zxdh/zxdh_pci.c b/drivers/net/zxdh/zxdh_pci.c
new file mode 100644
index 00..73ec640b84
--- /dev/null
+++ b/drivers/net/zxdh/zxdh_pci.c
@@ -0,0 +1,290 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2023 ZTE Corporation
+ */
+
+#include 
+#include 
+
+#ifdef RTE_EXEC_ENV_LINUX
+ #include 
+ #include 
+#endif
+
+#include 
+#include 
+#include 
+#include 
+
+#include "zxdh_ethdev.h"
+#include "zxdh_pci.h"
+#include "zxdh_logs.h"
+#include "zxdh_queue.h"
+
+#define ZXDH_PMD_DEFAULT_GUEST_FEATURES   \
+   (1ULL << ZXDH_NET_F_MRG_RXBUF | \
+1ULL << ZXDH_NET_F_STATUS| \
+1ULL << ZXDH_NET_F_MQ| \
+1ULL << ZXDH_F_ANY_LAYOUT| \
+1ULL << ZXDH_F_VERSION_1 | \
+   

[PATCH v5 5/9] net/zxdh: add msg chan enable implementation

2024-10-14 Thread Junlong Wang
Add msg chan enable implementation to support
send msg to get infos.

Signed-off-by: Junlong Wang 
---
 drivers/net/zxdh/zxdh_ethdev.c |   6 +
 drivers/net/zxdh/zxdh_ethdev.h |  12 +
 drivers/net/zxdh/zxdh_msg.c| 655 -
 drivers/net/zxdh/zxdh_msg.h| 127 +++
 4 files changed, 796 insertions(+), 4 deletions(-)

diff --git a/drivers/net/zxdh/zxdh_ethdev.c b/drivers/net/zxdh/zxdh_ethdev.c
index 66b57c4e59..d95ab4471a 100644
--- a/drivers/net/zxdh/zxdh_ethdev.c
+++ b/drivers/net/zxdh/zxdh_ethdev.c
@@ -97,6 +97,12 @@ static int zxdh_eth_dev_init(struct rte_eth_dev *eth_dev)
goto err_zxdh_init;
}
 
+   ret = zxdh_msg_chan_enable(eth_dev);
+   if (ret != 0) {
+   PMD_INIT_LOG(ERR, "zxdh_msg_bar_chan_enable failed ret %d", 
ret);
+   goto err_zxdh_init;
+   }
+
return ret;
 
 err_zxdh_init:
diff --git a/drivers/net/zxdh/zxdh_ethdev.h b/drivers/net/zxdh/zxdh_ethdev.h
index 24eb3a5ca0..a51181f1ce 100644
--- a/drivers/net/zxdh/zxdh_ethdev.h
+++ b/drivers/net/zxdh/zxdh_ethdev.h
@@ -29,10 +29,22 @@ extern "C" {
 
 #define ZXDH_NUM_BARS2
 
+union VPORT {
+   uint16_t vport;
+   struct {
+   uint16_t vfid:8;
+   uint16_t pfid:3;
+   uint16_t vf_flag:1;
+   uint16_t epid:3;
+   uint16_t direct_flag:1;
+   };
+};
+
 struct zxdh_hw {
struct rte_eth_dev *eth_dev;
struct zxdh_pci_common_cfg *common_cfg;
struct zxdh_net_config *dev_cfg;
+   union VPORT vport;
 
uint64_t bar_addr[ZXDH_NUM_BARS];
uint64_t host_features;
diff --git a/drivers/net/zxdh/zxdh_msg.c b/drivers/net/zxdh/zxdh_msg.c
index 4928711ad8..4e4930e5a1 100644
--- a/drivers/net/zxdh/zxdh_msg.c
+++ b/drivers/net/zxdh/zxdh_msg.c
@@ -35,10 +35,82 @@
 #define MAX_EP_NUM  (4)
 #define MAX_HARD_SPINLOCK_NUM(511)
 
-#define BAR0_SPINLOCK_OFFSET   (0x4000)
-#define FW_SHRD_OFFSET (0x5000)
-#define FW_SHRD_INNER_HW_LABEL_PAT (0x800)
-#define HW_LABEL_OFFSET(FW_SHRD_OFFSET + 
FW_SHRD_INNER_HW_LABEL_PAT)
+#define LOCK_PRIMARY_ID_MASK  (0x8000)
+/* bar offset */
+#define BAR0_CHAN_RISC_OFFSET(0x2000)
+#define BAR0_CHAN_PFVF_OFFSET(0x3000)
+#define BAR0_SPINLOCK_OFFSET (0x4000)
+#define FW_SHRD_OFFSET   (0x5000)
+#define FW_SHRD_INNER_HW_LABEL_PAT   (0x800)
+#define HW_LABEL_OFFSET  (FW_SHRD_OFFSET + 
FW_SHRD_INNER_HW_LABEL_PAT)
+#define ZXDH_CTRLCH_OFFSET   (0x2000)
+#define CHAN_RISC_SPINLOCK_OFFSET(BAR0_SPINLOCK_OFFSET - 
BAR0_CHAN_RISC_OFFSET)
+#define CHAN_PFVF_SPINLOCK_OFFSET(BAR0_SPINLOCK_OFFSET - 
BAR0_CHAN_PFVF_OFFSET)
+#define CHAN_RISC_LABEL_OFFSET   (HW_LABEL_OFFSET - BAR0_CHAN_RISC_OFFSET)
+#define CHAN_PFVF_LABEL_OFFSET   (HW_LABEL_OFFSET - BAR0_CHAN_PFVF_OFFSET)
+
+#define REPS_HEADER_LEN_OFFSET  1
+#define REPS_HEADER_PAYLOAD_OFFSET  4
+#define REPS_HEADER_REPLYED 0xff
+
+#define BAR_MSG_CHAN_USABLE  0
+#define BAR_MSG_CHAN_USED1
+
+#define BAR_MSG_POL_MASK(0x10)
+#define BAR_MSG_POL_OFFSET  (4)
+
+#define BAR_ALIGN_WORD_MASK  0xfffc
+#define BAR_MSG_VALID_MASK1
+#define BAR_MSG_VALID_OFFSET  0
+
+#define REPS_INFO_FLAG_USABLE  0x00
+#define REPS_INFO_FLAG_USED0xa0
+
+#define BAR_PF_NUM 7
+#define BAR_VF_NUM 256
+#define BAR_INDEX_PF_TO_VF 0
+#define BAR_INDEX_MPF_TO_MPF   0xff
+#define BAR_INDEX_MPF_TO_PFVF  0
+#define BAR_INDEX_PFVF_TO_MPF  0
+
+#define MAX_HARD_SPINLOCK_ASK_TIMES  (1000)
+#define SPINLOCK_POLLING_SPAN_US (100)
+
+#define BAR_MSG_SRC_NUM   3
+#define BAR_MSG_SRC_MPF   0
+#define BAR_MSG_SRC_PF1
+#define BAR_MSG_SRC_VF2
+#define BAR_MSG_SRC_ERR   0xff
+#define BAR_MSG_DST_NUM   3
+#define BAR_MSG_DST_RISC  0
+#define BAR_MSG_DST_MPF   2
+#define BAR_MSG_DST_PFVF  1
+#define BAR_MSG_DST_ERR   0xff
+
+#define LOCK_TYPE_HARD(1)
+#define LOCK_TYPE_SOFT(0)
+#define BAR_INDEX_TO_RISC  0
+
+#define BAR_SUBCHAN_INDEX_SEND  0
+#define BAR_SUBCHAN_INDEX_RECV  1
+
+uint8_t subchan_id_tbl[BAR_MSG_SRC_NUM][BAR_MSG_DST_NUM] = {
+   {BAR_SUBCHAN_INDEX_SEND, BAR_SUBCHAN_INDEX_SEND, 
BAR_SUBCHAN_INDEX_SEND},
+   {BAR_SUBCHAN_INDEX_SEND, BAR_SUBCHAN_INDEX_SEND, 
BAR_SUBCHAN_INDEX_RECV},
+   {BAR_SUBCHAN_INDEX_SEND, BAR_SUBCHAN_INDEX_RECV, BAR_SUBCHAN_INDEX_RECV}
+};
+
+uint8_t chan_id_tbl[BAR_MSG_SRC_NUM][BAR_MSG_DST_NUM] = {
+   {BAR_INDEX_TO_RISC, BAR_INDEX_MPF_TO_PFVF, BAR_INDEX_MPF_TO_MPF},
+   {BAR_INDEX_TO_RISC, BAR_INDEX_PF_TO_VF,BAR_INDEX_PFVF_TO_MPF},
+   {BAR_INDEX_TO_RISC, BAR_INDEX_PF_TO_VF,BAR_INDEX_PFVF_TO_MPF}
+};
+
+uint8_t lock_type_tbl[BAR_MSG_SRC_NUM][BAR_MSG_DST_NUM] = {
+   {LOCK_TYPE_HARD, LOCK_TYPE_HARD, LOCK_TYPE_HARD},
+   {LOCK_TYPE_SOFT, LOCK_TYPE_SOFT, LOCK_TYPE_HARD},
+   {LOCK_TYPE_HARD, LOCK_TYPE_HARD, LOCK_TYPE_HARD}
+};
 
 struc

[PATCH v5 4/9] net/zxdh: add msg chan and msg hwlock init

2024-10-14 Thread Junlong Wang
Add msg channel and hwlock init implementation.

Signed-off-by: Junlong Wang 
---
 drivers/net/zxdh/meson.build   |   1 +
 drivers/net/zxdh/zxdh_ethdev.c |  15 +++
 drivers/net/zxdh/zxdh_ethdev.h |   1 +
 drivers/net/zxdh/zxdh_msg.c| 161 +
 drivers/net/zxdh/zxdh_msg.h|  65 +
 5 files changed, 243 insertions(+)
 create mode 100644 drivers/net/zxdh/zxdh_msg.c
 create mode 100644 drivers/net/zxdh/zxdh_msg.h

diff --git a/drivers/net/zxdh/meson.build b/drivers/net/zxdh/meson.build
index 080c6c7725..9d0b5b9fd3 100644
--- a/drivers/net/zxdh/meson.build
+++ b/drivers/net/zxdh/meson.build
@@ -16,4 +16,5 @@ endif
 sources = files(
 'zxdh_ethdev.c',
 'zxdh_pci.c',
+'zxdh_msg.c',
 )
diff --git a/drivers/net/zxdh/zxdh_ethdev.c b/drivers/net/zxdh/zxdh_ethdev.c
index bb219c189f..66b57c4e59 100644
--- a/drivers/net/zxdh/zxdh_ethdev.c
+++ b/drivers/net/zxdh/zxdh_ethdev.c
@@ -9,6 +9,7 @@
 #include "zxdh_ethdev.h"
 #include "zxdh_logs.h"
 #include "zxdh_pci.h"
+#include "zxdh_msg.h"
 
 struct zxdh_hw_internal zxdh_hw_internal[RTE_MAX_ETHPORTS];
 
@@ -83,9 +84,23 @@ static int zxdh_eth_dev_init(struct rte_eth_dev *eth_dev)
if (ret < 0)
goto err_zxdh_init;
 
+   ret = zxdh_msg_chan_init();
+   if (ret < 0) {
+   PMD_INIT_LOG(ERR, "Failed to init bar msg chan");
+   goto err_zxdh_init;
+   }
+   hw->msg_chan_init = 1;
+
+   ret = zxdh_msg_chan_hwlock_init(eth_dev);
+   if (ret != 0) {
+   PMD_INIT_LOG(ERR, "zxdh_msg_chan_hwlock_init failed ret %d", 
ret);
+   goto err_zxdh_init;
+   }
+
return ret;
 
 err_zxdh_init:
+   zxdh_bar_msg_chan_exit();
rte_free(eth_dev->data->mac_addrs);
eth_dev->data->mac_addrs = NULL;
return ret;
diff --git a/drivers/net/zxdh/zxdh_ethdev.h b/drivers/net/zxdh/zxdh_ethdev.h
index 18d9916713..24eb3a5ca0 100644
--- a/drivers/net/zxdh/zxdh_ethdev.h
+++ b/drivers/net/zxdh/zxdh_ethdev.h
@@ -51,6 +51,7 @@ struct zxdh_hw {
uint8_t mac_addr[RTE_ETHER_ADDR_LEN];
uint8_t duplex;
uint8_t is_pf;
+   uint8_t msg_chan_init;
 };
 
 #ifdef __cplusplus
diff --git a/drivers/net/zxdh/zxdh_msg.c b/drivers/net/zxdh/zxdh_msg.c
new file mode 100644
index 00..4928711ad8
--- /dev/null
+++ b/drivers/net/zxdh/zxdh_msg.c
@@ -0,0 +1,161 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 ZTE Corporation
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "zxdh_ethdev.h"
+#include "zxdh_logs.h"
+#include "zxdh_msg.h"
+
+#define REPS_INFO_FLAG_USABLE  0x00
+#define BAR_SEQID_NUM_MAX  256
+
+#define ZXDH_BAR0_INDEX  0
+
+#define PCIEID_IS_PF_MASK   (0x0800)
+#define PCIEID_PF_IDX_MASK  (0x0700)
+#define PCIEID_VF_IDX_MASK  (0x00ff)
+#define PCIEID_EP_IDX_MASK  (0x7000)
+/* PCIEID bit field offset */
+#define PCIEID_PF_IDX_OFFSET  (8)
+#define PCIEID_EP_IDX_OFFSET  (12)
+
+#define MULTIPLY_BY_8(x)((x) << 3)
+#define MULTIPLY_BY_32(x)   ((x) << 5)
+#define MULTIPLY_BY_256(x)  ((x) << 8)
+
+#define MAX_EP_NUM  (4)
+#define MAX_HARD_SPINLOCK_NUM(511)
+
+#define BAR0_SPINLOCK_OFFSET   (0x4000)
+#define FW_SHRD_OFFSET (0x5000)
+#define FW_SHRD_INNER_HW_LABEL_PAT (0x800)
+#define HW_LABEL_OFFSET(FW_SHRD_OFFSET + 
FW_SHRD_INNER_HW_LABEL_PAT)
+
+struct dev_stat {
+   bool is_mpf_scanned;
+   bool is_res_init;
+   int16_t dev_cnt; /* probe cnt */
+};
+struct dev_stat g_dev_stat = {0};
+
+struct seqid_item {
+   void *reps_addr;
+   uint16_t id;
+   uint16_t buffer_len;
+   uint16_t flag;
+};
+
+struct seqid_ring {
+   uint16_t cur_id;
+   pthread_spinlock_t lock;
+   struct seqid_item reps_info_tbl[BAR_SEQID_NUM_MAX];
+};
+struct seqid_ring g_seqid_ring = {0};
+
+static uint16_t pcie_id_to_hard_lock(uint16_t src_pcieid, uint8_t dst)
+{
+   uint16_t lock_id = 0;
+   uint16_t pf_idx = (src_pcieid & PCIEID_PF_IDX_MASK) >> 
PCIEID_PF_IDX_OFFSET;
+   uint16_t ep_idx = (src_pcieid & PCIEID_EP_IDX_MASK) >> 
PCIEID_EP_IDX_OFFSET;
+
+   switch (dst) {
+   /* msg to risc */
+   case MSG_CHAN_END_RISC:
+   lock_id = MULTIPLY_BY_8(ep_idx) + pf_idx;
+   break;
+   /* msg to pf/vf */
+   case MSG_CHAN_END_VF:
+   case MSG_CHAN_END_PF:
+   lock_id = MULTIPLY_BY_8(ep_idx) + pf_idx + MULTIPLY_BY_8(1 + 
MAX_EP_NUM);
+   break;
+   default:
+   lock_id = 0;
+   break;
+   }
+   if (lock_id >= MAX_HARD_SPINLOCK_NUM)
+   lock_id = 0;
+
+   return lock_id;
+}
+
+static void label_write(uint64_t label_lock_addr, uint32_t lock_id, uint16_t 
value)
+{
+   *(volatile uint16_t *)(label_lock_addr + lock_id * 2) = value;
+}
+
+static void spinlock_write(uint64_t virt_lock_addr, uint32_t lock_id, uint8_t 
data)
+{
+ 

[PATCH v5 7/9] net/zxdh: add configure zxdh intr implementation

2024-10-14 Thread Junlong Wang
configure zxdh intr include risc,dtb. and release intr.

Signed-off-by: Junlong Wang 
---
 drivers/net/zxdh/zxdh_ethdev.c | 302 -
 drivers/net/zxdh/zxdh_ethdev.h |   8 +
 drivers/net/zxdh/zxdh_msg.c| 187 
 drivers/net/zxdh/zxdh_msg.h|  11 ++
 drivers/net/zxdh/zxdh_pci.c|  62 +++
 drivers/net/zxdh/zxdh_pci.h|  12 ++
 6 files changed, 581 insertions(+), 1 deletion(-)

diff --git a/drivers/net/zxdh/zxdh_ethdev.c b/drivers/net/zxdh/zxdh_ethdev.c
index ee2e1c0d5d..4f6711c9af 100644
--- a/drivers/net/zxdh/zxdh_ethdev.c
+++ b/drivers/net/zxdh/zxdh_ethdev.c
@@ -25,6 +25,302 @@ uint16_t vport_to_vfid(union VPORT v)
return (v.epid * 8 + v.pfid) + 1152;
 }
 
+static void zxdh_queues_unbind_intr(struct rte_eth_dev *dev)
+{
+   struct zxdh_hw *hw = dev->data->dev_private;
+   int32_t i;
+
+   for (i = 0; i < dev->data->nb_rx_queues; ++i) {
+   VTPCI_OPS(hw)->set_queue_irq(hw, hw->vqs[i * 2], 
ZXDH_MSI_NO_VECTOR);
+   VTPCI_OPS(hw)->set_queue_irq(hw, hw->vqs[i * 2 + 1], 
ZXDH_MSI_NO_VECTOR);
+   }
+}
+
+
+static int32_t zxdh_intr_unmask(struct rte_eth_dev *dev)
+{
+   struct zxdh_hw *hw = dev->data->dev_private;
+
+   if (rte_intr_ack(dev->intr_handle) < 0)
+   return -1;
+
+   hw->use_msix = zxdh_vtpci_msix_detect(RTE_ETH_DEV_TO_PCI(dev));
+
+   return 0;
+}
+
+static void zxdh_devconf_intr_handler(void *param)
+{
+   struct rte_eth_dev *dev = param;
+
+   if (zxdh_intr_unmask(dev) < 0)
+   PMD_DRV_LOG(ERR, "interrupt enable failed");
+}
+
+
+/* Interrupt handler triggered by NIC for handling specific interrupt. */
+static void zxdh_fromriscv_intr_handler(void *param)
+{
+   struct rte_eth_dev *dev = param;
+   struct zxdh_hw *hw = dev->data->dev_private;
+   uint64_t virt_addr = 0;
+
+   virt_addr = (uint64_t)(hw->bar_addr[ZXDH_BAR0_INDEX] + 
ZXDH_CTRLCH_OFFSET);
+   if (hw->is_pf) {
+   PMD_INIT_LOG(DEBUG, "zxdh_risc2pf_intr_handler\n");
+   zxdh_bar_irq_recv(MSG_CHAN_END_RISC, MSG_CHAN_END_PF, 
virt_addr, dev);
+   } else {
+   PMD_INIT_LOG(DEBUG, "zxdh_riscvf_intr_handler\n");
+   zxdh_bar_irq_recv(MSG_CHAN_END_RISC, MSG_CHAN_END_VF, 
virt_addr, dev);
+   }
+}
+
+/* Interrupt handler triggered by NIC for handling specific interrupt. */
+static void zxdh_frompfvf_intr_handler(void *param)
+{
+   struct rte_eth_dev *dev = param;
+   struct zxdh_hw *hw = dev->data->dev_private;
+   uint64_t virt_addr = 0;
+
+   virt_addr = (uint64_t)(hw->bar_addr[ZXDH_BAR0_INDEX] + 
ZXDH_MSG_CHAN_PFVFSHARE_OFFSET);
+   if (hw->is_pf) {
+   PMD_INIT_LOG(DEBUG, "zxdh_vf2pf_intr_handler\n");
+   zxdh_bar_irq_recv(MSG_CHAN_END_VF, MSG_CHAN_END_PF, virt_addr, 
dev);
+   } else {
+   PMD_INIT_LOG(DEBUG, "zxdh_pf2vf_intr_handler");
+   zxdh_bar_irq_recv(MSG_CHAN_END_PF, MSG_CHAN_END_VF, virt_addr, 
dev);
+   }
+}
+
+static void zxdh_intr_cb_reg(struct rte_eth_dev *dev)
+{
+   struct zxdh_hw *hw = dev->data->dev_private;
+
+   if (dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)
+   rte_intr_callback_unregister(dev->intr_handle, 
zxdh_devconf_intr_handler, dev);
+
+   /* register callback to update dev config intr */
+   rte_intr_callback_register(dev->intr_handle, zxdh_devconf_intr_handler, 
dev);
+   /* Register rsic_v to pf interrupt callback */
+   struct rte_intr_handle *tmp = hw->risc_intr +
+   (MSIX_FROM_PFVF - ZXDH_MSIX_INTR_MSG_VEC_BASE);
+
+   rte_intr_callback_register(tmp, zxdh_frompfvf_intr_handler, dev);
+
+   tmp = hw->risc_intr + (MSIX_FROM_RISCV - ZXDH_MSIX_INTR_MSG_VEC_BASE);
+   rte_intr_callback_register(tmp, zxdh_fromriscv_intr_handler, dev);
+}
+
+static void zxdh_intr_cb_unreg(struct rte_eth_dev *dev)
+{
+   if (dev->data->dev_flags & RTE_ETH_DEV_INTR_LSC)
+   rte_intr_callback_unregister(dev->intr_handle, 
zxdh_devconf_intr_handler, dev);
+
+   struct zxdh_hw *hw = dev->data->dev_private;
+
+   /* register callback to update dev config intr */
+   rte_intr_callback_unregister(dev->intr_handle, 
zxdh_devconf_intr_handler, dev);
+   /* Register rsic_v to pf interrupt callback */
+   struct rte_intr_handle *tmp = hw->risc_intr +
+   (MSIX_FROM_PFVF - ZXDH_MSIX_INTR_MSG_VEC_BASE);
+
+   rte_intr_callback_unregister(tmp, zxdh_frompfvf_intr_handler, dev);
+   tmp = hw->risc_intr + (MSIX_FROM_RISCV - ZXDH_MSIX_INTR_MSG_VEC_BASE);
+   rte_intr_callback_unregister(tmp, zxdh_fromriscv_intr_handler, dev);
+}
+
+static int32_t zxdh_intr_disable(struct rte_eth_dev *dev)
+{
+   struct zxdh_hw *hw = dev->data->dev_private;
+
+   if (!hw->intr_enabled)
+   return 0;
+
+   zxdh_intr_cb_unreg(dev);
+   if (rte_intr_disable(dev->intr_handle) < 0)
+  

[PATCH v5 6/9] net/zxdh: add zxdh get device backend infos

2024-10-14 Thread Junlong Wang
Add zxdh get device backend infos,
use msg chan to send msg get.

Signed-off-by: Junlong Wang 
---
 drivers/net/zxdh/meson.build   |   1 +
 drivers/net/zxdh/zxdh_common.c | 249 +
 drivers/net/zxdh/zxdh_common.h |  30 
 drivers/net/zxdh/zxdh_ethdev.c |  35 +
 drivers/net/zxdh/zxdh_ethdev.h |   5 +
 drivers/net/zxdh/zxdh_msg.c|   3 -
 drivers/net/zxdh/zxdh_msg.h|  27 +++-
 7 files changed, 346 insertions(+), 4 deletions(-)
 create mode 100644 drivers/net/zxdh/zxdh_common.c
 create mode 100644 drivers/net/zxdh/zxdh_common.h

diff --git a/drivers/net/zxdh/meson.build b/drivers/net/zxdh/meson.build
index 9d0b5b9fd3..9aec47e68f 100644
--- a/drivers/net/zxdh/meson.build
+++ b/drivers/net/zxdh/meson.build
@@ -17,4 +17,5 @@ sources = files(
 'zxdh_ethdev.c',
 'zxdh_pci.c',
 'zxdh_msg.c',
+'zxdh_common.c',
 )
diff --git a/drivers/net/zxdh/zxdh_common.c b/drivers/net/zxdh/zxdh_common.c
new file mode 100644
index 00..140d0f2322
--- /dev/null
+++ b/drivers/net/zxdh/zxdh_common.c
@@ -0,0 +1,249 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 ZTE Corporation
+ */
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "zxdh_ethdev.h"
+#include "zxdh_logs.h"
+#include "zxdh_msg.h"
+#include "zxdh_common.h"
+
+#define ZXDH_MSG_RSP_SIZE_MAX  512
+
+#define ZXDH_COMMON_TABLE_READ   0
+#define ZXDH_COMMON_TABLE_WRITE  1
+
+#define ZXDH_COMMON_FIELD_PHYPORT  6
+
+#define RSC_TBL_CONTENT_LEN_MAX  (257 * 2)
+
+#define REPS_HEADER_PAYLOAD_OFFSET  4
+#define TBL_MSG_PRO_SUCCESS  0xaa
+
+struct zxdh_common_msg {
+   uint8_t  type;/* 0:read table 1:write table */
+   uint8_t  field;
+   uint16_t pcie_id;
+   uint16_t slen;/* Data length for write table */
+   uint16_t reserved;
+} __rte_packed;
+
+struct zxdh_common_rsp_hdr {
+   uint8_t  rsp_status;
+   uint16_t rsp_len;
+   uint8_t  reserved;
+   uint8_t  payload_status;
+   uint8_t  rsv;
+   uint16_t payload_len;
+} __rte_packed;
+
+struct tbl_msg_header {
+   uint8_t  type;  /* r/w */
+   uint8_t  field;
+   uint16_t pcieid;
+   uint16_t slen;
+   uint16_t rsv;
+};
+struct tbl_msg_reps_header {
+   uint8_t  check;
+   uint8_t  rsv;
+   uint16_t len;
+};
+
+static int32_t zxdh_fill_common_msg(struct zxdh_hw *hw,
+   struct zxdh_pci_bar_msg *desc,
+   uint8_ttype,
+   uint8_tfield,
+   void  *buff,
+   uint16_t   buff_size)
+{
+   uint64_t msg_len = sizeof(struct zxdh_common_msg) + buff_size;
+
+   desc->payload_addr = rte_zmalloc(NULL, msg_len, 0);
+   if (unlikely(desc->payload_addr == NULL)) {
+   PMD_DRV_LOG(ERR, "Failed to allocate msg_data");
+   return -ENOMEM;
+   }
+   memset(desc->payload_addr, 0, msg_len);
+   desc->payload_len = msg_len;
+   struct zxdh_common_msg *msg_data = (struct zxdh_common_msg 
*)desc->payload_addr;
+
+   msg_data->type = type;
+   msg_data->field = field;
+   msg_data->pcie_id = hw->pcie_id;
+   msg_data->slen = buff_size;
+   if (buff_size != 0)
+   rte_memcpy(msg_data + 1, buff, buff_size);
+
+   return 0;
+}
+
+static int32_t zxdh_send_command(struct zxdh_hw *hw,
+   struct zxdh_pci_bar_msg  *desc,
+   enum bar_module_idmodule_id,
+   struct zxdh_msg_recviver_mem *msg_rsp)
+{
+   desc->virt_addr = (uint64_t)(hw->bar_addr[ZXDH_BAR0_INDEX] + 
ZXDH_CTRLCH_OFFSET);
+   desc->src = hw->is_pf ? MSG_CHAN_END_PF : MSG_CHAN_END_VF;
+   desc->dst = MSG_CHAN_END_RISC;
+   desc->module_id = module_id;
+   desc->src_pcieid = hw->pcie_id;
+
+   msg_rsp->buffer_len  = ZXDH_MSG_RSP_SIZE_MAX;
+   msg_rsp->recv_buffer = rte_zmalloc(NULL, msg_rsp->buffer_len, 0);
+   if (unlikely(msg_rsp->recv_buffer == NULL)) {
+   PMD_DRV_LOG(ERR, "Failed to allocate messages response");
+   return -ENOMEM;
+   }
+
+   if (zxdh_bar_chan_sync_msg_send(desc, msg_rsp) != BAR_MSG_OK) {
+   PMD_DRV_LOG(ERR, "Failed to send sync messages or receive 
response");
+   rte_free(msg_rsp->recv_buffer);
+   return -1;
+   }
+
+   return 0;
+}
+
+static int32_t zxdh_common_rsp_check(struct zxdh_msg_recviver_mem *msg_rsp,
+   void *buff, uint16_t len)
+{
+   struct zxdh_common_rsp_hdr *rsp_hdr = (struct zxdh_common_rsp_hdr 
*)msg_rsp->recv_buffer;
+
+   if ((rsp_hdr->payload_status != 0xaa) || (rsp_hdr->payload_len != len)) 
{
+   PMD_DRV_LOG(ERR, "Common response is invalid, status:0x%x 
rsp_len:%d",
+   rsp_hdr->payload_status, 
rsp_hdr->payload_len);
+   return -1;
+   }
+   if (len != 0)
+   rte_memcpy(buff, rsp_hdr + 1, len);
+
+   return 0;
+}
+
+static int32_t zxdh_common_table_read(struct zxdh_hw *hw, uint8_t field,
+  

[PATCH v5 1/9] net/zxdh: add zxdh ethdev pmd driver

2024-10-14 Thread Junlong Wang
Add basic zxdh ethdev init and register PCI probe functions
Update doc files

Signed-off-by: Junlong Wang 
---
 doc/guides/nics/features/zxdh.ini |  9 +++
 doc/guides/nics/index.rst |  1 +
 doc/guides/nics/zxdh.rst  | 30 ++
 drivers/net/meson.build   |  1 +
 drivers/net/zxdh/meson.build  | 18 ++
 drivers/net/zxdh/zxdh_ethdev.c| 92 +++
 drivers/net/zxdh/zxdh_ethdev.h| 44 +++
 7 files changed, 195 insertions(+)
 create mode 100644 doc/guides/nics/features/zxdh.ini
 create mode 100644 doc/guides/nics/zxdh.rst
 create mode 100644 drivers/net/zxdh/meson.build
 create mode 100644 drivers/net/zxdh/zxdh_ethdev.c
 create mode 100644 drivers/net/zxdh/zxdh_ethdev.h

diff --git a/doc/guides/nics/features/zxdh.ini 
b/doc/guides/nics/features/zxdh.ini
new file mode 100644
index 00..05c8091ed7
--- /dev/null
+++ b/doc/guides/nics/features/zxdh.ini
@@ -0,0 +1,9 @@
+;
+; Supported features of the 'zxdh' network poll mode driver.
+;
+; Refer to default.ini for the full list of available PMD features.
+;
+[Features]
+Linux= Y
+x86-64   = Y
+ARMv8= Y
diff --git a/doc/guides/nics/index.rst b/doc/guides/nics/index.rst
index c14bc7988a..8e371ac4a5 100644
--- a/doc/guides/nics/index.rst
+++ b/doc/guides/nics/index.rst
@@ -69,3 +69,4 @@ Network Interface Controller Drivers
 vhost
 virtio
 vmxnet3
+zxdh
diff --git a/doc/guides/nics/zxdh.rst b/doc/guides/nics/zxdh.rst
new file mode 100644
index 00..4cf531e967
--- /dev/null
+++ b/doc/guides/nics/zxdh.rst
@@ -0,0 +1,30 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+Copyright(c) 2023 ZTE Corporation.
+
+ZXDH Poll Mode Driver
+==
+
+The ZXDH PMD (**librte_net_zxdh**) provides poll mode driver support
+for 25/100 Gbps ZXDH NX Series Ethernet Controller based on
+the ZTE Ethernet Controller E310/E312.
+
+- Learning about ZXDH NX Series Ethernet Controller NICs using
+  ``_.
+
+Features
+
+
+Features of the zxdh PMD are:
+
+- Multi arch support: x86_64, ARMv8.
+
+
+Driver compilation and testing
+--
+
+Refer to the document :ref:`compiling and testing a PMD for a NIC 
`
+for details.
+
+Limitations or Known issues
+---
+X86-32, Power8, ARMv7 and BSD are not supported yet.
diff --git a/drivers/net/meson.build b/drivers/net/meson.build
index fb6d34b782..0a12914534 100644
--- a/drivers/net/meson.build
+++ b/drivers/net/meson.build
@@ -62,6 +62,7 @@ drivers = [
 'vhost',
 'virtio',
 'vmxnet3',
+'zxdh',
 ]
 std_deps = ['ethdev', 'kvargs'] # 'ethdev' also pulls in mbuf, net, eal etc
 std_deps += ['bus_pci'] # very many PMDs depend on PCI, so make std
diff --git a/drivers/net/zxdh/meson.build b/drivers/net/zxdh/meson.build
new file mode 100644
index 00..58e39c1f96
--- /dev/null
+++ b/drivers/net/zxdh/meson.build
@@ -0,0 +1,18 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 ZTE Corporation
+
+if not is_linux
+build = false
+reason = 'only supported on Linux'
+subdir_done()
+endif
+
+if not dpdk_conf.has('RTE_ARCH_X86_64') or not dpdk_conf.get('RTE_ARCH_64')
+build = false
+reason = 'only supported on x86_64 and aarch64'
+subdir_done()
+endif
+
+sources = files(
+   'zxdh_ethdev.c',
+   )
diff --git a/drivers/net/zxdh/zxdh_ethdev.c b/drivers/net/zxdh/zxdh_ethdev.c
new file mode 100644
index 00..75d8b28cc3
--- /dev/null
+++ b/drivers/net/zxdh/zxdh_ethdev.c
@@ -0,0 +1,92 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 ZTE Corporation
+ */
+
+#include 
+#include 
+#include 
+
+#include "zxdh_ethdev.h"
+
+static int zxdh_eth_dev_init(struct rte_eth_dev *eth_dev)
+{
+   struct rte_pci_device *pci_dev = RTE_ETH_DEV_TO_PCI(eth_dev);
+   struct zxdh_hw *hw = eth_dev->data->dev_private;
+   int ret = 0;
+
+   eth_dev->dev_ops = NULL;
+
+   /* Allocate memory for storing MAC addresses */
+   eth_dev->data->mac_addrs = rte_zmalloc("zxdh_mac",
+   ZXDH_MAX_MAC_ADDRS * RTE_ETHER_ADDR_LEN, 0);
+   if (eth_dev->data->mac_addrs == NULL)
+   return -ENOMEM;
+
+   memset(hw, 0, sizeof(*hw));
+   hw->bar_addr[0] = (uint64_t)pci_dev->mem_resource[0].addr;
+   if (hw->bar_addr[0] == 0)
+   return -EIO;
+
+   hw->device_id = pci_dev->id.device_id;
+   hw->port_id = eth_dev->data->port_id;
+   hw->eth_dev = eth_dev;
+   hw->speed = RTE_ETH_SPEED_NUM_UNKNOWN;
+   hw->duplex = RTE_ETH_LINK_FULL_DUPLEX;
+   hw->is_pf = 0;
+
+   if (pci_dev->id.device_id == ZXDH_E310_PF_DEVICEID ||
+   pci_dev->id.device_id == ZXDH_E312_PF_DEVICEID) {
+   hw->is_pf = 1;
+   }
+
+   return ret;
+}
+
+static int zxdh_eth_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,

[PATCH v5 2/9] net/zxdh: add logging implementation

2024-10-14 Thread Junlong Wang
Adds zxdh logging implementation.

Signed-off-by: Junlong Wang 
---
 drivers/net/zxdh/zxdh_ethdev.c | 15 +--
 drivers/net/zxdh/zxdh_logs.h   | 35 ++
 2 files changed, 48 insertions(+), 2 deletions(-)
 create mode 100644 drivers/net/zxdh/zxdh_logs.h

diff --git a/drivers/net/zxdh/zxdh_ethdev.c b/drivers/net/zxdh/zxdh_ethdev.c
index 75d8b28cc3..7220770c01 100644
--- a/drivers/net/zxdh/zxdh_ethdev.c
+++ b/drivers/net/zxdh/zxdh_ethdev.c
@@ -7,6 +7,7 @@
 #include 
 
 #include "zxdh_ethdev.h"
+#include "zxdh_logs.h"
 
 static int zxdh_eth_dev_init(struct rte_eth_dev *eth_dev)
 {
@@ -19,13 +20,18 @@ static int zxdh_eth_dev_init(struct rte_eth_dev *eth_dev)
/* Allocate memory for storing MAC addresses */
eth_dev->data->mac_addrs = rte_zmalloc("zxdh_mac",
ZXDH_MAX_MAC_ADDRS * RTE_ETHER_ADDR_LEN, 0);
-   if (eth_dev->data->mac_addrs == NULL)
+   if (eth_dev->data->mac_addrs == NULL) {
+   PMD_INIT_LOG(ERR, "Failed to allocate %d bytes store MAC 
addresses",
+   ZXDH_MAX_MAC_ADDRS * RTE_ETHER_ADDR_LEN);
return -ENOMEM;
+   }
 
memset(hw, 0, sizeof(*hw));
hw->bar_addr[0] = (uint64_t)pci_dev->mem_resource[0].addr;
-   if (hw->bar_addr[0] == 0)
+   if (hw->bar_addr[0] == 0) {
+   PMD_INIT_LOG(ERR, "Bad mem resource.");
return -EIO;
+   }
 
hw->device_id = pci_dev->id.device_id;
hw->port_id = eth_dev->data->port_id;
@@ -90,3 +96,8 @@ static struct rte_pci_driver zxdh_pmd = {
 RTE_PMD_REGISTER_PCI(net_zxdh, zxdh_pmd);
 RTE_PMD_REGISTER_PCI_TABLE(net_zxdh, pci_id_zxdh_map);
 RTE_PMD_REGISTER_KMOD_DEP(net_zxdh, "* vfio-pci");
+RTE_LOG_REGISTER_SUFFIX(zxdh_logtype_init, init, NOTICE);
+RTE_LOG_REGISTER_SUFFIX(zxdh_logtype_driver, driver, NOTICE);
+RTE_LOG_REGISTER_SUFFIX(zxdh_logtype_rx, rx, NOTICE);
+RTE_LOG_REGISTER_SUFFIX(zxdh_logtype_tx, tx, NOTICE);
+RTE_LOG_REGISTER_SUFFIX(zxdh_logtype_msg, msg, NOTICE);
diff --git a/drivers/net/zxdh/zxdh_logs.h b/drivers/net/zxdh/zxdh_logs.h
new file mode 100644
index 00..e118d26379
--- /dev/null
+++ b/drivers/net/zxdh/zxdh_logs.h
@@ -0,0 +1,35 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 ZTE Corporation
+ */
+
+#ifndef _ZXDH_LOGS_H_
+#define _ZXDH_LOGS_H_
+
+#include 
+
+extern int zxdh_logtype_init;
+#define PMD_INIT_LOG(level, fmt, args...) \
+   rte_log(RTE_LOG_ ## level, zxdh_logtype_init, \
+   "offload_zxdh %s(): " fmt "\n", __func__, ## args)
+
+extern int zxdh_logtype_driver;
+#define PMD_DRV_LOG(level, fmt, args...) \
+   rte_log(RTE_LOG_ ## level, zxdh_logtype_driver, \
+   "offload_zxdh %s(): " fmt "\n", __func__, ## args)
+
+extern int zxdh_logtype_rx;
+#define PMD_RX_LOG(level, fmt, args...) \
+   rte_log(RTE_LOG_ ## level, zxdh_logtype_rx, \
+   "offload_zxdh %s(): " fmt "\n", __func__, ## args)
+
+extern int zxdh_logtype_tx;
+#define PMD_TX_LOG(level, fmt, args...) \
+   rte_log(RTE_LOG_ ## level, zxdh_logtype_tx, \
+   "offload_zxdh %s(): " fmt "\n", __func__, ## args)
+
+extern int zxdh_logtype_msg;
+#define PMD_MSG_LOG(level, fmt, args...) \
+   rte_log(RTE_LOG_ ## level, zxdh_logtype_msg, \
+   "offload_zxdh %s(): " fmt "\n", __func__, ## args)
+
+#endif /* _ZXDH_LOGS_H_ */
-- 
2.27.0

Re: [PATCH v2 1/6] eal: add static per-lcore memory allocation facility

2024-10-14 Thread Mattias Rönnblom

On 2024-10-14 09:56, Morten Brørup wrote:

From: Jerin Jacob [mailto:jerinjac...@gmail.com]
Sent: Wednesday, 18 September 2024 12.12

On Thu, Sep 12, 2024 at 8:52 PM Jerin Jacob 
wrote:


On Thu, Sep 12, 2024 at 7:11 PM Morten Brørup

 wrote:



From: Jerin Jacob [mailto:jerinjac...@gmail.com]
Sent: Thursday, 12 September 2024 15.17

On Thu, Sep 12, 2024 at 2:40 PM Morten Brørup



wrote:



+#define LCORE_BUFFER_SIZE (RTE_MAX_LCORE_VAR *

RTE_MAX_LCORE)


Considering hugepages...

Lcore variables may be allocated before DPDK's memory allocator

(rte_malloc()) is ready, so rte_malloc() cannot be used for lcore

variables.


And lcore variables are not usable (shared) for DPDK multi-

process, so the

lcore_buffer could be allocated through the O/S APIs as anonymous

hugepages,

instead of using rte_malloc().


The alternative, using rte_malloc(), would disallow allocating

lcore

variables before DPDK's memory allocator has been initialized,

which I think

is too late.

I thought it is not. A lot of the subsystems are initialized

after the

memory subsystem is initialized.
[1] example given in documentation. I thought, RTE_INIT needs to
replaced if the subsystem called after memory initialized (which

is

the case for most of the libraries)


The list of RTE_INIT functions are called before main(). It is not

very useful.


Yes, it would be good to replace (or supplement) RTE_INIT_PRIO by

something similar, which calls the list of "INIT" functions at the
appropriate time during EAL initialization.


DPDK should then use this "INIT" list for all its initialization,

so the init function of new features (such as this, and trace) can be
inserted at the correct location in the list.



Trace library had a similar situation. It is managed like [2]


Yes, if we insist on using rte_malloc() for lcore variables, the

alternative is to prohibit establishing lcore variables in functions
called through RTE_INIT.


I was not insisting on using ONLY rte_malloc(). Since rte_malloc()

can

be called before rte_eal_init)(it will return NULL). Alloc routine

can

check first rte_malloc() is available if not switch over glibc.



@Mattias Rönnblom This comment is not addressed in v7. Could you check?


Mattias, following up on Jerin's suggestion:

When allocating an lcore variable, and the buffer holding lcore variables is 
out of space (or was never allocated), a new buffer is allocated.

Here's the twist I think Jerin is asking for:
You could check if rte_malloc() is available, and use that (instead of the 
heap) when allocating a new buffer holding lcore variables.
This check can be performed (aggressively) when allocating a new lcore 
variable, or (conservatively) only when allocating a new buffer.


Now, if using hugepages, the value of RTE_MAX_LCORE_VAR (the maximum size of 
one lcore variable instance) becomes more important.

Let's consider systems with 2 MB hugepages:

If it supports two lcores (RTE_MAX_LCORE is 2), the current RTE_MAX_LCORE_VAR 
default of 1 MB is a perfect match; it will use 2 MB of RAM as one 2 MB 
hugepage.

If it supports 128 lcores, the current RTE_MAX_LCORE_VAR default of 1 MB will 
use 128 MB of RAM.

If we scale it back, so it only uses one 2 MB hugepage, RTE_MAX_LCORE_VAR will 
have to be 2 MB / 128 lcores = 16 KB.
16 KB might be too small. E.g. a mempool cache uses 2 * 512 * sizeof(void *) = 
8 KB + a few bytes for the information about the cache. So I can easily point 
at one example where 16 KB is going very close to the edge.

So, as you already asked, what is a reasonable default minimum value of 
RTE_MAX_LCORE_VAR?

Maybe we should just stick with your initial suggestion (1 MB) and see how it 
goes.



Sure. Let's stick with 1 MB.

I'm guessing that if/when someone takes a closer look how to do 
per-lcore *dynamic* allocations, this API and its implementation will be 
revisited as well.





At the recent DPDK Summit, we discussed memory consumption in one of the 
workshops.
One of the possible means for reducing memory consumption is making 
RTE_MAX_LCORE dynamic, so an application using only a few cores will scale its 
per-lcore tables to the actual number of lcores, instead of scaling to some 
hardcoded maximum.

With this in mind, I'm less worried about the RTE_MAX_LCORE multiplier.




A interesting hack would be disable huge page usage, set up a swap file 
in a zram device, and then MADV_PAGEOUT the DPDK process after startup.


I wonder how much smaller DPDK process RSS would be, when it had paged 
back in all the pages that were actually required.




[PATCH v5 8/9] net/zxdh: add zxdh dev infos get ops

2024-10-14 Thread Junlong Wang
Add support for zxdh infos get.

Signed-off-by: Junlong Wang 
---
 drivers/net/zxdh/zxdh_ethdev.c | 62 +-
 1 file changed, 61 insertions(+), 1 deletion(-)

diff --git a/drivers/net/zxdh/zxdh_ethdev.c b/drivers/net/zxdh/zxdh_ethdev.c
index 4f6711c9af..e0f2c1985b 100644
--- a/drivers/net/zxdh/zxdh_ethdev.c
+++ b/drivers/net/zxdh/zxdh_ethdev.c
@@ -12,6 +12,9 @@
 #include "zxdh_msg.h"
 #include "zxdh_common.h"
 
+#define ZXDH_MIN_RX_BUFSIZE 64
+#define ZXDH_MAX_RX_PKTLEN  14000U
+
 struct zxdh_hw_internal zxdh_hw_internal[RTE_MAX_ETHPORTS];
 
 uint16_t vport_to_vfid(union VPORT v)
@@ -25,6 +28,58 @@ uint16_t vport_to_vfid(union VPORT v)
return (v.epid * 8 + v.pfid) + 1152;
 }
 
+static uint32_t zxdh_dev_speed_capa_get(uint32_t speed)
+{
+   switch (speed) {
+   case RTE_ETH_SPEED_NUM_10G:  return RTE_ETH_LINK_SPEED_10G;
+   case RTE_ETH_SPEED_NUM_20G:  return RTE_ETH_LINK_SPEED_20G;
+   case RTE_ETH_SPEED_NUM_25G:  return RTE_ETH_LINK_SPEED_25G;
+   case RTE_ETH_SPEED_NUM_40G:  return RTE_ETH_LINK_SPEED_40G;
+   case RTE_ETH_SPEED_NUM_50G:  return RTE_ETH_LINK_SPEED_50G;
+   case RTE_ETH_SPEED_NUM_56G:  return RTE_ETH_LINK_SPEED_56G;
+   case RTE_ETH_SPEED_NUM_100G: return RTE_ETH_LINK_SPEED_100G;
+   case RTE_ETH_SPEED_NUM_200G: return RTE_ETH_LINK_SPEED_200G;
+   default: return 0;
+   }
+}
+
+static int32_t zxdh_dev_infos_get(struct rte_eth_dev *dev,
+   struct rte_eth_dev_info *dev_info)
+{
+   struct zxdh_hw *hw = dev->data->dev_private;
+
+   dev_info->speed_capa   = zxdh_dev_speed_capa_get(hw->speed);
+   dev_info->max_rx_queues= RTE_MIN(hw->max_queue_pairs, 
ZXDH_RX_QUEUES_MAX);
+   dev_info->max_tx_queues= RTE_MIN(hw->max_queue_pairs, 
ZXDH_TX_QUEUES_MAX);
+   dev_info->min_rx_bufsize   = ZXDH_MIN_RX_BUFSIZE;
+   dev_info->max_rx_pktlen= ZXDH_MAX_RX_PKTLEN;
+   dev_info->max_mac_addrs= ZXDH_MAX_MAC_ADDRS;
+   dev_info->rx_offload_capa  = (RTE_ETH_RX_OFFLOAD_VLAN_STRIP |
+   RTE_ETH_RX_OFFLOAD_VLAN_FILTER |
+   RTE_ETH_RX_OFFLOAD_QINQ_STRIP);
+   dev_info->rx_offload_capa |= (RTE_ETH_RX_OFFLOAD_IPV4_CKSUM |
+   RTE_ETH_RX_OFFLOAD_UDP_CKSUM |
+   RTE_ETH_RX_OFFLOAD_TCP_CKSUM |
+   RTE_ETH_RX_OFFLOAD_OUTER_IPV4_CKSUM);
+   dev_info->rx_offload_capa |= (RTE_ETH_RX_OFFLOAD_SCATTER);
+   dev_info->rx_offload_capa |=  RTE_ETH_RX_OFFLOAD_TCP_LRO;
+   dev_info->rx_offload_capa |=  RTE_ETH_RX_OFFLOAD_RSS_HASH;
+
+   dev_info->tx_offload_capa = (RTE_ETH_TX_OFFLOAD_MULTI_SEGS);
+   dev_info->tx_offload_capa |= (RTE_ETH_TX_OFFLOAD_TCP_TSO |
+   RTE_ETH_TX_OFFLOAD_UDP_TSO);
+   dev_info->tx_offload_capa |= (RTE_ETH_TX_OFFLOAD_VLAN_INSERT |
+   RTE_ETH_TX_OFFLOAD_QINQ_INSERT |
+   RTE_ETH_TX_OFFLOAD_VXLAN_TNL_TSO);
+   dev_info->tx_offload_capa |= (RTE_ETH_TX_OFFLOAD_IPV4_CKSUM |
+   RTE_ETH_TX_OFFLOAD_UDP_CKSUM |
+   RTE_ETH_TX_OFFLOAD_TCP_CKSUM |
+   RTE_ETH_TX_OFFLOAD_OUTER_IPV4_CKSUM |
+   RTE_ETH_TX_OFFLOAD_OUTER_UDP_CKSUM);
+
+   return 0;
+}
+
 static void zxdh_queues_unbind_intr(struct rte_eth_dev *dev)
 {
struct zxdh_hw *hw = dev->data->dev_private;
@@ -321,6 +376,11 @@ static int32_t zxdh_configure_intr(struct rte_eth_dev *dev)
return ret;
 }
 
+/* dev_ops for zxdh, bare necessities for basic operation */
+static const struct eth_dev_ops zxdh_eth_dev_ops = {
+   .dev_infos_get   = zxdh_dev_infos_get,
+};
+
 static int32_t zxdh_init_device(struct rte_eth_dev *eth_dev)
 {
struct zxdh_hw *hw = eth_dev->data->dev_private;
@@ -377,7 +437,7 @@ static int zxdh_eth_dev_init(struct rte_eth_dev *eth_dev)
struct zxdh_hw *hw = eth_dev->data->dev_private;
int ret = 0;
 
-   eth_dev->dev_ops = NULL;
+   eth_dev->dev_ops = &zxdh_eth_dev_ops;
 
/* Allocate memory for storing MAC addresses */
eth_dev->data->mac_addrs = rte_zmalloc("zxdh_mac",
-- 
2.27.0

Re: [PATCH v11 1/7] eal: add static per-lcore memory allocation facility

2024-10-14 Thread Mattias Rönnblom

On 2024-10-14 10:17, Morten Brørup wrote:

From: Mattias Rönnblom [mailto:mattias.ronnb...@ericsson.com]
Sent: Monday, 14 October 2024 09.44




+struct lcore_var_buffer {
+   char data[RTE_MAX_LCORE_VAR * RTE_MAX_LCORE];
+   struct lcore_var_buffer *prev;
+};


In relation to Jerin's request for using hugepages when available, the "data" 
field should be a pointer to the memory allocated from either the heap or through 
rte_malloc. You would also need to add a flag to indicate which it is, so the correct 
deallocation function can be used to free it on cleanup.



The typing (glibc heap or DPDK heap) could be in the buffers themselves, no?



Here's another (nice to have) idea, which does not need to be part of this 
series, but can be implemented in a separate patch:
If you move "offset" into this structure, new lcore variables can be allocated 
from any buffer, instead of only the most recently allocated buffer.
There might even be gains by picking the "optimal" buffer to allocate different 
size variables from.




If the max lcore variable size is much greater than the actual variable 
sizes, the amount of fragmentation (i.e., the space at the end) will be 
very small.


I don't think we should use huge pages for this facility, since they 
don't support demand paging.


The day we have a DPDK heap which support lcore-affinitized allocations, 
then potentially eal_common_lcore_var.c could use that, provided it's 
available (and there is a proper way to check [or get notified] if it is 
available or not).



+
+static struct lcore_var_buffer *current_buffer;
+
+/* initialized to trigger buffer allocation on first allocation */
+static size_t offset = RTE_MAX_LCORE_VAR;




+void *
+rte_lcore_var_alloc(size_t size, size_t align)
+{
+   /* Having the per-lcore buffer size aligned on cache lines
+* assures as well as having the base pointer aligned on cache
+* size assures that aligned offsets also translate to alipgned
+* pointers across all values.
+*/
+   RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
+   RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
+   RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);


This is very slow path, please RTE_VERIFY instead of RTE_ASSERT in this 
function.



Sure. (I think I rejected that before, but now I don't agree with my old 
self.)





+/**
+ * Get pointer to lcore variable instance with the specified lcore id.
+ *
+ * @param lcore_id
+ *   The lcore id specifying which of the @c RTE_MAX_LCORE value
+ *   instances should be accessed. The lcore id need not be valid
+ *   (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
+ *   is also not valid (and thus should not be dereferenced).
+ * @param handle
+ *   The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle)\
+   ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))


Please remove the _VALUE suffix.



You changed your mind? I'm missing the rationale here.


+
+/**
+ * Get pointer to lcore variable instance of the current thread.
+ *
+ * May only be used by EAL threads and registered non-EAL threads.
+ */
+#define RTE_LCORE_VAR_VALUE(handle) \
+   RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)


Please remove the _VALUE suffix.


+
+/**
+ * Iterate over each lcore id's value for an lcore variable.
+ *
+ * @param lcore_id
+ *   An unsigned int variable successively set to the
+ *   lcore id of every valid lcore id (up to @c RTE_MAX_LCORE).
+ * @param value
+ *   A pointer variable successively set to point to lcore variable
+ *   value instance of the current lcore id being processed.
+ * @param handle
+ *   The lcore variable handle.
+ */
+#define RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, value, handle)


Please remove the _VALUE suffix.


\
+   for ((lcore_id) =   \
+(((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0);
\
+(lcore_id) < RTE_MAX_LCORE; \
+(lcore_id)++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id,
\






Re: [PATCH v11 7/7] eal: keep per-lcore power intrinsics state in lcore variable

2024-10-14 Thread Mattias Rönnblom

On 2024-10-14 18:30, Stephen Hemminger wrote:

On Mon, 14 Oct 2024 09:43:48 +0200
Mattias Rönnblom  wrote:


Keep per-lcore power intrinsics state in a lcore variable to reduce
cache working set size and avoid any CPU next-line-prefetching causing
false sharing.

Signed-off-by: Mattias Rönnblom 
Acked-by: Morten Brørup 
Acked-by: Konstantin Ananyev 
Acked-by: Chengwen Feng 
Acked-by: Stephen Hemminger 


This looks like a problem.

---BEGIN LOGS

 [Begin job log] "ubuntu-22.04-clang-asan+doc+tests" at step Build and test

+ configure_coredump
+ which gdb
+ ulimit -c unlimited
+ sudo sysctl -w kernel.core_pattern=/tmp/dpdk-core.%e.%p
kernel.core_pattern = /tmp/dpdk-core.%e.%p
+ devtools/test-null.sh
=
==67776==ERROR: AddressSanitizer: invalid alignment requested in aligned_alloc: 
64, alignment must be a power of two and the requested size 0x808 must be a 
multiple of alignment (thread T0)
 #0 0x5562b2504042 in aligned_alloc 
(/home/runner/work/dpdk/dpdk/build/app/dpdk-testpmd+0xaad042) (BuildId: 
731d8ec8ca4a6bf8e01bfd7548ebeb784aece6e3)
 #1 0x5562b37f671b in lcore_var_alloc 
/home/runner/work/dpdk/dpdk/build/../lib/eal/common/eal_common_lcore_var.c:77:20
 #2 0x5562b37f671b in rte_lcore_var_alloc 
/home/runner/work/dpdk/dpdk/build/../lib/eal/common/eal_common_lcore_var.c:123:9
 #3 0x5562b341b902 in rte_power_ethdev_pmgmt_init 
/home/runner/work/dpdk/dpdk/build/../lib/power/rte_power_pmd_mgmt.c:775:2
 #4 0x7f76b7829eba in call_init csu/../csu/libc-start.c:145:3
 #5 0x7f76b7829eba in __libc_start_main csu/../csu/libc-start.c:379:5

==67776==HINT: if you don't care about these errors you may set 
allocator_may_return_null=1
SUMMARY: AddressSanitizer: invalid-aligned-alloc-alignment 
(/home/runner/work/dpdk/dpdk/build/app/dpdk-testpmd+0xaad042) (BuildId: 
731d8ec8ca4a6bf8e01bfd7548ebeb784aece6e3) in aligned_alloc
==67776==ABORTING


Yes. Thanks.




Re: [PATCH v6 3/3] node: add xstats for ip4 nodes

2024-10-14 Thread Robin Jarry

, Oct 14, 2024 at 18:10:

From: Pavan Nikhilesh 

Add xstat counters for ip4 LPM lookup failures in
ip4_lookup node.
Add reassembly failure xstat counter for ip4 reassembly
node.

Signed-off-by: Pavan Nikhilesh 
Acked-by: Kiran Kumar K 
---


Reviewed-by: Robin Jarry 



[DPDK/cryptodev Bug 1565] Lots of warnings from Clang Asan build in Openssl PMD

2024-10-14 Thread bugzilla
https://bugs.dpdk.org/show_bug.cgi?id=1565

Bug ID: 1565
   Summary: Lots of warnings from Clang Asan build in Openssl PMD
   Product: DPDK
   Version: 24.11
  Hardware: All
OS: All
Status: UNCONFIRMED
  Severity: critical
  Priority: Normal
 Component: cryptodev
  Assignee: dev@dpdk.org
  Reporter: step...@networkplumber.org
  Target Milestone: ---

This was a build of latest main branch clang and Address Sanitizer reports lots
of warnings in OpenSSL. Given that this is a security related driver setting
severity to high.


$ meson setup -Db_sanitize=address -Db_lundef=false build
...

$ clang --version
Debian clang version 16.0.6 (27+b1)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin

ninja: Entering directory `build'
[1535/3031] Compiling C object
drivers/libtmp_rte_crypto_openssl.a.p/crypto_openssl_rte_openssl_pmd.c.o
In file included from /usr/lib/gcc/x86_64-linux-gnu/14/include/immintrin.h:43,
 from ../lib/eal/x86/include/rte_rtm.h:8,
 from ../lib/eal/x86/include/rte_spinlock.h:9,
 from ../lib/mempool/rte_mempool.h:44,
 from ../lib/mbuf/rte_mbuf.h:38,
 from ../lib/cryptodev/rte_crypto.h:15,
 from ../lib/cryptodev/rte_cryptodev.h:19,
 from ../drivers/crypto/openssl/rte_openssl_pmd.c:7:
In function ‘_mm256_loadu_si256’,
inlined from ‘rte_mov32’ at ../lib/eal/x86/include/rte_memcpy.h:127:9,
inlined from ‘rte_memcpy_generic’ at
../lib/eal/x86/include/rte_memcpy.h:453:3,
inlined from ‘rte_memcpy’ at ../lib/eal/x86/include/rte_memcpy.h:757:10,
inlined from ‘openssl_set_session_auth_parameters’ at
../drivers/crypto/openssl/rte_openssl_pmd.c:695:3,
inlined from ‘openssl_set_session_parameters’ at
../drivers/crypto/openssl/rte_openssl_pmd.c:867:9:
/usr/lib/gcc/x86_64-linux-gnu/14/include/avxintrin.h:929:10: warning: array
subscript ‘__m256i_u[0]’ is partly outside array bounds of ‘const char[12]’
[-Warray-bounds=]
  929 |   return *__P;
  |  ^~~~
In function ‘_mm256_storeu_si256’,
inlined from ‘rte_mov32’ at ../lib/eal/x86/include/rte_memcpy.h:128:2,
inlined from ‘rte_memcpy_generic’ at
../lib/eal/x86/include/rte_memcpy.h:453:3,
inlined from ‘rte_memcpy’ at ../lib/eal/x86/include/rte_memcpy.h:757:10,
inlined from ‘openssl_set_session_auth_parameters’ at
../drivers/crypto/openssl/rte_openssl_pmd.c:695:3,
inlined from ‘openssl_set_session_parameters’ at
../drivers/crypto/openssl/rte_openssl_pmd.c:867:9:
/usr/lib/gcc/x86_64-linux-gnu/14/include/avxintrin.h:935:8: warning: array
subscript ‘__m256i_u[0]’ is partly outside array bounds of ‘char[16]’
[-Warray-bounds=]
  935 |   *__P = __A;
  |   ~^
../drivers/crypto/openssl/rte_openssl_pmd.c: In function
‘openssl_set_session_parameters’:
../drivers/crypto/openssl/rte_openssl_pmd.c:633:14: note: object ‘algo_name’ of
size 16
  633 | char algo_name[MAX_OSSL_ALGO_NAME_SIZE];
  |  ^
In function ‘_mm256_loadu_si256’,
inlined from ‘rte_mov32’ at ../lib/eal/x86/include/rte_memcpy.h:127:9,
inlined from ‘rte_mov64’ at ../lib/eal/x86/include/rte_memcpy.h:148:2,
inlined from ‘rte_mov128’ at ../lib/eal/x86/include/rte_memcpy.h:160:2,
inlined from ‘rte_memcpy_generic’ at
../lib/eal/x86/include/rte_memcpy.h:422:4,
inlined from ‘rte_memcpy’ at ../lib/eal/x86/include/rte_memcpy.h:757:10,
inlined from ‘openssl_set_session_auth_parameters’ at
../drivers/crypto/openssl/rte_openssl_pmd.c:695:3,
inlined from ‘openssl_set_session_parameters’ at
../drivers/crypto/openssl/rte_openssl_pmd.c:867:9:
/usr/lib/gcc/x86_64-linux-gnu/14/include/avxintrin.h:929:10: warning: array
subscript ‘__m256i_u[0]’ is partly outside array bounds of ‘const char[12]’
[-Warray-bounds=]
  929 |   return *__P;
  |  ^~~~
In function ‘_mm256_storeu_si256’,
inlined from ‘rte_mov32’ at ../lib/eal/x86/include/rte_memcpy.h:128:2,
inlined from ‘rte_mov64’ at ../lib/eal/x86/include/rte_memcpy.h:148:2,
inlined from ‘rte_mov128’ at ../lib/eal/x86/include/rte_memcpy.h:160:2,
inlined from ‘rte_memcpy_generic’ at
../lib/eal/x86/include/rte_memcpy.h:422:4,
inlined from ‘rte_memcpy’ at ../lib/eal/x86/include/rte_memcpy.h:757:10,
inlined from ‘openssl_set_session_auth_parameters’ at
../drivers/crypto/openssl/rte_openssl_pmd.c:695:3,
inlined from ‘openssl_set_session_parameters’ at
../drivers/crypto/openssl/rte_openssl_pmd.c:867:9:
/usr/lib/gcc/x86_64-linux-gnu/14/include/avxintrin.h:935:8: warning: array
subscript ‘__m256i_u[0]’ is partly outside array bounds of ‘char[16]’
[-Warray-bounds=]
  935 |   *__P = __A;
  |   ~^
../drivers/crypto/openssl/rte_openssl_pmd.c: In function
‘openssl_set_session_parameters’:
../drivers/crypto/openssl/rte_openssl_pmd.c:633:14: note: object ‘algo_name’ of
size 16
  633 | ch

[PATCH v4 1/5] power: refactor core power management library

2024-10-14 Thread Sivaprasad Tummala
This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.

These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.

v4:
 - fixed build error with RTE_ASSERT

v3:
 - renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
 - re-worked on auto detection logic

v2:
 - added NULL check for global_core_ops in rte_power_get_core_ops

Signed-off-by: Sivaprasad Tummala 
---
 drivers/meson.build   |   1 +
 .../power/acpi/acpi_cpufreq.c |  22 +-
 .../power/acpi/acpi_cpufreq.h |   6 +-
 drivers/power/acpi/meson.build|  10 +
 .../power/amd_pstate/amd_pstate_cpufreq.c |  24 +-
 .../power/amd_pstate/amd_pstate_cpufreq.h |   8 +-
 drivers/power/amd_pstate/meson.build  |  10 +
 .../power/cppc/cppc_cpufreq.c |  22 +-
 .../power/cppc/cppc_cpufreq.h |   8 +-
 drivers/power/cppc/meson.build|  10 +
 .../power/kvm_vm}/guest_channel.c |   0
 .../power/kvm_vm}/guest_channel.h |   0
 .../power/kvm_vm/kvm_vm.c |  22 +-
 .../power/kvm_vm/kvm_vm.h |   6 +-
 drivers/power/kvm_vm/meson.build  |  16 +
 drivers/power/meson.build |  12 +
 drivers/power/pstate/meson.build  |  10 +
 .../power/pstate/pstate_cpufreq.c |  22 +-
 .../power/pstate/pstate_cpufreq.h |   6 +-
 lib/power/meson.build |   7 +-
 lib/power/power_common.c  |   2 +-
 lib/power/power_common.h  |  16 +-
 lib/power/rte_power.c | 287 ++
 lib/power/rte_power.h | 139 ++---
 lib/power/rte_power_cpufreq_api.h | 208 +
 lib/power/version.map |  14 +
 26 files changed, 619 insertions(+), 269 deletions(-)
 rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c 
(95%)
 rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h 
(98%)
 create mode 100644 drivers/power/acpi/meson.build
 rename lib/power/power_amd_pstate_cpufreq.c => 
drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
 rename lib/power/power_amd_pstate_cpufreq.h => 
drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
 create mode 100644 drivers/power/amd_pstate/meson.build
 rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c 
(95%)
 rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h 
(97%)
 create mode 100644 drivers/power/cppc/meson.build
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
 rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
 rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
 create mode 100644 drivers/power/kvm_vm/meson.build
 create mode 100644 drivers/power/meson.build
 create mode 100644 drivers/power/pstate/meson.build
 rename lib/power/power_pstate_cpufreq.c => 
drivers/power/pstate/pstate_cpufreq.c (96%)
 rename lib/power/power_pstate_cpufreq.h => 
drivers/power/pstate/pstate_cpufreq.h (98%)
 create mode 100644 lib/power/rte_power_cpufreq_api.h

diff --git a/drivers/meson.build b/drivers/meson.build
index 66931d4241..9d77e0deab 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
 'event',  # depends on common, bus, mempool and net.
 'baseband',   # depends on common and bus.
 'gpu',# depends on common and bus.
+'power',  # depends on common (in future).
 ]
 
 if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index abad53bef1..c3fd10f287 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
 #include 
 #include 
 
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
 #include "power_common.h"
 
 #define STR_SIZE 1024
@@ -583,3 +583,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
 
return 0;
 }
+
+static struct rte_power_core_ops acpi_ops = {
+   .name = "acpi",
+   .init = power_a

[PATCH v4 2/5] power: refactor uncore power management library

2024-10-14 Thread Sivaprasad Tummala
This patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.

This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.

v4:
 - fixed build error with RTE_ASSERT

v3:
 - fixed typo in header file inclusion

Signed-off-by: Sivaprasad Tummala 
---
 .../power/intel_uncore/intel_uncore.c |  18 +-
 .../power/intel_uncore/intel_uncore.h |   8 +-
 drivers/power/intel_uncore/meson.build|   6 +
 drivers/power/meson.build |   3 +-
 lib/power/meson.build |   2 +-
 lib/power/rte_power_uncore.c  | 207 ++-
 lib/power/rte_power_uncore.h  |  87 ---
 lib/power/rte_power_uncore_ops.h  | 239 ++
 lib/power/version.map |   1 +
 9 files changed, 406 insertions(+), 165 deletions(-)
 rename lib/power/power_intel_uncore.c => 
drivers/power/intel_uncore/intel_uncore.c (95%)
 rename lib/power/power_intel_uncore.h => 
drivers/power/intel_uncore/intel_uncore.h (97%)
 create mode 100644 drivers/power/intel_uncore/meson.build
 create mode 100644 lib/power/rte_power_uncore_ops.h

diff --git a/lib/power/power_intel_uncore.c 
b/drivers/power/intel_uncore/intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/intel_uncore/intel_uncore.c
index 4eb9c5900a..804ad5d755 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/intel_uncore/intel_uncore.c
@@ -8,7 +8,7 @@
 
 #include 
 
-#include "power_intel_uncore.h"
+#include "intel_uncore.h"
 #include "power_common.h"
 
 #define MAX_NUMA_DIE 8
@@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
 
return count;
 }
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+   .name = "intel-uncore",
+   .init = power_intel_uncore_init,
+   .exit = power_intel_uncore_exit,
+   .get_avail_freqs = power_intel_uncore_freqs,
+   .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+   .get_num_dies = power_intel_uncore_get_num_dies,
+   .get_num_freqs = power_intel_uncore_get_num_freqs,
+   .get_freq = power_get_intel_uncore_freq,
+   .set_freq = power_set_intel_uncore_freq,
+   .freq_max = power_intel_uncore_freq_max,
+   .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h 
b/drivers/power/intel_uncore/intel_uncore.h
similarity index 97%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/intel_uncore/intel_uncore.h
index 20a3ba8ebe..f2ce2f0c66 100644
--- a/lib/power/power_intel_uncore.h
+++ b/drivers/power/intel_uncore/intel_uncore.h
@@ -2,8 +2,8 @@
  * Copyright(c) 2022 Intel Corporation
  */
 
-#ifndef POWER_INTEL_UNCORE_H
-#define POWER_INTEL_UNCORE_H
+#ifndef INTEL_UNCORE_H
+#define INTEL_UNCORE_H
 
 /**
  * @file
@@ -11,7 +11,7 @@
  */
 
 #include "rte_power.h"
-#include "rte_power_uncore.h"
+#include "rte_power_uncore_ops.h"
 
 #ifdef __cplusplus
 extern "C" {
@@ -223,4 +223,4 @@ power_intel_uncore_get_num_dies(unsigned int pkg);
 }
 #endif
 
-#endif /* POWER_INTEL_UNCORE_H */
+#endif /* INTEL_UNCORE_H */
diff --git a/drivers/power/intel_uncore/meson.build 
b/drivers/power/intel_uncore/meson.build
new file mode 100644
index 00..876df8ad14
--- /dev/null
+++ b/drivers/power/intel_uncore/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+sources = files('intel_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 8c7215c639..c83047af94 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -6,7 +6,8 @@ drivers = [
 'amd_pstate',
 'cppc',
 'kvm_vm',
-'pstate'
+'pstate',
+'intel_uncore'
 ]
 
 std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index d6b86ea19c..63616e60fd 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,7 +13,6 @@ if not is_linux
 endif
 sources = files(
 'power_common.c',
-'power_intel_uncore.c',
 'rte_power.c',
 'rte_power_uncore.c',
 'rte_power_pmd_mgmt.c',
@@ -24,6 +23,7 @@ headers = files(
 'rte_power_guest_channel.h',
 'rte_power_pmd_mgmt.h',
 'rte_power_uncore.h',
+'rte_power_uncore_ops.h',
 )
 if cc.has_argument('-Wno-cast-qual')
 cflags += '-Wno-cast-qual'
diff --git a/lib/power/rte_power_uncore

[PATCH v4 0/5] power: refactor power management library

2024-10-14 Thread Sivaprasad Tummala
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
  
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
 
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.

Sivaprasad Tummala (5):
  power: refactor core power management library
  power: refactor uncore power management library
  test/power: removed function pointer validations
  power/amd_uncore: uncore support for AMD EPYC processors
  maintainers: update for drivers/power

 MAINTAINERS   |   1 +
 app/test/test_power.c |  95 -
 app/test/test_power_cpufreq.c |  52 ---
 app/test/test_power_kvm_vm.c  |  36 --
 drivers/meson.build   |   1 +
 .../power/acpi/acpi_cpufreq.c |  22 +-
 .../power/acpi/acpi_cpufreq.h |   6 +-
 drivers/power/acpi/meson.build|  10 +
 .../power/amd_pstate/amd_pstate_cpufreq.c |  24 +-
 .../power/amd_pstate/amd_pstate_cpufreq.h |   8 +-
 drivers/power/amd_pstate/meson.build  |  10 +
 drivers/power/amd_uncore/amd_uncore.c | 329 ++
 drivers/power/amd_uncore/amd_uncore.h | 226 
 drivers/power/amd_uncore/meson.build  |  20 ++
 .../power/cppc/cppc_cpufreq.c |  22 +-
 .../power/cppc/cppc_cpufreq.h |   8 +-
 drivers/power/cppc/meson.build|  10 +
 .../power/intel_uncore/intel_uncore.c |  18 +-
 .../power/intel_uncore/intel_uncore.h |   8 +-
 drivers/power/intel_uncore/meson.build|   6 +
 .../power/kvm_vm}/guest_channel.c |   0
 .../power/kvm_vm}/guest_channel.h |   0
 .../power/kvm_vm/kvm_vm.c |  22 +-
 .../power/kvm_vm/kvm_vm.h |   6 +-
 drivers/power/kvm_vm/meson.build  |  16 +
 drivers/power/meson.build |  14 +
 drivers/power/pstate/meson.build  |  10 +
 .../power/pstate/pstate_cpufreq.c |  22 +-
 .../power/pstate/pstate_cpufreq.h |   6 +-
 examples/l3fwd-power/main.c   |  12 +-
 lib/power/meson.build |   9 +-
 lib/power/power_common.c  |   2 +-
 lib/power/power_common.h  |  16 +-
 lib/power/rte_power.c | 287 +--
 lib/power/rte_power.h | 139 +---
 lib/power/rte_power_cpufreq_api.h | 208 +++
 lib/power/rte_power_uncore.c  | 207 +--
 lib/power/rte_power_uncore.h  |  87 +++--
 lib/power/rte_power_uncore_ops.h  | 239 +
 lib/power/version.map |  15 +
 40 files changed, 1605 insertions(+), 624 deletions(-)
 rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c 
(95%)
 rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h 
(98%)
 create mode 100644 drivers/power/acpi/meson.build
 rename lib/power/power_amd_pstate_cpufreq.c => 
drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
 rename lib/power/power_amd_pstate_cpufreq.h => 
drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
 create mode 100644 drivers/power/amd_pstate/meson.build
 create mode 100644 drivers/power/amd_uncore/amd_uncore.c
 create mode 100644 drivers/power/amd_uncore/amd_uncore.h
 create mode 100644 drivers/power/amd_uncore/meson.build
 rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c 
(95%)
 rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h 
(97%)
 create mode 100644 drivers/power/cppc/meson.build
 rename lib/power/power_intel_uncore.c => 
drivers/power/intel_uncore/intel_uncore.c (95%)
 rename lib/power/power_intel_uncore.h => 
drivers/power/intel_uncore/intel_uncore.h (97%)
 create mode 100644 drivers/power/intel_uncore/meson.build
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
 rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
 rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
 create mode 100644 drivers/power/kvm_vm/meson.build
 create mode 100644 drivers/power/meson.build
 create mode 100644 drivers/power/pstate/meson.build
 rename lib/power/power_pstate_cpufreq.c => 
drivers/power/pstate/pstate_cpufreq.c (96%)
 rename lib/power/power_pstate_cpufre

[PATCH v4 4/5] power/amd_uncore: uncore support for AMD EPYC processors

2024-10-14 Thread Sivaprasad Tummala
This patch introduces driver support for power management of uncore
components in AMD EPYC processors.

v2:
 - fixed typo in comments section.
 - added fabric frequency get support for legacy platforms.

Signed-off-by: Sivaprasad Tummala 
---
 drivers/power/amd_uncore/amd_uncore.c | 329 ++
 drivers/power/amd_uncore/amd_uncore.h | 226 ++
 drivers/power/amd_uncore/meson.build  |  20 ++
 drivers/power/meson.build |   1 +
 4 files changed, 576 insertions(+)
 create mode 100644 drivers/power/amd_uncore/amd_uncore.c
 create mode 100644 drivers/power/amd_uncore/amd_uncore.h
 create mode 100644 drivers/power/amd_uncore/meson.build

diff --git a/drivers/power/amd_uncore/amd_uncore.c 
b/drivers/power/amd_uncore/amd_uncore.c
new file mode 100644
index 00..c3e95cdc08
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.c
@@ -0,0 +1,329 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+
+#include "amd_uncore.h"
+#include "power_common.h"
+#include "e_smi/e_smi.h"
+
+#define MAX_NUMA_DIE 8
+
+struct  __rte_cache_aligned uncore_power_info {
+   unsigned int die;  /* Core die id */
+   unsigned int pkg;  /* Package id */
+   uint32_t freqs[RTE_MAX_UNCORE_FREQS];  /* Frequency array */
+   uint32_t nb_freqs; /* Number of available freqs */
+   uint32_t curr_idx; /* Freq index in freqs array */
+   uint32_t max_freq;/* System max uncore freq */
+   uint32_t min_freq;/* System min uncore freq */
+};
+
+static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
+static int esmi_initialized;
+static unsigned int hsmp_proto_ver;
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+   int ret;
+
+   if (idx >= RTE_MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+   POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
+   "should be less than %u", idx, ui->nb_freqs);
+   return -1;
+   }
+
+   ret = esmi_apb_disable(ui->pkg, idx);
+   if (ret != ESMI_SUCCESS) {
+   POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
+   idx, ui->pkg);
+   return -1;
+   }
+
+   POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
+   idx, ui->pkg, ui->die);
+
+   /* write the minimum value first if the target freq is less than 
current max */
+   ui->curr_idx = idx;
+
+   return 0;
+}
+
+static int
+power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
+{
+   switch (hsmp_proto_ver) {
+   case HSMP_PROTO_VER5:
+   ui->max_freq = 180; /* Hz */
+   ui->min_freq = 120; /* Hz */
+   break;
+   case HSMP_PROTO_VER2:
+   default:
+   ui->max_freq = 160; /* Hz */
+   ui->min_freq = 120; /* Hz */
+   }
+
+   return 0;
+}
+
+/*
+ * Get the available uncore frequencies of the specific die.
+ */
+static int
+power_get_available_uncore_freqs(struct uncore_power_info *ui)
+{
+   ui->nb_freqs = 3;
+   if (ui->nb_freqs >= RTE_MAX_UNCORE_FREQS) {
+   POWER_LOG(ERR, "Too many available uncore frequencies: %d",
+   ui->nb_freqs);
+   return -1;
+   }
+
+   /* Generate the uncore freq bucket array. */
+   switch (hsmp_proto_ver) {
+   case HSMP_PROTO_VER5:
+   ui->freqs[0] = 180;
+   ui->freqs[1] = 144;
+   ui->freqs[2] = 120;
+   break;
+   case HSMP_PROTO_VER2:
+   default:
+   ui->freqs[0] = 160;
+   ui->freqs[1] = 1333000;
+   ui->freqs[2] = 120;
+   }
+
+   POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
+   ui->num_uncore_freqs, ui->pkg, ui->die);
+
+   return 0;
+}
+
+static int
+check_pkg_die_values(unsigned int pkg, unsigned int die)
+{
+   unsigned int max_pkgs, max_dies;
+   max_pkgs = power_amd_uncore_get_num_pkgs();
+   if (max_pkgs == 0)
+   return -1;
+   if (pkg >= max_pkgs) {
+   POWER_LOG(DEBUG, "Package number %02u can not exceed %u",
+   pkg, max_pkgs);
+   return -1;
+   }
+
+   max_dies = power_amd_uncore_get_num_dies(pkg);
+   if (max_dies == 0)
+   return -1;
+   if (die >= max_dies) {
+   POWER_LOG(DEBUG, "Die number %02u can not exceed %u",
+   die, max_dies);
+   return -1;
+   }
+
+   return 0;
+}
+
+static void
+power_amd_uncore_esmi_init(void)
+{
+   if (esmi_init() == ESMI_SUCCESS) {
+   i

[PATCH v4 0/5] power: refactor power management library

2024-10-14 Thread Sivaprasad Tummala
This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.
  
This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.
 
Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.

Sivaprasad Tummala (5):
  power: refactor core power management library
  power: refactor uncore power management library
  test/power: removed function pointer validations
  power/amd_uncore: uncore support for AMD EPYC processors
  maintainers: update for drivers/power

 MAINTAINERS   |   1 +
 app/test/test_power.c |  95 -
 app/test/test_power_cpufreq.c |  52 ---
 app/test/test_power_kvm_vm.c  |  36 --
 drivers/meson.build   |   1 +
 .../power/acpi/acpi_cpufreq.c |  22 +-
 .../power/acpi/acpi_cpufreq.h |   6 +-
 drivers/power/acpi/meson.build|  10 +
 .../power/amd_pstate/amd_pstate_cpufreq.c |  24 +-
 .../power/amd_pstate/amd_pstate_cpufreq.h |   8 +-
 drivers/power/amd_pstate/meson.build  |  10 +
 drivers/power/amd_uncore/amd_uncore.c | 329 ++
 drivers/power/amd_uncore/amd_uncore.h | 226 
 drivers/power/amd_uncore/meson.build  |  20 ++
 .../power/cppc/cppc_cpufreq.c |  22 +-
 .../power/cppc/cppc_cpufreq.h |   8 +-
 drivers/power/cppc/meson.build|  10 +
 .../power/intel_uncore/intel_uncore.c |  18 +-
 .../power/intel_uncore/intel_uncore.h |   8 +-
 drivers/power/intel_uncore/meson.build|   6 +
 .../power/kvm_vm}/guest_channel.c |   0
 .../power/kvm_vm}/guest_channel.h |   0
 .../power/kvm_vm/kvm_vm.c |  22 +-
 .../power/kvm_vm/kvm_vm.h |   6 +-
 drivers/power/kvm_vm/meson.build  |  16 +
 drivers/power/meson.build |  14 +
 drivers/power/pstate/meson.build  |  10 +
 .../power/pstate/pstate_cpufreq.c |  22 +-
 .../power/pstate/pstate_cpufreq.h |   6 +-
 examples/l3fwd-power/main.c   |  12 +-
 lib/power/meson.build |   9 +-
 lib/power/power_common.c  |   2 +-
 lib/power/power_common.h  |  16 +-
 lib/power/rte_power.c | 287 +--
 lib/power/rte_power.h | 139 +---
 lib/power/rte_power_cpufreq_api.h | 208 +++
 lib/power/rte_power_uncore.c  | 207 +--
 lib/power/rte_power_uncore.h  |  87 +++--
 lib/power/rte_power_uncore_ops.h  | 239 +
 lib/power/version.map |  15 +
 40 files changed, 1605 insertions(+), 624 deletions(-)
 rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c 
(95%)
 rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h 
(98%)
 create mode 100644 drivers/power/acpi/meson.build
 rename lib/power/power_amd_pstate_cpufreq.c => 
drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
 rename lib/power/power_amd_pstate_cpufreq.h => 
drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
 create mode 100644 drivers/power/amd_pstate/meson.build
 create mode 100644 drivers/power/amd_uncore/amd_uncore.c
 create mode 100644 drivers/power/amd_uncore/amd_uncore.h
 create mode 100644 drivers/power/amd_uncore/meson.build
 rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c 
(95%)
 rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h 
(97%)
 create mode 100644 drivers/power/cppc/meson.build
 rename lib/power/power_intel_uncore.c => 
drivers/power/intel_uncore/intel_uncore.c (95%)
 rename lib/power/power_intel_uncore.h => 
drivers/power/intel_uncore/intel_uncore.h (97%)
 create mode 100644 drivers/power/intel_uncore/meson.build
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
 rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
 rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
 create mode 100644 drivers/power/kvm_vm/meson.build
 create mode 100644 drivers/power/meson.build
 create mode 100644 drivers/power/pstate/meson.build
 rename lib/power/power_pstate_cpufreq.c => 
drivers/power/pstate/pstate_cpufreq.c (96%)
 rename lib/power/power_pstate_cpufre

[PATCH v4 3/5] test/power: removed function pointer validations

2024-10-14 Thread Sivaprasad Tummala
After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.

v2:
 - removed function pointer validation in l3fwd-power app.

Signed-off-by: Sivaprasad Tummala 
---
 app/test/test_power.c | 95 ---
 app/test/test_power_cpufreq.c | 52 ---
 app/test/test_power_kvm_vm.c  | 36 -
 examples/l3fwd-power/main.c   | 12 ++---
 4 files changed, 4 insertions(+), 191 deletions(-)

diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
 
 #include 
 
-static int
-check_function_ptrs(void)
-{
-   enum power_management_env env = rte_power_get_env();
-
-   const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
-   const char *inject_not_string1 = not_null_expected ? " not" : "";
-   const char *inject_not_string2 = not_null_expected ? "" : " not";
-
-   if ((rte_power_freqs == NULL) == not_null_expected) {
-   printf("rte_power_freqs should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_get_freq == NULL) == not_null_expected) {
-   printf("rte_power_get_freq should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_set_freq == NULL) == not_null_expected) {
-   printf("rte_power_set_freq should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_up == NULL) == not_null_expected) {
-   printf("rte_power_freq_up should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_down == NULL) == not_null_expected) {
-   printf("rte_power_freq_down should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_max == NULL) == not_null_expected) {
-   printf("rte_power_freq_max should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_min == NULL) == not_null_expected) {
-   printf("rte_power_freq_min should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_turbo_status == NULL) == not_null_expected) {
-   printf("rte_power_turbo_status should%s be NULL, environment 
has%s been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
-   printf("rte_power_freq_enable_turbo should%s be NULL, 
environment has%s been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
-   printf("rte_power_freq_disable_turbo should%s be NULL, 
environment has%s been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_get_capabilities == NULL) == not_null_expected) {
-   printf("rte_power_get_capabilities should%s be NULL, 
environment has%s been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-
-   return 0;
-}
-
 static int
 test_power(void)
 {
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
 
-   /* Verify that function pointers are NULL */
-   if (check_function_ptrs() < 0)
-   goto fail_all;
-
rte_power_unset_env();
 
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)

[PATCH v4 5/5] maintainers: update for drivers/power

2024-10-14 Thread Sivaprasad Tummala
Update maintainers for drivers/power/*.

Signed-off-by: Sivaprasad Tummala 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 6814991735..9f14e8f8d6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1744,6 +1744,7 @@ M: Anatoly Burakov 
 M: David Hunt 
 M: Sivaprasad Tummala 
 F: lib/power/
+F: drivers/power/*
 F: doc/guides/prog_guide/power_man.rst
 F: app/test/test_power*
 F: examples/l3fwd-power/
-- 
2.34.1



Re: [v6 08/15] dma/dpaa: refactor driver

2024-10-14 Thread Stephen Hemminger
On Mon, 14 Oct 2024 15:06:32 +0530
Gagandeep Singh  wrote:

> @@ -551,7 +485,7 @@ fsl_qdma_reg_init(struct fsl_qdma_engine *fsl_qdma)
>  
>   /* Initialize the status queue mode. */
>   reg = FSL_QDMA_BSQMR_EN;
> - val = ilog2(fsl_qdma->status[j]->n_cq) - 6;
> + val = ilog2_qsize(temp_stat->n_cq);
>   reg |= FSL_QDMA_BSQMR_CQ_SIZE(val);
>   qdma_writel(reg, block + FSL_QDMA_BSQMR);
>   }
> @@ -563,158 +497,389 @@ fsl_qdma_reg_init(struct fsl_qdma_engine *fsl_qdma)
>   return 0;
>  }
>  
> -static void *
> -fsl_qdma_prep_memcpy(void *fsl_chan, dma_addr_t dst,
> -dma_addr_t src, size_t len,
> -void *call_back,
> -void *param)
> +static uint16_t
> +dpaa_qdma_block_dequeue(struct fsl_qdma_engine *fsl_qdma,
> + uint8_t block_id)
>  {
> - struct fsl_qdma_comp *fsl_comp;
> + struct fsl_qdma_status_queue *stat_queue;
> + struct fsl_qdma_queue *cmd_queue;
> + struct fsl_qdma_comp_cmd_desc *cq;
> + uint16_t start, count = 0;
> + uint8_t qid = 0;
> + uint32_t reg;
> + int ret;
> + uint8_t *block;
> + uint16_t *dq_complete;
> + struct fsl_qdma_desc *desc[FSL_QDMA_SG_MAX_ENTRY];
>  
> - fsl_comp =
> - fsl_qdma_request_enqueue_desc((struct fsl_qdma_chan *)fsl_chan);
> - if (!fsl_comp)
> - return NULL;
> + stat_queue = &fsl_qdma->stat_queues[block_id];
> + cq = stat_queue->cq;
> + start = stat_queue->complete;
> +
> + block = fsl_qdma->block_base +
> + FSL_QDMA_BLOCK_BASE_OFFSET(fsl_qdma, block_id);
>  
> - fsl_comp->qchan = fsl_chan;
> - fsl_comp->call_back_func = call_back;
> - fsl_comp->params = param;
> + do {
> + reg = qdma_readl_be(block + FSL_QDMA_BSQSR);
> + if (reg & FSL_QDMA_BSQSR_QE_BE)
> + break;
>  
> - fsl_qdma_comp_fill_memcpy(fsl_comp, dst, src, len);
> - return (void *)fsl_comp;
> + qdma_writel_be(FSL_QDMA_BSQMR_DI, block + FSL_QDMA_BSQMR);
> + ret = qdma_ccdf_get_queue(&cq[start], &qid);
> + if (ret == true) {
> + cmd_queue = &fsl_qdma->cmd_queues[block_id][qid];
> +
> + ret = rte_ring_dequeue(cmd_queue->complete_burst,
> + (void **)&dq_complete);
> + if (ret)
> + rte_panic("DQ desc number failed!\n");

Please don't panic here, either recover, log an error or take the device
offline. Killing the whole application is not acceptable.


Re: [PATCH v6 1/3] graph: add support for node specific xstats

2024-10-14 Thread Jerin Jacob
On Mon, Oct 14, 2024 at 9:41 PM  wrote:
>
> From: Pavan Nikhilesh 
>
> Add ability for Nodes to advertise xstat counters
> during registration and increment them in fastpath.
> Add support for retrieving/printing stats for node
> specific xstats using rte_graph_cluster_stats_get().
> Add `rte_node_xstat_increment` API to increment node
> specific xstat counters.
>
> Signed-off-by: Pavan Nikhilesh 
> Acked-by: Kiran Kumar K 
> Reviewed-by: Robin Jarry 
> ---
>  doc/guides/prog_guide/graph_lib.rst| 22 +--
>  doc/guides/rel_notes/deprecation.rst   |  6 --
>  doc/guides/rel_notes/release_24_11.rst |  8 +++
>  lib/graph/graph_populate.c | 20 ++-
>  lib/graph/graph_private.h  |  3 +
>  lib/graph/graph_stats.c| 79 +-
>  lib/graph/node.c   | 37 +++-
>  lib/graph/rte_graph.h  | 11 
>  lib/graph/rte_graph_worker_common.h| 23 
>  lib/graph/version.map  |  7 +++
>  10 files changed, 201 insertions(+), 15 deletions(-)

>

Doxygen comment is missing for rte_node_xstats  structure.



> +struct rte_node_xstats {
> +   uint16_t nb_xstats;  /**< Number of xstats. */
> +   char xstat_desc[][RTE_NODE_XSTAT_DESC_SIZE]; /**< Names of xstats. */
> +};
> +
>
> +/**
> + * Increment Node xstat count.
> + *
> + * Increment the count of an xstat for a given node.
> + *
> + * @param node
> + *   Pointer to the node.
> + * @param xstat_id
> + *   Error ID.

Extend stats ID


With the above fixes:
Acked-by: Jerin Jacob 

> + * @param value
> + *   Value to increment.
> + */
> +__rte_experimental
> +static inline void
> +rte_node_xstat_increment(struct rte_node *node, uint16_t xstat_id, uint64_t 
> value)
> +{
> +   if (rte_graph_has_stats_feature()) {
> +   uint64_t *xstat = (uint64_t *)RTE_PTR_ADD(node, 
> node->xstat_off);
> +   xstat[xstat_id] += value;
> +   }
> +}
> +


[PATCH v3 2/2] dts: port over unified packet suite

2024-10-14 Thread Dean Marx
Port over unified packet testing suite from old DTS. This suite
tests the ability of the PMD to recognize valid or invalid packet flags.

Signed-off-by: Dean Marx 
Reviewed-by: Jeremy Spewock 
---
 dts/framework/config/conf_yaml_schema.json |   3 +-
 dts/tests/TestSuite_uni_pkt.py | 229 +
 2 files changed, 231 insertions(+), 1 deletion(-)
 create mode 100644 dts/tests/TestSuite_uni_pkt.py

diff --git a/dts/framework/config/conf_yaml_schema.json 
b/dts/framework/config/conf_yaml_schema.json
index df390e8ae2..156fa47e94 100644
--- a/dts/framework/config/conf_yaml_schema.json
+++ b/dts/framework/config/conf_yaml_schema.json
@@ -187,7 +187,8 @@
   "enum": [
 "hello_world",
 "os_udp",
-"pmd_buffer_scatter"
+"pmd_buffer_scatter",
+"uni_pkt"
   ]
 },
 "test_target": {
diff --git a/dts/tests/TestSuite_uni_pkt.py b/dts/tests/TestSuite_uni_pkt.py
new file mode 100644
index 00..e7a8e36f79
--- /dev/null
+++ b/dts/tests/TestSuite_uni_pkt.py
@@ -0,0 +1,229 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 University of New Hampshire
+
+"""Unified packet type flag testing suite.
+
+According to DPDK documentation, each Poll Mode Driver should reserve 32 bits
+of packet headers for unified packet type flags. These flags serve as an
+identifier for user applications, and are divided into subcategories:
+L2, L3, L4, tunnel, inner L2, inner L3, and inner L4 types.
+This suite verifies the ability of the driver to recognize these types.
+
+"""
+
+from scapy.contrib.nsh import NSH  # type: ignore[import-untyped]
+from scapy.layers.inet import GRE, ICMP, IP, TCP, UDP  # type: 
ignore[import-untyped]
+from scapy.layers.inet6 import IPv6, IPv6ExtHdrFragment  # type: 
ignore[import-untyped]
+from scapy.layers.l2 import ARP, Ether  # type: ignore[import-untyped]
+from scapy.layers.sctp import SCTP, SCTPChunkData  # type: 
ignore[import-untyped]
+from scapy.layers.vxlan import VXLAN  # type: ignore[import-untyped]
+from scapy.packet import Packet, Raw  # type: ignore[import-untyped]
+
+from framework.remote_session.testpmd_shell import (
+RtePTypes,
+SimpleForwardingModes,
+TestPmdShell,
+TestPmdVerbosePacket,
+)
+from framework.test_suite import TestSuite, func_test
+from framework.testbed_model.capability import TopologyType, requires
+
+
+@requires(topology_type=TopologyType.two_links)
+class TestUniPkt(TestSuite):
+"""DPDK Unified packet test suite.
+
+This testing suite uses testpmd's verbose output hardware/software
+packet type field to verify the ability of the driver to recognize
+unified packet types when receiving different packets.
+
+"""
+
+def check_for_matching_packet(
+self, output: list[TestPmdVerbosePacket], flags: RtePTypes
+) -> bool:
+"""Returns :data:`True` if the packet in verbose output contains all 
specified flags."""
+for packet in output:
+if packet.dst_mac == "00:00:00:00:00:01":
+if flags not in packet.hw_ptype and flags not in 
packet.sw_ptype:
+return False
+return True
+
+def send_packet_and_verify_flags(
+self, expected_flag: RtePTypes, packet: Packet, testpmd: TestPmdShell
+) -> None:
+"""Sends a packet to the DUT and verifies the verbose ptype flags."""
+testpmd.start()
+self.send_packet_and_capture(packet=packet)
+verbose_output = testpmd.extract_verbose_output(testpmd.stop())
+valid = self.check_for_matching_packet(output=verbose_output, 
flags=expected_flag)
+self.verify(valid, f"Packet type flag did not match the expected flag: 
{expected_flag}.")
+
+def setup_session(
+self, testpmd: TestPmdShell, expected_flags: list[RtePTypes], 
packet_list=list[Packet]
+) -> None:
+"""Sets the forwarding and verbose mode of each test case interactive 
shell session."""
+testpmd.set_forward_mode(SimpleForwardingModes.rxonly)
+testpmd.set_verbose(level=1)
+for i in range(0, len(packet_list)):
+self.send_packet_and_verify_flags(
+expected_flag=expected_flags[i], packet=packet_list[i], 
testpmd=testpmd
+)
+
+@func_test
+def test_l2_packet_detect(self) -> None:
+"""Verify the correct flags are shown in verbose output when sending 
L2 packets."""
+mac_id = "00:00:00:00:00:01"
+packet_list = [Ether(dst=mac_id, type=0x88F7) / Raw(), 
Ether(dst=mac_id) / ARP() / Raw()]
+flag_list = [RtePTypes.L2_ETHER_TIMESYNC, RtePTypes.L2_ETHER_ARP]
+with TestPmdShell(node=self.sut_node) as testpmd:
+self.setup_session(testpmd=testpmd, expected_flags=flag_list, 
packet_list=packet_list)
+
+@func_test
+def test_l3_l4_packet_detect(self) -> None:
+"""Verify correct flags are shown in the verbose output when sending 
IP/L4 packets."""
+mac_id = "00:00:00:00:00:01"
+ 

[PATCH v3 0/2] dts: port over unified packet type suite

2024-10-14 Thread Dean Marx
Port over unified packet type flag testing suite from old DTS.
According to DPDK documentation, each Poll Mode Driver should reserve 32
bits of packet headers for unified packet type flags. These flags serve
as an identifier for user applications, and are divided into 
subcategories: L2, L3, L4, tunnel, inner L2, inner L3, and inner L4 types.
This suite verifies the ability of the driver to recognize these types.

---
v1:
* Removed NVGRE test cases due to lack of SCAPY support
* Removed redundant packet flag verification in certain test cases

v2:
* Fixed git history issue causing apply patch failure
* Removed set_verbose duplication and added dependency

v3:
* Rebased off next-dts

Dean Marx (2):
  dts: add VXLAN port method to testpmd shell
  dts: port over unified packet suite

 dts/framework/config/conf_yaml_schema.json|   3 +-
 dts/framework/remote_session/testpmd_shell.py |  21 ++
 dts/tests/TestSuite_uni_pkt.py| 229 ++
 3 files changed, 252 insertions(+), 1 deletion(-)
 create mode 100644 dts/tests/TestSuite_uni_pkt.py

-- 
2.44.0



[PATCH v3 1/2] dts: add VXLAN port method to testpmd shell

2024-10-14 Thread Dean Marx
Add rx_vxlan_port add/rm method to testpmd shell for adding
or removing a vxlan id to the specified port filter list.

Signed-off-by: Dean Marx 
Reviewed-by: Jeremy Spewock 
---
 dts/framework/remote_session/testpmd_shell.py | 21 +++
 1 file changed, 21 insertions(+)

diff --git a/dts/framework/remote_session/testpmd_shell.py 
b/dts/framework/remote_session/testpmd_shell.py
index 16b41a7814..f3238e6b6d 100644
--- a/dts/framework/remote_session/testpmd_shell.py
+++ b/dts/framework/remote_session/testpmd_shell.py
@@ -1943,6 +1943,27 @@ def set_verbose(self, level: int, verify: bool = True) 
-> None:
 f"Testpmd failed to set verbose level to {level}."
 )
 
+def rx_vxlan(self, vxlan_id: int, port_id: int, enable: bool, verify: bool 
= True) -> None:
+"""Add or remove vxlan id to/from filter list.
+
+Args:
+vxlan_id: VXLAN ID to add to port filter list.
+port_id: ID of the port to modify VXLAN filter of.
+enable: If :data:`True`, adds specified VXLAN ID, otherwise 
removes it.
+verify: If :data:`True`, the output of the command is checked to 
verify
+the VXLAN ID was successfully added/removed from the port.
+
+Raises:
+InteractiveCommandExecutionError: If `verify` is :data:`True` and 
VXLAN ID
+is not successfully added or removed.
+"""
+action = "add" if enable else "rm"
+vxlan_output = self.send_command(f"rx_vxlan_port {action} {vxlan_id} 
{port_id}")
+if verify:
+if "udp tunneling add error" in vxlan_output:
+self._logger.debug(f"Failed to set VXLAN:\n{vxlan_output}")
+raise InteractiveCommandExecutionError(f"Failed to set 
VXLAN:\n{vxlan_output}")
+
 def _close(self) -> None:
 """Overrides :meth:`~.interactive_shell.close`."""
 self.stop()
-- 
2.44.0



Re: [v3 43/43] net/dpaa2: dpdmux single flow/multiple rules support

2024-10-14 Thread Stephen Hemminger
On Mon, 14 Oct 2024 17:31:26 +0530
vanshika.shu...@nxp.com wrote:

> From: Jun Yang 
> 
> Support multiple extractions as well as hardware descriptions
> instead of hard code.
> 
> Signed-off-by: Jun Yang 
> ---
>  drivers/net/dpaa2/dpaa2_ethdev.h |   1 +
>  drivers/net/dpaa2/dpaa2_flow.c   |  22 --
>  drivers/net/dpaa2/dpaa2_mux.c| 395 ---
>  drivers/net/dpaa2/dpaa2_parse_dump.h |   2 +
>  drivers/net/dpaa2/rte_pmd_dpaa2.h|   8 +-
>  5 files changed, 247 insertions(+), 181 deletions(-)

Fix this spelling error in next version please.

### [PATCH] net/dpaa2: dpdmux single flow/multiple rules support

WARNING:TYPO_SPELLING: 'exended' may be misspelled - perhaps 'extended'?
#173: FILE: drivers/net/dpaa2/dpaa2_mux.c:124:
+* This can be exended to other fields using pattern->type.
   ^^^


Re: [v3 14/43] bus/fslmc: enhance MC VFIO multiprocess support

2024-10-14 Thread Stephen Hemminger
On Mon, 14 Oct 2024 17:30:57 +0530
vanshika.shu...@nxp.com wrote:

> +#ifndef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA
> + if (vaddr != iovaddr) {
> + DPAA2_BUS_WARN("vaddr(0x%lx) != iovaddr(0x%lx)",
> + vaddr, iovaddr);
> + }
>  #endif

Checkpatch complain shere.
Warning in drivers/bus/fslmc/fslmc_vfio.c:
Using %l format, prefer %PRI*64 if type is [u]int64_t



Re: [v3 13/43] bus/fslmc: get MC VFIO group FD directly

2024-10-14 Thread Stephen Hemminger
On Mon, 14 Oct 2024 17:30:56 +0530
vanshika.shu...@nxp.com wrote:

> +static int
> +fslmc_vfio_open_group_fd(int iommu_group_num)
> +{
> + int vfio_group_fd;
> + char filename[PATH_MAX];
> + struct rte_mp_msg mp_req, *mp_rep;
> + struct rte_mp_reply mp_reply = {0};
> + struct timespec ts = {.tv_sec = 5, .tv_nsec = 0};
> + struct vfio_mp_param *p = (struct vfio_mp_param *)mp_req.param;
> +
> + /* if primary, try to open the group */
> + if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> + /* try regular group format */
> + snprintf(filename, sizeof(filename),
> + VFIO_GROUP_FMT, iommu_group_num);
> + vfio_group_fd = open(filename, O_RDWR);
> + if (vfio_group_fd <= 0) {
> + DPAA2_BUS_ERR("Open VFIO group(%s) failed(%d)",
> + filename, vfio_group_fd);
> + }
> +
> + return vfio_group_fd;
> + }
> + /* if we're in a secondary process, request group fd from the primary
> +  * process via mp channel.
> +  */
> + p->req = SOCKET_REQ_GROUP;
> + p->group_num = iommu_group_num;
> + strcpy(mp_req.name, EAL_VFIO_MP);

Later versions of checkpatch complain that strcpy() should not be used.
Instead use strlcpy.



Re: [v3 16/43] bus/fslmc: dynamic IOVA mode configuration

2024-10-14 Thread Stephen Hemminger
On Mon, 14 Oct 2024 17:30:59 +0530
vanshika.shu...@nxp.com wrote:

> iff --git a/drivers/bus/fslmc/fslmc_vfio.h b/drivers/bus/fslmc/fslmc_vfio.h
> index 1695b6c078..408b35680d 100644
> --- a/drivers/bus/fslmc/fslmc_vfio.h
> +++ b/drivers/bus/fslmc/fslmc_vfio.h
> @@ -11,6 +11,10 @@
>  #include 
>  #include 
>  
> +#ifndef __hot
> +#define __hot __attribute__((hot))
> +#endif
> +

DPDK already has __rte_hot (in rte_common.h) use that instead
to fix.

Warning in drivers/bus/fslmc/fslmc_vfio.h:
Using compiler attribute directly



[PATCH v1 1/2] baseband/acc: FFT support in VRB2 PRQ device

2024-10-14 Thread Nicolas Chautru
Supporting recent change in the device to
extend FFT capability processing in latest stepping.
Also including cosmetic change to VRB2 register definition.

Signed-off-by: Nicolas Chautru 
---
 drivers/baseband/acc/acc_common.h   |  2 +-
 drivers/baseband/acc/rte_vrb_pmd.c  | 30 +
 drivers/baseband/acc/vrb2_vf_enum.h |  4 ++--
 3 files changed, 29 insertions(+), 7 deletions(-)

diff --git a/drivers/baseband/acc/acc_common.h 
b/drivers/baseband/acc/acc_common.h
index 0c249d5b93..4c60b7896b 100644
--- a/drivers/baseband/acc/acc_common.h
+++ b/drivers/baseband/acc/acc_common.h
@@ -106,7 +106,7 @@
 #define ACC_MAX_FCW_SIZE  128
 #define ACC_IQ_SIZE4
 
-#define ACC_FCW_FFT_BLEN_3 28
+#define ACC_FCW_FFT_BLEN_VRB2 128
 
 /* Constants from K0 computation from 3GPP 38.212 Table 5.4.2.1-2 */
 #define ACC_N_ZC_1 66 /* N = 66 Zc for BG 1 */
diff --git a/drivers/baseband/acc/rte_vrb_pmd.c 
b/drivers/baseband/acc/rte_vrb_pmd.c
index 0455320c2a..5eb3e8dd48 100644
--- a/drivers/baseband/acc/rte_vrb_pmd.c
+++ b/drivers/baseband/acc/rte_vrb_pmd.c
@@ -1006,7 +1006,7 @@ vrb_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
case RTE_BBDEV_OP_FFT:
fcw_len = ACC_FCW_FFT_BLEN;
if (q->d->device_variant == VRB2_VARIANT)
-   fcw_len = ACC_FCW_FFT_BLEN_3;
+   fcw_len = ACC_FCW_FFT_BLEN_VRB2;
break;
case RTE_BBDEV_OP_MLDTS:
fcw_len = ACC_FCW_MLDTS_BLEN;
@@ -1402,7 +1402,11 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct 
rte_bbdev_driver_info *dev_info)
RTE_BBDEV_FFT_FP16_INPUT |
RTE_BBDEV_FFT_FP16_OUTPUT |
RTE_BBDEV_FFT_POWER_MEAS |
-   RTE_BBDEV_FFT_WINDOWING_BYPASS,
+   RTE_BBDEV_FFT_WINDOWING_BYPASS |
+   
RTE_BBDEV_FFT_TIMING_OFFSET_PER_CS |
+   RTE_BBDEV_FFT_TIMING_ERROR |
+   RTE_BBDEV_FFT_DEWINDOWING |
+   RTE_BBDEV_FFT_FREQ_RESAMPLING,
.num_buffers_src = 1,
.num_buffers_dst = 1,
.fft_windows_num = ACC_MAX_FFT_WIN,
@@ -3725,6 +3729,8 @@ vrb1_fcw_fft_fill(struct rte_bbdev_fft_op *op, struct 
acc_fcw_fft *fcw)
 static inline void
 vrb2_fcw_fft_fill(struct rte_bbdev_fft_op *op, struct acc_fcw_fft_3 *fcw)
 {
+   uint8_t cs;
+
fcw->in_frame_size = op->fft.input_sequence_size;
fcw->leading_pad_size = op->fft.input_leading_padding;
fcw->out_frame_size = op->fft.output_sequence_size;
@@ -3760,6 +3766,16 @@ vrb2_fcw_fft_fill(struct rte_bbdev_fft_op *op, struct 
acc_fcw_fft_3 *fcw)
fcw->bypass = 3;
else
fcw->bypass = 0;
+
+   fcw->enable_dewin = check_bit(op->fft.op_flags, 
RTE_BBDEV_FFT_DEWINDOWING);
+   fcw->freq_resample_mode = op->fft.freq_resample_mode;
+   fcw->depad_output_size = fcw->freq_resample_mode == 0 ?
+   op->fft.output_sequence_size : 
op->fft.output_depadded_size;
+   for (cs = 0; cs < RTE_BBDEV_MAX_CS; cs++) {
+   fcw->cs_theta_0[cs] = op->fft.cs_theta_0[cs];
+   fcw->cs_theta_d[cs] = op->fft.cs_theta_d[cs];
+   fcw->cs_time_offset[cs] = op->fft.time_offset[cs];
+   }
 }
 
 static inline int
@@ -3782,8 +3798,14 @@ vrb_dma_desc_fft_fill(struct rte_bbdev_fft_op *op,
/* FCW already done */
acc_header_init(desc);
 
-   RTE_SET_USED(win_input);
-   RTE_SET_USED(win_offset);
+   if (win_en && win_input) {
+   desc->data_ptrs[bd_idx].address = 
rte_pktmbuf_iova_offset(win_input, *win_offset);
+   desc->data_ptrs[bd_idx].blen = op->fft.output_depadded_size * 2;
+   desc->data_ptrs[bd_idx].blkid = ACC_DMA_BLKID_DEWIN_IN;
+   desc->data_ptrs[bd_idx].last = 0;
+   desc->data_ptrs[bd_idx].dma_ext = 0;
+   bd_idx++;
+   }
 
desc->data_ptrs[bd_idx].address = rte_pktmbuf_iova_offset(input, 
*in_offset);
desc->data_ptrs[bd_idx].blen = op->fft.input_sequence_size * 
ACC_IQ_SIZE;
diff --git a/drivers/baseband/acc/vrb2_vf_enum.h 
b/drivers/baseband/acc/vrb2_vf_enum.h
index 9c6e451010..1cc6986c67 100644
--- a/drivers/baseband/acc/vrb2_vf_enum.h
+++ b/drivers/baseband/acc/vrb2_vf_enum.h
@@ -18,8 +18,8 @@ enum {
VRB2_VfHiInfoRingIntWrEnVf   = 0x0020,
VRB2_VfHiInfoRingPf2VfWrEnVf = 0x0024,
VRB2_VfHiMsixVectorMapperVf  = 0x0060,
-   VRB2_VfHiDeviceStatus= 0x0068,
-   VRB2_VfHiInterruptSrc= 0x0070,
+ 

[PATCH v1 2/2] baseband/acc: saturate input to 6 bits for VRB decoder

2024-10-14 Thread Nicolas Chautru
Making the decoder more robust by forcing a default
6 bits LLR saturation to LDPC Decoder input.

Signed-off-by: Nicolas Chautru 
---
 drivers/baseband/acc/rte_vrb_pmd.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/baseband/acc/rte_vrb_pmd.c 
b/drivers/baseband/acc/rte_vrb_pmd.c
index 5eb3e8dd48..eb9892ff31 100644
--- a/drivers/baseband/acc/rte_vrb_pmd.c
+++ b/drivers/baseband/acc/rte_vrb_pmd.c
@@ -1799,6 +1799,9 @@ vrb_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct 
acc_fcw_ld *fcw,
fcw->hcout_offset = 0;
}
 
+   /* Force saturation to 6 bits LLR. */
+   fcw->saturate_input = 1;
+
fcw->tb_crc_select = 0;
if (check_bit(op->ldpc_dec.op_flags, RTE_BBDEV_LDPC_CRC_TYPE_24A_CHECK))
fcw->tb_crc_select = 2;
-- 
2.34.1



[PATCH v1 0/2] baseband/acc: vrb2 FFT support

2024-10-14 Thread Nicolas Chautru
Hi, 
Additional and final series for the VRB2 PMD.
Now supporting latest FFT processing (available on final stepping
of the device ) and generic improvement to decoder configuration.

Thanks
Nic

Nicolas Chautru (2):
  baseband/acc: FFT support in VRB2 PRQ device
  baseband/acc: saturate input to 6 bits for VRB decoder

 drivers/baseband/acc/acc_common.h   |  2 +-
 drivers/baseband/acc/rte_vrb_pmd.c  | 33 +
 drivers/baseband/acc/vrb2_vf_enum.h |  4 ++--
 3 files changed, 32 insertions(+), 7 deletions(-)

-- 
2.34.1



[PATCH v2] net/mlx5: fix potential memory leak in meter

2024-10-14 Thread Shun Hao
When meter not enabled, avoid allocate memory for meter profile table,
which will not be freed in close process when meter not enabled

Fixes: a295c69a8b24 ("net/mlx5: optimize meter profile lookup")
Cc: sta...@dpdk.org

Signed-off-by: Shun Hao 
Acked-by: Bing Zhao 
---
 drivers/net/mlx5/linux/mlx5_os.c   | 8 +---
 drivers/net/mlx5/mlx5_flow_meter.c | 4 ++--
 drivers/net/mlx5/windows/mlx5_os.c | 8 +---
 3 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/drivers/net/mlx5/linux/mlx5_os.c b/drivers/net/mlx5/linux/mlx5_os.c
index 0a8de88759..3881daf5cc 100644
--- a/drivers/net/mlx5/linux/mlx5_os.c
+++ b/drivers/net/mlx5/linux/mlx5_os.c
@@ -1612,9 +1612,11 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
priv->ctrl_flows = 0;
rte_spinlock_init(&priv->flow_list_lock);
TAILQ_INIT(&priv->flow_meters);
-   priv->mtr_profile_tbl = mlx5_l3t_create(MLX5_L3T_TYPE_PTR);
-   if (!priv->mtr_profile_tbl)
-   goto error;
+   if (priv->mtr_en) {
+   priv->mtr_profile_tbl = mlx5_l3t_create(MLX5_L3T_TYPE_PTR);
+   if (!priv->mtr_profile_tbl)
+   goto error;
+   }
/* Bring Ethernet device up. */
DRV_LOG(DEBUG, "port %u forcing Ethernet interface up",
eth_dev->data->port_id);
diff --git a/drivers/net/mlx5/mlx5_flow_meter.c 
b/drivers/net/mlx5/mlx5_flow_meter.c
index 19d8607070..98a61cbdd4 100644
--- a/drivers/net/mlx5/mlx5_flow_meter.c
+++ b/drivers/net/mlx5/mlx5_flow_meter.c
@@ -378,8 +378,8 @@ mlx5_flow_meter_profile_find(struct mlx5_priv *priv, 
uint32_t meter_profile_id)
 
if (priv->mtr_profile_arr)
return &priv->mtr_profile_arr[meter_profile_id];
-   if (mlx5_l3t_get_entry(priv->mtr_profile_tbl,
-  meter_profile_id, &data) || !data.ptr)
+   if (!priv->mtr_profile_tbl ||
+   mlx5_l3t_get_entry(priv->mtr_profile_tbl, meter_profile_id, &data) 
|| !data.ptr)
return NULL;
fmp = data.ptr;
/* Remove reference taken by the mlx5_l3t_get_entry. */
diff --git a/drivers/net/mlx5/windows/mlx5_os.c 
b/drivers/net/mlx5/windows/mlx5_os.c
index 0ebd233595..61ad06a373 100644
--- a/drivers/net/mlx5/windows/mlx5_os.c
+++ b/drivers/net/mlx5/windows/mlx5_os.c
@@ -521,9 +521,11 @@ mlx5_dev_spawn(struct rte_device *dpdk_dev,
claim_zero(mlx5_mac_addr_add(eth_dev, &mac, 0, 0));
priv->ctrl_flows = 0;
TAILQ_INIT(&priv->flow_meters);
-   priv->mtr_profile_tbl = mlx5_l3t_create(MLX5_L3T_TYPE_PTR);
-   if (!priv->mtr_profile_tbl)
-   goto error;
+   if (priv->mtr_en) {
+   priv->mtr_profile_tbl = mlx5_l3t_create(MLX5_L3T_TYPE_PTR);
+   if (!priv->mtr_profile_tbl)
+   goto error;
+   }
/* Bring Ethernet device up. */
DRV_LOG(DEBUG, "port %u forcing Ethernet interface up.",
eth_dev->data->port_id);
-- 
2.20.0



[PATCH v7 1/3] graph: add support for node specific xstats

2024-10-14 Thread pbhagavatula
From: Pavan Nikhilesh 

Add ability for Nodes to advertise xstat counters
during registration and increment them in fastpath.
Add support for retrieving/printing stats for node
specific xstats using rte_graph_cluster_stats_get().
Add `rte_node_xstat_increment` API to increment node
specific xstat counters.

Signed-off-by: Pavan Nikhilesh 
Acked-by: Kiran Kumar K 
Reviewed-by: Robin Jarry 
Acked-by: Jerin Jacob 
---
 doc/guides/prog_guide/graph_lib.rst| 22 +--
 doc/guides/rel_notes/deprecation.rst   |  6 --
 doc/guides/rel_notes/release_24_11.rst |  8 +++
 lib/graph/graph_populate.c | 20 ++-
 lib/graph/graph_private.h  |  3 +
 lib/graph/graph_stats.c| 79 +-
 lib/graph/node.c   | 37 +++-
 lib/graph/rte_graph.h  | 15 +
 lib/graph/rte_graph_worker_common.h| 23 
 lib/graph/version.map  |  7 +++
 10 files changed, 205 insertions(+), 15 deletions(-)

diff --git a/doc/guides/prog_guide/graph_lib.rst 
b/doc/guides/prog_guide/graph_lib.rst
index ad09bdfe26..4d9ae84ada 100644
--- a/doc/guides/prog_guide/graph_lib.rst
+++ b/doc/guides/prog_guide/graph_lib.rst
@@ -21,6 +21,7 @@ Features of the Graph library are:
 - Nodes as plugins.
 - Support for out of tree nodes.
 - Inbuilt nodes for packet processing.
+- Node specific xstat counts.
 - Multi-process support.
 - Low overhead graph walk and node enqueue.
 - Low overhead statistics collection infrastructure.
@@ -124,6 +125,18 @@ Source nodes are static nodes created using 
``RTE_NODE_REGISTER`` by passing
 While performing the graph walk, the ``process()`` function of all the source
 nodes will be called first. So that these nodes can be used as input nodes for 
a graph.
 
+nb_xstats:
+^^
+
+The number of xstats that this node can report. The ``xstat_desc[]`` stores 
the xstat
+descriptions which will later be propagated to stats.
+
+xstat_desc[]:
+^
+
+The dynamic array to store the xstat descriptions that will be reported by this
+node.
+
 Node creation and registration
 ~~
 * Node implementer creates the node by implementing ops and attributes of
@@ -141,13 +154,13 @@ Link the Nodes to create the graph topology
Topology after linking the nodes
 
 Once nodes are available to the program, Application or node public API
-functions can links them together to create a complex packet processing graph.
+functions can link them together to create a complex packet processing graph.
 
 There are multiple different types of strategies to link the nodes.
 
 Method (a):
 ^^^
-Provide the ``next_nodes[]`` at the node registration time. See  ``struct 
rte_node_register::nb_edges``.
+Provide the ``next_nodes[]`` at the node registration time. See ``struct 
rte_node_register::nb_edges``.
 This is a use case to address the static node scheme where one knows upfront 
the
 ``next_nodes[]`` of the node.
 
@@ -385,8 +398,9 @@ Understanding the memory layout helps to debug the graph 
library and
 improve the performance if needed.
 
 Graph object consists of a header, circular buffer to store the pending
-stream when walking over the graph, and variable-length memory to store
-the ``rte_node`` objects.
+stream when walking over the graph, variable-length memory to store
+the ``rte_node`` objects, and variable-length memory to store the xstat
+reported by each ``rte_node``.
 
 The graph_nodes_mem_create() creates and populate this memory. The functions
 such as ``rte_graph_walk()`` and ``rte_node_enqueue_*`` use this memory
diff --git a/doc/guides/rel_notes/deprecation.rst 
b/doc/guides/rel_notes/deprecation.rst
index 7bc2310bc4..20fcfedb7b 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -193,9 +193,3 @@ Deprecation Notices
   will be deprecated and subsequently removed in DPDK 24.11 release.
   Before this, the new port library API (functions rte_swx_port_*)
   will gradually transition from experimental to stable status.
-
-* graph: The graph library data structures will be modified
-  to support node specific errors.
-  The structures ``rte_node``, ``rte_node_register``
-  and ``rte_graph_cluster_node_stats`` will be extended
-  to include node error counters and error description.
diff --git a/doc/guides/rel_notes/release_24_11.rst 
b/doc/guides/rel_notes/release_24_11.rst
index dcee09b5d0..fee4b2305d 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -178,6 +178,10 @@ New Features
   This field is used to pass an extra configuration settings such as ability
   to lookup IPv4 addresses in network byte order.
 
+* **Add node specific xstats for rte_graph**
+
+  * Added ability for node to advertise and update multiple xstat counters,
+that can be retrieved using rte_graph_cluster_stats_get.
 
 Removed Items
 -
@@ -260,6 +264,10 @@ ABI Changes
 
 * eventdev: Added ``pr

[PATCH v7 3/3] node: add xstats for ip4 nodes

2024-10-14 Thread pbhagavatula
From: Pavan Nikhilesh 

Add xstat counters for ip4 LPM lookup failures in
ip4_lookup node.
Add reassembly failure xstat counter for ip4 reassembly
node.

Signed-off-by: Pavan Nikhilesh 
Acked-by: Kiran Kumar K 
Reviewed-by: Robin Jarry 
---
 lib/node/ip4_lookup.c  | 9 +
 lib/node/ip4_lookup_neon.h | 5 +
 lib/node/ip4_lookup_sse.h  | 6 ++
 lib/node/ip4_reassembly.c  | 9 +
 lib/node/node_private.h| 8 
 5 files changed, 37 insertions(+)

diff --git a/lib/node/ip4_lookup.c b/lib/node/ip4_lookup.c
index 18955971f6..53e8961bf5 100644
--- a/lib/node/ip4_lookup.c
+++ b/lib/node/ip4_lookup.c
@@ -86,6 +86,7 @@ ip4_lookup_node_process_scalar(struct rte_graph *graph, 
struct rte_node *node,
rc = rte_lpm_lookup(lpm, rte_be_to_cpu_32(ipv4_hdr->dst_addr),
&next_hop);
next_hop = (rc == 0) ? next_hop : drop_nh;
+   NODE_INCREMENT_XSTAT_ID(node, 0, (rc != 0), 1);
 
node_mbuf_priv1(mbuf, dyn)->nh = (uint16_t)next_hop;
next_hop = next_hop >> 16;
@@ -219,11 +220,19 @@ ip4_lookup_node_init(const struct rte_graph *graph, 
struct rte_node *node)
return 0;
 }
 
+static struct rte_node_xstats ip4_lookup_xstats = {
+   .nb_xstats = 1,
+   .xstat_desc = {
+   [0] = "ip4_lookup_error",
+   },
+};
+
 static struct rte_node_register ip4_lookup_node = {
.process = ip4_lookup_node_process_scalar,
.name = "ip4_lookup",
 
.init = ip4_lookup_node_init,
+   .xstats = &ip4_lookup_xstats,
 
.nb_edges = RTE_NODE_IP4_LOOKUP_NEXT_PKT_DROP + 1,
.next_nodes = {
diff --git a/lib/node/ip4_lookup_neon.h b/lib/node/ip4_lookup_neon.h
index d5c8da3719..82488a6fbc 100644
--- a/lib/node/ip4_lookup_neon.h
+++ b/lib/node/ip4_lookup_neon.h
@@ -116,6 +116,10 @@ ip4_lookup_node_process_vec(struct rte_graph *graph, 
struct rte_node *node,
priv01.u16[4] = result.u16[2];
priv23.u16[0] = result.u16[4];
priv23.u16[4] = result.u16[6];
+   NODE_INCREMENT_XSTAT_ID(node, 0, result.u16[1] == (drop_nh >> 
16), 1);
+   NODE_INCREMENT_XSTAT_ID(node, 0, result.u16[3] == (drop_nh >> 
16), 1);
+   NODE_INCREMENT_XSTAT_ID(node, 0, result.u16[5] == (drop_nh >> 
16), 1);
+   NODE_INCREMENT_XSTAT_ID(node, 0, result.u16[7] == (drop_nh >> 
16), 1);
 
node_mbuf_priv1(mbuf0, dyn)->u = priv01.u64[0];
node_mbuf_priv1(mbuf1, dyn)->u = priv01.u64[1];
@@ -202,6 +206,7 @@ ip4_lookup_node_process_vec(struct rte_graph *graph, struct 
rte_node *node,
&next_hop);
next_hop = (rc == 0) ? next_hop : drop_nh;
 
+   NODE_INCREMENT_XSTAT_ID(node, 0, (rc != 0), 1);
node_mbuf_priv1(mbuf0, dyn)->nh = (uint16_t)next_hop;
next_hop = next_hop >> 16;
next0 = (uint16_t)next_hop;
diff --git a/lib/node/ip4_lookup_sse.h b/lib/node/ip4_lookup_sse.h
index 74dbf97533..fb5f9c9b99 100644
--- a/lib/node/ip4_lookup_sse.h
+++ b/lib/node/ip4_lookup_sse.h
@@ -115,6 +115,11 @@ ip4_lookup_node_process_vec(struct rte_graph *graph, 
struct rte_node *node,
/* Perform LPM lookup to get NH and next node */
rte_lpm_lookupx4(lpm, dip, dst.u32, drop_nh);
 
+   NODE_INCREMENT_XSTAT_ID(node, 0, dst.u16[1] == (drop_nh >> 16), 
1);
+   NODE_INCREMENT_XSTAT_ID(node, 0, dst.u16[3] == (drop_nh >> 16), 
1);
+   NODE_INCREMENT_XSTAT_ID(node, 0, dst.u16[5] == (drop_nh >> 16), 
1);
+   NODE_INCREMENT_XSTAT_ID(node, 0, dst.u16[7] == (drop_nh >> 16), 
1);
+
/* Extract next node id and NH */
node_mbuf_priv1(mbuf0, dyn)->nh = dst.u32[0] & 0x;
next0 = (dst.u32[0] >> 16);
@@ -206,6 +211,7 @@ ip4_lookup_node_process_vec(struct rte_graph *graph, struct 
rte_node *node,
rc = rte_lpm_lookup(lpm, rte_be_to_cpu_32(ipv4_hdr->dst_addr),
&next_hop);
next_hop = (rc == 0) ? next_hop : drop_nh;
+   NODE_INCREMENT_XSTAT_ID(node, 0, rc != 0, 1);
 
node_mbuf_priv1(mbuf0, dyn)->nh = next_hop & 0x;
next0 = (next_hop >> 16);
diff --git a/lib/node/ip4_reassembly.c b/lib/node/ip4_reassembly.c
index 04823cc596..eb5f391114 100644
--- a/lib/node/ip4_reassembly.c
+++ b/lib/node/ip4_reassembly.c
@@ -120,6 +120,7 @@ ip4_reassembly_node_process(struct rte_graph *graph, struct 
rte_node *node, void
rte_node_next_stream_put(graph, node, 
RTE_NODE_IP4_REASSEMBLY_NEXT_PKT_DROP,
 dr->cnt);
idx += dr->cnt;
+   NODE_INCREMENT_XSTAT_ID(node, 0, dr->cnt, dr->cnt);
dr->cnt = 0;
}
 
@@ -165,11 +166,19 @@ ip4_reassembly_node_init(const struct rte_graph *graph, 
struct rte_node *

[PATCH v7 0/3] Introduce node-specific xstats in graph library

2024-10-14 Thread pbhagavatula
From: Pavan Nikhilesh 

Introduce the ability for nodes to advertise xstats counters during
registration and increment them during the node process function in
the graph library.
This enhancement allows for better stats tracking and debugging
capabilities within the graph framework.

The number of xstats and the mapping of xstat IDs to xstat descriptions
are defined during node registration.

Example:
static struct rte_node_xstats ip4_reassembly_xstats = {
.nb_xstats = 1,
.xstat_desc = {
[0] = "ip4_reassembly_error",
},
};

Here, "ip4_reassembly_error" is mapped to xstat ID 0, and the same ID is
used in the `ip4_reassembly_node_process` function to increment reassembly
errors as an xstat.
Depending on the node, there can be multiple such xstats that can be
updated independently and retrieved using `rte_graph_cluster_stats_get`.

Example:
+---+---+---+--+
|Node   |calls  |objs   |realloc_count |
+---+---+---+--+
|ip4_lookup |1324083|338965248  |2 |
|   ip4_lookup_error|   |338965496  |  |
|pkt_drop   |1324084|338965504  |1 |
|ethdev_rx-0-0  |1324086|338966016  |2 |
|pkt_cls|1324086|338966016  |1 |
+---+---+---+--+

v2 Changes:
- Fix compilation.
v3 Changes:
- Resend as 1/5 didn't make it through.
v4 Changes:
- Address review comments.
- Rebase on main branch.
v5 Changes:
- Shrink structure member names.(Robin)
- add rte_node_error_increment utility function. (Robin)
- Squash patches. (Robin)
- Update RN, DN. (David)
v6 Changes:
- Rename error to xstat. (Robin)
- Rearranges patches, update SVG fonts.
v7 Changes:
- Fix doxygen. (Jerin)

Pavan Nikhilesh (3):
  graph: add support for node specific xstats
  doc: update graph layout and node anatomy images
  node: add xstats for ip4 nodes

 doc/guides/prog_guide/graph_lib.rst   |  22 +-
 .../prog_guide/img/anatomy_of_a_node.svg  | 329 +--
 .../prog_guide/img/graph_mem_layout.svg   | 921 +-
 doc/guides/rel_notes/deprecation.rst  |   6 -
 doc/guides/rel_notes/release_24_11.rst|   8 +
 lib/graph/graph_populate.c|  20 +-
 lib/graph/graph_private.h |   3 +
 lib/graph/graph_stats.c   |  79 +-
 lib/graph/node.c  |  37 +-
 lib/graph/rte_graph.h |  15 +
 lib/graph/rte_graph_worker_common.h   |  23 +
 lib/graph/version.map |   7 +
 lib/node/ip4_lookup.c |   9 +
 lib/node/ip4_lookup_neon.h|   5 +
 lib/node/ip4_lookup_sse.h |   6 +
 lib/node/ip4_reassembly.c |   9 +
 lib/node/node_private.h   |   8 +
 17 files changed, 1197 insertions(+), 310 deletions(-)

--
2.25.1



[PATCH v6] devtools: add .clang-format file

2024-10-14 Thread Abdullah Ömer Yamaç
clang-format is a tool to format C/C++/Objective-C code. It can be used
to reformat code to match a given coding style, or to ensure that code
adheres to a specific coding style. It helps to maintain a consistent
coding style across the DPDK codebase.

.clang-format file overrides the default style options provided by
clang-format and large set of IDEs and text editors support it.

Signed-off-by: Abdullah Ömer Yamaç 
---
 .clang-format | 153 ++
 1 file changed, 153 insertions(+)
 create mode 100644 .clang-format

diff --git a/.clang-format b/.clang-format
new file mode 100644
index 00..805c08da1d
--- /dev/null
+++ b/.clang-format
@@ -0,0 +1,153 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2023 Abdullah Ömer Yamaç.
+#
+
+---
+BasedOnStyle: LLVM
+
+# Place opening and closing parentheses on the same line for control statements
+BreakBeforeBraces: Custom
+BraceWrapping:
+AfterFunction: true
+AfterControlStatement: false
+
+AllowShortEnumsOnASingleLine: false
+
+# Should be declared this way:
+SpaceBeforeParens: Custom
+SpaceBeforeParensOptions:
+AfterForeachMacros: false
+
+# Set maximum line length to 100 characters
+ColumnLimit: 100
+
+# Use LF (line feed) as the end-of-line character
+LineEnding: LF
+
+# Insert a newline at the end of the file
+InsertNewlineAtEOF: true
+
+# Set indentation width to 8 spaces
+IndentWidth: 8
+
+# Set continuation indentation width to 16 spaces (2 tabs)
+AlignAfterOpenBracket: DontAlign
+ContinuationIndentWidth: 16
+
+# Set tab width to 8 spaces
+TabWidth: 8
+
+# Use tabs for indentation
+UseTab: Always
+
+# Preserve include blocks as they are
+IncludeBlocks: Preserve
+
+# Never sort includes
+SortIncludes: Never
+
+# Always break after return type for top-level definitions
+AlwaysBreakAfterReturnType: TopLevelDefinitions
+
+# Always break before multiline string literals
+AlignEscapedNewlines: Left
+
+# Foreach macros
+ForEachMacros:
+[
+"CIRBUF_FOREACH",
+"DLB2_LIST_FOR_EACH",
+"DLB2_LIST_FOR_EACH_SAFE",
+"ECORE_LIST_FOR_EACH_ENTRY",
+"ECORE_LIST_FOR_EACH_ENTRY_SAFE",
+"FOR_EACH",
+"FOR_EACH_BUCKET",
+"FOR_EACH_CNIC_QUEUE",
+"FOR_EACH_COS_IN_TX_QUEUE",
+"FOR_EACH_ETH_QUEUE",
+"FOR_EACH_MEMBER",
+"FOR_EACH_NONDEFAULT_ETH_QUEUE",
+"FOR_EACH_NONDEFAULT_QUEUE",
+"FOR_EACH_PORT",
+"FOR_EACH_PORT_IF",
+"FOR_EACH_QUEUE",
+"FOR_EACH_SUITE_TESTCASE",
+"FOR_EACH_SUITE_TESTSUITE",
+"FOREACH_ABS_FUNC_IN_PORT",
+"FOREACH_DEVICE_ON_AUXILIARY_BUS",
+"FOREACH_DEVICE_ON_CDXBUS",
+"FOREACH_DEVICE_ON_PCIBUS",
+"FOREACH_DEVICE_ON_PLATFORM_BUS",
+"FOREACH_DEVICE_ON_UACCEBUS",
+"FOREACH_DEVICE_ON_VMBUS",
+"FOREACH_DRIVER_ON_AUXILIARY_BUS",
+"FOREACH_DRIVER_ON_CDXBUS",
+"FOREACH_DRIVER_ON_PCIBUS",
+"FOREACH_DRIVER_ON_PLATFORM_BUS",
+"FOREACH_DRIVER_ON_UACCEBUS",
+"FOREACH_DRIVER_ON_VMBUS",
+"FOREACH_SUBDEV",
+"FOREACH_SUBDEV_STATE",
+"HLIST_FOR_EACH_ENTRY",
+"ILIST_FOREACH",
+"LIST_FOR_EACH_ENTRY",
+"LIST_FOR_EACH_ENTRY_SAFE",
+"LIST_FOREACH",
+"LIST_FOREACH_FROM",
+"LIST_FOREACH_FROM_SAFE",
+"LIST_FOREACH_SAFE",
+"ML_AVG_FOREACH_QP",
+"ML_AVG_FOREACH_QP_MVTVM",
+"ML_AVG_RESET_FOREACH_QP",
+"ML_MAX_FOREACH_QP",
+"ML_MAX_FOREACH_QP_MVTVM",
+"ML_MAX_RESET_FOREACH_QP",
+"ML_MIN_FOREACH_QP",
+"ML_MIN_FOREACH_QP_MVTVM",
+"ML_MIN_RESET_FOREACH_QP",
+"MLX5_ETH_FOREACH_DEV",
+"MLX5_IPOOL_FOREACH",
+"MLX5_L3T_FOREACH",
+"OSAL_LIST_FOR_EACH_ENTRY",
+"OSAL_LIST_FOR_EACH_ENTRY_SAFE",
+"PLT_TAILQ_FOREACH_SAFE",
+"RTE_BBDEV_FOREACH",
+"RTE_DEV_FOREACH",
+"RTE_DMA_FOREACH_DEV",
+"RTE_EAL_DEVARGS_FOREACH",
+"RTE_ETH_FOREACH_DEV",
+"RTE_ETH_FOREACH_DEV_OF",
+"RTE_ETH_FOREACH_DEV_OWNED_BY",
+"RTE_ETH_FOREACH_DEV_SIBLING",
+"RTE_ETH_FOREACH_MATCHING_DEV",
+"RTE_ETH_FOREACH_VALID_DEV",
+"RTE_GPU_FOREACH",
+"RTE_GPU_FOREACH_CHILD",
+"RTE_GPU_FOREACH_PARENT",
+"RTE_LCORE_FOREACH",
+"RTE_LCORE_FOREACH_WO

[PATCH v6] devtools: add .clang-format file

2024-10-14 Thread Abdullah Ömer Yamaç
clang-format is a tool to format C/C++/Objective-C code. It can be used
to reformat code to match a given coding style, or to ensure that code
adheres to a specific coding style. It helps to maintain a consistent
coding style across the DPDK codebase.

.clang-format file overrides the default style options provided by
clang-format and large set of IDEs and text editors support it.

Signed-off-by: Abdullah Ömer Yamaç 
---
 .clang-format | 153 ++
 1 file changed, 153 insertions(+)
 create mode 100644 .clang-format

diff --git a/.clang-format b/.clang-format
new file mode 100644
index 00..805c08da1d
--- /dev/null
+++ b/.clang-format
@@ -0,0 +1,153 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 Abdullah Ömer Yamaç.
+#
+
+---
+BasedOnStyle: LLVM
+
+# Place opening and closing parentheses on the same line for control statements
+BreakBeforeBraces: Custom
+BraceWrapping:
+AfterFunction: true
+AfterControlStatement: false
+
+AllowShortEnumsOnASingleLine: false
+
+# Should be declared this way:
+SpaceBeforeParens: Custom
+SpaceBeforeParensOptions:
+AfterForeachMacros: false
+
+# Set maximum line length to 100 characters
+ColumnLimit: 100
+
+# Use LF (line feed) as the end-of-line character
+LineEnding: LF
+
+# Insert a newline at the end of the file
+InsertNewlineAtEOF: true
+
+# Set indentation width to 8 spaces
+IndentWidth: 8
+
+# Set continuation indentation width to 16 spaces (2 tabs)
+AlignAfterOpenBracket: DontAlign
+ContinuationIndentWidth: 16
+
+# Set tab width to 8 spaces
+TabWidth: 8
+
+# Use tabs for indentation
+UseTab: Always
+
+# Preserve include blocks as they are
+IncludeBlocks: Preserve
+
+# Never sort includes
+SortIncludes: Never
+
+# Always break after return type for top-level definitions
+AlwaysBreakAfterReturnType: TopLevelDefinitions
+
+# Always break before multiline string literals
+AlignEscapedNewlines: Left
+
+# Foreach macros
+ForEachMacros:
+[
+"CIRBUF_FOREACH",
+"DLB2_LIST_FOR_EACH",
+"DLB2_LIST_FOR_EACH_SAFE",
+"ECORE_LIST_FOR_EACH_ENTRY",
+"ECORE_LIST_FOR_EACH_ENTRY_SAFE",
+"FOR_EACH",
+"FOR_EACH_BUCKET",
+"FOR_EACH_CNIC_QUEUE",
+"FOR_EACH_COS_IN_TX_QUEUE",
+"FOR_EACH_ETH_QUEUE",
+"FOR_EACH_MEMBER",
+"FOR_EACH_NONDEFAULT_ETH_QUEUE",
+"FOR_EACH_NONDEFAULT_QUEUE",
+"FOR_EACH_PORT",
+"FOR_EACH_PORT_IF",
+"FOR_EACH_QUEUE",
+"FOR_EACH_SUITE_TESTCASE",
+"FOR_EACH_SUITE_TESTSUITE",
+"FOREACH_ABS_FUNC_IN_PORT",
+"FOREACH_DEVICE_ON_AUXILIARY_BUS",
+"FOREACH_DEVICE_ON_CDXBUS",
+"FOREACH_DEVICE_ON_PCIBUS",
+"FOREACH_DEVICE_ON_PLATFORM_BUS",
+"FOREACH_DEVICE_ON_UACCEBUS",
+"FOREACH_DEVICE_ON_VMBUS",
+"FOREACH_DRIVER_ON_AUXILIARY_BUS",
+"FOREACH_DRIVER_ON_CDXBUS",
+"FOREACH_DRIVER_ON_PCIBUS",
+"FOREACH_DRIVER_ON_PLATFORM_BUS",
+"FOREACH_DRIVER_ON_UACCEBUS",
+"FOREACH_DRIVER_ON_VMBUS",
+"FOREACH_SUBDEV",
+"FOREACH_SUBDEV_STATE",
+"HLIST_FOR_EACH_ENTRY",
+"ILIST_FOREACH",
+"LIST_FOR_EACH_ENTRY",
+"LIST_FOR_EACH_ENTRY_SAFE",
+"LIST_FOREACH",
+"LIST_FOREACH_FROM",
+"LIST_FOREACH_FROM_SAFE",
+"LIST_FOREACH_SAFE",
+"ML_AVG_FOREACH_QP",
+"ML_AVG_FOREACH_QP_MVTVM",
+"ML_AVG_RESET_FOREACH_QP",
+"ML_MAX_FOREACH_QP",
+"ML_MAX_FOREACH_QP_MVTVM",
+"ML_MAX_RESET_FOREACH_QP",
+"ML_MIN_FOREACH_QP",
+"ML_MIN_FOREACH_QP_MVTVM",
+"ML_MIN_RESET_FOREACH_QP",
+"MLX5_ETH_FOREACH_DEV",
+"MLX5_IPOOL_FOREACH",
+"MLX5_L3T_FOREACH",
+"OSAL_LIST_FOR_EACH_ENTRY",
+"OSAL_LIST_FOR_EACH_ENTRY_SAFE",
+"PLT_TAILQ_FOREACH_SAFE",
+"RTE_BBDEV_FOREACH",
+"RTE_DEV_FOREACH",
+"RTE_DMA_FOREACH_DEV",
+"RTE_EAL_DEVARGS_FOREACH",
+"RTE_ETH_FOREACH_DEV",
+"RTE_ETH_FOREACH_DEV_OF",
+"RTE_ETH_FOREACH_DEV_OWNED_BY",
+"RTE_ETH_FOREACH_DEV_SIBLING",
+"RTE_ETH_FOREACH_MATCHING_DEV",
+"RTE_ETH_FOREACH_VALID_DEV",
+"RTE_GPU_FOREACH",
+"RTE_GPU_FOREACH_CHILD",
+"RTE_GPU_FOREACH_PARENT",
+"RTE_LCORE_FOREACH",
+"RTE_LCORE_FOREACH_WO

[PATCH v3] hash: separate param checks in hash create func

2024-10-14 Thread Niall Meade
Separated name, entries and key_len parameter checks in
rte_hash_create().  Also made the error messages more
informative/verbose to help with debugging. Also added myself to the
mailing list.

Signed-off-by: Niall Meade 

---
v3:
* code indentation fix and rte_errno set correctly
v2:
* change hash log messages to be one line

I had name set to NULL in the parameters I was passing to
rte_hash_create() and the error message I got didn't specify which
parameter was invalid.
---
 .mailmap   |  1 +
 lib/hash/rte_cuckoo_hash.c | 22 ++
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/.mailmap b/.mailmap
index a66da3c8cb..93df2effb2 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1055,6 +1055,7 @@ Nelson Escobar 
 Nemanja Marjanovic 
 Netanel Belgazal 
 Netanel Gonen 
+Niall Meade 
 Niall Power 
 Nicholas Pratte 
 Nick Connolly  
diff --git a/lib/hash/rte_cuckoo_hash.c b/lib/hash/rte_cuckoo_hash.c
index 577b5839d3..9575e8aa0c 100644
--- a/lib/hash/rte_cuckoo_hash.c
+++ b/lib/hash/rte_cuckoo_hash.c
@@ -184,17 +184,24 @@ rte_hash_create(const struct rte_hash_parameters *params)
hash_list = RTE_TAILQ_CAST(rte_hash_tailq.head, rte_hash_list);
 
if (params == NULL) {
+   rte_errno = EINVAL;
HASH_LOG(ERR, "%s has no parameters", __func__);
return NULL;
}
 
/* Check for valid parameters */
if ((params->entries > RTE_HASH_ENTRIES_MAX) ||
-   (params->entries < RTE_HASH_BUCKET_ENTRIES) ||
-   (params->name == NULL) ||
-   (params->key_len == 0)) {
+   (params->entries < RTE_HASH_BUCKET_ENTRIES)) {
+   rte_errno = EINVAL;
+   HASH_LOG(ERR, "%s() entries (%u) must be in range [%d, %d] 
inclusive",
+   __func__, params->entries, RTE_HASH_BUCKET_ENTRIES,
+   RTE_HASH_ENTRIES_MAX);
+   return NULL;
+   }
+
+   if (params->key_len == 0) {
rte_errno = EINVAL;
-   HASH_LOG(ERR, "%s has invalid parameters", __func__);
+   HASH_LOG(ERR, "%s() key_len must be greater than 0", __func__);
return NULL;
}
 
@@ -204,6 +211,13 @@ rte_hash_create(const struct rte_hash_parameters *params)
return NULL;
}
 
+   if (params->name == NULL) {
+   rte_errno = EINVAL;
+   HASH_LOG(ERR, "%s() has invalid parameters, name can't be NULL",
+   __func__);
+   return NULL;
+   }
+
/* Validate correct usage of extra options */
if ((params->extra_flag & RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY) &&
(params->extra_flag & RTE_HASH_EXTRA_FLAGS_RW_CONCURRENCY_LF)) {
-- 
2.34.1



Re: [PATCH] ip_frag: support IPv6 reassembly with extensions

2024-10-14 Thread Vignesh Purushotham Srinivas
-Original Message-
From: Konstantin Ananyev 
To: Stephen Hemminger ,
vignesh.purushotham.srini...@ericsson.com

Cc: konstantin.v.anan...@yandex.ru ,
dev@dpdk.org 
Subject: RE: [PATCH] ip_frag: support IPv6 reassembly with extensions
Date: Tue, 17 Sep 2024 17:57:59 +

[You don't often get email from konstantin.anan...@huawei.com. Learn
why this is important at
https://aka.ms/LearnAboutSenderIdentification ]

> 
> On Mon, 26 Aug 2024 13:23:28 +0200
>  wrote:
> 
> > diff --git a/lib/ip_frag/ip_reassembly.h
> > b/lib/ip_frag/ip_reassembly.h
> > index 54afed5417..429e74f1b3 100644
> > --- a/lib/ip_frag/ip_reassembly.h
> > +++ b/lib/ip_frag/ip_reassembly.h
> > @@ -54,6 +54,8 @@ struct __rte_cache_aligned ip_frag_pkt {
> >     uint32_t total_size;   /* expected reassembled
> > size */
> >     uint32_t frag_size;    /* size of fragments
> > received */
> >     uint32_t last_idx; /* index of next entry
> > to fill */
> > +   uint32_t exts_len; /* length of extension
> > hdrs for first fragment */
> > +   uint8_t *next_proto;   /* pointer of the
> > next_proto field */
> >     struct ip_frag frags[IP_MAX_FRAG_NUM]; /* fragments */
> >  };
> 
> This creates a 32 bit hole in the structure.
> Better to put next_proto after the start field.

Another alternative - use offset within the mbuf instead of pointer.

ACK

> 
> > +
> > +   while (next_proto != IPPROTO_FRAGMENT &&
> > +   num_exts < MAX_NUM_IPV6_EXTS &&
> > +   (next_proto = rte_ipv6_get_next_ext(
> > +   *last_ext, next_proto, &ext_len)) >= 0) {
> 
> I would break up this loop condition for clarity.

+ 1

ACK

> Something like:
> 
>   while (next_proto != IPPROTO_FRAGMENT && num_exts <
> MAX_NUM_IPV6_EXTS) {
>   next_proto = rte_ipv6_get_next_ext(*last_ext,
> next_proto, &ext_len);
>   if (next_proto < 0)
>   break
> 
> Also, need a new test cases for this.

Agree, that would be good thing to add.

ACK



[PATCH v3] test: fix option devices

2024-10-14 Thread Mingjin Ye
Without using allow (-a) or block (-b), EAL loads all devices by default.
Unexpected devices may be loaded when running test cases in sub-processes.

This patch fixes the issue by copying the parameters of the master process
if the allow (-a) or block (-b) option is not used when starting the child
process.

Also, EAL does not allow the options allow (-a) and block (-b) to be used
at the same time.

Fixes: b3ce7891ad38 ("test: fix probing in secondary process")
Cc: sta...@dpdk.org

Signed-off-by: Mingjin Ye 
---
v2: The long form of the fix option is "--block".
---
v3: new scheme.
---
 app/test/process.h | 58 ++
 1 file changed, 54 insertions(+), 4 deletions(-)

diff --git a/app/test/process.h b/app/test/process.h
index 9fb2bf481c..665abae9dc 100644
--- a/app/test/process.h
+++ b/app/test/process.h
@@ -36,6 +36,7 @@ extern uint16_t flag_for_send_pkts;
 #endif
 
 #define PREFIX_ALLOW "--allow="
+#define PREFIX_BLOCK "--block="
 
 static int
 add_parameter_allow(char **argv, int max_capacity)
@@ -44,7 +45,7 @@ add_parameter_allow(char **argv, int max_capacity)
int count = 0;
 
RTE_EAL_DEVARGS_FOREACH(NULL, devargs) {
-   if (strlen(devargs->name) == 0)
+   if (strlen(devargs->name) == 0 || devargs->type != 
RTE_DEVTYPE_ALLOWED)
continue;
 
if (devargs->data == NULL || strlen(devargs->data) == 0) {
@@ -63,6 +64,32 @@ add_parameter_allow(char **argv, int max_capacity)
return count;
 }
 
+static int
+add_parameter_block(char **argv, int max_capacity)
+{
+   struct rte_devargs *devargs;
+   int count = 0;
+
+   RTE_EAL_DEVARGS_FOREACH(NULL, devargs) {
+   if (strlen(devargs->name) == 0 || devargs->type != 
RTE_DEVTYPE_BLOCKED)
+   continue;
+
+   if (devargs->data == NULL || strlen(devargs->data) == 0) {
+   if (asprintf(&argv[count], PREFIX_BLOCK"%s", 
devargs->name) < 0)
+   break;
+   } else {
+   if (asprintf(&argv[count], PREFIX_BLOCK"%s,%s",
+devargs->name, devargs->data) < 0)
+   break;
+   }
+
+   if (++count == max_capacity)
+   break;
+   }
+
+   return count;
+}
+
 /*
  * launches a second copy of the test process using the given argv parameters,
  * which should include argv[0] as the process name. To identify in the
@@ -74,7 +101,7 @@ process_dup(const char *const argv[], int numargs, const 
char *env_value)
 {
int num = 0;
char **argv_cpy;
-   int allow_num;
+   int allow_num, block_num;
int argv_num;
int i, status;
char path[32];
@@ -89,8 +116,27 @@ process_dup(const char *const argv[], int numargs, const 
char *env_value)
if (pid < 0)
return -1;
else if (pid == 0) {
-   allow_num = rte_devargs_type_count(RTE_DEVTYPE_ALLOWED);
-   argv_num = numargs + allow_num + 1;
+   allow_num = 0;
+   block_num = 0;
+
+   for (i = 0; i < numargs; i++) {
+   if (strcmp(argv[i], "-b") == 0 ||
+   strcmp(argv[i], "--block") == 0)
+   block_num++;
+   if (strcmp(argv[i], "-a") == 0 ||
+   strcmp(argv[i], "--allow") == 0)
+   allow_num++;
+   }
+   /* If block (-b) and allow (-a) are present, they will not be 
added. */
+   if (!block_num && !allow_num) {
+   allow_num = rte_devargs_type_count(RTE_DEVTYPE_ALLOWED);
+   block_num = rte_devargs_type_count(RTE_DEVTYPE_BLOCKED);
+   } else {
+   allow_num = 0;
+   block_num = 0;
+   }
+
+   argv_num = numargs + allow_num + block_num + 1;
argv_cpy = calloc(argv_num, sizeof(char *));
if (!argv_cpy)
rte_panic("Memory allocation failed\n");
@@ -101,8 +147,12 @@ process_dup(const char *const argv[], int numargs, const 
char *env_value)
if (argv_cpy[i] == NULL)
rte_panic("Error dup args\n");
}
+
+   /* EAL limits block (-b) and allow (-a) to not exist at the 
same time. */
if (allow_num > 0)
num = add_parameter_allow(&argv_cpy[i], allow_num);
+   else if (block_num > 0)
+   num = add_parameter_block(&argv_cpy[i], block_num);
num += numargs;
 
 #ifdef RTE_EXEC_ENV_LINUX
-- 
2.25.1



RE: [PATCH v2] test: fix option block

2024-10-14 Thread Ye, MingjinX



> -Original Message-
> From: Stephen Hemminger 
> Sent: Sunday, October 13, 2024 6:21 AM
> To: Ye, MingjinX 
> Cc: dev@dpdk.org; sta...@dpdk.org
> Subject: Re: [PATCH v2] test: fix option block
> 
> On Sat, 12 Oct 2024 09:35:19 +
> Mingjin Ye  wrote:
> 
> > The options allow (-a) and block (-b) cannot be used at the same time.
> > Therefore, allow (-a) will not be added when block (-b) is present.
> >
> > Fixes: b3ce7891ad38 ("test: fix probing in secondary process")
> > Cc: sta...@dpdk.org
> >
> > Signed-off-by: Mingjin Ye 
> > ---
> 
> What is this patch trying to solve?
Solve the issue of allow/block devices being added accidentally.
The v3 patch will be sent.
> 
> Right now starting dpdk-test with both options together causes an error in
> EAL init.
EAL does not support adding both allow (-a) and block (-b) options.

> 
> root@hermes:/home/shemminger/DPDK/main# ./build/app/dpdk-test -a
> ae:00.0 -b 00:1f.6
> EAL: Detected CPU lcores: 8
> EAL: Detected NUMA nodes: 1
> EAL: Options allow (-a) and block (-b) can't be used at the same time
> 
> Usage: ./build/app/dpdk-test [options]
> 
> Therefore it should never get into the process_dup function at all.


Re: [PATCH v3 10/12] baseband/acc: cosmetic changes

2024-10-14 Thread Maxime Coquelin




On 10/9/24 23:13, Hernan Vargas wrote:

Cosmetic code changes.
No functional impact.

Signed-off-by: Hernan Vargas 
---
  drivers/baseband/acc/rte_acc100_pmd.c |  2 +-
  drivers/baseband/acc/rte_vrb_pmd.c| 54 +--
  2 files changed, 36 insertions(+), 20 deletions(-)



Reviewed-by: Maxime Coquelin 



[PATCH v2 00/10] net/ice: base code update for RC2

2024-10-14 Thread Bruce Richardson
A number of small fixes and other changes to enable Tx scheduler
enhancements to our DPDK driver have been added to the base code.
Upstream these changes for 24.11 RC2. Most have previously been
submitted as part of the scheduler changes [1]

[1] https://patches.dpdk.org/project/dpdk/list/?series=32758&state=*

v2: include README file update

Bruce Richardson (7):
  net/ice/base: remove 255 limit on sched child nodes
  net/ice/base: set VSI index on newly created nodes
  net/ice/base: optimize subtree searches
  net/ice/base: remove flag checks before topology upload
  net/ice/base: allow init without TC class sched nodes
  net/ice/base: read VSI layer info from VSI
  net/ice/base: update README

Dave Ertman (1):
  net/ice/base: fix VLAN replay after reset

Fabio Pricoco (1):
  net/ice/base: add bounds check

Jacob Keller (1):
  net/ice/base: re-enable bypass mode for E822

 drivers/net/ice/base/README |   2 +-
 drivers/net/ice/base/ice_controlq.c |  23 +-
 drivers/net/ice/base/ice_dcb.c  |   3 +-
 drivers/net/ice/base/ice_ddp.c  |  33 
 drivers/net/ice/base/ice_ptp_hw.c   | 117 ++--
 drivers/net/ice/base/ice_ptp_hw.h   |   2 +-
 drivers/net/ice/base/ice_sched.c|  59 +++---
 drivers/net/ice/base/ice_switch.c   |   2 -
 drivers/net/ice/base/ice_type.h |   3 +-
 drivers/net/ice/ice_ethdev.c|   2 +-
 10 files changed, 169 insertions(+), 77 deletions(-)

--
2.43.0



[PATCH v2 01/10] net/ice/base: re-enable bypass mode for E822

2024-10-14 Thread Bruce Richardson
From: Jacob Keller 

When removing bypass mode, the code for E822 bypass was completely
removed in error. This code should be maintained in DPDK so re-add the
necessary functions.

Fixes: ce9ad8c5bc6d ("net/ice/base: remove PHY port timer bypass mode")
Cc: sta...@dpdk.org

Signed-off-by: Jacob Keller 
Signed-off-by: Bruce Richardson 
---
 drivers/net/ice/base/ice_ptp_hw.c | 117 --
 drivers/net/ice/base/ice_ptp_hw.h |   2 +-
 drivers/net/ice/ice_ethdev.c  |   2 +-
 3 files changed, 113 insertions(+), 8 deletions(-)

diff --git a/drivers/net/ice/base/ice_ptp_hw.c 
b/drivers/net/ice/base/ice_ptp_hw.c
index 2a112fea12..1e92e5ff21 100644
--- a/drivers/net/ice/base/ice_ptp_hw.c
+++ b/drivers/net/ice/base/ice_ptp_hw.c
@@ -4468,18 +4468,103 @@ ice_stop_phy_timer_e822(struct ice_hw *hw, u8 port, 
bool soft_reset)
return 0;
 }
 
+/**
+ * ice_phy_cfg_fixed_tx_offset_e822 - Configure Tx offset for bypass mode
+ * @hw: pointer to the HW struct
+ * @port: the PHY port to configure
+ *
+ * Calculate and program the fixed Tx offset, and indicate that the offset is
+ * ready. This can be used when operating in bypass mode.
+ */
+static int ice_phy_cfg_fixed_tx_offset_e822(struct ice_hw *hw, u8 port)
+{
+   enum ice_ptp_link_spd link_spd;
+   enum ice_ptp_fec_mode fec_mode;
+   u64 total_offset;
+   int err;
+
+   err = ice_phy_get_speed_and_fec_e822(hw, port, &link_spd, &fec_mode);
+   if (err)
+   return err;
+
+   total_offset = ice_calc_fixed_tx_offset_e822(hw, link_spd);
+
+   /* Program the fixed Tx offset into the P_REG_TOTAL_TX_OFFSET_L
+* register, then indicate that the Tx offset is ready. After this,
+* timestamps will be enabled.
+*
+* Note that this skips including the more precise offsets generated
+* by the Vernier calibration.
+*/
+
+   err = ice_write_64b_phy_reg_e822(hw, port, P_REG_TOTAL_TX_OFFSET_L,
+total_offset);
+   if (err)
+   return err;
+
+   err = ice_write_phy_reg_e822(hw, port, P_REG_TX_OR, 1);
+   if (err)
+   return err;
+
+   return ICE_SUCCESS;
+}
+
+/**
+ * ice_phy_cfg_rx_offset_e822 - Configure fixed Rx offset for bypass mode
+ * @hw: pointer to the HW struct
+ * @port: the PHY port to configure
+ *
+ * Calculate and program the fixed Rx offset, and indicate that the offset is
+ * ready. This can be used when operating in bypass mode.
+ */
+static int ice_phy_cfg_fixed_rx_offset_e822(struct ice_hw *hw, u8 port)
+{
+   enum ice_ptp_link_spd link_spd;
+   enum ice_ptp_fec_mode fec_mode;
+   u64 total_offset;
+   int err;
+
+   err = ice_phy_get_speed_and_fec_e822(hw, port, &link_spd, &fec_mode);
+   if (err)
+   return err;
+
+   total_offset = ice_calc_fixed_rx_offset_e822(hw, link_spd);
+
+   /* Program the fixed Rx offset into the P_REG_TOTAL_RX_OFFSET_L
+* register, then indicate that the Rx offset is ready. After this,
+* timestamps will be enabled.
+*
+* Note that this skips including the more precise offsets generated
+* by Vernier calibration.
+*/
+   err = ice_write_64b_phy_reg_e822(hw, port, P_REG_TOTAL_RX_OFFSET_L,
+total_offset);
+   if (err)
+   return err;
+
+   err = ice_write_phy_reg_e822(hw, port, P_REG_RX_OR, 1);
+   if (err)
+   return err;
+
+   return ICE_SUCCESS;
+}
+
 /**
  * ice_start_phy_timer_e822 - Start the PHY clock timer
  * @hw: pointer to the HW struct
  * @port: the PHY port to start
+ * @bypass: if true, start the PHY in bypass mode
  *
  * Start the clock of a PHY port. This must be done as part of the flow to
  * re-calibrate Tx and Rx timestamping offsets whenever the clock time is
  * initialized or when link speed changes.
  *
- * Hardware will take Vernier measurements on Tx or Rx of packets.
+ * Bypass mode enables timestamps immediately without waiting for Vernier
+ * calibration to complete. Hardware will still continue taking Vernier
+ * measurements on Tx or Rx of packets, but they will not be applied to
+ * timestamps.
  */
-int ice_start_phy_timer_e822(struct ice_hw *hw, u8 port)
+int ice_start_phy_timer_e822(struct ice_hw *hw, u8 port, bool bypass)
 {
u32 lo, hi, val;
u64 incval;
@@ -4544,15 +4629,35 @@ int ice_start_phy_timer_e822(struct ice_hw *hw, u8 port)
 
ice_ptp_exec_tmr_cmd(hw);
 
+   if (bypass) {
+   /* Enter BYPASS mode, enabling timestamps immediately. */
+   val |= P_REG_PS_BYPASS_MODE_M;
+   err = ice_write_phy_reg_e822(hw, port, P_REG_PS, val);
+   if (err)
+   return err;
+   }
+
val |= P_REG_PS_ENA_CLK_M;
err = ice_write_phy_reg_e822(hw, port, P_REG_PS, val);
if (err)
return err;
 
-   val |= P_REG_PS

[PATCH v2 04/10] net/ice/base: remove 255 limit on sched child nodes

2024-10-14 Thread Bruce Richardson
The Tx scheduler in the ice driver can be configured to have large
numbers of child nodes at a given layer, but the driver code implicitly
limited the number of nodes to 255 by using a u8 datatype for the number
of children. Increase this to a 16-bit value throughout the code.

Signed-off-by: Bruce Richardson 
---
 drivers/net/ice/base/ice_dcb.c   |  3 ++-
 drivers/net/ice/base/ice_sched.c | 23 +--
 drivers/net/ice/base/ice_type.h  |  2 +-
 3 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ice/base/ice_dcb.c b/drivers/net/ice/base/ice_dcb.c
index 4ef54613b1..e97f35b4cf 100644
--- a/drivers/net/ice/base/ice_dcb.c
+++ b/drivers/net/ice/base/ice_dcb.c
@@ -1585,7 +1585,8 @@ ice_update_port_tc_tree_cfg(struct ice_port_info *pi,
struct ice_aqc_txsched_elem_data elem;
u32 teid1, teid2;
int status = 0;
-   u8 i, j;
+   u16 i;
+   u8 j;
 
if (!pi)
return ICE_ERR_PARAM;
diff --git a/drivers/net/ice/base/ice_sched.c b/drivers/net/ice/base/ice_sched.c
index 373c32a518..1d6dd2af82 100644
--- a/drivers/net/ice/base/ice_sched.c
+++ b/drivers/net/ice/base/ice_sched.c
@@ -288,7 +288,7 @@ ice_sched_get_first_node(struct ice_port_info *pi,
  */
 struct ice_sched_node *ice_sched_get_tc_node(struct ice_port_info *pi, u8 tc)
 {
-   u8 i;
+   u16 i;
 
if (!pi || !pi->root)
return NULL;
@@ -311,7 +311,7 @@ void ice_free_sched_node(struct ice_port_info *pi, struct 
ice_sched_node *node)
 {
struct ice_sched_node *parent;
struct ice_hw *hw = pi->hw;
-   u8 i, j;
+   u16 i, j;
 
/* Free the children before freeing up the parent node
 * The parent array is updated below and that shifts the nodes
@@ -1503,7 +1503,7 @@ ice_sched_get_free_qgrp(struct ice_port_info *pi,
struct ice_sched_node *qgrp_node, u8 owner)
 {
struct ice_sched_node *min_qgrp;
-   u8 min_children;
+   u16 min_children;
 
if (!qgrp_node)
return qgrp_node;
@@ -2063,7 +2063,7 @@ static void ice_sched_rm_agg_vsi_info(struct 
ice_port_info *pi, u16 vsi_handle)
  */
 static bool ice_sched_is_leaf_node_present(struct ice_sched_node *node)
 {
-   u8 i;
+   u16 i;
 
for (i = 0; i < node->num_children; i++)
if (ice_sched_is_leaf_node_present(node->children[i]))
@@ -2098,7 +2098,7 @@ ice_sched_rm_vsi_cfg(struct ice_port_info *pi, u16 
vsi_handle, u8 owner)
 
ice_for_each_traffic_class(i) {
struct ice_sched_node *vsi_node, *tc_node;
-   u8 j = 0;
+   u16 j = 0;
 
tc_node = ice_sched_get_tc_node(pi, i);
if (!tc_node)
@@ -2166,7 +2166,7 @@ int ice_rm_vsi_lan_cfg(struct ice_port_info *pi, u16 
vsi_handle)
  */
 bool ice_sched_is_tree_balanced(struct ice_hw *hw, struct ice_sched_node *node)
 {
-   u8 i;
+   u16 i;
 
/* start from the leaf node */
for (i = 0; i < node->num_children; i++)
@@ -2240,7 +2240,8 @@ ice_sched_get_free_vsi_parent(struct ice_hw *hw, struct 
ice_sched_node *node,
  u16 *num_nodes)
 {
u8 l = node->tx_sched_layer;
-   u8 vsil, i;
+   u8 vsil;
+   u16 i;
 
vsil = ice_sched_get_vsi_layer(hw);
 
@@ -2282,7 +2283,7 @@ ice_sched_update_parent(struct ice_sched_node *new_parent,
struct ice_sched_node *node)
 {
struct ice_sched_node *old_parent;
-   u8 i, j;
+   u16 i, j;
 
old_parent = node->parent;
 
@@ -2382,8 +2383,9 @@ ice_sched_move_vsi_to_agg(struct ice_port_info *pi, u16 
vsi_handle, u32 agg_id,
u16 num_nodes[ICE_AQC_TOPO_MAX_LEVEL_NUM] = { 0 };
u32 first_node_teid, vsi_teid;
u16 num_nodes_added;
-   u8 aggl, vsil, i;
+   u8 aggl, vsil;
int status;
+   u16 i;
 
tc_node = ice_sched_get_tc_node(pi, tc);
if (!tc_node)
@@ -2498,7 +2500,8 @@ ice_move_all_vsi_to_dflt_agg(struct ice_port_info *pi,
 static bool
 ice_sched_is_agg_inuse(struct ice_port_info *pi, struct ice_sched_node *node)
 {
-   u8 vsil, i;
+   u8 vsil;
+   u16 i;
 
vsil = ice_sched_get_vsi_layer(pi->hw);
if (node->tx_sched_layer < vsil - 1) {
diff --git a/drivers/net/ice/base/ice_type.h b/drivers/net/ice/base/ice_type.h
index 598a80155b..6177bf4e2a 100644
--- a/drivers/net/ice/base/ice_type.h
+++ b/drivers/net/ice/base/ice_type.h
@@ -1030,9 +1030,9 @@ struct ice_sched_node {
struct ice_aqc_txsched_elem_data info;
u32 agg_id; /* aggregator group ID */
u16 vsi_handle;
+   u16 num_children;
u8 in_use;  /* suspended or in use */
u8 tx_sched_layer;  /* Logical Layer (1-9) */
-   u8 num_children;
u8 tc_num;
u8 owner;
 #define ICE_SCHED_NODE_OWNER_LAN   0
-- 
2.43.0



[PATCH v2 02/10] net/ice/base: add bounds check

2024-10-14 Thread Bruce Richardson
From: Fabio Pricoco 

Refactor while loop to add a check that the values read are in the
correct range.

Fixes: 6c1f26be50a2 ("net/ice/base: add control queue information")
Cc: sta...@dpdk.org

Signed-off-by: Fabio Pricoco 
Signed-off-by: Bruce Richardson 
---
 drivers/net/ice/base/ice_controlq.c | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ice/base/ice_controlq.c 
b/drivers/net/ice/base/ice_controlq.c
index af27dc8542..b210495827 100644
--- a/drivers/net/ice/base/ice_controlq.c
+++ b/drivers/net/ice/base/ice_controlq.c
@@ -839,16 +839,35 @@ static u16 ice_clean_sq(struct ice_hw *hw, struct 
ice_ctl_q_info *cq)
struct ice_ctl_q_ring *sq = &cq->sq;
u16 ntc = sq->next_to_clean;
struct ice_aq_desc *desc;
+   u32 head;
 
desc = ICE_CTL_Q_DESC(*sq, ntc);
 
-   while (rd32(hw, cq->sq.head) != ntc) {
-   ice_debug(hw, ICE_DBG_AQ_MSG, "ntc %d head %d.\n", ntc, 
rd32(hw, cq->sq.head));
+   head = rd32(hw, sq->head);
+   if (head >= sq->count) {
+   ice_debug(hw, ICE_DBG_AQ_MSG,
+ "Read head value (%d) exceeds allowed range.\n",
+ head);
+   return 0;
+   }
+
+   while (head != ntc) {
+   ice_debug(hw, ICE_DBG_AQ_MSG,
+ "ntc %d head %d.\n",
+ ntc, head);
ice_memset(desc, 0, sizeof(*desc), ICE_DMA_MEM);
ntc++;
if (ntc == sq->count)
ntc = 0;
desc = ICE_CTL_Q_DESC(*sq, ntc);
+
+   head = rd32(hw, sq->head);
+   if (head >= sq->count) {
+   ice_debug(hw, ICE_DBG_AQ_MSG,
+ "Read head value (%d) exceeds allowed 
range.\n",
+ head);
+   return 0;
+   }
}
 
sq->next_to_clean = ntc;
-- 
2.43.0



[PATCH v2 03/10] net/ice/base: fix VLAN replay after reset

2024-10-14 Thread Bruce Richardson
From: Dave Ertman 

If there is more than one VLAN defined when any reset that affects the
PF is initiated, after the reset rebuild, no traffic will pass on any
VLAN but the last one created.

This is caused by the iteration though the VLANs during replay each
clearing the vsi_map bitmap of the VSI that is being replayed.  The
problem is that during the replay, the pointer to the vsi_map bitmap is
used by each successive vlan to determine if it should be replayed on
this VSI.

The logic was that the replay of the VLAN would replace the bit in the
map before the next VLAN would iterate through.  But, since the replay
copies the old bitmap pointer to filt_replay_rules and creates a new one
for the recreated VLANS, it does not do this, and leaves the old bitmap
broken to be used to replay the remaining VLANs.

Since the old bitmap will be cleaned up in post replay cleanup, there is
no need to alter it and break following VLAN replay, so don't clear the
bit.

Fixes: c7dd15931183 ("net/ice/base: add virtual switch code")
Cc: sta...@dpdk.org

Signed-off-by: Dave Ertman 
Signed-off-by: Jacob Keller 
Signed-off-by: Bruce Richardson 
---
 drivers/net/ice/base/ice_switch.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ice/base/ice_switch.c 
b/drivers/net/ice/base/ice_switch.c
index 96ef26d535..a3786961e6 100644
--- a/drivers/net/ice/base/ice_switch.c
+++ b/drivers/net/ice/base/ice_switch.c
@@ -10110,8 +10110,6 @@ ice_replay_vsi_fltr(struct ice_hw *hw, struct 
ice_port_info *pi,
if (!itr->vsi_list_info ||
!ice_is_bit_set(itr->vsi_list_info->vsi_map, vsi_handle))
continue;
-   /* Clearing it so that the logic can add it back */
-   ice_clear_bit(vsi_handle, itr->vsi_list_info->vsi_map);
f_entry.fltr_info.vsi_handle = vsi_handle;
f_entry.fltr_info.fltr_act = ICE_FWD_TO_VSI;
/* update the src in case it is VSI num */
-- 
2.43.0



Re: [PATCH v3 06/12] baseband/acc: enhance SW ring alignment

2024-10-14 Thread Maxime Coquelin




On 10/9/24 23:12, Hernan Vargas wrote:

Calculate the aligned total size required for queue rings, ensuring that
the size is a power of two for proper memory allocation.

Signed-off-by: Hernan Vargas 
---
  drivers/baseband/acc/acc_common.h | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)



Reviewed-by: Maxime Coquelin 



[PATCH v2 08/10] net/ice/base: allow init without TC class sched nodes

2024-10-14 Thread Bruce Richardson
If DCB support is disabled via DDP image, there will not be any traffic
class (TC) nodes in the scheduler tree immediately above the root level.
To allow the driver to work with this scenario, we allow use of the root
node as a dummy TC0 node in case where there are no TC nodes in the
tree. For use of any other TC other than 0 (used by default in the
driver), existing behaviour of returning NULL pointer is maintained.

Signed-off-by: Bruce Richardson 
---
 drivers/net/ice/base/ice_sched.c | 8 +++-
 drivers/net/ice/base/ice_type.h  | 1 +
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ice/base/ice_sched.c b/drivers/net/ice/base/ice_sched.c
index 4c5c19daf3..7e255c0337 100644
--- a/drivers/net/ice/base/ice_sched.c
+++ b/drivers/net/ice/base/ice_sched.c
@@ -293,6 +293,10 @@ struct ice_sched_node *ice_sched_get_tc_node(struct 
ice_port_info *pi, u8 tc)
 
if (!pi || !pi->root)
return NULL;
+   /* if no TC nodes, use root as TC node 0 */
+   if (!pi->has_tc)
+   return tc == 0 ? pi->root : NULL;
+
for (i = 0; i < pi->root->num_children; i++)
if (pi->root->children[i]->tc_num == tc)
return pi->root->children[i];
@@ -1306,7 +1310,9 @@ int ice_sched_init_port(struct ice_port_info *pi)
if (buf[0].generic[j].data.elem_type ==
ICE_AQC_ELEM_TYPE_ENTRY_POINT)
hw->sw_entry_point_layer = j;
-
+   else if (buf[0].generic[j].data.elem_type ==
+   ICE_AQC_ELEM_TYPE_TC)
+   pi->has_tc = 1;
status = ice_sched_add_node(pi, j, &buf[i].generic[j], 
NULL);
if (status)
goto err_init_port;
diff --git a/drivers/net/ice/base/ice_type.h b/drivers/net/ice/base/ice_type.h
index 6177bf4e2a..35f832eb9f 100644
--- a/drivers/net/ice/base/ice_type.h
+++ b/drivers/net/ice/base/ice_type.h
@@ -1260,6 +1260,7 @@ struct ice_port_info {
struct ice_qos_cfg qos_cfg;
u8 is_vf:1;
u8 is_custom_tx_enabled:1;
+   u8 has_tc:1;
 };
 
 struct ice_switch_info {
-- 
2.43.0



[PATCH v2 09/10] net/ice/base: read VSI layer info from VSI

2024-10-14 Thread Bruce Richardson
Rather than computing from the number of HW layers the layer of the VSI,
we can instead just read that info from the VSI node itself. This allows
the layer to be changed at runtime.

Signed-off-by: Bruce Richardson 
---
 drivers/net/ice/base/ice_sched.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ice/base/ice_sched.c b/drivers/net/ice/base/ice_sched.c
index 7e255c0337..9608ac7c24 100644
--- a/drivers/net/ice/base/ice_sched.c
+++ b/drivers/net/ice/base/ice_sched.c
@@ -1550,7 +1550,6 @@ ice_sched_get_free_qparent(struct ice_port_info *pi, u16 
vsi_handle, u8 tc,
u16 max_children;
 
qgrp_layer = ice_sched_get_qgrp_layer(pi->hw);
-   vsi_layer = ice_sched_get_vsi_layer(pi->hw);
max_children = pi->hw->max_children[qgrp_layer];
 
vsi_ctx = ice_get_vsi_ctx(pi->hw, vsi_handle);
@@ -1560,6 +1559,7 @@ ice_sched_get_free_qparent(struct ice_port_info *pi, u16 
vsi_handle, u8 tc,
/* validate invalid VSI ID */
if (!vsi_node)
return NULL;
+   vsi_layer = vsi_node->tx_sched_layer;
 
/* If the queue group and vsi layer are same then queues
 * are all attached directly to VSI
-- 
2.43.0



Re: [PATCH v3 11/12] baseband/acc: rte free refactor

2024-10-14 Thread Maxime Coquelin

I would rename the title to:

"baseband/acc: refactor resources freeing"

I can fix while applying.

On 10/9/24 23:13, Hernan Vargas wrote:

Refactor to explicitly set pointer to NULL after free to avoid double
free.

Signed-off-by: Hernan Vargas 
---
  drivers/baseband/acc/rte_acc100_pmd.c | 23 +++--
  drivers/baseband/acc/rte_vrb_pmd.c| 48 +++
  2 files changed, 39 insertions(+), 32 deletions(-)



Reviewed-by: Maxime Coquelin 

Thanks,
Maxime



[PATCH v2 05/10] net/ice/base: set VSI index on newly created nodes

2024-10-14 Thread Bruce Richardson
The ice_sched_node type has got a field for the vsi to which the node
belongs. This field was not getting set in "ice_sched_add_node", so add
a line configuring this field for each node from its parent node.
Similarly, when searching for a qgroup node, we can check for each node
that the VSI information is correct.

Signed-off-by: Bruce Richardson 
---
 drivers/net/ice/base/ice_sched.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ice/base/ice_sched.c b/drivers/net/ice/base/ice_sched.c
index 1d6dd2af82..45934f9152 100644
--- a/drivers/net/ice/base/ice_sched.c
+++ b/drivers/net/ice/base/ice_sched.c
@@ -200,6 +200,7 @@ ice_sched_add_node(struct ice_port_info *pi, u8 layer,
node->in_use = true;
node->parent = parent;
node->tx_sched_layer = layer;
+   node->vsi_handle = parent->vsi_handle;
parent->children[parent->num_children++] = node;
node->info = elem;
return 0;
@@ -1575,7 +1576,7 @@ ice_sched_get_free_qparent(struct ice_port_info *pi, u16 
vsi_handle, u8 tc,
/* make sure the qgroup node is part of the VSI subtree */
if (ice_sched_find_node_in_subtree(pi->hw, vsi_node, qgrp_node))
if (qgrp_node->num_children < max_children &&
-   qgrp_node->owner == owner)
+   qgrp_node->owner == owner && qgrp_node->vsi_handle 
== vsi_handle)
break;
qgrp_node = qgrp_node->sibling;
}
-- 
2.43.0



[PATCH v2 07/10] net/ice/base: remove flag checks before topology upload

2024-10-14 Thread Bruce Richardson
DPDK should support more than just 9-level or 5-level topologies, so
remove the checks for those particular settings.

Signed-off-by: Bruce Richardson 
---
 drivers/net/ice/base/ice_ddp.c | 33 -
 1 file changed, 33 deletions(-)

diff --git a/drivers/net/ice/base/ice_ddp.c b/drivers/net/ice/base/ice_ddp.c
index 90aaa6b331..c17a58eab8 100644
--- a/drivers/net/ice/base/ice_ddp.c
+++ b/drivers/net/ice/base/ice_ddp.c
@@ -2384,38 +2384,6 @@ int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len)
return status;
}
 
-   /* Is default topology already applied ? */
-   if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
-   hw->num_tx_sched_layers == 9) {
-   ice_debug(hw, ICE_DBG_INIT, "Loaded default topology\n");
-   /* Already default topology is loaded */
-   return ICE_ERR_ALREADY_EXISTS;
-   }
-
-   /* Is new topology already applied ? */
-   if ((flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
-   hw->num_tx_sched_layers == 5) {
-   ice_debug(hw, ICE_DBG_INIT, "Loaded new topology\n");
-   /* Already new topology is loaded */
-   return ICE_ERR_ALREADY_EXISTS;
-   }
-
-   /* Is set topology issued already ? */
-   if (flags & ICE_AQC_TX_TOPO_FLAGS_ISSUED) {
-   ice_debug(hw, ICE_DBG_INIT, "Update tx topology was done by 
another PF\n");
-   /* add a small delay before exiting */
-   for (i = 0; i < 20; i++)
-   ice_msec_delay(100, true);
-   return ICE_ERR_ALREADY_EXISTS;
-   }
-
-   /* Change the topology from new to default (5 to 9) */
-   if (!(flags & ICE_AQC_TX_TOPO_FLAGS_LOAD_NEW) &&
-   hw->num_tx_sched_layers == 5) {
-   ice_debug(hw, ICE_DBG_INIT, "Change topology from 5 to 9 
layers\n");
-   goto update_topo;
-   }
-
pkg_hdr = (struct ice_pkg_hdr *)buf;
state = ice_verify_pkg(pkg_hdr, len);
if (state) {
@@ -2462,7 +2430,6 @@ int ice_cfg_tx_topo(struct ice_hw *hw, u8 *buf, u32 len)
/* Get the new topology buffer */
new_topo = ((u8 *)section) + offset;
 
-update_topo:
/* acquire global lock to make sure that set topology issued
 * by one PF
 */
-- 
2.43.0



[PATCH v2 06/10] net/ice/base: optimize subtree searches

2024-10-14 Thread Bruce Richardson
In a number of places throughout the driver code, we want to confirm
that a scheduler node is indeed a child of another node. Currently, this
is confirmed by searching down the tree from the base until the desired
node is hit, a search which may hit many irrelevant tree nodes when
recursing down wrong branches. By switching the direction of search, to
check upwards from the node to the parent, we can avoid any incorrect
paths, and so speed up processing.

Signed-off-by: Bruce Richardson 
---
 drivers/net/ice/base/ice_sched.c | 23 +++
 1 file changed, 7 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ice/base/ice_sched.c b/drivers/net/ice/base/ice_sched.c
index 45934f9152..4c5c19daf3 100644
--- a/drivers/net/ice/base/ice_sched.c
+++ b/drivers/net/ice/base/ice_sched.c
@@ -1464,25 +1464,16 @@ void ice_sched_get_psm_clk_freq(struct ice_hw *hw)
  * subtree or not
  */
 bool
-ice_sched_find_node_in_subtree(struct ice_hw *hw, struct ice_sched_node *base,
+ice_sched_find_node_in_subtree(struct ice_hw __ALWAYS_UNUSED *hw,
+  struct ice_sched_node *base,
   struct ice_sched_node *node)
 {
-   u8 i;
-
-   for (i = 0; i < base->num_children; i++) {
-   struct ice_sched_node *child = base->children[i];
-
-   if (node == child)
-   return true;
-
-   if (child->tx_sched_layer > node->tx_sched_layer)
-   return false;
-
-   /* this recursion is intentional, and wouldn't
-* go more than 8 calls
-*/
-   if (ice_sched_find_node_in_subtree(hw, child, node))
+   if (base == node)
+   return true;
+   while (node->tx_sched_layer != 0 && node->parent != NULL) {
+   if (node->parent == base)
return true;
+   node = node->parent;
}
return false;
 }
-- 
2.43.0



[PATCH v2 10/10] net/ice/base: update README

2024-10-14 Thread Bruce Richardson
Update the README file with the date of that latest base code snapshot.

Signed-off-by: Bruce Richardson 
---
 drivers/net/ice/base/README | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ice/base/README b/drivers/net/ice/base/README
index 3c2dc43856..c32e530789 100644
--- a/drivers/net/ice/base/README
+++ b/drivers/net/ice/base/README
@@ -6,7 +6,7 @@ Intel® ICE driver
 ==
 
 This directory contains source code of ice base driver generated on
-2024-08-19 released by the team which develops
+2024-10-11 released by the team which develops
 basic drivers for any ice NIC. The directory of base/ contains the
 original source package.
 This driver is valid for the product(s) listed below
-- 
2.43.0



Re: [PATCH v3 12/12] baseband/acc: clean up of VRB1 capabilities

2024-10-14 Thread Maxime Coquelin




On 10/9/24 23:13, Hernan Vargas wrote:

The interrupt support was defeatured on the VRB1 device.

Signed-off-by: Hernan Vargas 
---
  doc/guides/bbdevs/vrb1.rst | 3 ---
  drivers/baseband/acc/rte_vrb_pmd.c | 8 ++--
  2 files changed, 2 insertions(+), 9 deletions(-)




Reviewed-by: Maxime Coquelin 

Maxime



[PATCH] common/cnxk: allow enabling IOVA field in mbuf

2024-10-14 Thread Shijith Thotton
Value of RTE_IOVA_IN_MBUF was always disabled on cnxk platforms, as IOVA
in the mbuf is not required. This change modifies that behavior,
allowing RTE_IOVA_IN_MBUF to be enabled if the build option
-Denable_iova_as_pa=true is explicitly specified.

Signed-off-by: Shijith Thotton 
---
 config/arm/meson.build  | 8 ++--
 drivers/common/cnxk/meson.build | 9 +
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/config/arm/meson.build b/config/arm/meson.build
index 012935d5d7..ca54524376 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -439,10 +439,7 @@ soc_cn9k = {
 'description': 'Marvell OCTEON 9',
 'implementer': '0x43',
 'part_number': '0xb2',
-'numa': false,
-'flags': [
-['RTE_IOVA_IN_MBUF', 0]
-]
+'numa': false
 }
 
 soc_cn10k = {
@@ -451,8 +448,7 @@ soc_cn10k = {
 'flags': [
 ['RTE_MAX_LCORE', 24],
 ['RTE_MAX_NUMA_NODES', 1],
-['RTE_MEMPOOL_ALIGN', 128],
-['RTE_IOVA_IN_MBUF', 0]
+['RTE_MEMPOOL_ALIGN', 128]
 ],
 'part_number': '0xd49',
 'extra_march_features': ['crypto'],
diff --git a/drivers/common/cnxk/meson.build b/drivers/common/cnxk/meson.build
index dc2ddf1f20..bba780e750 100644
--- a/drivers/common/cnxk/meson.build
+++ b/drivers/common/cnxk/meson.build
@@ -108,4 +108,13 @@ deps += ['bus_pci', 'net', 'telemetry']
 
 require_iova_in_mbuf = false
 
+cnxk_socs = ['cn9k', 'cn10k', 'cn20k']
+
+# Enable RTE_IOVA_IN_MBUF only if enable_iova_as_pa is set explicitly, else 
disable it
+if meson.version().version_compare('>=1.1.0')
+if '-Denable_iova_as_pa' not in meson.build_options() and soc_type in 
cnxk_socs
+dpdk_conf.set10('RTE_IOVA_IN_MBUF', false)
+endif
+endif
+
 annotate_locks = false
-- 
2.25.1



RE: [EXTERNAL] Re: [RFC PATCH 0/3] add feature arc in rte_graph

2024-10-14 Thread Nitin Saxena
Hi David and all,

>> I see no non-RFC series following this original submission.
>> It will slip to next release unless there is an objection.

I had pushed non RFC patch series before -rc1 date (11th oct). 
We have an ABI change in this patch series 
https://patches.dpdk.org/project/dpdk/patch/20241010133111.2764712-3-nsax...@marvell.com/
Could you help merge this patch series in rc2 otherwise it has to wait for next 
LTS

Thanks,
Nitin

> -Original Message-
> From: David Marchand 
> Sent: Tuesday, October 8, 2024 1:34 PM
> To: Nitin Saxena 
> Cc: Jerin Jacob ; Kiran Kumar Kokkilagadda
> ; Nithin Kumar Dabilpuram
> ; Zhirun Yan ;
> dev@dpdk.org; Nitin Saxena ; Robin Jarry
> ; Christophe Fontaine 
> Subject: [EXTERNAL] Re: [RFC PATCH 0/3] add feature arc in rte_graph
> 
> Hi graph guys, On Sat, Sep 7, 2024 at 9: 31 AM Nitin Saxena
>  wrote: > > Feature arc represents an ordered list of
> features/protocols at a given > networking layer. It is a high level 
> abstraction
> to connect 
> Hi graph guys,
> 
> On Sat, Sep 7, 2024 at 9:31 AM Nitin Saxena  wrote:
> >
> > Feature arc represents an ordered list of features/protocols at a
> > given networking layer. It is a high level abstraction to connect
> > various rte_graph nodes, as feature nodes, and allow packets steering
> > across these nodes in a generic manner.
> >
> > Features (or feature nodes) are nodes which handles partial or
> > complete handling of a protocol in fast path. Like ipv4-rewrite node,
> > which adds rewrite data to an outgoing IPv4 packet.
> >
> > However in above example, outgoing interface(say "eth0") may have
> > outbound IPsec policy enabled, hence packets must be steered from
> > ipv4-rewrite node to ipsec-outbound-policy node for outbound IPsec
> > policy lookup. On the other hand, packets routed to another interface
> > (eth1) will not be sent to ipsec-outbound-policy node as IPsec feature
> > is disabled on eth1. Feature-arc allows rte_graph applications to
> > manage such constraints easily
> >
> > Feature arc abstraction allows rte_graph based application to
> >
> > 1. Seamlessly steer packets across feature nodes based on wheter
> > feature is enabled or disabled on an interface. Features enabled on
> > one interface may not be enabled on another interface with in a same
> > feature arc.
> >
> > 2. Allow enabling/disabling of features on an interface at runtime, so
> > that if a feature is disabled, packets associated with that interface
> > won't be steered to corresponding feature node.
> >
> > 3. Provides mechanism to hook custom/user-defined nodes to a feature
> > node and allow packet steering from feature node to custom node
> > without changing former's fast path function
> >
> > 4. Allow expressing features in a particular sequential order so that
> > packets are steered in an ordered way across nodes in fast path. For
> > eg: if IPsec and IPv4 features are enabled on an ingress interface,
> > packets must be sent to IPsec inbound policy node first and then to
> > ipv4 lookup node.
> >
> > This patch series adds feature arc library in rte_graph and also adds
> > "ipv4-output" feature arc handling in "ipv4-rewrite" node.
> >
> > Nitin Saxena (3):
> >   graph: add feature arc support
> >   graph: add feature arc option in graph create
> >   graph: add IPv4 output feature arc
> >
> >  lib/graph/graph.c|   1 +
> >  lib/graph/graph_feature_arc.c| 959 +++
> >  lib/graph/graph_populate.c   |   7 +-
> >  lib/graph/graph_private.h|   3 +
> >  lib/graph/meson.build|   2 +
> >  lib/graph/node.c |   2 +
> >  lib/graph/rte_graph.h|   3 +
> >  lib/graph/rte_graph_feature_arc.h| 373 +
> >  lib/graph/rte_graph_feature_arc_worker.h | 548 +
> >  lib/graph/version.map|  17 +
> >  lib/node/ip4_rewrite.c   | 476 ---
> >  lib/node/ip4_rewrite_priv.h  |   9 +-
> >  lib/node/node_private.h  |  19 +-
> >  lib/node/rte_node_ip4_api.h  |   3 +
> >  14 files changed, 2325 insertions(+), 97 deletions(-)  create mode
> > 100644 lib/graph/graph_feature_arc.c  create mode 100644
> > lib/graph/rte_graph_feature_arc.h  create mode 100644
> > lib/graph/rte_graph_feature_arc_worker.h
> 
> I see no non-RFC series following this original submission.
> It will slip to next release unless there is an objection.
> 
> Btw, I suggest copying Robin (and Christophe) for graph related changes.
> 
> 
> --
> David Marchand



Re: [PATCH] net: improve vlan header type alignment

2024-10-14 Thread Bruce Richardson
On Sun, Oct 13, 2024 at 08:35:54AM +, Morten Brørup wrote:
> Ethernet packets can be VLAN tagged, i.e. an Ethernet header can have a
> VLAN tag (a.k.a. VLAN header) embedded.
> Since the Ethernet header is 2 byte aligned, and the VLAN tag is directly
> related to the Ethernet header, the VLAN tag is also 2 byte aligned, so
> packing the VLAN tag structure is not necessary.
> 
> Furthermore, the Ethernet header type is implictly 2 byte aligned, so
> removed the superfluous explicit 2 byte alignment.
> 
> Added static_asserts to verify the size and alignment of the various
> Ethernet types.
> 
> Signed-off-by: Morten Brørup 
> ---
>  lib/net/rte_ether.h | 21 +++--
>  1 file changed, 19 insertions(+), 2 deletions(-)
> 
Acked-by: Bruce Richardson 


[PATCH v2 2/4] net/mlx5: fix real time counter reading from PCI BAR

2024-10-14 Thread Viacheslav Ovsiienko
From: Tim Martin 

There is the mlx5_txpp_read_clock() routine reading
the 64-bit real time counter from the device PCI BAR.
It introduced two issues:

  - it checks the PCI BAR mapping into process address
space and tries to map this on demand. This might be
problematic if something goes wrong and mapping fails.
It happens on every read_clock API call, invokes kernel
taking a long time and causing application malfunction.

  - the 64-bit counter should be read in single atomic
transaction

Fixes: 9b31fc9007f9 ("net/mlx5: fix read device clock in real time mode")
Cc: sta...@dpdk.org

Signed-off-by: Tim Martin 
Acked-by: Viacheslav Ovsiienko 
---
 .mailmap |  1 +
 drivers/net/mlx5/mlx5.c  |  4 
 drivers/net/mlx5/mlx5_tx.h   | 34 +-
 drivers/net/mlx5/mlx5_txpp.c | 11 ++-
 4 files changed, 40 insertions(+), 10 deletions(-)

diff --git a/.mailmap b/.mailmap
index b30d993f3b..3065211c0a 100644
--- a/.mailmap
+++ b/.mailmap
@@ -1505,6 +1505,7 @@ Timmons C. Player 
 Timothy McDaniel 
 Timothy Miskell 
 Timothy Redaelli 
+Tim Martin 
 Tim Shearer 
 Ting-Kai Ku 
 Ting Xu 
diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c
index e36fa651a1..52b90e6ff3 100644
--- a/drivers/net/mlx5/mlx5.c
+++ b/drivers/net/mlx5/mlx5.c
@@ -2242,6 +2242,7 @@ int
 mlx5_proc_priv_init(struct rte_eth_dev *dev)
 {
struct mlx5_priv *priv = dev->data->dev_private;
+   struct mlx5_dev_ctx_shared *sh = priv->sh;
struct mlx5_proc_priv *ppriv;
size_t ppriv_size;
 
@@ -2262,6 +2263,9 @@ mlx5_proc_priv_init(struct rte_eth_dev *dev)
dev->process_private = ppriv;
if (rte_eal_process_type() == RTE_PROC_PRIMARY)
priv->sh->pppriv = ppriv;
+   /* Check and try to map HCA PCI BAR to allow reading real time. */
+   if (sh->dev_cap.rt_timestamp && mlx5_dev_is_pci(dev->device))
+   mlx5_txpp_map_hca_bar(dev);
return 0;
 }
 
diff --git a/drivers/net/mlx5/mlx5_tx.h b/drivers/net/mlx5/mlx5_tx.h
index 983913faa2..587e6a9f7d 100644
--- a/drivers/net/mlx5/mlx5_tx.h
+++ b/drivers/net/mlx5/mlx5_tx.h
@@ -372,6 +372,38 @@ mlx5_txpp_convert_tx_ts(struct mlx5_dev_ctx_shared *sh, 
uint64_t mts)
return ci;
 }
 
+/**
+ * Read real time clock counter directly from the device PCI BAR area.
+ * The PCI BAR must be mapped to the process memory space at initialization.
+ *
+ * @param dev
+ *   Device to read clock counter from
+ *
+ * @return
+ *   0 - if HCA BAR is not supported or not mapped.
+ *   !=0 - read 64-bit value of real-time in UTC formatv (nanoseconds)
+ */
+static __rte_always_inline uint64_t mlx5_read_pcibar_clock(struct rte_eth_dev 
*dev)
+{
+   struct mlx5_proc_priv *ppriv = dev->process_private;
+
+   if (ppriv && ppriv->hca_bar) {
+   struct mlx5_priv *priv = dev->data->dev_private;
+   struct mlx5_dev_ctx_shared *sh = priv->sh;
+   uint64_t *hca_ptr = (uint64_t *)(ppriv->hca_bar) +
+ __mlx5_64_off(initial_seg, real_time);
+   uint64_t __rte_atomic *ts_addr;
+   uint64_t ts;
+
+   ts_addr = (uint64_t __rte_atomic *)hca_ptr;
+   ts = rte_atomic_load_explicit(ts_addr, 
rte_memory_order_seq_cst);
+   ts = rte_be_to_cpu_64(ts);
+   ts = mlx5_txpp_convert_rx_ts(sh, ts);
+   return ts;
+   }
+   return 0;
+}
+
 /**
  * Set Software Parser flags and offsets in Ethernet Segment of WQE.
  * Flags must be preliminary initialized to zero.
@@ -822,7 +854,7 @@ mlx5_tx_cseg_init(struct mlx5_txq_data *__rte_restrict txq,
cs->flags = RTE_BE32(MLX5_COMP_ONLY_FIRST_ERR <<
 MLX5_COMP_MODE_OFFSET);
cs->misc = RTE_BE32(0);
-   if (__rte_trace_point_fp_is_enabled() && !loc->pkts_sent)
+   if (__rte_trace_point_fp_is_enabled())
rte_pmd_mlx5_trace_tx_entry(txq->port_id, txq->idx);
rte_pmd_mlx5_trace_tx_wqe((txq->wqe_ci << 8) | opcode);
 }
diff --git a/drivers/net/mlx5/mlx5_txpp.c b/drivers/net/mlx5/mlx5_txpp.c
index 4e26fa2db8..e6d3ad83e9 100644
--- a/drivers/net/mlx5/mlx5_txpp.c
+++ b/drivers/net/mlx5/mlx5_txpp.c
@@ -971,7 +971,6 @@ mlx5_txpp_read_clock(struct rte_eth_dev *dev, uint64_t 
*timestamp)
 {
struct mlx5_priv *priv = dev->data->dev_private;
struct mlx5_dev_ctx_shared *sh = priv->sh;
-   struct mlx5_proc_priv *ppriv;
uint64_t ts;
int ret;
 
@@ -997,15 +996,9 @@ mlx5_txpp_read_clock(struct rte_eth_dev *dev, uint64_t 
*timestamp)
*timestamp = ts;
return 0;
}
-   /* Check and try to map HCA PIC BAR to allow reading real time. */
-   ppriv = dev->process_private;
-   if (ppriv && !ppriv->hca_bar &&
-   sh->dev_cap.rt_timestamp && mlx5_dev_is_pci(dev->device))
-   mlx5_txpp_map_hca_bar(dev);
/* Check if we can read 

[PATCH v2 3/4] net/mlx5: fix Tx tracing to use single clock source

2024-10-14 Thread Viacheslav Ovsiienko
From: Tim Martin 

The prior commit introduced tracing for mlx5, but there is a mixture of
two unrelated clocks used: the TSC for host work submission timestamps
and the NIC HW clock for CQE completion times. It is necessary to have
timestamps from a single common clock, and the NIC HW clock is the
better choice since it can be used with externally synchronized clocks.

This patch adds the NIC HW clock as an additional logged parameter for
trace_tx_entry, trace_tx_exit, and trace_tx_wqe.  The included trace
analysis python script is also updated to use the new clock when
it is available.

Fixes: a1e910f5b8d4 ("net/mlx5: introduce tracepoints")
Fixes: 9725191a7e14 ("net/mlx5: add Tx datapath trace analyzing script")
Cc: sta...@dpdk.org

Signed-off-by: Tim Martin 
Acked-by: Viacheslav Ovsiienko 
---
 drivers/net/mlx5/mlx5_trace.h|  9 ++---
 drivers/net/mlx5/mlx5_tx.h   | 21 +
 drivers/net/mlx5/tools/mlx5_trace.py | 12 +---
 3 files changed, 32 insertions(+), 10 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_trace.h b/drivers/net/mlx5/mlx5_trace.h
index a8f0b372c8..4fc3584acc 100644
--- a/drivers/net/mlx5/mlx5_trace.h
+++ b/drivers/net/mlx5/mlx5_trace.h
@@ -22,21 +22,24 @@ extern "C" {
 /* TX burst subroutines trace points. */
 RTE_TRACE_POINT_FP(
rte_pmd_mlx5_trace_tx_entry,
-   RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t queue_id),
+   RTE_TRACE_POINT_ARGS(uint64_t real_time, uint16_t port_id, uint16_t 
queue_id),
+   rte_trace_point_emit_u64(real_time);
rte_trace_point_emit_u16(port_id);
rte_trace_point_emit_u16(queue_id);
 )
 
 RTE_TRACE_POINT_FP(
rte_pmd_mlx5_trace_tx_exit,
-   RTE_TRACE_POINT_ARGS(uint16_t nb_sent, uint16_t nb_req),
+   RTE_TRACE_POINT_ARGS(uint64_t real_time, uint16_t nb_sent, uint16_t 
nb_req),
+   rte_trace_point_emit_u64(real_time);
rte_trace_point_emit_u16(nb_sent);
rte_trace_point_emit_u16(nb_req);
 )
 
 RTE_TRACE_POINT_FP(
rte_pmd_mlx5_trace_tx_wqe,
-   RTE_TRACE_POINT_ARGS(uint32_t opcode),
+   RTE_TRACE_POINT_ARGS(uint64_t real_time, uint32_t opcode),
+   rte_trace_point_emit_u64(real_time);
rte_trace_point_emit_u32(opcode);
 )
 
diff --git a/drivers/net/mlx5/mlx5_tx.h b/drivers/net/mlx5/mlx5_tx.h
index 587e6a9f7d..55568c41b1 100644
--- a/drivers/net/mlx5/mlx5_tx.h
+++ b/drivers/net/mlx5/mlx5_tx.h
@@ -404,6 +404,14 @@ static __rte_always_inline uint64_t 
mlx5_read_pcibar_clock(struct rte_eth_dev *d
return 0;
 }
 
+static __rte_always_inline uint64_t mlx5_read_pcibar_clock_from_txq(struct 
mlx5_txq_data *txq)
+{
+   struct mlx5_txq_ctrl *txq_ctrl = container_of(txq, struct 
mlx5_txq_ctrl, txq);
+   struct rte_eth_dev *dev = ETH_DEV(txq_ctrl->priv);
+
+   return mlx5_read_pcibar_clock(dev);
+}
+
 /**
  * Set Software Parser flags and offsets in Ethernet Segment of WQE.
  * Flags must be preliminary initialized to zero.
@@ -841,6 +849,7 @@ mlx5_tx_cseg_init(struct mlx5_txq_data *__rte_restrict txq,
  unsigned int olx)
 {
struct mlx5_wqe_cseg *__rte_restrict cs = &wqe->cseg;
+   uint64_t real_time;
 
/* For legacy MPW replace the EMPW by TSO with modifier. */
if (MLX5_TXOFF_CONFIG(MPW) && opcode == MLX5_OPCODE_ENHANCED_MPSW)
@@ -854,9 +863,12 @@ mlx5_tx_cseg_init(struct mlx5_txq_data *__rte_restrict txq,
cs->flags = RTE_BE32(MLX5_COMP_ONLY_FIRST_ERR <<
 MLX5_COMP_MODE_OFFSET);
cs->misc = RTE_BE32(0);
-   if (__rte_trace_point_fp_is_enabled())
-   rte_pmd_mlx5_trace_tx_entry(txq->port_id, txq->idx);
-   rte_pmd_mlx5_trace_tx_wqe((txq->wqe_ci << 8) | opcode);
+   if (__rte_trace_point_fp_is_enabled()) {
+   real_time = mlx5_read_pcibar_clock_from_txq(txq);
+   if (!loc->pkts_sent)
+   rte_pmd_mlx5_trace_tx_entry(real_time, txq->port_id, 
txq->idx);
+   rte_pmd_mlx5_trace_tx_wqe(real_time, (txq->wqe_ci << 8) | 
opcode);
+   }
 }
 
 /**
@@ -3818,7 +3830,8 @@ mlx5_tx_burst_tmpl(struct mlx5_txq_data *__rte_restrict 
txq,
__mlx5_tx_free_mbuf(txq, pkts, loc.mbuf_free, olx);
/* Trace productive bursts only. */
if (__rte_trace_point_fp_is_enabled() && loc.pkts_sent)
-   rte_pmd_mlx5_trace_tx_exit(loc.pkts_sent, pkts_n);
+   rte_pmd_mlx5_trace_tx_exit(mlx5_read_pcibar_clock_from_txq(txq),
+  loc.pkts_sent, pkts_n);
return loc.pkts_sent;
 }
 
diff --git a/drivers/net/mlx5/tools/mlx5_trace.py 
b/drivers/net/mlx5/tools/mlx5_trace.py
index 67461520a9..5eb634a490 100755
--- a/drivers/net/mlx5/tools/mlx5_trace.py
+++ b/drivers/net/mlx5/tools/mlx5_trace.py
@@ -174,7 +174,9 @@ def do_tx_entry(msg, trace):
 return
 # allocate the new burst and append to the queue
 burst = MlxBurst()
-burst.call_ts = msg.default_clock_snapsho

[PATCH v2 0/4] net/mlx5: series to fix and improve tx trace capabilitie

2024-10-14 Thread Viacheslav Ovsiienko
Signed-off-by: Viacheslav Ovsiienko 

---
v1: https://inbox.dpdk.org/dev/20241009114028.973284-1-viachesl...@nvidia.com/
v2: move part of code between 2nd and 3rd patches to fix single patch 
compilation issue.

Tim Martin (2):
  net/mlx5: fix real time counter reading from PCI BAR
  net/mlx5: fix Tx tracing to use single clock source

Viacheslav Ovsiienko (2):
  net/mlx5/tools: fix trace dump multiple burst completions
  net/mlx5: update dump script to show incomplete records

 .mailmap |  1 +
 doc/guides/nics/mlx5.rst |  6 ++
 drivers/net/mlx5/mlx5.c  |  4 ++
 drivers/net/mlx5/mlx5_trace.h|  9 ++-
 drivers/net/mlx5/mlx5_tx.h   | 53 ++--
 drivers/net/mlx5/mlx5_txpp.c | 11 +---
 drivers/net/mlx5/tools/mlx5_trace.py | 90 
 7 files changed, 133 insertions(+), 41 deletions(-)

-- 
2.34.1



[PATCH v2 1/4] net/mlx5/tools: fix trace dump multiple burst completions

2024-10-14 Thread Viacheslav Ovsiienko
In case if there were multiple bursts completed in the single
completion the first only burst was moved to the done list.
The situation is not typical, because usually tracing was
used for scheduled traffic debugging and for this case each
burst had its own completion requested, and there were no
completions with multiple bursts.

Fixes: 9725191a7e14 ("net/mlx5: add Tx datapath trace analyzing script")
Cc: sta...@dpdk.org

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Dariusz Sosnowski 
---
 drivers/net/mlx5/tools/mlx5_trace.py | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/mlx5/tools/mlx5_trace.py 
b/drivers/net/mlx5/tools/mlx5_trace.py
index 8c1fd0a350..67461520a9 100755
--- a/drivers/net/mlx5/tools/mlx5_trace.py
+++ b/drivers/net/mlx5/tools/mlx5_trace.py
@@ -258,13 +258,14 @@ def do_tx_complete(msg, trace):
 if burst.comp(wqe_id, wqe_ts) == 0:
 break
 rmv += 1
-# mode completed burst to done list
+# move completed burst(s) to done list
 if rmv != 0:
 idx = 0
 while idx < rmv:
+burst = queue.wait_burst[idx]
 queue.done_burst.append(burst)
 idx += 1
-del queue.wait_burst[0:rmv]
+queue.wait_burst = queue.wait_burst[rmv:]
 
 
 def do_tx(msg, trace):
-- 
2.34.1



Re: [PATCH] Revert "build: disable gcc 10 zero-length-bounds warning"

2024-10-14 Thread Bruce Richardson
On Fri, Oct 11, 2024 at 10:57:10AM -0700, Stephen Hemminger wrote:
> The zero length array warning can be re-enabled.
> The zero length marker fields are now removed by
> commit 9e152e674c77 ("mbuf: remove marker fields")
> in DPDK 24.03.
> 
> This reverts commit cfacbcb5a23bc26cb913528c372adddabbb33ca1.
> 
> Signed-off-by: Stephen Hemminger 
> ---
Acked-by: Bruce Richardson 


[PATCH v2 4/4] net/mlx5: update dump script to show incomplete records

2024-10-14 Thread Viacheslav Ovsiienko
If the trace dump is stored at the moment when there is some
incomplete Tx transfers - WQE is pushed but hardware did not
sent the completions yet - this incomplete was not dumped by
the script mlx5_trace. For some cases (for example, if queue
was stuck) the valuable debug information was lost.

To improve the dump fullness the following optional script
arguments are added:

 -v [level] - provides the raw dump of the object record
  of the specified level (0 - bursts, 1 - WQEs,
  2+ - mbufs)
 -a - dumps all bursts, including incomplete ones

Signed-off-by: Viacheslav Ovsiienko 
Acked-by: Dariusz Sosnowski 

---
 drivers/net/mlx5/tools/mlx5_trace.py | 82 
 1 file changed, 59 insertions(+), 23 deletions(-)
---
 doc/guides/nics/mlx5.rst |  6 +++
 drivers/net/mlx5/tools/mlx5_trace.py | 73 
 2 files changed, 59 insertions(+), 20 deletions(-)

diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst
index 1dccdaad50..f82e2d75de 100644
--- a/doc/guides/nics/mlx5.rst
+++ b/doc/guides/nics/mlx5.rst
@@ -2360,6 +2360,12 @@ Steps to enable Tx datapath tracing:
 
The parameter of the script is the trace data folder.
 
+   The optional parameter ``-a`` forces to dump incomplete bursts.
+
+   The optional parameter ``-v [level]`` forces to dump raw records data
+   for the specified level and below. Level 0 dumps bursts, level 1 dumps WQEs,
+   level 2 dumps mbufs.
+
.. code-block:: console
 
   mlx5_trace.py /var/log/rte-2023-01-23-AM-11-52-39
diff --git a/drivers/net/mlx5/tools/mlx5_trace.py 
b/drivers/net/mlx5/tools/mlx5_trace.py
index 5eb634a490..96eb82082f 100755
--- a/drivers/net/mlx5/tools/mlx5_trace.py
+++ b/drivers/net/mlx5/tools/mlx5_trace.py
@@ -21,10 +21,13 @@ def __init__(self):
 self.wait_burst = []  # waiting for completion
 self.pq_id = 0
 
-def log(self):
+def log(self, all):
 """Log all queue bursts"""
 for txb in self.done_burst:
 txb.log()
+if all == True:
+for txb in self.wait_burst:
+txb.log()
 
 
 class MlxMbuf:
@@ -147,24 +150,26 @@ def __init__(self):
 self.tx_qlst = {}  # active Tx queues per port/queue
 self.tx_wlst = {}  # wait timestamp list per CPU
 
-def run(self, msg_it):
+def run(self, msg_it, verbose):
 """Run over gathered tracing data and build database"""
 for msg in msg_it:
 if not isinstance(msg, bt2._EventMessageConst):
 continue
 event = msg.event
 if event.name.startswith(PFX_TX):
-do_tx(msg, self)
+do_tx(msg, self, verbose)
 # Handling of other log event cathegories can be added here
+if verbose:
+print("*** End of raw data dump ***")
 
-def log(self):
+def log(self, all):
 """Log gathered trace database"""
 for pq_id in self.tx_qlst:
 queue = self.tx_qlst.get(pq_id)
-queue.log()
+queue.log(all)
 
 
-def do_tx_entry(msg, trace):
+def do_tx_entry(msg, trace, verbose):
 """Handle entry Tx busrt"""
 event = msg.event
 cpu_id = event["cpu_id"]
@@ -172,6 +177,10 @@ def do_tx_entry(msg, trace):
 if burst is not None:
 # continue existing burst after WAIT
 return
+if verbose > 0:
+print("%u:%X tx_entry(real_time=%u, port_id=%u, queue_id=%u)" %
+  (msg.default_clock_snapshot.ns_from_origin, cpu_id,
+   event["real_time"], event["port_id"], event["queue_id"]))
 # allocate the new burst and append to the queue
 burst = MlxBurst()
 burst.call_ts = event["real_time"]
@@ -189,10 +198,14 @@ def do_tx_entry(msg, trace):
 queue.wait_burst.append(burst)
 
 
-def do_tx_exit(msg, trace):
+def do_tx_exit(msg, trace, verbose):
 """Handle exit Tx busrt"""
 event = msg.event
 cpu_id = event["cpu_id"]
+if verbose > 0:
+print("%u:%X tx_exit(real_time=%u, nb_sent=%u, nb_req=%u)" %
+  (msg.default_clock_snapshot.ns_from_origin, cpu_id,
+   event["real_time"], event["nb_sent"], event["nb_req"]))
 burst = trace.tx_blst.get(cpu_id)
 if burst is None:
 return
@@ -204,10 +217,14 @@ def do_tx_exit(msg, trace):
 trace.tx_blst.pop(cpu_id)
 
 
-def do_tx_wqe(msg, trace):
+def do_tx_wqe(msg, trace, verbose):
 """Handle WQE record"""
 event = msg.event
 cpu_id = event["cpu_id"]
+if verbose > 1:
+print("%u:%X tx_wqe(real_time=%u, opcode=%08X)" %
+  (msg.default_clock_snapshot.ns_from_origin, cpu_id,
+   event["real_time"], event["opcode"]))
 burst = trace.tx_blst.get(cpu_id)
 if burst is None:
 return
@@ -221,17 +238,24 @@ def do_tx_wqe(msg, trace):
 burst.wqes.append(wqe)
 
 
-def do_tx_wait(msg, trace):
+def do_tx_wait(msg, trace, verbose):
 """Handle WAIT record"""
 event = msg.even

[v3 1/5] raw/gdtc: introduce gdtc raw device driver

2024-10-14 Thread Yong Zhang
Introduce rawdev driver support for GDTC which
can help to connect two separate hosts with each other.

Signed-off-by: Yong Zhang 
---
 MAINTAINERS|   5 +
 doc/guides/rawdevs/gdtc.rst|  35 ++
 doc/guides/rawdevs/index.rst   |   1 +
 drivers/raw/gdtc/gdtc_rawdev.c | 212 +
 drivers/raw/gdtc/gdtc_rawdev.h | 120 +++
 drivers/raw/gdtc/meson.build   |   5 +
 drivers/raw/meson.build|   1 +
 7 files changed, 379 insertions(+)
 create mode 100644 doc/guides/rawdevs/gdtc.rst
 create mode 100644 drivers/raw/gdtc/gdtc_rawdev.c
 create mode 100644 drivers/raw/gdtc/gdtc_rawdev.h
 create mode 100644 drivers/raw/gdtc/meson.build

diff --git a/MAINTAINERS b/MAINTAINERS
index c5a703b5c0..32fc4c801e 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1511,6 +1511,11 @@ M: Gagandeep Singh 
 F: drivers/raw/dpaa2_cmdif/
 F: doc/guides/rawdevs/dpaa2_cmdif.rst
 
+ZTE GDTC
+M: Yong Zhang 
+F: drivers/raw/gdtc/
+F: doc/guides/rawdevs/gdtc.rst
+
 
 Packet processing
 -
diff --git a/doc/guides/rawdevs/gdtc.rst b/doc/guides/rawdevs/gdtc.rst
new file mode 100644
index 00..7e4e648c89
--- /dev/null
+++ b/doc/guides/rawdevs/gdtc.rst
@@ -0,0 +1,35 @@
+..  SPDX-License-Identifier: BSD-3-Clause
+Copyright 2024 ZTE Corporation
+
+GDTC Rawdev Driver
+==
+
+The ``gdtc`` rawdev driver is an implementation of the rawdev API,
+that provides communication between two separate hosts.
+This is achieved via using the GDMA controller of Dinghai SoC,
+which can be configured through exposed MPF device.
+
+Device Setup
+-
+
+Using the GDTC PMD driver does not require the MPF device to bind
+additional user-space IO driver.
+
+Before performing actual data transmission, it is necessary to
+call ``rte_rawdev_queue_setup()`` to obtain an available queue ID.
+
+For data transfer, utilize the standard ``rte_rawdev_enqueue_buffers()`` API.
+The data transfer status can be queried via ``rte_rawdev_dequeue_buffers()``,
+which will return the number of successfully transferred data items.
+
+Initialization
+--
+
+The ``gdtc`` rawdev driver needs to work in IOVA PA mode.
+Consider using ``--iova-mode=pa`` in the EAL options.
+
+Platform Requirement
+
+
+This PMD is only supported on ZTE Neo Platforms:
+- Neo X510/X512
diff --git a/doc/guides/rawdevs/index.rst b/doc/guides/rawdevs/index.rst
index f34315f051..921f3a120c 100644
--- a/doc/guides/rawdevs/index.rst
+++ b/doc/guides/rawdevs/index.rst
@@ -16,3 +16,4 @@ application through rawdev API.
 dpaa2_cmdif
 ifpga
 ntb
+gdtc
diff --git a/drivers/raw/gdtc/gdtc_rawdev.c b/drivers/raw/gdtc/gdtc_rawdev.c
new file mode 100644
index 00..436658d850
--- /dev/null
+++ b/drivers/raw/gdtc/gdtc_rawdev.c
@@ -0,0 +1,212 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2024 ZTE Corporation
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "gdtc_rawdev.h"
+
+/* Register offset */
+#define ZXDH_GDMA_BASE_OFFSET   0x10
+
+#define ZXDH_GDMA_CHAN_SHIFT0x80
+char zxdh_gdma_driver_name[] = "rawdev_zxdh_gdma";
+char dev_name[] = "zxdh_gdma";
+
+uint32_t
+zxdh_gdma_read_reg(struct rte_rawdev *dev, uint16_t queue_id, uint32_t offset)
+{
+   struct zxdh_gdma_rawdev *gdmadev = zxdh_gdma_rawdev_get_priv(dev);
+   uint32_t addr = 0;
+   uint32_t val = 0;
+
+   addr = offset + queue_id * ZXDH_GDMA_CHAN_SHIFT;
+   val = *(uint32_t *)(gdmadev->base_addr + addr);
+
+   return val;
+}
+
+void
+zxdh_gdma_write_reg(struct rte_rawdev *dev, uint16_t queue_id, uint32_t 
offset, uint32_t val)
+{
+   struct zxdh_gdma_rawdev *gdmadev = zxdh_gdma_rawdev_get_priv(dev);
+   uint32_t addr = 0;
+
+   addr = offset + queue_id * ZXDH_GDMA_CHAN_SHIFT;
+   *(uint32_t *)(gdmadev->base_addr + addr) = val;
+}
+
+static const struct rte_rawdev_ops zxdh_gdma_rawdev_ops = {
+};
+
+static int
+zxdh_gdma_map_resource(struct rte_pci_device *dev)
+{
+   int fd = -1;
+   char devname[PATH_MAX];
+   void *mapaddr = NULL;
+   struct rte_pci_addr *loc;
+
+   loc = &dev->addr;
+   snprintf(devname, sizeof(devname), "%s/" PCI_PRI_FMT "/resource0",
+   rte_pci_get_sysfs_path(),
+   loc->domain, loc->bus, loc->devid,
+   loc->function);
+
+   fd = open(devname, O_RDWR);
+   if (fd < 0) {
+   ZXDH_PMD_LOG(ERR, "Cannot open %s: %s", devname, 
strerror(errno));
+   return -1;
+   }
+
+   /* Map the PCI memory resource of device */
+   mapaddr = rte_mem_map(NULL, (size_t)dev->mem_resource[0].len,
+   RTE_PROT_READ | RTE_PROT_WRITE

[v3 4/5] raw/gdtc: add support for enqueue operation

2024-10-14 Thread Yong Zhang
Add rawdev enqueue operation for gdtc devices.

Signed-off-by: Yong Zhang 
---
 drivers/raw/gdtc/gdtc_rawdev.c | 220 +
 drivers/raw/gdtc/gdtc_rawdev.h |  19 +++
 2 files changed, 239 insertions(+)

diff --git a/drivers/raw/gdtc/gdtc_rawdev.c b/drivers/raw/gdtc/gdtc_rawdev.c
index 9a1f939ee8..03f7cc1a8e 100644
--- a/drivers/raw/gdtc/gdtc_rawdev.c
+++ b/drivers/raw/gdtc/gdtc_rawdev.c
@@ -43,10 +43,34 @@
 /* Register offset */
 #define ZXDH_GDMA_BASE_OFFSET   0x10
 #define ZXDH_GDMA_EXT_ADDR_OFFSET   0x218
+#define ZXDH_GDMA_SAR_LOW_OFFSET0x200
+#define ZXDH_GDMA_DAR_LOW_OFFSET0x204
+#define ZXDH_GDMA_SAR_HIGH_OFFSET   0x234
+#define ZXDH_GDMA_DAR_HIGH_OFFSET   0x238
+#define ZXDH_GDMA_XFERSIZE_OFFSET   0x208
 #define ZXDH_GDMA_CONTROL_OFFSET0x230
+#define ZXDH_GDMA_TC_STATUS_OFFSET  0x0
+#define ZXDH_GDMA_STATUS_CLEAN_OFFSET   0x80
+#define ZXDH_GDMA_LLI_L_OFFSET  0x21c
+#define ZXDH_GDMA_LLI_H_OFFSET  0x220
+#define ZXDH_GDMA_CHAN_CONTINUE_OFFSET  0x224
 #define ZXDH_GDMA_TC_CNT_OFFSET 0x23c
 #define ZXDH_GDMA_LLI_USER_OFFSET   0x228
 
+/* Control register */
+#define ZXDH_GDMA_CHAN_ENABLE   0x1
+#define ZXDH_GDMA_CHAN_DISABLE  0
+#define ZXDH_GDMA_SOFT_CHAN 0x2
+#define ZXDH_GDMA_TC_INTR_ENABLE0x10
+#define ZXDH_GDMA_ALL_INTR_ENABLE   0x30
+#define ZXDH_GDMA_SBS_SHIFT 6   /* src burst size 
*/
+#define ZXDH_GDMA_SBL_SHIFT 9   /* src burst 
length */
+#define ZXDH_GDMA_DBS_SHIFT 13  /* dest burst size 
*/
+#define ZXDH_GDMA_BURST_SIZE_MIN0x1 /* 1 byte */
+#define ZXDH_GDMA_BURST_SIZE_MEDIUM 0x4 /* 4 word */
+#define ZXDH_GDMA_BURST_SIZE_MAX0x6 /* 16 word */
+#define ZXDH_GDMA_DEFAULT_BURST_LEN 0xf /* 16 beats */
+#define ZXDH_GDMA_TC_CNT_ENABLE (1 << 27)
 #define ZXDH_GDMA_CHAN_FORCE_CLOSE  (1 << 31)
 
 /* TC count & Error interrupt status register */
@@ -58,9 +82,15 @@
 #define ZXDH_GDMA_TC_CNT_CLEAN  (1)
 
 #define ZXDH_GDMA_CHAN_SHIFT0x80
+#define ZXDH_GDMA_LINK_END_NODE (1 << 30)
+#define ZXDH_GDMA_CHAN_CONTINUE (1)
+
 #define LOW32_MASK  0x
 #define LOW16_MASK  0x
 
+#define IDX_TO_ADDR(addr, idx, t) \
+   ((t)((uintptr_t)(addr) + (idx) * sizeof(struct zxdh_gdma_buff_desc)))
+
 static int zxdh_gdma_queue_init(struct rte_rawdev *dev, uint16_t queue_id);
 static int zxdh_gdma_queue_free(struct rte_rawdev *dev, uint16_t queue_id);
 
@@ -308,6 +338,194 @@ zxdh_gdma_rawdev_get_attr(struct rte_rawdev *dev,
 
return 0;
 }
+
+static inline void
+zxdh_gdma_control_cal(uint32_t *val, uint8_t tc_enable)
+{
+   *val = (ZXDH_GDMA_CHAN_ENABLE |
+   ZXDH_GDMA_SOFT_CHAN |
+   (ZXDH_GDMA_DEFAULT_BURST_LEN << ZXDH_GDMA_SBL_SHIFT) |
+   (ZXDH_GDMA_BURST_SIZE_MAX << ZXDH_GDMA_SBS_SHIFT) |
+   (ZXDH_GDMA_BURST_SIZE_MAX << ZXDH_GDMA_DBS_SHIFT));
+
+   if (tc_enable != 0)
+   *val |= ZXDH_GDMA_TC_CNT_ENABLE;
+}
+
+static inline uint32_t
+zxdh_gdma_user_get(struct zxdh_gdma_queue *queue, struct zxdh_gdma_job *job)
+{
+   uint32_t src_user = 0;
+   uint32_t dst_user = 0;
+
+   if ((job->flags & ZXDH_GDMA_JOB_DIR_MASK) == 0) {
+   ZXDH_PMD_LOG(DEBUG, "job flags:0x%x default user:0x%x",
+   job->flags, 
queue->user);
+   return queue->user;
+   } else if ((job->flags & ZXDH_GDMA_JOB_DIR_TX) != 0) {
+   src_user = ZXDH_GDMA_ZF_USER;
+   dst_user = ((job->pf_id << ZXDH_GDMA_PF_NUM_SHIFT) |
+   ((job->ep_id + ZXDH_GDMA_EPID_OFFSET) << 
ZXDH_GDMA_EP_ID_SHIFT));
+
+   if (job->vf_id != 0)
+   dst_user |= (ZXDH_GDMA_VF_EN |
+((job->vf_id - 1) << 
ZXDH_GDMA_VF_NUM_SHIFT));
+   } else {
+   dst_user = ZXDH_GDMA_ZF_USER;
+   src_user = ((job->pf_id << ZXDH_GDMA_PF_NUM_SHIFT) |
+   ((job->ep_id + ZXDH_GDMA_EPID_OFFSET) << 
ZXDH_GDMA_EP_ID_SHIFT));
+
+   if (job->vf_id != 0)
+   src_user |= (ZXDH_GDMA_VF_EN |
+((job->vf_id - 1) << 
ZXDH_GDMA_VF_NUM_SHIFT));
+   }
+   ZXDH_PMD_LOG(DEBUG, "job flags:0x%x ep_id:%u, pf_id:%u, vf_id:%u, 
user:0x%x",
+   job->flags, job->ep_id, 
job->pf_id, job->vf_id,
+

[v3 2/5] raw/gdtc: add support for queue setup operation

2024-10-14 Thread Yong Zhang
Add queue initialization and release interface.

Signed-off-by: Yong Zhang 
---
 drivers/raw/gdtc/gdtc_rawdev.c | 242 +
 drivers/raw/gdtc/gdtc_rawdev.h |  19 +++
 2 files changed, 261 insertions(+)

diff --git a/drivers/raw/gdtc/gdtc_rawdev.c b/drivers/raw/gdtc/gdtc_rawdev.c
index 436658d850..c4f02cfd20 100644
--- a/drivers/raw/gdtc/gdtc_rawdev.c
+++ b/drivers/raw/gdtc/gdtc_rawdev.c
@@ -28,13 +28,58 @@
 
 #include "gdtc_rawdev.h"
 
+/*
+ * User define:
+ * ep_id-bit[15:12] vfunc_num-bit[11:4] func_num-bit[3:1] vfunc_active-bit0
+ * host ep_id:5~8   zf ep_id:9
+ */
+#define ZXDH_GDMA_ZF_USER   0x9000  /* ep4 pf0 */
+#define ZXDH_GDMA_PF_NUM_SHIFT  1
+#define ZXDH_GDMA_VF_NUM_SHIFT  4
+#define ZXDH_GDMA_EP_ID_SHIFT   12
+#define ZXDH_GDMA_VF_EN 1
+#define ZXDH_GDMA_EPID_OFFSET   5
+
 /* Register offset */
 #define ZXDH_GDMA_BASE_OFFSET   0x10
+#define ZXDH_GDMA_EXT_ADDR_OFFSET   0x218
+#define ZXDH_GDMA_CONTROL_OFFSET0x230
+#define ZXDH_GDMA_TC_CNT_OFFSET 0x23c
+#define ZXDH_GDMA_LLI_USER_OFFSET   0x228
+
+#define ZXDH_GDMA_CHAN_FORCE_CLOSE  (1 << 31)
+
+/* TC count & Error interrupt status register */
+#define ZXDH_GDMA_SRC_LLI_ERR   (1 << 16)
+#define ZXDH_GDMA_SRC_DATA_ERR  (1 << 17)
+#define ZXDH_GDMA_DST_ADDR_ERR  (1 << 18)
+#define ZXDH_GDMA_ERR_STATUS(1 << 19)
+#define ZXDH_GDMA_ERR_INTR_ENABLE   (1 << 20)
+#define ZXDH_GDMA_TC_CNT_CLEAN  (1)
 
 #define ZXDH_GDMA_CHAN_SHIFT0x80
+#define LOW32_MASK  0x
+#define LOW16_MASK  0x
+
+static int zxdh_gdma_queue_init(struct rte_rawdev *dev, uint16_t queue_id);
+static int zxdh_gdma_queue_free(struct rte_rawdev *dev, uint16_t queue_id);
+
 char zxdh_gdma_driver_name[] = "rawdev_zxdh_gdma";
 char dev_name[] = "zxdh_gdma";
 
+static inline struct zxdh_gdma_queue *
+zxdh_gdma_get_queue(struct rte_rawdev *dev, uint16_t queue_id)
+{
+   struct zxdh_gdma_rawdev *gdmadev = zxdh_gdma_rawdev_get_priv(dev);
+
+   if (queue_id >= ZXDH_GDMA_TOTAL_CHAN_NUM) {
+   ZXDH_PMD_LOG(ERR, "queue id %d is invalid", queue_id);
+   return NULL;
+   }
+
+   return &(gdmadev->vqs[queue_id]);
+}
+
 uint32_t
 zxdh_gdma_read_reg(struct rte_rawdev *dev, uint16_t queue_id, uint32_t offset)
 {
@@ -58,9 +103,206 @@ zxdh_gdma_write_reg(struct rte_rawdev *dev, uint16_t 
queue_id, uint32_t offset,
*(uint32_t *)(gdmadev->base_addr + addr) = val;
 }
 
+static int
+zxdh_gdma_rawdev_queue_setup(struct rte_rawdev *dev,
+uint16_t queue_id,
+rte_rawdev_obj_t 
queue_conf,
+size_t conf_size)
+{
+   struct zxdh_gdma_rawdev *gdmadev = NULL;
+   struct zxdh_gdma_queue *queue = NULL;
+   struct zxdh_gdma_queue_config *qconfig = NULL;
+   struct zxdh_gdma_rbp *rbp = NULL;
+   uint16_t i = 0;
+   uint8_t is_txq = 0;
+   uint32_t src_user = 0;
+   uint32_t dst_user = 0;
+
+   if (dev == NULL)
+   return -EINVAL;
+
+   if ((queue_conf == NULL) || (conf_size != sizeof(struct 
zxdh_gdma_queue_config)))
+   return -EINVAL;
+
+   gdmadev = zxdh_gdma_rawdev_get_priv(dev);
+   qconfig = (struct zxdh_gdma_queue_config *)queue_conf;
+
+   for (i = 0; i < ZXDH_GDMA_TOTAL_CHAN_NUM; i++) {
+   if (gdmadev->vqs[i].enable == 0)
+   break;
+   }
+   if (i >= ZXDH_GDMA_TOTAL_CHAN_NUM) {
+   ZXDH_PMD_LOG(ERR, "Failed to setup queue, no avail queues");
+   return -1;
+   }
+   queue_id = i;
+   if (zxdh_gdma_queue_init(dev, queue_id) != 0) {
+   ZXDH_PMD_LOG(ERR, "Failed to init queue");
+   return -1;
+   }
+   queue = &(gdmadev->vqs[queue_id]);
+
+   rbp = qconfig->rbp;
+   if ((rbp->srbp != 0) && (rbp->drbp == 0)) {
+   is_txq = 0;
+   dst_user = ZXDH_GDMA_ZF_USER;
+   src_user = ((rbp->spfid << ZXDH_GDMA_PF_NUM_SHIFT) |
+   ((rbp->sportid + ZXDH_GDMA_EPID_OFFSET) << 
ZXDH_GDMA_EP_ID_SHIFT));
+
+   if (rbp->svfid != 0)
+   src_user |= (ZXDH_GDMA_VF_EN |
+((rbp->svfid - 1) << 
ZXDH_GDMA_VF_NUM_SHIFT));
+
+   ZXDH_PMD_LOG(DEBUG, "rxq->qidx:%d setup src_user(ep:%d pf:%d 
vf:%d) success",
+   queue_id, (uint8_t)rbp->sportid, 
(uint8_t)rbp->spfid,
+   (uint8_t)rbp->svfid);
+   } else if ((rbp->srbp == 0

[v3 3/5] raw/gdtc: add support for standard rawdev operations

2024-10-14 Thread Yong Zhang
Add support for rawdev operations such as dev_start and dev_stop.

Signed-off-by: Yong Zhang 
---
 drivers/raw/gdtc/gdtc_rawdev.c | 136 -
 drivers/raw/gdtc/gdtc_rawdev.h |  10 +++
 2 files changed, 145 insertions(+), 1 deletion(-)

diff --git a/drivers/raw/gdtc/gdtc_rawdev.c b/drivers/raw/gdtc/gdtc_rawdev.c
index c4f02cfd20..9a1f939ee8 100644
--- a/drivers/raw/gdtc/gdtc_rawdev.c
+++ b/drivers/raw/gdtc/gdtc_rawdev.c
@@ -103,6 +103,96 @@ zxdh_gdma_write_reg(struct rte_rawdev *dev, uint16_t 
queue_id, uint32_t offset,
*(uint32_t *)(gdmadev->base_addr + addr) = val;
 }
 
+static int
+zxdh_gdma_rawdev_info_get(struct rte_rawdev *dev,
+ __rte_unused rte_rawdev_obj_t 
dev_info,
+ __rte_unused size_t 
dev_info_size)
+{
+   if (dev == NULL)
+   return -EINVAL;
+
+   return 0;
+}
+
+static int
+zxdh_gdma_rawdev_configure(const struct rte_rawdev *dev,
+  rte_rawdev_obj_t config,
+  size_t config_size)
+{
+   struct zxdh_gdma_config *gdma_config = NULL;
+
+   if ((dev == NULL) ||
+   (config == NULL) ||
+   (config_size != sizeof(struct zxdh_gdma_config)))
+   return -EINVAL;
+
+   gdma_config = (struct zxdh_gdma_config *)config;
+   if (gdma_config->max_vqs > ZXDH_GDMA_TOTAL_CHAN_NUM) {
+   ZXDH_PMD_LOG(ERR, "gdma supports up to %d queues", 
ZXDH_GDMA_TOTAL_CHAN_NUM);
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
+static int
+zxdh_gdma_rawdev_start(struct rte_rawdev *dev)
+{
+   struct zxdh_gdma_rawdev *gdmadev = NULL;
+
+   if (dev == NULL)
+   return -EINVAL;
+
+   gdmadev = zxdh_gdma_rawdev_get_priv(dev);
+   gdmadev->device_state = ZXDH_GDMA_DEV_RUNNING;
+
+   return 0;
+}
+
+static void
+zxdh_gdma_rawdev_stop(struct rte_rawdev *dev)
+{
+   struct zxdh_gdma_rawdev *gdmadev = NULL;
+
+   if (dev == NULL)
+   return;
+
+   gdmadev = zxdh_gdma_rawdev_get_priv(dev);
+   gdmadev->device_state = ZXDH_GDMA_DEV_STOPPED;
+}
+
+static int
+zxdh_gdma_rawdev_reset(struct rte_rawdev *dev)
+{
+   if (dev == NULL)
+   return -EINVAL;
+
+   return 0;
+}
+
+static int
+zxdh_gdma_rawdev_close(struct rte_rawdev *dev)
+{
+   struct zxdh_gdma_rawdev *gdmadev = NULL;
+   struct zxdh_gdma_queue *queue = NULL;
+   uint16_t queue_id = 0;
+
+   if (dev == NULL)
+   return -EINVAL;
+
+   for (queue_id = 0; queue_id < ZXDH_GDMA_TOTAL_CHAN_NUM; queue_id++) {
+   queue = zxdh_gdma_get_queue(dev, queue_id);
+   if ((queue == NULL) || (queue->enable == 0))
+   continue;
+
+   zxdh_gdma_queue_free(dev, queue_id);
+   }
+   gdmadev = zxdh_gdma_rawdev_get_priv(dev);
+   gdmadev->device_state = ZXDH_GDMA_DEV_STOPPED;
+
+   return 0;
+}
+
 static int
 zxdh_gdma_rawdev_queue_setup(struct rte_rawdev *dev,
 uint16_t queue_id,
@@ -184,8 +274,52 @@ zxdh_gdma_rawdev_queue_setup(struct rte_rawdev *dev,
return queue_id;
 }
 
+static int
+zxdh_gdma_rawdev_queue_release(struct rte_rawdev *dev, uint16_t queue_id)
+{
+   struct zxdh_gdma_queue *queue = NULL;
+
+   if (dev == NULL)
+   return -EINVAL;
+
+   queue = zxdh_gdma_get_queue(dev, queue_id);
+   if ((queue == NULL) || (queue->enable == 0))
+   return -EINVAL;
+
+   zxdh_gdma_queue_free(dev, queue_id);
+
+   return 0;
+}
+
+static int
+zxdh_gdma_rawdev_get_attr(struct rte_rawdev *dev,
+ __rte_unused const char 
*attr_name,
+ uint64_t *attr_value)
+{
+   struct zxdh_gdma_rawdev *gdmadev = NULL;
+   struct zxdh_gdma_attr *gdma_attr = NULL;
+
+   if ((dev == NULL) || (attr_value == NULL))
+   return -EINVAL;
+
+   gdmadev   = zxdh_gdma_rawdev_get_priv(dev);
+   gdma_attr = (struct zxdh_gdma_attr *)attr_value;
+   gdma_attr->num_hw_queues = gdmadev->used_num;
+
+   return 0;
+}
 static const struct rte_rawdev_ops zxdh_gdma_rawdev_ops = {
+   .dev_info_get = zxdh_gdma_rawdev_info_get,
+   .dev_configure = zxdh_gdma_rawdev_configure,
+   .dev_start = zxdh_gdma_rawdev_start,
+   .dev_stop = zxdh_gdma_rawdev_stop,
+   .dev_close = zxdh_gdma_rawdev_close,
+   .dev_reset = zxdh_gdma_rawdev_reset,
+
.queue_setup = zxdh_gdma_rawdev_queue_setup,
+   .queue_release = zxdh_gdma_rawdev_queue_release,
+
+   .attr_get = zxdh_gdma_rawdev_get_attr,
 };
 
 static int
@@ -248,7 +382,7 @@ zxdh_gdma_queue_init(struct rte_rawdev *dev, uint16_t 
queue_id)
ZXDH_PMD_LOG(INFO, "queue%u ring phy addr:0x%"PRIx64" virt addr:%p",
 

[v3 5/5] raw/gdtc: add support for dequeue operation

2024-10-14 Thread Yong Zhang
Add rawdev dequeue operation for gdtc devices.

Signed-off-by: Yong Zhang 
---
 drivers/raw/gdtc/gdtc_rawdev.c | 113 +
 1 file changed, 113 insertions(+)

diff --git a/drivers/raw/gdtc/gdtc_rawdev.c b/drivers/raw/gdtc/gdtc_rawdev.c
index 03f7cc1a8e..8e9543f402 100644
--- a/drivers/raw/gdtc/gdtc_rawdev.c
+++ b/drivers/raw/gdtc/gdtc_rawdev.c
@@ -88,6 +88,8 @@
 #define LOW32_MASK  0x
 #define LOW16_MASK  0x
 
+#define ZXDH_GDMA_TC_CNT_MAX0x1
+
 #define IDX_TO_ADDR(addr, idx, t) \
((t)((uintptr_t)(addr) + (idx) * sizeof(struct zxdh_gdma_buff_desc)))
 
@@ -526,6 +528,116 @@ zxdh_gdma_rawdev_enqueue_bufs(struct rte_rawdev *dev,
 
return count;
 }
+
+static inline void
+zxdh_gdma_used_idx_update(struct zxdh_gdma_queue *queue, uint16_t cnt, uint8_t 
data_bd_err)
+{
+   uint16_t idx = 0;
+
+   if (queue->sw_ring.used_idx + cnt < queue->queue_size)
+   queue->sw_ring.used_idx += cnt;
+   else
+   queue->sw_ring.used_idx = queue->sw_ring.used_idx + cnt - 
queue->queue_size;
+
+   if (data_bd_err == 1) {
+   /* Update job status, the last job status is error */
+   if (queue->sw_ring.used_idx == 0)
+   idx = queue->queue_size - 1;
+   else
+   idx = queue->sw_ring.used_idx - 1;
+
+   queue->sw_ring.job[idx]->status = 1;
+   }
+}
+
+static int
+zxdh_gdma_rawdev_dequeue_bufs(struct rte_rawdev *dev,
+   __rte_unused struct 
rte_rawdev_buf **buffers,
+   uint32_t count,
+   rte_rawdev_obj_t context)
+{
+   struct zxdh_gdma_queue *queue = NULL;
+   struct zxdh_gdma_enqdeq *e_context = NULL;
+   uint16_t queue_id = 0;
+   uint32_t val = 0;
+   uint16_t tc_cnt = 0;
+   uint16_t diff_cnt = 0;
+   uint16_t i = 0;
+   uint16_t bd_idx = 0;
+   uint64_t next_bd_addr = 0;
+   uint8_t data_bd_err = 0;
+
+   if ((dev == NULL) || (context == NULL))
+   return -EINVAL;
+
+   e_context = (struct zxdh_gdma_enqdeq *)context;
+   queue_id = e_context->vq_id;
+   queue = zxdh_gdma_get_queue(dev, queue_id);
+   if ((queue == NULL) || (queue->enable == 0))
+   return -EINVAL;
+
+   if (queue->sw_ring.pend_cnt == 0)
+   goto deq_job;
+
+   /* Get data transmit count */
+   val = zxdh_gdma_read_reg(dev, queue_id, ZXDH_GDMA_TC_CNT_OFFSET);
+   tc_cnt = val & LOW16_MASK;
+   if (tc_cnt >= queue->tc_cnt)
+   diff_cnt = tc_cnt - queue->tc_cnt;
+   else
+   diff_cnt = tc_cnt + ZXDH_GDMA_TC_CNT_MAX - queue->tc_cnt;
+
+   queue->tc_cnt = tc_cnt;
+
+   /* Data transmit error, channel stopped */
+   if ((val & ZXDH_GDMA_ERR_STATUS) != 0) {
+   next_bd_addr  = zxdh_gdma_read_reg(dev, queue_id, 
ZXDH_GDMA_LLI_L_OFFSET);
+   next_bd_addr |= ((uint64_t)zxdh_gdma_read_reg(dev, queue_id,
+   ZXDH_GDMA_LLI_H_OFFSET) 
<< 32);
+   next_bd_addr  = next_bd_addr << 6;
+   bd_idx = (next_bd_addr - queue->ring.ring_mem) / sizeof(struct 
zxdh_gdma_buff_desc);
+   if ((val & ZXDH_GDMA_SRC_DATA_ERR) || (val & 
ZXDH_GDMA_DST_ADDR_ERR)) {
+   diff_cnt++;
+   data_bd_err = 1;
+   }
+   ZXDH_PMD_LOG(INFO, "queue%d is err(0x%x) next_bd_idx:%u 
ll_addr:0x%"PRIx64" def user:0x%x",
+   queue_id, val, bd_idx, next_bd_addr, 
queue->user);
+
+   ZXDH_PMD_LOG(INFO, "Clean up error status");
+   val = ZXDH_GDMA_ERR_STATUS | ZXDH_GDMA_ERR_INTR_ENABLE;
+   zxdh_gdma_write_reg(dev, queue_id, ZXDH_GDMA_TC_CNT_OFFSET, 
val);
+
+   ZXDH_PMD_LOG(INFO, "Restart channel");
+   zxdh_gdma_write_reg(dev, queue_id, ZXDH_GDMA_XFERSIZE_OFFSET, 
0);
+   zxdh_gdma_control_cal(&val, 0);
+   zxdh_gdma_write_reg(dev, queue_id, ZXDH_GDMA_CONTROL_OFFSET, 
val);
+   }
+
+   if (diff_cnt != 0) {
+   zxdh_gdma_used_idx_update(queue, diff_cnt, data_bd_err);
+   queue->sw_ring.deq_cnt += diff_cnt;
+   queue->sw_ring.pend_cnt -= diff_cnt;
+   }
+
+deq_job:
+   if (queue->sw_ring.deq_cnt == 0)
+   return 0;
+   else if (queue->sw_ring.deq_cnt < count)
+   count = queue->sw_ring.deq_cnt;
+
+   queue->sw_ring.deq_cnt -= count;
+
+   for (i = 0; i < count; i++) {
+   e_context->job[i] = queue->sw_ring.job[queue->sw_ring.deq_idx];
+   queue->sw_ring.job[queue->sw_ring.deq_idx] = NULL;
+   if (++queue->sw_ring.deq_idx >= queue->queue_size)
+ 

RE: [PATCH v11 1/7] eal: add static per-lcore memory allocation facility

2024-10-14 Thread Morten Brørup
> From: Mattias Rönnblom [mailto:mattias.ronnb...@ericsson.com]
> Sent: Monday, 14 October 2024 09.44


> +struct lcore_var_buffer {
> + char data[RTE_MAX_LCORE_VAR * RTE_MAX_LCORE];
> + struct lcore_var_buffer *prev;
> +};

In relation to Jerin's request for using hugepages when available, the "data" 
field should be a pointer to the memory allocated from either the heap or 
through rte_malloc. You would also need to add a flag to indicate which it is, 
so the correct deallocation function can be used to free it on cleanup.


Here's another (nice to have) idea, which does not need to be part of this 
series, but can be implemented in a separate patch:
If you move "offset" into this structure, new lcore variables can be allocated 
from any buffer, instead of only the most recently allocated buffer.
There might even be gains by picking the "optimal" buffer to allocate different 
size variables from.


> +
> +static struct lcore_var_buffer *current_buffer;
> +
> +/* initialized to trigger buffer allocation on first allocation */
> +static size_t offset = RTE_MAX_LCORE_VAR;


> +void *
> +rte_lcore_var_alloc(size_t size, size_t align)
> +{
> + /* Having the per-lcore buffer size aligned on cache lines
> +  * assures as well as having the base pointer aligned on cache
> +  * size assures that aligned offsets also translate to alipgned
> +  * pointers across all values.
> +  */
> + RTE_BUILD_BUG_ON(RTE_MAX_LCORE_VAR % RTE_CACHE_LINE_SIZE != 0);
> + RTE_ASSERT(align <= RTE_CACHE_LINE_SIZE);
> + RTE_ASSERT(size <= RTE_MAX_LCORE_VAR);

This is very slow path, please RTE_VERIFY instead of RTE_ASSERT in this 
function.


> +/**
> + * Get pointer to lcore variable instance with the specified lcore id.
> + *
> + * @param lcore_id
> + *   The lcore id specifying which of the @c RTE_MAX_LCORE value
> + *   instances should be accessed. The lcore id need not be valid
> + *   (e.g., may be @ref LCORE_ID_ANY), but in such a case, the pointer
> + *   is also not valid (and thus should not be dereferenced).
> + * @param handle
> + *   The lcore variable handle.
> + */
> +#define RTE_LCORE_VAR_LCORE_VALUE(lcore_id, handle)  \
> + ((typeof(handle))rte_lcore_var_lcore_ptr(lcore_id, handle))

Please remove the _VALUE suffix.

> +
> +/**
> + * Get pointer to lcore variable instance of the current thread.
> + *
> + * May only be used by EAL threads and registered non-EAL threads.
> + */
> +#define RTE_LCORE_VAR_VALUE(handle) \
> + RTE_LCORE_VAR_LCORE_VALUE(rte_lcore_id(), handle)

Please remove the _VALUE suffix.

> +
> +/**
> + * Iterate over each lcore id's value for an lcore variable.
> + *
> + * @param lcore_id
> + *   An unsigned int variable successively set to the
> + *   lcore id of every valid lcore id (up to @c RTE_MAX_LCORE).
> + * @param value
> + *   A pointer variable successively set to point to lcore variable
> + *   value instance of the current lcore id being processed.
> + * @param handle
> + *   The lcore variable handle.
> + */
> +#define RTE_LCORE_VAR_FOREACH_VALUE(lcore_id, value, handle)

Please remove the _VALUE suffix.

>   \
> + for ((lcore_id) =   \
> +  (((value) = RTE_LCORE_VAR_LCORE_VALUE(0, handle)), 0);
> \
> +  (lcore_id) < RTE_MAX_LCORE;\
> +  (lcore_id)++, (value) = RTE_LCORE_VAR_LCORE_VALUE(lcore_id,
> \



  1   2   3   >