date:20241008

[PATCH v16 0/1] dts: port over VLAN test suite

2024-10-08 Thread Dean Marx

Port over VLAN capabilities test suite from old DTS. This test
suite verifies that VLAN filtering, stripping, and header
insertion all function as expected. When a VLAN ID is in the
filter list, all packets with that ID should be forwarded
and all others should be dropped. While stripping is enabled,
packets sent with a VLAN ID should have the ID removed
and then be forwarded. Additionally, when header insertion
is enabled packets without a VLAN ID should have a specified
ID inserted and then be forwarded.

---
v13:
* Combined conf schema and test suite patches

v14:
* Reworded docstrings in suite
* Added flag checking to shell methods
* Fixed tx_vlan_reset method bug

v16:
* Rebased off next-dts

Dean Marx (1):
  dts: VLAN test suite implementation

 dts/framework/config/conf_yaml_schema.json |   3 +-
 dts/tests/TestSuite_vlan.py| 167 +
 2 files changed, 169 insertions(+), 1 deletion(-)
 create mode 100644 dts/tests/TestSuite_vlan.py

-- 
2.44.0

[PATCH v16] dts: VLAN test suite implementation

2024-10-08 Thread Dean Marx

Test suite for verifying VLAN filtering, stripping, and insertion
functionality on Poll Mode Driver.

Depends-on: Patch-145473 ("dts: add VLAN methods to testpmd shell")

Signed-off-by: Dean Marx 
Reviewed-by: Jeremy Spewock 
---
 dts/framework/config/conf_yaml_schema.json |   3 +-
 dts/tests/TestSuite_vlan.py| 167 +
 2 files changed, 169 insertions(+), 1 deletion(-)
 create mode 100644 dts/tests/TestSuite_vlan.py

diff --git a/dts/framework/config/conf_yaml_schema.json 
b/dts/framework/config/conf_yaml_schema.json
index df390e8ae2..d437f4db36 100644
--- a/dts/framework/config/conf_yaml_schema.json
+++ b/dts/framework/config/conf_yaml_schema.json
@@ -187,7 +187,8 @@
   "enum": [
 "hello_world",
 "os_udp",
-"pmd_buffer_scatter"
+"pmd_buffer_scatter",
+"vlan"
   ]
 },
 "test_target": {
diff --git a/dts/tests/TestSuite_vlan.py b/dts/tests/TestSuite_vlan.py
new file mode 100644
index 00..19714769f4
--- /dev/null
+++ b/dts/tests/TestSuite_vlan.py
@@ -0,0 +1,167 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2024 University of New Hampshire
+
+"""Test the support of VLAN Offload Features by Poll Mode Drivers.
+
+This test suite verifies that VLAN filtering, stripping, and header insertion 
all
+function as expected. When a VLAN ID is in the filter list, all packets with 
that
+ID should be forwarded and all others should be dropped. While stripping is 
enabled,
+packets sent with a VLAN ID should have the ID removed and then be forwarded.
+Additionally, when header insertion is enabled packets without a
+VLAN ID should have a specified ID inserted and then be forwarded.
+
+"""
+
+from scapy.layers.l2 import Dot1Q, Ether  # type: ignore[import-untyped]
+from scapy.packet import Raw  # type: ignore[import-untyped]
+
+from framework.remote_session.testpmd_shell import SimpleForwardingModes, 
TestPmdShell
+from framework.test_suite import TestSuite, func_test
+from framework.testbed_model.capability import NicCapability, TopologyType, 
requires
+
+
+@requires(NicCapability.RX_OFFLOAD_VLAN_FILTER)
+@requires(topology_type=TopologyType.two_links)
+class TestVlan(TestSuite):
+"""DPDK VLAN test suite.
+
+Ensures VLAN packet reception, stripping, and insertion on the Poll Mode 
Driver
+when the appropriate conditions are met. The suite contains four test 
cases:
+
+1. VLAN reception no stripping - verifies that a vlan packet with a tag
+within the filter list is received.
+2. VLAN reception stripping - verifies that a vlan packet with a tag
+within the filter list is received without the vlan tag.
+3. VLAN no reception - verifies that a vlan packet with a tag not within
+the filter list is dropped.
+4. VLAN insertion - verifies that a non vlan packet is received with a vlan
+tag when insertion is enabled.
+"""
+
+def send_vlan_packet_and_verify(self, should_receive: bool, strip: bool, 
vlan_id: int) -> None:
+"""Generate a vlan packet, send and verify packet with same payload is 
received on the dut.
+
+Args:
+should_receive: Indicate whether the packet should be successfully 
received.
+strip: If :data:`False`, will verify received packets match the 
given VLAN ID,
+otherwise verifies that the received packet has no VLAN ID
+(as it has been stripped off.)
+vlan_id: Expected vlan ID.
+"""
+packet = Ether() / Dot1Q(vlan=vlan_id) / Raw(load="x")
+received_packets = self.send_packet_and_capture(packet)
+test_packet = None
+for packet in received_packets:
+if hasattr(packet, "load") and b"x" in packet.load:
+test_packet = packet
+break
+if should_receive:
+self.verify(
+test_packet is not None, "Packet was dropped when it should 
have been received"
+)
+if test_packet is not None:
+if strip:
+self.verify(
+not test_packet.haslayer(Dot1Q), "Vlan tag was not 
stripped successfully"
+)
+else:
+self.verify(
+test_packet.vlan == vlan_id,
+"The received tag did not match the expected tag",
+)
+else:
+self.verify(
+test_packet is None,
+"Packet was received when it should have been dropped",
+)
+
+def send_packet_and_verify_insertion(self, expected_id: int) -> None:
+"""Generate a packet with no vlan tag, send and verify on the dut.
+
+Args:
+expected_id: The vlan id that is being inserted through tx_offload 
configuration.
+"""
+packet = Ether() / Raw(load="x")
+received_packets = self.send_packet_and_capture(packet)
+test_pack

[PATCH v2] fib: network byte order IPv4 lookup

2024-10-08 Thread Vladimir Medvedkin

Previously when running rte_fib_lookup IPv4 addresses must have been in
host byte order.

This patch adds a new flag RTE_FIB_FLAG_LOOKUP_BE that can be passed on
fib create, which will allow to have IPv4 in network byte order on
lookup.

Signed-off-by: Vladimir Medvedkin 
---
 app/test/test_fib.c  |  2 +-
 lib/fib/dir24_8.c| 62 +++---
 lib/fib/dir24_8.h| 44 -
 lib/fib/dir24_8_avx512.c | 82 +++-
 lib/fib/dir24_8_avx512.h | 15 
 lib/fib/meson.build  | 38 +++
 lib/fib/rte_fib.c|  7 +++-
 lib/fib/rte_fib.h|  4 ++
 8 files changed, 169 insertions(+), 85 deletions(-)

diff --git a/app/test/test_fib.c b/app/test/test_fib.c
index 45dccca1f6..b0e53dbe01 100644
--- a/app/test/test_fib.c
+++ b/app/test/test_fib.c
@@ -319,7 +319,7 @@ int32_t
 test_lookup(void)
 {
struct rte_fib *fib = NULL;
-   struct rte_fib_conf config;
+   struct rte_fib_conf config = { 0 };
uint64_t def_nh = 100;
int ret;
 
diff --git a/lib/fib/dir24_8.c b/lib/fib/dir24_8.c
index c739e92304..5520f0f519 100644
--- a/lib/fib/dir24_8.c
+++ b/lib/fib/dir24_8.c
@@ -26,56 +26,72 @@
 #define ROUNDUP(x, y)   RTE_ALIGN_CEIL(x, (1 << (32 - y)))
 
 static inline rte_fib_lookup_fn_t
-get_scalar_fn(enum rte_fib_dir24_8_nh_sz nh_sz)
+get_scalar_fn(enum rte_fib_dir24_8_nh_sz nh_sz, bool be_addr)
 {
switch (nh_sz) {
case RTE_FIB_DIR24_8_1B:
-   return dir24_8_lookup_bulk_1b;
+   return (be_addr) ? dir24_8_lookup_bulk_1b_be :
+   dir24_8_lookup_bulk_1b;
case RTE_FIB_DIR24_8_2B:
-   return dir24_8_lookup_bulk_2b;
+   return (be_addr) ? dir24_8_lookup_bulk_2b_be :
+   dir24_8_lookup_bulk_2b;
case RTE_FIB_DIR24_8_4B:
-   return dir24_8_lookup_bulk_4b;
+   return (be_addr) ? dir24_8_lookup_bulk_4b_be :
+   dir24_8_lookup_bulk_4b;
case RTE_FIB_DIR24_8_8B:
-   return dir24_8_lookup_bulk_8b;
+   return (be_addr) ? dir24_8_lookup_bulk_8b_be :
+   dir24_8_lookup_bulk_8b;
default:
return NULL;
}
 }
 
 static inline rte_fib_lookup_fn_t
-get_scalar_fn_inlined(enum rte_fib_dir24_8_nh_sz nh_sz)
+get_scalar_fn_inlined(enum rte_fib_dir24_8_nh_sz nh_sz, bool be_addr)
 {
switch (nh_sz) {
case RTE_FIB_DIR24_8_1B:
-   return dir24_8_lookup_bulk_0;
+   return (be_addr) ? dir24_8_lookup_bulk_0_be :
+   dir24_8_lookup_bulk_0;
case RTE_FIB_DIR24_8_2B:
-   return dir24_8_lookup_bulk_1;
+   return (be_addr) ? dir24_8_lookup_bulk_1_be :
+   dir24_8_lookup_bulk_1;
case RTE_FIB_DIR24_8_4B:
-   return dir24_8_lookup_bulk_2;
+   return (be_addr) ? dir24_8_lookup_bulk_2_be :
+   dir24_8_lookup_bulk_2;
case RTE_FIB_DIR24_8_8B:
-   return dir24_8_lookup_bulk_3;
+   return (be_addr) ? dir24_8_lookup_bulk_3_be :
+   dir24_8_lookup_bulk_3;
default:
return NULL;
}
 }
 
 static inline rte_fib_lookup_fn_t
-get_vector_fn(enum rte_fib_dir24_8_nh_sz nh_sz)
+get_vector_fn(enum rte_fib_dir24_8_nh_sz nh_sz, bool be_addr)
 {
 #ifdef CC_DIR24_8_AVX512_SUPPORT
if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) <= 0) ||
+   (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512DQ) <= 0) ||
(rte_vect_get_max_simd_bitwidth() < RTE_VECT_SIMD_512))
return NULL;
 
+   if (be_addr && (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512BW) <= 0))
+   return NULL;
+
switch (nh_sz) {
case RTE_FIB_DIR24_8_1B:
-   return rte_dir24_8_vec_lookup_bulk_1b;
+   return (be_addr) ? rte_dir24_8_vec_lookup_bulk_1b_be :
+   rte_dir24_8_vec_lookup_bulk_1b;
case RTE_FIB_DIR24_8_2B:
-   return rte_dir24_8_vec_lookup_bulk_2b;
+   return (be_addr) ? rte_dir24_8_vec_lookup_bulk_2b_be :
+   rte_dir24_8_vec_lookup_bulk_2b;
case RTE_FIB_DIR24_8_4B:
-   return rte_dir24_8_vec_lookup_bulk_4b;
+   return (be_addr) ? rte_dir24_8_vec_lookup_bulk_4b_be :
+   rte_dir24_8_vec_lookup_bulk_4b;
case RTE_FIB_DIR24_8_8B:
-   return rte_dir24_8_vec_lookup_bulk_8b;
+   return (be_addr) ? rte_dir24_8_vec_lookup_bulk_8b_be :
+   rte_dir24_8_vec_lookup_bulk_8b;
default:
return NULL;
}
@@ -86,7 +102,7 @@ get_vecto

RE: [PATCH v2 05/10] baseband/acc: enhance SW ring alignment

2024-10-08 Thread Chautru, Nicolas

Hi Maxime, 

> -Original Message-
> From: Maxime Coquelin 
> Sent: Tuesday, October 8, 2024 12:52 AM
> To: Vargas, Hernan ; dev@dpdk.org;
> gak...@marvell.com; t...@redhat.com
> Cc: Chautru, Nicolas ; Zhang, Qi Z
> 
> Subject: Re: [PATCH v2 05/10] baseband/acc: enhance SW ring alignment
> 
> 
> 
> On 10/3/24 22:49, Hernan Vargas wrote:
> > Calculate the aligned total size required for queue rings, ensuring
> > that the size is a power of two for proper memory allocation.
> >
> > Signed-off-by: Hernan Vargas 
> > ---
> >   drivers/baseband/acc/acc_common.h | 7 ---
> >   1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/baseband/acc/acc_common.h
> > b/drivers/baseband/acc/acc_common.h
> > index 0d1c26166ff2..8ac1ca001c1d 100644
> > --- a/drivers/baseband/acc/acc_common.h
> > +++ b/drivers/baseband/acc/acc_common.h
> > @@ -767,19 +767,20 @@ alloc_sw_rings_min_mem(struct rte_bbdev
> *dev, struct acc_device *d,
> > int i = 0;
> > uint32_t q_sw_ring_size = ACC_MAX_QUEUE_DEPTH *
> get_desc_len();
> > uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
> > -   /* Free first in case this is a reconfiguration */
> > +   uint32_t alignment = q_sw_ring_size *
> rte_align32pow2(num_queues);
> > +   /* Free first in case this is dev_sw_ring_size, q_sw_ring_size,
> > +socket); reconfiguration */
> 
> There is a copy/paste mistake in the comment?

Thanks, yes. We missed it in the rebase somehow. 

> 
> > rte_free(d->sw_rings_base);
> >
> > /* Find an aligned block of memory to store sw rings */
> > while (i < ACC_SW_RING_MEM_ALLOC_ATTEMPTS) {
> > /*
> >  * sw_ring allocated memory is guaranteed to be aligned to
> > -* q_sw_ring_size at the condition that the requested size is
> > +* alignment at the condition that the requested size is
> 
> This comment is really unclear "aligned to alignment"

Unclear indeed when reading it again. Should be "aligned to the variable 
`alignment` ..."
Ie the change is purely to use now the new variable `alignment` instead of 
`queue_ring_size`
Thanks

> 
> >  * less than the page size
> >  */
> > sw_rings_base = rte_zmalloc_socket(
> > dev->device->driver->name,
> > -   dev_sw_ring_size, q_sw_ring_size, socket);
> > +   dev_sw_ring_size, alignment, socket);
> >
> > if (sw_rings_base == NULL) {
> > rte_acc_log(ERR,

Re: [PATCH v2] fib: network byte order IPv4 lookup

2024-10-08 Thread Stephen Hemminger

On Tue,  8 Oct 2024 17:16:05 +
Vladimir Medvedkin  wrote:

> Previously when running rte_fib_lookup IPv4 addresses must have been in
> host byte order.
> 
> This patch adds a new flag RTE_FIB_FLAG_LOOKUP_BE that can be passed on
> fib create, which will allow to have IPv4 in network byte order on
> lookup.
> 
> Signed-off-by: Vladimir Medvedkin 
> ---

github build failed with this.

FAILED: lib/76b5a35@@rte_fib at sta/fib_dir24_8.c.o 
ccache gcc -Ilib/76b5a35@@rte_fib at sta -Ilib -I../lib -Ilib/fib -I../lib/fib 
-I. -I../ -Iconfig -I../config -Ilib/eal/include -I../lib/eal/include 
-Ilib/eal/linux/include -I../lib/eal/linux/include -Ilib/eal/x86/include 
-I../lib/eal/x86/include -Ilib/eal/common -I../lib/eal/common -Ilib/eal 
-I../lib/eal -Ilib/kvargs -I../lib/kvargs -Ilib/log -I../lib/log 
-Ilib/telemetry/../metrics -I../lib/telemetry/../metrics -Ilib/telemetry 
-I../lib/telemetry -Ilib/rib -I../lib/rib -Ilib/mempool -I../lib/mempool 
-Ilib/ring -I../lib/ring -fdiagnostics-color=always -pipe 
-D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -std=c11 -O2 -g 
-include rte_config.h -Wcast-qual -Wdeprecated -Wformat -Wformat-nonliteral 
-Wformat-security -Wmissing-declarations -Wmissing-prototypes -Wnested-externs 
-Wold-style-definition -Wpointer-arith -Wsign-compare -Wstrict-prototypes 
-Wundef -Wwrite-strings -Wno-address-of-packed-member -Wno-packed-not-aligned 
-Wno-missing-field-initializers -Wno-zero-length-bounds 
-Wno-pointer-to-int-cast -D_GNU_SOURCE -m32 -fPIC -march=corei7 -mrtm 
-DALLOW_EXPERIMENTAL_API -DALLOW_INTERNAL_API -Wno-format-truncation 
-DRTE_LOG_DEFAULT_LOGTYPE=lib.fib -MD -MQ 'lib/76b5a35@@rte_fib at 
sta/fib_dir24_8.c.o' -MF 'lib/76b5a35@@rte_fib at sta/fib_dir24_8.c.o.d' -o 
'lib/76b5a35@@rte_fib at sta/fib_dir24_8.c.o' -c ../lib/fib/dir24_8.c
../lib/fib/dir24_8.c: In function ‘get_vector_fn’:
../lib/fib/dir24_8.c:71:54: error: unused parameter ‘be_addr’ 
[-Werror=unused-parameter]
   71 | get_vector_fn(enum rte_fib_dir24_8_nh_sz nh_sz, bool be_addr)
  |

RE: [PATCH v4 2/4] cryptodev: add ec points to sm2 op

2024-10-08 Thread Kusztal, ArkadiuszX




> -Original Message-
> From: Stephen Hemminger 
> Sent: Tuesday, October 8, 2024 11:09 PM
> To: Kusztal, ArkadiuszX 
> Cc: dev@dpdk.org; gak...@marvell.com; Dooley, Brian
> 
> Subject: Re: [PATCH v4 2/4] cryptodev: add ec points to sm2 op
> 
> On Tue, 8 Oct 2024 21:00:50 +
> "Kusztal, ArkadiuszX"  wrote:
> 
> > Hi Stephen,
> >
> > > -Original Message-
> > > From: Stephen Hemminger 
> > > Sent: Tuesday, October 8, 2024 10:46 PM
> > > To: Kusztal, ArkadiuszX 
> > > Cc: dev@dpdk.org; gak...@marvell.com; Dooley, Brian
> > > 
> > > Subject: Re: [PATCH v4 2/4] cryptodev: add ec points to sm2 op
> > >
> > > On Tue,  8 Oct 2024 19:14:31 +0100
> > > Arkadiusz Kusztal  wrote:
> > >
> > > > +   RTE_CRYPTO_SM2_PARTIAL,
> > > > +   /**<
> > > > +* PMD does not support the full process of the
> > > > +* SM2 encryption/decryption, but the elliptic
> > > > +* curve part only
> > >
> > > Couldn't this just be:
> > >   /**< PMD only supports elliptic curve */
> >
> > SM2 encryption involves several steps: random number generation, hashing,
> some trivial xor's etc, and calculation of elliptic curve points, what I 
> meant here
> is that only this EC calculation will be performed.
> > But when I read it now, I probably may need to add some more clarity to it.
> 
> 
> My point is what developers write tends to be overly wordy and redundant.
> Comments and documentation should be as succinct as possible.

I agree, I will change it to the more technical/precise.

[PATCH v10 0/1] dts: port over queue start/stop suite

2024-10-08 Thread Dean Marx

Queue start/stop suite ensures the Poll Mode Driver can functionally
enable and disable Rx/Tx queues on ports. The suite contains two test
cases:

1. All queues enabled - verifies that packets are received when all
queues on all ports are enabled.
2. Queue start/stop - verifies that packets are not received when the Rx
queue on port 0 is disabled, and then again when the Tx queue on port 1
is disabled.

An important aspect of DPDK is queueing packets for transmission to
bypass the kernel, which makes this test suite necessary to ensure
performance.
--
v8:
* Refactored to be compatible with context manager

v9: 
* Combined configuration schema and test suite patches

v10:
* Rebased off next-dts

Dean Marx (1):
  dts: port over queue start/stop suite

 dts/framework/config/conf_yaml_schema.json |  3 ++-
 dts/tests/TestSuite_queue_start_stop.py| 16 +++-
 2 files changed, 9 insertions(+), 10 deletions(-)

-- 
2.44.0

[PATCH v10] dts: port over queue start/stop suite

2024-10-08 Thread Dean Marx

This suite tests the ability of the Poll Mode Driver to enable
and disable Rx/Tx queues on a port.

Depends-on: patch-12 ("dts: add port queue modification
and forwarding stats to testpmd")

Signed-off-by: Dean Marx 
Reviewed-by: Jeremy Spewock 
---
 dts/framework/config/conf_yaml_schema.json |  3 ++-
 dts/tests/TestSuite_queue_start_stop.py| 16 +++-
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/dts/framework/config/conf_yaml_schema.json 
b/dts/framework/config/conf_yaml_schema.json
index df390e8ae2..12a4a26dc8 100644
--- a/dts/framework/config/conf_yaml_schema.json
+++ b/dts/framework/config/conf_yaml_schema.json
@@ -187,7 +187,8 @@
   "enum": [
 "hello_world",
 "os_udp",
-"pmd_buffer_scatter"
+"pmd_buffer_scatter",
+"queue_start_stop"
   ]
 },
 "test_target": {
diff --git a/dts/tests/TestSuite_queue_start_stop.py 
b/dts/tests/TestSuite_queue_start_stop.py
index 7533f0b395..389030ae8c 100644
--- a/dts/tests/TestSuite_queue_start_stop.py
+++ b/dts/tests/TestSuite_queue_start_stop.py
@@ -17,9 +17,13 @@
 from scapy.packet import Raw  # type: ignore[import-untyped]
 
 from framework.remote_session.testpmd_shell import SimpleForwardingModes, 
TestPmdShell
-from framework.test_suite import TestSuite
+from framework.test_suite import TestSuite, func_test
+from framework.testbed_model.capability import NicCapability, TopologyType, 
requires
 
 
+@requires(topology_type=TopologyType.two_links)
+@requires(NicCapability.RUNTIME_RX_QUEUE_SETUP)
+@requires(NicCapability.RUNTIME_TX_QUEUE_SETUP)
 class TestQueueStartStop(TestSuite):
 """DPDK Queue start/stop test suite.
 
@@ -30,14 +34,6 @@ class TestQueueStartStop(TestSuite):
 queue and verify that packets are not received/forwarded.
 """
 
-def set_up_suite(self) -> None:
-"""Set up the test suite.
-
-Setup:
-Verify that at least two ports are open for session.
-"""
-self.verify(len(self._port_links) > 1, "Not enough ports")
-
 def send_packet_and_verify(self, should_receive: bool = True) -> None:
 """Generate a packet, send to the DUT, and verify it is forwarded back.
 
@@ -54,6 +50,7 @@ def send_packet_and_verify(self, should_receive: bool = True) 
-> None:
 f"Packet was {'dropped' if should_receive else 'received'}",
 )
 
+@func_test
 def test_rx_queue_start_stop(self) -> None:
 """Verify packets are not received by port 0 when Rx queue is disabled.
 
@@ -72,6 +69,7 @@ def test_rx_queue_start_stop(self) -> None:
 "Packets were received on Rx queue when it should've been 
disabled",
 )
 
+@func_test
 def test_tx_queue_start_stop(self) -> None:
 """Verify packets are not forwarded by port 1 when Tx queue is 
disabled.
 
-- 
2.44.0

RE: [RFC 0/4] ethdev: rework config restore

2024-10-08 Thread Konstantin Ananyev



>  We have been working on optimizing the latency of calls to
>  rte_eth_dev_start(), on ports spawned by mlx5 PMD. Most of the work
>  requires changes in the implementation of
>  .dev_start() PMD callback, but I also wanted to start a discussion
>  regarding configuration restore.
> 
>  rte_eth_dev_start() does a few things on top of calling .dev_start() 
>  callback:
> 
>  - Before calling it:
>  - eth_dev_mac_restore() - if device supports
>  RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>  - After calling it:
>  - eth_dev_mac_restore() - if device does not support
> >>> RTE_ETH_DEV_NOLIVE_MAC_ADDR;
>  - restore promiscuous config
>  - restore all multicast config
> 
>  eth_dev_mac_restore() iterates over all known MAC addresses - stored
>  in rte_eth_dev_data.mac_addrs array - and calls
>  .mac_addr_set() and .mac_addr_add() callbacks to apply these MAC 
>  addresses.
> 
>  Promiscuous config restore checks if promiscuous mode is enabled or
>  not, and calls .promiscuous_enable() or .promiscuous_disable() callback.
> 
>  All multicast config restore checks if all multicast mode is enabled
>  or not, and calls .allmulticast_enable() or .allmulticast_disable() 
>  callback.
> 
>  Callbacks are called directly in all of these cases, to bypass the
>  checks for applying the same configuration, which exist in relevant APIs.
>  Checks are bypassed to force drivers to reapply the configuration.
> 
>  Let's consider what happens in the following sequence of API calls.
> 
>  1. rte_eth_dev_configure()
>  2. rte_eth_tx_queue_setup()
>  3. rte_eth_rx_queue_setup()
>  4. rte_eth_promiscuous_enable()
>  - Call dev->dev_ops->promiscuous_enable()
>  - Stores promiscuous state in dev->data->promiscuous 5.
>  rte_eth_allmulticast_enable()
>  - Call dev->dev_ops->allmulticast_enable()
>  - Stores allmulticast state in dev->data->allmulticast 6.
>  rte_eth_dev_start()
>  - Call dev->dev_ops->dev_start()
>  - Call dev->dev_ops->mac_addr_set() - apply default MAC address
>  - Call dev->dev_ops->promiscuous_enable()
>  - Call dev->dev_ops->allmulticast_enable()
> 
>  Even though all configuration is available in dev->data after step 5,
>  library forces reapplying this configuration in step 6.
> 
>  In mlx5 PMD case all relevant callbacks require communication with the
>  kernel driver, to configure the device (mlx5 PMD must create/destroy
>  new kernel flow rules and/or change netdev config).
> 
>  mlx5 PMD handles applying all configuration in .dev_start(), so the
>  following forced callbacks force additional communication with the 
>  kernel. The
> >>> same configuration is applied multiple times.
> 
>  As an optimization, mlx5 PMD could check if a given configuration was
>  applied, but this would duplicate the functionality of the library
>  (for example rte_eth_promiscuous_enable() does not call the driver if
>  dev->data->promiscuous is set).
> 
>  Question: Since all of the configuration is available before
>  .dev_start() callback is called, why ethdev library does not expect 
>  .dev_start() to
> >>> take this configuration into account?
>  In other words, why library has to reapply the configuration?
> 
>  I could not find any particular reason why configuration restore
>  exists as part of the process (it was in the initial DPDK commit).
> 
> >>>
> >>> My assumption is .dev_stop() cause these values reset in some devices, so
> >>> .dev_start() restores them back.
> >>> @Bruce or @Konstantin may remember the history.
> >
> > Yep, as I remember, at least some Intel PMDs calling hw_reset() ad 
> > dec_stop() and
> > even dev_start() to make sure that HW is in a clean (known) state.
> >
> >>>
> >>> But I agree this is device specific behavior, and can be managed by what 
> >>> device
> >>> requires.
> >
> > Probably yes.
> >
> >>>
>  The patches included in this RFC, propose a mechanism which would help
>  with managing which drivers rely on forceful configuration restore.
>  Drivers could advertise if forceful configuration restore is needed
>  through `RTE_ETH_DEV_*_FORCE_RESTORE` device flag. If this flag is
>  set, then the driver in question requires ethdev to forcefully restore
> >>> configuration.
> 
> >>>
> >>> OK to use flag for it, but not sure about using 'dev_info->dev_flags'
> >>> (RTE_ETH_DEV_*) for this, as this flag is shared with user and this is 
> >>> all dpdk
> >>> internal.
> >>>
> >>> What about to have a dedicated flag for it? We can have a dedicated set 
> >>> of flag
> >>> values for restore.
> >>
> >> Agreed. What do you think about the following?
> >
> > Instead of exposing that, can we probably make it transparent to the user
> > and probab

[PATCH v3 1/5] power: refactor core power management library

2024-10-08 Thread Sivaprasad Tummala

This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.

These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.

v3:
 - renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
 - re-worked on auto detection logic

v2:
 - added NULL check for global_core_ops in rte_power_get_core_ops

Signed-off-by: Sivaprasad Tummala 
---
 drivers/meson.build   |   1 +
 .../power/acpi/acpi_cpufreq.c |  22 +-
 .../power/acpi/acpi_cpufreq.h |   6 +-
 drivers/power/acpi/meson.build|  10 +
 .../power/amd_pstate/amd_pstate_cpufreq.c |  24 +-
 .../power/amd_pstate/amd_pstate_cpufreq.h |   8 +-
 drivers/power/amd_pstate/meson.build  |  10 +
 .../power/cppc/cppc_cpufreq.c |  22 +-
 .../power/cppc/cppc_cpufreq.h |   8 +-
 drivers/power/cppc/meson.build|  10 +
 .../power/kvm_vm}/guest_channel.c |   0
 .../power/kvm_vm}/guest_channel.h |   0
 .../power/kvm_vm/kvm_vm.c |  22 +-
 .../power/kvm_vm/kvm_vm.h |   6 +-
 drivers/power/kvm_vm/meson.build  |  16 +
 drivers/power/meson.build |  12 +
 drivers/power/pstate/meson.build  |  10 +
 .../power/pstate/pstate_cpufreq.c |  22 +-
 .../power/pstate/pstate_cpufreq.h |   6 +-
 lib/power/meson.build |   7 +-
 lib/power/power_common.c  |   2 +-
 lib/power/power_common.h  |  16 +-
 lib/power/rte_power.c | 286 ++
 lib/power/rte_power.h | 139 ++---
 lib/power/rte_power_cpufreq_api.h | 208 +
 lib/power/version.map |  14 +
 26 files changed, 618 insertions(+), 269 deletions(-)
 rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c 
(95%)
 rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h 
(98%)
 create mode 100644 drivers/power/acpi/meson.build
 rename lib/power/power_amd_pstate_cpufreq.c => 
drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
 rename lib/power/power_amd_pstate_cpufreq.h => 
drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
 create mode 100644 drivers/power/amd_pstate/meson.build
 rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c 
(95%)
 rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h 
(97%)
 create mode 100644 drivers/power/cppc/meson.build
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
 rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
 rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
 create mode 100644 drivers/power/kvm_vm/meson.build
 create mode 100644 drivers/power/meson.build
 create mode 100644 drivers/power/pstate/meson.build
 rename lib/power/power_pstate_cpufreq.c => 
drivers/power/pstate/pstate_cpufreq.c (96%)
 rename lib/power/power_pstate_cpufreq.h => 
drivers/power/pstate/pstate_cpufreq.h (98%)
 create mode 100644 lib/power/rte_power_cpufreq_api.h

diff --git a/drivers/meson.build b/drivers/meson.build
index 66931d4241..9d77e0deab 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
 'event',  # depends on common, bus, mempool and net.
 'baseband',   # depends on common and bus.
 'gpu',# depends on common and bus.
+'power',  # depends on common (in future).
 ]
 
 if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index 81996e1c13..8637c69703 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
 #include 
 #include 
 
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
 #include "power_common.h"
 
 #define STR_SIZE 1024
@@ -577,3 +577,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
 
return 0;
 }
+
+static struct rte_power_core_ops acpi_ops = {
+   .name = "acpi",
+   .init = power_acpi_cpufreq_init,
+   .exit = power_ac

[PATCH v3 0/5] power: refactor power management library

2024-10-08 Thread Sivaprasad Tummala

This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.

This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.

Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.

Sivaprasad Tummala (5):
  power: refactor core power management library
  power: refactor uncore power management library
  test/power: removed function pointer validations
  power/amd_uncore: uncore support for AMD EPYC processors
  maintainers: update for drivers/power

 MAINTAINERS   |   1 +
 app/test/test_power.c |  95 -
 app/test/test_power_cpufreq.c |  52 ---
 app/test/test_power_kvm_vm.c  |  36 --
 drivers/meson.build   |   1 +
 .../power/acpi/acpi_cpufreq.c |  22 +-
 .../power/acpi/acpi_cpufreq.h |   6 +-
 drivers/power/acpi/meson.build|  10 +
 .../power/amd_pstate/amd_pstate_cpufreq.c |  24 +-
 .../power/amd_pstate/amd_pstate_cpufreq.h |   8 +-
 drivers/power/amd_pstate/meson.build  |  10 +
 drivers/power/amd_uncore/amd_uncore.c | 329 ++
 drivers/power/amd_uncore/amd_uncore.h | 226 
 drivers/power/amd_uncore/meson.build  |  20 ++
 .../power/cppc/cppc_cpufreq.c |  22 +-
 .../power/cppc/cppc_cpufreq.h |   8 +-
 drivers/power/cppc/meson.build|  10 +
 .../power/intel_uncore/intel_uncore.c |  18 +-
 .../power/intel_uncore/intel_uncore.h |   8 +-
 drivers/power/intel_uncore/meson.build|   6 +
 .../power/kvm_vm}/guest_channel.c |   0
 .../power/kvm_vm}/guest_channel.h |   0
 .../power/kvm_vm/kvm_vm.c |  22 +-
 .../power/kvm_vm/kvm_vm.h |   6 +-
 drivers/power/kvm_vm/meson.build  |  16 +
 drivers/power/meson.build |  14 +
 drivers/power/pstate/meson.build  |  10 +
 .../power/pstate/pstate_cpufreq.c |  22 +-
 .../power/pstate/pstate_cpufreq.h |   6 +-
 examples/l3fwd-power/main.c   |  12 +-
 lib/power/meson.build |   9 +-
 lib/power/power_common.c  |   2 +-
 lib/power/power_common.h  |  16 +-
 lib/power/rte_power.c | 286 +--
 lib/power/rte_power.h | 139 +---
 lib/power/rte_power_cpufreq_api.h | 208 +++
 lib/power/rte_power_uncore.c  | 206 +--
 lib/power/rte_power_uncore.h  |  87 +++--
 lib/power/rte_power_uncore_ops.h  | 239 +
 lib/power/version.map |  15 +
 40 files changed, 1603 insertions(+), 624 deletions(-)
 rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c 
(95%)
 rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h 
(98%)
 create mode 100644 drivers/power/acpi/meson.build
 rename lib/power/power_amd_pstate_cpufreq.c => 
drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
 rename lib/power/power_amd_pstate_cpufreq.h => 
drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
 create mode 100644 drivers/power/amd_pstate/meson.build
 create mode 100644 drivers/power/amd_uncore/amd_uncore.c
 create mode 100644 drivers/power/amd_uncore/amd_uncore.h
 create mode 100644 drivers/power/amd_uncore/meson.build
 rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c 
(95%)
 rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h 
(97%)
 create mode 100644 drivers/power/cppc/meson.build
 rename lib/power/power_intel_uncore.c => 
drivers/power/intel_uncore/intel_uncore.c (95%)
 rename lib/power/power_intel_uncore.h => 
drivers/power/intel_uncore/intel_uncore.h (97%)
 create mode 100644 drivers/power/intel_uncore/meson.build
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
 rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
 rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
 create mode 100644 drivers/power/kvm_vm/meson.build
 create mode 100644 drivers/power/meson.build
 create mode 100644 drivers/power/pstate/meson.build
 rename lib/power/power_pstate_cpufreq.c => 
drivers/power/pstate/pstate_cpufreq.c (96%)
 rename lib/power/power_pstate_cpufreq.h

[PATCH v3 3/5] test/power: removed function pointer validations

2024-10-08 Thread Sivaprasad Tummala

After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.

v2:
 - removed function pointer validation in l3fwd-power app.

Signed-off-by: Sivaprasad Tummala 
---
 app/test/test_power.c | 95 ---
 app/test/test_power_cpufreq.c | 52 ---
 app/test/test_power_kvm_vm.c  | 36 -
 examples/l3fwd-power/main.c   | 12 ++---
 4 files changed, 4 insertions(+), 191 deletions(-)

diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
 
 #include 
 
-static int
-check_function_ptrs(void)
-{
-   enum power_management_env env = rte_power_get_env();
-
-   const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
-   const char *inject_not_string1 = not_null_expected ? " not" : "";
-   const char *inject_not_string2 = not_null_expected ? "" : " not";
-
-   if ((rte_power_freqs == NULL) == not_null_expected) {
-   printf("rte_power_freqs should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_get_freq == NULL) == not_null_expected) {
-   printf("rte_power_get_freq should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_set_freq == NULL) == not_null_expected) {
-   printf("rte_power_set_freq should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_up == NULL) == not_null_expected) {
-   printf("rte_power_freq_up should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_down == NULL) == not_null_expected) {
-   printf("rte_power_freq_down should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_max == NULL) == not_null_expected) {
-   printf("rte_power_freq_max should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_min == NULL) == not_null_expected) {
-   printf("rte_power_freq_min should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_turbo_status == NULL) == not_null_expected) {
-   printf("rte_power_turbo_status should%s be NULL, environment 
has%s been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
-   printf("rte_power_freq_enable_turbo should%s be NULL, 
environment has%s been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
-   printf("rte_power_freq_disable_turbo should%s be NULL, 
environment has%s been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_get_capabilities == NULL) == not_null_expected) {
-   printf("rte_power_get_capabilities should%s be NULL, 
environment has%s been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-
-   return 0;
-}
-
 static int
 test_power(void)
 {
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
 
-   /* Verify that function pointers are NULL */
-   if (check_function_ptrs() < 0)
-   goto fail_all;
-
rte_power_unset_env();
 
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)

Re: [PATCH v2 1/2] fib: implement RCU rule reclamation

2024-10-08 Thread Stephen Hemminger

On Tue,  8 Oct 2024 17:55:23 +
Vladimir Medvedkin  wrote:

> @@ -569,7 +600,60 @@ dir24_8_free(void *p)
>  {
>   struct dir24_8_tbl *dp = (struct dir24_8_tbl *)p;
>  
> + if (dp->dq != NULL)
> + rte_rcu_qsbr_dq_delete(dp->dq);
> +

Side note:
rte_rcu_qsbr_dq_delete should be changed to accept NULL as nop.
Like all the other free routines

Re: [OS-Team] [dpdklab] Re: [PATCH 0/5] Increase minimum meson version

2024-10-08 Thread Patrick Robb

On Tue, Oct 8, 2024 at 4:28 AM David Marchand 
wrote:

>
> This series can't be merged until the (UNH and LoongArch) CI are ready
> for such a change.
>
> TL;DR: the meson minimum version is being changed from 0.53.2 to 0.57
> in the current release.
>
> @UNH @Min Zhou
> How long would it take for all CI to be ready for this change?
>
>
Thanks for the heads up. So, as far as I can tell, this will require an
update to the dpdk/.ci/linux-setup.sh script (which I have just submitted)
as I think various labs rely on it including the github robot, loongson,
Intel (Maybe, I don't know). UNH does not use it much as we opt to meet the
meson dependency separately in the dpdk-ci project's container template
engine.

It will also require updates to the container template engine, which I can
get Cody started on tomorrow.

> Important note: if relevant to your CI, testing against LTS branches
> must still be done with the 0.53.2 version, so no change relying on
> post 0.53.2 meson feature gets backported.
>

Okay, full disclosure I don't think this is something we handled the last
time the meson version got bumped in 2022. So, back then we just bumped the
meson version for all environments to .53, then did LTS testing for 19.11,
20.11, 21.11 from environments running meson .53. But, I understand how
this is an issue and something we should avoid this go around.

However, it is not ideal to set the meson version "at runtime" for CI
testing based on the repo under test (mainline and next-* want .57, old LTS
versions want .53). It would be possible to modify our jenkinsfiles
(automation scripts) to check the DPDK version, and run pip commands
resetting the meson version accordingly, at the start of each testing
job... but I have a couple concerns here with regards to
stability/maintenance.

Another option, which Adam is suggesting, is to create a dedicated
environment which is version locked to .53 (it can just be an ubuntu
container image), label it as a meson .53 environment, and add that to the
total list of dpdk build environments which are run when we do testing for
either 22.11, or 23.11. Then, we could run the rest of the testing from the
same container images we use for mainline (that have .57 baked in), and
this would not be a problem because we would have that 1 environment doing
a dpdk build, which is guaranteed to be on .53. Bruce/David let me know if
you can think of any issues with this.

This is also very similar to a Community Lab request from a few months ago
(which we have an open internal Jira ticket for), which is to add a VM
environment which is locked to the minimum supported kernel version for
DPDK. But, that's another story...

Anyways, in terms of the timeline... the Jenkins script updates are
probably the most difficult in that they will require a PR review, dry run
tests etc. but it's still fairly trivial. Cody can probably update the
meson dependencies on the template engine and submit that to the dpdk-ci
project by end of week. So, I would say CI should be ready by next Tuesday,
provided the patches which will be incoming to dev and ci mailing lists can
be merged. Is this timeline okay?

Re: [OS-Team] [dpdklab] Re: [PATCH 0/5] Increase minimum meson version

2024-10-08 Thread Bruce Richardson

On Tue, Oct 08, 2024 at 03:49:12PM -0400, Patrick Robb wrote:
>On Tue, Oct 8, 2024 at 4:28 AM David Marchand
><[1]david.march...@redhat.com> wrote:
> 
>  This series can't be merged until the (UNH and LoongArch) CI are
>  ready
>  for such a change.
>  TL;DR: the meson minimum version is being changed from 0.53.2 to
>  0.57
>  in the current release.
>  @UNH @Min Zhou
>  How long would it take for all CI to be ready for this change?
> 
>Thanks for the heads up. So, as far as I can tell, this will require an
>update to the dpdk/.ci/linux-setup.sh script (which I have just
>submitted) as I think various labs rely on it including the github
>robot, loongson, Intel (Maybe, I don't know).

The update to that linux-setup file is included already in the first patch
of the series. No additional updates needed for jobs that rely on it.

> UNH does not use it much
>as we opt to meet the meson dependency separately in the dpdk-ci
>project's container template engine.

That's a bit of a pity, since we can't update the meson version
automatically as part of the meson update as we do with CIs using the
linux-setup script. Is there some other file that can be in the DPDK main
repo that can contain this details to regular DPDK patches can update the
CI too as part of a meson update?

>It will also require updates to the container template engine, which I
>can get Cody started on tomorrow.
> 
>  Important note: if relevant to your CI, testing against LTS branches
>  must still be done with the 0.53.2 version, so no change relying on
>  post 0.53.2 meson feature gets backported.
> 
>Okay, full disclosure I don't think this is something we handled the
>last time the meson version got bumped in 2022. So, back then we just
>bumped the meson version for all environments to .53, then did LTS
>testing for 19.11, 20.11, 21.11 from environments running meson .53.
>But, I understand how this is an issue and something we should avoid
>this go around.
>However, it is not ideal to set the meson version "at runtime" for CI
>testing based on the repo under test (mainline and next-* want .57, old
>LTS versions want .53). It would be possible to modify our jenkinsfiles
>(automation scripts) to check the DPDK version, and run pip commands
>resetting the meson version accordingly, at the start of each testing
>job... but I have a couple concerns here with regards to
>stability/maintenance.

I would recommend against using the DPDK version as a guide. However, the
meson version is included in the project options in the meson.build file of
the code. Could you use "grep meson_version meson.build" in a script to
extra the required version info? If it's helpful, we can possibly provide
some sort of compatibility guarantee of the format of this line.


Regards,
/Bruce

Re: [PATCH] Increasing ci meson version to .57

2024-10-08 Thread Bruce Richardson

On Tue, Oct 08, 2024 at 03:25:43PM -0400, Patrick Robb wrote:
> There is a proposed increase in the minimum meson version to .57
> This patch aligns the linux setup ci script with this change.
> 
> Signed-off-by: Patrick Robb 
> ---
>  .ci/linux-setup.sh | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/.ci/linux-setup.sh b/.ci/linux-setup.sh
> index 975bf32144..c7b6a86d38 100755
> --- a/.ci/linux-setup.sh
> +++ b/.ci/linux-setup.sh
> @@ -4,7 +4,7 @@
>  [ "$(id -u)" != '0' ] || alias sudo=
>  
>  # need to install as 'root' since some of the unit tests won't run without it
> -sudo python3 -m pip install --upgrade 'meson==0.53.2'
> +sudo python3 -m pip install --upgrade 'meson==0.57'
>  

This change should be already covered in [1].

Regards,
/Bruce

[1] 
https://patches.dpdk.org/project/dpdk/patch/20240920125737.1197969-2-bruce.richard...@intel.com/

Re: [PATCH] Increasing ci meson version to .57

2024-10-08 Thread Patrick Robb

Haha... I guess that serves as a lesson.

Thanks Bruce.

[v4 02/15] dma/dpaa2: refactor driver code

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

refactor the driver code with changes in:
- multiple HW queues
- SMA single copy and SG copy
- silent mode

Signed-off-by: Jun Yang 
---
 doc/guides/dmadevs/dpaa2.rst   |8 +
 drivers/dma/dpaa2/dpaa2_qdma.c | 2208 
 drivers/dma/dpaa2/dpaa2_qdma.h |  148 +-
 drivers/dma/dpaa2/rte_pmd_dpaa2_qdma.h |  130 +-
 drivers/dma/dpaa2/version.map  |   13 -
 5 files changed, 1158 insertions(+), 1349 deletions(-)
 delete mode 100644 drivers/dma/dpaa2/version.map

diff --git a/doc/guides/dmadevs/dpaa2.rst b/doc/guides/dmadevs/dpaa2.rst
index d2c26231e2..079337e61c 100644
--- a/doc/guides/dmadevs/dpaa2.rst
+++ b/doc/guides/dmadevs/dpaa2.rst
@@ -73,3 +73,11 @@ Platform Requirement
 
 DPAA2 drivers for DPDK can only work on NXP SoCs as listed in the
 ``Supported DPAA2 SoCs``.
+
+Device Argumenst
+
+1. Use dev arg option ``fle_pre_populate=1`` to pre-populate all
+   DMA descriptors with pre-initialized values.
+   usage example: ``fslmc:dpdmai.1,fle_pre_populate=1``
+2. Use dev arg option ``desc_debug=1`` to enable descriptor debugs.
+   usage example: ``fslmc:dpdmai.1,desc_debug=1``
diff --git a/drivers/dma/dpaa2/dpaa2_qdma.c b/drivers/dma/dpaa2/dpaa2_qdma.c
index 5d4749eae3..6c77dc32c4 100644
--- a/drivers/dma/dpaa2/dpaa2_qdma.c
+++ b/drivers/dma/dpaa2/dpaa2_qdma.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2018-2022 NXP
+ * Copyright 2018-2024 NXP
  */
 
 #include 
@@ -14,220 +14,378 @@
 #include "dpaa2_qdma.h"
 #include "dpaa2_qdma_logs.h"
 
-#define DPAA2_QDMA_PREFETCH "prefetch"
+#define DPAA2_QDMA_FLE_PRE_POPULATE "fle_pre_populate"
+#define DPAA2_QDMA_DESC_DEBUG "desc_debug"
 
-uint32_t dpaa2_coherent_no_alloc_cache;
-uint32_t dpaa2_coherent_alloc_cache;
+static uint32_t dpaa2_coherent_no_alloc_cache;
+static uint32_t dpaa2_coherent_alloc_cache;
 
-static inline int
-qdma_populate_fd_pci(phys_addr_t src, phys_addr_t dest,
-uint32_t len, struct qbman_fd *fd,
-struct dpaa2_qdma_rbp *rbp, int ser)
+static struct fsl_mc_io s_proc_mc_reg;
+
+static int
+check_devargs_handler(__rte_unused const char *key, const char *value,
+ __rte_unused void *opaque)
 {
-   fd->simple_pci.saddr_lo = lower_32_bits((uint64_t) (src));
-   fd->simple_pci.saddr_hi = upper_32_bits((uint64_t) (src));
+   if (strcmp(value, "1"))
+   return -1;
 
-   fd->simple_pci.len_sl = len;
+   return 0;
+}
 
-   fd->simple_pci.bmt = 1;
-   fd->simple_pci.fmt = 3;
-   fd->simple_pci.sl = 1;
-   fd->simple_pci.ser = ser;
+static int
+dpaa2_qdma_get_devargs(struct rte_devargs *devargs, const char *key)
+{
+   struct rte_kvargs *kvlist;
 
-   fd->simple_pci.sportid = rbp->sportid;  /*pcie 3 */
-   fd->simple_pci.srbp = rbp->srbp;
-   if (rbp->srbp)
-   fd->simple_pci.rdttype = 0;
-   else
-   fd->simple_pci.rdttype = dpaa2_coherent_alloc_cache;
+   if (!devargs)
+   return 0;
 
-   /*dest is pcie memory */
-   fd->simple_pci.dportid = rbp->dportid;  /*pcie 3 */
-   fd->simple_pci.drbp = rbp->drbp;
-   if (rbp->drbp)
-   fd->simple_pci.wrttype = 0;
-   else
-   fd->simple_pci.wrttype = dpaa2_coherent_no_alloc_cache;
+   kvlist = rte_kvargs_parse(devargs->args, NULL);
+   if (!kvlist)
+   return 0;
 
-   fd->simple_pci.daddr_lo = lower_32_bits((uint64_t) (dest));
-   fd->simple_pci.daddr_hi = upper_32_bits((uint64_t) (dest));
+   if (!rte_kvargs_count(kvlist, key)) {
+   rte_kvargs_free(kvlist);
+   return 0;
+   }
 
-   return 0;
+   if (rte_kvargs_process(kvlist, key,
+  check_devargs_handler, NULL) < 0) {
+   rte_kvargs_free(kvlist);
+   return 0;
+   }
+   rte_kvargs_free(kvlist);
+
+   return 1;
 }
 
 static inline int
-qdma_populate_fd_ddr(phys_addr_t src, phys_addr_t dest,
-uint32_t len, struct qbman_fd *fd, int ser)
+qdma_cntx_idx_ring_eq(struct qdma_cntx_idx_ring *ring,
+   const uint16_t *elem, uint16_t nb,
+   uint16_t *free_space)
 {
-   fd->simple_ddr.saddr_lo = lower_32_bits((uint64_t) (src));
-   fd->simple_ddr.saddr_hi = upper_32_bits((uint64_t) (src));
-
-   fd->simple_ddr.len = len;
-
-   fd->simple_ddr.bmt = 1;
-   fd->simple_ddr.fmt = 3;
-   fd->simple_ddr.sl = 1;
-   fd->simple_ddr.ser = ser;
-   /**
-* src If RBP=0 {NS,RDTTYPE[3:0]}: 0_1011
-* Coherent copy of cacheable memory,
-   * lookup in downstream cache, no allocate
-* on miss
-*/
-   fd->simple_ddr.rns = 0;
-   fd->simple_ddr.rdttype = dpaa2_coherent_alloc_cache;
-   /**
-* dest If RBP=0 {NS,WRTTYPE[3:0]}: 0_0111
-* Coherent write of cacheable memory,
-* lookup in downstream cache, no allocate on

[v4 14/15] dma/dpaa: add DMA error checks

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

add user configurable DMA error checks.

Signed-off-by: Jun Yang 
Signed-off-by: Gagandeep Singh 
---
 doc/guides/dmadevs/dpaa.rst  |   6 ++
 drivers/dma/dpaa/dpaa_qdma.c | 135 ++-
 drivers/dma/dpaa/dpaa_qdma.h |  42 ++
 drivers/net/dpaa2/dpaa2_ethdev.c |   2 +-
 4 files changed, 183 insertions(+), 2 deletions(-)

diff --git a/doc/guides/dmadevs/dpaa.rst b/doc/guides/dmadevs/dpaa.rst
index 8a7c0befc3..a60457229a 100644
--- a/doc/guides/dmadevs/dpaa.rst
+++ b/doc/guides/dmadevs/dpaa.rst
@@ -69,3 +69,9 @@ Platform Requirement
 
 DPAA DMA driver for DPDK can only work on NXP SoCs
 as listed in the `Supported DPAA SoCs`_.
+
+Device Arguments
+
+
+Use dev arg option ``dpaa_dma_err_check=1`` to check DMA errors at
+driver level. usage example: ``dpaa_bus:dpaa_qdma-1,dpaa_dma_err_check=1``
diff --git a/drivers/dma/dpaa/dpaa_qdma.c b/drivers/dma/dpaa/dpaa_qdma.c
index 0aa3575fe9..3fcd9b8904 100644
--- a/drivers/dma/dpaa/dpaa_qdma.c
+++ b/drivers/dma/dpaa/dpaa_qdma.c
@@ -4,11 +4,15 @@
 
 #include 
 #include 
+#include 
 
 #include "dpaa_qdma.h"
 #include "dpaa_qdma_logs.h"
 
 static uint32_t s_sg_max_entry_sz = 2000;
+static bool s_hw_err_check;
+
+#define DPAA_DMA_ERROR_CHECK "dpaa_dma_err_check"
 
 static inline void
 qdma_desc_addr_set64(struct fsl_qdma_comp_cmd_desc *ccdf, u64 addr)
@@ -638,7 +642,7 @@ fsl_qdma_enqueue_overflow(struct fsl_qdma_queue *fsl_queue)
 
check_num = 0;
 overflow_check:
-   if (fsl_qdma->is_silent) {
+   if (fsl_qdma->is_silent || unlikely(s_hw_err_check)) {
reg = qdma_readl_be(block +
 FSL_QDMA_BCQSR(fsl_queue->queue_id));
overflow = (reg & FSL_QDMA_BCQSR_QF_XOFF_BE) ?
@@ -1076,13 +1080,81 @@ dpaa_qdma_copy_sg(void *dev_private,
return ret;
 }
 
+static int
+dpaa_qdma_err_handle(struct fsl_qdma_err_reg *reg)
+{
+   struct fsl_qdma_err_reg local;
+   size_t i, offset = 0;
+   char err_msg[512];
+
+   local.dedr_be = rte_read32(®->dedr_be);
+   if (!local.dedr_be)
+   return 0;
+   offset = sprintf(err_msg, "ERR detected:");
+   if (local.dedr.ere) {
+   offset += sprintf(&err_msg[offset],
+   " ere(Enqueue rejection error)");
+   }
+   if (local.dedr.dde) {
+   offset += sprintf(&err_msg[offset],
+   " dde(Destination descriptor error)");
+   }
+   if (local.dedr.sde) {
+   offset += sprintf(&err_msg[offset],
+   " sde(Source descriptor error)");
+   }
+   if (local.dedr.cde) {
+   offset += sprintf(&err_msg[offset],
+   " cde(Command descriptor error)");
+   }
+   if (local.dedr.wte) {
+   offset += sprintf(&err_msg[offset],
+   " wte(Write transaction error)");
+   }
+   if (local.dedr.rte) {
+   offset += sprintf(&err_msg[offset],
+   " rte(Read transaction error)");
+   }
+   if (local.dedr.me) {
+   offset += sprintf(&err_msg[offset],
+   " me(Multiple errors of the same type)");
+   }
+   DPAA_QDMA_ERR("%s", err_msg);
+   for (i = 0; i < FSL_QDMA_DECCD_ERR_NUM; i++) {
+   local.deccd_le[FSL_QDMA_DECCD_ERR_NUM - 1 - i] =
+   QDMA_IN(®->deccd_le[i]);
+   }
+   local.deccqidr_be = rte_read32(®->deccqidr_be);
+   local.decbr = rte_read32(®->decbr);
+
+   offset = sprintf(err_msg, "ERR command:");
+   offset += sprintf(&err_msg[offset],
+   " status: %02x, ser: %d, offset:%d, fmt: %02x",
+   local.err_cmd.status, local.err_cmd.ser,
+   local.err_cmd.offset, local.err_cmd.format);
+   offset += sprintf(&err_msg[offset],
+   " address: 0x%"PRIx64", queue: %d, dd: %02x",
+   (uint64_t)local.err_cmd.addr_hi << 32 |
+   local.err_cmd.addr_lo,
+   local.err_cmd.queue, local.err_cmd.dd);
+   DPAA_QDMA_ERR("%s", err_msg);
+   DPAA_QDMA_ERR("ERR command block: %d, queue: %d",
+   local.deccqidr.block, local.deccqidr.queue);
+
+   rte_write32(local.dedr_be, ®->dedr_be);
+
+   return -EIO;
+}
+
 static uint16_t
 dpaa_qdma_dequeue_status(void *dev_private, uint16_t vchan,
const uint16_t nb_cpls, uint16_t *last_idx,
enum rte_dma_status_code *st)
 {
struct fsl_qdma_engine *fsl_qdma = dev_private;
+   int err;
struct fsl_qdma_queue *fsl_queue = fsl_qdma->chan[vchan];
+   void *status = fsl_qdma->status_base;
struct fsl_qdma_desc *desc_complete[nb_cpls];
uint16_t i, dq_num;
 
@@ -1107,6 +1179,12 @@ dpaa_qdma_dequeue_status(void *dev_private, uint16_t 
vchan,
st[i] = RTE_DMA_STATUS_SUCCESSFUL;
}
 
+   if (s_hw_err_check) {
+   err = dpaa_qdma_err_handle(status

[v4 11/15] dma/dpaa: add workaround for ERR050757

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

ERR050757 on LS104x indicates:

For outbound PCIe read transactions, a completion buffer is used
to store the PCIe completions till the data is passed back to the
initiator. At most 16 outstanding transactions are allowed and
maximum read request is 256 bytes. The completion buffer size
inside the controller needs to be at least 4KB, but the PCIe
controller has 3 KB of buffer. In case the size of pending
outbound read transactions of more than 3KB, the PCIe controller
may drop the incoming completions without notifying the initiator
of the transaction, leaving transactions unfinished. All
subsequent outbound reads to PCIe are blocked permanently.
To avoid qDMA hang as it keeps waiting for data that was silently
dropped, set stride mode for qDMA.

Signed-off-by: Jun Yang 
Signed-off-by: Gagandeep Singh 
---
 config/arm/meson.build   |  3 ++-
 doc/guides/dmadevs/dpaa.rst  |  2 ++
 drivers/dma/dpaa/dpaa_qdma.c | 38 +---
 drivers/dma/dpaa/dpaa_qdma.h | 19 +++---
 4 files changed, 46 insertions(+), 16 deletions(-)

diff --git a/config/arm/meson.build b/config/arm/meson.build
index 012935d5d7..f81e466318 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -468,7 +468,8 @@ soc_dpaa = {
 ['RTE_MACHINE', '"dpaa"'],
 ['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false],
 ['RTE_MAX_LCORE', 16],
-['RTE_MAX_NUMA_NODES', 1]
+['RTE_MAX_NUMA_NODES', 1],
+   ['RTE_DMA_DPAA_ERRATA_ERR050757', true]
 ],
 'numa': false
 }
diff --git a/doc/guides/dmadevs/dpaa.rst b/doc/guides/dmadevs/dpaa.rst
index f99bfc6087..746919ec6b 100644
--- a/doc/guides/dmadevs/dpaa.rst
+++ b/doc/guides/dmadevs/dpaa.rst
@@ -42,6 +42,8 @@ Compilation
 For builds using ``meson`` and ``ninja``, the driver will be built when the
 target platform is dpaa-based. No additional compilation steps are necessary.
 
+- ``RTE_DMA_DPAA_ERRATA_ERR050757`` - enable software workaround for 
Errata-A050757
+
 Initialization
 --
 
diff --git a/drivers/dma/dpaa/dpaa_qdma.c b/drivers/dma/dpaa/dpaa_qdma.c
index 041446b5bc..dbc53b784f 100644
--- a/drivers/dma/dpaa/dpaa_qdma.c
+++ b/drivers/dma/dpaa/dpaa_qdma.c
@@ -167,7 +167,6 @@ fsl_qdma_pre_comp_sd_desc(struct fsl_qdma_queue *queue)
 
/* Descriptor Buffer */
sdf->srttype = FSL_QDMA_CMD_RWTTYPE;
-
ddf->dwttype = FSL_QDMA_CMD_RWTTYPE;
ddf->lwc = FSL_QDMA_CMD_LWC;
 
@@ -449,8 +448,9 @@ fsl_qdma_reg_init(struct fsl_qdma_engine *fsl_qdma)
 
/* Initialize the queue mode. */
reg = FSL_QDMA_BCQMR_EN;
-   reg |= FSL_QDMA_BCQMR_CD_THLD(ilog2(temp->n_cq) - 4);
-   reg |= FSL_QDMA_BCQMR_CQ_SIZE(ilog2(temp->n_cq) - 6);
+   reg |= FSL_QDMA_BCQMR_CD_THLD(ilog2_qthld(temp->n_cq));
+   reg |= FSL_QDMA_BCQMR_CQ_SIZE(ilog2_qsize(temp->n_cq));
+   temp->le_cqmr = reg;
qdma_writel(reg, block + FSL_QDMA_BCQMR(i));
}
 
@@ -694,6 +694,9 @@ fsl_qdma_enqueue_desc_single(struct fsl_qdma_queue 
*fsl_queue,
struct fsl_qdma_comp_sg_desc *csgf_src, *csgf_dest;
struct fsl_qdma_cmpd_ft *ft;
int ret;
+#ifdef RTE_DMA_DPAA_ERRATA_ERR050757
+   struct fsl_qdma_sdf *sdf;
+#endif
 
ret = fsl_qdma_enqueue_overflow(fsl_queue);
if (unlikely(ret))
@@ -701,6 +704,19 @@ fsl_qdma_enqueue_desc_single(struct fsl_qdma_queue 
*fsl_queue,
 
ft = fsl_queue->ft[fsl_queue->ci];
 
+#ifdef RTE_DMA_DPAA_ERRATA_ERR050757
+   sdf = &ft->df.sdf;
+   sdf->srttype = FSL_QDMA_CMD_RWTTYPE;
+   if (len > FSL_QDMA_CMD_SS_ERR050757_LEN) {
+   sdf->ssen = 1;
+   sdf->sss = FSL_QDMA_CMD_SS_ERR050757_LEN;
+   sdf->ssd = FSL_QDMA_CMD_SS_ERR050757_LEN;
+   } else {
+   sdf->ssen = 0;
+   sdf->sss = 0;
+   sdf->ssd = 0;
+   }
+#endif
csgf_src = &ft->desc_sbuf;
csgf_dest = &ft->desc_dbuf;
qdma_desc_sge_addr_set64(csgf_src, src);
@@ -733,6 +749,9 @@ fsl_qdma_enqueue_desc_sg(struct fsl_qdma_queue *fsl_queue)
uint32_t total_len;
uint16_t start, idx, num, i, next_idx;
int ret;
+#ifdef RTE_DMA_DPAA_ERRATA_ERR050757
+   struct fsl_qdma_sdf *sdf;
+#endif
 
 eq_sg:
total_len = 0;
@@ -798,6 +817,19 @@ fsl_qdma_enqueue_desc_sg(struct fsl_qdma_queue *fsl_queue)
ft->desc_dsge[num - 1].final = 1;
csgf_src->length = total_len;
csgf_dest->length = total_len;
+#ifdef RTE_DMA_DPAA_ERRATA_ERR050757
+   sdf = &ft->df.sdf;
+   sdf->srttype = FSL_QDMA_CMD_RWTTYPE;
+   if (total_len > FSL_QDMA_CMD_SS_ERR050757_LEN) {
+   sdf->ssen = 1;
+   sdf->sss = FSL_QDMA_CMD_SS_ERR050757_LEN;
+   sdf->ssd = FSL_QDMA_CMD_SS_ERR050757_LEN;
+   } else {
+   sdf->ssen

[v4 10/15] dma/dpaa: add silent mode support

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

add silent mode support.

Signed-off-by: Jun Yang 
Signed-off-by: Gagandeep Singh 
---
 drivers/dma/dpaa/dpaa_qdma.c | 46 
 drivers/dma/dpaa/dpaa_qdma.h |  1 +
 2 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/drivers/dma/dpaa/dpaa_qdma.c b/drivers/dma/dpaa/dpaa_qdma.c
index 94be9c5fd1..041446b5bc 100644
--- a/drivers/dma/dpaa/dpaa_qdma.c
+++ b/drivers/dma/dpaa/dpaa_qdma.c
@@ -119,6 +119,7 @@ dma_pool_alloc(char *nm, int size, int aligned, dma_addr_t 
*phy_addr)
 static int
 fsl_qdma_pre_comp_sd_desc(struct fsl_qdma_queue *queue)
 {
+   struct fsl_qdma_engine *fsl_qdma = queue->engine;
struct fsl_qdma_sdf *sdf;
struct fsl_qdma_ddf *ddf;
struct fsl_qdma_comp_cmd_desc *ccdf;
@@ -173,7 +174,8 @@ fsl_qdma_pre_comp_sd_desc(struct fsl_qdma_queue *queue)
ccdf = &queue->cq[i];
qdma_desc_addr_set64(ccdf, phy_ft);
ccdf->format = FSL_QDMA_COMP_SG_FORMAT;
-
+   if (!fsl_qdma->is_silent)
+   ccdf->ser = 1;
ccdf->queue = queue->queue_id;
}
queue->ci = 0;
@@ -575,9 +577,12 @@ static int
 fsl_qdma_enqueue_desc_to_ring(struct fsl_qdma_queue *fsl_queue,
uint16_t num)
 {
+   struct fsl_qdma_engine *fsl_qdma = fsl_queue->engine;
uint16_t i, idx, start, dq;
int ret, dq_cnt;
 
+   if (fsl_qdma->is_silent)
+   return 0;
 
fsl_queue->desc_in_hw[fsl_queue->ci] = num;
 eq_again:
@@ -622,17 +627,34 @@ static int
 fsl_qdma_enqueue_overflow(struct fsl_qdma_queue *fsl_queue)
 {
int overflow = 0;
+   uint32_t reg;
uint16_t blk_drain, check_num, drain_num;
+   uint8_t *block = fsl_queue->block_vir;
const struct rte_dma_stats *st = &fsl_queue->stats;
struct fsl_qdma_engine *fsl_qdma = fsl_queue->engine;
 
check_num = 0;
 overflow_check:
-   overflow = (fsl_qdma_queue_bd_in_hw(fsl_queue) >=
+   if (fsl_qdma->is_silent) {
+   reg = qdma_readl_be(block +
+FSL_QDMA_BCQSR(fsl_queue->queue_id));
+   overflow = (reg & FSL_QDMA_BCQSR_QF_XOFF_BE) ?
+   1 : 0;
+   } else {
+   overflow = (fsl_qdma_queue_bd_in_hw(fsl_queue) >=
QDMA_QUEUE_CR_WM) ? 1 : 0;
+   }
 
-   if (likely(!overflow))
+   if (likely(!overflow)) {
return 0;
+   } else if (fsl_qdma->is_silent) {
+   check_num++;
+   if (check_num >= 1) {
+   DPAA_QDMA_WARN("Waiting for HW complete in silent 
mode");
+   check_num = 0;
+   }
+   goto overflow_check;
+   }
 
DPAA_QDMA_DP_DEBUG("TC%d/Q%d submitted(%"PRIu64")-completed(%"PRIu64") 
>= %d",
fsl_queue->block_id, fsl_queue->queue_id,
@@ -877,10 +899,13 @@ dpaa_get_channel(struct fsl_qdma_engine *fsl_qdma,
 }
 
 static int
-dpaa_qdma_configure(__rte_unused struct rte_dma_dev *dmadev,
-   __rte_unused const struct rte_dma_conf *dev_conf,
+dpaa_qdma_configure(struct rte_dma_dev *dmadev,
+   const struct rte_dma_conf *dev_conf,
__rte_unused uint32_t conf_sz)
 {
+   struct fsl_qdma_engine *fsl_qdma = dmadev->data->dev_private;
+
+   fsl_qdma->is_silent = dev_conf->enable_silent;
return 0;
 }
 
@@ -966,6 +991,12 @@ dpaa_qdma_dequeue_status(void *dev_private, uint16_t vchan,
struct fsl_qdma_desc *desc_complete[nb_cpls];
uint16_t i, dq_num;
 
+   if (unlikely(fsl_qdma->is_silent)) {
+   DPAA_QDMA_WARN("Can't dq in slient mode");
+
+   return 0;
+   }
+
dq_num = dpaa_qdma_block_dequeue(fsl_qdma,
fsl_queue->block_id);
DPAA_QDMA_DP_DEBUG("%s: block dq(%d)",
@@ -995,6 +1026,11 @@ dpaa_qdma_dequeue(void *dev_private,
struct fsl_qdma_desc *desc_complete[nb_cpls];
uint16_t i, dq_num;
 
+   if (unlikely(fsl_qdma->is_silent)) {
+   DPAA_QDMA_WARN("Can't dq in slient mode");
+
+   return 0;
+   }
 
*has_error = false;
dq_num = dpaa_qdma_block_dequeue(fsl_qdma,
diff --git a/drivers/dma/dpaa/dpaa_qdma.h b/drivers/dma/dpaa/dpaa_qdma.h
index 75c014f32f..9b69db517e 100644
--- a/drivers/dma/dpaa/dpaa_qdma.h
+++ b/drivers/dma/dpaa/dpaa_qdma.h
@@ -257,6 +257,7 @@ struct fsl_qdma_engine {
struct fsl_qdma_queue *chan[QDMA_BLOCKS * QDMA_QUEUES];
uint32_t num_blocks;
int block_offset;
+   int is_silent;
 };
 
 #endif /* _DPAA_QDMA_H_ */
-- 
2.25.1

[v4 13/15] dma/dpaa: add Scatter Gather support

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

Support copy_sg operation for scatter gather.

Signed-off-by: Jun Yang 
Signed-off-by: Gagandeep Singh 
---
 drivers/dma/dpaa/dpaa_qdma.c | 55 
 drivers/dma/dpaa/dpaa_qdma.h | 10 ++-
 2 files changed, 64 insertions(+), 1 deletion(-)

diff --git a/drivers/dma/dpaa/dpaa_qdma.c b/drivers/dma/dpaa/dpaa_qdma.c
index 6d8e9c8183..0aa3575fe9 100644
--- a/drivers/dma/dpaa/dpaa_qdma.c
+++ b/drivers/dma/dpaa/dpaa_qdma.c
@@ -1021,6 +1021,60 @@ dpaa_qdma_enqueue(void *dev_private, uint16_t vchan,
return ret;
 }
 
+static int
+dpaa_qdma_copy_sg(void *dev_private,
+   uint16_t vchan,
+   const struct rte_dma_sge *src,
+   const struct rte_dma_sge *dst,
+   uint16_t nb_src, uint16_t nb_dst,
+   uint64_t flags)
+{
+   int ret;
+   uint16_t i, start, idx;
+   struct fsl_qdma_engine *fsl_qdma = dev_private;
+   struct fsl_qdma_queue *fsl_queue = fsl_qdma->chan[vchan];
+   const uint16_t *idx_addr = NULL;
+
+   if (unlikely(nb_src != nb_dst)) {
+   DPAA_QDMA_ERR("%s: nb_src(%d) != nb_dst(%d) on  queue%d",
+   __func__, nb_src, nb_dst, vchan);
+   return -EINVAL;
+   }
+
+   if ((fsl_queue->pending_num + nb_src) > FSL_QDMA_SG_MAX_ENTRY) {
+   DPAA_QDMA_ERR("Too many pending jobs on queue%d",
+   vchan);
+   return -ENOSPC;
+   }
+   start = fsl_queue->pending_start + fsl_queue->pending_num;
+   start = start & (fsl_queue->pending_max - 1);
+   idx = start;
+
+   idx_addr = DPAA_QDMA_IDXADDR_FROM_SG_FLAG(flags);
+
+   for (i = 0; i < nb_src; i++) {
+   if (unlikely(src[i].length != dst[i].length)) {
+   DPAA_QDMA_ERR("src.len(%d) != dst.len(%d)",
+   src[i].length, dst[i].length);
+   return -EINVAL;
+   }
+   idx = (start + i) & (fsl_queue->pending_max - 1);
+   fsl_queue->pending_desc[idx].src = src[i].addr;
+   fsl_queue->pending_desc[idx].dst = dst[i].addr;
+   fsl_queue->pending_desc[idx].len = dst[i].length;
+   fsl_queue->pending_desc[idx].flag = idx_addr[i];
+   }
+   fsl_queue->pending_num += nb_src;
+
+   if (!(flags & RTE_DMA_OP_FLAG_SUBMIT))
+   return idx;
+
+   ret = fsl_qdma_enqueue_desc(fsl_queue);
+   if (!ret)
+   return fsl_queue->pending_start;
+
+   return ret;
+}
 
 static uint16_t
 dpaa_qdma_dequeue_status(void *dev_private, uint16_t vchan,
@@ -1235,6 +1289,7 @@ dpaa_qdma_probe(__rte_unused struct rte_dpaa_driver 
*dpaa_drv,
dmadev->device = &dpaa_dev->device;
dmadev->fp_obj->dev_private = dmadev->data->dev_private;
dmadev->fp_obj->copy = dpaa_qdma_enqueue;
+   dmadev->fp_obj->copy_sg = dpaa_qdma_copy_sg;
dmadev->fp_obj->submit = dpaa_qdma_submit;
dmadev->fp_obj->completed = dpaa_qdma_dequeue;
dmadev->fp_obj->completed_status = dpaa_qdma_dequeue_status;
diff --git a/drivers/dma/dpaa/dpaa_qdma.h b/drivers/dma/dpaa/dpaa_qdma.h
index 171c093117..1e820d0207 100644
--- a/drivers/dma/dpaa/dpaa_qdma.h
+++ b/drivers/dma/dpaa/dpaa_qdma.h
@@ -24,8 +24,13 @@
 #define QDMA_STATUS_REGION_OFFSET \
(QDMA_CTRL_REGION_OFFSET + QDMA_CTRL_REGION_SIZE)
 #define QDMA_STATUS_REGION_SIZE 0x1
-#define DPAA_QDMA_COPY_IDX_OFFSET 8
+
 #define DPAA_QDMA_FLAGS_INDEX RTE_BIT64(63)
+#define DPAA_QDMA_COPY_IDX_OFFSET 8
+#define DPAA_QDMA_SG_IDX_ADDR_ALIGN \
+   RTE_BIT64(DPAA_QDMA_COPY_IDX_OFFSET)
+#define DPAA_QDMA_SG_IDX_ADDR_MASK \
+   (DPAA_QDMA_SG_IDX_ADDR_ALIGN - 1)
 
 #define FSL_QDMA_DMR   0x0
 #define FSL_QDMA_DSR   0x4
@@ -194,6 +199,9 @@ struct fsl_qdma_cmpd_ft {
uint64_t phy_df;
 } __rte_packed;
 
+#define DPAA_QDMA_IDXADDR_FROM_SG_FLAG(flag) \
+   ((void *)(uintptr_t)((flag) - ((flag) & DPAA_QDMA_SG_IDX_ADDR_MASK)))
+
 #define DPAA_QDMA_IDX_FROM_FLAG(flag) \
((flag) >> DPAA_QDMA_COPY_IDX_OFFSET)
 
-- 
2.25.1

[v4 12/15] dma/dpaa: qdma stall workaround for ERR050265

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

Non-prefetchable read setting in the source descriptor may be
required for targets other than local memory. Prefetchable read
setting will offer better performance for misaligned transfers
in the form of fewer transactions and should be set if possible.
This patch also fixes QDMA stall issue due to unaligned
transactions.

Signed-off-by: Jun Yang 
Signed-off-by: Gagandeep Singh 
---
 config/arm/meson.build   | 3 ++-
 doc/guides/dmadevs/dpaa.rst  | 1 +
 drivers/dma/dpaa/dpaa_qdma.c | 9 +
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/config/arm/meson.build b/config/arm/meson.build
index f81e466318..f63ef41130 100644
--- a/config/arm/meson.build
+++ b/config/arm/meson.build
@@ -469,7 +469,8 @@ soc_dpaa = {
 ['RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', false],
 ['RTE_MAX_LCORE', 16],
 ['RTE_MAX_NUMA_NODES', 1],
-   ['RTE_DMA_DPAA_ERRATA_ERR050757', true]
+   ['RTE_DMA_DPAA_ERRATA_ERR050757', true],
+   ['RTE_DMA_DPAA_ERRATA_ERR050265', true]
 ],
 'numa': false
 }
diff --git a/doc/guides/dmadevs/dpaa.rst b/doc/guides/dmadevs/dpaa.rst
index 746919ec6b..8a7c0befc3 100644
--- a/doc/guides/dmadevs/dpaa.rst
+++ b/doc/guides/dmadevs/dpaa.rst
@@ -43,6 +43,7 @@ For builds using ``meson`` and ``ninja``, the driver will be 
built when the
 target platform is dpaa-based. No additional compilation steps are necessary.
 
 - ``RTE_DMA_DPAA_ERRATA_ERR050757`` - enable software workaround for 
Errata-A050757
+- ``RTE_DMA_DPAA_ERRATA_ERR050265`` - enable software workaround for 
Errata-A050265
 
 Initialization
 --
diff --git a/drivers/dma/dpaa/dpaa_qdma.c b/drivers/dma/dpaa/dpaa_qdma.c
index dbc53b784f..6d8e9c8183 100644
--- a/drivers/dma/dpaa/dpaa_qdma.c
+++ b/drivers/dma/dpaa/dpaa_qdma.c
@@ -167,6 +167,9 @@ fsl_qdma_pre_comp_sd_desc(struct fsl_qdma_queue *queue)
 
/* Descriptor Buffer */
sdf->srttype = FSL_QDMA_CMD_RWTTYPE;
+#ifdef RTE_DMA_DPAA_ERRATA_ERR050265
+   sdf->prefetch = 1;
+#endif
ddf->dwttype = FSL_QDMA_CMD_RWTTYPE;
ddf->lwc = FSL_QDMA_CMD_LWC;
 
@@ -707,6 +710,9 @@ fsl_qdma_enqueue_desc_single(struct fsl_qdma_queue 
*fsl_queue,
 #ifdef RTE_DMA_DPAA_ERRATA_ERR050757
sdf = &ft->df.sdf;
sdf->srttype = FSL_QDMA_CMD_RWTTYPE;
+#ifdef RTE_DMA_DPAA_ERRATA_ERR050265
+   sdf->prefetch = 1;
+#endif
if (len > FSL_QDMA_CMD_SS_ERR050757_LEN) {
sdf->ssen = 1;
sdf->sss = FSL_QDMA_CMD_SS_ERR050757_LEN;
@@ -820,6 +826,9 @@ fsl_qdma_enqueue_desc_sg(struct fsl_qdma_queue *fsl_queue)
 #ifdef RTE_DMA_DPAA_ERRATA_ERR050757
sdf = &ft->df.sdf;
sdf->srttype = FSL_QDMA_CMD_RWTTYPE;
+#ifdef RTE_DMA_DPAA_ERRATA_ERR050265
+   sdf->prefetch = 1;
+#endif
if (total_len > FSL_QDMA_CMD_SS_ERR050757_LEN) {
sdf->ssen = 1;
sdf->sss = FSL_QDMA_CMD_SS_ERR050757_LEN;
-- 
2.25.1

[v4 08/15] dma/dpaa: refactor driver

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

This patch refactor the DPAA DMA driver code with changes:
 - HW descriptors rename and update with details.
 - update qdma engine and queue structures
 - using rte_ring APIs for enqueue and dequeue.

Signed-off-by: Jun Yang 
Signed-off-by: Gagandeep Singh 
---
 drivers/dma/dpaa/dpaa_qdma.c | 1330 ++
 drivers/dma/dpaa/dpaa_qdma.h |  222 +++---
 2 files changed, 864 insertions(+), 688 deletions(-)

diff --git a/drivers/dma/dpaa/dpaa_qdma.c b/drivers/dma/dpaa/dpaa_qdma.c
index 3d4fd818f8..a10a867580 100644
--- a/drivers/dma/dpaa/dpaa_qdma.c
+++ b/drivers/dma/dpaa/dpaa_qdma.c
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2021 NXP
+ * Copyright 2021-2024 NXP
  */
 
 #include 
@@ -8,69 +8,71 @@
 #include "dpaa_qdma.h"
 #include "dpaa_qdma_logs.h"
 
+static uint32_t s_sg_max_entry_sz = 2000;
+
 static inline void
-qdma_desc_addr_set64(struct fsl_qdma_format *ccdf, u64 addr)
+qdma_desc_addr_set64(struct fsl_qdma_comp_cmd_desc *ccdf, u64 addr)
 {
ccdf->addr_hi = upper_32_bits(addr);
ccdf->addr_lo = rte_cpu_to_le_32(lower_32_bits(addr));
 }
 
-static inline u64
-qdma_ccdf_get_queue(const struct fsl_qdma_format *ccdf)
+static inline void
+qdma_desc_sge_addr_set64(struct fsl_qdma_comp_sg_desc *sge, u64 addr)
 {
-   return ccdf->cfg8b_w1 & 0xff;
+   sge->addr_hi = upper_32_bits(addr);
+   sge->addr_lo = rte_cpu_to_le_32(lower_32_bits(addr));
 }
 
 static inline int
-qdma_ccdf_get_offset(const struct fsl_qdma_format *ccdf)
+qdma_ccdf_get_queue(struct fsl_qdma_comp_cmd_desc *ccdf,
+   uint8_t *queue_idx)
 {
-   return (rte_le_to_cpu_32(ccdf->cfg) & QDMA_CCDF_MASK)
-   >> QDMA_CCDF_OFFSET;
-}
+   uint64_t addr = ((uint64_t)ccdf->addr_hi) << 32 | ccdf->addr_lo;
+
+   if (addr && queue_idx)
+   *queue_idx = ccdf->queue;
+   if (addr) {
+   ccdf->addr_hi = 0;
+   ccdf->addr_lo = 0;
+   return true;
+   }
 
-static inline void
-qdma_ccdf_set_format(struct fsl_qdma_format *ccdf, int offset)
-{
-   ccdf->cfg = rte_cpu_to_le_32(QDMA_CCDF_FOTMAT | offset);
+   return false;
 }
 
 static inline int
-qdma_ccdf_get_status(const struct fsl_qdma_format *ccdf)
+ilog2(int x)
 {
-   return (rte_le_to_cpu_32(ccdf->status) & QDMA_CCDF_MASK)
-   >> QDMA_CCDF_STATUS;
-}
+   int log = 0;
 
-static inline void
-qdma_ccdf_set_ser(struct fsl_qdma_format *ccdf, int status)
-{
-   ccdf->status = rte_cpu_to_le_32(QDMA_CCDF_SER | status);
+   x >>= 1;
+
+   while (x) {
+   log++;
+   x >>= 1;
+   }
+   return log;
 }
 
-static inline void
-qdma_csgf_set_len(struct fsl_qdma_format *csgf, int len)
+static inline int
+ilog2_qsize(uint32_t q_size)
 {
-   csgf->cfg = rte_cpu_to_le_32(len & QDMA_SG_LEN_MASK);
+   return (ilog2(q_size) - ilog2(64));
 }
 
-static inline void
-qdma_csgf_set_f(struct fsl_qdma_format *csgf, int len)
+static inline int
+ilog2_qthld(uint32_t q_thld)
 {
-   csgf->cfg = rte_cpu_to_le_32(QDMA_SG_FIN | (len & QDMA_SG_LEN_MASK));
+   return (ilog2(q_thld) - ilog2(16));
 }
 
 static inline int
-ilog2(int x)
+fsl_qdma_queue_bd_in_hw(struct fsl_qdma_queue *fsl_queue)
 {
-   int log = 0;
-
-   x >>= 1;
+   struct rte_dma_stats *stats = &fsl_queue->stats;
 
-   while (x) {
-   log++;
-   x >>= 1;
-   }
-   return log;
+   return (stats->submitted - stats->completed);
 }
 
 static u32
@@ -97,12 +99,12 @@ qdma_writel_be(u32 val, void *addr)
QDMA_OUT_BE(addr, val);
 }
 
-static void
-*dma_pool_alloc(int size, int aligned, dma_addr_t *phy_addr)
+static void *
+dma_pool_alloc(char *nm, int size, int aligned, dma_addr_t *phy_addr)
 {
void *virt_addr;
 
-   virt_addr = rte_malloc("dma pool alloc", size, aligned);
+   virt_addr = rte_zmalloc(nm, size, aligned);
if (!virt_addr)
return NULL;
 
@@ -111,268 +113,221 @@ static void
return virt_addr;
 }
 
-static void
-dma_pool_free(void *addr)
-{
-   rte_free(addr);
-}
-
-static void
-fsl_qdma_free_chan_resources(struct fsl_qdma_chan *fsl_chan)
-{
-   struct fsl_qdma_queue *fsl_queue = fsl_chan->queue;
-   struct fsl_qdma_engine *fsl_qdma = fsl_chan->qdma;
-   struct fsl_qdma_comp *comp_temp, *_comp_temp;
-   int id;
-
-   if (--fsl_queue->count)
-   goto finally;
-
-   id = (fsl_qdma->block_base - fsl_queue->block_base) /
- fsl_qdma->block_offset;
-
-   while (rte_atomic32_read(&wait_task[id]) == 1)
-   rte_delay_us(QDMA_DELAY);
-
-   list_for_each_entry_safe(comp_temp, _comp_temp,
-&fsl_queue->comp_used, list) {
-   list_del(&comp_temp->list);
-   dma_pool_free(comp_temp->virt_addr);
-   dma_pool_free(comp_temp->desc_virt_addr);
-   rte_free(comp_temp);
-   }
-
-

[v4 04/15] dma/dpaa2: add short FD support

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

Short FD can be used for single transfer scenario which shows higher
performance than FLE.
1) Save index context in FD att field for short and FLE(NonSG).
2) Identify FD type by att of FD.
3) Force 48 bits address for source address and fle according to spec.

Signed-off-by: Jun Yang 
---
 doc/guides/dmadevs/dpaa2.rst   |   2 +
 drivers/dma/dpaa2/dpaa2_qdma.c | 314 +++--
 drivers/dma/dpaa2/dpaa2_qdma.h |  69 --
 drivers/dma/dpaa2/rte_pmd_dpaa2_qdma.h |  13 -
 4 files changed, 286 insertions(+), 112 deletions(-)

diff --git a/doc/guides/dmadevs/dpaa2.rst b/doc/guides/dmadevs/dpaa2.rst
index 079337e61c..a358434aca 100644
--- a/doc/guides/dmadevs/dpaa2.rst
+++ b/doc/guides/dmadevs/dpaa2.rst
@@ -81,3 +81,5 @@ Device Argumenst
usage example: ``fslmc:dpdmai.1,fle_pre_populate=1``
 2. Use dev arg option ``desc_debug=1`` to enable descriptor debugs.
usage example: ``fslmc:dpdmai.1,desc_debug=1``
+2. Use dev arg option ``short_fd=1`` to enable short FDs.
+   usage example: ``fslmc:dpdmai.1,short_fd=1``
diff --git a/drivers/dma/dpaa2/dpaa2_qdma.c b/drivers/dma/dpaa2/dpaa2_qdma.c
index 3a6aa69e8b..23ecf4c5ac 100644
--- a/drivers/dma/dpaa2/dpaa2_qdma.c
+++ b/drivers/dma/dpaa2/dpaa2_qdma.c
@@ -16,6 +16,7 @@
 
 #define DPAA2_QDMA_FLE_PRE_POPULATE "fle_pre_populate"
 #define DPAA2_QDMA_DESC_DEBUG "desc_debug"
+#define DPAA2_QDMA_USING_SHORT_FD "short_fd"
 
 static uint32_t dpaa2_coherent_no_alloc_cache;
 static uint32_t dpaa2_coherent_alloc_cache;
@@ -560,7 +561,6 @@ dpaa2_qdma_long_fmt_dump(const struct qbman_fle *fle)
const struct qdma_cntx_fle_sdd *fle_sdd;
const struct qdma_sdd *sdd;
const struct qdma_cntx_sg *cntx_sg = NULL;
-   const struct qdma_cntx_long *cntx_long = NULL;
 
fle_sdd = container_of(fle, const struct qdma_cntx_fle_sdd, fle[0]);
sdd = fle_sdd->sdd;
@@ -583,11 +583,8 @@ dpaa2_qdma_long_fmt_dump(const struct qbman_fle *fle)
QBMAN_FLE_WORD4_FMT_SGE) {
cntx_sg = container_of(fle_sdd, const struct qdma_cntx_sg,
fle_sdd);
-   } else if (fle[DPAA2_QDMA_SRC_FLE].word4.fmt ==
+   } else if (fle[DPAA2_QDMA_SRC_FLE].word4.fmt !=
QBMAN_FLE_WORD4_FMT_SBF) {
-   cntx_long = container_of(fle_sdd, const struct qdma_cntx_long,
-   fle_sdd);
-   } else {
DPAA2_QDMA_ERR("Unsupported fle format:%d",
fle[DPAA2_QDMA_SRC_FLE].word4.fmt);
return;
@@ -598,11 +595,6 @@ dpaa2_qdma_long_fmt_dump(const struct qbman_fle *fle)
dpaa2_qdma_sdd_dump(&sdd[i]);
}
 
-   if (cntx_long) {
-   DPAA2_QDMA_INFO("long format/Single buffer cntx idx:%d",
-   cntx_long->cntx_idx);
-   }
-
if (cntx_sg) {
DPAA2_QDMA_INFO("long format/SG format, job number:%d",
cntx_sg->job_nb);
@@ -620,6 +612,8 @@ dpaa2_qdma_long_fmt_dump(const struct qbman_fle *fle)
DPAA2_QDMA_INFO("cntx_idx[%d]:%d", i,
cntx_sg->cntx_idx[i]);
}
+   } else {
+   DPAA2_QDMA_INFO("long format/Single buffer cntx");
}
 }
 
@@ -682,7 +676,7 @@ dpaa2_qdma_copy_sg(void *dev_private,
offsetof(struct qdma_cntx_sg, fle_sdd) +
offsetof(struct qdma_cntx_fle_sdd, fle);
 
-   DPAA2_SET_FD_ADDR(fd, fle_iova);
+   dpaa2_qdma_fd_set_addr(fd, fle_iova);
DPAA2_SET_FD_COMPOUND_FMT(fd);
DPAA2_SET_FD_FLC(fd, (uint64_t)cntx_sg);
 
@@ -718,6 +712,7 @@ dpaa2_qdma_copy_sg(void *dev_private,
if (unlikely(qdma_vq->flags & DPAA2_QDMA_DESC_DEBUG_FLAG))
dpaa2_qdma_long_fmt_dump(cntx_sg->fle_sdd.fle);
 
+   dpaa2_qdma_fd_save_att(fd, 0, DPAA2_QDMA_FD_SG);
qdma_vq->fd_idx++;
qdma_vq->silent_idx =
(qdma_vq->silent_idx + 1) & (DPAA2_QDMA_MAX_DESC - 1);
@@ -734,74 +729,178 @@ dpaa2_qdma_copy_sg(void *dev_private,
return ret;
 }
 
+static inline void
+qdma_populate_fd_pci(uint64_t src, uint64_t dest,
+   uint32_t len, struct qbman_fd *fd,
+   struct dpaa2_qdma_rbp *rbp, int ser)
+{
+   fd->simple_pci.saddr_lo = lower_32_bits(src);
+   fd->simple_pci.saddr_hi = upper_32_bits(src);
+
+   fd->simple_pci.len_sl = len;
+
+   fd->simple_pci.bmt = DPAA2_QDMA_BMT_DISABLE;
+   fd->simple_pci.fmt = DPAA2_QDMA_FD_SHORT_FORMAT;
+   fd->simple_pci.sl = 1;
+   fd->simple_pci.ser = ser;
+   if (ser)
+   fd->simple.frc |= QDMA_SER_CTX;
+
+   fd->simple_pci.sportid = rbp->sportid;
+
+   fd->simple_pci.svfid = rbp->svfid;
+   fd->simple_pci.spfid = rbp->spfid;
+   fd->simple_pci.svfa = rbp->svfa;
+   fd->simple_pci.dvfid = rbp->dvfid;
+   fd->simple_pci.dpfid = rbp->dpfid;
+   fd->simple_pci.dvfa = rbp->dvfa;
+
+   fd->simple_pci.srbp = rbp->s

[v4 07/15] dma/dpaa2: move the qdma header to common place

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

Include rte_pmd_dpaax_qdma.h instead of rte_pmd_dpaa2_qdma.h
and change code accordingly.

Signed-off-by: Jun Yang 
---
 doc/api/doxy-api-index.md |  2 +-
 doc/api/doxy-api.conf.in  |  2 +-
 drivers/common/dpaax/meson.build  |  3 +-
 drivers/common/dpaax/rte_pmd_dpaax_qdma.h | 23 +++
 drivers/dma/dpaa2/dpaa2_qdma.c| 84 +++
 drivers/dma/dpaa2/dpaa2_qdma.h| 10 +--
 drivers/dma/dpaa2/meson.build |  4 +-
 drivers/dma/dpaa2/rte_pmd_dpaa2_qdma.h| 23 ---
 8 files changed, 72 insertions(+), 79 deletions(-)
 create mode 100644 drivers/common/dpaax/rte_pmd_dpaax_qdma.h
 delete mode 100644 drivers/dma/dpaa2/rte_pmd_dpaa2_qdma.h

diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index f9f0300126..5a4411eb4a 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -57,7 +57,7 @@ The public API headers are grouped by topics:
   [mlx5](@ref rte_pmd_mlx5.h),
   [dpaa2_mempool](@ref rte_dpaa2_mempool.h),
   [dpaa2_cmdif](@ref rte_pmd_dpaa2_cmdif.h),
-  [dpaa2_qdma](@ref rte_pmd_dpaa2_qdma.h),
+  [dpaax](@ref rte_pmd_dpaax_qdma.h),
   [crypto_scheduler](@ref rte_cryptodev_scheduler.h),
   [dlb2](@ref rte_pmd_dlb2.h),
   [ifpga](@ref rte_pmd_ifpga.h)
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index a8823c046f..33250d867c 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -8,7 +8,7 @@ INPUT   = @TOPDIR@/doc/api/doxy-api-index.md \
   @TOPDIR@/drivers/bus/vdev \
   @TOPDIR@/drivers/crypto/cnxk \
   @TOPDIR@/drivers/crypto/scheduler \
-  @TOPDIR@/drivers/dma/dpaa2 \
+  @TOPDIR@/drivers/common/dpaax \
   @TOPDIR@/drivers/event/dlb2 \
   @TOPDIR@/drivers/event/cnxk \
   @TOPDIR@/drivers/mempool/cnxk \
diff --git a/drivers/common/dpaax/meson.build b/drivers/common/dpaax/meson.build
index a162779116..db61b76ce3 100644
--- a/drivers/common/dpaax/meson.build
+++ b/drivers/common/dpaax/meson.build
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: BSD-3-Clause
-# Copyright(c) 2018 NXP
+# Copyright 2018, 2024 NXP
 
 if not is_linux
 build = false
@@ -16,3 +16,4 @@ endif
 if cc.has_argument('-Wno-pointer-arith')
 cflags += '-Wno-pointer-arith'
 endif
+headers = files('rte_pmd_dpaax_qdma.h')
diff --git a/drivers/common/dpaax/rte_pmd_dpaax_qdma.h 
b/drivers/common/dpaax/rte_pmd_dpaax_qdma.h
new file mode 100644
index 00..2552a4adfb
--- /dev/null
+++ b/drivers/common/dpaax/rte_pmd_dpaax_qdma.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2021-2024 NXP
+ */
+
+#ifndef _RTE_PMD_DPAAX_QDMA_H_
+#define _RTE_PMD_DPAAX_QDMA_H_
+
+#include 
+
+#define RTE_DPAAX_QDMA_COPY_IDX_OFFSET 8
+#define RTE_DPAAX_QDMA_SG_IDX_ADDR_ALIGN \
+   RTE_BIT64(RTE_DPAAX_QDMA_COPY_IDX_OFFSET)
+#define RTE_DPAAX_QDMA_SG_IDX_ADDR_MASK \
+   (RTE_DPAAX_QDMA_SG_IDX_ADDR_ALIGN - 1)
+#define RTE_DPAAX_QDMA_SG_SUBMIT(idx_addr, flag) \
+   (((uint64_t)idx_addr) | (flag))
+
+#define RTE_DPAAX_QDMA_COPY_SUBMIT(idx, flag) \
+   ((idx << RTE_DPAAX_QDMA_COPY_IDX_OFFSET) | (flag))
+
+#define RTE_DPAAX_QDMA_JOB_SUBMIT_MAX 64
+#define RTE_DMA_CAPA_DPAAX_QDMA_FLAGS_INDEX RTE_BIT64(63)
+#endif /* _RTE_PMD_DPAAX_QDMA_H_ */
diff --git a/drivers/dma/dpaa2/dpaa2_qdma.c b/drivers/dma/dpaa2/dpaa2_qdma.c
index 180ffb3468..c36cf6cbe6 100644
--- a/drivers/dma/dpaa2/dpaa2_qdma.c
+++ b/drivers/dma/dpaa2/dpaa2_qdma.c
@@ -10,7 +10,7 @@
 
 #include 
 
-#include "rte_pmd_dpaa2_qdma.h"
+#include 
 #include "dpaa2_qdma.h"
 #include "dpaa2_qdma_logs.h"
 
@@ -251,16 +251,16 @@ fle_sdd_pre_populate(struct qdma_cntx_fle_sdd *fle_sdd,
}
/* source frame list to source buffer */
DPAA2_SET_FLE_ADDR(&fle[DPAA2_QDMA_SRC_FLE], src);
-#ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA
-   DPAA2_SET_FLE_BMT(&fle[DPAA2_QDMA_SRC_FLE]);
-#endif
+   /** IOMMU is always on for either VA or PA mode,
+* so Bypass Memory Translation should be disabled.
+*
+* DPAA2_SET_FLE_BMT(&fle[DPAA2_QDMA_SRC_FLE]);
+* DPAA2_SET_FLE_BMT(&fle[DPAA2_QDMA_DST_FLE]);
+*/
fle[DPAA2_QDMA_SRC_FLE].word4.fmt = fmt;
 
/* destination frame list to destination buffer */
DPAA2_SET_FLE_ADDR(&fle[DPAA2_QDMA_DST_FLE], dest);
-#ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA
-   DPAA2_SET_FLE_BMT(&fle[DPAA2_QDMA_DST_FLE]);
-#endif
fle[DPAA2_QDMA_DST_FLE].word4.fmt = fmt;
 
/* Final bit: 1, for last frame list */
@@ -274,23 +274,21 @@ sg_entry_pre_populate(struct qdma_cntx_sg *sg_cntx)
struct qdma_sg_entry *src_sge = sg_cntx->sg_src_entry;
struct qdma_sg_entry *dst_sge = sg_cntx->sg_dst_entry;
 
-   for (i = 0; i < RTE_DPAA2_QDMA_JOB_SUBMIT_MAX; i++) {
+   for (i = 0; i < RTE_DPAAX_QD

[v4 09/15] dma/dpaa: support burst capacity API

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

This patch improves the dpaa qdma driver and
adds dpaa_qdma_burst_capacity API which returns the
remaining space in the descriptor ring.

Signed-off-by: Jun Yang 
Signed-off-by: Gagandeep Singh 
---
 drivers/dma/dpaa/dpaa_qdma.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/drivers/dma/dpaa/dpaa_qdma.c b/drivers/dma/dpaa/dpaa_qdma.c
index a10a867580..94be9c5fd1 100644
--- a/drivers/dma/dpaa/dpaa_qdma.c
+++ b/drivers/dma/dpaa/dpaa_qdma.c
@@ -1039,6 +1039,15 @@ dpaa_qdma_stats_reset(struct rte_dma_dev *dmadev, 
uint16_t vchan)
return 0;
 }
 
+static uint16_t
+dpaa_qdma_burst_capacity(const void *dev_private, uint16_t vchan)
+{
+   const struct fsl_qdma_engine *fsl_qdma = dev_private;
+   struct fsl_qdma_queue *fsl_queue = fsl_qdma->chan[vchan];
+
+   return fsl_queue->pending_max - fsl_queue->pending_num;
+}
+
 static struct rte_dma_dev_ops dpaa_qdma_ops = {
.dev_info_get = dpaa_qdma_info_get,
.dev_configure= dpaa_qdma_configure,
@@ -1152,6 +1161,7 @@ dpaa_qdma_probe(__rte_unused struct rte_dpaa_driver 
*dpaa_drv,
dmadev->fp_obj->submit = dpaa_qdma_submit;
dmadev->fp_obj->completed = dpaa_qdma_dequeue;
dmadev->fp_obj->completed_status = dpaa_qdma_dequeue_status;
+   dmadev->fp_obj->burst_capacity = dpaa_qdma_burst_capacity;
 
/* Invoke PMD device initialization function */
ret = dpaa_qdma_init(dmadev);
-- 
2.25.1

[v4 06/15] dma/dpaa2: change the DMA copy return value

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

The return value of DMA copy/sg copy should be index of
descriptor copied in success.

Signed-off-by: Jun Yang 
---
 drivers/dma/dpaa2/dpaa2_qdma.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/dpaa2/dpaa2_qdma.c b/drivers/dma/dpaa2/dpaa2_qdma.c
index 23ecf4c5ac..180ffb3468 100644
--- a/drivers/dma/dpaa2/dpaa2_qdma.c
+++ b/drivers/dma/dpaa2/dpaa2_qdma.c
@@ -644,6 +644,11 @@ dpaa2_qdma_copy_sg(void *dev_private,
return -ENOTSUP;
}
 
+   if (unlikely(!nb_src)) {
+   DPAA2_QDMA_ERR("No SG entry specified");
+   return -EINVAL;
+   }
+
if (unlikely(nb_src > RTE_DPAA2_QDMA_JOB_SUBMIT_MAX)) {
DPAA2_QDMA_ERR("SG entry number(%d) > MAX(%d)",
nb_src, RTE_DPAA2_QDMA_JOB_SUBMIT_MAX);
@@ -720,10 +725,13 @@ dpaa2_qdma_copy_sg(void *dev_private,
if (flags & RTE_DMA_OP_FLAG_SUBMIT) {
expected = qdma_vq->fd_idx;
ret = dpaa2_qdma_multi_eq(qdma_vq);
-   if (likely(ret == expected))
-   return 0;
+   if (likely(ret == expected)) {
+   qdma_vq->copy_num += nb_src;
+   return (qdma_vq->copy_num - 1) & UINT16_MAX;
+   }
} else {
-   return 0;
+   qdma_vq->copy_num += nb_src;
+   return (qdma_vq->copy_num - 1) & UINT16_MAX;
}
 
return ret;
-- 
2.25.1

Re: [PATCH v2 04/10] baseband/acc: future proof structure comparison

2024-10-08 Thread Maxime Coquelin





On 10/3/24 22:49, Hernan Vargas wrote:

Some implementation in the PMD is based on some size assumption from
the bbdev structure, which should use sizeof instead to be more future
proof in case these structures change.

Signed-off-by: Hernan Vargas 
---
  drivers/baseband/acc/acc_common.h | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)


Reviewed-by: Maxime Coquelin 

Thanks,
Maxime

Re: [PATCH v2 05/10] baseband/acc: enhance SW ring alignment

2024-10-08 Thread Maxime Coquelin





On 10/3/24 22:49, Hernan Vargas wrote:

Calculate the aligned total size required for queue rings, ensuring that
the size is a power of two for proper memory allocation.

Signed-off-by: Hernan Vargas 
---
  drivers/baseband/acc/acc_common.h | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/baseband/acc/acc_common.h 
b/drivers/baseband/acc/acc_common.h
index 0d1c26166ff2..8ac1ca001c1d 100644
--- a/drivers/baseband/acc/acc_common.h
+++ b/drivers/baseband/acc/acc_common.h
@@ -767,19 +767,20 @@ alloc_sw_rings_min_mem(struct rte_bbdev *dev, struct 
acc_device *d,
int i = 0;
uint32_t q_sw_ring_size = ACC_MAX_QUEUE_DEPTH * get_desc_len();
uint32_t dev_sw_ring_size = q_sw_ring_size * num_queues;
-   /* Free first in case this is a reconfiguration */
+   uint32_t alignment = q_sw_ring_size * rte_align32pow2(num_queues);
+   /* Free first in case this is dev_sw_ring_size, q_sw_ring_size, 
socket); reconfiguration */


There is a copy/paste mistake in the comment?


rte_free(d->sw_rings_base);
  
  	/* Find an aligned block of memory to store sw rings */

while (i < ACC_SW_RING_MEM_ALLOC_ATTEMPTS) {
/*
 * sw_ring allocated memory is guaranteed to be aligned to
-* q_sw_ring_size at the condition that the requested size is
+* alignment at the condition that the requested size is


This comment is really unclear "aligned to alignment"


 * less than the page size
 */
sw_rings_base = rte_zmalloc_socket(
dev->device->driver->name,
-   dev_sw_ring_size, q_sw_ring_size, socket);
+   dev_sw_ring_size, alignment, socket);
  
  		if (sw_rings_base == NULL) {

rte_acc_log(ERR,

Re: [PATCH v5 1/2] timer: lower rounding of TSC estimation to 100KHz

2024-10-08 Thread David Marchand

On Thu, Oct 3, 2024 at 5:13 PM Stephen Hemminger
 wrote:
> On Thu, 3 Oct 2024 15:05:02 +0100
> Bruce Richardson  wrote:
>
> > On Thu, Oct 03, 2024 at 03:26:03PM +0300, Isaac Boukris wrote:
> > > In practice, the frequency is often not a nice round number, while
> > > the estimation results are rather accurate, just a couple of KHz
> > > away from the kernel's tsc_khz value, so it should suffice.
> > >
> > > Rounding to 10MHz can cause a significant drift from real time,
> > > up to a second per 10 minutes.
> > >
> > > See also bugzilla: 959
> > >
> > > Signed-off-by: Isaac Boukris 
> > Acked-by: Bruce Richardson 
> Acked-by: Stephen Hemminger 

Series applied.
Thanks Issac.


-- 
David Marchand

Re: [PATCH v2 06/10] baseband/acc: remove soft output bypass

2024-10-08 Thread Maxime Coquelin





On 10/3/24 22:49, Hernan Vargas wrote:

Removing soft output bypass capability due to device limitations.


It should be specified this is for VRB2 device variant.

And this should be backported, so pass Fixes tag and cc stable as it was
introduced in v23.11 LTS.

Fixes: b49fe052f9cd ("baseband/acc: add FEC capabilities for VRB2 variant")
Cc: sta...@dpdk.org

Thanks,
Maxime



Signed-off-by: Hernan Vargas 
---
  drivers/baseband/acc/rte_vrb_pmd.c | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/baseband/acc/rte_vrb_pmd.c 
b/drivers/baseband/acc/rte_vrb_pmd.c
index 26335d55ba3b..88201d11de88 100644
--- a/drivers/baseband/acc/rte_vrb_pmd.c
+++ b/drivers/baseband/acc/rte_vrb_pmd.c
@@ -1359,7 +1359,6 @@ vrb_dev_info_get(struct rte_bbdev *dev, struct 
rte_bbdev_driver_info *dev_info)
RTE_BBDEV_LDPC_HARQ_4BIT_COMPRESSION |
RTE_BBDEV_LDPC_LLR_COMPRESSION |
RTE_BBDEV_LDPC_SOFT_OUT_ENABLE |
-   RTE_BBDEV_LDPC_SOFT_OUT_RM_BYPASS |
RTE_BBDEV_LDPC_SOFT_OUT_DEINTERLEAVER_BYPASS |
RTE_BBDEV_LDPC_DEC_INTERRUPTS,
.llr_size = 8,
@@ -1736,18 +1735,18 @@ vrb_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct 
acc_fcw_ld *fcw,
fcw->so_en = check_bit(op->ldpc_dec.op_flags, 
RTE_BBDEV_LDPC_SOFT_OUT_ENABLE);
fcw->so_bypass_intlv = check_bit(op->ldpc_dec.op_flags,
RTE_BBDEV_LDPC_SOFT_OUT_DEINTERLEAVER_BYPASS);
-   fcw->so_bypass_rm = check_bit(op->ldpc_dec.op_flags,
-   RTE_BBDEV_LDPC_SOFT_OUT_RM_BYPASS);
+   fcw->so_bypass_rm = 0;
fcw->minsum_offset = 1;
fcw->dec_llrclip   = 2;
}
  
  	/*

-* These are all implicitly set
+* These are all implicitly set:
 * fcw->synd_post = 0;
 * fcw->dec_convllr = 0;
 * fcw->hcout_convllr = 0;
 * fcw->hcout_size1 = 0;
+* fcw->so_it = 0;
 * fcw->hcout_offset = 0;
 * fcw->negstop_th = 0;
 * fcw->negstop_it = 0;

[v4 01/15] dma/dpaa2: configure route by port by PCIe port param

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

struct {
uint64_t coreid : 4; /**--rbp.sportid / rbp.dportid*/
uint64_t pfid : 8; /**--rbp.spfid / rbp.dpfid*/
uint64_t vfen : 1; /**--rbp.svfa / rbp.dvfa*/
uint64_t vfid : 16; /**--rbp.svfid / rbp.dvfid*/
.
} pcie;

Signed-off-by: Jun Yang 
---
 .../bus/fslmc/qbman/include/fsl_qbman_base.h  | 29 ++---
 drivers/dma/dpaa2/dpaa2_qdma.c| 59 +--
 drivers/dma/dpaa2/dpaa2_qdma.h| 38 +++-
 drivers/dma/dpaa2/rte_pmd_dpaa2_qdma.h| 55 +
 drivers/dma/dpaa2/version.map |  1 -
 5 files changed, 100 insertions(+), 82 deletions(-)

diff --git a/drivers/bus/fslmc/qbman/include/fsl_qbman_base.h 
b/drivers/bus/fslmc/qbman/include/fsl_qbman_base.h
index 48ffb1b46e..7528b610e1 100644
--- a/drivers/bus/fslmc/qbman/include/fsl_qbman_base.h
+++ b/drivers/bus/fslmc/qbman/include/fsl_qbman_base.h
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  *
  * Copyright (C) 2014 Freescale Semiconductor, Inc.
- * Copyright 2017-2019 NXP
+ * Copyright 2017-2024 NXP
  *
  */
 #ifndef _FSL_QBMAN_BASE_H
@@ -141,12 +141,23 @@ struct qbman_fd {
uint32_t saddr_hi;
 
uint32_t len_sl:18;
-   uint32_t rsv1:14;
-
+   uint32_t rsv13:2;
+   uint32_t svfid:6;
+   uint32_t rsv12:2;
+   uint32_t spfid:2;
+   uint32_t rsv1:2;
uint32_t sportid:4;
-   uint32_t rsv2:22;
+   uint32_t rsv2:1;
+   uint32_t sca:1;
+   uint32_t sat:2;
+   uint32_t sattr:3;
+   uint32_t svfa:1;
+   uint32_t stc:3;
uint32_t bmt:1;
-   uint32_t rsv3:1;
+   uint32_t dvfid:6;
+   uint32_t rsv3:2;
+   uint32_t dpfid:2;
+   uint32_t rsv31:2;
uint32_t fmt:2;
uint32_t sl:1;
uint32_t rsv4:1;
@@ -154,12 +165,14 @@ struct qbman_fd {
uint32_t acc_err:4;
uint32_t rsv5:4;
uint32_t ser:1;
-   uint32_t rsv6:3;
+   uint32_t rsv6:2;
+   uint32_t wns:1;
uint32_t wrttype:4;
uint32_t dqos:3;
uint32_t drbp:1;
uint32_t dlwc:2;
-   uint32_t rsv7:2;
+   uint32_t rsv7:1;
+   uint32_t rns:1;
uint32_t rdttype:4;
uint32_t sqos:3;
uint32_t srbp:1;
@@ -182,7 +195,7 @@ struct qbman_fd {
uint32_t saddr_lo;
 
uint32_t saddr_hi:17;
-   uint32_t rsv1:15;
+   uint32_t rsv1_att:15;
 
uint32_t len;
 
diff --git a/drivers/dma/dpaa2/dpaa2_qdma.c b/drivers/dma/dpaa2/dpaa2_qdma.c
index 5780e49297..5d4749eae3 100644
--- a/drivers/dma/dpaa2/dpaa2_qdma.c
+++ b/drivers/dma/dpaa2/dpaa2_qdma.c
@@ -22,7 +22,7 @@ uint32_t dpaa2_coherent_alloc_cache;
 static inline int
 qdma_populate_fd_pci(phys_addr_t src, phys_addr_t dest,
 uint32_t len, struct qbman_fd *fd,
-struct rte_dpaa2_qdma_rbp *rbp, int ser)
+struct dpaa2_qdma_rbp *rbp, int ser)
 {
fd->simple_pci.saddr_lo = lower_32_bits((uint64_t) (src));
fd->simple_pci.saddr_hi = upper_32_bits((uint64_t) (src));
@@ -93,7 +93,7 @@ qdma_populate_fd_ddr(phys_addr_t src, phys_addr_t dest,
 static void
 dpaa2_qdma_populate_fle(struct qbman_fle *fle,
uint64_t fle_iova,
-   struct rte_dpaa2_qdma_rbp *rbp,
+   struct dpaa2_qdma_rbp *rbp,
uint64_t src, uint64_t dest,
size_t len, uint32_t flags, uint32_t fmt)
 {
@@ -114,7 +114,6 @@ dpaa2_qdma_populate_fle(struct qbman_fle *fle,
/* source */
sdd->read_cmd.portid = rbp->sportid;
sdd->rbpcmd_simple.pfid = rbp->spfid;
-   sdd->rbpcmd_simple.vfa = rbp->vfa;
sdd->rbpcmd_simple.vfid = rbp->svfid;
 
if (rbp->srbp) {
@@ -127,7 +126,6 @@ dpaa2_qdma_populate_fle(struct qbman_fle *fle,
/* destination */
sdd->write_cmd.portid = rbp->dportid;
sdd->rbpcmd_simple.pfid = rbp->dpfid;
-   sdd->rbpcmd_simple.vfa = rbp->vfa;
sdd->rbpcmd_simple.vfid = rbp->dvfid;
 
if (rbp->drbp) {
@@ -178,7 +176,7 @@ dpdmai_dev_set_fd_us(struct qdma_virt_queue *qdma_vq,

[v4 03/15] bus/fslmc: enhance the qbman dq storage logic

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

Multiple DQ storages are used among multiple cores, the single dq
storage of first union is leak if multiple storages are allocated.
It does not make sense to keep the single dq storage of union,
remove it and reuse the first dq storage of multiple storages
for this case.

Signed-off-by: Jun Yang 
---
 drivers/bus/fslmc/portal/dpaa2_hw_dpci.c| 25 ++-
 drivers/bus/fslmc/portal/dpaa2_hw_dpio.c|  7 +-
 drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 38 +-
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c | 23 ++
 drivers/crypto/dpaa2_sec/dpaa2_sec_raw_dp.c |  4 +-
 drivers/dma/dpaa2/dpaa2_qdma.c  | 41 ++-
 drivers/net/dpaa2/dpaa2_ethdev.c| 81 -
 drivers/net/dpaa2/dpaa2_rxtx.c  | 19 +++--
 drivers/raw/dpaa2_cmdif/dpaa2_cmdif.c   |  4 +-
 9 files changed, 102 insertions(+), 140 deletions(-)

diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpci.c 
b/drivers/bus/fslmc/portal/dpaa2_hw_dpci.c
index 7e858a113f..160126f6d6 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpci.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpci.c
@@ -81,22 +81,10 @@ rte_dpaa2_create_dpci_device(int vdev_fd __rte_unused,
}
 
/* Allocate DQ storage for the DPCI Rx queues */
-   rxq = &(dpci_node->rx_queue[i]);
-   rxq->q_storage = rte_malloc("dq_storage",
-   sizeof(struct queue_storage_info_t),
-   RTE_CACHE_LINE_SIZE);
-   if (!rxq->q_storage) {
-   DPAA2_BUS_ERR("q_storage allocation failed");
-   ret = -ENOMEM;
+   rxq = &dpci_node->rx_queue[i];
+   ret = dpaa2_queue_storage_alloc(rxq, 1);
+   if (ret)
goto err;
-   }
-
-   memset(rxq->q_storage, 0, sizeof(struct queue_storage_info_t));
-   ret = dpaa2_alloc_dq_storage(rxq->q_storage);
-   if (ret) {
-   DPAA2_BUS_ERR("dpaa2_alloc_dq_storage failed");
-   goto err;
-   }
}
 
/* Enable the device */
@@ -141,12 +129,9 @@ rte_dpaa2_create_dpci_device(int vdev_fd __rte_unused,
 
 err:
for (i = 0; i < DPAA2_DPCI_MAX_QUEUES; i++) {
-   struct dpaa2_queue *rxq = &(dpci_node->rx_queue[i]);
+   struct dpaa2_queue *rxq = &dpci_node->rx_queue[i];
 
-   if (rxq->q_storage) {
-   dpaa2_free_dq_storage(rxq->q_storage);
-   rte_free(rxq->q_storage);
-   }
+   dpaa2_queue_storage_free(rxq, 1);
}
rte_free(dpci_node);
 
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c 
b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
index 4aec7b2cd8..a8afc772fd 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpio.c
@@ -574,6 +574,7 @@ dpaa2_free_dq_storage(struct queue_storage_info_t 
*q_storage)
 
for (i = 0; i < NUM_DQS_PER_QUEUE; i++) {
rte_free(q_storage->dq_storage[i]);
+   q_storage->dq_storage[i] = NULL;
}
 }
 
@@ -583,7 +584,7 @@ dpaa2_alloc_dq_storage(struct queue_storage_info_t 
*q_storage)
int i = 0;
 
for (i = 0; i < NUM_DQS_PER_QUEUE; i++) {
-   q_storage->dq_storage[i] = rte_malloc(NULL,
+   q_storage->dq_storage[i] = rte_zmalloc(NULL,
dpaa2_dqrr_size * sizeof(struct qbman_result),
RTE_CACHE_LINE_SIZE);
if (!q_storage->dq_storage[i])
@@ -591,8 +592,10 @@ dpaa2_alloc_dq_storage(struct queue_storage_info_t 
*q_storage)
}
return 0;
 fail:
-   while (--i >= 0)
+   while (--i >= 0) {
rte_free(q_storage->dq_storage[i]);
+   q_storage->dq_storage[i] = NULL;
+   }
 
return -1;
 }
diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h 
b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
index 169c7917ea..1ce481c88d 100644
--- a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
+++ b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h
@@ -1,7 +1,7 @@
 /* SPDX-License-Identifier: BSD-3-Clause
  *
  *   Copyright (c) 2016 Freescale Semiconductor, Inc. All rights reserved.
- *   Copyright 2016-2021 NXP
+ *   Copyright 2016-2024 NXP
  *
  */
 
@@ -165,7 +165,9 @@ struct __rte_cache_aligned dpaa2_queue {
uint64_t tx_pkts;
uint64_t err_pkts;
union {
-   struct queue_storage_info_t *q_storage;
+   /**Ingress*/
+   struct queue_storage_info_t *q_storage[RTE_MAX_LCORE];
+   /**Egress*/
struct qbman_result *cscn;
};
struct rte_event ev;
@@ -186,6 +188,38 @@ struct swp_active_dqs {
uint64_t reserved[7];
 };
 
+#define dpaa2_queue_storage_alloc(q, num) \
+({ \
+   int ret = 0, i; \
+   \
+   for (i = 0; i < (num); i++) {

[v4 00/15] NXP DMA driver fixes and Enhancements

2024-10-08 Thread Gagandeep Singh

V4 changes:
* rebased series to latest commit and patches reduced.

V3 changes:
* fix 32 bit compilation issue

V2 changes:
* fix compilation issue on ubuntu 22.04

Hemant Agrawal (1):
  bus/dpaa: add port bmi stats

Jun Yang (14):
  dma/dpaa2: configure route by port by PCIe port param
  dma/dpaa2: refactor driver code
  bus/fslmc: enhance the qbman dq storage logic
  dma/dpaa2: add short FD support
  dma/dpaa2: limit the max descriptor number
  dma/dpaa2: change the DMA copy return value
  dma/dpaa2: move the qdma header to common place
  dma/dpaa: refactor driver
  dma/dpaa: support burst capacity API
  dma/dpaa: add silent mode support
  dma/dpaa: add workaround for ERR050757
  dma/dpaa: qdma stall workaround for ERR050265
  dma/dpaa: add Scatter Gather support
  dma/dpaa: add DMA error checks

 config/arm/meson.build|4 +-
 doc/api/doxy-api-index.md |2 +-
 doc/api/doxy-api.conf.in  |2 +-
 doc/guides/dmadevs/dpaa.rst   |9 +
 doc/guides/dmadevs/dpaa2.rst  |   10 +
 drivers/bus/dpaa/base/fman/fman_hw.c  |   65 +-
 drivers/bus/dpaa/include/fman.h   |4 +-
 drivers/bus/dpaa/include/fsl_fman.h   |   12 +
 drivers/bus/dpaa/version.map  |4 +
 drivers/bus/fslmc/portal/dpaa2_hw_dpci.c  |   25 +-
 drivers/bus/fslmc/portal/dpaa2_hw_dpio.c  |7 +-
 drivers/bus/fslmc/portal/dpaa2_hw_pvt.h   |   38 +-
 .../bus/fslmc/qbman/include/fsl_qbman_base.h  |   29 +-
 drivers/common/dpaax/meson.build  |3 +-
 drivers/common/dpaax/rte_pmd_dpaax_qdma.h |   23 +
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c   |   23 +-
 drivers/crypto/dpaa2_sec/dpaa2_sec_raw_dp.c   |4 +-
 drivers/dma/dpaa/dpaa_qdma.c  | 1593 +++
 drivers/dma/dpaa/dpaa_qdma.h  |  292 +-
 drivers/dma/dpaa2/dpaa2_qdma.c| 2446 +
 drivers/dma/dpaa2/dpaa2_qdma.h|  243 +-
 drivers/dma/dpaa2/meson.build |4 +-
 drivers/dma/dpaa2/rte_pmd_dpaa2_qdma.h|  177 --
 drivers/dma/dpaa2/version.map |   14 -
 drivers/net/dpaa/dpaa_ethdev.c|   46 +-
 drivers/net/dpaa/dpaa_ethdev.h|   12 +
 drivers/net/dpaa2/dpaa2_ethdev.c  |   83 +-
 drivers/net/dpaa2/dpaa2_rxtx.c|   19 +-
 drivers/raw/dpaa2_cmdif/dpaa2_cmdif.c |4 +-
 29 files changed, 2899 insertions(+), 2298 deletions(-)
 create mode 100644 drivers/common/dpaax/rte_pmd_dpaax_qdma.h
 delete mode 100644 drivers/dma/dpaa2/rte_pmd_dpaa2_qdma.h
 delete mode 100644 drivers/dma/dpaa2/version.map

-- 
2.25.1

[v4 05/15] dma/dpaa2: limit the max descriptor number

2024-10-08 Thread Gagandeep Singh

From: Jun Yang 

For non-SG format, the index is saved in FD with
DPAA2_QDMA_FD_ATT_TYPE_OFFSET(13) bits width.

The max descriptor number of ring is power of 2, so the
eventual max number is:
((1 << DPAA2_QDMA_FD_ATT_TYPE_OFFSET) / 2)

Signed-off-by: Jun Yang 
---
 drivers/dma/dpaa2/dpaa2_qdma.h | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/dma/dpaa2/dpaa2_qdma.h b/drivers/dma/dpaa2/dpaa2_qdma.h
index 0be65e1cc6..250c83c83c 100644
--- a/drivers/dma/dpaa2/dpaa2_qdma.h
+++ b/drivers/dma/dpaa2/dpaa2_qdma.h
@@ -8,8 +8,6 @@
 #include "portal/dpaa2_hw_pvt.h"
 #include "portal/dpaa2_hw_dpio.h"
 
-#define DPAA2_QDMA_MAX_DESC4096
-#define DPAA2_QDMA_MIN_DESC1
 #define DPAA2_QDMA_MAX_VHANS   64
 
 #define DPAA2_DPDMAI_MAX_QUEUES16
@@ -169,10 +167,15 @@ enum dpaa2_qdma_fd_type {
 };
 
 #define DPAA2_QDMA_FD_ATT_TYPE_OFFSET 13
+#define DPAA2_QDMA_FD_ATT_MAX_IDX \
+   ((1 << DPAA2_QDMA_FD_ATT_TYPE_OFFSET) - 1)
 #define DPAA2_QDMA_FD_ATT_TYPE(att) \
(att >> DPAA2_QDMA_FD_ATT_TYPE_OFFSET)
 #define DPAA2_QDMA_FD_ATT_CNTX(att) \
-   (att & ((1 << DPAA2_QDMA_FD_ATT_TYPE_OFFSET) - 1))
+   (att & DPAA2_QDMA_FD_ATT_MAX_IDX)
+
+#define DPAA2_QDMA_MAX_DESC ((DPAA2_QDMA_FD_ATT_MAX_IDX + 1) / 2)
+#define DPAA2_QDMA_MIN_DESC 1
 
 static inline void
 dpaa2_qdma_fd_set_addr(struct qbman_fd *fd,
@@ -186,6 +189,7 @@ static inline void
 dpaa2_qdma_fd_save_att(struct qbman_fd *fd,
uint16_t job_idx, enum dpaa2_qdma_fd_type type)
 {
+   RTE_ASSERT(job_idx <= DPAA2_QDMA_FD_ATT_MAX_IDX);
fd->simple_ddr.rsv1_att = job_idx |
(type << DPAA2_QDMA_FD_ATT_TYPE_OFFSET);
 }
-- 
2.25.1

[v4 15/15] bus/dpaa: add port bmi stats

2024-10-08 Thread Gagandeep Singh

From: Hemant Agrawal 

Add BMI statistics and fixing the existing extended
statistics

Signed-off-by: Hemant Agrawal 
Signed-off-by: Gagandeep Singh 
---
 drivers/bus/dpaa/base/fman/fman_hw.c | 65 +++-
 drivers/bus/dpaa/include/fman.h  |  4 +-
 drivers/bus/dpaa/include/fsl_fman.h  | 12 +
 drivers/bus/dpaa/version.map |  4 ++
 drivers/net/dpaa/dpaa_ethdev.c   | 46 +---
 drivers/net/dpaa/dpaa_ethdev.h   | 12 +
 6 files changed, 134 insertions(+), 9 deletions(-)

diff --git a/drivers/bus/dpaa/base/fman/fman_hw.c 
b/drivers/bus/dpaa/base/fman/fman_hw.c
index 24a99f7235..27b39a4975 100644
--- a/drivers/bus/dpaa/base/fman/fman_hw.c
+++ b/drivers/bus/dpaa/base/fman/fman_hw.c
@@ -244,8 +244,8 @@ fman_if_stats_get_all(struct fman_if *p, uint64_t *value, 
int n)
uint64_t base_offset = offsetof(struct memac_regs, reoct_l);
 
for (i = 0; i < n; i++)
-   value[i] = (((u64)in_be32((char *)regs + base_offset + 8 * i) |
-   (u64)in_be32((char *)regs + base_offset +
+   value[i] = ((u64)in_be32((char *)regs + base_offset + 8 * i) |
+   ((u64)in_be32((char *)regs + base_offset +
8 * i + 4)) << 32);
 }
 
@@ -266,6 +266,67 @@ fman_if_stats_reset(struct fman_if *p)
;
 }
 
+void
+fman_if_bmi_stats_enable(struct fman_if *p)
+{
+   struct __fman_if *m = container_of(p, struct __fman_if, __if);
+   struct rx_bmi_regs *regs = (struct rx_bmi_regs *)m->bmi_map;
+   uint32_t tmp;
+
+   tmp = in_be32(®s->fmbm_rstc);
+
+   tmp |= FMAN_BMI_COUNTERS_EN;
+
+   out_be32(®s->fmbm_rstc, tmp);
+}
+
+void
+fman_if_bmi_stats_disable(struct fman_if *p)
+{
+   struct __fman_if *m = container_of(p, struct __fman_if, __if);
+   struct rx_bmi_regs *regs = (struct rx_bmi_regs *)m->bmi_map;
+   uint32_t tmp;
+
+   tmp = in_be32(®s->fmbm_rstc);
+
+   tmp &= ~FMAN_BMI_COUNTERS_EN;
+
+   out_be32(®s->fmbm_rstc, tmp);
+}
+
+void
+fman_if_bmi_stats_get_all(struct fman_if *p, uint64_t *value)
+{
+   struct __fman_if *m = container_of(p, struct __fman_if, __if);
+   struct rx_bmi_regs *regs = (struct rx_bmi_regs *)m->bmi_map;
+   int i = 0;
+
+   value[i++] = (u32)in_be32(®s->fmbm_rfrc);
+   value[i++] = (u32)in_be32(®s->fmbm_rfbc);
+   value[i++] = (u32)in_be32(®s->fmbm_rlfc);
+   value[i++] = (u32)in_be32(®s->fmbm_rffc);
+   value[i++] = (u32)in_be32(®s->fmbm_rfdc);
+   value[i++] = (u32)in_be32(®s->fmbm_rfldec);
+   value[i++] = (u32)in_be32(®s->fmbm_rodc);
+   value[i++] = (u32)in_be32(®s->fmbm_rbdc);
+}
+
+void
+fman_if_bmi_stats_reset(struct fman_if *p)
+{
+   struct __fman_if *m = container_of(p, struct __fman_if, __if);
+   struct rx_bmi_regs *regs = (struct rx_bmi_regs *)m->bmi_map;
+
+   out_be32(®s->fmbm_rfrc, 0);
+   out_be32(®s->fmbm_rfbc, 0);
+   out_be32(®s->fmbm_rlfc, 0);
+   out_be32(®s->fmbm_rffc, 0);
+   out_be32(®s->fmbm_rfdc, 0);
+   out_be32(®s->fmbm_rfldec, 0);
+   out_be32(®s->fmbm_rodc, 0);
+   out_be32(®s->fmbm_rbdc, 0);
+}
+
 void
 fman_if_promiscuous_enable(struct fman_if *p)
 {
diff --git a/drivers/bus/dpaa/include/fman.h b/drivers/bus/dpaa/include/fman.h
index f918836ec2..1f120b7614 100644
--- a/drivers/bus/dpaa/include/fman.h
+++ b/drivers/bus/dpaa/include/fman.h
@@ -56,6 +56,8 @@
 #define FMAN_PORT_BMI_FIFO_UNITS   0x100
 #define FMAN_PORT_IC_OFFSET_UNITS  0x10
 
+#define FMAN_BMI_COUNTERS_EN 0x8000
+
 #define FMAN_ENABLE_BPOOL_DEPLETION0xF0F0
 
 #define HASH_CTRL_MCAST_EN 0x0100
@@ -260,7 +262,7 @@ struct rx_bmi_regs {
/**< Buffer Manager pool Information-*/
uint32_t fmbm_acnt[FMAN_PORT_MAX_EXT_POOLS_NUM];
/**< Allocate Counter-*/
-   uint32_t reserved0130[8];
+   uint32_t reserved0120[16];
/**< 0x130/0x140 - 0x15F reserved -*/
uint32_t fmbm_rcgm[FMAN_PORT_CG_MAP_NUM];
/**< Congestion Group Map*/
diff --git a/drivers/bus/dpaa/include/fsl_fman.h 
b/drivers/bus/dpaa/include/fsl_fman.h
index 20690f8329..5a9750ad0c 100644
--- a/drivers/bus/dpaa/include/fsl_fman.h
+++ b/drivers/bus/dpaa/include/fsl_fman.h
@@ -60,6 +60,18 @@ void fman_if_stats_reset(struct fman_if *p);
 __rte_internal
 void fman_if_stats_get_all(struct fman_if *p, uint64_t *value, int n);
 
+__rte_internal
+void fman_if_bmi_stats_enable(struct fman_if *p);
+
+__rte_internal
+void fman_if_bmi_stats_disable(struct fman_if *p);
+
+__rte_internal
+void fman_if_bmi_stats_get_all(struct fman_if *p, uint64_t *value);
+
+__rte_internal
+void fman_if_bmi_stats_reset(struct fman_if *p);
+
 /* Set ignore pause option for a specific interface */
 void fman_if_set_rx_ignore_pause_frames(struct fman_if *p, bool enable);
 
diff --git a

[PATCH v3 1/4] cryptodev: add partial sm2 feature flag

2024-10-08 Thread Arkadiusz Kusztal

Due to complex ways of handling asymmetric cryptography algorithms,
capabilities may differ between hardware and software PMDs,
or even between hardware PMDs. One of the examples are algorithms that
need an additional round of hashing, like SM2.

Signed-off-by: Arkadiusz Kusztal 
---
 lib/cryptodev/rte_cryptodev.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/cryptodev/rte_cryptodev.h b/lib/cryptodev/rte_cryptodev.h
index bec947f6d5..c0e816b17f 100644
--- a/lib/cryptodev/rte_cryptodev.h
+++ b/lib/cryptodev/rte_cryptodev.h
@@ -554,6 +554,8 @@ rte_cryptodev_asym_get_xform_string(enum 
rte_crypto_asym_xform_type xform_enum);
 /**< Support inner checksum computation/verification */
 #define RTE_CRYPTODEV_FF_SECURITY_RX_INJECT(1ULL << 28)
 /**< Support Rx injection after security processing */
+#define RTE_CRYPTODEV_FF_ASYM_PARTIAL_SM2  (1ULL << 29)
+/**< Support the elliptic curve part only in SM2 */
 
 /**
  * Get the name of a crypto device feature flag
-- 
2.13.6

[PATCH v3 2/4] cryptodev: add ec points to sm2 op

2024-10-08 Thread Arkadiusz Kusztal

In the case when PMD cannot support the full process of the SM2,
but elliptic curve computation only, additional fields
are needed to handle such a case.

Points C1, kP therefore were added to the SM2 crypto operation struct.

Signed-off-by: Arkadiusz Kusztal 
---
 lib/cryptodev/rte_crypto_asym.h | 119 
 1 file changed, 71 insertions(+), 48 deletions(-)

diff --git a/lib/cryptodev/rte_crypto_asym.h b/lib/cryptodev/rte_crypto_asym.h
index 39d3da3952..f59759062f 100644
--- a/lib/cryptodev/rte_crypto_asym.h
+++ b/lib/cryptodev/rte_crypto_asym.h
@@ -600,40 +600,6 @@ struct rte_crypto_ecpm_op_param {
 };
 
 /**
- * Asymmetric crypto transform data
- *
- * Structure describing asym xforms.
- */
-struct rte_crypto_asym_xform {
-   struct rte_crypto_asym_xform *next;
-   /**< Pointer to next xform to set up xform chain.*/
-   enum rte_crypto_asym_xform_type xform_type;
-   /**< Asymmetric crypto transform */
-
-   union {
-   struct rte_crypto_rsa_xform rsa;
-   /**< RSA xform parameters */
-
-   struct rte_crypto_modex_xform modex;
-   /**< Modular Exponentiation xform parameters */
-
-   struct rte_crypto_modinv_xform modinv;
-   /**< Modular Multiplicative Inverse xform parameters */
-
-   struct rte_crypto_dh_xform dh;
-   /**< DH xform parameters */
-
-   struct rte_crypto_dsa_xform dsa;
-   /**< DSA xform parameters */
-
-   struct rte_crypto_ec_xform ec;
-   /**< EC xform parameters, used by elliptic curve based
-* operations.
-*/
-   };
-};
-
-/**
  * SM2 operation params.
  */
 struct rte_crypto_sm2_op_param {
@@ -658,20 +624,43 @@ struct rte_crypto_sm2_op_param {
 * will be overwritten by the PMD with the decrypted length.
 */
 
-   rte_crypto_param cipher;
-   /**<
-* Pointer to input data
-* - to be decrypted for SM2 private decrypt.
-*
-* Pointer to output data
-* - for SM2 public encrypt.
-* In this case the underlying array should have been allocated
-* with enough memory to hold ciphertext output (at least X bytes
-* for prime field curve of N bytes and for message M bytes,
-* where X = (C1 || C2 || C3) and computed based on SM2 RFC as
-* C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will
-* be overwritten by the PMD with the encrypted length.
-*/
+   union {
+   rte_crypto_param cipher;
+   /**<
+* Pointer to input data
+* - to be decrypted for SM2 private decrypt.
+*
+* Pointer to output data
+* - for SM2 public encrypt.
+* In this case the underlying array should have been allocated
+* with enough memory to hold ciphertext output (at least X 
bytes
+* for prime field curve of N bytes and for message M bytes,
+* where X = (C1 || C2 || C3) and computed based on SM2 RFC as
+* C1 (1 + N + N), C2 = M, C3 = N. The cipher.length field will
+* be overwritten by the PMD with the encrypted length.
+*/
+   struct {
+   struct rte_crypto_ec_point C1;
+   /**<
+* This field is used only when PMD does not support 
the full
+* process of the SM2 encryption/decryption, but the 
elliptic
+* curve part only.
+*
+* In the case of encryption, it is an output - point 
C1 = (x1,y1).
+* In the case of decryption, if is an input - point C1 
= (x1,y1)
+*
+*/
+   struct rte_crypto_ec_point kP;
+   /**<
+* This field is used only when PMD does not support 
the full
+* process of the SM2 encryption/decryption, but the 
elliptic
+* curve part only.
+*
+* It is an output in the encryption case, it is a point
+* [k]P = (x2,y2)
+*/
+   };
+   };
 
rte_crypto_uint id;
/**< The SM2 id used by signer and verifier. */
@@ -698,6 +687,40 @@ struct rte_crypto_sm2_op_param {
 };
 
 /**
+ * Asymmetric crypto transform data
+ *
+ * Structure describing asym xforms.
+ */
+struct rte_crypto_asym_xform {
+   struct rte_crypto_asym_xform *next;
+   /**< Pointer to next xform to set up xform chain.*/
+   enum rte_crypto_asym_xform_type xform_type;
+   /**< Asymmetric crypto transform */
+
+   union {
+   struct rte_crypto_rsa_xform rsa;
+   /**<

Re: [PATCH v2 03/10] baseband/acc: configure max queues per device

2024-10-08 Thread Maxime Coquelin





On 10/3/24 22:49, Hernan Vargas wrote:

Configure max_queues based on the number of queue groups and numbers of
AQS per device variant.

Signed-off-by: Hernan Vargas 
---
  drivers/baseband/acc/rte_vrb_pmd.c | 9 +++--
  1 file changed, 7 insertions(+), 2 deletions(-)



Reviewed-by: Maxime Coquelin 

Thanks,
Maxime

Re: [PATCH v2 02/10] baseband/acc: queue allocation refactor

2024-10-08 Thread Maxime Coquelin


Hi Nicolas,

On 10/7/24 18:45, Chautru, Nicolas wrote:

Hi Maxime,


-Original Message-
From: Maxime Coquelin 
Sent: Monday, October 7, 2024 2:31 AM
To: Chautru, Nicolas ; Vargas, Hernan
; dev@dpdk.org; gak...@marvell.com;
t...@redhat.com
Cc: Zhang, Qi Z 
Subject: Re: [PATCH v2 02/10] baseband/acc: queue allocation refactor

Hi Nicolas,

On 10/4/24 20:19, Chautru, Nicolas wrote:

Hi Maxime,


-Original Message-
From: Maxime Coquelin 
Sent: Friday, October 4, 2024 5:08 AM
To: Vargas, Hernan ; dev@dpdk.org;
gak...@marvell.com; t...@redhat.com
Cc: Chautru, Nicolas ; Zhang, Qi Z

Subject: Re: [PATCH v2 02/10] baseband/acc: queue allocation refactor



On 10/3/24 22:49, Hernan Vargas wrote:

Refactor to manage queue memory per operation more flexibly for VRB
devices.

Signed-off-by: Hernan Vargas 
---
drivers/baseband/acc/acc_common.h  |   5 +
drivers/baseband/acc/rte_vrb_pmd.c | 214 --

---

2 files changed, 157 insertions(+), 62 deletions(-)

diff --git a/drivers/baseband/acc/acc_common.h
b/drivers/baseband/acc/acc_common.h
index b1f81e73e68d..adbac0dcca70 100644
--- a/drivers/baseband/acc/acc_common.h
+++ b/drivers/baseband/acc/acc_common.h
@@ -149,6 +149,8 @@
#define VRB2_VF_ID_SHIFT 6

#define ACC_MAX_FFT_WIN  16
+#define ACC_MAX_RING_BUFFER  64
+#define VRB2_MAX_Q_PER_OP 256

extern int acc_common_logtype;

@@ -581,6 +583,9 @@ struct acc_device {
void *sw_rings_base;  /* Base addr of un-aligned memory for sw
rings

*/

void *sw_rings;  /* 64MBs of 64MB aligned memory for sw rings */
rte_iova_t sw_rings_iova;  /* IOVA address of sw_rings */
+   void *sw_rings_array[ACC_MAX_RING_BUFFER];  /* Array of aligned

memory for sw rings. */

+   rte_iova_t sw_rings_iova_array[ACC_MAX_RING_BUFFER];  /* Array

of sw_rings IOVA. */

+   uint32_t queue_index[ACC_MAX_RING_BUFFER]; /* Tracking queue

index

+per ring buffer. */
/* Virtual address of the info memory routed to the this function under
 * operation, whether it is PF or VF.
 * HW may DMA information data at this location asynchronously
diff --git a/drivers/baseband/acc/rte_vrb_pmd.c
b/drivers/baseband/acc/rte_vrb_pmd.c
index bae01e563826..2c62a5b3e329 100644
--- a/drivers/baseband/acc/rte_vrb_pmd.c
+++ b/drivers/baseband/acc/rte_vrb_pmd.c
@@ -281,7 +281,7 @@ fetch_acc_config(struct rte_bbdev *dev)
/* Check the depth of the AQs. */
reg_len0 = acc_reg_read(d, d->reg_addr->depth_log0_offset);
reg_len1 = acc_reg_read(d, d->reg_addr->depth_log1_offset);
-   for (acc = 0; acc < NUM_ACC; acc++) {
+   for (acc = 0; acc < VRB1_NUM_ACCS; acc++) {
qtopFromAcc(&q_top, acc, acc_conf);
if (q_top->first_qgroup_index <

ACC_NUM_QGRPS_PER_WORD)

q_top->aq_depth_log2 =
@@ -290,7 +290,7 @@ fetch_acc_config(struct rte_bbdev *dev)
q_top->aq_depth_log2 = (reg_len1 >> ((q_top-
first_qgroup_index -


ACC_NUM_QGRPS_PER_WORD) * 4)) & 0xF;

}
-   } else {
+   } else if (d->device_variant == VRB2_VARIANT) {
reg0 = acc_reg_read(d, d->reg_addr->qman_group_func);
reg1 = acc_reg_read(d, d->reg_addr->qman_group_func + 4);
reg2 = acc_reg_read(d, d->reg_addr->qman_group_func + 8);

@@

-308,7 +308,7 @@ fetch_acc_config(struct rte_bbdev *dev)
idx = (reg2 >> ((qg %

ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;

else
idx = (reg3 >> ((qg %

ACC_NUM_QGRPS_PER_WORD) * 4)) & 0x7;

-   if (idx < VRB_NUM_ACCS) {
+   if (idx < VRB2_NUM_ACCS) {
acc = qman_func_id[idx];
updateQtop(acc, qg, acc_conf, d);
}
@@ -321,7 +321,7 @@ fetch_acc_config(struct rte_bbdev *dev)
reg_len2 = acc_reg_read(d, d->reg_addr->depth_log0_offset +

8);

reg_len3 = acc_reg_read(d, d->reg_addr->depth_log0_offset +

12);


-   for (acc = 0; acc < NUM_ACC; acc++) {
+   for (acc = 0; acc < VRB2_NUM_ACCS; acc++) {
qtopFromAcc(&q_top, acc, acc_conf);
if (q_top->first_qgroup_index /

ACC_NUM_QGRPS_PER_WORD == 0)

q_top->aq_depth_log2 = (reg_len0 >> ((q_top-
first_qgroup_index %


This function could be much heavily refactored.
If we look at was is actuallt performed, VRB1 and VRB2 logic is the
same, just a couple of value differs (they could be set at probe time).

I might propose something in the future.


@@ -543,6 +543,7 @@ vrb_setup_queues(struct rte_bbdev *dev,

uint16_t

num_queues, int socket_id)

{
uint32_t phys_low, phys_high,

RE: [PATCH v6 1/6] cryptodev: add EDDSA asymmetric crypto algorithm

2024-10-08 Thread Kusztal, ArkadiuszX

Acked-by: Arkadiusz Kusztal 

> -Original Message-
> From: Kusztal, ArkadiuszX 
> Sent: Monday, October 7, 2024 6:04 PM
> To: Gowrishankar Muthukrishnan ;
> dev@dpdk.org; Akhil Goyal ; Fan Zhang
> 
> Cc: Anoob Joseph ; Richardson, Bruce
> ; jer...@marvell.com; Ji, Kai ;
> jack.bond-pres...@foss.arm.com; Marchand, David
> ; hemant.agra...@nxp.com; De Lara Guarch,
> Pablo ; Trahe, Fiona
> ; Doherty, Declan ;
> ma...@nvidia.com; ruifeng.w...@arm.com
> Subject: RE: [PATCH v6 1/6] cryptodev: add EDDSA asymmetric crypto algorithm
> 
> Hi Gowrishankar,
> 
> I like the idea of adding EdDSA, but I have several comments.
> 
> > -Original Message-
> > From: Gowrishankar Muthukrishnan 
> > Sent: Friday, October 4, 2024 10:26 AM
> > To: dev@dpdk.org; Akhil Goyal ; Fan Zhang
> > 
> > Cc: Anoob Joseph ; Richardson, Bruce
> > ; jer...@marvell.com; Kusztal, ArkadiuszX
> > ; Ji, Kai ; jack.bond-
> > pres...@foss.arm.com; Marchand, David ;
> > hemant.agra...@nxp.com; De Lara Guarch, Pablo
> > ; Trahe, Fiona
> > ; Doherty, Declan ;
> > ma...@nvidia.com; ruifeng.w...@arm.com; Gowrishankar Muthukrishnan
> > 
> > Subject: [PATCH v6 1/6] cryptodev: add EDDSA asymmetric crypto
> > algorithm
> >
> > Add support for asymmetric EDDSA in cryptodev, as referenced in RFC:
> > https://datatracker.ietf.org/doc/html/rfc8032
> >
> > Signed-off-by: Gowrishankar Muthukrishnan 
> > ---
> >  doc/guides/cryptodevs/features/default.ini |  1 +
> >  doc/guides/prog_guide/cryptodev_lib.rst|  2 +-
> >  lib/cryptodev/rte_crypto_asym.h| 47 ++
> >  3 files changed, 49 insertions(+), 1 deletion(-)
> >
> > diff --git a/doc/guides/cryptodevs/features/default.ini
> > b/doc/guides/cryptodevs/features/default.ini
> > index f411d4bab7..3073753911 100644
> > --- a/doc/guides/cryptodevs/features/default.ini
> > +++ b/doc/guides/cryptodevs/features/default.ini
> > @@ -130,6 +130,7 @@ ECDSA   =
> >  ECPM=
> >  ECDH=
> >  SM2 =
> > +EDDSA   =
> >
> >  ;
> >  ; Supported Operating systems of a default crypto driver.
> > diff --git a/doc/guides/prog_guide/cryptodev_lib.rst
> > b/doc/guides/prog_guide/cryptodev_lib.rst
> > index 2b513bbf82..dd636ba5ef 100644
> > --- a/doc/guides/prog_guide/cryptodev_lib.rst
> > +++ b/doc/guides/prog_guide/cryptodev_lib.rst
> > @@ -927,7 +927,7 @@ Asymmetric Cryptography  The cryptodev library
> > currently provides support for the following asymmetric  Crypto
> > operations; RSA, Modular exponentiation and inversion, Diffie-Hellman
> > and  Elliptic Curve Diffie-Hellman public and/or private key
> > generation and shared -secret compute, DSA Signature generation and
> verification.
> > +secret compute, DSA and EdDSA Signature generation and verification.
> >
> >  Session and Session Management
> >  ~~
> > diff --git a/lib/cryptodev/rte_crypto_asym.h
> > b/lib/cryptodev/rte_crypto_asym.h index 39d3da3952..fe4194c184 100644
> > --- a/lib/cryptodev/rte_crypto_asym.h
> > +++ b/lib/cryptodev/rte_crypto_asym.h
> > @@ -49,6 +49,10 @@ rte_crypto_asym_op_strings[];
> >   * and if the flag is not set, shared secret will be padded to the left 
> > with
> >   * zeros to the size of the underlying algorithm (default)
> >   */
> > +#define RTE_CRYPTO_ASYM_FLAG_PUB_KEY_COMPRESSED
> > RTE_BIT32(2)
> > +/**<
> > + * Flag to denote public key will be returned in compressed form  */
> >
> >  /**
> >   * List of elliptic curves. This enum aligns with @@ -65,9 +69,22 @@
> > enum rte_crypto_curve_id {
> > RTE_CRYPTO_EC_GROUP_SECP256R1 = 23,
> > RTE_CRYPTO_EC_GROUP_SECP384R1 = 24,
> > RTE_CRYPTO_EC_GROUP_SECP521R1 = 25,
> > +   RTE_CRYPTO_EC_GROUP_ED25519   = 29,
> > +   RTE_CRYPTO_EC_GROUP_ED448 = 30,
> > RTE_CRYPTO_EC_GROUP_SM2   = 41,
> >  };
> >
> > +/**
> > + * List of Edwards curve instances as per RFC 8032 (Section 5).
> > + */
> > +enum rte_crypto_edward_instance {
> > +   RTE_CRYPTO_EDCURVE_25519,
> > +   RTE_CRYPTO_EDCURVE_25519CTX,
> > +   RTE_CRYPTO_EDCURVE_25519PH,
> > +   RTE_CRYPTO_EDCURVE_448,
> > +   RTE_CRYPTO_EDCURVE_448PH
> > +};
> > +
> >  /**
> >   * Asymmetric crypto transformation types.
> >   * Each xform type maps to one asymmetric algorithm @@ -119,6 +136,10
> > @@ enum rte_crypto_asym_xform_type {
> >  * Performs Encrypt, Decrypt, Sign and Verify.
> >  * Refer to rte_crypto_asym_op_type.
> >  */
> > +   RTE_CRYPTO_ASYM_XFORM_EDDSA,
> > +   /**< Edwards Curve Digital Signature Algorithm
> > +* Perform Signature Generation and Verification.
> > +*/
> > RTE_CRYPTO_ASYM_XFORM_TYPE_LIST_END
> > /**< End of list */
> >  };
> > @@ -585,6 +606,31 @@ struct rte_crypto_ecdsa_op_param {
> >  */
> >  };
> >
> > +/**
> > + * EdDSA operation params
> > + */
> > +struct rte_crypto_eddsa_op_param {
> > +   enum rte_crypto_asym_op_type op_type;
> > +   /**< Signature generation or verification */
> > +
> > +   rte_cryp

[PATCH v2] rawdev: add API to get device from index

2024-10-08 Thread Akhil Goyal

Added an internal API for PMDs to get raw device pointer
from a device id.

Signed-off-by: Akhil Goyal 
---
- resend patch for main branch separated from rvu_lf raw driver
https://patches.dpdk.org/project/dpdk/list/?series=32949

 lib/rawdev/rte_rawdev_pmd.h | 24 
 1 file changed, 24 insertions(+)

diff --git a/lib/rawdev/rte_rawdev_pmd.h b/lib/rawdev/rte_rawdev_pmd.h
index 22b406444d..8339122348 100644
--- a/lib/rawdev/rte_rawdev_pmd.h
+++ b/lib/rawdev/rte_rawdev_pmd.h
@@ -102,6 +102,30 @@ rte_rawdev_pmd_get_named_dev(const char *name)
return NULL;
 }
 
+/**
+ * Get the rte_rawdev structure device pointer for given device ID.
+ *
+ * @param dev_id
+ *   raw device index.
+ *
+ * @return
+ *   - The rte_rawdev structure pointer for the given device ID.
+ */
+static inline struct rte_rawdev *
+rte_rawdev_pmd_get_dev(uint8_t dev_id)
+{
+   struct rte_rawdev *dev;
+
+   if (dev_id >= RTE_RAWDEV_MAX_DEVS)
+   return NULL;
+
+   dev = &rte_rawdevs[dev_id];
+   if (dev->attached == RTE_RAWDEV_ATTACHED)
+   return dev;
+
+   return NULL;
+}
+
 /**
  * Validate if the raw device index is a valid attached raw device.
  *
-- 
2.25.1

[PATCH v3 4/4] app/test: add test sm2 C1/Kp test cases

2024-10-08 Thread Arkadiusz Kusztal

This commit adds tests cases to be used when C1 or kP elliptic
curve points need to be computed.

Signed-off-by: Arkadiusz Kusztal 
---
 app/test/test_cryptodev_asym.c | 148 -
 app/test/test_cryptodev_sm2_test_vectors.h | 112 +-
 2 files changed, 256 insertions(+), 4 deletions(-)

diff --git a/app/test/test_cryptodev_asym.c b/app/test/test_cryptodev_asym.c
index f0b5d38543..cb28179562 100644
--- a/app/test/test_cryptodev_asym.c
+++ b/app/test/test_cryptodev_asym.c
@@ -2635,6 +2635,8 @@ test_sm2_sign(void)
asym_op->sm2.k.data = input_params.k.data;
asym_op->sm2.k.length = input_params.k.length;
}
+   asym_op->sm2.k.data = input_params.k.data;
+   asym_op->sm2.k.length = input_params.k.length;
 
/* Init out buf */
asym_op->sm2.r.data = output_buf_r;
@@ -3184,7 +3186,7 @@ static int send_one(void)
ticks++;
if (ticks >= DEQ_TIMEOUT) {
RTE_LOG(ERR, USER1,
-   "line %u FAILED: Cannot dequeue the crypto op 
on device %d",
+   "line %u FAILED: Cannot dequeue the crypto op 
on device, timeout %d",
__LINE__, params->valid_devs[0]);
return TEST_FAILED;
}
@@ -3489,6 +3491,142 @@ kat_rsa_decrypt_crt(const void *data)
return 0;
 }
 
+static int
+test_sm2_partial_encryption(const void *data)
+{
+   struct rte_crypto_asym_xform xform = { 0 };
+   const uint8_t dev_id = params->valid_devs[0];
+   const struct crypto_testsuite_sm2_params *test_vector = data;
+   uint8_t result_C1_x1[TEST_DATA_SIZE] = { 0 };
+   uint8_t result_C1_y1[TEST_DATA_SIZE] = { 0 };
+   uint8_t result_kP_x1[TEST_DATA_SIZE] = { 0 };
+   uint8_t result_kP_y1[TEST_DATA_SIZE] = { 0 };
+   const struct rte_cryptodev_asymmetric_xform_capability *capa;
+   struct rte_cryptodev_asym_capability_idx idx;
+   struct rte_cryptodev_info dev_info;
+
+   rte_cryptodev_info_get(dev_id, &dev_info);
+   if (!(dev_info.feature_flags &
+   RTE_CRYPTODEV_FF_ASYM_PARTIAL_SM2)) {
+   RTE_LOG(INFO, USER1,
+   "Device doesn't support partial SM2. Test Skipped\n");
+   return TEST_SKIPPED;
+   }
+
+   idx.type = RTE_CRYPTO_ASYM_XFORM_SM2;
+   capa = rte_cryptodev_asym_capability_get(dev_id, &idx);
+   if (capa == NULL)
+   return TEST_SKIPPED;
+
+   xform.xform_type = RTE_CRYPTO_ASYM_XFORM_SM2;
+   xform.ec.curve_id = RTE_CRYPTO_EC_GROUP_SM2;
+   xform.ec.q = test_vector->pubkey;
+   self->op->asym->sm2.op_type = RTE_CRYPTO_ASYM_OP_ENCRYPT;
+   self->op->asym->sm2.k = test_vector->k;
+   if (rte_cryptodev_asym_session_create(dev_id, &xform,
+   params->session_mpool, &self->sess) < 0) {
+   RTE_LOG(ERR, USER1, "line %u FAILED: Session creation failed",
+   __LINE__);
+   return TEST_FAILED;
+   }
+   rte_crypto_op_attach_asym_session(self->op, self->sess);
+
+   self->op->asym->sm2.C1.x.data = result_C1_x1;
+   self->op->asym->sm2.C1.y.data = result_C1_y1;
+   self->op->asym->sm2.kP.x.data = result_kP_x1;
+   self->op->asym->sm2.kP.y.data = result_kP_y1;
+   TEST_ASSERT_SUCCESS(send_one(),
+   "Failed to process crypto op");
+
+   debug_hexdump(stdout, "C1[x]", self->op->asym->sm2.C1.x.data,
+   self->op->asym->sm2.C1.x.length);
+   debug_hexdump(stdout, "C1[y]", self->op->asym->sm2.C1.y.data,
+   self->op->asym->sm2.C1.y.length);
+   debug_hexdump(stdout, "kP[x]", self->op->asym->sm2.kP.x.data,
+   self->op->asym->sm2.kP.x.length);
+   debug_hexdump(stdout, "kP[y]", self->op->asym->sm2.kP.y.data,
+   self->op->asym->sm2.kP.y.length);
+
+   TEST_ASSERT_BUFFERS_ARE_EQUAL(test_vector->C1.x.data,
+   self->op->asym->sm2.C1.x.data,
+   test_vector->C1.x.length,
+   "Incorrect value of C1[x]\n");
+   TEST_ASSERT_BUFFERS_ARE_EQUAL(test_vector->C1.y.data,
+   self->op->asym->sm2.C1.y.data,
+   test_vector->C1.y.length,
+   "Incorrect value of C1[y]\n");
+   TEST_ASSERT_BUFFERS_ARE_EQUAL(test_vector->kP.x.data,
+   self->op->asym->sm2.kP.x.data,
+   test_vector->kP.x.length,
+   "Incorrect value of kP[x]\n");
+   TEST_ASSERT_BUFFERS_ARE_EQUAL(test_vector->kP.y.data,
+   self->op->asym->sm2.kP.y.data,
+   test_vector->kP.y.length,
+   "Incorrect value of kP[y]\n");
+
+   return TEST_SUCCESS;
+}
+
+static int
+test_sm2_partial_decryption(const void *data)
+{
+   struct rte_crypto_asym_xform xform = {};
+   const uint8_t dev_id = params->valid_devs[0];
+   const struct c

[PATCH v3 3/4] crypto/qat: add sm2 encryption/decryption function

2024-10-08 Thread Arkadiusz Kusztal

This commit adds SM2 elliptic curve based asymmetric
encryption and decryption to the Intel QuickAssist
Technology PMD.

Signed-off-by: Arkadiusz Kusztal 
---
 doc/guides/cryptodevs/features/qat.ini  |   1 +
 doc/guides/rel_notes/release_24_11.rst  |   4 +
 drivers/common/qat/qat_adf/icp_qat_fw_mmp_ids.h |   3 +
 drivers/common/qat/qat_adf/qat_pke.h|  20 
 drivers/crypto/qat/dev/qat_asym_pmd_gen1.c  |   3 +-
 drivers/crypto/qat/qat_asym.c   | 140 +++-
 6 files changed, 164 insertions(+), 7 deletions(-)

diff --git a/doc/guides/cryptodevs/features/qat.ini 
b/doc/guides/cryptodevs/features/qat.ini
index f41d29158f..219dd1e011 100644
--- a/doc/guides/cryptodevs/features/qat.ini
+++ b/doc/guides/cryptodevs/features/qat.ini
@@ -71,6 +71,7 @@ ZUC EIA3 = Y
 AES CMAC (128) = Y
 SM3  = Y
 SM3 HMAC = Y
+SM2  = Y
 
 ;
 ; Supported AEAD algorithms of the 'qat' crypto driver.
diff --git a/doc/guides/rel_notes/release_24_11.rst 
b/doc/guides/rel_notes/release_24_11.rst
index 0ff70d9057..85f4a2dd97 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -55,6 +55,10 @@ New Features
  Also, make sure to start the actual text at the margin.
  ===
 
+* **Updated the QuickAssist Technology (QAT) Crypto PMD.**
+
+  * Added SM2 encryption and decryption alghorithms.
+
 
 Removed Items
 -
diff --git a/drivers/common/qat/qat_adf/icp_qat_fw_mmp_ids.h 
b/drivers/common/qat/qat_adf/icp_qat_fw_mmp_ids.h
index 630c6e1a9b..aa49612ca1 100644
--- a/drivers/common/qat/qat_adf/icp_qat_fw_mmp_ids.h
+++ b/drivers/common/qat/qat_adf/icp_qat_fw_mmp_ids.h
@@ -1542,6 +1542,9 @@ icp_qat_fw_mmp_ecdsa_verify_gfp_521_input::in in @endlink
  * @li no output parameters
  */
 
+#define PKE_ECSM2_ENCRYPTION 0x25221720
+#define PKE_ECSM2_DECRYPTION 0x201716e6
+
 #define PKE_LIVENESS 0x0001
 /**< Functionality ID for PKE_LIVENESS
  * @li 0 input parameter(s)
diff --git a/drivers/common/qat/qat_adf/qat_pke.h 
b/drivers/common/qat/qat_adf/qat_pke.h
index f88932a275..ac051e965d 100644
--- a/drivers/common/qat/qat_adf/qat_pke.h
+++ b/drivers/common/qat/qat_adf/qat_pke.h
@@ -334,4 +334,24 @@ get_sm2_ecdsa_verify_function(void)
return qat_function;
 }
 
+static struct qat_asym_function
+get_sm2_encryption_function(void)
+{
+   struct qat_asym_function qat_function = {
+   PKE_ECSM2_ENCRYPTION, 32
+   };
+
+   return qat_function;
+}
+
+static struct qat_asym_function
+get_sm2_decryption_function(void)
+{
+   struct qat_asym_function qat_function = {
+   PKE_ECSM2_DECRYPTION, 32
+   };
+
+   return qat_function;
+}
+
 #endif
diff --git a/drivers/crypto/qat/dev/qat_asym_pmd_gen1.c 
b/drivers/crypto/qat/dev/qat_asym_pmd_gen1.c
index 67b1892c32..f991729dd9 100644
--- a/drivers/crypto/qat/dev/qat_asym_pmd_gen1.c
+++ b/drivers/crypto/qat/dev/qat_asym_pmd_gen1.c
@@ -87,7 +87,8 @@ qat_asym_crypto_feature_flags_get_gen1(
RTE_CRYPTODEV_FF_HW_ACCELERATED |
RTE_CRYPTODEV_FF_ASYM_SESSIONLESS |
RTE_CRYPTODEV_FF_RSA_PRIV_OP_KEY_EXP |
-   RTE_CRYPTODEV_FF_RSA_PRIV_OP_KEY_QT;
+   RTE_CRYPTODEV_FF_RSA_PRIV_OP_KEY_QT |
+   RTE_CRYPTODEV_FF_ASYM_PARTIAL_SM2;
 
return feature_flags;
 }
diff --git a/drivers/crypto/qat/qat_asym.c b/drivers/crypto/qat/qat_asym.c
index 491f5ecd5b..e1ada8629e 100644
--- a/drivers/crypto/qat/qat_asym.c
+++ b/drivers/crypto/qat/qat_asym.c
@@ -932,6 +932,15 @@ sm2_ecdsa_sign_set_input(struct icp_qat_fw_pke_request 
*qat_req,
qat_req->input_param_count = 3;
qat_req->output_param_count = 2;
 
+   HEXDUMP("SM2 K test", asym_op->sm2.k.data,
+   cookie->alg_bytesize);
+   HEXDUMP("SM2 K", cookie->input_array[0],
+   cookie->alg_bytesize);
+   HEXDUMP("SM2 msg", cookie->input_array[1],
+   cookie->alg_bytesize);
+   HEXDUMP("SM2 pkey", cookie->input_array[2],
+   cookie->alg_bytesize);
+
return RTE_CRYPTO_OP_STATUS_SUCCESS;
 }
 
@@ -983,6 +992,114 @@ sm2_ecdsa_sign_collect(struct rte_crypto_asym_op *asym_op,
 }
 
 static int
+sm2_encryption_set_input(struct icp_qat_fw_pke_request *qat_req,
+   struct qat_asym_op_cookie *cookie,
+   const struct rte_crypto_asym_op *asym_op,
+   const struct rte_crypto_asym_xform *xform)
+{
+   const struct qat_asym_function qat_function =
+   get_sm2_encryption_function();
+   const uint32_t qat_func_alignsize =
+   qat_function.bytesize;
+
+   SET_PKE_LN(asym_op->sm2.k, qat_func_alignsize, 0);
+   SET_PKE_LN(xform->ec.q.x, qat_func_alignsize, 1);
+   SET_PKE_LN(xform->ec.q.y, qat_func_alignsize, 2);
+
+   cookie->alg_bytesize = qat_function.bytesize;
+   cookie->qat_func_ali

Re: [RFC PATCH 0/3] add feature arc in rte_graph

2024-10-08 Thread David Marchand

Hi graph guys,

On Sat, Sep 7, 2024 at 9:31 AM Nitin Saxena  wrote:
>
> Feature arc represents an ordered list of features/protocols at a given
> networking layer. It is a high level abstraction to connect various
> rte_graph nodes, as feature nodes, and allow packets steering across
> these nodes in a generic manner.
>
> Features (or feature nodes) are nodes which handles partial or complete
> handling of a protocol in fast path. Like ipv4-rewrite node, which adds
> rewrite data to an outgoing IPv4 packet.
>
> However in above example, outgoing interface(say "eth0") may have
> outbound IPsec policy enabled, hence packets must be steered from
> ipv4-rewrite node to ipsec-outbound-policy node for outbound IPsec
> policy lookup. On the other hand, packets routed to another interface
> (eth1) will not be sent to ipsec-outbound-policy node as IPsec feature
> is disabled on eth1. Feature-arc allows rte_graph applications to manage
> such constraints easily
>
> Feature arc abstraction allows rte_graph based application to
>
> 1. Seamlessly steer packets across feature nodes based on wheter feature
> is enabled or disabled on an interface. Features enabled on one
> interface may not be enabled on another interface with in a same feature
> arc.
>
> 2. Allow enabling/disabling of features on an interface at runtime,
> so that if a feature is disabled, packets associated with that interface
> won't be steered to corresponding feature node.
>
> 3. Provides mechanism to hook custom/user-defined nodes to a feature
> node and allow packet steering from feature node to custom node without
> changing former's fast path function
>
> 4. Allow expressing features in a particular sequential order so that
> packets are steered in an ordered way across nodes in fast path. For
> eg: if IPsec and IPv4 features are enabled on an ingress interface,
> packets must be sent to IPsec inbound policy node first and then to ipv4
> lookup node.
>
> This patch series adds feature arc library in rte_graph and also adds
> "ipv4-output" feature arc handling in "ipv4-rewrite" node.
>
> Nitin Saxena (3):
>   graph: add feature arc support
>   graph: add feature arc option in graph create
>   graph: add IPv4 output feature arc
>
>  lib/graph/graph.c|   1 +
>  lib/graph/graph_feature_arc.c| 959 +++
>  lib/graph/graph_populate.c   |   7 +-
>  lib/graph/graph_private.h|   3 +
>  lib/graph/meson.build|   2 +
>  lib/graph/node.c |   2 +
>  lib/graph/rte_graph.h|   3 +
>  lib/graph/rte_graph_feature_arc.h| 373 +
>  lib/graph/rte_graph_feature_arc_worker.h | 548 +
>  lib/graph/version.map|  17 +
>  lib/node/ip4_rewrite.c   | 476 ---
>  lib/node/ip4_rewrite_priv.h  |   9 +-
>  lib/node/node_private.h  |  19 +-
>  lib/node/rte_node_ip4_api.h  |   3 +
>  14 files changed, 2325 insertions(+), 97 deletions(-)
>  create mode 100644 lib/graph/graph_feature_arc.c
>  create mode 100644 lib/graph/rte_graph_feature_arc.h
>  create mode 100644 lib/graph/rte_graph_feature_arc_worker.h

I see no non-RFC series following this original submission.
It will slip to next release unless there is an objection.

Btw, I suggest copying Robin (and Christophe) for graph related changes.


-- 
David Marchand

Re: [PATCH v2 08/10] baseband/acc: remove check on HARQ memory

2024-10-08 Thread Maxime Coquelin





On 10/3/24 22:49, Hernan Vargas wrote:

Automatically reset HARQ memory to prevent errors and simplify usage.
In a way we can assume that the HARQ output operation will always
overwrite the buffer, so we can reset this from the driver to prevent
an error being reported when application fails to do this explicitly.



Should it be backported?


Signed-off-by: Hernan Vargas 
---
  drivers/baseband/acc/rte_vrb_pmd.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/baseband/acc/rte_vrb_pmd.c 
b/drivers/baseband/acc/rte_vrb_pmd.c
index 865a050e1b19..27c8bdca3d08 100644
--- a/drivers/baseband/acc/rte_vrb_pmd.c
+++ b/drivers/baseband/acc/rte_vrb_pmd.c
@@ -2595,8 +2595,9 @@ vrb_enqueue_ldpc_dec_one_op_cb(struct acc_queue *q, 
struct rte_bbdev_dec_op *op,
/* Hard output. */
mbuf_append(h_output_head, h_output, h_out_length);
if (op->ldpc_dec.harq_combined_output.length > 0) {
-   /* Push the HARQ output into host memory. */
+   /* Push the HARQ output into host memory overwriting existing 
data. */
struct rte_mbuf *hq_output_head, *hq_output;
+   op->ldpc_dec.harq_combined_output.data->data_len = 0;
hq_output_head = op->ldpc_dec.harq_combined_output.data;
hq_output = op->ldpc_dec.harq_combined_output.data;
hq_len = op->ldpc_dec.harq_combined_output.length;


Reviewed-by: Maxime Coquelin 

Thanks,
Maxime

Re: [PATCH v2 07/10] baseband/acc: algorithm tuning for LDPC decoder

2024-10-08 Thread Maxime Coquelin





On 10/3/24 22:49, Hernan Vargas wrote:

Reverting to MS1 version of the algorithm to improve MU1 fading
conditions.

Signed-off-by: Hernan Vargas 
---
  drivers/baseband/acc/rte_vrb_pmd.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)



Reviewed-by: Maxime Coquelin 

Thanks,
Maxime

Re: [PATCH v2 09/10] baseband/acc: reset ring data valid bit

2024-10-08 Thread Maxime Coquelin





On 10/3/24 22:49, Hernan Vargas wrote:

Reset only the valid bit to keep info ring data notably for dumping.

Signed-off-by: Hernan Vargas 
---
  drivers/baseband/acc/rte_vrb_pmd.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/baseband/acc/rte_vrb_pmd.c 
b/drivers/baseband/acc/rte_vrb_pmd.c
index 27c8bdca3d08..5f7568a4b7ea 100644
--- a/drivers/baseband/acc/rte_vrb_pmd.c
+++ b/drivers/baseband/acc/rte_vrb_pmd.c
@@ -411,7 +411,7 @@ vrb_check_ir(struct acc_device *acc_dev)
rte_bbdev_log(WARNING, "InfoRing: ITR:%d Info:0x%x",
int_nb, ring_data->detailed_info);
/* Initialize Info Ring entry and move forward. */
-   ring_data->val = 0;
+   ring_data->valid = 0;
}
info_ring_head++;
ring_data = acc_dev->info_ring + (info_ring_head & 
ACC_INFO_RING_MASK);


Reviewed-by: Maxime Coquelin 

Thanks,
Maxime

Re: [PATCH v2 10/10] baseband/acc: cosmetic changes

2024-10-08 Thread Maxime Coquelin





On 10/3/24 22:49, Hernan Vargas wrote:

Cosmetic code changes.
No functional impact.

Signed-off-by: Hernan Vargas 
---
  drivers/baseband/acc/rte_acc100_pmd.c |  2 +-
  drivers/baseband/acc/rte_vrb_pmd.c| 62 +++
  2 files changed, 44 insertions(+), 20 deletions(-)

diff --git a/drivers/baseband/acc/rte_acc100_pmd.c 
b/drivers/baseband/acc/rte_acc100_pmd.c
index e3a523946448..c33e2758b100 100644
--- a/drivers/baseband/acc/rte_acc100_pmd.c
+++ b/drivers/baseband/acc/rte_acc100_pmd.c
@@ -4199,7 +4199,7 @@ poweron_cleanup(struct rte_bbdev *bbdev, struct 
acc_device *d,
acc_reg_write(d, HWPfQmgrIngressAq + 0x100, enq_req.val);
usleep(ACC_LONG_WAIT * 100);
if (desc->req.word0 != 2)
-   rte_bbdev_log(WARNING, "DMA Response %#"PRIx32, 
desc->req.word0);
+   rte_bbdev_log(WARNING, "DMA Response %#"PRIx32"", 
desc->req.word0);
}
  
  	/* Reset LDPC Cores */

diff --git a/drivers/baseband/acc/rte_vrb_pmd.c 
b/drivers/baseband/acc/rte_vrb_pmd.c
index 5f7568a4b7ea..c8875447d3d0 100644
--- a/drivers/baseband/acc/rte_vrb_pmd.c
+++ b/drivers/baseband/acc/rte_vrb_pmd.c
@@ -956,6 +956,9 @@ vrb_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
struct acc_queue *q;
int32_t q_idx;
int ret;
+   union acc_dma_desc *desc = NULL;
+   unsigned int desc_idx, b_idx;
+   int fcw_len;
  
  	if (d == NULL) {

rte_bbdev_log(ERR, "Undefined device");
@@ -982,16 +985,33 @@ vrb_queue_setup(struct rte_bbdev *dev, uint16_t queue_id,
}
  
  	/* Prepare the Ring with default descriptor format. */

-   union acc_dma_desc *desc = NULL;
-   unsigned int desc_idx, b_idx;
-   int fcw_len = (conf->op_type == RTE_BBDEV_OP_LDPC_ENC ?
-   ACC_FCW_LE_BLEN : (conf->op_type == 
RTE_BBDEV_OP_TURBO_DEC ?
-   ACC_FCW_TD_BLEN : (conf->op_type == 
RTE_BBDEV_OP_LDPC_DEC ?
-   ACC_FCW_LD_BLEN : (conf->op_type == RTE_BBDEV_OP_FFT ?
-   ACC_FCW_FFT_BLEN : ACC_FCW_MLDTS_BLEN;
-
-   if ((q->d->device_variant == VRB2_VARIANT) && (conf->op_type == 
RTE_BBDEV_OP_FFT))
-   fcw_len = ACC_FCW_FFT_BLEN_3;
+   switch (conf->op_type) {
+   case RTE_BBDEV_OP_LDPC_ENC:
+   fcw_len = ACC_FCW_LE_BLEN;
+   break;
+   case RTE_BBDEV_OP_LDPC_DEC:
+   fcw_len = ACC_FCW_LD_BLEN;
+   break;
+   case RTE_BBDEV_OP_TURBO_DEC:
+   fcw_len = ACC_FCW_TD_BLEN;
+   break;
+   case RTE_BBDEV_OP_TURBO_ENC:
+   fcw_len = ACC_FCW_TE_BLEN;
+   break;
+   case RTE_BBDEV_OP_FFT:
+   fcw_len = ACC_FCW_FFT_BLEN;
+   if (q->d->device_variant == VRB2_VARIANT)
+   fcw_len = ACC_FCW_FFT_BLEN_3;
+   break;
+   case RTE_BBDEV_OP_MLDTS:
+   fcw_len = ACC_FCW_MLDTS_BLEN;
+   break;
+   default:
+   /* NOT REACHED. */
+   fcw_len = 0;
+   rte_bbdev_log(ERR, "Unexpected error in %s using type %d", 
__func__, conf->op_type);
+   break;
+   }


This part is useful as it makes the code clearer.


for (desc_idx = 0; desc_idx < d->sw_ring_max_depth; desc_idx++) {
desc = q->ring_addr + desc_idx;
@@ -1757,8 +1777,7 @@ vrb_fcw_ld_fill(struct rte_bbdev_dec_op *op, struct 
acc_fcw_ld *fcw,
if (fcw->hcout_en > 0) {
parity_offset = (op->ldpc_dec.basegraph == 1 ? 20 : 8)
* op->ldpc_dec.z_c - op->ldpc_dec.n_filler;
-   k0_p = (fcw->k0 > parity_offset) ?
-   fcw->k0 - op->ldpc_dec.n_filler : fcw->k0;
+   k0_p = (fcw->k0 > parity_offset) ? fcw->k0 - op->ldpc_dec.n_filler 
: fcw->k0;
ncb_p = fcw->ncb - op->ldpc_dec.n_filler;
l = k0_p + fcw->rm_e;
harq_out_length = (uint16_t) fcw->hcin_size0;
@@ -2000,16 +2019,15 @@ vrb_dma_desc_ld_fill(struct rte_bbdev_dec_op *op,
next_triplet++;
}
  
-	if (check_bit(op->ldpc_dec.op_flags,

-   RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
+   if (check_bit(op->ldpc_dec.op_flags, 
RTE_BBDEV_LDPC_HQ_COMBINE_OUT_ENABLE)) {
if (op->ldpc_dec.harq_combined_output.data == 0) {
rte_bbdev_log(ERR, "HARQ output is not defined");
return -1;
}
  
-		/* Pruned size of the HARQ */

+   /* Pruned size of the HARQ. */
h_p_size = fcw->hcout_size0 + fcw->hcout_size1;
-   /* Non-Pruned size of the HARQ */
+   /* Non-Pruned size of the HARQ. */
h_np_size = fcw->hcout_offset > 0 ?
fcw->hcout_offset + fcw->hcout_size1 :
h_p_siz

Re: [PATCH v4 1/5] graph: add support for node specific errors

2024-10-08 Thread David Marchand

Hello Kiran,

On Thu, Aug 22, 2024 at 8:38 AM Kiran Kumar Kokkilagadda
 wrote:
> > -Original Message-
> > From: pbhagavat...@marvell.com 
> > Sent: Friday, August 16, 2024 8:39 PM
> > To: Jerin Jacob ; Nithin Kumar Dabilpuram
> > ; Kiran Kumar Kokkilagadda
> > ; zhirun@intel.com; Zhirun Yan
> > 
> > Cc: dev@dpdk.org; Pavan Nikhilesh Bhagavatula 
> > Subject: [PATCH v4 1/5] graph: add support for node specific errors
> >
> > From: Pavan Nikhilesh 
> >
> > Add ability for Nodes to advertise error counters
> > during registration.
> >
> > Signed-off-by: Pavan Nikhilesh 
> > ---
> Acked-by: Kiran Kumar Kokkilagadda 

When acking, please strip the rest of the mail.
Otherwise it leaves an impression that you left some comment later in the mail.


-- 
David Marchand

Re: [PATCH 0/5] Increase minimum meson version

2024-10-08 Thread David Marchand

Hello CI guys,

On Fri, Sep 20, 2024 at 2:57 PM Bruce Richardson
 wrote:
>
> This patchset proposed increasing the minimum meson version to 0.57
> and makes changes to update our build files appropriately for that
> change: replacing deprecated functions, removing unnecessary version
> checks and taking advantage of some new capabilities.
>
> Why 0.57? No one particular reason; it's mainly a conservative version
> bump that doesn't have many impacts, but still gives us the minimum
> updates we need to replace the deprecated get_cross_properties fn
> and have a few extra features guaranteed available.
>
> Bruce Richardson (5):
>   build: increase minimum meson version to 0.57
>   build: remove version check on compiler links function
>   build: remove unnecessary version checks
>   build: use version file support from meson
>   build: replace deprecated meson function
>
>  .ci/linux-setup.sh| 2 +-
>  config/arm/meson.build| 4 ++--
>  config/meson.build| 8 
>  config/riscv/meson.build  | 4 ++--
>  doc/api/meson.build   | 2 +-
>  doc/guides/linux_gsg/sys_reqs.rst | 2 +-
>  doc/guides/prog_guide/build-sdk-meson.rst | 2 +-
>  drivers/common/qat/meson.build| 2 +-
>  drivers/crypto/ipsec_mb/meson.build   | 2 +-
>  drivers/event/cnxk/meson.build| 2 +-
>  drivers/meson.build   | 7 ++-
>  drivers/net/cnxk/meson.build  | 2 +-
>  lib/meson.build   | 6 --
>  meson.build   | 7 ++-
>  14 files changed, 20 insertions(+), 32 deletions(-)

This series can't be merged until the (UNH and LoongArch) CI are ready
for such a change.

TL;DR: the meson minimum version is being changed from 0.53.2 to 0.57
in the current release.

@UNH @Min Zhou
How long would it take for all CI to be ready for this change?

Important note: if relevant to your CI, testing against LTS branches
must still be done with the 0.53.2 version, so no change relying on
post 0.53.2 meson feature gets backported.


-- 
David Marchand

RE: [PATCH v3 17/18] eal: add function attributes for allocation functions

2024-10-08 Thread Morten Brørup

> From: Stephen Hemminger [mailto:step...@networkplumber.org]
> Sent: Sunday, 29 September 2024 17.35
> 
> The allocation functions take a alignment argument that
> can be useful to hint the compiler optimizer.
> 
> This is supported by Gcc and Clang but only useful with
> Gcc because Clang gives warning if alignment is 0.

This patch defines and uses __rte_alloc_align(). OK.

> 
> Recent versions of GCC have a malloc attribute that can
> be used to find mismatches between allocation and free;
> the typical problem caught is a pointer allocated with
> rte_malloc() that is then incorrectly freed using free().

This patch defines __rte_alloc_func(), but uses it in the next patch in the 
series.
Suggest either doing both here, or move the definition of __rte_alloc_func() to 
the next patch.


> +/**
> + * Tells the compiler this is a function like malloc and that the
> pointer
> + * returned cannot alias any other pointer (ie new memory).

There's a good example of its use here:
https://developers.redhat.com/blog/2021/04/30/detecting-memory-management-bugs-with-gcc-11-part-1-understanding-dynamic-allocation#detecting_mismatched_deallocations

It not only refers to memory, but also handle pointers.
You might want to replace "ie new memory" by "ie new object" or similar.


Please add the optional arguments to pass to __rte_alloc_func to the macro 
description, e.g.:
@param [free_func]
  The name of the deallocation function to free the allocated object
@param [free_func_ptr_index]
  The deallocation function's argument index of the object pointer.

PS: The brackets indicate that the parameter is optional. I didn't know, so it 
is what I found on the internet.

> + *
> + * Also, with recent GCC versions also able to track that proper
> + * dealloctor function is used for this pointer.
> + */
> +#if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION >= 11)
> +#define __rte_alloc_func(...) \
> + __attribute__((malloc, malloc(__VA_ARGS__)))
> +
> +#elif defined(RTE_CC_GCC) || defined(RTE_CC_CLANG)
> +#define __rte_alloc_func(...) \
> + __attribute__((malloc))
> +#else
> +#define __rte_alloc_func(...)
> +#endif

The _func postfix seems superfluous. Macros hinting about Hot and cold 
functions are simply __rte_hot and __rte_cold, without _func postfix.
It's probably a matter of taste, so I'll leave it up to you.

Minor detail:
When looking at the code using the macro, it seems somewhat confusing that the 
macro name is "__rte_alloc" when its arguments describe the associated free 
function.
But I have no ideas for a better name...
Even if the two arguments were required, the primary purpose of the macro is to 
inform the compiler that the function is an allocation function; so that must 
be dominant in the name of the macro, which it is with the current name.

With the macro description updated,
Series-Acked-by: Morten Brørup

Re: [PATCH 3/5] build: remove unnecessary version checks

2024-10-08 Thread David Marchand

On Fri, Sep 20, 2024 at 2:58 PM Bruce Richardson
 wrote:
>
> Since minimum meson version is now 0.57 we can remove all version checks
> for versions lower than that.
>
> Signed-off-by: Bruce Richardson 
> ---
>  config/meson.build  | 2 +-
>  doc/api/meson.build | 2 +-
>  drivers/meson.build | 3 ---
>  lib/meson.build | 6 --
>  4 files changed, 2 insertions(+), 11 deletions(-)
>
> diff --git a/config/meson.build b/config/meson.build
> index 8c8b019c25..913825b1ca 100644
> --- a/config/meson.build
> +++ b/config/meson.build
> @@ -97,7 +97,7 @@ eal_pmd_path = join_paths(get_option('prefix'), 
> driver_install_path)
>  if not is_windows
>  meson.add_install_script('../buildtools/symlink-drivers-solibs.sh',
>  get_option('libdir'), pmd_subdir_opt)
> -elif meson.version().version_compare('>=0.55.0')
> +else
>  # 0.55.0 is required to use external program with add_install_script

Nit: this comment can be removed (I intend to do when applying).


>  meson.add_install_script(py3,
>  files('../buildtools/symlink-drivers-solibs.py'),


-- 
David Marchand

RE: [PATCH v6 1/6] cryptodev: add EDDSA asymmetric crypto algorithm

2024-10-08 Thread Gowrishankar Muthukrishnan

> > > > > +/**
> > > > > + * EdDSA operation params
> > > > > + */
> > > > > +struct rte_crypto_eddsa_op_param {
> > > > > + enum rte_crypto_asym_op_type op_type;
> > > > > + /**< Signature generation or verification */
> > > > > +
> > > > > + rte_crypto_param message;
> > > > > + /**< Input message digest to be signed or verified */
> > > > HashEdDSA will require a message digest; pure EdDSA will require
> > > > the message itself. For HW it will be more complicated.
> >
> > Do you mean some hardware may not have HashEdDSA support ?
> Not in full. For example: ECDSA in QAT and Octeon accepts a digest, not a
> message. So it does not support the full process, but EdDSA is more
> complicated than that because of the two hash rounds, similar to the SM2.
> 
> For now we have only OpenSSL PMD that supports it, and it accepts a
> message not a digest, so this should be changed to "message to be signed".
> 
Ack.


> > > > All instances are using the same curve, where they differ is the
> > > > way of handling input message.
> > > > And I think this should be a session variable -> new xform for the 
> > > > EdDSA.
> >
> > Based on prehash and context string, these instances are listed in RFC.
> > A context string per operation helps ensure each signature is uniquely
> > tied to its specific context, thereby preventing reuse of signatures
> > across different contexts or operations.
> > Prehashing adds additional security by ensuring new prehash is
> > computed from the message.
> > Therefor it is more appropriate to treat both of these as operational
> variables.
> 
> Different 'instance' are basically different algorithms.
> 
> About the 'context' I am not sure, as not any major protocol specifies its 
> usage
> (TLS and IKEv2 forbids PH though), But from RFC8032, it looks like it was
> defined to be used per protocol basis, or some subprotocol routine. But about
> this I am not sure.
> 
> Yet, EdDSA should not be delayed really; it is basically a network standard 
> for
> quite a time.
> These changes may be discussed later.
> 

Sure Arkadiusz.

Regards,
Gowrishankar

Re: [PATCH] Increasing ci meson version to .57

2024-10-08 Thread Stephen Hemminger

On Tue,  8 Oct 2024 15:25:43 -0400
Patrick Robb  wrote:

> There is a proposed increase in the minimum meson version to .57
> This patch aligns the linux setup ci script with this change.
> 
> Signed-off-by: Patrick Robb 

I wonder if we shouldn't push it to something later.
Debian stable is using 1.0.1 and testing is up to 1.5.2

0.57.0 was released on Feb 14,2021
1.0.0 was released on Dec 23, 2022

So getting a release that was more recent makes sense.
The users stuck on enterprise distro's are going to have to resort
to pip anyway to get a new version.

RE: [PATCH v2] cryptodev: add asymmetric operational capability

2024-10-08 Thread Gowrishankar Muthukrishnan

> Acked-by: Arkadiusz Kusztal  

Thanks.
> With some comments.
> 


> > diff --git a/drivers/crypto/openssl/rte_openssl_pmd_ops.c
> > b/drivers/crypto/openssl/rte_openssl_pmd_ops.c
> > index b7b612fc57..6f81bcb110 100644
> > --- a/drivers/crypto/openssl/rte_openssl_pmd_ops.c
> > +++ b/drivers/crypto/openssl/rte_openssl_pmd_ops.c
> > @@ -598,15 +598,22 @@ static const struct rte_cryptodev_capabilities
> > openssl_pmd_capabilities[] = {
> > {.asym = {
> > .xform_capa = {
> > .xform_type =
> > RTE_CRYPTO_ASYM_XFORM_SM2,
> > -   .hash_algos = (1 <<
> RTE_CRYPTO_AUTH_SM3),
> > .op_types =
> > -   ((1< > +   ((1 << RTE_CRYPTO_ASYM_OP_SIGN) |
> >  (1 << RTE_CRYPTO_ASYM_OP_VERIFY) |
> >  (1 << RTE_CRYPTO_ASYM_OP_ENCRYPT) |
> >  (1 << RTE_CRYPTO_ASYM_OP_DECRYPT)),
> > -   {.internal_rng = 1
> > -   }
> > -   }
> Designated initializers could probably help with readability.
> > +   .op_capa = {
> > +
> [RTE_CRYPTO_ASYM_OP_ENCRYPT] = (1 << RTE_CRYPTO_SM2_RNG) |
> > +   (1 << RTE_CRYPTO_SM2_PKE_KDF),
> > +
> [RTE_CRYPTO_ASYM_OP_DECRYPT] = (1 << RTE_CRYPTO_SM2_RNG) |
> > +   (1 << RTE_CRYPTO_SM2_PKE_KDF),
> > +
> [RTE_CRYPTO_ASYM_OP_SIGN] = (1 << RTE_CRYPTO_SM2_RNG) |
> > +   (1 << RTE_CRYPTO_SM2_PH),
> > +
> [RTE_CRYPTO_ASYM_OP_VERIFY] = (1 << RTE_CRYPTO_SM2_RNG) |
> > +   (1 << RTE_CRYPTO_SM2_PH)
> > +   },
> > +   },
> > }
> 
Ack.

> Probably driver/test changes should be in different patches.
> 

Ack.

Regards,
Gowrishankar

Re: [PATCH] net/mvneta: fix possible out-of-bounds write

2024-10-08 Thread Stephen Hemminger

On Wed, 9 Oct 2024 02:23:42 +
Chengwen Feng  wrote:

> The mvneta_ifnames_get() function will save 'iface' value to ifnames,
> it will out-of-bounds write if passed many iface pairs (e.g.
> 'iface=xxx,iface=xxx,...').
> 
> Fixes: 4ccc8d770d3b ("net/mvneta: add PMD skeleton")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Chengwen Feng 
> Acked-by: Ferruh Yigit 
> ---
>  drivers/net/mvneta/mvneta_ethdev.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/net/mvneta/mvneta_ethdev.c 
> b/drivers/net/mvneta/mvneta_ethdev.c
> index 3841c1ebe9..c49f083efa 100644
> --- a/drivers/net/mvneta/mvneta_ethdev.c
> +++ b/drivers/net/mvneta/mvneta_ethdev.c
> @@ -91,6 +91,9 @@ mvneta_ifnames_get(const char *key __rte_unused, const char 
> *value,
>  {
>   struct mvneta_ifnames *ifnames = extra_args;
>  
> + if (ifnames->idx >= NETA_NUM_ETH_PPIO)
> + return -EINVAL;
> +

Looks like a reasonable fix but for if some user tried to set up too many
devices, best to add a log message with severity of ERR to help them know why.

Re: [PATCH] net/mvneta: fix possible out-of-bounds write

2024-10-08 Thread zhoumin


Recheck-request: loongarch-compilation

--
Just for a test, please ignore.

Re: [PATCH v4 2/4] cryptodev: add ec points to sm2 op

2024-10-08 Thread Stephen Hemminger

On Tue, 8 Oct 2024 21:00:50 +
"Kusztal, ArkadiuszX"  wrote:

> Hi Stephen,
> 
> > -Original Message-
> > From: Stephen Hemminger 
> > Sent: Tuesday, October 8, 2024 10:46 PM
> > To: Kusztal, ArkadiuszX 
> > Cc: dev@dpdk.org; gak...@marvell.com; Dooley, Brian
> > 
> > Subject: Re: [PATCH v4 2/4] cryptodev: add ec points to sm2 op
> > 
> > On Tue,  8 Oct 2024 19:14:31 +0100
> > Arkadiusz Kusztal  wrote:
> >   
> > > + RTE_CRYPTO_SM2_PARTIAL,
> > > + /**<
> > > +  * PMD does not support the full process of the
> > > +  * SM2 encryption/decryption, but the elliptic
> > > +  * curve part only  
> > 
> > Couldn't this just be:
> > /**< PMD only supports elliptic curve */  
> 
> SM2 encryption involves several steps: random number generation, hashing, 
> some trivial xor's etc, and calculation of elliptic curve points, what I 
> meant here is that only this EC calculation will be performed.
> But when I read it now, I probably may need to add some more clarity to it.


My point is what developers write tends to be overly wordy and redundant.
Comments and documentation should be as succinct as possible.

RE: [PATCH v4 2/4] cryptodev: add ec points to sm2 op

2024-10-08 Thread Kusztal, ArkadiuszX

Hi Stephen,

> -Original Message-
> From: Stephen Hemminger 
> Sent: Tuesday, October 8, 2024 10:46 PM
> To: Kusztal, ArkadiuszX 
> Cc: dev@dpdk.org; gak...@marvell.com; Dooley, Brian
> 
> Subject: Re: [PATCH v4 2/4] cryptodev: add ec points to sm2 op
> 
> On Tue,  8 Oct 2024 19:14:31 +0100
> Arkadiusz Kusztal  wrote:
> 
> > +   RTE_CRYPTO_SM2_PARTIAL,
> > +   /**<
> > +* PMD does not support the full process of the
> > +* SM2 encryption/decryption, but the elliptic
> > +* curve part only
> 
> Couldn't this just be:
>   /**< PMD only supports elliptic curve */

SM2 encryption involves several steps: random number generation, hashing, some 
trivial xor's etc, and calculation of elliptic curve points, what I meant here 
is that only this EC calculation will be performed.
But when I read it now, I probably may need to add some more clarity to it.

Re: [PATCH] net/gve: add IO memory barriers before reading descriptors

2024-10-08 Thread Ferruh Yigit

On 10/4/2024 2:05 AM, Joshua Washington wrote:
> Without memory barriers, there is no guarantee that the CPU will
> actually wait until after the descriptor has been fully written before
> loading descriptor data. In this case, it is possible that stale data is
> read and acted on by the driver when processing TX or RX completions.
> 
> This change adds read memory barriers just after the generation bit is
> read in both the RX and the TX path to ensure that the NIC has properly
> passed ownership to the driver before descriptor data is read in full.
> 
> Note that memory barriers should not be needed after writing the RX
> buffer queue/TX descriptor queue tails because rte_write32 includes an
> implicit write memory barrier.
> 
> Fixes: 4022ff56 ("net/gve: support basic Tx data path for DQO")
> Fixes: 45da16b5b181 ("net/gve: support basic Rx data path for DQO")
> Cc: junfeng@intel.com
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Joshua Washington 
> Reviewed-by: Praveen Kaligineedi 
> Reviewed-by: Rushil Gupta 
>

Applied to dpdk-next-net/main, thanks.

Re: [PATCH v3] app/proc-info: add rte_eal_cleanup() to avoid memory leak

2024-10-08 Thread fengchengwen

Acked-by: Chengwen Feng 

On 2024/10/4 10:48, Stephen Hemminger wrote:
> From: Fidaullah Noonari 
> 
> when app is launched with -m proc-info exit without
> rte_eal_cleanup() causing memory leakage. This commit resolves the
> memory leakage issue and closes app properly.
> 
> Bugzilla id: 898
> Signed-off-by: Fidaullah Noonari 
> Acked-by: Stephen Hemminger 
> ---
> v3 - handle eventdev_xstats as well
>  rebase to 24.11

RE: rte_ring move head question for machines with relaxed MO (arm/ppc)

2024-10-08 Thread Wathsala Wathawana Vithanage

> 
> > > 1. rte_ring_generic_pvt.h:
> > > =
> > >
> > > pseudo-c-code  //related 
> > > armv8 instructions
> > >  
> > > --
> > >  head.load()  //ldr [head]
> > >  rte_smp_rmb()//dmb ishld
> > >  opposite_tail.load()//ldr 
> > > [opposite_tail]
> > >  ...
> > >  rte_atomic32_cmpset(head, ...)  //ldrex[head];... stlex[head]
> > >
> > >
> > > 2. rte_ring_c11_pvt.h
> > > =
> > >
> > > pseudo-c-code   //related 
> > > armv8 instructions
> > >  
> > > --
> > > head.atomic_load(relaxed) //ldr[head]
> > > atomic_thread_fence(acquire)   //dmb ish
> > > opposite_tail.atomic_load(acquire)   //lda[opposite_tail]
> > > ...
> > > head.atomic_cas(..., relaxed)  //ldrex[haed]; ... 
> > > strex[head]
> > >
> > >
> > > 3.   rte_ring_hts_elem_pvt.h
> > > ==
> > >
> > > pseudo-c-code   //related 
> > > armv8 instructions
> > >  
> > > --
> > > head.atomic_load(acquire)//lda [head]
> > > opposite_tail.load() //ldr 
> > > [opposite_tail]
> > > ...
> > > head.atomic_cas(..., acquire)// ldaex[head]; ... 
> > > strex[head]
> > >
> > > The questions that arose from these observations:
> > > a) are all 3 approaches equivalent in terms of functionality?
> > Different, lda (Load with acquire semantics) and ldr (load) are different.
> 
> I understand that, my question was:
> lda {head]; ldr[tail]
> vs
> ldr [head]; dmb ishld; ldr [tail];
> 
> Is there any difference in terms of functionality (memory ops
> ordering/observability)?
> 
> >
> > > b) if yes, is there any difference in terms of performance between:
> > >  "ldr; dmb; ldr;"   vs "lda; ldr;"
> > >   ?
> > dmb is a full barrier, performance is poor.
> > I would assume (haven't measured) ldr; dmb; ldr to be less performant
> > than lda;ldr;
> 
> Through all this mail am talking about 'dmb ishld', sorry for not being clear
> upfront.
> 
> >
> > > c) Comapring at 1) and 2) above, combination of
> > >ldr [head]; dmb; lda [opposite_tail]:
> > >looks like an overkill to me.  Wouldn't just:
> > >ldr [head]; dmb; ldr[opposite_tail];
> > >be sufficient here?
> > lda [opposite_tail]: synchronizes with stlr in tail update that happens 
> > after
> array update.
> > So, it cannot be changed to ldr.
> 
> Can you explain me a bit more here why it is not possible?
> From here:
> https://developer.arm.com/documentation/dui0802/b/A32-and-T32-
> Instructions/LDA-and-STL
> "There is no requirement that a load-acquire and store-release be paired."
> Do I misinterpret this statement somehow?

There is no architectural requirement for them to be paired.
But C11 seem to have such requirement, such that prod: lda[cons-tail] 
synchronizes with cons: stl[cons-tail].

Re: [PATCH] Increasing ci meson version to .57

2024-10-08 Thread Aaron Conole

Patrick Robb  writes:

> Haha... I guess that serves as a lesson.
>
> Thanks Bruce.

Probably worth it to try that patch out and add your tested-by tag :)

Re: [PATCH v2 05/50] net/ntnic: extend and fix logging implementation

2024-10-08 Thread Ferruh Yigit

On 10/7/2024 8:33 PM, Serhii Iliushyk wrote:
> @@ -682,7 +682,7 @@ static void 
> qsfp28_wait_for_ready_after_reset(nim_i2c_ctx_p ctx)
>   init_complete_flag_present = (data & (1 << 4)) != 0;
>   }
>  
> - NT_LOG(DBG, NTHW, "NIM InitCompleteFlagPresent = %d", 
> init_complete_flag_present);
> + NT_LOG(DBG, NTHW, "NIM InitCompleteFlagPresent = %d\n", 
> init_complete_flag_present);
>  

There is a commit that standardizes not having '\n' [1], please follow
the same.

[1]
https://git.dpdk.org/next/dpdk-next-net/commit/?id=f665790a5dbad7b645ff46f31d65e977324e7bfc

Just for a test, please ignore

2024-10-08 Thread zhoumin


Recheck-request: loongarch-compilation

--
Just for a test, please ignore.

Re: [PATCH v3 5/5] net/cxgbe: use rte macro instead of GCC attribute

2024-10-08 Thread Potnuri Bharat Teja

On Monday, October 10/07/24, 2024 at 13:18:30 -0700, Stephen Hemminger wrote:
> On Thu, 13 Jun 2024 16:05:10 +0200
> David Marchand  wrote:
> 
> > On Thu, Jun 13, 2024 at 3:44 PM David Marchand
> >  wrote:
> > >
> > > On Wed, Jun 12, 2024 at 10:16 AM David Marchand
> > >  wrote:  
> > > >
> > > > On Wed, Mar 6, 2024 at 11:14 PM Tyler Retzlaff
> > > >  wrote:  
> > > > >
> > > > > Use existing __rte_may_alias macro from rte_common.h instead of
> > > > > directly using __attribute__((__may_alias__)).
> > > > >
> > > > > Signed-off-by: Tyler Retzlaff 
> > > > > ---
> > > > >  drivers/net/cxgbe/base/common.h  | 2 +-
> > > > >  drivers/net/cxgbe/base/t4_hw.c   | 2 +-
> > > > >  drivers/net/cxgbe/base/t4vf_hw.c | 2 +-
> > > > >  3 files changed, 3 insertions(+), 3 deletions(-)  
> > > >
> > > > Adding cxgbe maintainer.
> > > >
> > > > This patch is touching base/ driver code.
> > > > Rahul, is this change ok for you?  
> > >
> > > I got a bounce on the previous mail.
> > > Trying again.  
> > 
> > So again, no luck.
> > I tried to contact some people at chelsio.
> > 
> > I'll keep this patch on hold for now but apply the rest of the series.
> > 
> > 
> 
> Cleaning up the outstanding patch list.
> 
> Could we get an ack from the new maintainer?

reviewed it here: 
https://lore.kernel.org/dpdk-dev/98cbd80474fa8b44bf855df32c47dc35e9f...@smartserver.smartshare.dk/T/#m61bf7e81f61ce7d713872184d033078f37f903bb
Reviewed-by: Potnuri Bharat Teja

RE: rte_ring move head question for machines with relaxed MO (arm/ppc)

2024-10-08 Thread Wathsala Wathawana Vithanage

> > > 1. rte_ring_generic_pvt.h:
> > > =
> > >
> > > pseudo-c-code  //related 
> > > armv8 instructions
> > >  
> > > --
> > >  head.load()  //ldr [head]
> > >  rte_smp_rmb()//dmb ishld
> > >  opposite_tail.load()//ldr 
> > > [opposite_tail]
> > >  ...
> > >  rte_atomic32_cmpset(head, ...)  //ldrex[head];... stlex[head]
> > >
> > >
> > > 2. rte_ring_c11_pvt.h
> > > =
> > >
> > > pseudo-c-code   //related 
> > > armv8 instructions
> > >  
> > > --
> > > head.atomic_load(relaxed) //ldr[head]
> > > atomic_thread_fence(acquire)   //dmb ish
> > > opposite_tail.atomic_load(acquire)   //lda[opposite_tail]
> > > ...
> > > head.atomic_cas(..., relaxed)  //ldrex[haed]; ... 
> > > strex[head]
> > >
> > >
> > > 3.   rte_ring_hts_elem_pvt.h
> > > ==
> > >
> > > pseudo-c-code   //related 
> > > armv8 instructions
> > >  
> > > --
> > > head.atomic_load(acquire)//lda [head]
> > > opposite_tail.load() //ldr 
> > > [opposite_tail]
> > > ...
> > > head.atomic_cas(..., acquire)// ldaex[head]; ... 
> > > strex[head]
> > >
> > > The questions that arose from these observations:
> > > a) are all 3 approaches equivalent in terms of functionality?
> > Different, lda (Load with acquire semantics) and ldr (load) are different.
> 
> I understand that, my question was:
> lda {head]; ldr[tail]
> vs
> ldr [head]; dmb ishld; ldr [tail];
> 
> Is there any difference in terms of functionality (memory ops
> ordering/observability)?
> 
> >
> > > b) if yes, is there any difference in terms of performance between:
> > >  "ldr; dmb; ldr;"   vs "lda; ldr;"
> > >   ?
> > dmb is a full barrier, performance is poor.
> > I would assume (haven't measured) ldr; dmb; ldr to be less performant
> > than lda;ldr;
> 
> Through all this mail am talking about 'dmb ishld', sorry for not being clear
> upfront.
>
A: ldr; dmb ishld; ldr; -> load before the dmb ishld should be observed by the 
inner shareable 
domain before execution of the second ldr.
(Also applies to stores program order after dmb ishld.)

B: lda; ldr; -> second load cannot execute before the load acquire.
(Also applies to stores program order after lda.)

In theory, I would assume them to be at least roughly equal in performance if 
not B is more
performant than A.


> >
> > > c) Comapring at 1) and 2) above, combination of
> > >ldr [head]; dmb; lda [opposite_tail]:
> > >looks like an overkill to me.  Wouldn't just:
> > >ldr [head]; dmb; ldr[opposite_tail];
> > >be sufficient here?
> > lda [opposite_tail]: synchronizes with stlr in tail update that happens 
> > after
> array update.
> > So, it cannot be changed to ldr.
> 
> Can you explain me a bit more here why it is not possible?
> From here:
> https://developer.arm.com/documentation/dui0802/b/A32-and-T32-
> Instructions/LDA-and-STL
> "There is no requirement that a load-acquire and store-release be paired."
> Do I misinterpret this statement somehow?
> 
> > lda can be replaced with ldapr (LDA with release consistency -
> > processor consistency) which is more performant as lda is allowed to
> > rise above stlr. Can be done with -mcpu=+rcpc
> >
> > --wathsala
> >

[PATCH] net/mvneta: fix possible out-of-bounds write

2024-10-08 Thread Chengwen Feng

The mvneta_ifnames_get() function will save 'iface' value to ifnames,
it will out-of-bounds write if passed many iface pairs (e.g.
'iface=xxx,iface=xxx,...').

Fixes: 4ccc8d770d3b ("net/mvneta: add PMD skeleton")
Cc: sta...@dpdk.org

Signed-off-by: Chengwen Feng 
Acked-by: Ferruh Yigit 
---
 drivers/net/mvneta/mvneta_ethdev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/mvneta/mvneta_ethdev.c 
b/drivers/net/mvneta/mvneta_ethdev.c
index 3841c1ebe9..c49f083efa 100644
--- a/drivers/net/mvneta/mvneta_ethdev.c
+++ b/drivers/net/mvneta/mvneta_ethdev.c
@@ -91,6 +91,9 @@ mvneta_ifnames_get(const char *key __rte_unused, const char 
*value,
 {
struct mvneta_ifnames *ifnames = extra_args;
 
+   if (ifnames->idx >= NETA_NUM_ETH_PPIO)
+   return -EINVAL;
+
ifnames->names[ifnames->idx++] = value;
 
return 0;
-- 
2.17.1

Re: [PATCH v2 00/50] Provide: flow filter init API, Enable virtual queues, fix ntnic issues for release 24.07

2024-10-08 Thread Ferruh Yigit

On 10/7/2024 8:33 PM, Serhii Iliushyk wrote:
> The list of updates provided by the patchset:
>   * Update the supported version of the FPGA to 9563.55.49
>   * Fix Coverity issues
>   * Fix issues related to release 24.07
>   * Extended and fixed the implementation of the logging
>   * Added NT flow filter init API
>   * Added NT flow backend initialization API
>   * Added initialization of FPGA modules related to flow HW offload
>   * Added basic handling of the virtual queues
>   * Update documentation
> 
> Danylo Vodopianov (15):
>   net/ntnic: fix coverity issues:
>   net/ntnic: extend and fix logging implementation
>   net/ntnic: add basic queue operations
>   net/ntnic: enhance Ethernet device configuration
>   net/ntnic: add scatter-gather HW deallocation
>   net/ntnic: add queue setup operations
>   net/ntnic: add packet handler for virtio queues
>   net/ntnic: add init for virt queues in the DBS
>   net/ntnic: add split-queue support
>   net/ntnic: add functions for availability monitor management
>   net/ntnic: used writer data handling functions
>   net/ntnic: add descriptor reader data handling functions
>   net/ntnic: virtqueue setup managed packed-ring was added
>   net/ntnic: add functions for releasing virt queues
>   net/ntnic: add functions for retrieving and managing packets
> 
> Oleksandr Kolomeiets (33):
>   net/ntnic: update NT NiC PMD driver with FPGA version
>   net/ntnic: update documentation
>   net/ntnic: remove extra calling of the API for release port
>   net/ntnic: add flow filter init API
>   net/ntnic: add flow filter deinitialization API
>   net/ntnic: add flow backend initialization API
>   net/ntnic: add flow backend deinitialization API
>   net/ntnic: add INFO flow module
>   net/ntnic: add categorizer (CAT) flow module
>   net/ntnic: add key match (KM) flow module
>   net/ntnic: add flow matcher (FLM) flow module
>   net/ntnic: add IP fragmenter (IFR) flow module
>   net/ntnic: add hasher (HSH) flow module
>   net/ntnic: add queue select (QSL) flow module
>   net/ntnic: add slicer (SLC LR) flow module
>   net/ntnic: add packet descriptor builder (PDB) flow module
>   net/ntnic: add header field update (HFU) flow module
>   net/ntnic: add RPP local retransmit (RPP LR) flow module
>   net/ntnic: add copier (Tx CPY) flow module
>   net/ntnic: add checksum update (CSU) flow module
>   net/ntnic: add insert (Tx INS) flow module
>   net/ntnic: add replacer (Tx RPL) flow module
>   net/ntnic: add base init and deinit of the NT flow API
>   net/ntnic: add base init and deinit the NT flow backend
>   net/ntnic: add categorizer (CAT) FPGA module
>   net/ntnic: add key match (KM) FPGA module
>   net/ntnic: add flow matcher (FLM) FPGA module
>   net/ntnic: add hasher (HSH) FPGA module
>   net/ntnic: add queue select (QSL) FPGA module
>   net/ntnic: add slicer (SLC LR) FPGA module
>   net/ntnic: add packet descriptor builder (PDB) FPGA module
>   net/ntnic: add Tx Packet Editor (TPE) FPGA module
>   net/ntnic: add receive MAC converter (RMC) core module
> 
> Serhii Iliushyk (2):
>   net/ntnic: add Tx Packet Editor (TPE) flow module
>   net/ntnic: update FPGA registeris related to DBS
>

Hi Serhii,

What is the status of the driver after this patches, does Rx/Tx works?

[PATCH v6 1/4] kvargs: add one new process API

2024-10-08 Thread Chengwen Feng

The rte_kvargs_process() was used to handle key=value (e.g.
socket_id=0), it also supports to handle only-key (e.g. socket_id).
But many drivers's callback can only handle key=value, it will segment
fault if handles only-key. so the patchset [1] was introduced.

Because the patchset [1] modified too much drivers, therefore:
1) A new API rte_kvargs_process_opt() was introduced, it inherits the
function of rte_kvargs_process() which could handle both key=value and
only-key cases.
2) Constraint the rte_kvargs_process() can only handle key=value cases,
it will return -1 when handle only-key case (that is the matched key's
value is NULL).

This patch also make sure the rte_kvargs_process_opt() and
rte_kvargs_process() API both return -1 when the kvlist parameter is
NULL.

[1] 
https://patches.dpdk.org/project/dpdk/patch/20230320092110.37295-1-fengcheng...@huawei.com/

Signed-off-by: Chengwen Feng 
---
 doc/guides/rel_notes/release_24_11.rst | 13 
 lib/kvargs/rte_kvargs.c| 43 --
 lib/kvargs/rte_kvargs.h| 39 +--
 lib/kvargs/version.map |  7 +
 4 files changed, 90 insertions(+), 12 deletions(-)

diff --git a/doc/guides/rel_notes/release_24_11.rst 
b/doc/guides/rel_notes/release_24_11.rst
index e0a9aa55a1..873f0639dc 100644
--- a/doc/guides/rel_notes/release_24_11.rst
+++ b/doc/guides/rel_notes/release_24_11.rst
@@ -96,6 +96,19 @@ API Changes
Also, make sure to start the actual text at the margin.
===
 
+* **Updated kvargs process API.**
+
+  * Introduced rte_kvargs_process_opt() API, which inherits the function
+of rte_kvargs_process() and could handle both key=value and only-key
+cases.
+
+  * Constraint rte_kvargs_process() API can only handle key=value cases,
+it will return -1 when handle only-key case (that is the matched key's
+value is NULL).
+
+  * Make sure rte_kvargs_process_opt() and rte_kvargs_process() API both
+return -1 when the kvlist parameter is NULL.
+
 
 ABI Changes
 ---
diff --git a/lib/kvargs/rte_kvargs.c b/lib/kvargs/rte_kvargs.c
index c77bb82feb..b02f22f5a2 100644
--- a/lib/kvargs/rte_kvargs.c
+++ b/lib/kvargs/rte_kvargs.c
@@ -167,31 +167,56 @@ rte_kvargs_count(const struct rte_kvargs *kvlist, const 
char *key_match)
return ret;
 }
 
-/*
- * For each matching key, call the given handler function.
- */
-int
-rte_kvargs_process(const struct rte_kvargs *kvlist,
-   const char *key_match,
-   arg_handler_t handler,
-   void *opaque_arg)
+static int
+kvargs_process_common(const struct rte_kvargs *kvlist,
+ const char *key_match,
+ arg_handler_t handler,
+ void *opaque_arg,
+ bool support_only_key)
 {
const struct rte_kvargs_pair *pair;
unsigned i;
 
if (kvlist == NULL)
-   return 0;
+   return -1;
 
for (i = 0; i < kvlist->count; i++) {
pair = &kvlist->pairs[i];
if (key_match == NULL || strcmp(pair->key, key_match) == 0) {
+   if (!support_only_key && pair->value == NULL)
+   return -1;
if ((*handler)(pair->key, pair->value, opaque_arg) < 0)
return -1;
}
}
+
return 0;
 }
 
+/*
+ * For each matching key in key=value, call the given handler function.
+ */
+int
+rte_kvargs_process(const struct rte_kvargs *kvlist,
+  const char *key_match,
+  arg_handler_t handler,
+  void *opaque_arg)
+{
+   return kvargs_process_common(kvlist, key_match, handler, opaque_arg, 
false);
+}
+
+/*
+ * For each matching key in key=value or only-key, call the given handler 
function.
+ */
+int
+rte_kvargs_process_opt(const struct rte_kvargs *kvlist,
+  const char *key_match,
+  arg_handler_t handler,
+  void *opaque_arg)
+{
+   return kvargs_process_common(kvlist, key_match, handler, opaque_arg, 
true);
+}
+
 /* free the rte_kvargs structure */
 void
 rte_kvargs_free(struct rte_kvargs *kvlist)
diff --git a/lib/kvargs/rte_kvargs.h b/lib/kvargs/rte_kvargs.h
index b0d1301c61..b37cd4902f 100644
--- a/lib/kvargs/rte_kvargs.h
+++ b/lib/kvargs/rte_kvargs.h
@@ -6,6 +6,8 @@
 #ifndef _RTE_KVARGS_H_
 #define _RTE_KVARGS_H_
 
+#include 
+
 /**
  * @file
  * RTE Argument parsing
@@ -166,14 +168,17 @@ const char *rte_kvargs_get_with_value(const struct 
rte_kvargs *kvlist,
  const char *key, const char *value);
 
 /**
- * Call a handler function for each key/value matching the key
+ * Call a handler function for each key=value matching the key
  *
- * For each key/value association that matches the given key, calls the
+ * For each key=value association tha

[PATCH v6 4/4] common/nfp: use new API to parse kvargs

2024-10-08 Thread Chengwen Feng

The nfp_parse_class_options() function could handle both key=value and
only-key, so it should use rte_kvargs_process_opt() instead of
rte_kvargs_process() to parse.

Signed-off-by: Chengwen Feng 
---
 drivers/common/nfp/nfp_common_pci.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/common/nfp/nfp_common_pci.c 
b/drivers/common/nfp/nfp_common_pci.c
index 723035d0f7..5c36052f9d 100644
--- a/drivers/common/nfp/nfp_common_pci.c
+++ b/drivers/common/nfp/nfp_common_pci.c
@@ -170,10 +170,8 @@ nfp_parse_class_options(const struct rte_devargs *devargs)
if (kvargs == NULL)
return dev_class;
 
-   if (rte_kvargs_count(kvargs, RTE_DEVARGS_KEY_CLASS) != 0) {
-   rte_kvargs_process(kvargs, RTE_DEVARGS_KEY_CLASS,
-   nfp_kvarg_dev_class_handler, &dev_class);
-   }
+   rte_kvargs_process_opt(kvargs, RTE_DEVARGS_KEY_CLASS,
+  nfp_kvarg_dev_class_handler, &dev_class);
 
rte_kvargs_free(kvargs);
 
-- 
2.17.1

[PATCH v6 2/4] net/sfc: use new API to parse kvargs

2024-10-08 Thread Chengwen Feng

Add sfc_kvargs_process_opt() function to handle only-key case, and
remove redundancy NULL judgement of value because the rte_kvargs_process
(which invoked in sfc_kvargs_process()) will handle it.

Signed-off-by: Chengwen Feng 
---
 drivers/common/sfc_efx/sfc_efx.c |  3 ---
 drivers/net/sfc/sfc_ethdev.c | 12 ++--
 drivers/net/sfc/sfc_kvargs.c | 12 +++-
 drivers/net/sfc/sfc_kvargs.h |  2 ++
 4 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/common/sfc_efx/sfc_efx.c b/drivers/common/sfc_efx/sfc_efx.c
index 5eeffb065b..458efacba5 100644
--- a/drivers/common/sfc_efx/sfc_efx.c
+++ b/drivers/common/sfc_efx/sfc_efx.c
@@ -23,9 +23,6 @@ sfc_efx_kvarg_dev_class_handler(__rte_unused const char *key,
 {
enum sfc_efx_dev_class *dev_class = opaque;
 
-   if (class_str == NULL)
-   return *dev_class;
-
if (strcmp(class_str, "vdpa") == 0) {
*dev_class = SFC_EFX_DEV_CLASS_VDPA;
} else if (strcmp(class_str, "net") == 0) {
diff --git a/drivers/net/sfc/sfc_ethdev.c b/drivers/net/sfc/sfc_ethdev.c
index 3480a51642..89444f0b4a 100644
--- a/drivers/net/sfc/sfc_ethdev.c
+++ b/drivers/net/sfc/sfc_ethdev.c
@@ -2835,8 +2835,8 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
if (encp->enc_rx_es_super_buffer_supported)
avail_caps |= SFC_DP_HW_FW_CAP_RX_ES_SUPER_BUFFER;
 
-   rc = sfc_kvargs_process(sa, SFC_KVARG_RX_DATAPATH,
-   sfc_kvarg_string_handler, &rx_name);
+   rc = sfc_kvargs_process_opt(sa, SFC_KVARG_RX_DATAPATH,
+   sfc_kvarg_string_handler, &rx_name);
if (rc != 0)
goto fail_kvarg_rx_datapath;
 
@@ -2878,8 +2878,8 @@ sfc_eth_dev_set_ops(struct rte_eth_dev *dev)
 
sfc_notice(sa, "use %s Rx datapath", sas->dp_rx_name);
 
-   rc = sfc_kvargs_process(sa, SFC_KVARG_TX_DATAPATH,
-   sfc_kvarg_string_handler, &tx_name);
+   rc = sfc_kvargs_process_opt(sa, SFC_KVARG_TX_DATAPATH,
+   sfc_kvarg_string_handler, &tx_name);
if (rc != 0)
goto fail_kvarg_tx_datapath;
 
@@ -3073,8 +3073,8 @@ sfc_parse_switch_mode(struct sfc_adapter *sa, bool 
has_representors)
 
sfc_log_init(sa, "entry");
 
-   rc = sfc_kvargs_process(sa, SFC_KVARG_SWITCH_MODE,
-   sfc_kvarg_string_handler, &switch_mode);
+   rc = sfc_kvargs_process_opt(sa, SFC_KVARG_SWITCH_MODE,
+   sfc_kvarg_string_handler, &switch_mode);
if (rc != 0)
goto fail_kvargs;
 
diff --git a/drivers/net/sfc/sfc_kvargs.c b/drivers/net/sfc/sfc_kvargs.c
index 783cb43ae6..eb36fa98ca 100644
--- a/drivers/net/sfc/sfc_kvargs.c
+++ b/drivers/net/sfc/sfc_kvargs.c
@@ -73,6 +73,16 @@ sfc_kvargs_process(struct sfc_adapter *sa, const char 
*key_match,
return -rte_kvargs_process(sa->kvargs, key_match, handler, opaque_arg);
 }
 
+int
+sfc_kvargs_process_opt(struct sfc_adapter *sa, const char *key_match,
+  arg_handler_t handler, void *opaque_arg)
+{
+   if (sa->kvargs == NULL)
+   return 0;
+
+   return -rte_kvargs_process_opt(sa->kvargs, key_match, handler, 
opaque_arg);
+}
+
 int
 sfc_kvarg_bool_handler(__rte_unused const char *key,
   const char *value_str, void *opaque)
@@ -104,7 +114,7 @@ sfc_kvarg_long_handler(__rte_unused const char *key,
long value;
char *endptr;
 
-   if (!value_str || !opaque)
+   if (!opaque)
return -EINVAL;
 
value = strtol(value_str, &endptr, 0);
diff --git a/drivers/net/sfc/sfc_kvargs.h b/drivers/net/sfc/sfc_kvargs.h
index 2226f2b3d9..4dcc61e973 100644
--- a/drivers/net/sfc/sfc_kvargs.h
+++ b/drivers/net/sfc/sfc_kvargs.h
@@ -83,6 +83,8 @@ void sfc_kvargs_cleanup(struct sfc_adapter *sa);
 
 int sfc_kvargs_process(struct sfc_adapter *sa, const char *key_match,
   arg_handler_t handler, void *opaque_arg);
+int sfc_kvargs_process_opt(struct sfc_adapter *sa, const char *key_match,
+  arg_handler_t handler, void *opaque_arg);
 
 int sfc_kvarg_bool_handler(const char *key, const char *value_str,
   void *opaque);
-- 
2.17.1

[PATCH v6 0/4] fix segment fault when parse args

2024-10-08 Thread Chengwen Feng

The rte_kvargs_process() was used to parse key-value (e.g. socket_id=0),
it also supports to parse only-key (e.g. socket_id). But many drivers's
callback can only handle key-value, it will segment fault if handles
only-key. so the patchset [1] was introduced.

Because the patchset [1] modified too much drivers, therefore:
1) A new API rte_kvargs_process_opt() was introduced, it inherits the
function of rte_kvargs_process() which could parse both key-value and
only-key.
2) Constraint the rte_kvargs_process() can only parse key-value.

[1] 
https://patches.dpdk.org/project/dpdk/patch/20230320092110.37295-1-fengcheng...@huawei.com/

Chengwen Feng (4):
  kvargs: add one new process API
  net/sfc: use new API to parse kvargs
  net/tap: use new API to parse kvargs
  common/nfp: use new API to parse kvargs

---
v6: rebase to 24.11, refine net/sfc modification, make mvneta as an
independent commit which address Stephen's comment.
v5: remove redundant of rte_kvargs_count of 4/5 commit which address
Stephen's comment.
v4: refine API's define and impl which address Ferruh's comments.
add common/nfp change commit.
v3: introduce new API instead of modify too many drivers which address
Ferruh's comments.

 doc/guides/rel_notes/release_24_11.rst | 13 
 drivers/common/nfp/nfp_common_pci.c|  6 ++--
 drivers/common/sfc_efx/sfc_efx.c   |  3 --
 drivers/net/sfc/sfc_ethdev.c   | 12 +++
 drivers/net/sfc/sfc_kvargs.c   | 12 ++-
 drivers/net/sfc/sfc_kvargs.h   |  2 ++
 drivers/net/tap/rte_eth_tap.c  | 10 +++---
 lib/kvargs/rte_kvargs.c| 43 --
 lib/kvargs/rte_kvargs.h| 39 +--
 lib/kvargs/version.map |  7 +
 10 files changed, 116 insertions(+), 31 deletions(-)

-- 
2.17.1

[PATCH v6 3/4] net/tap: use new API to parse kvargs

2024-10-08 Thread Chengwen Feng

Some kvargs could be key=value or only-key, it should use
rte_kvargs_process_opt() instead of rte_kvargs_process() to handle
these kvargs.

Signed-off-by: Chengwen Feng 
---
 drivers/net/tap/rte_eth_tap.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index c5af5751f6..5ad3bbadd1 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -2291,7 +2291,7 @@ rte_pmd_tun_probe(struct rte_vdev_device *dev)
kvlist = rte_kvargs_parse(params, valid_arguments);
if (kvlist) {
if (rte_kvargs_count(kvlist, ETH_TAP_IFACE_ARG) == 1) {
-   ret = rte_kvargs_process(kvlist,
+   ret = rte_kvargs_process_opt(kvlist,
ETH_TAP_IFACE_ARG,
&set_interface_name,
tun_name);
@@ -2487,10 +2487,10 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
kvlist = rte_kvargs_parse(params, valid_arguments);
if (kvlist) {
if (rte_kvargs_count(kvlist, ETH_TAP_IFACE_ARG) == 1) {
-   ret = rte_kvargs_process(kvlist,
-ETH_TAP_IFACE_ARG,
-&set_interface_name,
-tap_name);
+   ret = rte_kvargs_process_opt(kvlist,
+ETH_TAP_IFACE_ARG,
+
&set_interface_name,
+tap_name);
if (ret == -1)
goto leave;
}
-- 
2.17.1

Re: [PATCH v5 0/5] fix segment fault when parse args

2024-10-08 Thread fengchengwen

Hi Stephen,

On 2024/10/5 9:19, Stephen Hemminger wrote:
> On Mon, 6 Nov 2023 07:31:19 +
> Chengwen Feng  wrote:
> 
>> The rte_kvargs_process() was used to parse key-value (e.g. socket_id=0),
>> it also supports to parse only-key (e.g. socket_id). But many drivers's
>> callback can only handle key-value, it will segment fault if handles
>> only-key. so the patchset [1] was introduced.
>>
>> Because the patchset [1] modified too much drivers, therefore:
>> 1) A new API rte_kvargs_process_opt() was introduced, it inherits the
>> function of rte_kvargs_process() which could parse both key-value and
>> only-key.
>> 2) Constraint the rte_kvargs_process() can only parse key-value.
>>
>> This patchset also include one bugfix for kvargs of mvneta driver.
>>
>> [1] 
>> https://patches.dpdk.org/project/dpdk/patch/20230320092110.37295-1-fengcheng...@huawei.com/
>>
>> Chengwen Feng (5):
>>   kvargs: add one new process API
>>   net/sfc: use new API to parse kvargs
>>   net/tap: use new API to parse kvargs
>>   common/nfp: use new API to parse kvargs
>>   net/mvneta: fix possible out-of-bounds write
> 
> Not sure why the patchset never got more attention.
> Yes it is a real bug, and this looks like a reasonable way to address it.
> 
> It does need to be rebased to current 24.11 tree to have a chance,
> and would be good to add more documentation to the API and remove
> cases in drivers that have unnecessary NULL checks after this.
> But those changes can be follow ups.
> 
> Also the mvneta patch probably should be sent as separate it does
> not depend on anything here.

Done

> 
> Bottom line resubmit it, and I will ack the new version.

Already sent v6, please take a review

Thanks

[v2] crypto/dpaa2_sec: rework debug code

2024-10-08 Thread Gagandeep Singh

Output debug information according to various modes.

Signed-off-by: Jun Yang 
Signed-off-by: Gagandeep Singh 
---
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c | 72 +
 1 file changed, 46 insertions(+), 26 deletions(-)

diff --git a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c 
b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
index 2cdf9308f8..0c96ca0023 100644
--- a/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
+++ b/drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c
@@ -65,6 +65,47 @@ enum dpaa2_sec_dump_levels {
 uint8_t cryptodev_driver_id;
 uint8_t dpaa2_sec_dp_dump = DPAA2_SEC_DP_ERR_DUMP;
 
+static inline void
+dpaa2_sec_dp_fd_dump(const struct qbman_fd *fd, uint16_t bpid,
+struct rte_mbuf *mbuf, bool tx)
+{
+#if (RTE_LOG_DEBUG <= RTE_LOG_DP_LEVEL)
+   char debug_str[1024];
+   int offset;
+
+   if (tx) {
+   offset = sprintf(debug_str,
+   "CIPHER SG: fdaddr =%" PRIx64 ", from %s pool ",
+   DPAA2_GET_FD_ADDR(fd),
+   bpid < MAX_BPID ? "SW" : "BMAN");
+   if (bpid < MAX_BPID) {
+   offset += sprintf(&debug_str[offset],
+   "bpid = %d ", bpid);
+   }
+   } else {
+   offset = sprintf(debug_str, "Mbuf %p from %s pool ",
+mbuf, DPAA2_GET_FD_IVP(fd) ? "SW" : "BMAN");
+   if (!DPAA2_GET_FD_IVP(fd)) {
+   offset += sprintf(&debug_str[offset], "bpid = %d ",
+ DPAA2_GET_FD_BPID(fd));
+   }
+   }
+   offset += sprintf(&debug_str[offset],
+   "private size = %d ",
+   mbuf->pool->private_data_size);
+   offset += sprintf(&debug_str[offset],
+   "addr %p, fdaddr =%" PRIx64 ", off =%d, len =%d",
+   mbuf->buf_addr, DPAA2_GET_FD_ADDR(fd),
+   DPAA2_GET_FD_OFFSET(fd), DPAA2_GET_FD_LEN(fd));
+   DPAA2_SEC_DP_DEBUG("%s", debug_str);
+#else
+   RTE_SET_USED(bpid);
+   RTE_SET_USED(tx);
+   RTE_SET_USED(mbuf);
+   RTE_SET_USED(fd);
+#endif
+}
+
 static inline void
 free_fle(const struct qbman_fd *fd, struct dpaa2_sec_qp *qp)
 {
@@ -1097,7 +1138,7 @@ build_auth_fd(dpaa2_sec_session *sess, struct 
rte_crypto_op *op,
 
 static int
 build_cipher_sg_fd(dpaa2_sec_session *sess, struct rte_crypto_op *op,
-   struct qbman_fd *fd, __rte_unused uint16_t bpid)
+   struct qbman_fd *fd, uint16_t bpid)
 {
struct rte_crypto_sym_op *sym_op = op->sym;
struct qbman_fle *ip_fle, *op_fle, *sge, *fle;
@@ -1212,14 +1253,8 @@ build_cipher_sg_fd(dpaa2_sec_session *sess, struct 
rte_crypto_op *op,
DPAA2_SET_FD_COMPOUND_FMT(fd);
DPAA2_SET_FD_FLC(fd, DPAA2_VADDR_TO_IOVA(flc));
 
-   DPAA2_SEC_DP_DEBUG(
-   "CIPHER SG: fdaddr =%" PRIx64 " bpid =%d meta =%d"
-   " off =%d, len =%d",
-   DPAA2_GET_FD_ADDR(fd),
-   DPAA2_GET_FD_BPID(fd),
-   rte_dpaa2_bpid_info[bpid].meta_data_size,
-   DPAA2_GET_FD_OFFSET(fd),
-   DPAA2_GET_FD_LEN(fd));
+   dpaa2_sec_dp_fd_dump(fd, bpid, mbuf, true);
+
return 0;
 }
 
@@ -1326,14 +1361,7 @@ build_cipher_fd(dpaa2_sec_session *sess, struct 
rte_crypto_op *op,
DPAA2_SET_FLE_FIN(sge);
DPAA2_SET_FLE_FIN(fle);
 
-   DPAA2_SEC_DP_DEBUG(
-   "CIPHER: fdaddr =%" PRIx64 " bpid =%d meta =%d"
-   " off =%d, len =%d",
-   DPAA2_GET_FD_ADDR(fd),
-   DPAA2_GET_FD_BPID(fd),
-   rte_dpaa2_bpid_info[bpid].meta_data_size,
-   DPAA2_GET_FD_OFFSET(fd),
-   DPAA2_GET_FD_LEN(fd));
+   dpaa2_sec_dp_fd_dump(fd, bpid, dst, true);
 
return 0;
 }
@@ -1604,15 +1632,7 @@ sec_fd_to_mbuf(const struct qbman_fd *fd, struct 
dpaa2_sec_qp *qp)
dst->data_len = len;
}
 
-   DPAA2_SEC_DP_DEBUG("mbuf %p BMAN buf addr %p,"
-   " fdaddr =%" PRIx64 " bpid =%d meta =%d off =%d, len =%d",
-   (void *)dst,
-   dst->buf_addr,
-   DPAA2_GET_FD_ADDR(fd),
-   DPAA2_GET_FD_BPID(fd),
-   rte_dpaa2_bpid_info[DPAA2_GET_FD_BPID(fd)].meta_data_size,
-   DPAA2_GET_FD_OFFSET(fd),
-   DPAA2_GET_FD_LEN(fd));
+   dpaa2_sec_dp_fd_dump(fd, 0, dst, false);
 
/* free the fle memory */
if (likely(rte_pktmbuf_is_contiguous(src))) {
-- 
2.25.1

Re: [PATCH v9 05/17] examples/vhost: fix free function mismatch

2024-10-08 Thread Chenbo Xia



> On Oct 9, 2024, at 00:47, Stephen Hemminger  
> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> The pointer bdev is allocated with rte_zmalloc() and then
> incorrectly freed with free() which will lead pool corruption.
> 
> Bugzilla ID: 1553
> Fixes: c19beb3f38cd ("examples/vhost_blk: introduce vhost storage sample")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Stephen Hemminger 
> Acked-by: Chengwen Feng 
> ---
> examples/vhost_blk/vhost_blk.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/examples/vhost_blk/vhost_blk.c b/examples/vhost_blk/vhost_blk.c
> index 03f1ac9c3f..9c9e326949 100644
> --- a/examples/vhost_blk/vhost_blk.c
> +++ b/examples/vhost_blk/vhost_blk.c
> @@ -776,7 +776,7 @@ vhost_blk_bdev_construct(const char *bdev_name,
>bdev->data = rte_zmalloc(NULL, blk_cnt * blk_size, 0);
>if (!bdev->data) {
>fprintf(stderr, "No enough reserved huge memory for disk\n");
> -   free(bdev);
> +   rte_free(bdev);
>return NULL;
>}
> 
> --
> 2.45.2
> 

Reviewed-by: Chenbo Xia

[PATCH v3 1/5] power: refactor core power management library

2024-10-08 Thread Sivaprasad Tummala

This patch introduces a comprehensive refactor to the core power
management library. The primary focus is on improving modularity
and organization by relocating specific driver implementations
from the 'lib/power' directory to dedicated directories within
'drivers/power/core/*'. The adjustment of meson.build files
enables the selective activation of individual drivers.

These changes contribute to a significant enhancement in code
organization, providing a clearer structure for driver implementations.
The refactor aims to improve overall code clarity and boost
maintainability. Additionally, it establishes a foundation for
future development, allowing for more focused work on individual
drivers and seamless integration of forthcoming enhancements.

v3:
 - renamed rte_power_core_ops.h as rte_power_cpufreq_api.h
 - re-worked on auto detection logic

v2:
 - added NULL check for global_core_ops in rte_power_get_core_ops

Signed-off-by: Sivaprasad Tummala 
---
 drivers/meson.build   |   1 +
 .../power/acpi/acpi_cpufreq.c |  22 +-
 .../power/acpi/acpi_cpufreq.h |   6 +-
 drivers/power/acpi/meson.build|  10 +
 .../power/amd_pstate/amd_pstate_cpufreq.c |  24 +-
 .../power/amd_pstate/amd_pstate_cpufreq.h |   8 +-
 drivers/power/amd_pstate/meson.build  |  10 +
 .../power/cppc/cppc_cpufreq.c |  22 +-
 .../power/cppc/cppc_cpufreq.h |   8 +-
 drivers/power/cppc/meson.build|  10 +
 .../power/kvm_vm}/guest_channel.c |   0
 .../power/kvm_vm}/guest_channel.h |   0
 .../power/kvm_vm/kvm_vm.c |  22 +-
 .../power/kvm_vm/kvm_vm.h |   6 +-
 drivers/power/kvm_vm/meson.build  |  16 +
 drivers/power/meson.build |  12 +
 drivers/power/pstate/meson.build  |  10 +
 .../power/pstate/pstate_cpufreq.c |  22 +-
 .../power/pstate/pstate_cpufreq.h |   6 +-
 lib/power/meson.build |   7 +-
 lib/power/power_common.c  |   2 +-
 lib/power/power_common.h  |  16 +-
 lib/power/rte_power.c | 286 ++
 lib/power/rte_power.h | 139 ++---
 lib/power/rte_power_cpufreq_api.h | 208 +
 lib/power/version.map |  14 +
 26 files changed, 618 insertions(+), 269 deletions(-)
 rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c 
(95%)
 rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h 
(98%)
 create mode 100644 drivers/power/acpi/meson.build
 rename lib/power/power_amd_pstate_cpufreq.c => 
drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
 rename lib/power/power_amd_pstate_cpufreq.h => 
drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
 create mode 100644 drivers/power/amd_pstate/meson.build
 rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c 
(95%)
 rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h 
(97%)
 create mode 100644 drivers/power/cppc/meson.build
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
 rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
 rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
 create mode 100644 drivers/power/kvm_vm/meson.build
 create mode 100644 drivers/power/meson.build
 create mode 100644 drivers/power/pstate/meson.build
 rename lib/power/power_pstate_cpufreq.c => 
drivers/power/pstate/pstate_cpufreq.c (96%)
 rename lib/power/power_pstate_cpufreq.h => 
drivers/power/pstate/pstate_cpufreq.h (98%)
 create mode 100644 lib/power/rte_power_cpufreq_api.h

diff --git a/drivers/meson.build b/drivers/meson.build
index 66931d4241..9d77e0deab 100644
--- a/drivers/meson.build
+++ b/drivers/meson.build
@@ -29,6 +29,7 @@ subdirs = [
 'event',  # depends on common, bus, mempool and net.
 'baseband',   # depends on common and bus.
 'gpu',# depends on common and bus.
+'power',  # depends on common (in future).
 ]
 
 if meson.is_cross_build()
diff --git a/lib/power/power_acpi_cpufreq.c b/drivers/power/acpi/acpi_cpufreq.c
similarity index 95%
rename from lib/power/power_acpi_cpufreq.c
rename to drivers/power/acpi/acpi_cpufreq.c
index 81996e1c13..8637c69703 100644
--- a/lib/power/power_acpi_cpufreq.c
+++ b/drivers/power/acpi/acpi_cpufreq.c
@@ -10,7 +10,7 @@
 #include 
 #include 
 
-#include "power_acpi_cpufreq.h"
+#include "acpi_cpufreq.h"
 #include "power_common.h"
 
 #define STR_SIZE 1024
@@ -577,3 +577,23 @@ int power_acpi_get_capabilities(unsigned int lcore_id,
 
return 0;
 }
+
+static struct rte_power_core_ops acpi_ops = {
+   .name = "acpi",
+   .init = power_acpi_cpufreq_init,
+   .exit = power_ac

[PATCH v3 4/5] power/amd_uncore: uncore support for AMD EPYC processors

2024-10-08 Thread Sivaprasad Tummala

This patch introduces driver support for power management of uncore
components in AMD EPYC processors.

v2:
 - fixed typo in comments section.
 - added fabric frequency get support for legacy platforms.

Signed-off-by: Sivaprasad Tummala 
---
 drivers/power/amd_uncore/amd_uncore.c | 328 ++
 drivers/power/amd_uncore/amd_uncore.h | 226 ++
 drivers/power/amd_uncore/meson.build  |  20 ++
 drivers/power/meson.build |   1 +
 4 files changed, 575 insertions(+)
 create mode 100644 drivers/power/amd_uncore/amd_uncore.c
 create mode 100644 drivers/power/amd_uncore/amd_uncore.h
 create mode 100644 drivers/power/amd_uncore/meson.build

diff --git a/drivers/power/amd_uncore/amd_uncore.c 
b/drivers/power/amd_uncore/amd_uncore.c
new file mode 100644
index 00..e667a783cd
--- /dev/null
+++ b/drivers/power/amd_uncore/amd_uncore.c
@@ -0,0 +1,328 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2024 Advanced Micro Devices, Inc.
+ */
+
+#include 
+#include 
+#include 
+
+#include 
+
+#include "amd_uncore.h"
+#include "power_common.h"
+#include "e_smi/e_smi.h"
+
+#define MAX_NUMA_DIE 8
+
+struct  __rte_cache_aligned uncore_power_info {
+   unsigned int die;  /* Core die id */
+   unsigned int pkg;  /* Package id */
+   uint32_t freqs[RTE_MAX_UNCORE_FREQS];  /* Frequency array */
+   uint32_t nb_freqs; /* Number of available freqs */
+   uint32_t curr_idx; /* Freq index in freqs array */
+   uint32_t max_freq;/* System max uncore freq */
+   uint32_t min_freq;/* System min uncore freq */
+};
+
+static struct uncore_power_info uncore_info[RTE_MAX_NUMA_NODES][MAX_NUMA_DIE];
+static int esmi_initialized;
+static int hsmp_proto_ver;
+
+static int
+set_uncore_freq_internal(struct uncore_power_info *ui, uint32_t idx)
+{
+   int ret;
+
+   if (idx >= RTE_MAX_UNCORE_FREQS || idx >= ui->nb_freqs) {
+   POWER_LOG(DEBUG, "Invalid uncore frequency index %u, which "
+   "should be less than %u", idx, ui->nb_freqs);
+   return -1;
+   }
+
+   ret = esmi_apb_disable(ui->pkg, idx);
+   if (ret != ESMI_SUCCESS) {
+   POWER_LOG(ERR, "DF P-state '%u' set failed for pkg %02u",
+   idx, ui->pkg);
+   return -1;
+   }
+
+   POWER_DEBUG_LOG("DF P-state '%u' to be set for pkg %02u die %02u",
+   idx, ui->pkg, ui->die);
+
+   /* write the minimum value first if the target freq is less than 
current max */
+   ui->curr_idx = idx;
+
+   return 0;
+}
+
+static int
+power_init_for_setting_uncore_freq(struct uncore_power_info *ui)
+{
+   switch (hsmp_proto_ver) {
+   case HSMP_PROTO_VER5:
+   ui->max_freq = 180; /* Hz */
+   ui->min_freq = 120; /* Hz */
+   break;
+   case HSMP_PROTO_VER2:
+   default:
+   ui->max_freq = 160; /* Hz */
+   ui->min_freq = 120; /* Hz */
+   }
+
+   return 0;
+}
+
+/*
+ * Get the available uncore frequencies of the specific die.
+ */
+static int
+power_get_available_uncore_freqs(struct uncore_power_info *ui)
+{
+   ui->nb_freqs = 3;
+   if (ui->nb_freqs >= RTE_MAX_UNCORE_FREQS) {
+   POWER_LOG(ERR, "Too many available uncore frequencies: %d",
+   num_uncore_freqs);
+   return -1;
+   }
+
+   /* Generate the uncore freq bucket array. */
+   switch (hsmp_proto_ver) {
+   case HSMP_PROTO_VER5:
+   ui->freqs[0] = 180;
+   ui->freqs[1] = 144;
+   ui->freqs[2] = 120;
+   case HSMP_PROTO_VER2:
+   default:
+   ui->freqs[0] = 160;
+   ui->freqs[1] = 1333000;
+   ui->freqs[2] = 120;
+   }
+
+   POWER_DEBUG_LOG("%d frequency(s) of pkg %02u die %02u are available",
+   ui->num_uncore_freqs, ui->pkg, ui->die);
+
+   return 0;
+}
+
+static int
+check_pkg_die_values(unsigned int pkg, unsigned int die)
+{
+   unsigned int max_pkgs, max_dies;
+   max_pkgs = power_amd_uncore_get_num_pkgs();
+   if (max_pkgs == 0)
+   return -1;
+   if (pkg >= max_pkgs) {
+   POWER_LOG(DEBUG, "Package number %02u can not exceed %u",
+   pkg, max_pkgs);
+   return -1;
+   }
+
+   max_dies = power_amd_uncore_get_num_dies(pkg);
+   if (max_dies == 0)
+   return -1;
+   if (die >= max_dies) {
+   POWER_LOG(DEBUG, "Die number %02u can not exceed %u",
+   die, max_dies);
+   return -1;
+   }
+
+   return 0;
+}
+
+static void
+power_amd_uncore_esmi_init(void)
+{
+   if (esmi_init() == ESMI_SUCCESS) {
+   if (esmi_hsmp_proto_ver_get(&

[PATCH v3 2/5] power: refactor uncore power management library

2024-10-08 Thread Sivaprasad Tummala

This patch refactors the power management library, addressing uncore
power management. The primary changes involve the creation of dedicated
directories for each driver within 'drivers/power/uncore/*'. The
adjustment of meson.build files enables the selective activation
of individual drivers.

This refactor significantly improves code organization, enhances
clarity and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.

v3:
 - fixed typo in header file inclusion

Signed-off-by: Sivaprasad Tummala 
---
 .../power/intel_uncore/intel_uncore.c |  18 +-
 .../power/intel_uncore/intel_uncore.h |   8 +-
 drivers/power/intel_uncore/meson.build|   6 +
 drivers/power/meson.build |   3 +-
 lib/power/meson.build |   2 +-
 lib/power/rte_power_uncore.c  | 206 ++-
 lib/power/rte_power_uncore.h  |  87 ---
 lib/power/rte_power_uncore_ops.h  | 239 ++
 lib/power/version.map |   1 +
 9 files changed, 405 insertions(+), 165 deletions(-)
 rename lib/power/power_intel_uncore.c => 
drivers/power/intel_uncore/intel_uncore.c (95%)
 rename lib/power/power_intel_uncore.h => 
drivers/power/intel_uncore/intel_uncore.h (97%)
 create mode 100644 drivers/power/intel_uncore/meson.build
 create mode 100644 lib/power/rte_power_uncore_ops.h

diff --git a/lib/power/power_intel_uncore.c 
b/drivers/power/intel_uncore/intel_uncore.c
similarity index 95%
rename from lib/power/power_intel_uncore.c
rename to drivers/power/intel_uncore/intel_uncore.c
index 4eb9c5900a..804ad5d755 100644
--- a/lib/power/power_intel_uncore.c
+++ b/drivers/power/intel_uncore/intel_uncore.c
@@ -8,7 +8,7 @@
 
 #include 
 
-#include "power_intel_uncore.h"
+#include "intel_uncore.h"
 #include "power_common.h"
 
 #define MAX_NUMA_DIE 8
@@ -475,3 +475,19 @@ power_intel_uncore_get_num_dies(unsigned int pkg)
 
return count;
 }
+
+static struct rte_power_uncore_ops intel_uncore_ops = {
+   .name = "intel-uncore",
+   .init = power_intel_uncore_init,
+   .exit = power_intel_uncore_exit,
+   .get_avail_freqs = power_intel_uncore_freqs,
+   .get_num_pkgs = power_intel_uncore_get_num_pkgs,
+   .get_num_dies = power_intel_uncore_get_num_dies,
+   .get_num_freqs = power_intel_uncore_get_num_freqs,
+   .get_freq = power_get_intel_uncore_freq,
+   .set_freq = power_set_intel_uncore_freq,
+   .freq_max = power_intel_uncore_freq_max,
+   .freq_min = power_intel_uncore_freq_min,
+};
+
+RTE_POWER_REGISTER_UNCORE_OPS(intel_uncore_ops);
diff --git a/lib/power/power_intel_uncore.h 
b/drivers/power/intel_uncore/intel_uncore.h
similarity index 97%
rename from lib/power/power_intel_uncore.h
rename to drivers/power/intel_uncore/intel_uncore.h
index 20a3ba8ebe..f2ce2f0c66 100644
--- a/lib/power/power_intel_uncore.h
+++ b/drivers/power/intel_uncore/intel_uncore.h
@@ -2,8 +2,8 @@
  * Copyright(c) 2022 Intel Corporation
  */
 
-#ifndef POWER_INTEL_UNCORE_H
-#define POWER_INTEL_UNCORE_H
+#ifndef INTEL_UNCORE_H
+#define INTEL_UNCORE_H
 
 /**
  * @file
@@ -11,7 +11,7 @@
  */
 
 #include "rte_power.h"
-#include "rte_power_uncore.h"
+#include "rte_power_uncore_ops.h"
 
 #ifdef __cplusplus
 extern "C" {
@@ -223,4 +223,4 @@ power_intel_uncore_get_num_dies(unsigned int pkg);
 }
 #endif
 
-#endif /* POWER_INTEL_UNCORE_H */
+#endif /* INTEL_UNCORE_H */
diff --git a/drivers/power/intel_uncore/meson.build 
b/drivers/power/intel_uncore/meson.build
new file mode 100644
index 00..876df8ad14
--- /dev/null
+++ b/drivers/power/intel_uncore/meson.build
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2017 Intel Corporation
+# Copyright(c) 2024 Advanced Micro Devices, Inc.
+
+sources = files('intel_uncore.c')
+deps += ['power']
diff --git a/drivers/power/meson.build b/drivers/power/meson.build
index 8c7215c639..c83047af94 100644
--- a/drivers/power/meson.build
+++ b/drivers/power/meson.build
@@ -6,7 +6,8 @@ drivers = [
 'amd_pstate',
 'cppc',
 'kvm_vm',
-'pstate'
+'pstate',
+'intel_uncore'
 ]
 
 std_deps = ['power']
diff --git a/lib/power/meson.build b/lib/power/meson.build
index d6b86ea19c..63616e60fd 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -13,7 +13,6 @@ if not is_linux
 endif
 sources = files(
 'power_common.c',
-'power_intel_uncore.c',
 'rte_power.c',
 'rte_power_uncore.c',
 'rte_power_pmd_mgmt.c',
@@ -24,6 +23,7 @@ headers = files(
 'rte_power_guest_channel.h',
 'rte_power_pmd_mgmt.h',
 'rte_power_uncore.h',
+'rte_power_uncore_ops.h',
 )
 if cc.has_argument('-Wno-cast-qual')
 cflags += '-Wno-cast-qual'
diff --git a/lib/power/rte_power_uncore.c b/lib/power/rte_power_uncore.c
index 48

[PATCH v3 0/5] power: refactor power management library

2024-10-08 Thread Sivaprasad Tummala

This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.

This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.

Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.

Sivaprasad Tummala (5):
  power: refactor core power management library
  power: refactor uncore power management library
  test/power: removed function pointer validations
  power/amd_uncore: uncore support for AMD EPYC processors
  maintainers: update for drivers/power

 MAINTAINERS   |   1 +
 app/test/test_power.c |  95 -
 app/test/test_power_cpufreq.c |  52 ---
 app/test/test_power_kvm_vm.c  |  36 --
 drivers/meson.build   |   1 +
 .../power/acpi/acpi_cpufreq.c |  22 +-
 .../power/acpi/acpi_cpufreq.h |   6 +-
 drivers/power/acpi/meson.build|  10 +
 .../power/amd_pstate/amd_pstate_cpufreq.c |  24 +-
 .../power/amd_pstate/amd_pstate_cpufreq.h |   8 +-
 drivers/power/amd_pstate/meson.build  |  10 +
 drivers/power/amd_uncore/amd_uncore.c | 328 ++
 drivers/power/amd_uncore/amd_uncore.h | 226 
 drivers/power/amd_uncore/meson.build  |  20 ++
 .../power/cppc/cppc_cpufreq.c |  22 +-
 .../power/cppc/cppc_cpufreq.h |   8 +-
 drivers/power/cppc/meson.build|  10 +
 .../power/intel_uncore/intel_uncore.c |  18 +-
 .../power/intel_uncore/intel_uncore.h |   8 +-
 drivers/power/intel_uncore/meson.build|   6 +
 .../power/kvm_vm}/guest_channel.c |   0
 .../power/kvm_vm}/guest_channel.h |   0
 .../power/kvm_vm/kvm_vm.c |  22 +-
 .../power/kvm_vm/kvm_vm.h |   6 +-
 drivers/power/kvm_vm/meson.build  |  16 +
 drivers/power/meson.build |  14 +
 drivers/power/pstate/meson.build  |  10 +
 .../power/pstate/pstate_cpufreq.c |  22 +-
 .../power/pstate/pstate_cpufreq.h |   6 +-
 examples/l3fwd-power/main.c   |  12 +-
 lib/power/meson.build |   9 +-
 lib/power/power_common.c  |   2 +-
 lib/power/power_common.h  |  16 +-
 lib/power/rte_power.c | 286 +--
 lib/power/rte_power.h | 139 +---
 lib/power/rte_power_cpufreq_api.h | 208 +++
 lib/power/rte_power_uncore.c  | 206 +--
 lib/power/rte_power_uncore.h  |  87 +++--
 lib/power/rte_power_uncore_ops.h  | 239 +
 lib/power/version.map |  15 +
 40 files changed, 1602 insertions(+), 624 deletions(-)
 rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c 
(95%)
 rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h 
(98%)
 create mode 100644 drivers/power/acpi/meson.build
 rename lib/power/power_amd_pstate_cpufreq.c => 
drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
 rename lib/power/power_amd_pstate_cpufreq.h => 
drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
 create mode 100644 drivers/power/amd_pstate/meson.build
 create mode 100644 drivers/power/amd_uncore/amd_uncore.c
 create mode 100644 drivers/power/amd_uncore/amd_uncore.h
 create mode 100644 drivers/power/amd_uncore/meson.build
 rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c 
(95%)
 rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h 
(97%)
 create mode 100644 drivers/power/cppc/meson.build
 rename lib/power/power_intel_uncore.c => 
drivers/power/intel_uncore/intel_uncore.c (95%)
 rename lib/power/power_intel_uncore.h => 
drivers/power/intel_uncore/intel_uncore.h (97%)
 create mode 100644 drivers/power/intel_uncore/meson.build
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
 rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
 rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
 create mode 100644 drivers/power/kvm_vm/meson.build
 create mode 100644 drivers/power/meson.build
 create mode 100644 drivers/power/pstate/meson.build
 rename lib/power/power_pstate_cpufreq.c => 
drivers/power/pstate/pstate_cpufreq.c (96%)
 rename lib/power/power_pstate_cpufreq.h

[PATCH v3 3/5] test/power: removed function pointer validations

2024-10-08 Thread Sivaprasad Tummala

After refactoring the power library, power management operations are now
consistently supported regardless of the operating environment, making
function pointer checks unnecessary and thus removed from applications.

v2:
 - removed function pointer validation in l3fwd-power app.

Signed-off-by: Sivaprasad Tummala 
---
 app/test/test_power.c | 95 ---
 app/test/test_power_cpufreq.c | 52 ---
 app/test/test_power_kvm_vm.c  | 36 -
 examples/l3fwd-power/main.c   | 12 ++---
 4 files changed, 4 insertions(+), 191 deletions(-)

diff --git a/app/test/test_power.c b/app/test/test_power.c
index 403adc22d6..5df5848c70 100644
--- a/app/test/test_power.c
+++ b/app/test/test_power.c
@@ -24,86 +24,6 @@ test_power(void)
 
 #include 
 
-static int
-check_function_ptrs(void)
-{
-   enum power_management_env env = rte_power_get_env();
-
-   const bool not_null_expected = !(env == PM_ENV_NOT_SET);
-
-   const char *inject_not_string1 = not_null_expected ? " not" : "";
-   const char *inject_not_string2 = not_null_expected ? "" : " not";
-
-   if ((rte_power_freqs == NULL) == not_null_expected) {
-   printf("rte_power_freqs should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_get_freq == NULL) == not_null_expected) {
-   printf("rte_power_get_freq should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_set_freq == NULL) == not_null_expected) {
-   printf("rte_power_set_freq should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_up == NULL) == not_null_expected) {
-   printf("rte_power_freq_up should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_down == NULL) == not_null_expected) {
-   printf("rte_power_freq_down should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_max == NULL) == not_null_expected) {
-   printf("rte_power_freq_max should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_min == NULL) == not_null_expected) {
-   printf("rte_power_freq_min should%s be NULL, environment has%s 
been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_turbo_status == NULL) == not_null_expected) {
-   printf("rte_power_turbo_status should%s be NULL, environment 
has%s been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_enable_turbo == NULL) == not_null_expected) {
-   printf("rte_power_freq_enable_turbo should%s be NULL, 
environment has%s been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_freq_disable_turbo == NULL) == not_null_expected) {
-   printf("rte_power_freq_disable_turbo should%s be NULL, 
environment has%s been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-   if ((rte_power_get_capabilities == NULL) == not_null_expected) {
-   printf("rte_power_get_capabilities should%s be NULL, 
environment has%s been "
-   "initialised\n", inject_not_string1,
-   inject_not_string2);
-   return -1;
-   }
-
-   return 0;
-}
-
 static int
 test_power(void)
 {
@@ -124,10 +44,6 @@ test_power(void)
return -1;
}
 
-   /* Verify that function pointers are NULL */
-   if (check_function_ptrs() < 0)
-   goto fail_all;
-
rte_power_unset_env();
 
/* Perform tests for valid environments.*/
@@ -154,22 +70,11 @@ test_power(void)

[PATCH v3 0/5] power: refactor power management library

2024-10-08 Thread Sivaprasad Tummala

This patchset refactors the power management library, addressing both
core and uncore power management. The primary changes involve the
creation of dedicated directories for each driver within
'drivers/power/core/*' and 'drivers/power/uncore/*'.

This refactor significantly improves code organization, enhances
clarity, and boosts maintainability. It lays the foundation for more
focused development on individual drivers and facilitates seamless
integration of future enhancements, particularly the AMD uncore driver.

Furthermore, this effort aims to streamline code maintenance by
consolidating common functions for cpufreq and cppc across various
core drivers, thus reducing code duplication.

Sivaprasad Tummala (5):
  power: refactor core power management library
  power: refactor uncore power management library
  test/power: removed function pointer validations
  power/amd_uncore: uncore support for AMD EPYC processors
  maintainers: update for drivers/power

 MAINTAINERS   |   1 +
 app/test/test_power.c |  95 -
 app/test/test_power_cpufreq.c |  52 ---
 app/test/test_power_kvm_vm.c  |  36 --
 drivers/meson.build   |   1 +
 .../power/acpi/acpi_cpufreq.c |  22 +-
 .../power/acpi/acpi_cpufreq.h |   6 +-
 drivers/power/acpi/meson.build|  10 +
 .../power/amd_pstate/amd_pstate_cpufreq.c |  24 +-
 .../power/amd_pstate/amd_pstate_cpufreq.h |   8 +-
 drivers/power/amd_pstate/meson.build  |  10 +
 drivers/power/amd_uncore/amd_uncore.c | 328 ++
 drivers/power/amd_uncore/amd_uncore.h | 226 
 drivers/power/amd_uncore/meson.build  |  20 ++
 .../power/cppc/cppc_cpufreq.c |  22 +-
 .../power/cppc/cppc_cpufreq.h |   8 +-
 drivers/power/cppc/meson.build|  10 +
 .../power/intel_uncore/intel_uncore.c |  18 +-
 .../power/intel_uncore/intel_uncore.h |   8 +-
 drivers/power/intel_uncore/meson.build|   6 +
 .../power/kvm_vm}/guest_channel.c |   0
 .../power/kvm_vm}/guest_channel.h |   0
 .../power/kvm_vm/kvm_vm.c |  22 +-
 .../power/kvm_vm/kvm_vm.h |   6 +-
 drivers/power/kvm_vm/meson.build  |  16 +
 drivers/power/meson.build |  14 +
 drivers/power/pstate/meson.build  |  10 +
 .../power/pstate/pstate_cpufreq.c |  22 +-
 .../power/pstate/pstate_cpufreq.h |   6 +-
 examples/l3fwd-power/main.c   |  12 +-
 lib/power/meson.build |   9 +-
 lib/power/power_common.c  |   2 +-
 lib/power/power_common.h  |  16 +-
 lib/power/rte_power.c | 286 +--
 lib/power/rte_power.h | 139 +---
 lib/power/rte_power_cpufreq_api.h | 208 +++
 lib/power/rte_power_uncore.c  | 206 +--
 lib/power/rte_power_uncore.h  |  87 +++--
 lib/power/rte_power_uncore_ops.h  | 239 +
 lib/power/version.map |  15 +
 40 files changed, 1602 insertions(+), 624 deletions(-)
 rename lib/power/power_acpi_cpufreq.c => drivers/power/acpi/acpi_cpufreq.c 
(95%)
 rename lib/power/power_acpi_cpufreq.h => drivers/power/acpi/acpi_cpufreq.h 
(98%)
 create mode 100644 drivers/power/acpi/meson.build
 rename lib/power/power_amd_pstate_cpufreq.c => 
drivers/power/amd_pstate/amd_pstate_cpufreq.c (95%)
 rename lib/power/power_amd_pstate_cpufreq.h => 
drivers/power/amd_pstate/amd_pstate_cpufreq.h (97%)
 create mode 100644 drivers/power/amd_pstate/meson.build
 create mode 100644 drivers/power/amd_uncore/amd_uncore.c
 create mode 100644 drivers/power/amd_uncore/amd_uncore.h
 create mode 100644 drivers/power/amd_uncore/meson.build
 rename lib/power/power_cppc_cpufreq.c => drivers/power/cppc/cppc_cpufreq.c 
(95%)
 rename lib/power/power_cppc_cpufreq.h => drivers/power/cppc/cppc_cpufreq.h 
(97%)
 create mode 100644 drivers/power/cppc/meson.build
 rename lib/power/power_intel_uncore.c => 
drivers/power/intel_uncore/intel_uncore.c (95%)
 rename lib/power/power_intel_uncore.h => 
drivers/power/intel_uncore/intel_uncore.h (97%)
 create mode 100644 drivers/power/intel_uncore/meson.build
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.c (100%)
 rename {lib/power => drivers/power/kvm_vm}/guest_channel.h (100%)
 rename lib/power/power_kvm_vm.c => drivers/power/kvm_vm/kvm_vm.c (82%)
 rename lib/power/power_kvm_vm.h => drivers/power/kvm_vm/kvm_vm.h (98%)
 create mode 100644 drivers/power/kvm_vm/meson.build
 create mode 100644 drivers/power/meson.build
 create mode 100644 drivers/power/pstate/meson.build
 rename lib/power/power_pstate_cpufreq.c => 
drivers/power/pstate/pstate_cpufreq.c (96%)
 rename lib/power/power_pstate_cpufreq.h

[PATCH v3 5/5] maintainers: update for drivers/power

2024-10-08 Thread Sivaprasad Tummala

Update maintainers for drivers/power/*.

Signed-off-by: Sivaprasad Tummala 
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 812463fe9f..7d2868fe30 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1737,6 +1737,7 @@ M: Anatoly Burakov 
 M: David Hunt 
 M: Sivaprasad Tummala 
 F: lib/power/
+F: drivers/power/*
 F: doc/guides/prog_guide/power_man.rst
 F: app/test/test_power*
 F: examples/l3fwd-power/
-- 
2.34.1

[PATCH] fib6: add runtime checks for vector lookup

2024-10-08 Thread Vladimir Medvedkin

AVX512 lookup function requires CPU to support RTE_CPUFLAG_AVX512DQ and
RTE_CPUFLAG_AVX512BW. Add runtime checks of these two flags when deciding
if vector function can be used.

Fixes: c3e12e0f0354 ("fib: add dataplane algorithm for IPv6")
Cc: sta...@dpdk.org

Signed-off-by: Vladimir Medvedkin 
---
 lib/fib/trie.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/fib/trie.c b/lib/fib/trie.c
index 09470e7287..805e21d090 100644
--- a/lib/fib/trie.c
+++ b/lib/fib/trie.c
@@ -47,6 +47,8 @@ get_vector_fn(enum rte_fib_trie_nh_sz nh_sz)
 {
 #ifdef CC_TRIE_AVX512_SUPPORT
if ((rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) <= 0) ||
+   (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512DQ) <= 0) ||
+   (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512BW) <= 0) ||
(rte_vect_get_max_simd_bitwidth() < RTE_VECT_SIMD_512))
return NULL;
switch (nh_sz) {
-- 
2.34.1

Re: [RFC v3 1/3] uapi: introduce kernel uAPI headers import

2024-10-08 Thread Maxime Coquelin





On 9/17/24 13:36, David Marchand wrote:

On Wed, Sep 11, 2024 at 9:32 PM Maxime Coquelin
 wrote:


This patch introduces uAPI headers import into the DPDK
repository. This import is possible thanks to Linux Kernel
licence exception for syscalls:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/LICENSES/exceptions/Linux-syscall-note

Header files are have to be explicitly imported.

Guidelines are provided in the documentation, and helper
scripts are also provided to ensure proper importation of the
header (unmodified content from a released Kernel version):
  - import-linux-uapi.sh: used to add and update headers and
  their dependencies to linux-headers/uapi/
  - check-linux-uapi.sh: used to check all headers are valid

Signed-off-by: Maxime Coquelin 


I have been trying this script to import linux/vfio.h and cleanup its
usage in DPDK.

There is one issue that was raised.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/include/uapi/linux/vfio.h#n1573

struct vfio_bitmap {
__u64pgsize; /* page size for bitmap in bytes */
__u64size; /* in bytes */
__u64 __user *data; /* one bit per page */
};

The __user annotation is sanitized by the headers install script in
the kernel, but the dpdk import script is missing this part.
Such sanitizations breaks the check script.

We could invert the logic in the check script: instead of "restoring"
an imported header, the check would convert a freshly downloaded
header and compare it against the imported header in dpdk.
One thing though is that we would need a copy of the "conversion"
function in the two scripts.

One idea.. can we have a single script?

# Interactive mode, with questions about what to import if dependencies exist:
$ devtools/linux-uapi.sh import linux/vfio.h v6.10

# Non interactive mode, the script uses the version file and imported headers:
$ devtools/linux-uapi.sh check


Regardless of this suggestion, I have some nits about the shell scripts below:


---
  devtools/check-linux-uapi.sh   |  85 ++
  devtools/import-linux-uapi.sh  | 119 +
  doc/guides/contributing/index.rst  |   1 +
  doc/guides/contributing/linux_uapi.rst |  71 +++
  linux-headers/uapi/.gitignore  |   4 +
  linux-headers/uapi/version |   1 +
  meson.build|   8 +-
  7 files changed, 287 insertions(+), 2 deletions(-)
  create mode 100755 devtools/check-linux-uapi.sh
  create mode 100755 devtools/import-linux-uapi.sh
  create mode 100644 doc/guides/contributing/linux_uapi.rst
  create mode 100644 linux-headers/uapi/.gitignore
  create mode 100644 linux-headers/uapi/version



Thanks for the deep review!
I think I addressed all the comments in upcoming V1

Maxime

RE: [EXTERNAL] Re: [PATCH v6 0/2] devtools: add tracepoint check in checkpatch

2024-10-08 Thread Ankur Dwivedi




>-Original Message-
>From: Stephen Hemminger 
>Sent: Tuesday, October 8, 2024 6:11 AM
>To: Ankur Dwivedi 
>Cc: dev@dpdk.org; tho...@monjalon.net; Jerin Jacob 
>Subject: [EXTERNAL] Re: [PATCH v6 0/2] devtools: add tracepoint check in
>checkpatch
>
>On Wed, 17 Jul 2024 12: 09: 53 + Ankur Dwivedi
> wrote: > >-Original Message- > >From: Ankur
>Dwivedi  > >Sent: Friday, December 15, 2023 12: 14
>PM > >To: 
>On Wed, 17 Jul 2024 12:09:53 +
>Ankur Dwivedi  wrote:
>
>> >-Original Message-
>> >From: Ankur Dwivedi 
>> >Sent: Friday, December 15, 2023 12:14 PM
>> >To: dev@dpdk.org
>> >Cc: tho...@monjalon.net; Jerin Jacob Kollanukkaran
>> >; Ankur Dwivedi 
>> >Subject: [PATCH v6 0/2] devtools: add tracepoint check in checkpatch
>> >
>> >This patch series adds a validation in checkpatch tool to check if
>> >tracepoint is present in any new function added in ethdev, eventdev
>> >cryptodev and mempool library.
>>
>> Please let me know if this patch series can be merged in DPDK or if there are
>any comments.
>
>Not sure why the patch got ignored.
>Perhaps if check-tracepoint was run first against existing code; add to check-
>patch later.

check-tracepoint reads a patch and checks if a newly added function in a 
library has the trace in it or not. 
For existing code trace can be added manually. Trace was added for existing 
functions in 23.03 release.
>
>And the skip list is empty, is that right?
Yes. 
If trace is not required for a new library function, the function name can be 
added in skiplist.
The checkpatch will ignore trace check for that function.
> is all of existing cryptodev ethdev ... ok
>now?

No, it's not completely ok. Few functions does not have trace added.  Majority 
have trace added.

[PATCH v2] net/mvneta: fix possible out-of-bounds write

2024-10-08 Thread Chengwen Feng

The mvneta_ifnames_get() function will save 'iface' value to ifnames,
it will out-of-bounds write if passed many iface pairs (e.g.
'iface=xxx,iface=xxx,...').

Fixes: 4ccc8d770d3b ("net/mvneta: add PMD skeleton")
Cc: sta...@dpdk.org

Signed-off-by: Chengwen Feng 
Acked-by: Ferruh Yigit 

---
v2: add error log which address Stephen's comment.

---
 drivers/net/mvneta/mvneta_ethdev.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/mvneta/mvneta_ethdev.c 
b/drivers/net/mvneta/mvneta_ethdev.c
index 3841c1ebe9..e641f19266 100644
--- a/drivers/net/mvneta/mvneta_ethdev.c
+++ b/drivers/net/mvneta/mvneta_ethdev.c
@@ -91,6 +91,11 @@ mvneta_ifnames_get(const char *key __rte_unused, const char 
*value,
 {
struct mvneta_ifnames *ifnames = extra_args;
 
+   if (ifnames->idx >= NETA_NUM_ETH_PPIO) {
+   MVNETA_LOG(ERROR, "Detect too many ifnames!");
+   return -EINVAL;
+   }
+
ifnames->names[ifnames->idx++] = value;
 
return 0;
-- 
2.17.1

Re: [PATCH v2] rawdev: add API to get device from index

2024-10-08 Thread Hemant Agrawal


Reviewed-by:  Hemant Agrawal 

On 08-10-2024 13:10, Akhil Goyal wrote:

Added an internal API for PMDs to get raw device pointer
from a device id.

Signed-off-by: Akhil Goyal 
---
- resend patch for main branch separated from rvu_lf raw driver
https://patches.dpdk.org/project/dpdk/list/?series=32949

  lib/rawdev/rte_rawdev_pmd.h | 24 
  1 file changed, 24 insertions(+)

diff --git a/lib/rawdev/rte_rawdev_pmd.h b/lib/rawdev/rte_rawdev_pmd.h
index 22b406444d..8339122348 100644
--- a/lib/rawdev/rte_rawdev_pmd.h
+++ b/lib/rawdev/rte_rawdev_pmd.h
@@ -102,6 +102,30 @@ rte_rawdev_pmd_get_named_dev(const char *name)
return NULL;
  }
  
+/**

+ * Get the rte_rawdev structure device pointer for given device ID.
+ *
+ * @param dev_id
+ *   raw device index.
+ *
+ * @return
+ *   - The rte_rawdev structure pointer for the given device ID.
+ */
+static inline struct rte_rawdev *
+rte_rawdev_pmd_get_dev(uint8_t dev_id)
+{
+   struct rte_rawdev *dev;
+
+   if (dev_id >= RTE_RAWDEV_MAX_DEVS)
+   return NULL;
+
+   dev = &rte_rawdevs[dev_id];
+   if (dev->attached == RTE_RAWDEV_ATTACHED)
+   return dev;
+
+   return NULL;
+}
+
  /**
   * Validate if the raw device index is a valid attached raw device.
   *

[PATCH v1] doc: update supported MEV TS version for CPFL

2024-10-08 Thread Soumyadeep Hore

Current MEV TS IPU support FW version 1.6. Hence,
updating the same in documentation

Signed-off-by: Soumyadeep Hore 
---
 doc/guides/nics/cpfl.rst | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/doc/guides/nics/cpfl.rst b/doc/guides/nics/cpfl.rst
index 69eabf5616..154201e745 100644
--- a/doc/guides/nics/cpfl.rst
+++ b/doc/guides/nics/cpfl.rst
@@ -37,6 +37,8 @@ Here is the suggested matching list which has been tested and 
verified.
++--+
|24.07   |   1.4|
++--+
+   |24.11   |   1.6|
+   ++--+
 
 
 Configuration
-- 
2.34.1

rte_ring move head question for machines with relaxed MO (arm/ppc)

2024-10-08 Thread Konstantin Ananyev

Hi lads,

Looking at rte_ring move_head functions I noticed that all of them
use slightly different approach to guarantee desired order of memory accesses:


1. rte_ring_generic_pvt.h:
=

pseudo-c-code  //related armv8 
instructions
 
-- 
 head.load()  //ldr [head]
 rte_smp_rmb()//dmb ishld
 opposite_tail.load()//ldr [opposite_tail]
 ...
 rte_atomic32_cmpset(head, ...)  //ldrex[head];... stlex[head]


2. rte_ring_c11_pvt.h
=

pseudo-c-code   //related armv8 
instructions
 
--
head.atomic_load(relaxed) //ldr[head]
atomic_thread_fence(acquire)   //dmb ish
opposite_tail.atomic_load(acquire)   //lda[opposite_tail]
...
head.atomic_cas(..., relaxed)  //ldrex[haed]; ... 
strex[head]


3.   rte_ring_hts_elem_pvt.h
==

pseudo-c-code   //related armv8 
instructions
 
--
head.atomic_load(acquire)//lda [head]
opposite_tail.load() //ldr [opposite_tail] 
...
head.atomic_cas(..., acquire)// ldaex[head]; ... 
strex[head] 

The questions that arose from these observations:
a) are all 3 approaches equivalent in terms of functionality?
b) if yes, is there any difference in terms of performance between:
 "ldr; dmb; ldr;"   vs "lda; ldr;"
  ?
c) Comapring at 1) and 2) above, combination of 
   ldr [head]; dmb; lda [opposite_tail]:
   looks like an overkill to me.  Wouldn't just:
   ldr [head]; dmb; ldr[opposite_tail];
   be sufficient here?
I.E.- for reading tail value - we don't need to use load(acquire).
Or probably I do miss something obvious here?

Thanks
Konstantin

For convenience, I created a godbot page with all these variants:
https://godbolt.org/z/Yjj73b8xa
   
#1 - __rte_ring_headtail_move_head()
#2 - __rte_ring_headtail_move_head_c11_v1  
#3 - __rte_ring_headtail_move_head_c11_v2
#2 with c) -  __rte_ring_headtail_move_head_c11_v3

RE: rte_ring move head question for machines with relaxed MO (arm/ppc)

2024-10-08 Thread Wathsala Wathawana Vithanage

> 1. rte_ring_generic_pvt.h:
> =
> 
> pseudo-c-code  //related armv8 
> instructions
>  
> --
>  head.load()  //ldr [head]
>  rte_smp_rmb()//dmb ishld
>  opposite_tail.load()//ldr [opposite_tail]
>  ...
>  rte_atomic32_cmpset(head, ...)  //ldrex[head];... stlex[head]
> 
> 
> 2. rte_ring_c11_pvt.h
> =
> 
> pseudo-c-code   //related armv8 
> instructions
>  
> --
> head.atomic_load(relaxed) //ldr[head]
> atomic_thread_fence(acquire)   //dmb ish
> opposite_tail.atomic_load(acquire)   //lda[opposite_tail]
> ...
> head.atomic_cas(..., relaxed)  //ldrex[haed]; ... 
> strex[head]
> 
> 
> 3.   rte_ring_hts_elem_pvt.h
> ==
> 
> pseudo-c-code   //related armv8 
> instructions
>  
> --
> head.atomic_load(acquire)//lda [head]
> opposite_tail.load() //ldr [opposite_tail]
> ...
> head.atomic_cas(..., acquire)// ldaex[head]; ... 
> strex[head]
> 
> The questions that arose from these observations:
> a) are all 3 approaches equivalent in terms of functionality?
Different, lda (Load with acquire semantics) and ldr (load) are different. 

> b) if yes, is there any difference in terms of performance between:
>  "ldr; dmb; ldr;"   vs "lda; ldr;"
>   ?
dmb is a full barrier, performance is poor.
I would assume (haven't measured) ldr; dmb; ldr to be less performant than 
lda;ldr;

> c) Comapring at 1) and 2) above, combination of
>ldr [head]; dmb; lda [opposite_tail]:
>looks like an overkill to me.  Wouldn't just:
>ldr [head]; dmb; ldr[opposite_tail];
>be sufficient here?
lda [opposite_tail]: synchronizes with stlr in tail update that happens after 
array update.
So, it cannot be changed to ldr. 

lda can be replaced with ldapr (LDA with release consistency - processor 
consistency) 
which is more performant as lda is allowed to rise above stlr. Can be done with 
-mcpu=+rcpc

--wathsala

RE: rte_ring move head question for machines with relaxed MO (arm/ppc)

2024-10-08 Thread Wathsala Wathawana Vithanage

> 
> lda can be replaced with ldapr (LDA with release consistency - processor
> consistency) which is more performant as lda is allowed to rise above stlr. 
> Can
> be done with -mcpu=+rcpc
> 

Correction: lapr is allowed to rise above stlr.

-- wathsala

Re: [PATCH v2 0/4] simplify doing 32-bit DPDK builds

2024-10-08 Thread David Marchand

On Thu, Sep 19, 2024 at 10:02 AM Bruce Richardson
 wrote:
> > I was then surprised to read the result:
> > ...
> > 2024-09-19T07:22:12.6485260Z Checking for size of "void *" : 8
> > 2024-09-19T07:22:12.6485592Z Checking for size of "void *" : 8
> > ...
> >
> >
> > *scratch* *scratch*
> > So I retested the series locally on my f39 (the series seemed ok so
> > far) but I downgraded meson to 0.53.2 (which is the version forced in
> > GHA) and now I observe the same issue.
> >
> > I suspect something changed in the cross file handling in more recent
> > meson versions.
> > Likely, the c_args= or [build-in options] part is not read.
> >
> > Am I doing something wrong?
> >
> From the docs on cross-files [1], it appears that significant changes to
> the cross-file handling was done in 0.56 release. That may be the cause.
> I'll have to try some testing myself.
>
> Overall, I think we haven't increased our minimum meson version in some
> time. Maybe it's time to consider doing so in this release or the next one?
> Need to look through release notes to see how far forward to jump to see
> what extra features might be useful for us to leverage.

Just a note that I think it is safer to wait for the upgrade to meson
0.57 before merging this series.
If this upgrade does not happen in this release, I'll merge at least
the first patches but keep the old way of testing 32 bits in the
devtools script.

Ok for you?


-- 
David Marchand

Re: [PATCH v2 0/8] centralize AVX-512 feature detection

2024-10-08 Thread David Marchand

On Tue, Oct 1, 2024 at 1:19 PM Bruce Richardson
 wrote:
>
> The meson code to detect CPU and compiler support for AVX512 was duplicated
> across multiple drivers. Do all detection in just a single place to simplify
> the code.
>
> v2: ensure that target_has_avx512 is always defined on x86 to fix build errors
>
> Bruce Richardson (8):
>   config/x86: add global defines for checking AVX-512
>   event/dlb2: use global AVX-512 variables
>   common/idpf: use global AVX-512 variables
>   net/cpfl: use global AVX-512 variables
>   net/i40e: use global AVX-512 variables
>   net/iavf: use global AVX-512 variables
>   net/ice: use global AVX-512 variables
>   net/idpf: use global AVX-512 variables
>
>  config/x86/meson.build  | 19 +++
>  drivers/common/idpf/meson.build | 17 ++---
>  drivers/event/dlb2/meson.build  | 42 +++--
>  drivers/net/cpfl/meson.build| 19 ++-
>  drivers/net/i40e/meson.build| 13 ++
>  drivers/net/iavf/meson.build| 13 ++
>  drivers/net/ice/meson.build | 15 ++--
>  drivers/net/idpf/meson.build| 19 ++-
>  8 files changed, 36 insertions(+), 121 deletions(-)

Thanks for this cleanup, I have two comments.

- Some drivers were going into great lenghts to check that individiual
avx512 features were available.
With this series, we end up requiring support for all features to
announce avx512 availability.
Are we perhaps disabling AVX512 support with some toolchains, out
there, supporting only part of the set?

- Some drivers were checking for presence of -mno-avx512f in
machine_args as a way to disable building any AVX512 stuff.
This gets discarded with this series.


-- 
David Marchand

Re: [v2 1/5] raw/zxdh: introduce zxdh raw device driver

2024-10-08 Thread Stephen Hemminger

On Mon, 12 Aug 2024 15:31:24 +0800
Yong Zhang  wrote:

> diff --git a/doc/guides/rawdevs/zxdh.rst b/doc/guides/rawdevs/zxdh.rst
> new file mode 100644
> index 00..fa7ada1004
> --- /dev/null
> +++ b/doc/guides/rawdevs/zxdh.rst
> @@ -0,0 +1,30 @@
> +..  SPDX-License-Identifier: BSD-3-Clause
> +Copyright 2024 ZTE Corporation
> +
> +ZXDH Rawdev Driver
> +==
> +
> +The ``zxdh`` rawdev driver is an implementation of the rawdev API,
> +that provides communication between two separate hosts.
> +This is achieved via using the GDMA controller of Dinghai SoC,
> +which can be configured through exposed MPF devices.

This is awkward use of passive voice and could be simplified and
shortened.

> +
> +Device Setup
> +-
> +
> +It is recommended to bind the ZXDH MPF kernel driver for MPF devices (Not 
> mandatory).
> +The kernel drivers can be downloaded at `ZTE Official Website
> +`_.

What is works (and what doesn't) without the kernel driver.
Has this driver been submitted to upstream kernel.org?
DPDK does not want to be requiring or using 3rd party kernel drivers with non 
GPLv2 licenses.

> +
> +Initialization
> +--
> +
> +The ``zxdh`` rawdev driver needs to work in IOVA PA mode.
> +Consider using ``--iova-mode=pa`` in the EAL options.
> +
> +Platform Requirement
> +
> +
> +This PMD is only supported on ZTE Neo Platforms:
> +- Neo X510/X512
> +


Adding blank line at end of file causes git to complain when merging

> diff --git a/drivers/raw/meson.build b/drivers/raw/meson.build
> index 05cad143fe..237d1bdd80 100644
> --- a/drivers/raw/meson.build
> +++ b/drivers/raw/meson.build
> @@ -12,5 +12,6 @@ drivers = [
>  'ifpga',
>  'ntb',
>  'skeleton',
> +'zxdh',
>  ]
>  std_deps = ['rawdev']
> diff --git a/drivers/raw/zxdh/meson.build b/drivers/raw/zxdh/meson.build
> new file mode 100644
> index 00..266d3db6d8
> --- /dev/null
> +++ b/drivers/raw/zxdh/meson.build
> @@ -0,0 +1,5 @@
> +#SPDX-License-Identifier: BSD-3-Clause
> +#Copyright 2024 ZTE Corporation
> +
> +deps += ['rawdev', 'kvargs', 'mbuf', 'bus_pci']
> +sources = files('zxdh_rawdev.c')
> diff --git a/drivers/raw/zxdh/zxdh_rawdev.c b/drivers/raw/zxdh/zxdh_rawdev.c
> new file mode 100644
> index 00..269c4f92e0
> --- /dev/null
> +++ b/drivers/raw/zxdh/zxdh_rawdev.c
> @@ -0,0 +1,220 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2024 ZTE Corporation
> + */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 

This is a large set of include and looks like it was a copy paste from 
elsewhere.

For example: you are including rte_kvargs.h but never use kvargs in the driver.

If I run iwyu on the driver it produces.

$ iwyu -I lib/eal/include -I drivers/bus/pci drivers/raw/zxdh/zxdh_rawdev.c

drivers/raw/zxdh/zxdh_rawdev.h should add these lines:
#include   // for rte_spinlock_t
#include// for RTE_LOG_COMMA, RTE_LOG_LINE_PREFIX
#include // for uint32_t, uint16_t, uint8_t, uint64_t
#include "rte_common.h"// for rte_iova_t

drivers/raw/zxdh/zxdh_rawdev.h should remove these lines:
- #include   // lines 13-13

The full include-list for drivers/raw/zxdh/zxdh_rawdev.h:
#include   // for rte_spinlock_t
#include// for RTE_LOG_COMMA, RTE_LOG_LINE_PREFIX
#include // for rte_rawdev
#include // for uint32_t, uint16_t, uint8_t, uint64_t
#include "rte_common.h"// for rte_iova_t
---

drivers/raw/zxdh/zxdh_rawdev.c should add these lines:
#include  // for PATH_MAX
#include// for rte_spinlock_lock, rte_spinlock_u...
#include "rte_branch_prediction.h"  // for unlikely

drivers/raw/zxdh/zxdh_rawdev.c should remove these lines:
- #include   // lines 5-5
- #include   // lines 16-16
- #include   // lines 28-28
- #include   // lines 19-19
- #include   // lines 21-21
- #include   // lines 22-22
- #include   // lines 26-26
- #include   // lines 25-25
- #include   // lines 7-7
- #include   // lines 9-9
- #include   // lines 12-12

The full include-list for drivers/raw/zxdh/zxdh_rawdev.c:
#include "zxdh_rawdev.h"
#include  // for rte_pci_device, rte_pci_get_sysfs...
#include   // for EINVAL, ENOMEM, EEXIST, errno
#include   // for open, O_RDWR
#include// for uint16_t, uint32_t, uint8_t, uint...
#include  // for PATH_MAX
#include  // for rte_wmb
#include  // for __rte_unused, RTE_PRIORITY_LAST
#include // for rte_mem_resource, RTE_PMD_REGISTE...
#include

Re: [PATCH v2 0/8] centralize AVX-512 feature detection

2024-10-08 Thread David Marchand

On Tue, Oct 8, 2024 at 12:03 PM Bruce Richardson
 wrote:
>
> On Tue, Oct 08, 2024 at 10:49:39AM +0200, David Marchand wrote:
> > On Tue, Oct 1, 2024 at 1:19 PM Bruce Richardson
> >  wrote:
> > >
> > > The meson code to detect CPU and compiler support for AVX512 was 
> > > duplicated
> > > across multiple drivers. Do all detection in just a single place to 
> > > simplify
> > > the code.
> > >
> > > v2: ensure that target_has_avx512 is always defined on x86 to fix build 
> > > errors
> > >
> > > Bruce Richardson (8):
> > >   config/x86: add global defines for checking AVX-512
> > >   event/dlb2: use global AVX-512 variables
> > >   common/idpf: use global AVX-512 variables
> > >   net/cpfl: use global AVX-512 variables
> > >   net/i40e: use global AVX-512 variables
> > >   net/iavf: use global AVX-512 variables
> > >   net/ice: use global AVX-512 variables
> > >   net/idpf: use global AVX-512 variables
> > >
> > >  config/x86/meson.build  | 19 +++
> > >  drivers/common/idpf/meson.build | 17 ++---
> > >  drivers/event/dlb2/meson.build  | 42 +++--
> > >  drivers/net/cpfl/meson.build| 19 ++-
> > >  drivers/net/i40e/meson.build| 13 ++
> > >  drivers/net/iavf/meson.build| 13 ++
> > >  drivers/net/ice/meson.build | 15 ++--
> > >  drivers/net/idpf/meson.build| 19 ++-
> > >  8 files changed, 36 insertions(+), 121 deletions(-)
> >
> > Thanks for this cleanup, I have two comments.
> >
> > - Some drivers were going into great lenghts to check that individiual
> > avx512 features were available.
> > With this series, we end up requiring support for all features to
> > announce avx512 availability.
> > Are we perhaps disabling AVX512 support with some toolchains, out
> > there, supporting only part of the set?
> >
>
> The various AVX-512 feature sets checked for (F, BW, VL, DQ) were all
> introduced in the same hardware generation - all are available in gcc when
> using -march=skylake-avx512 or later, or -march=znver4. On the toolchain
> side, gcc introduced all these flags simultaneously in gcc-6 [1]. For
> clang/llvm, testing with godbolt for compiler errors/warnings indicates
> that all these 4 avx512 flags are available from clang 3.6 - the minimum we
> support in DPDK [2]
>
> [1] https://gcc.gnu.org/gcc-6/changes.html
> [2] 
> https://doc.dpdk.org/guides/linux_gsg/sys_reqs.html#compilation-of-the-dpdk

Perfect, thanks for the details.

>
> > - Some drivers were checking for presence of -mno-avx512f in
> > machine_args as a way to disable building any AVX512 stuff.
> > This gets discarded with this series.
> >
>
> Yes, because it should no longer be necessary. The places in the build
> system where we set the no-avx512f flag are reworked so that we don't have
> cc_has_avx512 set.

Ok, it is clearer now.

Last comment on style:
$ git grep cc_avx512_flags drivers/
drivers/common/idpf/meson.build:avx512_args = [cflags] + cc_avx512_flags
drivers/event/dlb2/meson.build:   c_args:
cflags + cc_avx512_flags)
drivers/net/i40e/meson.build:avx512_args = cflags + cc_avx512_flags
drivers/net/iavf/meson.build:avx512_args = cflags + cc_avx512_flags
drivers/net/ice/meson.build:avx512_args = cflags + cc_avx512_flags

I think it is safe to remove the [] around cflags in common/idpf, right?


Do you have some cycles to send a v2 and convert lib/net and net/virtio ?
Otherwise, can you do a followup patch for rc2?


Thanks.

-- 
David Marchand

Re: [PATCH v2 0/8] centralize AVX-512 feature detection

2024-10-08 Thread Bruce Richardson

On Tue, Oct 08, 2024 at 01:33:16PM +0200, David Marchand wrote:
> On Tue, Oct 8, 2024 at 12:03 PM Bruce Richardson
>  wrote:
> >
> > On Tue, Oct 08, 2024 at 10:49:39AM +0200, David Marchand wrote:
> > > On Tue, Oct 1, 2024 at 1:19 PM Bruce Richardson
> > >  wrote:
> > > >
> > > > The meson code to detect CPU and compiler support for AVX512 was 
> > > > duplicated
> > > > across multiple drivers. Do all detection in just a single place to 
> > > > simplify
> > > > the code.
> > > >
> > > > v2: ensure that target_has_avx512 is always defined on x86 to fix build 
> > > > errors
> > > >
> > > > Bruce Richardson (8):
> > > >   config/x86: add global defines for checking AVX-512
> > > >   event/dlb2: use global AVX-512 variables
> > > >   common/idpf: use global AVX-512 variables
> > > >   net/cpfl: use global AVX-512 variables
> > > >   net/i40e: use global AVX-512 variables
> > > >   net/iavf: use global AVX-512 variables
> > > >   net/ice: use global AVX-512 variables
> > > >   net/idpf: use global AVX-512 variables
> > > >
> > > >  config/x86/meson.build  | 19 +++
> > > >  drivers/common/idpf/meson.build | 17 ++---
> > > >  drivers/event/dlb2/meson.build  | 42 +++--
> > > >  drivers/net/cpfl/meson.build| 19 ++-
> > > >  drivers/net/i40e/meson.build| 13 ++
> > > >  drivers/net/iavf/meson.build| 13 ++
> > > >  drivers/net/ice/meson.build | 15 ++--
> > > >  drivers/net/idpf/meson.build| 19 ++-
> > > >  8 files changed, 36 insertions(+), 121 deletions(-)
> > >
> > > Thanks for this cleanup, I have two comments.
> > >
> > > - Some drivers were going into great lenghts to check that individiual
> > > avx512 features were available.
> > > With this series, we end up requiring support for all features to
> > > announce avx512 availability.
> > > Are we perhaps disabling AVX512 support with some toolchains, out
> > > there, supporting only part of the set?
> > >
> >
> > The various AVX-512 feature sets checked for (F, BW, VL, DQ) were all
> > introduced in the same hardware generation - all are available in gcc when
> > using -march=skylake-avx512 or later, or -march=znver4. On the toolchain
> > side, gcc introduced all these flags simultaneously in gcc-6 [1]. For
> > clang/llvm, testing with godbolt for compiler errors/warnings indicates
> > that all these 4 avx512 flags are available from clang 3.6 - the minimum we
> > support in DPDK [2]
> >
> > [1] https://gcc.gnu.org/gcc-6/changes.html
> > [2] 
> > https://doc.dpdk.org/guides/linux_gsg/sys_reqs.html#compilation-of-the-dpdk
> 
> Perfect, thanks for the details.
> 
> >
> > > - Some drivers were checking for presence of -mno-avx512f in
> > > machine_args as a way to disable building any AVX512 stuff.
> > > This gets discarded with this series.
> > >
> >
> > Yes, because it should no longer be necessary. The places in the build
> > system where we set the no-avx512f flag are reworked so that we don't have
> > cc_has_avx512 set.
> 
> Ok, it is clearer now.
> 
> Last comment on style:
> $ git grep cc_avx512_flags drivers/
> drivers/common/idpf/meson.build:avx512_args = [cflags] + 
> cc_avx512_flags
> drivers/event/dlb2/meson.build:   c_args:
> cflags + cc_avx512_flags)
> drivers/net/i40e/meson.build:avx512_args = cflags + cc_avx512_flags
> drivers/net/iavf/meson.build:avx512_args = cflags + cc_avx512_flags
> drivers/net/ice/meson.build:avx512_args = cflags + cc_avx512_flags
> 
> I think it is safe to remove the [] around cflags in common/idpf, right?
> 
Yep.

> 
> Do you have some cycles to send a v2 and convert lib/net and net/virtio ?
> Otherwise, can you do a followup patch for rc2?
> 
I'll see what I can do today.

/Bruce

1 2 3 >

1 - 100 of 275 matches

Mail list logo