The 01/16/2018 10:49, Andrew Rybchenko wrote:
> On 01/16/2018 04:10 AM, Yongseok Koh wrote:
> >This commit introduces rte_dma_wmb() and rte_dma_rmb(), in order to
> >guarantee the ordering of coherent shared memory between the CPU and a DMA
> >capable device.
> >
> >Signed-off-by: Yongseok Koh
> >
of the entry in order to guarantee data is not stale.
>
> Fixes: 570acdb1da8a ("net/mlx5: add vectorized Rx/Tx burst for ARM")
> Cc: sta...@dpdk.org
>
> Signed-off-by: Yongseok Koh
> Acked-by: Shahaf Shuler
> Acked-by: Nelio Laranjeiro
Acked-by: Jianbo
The 01/15/2018 17:10, Yongseok Koh wrote:
> Cc: Thomas Speier
>
> Signed-off-by: Yongseok Koh
> Acked-by: Thomas Speier
Acked-by: Jianbo Liu
> ---
> lib/librte_eal/common/include/arch/arm/rte_atomic_64.h | 4
> 1 file changed, 4 insertions(+)
>
> diff --
t;
> +#define rte_dma_wmb() rte_wmb()
> +
> +#define rte_dma_rmb() rte_rmb()
> +
> #ifdef __cplusplus
> }
> #endif
> --
> 2.11.0
>
Acked-by: Jianbo Liu
--
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged
The 01/15/2018 17:10, Yongseok Koh wrote:
> This commit introduces rte_dma_wmb() and rte_dma_rmb(), in order to
> guarantee the ordering of coherent shared memory between the CPU and a DMA
> capable device.
>
> Signed-off-by: Yongseok Koh
Acked-by: Jianbo Liu
> ---
>
MAX_MEMZONE + 1].
>
> Signed-off-by: Phil Yang
Acked-by: Jianbo Liu
> ---
> test/test/test_memzone.c | 6 +-
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/test/test/test_memzone.c b/test/test/test_memzone.c
> index 6e80977..24e29a7 100644
> --- a/
The 12/26/2017 20:28, Yongseok Koh wrote:
> Instead of using system-wide 'dsb' instruction for IO barriers, 'dmb' is
> sufficient and could bring better performance. Using 'dmb' with Outer
> Shareable Domain option is also consistent with linux kernel.
But in kernel dsb is used for io barriers.
ht
The 11/30/2017 08:10, Ravi Kumar wrote:
> Signed-off-by: Ravi Kumar
> ---
> drivers/net/axgbe/axgbe_common.h | 1645
> +-
> 1 file changed, 1644 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/axgbe/axgbe_common.h
> b/drivers/net/axgbe/axgbe_common.
https://github.com/freebsd/freebsd/blob/master/sys/sys/buf_ring.h#L170
> [2] http://dpdk.org/ml/archives/dev/2017-October/080861.html
>
> Signed-off-by: Jia He
> Suggested-by: Jerin Jacob
> Acked-by: Jerin Jacob
Acked-by: Jianbo Liu
> ---
> config/common_armv8a_
e | 3 ++-
> drivers/crypto/mrvl/rte_mrvl_compat.h | 1 +
> 4 files changed, 9 insertions(+), 25 deletions(-)
>
> --
> 2.7.4
>
Acked-by: Jianbo Liu
--
IMPORTANT NOTICE: The contents of this email and any attachments are
confidential and may also be privileged. If y
-build.sh | 1 +
> doc/guides/nics/mrvl.rst | 69 ---
> drivers/net/mrvl/Makefile | 4 +-
> drivers/net/mrvl/mrvl_ethdev.c | 92
> +-
> drivers/net/mrvl/mrvl_ethdev.h | 5 +++
> 5 files changed, 117 insertions(+), 54 deletions(-
The 11/30/2017 14:32, Tomasz Duszynski wrote:
> Add extra error logs in a few places.
>
> Signed-off-by: Tomasz Duszynski
> ---
> drivers/net/mrvl/mrvl_ethdev.c | 9 ++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/mrvl/mrvl_ethdev.c b/drivers/net/mrvl/mrvl
The 11/30/2017 14:32, Tomasz Duszynski wrote:
> Followig changes are needed to switch to musdk-17.10:
>
> - With a new version of the musdk library it's no longer necessary to
> explicitly define MVCONF_ARCH_DMA_ADDR_T_64BIT and
> CONF_PP2_BPOOL_COOKIE_SIZE.
>
> Proper defines are autogenerat
STYLE: Block comments use a trailing */ on a
> separate line
> #58: FILE: lib/librte_ring/rte_ring.h:414:
> + * memory model. It is noop on x86 */
>
> WARNING:BLOCK_COMMENT_STYLE: Block comments use a trailing */ on a
> separate line
> #70: FILE: lib/librte_rin
The 11/10/2017 10:06, Jia He wrote:
>
>
> On 11/9/2017 5:38 PM, Ananyev, Konstantin Wrote:
> >
> >>-Original Message-----
> >>From: Jianbo Liu [mailto:jianbo@arm.com]
> >>Sent: Thursday, November 9, 2017 4:56 AM
> >>To
The 11/09/2017 09:38, Ananyev, Konstantin wrote:
>
>
> > -Original Message-
> > From: Jianbo Liu [mailto:jianbo@arm.com]
> > Sent: Thursday, November 9, 2017 4:56 AM
> > To: Jia He
> > Cc: Richardson, Bruce ;
> > jerin.ja...@caviumnetworks
The 11/09/2017 12:43, Jia He wrote:
> Hi Jianbo
>
>
> On 11/9/2017 11:21 AM, Jianbo Liu Wrote:
> >The 11/09/2017 11:14, Jia He wrote:
> >>
> >>On 11/9/2017 9:22 AM, Jia He Wrote:
> >>>Hi Bruce
> >>>
> >>>
> >>>On
The 11/09/2017 11:14, Jia He wrote:
>
>
> On 11/9/2017 9:22 AM, Jia He Wrote:
> >Hi Bruce
> >
> >
> >On 11/8/2017 6:28 PM, Bruce Richardson Wrote:
> >>On Wed, Nov 08, 2017 at 06:17:10AM +, Jia He wrote:
> >>>for the code as follows:
> >>>if (condition)
> >>>rte_smp_rmb();
> >>>else
> >>>
d pn. This breaks port grouping algorithm.
> >
> > This patch eliminates the above problem by introducing a compiler
> > barrier between the instructions that depend on pnum, pn and lp.
> >
> > Fixes: 569b290cdb36 ("examples/l3fwd: add NEON implementation")
&g
Fix i40e stop receiving on ARM, as the statuses of RX descriptors
are not consistent, which is caused by cacheable hugepages.
Fixes: ae0eb310f253 ("net/i40e: implement vector PMD for ARM")
Cc: sta...@dpdk.org
Signed-off-by: Jianbo Liu
---
drivers/net/i40e/i40e_rxtx_vec_n
d NEON implementation")
>
> Signed-off-by: Guduri Prathyusha
Acked-by: Jianbo Liu
> ---
>
> v2:
>
> * fix as suggested by Jianbo Liu
> ---
> examples/l3fwd/l3fwd_neon.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/examples/l3fwd/l3fw
The 10/29/2017 13:18, Guduri Prathyusha wrote:
> To group consecutive packets with same destination port in bursts of 4
> neon intrinsic data types dp1 and dp2 are calculated such that if
> dst_port[]={a,b,c,d,e,f,g,h,i...} dp1 should contain: and
> dp2 should contain: in the first iteration. dp1
The 10/29/2017 13:18, Guduri Prathyusha wrote:
> To group consecutive packets with same destination port in bursts of 4
> neon intrinsic data types dp1 and dp2 are calculated such that if
> dst_port[]={a,b,c,d,e,f,g,h,i...} dp1 should contain: and
> dp2 should contain: in the first iteration. dp1
The 10/27/2017 10:01, Dumitrescu, Cristian wrote:
>
>
> > -Original Message-
> > From: Jianbo Liu [mailto:jianbo@arm.com]
> > Sent: Friday, October 27, 2017 3:55 AM
> > To: dev@dpdk.org; Dumitrescu, Cristian
> > Cc: Jianbo Liu
> > Subjec
From: Jianbo Liu
Implement the same hash functions with crc32 on arm platform.
Signed-off-by: Jianbo Liu
---
examples/ip_pipeline/pipeline/hash_func.h | 2 +
examples/ip_pipeline/pipeline/hash_func_arm64.h | 261
2 files changed, 263 insertions(+)
create mode
Hash table function will check if the input bucket size is power of 2,
so the parameter should be rounded up before sending to the creating function.
Signed-off-by: Jianbo Liu
---
examples/ip_pipeline/pipeline/pipeline_flow_classification_be.c | 2 +-
examples/ip_pipeline/pipeline
t on different platforms,
> the total execution time of aligned/unaligned memcpy test are
> provided to allow comparation between platforms.
>
> Signed-off-by: Herbert Guan
Acked-by: Jianbo Liu
> ---
> test/test/test_memcpy_perf.c | 50
> +-
On 24 October 2017 at 23:38, Dumitrescu, Cristian
wrote:
> Hi Jianbo,
>
> ...
>
>> >> > As mentioned in one of our deprecation notices, I am actively working
>> >> > (not ready for 17.8 unfortunately) to add a key mask parameter to these
>> >> functions, so more work on these hash functions is lik
Hi Herbert,
The 10/23/2017 10:35, Herbert Guan wrote:
> The printed time values presented in TSC is not straight forward
> showing the performance difference. And if the high resolution
> counter is not enabled, time value is too small to show the actual
> performance (e.g. "1 - 1" seems the same
The 10/13/2017 07:19, Jerin Jacob wrote:
> -Original Message-
> > Date: Fri, 13 Oct 2017 09:16:31 +0800
> > From: Jia He
> > To: Jerin Jacob , "Ananyev, Konstantin"
> >
> > Cc: Olivier MATZ , "dev@dpdk.org" ,
> > "jia...@hxt-semitech.com" ,
> > "jie2@hxt-semitech.com" ,
> > "bing.
; --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -392,6 +392,16 @@ F: drivers/net/mlx5/
> F: doc/guides/nics/mlx5.rst
> F: doc/guides/nics/features/mlx5.ini
>
> +Marvell mrvl
> +M: Jacek Siuda
> +M: Tomasz Duszynski
> +M: Dmitri Epshtein
> +M: Natalie Samsonov
> +
Update my email to jianbo@arm.com.
Signed-off-by: Jianbo Liu
---
MAINTAINERS | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/MAINTAINERS b/MAINTAINERS
index 1b74f98..7c6ec95 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -148,13 +148,13 @@ F: test/test
period != 0) {
> @@ -2444,7 +2455,7 @@ main(int argc, char** argv)
> /* Convert to number of cycles */
> timer_period = stats_period * rte_get_timer_hz();
>
> - while (1) {
> + while (f_quit == 0) {
> cur_time = rte_get_timer_cycles();
> diff_time += cur_time - prev_time;
>
> --
> 2.7.4
>
Acked-by: Jianbo Liu
On 18 September 2017 at 11:40, Phil Yang wrote:
> While running testpmd in container with stats-period option, it can't
> quit normally after received SIGINT.
>
> Signed-off-by: Phil Yang
> ---
> app/test-pmd/testpmd.c | 8 +++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git
On 14 September 2017 at 17:41, Ferruh Yigit wrote:
> On 9/14/2017 6:23 AM, santosh wrote:
>>
>> On Thursday 14 September 2017 09:23 AM, Jianbo Liu wrote:
>>> The kernel patch was merged to support pci resource mapping.
>>> https://patchwork.kernel.org/patch/9677
The kernel patch was merged to support pci resource mapping.
https://patchwork.kernel.org/patch/9677441/
So enable igu_uio in the default arm64 configuration.
v2:
- keep headline format
Signed-off-by: Jianbo Liu
---
config/common_armv8a_linuxapp | 2 --
1 file changed, 2 deletions(-)
diff
The kernel patch was merged to support pci resource mapping.
https://patchwork.kernel.org/patch/9677441/
So enable igu_uio in the default arm64 configuration.
Signed-off-by: Jianbo Liu
---
config/common_armv8a_linuxapp | 2 --
1 file changed, 2 deletions(-)
diff --git a/config
On 13 August 2017 at 15:03, Jerin Jacob wrote:
> Use cntvct_el0 system register to get the system counter frequency.
>
> If the system is configured with RTE_ARM_EAL_RDTSC_USE_PMU then
> return 0(let the common code calibrate the tsc frequency).
>
> CC: Jianbo Liu
> Signe
@@ -2789,7 +2789,7 @@ struct rte_fdir_conf fdir_conf = {
> static int
> test_balance_l23_tx_burst_ipv4_toggle_ip_addr(void)
> {
> - return balance_l23_tx_burst(0, 1, 1, 0);
> + return balance_l23_tx_burst(0, 1, 0, 1);
> }
>
> static int
> --
> 1.8.3.1
>
Acked-by: Jianbo Liu
+#endif
> +
> /* NEON intrinsic vreinterpretq_u64_p128() is supported since GCC version 7
> */
> static inline uint64x2_t
> vreinterpretq_u64_p128(poly128_t x)
> --
> 1.8.3.1
>
Acked-by: Jianbo Liu
-
> - for (i = 0; i <
> TEST_ADAPTIVE_TRANSMIT_LOAD_BALANCING_RX_BURST_SLAVE_COUNT; i++) {
> - for (j = 0; j < MAX_PKT_BURST; j++) {
> - if (pkt_burst[i][j] != NULL) {
> - rte_pktmbuf_free(pkt_burst[i][j]);
> - pkt_burst[i][j] = NULL;
> - }
> - }
> - }
> -
> -
> /* Clean up and remove slaves from bonded device */
> return remove_slaves_and_stop_bonded_device();
> }
> --
> 1.8.3.1
>
Acked-by: Jianbo Liu
On 9 July 2017 at 01:08, Thomas Monjalon wrote:
> 07/07/2017 18:26, Jerin Jacob:
>> vaddvq_u16() is not available for armv7.
>> Emulate the vaddvq_u16() using armv7 NEON intrinsics.
>
> After implementing this function, another missing function appears:
>
> lib/librte_sched/rte_sched.c:174
uint64x1_t o = vget_low_u64(n) + vget_high_u64(n);
> +
> + return vget_lane_u32((uint32x2_t)o, 0);
> +}
> +
> #endif
>
> #if defined(RTE_TOOLCHAIN_GCC) && (GCC_VERSION < 7)
> --
> 2.13.2
>
Acked-by: Jianbo Liu
On 4 July 2017 at 21:55, De Lara Guarch, Pablo
wrote:
>
>
>> -Original Message-
>> From: Thomas Monjalon [mailto:tho...@monjalon.net]
>> Sent: Tuesday, July 4, 2017 12:26 AM
>> To: Dumitrescu, Cristian ; De Lara Guarch,
>> Pablo
>>
Use ARM NEON intrinsics to accelerate l3 fowarding.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 4 +-
examples/l3fwd/l3fwd_em_hlm.h| 17 ++-
examples/l3fwd/l3fwd_em_hlm_neon.h | 74 ++
examples/l3fwd/l3fwd_em_sequential.h | 18 ++-
examples/l3fwd
As l3fwd_em_sse.h is renamed to l3fwd_em_sequential.h, change the macro
to __L3FWD_EM_SEQUENTIAL_H__ to maintain consistency.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_sequential.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/examples/l3fwd
New macro to define how many times of hash lookup in one time, and this
makes the code more concise.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_hlm.h | 241 +-
1 file changed, 71 insertions(+), 170 deletions(-)
diff --git a/examples/l3fwd
Keep x86 related code in l3fwd_sse.h, and move common code to
l3fwd_common.h, which will be used by other Archs.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_common.h | 293 ++
examples/l3fwd/l3fwd_sse.h| 261
The l3fwd_em_sse.h is enabled by NO_HASH_LOOKUP_MULTI.
Renaming it because it's only for sequential hash lookup,
and doesn't include any x86 SSE instructions.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 2 +-
examples/l3fwd/{l3fw
Implement vcopyq_laneq_u32 if gcc version is lower than 7.
Signed-off-by: Jianbo Liu
---
lib/librte_eal/common/include/arch/arm/rte_vect.h | 9 +
1 file changed, 9 insertions(+)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h
b/lib/librte_eal/common/include/arch/arm
Some common code can be used by other ARCHs, move to l3fwd_lpm.c
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_lpm.c | 83 ++
examples/l3fwd/l3fwd_lpm.h | 26 +
examples/l3fwd/l3fwd_lpm_sse.h | 66
Extract common code from l3fwd_em_hlm_sse.h, and add to the new file
l3fwd_em_hlm.h.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c | 2 +-
examples/l3fwd/l3fwd_em_hlm.h | 302 ++
examples/l3fwd/l3fwd_em_hlm_sse.h | 276
in git log
- Ashwin's suggestions for performance on ThunderX
v2:
- change name of l3fwd_em_sse.h to l3fwd_em_sequential.h
- add the times of hash multi-lookup for different Archs
- performance tuning on ThunderX: prefetching, set NO_HASH_LOOKUP_MULTI ...
Jianbo Liu (8):
examples/l3fw
On 5 June 2017 at 16:58, Jerin Jacob wrote:
> CC: Jianbo Liu
> Signed-off-by: Jerin Jacob
> ---
> v2:
> - Removed YEILD instruction comment, as it is an implementation
> specific(Jianbo)
> ---
> lib/librte_eal/common/include/arch/arm/rte_pause.h | 4 ++
>
On 18 May 2017 at 18:16, Jerin Jacob wrote:
> -Original Message-
>> Date: Thu, 18 May 2017 17:40:58 +0800
>> From: Jianbo Liu
>> To: Jerin Jacob
>> Cc: dev@dpdk.org, tho...@monjalon.net, Jan Viktorin
>>
>> Subject: Re: [dpdk-dev] [PATCH 3/6]
On 11 May 2017 at 18:10, Jerin Jacob wrote:
> CC: Jianbo Liu
> Signed-off-by: Jerin Jacob
> ---
> lib/librte_eal/common/include/arch/arm/rte_pause.h | 4 ++
> .../common/include/arch/arm/rte_pause_64.h | 55
> ++
> 2 files changed, 59 inserti
On 18 May 2017 at 16:11, Jan Viktorin wrote:
> On Thu, 11 May 2017 15:40:42 +0530
> Jerin Jacob wrote:
>
>> The patch does not provide any functional change for ARM32
>> with respect to existing rte_pause() definition.
>>
>> CC: Jan Viktorin
>> CC: Ji
Implement the same hash functions with crc32 on arm platform.
Signed-off-by: Jianbo Liu
---
examples/ip_pipeline/pipeline/hash_func.h | 2 +
examples/ip_pipeline/pipeline/hash_func_arm64.h | 245
2 files changed, 247 insertions(+)
create mode 100644 examples
On 18 May 2017 at 02:44, Jerin Jacob wrote:
> -Original Message-
>> Date: Wed, 17 May 2017 11:19:49 -0700
>> From: Ashwin Sekhar T K
>> To: jerin.ja...@caviumnetworks.com, john.mcnam...@intel.com,
>> jianbo@linaro.org
>> Cc: dev@dpdk.org, Ashwin Sekhar T K
>> Subject: [dpdk-dev] [PA
As l3fwd_em_sse.h is renamed to l3fwd_em_sequential.h, change the macro
to __L3FWD_EM_SEQUENTIAL_H__ to maintain consistency.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_sequential.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/examples/l3fwd
New macro to define how many times of hash lookup in one time, and this
makes the code more concise.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_hlm.h | 241 +-
1 file changed, 71 insertions(+), 170 deletions(-)
diff --git a/examples/l3fwd
Use ARM NEON intrinsics to accelerate l3 fowarding.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 4 +-
examples/l3fwd/l3fwd_em_hlm.h| 17 ++-
examples/l3fwd/l3fwd_em_hlm_neon.h | 74 ++
examples/l3fwd/l3fwd_em_sequential.h | 18 ++-
examples/l3fwd
Signed-off-by: Jianbo Liu
Some common code can be used by other ARCHs, move to l3fwd_lpm.c
---
examples/l3fwd/l3fwd_lpm.c | 83 ++
examples/l3fwd/l3fwd_lpm.h | 26 +
examples/l3fwd/l3fwd_lpm_sse.h | 66
Implement vcopyq_laneq_u32 if gcc version is lower than 7.
Signed-off-by: Jianbo Liu
---
lib/librte_eal/common/include/arch/arm/rte_vect.h | 9 +
1 file changed, 9 insertions(+)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h
b/lib/librte_eal/common/include/arch/arm
The l3fwd_em_sse.h is enabled by NO_HASH_LOOKUP_MULTI.
Renaming it because it's only for sequential hash lookup,
and doesn't include any x86 SSE instructions.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 2 +-
examples/l3fwd/{l3fw
Keep x86 related code in l3fwd_sse.h, and move common code to
l3fwd_common.h, which will be used by other Archs.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_common.h | 293 ++
examples/l3fwd/l3fwd_sse.h| 255
erent Archs
- performance tuning on ThunderX: prefetching, set NO_HASH_LOOKUP_MULTI ...
Jianbo Liu (8):
examples/l3fwd: extract arch independent code from multi hash lookup
examples/l3fwd: rename l3fwd_em_sse.h to l3fwd_em_sequential.h
examples/l3fwd: extract common code from multi packet
Extract common code from l3fwd_em_hlm_sse.h, and add to the new file
l3fwd_em_hlm.h.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c | 2 +-
examples/l3fwd/l3fwd_em_hlm.h | 302 ++
examples/l3fwd/l3fwd_em_hlm_sse.h | 280
librte_net/net_crc_neon.h | 297
> ++
> lib/librte_net/rte_net_crc.c | 34 ++-
> lib/librte_net/rte_net_crc.h | 2 +
> 5 files changed, 416 insertions(+), 6 deletions(-)
> create mode 100644 lib/librte_net/net_crc_neon.h
>
Acked-by: Jianbo Liu
ekhar T K
> Reviewed-by: Jan Viktorin
> ---
> lib/librte_eal/common/include/rte_common.h | 6 ++
> lib/librte_table/rte_lru.h | 10 ++
> 2 files changed, 8 insertions(+), 8 deletions(-)
>
Acked-by: Jianbo Liu
On 12 May 2017 at 15:25, Sekhar, Ashwin wrote:
> On Fri, 2017-05-12 at 13:51 +0800, Jianbo Liu wrote:
>> On 9 May 2017 at 17:53, Ashwin Sekhar T K
>> wrote:
>> >
>> > Added CRC compute APIs for arm64 utilizing the pmull
>> > capability
>> >
&
644 config/defconfig_arm64-armv8a-linuxapp-clang
>
Acked-by: Jianbo Liu
On 9 May 2017 at 17:53, Ashwin Sekhar T K
wrote:
> Added CRC compute APIs for arm64 utilizing the pmull
> capability
>
> Added new file net_crc_neon.h to hold the arm64 pmull
> CRC implementation
>
> Verified the changes with crc_autotest unit test case
>
> Signed-off-by: Ashwin Sekhar T K
> ---
On 11 May 2017 at 18:27, Sekhar, Ashwin wrote:
> On Thu, 2017-05-11 at 18:01 +0800, Jianbo Liu wrote:
>> On 11 May 2017 at 17:49, Sekhar, Ashwin
>> wrote:
>> >
>> > Hi Jianbo,
>> >
>> > Thanks for v3. Small compilation error. See inline comme
On 11 May 2017 at 17:49, Sekhar, Ashwin wrote:
> Hi Jianbo,
>
> Thanks for v3. Small compilation error. See inline comment. Otherwise
> it looks fine.
>
> On Thu, 2017-05-11 at 17:25 +0800, Jianbo Liu wrote:
>> Use ARM NEON intrinsics to accelerate l3 fowarding.
>>
New macro to define how many times of hash lookup in one time, and this
makes the code more concise.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_hlm.h | 241 +-
1 file changed, 71 insertions(+), 170 deletions(-)
diff --git a/examples/l3fwd
As l3fwd_em_sse.h is renamed to l3fwd_em_sequential.h, change the macro
to __L3FWD_EM_SEQUENTIAL_H__ to maintain consistency.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_sequential.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/examples/l3fwd
Use ARM NEON intrinsics to accelerate l3 fowarding.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 4 +-
examples/l3fwd/l3fwd_em_hlm.h| 17 ++-
examples/l3fwd/l3fwd_em_hlm_neon.h | 74 ++
examples/l3fwd/l3fwd_em_sequential.h | 18 ++-
examples/l3fwd
ching, set NO_HASH_LOOKUP_MULTI ...
Jianbo Liu (7):
examples/l3fwd: extract arch independent code from multi hash lookup
examples/l3fwd: rename l3fwd_em_sse.h to l3fwd_em_sequential.h
examples/l3fwd: extract common code from multi packet send
examples/l3fwd: rearrange the code for lpm_
The l3fwd_em_sse.h is enabled by NO_HASH_LOOKUP_MULTI.
Renaming it because it's only for sequential hash lookup,
and doesn't include any x86 SSE instructions.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 2 +-
examples/l3fwd/{l3fw
Extract common code from l3fwd_em_hlm_sse.h, and add to the new file
l3fwd_em_hlm.h.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c | 2 +-
examples/l3fwd/l3fwd_em_hlm.h | 302 ++
examples/l3fwd/l3fwd_em_hlm_sse.h | 280
Signed-off-by: Jianbo Liu
Some common code can be used by other ARCHs, move to l3fwd_lpm.c
---
examples/l3fwd/l3fwd_lpm.c | 83 ++
examples/l3fwd/l3fwd_lpm.h | 26 +
examples/l3fwd/l3fwd_lpm_sse.h | 66
Keep x86 related code in l3fwd_sse.h, and move common code to
l3fwd_common.h, which will be used by other Archs.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_common.h | 293 ++
examples/l3fwd/l3fwd_sse.h| 255
On 11 May 2017 at 12:27, Sekhar, Ashwin wrote:
>
> On Thu, 2017-05-11 at 04:14 +, Sekhar, Ashwin wrote:
> ...
>> > > Combining all the above comments, I made some changes on top of
>> > > your
>> > > patch. These changes are giving 3-4% improvement over your
>> > > version.
>> > >
>> > > You m
which helped improve performance on my
> Thunderx setup. For details see comments inline.
>
>
> On Wed, 2017-05-10 at 10:30 +0800, Jianbo Liu wrote:
>> Use ARM NEON intrinsics to accelerate l3 fowarding.
>>
>> Signed-off-by: Jianbo Liu
>> ---
>> e
Hi Ashwin,
On 9 May 2017 at 16:10, Sekhar, Ashwin wrote:
> On Fri, 2017-05-05 at 13:43 +0800, Jianbo Liu wrote:
>> On 5 May 2017 at 12:24, Sekhar, Ashwin
>> wrote:
>> >
>> > On Thu, 2017-05-04 at 16:42 +0800, Jianbo Liu wrote:
>> > >
>>
New micro to define how many times of hash lookup in one time, and this
makes the code more concise.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_hlm.h | 241 +-
1 file changed, 71 insertions(+), 170 deletions(-)
diff --git a/examples/l3fwd
Signed-off-by: Jianbo Liu
Some common code can be used by other ARCHs, move to l3fwd_lpm.c
---
examples/l3fwd/l3fwd_lpm.c | 83 ++
examples/l3fwd/l3fwd_lpm.h | 26 +
examples/l3fwd/l3fwd_lpm_sse.h | 66
As l3fwd_em_sse.h is renamed to l3fwd_em_sequential.h, change the macro
to __L3FWD_EM_SEQUENTIAL_H__ to maintain consistency.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em_sequential.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/examples/l3fwd
Use ARM NEON intrinsics to accelerate l3 fowarding.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 4 +-
examples/l3fwd/l3fwd_em_hlm.h| 19 ++-
examples/l3fwd/l3fwd_em_hlm_neon.h | 74 ++
examples/l3fwd/l3fwd_em_sequential.h | 20 ++-
examples/l3fwd
The l3fwd_em_sse.h is enabled by NO_HASH_LOOKUP_MULTI.
Renaming it because it's only for sequential hash lookup,
and doesn't include any x86 SSE instructions.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c| 2 +-
examples/l3fwd/{l3fw
Extract common code from l3fwd_em_hlm_sse.h, and add to the new file
l3fwd_em_hlm.h.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_em.c | 2 +-
examples/l3fwd/l3fwd_em_hlm.h | 302 ++
examples/l3fwd/l3fwd_em_hlm_sse.h | 280
v2:
- change name of l3fwd_em_sse.h to l3fwd_em_sequential.h
- add the times of hash multi-lookup for different Archs
- performance tuning on ThunderX: prefetching, set NO_HASH_LOOKUP_MULTI ...
Jianbo Liu (7):
examples/l3fwd: extract arch independent code from multi hash lookup
examples
Keep x86 related code in l3fwd_sse.h, and move common code to
l3fwd_common.h, which will be used by other Archs.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_common.h | 293 ++
examples/l3fwd/l3fwd_sse.h| 255
On 5 May 2017 at 12:24, Sekhar, Ashwin wrote:
> On Thu, 2017-05-04 at 16:42 +0800, Jianbo Liu wrote:
>> Hi Ashwin,
>>
>> On 3 May 2017 at 13:24, Jianbo Liu wrote:
>> >
>> > Hi Ashwin,
>> >
>> > On 2 May 2017 at 19:47, Sekhar, Ashwin
&g
Hi Ashwin,
On 3 May 2017 at 13:24, Jianbo Liu wrote:
> Hi Ashwin,
>
> On 2 May 2017 at 19:47, Sekhar, Ashwin wrote:
>> Hi Jianbo,
>>
>> I tested your neon changes on thunderx. I am seeing a performance
>> regression of ~10% for LPM case and ~20% for EM case
Hi Ashwin,
On 2 May 2017 at 19:47, Sekhar, Ashwin wrote:
> Hi Jianbo,
>
> I tested your neon changes on thunderx. I am seeing a performance
> regression of ~10% for LPM case and ~20% for EM case with your changes.
> Did you see improvement on any arm64 platform with these changes. If
> yes, how m
On 2 May 2017 at 14:41, Jerin Jacob wrote:
> -Original Message-
>> Date: Mon, 1 May 2017 22:59:53 -0700
>> From: Ashwin Sekhar T K
>> To: byron.mar...@intel.com, pablo.de.lara.gua...@intel.com,
>> jerin.ja...@caviumnetworks.com, jianbo@linaro.org
>> Cc: dev@dpdk.org, Ashwin Sekhar T
Keep x86 related code in l3fwd_sse.h, and move common code to
l3fwd_common.h, which will be used by other Archs.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd_common.h | 293 ++
examples/l3fwd/l3fwd_sse.h| 255
Use ARM NEON intrinsics to accelerate l3 fowarding.
Signed-off-by: Jianbo Liu
---
examples/l3fwd/l3fwd.h | 4 -
examples/l3fwd/l3fwd_em.c | 4 +-
examples/l3fwd/l3fwd_em_hlm.h | 5 +
examples/l3fwd/l3fwd_em_hlm_neon.h | 74 +++
examples/l3fwd
1 - 100 of 270 matches
Mail list logo