Re: [PATCH v6 7/7] vhost: optimize memcpy routines when cc memcpy is used

2024-10-10 Thread Mattias Rönnblom
On 2024-10-09 23:57, Stephen Hemminger wrote: On Fri, 20 Sep 2024 12:27:16 +0200 Mattias Rönnblom wrote: +#if defined(RTE_USE_CC_MEMCPY) && defined(RTE_ARCH_X86_64) +static __rte_always_inline void +pktcpy(void *restrict in_dst, const void *restrict in_src, size_t len) +{ + void *dst = _

Re: [PATCH v6 7/7] vhost: optimize memcpy routines when cc memcpy is used

2024-10-10 Thread Mattias Rönnblom
On 2024-10-09 23:25, Morten Brørup wrote: +#if defined(RTE_USE_CC_MEMCPY) && defined(RTE_ARCH_X86_64) +static __rte_always_inline void +pktcpy(void *restrict in_dst, const void *restrict in_src, size_t len) +{ A comment describing why batch_copy_elem.dst and src point to 16 byte aligned data w

RE: [PATCH v6 7/7] vhost: optimize memcpy routines when cc memcpy is used

2024-10-09 Thread Morten Brørup
> +#if defined(RTE_USE_CC_MEMCPY) && defined(RTE_ARCH_X86_64) > +static __rte_always_inline void > +pktcpy(void *restrict in_dst, const void *restrict in_src, size_t len) > +{ A comment describing why batch_copy_elem.dst and src point to 16 byte aligned data would be nice. > + void *dst = __

Re: [PATCH v6 7/7] vhost: optimize memcpy routines when cc memcpy is used

2024-10-03 Thread Maxime Coquelin
On 9/20/24 12:27, Mattias Rönnblom wrote: In build where use_cc_memcpy is set to true, the vhost user PMD suffers a large performance drop on Intel P-cores for small packets, at least when built by GCC and (to a much lesser extent) clang. This patch addresses that issue by using a custom virt

[PATCH v6 7/7] vhost: optimize memcpy routines when cc memcpy is used

2024-09-20 Thread Mattias Rönnblom
In build where use_cc_memcpy is set to true, the vhost user PMD suffers a large performance drop on Intel P-cores for small packets, at least when built by GCC and (to a much lesser extent) clang. This patch addresses that issue by using a custom virtio memcpy()-based packet copying routine. Perf