> -----Original Message-----
> From: Zhang, Qi Z
> Sent: Tuesday, November 20, 2018 4:58 PM
> To: Ananyev, Konstantin <konstantin.anan...@intel.com>; Richardson, Bruce 
> <bruce.richard...@intel.com>; Wiles, Keith
> <keith.wi...@intel.com>
> Cc: dev@dpdk.org; Lu, Wenzhuo <wenzhuo...@intel.com>; Iremonger, Bernard 
> <bernard.iremon...@intel.com>; sta...@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] app/testpmd: improve MAC swap performance
> 
> 
> 
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Tuesday, November 20, 2018 1:17 AM
> > To: Zhang, Qi Z <qi.z.zh...@intel.com>; Richardson, Bruce
> > <bruce.richard...@intel.com>; Wiles, Keith <keith.wi...@intel.com>
> > Cc: dev@dpdk.org; Lu, Wenzhuo <wenzhuo...@intel.com>; Iremonger, Bernard
> > <bernard.iremon...@intel.com>; Zhang, Qi Z <qi.z.zh...@intel.com>;
> > sta...@dpdk.org
> > Subject: RE: [dpdk-dev] [PATCH] app/testpmd: improve MAC swap performance
> >
> > Hi Qi,
> >
> > > -----Original Message-----
> > > From: dev [mailto:dev-boun...@dpdk.org] On Behalf Of Qi Zhang
> > > Sent: Tuesday, November 20, 2018 4:46 AM
> > > To: Richardson, Bruce <bruce.richard...@intel.com>; Wiles, Keith
> > > <keith.wi...@intel.com>
> > > Cc: dev@dpdk.org; Lu, Wenzhuo <wenzhuo...@intel.com>; Iremonger,
> > > Bernard <bernard.iremon...@intel.com>; Zhang, Qi Z
> > > <qi.z.zh...@intel.com>; sta...@dpdk.org
> > > Subject: [dpdk-dev] [PATCH] app/testpmd: improve MAC swap performance
> > >
> > > The patch optimizes the mac swap operation by taking advantage of SSE
> > > instructions, it only impacts x86 platform.
> > >
> > > Cc: sta...@dpdk.org
> > >
> > > Signed-off-by: Qi Zhang <qi.z.zh...@intel.com>
> > > ---
> > >  app/test-pmd/macswap.c | 16 +++++++++++++++-
> > >  1 file changed, 15 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c index
> > > a8384d5b8..0722782b0 100644
> > > --- a/app/test-pmd/macswap.c
> > > +++ b/app/test-pmd/macswap.c
> > > @@ -78,7 +78,6 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
> > >   struct rte_port  *txp;
> > >   struct rte_mbuf  *mb;
> > >   struct ether_hdr *eth_hdr;
> > > - struct ether_addr addr;
> > >   uint16_t nb_rx;
> > >   uint16_t nb_tx;
> > >   uint16_t i;
> > > @@ -95,6 +94,15 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
> > >   start_tsc = rte_rdtsc();
> > >  #endif
> > >
> > > +#ifdef RTE_ARCH_X86
> > > + __m128i addr;
> > > + __m128i shfl_msk = _mm_set_epi8(15, 14, 13, 12,
> > > +                                 5, 4, 3, 2,
> > > +                                 1, 0, 11, 10,
> > > +                                 9, 8, 7, 6);
> > > +#else
> > > + struct ether_addr addr;
> > > +#endif
> >
> > I think it would better to place IA specific code into a separate fnction 
> > (and
> > probably into a separate .h file).
> 
> OK, I will think about how to rework this.

Ideally would be good to have an generic one, and IA optimized version.

> 
> > BTW, just curious what % of improvement it gives?
> 
> So far , the only server I can test is a 1.6GHz Broadwell server with 2 ports 
> on 1 i40e 25G.
> The macswap performance is increase from 16.8mpps to 20mpps (about 19% 
> improvement)

Quite a lot, definitely looks like worth it.

> 
> > Konstantin
> >
> >
> > >   /*
> > >    * Receive a burst of packets and forward them.
> > >    */
> > > @@ -123,9 +131,15 @@ pkt_burst_mac_swap(struct fwd_stream *fs)
> > >           eth_hdr = rte_pktmbuf_mtod(mb, struct ether_hdr *);
> > >
> > >           /* Swap dest and src mac addresses. */
> > > +#ifdef RTE_ARCH_X86
> > > +         addr = _mm_loadu_si128((__m128i *)eth_hdr);
> > > +         addr = _mm_shuffle_epi8(addr, shfl_msk);
> > > +         _mm_storeu_si128((__m128i *)eth_hdr, addr); #else
> > >           ether_addr_copy(&eth_hdr->d_addr, &addr);
> > >           ether_addr_copy(&eth_hdr->s_addr, &eth_hdr->d_addr);
> > >           ether_addr_copy(&addr, &eth_hdr->s_addr);
> > > +#endif
> > >
> > >           mb->ol_flags &= IND_ATTACHED_MBUF | EXT_ATTACHED_MBUF;
> > >           mb->ol_flags |= ol_flags;
> > > --
> > > 2.13.6

Reply via email to