On Tue, Feb 10, 2026 at 10:28:10AM +0100, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:[email protected]]
> > Sent: Tuesday, 10 February 2026 10.04
> > 
> > On Tue, Feb 10, 2026 at 12:08:44AM +0100, Morten Brørup wrote:
> > > > +static inline void
> > > > +write_txd(volatile void *txd, uint64_t qw0, uint64_t qw1)
> > > > +{
> > > > +       uint64_t *txd_qw = __rte_assume_aligned(RTE_CAST_PTR(void *,
> > > > txd), 16);
> > > > +
> > > > +       txd_qw[0] = rte_cpu_to_le_64(qw0);
> > > > +       txd_qw[1] = rte_cpu_to_le_64(qw1);
> > > > +}
> > >
> > > How about using __rte_aligned() instead, something like this
> > (untested):
> > >
> > > struct __rte_aligned(16) txd_t {
> > >   uint64_t        qw0;
> > >   uint64_t        qw1;
> > > };
> > 
> > I can see if this works for us...
> > 
> > >
> > > *RTE_CAST_PTR(volatile struct txd_t *, txd) = {
> > rte_cpu_to_le_64(qw0),
> > > rte_cpu_to_le_64(qw1) };
> > >
> > >
> > > And why strip the "volatile"?
> > >
> > 
> > For the descriptor writes, it doesn't matter the order in which the
> > descriptors and the descriptor fields are actually written, since the
> > NIC
> > relies upon the tail pointer update - which includes a fence - to
> > inform it
> > of when the descriptors are ready. The volatile is necessary for reads,
> > though, which is why the ring is marked as such, but for Tx it prevents
> > the
> > compiler from opportunistically e.g. converting two 64-byte writes into
> > a
> > 128-byte write.
> 
> Makes sense.
> Suggest that you spread out a few comments about this at the relevant 
> locations in the source code.
> 
Adding an explanation as part of the write_txd function, which is where the
volatile gets cast away.

/Bruce

Reply via email to