> From: Bruce Richardson [mailto:[email protected]]
> Sent: Tuesday, 10 February 2026 10.04
> 
> On Tue, Feb 10, 2026 at 12:08:44AM +0100, Morten Brørup wrote:
> > > +static inline void
> > > +write_txd(volatile void *txd, uint64_t qw0, uint64_t qw1)
> > > +{
> > > + uint64_t *txd_qw = __rte_assume_aligned(RTE_CAST_PTR(void *,
> > > txd), 16);
> > > +
> > > + txd_qw[0] = rte_cpu_to_le_64(qw0);
> > > + txd_qw[1] = rte_cpu_to_le_64(qw1);
> > > +}
> >
> > How about using __rte_aligned() instead, something like this
> (untested):
> >
> > struct __rte_aligned(16) txd_t {
> >     uint64_t        qw0;
> >     uint64_t        qw1;
> > };
> 
> I can see if this works for us...
> 
> >
> > *RTE_CAST_PTR(volatile struct txd_t *, txd) = {
> rte_cpu_to_le_64(qw0),
> > rte_cpu_to_le_64(qw1) };
> >
> >
> > And why strip the "volatile"?
> >
> 
> For the descriptor writes, it doesn't matter the order in which the
> descriptors and the descriptor fields are actually written, since the
> NIC
> relies upon the tail pointer update - which includes a fence - to
> inform it
> of when the descriptors are ready. The volatile is necessary for reads,
> though, which is why the ring is marked as such, but for Tx it prevents
> the
> compiler from opportunistically e.g. converting two 64-byte writes into
> a
> 128-byte write.

Makes sense.
Suggest that you spread out a few comments about this at the relevant locations 
in the source code.


Reply via email to