> @@ -394,25 +394,30 @@ static void __packet_set_status(struct packet_sock *po, > void *frame, int status) > static int __packet_get_status(struct packet_sock *po, void *frame) > { > union tpacket_uhdr h; > - > - smp_rmb(); > + int status; > > h.raw = frame; > switch (po->tp_version) { > case TPACKET_V1: > flush_dcache_page(pgv_to_page(&h.h1->tp_status)); > - return h.h1->tp_status; > + status = h.h1->tp_status; > + break; > case TPACKET_V2: > flush_dcache_page(pgv_to_page(&h.h2->tp_status)); > - return h.h2->tp_status; > + status = h.h2->tp_status; > + break; > case TPACKET_V3: > flush_dcache_page(pgv_to_page(&h.h3->tp_status)); > - return h.h3->tp_status; > + status = h.h3->tp_status; > + break; > default: > WARN(1, "TPACKET version not supported.\n"); > BUG(); > - return 0; > + status = 0; > + break; > } > + smp_rmb(); > + return status; > } > > static __u32 tpacket_get_timestamp(struct sk_buff *skb, struct timespec *ts, > -- > 2.17.1 > > Sorry for a lack of information- it is the first my patch. > > The opposie smp_wmb() is in: > https://elixir.bootlin.com/linux/v4.15/source/net/packet/af_packet.c#L415
__packet_get_status in the common path synchronizes with userspace code, not with __packet_set_status. See __v2_tx_user_ready in tools/testing/selftests/net/psock_tpacket.c from an example. On receive, the smp_wmb in tpacket_rcv ensures that a frame is filled before the kernel updates the status field to TP_STATUS_USER. So that is not the purpose of the smp_wmb in __packet_set_status. Digging through the original PACKET_TX_RING discussion (below) it appears that the intent of that smp_wmb was intended either to ensure that the frame was fully written before freeing the skb (when called from tpacket_destruct_skb) or even to ensure that the frame was fully written before calling flush_dcache_page on the underlying shared page. The latter would explain its analogue in __packet_get_status, though it is not implemented as a barrier between the field access and page flush in either function in the final patch. >From http://lists.openwall.net/netdev/2008/10/31/70: >> Also, when you set the status back to TP_STATUS_KERNEL in the >> destructor, you need to add the following barriers: >> >> __packet_set_status(po, ph, TP_STATUS_KERNEL); >> smp_mb(); // make sure the TP_STATUS_KERNEL was actually written to >> memory before this - couldn't this actually be just a smp_wmb? >> flush_dcache_page(virt_to_page(ph)); // on non-x86 architectures like >> ARM that have a moronic cache (i.e cache by virtual rather than >> physical address). on x86 this is a noop. >> > > So, If my understanding of those memory barriers is correct, we should > have a smp_rmb() before status reading and smp_wmb() after status > writing in skb destructor and send procedure. > On my eye smp_rmb() should be moved *after* actual read of status to be > ensured that no read after __packet_get_status() happens before actual read > of status. > That sounds reasonable. But we should understand the intent of the existing code before making changes. That was not the purpose of this barrier, it appears.