On Sun, May 31, 2020 at 11:46:49PM +0200, Lorenzo Bianconi wrote:
> +
> + prog = READ_ONCE(rcpu->prog);
> for (i = 0; i < n; i++) {
> - void *f = frames[i];
> + void *f = xdp_frames[i];
> struct page *page = virt_to_page(f);
> + struct xdp_frame *xdpf;
> + struct xdp_buff xdp;
> + u32 act;
> + int err;
>
> /* Bring struct page memory area to curr CPU. Read by
> * build_skb_around via page_is_pfmemalloc(), and when
> * freed written by page_frag_free call.
> */
> prefetchw(page);
> + if (!prog) {
> + frames[nframes++] = xdp_frames[i];
> + continue;
> + }
I'm not sure compiler will be smart enough to hoist !prog check out of the loop.
Otherwise default cpumap case will be a bit slower.
I'd like to see performance numbers before/after and acks from folks
who are using cpumap before applying.
Also please add selftest for it. samples/bpf/ in patch 6 is not enough.
Other than the above the feature looks good to me. It nicely complements devmap.