Jesper Dangaard Brouer <bro...@redhat.com> writes:

> On Tue, 26 May 2020 22:05:38 +0800
> Hangbin Liu <liuhang...@gmail.com> wrote:
>
>> diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
>> index a51d9fb7a359..ecc5c44a5bab 100644
>> --- a/kernel/bpf/devmap.c
>> +++ b/kernel/bpf/devmap.c
> [...]
>
>> +int dev_map_enqueue_multi(struct xdp_buff *xdp, struct net_device *dev_rx,
>> +                      struct bpf_map *map, struct bpf_map *ex_map,
>> +                      bool exclude_ingress)
>> +{
>> +    struct bpf_dtab_netdev *obj = NULL;
>> +    struct xdp_frame *xdpf, *nxdpf;
>> +    struct net_device *dev;
>> +    bool first = true;
>> +    u32 key, next_key;
>> +    int err;
>> +
>> +    devmap_get_next_key(map, NULL, &key);
>> +
>> +    xdpf = convert_to_xdp_frame(xdp);
>> +    if (unlikely(!xdpf))
>> +            return -EOVERFLOW;
>> +
>> +    for (;;) {
>> +            switch (map->map_type) {
>> +            case BPF_MAP_TYPE_DEVMAP:
>> +                    obj = __dev_map_lookup_elem(map, key);
>> +                    break;
>> +            case BPF_MAP_TYPE_DEVMAP_HASH:
>> +                    obj = __dev_map_hash_lookup_elem(map, key);
>> +                    break;
>> +            default:
>> +                    break;
>> +            }
>> +
>> +            if (!obj || dev_in_exclude_map(obj, ex_map,
>> +                                           exclude_ingress ? 
>> dev_rx->ifindex : 0))
>> +                    goto find_next;
>> +
>> +            dev = obj->dev;
>> +
>> +            if (!dev->netdev_ops->ndo_xdp_xmit)
>> +                    goto find_next;
>> +
>> +            err = xdp_ok_fwd_dev(dev, xdp->data_end - xdp->data);
>> +            if (unlikely(err))
>> +                    goto find_next;
>> +
>> +            if (!first) {
>> +                    nxdpf = xdpf_clone(xdpf);
>> +                    if (unlikely(!nxdpf))
>> +                            return -ENOMEM;
>> +
>> +                    bq_enqueue(dev, nxdpf, dev_rx);
>> +            } else {
>> +                    bq_enqueue(dev, xdpf, dev_rx);
>
> This looks racy.  You enqueue the original frame, and then later
> xdpf_clone it.  The original frame might have been freed at that
> point.

This was actually my suggestion; on the assumption that bq_enqueue()
just puts the frame on a list that won't be flushed until we exit the
NAPI loop.

But I guess now that you mention it that bq_enqueue() may flush the
queue, so you're right that this won't work. Sorry about that, Hangbin :/

Jesper, the reason I suggested this was to avoid an "extra" copy (i.e.,
if we have two destinations, ideally we should only clone once instead
of twice). Got any clever ideas for a safe way to achieve this? :)

-Toke

Reply via email to