Jakub Kicinski <jakub.kicin...@netronome.com> [Mon, 2018-07-09 11:01 -0700]:
> More advanced applications may want to only replace programs without
> destroying associated maps.  Allow libbpf users to achieve that.
> Instead of always creating all of the maps at load time, expose to
> users an API to reconstruct the map object from already existing
> map.
> 
> The map parameters are read from the kernel and replace the parameters
> of the ELF map.  libbpf does not restrict the map replacement, i.e.
> the reused map does not have to be compatible with the ELF map
> definition.  We relay on the verifier for checking the compatibility
> between maps and programs.  The ELF map definition is completely
> overwritten by the information read from the kernel, to make sure
> libbpf's view of map object corresponds to the actual map.

Thanks for working on this Jakub! I encountered this shortcoming of
libbpf as well and was planning to fix it, but you beat me to it :)


> Signed-off-by: Jakub Kicinski <jakub.kicin...@netronome.com>
> Reviewed-by: Quentin Monnet <quentin.mon...@netronome.com>
> ---
>  tools/lib/bpf/libbpf.c | 35 +++++++++++++++++++++++++++++++++++
>  tools/lib/bpf/libbpf.h |  1 +
>  2 files changed, 36 insertions(+)
> 
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index b653dbb266c7..c80033fe66c3 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -215,6 +215,7 @@ struct bpf_map {
>       int fd;
>       char *name;
>       size_t offset;
> +     bool fd_preset;

Any reason not to use map->fd itself to identify if fd is present?

fd of every map is set to -1 in bpf_object__init_maps() that, in turn, is
called from __bpf_object__open():

        for (i = 0; i < nr_maps; i++)
                obj->maps[i].fd = -1;

Later it will either contain valid fd that is >= 0, or that same -1, what
should be enough to identify fd presence.


>       int map_ifindex;
>       struct bpf_map_def def;
>       uint32_t btf_key_type_id;
> @@ -1082,6 +1083,34 @@ static int bpf_map_find_btf_info(struct bpf_map *map, 
> const struct btf *btf)
>       return 0;
>  }
>  
> +int bpf_map__reuse_fd(struct bpf_map *map, int fd)
> +{
> +     struct bpf_map_info info = {};
> +     __u32 len = sizeof(info);
> +     int err;
> +
> +     err = bpf_obj_get_info_by_fd(fd, &info, &len);
> +     if (err)
> +             return err;
> +

Should there be a check that map->fd doesn't contain any valid fd (>= 0)
before rewriting it so that if it does (e.g. because the function is
called after bpf_object__load() by mistake), current map->fd won't be
leaked?


> +     map->fd = dup(fd);

Unfortunately, new descriptor created by dup(2) will not have O_CLOEXEC set, in
contrast to original fd returned by kernel on map creation.

libbpf has other interface shortcomings where it comes up. E.g. struct
bpf_object owns all descriptors it contains (progs, maps) and closes them in
bpf_object__close(). if one wants to open/load ELF, then close it but
keep, say, prog fd to attach it to cgroup some time later, then fd
should be duplicated as well to get a new one not owned by bpf_object.

Currently I use this workaround to avoid time when new fd doesn't have
O_CLOEXEC:

        int new_prog_fd = open("/dev/null", O_RDONLY | O_CLOEXEC);
        if (new_prog_fd < 0 ||
            dup3(bpf_program__fd(prog), new_prog_fd, O_CLOEXEC) == -1) {
                /* .. handle error .. */
                close(new_prog_fd);
        }
        /* .. use new_prog_fd with O_CLOEXEC set */

Not sure how to simplify it. dup2() has same problem with regard to
O_CLOEXEC.

Use-case: standalone server application that uses libbpf and does
fork()/execve() a lot.


> +     if (map->fd < 0)
> +             return map->fd;
> +     map->fd_preset = true;
> +
> +     free(map->name);
> +     map->name = strdup(info.name);
> +     map->def.type = info.type;
> +     map->def.key_size = info.key_size;
> +     map->def.value_size = info.value_size;
> +     map->def.max_entries = info.max_entries;
> +     map->def.map_flags = info.map_flags;
> +     map->btf_key_type_id = info.btf_key_type_id;
> +     map->btf_value_type_id = info.btf_value_type_id;
> +
> +     return 0;
> +}
> +
>  static int
>  bpf_object__create_maps(struct bpf_object *obj)
>  {
> @@ -1094,6 +1123,12 @@ bpf_object__create_maps(struct bpf_object *obj)
>               struct bpf_map_def *def = &map->def;
>               int *pfd = &map->fd;
>  
> +             if (map->fd_preset) {
> +                     pr_debug("skip map create (preset) %s: fd=%d\n",
> +                              map->name, map->fd);
> +                     continue;
> +             }
> +
>               create_attr.name = map->name;
>               create_attr.map_ifindex = map->map_ifindex;
>               create_attr.map_type = def->type;
> diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> index 60593ac44700..8e709a74f47c 100644
> --- a/tools/lib/bpf/libbpf.h
> +++ b/tools/lib/bpf/libbpf.h
> @@ -261,6 +261,7 @@ typedef void (*bpf_map_clear_priv_t)(struct bpf_map *, 
> void *);
>  int bpf_map__set_priv(struct bpf_map *map, void *priv,
>                     bpf_map_clear_priv_t clear_priv);
>  void *bpf_map__priv(struct bpf_map *map);
> +int bpf_map__reuse_fd(struct bpf_map *map, int fd);
>  bool bpf_map__is_offload_neutral(struct bpf_map *map);
>  void bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex);
>  int bpf_map__pin(struct bpf_map *map, const char *path);
> -- 
> 2.17.1
> 

-- 
Andrey Ignatov

Reply via email to