Jakub Kicinski <jakub.kicin...@netronome.com> [Mon, 2018-07-09 11:01 -0700]: > More advanced applications may want to only replace programs without > destroying associated maps. Allow libbpf users to achieve that. > Instead of always creating all of the maps at load time, expose to > users an API to reconstruct the map object from already existing > map. > > The map parameters are read from the kernel and replace the parameters > of the ELF map. libbpf does not restrict the map replacement, i.e. > the reused map does not have to be compatible with the ELF map > definition. We relay on the verifier for checking the compatibility > between maps and programs. The ELF map definition is completely > overwritten by the information read from the kernel, to make sure > libbpf's view of map object corresponds to the actual map.
Thanks for working on this Jakub! I encountered this shortcoming of libbpf as well and was planning to fix it, but you beat me to it :) > Signed-off-by: Jakub Kicinski <jakub.kicin...@netronome.com> > Reviewed-by: Quentin Monnet <quentin.mon...@netronome.com> > --- > tools/lib/bpf/libbpf.c | 35 +++++++++++++++++++++++++++++++++++ > tools/lib/bpf/libbpf.h | 1 + > 2 files changed, 36 insertions(+) > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > index b653dbb266c7..c80033fe66c3 100644 > --- a/tools/lib/bpf/libbpf.c > +++ b/tools/lib/bpf/libbpf.c > @@ -215,6 +215,7 @@ struct bpf_map { > int fd; > char *name; > size_t offset; > + bool fd_preset; Any reason not to use map->fd itself to identify if fd is present? fd of every map is set to -1 in bpf_object__init_maps() that, in turn, is called from __bpf_object__open(): for (i = 0; i < nr_maps; i++) obj->maps[i].fd = -1; Later it will either contain valid fd that is >= 0, or that same -1, what should be enough to identify fd presence. > int map_ifindex; > struct bpf_map_def def; > uint32_t btf_key_type_id; > @@ -1082,6 +1083,34 @@ static int bpf_map_find_btf_info(struct bpf_map *map, > const struct btf *btf) > return 0; > } > > +int bpf_map__reuse_fd(struct bpf_map *map, int fd) > +{ > + struct bpf_map_info info = {}; > + __u32 len = sizeof(info); > + int err; > + > + err = bpf_obj_get_info_by_fd(fd, &info, &len); > + if (err) > + return err; > + Should there be a check that map->fd doesn't contain any valid fd (>= 0) before rewriting it so that if it does (e.g. because the function is called after bpf_object__load() by mistake), current map->fd won't be leaked? > + map->fd = dup(fd); Unfortunately, new descriptor created by dup(2) will not have O_CLOEXEC set, in contrast to original fd returned by kernel on map creation. libbpf has other interface shortcomings where it comes up. E.g. struct bpf_object owns all descriptors it contains (progs, maps) and closes them in bpf_object__close(). if one wants to open/load ELF, then close it but keep, say, prog fd to attach it to cgroup some time later, then fd should be duplicated as well to get a new one not owned by bpf_object. Currently I use this workaround to avoid time when new fd doesn't have O_CLOEXEC: int new_prog_fd = open("/dev/null", O_RDONLY | O_CLOEXEC); if (new_prog_fd < 0 || dup3(bpf_program__fd(prog), new_prog_fd, O_CLOEXEC) == -1) { /* .. handle error .. */ close(new_prog_fd); } /* .. use new_prog_fd with O_CLOEXEC set */ Not sure how to simplify it. dup2() has same problem with regard to O_CLOEXEC. Use-case: standalone server application that uses libbpf and does fork()/execve() a lot. > + if (map->fd < 0) > + return map->fd; > + map->fd_preset = true; > + > + free(map->name); > + map->name = strdup(info.name); > + map->def.type = info.type; > + map->def.key_size = info.key_size; > + map->def.value_size = info.value_size; > + map->def.max_entries = info.max_entries; > + map->def.map_flags = info.map_flags; > + map->btf_key_type_id = info.btf_key_type_id; > + map->btf_value_type_id = info.btf_value_type_id; > + > + return 0; > +} > + > static int > bpf_object__create_maps(struct bpf_object *obj) > { > @@ -1094,6 +1123,12 @@ bpf_object__create_maps(struct bpf_object *obj) > struct bpf_map_def *def = &map->def; > int *pfd = &map->fd; > > + if (map->fd_preset) { > + pr_debug("skip map create (preset) %s: fd=%d\n", > + map->name, map->fd); > + continue; > + } > + > create_attr.name = map->name; > create_attr.map_ifindex = map->map_ifindex; > create_attr.map_type = def->type; > diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h > index 60593ac44700..8e709a74f47c 100644 > --- a/tools/lib/bpf/libbpf.h > +++ b/tools/lib/bpf/libbpf.h > @@ -261,6 +261,7 @@ typedef void (*bpf_map_clear_priv_t)(struct bpf_map *, > void *); > int bpf_map__set_priv(struct bpf_map *map, void *priv, > bpf_map_clear_priv_t clear_priv); > void *bpf_map__priv(struct bpf_map *map); > +int bpf_map__reuse_fd(struct bpf_map *map, int fd); > bool bpf_map__is_offload_neutral(struct bpf_map *map); > void bpf_map__set_ifindex(struct bpf_map *map, __u32 ifindex); > int bpf_map__pin(struct bpf_map *map, const char *path); > -- > 2.17.1 > -- Andrey Ignatov