2018-11-02 10:08 UTC+0100 ~ Daniel Borkmann <dan...@iogearbox.net> > On 11/01/2018 06:18 PM, Quentin Monnet wrote: >> 2018-10-30 15:23 UTC+0000 ~ Quentin Monnet <quentin.mon...@netronome.com> >>> The limit for memory locked in the kernel by a process is usually set to >>> 64 bytes by default. This can be an issue when creating large BPF maps. >>> A workaround is to raise this limit for the current process before >>> trying to create a new BPF map. Changing the hard limit requires the >>> CAP_SYS_RESOURCE and can usually only be done by root user (but then >>> only root can create BPF maps). >> >> Sorry, the parenthesis is not correct: non-root users can in fact create >> BPF maps as well. If a non-root user calls the function to create a map, >> setrlimit() will fail silently (but set errno), and the program will >> simply go on with its rlimit unchanged. >> >>> As far as I know there is not API to get the current amount of memory >>> locked for a user, therefore we cannot raise the limit only when >>> required. One solution, used by bcc, is to try to create the map, and on >>> getting a EPERM error, raising the limit to infinity before giving >>> another try. Another approach, used in iproute, is to raise the limit in >>> all cases, before trying to create the map. >>> >>> Here we do the same as in iproute2: the rlimit is raised to infinity >>> before trying to load the map. >>> >>> I send this patch as a RFC to see if people would prefer the bcc >>> approach instead, or the rlimit change to be in bpftool rather than in >>> libbpf. > > I'd avoid doing something like this in a generic library; it's basically an > ugly hack for the kind of accounting we're doing and only shows that while > this was "good enough" to start off with in the early days, we should be > doing something better today if every application raises it to inf anyway > then it's broken. :) It just shows that this missed its purpose. Similarly > to the jit_limit discussion on rlimit, perhaps we should be considering > switching to something else entirely from kernel side. Could be something > like memcg but this definitely needs some more evaluation first.
Changing the way limitations are enforced sounds like a cleaner long-term approach indeed. > (Meanwhile > I'd not change the lib but callers instead and once we have something better > in place we remove this type of "raising to inf" from the tree ...) Understood, for the time beeing I'll repost a patch adding the modification to bpftool once bpf-next is open. Thanks Daniel! Quentin