From: Alexei Starovoitov <[email protected]>
Date: Tue, 21 Mar 2017 19:05:04 -0700
> In both kmalloc and prealloc mode the bpf_map_update_elem() is using
> per-cpu extra_elems to do atomic update when the map is full.
> There are two issues with it. The logic can be misused, since it allows
> max_entries+num_cpus elements to be present in the map. And
> alloc_extra_elems()
> at map creation time can fail percpu alloc for large map values with a warn:
> WARNING: CPU: 3 PID: 2752 at ../mm/percpu.c:892 pcpu_alloc+0x119/0xa60
> illegal size (32824) or align (8) for percpu allocation
>
> The fixes for both of these issues are different for kmalloc and prealloc
> modes.
> For prealloc mode allocate extra num_possible_cpus elements and store
> their pointers into extra_elems array instead of actual elements.
> Hence we can use these hidden(spare) elements not only when the map is full
> but during bpf_map_update_elem() that replaces existing element too.
> That also improves performance, since pcpu_freelist_pop/push is avoided.
> Unfortunately this approach cannot be used for kmalloc mode which needs
> to kfree elements after rcu grace period. Therefore switch it back to normal
> kmalloc even when full and old element exists like it was prior to
> commit 6c9059817432 ("bpf: pre-allocate hash map elements").
>
> Add tests to check for over max_entries and large map values.
>
> Reported-by: Dave Jones <[email protected]>
> Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
> Signed-off-by: Alexei Starovoitov <[email protected]>
> Acked-by: Daniel Borkmann <[email protected]>
> Acked-by: Martin KaFai Lau <[email protected]>
Applied and queued up for -stable, thanks.