On 05/16/2018 11:46 PM, John Fastabend wrote:
> In the sockmap design BPF programs (SK_SKB_STREAM_PARSER and
> SK_SKB_STREAM_VERDICT) are attached to the sockmap map type and when
> a sock is added to the map the programs are used by the socket.
> However, sockmap updates from both userspace and BPF programs can
> happen concurrently with the attach and detach of these programs.
> 
> To resolve this we use the bpf_prog_inc_not_zero and a READ_ONCE()
> primitive to ensure the program pointer is not refeched and
> possibly NULL'd before the refcnt increment. This happens inside
> a RCU critical section so although the pointer reference in the map
> object may be NULL (by a concurrent detach operation) the reference
> from READ_ONCE will not be free'd until after grace period. This
> ensures the object returned by READ_ONCE() is valid through the
> RCU criticl section and safe to use as long as we "know" it may
> be free'd shortly.
> 
> Daniel spotted a case in the sock update API where instead of using
> the READ_ONCE() program reference we used the pointer from the
> original map, stab->bpf_{verdict|parse}. The problem with this is
> the logic checks the object returned from the READ_ONCE() is not
> NULL and then tries to reference the object again but using the
> above map pointer, which may have already been NULL'd by a parallel
> detach operation. If this happened bpf_porg_inc_not_zero could
> dereference a NULL pointer.
> 
> Fix this by using variable returned by READ_ONCE() that is checked
> for NULL.
> 
> Fixes: 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add 
> multi-map support")
> Reported-by: Daniel Borkmann <dan...@iogearbox.net>
> Signed-off-by: John Fastabend <john.fastab...@gmail.com>
> ---
>  kernel/bpf/sockmap.c |    4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
> index f03aaa8..583c1eb 100644
> --- a/kernel/bpf/sockmap.c
> +++ b/kernel/bpf/sockmap.c
> @@ -1703,11 +1703,11 @@ static int sock_map_ctx_update_elem(struct 
> bpf_sock_ops_kern *skops,
>                * we increment the refcnt. If this is the case abort with an
>                * error.
>                */
> -             verdict = bpf_prog_inc_not_zero(stab->bpf_verdict);
> +             verdict = bpf_prog_inc_not_zero(verdict);
>               if (IS_ERR(verdict))
>                       return PTR_ERR(verdict);
>  
> -             parse = bpf_prog_inc_not_zero(stab->bpf_parse);
> +             parse = bpf_prog_inc_not_zero(parse);
>               if (IS_ERR(parse)) {
>                       bpf_prog_put(verdict);
>                       return PTR_ERR(parse);

Isn't the same sort of behavior also possible with the 
bpf_prog_inc_not_zero(stab->bpf_tx_msg)?
Meaning, we now have verdict and parse covered with the patch, but the original 
tx_msg we
fetched earlier via READ_ONCE() where same would apply not (yet)?

Reply via email to