On 4/9/18 11:41 AM, Yonghong Song wrote:


On 4/9/18 9:47 AM, Alexei Starovoitov wrote:
On 4/9/18 9:18 AM, Yonghong Song wrote:
syzbot reported a possible deadlock in perf_event_detach_bpf_prog.
...
@@ -985,16 +986,31 @@ int perf_event_query_prog_array(struct
perf_event *event, void __user *info)
         return -EINVAL;
     if (copy_from_user(&query, uquery, sizeof(query)))
         return -EFAULT;
-    if (query.ids_len > BPF_TRACE_MAX_PROGS)
+
+    ids_len = query.ids_len;
+    if (ids_len > BPF_TRACE_MAX_PROGS)
         return -E2BIG;
+    ids = kcalloc(ids_len, sizeof(u32), GFP_USER | __GFP_NOWARN);
+    if (!ids)
+        return -ENOMEM;

     mutex_lock(&bpf_event_mutex);
     ret = bpf_prog_array_copy_info(event->tp_event->prog_array,
-                       uquery->ids,
-                       query.ids_len,
-                       &uquery->prog_cnt);
+                       ids,
+                       ids_len,
+                       &prog_cnt);
     mutex_unlock(&bpf_event_mutex);

+    if (!ret || ret == -ENOSPC) {
+        if (copy_to_user(&uquery->prog_cnt, &prog_cnt,
sizeof(prog_cnt)) ||
+            copy_to_user(uquery->ids, ids, ids_len * sizeof(u32))) {
+            ret = -EFAULT;
+            goto out;
+        }
+    }
+
+out:
+    kfree(ids);

alloc/free just to avoid this locking dependency feels suboptimal.

We actually already did kcalloc/kfree in bpf_prog_array_copy_to_user.
In that function, we did not copy_to_user one id at a time.
We allocate a temporary array and store the result there
and at the end, we call one copy_to_user to copy to the user buffer.

The patch here just moved this allocation and associated copy_to_user
out of the function and bpf_event_mutex. It did not introduce new
allocations.

I see, so the patch essentially open coding
bpf_prog_array_copy_to_user()
can we share the code then?
bpf/core.c callsite used by trace/bpf_trace.c
and similar callsite in bpf/cgroup.c
should be using common helper.



may be we can get rid of bpf_event_mutex in some cases?
the perf event itself is locked via perf_event_ctx_lock() when we're
calling perf_event_query_prog_array, perf_event_attach|detach_bpf_prog.
I forgot what was the motivation for us to introduce it in the
first place.

The original motivation for the lock to make sure bpf_prog_array
does not change during middle of attach/detach/query. it looks like
we have:
   . perf_event_attach|query under perf_event_ctx_lock
   . perf_event_detach not under perf_event_ctx_lock
Introducing perf_event_ctx_lock to perf_event_detach could still
have the deadlock.

ahh, right, since the progs are in even->tp_event which can
be shared by multiple perf_events.
Scratch that idea.

Reply via email to