Moving bpf_link_free call into delayed processing so we don't
need to wait for it when releasing the link.
For example bpf_tracing_link_release could take considerable
amount of time in bpf_trampoline_put function due to
synchronize_rcu_tasks call.
It speeds up bpftrace release time in following example:
Before:
Performance counter stats for './src/bpftrace -ve kfunc:__x64_sys_s*
{ printf("test\n"); } i:ms:10 { printf("exit\n"); exit();}' (5 runs):
3,290,457,628 cycles:k ( +- 0.27% )
933,581,973 cycles:u ( +- 0.20% )
50.25 +- 4.79 seconds time elapsed ( +- 9.53% )
After:
Performance counter stats for './src/bpftrace -ve kfunc:__x64_sys_s*
{ printf("test\n"); } i:ms:10 { printf("exit\n"); exit();}' (5 runs):
2,535,458,767 cycles:k ( +- 0.55% )
940,046,382 cycles:u ( +- 0.27% )
33.60 +- 3.27 seconds time elapsed ( +- 9.73% )
Signed-off-by: Jiri Olsa <[email protected]>
---
kernel/bpf/syscall.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 1110ecd7d1f3..61ef29f9177d 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2346,12 +2346,8 @@ void bpf_link_put(struct bpf_link *link)
if (!atomic64_dec_and_test(&link->refcnt))
return;
- if (in_atomic()) {
- INIT_WORK(&link->work, bpf_link_put_deferred);
- schedule_work(&link->work);
- } else {
- bpf_link_free(link);
- }
+ INIT_WORK(&link->work, bpf_link_put_deferred);
+ schedule_work(&link->work);
}
static int bpf_link_release(struct inode *inode, struct file *filp)
--
2.26.2