On 01/08/2018 04:35 AM, Alexei Starovoitov wrote:
> The BPF interpreter has been used as part of the spectre 2 attack 
> CVE-2017-5715.
> 
> A quote from goolge project zero blog:
> "At this point, it would normally be necessary to locate gadgets in
> the host kernel code that can be used to actually leak data by reading
> from an attacker-controlled location, shifting and masking the result
> appropriately and then using the result of that as offset to an
> attacker-controlled address for a load. But piecing gadgets together
> and figuring out which ones work in a speculation context seems annoying.
> So instead, we decided to use the eBPF interpreter, which is built into
> the host kernel - while there is no legitimate way to invoke it from inside
> a VM, the presence of the code in the host kernel's text section is sufficient
> to make it usable for the attack, just like with ordinary ROP gadgets."
> 
> To make attacker job harder introduce BPF_JIT_ALWAYS_ON config
> option that removes interpreter from the kernel in favor of JIT-only mode.
> So far eBPF JIT is supported by:
> x64, arm64, arm32, sparc64, s390, powerpc64, mips64
> 
> The start of JITed program is randomized and code page is marked as read-only.
> In addition "constant blinding" can be turned on with net.core.bpf_jit_harden
> 
> Signed-off-by: Alexei Starovoitov <a...@kernel.org>
> ---
>  init/Kconfig               | 7 +++++++
>  kernel/bpf/core.c          | 9 +++++++++
>  kernel/bpf/verifier.c      | 4 ++++
>  net/core/sysctl_net_core.c | 9 +++++++++
>  4 files changed, 29 insertions(+)
> 
> diff --git a/init/Kconfig b/init/Kconfig
> index 2934249fba46..5e2a4a391ba9 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1392,6 +1392,13 @@ config BPF_SYSCALL
>         Enable the bpf() system call that allows to manipulate eBPF
>         programs and maps via file descriptors.
>  
> +config BPF_JIT_ALWAYS_ON
> +     bool "Permanently enable BPF JIT and remove BPF interpreter"
> +     depends on BPF_SYSCALL && HAVE_EBPF_JIT && BPF_JIT
> +     help
> +       Enables BPF JIT and removes BPF interpreter to avoid
> +       speculative execution of BPF instructions by the interpreter
> +
>  config USERFAULTFD
>       bool "Enable userfaultfd() system call"
>       select ANON_INODES
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index 70a534549cd3..42756c434e0b 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -781,6 +781,7 @@ noinline u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 
> r4, u64 r5)
>  }
>  EXPORT_SYMBOL_GPL(__bpf_call_base);
>  
> +#ifndef CONFIG_BPF_JIT_ALWAYS_ON
>  /**
>   *   __bpf_prog_run - run eBPF program on a given context
>   *   @ctx: is the data we are operating on
> @@ -1376,6 +1377,7 @@ void bpf_patch_call_args(struct bpf_insn *insn, u32 
> stack_depth)
>               __bpf_call_base_args;
>       insn->code = BPF_JMP | BPF_CALL_ARGS;
>  }
> +#endif
>  
>  bool bpf_prog_array_compatible(struct bpf_array *array,
>                              const struct bpf_prog *fp)
> @@ -1427,9 +1429,11 @@ static int bpf_check_tail_call(const struct bpf_prog 
> *fp)
>   */
>  struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
>  {
> +#ifndef CONFIG_BPF_JIT_ALWAYS_ON
>       u32 stack_depth = max_t(u32, fp->aux->stack_depth, 1);
>  
>       fp->bpf_func = interpreters[(round_up(stack_depth, 32) / 32) - 1];
> +#endif
>  
>       /* eBPF JITs can rewrite the program in case constant
>        * blinding is active. However, in case of error during
> @@ -1453,6 +1457,11 @@ struct bpf_prog *bpf_prog_select_runtime(struct 
> bpf_prog *fp, int *err)
>        */
>       *err = bpf_check_tail_call(fp);
>  
> +#ifdef CONFIG_BPF_JIT_ALWAYS_ON
> +     if (!fp->jited)
> +             *err = -ENOTSUPP;
> +#endif

This part here and ...

>       return fp;
>  }
>  EXPORT_SYMBOL_GPL(bpf_prog_select_runtime);
[...]
> @@ -524,6 +530,9 @@ static __net_initdata struct pernet_operations 
> sysctl_core_ops = {
>  
>  static __init int sysctl_core_init(void)
>  {
> +#if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_JIT_ALWAYS_ON)
> +     bpf_jit_enable = 1;
> +#endif

... this one will race and break stuff in the current shape, one example
is the PTP classifier in the tree: sysctl_core_init() is done in fs_initcall(),
whereas ptp_classifier_init() is done in sock_init() which is done out of
core_initcall().

So what will happen is that at this point in time bpf_jit_enable is not yet
set to 1, so when ptp_classifier_init() calls the cBPF bpf_prog_create(), it
will migrate the insns over to eBPF and in bpf_prog_select_runtime() called
from bpf_migrate_filter() have the assumption that we always succeed here
since when JIT fails, we will fall back to the interpreter anyway. The only
error up until now in bpf_prog_select_runtime() that could happen is out of
native eBPF prog load, so bpf_migrate_filter() will thus return just fine
and on first call to PTP classifier from a network packet, we'll get NULL
pointer deref since the fp->bpf_func is still NULL. So this would rather
need to be set much earlier on init or e.g. in the JITs themselves.

Other than that I was wondering whether the arm32 eBPF JIT could cause
trouble for cBPF as well, but it looks not the case since only alu64 div/mod
and xadd is not implemented there yet, so that should be ok since not used
in the migration.

>       register_net_sysctl(&init_net, "net/core", net_core_table);
>       return register_pernet_subsys(&sysctl_core_ops);
>  }
> 

Reply via email to