Re: [PATCH RFC v2 25/29] mm: asi: Restricted execution fore bare-metal processes

2025-02-28 Thread Yosry Ahmed
(Trimming the CC list as my email server refuses the number of CCs)

On Fri, Jan 10, 2025 at 06:40:51PM +, Brendan Jackman wrote:
> Now userspace gets a restricted address space too. The critical section
> begins on exit to userspace and ends when it makes a system call.
> Other entries from userspace just interrupt the critical section via
> asi_intr_enter().
> 
> The reason why system calls have to actually asi_relax() (i.e. fully
> terminate the critical section instead of just interrupting it) is that
> system calls are the type of kernel entry that can lead to transition
> into a _different_ ASI domain, namely the KVM one: it is not supported
> to transition into a different domain while a critical section exists
> (i.e. while asi_state.target is not NULL), even if it has been paused by
> asi_intr_enter() (i.e. even if asi_state.intr_nest_depth is nonzero) -
> there must be an asi_relax() between any two asi_enter()s.
> 
> The restricted address space for bare-metal tasks naturally contains the
> entire userspace address region, although the task's own memory is still
> missing from the direct map.
> 
> This implementation creates new userspace-specific APIs for asi_init(),
> asi_destroy() and asi_enter(), which seems a little ugly, maybe this
> suggest a general rework of these APIs given that the "generic" version
> only has one caller. For RFC code this seems good enough though.
> 
> Signed-off-by: Brendan Jackman 
> ---
>  arch/x86/include/asm/asi.h   |  8 ++--
>  arch/x86/mm/asi.c| 49 
> 
>  include/asm-generic/asi.h|  9 +++-
>  include/linux/entry-common.h | 11 ++
>  init/main.c  |  2 ++
>  kernel/entry/common.c|  1 +
>  kernel/fork.c|  4 +++-
>  7 files changed, 76 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h
> index 
> e925d7d2cfc85bca8480c837548654e7a5a7009e..c3c1a57f0147ae9bd11d89c8bf7c8a4477728f51
>  100644
> --- a/arch/x86/include/asm/asi.h
> +++ b/arch/x86/include/asm/asi.h
> @@ -140,19 +140,23 @@ DECLARE_PER_CPU_ALIGNED(struct asi *, curr_asi);
>  
>  void asi_check_boottime_disable(void);
>  
> -void asi_init_mm_state(struct mm_struct *mm);
> +int asi_init_mm_state(struct mm_struct *mm);
>  
>  int asi_init_class(enum asi_class_id class_id, struct asi_taint_policy 
> *taint_policy);
> +void asi_init_userspace_class(void);
>  void asi_uninit_class(enum asi_class_id class_id);
>  const char *asi_class_name(enum asi_class_id class_id);
>  
>  int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi 
> **out_asi);
>  void asi_destroy(struct asi *asi);
> +void asi_destroy_userspace(struct mm_struct *mm);
>  void asi_clone_user_pgtbl(struct mm_struct *mm, pgd_t *pgdp);
>  
>  /* Enter an ASI domain (restricted address space) and begin the critical 
> section. */
>  void asi_enter(struct asi *asi);
>  
> +void asi_enter_userspace(void);
> +
>  /*
>   * Leave the "tense" state if we are in it, i.e. end the critical section. We
>   * will stay relaxed until the next asi_enter.
> @@ -294,7 +298,7 @@ void asi_handle_switch_mm(void);
>   */
>  static inline bool asi_maps_user_addr(enum asi_class_id class_id)
>  {
> - return false;
> + return class_id == ASI_CLASS_USERSPACE;
>  }
>  
>  #endif /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */
> diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c
> index 
> 093103c1bc2677c81d68008aca064fab53b73a62..1e9dc568e79e8686a4dbf47f765f2c2535d025ec
>  100644
> --- a/arch/x86/mm/asi.c
> +++ b/arch/x86/mm/asi.c
> @@ -25,6 +25,7 @@ const char *asi_class_names[] = {
>  #if IS_ENABLED(CONFIG_KVM)
>   [ASI_CLASS_KVM] = "KVM",
>  #endif
> + [ASI_CLASS_USERSPACE] = "userspace",
>  };
>  
>  DEFINE_PER_CPU_ALIGNED(struct asi *, curr_asi);
> @@ -67,6 +68,32 @@ int asi_init_class(enum asi_class_id class_id, struct 
> asi_taint_policy *taint_po
>  }
>  EXPORT_SYMBOL_GPL(asi_init_class);
>  
> +void __init asi_init_userspace_class(void)
> +{
> + static struct asi_taint_policy policy = {
> + /*
> +  * Prevent going to userspace with sensitive data potentially
> +  * left in sidechannels by code running in the unrestricted
> +  * address space, or another MM. Note we don't check for guest
> +  * data here. This reflects the assumption that the guest trusts
> +  * its VMM (absent fancy HW features, which are orthogonal).
> +  */
> + .protect_data = ASI_TAINT_KERNEL_DATA | ASI_TAINT_OTHER_MM_DATA,
> + /*
> +  * Don't go into userspace with control flow state controlled by
> +  * other processes, or any KVM guest the process is running.
> +  * Note this bit is about protecting userspace from other parts
> +  * of the system, while data_taints is about protecting other
> +  * parts of the system from the 

Re: [PATCH v7 7/8] execmem: add support for cache of large ROX pages

2025-02-28 Thread Mike Rapoport
Hi Ryan,

On Thu, Feb 27, 2025 at 11:13:29AM +, Ryan Roberts wrote:
> Hi Mike,
> 
> Drive by review comments below...
> 
> 
> On 23/10/2024 17:27, Mike Rapoport wrote:
> > From: "Mike Rapoport (Microsoft)" 
> > 
> > Using large pages to map text areas reduces iTLB pressure and improves
> > performance.
> > 
> > Extend execmem_alloc() with an ability to use huge pages with ROX
> > permissions as a cache for smaller allocations.
> > 
> > To populate the cache, a writable large page is allocated from vmalloc with
> > VM_ALLOW_HUGE_VMAP, filled with invalid instructions and then remapped as
> > ROX.
> > 
> > The direct map alias of that large page is exculded from the direct map.
> > 
> > Portions of that large page are handed out to execmem_alloc() callers
> > without any changes to the permissions.
> > 
> > When the memory is freed with execmem_free() it is invalidated again so
> > that it won't contain stale instructions.
> > 
> > An architecture has to implement execmem_fill_trapping_insns() callback
> > and select ARCH_HAS_EXECMEM_ROX configuration option to be able to use
> > the ROX cache.
> > 
> > The cache is enabled on per-range basis when an architecture sets
> > EXECMEM_ROX_CACHE flag in definition of an execmem_range.
> > 
> > Signed-off-by: Mike Rapoport (Microsoft) 
> > Reviewed-by: Luis Chamberlain 
> > Tested-by: kdevops 
> > ---
> 
> [...]
> 
> > +
> > +static int execmem_cache_populate(struct execmem_range *range, size_t size)
> > +{
> > +   unsigned long vm_flags = VM_ALLOW_HUGE_VMAP;
> > +   unsigned long start, end;
> > +   struct vm_struct *vm;
> > +   size_t alloc_size;
> > +   int err = -ENOMEM;
> > +   void *p;
> > +
> > +   alloc_size = round_up(size, PMD_SIZE);
> > +   p = execmem_vmalloc(range, alloc_size, PAGE_KERNEL, vm_flags);
> 
> Shouldn't this be passing PAGE_KERNEL_ROX? Otherwise I don't see how the
> allocated memory is ROX? I don't see any call below where you change the 
> permission.

The memory is allocated RW, filled with invalid instructions, unammped in
vmalloc space, removed from the direct map and then mapped as ROX in
vmalloc address space.
 
> Given the range has the pgprot in it, you could just drop passing the pgprot
> explicitly here and have execmem_vmalloc() use range->pgprot directly?

Here range->prprot and the prot passed to vmalloc are different.
 
> Thanks,
> Ryan
> 
> > +   if (!p)
> > +   return err;
> > +
> > +   vm = find_vm_area(p);
> > +   if (!vm)
> > +   goto err_free_mem;
> > +
> > +   /* fill memory with instructions that will trap */
> > +   execmem_fill_trapping_insns(p, alloc_size, /* writable = */ true);
> > +
> > +   start = (unsigned long)p;
> > +   end = start + alloc_size;
> > +
> > +   vunmap_range(start, end);
> > +
> > +   err = execmem_set_direct_map_valid(vm, false);
> > +   if (err)
> > +   goto err_free_mem;
> > +
> > +   err = vmap_pages_range_noflush(start, end, range->pgprot, vm->pages,
> > +  PMD_SHIFT);
> > +   if (err)
> > +   goto err_free_mem;
> > +
> > +   err = execmem_cache_add(p, alloc_size);
> > +   if (err)
> > +   goto err_free_mem;
> > +
> > +   return 0;
> > +
> > +err_free_mem:
> > +   vfree(p);
> > +   return err;
> > +}
> 
> [...]
> 

-- 
Sincerely yours,
Mike.

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc


Re: [PATCH RFC v2 02/29] x86: Create CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION

2025-02-28 Thread Mike Rapoport
Hi Brendan,

On Fri, Jan 10, 2025 at 06:40:28PM +, Brendan Jackman wrote:
> Currently a nop config. Keeping as a separate commit for easy review of
> the boring bits. Later commits will use and enable this new config.
> 
> This config is only added for non-UML x86_64 as other architectures do
> not yet have pending implementations. It also has somewhat artificial
> dependencies on !PARAVIRT and !KASAN which are explained in the Kconfig
> file.
> 
> Co-developed-by: Junaid Shahid 
> Signed-off-by: Junaid Shahid 
> Signed-off-by: Brendan Jackman 
> ---
>  arch/alpha/include/asm/Kbuild  |  1 +
>  arch/arc/include/asm/Kbuild|  1 +
>  arch/arm/include/asm/Kbuild|  1 +
>  arch/arm64/include/asm/Kbuild  |  1 +
>  arch/csky/include/asm/Kbuild   |  1 +
>  arch/hexagon/include/asm/Kbuild|  1 +
>  arch/loongarch/include/asm/Kbuild  |  3 +++
>  arch/m68k/include/asm/Kbuild   |  1 +
>  arch/microblaze/include/asm/Kbuild |  1 +
>  arch/mips/include/asm/Kbuild   |  1 +
>  arch/nios2/include/asm/Kbuild  |  1 +
>  arch/openrisc/include/asm/Kbuild   |  1 +
>  arch/parisc/include/asm/Kbuild |  1 +
>  arch/powerpc/include/asm/Kbuild|  1 +
>  arch/riscv/include/asm/Kbuild  |  1 +
>  arch/s390/include/asm/Kbuild   |  1 +
>  arch/sh/include/asm/Kbuild |  1 +
>  arch/sparc/include/asm/Kbuild  |  1 +
>  arch/um/include/asm/Kbuild |  2 +-
>  arch/x86/Kconfig   | 14 ++
>  arch/xtensa/include/asm/Kbuild |  1 +
>  include/asm-generic/asi.h  |  5 +
>  22 files changed, 41 insertions(+), 1 deletion(-)

I don't think this all is needed. You can put asi.h with stubs used outside
of arch/x86 in include/linux and save you the hassle of updating every
architecture.
 
> diff --git a/arch/sparc/include/asm/Kbuild b/arch/sparc/include/asm/Kbuild
> index 
> 43b0ae4c2c2112d4d4d3cb3c60e787b175172dea..cb9062c9be17fe276cc92d2ac99d8b165f6297bf
>  100644
> --- a/arch/sparc/include/asm/Kbuild
> +++ b/arch/sparc/include/asm/Kbuild
> @@ -4,3 +4,4 @@ generated-y += syscall_table_64.h
>  generic-y += agp.h
>  generic-y += kvm_para.h
>  generic-y += mcs_spinlock.h
> +generic-y += asi.h

sparc already has include/asm/asi.h, this will break the build

> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 
> 7b9a7e8f39acc8e9aeb7d4213e87d71047865f5c..5a50582eb210e9d1309856a737d32b76fa1bfc85
>  100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2519,6 +2519,20 @@ config MITIGATION_PAGE_TABLE_ISOLATION
>  
> See Documentation/arch/x86/pti.rst for more details.
>  
> +config MITIGATION_ADDRESS_SPACE_ISOLATION
> + bool "Allow code to run with a reduced kernel address space"
> + default n
> + depends on X86_64 && !PARAVIRT && !UML
> + help
> +   This feature provides the ability to run some kernel code
> +   with a reduced kernel address space. This can be used to
> +   mitigate some speculative execution attacks.
> +
> +   The !PARAVIRT dependency is only because of lack of testing; in theory
> +   the code is written to work under paravirtualization. In practice
> +   there are likely to be unhandled cases, in particular concerning TLB
> +   flushes.
> +

If you expect other architectures might implement ASI the config would better
fit into init/Kconfig or mm/Kconfig and in arch/x86/Kconfig will define
ARCH_HAS_MITIGATION_ADDRESS_SPACE_ISOLATION.

>  config MITIGATION_RETPOLINE
>   bool "Avoid speculative indirect branches in kernel"
>   select OBJTOOL if HAVE_OBJTOOL
> diff --git a/arch/xtensa/include/asm/Kbuild b/arch/xtensa/include/asm/Kbuild
> index 
> fa07c686cbcc2153776a478ac4093846f01eddab..07cea6902f98053be244d026ed594fe7246755a6
>  100644
> --- a/arch/xtensa/include/asm/Kbuild
> +++ b/arch/xtensa/include/asm/Kbuild
> @@ -8,3 +8,4 @@ generic-y += parport.h
>  generic-y += qrwlock.h
>  generic-y += qspinlock.h
>  generic-y += user.h
> +generic-y += asi.h
> diff --git a/include/asm-generic/asi.h b/include/asm-generic/asi.h
> new file mode 100644
> index 
> ..c4d9a5ff860a96428422a15000c622aeecc2d664
> --- /dev/null
> +++ b/include/asm-generic/asi.h
> @@ -0,0 +1,5 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_GENERIC_ASI_H
> +#define __ASM_GENERIC_ASI_H
> +
> +#endif

IMHO it should be include/linux/asi.h, with something like

#infdef __LINUX_ASI_H
#define __LINUX_ASI_H

#ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION

#include 

#else /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */

/* stubs for functions used outside arch/ */

#endif /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */

#endif /* __LINUX_ASI_H */

-- 
Sincerely yours,
Mike.

___
linux-snps-arc mailing list
linux-snps-arc@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-snps-arc