Re: [PATCH RFC v2 25/29] mm: asi: Restricted execution fore bare-metal processes
(Trimming the CC list as my email server refuses the number of CCs) On Fri, Jan 10, 2025 at 06:40:51PM +, Brendan Jackman wrote: > Now userspace gets a restricted address space too. The critical section > begins on exit to userspace and ends when it makes a system call. > Other entries from userspace just interrupt the critical section via > asi_intr_enter(). > > The reason why system calls have to actually asi_relax() (i.e. fully > terminate the critical section instead of just interrupting it) is that > system calls are the type of kernel entry that can lead to transition > into a _different_ ASI domain, namely the KVM one: it is not supported > to transition into a different domain while a critical section exists > (i.e. while asi_state.target is not NULL), even if it has been paused by > asi_intr_enter() (i.e. even if asi_state.intr_nest_depth is nonzero) - > there must be an asi_relax() between any two asi_enter()s. > > The restricted address space for bare-metal tasks naturally contains the > entire userspace address region, although the task's own memory is still > missing from the direct map. > > This implementation creates new userspace-specific APIs for asi_init(), > asi_destroy() and asi_enter(), which seems a little ugly, maybe this > suggest a general rework of these APIs given that the "generic" version > only has one caller. For RFC code this seems good enough though. > > Signed-off-by: Brendan Jackman > --- > arch/x86/include/asm/asi.h | 8 ++-- > arch/x86/mm/asi.c| 49 > > include/asm-generic/asi.h| 9 +++- > include/linux/entry-common.h | 11 ++ > init/main.c | 2 ++ > kernel/entry/common.c| 1 + > kernel/fork.c| 4 +++- > 7 files changed, 76 insertions(+), 8 deletions(-) > > diff --git a/arch/x86/include/asm/asi.h b/arch/x86/include/asm/asi.h > index > e925d7d2cfc85bca8480c837548654e7a5a7009e..c3c1a57f0147ae9bd11d89c8bf7c8a4477728f51 > 100644 > --- a/arch/x86/include/asm/asi.h > +++ b/arch/x86/include/asm/asi.h > @@ -140,19 +140,23 @@ DECLARE_PER_CPU_ALIGNED(struct asi *, curr_asi); > > void asi_check_boottime_disable(void); > > -void asi_init_mm_state(struct mm_struct *mm); > +int asi_init_mm_state(struct mm_struct *mm); > > int asi_init_class(enum asi_class_id class_id, struct asi_taint_policy > *taint_policy); > +void asi_init_userspace_class(void); > void asi_uninit_class(enum asi_class_id class_id); > const char *asi_class_name(enum asi_class_id class_id); > > int asi_init(struct mm_struct *mm, enum asi_class_id class_id, struct asi > **out_asi); > void asi_destroy(struct asi *asi); > +void asi_destroy_userspace(struct mm_struct *mm); > void asi_clone_user_pgtbl(struct mm_struct *mm, pgd_t *pgdp); > > /* Enter an ASI domain (restricted address space) and begin the critical > section. */ > void asi_enter(struct asi *asi); > > +void asi_enter_userspace(void); > + > /* > * Leave the "tense" state if we are in it, i.e. end the critical section. We > * will stay relaxed until the next asi_enter. > @@ -294,7 +298,7 @@ void asi_handle_switch_mm(void); > */ > static inline bool asi_maps_user_addr(enum asi_class_id class_id) > { > - return false; > + return class_id == ASI_CLASS_USERSPACE; > } > > #endif /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ > diff --git a/arch/x86/mm/asi.c b/arch/x86/mm/asi.c > index > 093103c1bc2677c81d68008aca064fab53b73a62..1e9dc568e79e8686a4dbf47f765f2c2535d025ec > 100644 > --- a/arch/x86/mm/asi.c > +++ b/arch/x86/mm/asi.c > @@ -25,6 +25,7 @@ const char *asi_class_names[] = { > #if IS_ENABLED(CONFIG_KVM) > [ASI_CLASS_KVM] = "KVM", > #endif > + [ASI_CLASS_USERSPACE] = "userspace", > }; > > DEFINE_PER_CPU_ALIGNED(struct asi *, curr_asi); > @@ -67,6 +68,32 @@ int asi_init_class(enum asi_class_id class_id, struct > asi_taint_policy *taint_po > } > EXPORT_SYMBOL_GPL(asi_init_class); > > +void __init asi_init_userspace_class(void) > +{ > + static struct asi_taint_policy policy = { > + /* > + * Prevent going to userspace with sensitive data potentially > + * left in sidechannels by code running in the unrestricted > + * address space, or another MM. Note we don't check for guest > + * data here. This reflects the assumption that the guest trusts > + * its VMM (absent fancy HW features, which are orthogonal). > + */ > + .protect_data = ASI_TAINT_KERNEL_DATA | ASI_TAINT_OTHER_MM_DATA, > + /* > + * Don't go into userspace with control flow state controlled by > + * other processes, or any KVM guest the process is running. > + * Note this bit is about protecting userspace from other parts > + * of the system, while data_taints is about protecting other > + * parts of the system from the
Re: [PATCH v7 7/8] execmem: add support for cache of large ROX pages
Hi Ryan, On Thu, Feb 27, 2025 at 11:13:29AM +, Ryan Roberts wrote: > Hi Mike, > > Drive by review comments below... > > > On 23/10/2024 17:27, Mike Rapoport wrote: > > From: "Mike Rapoport (Microsoft)" > > > > Using large pages to map text areas reduces iTLB pressure and improves > > performance. > > > > Extend execmem_alloc() with an ability to use huge pages with ROX > > permissions as a cache for smaller allocations. > > > > To populate the cache, a writable large page is allocated from vmalloc with > > VM_ALLOW_HUGE_VMAP, filled with invalid instructions and then remapped as > > ROX. > > > > The direct map alias of that large page is exculded from the direct map. > > > > Portions of that large page are handed out to execmem_alloc() callers > > without any changes to the permissions. > > > > When the memory is freed with execmem_free() it is invalidated again so > > that it won't contain stale instructions. > > > > An architecture has to implement execmem_fill_trapping_insns() callback > > and select ARCH_HAS_EXECMEM_ROX configuration option to be able to use > > the ROX cache. > > > > The cache is enabled on per-range basis when an architecture sets > > EXECMEM_ROX_CACHE flag in definition of an execmem_range. > > > > Signed-off-by: Mike Rapoport (Microsoft) > > Reviewed-by: Luis Chamberlain > > Tested-by: kdevops > > --- > > [...] > > > + > > +static int execmem_cache_populate(struct execmem_range *range, size_t size) > > +{ > > + unsigned long vm_flags = VM_ALLOW_HUGE_VMAP; > > + unsigned long start, end; > > + struct vm_struct *vm; > > + size_t alloc_size; > > + int err = -ENOMEM; > > + void *p; > > + > > + alloc_size = round_up(size, PMD_SIZE); > > + p = execmem_vmalloc(range, alloc_size, PAGE_KERNEL, vm_flags); > > Shouldn't this be passing PAGE_KERNEL_ROX? Otherwise I don't see how the > allocated memory is ROX? I don't see any call below where you change the > permission. The memory is allocated RW, filled with invalid instructions, unammped in vmalloc space, removed from the direct map and then mapped as ROX in vmalloc address space. > Given the range has the pgprot in it, you could just drop passing the pgprot > explicitly here and have execmem_vmalloc() use range->pgprot directly? Here range->prprot and the prot passed to vmalloc are different. > Thanks, > Ryan > > > + if (!p) > > + return err; > > + > > + vm = find_vm_area(p); > > + if (!vm) > > + goto err_free_mem; > > + > > + /* fill memory with instructions that will trap */ > > + execmem_fill_trapping_insns(p, alloc_size, /* writable = */ true); > > + > > + start = (unsigned long)p; > > + end = start + alloc_size; > > + > > + vunmap_range(start, end); > > + > > + err = execmem_set_direct_map_valid(vm, false); > > + if (err) > > + goto err_free_mem; > > + > > + err = vmap_pages_range_noflush(start, end, range->pgprot, vm->pages, > > + PMD_SHIFT); > > + if (err) > > + goto err_free_mem; > > + > > + err = execmem_cache_add(p, alloc_size); > > + if (err) > > + goto err_free_mem; > > + > > + return 0; > > + > > +err_free_mem: > > + vfree(p); > > + return err; > > +} > > [...] > -- Sincerely yours, Mike. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH RFC v2 02/29] x86: Create CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION
Hi Brendan, On Fri, Jan 10, 2025 at 06:40:28PM +, Brendan Jackman wrote: > Currently a nop config. Keeping as a separate commit for easy review of > the boring bits. Later commits will use and enable this new config. > > This config is only added for non-UML x86_64 as other architectures do > not yet have pending implementations. It also has somewhat artificial > dependencies on !PARAVIRT and !KASAN which are explained in the Kconfig > file. > > Co-developed-by: Junaid Shahid > Signed-off-by: Junaid Shahid > Signed-off-by: Brendan Jackman > --- > arch/alpha/include/asm/Kbuild | 1 + > arch/arc/include/asm/Kbuild| 1 + > arch/arm/include/asm/Kbuild| 1 + > arch/arm64/include/asm/Kbuild | 1 + > arch/csky/include/asm/Kbuild | 1 + > arch/hexagon/include/asm/Kbuild| 1 + > arch/loongarch/include/asm/Kbuild | 3 +++ > arch/m68k/include/asm/Kbuild | 1 + > arch/microblaze/include/asm/Kbuild | 1 + > arch/mips/include/asm/Kbuild | 1 + > arch/nios2/include/asm/Kbuild | 1 + > arch/openrisc/include/asm/Kbuild | 1 + > arch/parisc/include/asm/Kbuild | 1 + > arch/powerpc/include/asm/Kbuild| 1 + > arch/riscv/include/asm/Kbuild | 1 + > arch/s390/include/asm/Kbuild | 1 + > arch/sh/include/asm/Kbuild | 1 + > arch/sparc/include/asm/Kbuild | 1 + > arch/um/include/asm/Kbuild | 2 +- > arch/x86/Kconfig | 14 ++ > arch/xtensa/include/asm/Kbuild | 1 + > include/asm-generic/asi.h | 5 + > 22 files changed, 41 insertions(+), 1 deletion(-) I don't think this all is needed. You can put asi.h with stubs used outside of arch/x86 in include/linux and save you the hassle of updating every architecture. > diff --git a/arch/sparc/include/asm/Kbuild b/arch/sparc/include/asm/Kbuild > index > 43b0ae4c2c2112d4d4d3cb3c60e787b175172dea..cb9062c9be17fe276cc92d2ac99d8b165f6297bf > 100644 > --- a/arch/sparc/include/asm/Kbuild > +++ b/arch/sparc/include/asm/Kbuild > @@ -4,3 +4,4 @@ generated-y += syscall_table_64.h > generic-y += agp.h > generic-y += kvm_para.h > generic-y += mcs_spinlock.h > +generic-y += asi.h sparc already has include/asm/asi.h, this will break the build > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index > 7b9a7e8f39acc8e9aeb7d4213e87d71047865f5c..5a50582eb210e9d1309856a737d32b76fa1bfc85 > 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -2519,6 +2519,20 @@ config MITIGATION_PAGE_TABLE_ISOLATION > > See Documentation/arch/x86/pti.rst for more details. > > +config MITIGATION_ADDRESS_SPACE_ISOLATION > + bool "Allow code to run with a reduced kernel address space" > + default n > + depends on X86_64 && !PARAVIRT && !UML > + help > + This feature provides the ability to run some kernel code > + with a reduced kernel address space. This can be used to > + mitigate some speculative execution attacks. > + > + The !PARAVIRT dependency is only because of lack of testing; in theory > + the code is written to work under paravirtualization. In practice > + there are likely to be unhandled cases, in particular concerning TLB > + flushes. > + If you expect other architectures might implement ASI the config would better fit into init/Kconfig or mm/Kconfig and in arch/x86/Kconfig will define ARCH_HAS_MITIGATION_ADDRESS_SPACE_ISOLATION. > config MITIGATION_RETPOLINE > bool "Avoid speculative indirect branches in kernel" > select OBJTOOL if HAVE_OBJTOOL > diff --git a/arch/xtensa/include/asm/Kbuild b/arch/xtensa/include/asm/Kbuild > index > fa07c686cbcc2153776a478ac4093846f01eddab..07cea6902f98053be244d026ed594fe7246755a6 > 100644 > --- a/arch/xtensa/include/asm/Kbuild > +++ b/arch/xtensa/include/asm/Kbuild > @@ -8,3 +8,4 @@ generic-y += parport.h > generic-y += qrwlock.h > generic-y += qspinlock.h > generic-y += user.h > +generic-y += asi.h > diff --git a/include/asm-generic/asi.h b/include/asm-generic/asi.h > new file mode 100644 > index > ..c4d9a5ff860a96428422a15000c622aeecc2d664 > --- /dev/null > +++ b/include/asm-generic/asi.h > @@ -0,0 +1,5 @@ > +/* SPDX-License-Identifier: GPL-2.0 */ > +#ifndef __ASM_GENERIC_ASI_H > +#define __ASM_GENERIC_ASI_H > + > +#endif IMHO it should be include/linux/asi.h, with something like #infdef __LINUX_ASI_H #define __LINUX_ASI_H #ifdef CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION #include #else /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ /* stubs for functions used outside arch/ */ #endif /* CONFIG_MITIGATION_ADDRESS_SPACE_ISOLATION */ #endif /* __LINUX_ASI_H */ -- Sincerely yours, Mike. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc