Re: [PATCH v5 7/8] execmem: add support for cache of large ROX pages
On Wed, Oct 09, 2024 at 11:58:33PM -0700, Christoph Hellwig wrote: > On Wed, Oct 09, 2024 at 09:08:15PM +0300, Mike Rapoport wrote: > > /** > > * struct execmem_info - architecture parameters for code allocations > > + * @fill_trapping_insns: set memory to contain instructions that will trap > > * @ranges: array of parameter sets defining architecture specific > > * parameters for executable memory allocations. The ranges that are not > > * explicitly initialized by an architecture use parameters defined for > > * @EXECMEM_DEFAULT. > > */ > > struct execmem_info { > > + void (*fill_trapping_insns)(void *ptr, size_t size, bool writable); > > struct execmem_rangeranges[EXECMEM_TYPE_MAX]; > > Why is the filler an indirect function call and not an architecture > hook? The idea is to keep everything together and have execmem_info describe all that architecture needs. -- Sincerely yours, Mike. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH] asm-generic: provide generic page_to_phys and phys_to_page implementations
On Thu, Oct 10, 2024 at 09:03:42AM +0200, Christoph Hellwig wrote: > > I think we should try to have a little fewer nested macros > > to evaluate here, right now this ends up expanding > > __pfn_to_phys, PFN_PHYS, PAGE_SHIFT, CONFIG_PAGE_SHIFT, > > page_to_pfn and __page_to_pfn. While the behavior is fine, > > modern gcc versions list all of those in an warning message > > if someone passes the wrong arguments. > > > > Changing the two macros above into inline functions > > would help as well, but may cause other problems. > > Doing them as inlines seems useful to me, let me throw that at > the buildbot and see if anything explodes. The inline version instantly blows up, so I'll try just open coding the phys to/from pfn translation instead. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v5 2/8] mm: vmalloc: don't account for number of nodes for HUGE_VMAP allocations
Looks good: Reviewed-by: Christoph Hellwig ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v5 6/8] x86/module: perpare module loading for ROX allocations of text
Hi Mike, On Wed, Oct 09, 2024 at 09:08:14PM +0300, Mike Rapoport wrote: > From: "Mike Rapoport (Microsoft)" > > When module text memory will be allocated with ROX permissions, the > memory at the actual address where the module will live will contain > invalid instructions and there will be a writable copy that contains the > actual module code. > > Update relocations and alternatives patching to deal with it. > > Signed-off-by: Mike Rapoport (Microsoft) I bisected a boot failure that I see with CONFIG_CFI_CLANG enabled to this change as commit be712757cabd ("x86/module: perpare module loading for ROX allocations of text") in -next. $ echo CONFIG_CFI_CLANG=y >arch/x86/configs/cfi.config $ make -skj"$(nproc)" ARCH=x86_64 LLVM=1 mrproper defconfig cfi.config bzImage $ curl -LSs https://github.com/ClangBuiltLinux/boot-utils/releases/download/20230707-182910/x86_64-rootfs.cpio.zst | zstd -d >rootfs.cpio $ qemu-system-x86_64 \ -display none \ -nodefaults \ -M q35 \ -d unimp,guest_errors \ -append 'console=ttyS0 earlycon=uart8250,io,0x3f8' \ -kernel arch/x86/boot/bzImage \ -initrd rootfs.cpio \ -cpu host \ -enable-kvm \ -m 512m \ -smp 8 \ -serial mon:stdio [0.00] Linux version 6.12.0-rc2-00140-gbe712757cabd (nathan@n3-xlarge-x86) (ClangBuiltLinux clang version 19.1.0 (https://github.com/llvm/llvm-project.git a4bf6cd7cfb1a1421ba92bca9d017b49936c55e4), ClangBuiltLinux LLD 19.1.0 (https://github.com/llvm/llvm-project.git a4bf6cd7cfb1a1421ba92bca9d017b49936c55e4)) #1 SMP PREEMPT_DYNAMIC Thu Oct 10 22:42:57 UTC 2024 ... [0.092204] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl [0.093207] TAA: Mitigation: TSX disabled [0.093711] MMIO Stale Data: Mitigation: Clear CPU buffers [0.094228] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [0.095203] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [0.096203] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [0.097203] x86/fpu: Supporting XSAVE feature 0x020: 'AVX-512 opmask' [0.098003] x86/fpu: Supporting XSAVE feature 0x040: 'AVX-512 Hi256' [0.098203] x86/fpu: Supporting XSAVE feature 0x080: 'AVX-512 ZMM_Hi256' [0.099203] x86/fpu: Supporting XSAVE feature 0x200: 'Protection Keys User registers' [0.100204] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [0.101204] x86/fpu: xstate_offset[5]: 832, xstate_sizes[5]: 64 [0.102203] x86/fpu: xstate_offset[6]: 896, xstate_sizes[6]: 512 [0.103204] x86/fpu: xstate_offset[7]: 1408, xstate_sizes[7]: 1024 [0.104051] x86/fpu: xstate_offset[9]: 2432, xstate_sizes[9]:8 [0.104204] x86/fpu: Enabled xstate features 0x2e7, context size is 2440 bytes, using 'compacted' format. then nothing after that. Boot is successful if CFI is not enabled (the initrd will just shutdown the machine after printing the version string). If there is any further information I can provide or patches I can test, I am more than happy to do so. Cheers, Nathan ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v5 6/8] x86/module: perpare module loading for ROX allocations of text
> +extern void apply_alternatives(struct alt_instr *start, struct alt_instr > *end, > +struct module *mod); > +extern void apply_retpolines(s32 *start, s32 *end, struct module *mod); > +extern void apply_returns(s32 *start, s32 *end, struct module *mod); > +extern void apply_seal_endbr(s32 *start, s32 *end, struct module *mod); > extern void apply_fineibt(s32 *start_retpoline, s32 *end_retpoine, > - s32 *start_cfi, s32 *end_cfi); > - > -struct module; > + s32 *start_cfi, s32 *end_cfi, struct module *mod); Please drop all the pointless externs while you're at it. Otherwise looks good: Reviewed-by: Christoph Hellwig ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v5 7/8] execmem: add support for cache of large ROX pages
On Wed, Oct 09, 2024 at 01:24:27PM -0700, Andrew Morton wrote: > On Wed, 9 Oct 2024 21:08:15 +0300 Mike Rapoport wrote: > > > Using large pages to map text areas reduces iTLB pressure and improves > > performance. > > Are there any measurable performance improvements? I don't have any numbers, I just followed the common sense of "less TLB entries is better" and relied on Thomas comments from previous discussions. > What are the effects of this series upon overall memory consumption? There will be some execmem cache fragmentation and an increase in memory consumption. It depends on the actual modules loaded and how large it the fragmentation. For a set of pretty randomly chosen modules where most come from net/netfilter I see an increase from 19M to 25M. > The lack of acks is a bit surprising for a v5 patch, but I'll add all > this to mm.git for some testing, thanks. > -- Sincerely yours, Mike. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: Bisected: [PATCH v5 8/8] x86/module: enable ROX caches for module text
On Thu, Oct 10, 2024 at 05:30:33PM +0900, Sergey Senozhatsky wrote: > On (24/10/09 21:08), Mike Rapoport wrote: > > From: "Mike Rapoport (Microsoft)" > > > > Enable execmem's cache of PMD_SIZE'ed pages mapped as ROX for module > > text allocations. > > > > With this modprobe disappoints kmemleak > > [ 12.700128] kmemleak: Found object by alias at 0xa000a000 > [ 12.702179] CPU: 5 UID: 0 PID: 410 Comm: modprobe Tainted: G > N 6.12.0-rc2+ #760 > [ 12.704656] Tainted: [N]=TEST > [ 12.705526] Call Trace: > [ 12.706250] > [ 12.706888] dump_stack_lvl+0x3e/0xdb > [ 12.707961] __find_and_get_object+0x100/0x110 > [ 12.709256] kmemleak_no_scan+0x2e/0xb0 > [ 12.710354] kmemleak_load_module+0xad/0xe0 > [ 12.711557] load_module+0x2391/0x45a0 > [ 12.712507] __se_sys_finit_module+0x4e0/0x7a0 > [ 12.713599] do_syscall_64+0x54/0xf0 > [ 12.714477] ? irqentry_exit_to_user_mode+0x33/0x100 > [ 12.715696] entry_SYSCALL_64_after_hwframe+0x4b/0x53 > [ 12.716931] RIP: 0033:0x7fc7af51f059 > [ 12.717816] Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 > f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 8b 0d 8f 1d 0d 00 f7 d8 64 89 01 48 > [ 12.722324] RSP: 002b:7ffc1d0b0c18 EFLAGS: 0246 ORIG_RAX: > 0139 > [ 12.724173] RAX: ffda RBX: 5618a9439b20 RCX: > 7fc7af51f059 > [ 12.725884] RDX: RSI: 56187aea098b RDI: > 0003 > [ 12.727617] RBP: R08: 0060 R09: > 5618a943af60 > [ 12.729361] R10: 0038 R11: 0246 R12: > 56187aea098b > [ 12.731101] R13: 0004 R14: 5618a9439ac0 R15: > > [ 12.732814] Below is a quick fix, I'll revisit module - kmemleak interaction in v6 diff --git a/kernel/module/debug_kmemleak.c b/kernel/module/debug_kmemleak.c index b4cc03842d70..df873dad049d 100644 --- a/kernel/module/debug_kmemleak.c +++ b/kernel/module/debug_kmemleak.c @@ -14,7 +14,8 @@ void kmemleak_load_module(const struct module *mod, { /* only scan writable, non-executable sections */ for_each_mod_mem_type(type) { - if (type != MOD_DATA && type != MOD_INIT_DATA) + if (type != MOD_DATA && type != MOD_INIT_DATA && + !mod->mem[type].is_rox) kmemleak_no_scan(mod->mem[type].base); } } -- Sincerely yours, Mike. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v5 4/8] module: prepare to handle ROX allocations for text
On Wed, Oct 9, 2024 at 10:47 PM Mike Rapoport wrote: > > On Wed, Oct 09, 2024 at 03:23:40PM -0700, Song Liu wrote: > > On Wed, Oct 9, 2024 at 11:10 AM Mike Rapoport wrote: > > [...] > > > diff --git a/include/linux/module.h b/include/linux/module.h > > > index 88ecc5e9f523..7039f609c6ef 100644 > > > --- a/include/linux/module.h > > > +++ b/include/linux/module.h > > > @@ -367,6 +367,8 @@ enum mod_mem_type { > > > > > > struct module_memory { > > > void *base; > > > + void *rw_copy; > > > + bool is_rox; > > > unsigned int size; > > > > Do we really need to hold the rw_copy all the time? > > We hold it only during module initialization, it's freed in > post_relocation. Ah, I missed this part. Sorry for the noise. Song ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v5 3/8] asm-generic: introduce text-patching.h
On Wed, Oct 9, 2024, at 18:08, Mike Rapoport wrote: > From: "Mike Rapoport (Microsoft)" > > Several architectures support text patching, but they name the header > files that declare patching functions differently. > > Make all such headers consistently named text-patching.h and add an empty > header in asm-generic for architectures that do not support text patching. > > Signed-off-by: Mike Rapoport (Microsoft) Acked-by: Arnd Bergmann ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Bisected: [PATCH v5 8/8] x86/module: enable ROX caches for module text
On (24/10/09 21:08), Mike Rapoport wrote: > From: "Mike Rapoport (Microsoft)" > > Enable execmem's cache of PMD_SIZE'ed pages mapped as ROX for module > text allocations. > With this modprobe disappoints kmemleak [ 12.700128] kmemleak: Found object by alias at 0xa000a000 [ 12.702179] CPU: 5 UID: 0 PID: 410 Comm: modprobe Tainted: G N 6.12.0-rc2+ #760 [ 12.704656] Tainted: [N]=TEST [ 12.705526] Call Trace: [ 12.706250] [ 12.706888] dump_stack_lvl+0x3e/0xdb [ 12.707961] __find_and_get_object+0x100/0x110 [ 12.709256] kmemleak_no_scan+0x2e/0xb0 [ 12.710354] kmemleak_load_module+0xad/0xe0 [ 12.711557] load_module+0x2391/0x45a0 [ 12.712507] __se_sys_finit_module+0x4e0/0x7a0 [ 12.713599] do_syscall_64+0x54/0xf0 [ 12.714477] ? irqentry_exit_to_user_mode+0x33/0x100 [ 12.715696] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 12.716931] RIP: 0033:0x7fc7af51f059 [ 12.717816] Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8f 1d 0d 00 f7 d8 64 89 01 48 [ 12.722324] RSP: 002b:7ffc1d0b0c18 EFLAGS: 0246 ORIG_RAX: 0139 [ 12.724173] RAX: ffda RBX: 5618a9439b20 RCX: 7fc7af51f059 [ 12.725884] RDX: RSI: 56187aea098b RDI: 0003 [ 12.727617] RBP: R08: 0060 R09: 5618a943af60 [ 12.729361] R10: 0038 R11: 0246 R12: 56187aea098b [ 12.731101] R13: 0004 R14: 5618a9439ac0 R15: [ 12.732814] [ 12.733362] kmemleak: Object 0xa000 (size 2097152): [ 12.734800] kmemleak: comm "modprobe", pid 410, jiffies 4294880489 [ 12.736334] kmemleak: min_count = 2 [ 12.737228] kmemleak: count = 0 [ 12.738043] kmemleak: flags = 0x5 [ 12.738917] kmemleak: checksum = 0 [ 12.739783] kmemleak: backtrace: [ 12.740606] kmemleak_vmalloc+0x29/0xc0 [ 12.741532] kasan_alloc_module_shadow+0xbe/0xe0 [ 12.742649] execmem_vmalloc+0x116/0x220 [ 12.743596] execmem_alloc+0xfb/0x3d0 [ 12.744479] load_module+0x1e84/0x45a0 [ 12.745383] __se_sys_finit_module+0x4e0/0x7a0 [ 12.746452] do_syscall_64+0x54/0xf0 [ 12.747319] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 12.748772] kmemleak: Not scanning unknown object at 0xa000a000 [ 12.750364] CPU: 5 UID: 0 PID: 410 Comm: modprobe Tainted: G N 6.12.0-rc2+ #760 [ 12.752441] Tainted: [N]=TEST [ 12.753165] Call Trace: [ 12.753760] [ 12.754279] dump_stack_lvl+0x3e/0xdb [ 12.755165] kmemleak_load_module+0xad/0xe0 [ 12.756165] load_module+0x2391/0x45a0 [ 12.757068] __se_sys_finit_module+0x4e0/0x7a0 [ 12.758135] do_syscall_64+0x54/0xf0 [ 12.759099] ? irqentry_exit_to_user_mode+0x33/0x100 [ 12.760292] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 12.761508] RIP: 0033:0x7fc7af51f059 [ 12.762372] Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8f 1d 0d 00 f7 d8 64 89 01 48 [ 12.772361] RSP: 002b:7ffc1d0b0c18 EFLAGS: 0246 ORIG_RAX: 0139 [ 12.774957] RAX: ffda RBX: 5618a9439b20 RCX: 7fc7af51f059 [ 12.776635] RDX: RSI: 56187aea098b RDI: 0003 [ 12.778283] RBP: R08: 0060 R09: 5618a943af60 [ 12.779949] R10: 0038 R11: 0246 R12: 56187aea098b [ 12.781619] R13: 0004 R14: 5618a9439ac0 R15: [ 12.783319] ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH] asm-generic: provide generic page_to_phys and phys_to_page implementations
On Wed, Oct 09, 2024 at 02:06:27PM +, Arnd Bergmann wrote: > This is clearly a good idea, and I'm happy to take that through > the asm-generic tree if there are no complaints. > > Do you have any other patches that depend on it? Well, I have new code that would benefit from these helpers, but just open coding it for now and then doing a swipe to clean that up later together with the existing open coded versions is easy enough. > > -/* > > - * Change "struct page" to physical address. > > - */ > > -static inline phys_addr_t page_to_phys(struct page *page) > > -{ > > - unsigned long pfn = page_to_pfn(page); > > - > > - WARN_ON(IS_ENABLED(CONFIG_DEBUG_VIRTUAL) && !pfn_valid(pfn)); > > - > > - return PFN_PHYS(pfn); > > -} > > This part is technically a change in behavior, not sure how > much anyone cares. Well, the only other comment to the patch so far mentioned it. It also feels like a useful check, but I'm a bit worried about it triggering in various new places. Although that's just with CONFIG_DEBUG_VIRTUAL and probably points to real bugs, so maybe adding it everywhere is a good idea. > > +#define page_to_phys(page) __pfn_to_phys(page_to_pfn(page)) > > +#define phys_to_page(phys) pfn_to_page(__phys_to_pfn(phys)) > > I think we should try to have a little fewer nested macros > to evaluate here, right now this ends up expanding > __pfn_to_phys, PFN_PHYS, PAGE_SHIFT, CONFIG_PAGE_SHIFT, > page_to_pfn and __page_to_pfn. While the behavior is fine, > modern gcc versions list all of those in an warning message > if someone passes the wrong arguments. > > Changing the two macros above into inline functions > would help as well, but may cause other problems. Doing them as inlines seems useful to me, let me throw that at the buildbot and see if anything explodes. > On a related note, it would be even better if we could come > up with a generic definition for either __pa/__va or > virt_to_phys/phys_to_virt. Most architectures define one > of the two pairs in terms of the other, which leads to > confusion with header include order. Agreed, but that's a separate project. ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v5 7/8] execmem: add support for cache of large ROX pages
Op 09-10-2024 om 20:08 schreef Mike Rapoport: From: "Mike Rapoport (Microsoft)" Using large pages to map text areas reduces iTLB pressure and improves performance. Extend execmem_alloc() with an ability to use huge pages with ROX permissions as a cache for smaller allocations. To populate the cache, a writable large page is allocated from vmalloc with VM_ALLOW_HUGE_VMAP, filled with invalid instructions and then remapped as ROX. Portions of that large page are handed out to execmem_alloc() callers without any changes to the permissions. When the memory is freed with execmem_free() it is invalidated again so that it won't contain stale instructions. The cache is enabled when an architecture sets EXECMEM_ROX_CACHE flag in definition of an execmem_range. Signed-off-by: Mike Rapoport (Microsoft) --- include/linux/execmem.h | 2 + mm/execmem.c| 317 +++- mm/internal.h | 1 + mm/vmalloc.c| 5 + 4 files changed, 320 insertions(+), 5 deletions(-) [...] +static void execmem_cache_clean(struct work_struct *work) +{ + struct maple_tree *free_areas = &execmem_cache.free_areas; + struct mutex *mutex = &execmem_cache.mutex; + MA_STATE(mas, free_areas, 0, ULONG_MAX); + void *area; + + mutex_lock(mutex); + mas_for_each(&mas, area, ULONG_MAX) { + size_t size; + No need to check for !area, because it is already guaranteed by the while loop condition (mas_for_each) + if (!area) + continue; + + size = mas_range_len(&mas); + + if (IS_ALIGNED(size, PMD_SIZE) && + IS_ALIGNED(mas.index, PMD_SIZE)) { + struct vm_struct *vm = find_vm_area(area); + + execmem_set_direct_map_valid(vm, true); + mas_store_gfp(&mas, NULL, GFP_KERNEL); + vfree(area); + } + } + mutex_unlock(mutex); +} ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v5 3/8] asm-generic: introduce text-patching.h
On Wed, Oct 9, 2024 at 8:09 PM Mike Rapoport wrote: > From: "Mike Rapoport (Microsoft)" > > Several architectures support text patching, but they name the header > files that declare patching functions differently. > > Make all such headers consistently named text-patching.h and add an empty > header in asm-generic for architectures that do not support text patching. > > Signed-off-by: Mike Rapoport (Microsoft) > arch/m68k/include/asm/Kbuild | 1 + Acked-by: Geert Uytterhoeven # m68k Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc