Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits
Charlie Jenkins writes: > Create a personality flag ADDR_LIMIT_47BIT to support applications > that wish to transition from running in environments that support at > most 47-bit VAs to environments that support larger VAs. This > personality can be set to cause all allocations to be below the 47-bit > boundary. Using MAP_FIXED with mmap() will bypass this restriction. > > Signed-off-by: Charlie Jenkins > --- > include/uapi/linux/personality.h | 1 + > mm/mmap.c| 3 +++ > 2 files changed, 4 insertions(+) > > diff --git a/include/uapi/linux/personality.h > b/include/uapi/linux/personality.h > index 49796b7756af..cd3b8c154d9b 100644 > --- a/include/uapi/linux/personality.h > +++ b/include/uapi/linux/personality.h > @@ -22,6 +22,7 @@ enum { > WHOLE_SECONDS = 0x200, > STICKY_TIMEOUTS = 0x400, > ADDR_LIMIT_3GB =0x800, > + ADDR_LIMIT_47BIT = 0x1000, > }; I wonder if ADDR_LIMIT_128T would be clearer? Have you looked at writing an update for the personality(2) man page? :) cheers ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits
On Thu, Sep 5, 2024, at 21:15, Charlie Jenkins wrote: > Create a personality flag ADDR_LIMIT_47BIT to support applications > that wish to transition from running in environments that support at > most 47-bit VAs to environments that support larger VAs. This > personality can be set to cause all allocations to be below the 47-bit > boundary. Using MAP_FIXED with mmap() will bypass this restriction. > > Signed-off-by: Charlie Jenkins I think having an architecture-independent mechanism to limit the size of the 64-bit address space is useful in general, and we've discussed the same thing for arm64 in the past, though we have not actually reached an agreement on the ABI previously. > @@ -22,6 +22,7 @@ enum { > WHOLE_SECONDS = 0x200, > STICKY_TIMEOUTS = 0x400, > ADDR_LIMIT_3GB =0x800, > + ADDR_LIMIT_47BIT = 0x1000, > }; I'm a bit worried about having this done specifically in the personality flag bits, as they are rather limited. We obviously don't want to add many more such flags when there could be a way to just set the default limit. It's also unclear to me how we want this flag to interact with the existing logic in arch_get_mmap_end(), which attempts to limit the default mapping to a 47-bit address space already. For some reason, it appears that the arch_get_mmap_end() logic on RISC-V defaults to the maximum address space for the 'addr==0' case which is inconsistentn with the other architectures, so we should probably fix that part first, possibly moving more of that logic into a shared implementation. Arnd ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits
On Fri, Sep 06, 2024 at 07:17:44AM GMT, Arnd Bergmann wrote: > On Thu, Sep 5, 2024, at 21:15, Charlie Jenkins wrote: > > Create a personality flag ADDR_LIMIT_47BIT to support applications > > that wish to transition from running in environments that support at > > most 47-bit VAs to environments that support larger VAs. This > > personality can be set to cause all allocations to be below the 47-bit > > boundary. Using MAP_FIXED with mmap() will bypass this restriction. > > > > Signed-off-by: Charlie Jenkins > > I think having an architecture-independent mechanism to limit the size > of the 64-bit address space is useful in general, and we've discussed > the same thing for arm64 in the past, though we have not actually > reached an agreement on the ABI previously. The thread on the original proposals attests to this being rather a fraught topic, and I think the weight of opinion was more so in favour of opt-in rather than opt-out. > > > @@ -22,6 +22,7 @@ enum { > > WHOLE_SECONDS = 0x200, > > STICKY_TIMEOUTS = 0x400, > > ADDR_LIMIT_3GB =0x800, > > + ADDR_LIMIT_47BIT = 0x1000, > > }; > > I'm a bit worried about having this done specifically in the > personality flag bits, as they are rather limited. We obviously > don't want to add many more such flags when there could be > a way to just set the default limit. Since I'm the one who suggested it, I feel I should offer some kind of vague defence here :) We shouldn't let perfect be the enemy of the good. This is a relatively straightforward means of achieving the aim (assuming your concern about arch_get_mmap_end() below isn't a blocker) which has the least impact on existing code. Of course we can end up in absurdities where we start doing ADDR_LIMIT_xxBIT... but again - it's simple, shouldn't represent an egregious maintenance burden and is entirely opt-in so has things going for it. > > It's also unclear to me how we want this flag to interact with > the existing logic in arch_get_mmap_end(), which attempts to > limit the default mapping to a 47-bit address space already. How does ADDR_LIMIT_3GB presently interact with that? > > For some reason, it appears that the arch_get_mmap_end() > logic on RISC-V defaults to the maximum address > space for the 'addr==0' case which is inconsistentn with > the other architectures, so we should probably fix that > part first, possibly moving more of that logic into a > shared implementation. > > Arnd ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits
On Fri, Sep 06, 2024 at 07:17:44AM GMT, Arnd Bergmann wrote: > On Thu, Sep 5, 2024, at 21:15, Charlie Jenkins wrote: > > Create a personality flag ADDR_LIMIT_47BIT to support applications > > that wish to transition from running in environments that support at > > most 47-bit VAs to environments that support larger VAs. This > > personality can be set to cause all allocations to be below the 47-bit > > boundary. Using MAP_FIXED with mmap() will bypass this restriction. > > > > Signed-off-by: Charlie Jenkins > > I think having an architecture-independent mechanism to limit the size > of the 64-bit address space is useful in general, and we've discussed > the same thing for arm64 in the past, though we have not actually > reached an agreement on the ABI previously. The thread on the original proposals attests to this being rather a fraught topic, and I think the weight of opinion was more so in favour of opt-in rather than opt-out. > > > @@ -22,6 +22,7 @@ enum { > > WHOLE_SECONDS = 0x200, > > STICKY_TIMEOUTS = 0x400, > > ADDR_LIMIT_3GB =0x800, > > + ADDR_LIMIT_47BIT = 0x1000, > > }; > > I'm a bit worried about having this done specifically in the > personality flag bits, as they are rather limited. We obviously > don't want to add many more such flags when there could be > a way to just set the default limit. Since I'm the one who suggested it, I feel I should offer some kind of vague defence here :) We shouldn't let perfect be the enemy of the good. This is a relatively straightforward means of achieving the aim (assuming your concern about arch_get_mmap_end() below isn't a blocker) which has the least impact on existing code. Of course we can end up in absurdities where we start doing ADDR_LIMIT_xxBIT... but again - it's simple, shouldn't represent an egregious maintenance burden and is entirely opt-in so has things going for it. > > It's also unclear to me how we want this flag to interact with > the existing logic in arch_get_mmap_end(), which attempts to > limit the default mapping to a 47-bit address space already. How does ADDR_LIMIT_3GB presently interact with that? > > For some reason, it appears that the arch_get_mmap_end() > logic on RISC-V defaults to the maximum address > space for the 'addr==0' case which is inconsistentn with > the other architectures, so we should probably fix that > part first, possibly moving more of that logic into a > shared implementation. > > Arnd ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits
On Fri, Sep 6, 2024, at 08:14, Lorenzo Stoakes wrote: > On Fri, Sep 06, 2024 at 07:17:44AM GMT, Arnd Bergmann wrote: >> On Thu, Sep 5, 2024, at 21:15, Charlie Jenkins wrote: >> > Create a personality flag ADDR_LIMIT_47BIT to support applications >> > that wish to transition from running in environments that support at >> > most 47-bit VAs to environments that support larger VAs. This >> > personality can be set to cause all allocations to be below the 47-bit >> > boundary. Using MAP_FIXED with mmap() will bypass this restriction. >> > >> > Signed-off-by: Charlie Jenkins >> >> I think having an architecture-independent mechanism to limit the size >> of the 64-bit address space is useful in general, and we've discussed >> the same thing for arm64 in the past, though we have not actually >> reached an agreement on the ABI previously. > > The thread on the original proposals attests to this being rather a fraught > topic, and I think the weight of opinion was more so in favour of opt-in > rather than opt-out. You mean opt-in to using the larger addresses like we do on arm64 and powerpc, while "opt-out" means a limit as Charlie suggested? >> > @@ -22,6 +22,7 @@ enum { >> >WHOLE_SECONDS = 0x200, >> >STICKY_TIMEOUTS = 0x400, >> >ADDR_LIMIT_3GB =0x800, >> > + ADDR_LIMIT_47BIT = 0x1000, >> > }; >> >> I'm a bit worried about having this done specifically in the >> personality flag bits, as they are rather limited. We obviously >> don't want to add many more such flags when there could be >> a way to just set the default limit. > > Since I'm the one who suggested it, I feel I should offer some kind of > vague defence here :) > > We shouldn't let perfect be the enemy of the good. This is a relatively > straightforward means of achieving the aim (assuming your concern about > arch_get_mmap_end() below isn't a blocker) which has the least impact on > existing code. > > Of course we can end up in absurdities where we start doing > ADDR_LIMIT_xxBIT... but again - it's simple, shouldn't represent an > egregious maintenance burden and is entirely opt-in so has things going for > it. I'm more confused now, I think most importantly we should try to handle this consistently across all architectures. The proposed implementation seems to completely block addresses above BIT(47) even for applications that opt in by calling mmap(BIT(47), ...), which seems to break the existing applications. If we want this flag for RISC-V and also keep the behavior of defaulting to >BIT(47) addresses for mmap(0, ...) how about changing arch_get_mmap_end() to return the limit based on ADDR_LIMIT_47BIT and then make this default to enabled on arm64 and powerpc but disabled on riscv? >> It's also unclear to me how we want this flag to interact with >> the existing logic in arch_get_mmap_end(), which attempts to >> limit the default mapping to a 47-bit address space already. > > How does ADDR_LIMIT_3GB presently interact with that? That is x86 specific and only relevant to compat tasks, limiting them to 3 instead of 4 GB. There is also ADDR_LIMIT_32BIT, which on arm32 is always set in practice to allow 32-bit addressing as opposed to ARMv2 style 26-bit addressing (IIRC ARMv3 supported both 26-bit and 32-bit addressing, while ARMv4 through ARMv7 are 32-bit only. Arnd ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits
On Fri, Sep 6, 2024 at 3:18 PM Arnd Bergmann wrote: > > On Thu, Sep 5, 2024, at 21:15, Charlie Jenkins wrote: > > Create a personality flag ADDR_LIMIT_47BIT to support applications > > that wish to transition from running in environments that support at > > most 47-bit VAs to environments that support larger VAs. This > > personality can be set to cause all allocations to be below the 47-bit > > boundary. Using MAP_FIXED with mmap() will bypass this restriction. > > > > Signed-off-by: Charlie Jenkins > > I think having an architecture-independent mechanism to limit the size > of the 64-bit address space is useful in general, and we've discussed > the same thing for arm64 in the past, though we have not actually > reached an agreement on the ABI previously. > > > @@ -22,6 +22,7 @@ enum { > > WHOLE_SECONDS = 0x200, > > STICKY_TIMEOUTS = 0x400, > > ADDR_LIMIT_3GB =0x800, > > + ADDR_LIMIT_47BIT = 0x1000, > > }; > > I'm a bit worried about having this done specifically in the > personality flag bits, as they are rather limited. We obviously > don't want to add many more such flags when there could be > a way to just set the default limit. > > It's also unclear to me how we want this flag to interact with > the existing logic in arch_get_mmap_end(), which attempts to > limit the default mapping to a 47-bit address space already. To optimize RISC-V progress, I recommend: Step 1: Approve the patch. Step 2: Update Go and OpenJDK's RISC-V backend to utilize it. Step 3: Wait approximately several iterations for Go & OpenJDK Step 4: Remove the 47-bit constraint in arch_get_mmap_end() > > For some reason, it appears that the arch_get_mmap_end() > logic on RISC-V defaults to the maximum address > space for the 'addr==0' case which is inconsistentn with > the other architectures, so we should probably fix that > part first, possibly moving more of that logic into a > shared implementation. > > Arnd > -- Best Regards Guo Ren ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH v2 7/8] execmem: add support for cache of large ROX pages
Le 26/08/2024 à 08:55, Mike Rapoport a écrit : From: "Mike Rapoport (Microsoft)" Using large pages to map text areas reduces iTLB pressure and improves performance. Extend execmem_alloc() with an ability to use PMD_SIZE'ed pages with ROX permissions as a cache for smaller allocations. Why only PMD_SIZE ? On power 8xx, PMD_SIZE is 4M and the 8xx doesn't have such a page size. When you call vmalloc() with VM_ALLOW_HUGE_VMAP you get 16k pages or 512k pages depending on the size you ask for, see function arch_vmap_pte_supported_shift() To populate the cache, a writable large page is allocated from vmalloc with VM_ALLOW_HUGE_VMAP, filled with invalid instructions and then remapped as ROX. Portions of that large page are handed out to execmem_alloc() callers without any changes to the permissions. When the memory is freed with execmem_free() it is invalidated again so that it won't contain stale instructions. The cache is enabled when an architecture sets EXECMEM_ROX_CACHE flag in definition of an execmem_range. Christophe ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits
(Sorry having issues with my IPv6 setup that duplicated the original email... On Fri, Sep 06, 2024 at 09:14:08AM GMT, Arnd Bergmann wrote: > On Fri, Sep 6, 2024, at 08:14, Lorenzo Stoakes wrote: > > On Fri, Sep 06, 2024 at 07:17:44AM GMT, Arnd Bergmann wrote: > >> On Thu, Sep 5, 2024, at 21:15, Charlie Jenkins wrote: > >> > Create a personality flag ADDR_LIMIT_47BIT to support applications > >> > that wish to transition from running in environments that support at > >> > most 47-bit VAs to environments that support larger VAs. This > >> > personality can be set to cause all allocations to be below the 47-bit > >> > boundary. Using MAP_FIXED with mmap() will bypass this restriction. > >> > > >> > Signed-off-by: Charlie Jenkins > >> > >> I think having an architecture-independent mechanism to limit the size > >> of the 64-bit address space is useful in general, and we've discussed > >> the same thing for arm64 in the past, though we have not actually > >> reached an agreement on the ABI previously. > > > > The thread on the original proposals attests to this being rather a fraught > > topic, and I think the weight of opinion was more so in favour of opt-in > > rather than opt-out. > > You mean opt-in to using the larger addresses like we do on arm64 and > powerpc, while "opt-out" means a limit as Charlie suggested? I guess I'm not using brilliant terminology here haha! To clarify - the weight of opinion was for a situation where the address space is limited, except if you set a hint above that (you could call that opt-out or opt-in depending which way you look at it, so yeah ok very unclear sorry!). It was against the MAP_ flag and also I think a _flexible_ per-process limit is also questionable as you might end up setting a limit which breaks something else, and this starts getting messy quick. To be clear, the ADDR_LIMIT_47BIT suggestion is absolutely a compromise and practical suggestion. > > >> > @@ -22,6 +22,7 @@ enum { > >> > WHOLE_SECONDS = 0x200, > >> > STICKY_TIMEOUTS = 0x400, > >> > ADDR_LIMIT_3GB =0x800, > >> > +ADDR_LIMIT_47BIT = 0x1000, > >> > }; > >> > >> I'm a bit worried about having this done specifically in the > >> personality flag bits, as they are rather limited. We obviously > >> don't want to add many more such flags when there could be > >> a way to just set the default limit. > > > > Since I'm the one who suggested it, I feel I should offer some kind of > > vague defence here :) > > > > We shouldn't let perfect be the enemy of the good. This is a relatively > > straightforward means of achieving the aim (assuming your concern about > > arch_get_mmap_end() below isn't a blocker) which has the least impact on > > existing code. > > > > Of course we can end up in absurdities where we start doing > > ADDR_LIMIT_xxBIT... but again - it's simple, shouldn't represent an > > egregious maintenance burden and is entirely opt-in so has things going for > > it. > > I'm more confused now, I think most importantly we should try to > handle this consistently across all architectures. The proposed > implementation seems to completely block addresses above BIT(47) > even for applications that opt in by calling mmap(BIT(47), ...), > which seems to break the existing applications. Hm, I thought the commit message suggested the hint overrides it still? The intent is to optionally be able to run a process that keeps higher bits free for tagging and to be sure no memory mapping in the process will clobber these (correct me if I'm wrong Charlie! :) So you really wouldn't want this if you are using tagged pointers, you'd want to be sure literally nothing touches the higher bits. > > If we want this flag for RISC-V and also keep the behavior of > defaulting to >BIT(47) addresses for mmap(0, ...) how about > changing arch_get_mmap_end() to return the limit based on > ADDR_LIMIT_47BIT and then make this default to enabled on > arm64 and powerpc but disabled on riscv? But you wouldn't necessarily want all processes to be so restricted, I think this is what Charlie's trying to avoid :) On the ohter hand - I'm not sure there are many processes on any arch that'd want the higher mappings. So that'd push us again towards risc v just limiting to 48-bits and only mapping above this if a hint is provided like x86-64 does (and as you mentioned via irc - it seems risc v is an outlier in that DEFAULT_MAP_WINDOW == TASK_SIZE). This would be more consistent vs. other arches. > > >> It's also unclear to me how we want this flag to interact with > >> the existing logic in arch_get_mmap_end(), which attempts to > >> limit the default mapping to a 47-bit address space already. > > > > How does ADDR_LIMIT_3GB presently interact with that? > > That is x86 specific and only relevant to compat tasks, limiting > them to 3 instead of 4 GB. There is also ADDR_LIMIT_32BIT, which > on arm32 is always set in practice to allow 32-bit addressing
Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits
On Fri, Sep 6, 2024, at 09:14, Guo Ren wrote: > On Fri, Sep 6, 2024 at 3:18 PM Arnd Bergmann wrote: >> >> It's also unclear to me how we want this flag to interact with >> the existing logic in arch_get_mmap_end(), which attempts to >> limit the default mapping to a 47-bit address space already. > > To optimize RISC-V progress, I recommend: > > Step 1: Approve the patch. > Step 2: Update Go and OpenJDK's RISC-V backend to utilize it. > Step 3: Wait approximately several iterations for Go & OpenJDK > Step 4: Remove the 47-bit constraint in arch_get_mmap_end() I really want to first see a plausible explanation about why RISC-V can't just implement this using a 47-bit DEFAULT_MAP_WINDOW like all the other major architectures (x86, arm64, powerpc64), e.g. something like the patch below (untested, probably slightly wrong but show illustrate my point). Arnd diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h index 8702b8721a27..de9863be1efd 100644 --- a/arch/riscv/include/asm/processor.h +++ b/arch/riscv/include/asm/processor.h @@ -20,17 +20,8 @@ * mmap_end < addr, being mmap_end the top of that address space. * See Documentation/arch/riscv/vm-layout.rst for more details. */ -#define arch_get_mmap_end(addr, len, flags)\ -({ \ - unsigned long mmap_end; \ - typeof(addr) _addr = (addr);\ - if ((_addr) == 0 || is_compat_task() || \ - ((_addr + len) > BIT(VA_BITS - 1))) \ - mmap_end = STACK_TOP_MAX; \ - else\ - mmap_end = (_addr + len); \ - mmap_end; \ -}) +#define arch_get_mmap_end(addr, len, flags) \ + (((addr) > DEFAULT_MAP_WINDOW) ? TASK_SIZE : DEFAULT_MAP_WINDOW) #define arch_get_mmap_base(addr, base) \ ({ \ @@ -47,7 +38,7 @@ }) #ifdef CONFIG_64BIT -#define DEFAULT_MAP_WINDOW (UL(1) << (MMAP_VA_BITS - 1)) +#define DEFAULT_MAP_WINDOW (is_compat_task() ? (UL(1) << (MMAP_VA_BITS - 1)) : TASK_SIZE_32) #define STACK_TOP_MAX TASK_SIZE_64 #else #define DEFAULT_MAP_WINDOW TASK_SIZE ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH RFC v3 1/2] mm: Add personality flag to limit address to 47 bits
On Fri, Sep 06, 2024 at 09:55:42AM +, Arnd Bergmann wrote: > On Fri, Sep 6, 2024, at 09:14, Guo Ren wrote: > > On Fri, Sep 6, 2024 at 3:18 PM Arnd Bergmann wrote: > >> It's also unclear to me how we want this flag to interact with > >> the existing logic in arch_get_mmap_end(), which attempts to > >> limit the default mapping to a 47-bit address space already. > > > > To optimize RISC-V progress, I recommend: > > > > Step 1: Approve the patch. > > Step 2: Update Go and OpenJDK's RISC-V backend to utilize it. > > Step 3: Wait approximately several iterations for Go & OpenJDK > > Step 4: Remove the 47-bit constraint in arch_get_mmap_end() > > I really want to first see a plausible explanation about why > RISC-V can't just implement this using a 47-bit DEFAULT_MAP_WINDOW > like all the other major architectures (x86, arm64, powerpc64), FWIW arm64 actually limits DEFAULT_MAP_WINDOW to 48-bit in the default configuration. We end up with a 47-bit with 16K pages but for a different reason that has to do with LPA2 support (I doubt we need this for the user mapping but we need to untangle some of the macros there; that's for a separate discussion). That said, we haven't encountered any user space problems with a 48-bit DEFAULT_MAP_WINDOW. So I also think RISC-V should follow a similar approach (47 or 48 bit default limit). Better to have some ABI consistency between architectures. One can still ask for addresses above this default limit via mmap(). -- Catalin ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc