Re: [PATCH 1/3] futex: remove duplicated code
On Fri, Mar 03, 2017 at 01:27:10PM +0100, Jiri Slaby wrote: > There is code duplicated over all architecture's headers for > futex_atomic_op_inuser. Namely op decoding, access_ok check for uaddr, > and comparison of the result. > > Remove this duplication and leave up to the arches only the needed > assembly which is now in arch_futex_atomic_op_inuser. > > Note that s390 removed access_ok check in d12a29703 ("s390/uaccess: > remove pointless access_ok() checks") as access_ok there returns true. > We introduce it back to the helper for the sake of simplicity (it gets > optimized away anyway). Overall I'm in favor of this patch, and it's close to what I had in mind in the commit message for 00b73d8d1b7131da03aec73011a7286f566fe87f. But I'd actually like to see it go further. These ops are mainly (only?) used for the (almost never used) FUTEX_WAKE_OP operation, and there's very little sense in trying to optimize them with dedicated arch-specific forms like "lock xadd". Instead the entire logic should be in an arch-generic file, and all the arch should need to provide is a cmpxchg-on-user-memory primitive for it to use. On most archs, the same cmpxchg used in kernelspace should also work for user addresses, meaning a huge amount of unmaintained, largely untested, junk code can be removed. Rich ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [RFC PATCH 00/13] Introduce first class virtual address spaces
On Wed, Mar 15, 2017 at 12:44:47PM -0700, Till Smejkal wrote: > On Wed, 15 Mar 2017, Andy Lutomirski wrote: > > > One advantage of VAS segments is that they can be globally queried by > > > user programs > > > which means that VAS segments can be shared by applications that not > > > necessarily have > > > to be related. If I am not mistaken, MAP_SHARED of pure in memory data > > > will only work > > > if the tasks that share the memory region are related (aka. have a common > > > parent that > > > initialized the shared mapping). Otherwise, the shared mapping have to be > > > backed by a > > > file. > > > > What's wrong with memfd_create()? > > > > > VAS segments on the other side allow sharing of pure in memory data by > > > arbitrary related tasks without the need of a file. This becomes > > > especially > > > interesting if one combines VAS segments with non-volatile memory since > > > one can keep > > > data structures in the NVM and still be able to share them between > > > multiple tasks. > > > > What's wrong with regular mmap? > > I never wanted to say that there is something wrong with regular mmap. We just > figured that with VAS segments you could remove the need to mmap your shared > data but > instead can keep everything purely in memory. > > Unfortunately, I am not at full speed with memfds. Is my understanding > correct that > if the last user of such a file descriptor closes it, the corresponding > memory is > freed? Accordingly, memfd cannot be used to keep data in memory while no > program is > currently using it, can it? To be able to do this you need again some > representation I have a name for application-allocated kernel resources that persist without a process holding a reference to them or a node in the filesystem: a bug. See: sysvipc. Rich ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [RFC PATCH v2 0/2] Randomization of address chosen by mmap.
On Fri, Mar 23, 2018 at 05:48:06AM -0700, Matthew Wilcox wrote: > On Thu, Mar 22, 2018 at 07:36:36PM +0300, Ilya Smith wrote: > > Current implementation doesn't randomize address returned by mmap. > > All the entropy ends with choosing mmap_base_addr at the process > > creation. After that mmap build very predictable layout of address > > space. It allows to bypass ASLR in many cases. This patch make > > randomization of address on any mmap call. > > Why should this be done in the kernel rather than libc? libc is perfectly > capable of specifying random numbers in the first argument of mmap. Generally libc does not have a view of the current vm maps, and thus in passing "random numbers", they would have to be uniform across the whole vm space and thus non-uniform once the kernel rounds up to avoid existing mappings. Also this would impose requirements that libc be aware of the kernel's use of the virtual address space and what's available to userspace -- for example, on 32-bit archs whether 2GB, 3GB, or full 4GB (for 32-bit-user-on-64-bit-kernel) is available, and on 64-bit archs where fewer than the full 64 bits are actually valid in addresses, what the actual usable pointer size is. There is currently no clean way of conveying this information to userspace. Rich ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [RFC PATCH v2 0/2] Randomization of address chosen by mmap.
On Fri, Mar 23, 2018 at 12:06:18PM -0700, Matthew Wilcox wrote: > On Fri, Mar 23, 2018 at 02:00:24PM -0400, Rich Felker wrote: > > On Fri, Mar 23, 2018 at 05:48:06AM -0700, Matthew Wilcox wrote: > > > On Thu, Mar 22, 2018 at 07:36:36PM +0300, Ilya Smith wrote: > > > > Current implementation doesn't randomize address returned by mmap. > > > > All the entropy ends with choosing mmap_base_addr at the process > > > > creation. After that mmap build very predictable layout of address > > > > space. It allows to bypass ASLR in many cases. This patch make > > > > randomization of address on any mmap call. > > > > > > Why should this be done in the kernel rather than libc? libc is perfectly > > > capable of specifying random numbers in the first argument of mmap. > > > > Generally libc does not have a view of the current vm maps, and thus > > in passing "random numbers", they would have to be uniform across the > > whole vm space and thus non-uniform once the kernel rounds up to avoid > > existing mappings. > > I'm aware that you're the musl author, but glibc somehow manages to > provide etext, edata and end, demonstrating that it does know where at > least some of the memory map lies. Yes, but that's pretty minimal info. > Virtually everything after that is > brought into the address space via mmap, which at least glibc intercepts, There's also vdso, the program interpreter (ldso), and theoretically other things the kernel might add. I agree you _could_ track most of this (and all if you want to open /proc/self/maps), but it seems hackish and wrong (violating clean boundaries between userspace and kernel responsibility). > > Also this would impose requirements that libc be > > aware of the kernel's use of the virtual address space and what's > > available to userspace -- for example, on 32-bit archs whether 2GB, > > 3GB, or full 4GB (for 32-bit-user-on-64-bit-kernel) is available, and > > on 64-bit archs where fewer than the full 64 bits are actually valid > > in addresses, what the actual usable pointer size is. There is > > currently no clean way of conveying this information to userspace. > > Huh, I thought libc was aware of this. Also, I'd expect a libc-based > implementation to restrict itself to, eg, only loading libraries in > the bottom 1GB to avoid applications who want to map huge things from > running out of unfragmented address space. That seems like a rather arbitrary expectation and I'm not sure why you'd expect it to result in less fragmentation rather than more. For example if it started from 1GB and worked down, you'd immediately reduce the contiguous free space from ~3GB to ~2GB, and if it started from the bottom and worked up, brk would immediately become unavailable, increasing mmap pressure elsewhere. Rich ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [RFC PATCH v2 0/2] Randomization of address chosen by mmap.
On Fri, Mar 23, 2018 at 12:29:52PM -0700, Matthew Wilcox wrote: > On Fri, Mar 23, 2018 at 03:16:21PM -0400, Rich Felker wrote: > > > Huh, I thought libc was aware of this. Also, I'd expect a libc-based > > > implementation to restrict itself to, eg, only loading libraries in > > > the bottom 1GB to avoid applications who want to map huge things from > > > running out of unfragmented address space. > > > > That seems like a rather arbitrary expectation and I'm not sure why > > you'd expect it to result in less fragmentation rather than more. For > > example if it started from 1GB and worked down, you'd immediately > > reduce the contiguous free space from ~3GB to ~2GB, and if it started > > from the bottom and worked up, brk would immediately become > > unavailable, increasing mmap pressure elsewhere. > > By *not* limiting yourself to the bottom 1GB, you'll almost immediately > fragment the address space even worse. Just looking at 'ls' as a > hopefully-good example of a typical app, it maps: > > linux-vdso.so.1 (0x7ffef5eef000) > libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 > (0x7fb3657f5000) > libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x7fb36543b000) > libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x7fb3651c9000) > libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x7fb364fc5000) > /lib64/ld-linux-x86-64.so.2 (0x7fb365c3f000) > libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 > (0x7fb364da7000) > > The VDSO wouldn't move, but look at the distribution of mapping 6 things > into a 3GB address space in random locations. What are the odds you have > a contiguous 1GB chunk of address space? If you restrict yourself to the > bottom 1GB before running out of room and falling back to a sequential > allocation, you'll prevent a lot of fragmentation. Oh, you're talking about "with random locations" case. Randomizing each map just hopelessly fragments things no matter what you do on 32-bit. If you reduce the space over which you randomize to the point where it's not fragmenting/killing your available vm space, there are so few degrees of freedom left that it's trivial to brute-force. Maybe "libs randomized in low 1GB, everything else near-sequential in high addresses" works half decently, but I have a hard time believing you can get any ASLR that's significantly better than snake oil in a 32-bit address space, and you certainly do pay a high price in total available vm space. Rich ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [RFC PATCH v2 0/2] Randomization of address chosen by mmap.
On Tue, Mar 27, 2018 at 06:16:35PM -0400, Theodore Y. Ts'o wrote: > On Tue, Mar 27, 2018 at 04:51:08PM +0300, Ilya Smith wrote: > > > /dev/[u]random is not sufficient? > > > > Using /dev/[u]random makes 3 syscalls - open, read, close. This is a > > performance > > issue. > > You may want to take a look at the getrandom(2) system call, which is > the recommended way getting secure random numbers from the kernel. Yes, while opening /dev/urandom is not acceptable due to needing an fd, getrandom and existing fallbacks for it have this covered if needed. > > > Well, I am pretty sure userspace can implement proper free ranges > > > tracking… > > > > I think we need to know what libc developers will say on implementing ASLR > > in > > user-mode. I am pretty sure they will say ‘nether’ or ‘some-day’. And > > problem > > of ASLR will stay forever. > > Why can't you send patches to the libc developers? I can tell you right now that any patch submitted for musl that depended on trying to duplicate knowledge of the entire virtual address space layout in userspace as part of mmap would be rejected, and I would recommend glibc do the same. Not only does it vastly increase complexity; it also has all sorts of failure modes (fd exhastion, etc.) which would either introduce new and unwanted ways for mmap to fail, or would force fallback to the normal (no extra randomization) strategy under conditions an attacker could potentially control, defeating the whole purpose. It would also potentially make it easier for an attacker to examine the vm layout for attacks, since it would be recorded in userspace. There's also the issue of preserving AS-safety of mmap. POSIX does not actually require mmap to be AS-safe, and on musl munmap is not fully AS-safe anyway because of some obscure issues it compensates for, but we may be able to make it AS-safe (this is a low-priority open issue). If mmap were manipulating data structures representing the vm space in userspace, though, the only way to make it anywhere near AS-safe would be to block all signals and take a lock every time mmap or munmap is called. This would significantly increase the cost of each call, especially now that meltdown/spectre mitigations have greatly increased the overhead of each syscall. Overall, asking userspace to take a lead role in management of process vm space is a radical change in the split of what user and kernel are responsible for, and it really does not make sense as part of a dubious hardening measure. Something this big would need to be really well-motivated. Rich ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [RFC PATCH v2 0/2] Randomization of address chosen by mmap.
On Tue, Mar 27, 2018 at 04:49:04PM -0700, Matthew Wilcox wrote: > On Tue, Mar 27, 2018 at 03:53:53PM -0700, Kees Cook wrote: > > I agree: pushing this off to libc leaves a lot of things unprotected. > > I think this should live in the kernel. The question I have is about > > making it maintainable/readable/etc. > > > > The state-of-the-art for ASLR is moving to finer granularity (over > > just base-address offset), so I'd really like to see this supported in > > the kernel. We'll be getting there for other things in the future, and > > I'd like to have a working production example for researchers to > > study, etc. > > One thing we need is to limit the fragmentation of this approach. > Even on 64-bit systems, we can easily get into a situation where there isn't > space to map a contiguous terabyte. The default limit of only 65536 VMAs will also quickly come into play if consecutive anon mmaps don't get merged. Of course this can be raised, but it has significant resource and performance (fork) costs. Rich ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [RFC PATCH v2 0/2] Randomization of address chosen by mmap.
On Fri, Mar 30, 2018 at 09:55:08AM +0200, Pavel Machek wrote: > Hi! > > > Current implementation doesn't randomize address returned by mmap. > > All the entropy ends with choosing mmap_base_addr at the process > > creation. After that mmap build very predictable layout of address > > space. It allows to bypass ASLR in many cases. This patch make > > randomization of address on any mmap call. > > How will this interact with people debugging their application, and > getting different behaviours based on memory layout? > > strace, strace again, get different results? Normally gdb disables ASLR for the process when invoking a program to debug. I don't see why that would be terribly useful with strace but you can do the same if you want. Rich ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: Detecting libc in perf (was Re: perf tools build broken after v5.1-rc1)
On Tue, Apr 30, 2019 at 03:53:18PM +, Vineet Gupta wrote: > On 4/29/19 6:18 PM, Arnaldo Carvalho de Melo wrote: > >>> Auto-detecting system features: > >>> ... dwarf: [ OFF ] > >>> ...dwarf_getlocations: [ OFF ] > >>> ... glibc: [ on ] > >> Not related to current issue, this run uses a uClibc toolchain and yet it > >> is > >> detecting glibc - doesn't seem right to me. > > Ok, I'll improve that, I think it just tries to detect a libc, yeah, > > see: > > > > [acme@quaco linux]$ cat tools/build/feature/test-glibc.c > > // SPDX-License-Identifier: GPL-2.0 > > #include > > > > #if !defined(__UCLIBC__) > > #include > > #else > > #define XSTR(s) STR(s) > > #define STR(s) #s > > #endif > > > > int main(void) > > { > > #if !defined(__UCLIBC__) > > const char *version = gnu_get_libc_version(); > > #else > > const char *version = XSTR(__GLIBC__) "." XSTR(__GLIBC_MINOR__); > > #endif > > > > return (long)version; > > } > > [acme@quaco linux]$ > > > > [perfbuilder@59ca4b424ded /]$ grep __GLIBC__ > > /arc_gnu_2017.09-rc2_prebuilt_uclibc_le_arc700_linux_install/arc-snps-linux-uclibc/sysroot/usr/include/*.h > > /arc_gnu_2017.09-rc2_prebuilt_uclibc_le_arc700_linux_install/arc-snps-linux-uclibc/sysroot/usr/include/features.h: > >The macros `__GNU_LIBRARY__', `__GLIBC__', and `__GLIBC_MINOR__' are > > /arc_gnu_2017.09-rc2_prebuilt_uclibc_le_arc700_linux_install/arc-snps-linux-uclibc/sysroot/usr/include/features.h:#define > >__GLIBC__ 2 > > /arc_gnu_2017.09-rc2_prebuilt_uclibc_le_arc700_linux_install/arc-snps-linux-uclibc/sysroot/usr/include/features.h: > > ((__GLIBC__ << 16) + __GLIBC_MINOR__ >= ((maj) << 16) + (min)) > > [perfbuilder@59ca4b424ded /]$ > > > > Isn't that part of uClibc? > > Right you are. Per the big fat comment right above that code, this gross hack > in > uclibc is unavoidable as applications tend to rely on that define. > So a better fix would be to check for various !GLIBC libs explicitly. > > #ifdef __UCLIBC__ > > #elseif defined __MUSL__ > > > > Not pretty from app usage pov, but that seems to be the only sane way of > doing it. What are you trying to achieve? I was just CC'd and I'm missing the context. Rich ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: Detecting libc in perf (was Re: perf tools build broken after v5.1-rc1)
On Tue, Apr 30, 2019 at 10:13:40AM -0700, Vineet Gupta wrote: > On 4/30/19 10:04 AM, Rich Felker wrote: > > On Tue, Apr 30, 2019 at 03:53:18PM +, Vineet Gupta wrote: > >> On 4/29/19 6:18 PM, Arnaldo Carvalho de Melo wrote: > >>>>> Auto-detecting system features: > >>>>> ... dwarf: [ OFF ] > >>>>> ...dwarf_getlocations: [ OFF ] > >>>>> ... glibc: [ on ] > >>>> Not related to current issue, this run uses a uClibc toolchain and yet > >>>> it is > >>>> detecting glibc - doesn't seem right to me. > >>> Ok, I'll improve that, I think it just tries to detect a libc, yeah, > >>> see: > >>> > >>> [acme@quaco linux]$ cat tools/build/feature/test-glibc.c > >>> // SPDX-License-Identifier: GPL-2.0 > >>> #include > >>> > >>> #if !defined(__UCLIBC__) > >>> #include > >>> #else > >>> #define XSTR(s) STR(s) > >>> #define STR(s) #s > >>> #endif > >>> > >>> int main(void) > >>> { > >>> #if !defined(__UCLIBC__) > >>> const char *version = gnu_get_libc_version(); > >>> #else > >>> const char *version = XSTR(__GLIBC__) "." XSTR(__GLIBC_MINOR__); > >>> #endif > >>> > >>> return (long)version; > >>> } > >>> [acme@quaco linux]$ > >>> > >>> [perfbuilder@59ca4b424ded /]$ grep __GLIBC__ > >>> /arc_gnu_2017.09-rc2_prebuilt_uclibc_le_arc700_linux_install/arc-snps-linux-uclibc/sysroot/usr/include/*.h > >>> /arc_gnu_2017.09-rc2_prebuilt_uclibc_le_arc700_linux_install/arc-snps-linux-uclibc/sysroot/usr/include/features.h: > >>>The macros `__GNU_LIBRARY__', `__GLIBC__', and `__GLIBC_MINOR__' are > >>> /arc_gnu_2017.09-rc2_prebuilt_uclibc_le_arc700_linux_install/arc-snps-linux-uclibc/sysroot/usr/include/features.h:#define > >>> __GLIBC__ 2 > >>> /arc_gnu_2017.09-rc2_prebuilt_uclibc_le_arc700_linux_install/arc-snps-linux-uclibc/sysroot/usr/include/features.h: > >>> ((__GLIBC__ << 16) + __GLIBC_MINOR__ >= ((maj) << 16) + (min)) > >>> [perfbuilder@59ca4b424ded /]$ > >>> > >>> Isn't that part of uClibc? > >> > >> Right you are. Per the big fat comment right above that code, this gross > >> hack in > >> uclibc is unavoidable as applications tend to rely on that define. > >> So a better fix would be to check for various !GLIBC libs explicitly. > >> > >> #ifdef __UCLIBC__ > >> > >> #elseif defined __MUSL__ > >> > >> > >> > >> Not pretty from app usage pov, but that seems to be the only sane way of > >> doing it. > > > > What are you trying to achieve? I was just CC'd and I'm missing the > > context. > > Sorry I added you as a subject matter expert but didn't provide enough > context. > > The original issue [1] was perf failing to build on ARC due to perf tools > needing > a copy of unistd.h but this thread [2] was a small side issue of > auto-detecting > libc variaint in perf tools where despite uClibc tools, glibc is declared to > be > detected, due to uClibc's historical hack of defining __GLIBC__. So __GLIBC__ > is > not sufficient (and probably not the right interface to begin wtih) to ensure > glibc. > > [1] http://lists.infradead.org/pipermail/linux-snps-arc/2019-April/005676.html > [2] http://lists.infradead.org/pipermail/linux-snps-arc/2019-April/005684.html I think you misunderstood -- I'm asking what you're trying to achieve by detecting whether the libc is glibc, rather than whether it has some particular interface you want to conditionally use. This is a major smell and is usually something wrong that shouldn't be done. Rich ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc
Re: [PATCH 2/2] futex: remove futex_cmpxchg detection
On Tue, Oct 26, 2021 at 12:03:48PM +0200, Arnd Bergmann wrote: > From: Arnd Bergmann > > Now that all architectures have a working futex implementation > in any configuration, remove the runtime detection code. > > Signed-off-by: Arnd Bergmann > --- > arch/arc/Kconfig | 1 - > arch/arm/Kconfig | 1 - > arch/arm64/Kconfig| 1 - > arch/csky/Kconfig | 1 - > arch/m68k/Kconfig | 1 - > arch/riscv/Kconfig| 1 - > arch/s390/Kconfig | 1 - > arch/sh/Kconfig | 1 - > arch/um/Kconfig | 1 - > arch/um/kernel/skas/uaccess.c | 1 - > arch/xtensa/Kconfig | 1 - > init/Kconfig | 8 > kernel/futex/core.c | 35 --- > kernel/futex/futex.h | 6 -- > kernel/futex/syscalls.c | 22 ------ > 15 files changed, 82 deletions(-) Acked-by: Rich Felker ___ linux-snps-arc mailing list linux-snps-arc@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-snps-arc