Re: BASH recursion segfault, FUNCNEST doesn't help
On 6/6/22 16:14, Chet Ramey wrote: > On 6/2/22 4:00 PM, Gergely wrote: > >> I could not produce a scenario in 15 minutes that would indicate that >> this corrupts other sections, as there is a considerable gap between the >> stack and everything else. This is OS-dependent though and bash has no >> control over what happens should this occur. > > Because you haven't forced bash to write outside its own address space or > corrupt another area on the stack. This is a resource exhaustion issue, > no more. I did force it to write out of bounds, hence the segfault. >> Well, the issue is not the fact that this is a resource exhaustion, but >> rather the fact that it's entirely OS-dependent and the programmer has >> zero control over it. > > The programmer has complete control over this, at least in the scenario you > reported. Not really, a programmer can't know how large the stack is and how many more recursions bash can take. This is also kernel/distro/platform dependent. I get that it's a hard limit to hit, but to say the programmer has complete control is not quite true. Also this is a point about the "protection" being OS-dependent. In embedded devices the stack might very well be next to the heap, in which case this can be a legitimate issue. Even if Busybox is preferred in these devices, it's something worth considering (at least for IoT maintainers). Busybox is also vulnerable to this by the way. >> What happens should the situation occur, is not up >> to bash or the programmer. The behaviour is not portable and not >> recoverable. A programmer might expect a situation like this, but there >> is no knob to turn to prevent an abrupt termination, unlike FUNCNEST. > > If you think it's more valuable, you can build bash with a definition for > SOURCENEST_MAX that you find acceptable. There's no user-visible variable > to control that; it's just not something that many people request. But it's > there if you (or a distro) want to build it in. Recompiling works perfectly fine, however there is not configure switch, so I had to edit the code. This might be why the distributions are not setting this? I'm not sure. At least it's there. This will not help programmers though, who just want something that Just Works. >> Speaking for myself, I'd find an error a much MUCH more palatable >> condition than a segfault in this case. In the case of an error I at >> least have a chance to do cleanup or emit a message, as opposed to just >> terminating out of the blue. I don't think most bash programs are >> written with the expectation that they might seize to run any moment >> without any warning. > > I think anyone who codes up an infinite recursion should expect abrupt > termination. Any other scenario is variable and controlled by resource > limits. Sure, for unmitigated disasters of code like infinite recursions, I agree with you. This problem is not about that though. It's about a bounded - albeit large - number of recursions. For the sake of example, consider a program with a somewhat slow signal handler. This program might be forced to segfault by another program that can send it large amounts of signals in quick succession. Something like this: # terminal 1 $ cat signal.sh #!/bin/bash echo $$ export FUNCNEST=100 trap 'echo TRAP; sleep 0.01' SIGUSR1 while true do sleep 1 date done $ ./signal.sh 39817 Tue Jun 7 01:35:41 PM UTC 2022 ... TRAP ./signal.sh: line 1: echo: write error: Interrupted system call Segmentation fault # terminal 2 $ while :; do kill -SIGUSR1 39817; done bash: kill: (39817) - No such process ... Gergely
Re: BASH recursion segfault, FUNCNEST doesn't help
On 6/7/22 7:57 AM, Gergely wrote: >> Because you haven't forced bash to write outside its own address space or >> corrupt another area on the stack. This is a resource exhaustion issue, >> no more. > > > I did force it to write out of bounds, hence the segfault. That's backwards. You got a SIGSEGV, but it doesn't mean you forced bash to write beyond its address space. You get SIGSEGV when you exceed your stack or VM resource limits. Given the nature of the original script, it's probably the former. > Not really, a programmer can't know how large the stack is and how many > more recursions bash can take. This is also kernel/distro/platform > dependent. I get that it's a hard limit to hit, but to say the programmer > has complete control is not quite true. True, the programmer can't know the stack size. But in a scenario where you really need to recurse hundreds or thousands of times (is there one?), the programmer can try to increase the stack size with `ulimit -s' and warn the user if that fails. > Sure, for unmitigated disasters of code like infinite recursions, I agree > with you. This problem is not about that though. It's about a bounded - > albeit large - number of recursions. This is not an example of a bounded number of recursions, since the second process sends a continuous stream of SIGUSR1s. > For the sake of example, consider a program with a somewhat slow signal > handler. This program might be forced to segfault by another program that > can send it large amounts of signals in quick succession. This is another example of recursive execution that results in a stack size resource limit failure, and wouldn't be helped by any of the things we're talking about -- though there is an EVALNEST_MAX define that could. -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
Re: BASH recursion segfault, FUNCNEST doesn't help
On 6/7/22 15:49, Chet Ramey wrote: > On 6/7/22 7:57 AM, Gergely wrote: > >>> Because you haven't forced bash to write outside its own address space or >>> corrupt another area on the stack. This is a resource exhaustion issue, >>> no more. >> >> I did force it to write out of bounds, hence the segfault. > > That's backwards. You got a SIGSEGV, but it doesn't mean you forced bash to > write beyond its address space. You get SIGSEGV when you exceed your stack > or VM resource limits. Given the nature of the original script, it's > probably the former. I am not saying the write was successful, but the only reason it wasn't is because the kernel doesn't map pages there. Bash not caring about this makes it's relying on the kernel behaving "right". Here's a very trivial example that'll show $rsp containing an address that is outside of the stack: $ gdb --args bash -c 'cat /proc/self/maps; echo ". a" > a; . a' (gdb) r Starting program: /usr/bin/bash -c cat\ /proc/self/maps\;\ echo\ \".\ a\"\ \>\ a\;\ .\ a [Detaching after fork from child process 45053] 4000-6000 r--p fd:00 1573537 /usr/bin/cat 6000-b000 r-xp 2000 fd:00 1573537 /usr/bin/cat b000-d000 r--p 7000 fd:00 1573537 /usr/bin/cat e000-f000 r--p 9000 fd:00 1573537 /usr/bin/cat f000-5556 rw-p a000 fd:00 1573537 /usr/bin/cat 5556-55581000 rw-p 00:00 0 [heap] 77803000-77dbb000 r--p fd:00 1573144 /usr/lib/locale/locale-archive 77dbb000-77dbe000 rw-p 00:00 0 77dbe000-77de r--p fd:00 1576105 /usr/lib/x86_64-linux-gnu/libc-2.33.so 77de-77f38000 r-xp 00022000 fd:00 1576105 /usr/lib/x86_64-linux-gnu/libc-2.33.so 77f38000-77f88000 r--p 0017a000 fd:00 1576105 /usr/lib/x86_64-linux-gnu/libc-2.33.so 77f88000-77f8c000 r--p 001c9000 fd:00 1576105 /usr/lib/x86_64-linux-gnu/libc-2.33.so 77f8c000-77f8e000 rw-p 001cd000 fd:00 1576105 /usr/lib/x86_64-linux-gnu/libc-2.33.so 77f8e000-77f97000 rw-p 00:00 0 77fa2000-77fc6000 rw-p 00:00 0 77fc6000-77fca000 r--p 00:00 0 [vvar] 77fca000-77fcc000 r-xp 00:00 0 [vdso] 77fcc000-77fcd000 r--p fd:00 1576101 /usr/lib/x86_64-linux-gnu/ld-2.33.so 77fcd000-77ff1000 r-xp 1000 fd:00 1576101 /usr/lib/x86_64-linux-gnu/ld-2.33.so 77ff1000-77ffb000 r--p 00025000 fd:00 1576101 /usr/lib/x86_64-linux-gnu/ld-2.33.so 77ffb000-77ffd000 r--p 0002e000 fd:00 1576101 /usr/lib/x86_64-linux-gnu/ld-2.33.so 77ffd000-77fff000 rw-p 0003 fd:00 1576101 /usr/lib/x86_64-linux-gnu/ld-2.33.so 7ffde000-7000 rw-p 00:00 0 [stack] Program received signal SIGSEGV, Segmentation fault. 0x5558e963 in yyparse () at ./build-bash/y.tab.c:1744 1744./build-bash/y.tab.c: No such file or directory. (gdb) info frame 0 Stack frame at 0x7f7ffc40: rip = 0x5558e963 in yyparse (./build-bash/y.tab.c:1744); saved rip = 0x55585547 called by frame at 0x7f7ffc60 source language c. Arglist at 0x7f7fed98, args: Locals at 0x7f7fed98, Previous frame's sp is 0x7f7ffc40 Saved registers: rbx at 0x7f7ffc08, rbp at 0x7f7ffc10, r12 at 0x7f7ffc18, r13 at 0x7f7ffc20, r14 at 0x7f7ffc28, r15 at 0x7f7ffc30, rip at 0x7f7ffc38 (gdb) disas yyparse Dump of assembler code for function yyparse: ... 0x5558e959 <+57>:xor%eax,%eax 0x5558e95b <+59>:lea0x1d0(%rsp),%rbx => 0x5558e963 <+67>:mov%ax,0x40(%rsp) 0x5558e968 <+72>:mov%r8,%rbp ... (gdb) Here $rsp points to an invalid address. In this case reading fails, but it might as well be a write operation depending on what the given function does with it's local variables. >> Not really, a programmer can't know how large the stack is and how many >> more recursions bash can take. This is also kernel/distro/platform >> dependent. I get that it's a hard limit to hit, but to say the programmer >> has complete control is not quite true. > > True, the programmer can't know the stack size. But in a scenario where you > really need to recurse hundreds or thousands of times (is there one?), the > programmer can try to increase the stack size with `ulimit -s' and warn the > user if that fails. If there's a way for an attacker to make bash allocate very large stack frames, this number doesn't have to be very big. >> Sure, for unmitigated disasters of co