Re: a recursion bug

Linda Walsh Tue, 02 Oct 2012 10:02:11 -0700

Chet Ramey wrote:

On 9/28/12 9:54 AM, Yuxiang Cao wrote:

        I use ulimit -s to find stack size, which is 8192kbytes. Then I use 
valgrind to record the stack size which give me this information.
test.sh: xmalloc: ../bash/unwind_prot.c:308: cannot allocate 172 bytes (8359936 
bytes allocated)
So from the above information I think this is not a stack overflow, and that is 
a real fault in this program.


It's not; deep-enough recursion will eventually exhaust available
resources, no matter their limits, and cause the process to crash.  If
the kernel decides that you can't have any more heap space when malloc
requests it, then that's that.

---

So when a program does an 'exec' call, if the kernel has run out of process
descriptors and reached it's limit, you believe the correct response is NOT to
return EAGAIN, but to panic the kernel?

Why shouldn't bash fail at the point it hits resource exhaustion and return
an error condition -- like EAGAIN, ENOBUFS, ENOMEM... etc.

Bash should catch it's own resource allocation faults and not rely on
something external to itself to clean up it's mess.  Dumping core means
bash lost control.   If you really run out of resources, there are
standard error messages, that I can see no reason why BASH shouldn't
return.

It shoudln't be possible for a USER level script (that isn't trying to
over-write memory in the bash process) to crash Bash.

Not crashing has been a basic design requirement of all programs -- and
handling resource exhaustion 'gracefully', has always been expected of
system-level programs.

Example -- a print job that generated 1 job for each char in a 2-3MB binary.

..or, running 'make -j' to build the kernel (and watching load go over 200).
Neither should crash the system.
They may take ridiculously long to complete if the machine they are on is
overwhelmed (indead, the print job effectively did a DOS on the print server
for the next 5-6 hours -- we decided to let it run and see if it recovered --
it did -- a well designed OS).  The linux make only takes a few minutes for
a transient resource crisis to pass.  If it died everytime resources got
tight -- it would be considered unacceptable stability.

I feel that crashing should always be considered a bug in system and
production level code.  Do you really want the kernel to crash everytime it
hits a resource exhaustion point?  Or should it 'fail' the job that' is causing
the problem?  Why should bash be different?

Re: a recursion bug

Reply via email to