Severe memleak in sequence expressions?

2011-11-29 Thread Marc Schiffbauer
Hi all,

please Cc me on ansers as I am not on the list.

I think I found a severe memleak in bash.

I had this effect with bash 4.2.10 (Ubuntu Linux) as well as 
bash 4.1.9 on Gentoo Linux.

To make it short:

echo {0..1000}>/dev/null

This makes my system starting to swap as bash will use several GiB of
memory.

If I choose a way bigger number bash "just" seems to crash:

mschiff@lisa ~ $ echo {1..100}>/dev/null
bash: xmalloc: cannot allocate 11892949016 bytes (135168 bytes allocated)
mschiff@lisa ~ $

Is this a bug?

-Marc
-- 
8AAC 5F46 83B4 DB70 8317  3723 296C 6CCA 35A6 4134


pgpMXNfFKU8EZ.pgp
Description: PGP signature


Severe memleak in sequence expressions?

2011-11-30 Thread Marc Schiffbauer
Hi all,

I think there is a severe memleak in bash.

I had this effect with bash 4.2.10 (Ubuntu Linux) as well as 
bash 4.1.9 on Gentoo Linux.

To make it short:

echo {0..1000}>/dev/null

This makes my system starting to swap as bash will allocate several GiB of
memory.

If I choose a way bigger number bash "just" seems to crash:

mschiff@lisa ~ $ echo {1..100}>/dev/null
bash: xmalloc: cannot allocate 11892949016 bytes (135168 bytes allocated)
mschiff@lisa ~ $

Is this a bug?

Please do not try this on a productive machine!

-Marc
-- 
8AAC 5F46 83B4 DB70 8317  3723 296C 6CCA 35A6 4134


pgpqN0dAdD62n.pgp
Description: PGP signature


Re: Severe memleak in sequence expressions?

2011-11-30 Thread Marc Schiffbauer
* Chet Ramey schrieb am 30.11.11 um 14:23 Uhr:
> > Hi all,
> > 
> > I think there is a severe memleak in bash.
> 
> It's not a memory leak.  It might reveal a sub-optimal memory allocation
> pattern -- asking for an array with that many strings is going to gobble
> a lot of memory -- but it's not a leak in the sense that bash loses
> track of an allocated chunk.

Well, but how do you explain that:

mschiff@moe:~$ bash
mschiff@moe:~$ ps u $$
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
mschiff  14156  5.2  0.1   8836  3692 pts/1S23:28   0:00 bash
mschiff@moe:~$ time echo {0..100}>/dev/null

real0m2.307s
user0m2.084s
sys 0m0.196s
mschiff@moe:~$ ps u $$
USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
mschiff  14156 13.2  6.2 196272 191036 pts/1   S23:28   0:02 bash
mschiff@moe:~$


-Marc
-- 
8AAC 5F46 83B4 DB70 8317  3723 296C 6CCA 35A6 4134



Re: Severe memleak in sequence expressions?

2011-11-30 Thread Marc Schiffbauer
* Greg Wooledge schrieb am 30.11.11 um 14:28 Uhr:
> On Wed, Nov 30, 2011 at 12:37:36AM +0100, Marc Schiffbauer wrote:
> > echo {0..1000}>/dev/null
> > 
> > This makes my system starting to swap as bash will use several GiB of
> > memory.
> 
> Brace expansion is just a short way of typing a longer list of words.
> If you type {0..9} bash still has to expand it out to 0 1 2 3 4 5 6 7 8 9
> and then pass all of those words as arguments to echo.  (Granted, echo
> is a builtin, so there *could* be some sort of shortcut there)
> 
> > If I choose a way bigger number bash "just" seems to crash:
> > 
> > mschiff@lisa ~ $ echo {1..100}>/dev/null
> > bash: xmalloc: cannot allocate 11892949016 bytes (135168 bytes allocated)
> > mschiff@lisa ~ $
> > 
> > Is this a bug?
> 
> In my opinion, no.  You're asking bash to generate a list of words from 0
> to 100 all at once.  It faithfully attempts to do so.

Yeah, ok but it will not free the mem it allocated later on (see
other mail)

> If you want to loop over integers, "for" is much better suited:
> 
> for ((i=1; i<=100; i++)); do
>   whatever
> done

ok, sure. I was just wondering if that massive amount of memory
is ok for that.

> 
> This will not require that all of the words from 1 to 100
> be generated in advance.  It will simply keep one word at a time in the
> loop variable.
> 
> (Bear in mind that bash can only handle 64-bit integer arithmetic; and
> bash 2.05 and earlier can only handle 32-bit integer arithmetic.)

will do, thx ;)

-Marc
-- 
8AAC 5F46 83B4 DB70 8317  3723 296C 6CCA 35A6 4134



Re: Severe memleak in sequence expressions?

2011-12-01 Thread Marc Schiffbauer
* Chet Ramey schrieb am 01.12.11 um 02:54 Uhr:
> That's probably the result of the power-of-two allocation policy in the
> bash malloc.  When this came up before, I wrote:
> 
> ==
> That's not a memory leak.  Malloc implementations need not release
> memory back to the kernel; the bash malloc (and others) do so only
> under limited circumstances.  Memory obtained from the kernel using
> mmap or sbrk and kept in a cache by a malloc implementation doesn't
> constitute a leak.  A leak is memory for which there is no longer a
> handle, by the application or by malloc itself.
> ==

Thanks for the explanation, Chat

-Marc
-- 
8AAC 5F46 83B4 DB70 8317  3723 296C 6CCA 35A6 4134



Re: Severe memleak in sequence expressions?

2011-12-01 Thread Marc Schiffbauer
* Bob Proulx schrieb am 01.12.11 um 05:34 Uhr:
> Marc Schiffbauer wrote:
> > Greg Wooledge schrieb:
> > > Marc Schiffbauer wrote:
> > > > echo {0..1000}>/dev/null
> > > > 
> > > > This makes my system starting to swap as bash will use several GiB of
> > > > memory.
> > >
> > > In my opinion, no.  You're asking bash to generate a list of words from 0
> > > to 100 all at once.  It faithfully attempts to do so.
> > 
> > Yeah, ok but it will not free the mem it allocated later on (see
> > other mail)
> 

Hi Bob,

[...]
> In total to generate all of the arguments for {0..1000} consumes
> at least 78,888,899 bytes or 75 megabytes of memory(!) if I did all of
> the math right.  Each order of magnitude added grows the amount of
> required memory by an *order of magnitude*.  This should not in any
> way be surprising.  In order to generate 100 arguments
> it might consume 7.8e7 * 1e10 equals 7.8e17 bytes ignoring the smaller
> second order effects.  That is a lot of petabytes of memory!  And it
> is terribly inefficient.  You would never really want to do it this
> way.  You wouldn't want to burn that much memory all at once.  Instead
> you would want to make a for-loop to iterate over the sequence such as
> the "for ((i=1; i<=100; i++)); do" construct that Greg
> suggested.  That is a much more efficient way to do a loop over that
> many items.  And it will execute much faster.  Although a loop of that
> large will take a long time to complete.
> 

I was hit by that by accident. Normally I always do normal for-loops
instead so I was a bit surprised by the fact that my machine was not
responding anymore ;-)

Clearly, when I think about it again, it is more or less obvious.

> Put yourself in a shell author's position.  What would you think of
> this situation?  Trying to generate an unreasonably large number of
> program arguments is, well, unreasonable.  I think this is clearly an
> abuse of the feature.  You can't expect any program to be able to
> generate and use that much memory.

ACK

> And as for whether a program should return unused memory back to the
> operating system for better or worse very few programs actually do it.
> It isn't simple.  It requires more accounting to keep track of memory
> in order to know what can be returned.  It adds to the complexity of
> the code and complexity tends to create bugs.  I would rather have a
> simple and bug free program than one that is full of features but also
> full of bugs.  Especially the shell where bugs are really bad.
> Especially in a case like this where that large memory footprint was
> only due to the unreasonably large argument list it was asked to
> create.  Using a more efficient language construct avoids the memory
> growth, which is undesirable no matter what, and once that memmory
> growth is avoided then there isn't a need to return the memory it
> isn't using to the system either.
> 
> If you want bash to be reduced to a smaller size try exec'ing itself
> in order to do this.
> 
>   $ exec bash
> 
> That is my 2 cents worth plus a little more for free. :-)


Thank you for the explanation. 

I will not consider this a bug anymore ;-)

-Marc
-- 
8AAC 5F46 83B4 DB70 8317  3723 296C 6CCA 35A6 4134