Bash monopolizing or eating the RAM MEMORY

2017-03-20 Thread Noilson Caio
Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-redhat-linux-gnu'
-DCONF_VENDOR='redhat' -DLOCALEDIR='/usr/share/locale' -DPACKAGE='bash'
-DSHELL -DHAVE_CONFIG_H   -I.  -I. -I./include -I./lib  -D_GNU_SOURCE
-DRECYCLES_PIDS -DDEFAULT_PATH_VALUE='/usr/local/bin:/usr/bin'  -O2 -g
-pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong
--param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic
uname output: Linux SPFBL-POC-CENTOS-7 3.10.0-514.2.2.el7.x86_64 #1 SMP Tue
Dec 6 23:06:41 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
Machine Type: x86_64-redhat-linux-gnu

Bash Version: 4.2
Patch Level: 46
Release Status: release

Description:
Hello bash crew.
My name is Noilson Caio and i assume that there is something weird/strange
in bash. I'm working with huge folder structures and a lot of files every
day and the best way to describe the 'problem' is using a example.

Example task: Build a folder structure using 0-9 at 2 levels
Something like that:

.

|-- 0

|   |-- 0

|   |-- 1

|   |-- 2

|   |-- 3

|   |-- 4

|   |-- 5

|   |-- 6

|   |-- 7

|   |-- 8

|   `-- 9

|-- 1

|   |-- 0

|   |-- 1

|   |-- 2

|   |-- 3

|   |-- 4

|   |-- 5

|   |-- 6

|   |-- 7

|   |-- 8

|   `-- 9

|-- 2

|   |-- 0

|   |-- 1

|   |-- 2

|   |-- 3

|   |-- 4

|   |-- 5

|   |-- 6

|   |-- 7

|   |-- 8

|   `-- 9

|-- 3

|   |-- 0

|   |-- 1

|   |-- 2

|   |-- 3

|   |-- 4

|   |-- 5

|   |-- 6

|   |-- 7

|   |-- 8

|   `-- 9

|-- 4

|   |-- 0

|   |-- 1

|   |-- 2

|   |-- 3

|   |-- 4

|   |-- 5

|   |-- 6

|   |-- 7

|   |-- 8

|   `-- 9

|-- 5

|   |-- 0

|   |-- 1

|   |-- 2

|   |-- 3

|   |-- 4

|   |-- 5

|   |-- 6

|   |-- 7

|   |-- 8

|   `-- 9

|-- 6

|   |-- 0

|   |-- 1

|   |-- 2

|   |-- 3

|   |-- 4

|   |-- 5

|   |-- 6

|   |-- 7

|   |-- 8

|   `-- 9

|-- 7

|   |-- 0

|   |-- 1

|   |-- 2

|   |-- 3

|   |-- 4

|   |-- 5

|   |-- 6

|   |-- 7

|   |-- 8

|   `-- 9

|-- 8

|   |-- 0

|   |-- 1

|   |-- 2

|   |-- 3

|   |-- 4

|   |-- 5

|   |-- 6

|   |-- 7

|   |-- 8

|   `-- 9

`-- 9

   |-- 0

   |-- 1

   |-- 2

   |-- 3

   |-- 4

   |-- 5

   |-- 6

   |-- 7

   |-- 8

   `-- 9

110 directories, 0 files

For this kind of job i've been using 'curly braces' '{}' for almost 10
years. In response to the question: mkdir -p {0..9}/{0..9}/
Well, so far so good. But when i grow the arguments list (folder levels)
strange things happen =]. let me show examples and facts.

1 - Using mkdir -p {0..9}/{0..9}/{0..9}/{0..9}/{0..9}/ - ( 5 levels ) No
problems

2 - Using mkdir -p {0..9}/{0..9}/{0..9}/{0..9}/{0..9}/{0..9}/ - (6 levels )
We have a problem - "Argument list too long". Not really a problem for the
bash, it's a problem to system operator. I know that's a ARG_MAX limitation.
When this happen, the operator fixed it with other tools/split/ways. Don't
make sense you do increase more arguments in this task, but let's go;

3 - Using mkdir -p {0..9}/{0..9}/{0..9}/{0..9}/{0..9}/{0..9}/{0..9}/ - (7
levels ) - Ops, we don't have more "Argument list too long" now we have "Cannot
allocate memory".
Strace sample:

access("/usr/bin/mkdir", X_OK) = 0 stat("/usr/bin/mkdir",
{st_mode=S_IFREG|0755, st_size=39680, ...}) = 0 geteuid() = 0 getegid() = 0
getuid() = 0 getgid() = 0 access("/usr/bin/mkdir", R_OK) = 0
stat("/usr/bin/mkdir", {st_mode=S_IFREG|0755, st_size=39680, ...}) = 0
stat("/usr/bin/mkdir", {st_mode=S_IFREG|0755, st_size=39680, ...}) = 0
geteuid() = 0 getegid() = 0 getuid() = 0 getgid() = 0
access("/usr/bin/mkdir", X_OK) = 0 stat("/usr/bin/mkdir",
{st_mode=S_IFREG|0755, st_size=39680, ...}) = 0 geteuid() = 0 getegid() = 0
getuid() = 0 getgid() = 0 access("/usr/bin/mkdir", R_OK) = 0
rt_sigprocmask(SIG_BLOCK, [INT CHLD], [], 8) = 0 clone(child_stack=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x7fc12637e9d0) = -1 ENOMEM (Cannot allocate memory) write(2,
"-bash: fork: Cannot allocate mem"..., 36) = 36

Basically all RAM MEMORY it was eaten. after that, bash cannot will be able
to create a new 'fork'

4 - Using mkdir -p {0..9}/{0..9}/{0..9}/{0..9}/{0..9}/{0..9}/{0..9}/{0..9}/
- ( 8 levels or more ). Well, in this case all ram memory and swap (if
there is) will be consumed. And only stop when kernel oom score send a SIG
to kill the bash process. Exhaustive brk() calls. Maybe the assumed limit
is virtual memory that for default is unlimited


PS:

Maximum RAM tested: 16 GB
Other bash version tested:

--

Re: Bash monopolizing or eating the RAM MEMORY

2017-03-20 Thread Noilson Caio
thank you so much Mr. Wooledge. i guess that BUG is a strong word for this
case. i fully agree about "his is not a bash bug.  It's a problem with your
approach.", actuality that's my preoccupation.  can you help me to
understand because 10^6 strings pull the trigger of "Argument list too
long" and 10^7(n+1) don't ? i have afraid that a non-root user can
compromise a linux box intentionally. the memory needs be eaten until other
threshold can break it.

Thank you again.

On Mon, Mar 20, 2017 at 1:50 PM, Greg Wooledge  wrote:

> On Mon, Mar 20, 2017 at 12:17:39PM -0300, Noilson Caio wrote:
> > 1 - Using mkdir -p {0..9}/{0..9}/{0..9}/{0..9}/{0..9}/ - ( 5 levels ) No
> > problems
>
> 10 to the 5th power (100,000) strings generated.  Sloppy, but viable on
> today's computers.  You're relying on your operating system to allow
> an extraordinary large set of arguments to processes.  I'm guessing
> Linux.
>
> > 2 - Using mkdir -p {0..9}/{0..9}/{0..9}/{0..9}/{0..9}/{0..9}/ - (6
> levels )
> > We have a problem - "Argument list too long".
>
> You have two problems.  The first is that you are generating 10^6
> (1 million) strings in memory, all at once.  The second is that you are
> attempting to pass all of these strings as arguments to a single mkdir
> process.  Apparently even your system won't permit that.
>
> > 3 - Using mkdir -p {0..9}/{0..9}/{0..9}/{0..9}/{0..9}/{0..9}/{0..9}/ -
> (7
> > levels ) - Ops, we don't have more "Argument list too long" now we have
> "Cannot
> > allocate memory".
>
> 10 million strings, all at once.  Each one is ~15 bytes (counting the NUL
> and slashes), so you're looking at something like 150 megabytes.
>
> This is not a bash bug.  It's a problem with your approach.  You wouldn't
> call it a bug in C, if you wrote a C program that tried to allocate 150
> megabytes of variables and got an "out of memory" as a result.  The same
> applies to any other programming language.
>
> What you need to do is actually think about how big a chunk of memory
> (and argument list) you can handle in a single call to mkdir -p, and
> just do that many at once.  Call mkdir multiple times, in order to get
> the full task done.  Don't assume bash will handle that for you.
>



-- 
Noilson Caio Teixeira de Araújo
https://ncaio.wordpress.com
https://br.linkedin.com/in/ncaio
https://twitter.com/noilsoncaio
https://jammer4.wordpress.com/
http://8bit.academy


Re: Bash monopolizing or eating the RAM MEMORY

2017-03-20 Thread Noilson Caio
thank you

On Mon, Mar 20, 2017 at 4:03 PM, Greg Wooledge  wrote:

> On Mon, Mar 20, 2017 at 03:54:37PM -0300, Noilson Caio wrote:
> > thank you so much Mr. Wooledge. i guess that BUG is a strong word for
> this
> > case. i fully agree about "his is not a bash bug.  It's a problem with
> your
> > approach.", actuality that's my preoccupation.  can you help me to
> > understand because 10^6 strings pull the trigger of "Argument list too
> > long" and 10^7(n+1) don't ? i have afraid that a non-root user can
> > compromise a linux box intentionally. the memory needs be eaten until
> other
> > threshold can break it.
>
> It's not a "compromise".  Any user on a computer can run a program that
> uses (or tries to use) a bunch of memory.  You as the system admin can
> set resource limits on the user's processes.  This is outside the scope
> of the bug-bash mailing list (try a Linux sys admin list).
>
> As far as "argument list too long", I believe you already posted a link
> to Sven Mascheck's ARG_MAX web page.  This is the best explanation of
> the concept and the details.  For those that may have missed it,
> see http://www.in-ulm.de/~mascheck/various/argmax/
>
> If you want to create 1 million (or 10 million) directories from a bash
> script, you're going to have to call mkdir repeatedly.  If this is a
> problem, then I suggest you rewrite in a langauge that can call mkdir()
> as a function without forking a whole process to do so.  Pretty much
> every language that isn't a shell should allow this.  Pick your favorite.
>



-- 
Noilson Caio Teixeira de Araújo
https://ncaio.wordpress.com
https://br.linkedin.com/in/ncaio
https://twitter.com/noilsoncaio
https://jammer4.wordpress.com/
http://8bit.academy