Re: Bug/limitation in 'time' (kernel setings?)...

2013-03-20 Thread Linda Walsh


Bruce Dawson wrote:
> The reply below is way too long, but oh well.
> 
> The summary is:
> I am 100% certain that my ExprCount() loop is CPU bound.

Your stats claim 530 iterations/second when spawning expr.

That's ~2ms/process create.

In an article from 14 years ago on a 200Mhz Pentium, Linux process
creates were ~1ms.

http://www.linuxjournal.com/article/2941
(table @
http://www.linuxjournal.com/files/linuxjournal.com/linuxjournal/articles/029/2941/2941f1.jpg)

If you believe that a modern processor running at 1.6GHz, takes twice as long
as a 14 year old pentium, then there is little likelihood that you will be
convinced by any logical argument.



You need to rebuild your kernel with the config option turned on -- in fact,
you need to build your own kernel if you want to do any sort of kernel 
benchmarks.


> You seem convinced that the task is not CPU bound on my machine, but I'm
> convinced that it is.
---
Then your kernel is misconfigured for your workload.

You said:
> I am 100% certain that my ExprCount() loop is CPU bound. It is unfortunate
> that time gives inaccurate information about this. It would be nice if the
> documentation acknowledged that in order to save future generations from
> confusion.

What documentation would you have updated?


Here is a perl program:
#!/usr/bin/perl
die "need count" unless @ARGV;
my $count=$ARGV[0];
my $counter=$count;
while ($counter-->0) {
  my $pid;
  if (my $pid=fork() ) {
waitpid($pid,0);
  }else {
exec "/bin/true"; # use 'exit(0)' to just time forks w/no image load
  }
}
-
> /usr/bin/time /tmp/sp.pl 1
 time schedtool -a 3 -e /tmp/sp.pl 1
6.83sec 0.05usr 0.97sys (15.05% cpu)
---
But if I don't confine it to 1 cpu:
> time /tmp/sp.pl 1  
10.54sec 0.20usr 2.59sys (26.52% cpu)

Just forks:
> time /tmp/sp.pl 1
7.23sec 0.12usr 2.36sys (34.35% cpu)
> time /tmp/sp.pl 1
7.21sec 0.12usr 2.35sys (34.39% cpu)
---
1cpu forks:

> time schedtool -a 3 -e /tmp/sp.pl 1   
4.32sec 0.06usr 0.95sys (23.69% cpu)
Ishtar:law> time schedtool -a 3 -e /tmp/sp.pl 1
4.29sec 0.05usr 0.96sys (23.80% cpu)

-

Right there, it shows you that the scheduler is taking almost 33% of your
time.

That may be cpu, but it is not cpu that your program is using.
That's why I say your accounting is off.







>From these figures, I would say it's closer to 1000/second





RE: Bug/limitation in 'time' (kernel setings?)...

2013-03-20 Thread Bruce Dawson
Note that the ~2ms/process create is when the power management fails to get
invoked. It drops to ~0.6ms/process create when the CPU ramps up to full
speed. The chart you list from twelve years ago shows 6.0 ms to spawn a new
process, so my machine is about ten times faster -- seems about right.
Cloning a process is not comparable because in my test the kernel has to
load and exec() expr, not just fork() bash.

> Right there, it shows you that the scheduler is taking almost 33% of your
time.

I don't follow your logic. How do you know that it is the scheduler? I can
see that when you run your schedtool script it shows 15% to 34.4% of CPU
time consumed. I say that the rest is process startup, you say scheduler.
I'm not sure what evidence points to the scheduler, and I'm not sure what
math you used to come up with the 33% figure. You'll have to elucidate.

> Then your kernel is misconfigured for your workload.

What? My kernel is executing the tasks fine. When my CPU runs at full speed
it is slightly faster than yours, and when it runs at 1/3 speed it is slower
than yours. In both cases 'time' reports that the CPU is not pegged when it
is. Our kernels are behaving quite similarly.

> What documentation would you have updated?

The bash documentation where it mentions the 'time' command, and the 'time'
documentation.

-Original Message-
From: Linda Walsh [mailto:b...@tlinx.org] 
Sent: Wednesday, March 20, 2013 1:12 AM
To: Bruce Dawson
Cc: bug-bash
Subject: Re: Bug/limitation in 'time' (kernel setings?)...



Bruce Dawson wrote:
> The reply below is way too long, but oh well.
> 
> The summary is:
> I am 100% certain that my ExprCount() loop is CPU bound.

Your stats claim 530 iterations/second when spawning expr.

That's ~2ms/process create.

In an article from 14 years ago on a 200Mhz Pentium, Linux process creates
were ~1ms.

http://www.linuxjournal.com/article/2941
(table @
http://www.linuxjournal.com/files/linuxjournal.com/linuxjournal/articles/029
/2941/2941f1.jpg)

If you believe that a modern processor running at 1.6GHz, takes twice as
long as a 14 year old pentium, then there is little likelihood that you will
be convinced by any logical argument.



You need to rebuild your kernel with the config option turned on -- in fact,
you need to build your own kernel if you want to do any sort of kernel
benchmarks.


> You seem convinced that the task is not CPU bound on my machine, but 
> I'm convinced that it is.
---
Then your kernel is misconfigured for your workload.

You said:
> I am 100% certain that my ExprCount() loop is CPU bound. It is 
> unfortunate that time gives inaccurate information about this. It 
> would be nice if the documentation acknowledged that in order to save 
> future generations from confusion.

What documentation would you have updated?


Here is a perl program:
#!/usr/bin/perl
die "need count" unless @ARGV;
my $count=$ARGV[0];
my $counter=$count;
while ($counter-->0) {
  my $pid;
  if (my $pid=fork() ) {
waitpid($pid,0);
  }else {
exec "/bin/true"; # use 'exit(0)' to just time forks w/no image load
  }
}
-
> /usr/bin/time /tmp/sp.pl 1
 time schedtool -a 3 -e /tmp/sp.pl 1 6.83sec 0.05usr 0.97sys (15.05%
cpu)
---
But if I don't confine it to 1 cpu:
> time /tmp/sp.pl 1  
10.54sec 0.20usr 2.59sys (26.52% cpu)

Just forks:
> time /tmp/sp.pl 1
7.23sec 0.12usr 2.36sys (34.35% cpu)
> time /tmp/sp.pl 1
7.21sec 0.12usr 2.35sys (34.39% cpu)
---
1cpu forks:

> time schedtool -a 3 -e /tmp/sp.pl 1   
4.32sec 0.06usr 0.95sys (23.69% cpu)
Ishtar:law> time schedtool -a 3 -e /tmp/sp.pl 1 4.29sec 0.05usr 0.96sys
(23.80% cpu)

-

Right there, it shows you that the scheduler is taking almost 33% of your
time.

That may be cpu, but it is not cpu that your program is using.
That's why I say your accounting is off.







>From these figures, I would say it's closer to 1000/second





RE: Bug/limitation in 'time' (kernel setings?)...

2013-03-20 Thread Bruce Dawson
I looked at the CPU sampling data to try to confirm what is going on. The
profile data (captured by zoom, using perf) shows ~90% of the samples
hitting in various invocations of expr. The tool won't summarize call stacks
across processes so I manually looked at a bunch of them. I looked at
samples in 15 invocations of expr. Each invocation runs for less than a ms
and the sampling frequency is one KHz so no individual process got more than
one sample. The samples I saw were: 

- page_fault in module vmlinux
- page_fault-> flush_tlb_page in module vmlinux
- do_lookup_x in ld-2.15.so
- perf_event__output_id_sample in vmlinux
- page_fault-> do_page_fault in vmlinux
- do_lookup_x in ld-2.15.so
- stub_execve-> load_elf_binary-> exec_mmap in vmlinux
- page_fault-> handle_pte_fault-> copy_page_c in vmlinux
- page_fault-> handle_pte_fault-> page_add_file_rmap in vmlinux
- system_call_fastpath-> sys_mmap-> mmap_region in vmlinux
- page_fault-> do_page_fault-> handle_mm_fault in vmlinux
-  in unknown module
- page_fault-> handle_pte_fault-> page_add_file_rmap in vmlinux
-  in unknown module
- page_fault-> do_page_fault-> up_read in vmlinux

To summarize, the zoom profiler says that ~90% of CPU time when running
ExprCount() is spent in expr. Of that time it says that more than half of it
is spent in page_fault or mmap. That sounds consistent with the claim that
it is process startup that is consuming the missing CPU time that 'time'
fails to account for.

I see no indications in any of the data of a significant amount of CPU time
being spent in the OS scheduler. Neither the context switch data nor the
sampling data supports that conclusion.

I could be wrong of course but I think my conclusion makes sense, and it is
supported by significant evidence.

FWIW.

-Original Message-
From: Bruce Dawson [mailto:brucedaw...@cygnus-software.com] 
Sent: Wednesday, March 20, 2013 7:44 AM
To: 'Linda Walsh'
Cc: 'bug-bash'
Subject: RE: Bug/limitation in 'time' (kernel setings?)...

Note that the ~2ms/process create is when the power management fails to get
invoked. It drops to ~0.6ms/process create when the CPU ramps up to full
speed. The chart you list from twelve years ago shows 6.0 ms to spawn a new
process, so my machine is about ten times faster -- seems about right.
Cloning a process is not comparable because in my test the kernel has to
load and exec() expr, not just fork() bash.

> Right there, it shows you that the scheduler is taking almost 33% of your
time.

I don't follow your logic. How do you know that it is the scheduler? I can
see that when you run your schedtool script it shows 15% to 34.4% of CPU
time consumed. I say that the rest is process startup, you say scheduler.
I'm not sure what evidence points to the scheduler, and I'm not sure what
math you used to come up with the 33% figure. You'll have to elucidate.

> Then your kernel is misconfigured for your workload.

What? My kernel is executing the tasks fine. When my CPU runs at full speed
it is slightly faster than yours, and when it runs at 1/3 speed it is slower
than yours. In both cases 'time' reports that the CPU is not pegged when it
is. Our kernels are behaving quite similarly.

> What documentation would you have updated?

The bash documentation where it mentions the 'time' command, and the 'time'
documentation.

-Original Message-
From: Linda Walsh [mailto:b...@tlinx.org] 
Sent: Wednesday, March 20, 2013 1:12 AM
To: Bruce Dawson
Cc: bug-bash
Subject: Re: Bug/limitation in 'time' (kernel setings?)...



Bruce Dawson wrote:
> The reply below is way too long, but oh well.
> 
> The summary is:
> I am 100% certain that my ExprCount() loop is CPU bound.

Your stats claim 530 iterations/second when spawning expr.

That's ~2ms/process create.

In an article from 14 years ago on a 200Mhz Pentium, Linux process creates
were ~1ms.

http://www.linuxjournal.com/article/2941
(table @
http://www.linuxjournal.com/files/linuxjournal.com/linuxjournal/articles/029
/2941/2941f1.jpg)

If you believe that a modern processor running at 1.6GHz, takes twice as
long as a 14 year old pentium, then there is little likelihood that you will
be convinced by any logical argument.



You need to rebuild your kernel with the config option turned on -- in fact,
you need to build your own kernel if you want to do any sort of kernel
benchmarks.


> You seem convinced that the task is not CPU bound on my machine, but 
> I'm convinced that it is.
---
Then your kernel is misconfigured for your workload.

You said:
> I am 100% certain that my ExprCount() loop is CPU bound. It is 
> unfortunate that time gives inaccurate information about this. It 
> would be nice if the documentation acknowledged that in order to save 
> future generations from confusion.

What documentation would you have updated?


Here is a perl program:
#!/usr/bin/perl
die "need count" unless @ARGV;
my $count=$ARGV[0];
my $counter=$count;
while ($counter-->0) {
  my $pid;
  if (my $p

Re: Bug/limitation in 'time' (kernel setings?)...

2013-03-20 Thread Linda Walsh
1) Why not use your 'null' program, which in your blog, you state makes little
difference.
2) Better, why not use the simple 'fork'/exit' which doesn't call a separate
program;  (that's
   one reason for switching to perl or better, would be C;
Note the perl prog:
#!/usr/bin/perl
die "need count" unless @ARGV;
my $count=$ARGV[0];
while ($count-- > 0) {
  my $pid;
  if (my $pid=fork()) {  # if $pid  is set, then it is the pid of a child
waitpid($pid,0); # so parent waits for child to finish before continuing
  } else {   # else we are in child (or fork failed);
 # If fork failed prog will exit
exit 0;  # if we are in child, immediately exit (no exec);
#exec "/bin/true";   # alternate test case to load a simple prog
  }
}

schedtool is not a script -- it is system tool to control scheduler operations.

see also 'cpupower' to set/fix frequency and/or monitor idle-info

Note that in executing 'no program' (just doing a fork), but with the task
switching (scheduler activity), we get, at best ~ 35% cpu usage:

> time /tmp/sp.pl 1
7.29sec 0.11usr 2.41sys (34.59% cpu)

Now change the perl program to do all the forks at once.
then wait for all of them at once.
(so the parent and child won't alternate in the scheduler).

program for that is:
#!/usr/bin/perl
die "need count" unless @ARGV;
my $count=$ARGV[0];
my $counter=$count;
while ($counter-->0) { fork() || exit(0); }
while (waitpid(-1,0)>0) {};
print "counter=$counter\n";

Note I print counter at the end to verify that it executed the full count
if it did, it will be at "-1";
Now note the run time and cpu usage:
> time /tmp/sp1.pl 1
counter=-1
1.50sec 0.04usr 1.43sys (98.03% cpu)

Notice its about 1/5th the time.  I still forked the same number of
processes, and still waited for all of them to exit, but this
one was done  without invoking the scheduler each time through the loop.

The difference between the fork {wait/exit}
and the fork(all) wait(all); is
that the scheduler is not invoked each time through the loop.

Nearly 75-80% of the time is spent in the scheduler.




Unexpected behavior of single quotes when used in the patsub PE.

2013-03-20 Thread Eduardo A . Bustamante López

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS:  -DPROGRAM='bash=' -DCONF_HOSTTYPE='x86_64' 
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-unknown-linux-gnu' 
-DCONF_VENDOR='unknown' -DLOCALEDIR='/home/dualbus/local/share/locale' 
-DPACKAGE='bash' -DSHELL -DHAVE_CONFIG_H -DDEBUG -DMALLOC_DEBUG -I.  -I. 
-I./include -I./lib   -g -O2
uname output: Linux claret 3.7.9-1-ARCH #1 SMP PREEMPT Mon Feb 18 02:13:30 EET 
2013 x86_64 GNU/Linux
Machine Type: x86_64-unknown-linux-gnu

Bash Version: 4.3
Patch Level: 0
Release Status: alpha

Description:
  The rules for escaping single quotes inside the
  pat and rep arguments are somehow fuzzy. There's a difference
  between what I expected to work, and how really bash treats strings
  there. I see that that specific PE has changed in every bash major
  version, and even within minor releases it has incompatible
  behavior.

  I include a script that -- I expect -- explains the issue better
  than my words. The left side of the `|' is what bash generates, the
  right side is what I expect.

Repeat-By:

-
#!/usr/bin/env bash
for shell in bash{4,+,=} mksh ksh zsh; do
"$shell" <<'EOF'
  t() {
printf '%s  |  %s\n' "$1" "$2"
  }
  v="'"   # v <- '

  printf '===\n%s %s\n---\n' \
"$0" \
"$BASH_VERSION"

  #--
  t "${v/$'\''/$'\''}""'"
  t  ${v/$'\''/$'\''} "'"
  t "${v/$'\''/x}""x"
  t  ${v/$'\''/x} "x"
  t "${v/x/$'\''}""'"
  t  ${v/x/$'\''} "'"
  t "${v/x/$'\x5c\''}""'" #< I would actually expect these to
  t  ${v/x/$'\x5c\''} "'" #< be \'
  t "${v/\'/\'}"  "'"
  t  ${v/\'/\'}   "'"
EOF
done

: <<'EOF'
http://austingroupbugs.net/view.php?id=221
ansiexpand
sh_single_quote
expand_string_if_necessary
EOF
-

The output in my machine:
-
===
bash4 4.2.45(2)-release
---
'  |  '
'  |  '
bash4: line 13: bad substitution: no closing `}' in "${v/'/x}"
x  |  x
bash4: line 15: bad substitution: no closing `}' in "${v/x/'}"
'  |  '
'  |  '
'  |  '
\'  |  '
'  |  '
===
bash+ 4.3.0(1)-alpha
---
'  |  '
'  |  '
bash+: line 13: bad substitution: no closing `}' in "${v/'/x}"
x  |  x
bash+: line 15: bad substitution: no closing `}' in "${v/x/'}"
'  |  '
'  |  '
'  |  '
'  |  '
'  |  '
===
bash= 4.3.0(2)-alpha
---
'  |  '
'  |  '
x  |  x
x  |  x
'  |  '
'  |  '
'  |  '
'  |  '
'  |  '
'  |  '
===
mksh 
---
'  |  '
'  |  '
x  |  x
x  |  x
'  |  '
'  |  '
'  |  '
'  |  '
'  |  '
'  |  '
===
ksh 
---
'  |  '
'  |  '
x  |  x
x  |  x
'  |  '
'  |  '
'  |  '
'  |  '
'  |  '
'  |  '
===
zsh 
---
$'\''  |  '
'  |  '
x  |  x
x  |  x
'  |  '
'  |  '
'  |  '
'  |  '
\'  |  '
'  |  '
-


I add a fix for the `bad substitution' error. I'm not aware if it
breaks anything. I ran the tests, and there was no evident breakage,
but I didn't investigate the issue further.

As for the "${v/x/$'\x5c\''}" expansion, I don't think that
interpreting the expanded $'\x5c' is correct, but all the shells I
tested do that, so I guess that's expected behavior.


Fix:

---
 parse.y | 28 ++--
 1 file changed, 6 insertions(+), 22 deletions(-)

diff --git a/parse.y b/parse.y
index 61f0f7c..729cf3b 100644
--- a/parse.y
+++ b/parse.y
@@ -3352,17 +3352,9 @@ parse_matched_pair (qc, open, close, lenp, flags)
  ttrans = ansiexpand (nestret, 0, nestlen - 1, &ttranslen);
  xfree (nestret);
 
- if ((rflags & P_DQUOTE) == 0)
-   {
- nestret = sh_single_quote (ttrans);
- free (ttrans);
- nestlen = strlen (nestret);
-   }
- else
-   {
- nestret = ttrans;
- nestlen = ttranslen;
-   }
+  nestret = sh_single_quote (ttrans);
+  free (ttrans);
+  nestlen = strlen (nestret);
  retind -= 2;  /* back up before the $' */
}
  else if MBTEST((tflags & LEX_WASDOL) && ch == '"' && 
(extended_quote || (rflags & P_DQUOTE) == 0))
@@ -3830,17 +3822,9 @@ eof_error:
  ttrans = ansiexpand (nestret, 0, nestlen - 1, &ttranslen);
  xfree (nestret);
 
- if ((rflags & P_DQUOTE) == 0)
-   {
- nestret = sh_single_quote (ttrans);
- free (ttrans);
- nestlen = strlen (nestret);
-   }
- else
-   {
- nestret = ttrans;
- nestlen = ttranslen;
-   }
+nestret = sh_single_quote (ttrans);
+free (ttrans);
+nestlen = strlen (nestret);
  retind -= 2;  /* back up before the $' */
}
  else if MBTEST((tflags & LEX_WASDOL) && ch == '"' &&

Re: Unexpected behavior of single quotes when used in the patsub PE.

2013-03-20 Thread Chet Ramey
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 3/20/13 5:56 PM, Eduardo A. Bustamante López wrote:

> Bash Version: 4.3
> Patch Level: 0
> Release Status: alpha
> 
> Description:
>   The rules for escaping single quotes inside the
>   pat and rep arguments are somehow fuzzy. There's a difference
>   between what I expected to work, and how really bash treats strings
>   there.

You're correct that this specific behavior changed between bash-4.2 and
bash-4.3.  There was extensive discussion of this point on bug-bash,
starting with

http://lists.gnu.org/archive/html/bug-bash/2012-02/msg00106.html

That resulted in the change, which is documented in the CHANGES file in
bash-4.3-alpha (as well as COMPAT and in the change log):

zz. When using the pattern substitution word expansion, bash now runs the
replacement string through quote removal, since it allows quotes in that
string to act as escape characters.  This is not backwards compatible, so
it can be disabled by setting the bash compatibility mode to 4.2.

That change is described in the section of the man page that covers the
`compat42' shell option.

>   I see that that specific PE has changed in every bash major
>   version, and even within minor releases it has incompatible
>   behavior.

You say that, but you don't provide any actual evidence to back is up.
The behavior with this expansion was the same from bash-3.1 (as far back
as I went) through bash-4.2.

>   I include a script that -- I expect -- explains the issue better
>   than my words. The left side of the `|' is what bash generates, the
>   right side is what I expect.

Thanks for the script.  I will look at the fix for the bad substitution
error (I suspect your fix is not correct because the code should not treat
single quotes as special when parsing a double-quoted string, and the
error message is printed elsewhere).

Chet

- -- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRUc...@case.eduhttp://cnswww.cns.cwru.edu/~chet/
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (Darwin)
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlFKXyUACgkQu1hp8GTqdKsGygCghSkLLCjoJsi738/NnLDTejDW
C2QAnjL1RisKOGvlLJnhSxBNPSSM02BT
=SjNR
-END PGP SIGNATURE-



Re: Unexpected behavior of single quotes when used in the patsub PE.

2013-03-20 Thread Eduardo A . Bustamante López
On Wed, Mar 20, 2013 at 09:15:24PM -0400, Chet Ramey wrote:

> http://lists.gnu.org/archive/html/bug-bash/2012-02/msg00106.html
Ah, I rembember that thread, but somehow just read the beginning of
it. Until now I realize it developed a lot further.

> You say that, but you don't provide any actual evidence to back is up.
> The behavior with this expansion was the same from bash-3.1 (as far back
> as I went) through bash-4.2.

I'm sorry, I meant the change from 4.2 -> 4.3 (don't know the exact
patch level). Running the test script I provided gives the following
output for bash2 and bash3:


===
bash2 2.00.0(1)-release
---
bash2: line 20: unexpected EOF while looking for matching `''
bash2: line 21: syntax error: unexpected end of file
===
bash3 3.00.0(3)-rc1
---
''\'''  |  '
'  |  '
x  |  x
x  |  x
'  |  '
'  |  '
'  |  '
'  |  '
\'  |  '
'  |  '

But I didn't include it initially, because I didn't thought it was
useful. But since you say it's like that since bash 3.1, I guess I
overestimated the behavior.

> Thanks for the script.  I will look at the fix for the bad substitution
> error (I suspect your fix is not correct because the code should not treat
> single quotes as special when parsing a double-quoted string, and the
> error message is printed elsewhere).
Yes, I suspect it's not correct either, but I didn't understand the
bit about removing quotes if the expression was double quoted, so I
went for the straight change that fixed it apparently.

Thanks!

-- 
Eduardo A. Bustamante López


pgpcqJXxL_8_H.pgp
Description: PGP signature


RE: Bug/limitation in 'time' (kernel setings?)...

2013-03-20 Thread Bruce Dawson
Hey, fascinating stuff here. I'm learning a lot of useful new tools and
techniques. I like your perl scripts.

I tend to avoid perl because I don't know it well and I'm not a fan.
Personal preference. Good point about using my 'nop' program. It might
simplify testing slightly, but shouldn't change the results.

> 2) Better, why not use the simple 'fork'/exit' which doesn't call a
separate program;

Remember that this investigation started with a real problem, and that real
problem was associated with using $(expr) for counting. fork/exit is a
totally different scenario.

When I run your first perl program -- I call it serialperl.pl because it is
a serial process where, roughly speaking, only one process is executing at a
time, it can do about 5,000 process creates per second. 'time' shows 21% CPU
usage on my four-core eight-thread system. However "iostat -c 1" shows about
13.6% CPU utilization, which corresponds to a bit more than one core. So, as
I have always claimed, 'time' is underestimating the CPU time required to
execute my task. The sampling data I provided (separate e-mail) shows that
this is mostly from page faults, which are associated with process startup.

Your second perl program -- I call it parallel perl because it runs a lot of
processes in parallel -- is trickier to measure. It runs really fast, but if
I pass too big a number to it in order to get it to run long enough for a
useful test then it kills my box. Apparently having 25,000 processes running
in parallel is too much for Ubuntu to handle, even with 32 GB of RAM. Pity.

However, I did get enough data to confirm my theory about why it runs so
much faster than serialper.pl. It's not because scheduler overhead is lower
(in fact, having 20,000 processes running in parallel is a huge burden to
the scheduler). It's because it is running in parallel. When I run
"parallelperl.pl 2" then I see CPU usage (measured by "iostat -c 1"
running in another window) spike to ~80%. It would probably go higher if I
could get a long enough running test without DOSing my machine.

So, the reason parallelperl.pl runs faster is because it is running in
parallel.

> Note that in executing 'no program' (just doing a fork), but with the task
switching (scheduler activity), we get, at best ~ 35% cpu usage:

To be clear, we get 35% CPU usage reported by 'time'. That is entirely
different from actual total CPU usage. I think we agree about that, we're
just debating where that extra time is going. Clarity is critical in these
tricky discussions. In fact, I think it is crucial to report both what
'time' is saying the process and its children are doing, and what "iostat -c
1" says the overall system is doing during the run. Otherwise we are trying
to infer complex behavior with one eye closed. The 'time' results are only
of interest if we compare them to "iostat -c 1" or some similar system-wide
measurement.

> The difference between the fork {wait/exit} and the fork(all) wait(all);
is that the scheduler is not invoked each time through the loop.

I don't think that is a correct analysis and it is not justified by the
evidence. I think the change in performance is because the forked copies are
able to run in parallel. "iostat -c 1" strongly supports this conclusion.
Even if I'm wrong you have hardly shown that the difference is the
scheduler. To do that you would have to have sampling profiler data that
showed a majority of samples hitting in the scheduler. I've done that test
and it shows the majority of samples hitting in page faults, associated with
process startup.

I've provided significant evidence that process startup is the culprit:
- CPU scheduling graphs (pretty pictures) showing expr consuming the extra
time
- sampling data showing page faults in expr consuming the extra time

I have seen no evidence for the scheduler being the issue. If the scheduler
was the issue then that would suggest a hugely inefficient scheduler, which
is a bold claim to make.

Either way, the consistent and pervasive conclusion is that time's CPU
consumption results are significantly understated when multiple processes
are involved. It would be nice to have that acknowledged in the
documentation. If not, well, that's why I have a blog.

-Original Message-
From: Linda Walsh [mailto:b...@tlinx.org] 
Sent: Wednesday, March 20, 2013 11:38 AM
To: Bruce Dawson
Cc: 'bug-bash'
Subject: Re: Bug/limitation in 'time' (kernel setings?)...

1) Why not use your 'null' program, which in your blog, you state makes
little difference.
2) Better, why not use the simple 'fork'/exit' which doesn't call a separate
program;  (that's
   one reason for switching to perl or better, would be C; Note the perl
prog:
#!/usr/bin/perl
die "need count" unless @ARGV;
my $count=$ARGV[0];
while ($count-- > 0) {
  my $pid;
  if (my $pid=fork()) {  # if $pid  is set, then it is the pid of a child
waitpid($pid,0); # so parent waits for child to finish before
continuing
  } else {   # els