Thanks for the suggestions. I have captured a bunch more data and hence the 
delay. First I have started by reproducing the behavior with 4.2.1:

gmake realclean >& cleanlog.txt
/usr/bin/gmake-4.2.1 -j32 -Otarget dirsparallel=all release >& 
gmake421dpa32.log; date
Tue Jul 19 11:24:47 EDT 2016
Tue Jul 19 20:43:17 EDT 2016
gmake realclean >& cleanlog.txt
/usr/bin/gmake-4.2.1 -j32 -Onone dirsparallel=all release > & gmake421dpn32.log 
; date
Tue Jul 19 21:01:48 EDT 2016
Tue Jul 19 22:05:40 EDT 2016

Essentially the same as 4.1. This –Otarget run is not just “a little slower,” 
it is pathologically slow. The 9 hour build time is slower than if I had run the
build with no parallelization whatsoever. (The –Onone time is a bit faster than 
what I see doing parallelization at just the leaf level)

Our build is a relatively straight-forward “choreographed” build as followed:

     Top-Level Makefile
            Makefile.libs – This is a list of directories containing C++ code 
getting built into different relocatable libraries (one per dir)
                  (sub-directory Makefiles)
            Mikefile.bins – This is a list of directories that link executables
                  (sub-directory Makefiles)

There are O(120) libraries to build, each of which ranges from 10-50 objects. 
There are about 30 executables to build.  What “dirsparallel=all” does is to 
turn on parallel make for Makefile.libs, Makefile.dirs, etc. The actual build 
is more complicated,  having codegen, unit test, lint, and packaging  
activities, but it follows this simple model.

The defining characteristic here is a wide fan-out. There are a large number of 
relatively simple and brief makefiles. There will therefore be a few “middle 
management” makefiles that are waiting on children to complete (I imagine each 
of these occupies a make job??).

I have confirmed that my count of “g++” invocations is exactly the same for the 
two runs (4403)

The TMPDIR variable was not set. I set it to /tmp to insure that it is on a 
physical device. (The build and our home directories are NFS mounted) The total 
elapsed time for the “-Otarget” run was then 6 hours. This is enough of a 
reduction to be a little suspicious, but not much more.

I ran with strace for both the “none” case and the “target” case.  The 
summaries are attached. Unfortunately, I don’t think there is a smoking gun 
here.

I have also run the “-Otrace” case at a few different job-levels:

j8 – 6:34
j32 – 6:00
j256 – 9:49

From: David Boyce [mailto:david.s.bo...@gmail.com]
Sent: Sunday, July 17, 2016 5:03 PM
To: Gardell, Steven <sgard...@sonusnet.com>
Cc: psm...@gnu.org; Bug-make@gnu.org
Subject: Re: Parallel builds across makefiles

A couple of suggestions:

1. Check your value of TMPDIR if any. All -O is doing is redirecting output 
into a temp file and dumping it later. Effectively it turns command "foo" into 
"foo > $TMPDIR/blah; cat $TMPDIR/blah; unlink $TMPDIR/blah". This is why it 
seems almost impossible for it to slow things down appreciably, and a slow temp 
dir device might be one explanation. Along similar lines you could try 
"TMPDIR=/tmp make ..."

2. Try strace. It has a mode which will timestamp each system call and another 
which prints a table of system calls used and how much time it took. One of 
these will probably be instructive, especially when compared with the same 
without -O.

David

On Sun, Jul 17, 2016 at 12:22 PM, Gardell, Steven 
<sgard...@sonusnet.com<mailto:sgard...@sonusnet.com>> wrote:
No I have not gotten to 4.2.1 yet.

I have experimented using j values ranging from 16 to 64. The majority of the 
testing is with -Otarget. But I have tried all of them.

-----Original Message-----
From: Paul Smith [mailto:psm...@gnu.org<mailto:psm...@gnu.org>]
Sent: Sunday, July 17, 2016 3:14 PM
To: Gardell, Steven <sgard...@sonusnet.com<mailto:sgard...@sonusnet.com>>; 
Bug-make@gnu.org<mailto:Bug-make@gnu.org>
Subject: Re: Parallel builds across makefiles
On Sun, 2016-07-17 at 15:48 +0000, Gardell, Steven wrote:
> OK. Thanks! I will try the latest version.
>
> FWIW, I am measuring is total wall clock time to complete the build.
> (date; gmake... ; date) This goes up substantially with all sync modes
> other than "none" if gmake has parallel invocations of sub-makes going
> (e.g. "gmake  -f SomeOtherMakefile").  When I am only parallelizing
> individual makefiles (with ACE, these tend to be simple lists of files
> to compile), then the behavior is as promised by the documentation.

Assuming you determine the behavior exists even in version 4.2.1, please 
provide more details.  What level of -j are you using?  Which sync mode are you 
using?  Do you get the same behavior even with lower levels of -j, if yours is 
very large?

_______________________________________________
Bug-make mailing list
Bug-make@gnu.org<mailto:Bug-make@gnu.org>
https://lists.gnu.org/mailman/listinfo/bug-make

gmake-4.2.1: *** [Makefile:333: release] Error 1
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 93.92 48988.895003      281029    174320     42047 wait4
  3.56 1857.549713        8133    228409     47431 futex
  1.12  582.114172          10  59230554  48886748 stat
  0.81  421.206442          11  37319045  29591972 open
  0.40  209.945060           3  71120290   3458259 lstat
  0.06   33.636522           2  13616492      4796 read
  0.02   11.818771           1   8065605     14388 close
  0.02    9.514375           1   7618567           fstat
  0.01    7.582567           2   3842736           munmap
  0.01    7.392948           1   5568114           mmap
  0.01    7.124394          92     77606           sched_yield
  0.01    2.657384          12    213754           getdents
  0.00    2.140097           3    847396    371561 access
  0.00    2.071326           1   3078031           write
  0.00    1.887237           4    538715      3218 fcntl
  0.00    1.749332           1   1578161           brk
  0.00    1.286288           0   8765372      5684 lseek
  0.00    1.036779          93     11125           pwrite
  0.00    0.898698           2    593430           mprotect
  0.00    0.840105           0   1990271           rt_sigprocmask
  0.00    0.712259           3    243835    137443 execve
  0.00    0.699540          11     65687        14 vfork
  0.00    0.682775           1    964050        28 rt_sigaction
  0.00    0.611575         273      2237           symlink
  0.00    0.428026          67      6358      3569 mkdir
  0.00    0.413624           2    189901       584 readlink
  0.00    0.364815          10     37611     35308 ioctl
  0.00    0.286657           1    192127           dup2
  0.00    0.225329         112      2005       627 unlinkat
  0.00    0.175723          16     10961         2 unlink
  0.00    0.175075           1    161868           getuid
  0.00    0.152934           1    160100           geteuid
  0.00    0.144751           1    161788           getgid
  0.00    0.141394           1    158100           getegid
  0.00    0.140271          15      9374      3420 pselect6
  0.00    0.110696          42      2650       627 newfstatat
  0.00    0.105560          12      9017           ftruncate
  0.00    0.095245           1    106117           arch_prctl
  0.00    0.092465           1     70043           pipe
  0.00    0.086211           1     80684           getrlimit
  0.00    0.083629          69      1218           rename
  0.00    0.077480           2     49361           pread
  0.00    0.074814           1     50945           clone
  0.00    0.066864           9      7545           poll
  0.00    0.058354           1     85242      3614 rt_sigreturn
  0.00    0.045559           1     65173           uname
  0.00    0.036858           0     75027           setrlimit
  0.00    0.036661           2     16581           madvise
  0.00    0.030539          46       669           fchmodat
  0.00    0.023980          45       538           chmod
  0.00    0.022452           1     37443           getcwd
  0.00    0.021574           2      8659           statfs
  0.00    0.019853           5      3970           chdir
  0.00    0.019411           1     33296           getpgrp
  0.00    0.018386           2      8888           socket
  0.00    0.015994          49       326           fchmod
  0.00    0.015333           3      5238           sched_getaffinity
  0.00    0.013102           0     34870           getpid
  0.00    0.012723           0     33299           getppid
  0.00    0.009763          11       907       906 connect
  0.00    0.009151          29       321           fchown
  0.00    0.008814          45       197           chown
  0.00    0.008683           5      1706      1227 utimensat
  0.00    0.008137           1      7948      7512 bind
  0.00    0.007875           0     22244           mremap
  0.00    0.007423           2      4418           getrusage
  0.00    0.006924           1      9939           set_robust_list
  0.00    0.005535           1      7513           sendto
  0.00    0.005124           1      7977           set_tid_address
  0.00    0.003982           0      9345           dup
  0.00    0.003629           1      4886           getgroups
  0.00    0.002308           8       279           utimes
  0.00    0.001354           0      9256           setsockopt
  0.00    0.001206           0      7513           recvfrom
  0.00    0.001084           1      2142           umask
  0.00    0.001003           0      5041           gettid
  0.00    0.000981          36        27           openat
  0.00    0.000950           3       321           getxattr
  0.00    0.000643          18        36           faccessat
  0.00    0.000183           0      2107      2107 getpeername
  0.00    0.000091           0       493        57 getsockname
  0.00    0.000075           0       478           fchdir
  0.00    0.000020           0        71           clock_getres
  0.00    0.000000           0       202           writev
  0.00    0.000000           0        12         1 select
  0.00    0.000000           0         3           alarm
  0.00    0.000000           0        22           socketpair
  0.00    0.000000           0       872           getsockopt
  0.00    0.000000           0         6           kill
  0.00    0.000000           0         3           creat
  0.00    0.000000           0         6           sysinfo
  0.00    0.000000           0       177           times
  0.00    0.000000           0         4           sigaltstack
  0.00    0.000000           0         5         5 fsetxattr
  0.00    0.000000           0         5         5 fgetxattr
  0.00    0.000000           0         1           linkat
------ ----------- ----------- --------- --------- ----------------
100.00 52157.980642             227737277  82623160 total
System call usage summary for 32 bit mode:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 66.72   49.342526       54764       901           wait4
 29.96   22.157386         332     66751      7059 futex
  1.23    0.911634           9     96806     65138 open
  0.91    0.673931           0   2312182           write
  0.46    0.340610           1    556057           read
  0.19    0.139320           0    997680           _llseek
  0.16    0.120858           4     31291       157 close
  0.10    0.073414           2     30895      1751 stat64
  0.07    0.052509          21      2538       856 unlink
  0.05    0.037090           0    320275           gettimeofday
  0.03    0.022479           1     28604           munmap
  0.02    0.015609           1     12976           rt_sigaction
  0.02    0.013604          27       508           rename
  0.02    0.012859           0     60486           brk
  0.01    0.009089           0     27988           old_mmap
  0.01    0.008663           0     19261           fstat64
  0.01    0.005501           1      5980           getpid
  0.01    0.005287           1      4598      3014 lstat64
  0.01    0.004514           5       925           execve
  0.00    0.002443           0      5415           mmap2
  0.00    0.001559           1      2299           times
  0.00    0.000977           1      1283           time
  0.00    0.000689          14        48           poll
  0.00    0.000649           1       530           getcwd
  0.00    0.000593           0      3954           clock_gettime
  0.00    0.000563           0      2756           mprotect
  0.00    0.000401          17        24           ftruncate
  0.00    0.000327           0       901           fork
  0.00    0.000322           1       332           lseek
  0.00    0.000296           1       333       285 access
  0.00    0.000197           0      1209           uname
  0.00    0.000146           1       168           clone
  0.00    0.000144           3        48           getrlimit
  0.00    0.000115           0       456           sched_getaffinity
  0.00    0.000096           0       456           rt_sigprocmask
  0.00    0.000077           2        48           setsockopt
  0.00    0.000073           2        48           sendto
  0.00    0.000071           1        96           getdents64
  0.00    0.000060           1        48           old_getrlimit
  0.00    0.000048           0       312           sigaltstack
  0.00    0.000043           2        24        24 mkdir
  0.00    0.000038           0        96           madvise
  0.00    0.000027           0       192       192 sched_setscheduler
  0.00    0.000024           1        48           pread64
  0.00    0.000018           0        48           set_tid_address
  0.00    0.000000           0        24           setrlimit
  0.00    0.000000           0        48           readlink
  0.00    0.000000           0        48           getdents
  0.00    0.000000           0        24           sched_getparam
  0.00    0.000000           0        24           sched_getscheduler
  0.00    0.000000           0        24           sched_yield
  0.00    0.000000           0        72           getuid32
  0.00    0.000000           0        48           getgid32
  0.00    0.000000           0        72           geteuid32
  0.00    0.000000           0        48           getegid32
  0.00    0.000000           0        48           fcntl64
  0.00    0.000000           0       216           gettid
  0.00    0.000000           0        87           set_thread_area
  0.00    0.000000           0       216           set_robust_list
  0.00    0.000000           0        96           socket
  0.00    0.000000           0        48        48 bind
  0.00    0.000000           0        48        48 connect
  0.00    0.000000           0        48           recvfrom
------ ----------- ----------- --------- --------- ----------------
100.00   73.956879               4599113     78572 total

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 93.82 53268.050950      302081    176337     41624 wait4
  3.18 1805.057765        7853    229844     48664 futex
  1.06  599.730842           8  71119946   3458260 lstat
  0.94  532.243350          14  37295656  29590966 open
  0.73  416.897838           7  59083695  48774860 stat
  0.09   50.603194           4  13780850      4946 read
  0.04   21.309529           3   8034030     14388 close
  0.03   17.913204           2   7594725           fstat
  0.02   14.098519           3   5545626           mmap
  0.02   10.953696           3   3820511           munmap
  0.01    5.575016          75     74393           sched_yield
  0.01    3.563169           4    847404    371559 access
  0.01    3.497227           1   3051847           write
  0.01    3.432796           0   8694798      5684 lseek
  0.01    3.343298           2   1576613           brk
  0.01    2.970389          14    213808           getdents
  0.00    2.511414           1   1989877           rt_sigprocmask
  0.00    2.191222           4    598520           mprotect
  0.00    2.034284           2    963896        28 rt_sigaction
  0.00    1.013485           4    243818    137443 execve
  0.00    0.952604          86     11125           pwrite
  0.00    0.642269           3    189873       584 readlink
  0.00    0.574819           9     65681        32 vfork
  0.00    0.536335          14     37671     35305 ioctl
  0.00    0.532600         238      2237           symlink
  0.00    0.479347           3    161899           getuid
  0.00    0.448964           3    158131           getegid
  0.00    0.432548           1    462935      1835 fcntl
  0.00    0.425225           3    160131           geteuid
  0.00    0.418771          65      6407      3618 mkdir
  0.00    0.407211           3    161819           getgid
  0.00    0.395323           3    121333           dup2
  0.00    0.354416           3    106100           arch_prctl
  0.00    0.265068           3     80654           getrlimit
  0.00    0.254673          23     11070      4965 pselect6
  0.00    0.222961           3     70009           pipe
  0.00    0.206641         103      2005       627 unlinkat
  0.00    0.171408         106      1622         2 unlink
  0.00    0.161237           2     85127      5053 rt_sigreturn
  0.00    0.151686           2     65149           uname
  0.00    0.124745         102      1218           rename
  0.00    0.113329           2     50950           clone
  0.00    0.095758          36      2650       627 newfstatat
  0.00    0.087777           2     49361           pread
  0.00    0.068028           1     74988           setrlimit
  0.00    0.051289           2     33287           getpgrp
  0.00    0.049880           7      7545           poll
  0.00    0.043773           3     16604           madvise
  0.00    0.043232           1     33290           getppid
  0.00    0.038585           4      8672           statfs
  0.00    0.037752           1     34861           getpid
  0.00    0.037340           8      4418           getrusage
  0.00    0.034400          20      1703      1025 utimensat
  0.00    0.032469          60       538           chmod
  0.00    0.030341          45       669           fchmodat
  0.00    0.026653           1     37441           getcwd
  0.00    0.024115           6      4006           chdir
  0.00    0.021799          67       326           fchmod
  0.00    0.016171           2      9931           set_robust_list
  0.00    0.013941           3      5234           sched_getaffinity
  0.00    0.012925           3      4886           getgroups
  0.00    0.011171           1      8888           socket
  0.00    0.009249           1      7971           set_tid_address
  0.00    0.008972           0     22551           mremap
  0.00    0.008657           1      7948      7512 bind
  0.00    0.008099           9       907       906 connect
  0.00    0.007414          23       321           fchown
  0.00    0.005842           3      2155           umask
  0.00    0.004866          25       197           chown
  0.00    0.004684           1      7513           sendto
  0.00    0.003978          14       279           utimes
  0.00    0.002132          79        27           openat
  0.00    0.001911           6       321           getxattr
  0.00    0.001591           3       478           fchdir
  0.00    0.001353          38        36           faccessat
  0.00    0.001293           0      9256           setsockopt
  0.00    0.001183           0      7513           recvfrom
  0.00    0.001000         333         3           creat
  0.00    0.001000         250         4           sigaltstack
  0.00    0.000967           0      5039           gettid
  0.00    0.000439           4       107           ftruncate
  0.00    0.000173           1       202           writev
  0.00    0.000110           0       493        57 getsockname
  0.00    0.000090           1       177           times
  0.00    0.000079           0      2107      2107 getpeername
  0.00    0.000016           0       872           getsockopt
  0.00    0.000016           0        71           clock_getres
  0.00    0.000000           0        12         1 select
  0.00    0.000000           0         6           dup
  0.00    0.000000           0         3           alarm
  0.00    0.000000           0        22           socketpair
  0.00    0.000000           0         6           kill
  0.00    0.000000           0         6           sysinfo
  0.00    0.000000           0         5         5 fsetxattr
  0.00    0.000000           0         5         5 fgetxattr
  0.00    0.000000           0         1           linkat
------ ----------- ----------- --------- --------- ----------------
100.00 56776.115880             227365252  82512688 total
System call usage summary for 32 bit mode:
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 67.23   48.972573       54354       901           wait4
 29.39   21.408221         314     68089      7725 futex
  1.19    0.863531           9     96806     65138 open
  0.94    0.682799           0   2312048           write
  0.46    0.334636           1    556033           read
  0.18    0.130719           0    997635           _llseek
  0.17    0.122515           4     31291       157 close
  0.11    0.077986          31      2538       856 unlink
  0.10    0.073506           2     30895      1751 stat64
  0.07    0.047623           0    320530           gettimeofday
  0.03    0.022181           0     60615           brk
  0.03    0.021145           1     28624           munmap
  0.02    0.017376          34       508           rename
  0.02    0.014333           1     12976           rt_sigaction
  0.02    0.012528           0     27981           old_mmap
  0.01    0.009627           0     19261           fstat64
  0.01    0.008195           2      4598      3014 lstat64
  0.01    0.005199           1      5416           mmap2
  0.01    0.004555           1      5980           getpid
  0.00    0.003601           1      4058           clock_gettime
  0.00    0.002974           2      1209           uname
  0.00    0.001965           1      2299           times
  0.00    0.001364           1      1283           time
  0.00    0.001354           1       925           execve
  0.00    0.000944           0      2705           mprotect
  0.00    0.000421           1       530           getcwd
  0.00    0.000228           1       456           rt_sigprocmask
  0.00    0.000209           3        72           getuid32
  0.00    0.000181           0       456           sched_getaffinity
  0.00    0.000163           1       168           clone
  0.00    0.000125           0       333       285 access
  0.00    0.000102           2        48        48 bind
  0.00    0.000079           0       216           set_robust_list
  0.00    0.000064           0       312           sigaltstack
  0.00    0.000036           0       192       192 sched_setscheduler
  0.00    0.000031           1        48           getdents
  0.00    0.000031           0        96           socket
  0.00    0.000030           0        72           geteuid32
  0.00    0.000025           1        48           pread64
  0.00    0.000023           0        48           set_tid_address
  0.00    0.000021           0        96           madvise
  0.00    0.000000           0       901           fork
  0.00    0.000000           0       332           lseek
  0.00    0.000000           0        24        24 mkdir
  0.00    0.000000           0        24           setrlimit
  0.00    0.000000           0        48           old_getrlimit
  0.00    0.000000           0        48           readlink
  0.00    0.000000           0        24           ftruncate
  0.00    0.000000           0        24           sched_getparam
  0.00    0.000000           0        24           sched_getscheduler
  0.00    0.000000           0        24           sched_yield
  0.00    0.000000           0        48           poll
  0.00    0.000000           0        48           getrlimit
  0.00    0.000000           0        48           getgid32
  0.00    0.000000           0        48           getegid32
  0.00    0.000000           0        96           getdents64
  0.00    0.000000           0        48           fcntl64
  0.00    0.000000           0       216           gettid
  0.00    0.000000           0        87           set_thread_area
  0.00    0.000000           0        48        48 connect
  0.00    0.000000           0        48           sendto
  0.00    0.000000           0        48           recvfrom
  0.00    0.000000           0        48           setsockopt
------ ----------- ----------- --------- --------- ----------------
100.00   72.843219               4600699     79238 total

_______________________________________________
Bug-make mailing list
Bug-make@gnu.org
https://lists.gnu.org/mailman/listinfo/bug-make

Reply via email to