Hi Stephane, * Stephane Chazelas wrote on Tue, Oct 28, 2008 at 11:26:18AM CET: > > I have to admit I would have thought the code above to be safe > as well and I wonder if it's the same on all systems. But I can > reproduce the problem on Linux. As far as I can tell, if you > don't use O_APPEND, the system doesn't guarantee the write(2) to > be atomic, so I suppose you can get this kind of behavior if a > context switch occurs in the middle of a write(2) system call.
thanks for the feedback, that looks spot-on! It is supported by the fact that the log: > > <http://buildbot.proulx.com:9003/amd64-gnu-linux/builds/961/step-test/0> shows that the per-test testsuite.log file contains all the output, while the 'stdout' file did not. The former is always generated by either tee -a testsuite.log or cat >> testsuite.log Also, I have not been able to provoke lossage on an unredirected standard output (manually running ./micro-suite in the test dir). > That wouldn't have anything to do with the shell. Yep. > Replacing foo.sh > stdout 2> stderr with > : > stdout > stderr > ./foo.sh >> stdout 2>> stderr > > should be guaranteed to work. Yes. For shell portability, I'll write the first line as : > stdout : > stderr though. > I think > > { ./foo.sh | cat > stdout; } 2>&1 | cat > stderr > > should be OK as well as write(2)s to a pipe are meant to be > atomic as long as they are less than PIPE_BUF bytes (a page size > on Linux) and even if they were not atomic, I would still > consider it a bug if one process' output to a pipe was to > overwrite another one's. I agree. However, this solution requires two or three more processes than the first one. Consequently, I think the patch below should fix the failure. I've tried it out on a couple of GNU/Linux systems, and been unable to provoke the failure after an hour or so. I've pushed the change, and put Stéphane in THANKS. Cheers, Ralf, a lot less worried about parallel Autotest now :-) Fix parallel test execution output lossage. * lib/autotest/general.m4 (_AT_CHECK): Truncate files to hold standard output and standard error before the test, use append mode for writing. * THANKS: Update. Caught by Bob Proulx' build daemons, analysis and suggested fix by Stephane Chazelas. diff --git a/lib/autotest/general.m4 b/lib/autotest/general.m4 index 4d7c0f5..03d3902 100644 --- a/lib/autotest/general.m4 +++ b/lib/autotest/general.m4 @@ -1893,16 +1893,22 @@ m4_define([AT_DIFF_STDOUT()], # # ( $at_traceon; $1 ) >at-stdout 2>at-stder1 # +# Note that we truncate and append to the output files, to avoid losing +# output from multiple concurrent processes, e.g., an inner testsuite +# with parallel jobs. m4_define([_AT_CHECK], [{ $at_traceoff AS_ECHO(["$at_srcdir/AT_LINE: AS_ESCAPE([$1])"]) echo AT_LINE >"$at_check_line_file" +: >"$at_stdout" if _AT_DECIDE_TRACEABLE([$1]); then - ( $at_traceon; $1 ) >"$at_stdout" 2>"$at_stder1" + : >"$at_stder1" + ( $at_traceon; $1 ) >>"$at_stdout" 2>>"$at_stder1" at_func_filter_trace $? else - ( :; $1 ) >"$at_stdout" 2>"$at_stderr" + : >"$at_stderr" + ( :; $1 ) >>"$at_stdout" 2>>"$at_stderr" fi at_status=$? at_failed=false