Hi Branden, G. Branden Robinson wrote on Fri, May 21, 2021 at 04:57:59PM +1000:
> I have a reproducible problem that I don't understand. Not very surprisingly, i didn't instantly manage to reproduce your problem on a different machine and operating system. That's typical for races, they depend on a lot of details. > I've elided my actual build directory name below because it's not > important. > > $ echo '@c' >> doc/groff.texi > $ (cd build && make -j all check) The next line is from "make check": > make check-am The next line is from "make all": > LANG=C \ > LC_ALL=C \ > makeinfo -o doc/groff.info --enable-encoding -I.../groff/build/../doc > .../groff/build/../doc/groff.texi Strangely, i don't get the next line at all: > make[1]: Entering directory '.../groff/build' The next line is again from "make check": > make I don't get the following line either: > make[2]: Entering directory '.../groff/build' I do not get the following second copy of the makeinfo invocation: > LANG=C \ > LC_ALL=C \ > makeinfo -o doc/groff.info --enable-encoding -I.../groff/build/../doc > .../groff/build/../doc/groff.texi > makeinfo: rename doc/groff.info failed: No such file or directory > make[2]: *** [Makefile:12224: doc/groff.info] Error 1 [...] > Essential parts of this include: > (1) I have to actually modify groff.texi. Touching its timestamp does > not suffice. > (2) I have to call the "all" _and_ "check" targets. > (3) I have to use the "-j" flag. In my tests, i'm using "make -j 4 all check" because without specifying the maximum number of parallel jobs an an argument "4" to the "-j" option, i get nothing but $ make -j all check make: illegal argument to -j option -- all -- invalid usage: make [-BeiknpqrSst] [-C directory] [-D variable] [-d flags] [-f mk] [-I directory] [-j max_processes] [-m directory] [-V variable] [NAME=value] [target ...] > What I think is happening is that make(1) is forking off a job for each > of the "all" and "check" targets, and they are racing against each > other. One of them always loses, so I always get the error. That seems likely, yes. > I see Ingo is feeling feisty, Heh. :-| > so I'll add this: please don't advise me to not use -j. > Our *.am files work fine with it in most respects. No, and yes. Parallel builds can be useful, not only for wasting less time waiting for builds to finish but also for finding bugs in dependency specifications in Makefiles. More often than not, build failures that only happen with -j indicate Makefile bugs, for example bugs in dependency specifications, and sometimes other bugs, too. What you describe certainly feels like a build system bug to me. > In a minor mystery, I don't know what's generating that ENOENT > diagnostic; on my system, makeinfo is a symlink to texi2any (as it is > for most people, I expect), For me, /usr/local/bin/gmakeinfo is just a Perl script, not a symlink, installed by the texinfo-6.5p4 package. > and in that Perl script I can't find a line corresponding to it. Well, makeinfo is a large beast, including lots and lots of modules: require Texinfo::ModulePath; use Texinfo::Common; use Texinfo::Convert::Converter; require Texinfo::Parser; require Texinfo::Convert::HTML; Just as a few examples... It won't be easy finding anything in there. Oh wait, grep(1) to the rescue: $ grep -R 'rename.*failed' /usr/local/share/texinfo/Texinfo /usr/local/share/texinfo/Texinfo/Convert/Info.pm: $self->document_error(sprintf($self->__("rename %s failed: %s"), The line before that is: unless (rename($self->{'output_file'}, $self->{'output_file'}.'-'.$out_file_nr)) { So makeinfo(1) writes stuff to doc/groff.info, then renames the written file to groff.info-1 and so on. Now if two jobs do that at the same time, we get this beautiful example of cooperation: job 1: starts writing doc/groff.info job 2: starts writing doc/groff.info - same name, different inode; the file being written by job 1 is now unlinked but still open job 1: finishes writing the unlinked file job 1: renames doc/groff.info to doc/groff.info-1 - and actually, that is the file being written by job 2 job 2: finishes writing the file that it originally called doc/groff.info but that is now already called doc/groff.info-1 instead job 2: tries to rename doc/groff.info to doc/groff.info-1 - BOOM, because doc/groff.info no longer exists Quite funny that the content of doc/groff.info-1 is prossibly already correct, and so is the file name, and yet makeinfo(1) crashes. Now, obviously this code in makeinfo(1) is rather fragile. The standard idiom for handling such tasks is using mkstemp(3) rather than using constant filenames for temporary files, which would mitigate the problem somewhat. But that's not our task here, improving that is a job for the texinfo crowd over there --->. ;-) Our job here is to make sure our build system doesn't run makeinfo(1) twice in the same make job - not doing that makes sense anyway, and even more so given makeinfo's apparent fragility. > Any ideas? I don't understand yet why your make falls into the trap of running makeinfo twice and mine doesn't, but i thought i might share these partial results right away to reduce the risk of multiple people doing the same analysis in parallel. Races between developers, y'know. Yours, Ingo