Update of bug #41781 (project make): Assigned to: None => psmith Triage Status: None => Medium Effort
_______________________________________________________ Follow-up Comment #1: There's some confusion in the reading material you mention, and in the bug report, about how GNU make actually works. It's not true that enabling parallel builds will magically turn on the -k (keep-going) flag and that it can't be turned off. Make with parallel builds enabled, and without -k, handles failed builds in exactly the same way that it would otherwise: it stops building things "as soon as it can". When a parallel make instance detects a failed build then it will not start any new jobs and it will wait for all the currently-running parallel jobs to complete, then it will exit with an error code. Make doesn't try to kill any running jobs, it waits for them to finish: killing jobs can lead to corrupted builds if the recipe doesn't expect to be killed (arguably this is a bug in the recipe since the user could always use CTRL-C but nevertheless it's not expected that make will kill things). The problem you're seeing likely happens because you are using recursive make instances. Suppose you have a makefile which runs three recursive instances of make. You start it with -j2, so you have make instance A (the top one) spawning instance B and C (the children). The third recursive make instance is not started (yet) because there are only 2 jobs slots available. Now say that during the run of make instance B a job fails. In that case, instance B detects that a failure happened and it won't start any new jobs and it will exit with an error as soon as all currently running jobs _it started_ are completed. When B exits with an error, instance A (the root) sees the failure and it won't start up the final recursive make at all. However, already-running make instance C has no idea that B saw an error so C keeps building all its targets as usual, then exits. If you have a large amount of parallelism and a recursive make environment, then you can get a large number of directories building happily along, unaware that one of their siblings has detected an error. I've been thinking about this problem (which is not really the point of the patch you mention: that patch allows some finite number, less than "all" which is what you get with -k, errors to be ignored). I think the right solution is that when a make instance gets an error, it notifies all the other make instances about the error. I think the right way to do that is to start putting back a different token into the jobserver queue, which means "error detected". Then when other make instances get that token they'll know one of their siblings had an error and they can stop building. This means that make will still not kill any existing jobs that are running, but it won't start any new ones either, even in recursive environments. The big open question is, if a make instance detects that some other make instance failed and so it wants to stop early, what should its exit code be? I think it must exit with an error code, even though it, itself, did not detect any error, because it didn't completely build its target; we have to signal that to the parent in case the parent is relying on that for further processing. On the other hand that means you'll get a list of error messages for all the recursive make invocations, which might be unpleasant. _______________________________________________________ Reply to this item at: <http://savannah.gnu.org/bugs/?41781> _______________________________________________ Message sent via/by Savannah http://savannah.gnu.org/ _______________________________________________ Bug-make mailing list Bug-make@gnu.org https://lists.gnu.org/mailman/listinfo/bug-make