The biggest change since V1 of this patch is dropping the changes to STACK_CHECK_MOVING_SP. They're not needed.
This patch also refactors a bit of the new code in explow.c. In particular it pulls out 3 chunks of code for protecting dynamic stack adjustments so they can be re-used by backends that have their own allocation routines for dynamic stack data. -- The key goal of this patch is to introduce stack clash protections for dynamically allocated stack space and indirect uses of STACK_CHECK_PROTECT via get_stack_check_protect. Those two changes accomplish two things. First it gives most targets protection of dynamically allocated space (exceptions are targets which expanders to allocate dynamic stack space such as ppc). Second, targets which are not covered by -fstack-clash-protection prologues later, but which are covered by -fstack-clash-protection get a fair amount of protection. We essentially vector into a totally different routine to allocate/probe the dynamic stack space when -fstack-clash-protection is active. It differs from the existing routine is that it allocates PROBE_INTERVAL chunks and probes them as they are allocated. The existing code would allocate the entire space as a single hunk, then probe PROBE_INTERVAL chunks within the hunk. That routine is never presented with constant allocations on x86, but is presented with constant allocations on other architectures. It will optimize cases when it knows it does not need the loop or the residual allocation after the loop. It does not have an unrolled loop mode, but one could be added -- it didn't seem worth the effort. The test will check that the loop is avoided for one case where it makes sense. It does not check for avoiding the residual allocation, but it could probably be made to do so. The indirection for STACK_CHECK_PROTECT via get_stack_protect is worth some further discussion as well. Early in the development of the stack-clash mitigation patches we thought we could get away with re-using much of the existing target code for -stack-check=specific. Essentially that code starts a probing loop at STACK_CHECK_PROTECT and probes 2-3 pages beyond the current function's needs. The problem was that starting at STACK_CHECK_PROTECT would skip probes in the first couple pages leaving the code vulnerable. So the idea was to avoid using STACK_CHECK_PROTECT directly. Instead we would indirect through a new function (get_stack_check_protect) which would return either 0 or STACK_CHECK_PROTECT depending on whether or not we wanted -fstack-clash-protection or -fstack-check=specific respectively. That scheme works reasonably well. Except that it will tend to allocate a large (larger than PROBE_INTERVAL) chunk of memory at once, then go back and probe regions of PROBE_INTERVAL size. That introduces an unfortunate race condition with asynch signals and also crashes valgrind on ppc and aarch64. Rather than throw that code away, it may still be valuable to those targets with -fstack-check=specific support, but without -fstack-clash-protection support. So I'm including it here. OK for the trunk?