https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63890
--- Comment #11 from Jan Hubicka <hubicka at ucw dot cz> --- > > Index: config/i386/i386.h > =================================================================== > --- config/i386/i386.h (revision 220946) > +++ config/i386/i386.h (working copy) > @@ -1606,7 +1606,7 @@ enum reg_class > > #define ACCUMULATE_OUTGOING_ARGS \ > ((TARGET_ACCUMULATE_OUTGOING_ARGS && optimize_function_for_speed_p (cfun)) > \ > - || TARGET_STACK_PROBE || TARGET_64BIT_MS_ABI) > + || TARGET_STACK_PROBE || TARGET_64BIT_MS_ABI || crtl->profile) I do not see how ACCUMULATE_OUTGOING_ARGS is going to ensure mcount stack alignment. the calls are output into assembly code by: /* Output assembler code to FILE to increment profiler label # LABELNO for profiling a function entry. */ void x86_function_profiler (FILE *file, int labelno ATTRIBUTE_UNUSED) { const char *mcount_name = (flag_fentry ? MCOUNT_NAME_BEFORE_PROLOGUE : MCOUNT_NAME); if (TARGET_64BIT) { #ifndef NO_PROFILE_COUNTERS fprintf (file, "\tleaq\t%sP%d(%%rip),%%r11\n", LPREFIX, labelno); #endif if (!TARGET_PECOFF && flag_pic) fprintf (file, "1:\tcall\t*%s@GOTPCREL(%%rip)\n", mcount_name); else x86_print_call_or_nop (file, mcount_name); } that does not care about ACCUMULATE_OUTGOING_ARGS. We basically get alignment by accident because we push RBP. My guess is that you want to bump crtl->stack_alignment_needed to 64bits for every function with crtl->profile