https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113689

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |hjl.tools at gmail dot com,
                   |                            |hubicka at gcc dot gnu.org,
                   |                            |jakub at gcc dot gnu.org,
                   |                            |uros at gcc dot gnu.org
           Priority|P3                          |P2
   Target Milestone|---                         |11.5
            Summary|wrong code with unused      |[11/12/13/14 Regression]
                   |_BitInt() division with -O2 |wrong code with -fprofile
                   |-fprofile -mcmodel=large    |-mcmodel=large when needing
                   |-mavx                       |drap register since
                   |                            |r11-6548

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Seems like a backend bug to me, collision between function profiler after
prologue and drap.
I see
foo:
        leaq    8(%rsp), %r10
        andq    $-32, %rsp
        pushq   -8(%r10)
        pushq   %rbp
        movq    %rsp, %rbp
        pushq   %r14
        pushq   %r13
        pushq   %r12
        pushq   %r10
        pushq   %rbx
        subq    $200, %rsp
1:      movabsq $mcount, %r10
        call    *%r10
        xorl    %eax, %eax
        xorl    %edx, %edx
        movl    $-511, %r9d
        addb    $-1, %dl
        movq    %rax, %rdx
        sbbq    (%r10), %rdx

This function is stack_realign_drap, find_drap_reg returns R10_REG and so
ix86_get_drap_rtx uses %r10 as drap register.  Later on pro_and_epilogue
initializes
the drap register in the prologue.  And, final.cc when seeing
NOTE_INSN_PROLOGUE_END emits the late FUNCTION_PROFILER, which seems to have
clobbering of %r10 (and/or %r11) hardcoded in it, so it overwrites the drap
value.

One doesn't need _BitInt to reproduce:

/* { dg-do run { target lp64 } } */
/* { dg-options "-O2 -fprofile -mcmodel=large" } */

__attribute__((noipa)) void
bar (char *x, char *y, int *z)
{
  x[0] = 42;
  y[0] = 42;
  if (z[0] != 16)
    __builtin_abort ();
}

__attribute__((noipa)) void 
foo (int c, int d, int e, int f, int g, int h, int z)
{
  typedef char B[32];
  B b __attribute__((aligned (32)));
  bar (&b[0], __builtin_alloca (z), &z);
}

int
main ()
{
  foo (0, 0, 0, 0, 0, 0, 16);
}

Started with r11-6548-g1b885264a48dcd71b7aeb26c0abeb91246724897

Reply via email to