https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82683

            Bug ID: 82683
           Summary: GCC generates bad code with -tune=thunderx2t99
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: sje at gcc dot gnu.org
  Target Milestone: ---

Created attachment 42448
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42448&action=edit
Test case

I am compiling the GCC spec 2017 benchmark on aarch64.  If I compile it with
-tune=thunderxt88 it works and if I compile with -tune=thunderx2t99 it fails.
The tune option should affect the speed of a program on different architectures
but it should never result in bad code.

I have attached a cutdown testcase (compilable but not runnable) to show the
problem.  In the good case you should see two sxtw sign extend instructions:

        sxtw    x20, w0 
        cbz     x1, .L2 
        ldr     w0, [x1, x20, lsl 2]
        sxtw    x20, w0 // 21   
.L2:

In the bad case we only get one:

        sxtw    x20, w0 
        cbz     x1, .L2 
        ldr     w0, [x1, x20, lsl 2]
.L2

If I insert the missing sxtw by hand everything works fine for me.  The sxtw
seems to go missing during combine but I do not know why.  Notice that in
addition to not doing the sxtw, we leave the loaded value in w0 and do not
put it in x20 like the good code does.

In addition to the -tune argument I am compiling with:
-std=c11 -O2 -fno-inline -fno-schedule-insns -fno-schedule-insns2
-fno-strict-aliasing

Reply via email to