https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88770

            Bug ID: 88770
           Summary: Redundant load opt. or CSE pessimizes code
           Product: gcc
           Version: 8.2.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: bisqwit at iki dot fi
  Target Milestone: ---

For this code (-xc -std=c99 or -xc++ -std=c++17):

    struct guu { int a; int b; float c; char d; };

    extern void test(struct guu);

    void caller()
    {
        test( (struct guu){.a = 3, .b = 5, .c = 7, .d = 9} );
        test( (struct guu){.a = 3, .b = 5, .c = 7, .d = 9} );
    }

CSE (or some other form of redundant loads optimization) pessimizes the code.
Problem occurs on optimization levels -O1 and higher, including -Os.

If the function "caller" calls test() just once, the resulting code is (-O3
-fno-optimize-sibling-calls, stack alignment/push/pops omitted for brevity):

        movabs  rdi, 21474836483
        movabs  rsi, 39743127552
        call    test

If "caller" calls test() twice, the code is a lot longer and not just twice as
long. (Stack alignment/push/pops omitted for brevity):

        movabs  rbp, 21474836483
        mov     rdi, rbp
        movabs  rbx, 38654705664
        mov     rsi, rbx
        or      rbx, 1088421888
        or      rsi, 1088421888
        call    test
        mov     rsi, rbx
        mov     rdi, rbp
        call    test

If we change caller() such that the parameters in the two calls are not
identical:

    void caller()
    {
        test( (struct guu){.a = 3, .b = 5, .c = 7, .d = 9} );
        test( (struct guu){.a = 3, .b = 6, .c = 7, .d = 10} );
    }

The generated code is optimal again as expected:

        movabs  rdi, 21474836483
        movabs  rsi, 39743127552
        call    test
        movabs  rdi, 25769803779
        movabs  rsi, 44038094848
        call    test

The problem in the first examples is that the compiler sees that the same
parameter is used twice, and it tries to save it in a callee-saves register, in
order to reuse the same values on the second call. However re-initializing the
registers from scratch would have been more efficient.

The problem occurs on GCC versions 4.8.1 and newer. It does not occur in GCC
version 4.7.4, which generated different code that is otherwise inefficient.

For reference, the problem also exists in Clang versions 3.5 and newer, but not
in versions 3.4 and earlier.

Reply via email to