[Bug rtl-optimization/88428] New: Fails to consider lea -1(%rax), %rax compared to sub 1, %rax failing to CSE test

rguenth at gcc dot gnu.org Mon, 10 Dec 2018 04:10:13 -0800

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88428


            Bug ID: 88428
           Summary: Fails to consider lea -1(%rax), %rax compared to sub
                    1, %rax failing to CSE test
           Product: gcc
           Version: 9.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

The following GIMPLE test shows non-optimal assembly

long mask;
void bar ();
__GIMPLE () void foo (int a, int b)
{
  long _3;
  _3 = a_1(D) < b_2(D) ? _Literal (long) -1l : 0l;
  mask = _3;
  if (a_1(D) < b_2(D))
    goto bb1;
  else
    goto bb2;

bb1:
    bar ();

bb2:
  return;
}

foo:
.LFB0:
        .cfi_startproc
        xorl    %eax, %eax
        cmpl    %esi, %edi
        setge   %al
        subq    $1, %rax
        movq    %rax, mask(%rip)
        cmpl    %esi, %edi
        jl      .L5
...

here subq clobbers flags and thus the cmpl has to be repeated.  I believe
we could use lea which also has the same size

        leaq    -0x1(%rax), %rax

here instead and elide the redundant cmpl.  For my purpose the store to
mask is unnecessary, it was placed to simplify the testcase.  A GIMPLE
testcase was necessary to get the COND_EXPR and non-jumpy code through
optimization.

I'm not sure at which point during RTL we commit to using a CC clobbering
sub vs. a non-CC clobbering lea, but maybe cmpelim could replace one
with the other here?

[Bug rtl-optimization/88428] New: Fails to consider lea -1(%rax), %rax compared to sub 1, %rax failing to CSE test

Reply via email to