https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88428
Bug ID: 88428 Summary: Fails to consider lea -1(%rax), %rax compared to sub 1, %rax failing to CSE test Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: rguenth at gcc dot gnu.org Target Milestone: --- The following GIMPLE test shows non-optimal assembly long mask; void bar (); __GIMPLE () void foo (int a, int b) { long _3; _3 = a_1(D) < b_2(D) ? _Literal (long) -1l : 0l; mask = _3; if (a_1(D) < b_2(D)) goto bb1; else goto bb2; bb1: bar (); bb2: return; } foo: .LFB0: .cfi_startproc xorl %eax, %eax cmpl %esi, %edi setge %al subq $1, %rax movq %rax, mask(%rip) cmpl %esi, %edi jl .L5 ... here subq clobbers flags and thus the cmpl has to be repeated. I believe we could use lea which also has the same size leaq -0x1(%rax), %rax here instead and elide the redundant cmpl. For my purpose the store to mask is unnecessary, it was placed to simplify the testcase. A GIMPLE testcase was necessary to get the COND_EXPR and non-jumpy code through optimization. I'm not sure at which point during RTL we commit to using a CC clobbering sub vs. a non-CC clobbering lea, but maybe cmpelim could replace one with the other here?