https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109483
Bug ID: 109483
Summary: Unoptimal jump threading with assembler flag output
Product: gcc
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ubizjak at gmail dot com
Target Milestone: ---
Following testcase (int3 mnemonic is for marker only):
--cut here--
_Bool foo (int cnt)
{
if (cnt == -1)
{
_Bool success;
asm volatile("int3" : "=@ccz" (success));
if (!success)
return 0;
}
asm volatile ("" ::: "memory");
return 1;
}
--cut here--
compiles w/ -O2 on x86_64 to:
0000000000000000 <foo>:
0: 83 ff ff cmp $0xffffffff,%edi
3: 74 0b je 10 <foo+0x10>
5: b8 01 00 00 00 mov $0x1,%eax
a: c3 retq
b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
10: cc int3
11: 0f 94 c0 sete %al
14: 74 ef je 5 <foo+0x5>
16: c3 retq
Please note setting of %al before conditional jump. The instruction could be
moved after the jump, where the register could be cleared using "xor %eax,
%eax", similar to what clang creates:
0000000000000000 <foo>:
0: 83 ff ff cmp $0xffffffff,%edi
3: 75 06 jne b <foo+0xb>
5: cc int3
6: 74 03 je b <foo+0xb>
8: 31 c0 xor %eax,%eax
a: c3 retq
b: b0 01 mov $0x1,%al
d: c3 retq
Also note that for ZF=1 gcc sets %al to 1, jumps to *5 where the register is
again set to 1. This is not the case in the clang code.