https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89877
Bug ID: 89877 Summary: [ARC] miscompilation due to missing cc clobber in longlong.h: add_ssaaaa()/sub_ddmmss() Product: gcc Version: 8.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vgupta at synopsys dot com Target Milestone: --- Created attachment 46051 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46051&action=edit test case, build with -O2 to show issue A glibc build with -mcpu=hs4x sowed weird print values for test case below (originally showed in multibench test harness printing wrong values) void main(int argc, char *argv[]) { size_t total_time = 115424 double secs = (double)total_time/(double)1000; printf("%s %d %lf\n", "secs", total_time, secs); // prints 113.504 printf("%d\n", (size_t)secs); } The code path leads to glibc stdlib/divrem.c: __mpn_divrem() which in turn uses target defined inline asm macros in stdlib/longlong.h (which in turns is sync'ed from gcc include/longlong.h) These inline macros clobber the cpu flags, but fail to add "cc" in clobber list. This causes gcc to schedule a flag setting CMP instruction (or ADD.f) before the clobbering ADD.f/SUB.f instructions, causing a subsequent conditional branch to use a stale flag. __mpn_divrem: ... .L135: ... st -1,[r0] cmp_s r10,-1 <-- intended flag sub r0,r0,4 sub r4,r2,r9 add.f r2, r18, r9 <-- clobbered adc r3, r4, 0 beq_s @.L72 <-- stale flag used -mcpu=hs4x + cc clobber fix --------------------------- st -1,[r0] sub r4,r2,r9 sub r0,r0,4 add.f r2, r18, r9 adc r3, r4, 0 cmp_s r10,-1 <-- intended flag beq_s @.L72 <-- right flag used The issue doesn't happen with default -mpcu=hs38 as the instruction scheduling already delays the CMP for some reason. -mcpu=hs38 ---------- st -1,[r0] sub r4,r2,r9 sub r0,r0,4 add.f r2, r18, r9 adc r3, r4, 0 cmp_s r10,-1 beq_s @.L72