https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100152
Iain Sandoe <iains at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Target|x86_64-apple-darwin19.6.0 |x86_64-apple-darwin*
Summary|Possible 10.3 bad code |[10.3, 11, 12 Regression]
|generation regression from |[Darwin, X86] used
|10.2/9.3 on Mac OS 10.15.7 |caller-saved register not
|(Catalina) |preserved across a call.
Known to fail| |10.3.0
Target Milestone|--- |10.4
--- Comment #24 from Iain Sandoe <iains at gcc dot gnu.org> ---
O1:
movl $0, %ebx
L756:
movl 0(%rbp,%rbx,4), %esi
movq %r14, %rdi
call ____UTF_8_put
movq %rbx, %rax
addq $1, %rbx
cmpq %rax, %r13
jne L756
works OK because %rbx is callee saved.
----
O2:
xorl %r10d, %r10d
.p2align 4,,10
.p2align 3
L938:
movl 0(%rbp,%r10,4), %esi
call ____UTF_8_put
movq %r10, %rax
addq $1, %r10
cmpq %rax, %r12
jne L938
fails because %r10 is not callee saved and is clobbered by the lazy symbol
resolver.
10-2 uses rbx at O2, and so does Linux (it is of course hard to be 100% sure
that the same problem "could not occur" on other platforms; there is relatively
little Darwin-specific code in the x86 backend, especially for x86_64).
I did see a fail [wrong code] with 11.1 (and would expect that to be present in
master too) - whether the code crashes will depend on which reg happens to be
used - e.g. r8 could survive the call (even tho it is not saved) but r10 will
always be clobbered by the lazy symbol resolver.
A workaround is to build c_intf.o with -O1. Unfortunately, the configuration
for the project does not allow selection of the RTS optimisation level - it is
jammed on at the highest level found. Adding or modifying a rule for that
object will work in the short-term. Locally, I added a --enable-c-opt-rts to
allow testing, you're welcome to that patch if it's helpful.
Next will be to try and bisect to find the change that caused this - but obv.
that is not going to be done before 10.4 / 11.2 so the workaround is probably
needed.