https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116719
Bug ID: 116719 Summary: [SH] missed folding of fp factor constant Product: gcc Version: 14.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: olegendo at gcc dot gnu.org Target Milestone: --- The following code, compiled on sh-elf with -std=c11 -O2 -ml -m4-single -ffast-math -mfsrra -mfsca void fisincosr(float angle, float *s, float *c) { float __r = angle / 10430.37835f; *s = __builtin_sinf(__r); *c = __builtin_cosf(__r); } void fisincos(float angle, float *s, float *c) { fisincosr(angle * 182.04444443f, s, c); } produces: _fisincosr: ftrc fr5,fpul fsca fpul,dr2 fmov.s fr2,@r4 rts fmov.s fr3,@r5 _fisincos: mova .L4,r0 fmov.s @r0+,fr1 fmul fr1,fr5 fmov.s @r0+,fr1 fmul fr1,fr5 ftrc fr5,fpul fsca fpul,dr2 fmov.s fr2,@r4 rts fmov.s fr3,@r5 .L4: .long 1016003126 .long 1176697219 In the second function 'fsincos' it fails to combine the factor 182.04444443 with the other factor 10430.37835 which is emitted by the 'sincossf3' pattern. At some point, combine is trying to combine the two sf constants and emit a multiplication insn: Trying 8, 10, 9 -> 11: 8: {r168:SF=1.74532942473888397216796875e-2;use fpscr0:SI;clobber scratch;} 10: {r174:SF=1.0430378350470453e+4;use fpscr0:SI;clobber scratch;} 9: {r162:SF=r181:SF*r168:SF;clobber fpscr1:SI;use fpscr0:SI;} REG_DEAD r181:SF REG_UNUSED fpscr1:SI REG_DEAD r168:SF 11: {r171:SF=r162:SF*r174:SF;clobber fpscr1:SI;use fpscr0:SI;} REG_DEAD r174:SF REG_DEAD r162:SF REG_UNUSED fpscr1:SI REG_EQUAL r162:SF*1.0430378350470453e+4 Failed to match this instruction: (parallel [ (set (reg:SF 171) (mult:SF (reg:SF 181) (const_double:SF 1.820444488525390625e+2 [0x0.b60b61p+8]))) (clobber (reg:SI 155 fpscr1)) (use (reg:SI 154 fpscr0)) ]) ... but it fails because SH doesn't have any fp insns that can have constants as operands, only registers. It would require splitting the insn and emit a constant load, just like the expanders do. But that's not done during combine. Such patterns could be added, but then it'd be good to hoist the constant loads out of loops. See also PR 54089 and its https://gcc.gnu.org/bugzilla/attachment.cgi?id=55543, which is solving the problem of constants being formed later during RTL optimization. Another option could be to extend the combine pass to do this automatically, but it would probably have bigger repercussions also on other targets.