https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116719

            Bug ID: 116719
           Summary: [SH] missed folding of fp factor constant
           Product: gcc
           Version: 14.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: olegendo at gcc dot gnu.org
  Target Milestone: ---

The following code, compiled on sh-elf with -std=c11 -O2 -ml -m4-single
-ffast-math -mfsrra -mfsca 

void fisincosr(float angle, float *s, float *c) {
    float __r = angle / 10430.37835f;

    *s = __builtin_sinf(__r);
    *c = __builtin_cosf(__r);
}

void fisincos(float angle, float *s, float *c) {
    fisincosr(angle * 182.04444443f, s, c);
}

produces:

_fisincosr:
        ftrc    fr5,fpul
        fsca    fpul,dr2
        fmov.s  fr2,@r4
        rts     
        fmov.s  fr3,@r5
_fisincos:
        mova    .L4,r0
        fmov.s  @r0+,fr1
        fmul    fr1,fr5
        fmov.s  @r0+,fr1
        fmul    fr1,fr5
        ftrc    fr5,fpul
        fsca    fpul,dr2
        fmov.s  fr2,@r4
        rts     
        fmov.s  fr3,@r5
.L4:
        .long   1016003126
        .long   1176697219


In the second function 'fsincos' it fails to combine the factor 182.04444443
with the other factor 10430.37835 which is emitted by the 'sincossf3' pattern.

At some point, combine is trying to combine the two sf constants and emit a
multiplication insn:

Trying 8, 10, 9 -> 11:
    8: {r168:SF=1.74532942473888397216796875e-2;use fpscr0:SI;clobber scratch;}
   10: {r174:SF=1.0430378350470453e+4;use fpscr0:SI;clobber scratch;}
    9: {r162:SF=r181:SF*r168:SF;clobber fpscr1:SI;use fpscr0:SI;}
      REG_DEAD r181:SF
      REG_UNUSED fpscr1:SI
      REG_DEAD r168:SF
   11: {r171:SF=r162:SF*r174:SF;clobber fpscr1:SI;use fpscr0:SI;}
      REG_DEAD r174:SF
      REG_DEAD r162:SF
      REG_UNUSED fpscr1:SI
      REG_EQUAL r162:SF*1.0430378350470453e+4
Failed to match this instruction:
(parallel [
        (set (reg:SF 171)
            (mult:SF (reg:SF 181)
                (const_double:SF 1.820444488525390625e+2 [0x0.b60b61p+8])))
        (clobber (reg:SI 155 fpscr1))
        (use (reg:SI 154 fpscr0))
    ])


... but it fails because SH doesn't have any fp insns that can have constants
as operands, only registers.  It would require splitting the insn and emit a
constant load, just like the expanders do.  But that's not done during combine.

Such patterns could be added, but then it'd be good to hoist the constant loads
out of loops.  See also PR 54089 and its
https://gcc.gnu.org/bugzilla/attachment.cgi?id=55543, which is solving the
problem of constants being formed later during RTL optimization.

Another option could be to extend the combine pass to do this automatically,
but it would probably have bigger repercussions also on other targets.

Reply via email to