https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79593
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |uros at gcc dot gnu.org
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
That said, the reason why there is fld1 followed by fld %st(0) is that 1.0 is
used multiple times:
(insn 41 64 42 8 (set (reg:SF 114)
(mem/u/c:SF (symbol_ref/u:SI ("*.LC1") [flags 0x2]) [4 S4 A32]))
"pr79593.c":17 125 {*movsf_internal}
(expr_list:REG_EQUAL (const_double:SF 1.0e+0 [0x0.8p+1])
(nil)))
(insn 42 41 43 8 (set (reg:XF 118 [ delta ])
(float_extend:XF (reg:SF 114))) "pr79593.c":17 153 {*extendsfxf2_i387}
(expr_list:REG_EQUAL (const_double:XF 1.0e+0 [0x0.8p+1])
(nil)))
...
(insn 69 65 47 9 (set (reg:XF 110 [ delta ])
(float_extend:XF (reg:SF 114))) "pr79593.c":17 153 {*extendsfxf2_i387}
(expr_list:REG_DEAD (reg:SF 114)
(expr_list:REG_EQUAL (const_double:XF 1.0e+0 [0x0.8p+1])
(nil))))
in multiple basic blocks with conditional jump in between, so the combiner
doesn't combine it into (set (reg:XF ...)) (const_double:XF 1.0e+0).
Still in *.peephole2 we have:
(insn 82 64 42 8 (set (reg:SF 10 st(2) [114])
(const_double:SF 1.0e+0 [0x0.8p+1])) "pr79593.c":17 125
{*movsf_internal}
(expr_list:REG_EQUAL (const_double:SF 1.0e+0 [0x0.8p+1])
(nil)))
(insn 42 82 83 8 (set (reg:XF 9 st(1) [orig:118 delta ] [118])
(float_extend:XF (reg:SF 10 st(2) [114]))) "pr79593.c":17 153
{*extendsfxf2_i387}
(expr_list:REG_EQUIV (const_double:XF 1.0e+0 [0x0.8p+1])
(nil)))
...
(insn 69 65 47 9 (set (reg:XF 8 st [orig:110 delta ] [110])
(float_extend:XF (reg:SF 10 st(2) [114]))) "pr79593.c":17 153
{*extendsfxf2_i387}
(expr_list:REG_DEAD (reg:SF 10 st(2) [114])
(expr_list:REG_EQUAL (const_double:XF 1.0e+0 [0x0.8p+1])
(nil))))
It is only the regstack pass that optimizes those 2 into 1, but that isn't able
to peephole or otherwise combine:
(insn:TI 82 64 42 7 (set (reg:SF 8 st)
(const_double:SF 1.0e+0 [0x0.8p+1])) "pr79593.c":17 125
{*movsf_internal}
(expr_list:REG_EQUAL (const_double:SF 1.0e+0 [0x0.8p+1])
(nil)))
(insn:TI 42 82 83 7 (set (reg:XF 8 st)
(float_extend:XF (reg:SF 8 st))) "pr79593.c":17 153 {*extendsfxf2_i387}
(expr_list:REG_EQUIV (const_double:XF 1.0e+0 [0x0.8p+1])
(nil)))
and there is no peephole2 pass afterwards, so either regstack itself would need
to do this, or the machine reorg pass.
Still no idea why this is considered a regression, I get with gcc 5.4.1
20160721
subl $12, %esp
fldz
movl 16(%esp), %edx
movl 20(%esp), %eax
cmpl %eax, (%edx)
jbe .L2
flds global_data
flds global_data+4
fxch %st(2)
fcomp %st(1)
fnstsw %ax
sahf
ja .L13
fxch %st(1)
fsubrs 4(%edx)
.L5:
fdivp %st, %st(1)
ftst
fnstsw %ax
sahf
jnb .L6
fstp %st(0)
fldz
.L6:
fld1
fld %st(0)
fcomp %st(2)
fnstsw %ax
sahf
jnb .L14
fstp %st(1)
jmp .L7
.p2align 4,,10
.p2align 3
.L14:
fstp %st(0)
.L7:
.L2:
addl $12, %esp
ret