https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80706
--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #9)
> (In reply to Jakub Jelinek from comment #8)
> > The #c5 patch obviously doesn't help here, because the testcase triggers the
> > last of these 4 peephole2s. But #c7 works.
>
> Thanks! It looks like we'll have to live with extra stores then...
Can't we improve it in the combiner?
For PR71245 testcase obviously, we have:
(insn 5 2 6 2 (parallel [
(set (reg:DI 89 [ _4 ])
(unspec:DI [
(mem/v:DI (symbol_ref:SI ("d") [flags 0x2] <var_decl
0x7fcf8ee5c510 d>) [-1 S8 A64])
] UNSPEC_LDA))
(clobber (mem/c:DI (plus:SI (reg/f:SI 20 frame)
(const_int -8 [0xfffffffffffffff8])) [0 S8 A64]))
(clobber (scratch:DF))
]) "/usr/include/c++/6.3.1/atomic":235 4970 {atomic_loaddi_fpu}
(nil))
...
(insn 8 7 9 2 (set (reg:DF 91)
(plus:DF (subreg:DF (reg:DI 89 [ _4 ]) 0)
(reg:DF 92))) "pr71245.C":5 805 {*fop_df_comm}
(expr_list:REG_DEAD (reg:DF 92)
(expr_list:REG_DEAD (reg:DI 89 [ _4 ])
(nil))))
and apparently the combiner attempts to match:
(set (reg:DF 92)
(subreg:DF (unspec:DI [
(mem/v:DI (symbol_ref:SI ("d") [flags 0x2] <var_decl
0x7fcf8ee5c510 d>) [-1 S8 A64])
] UNSPEC_LDA) 0))
Perhaps if we had such a pattern that we'd split into a normal DFmode load
(perhaps with unspec before reload to guarantee it is atomic load), we wouldn't
need the temporary at all?