https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100182
Uroš Bizjak <ubizjak at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED CC|uros at gcc dot gnu.org | Assignee|unassigned at gcc dot gnu.org |ubizjak at gmail dot com --- Comment #7 from Uroš Bizjak <ubizjak at gmail dot com> --- (In reply to Jakub Jelinek from comment #1) > In this particular case it is the sync.md:398 peephole2: > (define_peephole2 > [(set (match_operand:DF 0 "memory_operand") > (match_operand:DF 1 "any_fp_register_operand")) > (set (mem:BLK (scratch:SI)) > (unspec:BLK [(mem:BLK (scratch:SI))] UNSPEC_MEMORY_BLOCKAGE)) > (set (match_operand:DF 2 "fp_register_operand") > (unspec:DF [(match_operand:DI 3 "memory_operand")] > UNSPEC_FILD_ATOMIC)) > (set (match_operand:DI 4 "memory_operand") > (unspec:DI [(match_dup 2)] > UNSPEC_FIST_ATOMIC))] > "!TARGET_64BIT > && peep2_reg_dead_p (4, operands[2]) > && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" > [(const_int 0)] > { > emit_insn (gen_memory_blockage ()); > emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]); > DONE; > }) > that triggers here but from what I can read, all the r7-1112 peephole2s > optimize away stores to some memory on the assumption that the memory is > read only once (in another insn matched by the same peephole2). > I'm not 100% sure if we can rely for it on spill slots for which r7-112 > seems to have been written, but for other memory we'd need to prove that the > memory is dead. > Rather than removing those peephole2s altogether, I wonder if we just > shouldn't check that the memory_operand which we'd optimize away stores to > aren't spill slots. Actually, these peepholes are too eager and also remove the store to the memory operand 0 on the assumption that the operand is used only in the peephole2 sequence. As shown in the testcase, this is not always true, and operand 0 can be accessed also after the peephole2'd sequence. The solution is to not remove the store to operand 0. Probably there will be some unneeded stores left in the code, but IMO, this is a small price to pay for the correctness. And we still remove fild/fistp pair. I'm testing the following patch: --cut here-- diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md index c7c508c8de8..538d1f89497 100644 --- a/gcc/config/i386/sync.md +++ b/gcc/config/i386/sync.md @@ -392,7 +392,8 @@ "!TARGET_64BIT && peep2_reg_dead_p (3, operands[2]) && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" - [(set (match_dup 5) (match_dup 1))] + [(set (match_dup 0) (match_dup 1)) + (set (match_dup 5) (match_dup 1))] "operands[5] = gen_lowpart (DFmode, operands[4]);") (define_peephole2 @@ -411,6 +412,7 @@ && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" [(const_int 0)] { + emit_move_insn (operands[0], operands[1]); emit_insn (gen_memory_blockage ()); emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]); DONE; @@ -428,7 +430,8 @@ "!TARGET_64BIT && peep2_reg_dead_p (3, operands[2]) && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" - [(set (match_dup 5) (match_dup 1))] + [(set (match_dup 0) (match_dup 1)) + (set (match_dup 5) (match_dup 1))] "operands[5] = gen_lowpart (DFmode, operands[4]);") (define_peephole2 @@ -447,6 +450,7 @@ && rtx_equal_p (XEXP (operands[0], 0), XEXP (operands[3], 0))" [(const_int 0)] { + emit_move_insn (operands[0], operands[1]); emit_insn (gen_memory_blockage ()); emit_move_insn (gen_lowpart (DFmode, operands[4]), operands[1]); DONE; --cut here--