On Tue, Jan 08, 2019 at 08:29:03AM +0100, Uros Bizjak wrote: > Is there a reason stack registers are excluded? Before stackreg pass, > these registers are just like other hard registers.
I was just afraid of those, after stack pass removing a store but not load would result in the stack getting out of sync, and I wasn't sure if the stack pass will tolerate a RA chosen stack reg not being touched anymore. Also, doesn't loading a float/double/long double into a stack reg and storing again canonicalize it in any way? Say NaNs? Or unnormal or pseudo-denormal long doubles etc.? Tried that now with a sNaN and the load/store didn't change it, but haven't tried with exceptions enabled to see if it would raise one. If that is fine, I'll leave that out. Looking around, I was using a wrong macro anyway and am surprised nothing complained, it should have been STACK_REG_P rather than STACK_REGNO_P. > Other that that, there is no need for REG_P predicate; after reload we > don't have subregs and register_operand will match only hard regs. Ok, I wasn't sure because "register_operand" allows (subreg (reg)) even if reload_completed, just disallows subregs of mem after that. > Also, please put peep2_reg_dead_p predicate in the pattern predicate. I don't see how, that would mean I'd have to write two peephole2s instead of one. It tries to deal with two different cases, one is where the temporary reg is dead, in that case we can optimize away both the load or store, the second case is where the temporary reg isn't dead, in that case we can optimize away the store, but not the load. With the optimizing away of both load and store I was just trying to do a cheap DCE there. Looking around more, I actually think I need to replace (match_dup 1) with (match_operand 2 "memory_operand"), add rtx_equal_p (operands[1], operands[2]) and !MEM_VOLATILE_P (operands[2]), because apparently rtx_equal_p doesn't check the MEM_VOLATILE_P bit. > > 2019-01-07 Jakub Jelinek <ja...@redhat.com> > > > > PR rtl-optimization/79593 > > * config/i386/i386.md (reg = mem; mem = reg): New define_peephole2. > > > > --- gcc/config/i386/i386.md.jj 2019-01-01 12:37:31.564738571 +0100 > > +++ gcc/config/i386/i386.md 2019-01-07 17:11:21.056392168 +0100 > > @@ -18740,6 +18740,21 @@ (define_peephole2 > > const0_rtx); > > }) > > > > +;; Attempt to optimize away memory stores of values the memory already > > +;; has. See PR79593. > > +(define_peephole2 > > + [(set (match_operand 0 "register_operand") > > + (match_operand 1 "memory_operand")) > > + (set (match_dup 1) (match_dup 0))] > > + "REG_P (operands[0]) > > + && !STACK_REGNO_P (operands[0]) > > + && !MEM_VOLATILE_P (operands[1])" > > + [(set (match_dup 0) (match_dup 1))] > > +{ > > + if (peep2_reg_dead_p (1, operands[0])) > > + DONE; > > +}) > > + > > ;; Attempt to always use XOR for zeroing registers (including FP modes). > > (define_peephole2 > > [(set (match_operand 0 "general_reg_operand") Jakub