On Wed, Dec 22, 2021 at 11:26 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Wed, Dec 22, 2021 at 10:26 AM Roger Sayle <ro...@nextmovesoftware.com> > wrote: > > > > > > Hi Uros, > > Would you consider the following variant that disables this optimization > > when a > > red zone is used by the current function? You're right that cfun's > > red_zone_size is > > recalculated dynamically, but ix86_red_zone_used should be a better "gate" > > given > > that this logic resides very late during compilation, in the output > > templates, where > > whether or not a red zone is used is known. > > > > On CSiBE, disabling this optimization in non-leaf functions that use a red > > zone costs > > 219 bytes, but remains a significant win over -Os. (Alas the absolute > > numbers aren't > > comparable as this testing included the 0/-1 write to memory changes). > > > > Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k > > check > > with no new failures. > > > > 2021-12-22 Roger Sayle <ro...@nextmovesoftware.com> > > > > gcc/ChangeLog > > PR target/103773 > > * config/i386/i386.md (*movdi_internal): Only use short > > push/pop sequence for register (non-memory) destinations > > when the current function doesn't make use of a red zone. > > (*movsi_internal): Likewise. > > > > gcc/testsuite/ChangeLog > > PR target/103773 > > * gcc.target/i386/pr103773.c: New test case. > > > > Please let me know what you think. I'll revert, if this tweak doesn't > > address > > your concerns. > > Yes, using ix86_red_zone_used looks safe. > > OTOH, is there a reason the transformation is not implemented via > peephole2 pass? IIRC, frame is stable after pro_and_epilogue_pass, and > peephole2 pass is instanced well after register allocation.
Something like the attached patch. Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 58b10643fcb..e5d603f0025 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -2514,6 +2514,24 @@ ] (symbol_ref "true")))]) +(define_peephole2 + [(set (match_operand:SWI48 0 "general_reg_operand") + (match_operand:SWI48 1 "const_int_operand"))] + "optimize_insn_for_size_p () && optimize_size > 1 + && IN_RANGE (INTVAL (operands[1]), -128, 127) + && !ix86_red_zone_used" + [(set (match_dup 2) (match_dup 1)) + (set (match_dup 0) (match_dup 3))] +{ + if (GET_MODE (operands[0]) != word_mode) + operands[0] = gen_rtx_REG (word_mode, REGNO (operands[0])); + + operands[2] = gen_rtx_MEM (word_mode, + gen_rtx_PRE_DEC (Pmode, stack_pointer_rtx)); + operands[3] = gen_rtx_MEM (word_mode, + gen_rtx_POST_INC (Pmode, stack_pointer_rtx)); +}) + (define_insn "*movhi_internal" [(set (match_operand:HI 0 "nonimmediate_operand" "=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,m")