On Wed, Dec 22, 2021 at 11:26 AM Uros Bizjak <ubiz...@gmail.com> wrote:
>
> On Wed, Dec 22, 2021 at 10:26 AM Roger Sayle <ro...@nextmovesoftware.com> 
> wrote:
> >
> >
> > Hi Uros,
> > Would you consider the following variant that disables this optimization 
> > when a
> > red zone is used by the current function?  You're right that cfun's 
> > red_zone_size is
> > recalculated dynamically, but ix86_red_zone_used should be a better "gate" 
> > given
> > that this logic resides very late during compilation, in the output 
> > templates, where
> > whether or not a red zone is used is known.
> >
> > On CSiBE, disabling this optimization in non-leaf functions that use a red 
> > zone costs
> > 219 bytes, but remains a significant win over -Os.  (Alas the absolute 
> > numbers aren't
> > comparable as this testing included the 0/-1 write to memory changes).
> >
> > Tested (overnight) on x86_64-pc-linux-gnu with make bootstrap and make -k 
> > check
> > with no new failures.
> >
> > 2021-12-22  Roger Sayle  <ro...@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> >         PR target/103773
> >         * config/i386/i386.md (*movdi_internal): Only use short
> >         push/pop sequence for register (non-memory) destinations
> >         when the current function doesn't make use of a red zone.
> >         (*movsi_internal): Likewise.
> >
> > gcc/testsuite/ChangeLog
> >         PR target/103773
> >         * gcc.target/i386/pr103773.c: New test case.
> >
> > Please let me know what you think.  I'll revert, if this tweak doesn't 
> > address
> > your concerns.
>
> Yes, using ix86_red_zone_used looks safe.
>
> OTOH, is there a reason the transformation is not implemented via
> peephole2 pass? IIRC, frame is stable after pro_and_epilogue_pass, and
> peephole2 pass is instanced well after register allocation.

Something like the attached patch.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 58b10643fcb..e5d603f0025 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2514,6 +2514,24 @@
           ]
           (symbol_ref "true")))])
 
+(define_peephole2
+  [(set (match_operand:SWI48 0 "general_reg_operand")
+       (match_operand:SWI48 1 "const_int_operand"))]
+  "optimize_insn_for_size_p () && optimize_size > 1
+   && IN_RANGE (INTVAL (operands[1]), -128, 127)
+   && !ix86_red_zone_used"
+  [(set (match_dup 2) (match_dup 1))
+   (set (match_dup 0) (match_dup 3))]
+{
+  if (GET_MODE (operands[0]) != word_mode)
+    operands[0] = gen_rtx_REG (word_mode, REGNO (operands[0]));
+
+  operands[2] = gen_rtx_MEM (word_mode,
+                            gen_rtx_PRE_DEC (Pmode, stack_pointer_rtx));
+  operands[3] = gen_rtx_MEM (word_mode,
+                            gen_rtx_POST_INC (Pmode, stack_pointer_rtx));
+})
+
 (define_insn "*movhi_internal"
   [(set (match_operand:HI 0 "nonimmediate_operand"
     "=r,r,r,m ,*k,*k ,r ,m ,*k ,?r,?*v,*v,*v,*v,m")

Reply via email to