[Bug middle-end/85090] [8 Regression] wrong code with -O2 -fno-tree-dominator-opts -mavx512f -fira-algorithm=priority

vmakarov at gcc dot gnu.org Fri, 06 Apr 2018 14:33:07 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85090


--- Comment #13 from Vladimir Makarov <vmakarov at gcc dot gnu.org> ---
(In reply to Uroš Bizjak from comment #11)
> (In reply to Jakub Jelinek from comment #5)
> > I guess it depends on what exactly a normal subreg on lhs means.
> > The documentation says:
> >           When used as an lvalue, 'subreg' is a word-based accessor.
> >           Storing to a 'subreg' modifies all the words of REG that
> >           overlap the 'subreg', but it leaves the other words of REG
> >           alone.
> 
> 
> But this wording applies only to multi-word registers. We can't use the
> above wording for 512bit single-word register, since we don't know how the
> move will affect the bits outside the subreg. We can say that the move
> "modifies all the words of REG that overlap the 'subreg', since we have only
> one 512-bit word of a 512-bit register.
> 

OK.

> So, I think that the transformation in the Comment 10 is invalid for
> registers that can't be decomposed to independent word-sized registers (to
> use "word-based accessor"), e.g. V32HImode xmm20. Perhaps the mentioned
> alter_subreg should choose correct approach based on TARGET_HARD_REGNO_NREGS?

Actually I do the same things as the old reload does.  It has practically the
same alter_subreg code.  May be the reload and LRA code is not up to date to
treat correctly this situation. I don't know.

  What I can do is to generate (strict_low_part (subreg:DI (reg:V32HI <sse
pseudo>))) to reflect the new semantics.  Something like

Index: lra.c
===================================================================
--- lra.c       (revision 258691)
+++ lra.c       (working copy)
@@ -487,14 +487,26 @@ int lra_curr_reload_num;
 void
 lra_emit_move (rtx x, rtx y)
 {
-  int old;
-
+  int old, regno;
+  machine_mode mode;
+  rtx reg;
+
   if (GET_CODE (y) != PLUS)
     {
       if (rtx_equal_p (x, y))
        return;
       old = max_reg_num ();
-      emit_move_insn (x, y);
+      if (GET_CODE (x) == SUBREG
+         && REG_P (reg = SUBREG_REG (x))
+         && GET_MODE_SIZE (mode = GET_MODE (reg)).to_constant () >
UNITS_PER_WORD
+         && (regno = REGNO (reg)) >= FIRST_PSEUDO_REGISTER
+         && ira_reg_class_max_nregs[lra_get_allocno_class (regno)][mode] == 1)
+       {
+         x = gen_rtx_STRICT_LOW_PART (VOIDmode, x);
+         emit_insn (gen_rtx_SET (x, y));
+       }
+      else
+       emit_move_insn (x, y);
       if (REG_P (x))
        lra_reg_info[ORIGINAL_REGNO (x)].last_reload = ++lra_curr_reload_num;
       /* Function emit_move can create pseudos -- so expand the pseudo


  But we need insn patterns for such cases which are absent in i386 md files. 
Without adding them, compiler will crash in LRA.

[Bug middle-end/85090] [8 Regression] wrong code with -O2 -fno-tree-dominator-opts -mavx512f -fira-algorithm=priority

Reply via email to