Hello!

As explained in the PR [1] Comment #4, this is a target problem with
invalid RTL sharing.

Invalid sharing is created by the misaligned expansion code in i386.c,
when subregs are involved. vec_extract_hi_v32qi pattern is generated
in loop2_invariant pass when misaligned V8SI move is generated, and
later cprop3 pass propagates a register inside a subreg. The pass
updates both instances of (reg:V8SI 181) to (reg:V8SI 175) in (insn
197) and (insn 198). However, since just renamed (reg 175) doesn't
trigger rescan of (insn 198) in the substitution loop, we miss a
rescan of (insn 198).

The solution is to avoid invalid sharing by copying RTXes when subregs
are created.

2016-06-06  Uros Bizjak  <ubiz...@gmail.com>

    PR target/71389
    * config/i386/i386.c (ix86_avx256_split_vector_move_misalign):
    Copy op1 RTX to avoid invalid sharing.
    (ix86_expand_vector_move_misalign): Ditto.

testsuite/ChangeLog:

2016-06-06  Uros Bizjak  <ubiz...@gmail.com>

    PR target/71389
    * g++.dg/pr71389.C: New test.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN, will be backported to release branches.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71389

Uros.
Index: config/i386/i386.c
===================================================================
--- config/i386/i386.c  (revision 237110)
+++ config/i386/i386.c  (working copy)
@@ -19552,7 +19552,7 @@ ix86_avx256_split_vector_move_misalign (rtx op0, r
       m = adjust_address (op0, mode, 0);
       emit_insn (extract (m, op1, const0_rtx));
       m = adjust_address (op0, mode, 16);
-      emit_insn (extract (m, op1, const1_rtx));
+      emit_insn (extract (m, copy_rtx (op1), const1_rtx));
     }
   else
     gcc_unreachable ();
@@ -19724,7 +19724,7 @@ ix86_expand_vector_move_misalign (machine_mode mod
          m = adjust_address (op0, V2SFmode, 0);
          emit_insn (gen_sse_storelps (m, op1));
          m = adjust_address (op0, V2SFmode, 8);
-         emit_insn (gen_sse_storehps (m, op1));
+         emit_insn (gen_sse_storehps (m, copy_rtx (op1)));
        }
     }
   else
Index: testsuite/g++.dg/pr71389.C
===================================================================
--- testsuite/g++.dg/pr71389.C  (nonexistent)
+++ testsuite/g++.dg/pr71389.C  (working copy)
@@ -0,0 +1,23 @@
+// { dg-do compile { target i?86-*-* x86_64-*-* } }
+// { dg-options "-std=c++11 -O3 -march=ivybridge" }
+
+#include <functional>
+
+extern int le_s6, le_s9, le_s11;
+long foo_v14[16][16];
+
+void fn1() {
+  std::array<std::array<int, 16>, 16> v13;
+  for (; le_s6;)
+    for (int k1 = 2; k1 < 4; k1 = k1 + 1) {
+      for (int n1 = 0; n1 < le_s9; n1 = 8) {
+        *foo_v14[6] = 20923310;
+        for (int i2 = n1; i2 < n1 + 8; i2 = i2 + 1)
+          v13.at(5).at(i2 + 6 - n1) = 306146921;
+      }
+
+      for (int l2 = 0; l2 < le_s11; l2 = l2 + 1)
+          *(l2 + v13.at(5).begin()) = 306146921;
+    }
+  v13.at(le_s6 - 4);
+}

Reply via email to