permute simplification

rguenth at gcc dot gnu.org Tue, 01 Sep 2020 02:32:58 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94301


--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Testcase

typedef double v1df __attribute__((vector_size(8)));
typedef double v2df __attribute__((vector_size(16)));
typedef long v2di __attribute__((vector_size(16)));

v2df __GIMPLE(ssa) foo (v1df x, v1df z)
{
  v2df y;

  __BB(2):
   y_2 = _Literal (v2df) { x_1(D), _Literal (v1df) { 0.0 } };
  y_3 = _Literal (v2df) { z_4(D), _Literal (v1df) { 0.0 } };
  y_5 = __VEC_PERM (y_2, y_3, _Literal (v2di) { 0l, 2l });
  y_6 = __VEC_PERM (y_5, y_5, _Literal (v2di) { 0l, 0l });
  return y_6;
}


> ./cc1 -quiet t.c -fgimple -O
during RTL pass: expand
t.c: In function 'foo':
t.c:5:20: internal compiler error: in require, at machmode.h:293
    5 | v2df __GIMPLE(ssa) foo (v1df x, v1df z)
      |                    ^~~
0xb84809 opt_mode<scalar_int_mode>::require() const
        /home/rguenther/src/gcc2/gcc/machmode.h:293
0xd5da8c store_integral_bit_field
        /home/rguenther/src/gcc2/gcc/expmed.c:1006
0xd5d2fe store_bit_field_1
        /home/rguenther/src/gcc2/gcc/expmed.c:873


works with -O0.  With -O we expand from

  y_2 = {x_1(D), { 0.0 }};
  y_4 = BIT_INSERT_EXPR <y_2, z_3(D), 64>;
  y_5 = VEC_PERM_EXPR <y_4, y_4, { 0, 0 }>;
  return y_5;

and the issue is that 'value' is

(mem/c:BLK (plus:DI (reg/f:DI 76 virtual-incoming-args)
        (const_int 8 [0x8])) [1 z+0 S8 A64])

because V1DF isn't a supported vector mode on x86_64 and vector lowering
doesn't do anything to it either.  Eventually V1m types should fall back to
the component mode transparently.  ABI-wise we seem to pass V1DF on the
stack ...

So the "simple" patch

diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index bde6fa22b58..90fc34e5a2c 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -512,6 +512,10 @@ mode_for_vector (scalar_mode innermode, poly_uint64
nunits)
        return mode;
     }

+  /* For single-element vectors, map it to the component mode.  */
+  if (known_eq (nunits, 1))
+    return innermode;
+
   return opt_machine_mode ();
 }


not only fixes the ICE and generates optimal code but also changes the ABI...

[Bug tree-optimization/94301] Missed vector-vector CTOR / permute simplification

Reply via email to