On Wed, Nov 19, 2014 at 09:59:45AM +0100, Richard Biener wrote:
> > OImode/XImode on i?86/x86_64 are not <= MAX_BITSIZE_MODE_ANY_INT, because
> > they are never used for integer arithmetics (and there is no way
> > to represent all their values in RTL if not using CONST_WIDE_INT).
> > As the following testcase shows, simplify_immed_subreg can be called
> > with such modes though, e.g. trying to forward propagate a CONST_VECTOR
> > (i?86/x86_64 handles all zeros and all ones as CONST_VECTORs that can appear
> > in the IL directly) into a SUBREG_REG.
> 
> So we have (subreg:OI (reg:V4DF ...) ...) for example?  What do we
> end doing with that OI mode subreg?  (why can't we simply use the
> V4DF one?)

propagate_rtx_1 is called on
(subreg:OI (reg:V8DI 89) 0)
with old_rtx
(reg:V8DI 89)
and new_rtx
(const_vector:V8DI [
        (const_int 0 [0])
        (const_int 0 [0])
        (const_int 0 [0])
        (const_int 0 [0])
        (const_int 0 [0])
        (const_int 0 [0])
        (const_int 0 [0])
        (const_int 0 [0])
    ])

Seems we throw away the result in that case though, because
gen_lowpart_common doesn't like to return low parts of VOIDmode constants
larger if the mode is larger than double int:
1353      innermode = GET_MODE (x);
1354      if (CONST_INT_P (x)
1355          && msize * BITS_PER_UNIT <= HOST_BITS_PER_WIDE_INT)
1356        innermode = mode_for_size (HOST_BITS_PER_WIDE_INT, MODE_INT, 0);
1357      else if (innermode == VOIDmode)
1358        innermode = mode_for_size (HOST_BITS_PER_DOUBLE_INT, MODE_INT, 0);
we hit the last if and as mode is wider than innermode, we return NULL later
on.

> > The following patch instead of ICE handles the most common cases (all 0
> > and all 1 CONST_VECTORs) and returns NULL otherwise.
> > 
> > Before wide-int got merged, the testcase worked, though the code didn't
> > bother checking anything, just created 0 or constm1_rtx for the two cases
> > that could happen and if something else appeared, could just return what
> > matched low TImode (or DImode for -m32).
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> Looks ok to me.  Not sure if the zero/all-ones case really happens that
> much to be worth special-casing - I think you could use
> fixed_wide_int<proper-size> and simply see if the result is representable
> in a CONST_INT/CONST_DOUBLE?  Can you try if that works?  It looks like
> the patch may be smaller for that.

So perhaps something like this?  Don't know how much more inefficient it is
compared to what it used to do before.

Or just keep the existing code and just remove the assert and instead return
NULL whenever outer_submode is wider than MAX_BITSIZE_MODE_ANY_INT?  At
least during propagation that will make zero change.
Though, in that case I have still doubts about the current code handling right
modes wider than HOST_BITS_PER_DOUBLE_INT but smaller than
MAX_BITSIZE_MODE_ANY_INT (none on i?86/x86_64).  If TARGET_SUPPORTS_WIDE_INT
== 0, we still silently throw away the upper bits, don't we?

BTW, to Mike, the assert has been misplaced, there was first buffer overflow
and only after that the assert.

2014-11-19  Jakub Jelinek  <ja...@redhat.com>

        PR target/63910
        * simplify-rtx.c (simplify_immed_subreg): Handle integer modes wider
        than MAX_BITSIZE_MODE_ANY_INT.

        * gcc.target/i386/pr63910.c: New test.

--- gcc/simplify-rtx.c.jj       2014-11-19 09:17:15.491327992 +0100
+++ gcc/simplify-rtx.c  2014-11-19 12:28:16.223808178 +0100
@@ -5501,8 +5501,12 @@ simplify_immed_subreg (machine_mode oute
            int units
              = (GET_MODE_BITSIZE (outer_submode) + HOST_BITS_PER_WIDE_INT - 1)
              / HOST_BITS_PER_WIDE_INT;
-           HOST_WIDE_INT tmp[MAX_BITSIZE_MODE_ANY_INT / 
HOST_BITS_PER_WIDE_INT];
-           wide_int r;
+           const int tmpsize = (MAX_BITSIZE_MODE_ANY_MODE 
+                                + HOST_BITS_PER_WIDE_INT - 1)
+                               / HOST_BITS_PER_WIDE_INT;
+           HOST_WIDE_INT tmp[tmpsize];
+           typedef FIXED_WIDE_INT (tmpsize * HOST_BITS_PER_WIDE_INT) imm_int;
+           imm_int r;
 
            for (u = 0; u < units; u++)
              {
@@ -5515,10 +5519,12 @@ simplify_immed_subreg (machine_mode oute
                tmp[u] = buf;
                base += HOST_BITS_PER_WIDE_INT;
              }
-           gcc_assert (GET_MODE_PRECISION (outer_submode)
-                       <= MAX_BITSIZE_MODE_ANY_INT);
-           r = wide_int::from_array (tmp, units,
-                                     GET_MODE_PRECISION (outer_submode));
+           r = imm_int::from_array (tmp, units,
+                                    GET_MODE_PRECISION (outer_submode));
+#if TARGET_SUPPORTS_WIDE_INT == 0
+           if (wi::min_precision (r, SIGNED) > HOST_BITS_PER_DOUBLE_INT)
+             return NULL_RTX;
+#endif
            elems[elem] = immed_wide_int_const (r, outer_submode);
          }
          break;
--- gcc/testsuite/gcc.target/i386/pr63910.c.jj  2014-11-19 12:04:23.490489130 
+0100
+++ gcc/testsuite/gcc.target/i386/pr63910.c     2014-11-19 12:04:23.490489130 
+0100
@@ -0,0 +1,12 @@
+/* PR target/63910 */
+/* { dg-do compile } */
+/* { dg-options "-O -mstringop-strategy=vector_loop -mavx512f" } */
+
+extern void bar (float *c);
+
+void
+foo (void)
+{
+  float c[1024] = { };
+  bar (c);
+}


        Jakub

Reply via email to