On Thu, 20 Oct 2011, Jakub Jelinek wrote:

> On Thu, Oct 20, 2011 at 11:42:01AM +0200, Richard Guenther wrote:
> > > +  if (TREE_CODE (scalar_dest) == VIEW_CONVERT_EXPR
> > > +      && is_pattern_stmt_p (stmt_info))
> > > +    scalar_dest = TREE_OPERAND (scalar_dest, 0);
> > >    if (TREE_CODE (scalar_dest) != ARRAY_REF
> > >        && TREE_CODE (scalar_dest) != INDIRECT_REF
> > >        && TREE_CODE (scalar_dest) != COMPONENT_REF
> > 
> > Just change the if () stmt to
> > 
> >  if (!handled_component_p (scalar_dest)
> >      && TREE_CODE (scalar_dest) != MEM_REF)
> >    return false;
> 
> That will accept BIT_FIELD_REF and ARRAY_RANGE_REF (as well as VCE outside of 
> pattern stmts).
> The VCEs I hope don't appear, but the first two might, and I'm not sure
> we are prepared to handle them.  Certainly not BIT_FIELD_REFs.
> 
> > > +      rhs = adjust_bool_pattern (var, TREE_TYPE (vectype), NULL_TREE, 
> > > stmts);
> > > +      if (TREE_CODE (lhs) == MEM_REF || TREE_CODE (lhs) == 
> > > TARGET_MEM_REF)
> > > + {
> > > +   lhs = copy_node (lhs);
> > 
> > We don't handle TARGET_MEM_REF in vectorizable_store, so no need to
> > do it here.  In fact, just unconditionally do ...
> > 
> > > +   TREE_TYPE (lhs) = TREE_TYPE (vectype);
> > > + }
> > > +      else
> > > + lhs = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (vectype), lhs);
> > 
> > ... this (wrap it in a V_C_E).  No need to special-case any
> > MEM_REFs.
> 
> Ok.  After all it seems vectorizable_store pretty much ignores it
> (except for the scalar_dest check above).  For aliasing it uses the type
> from DR_REF and otherwise it uses the vectorized type.
> 
> > > +      if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs)))
> > 
> > This should never be false, so you can as well unconditionally build
> > the conversion stmt.
> 
> You mean because currently adjust_bool_pattern will prefer signed types
> over unsigned while here lhs will be unsigned?  I guess I should
> change it to use signed type for the memory store too to avoid the extra
> cast instead.  Both types can be certainly the same precision, e.g. for:
> unsigned char a[N], b[N];
> unsigned int d[N], e[N];
> bool c[N];
> ...
>   for (i = 0; i < N; ++i)
>     c[i] = a[i] < b[i];
> or different precision, e.g. for:
>   for (i = 0; i < N; ++i)
>     c[i] = d[i] < e[i];
> 
> > > @@ -347,6 +347,28 @@ vect_determine_vectorization_factor (loo
> > >         gcc_assert (STMT_VINFO_DATA_REF (stmt_info)
> > >                     || is_pattern_stmt_p (stmt_info));
> > >         vectype = STMT_VINFO_VECTYPE (stmt_info);
> > > +       if (STMT_VINFO_DATA_REF (stmt_info))
> > > +         {
> > > +           struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
> > > +           tree scalar_type = TREE_TYPE (DR_REF (dr));
> > > +           /* vect_analyze_data_refs will allow bool writes through,
> > > +              in order to allow vect_recog_bool_pattern to transform
> > > +              those.  If they couldn't be transformed, give up now.  */
> > > +           if (((TYPE_PRECISION (scalar_type) == 1
> > > +                 && TYPE_UNSIGNED (scalar_type))
> > > +                || TREE_CODE (scalar_type) == BOOLEAN_TYPE)
> > 
> > Shouldn't it be always possible to vectorize those?  For loads
> > we can assume the memory contains only 1 or 0 (we assume that for
> > scalar loads), for stores we can mask out all other bits explicitly
> > if you add support for truncating conversions to non-mode precision
> > (in fact, we could support non-mode precision vectorization that way,
> > if not support bitfield loads or extending conversions).
> 
> Not without the pattern recognizer transforming it into something.
> That is something we've discussed on IRC before I started working on the
> first vect_recog_bool_pattern patch, we'd need to special case bool and
> one-bit precision types in way too many places all around the vectorizer.
> Another reason for that was that what vect_recog_bool_pattern does currently
> is certainly way faster than what would we end up with if we just handled
> bool as unsigned (or signed?) char with masking on casts and stores
> - the ability to use any integer type for the bools rather than char
> as appropriate means we can avoid many VEC_PACK_TRUNK_EXPRs and
> corresponding VEC_UNPACK_{LO,HI}_EXPRs.
> So the chosen solution was attempt to transform some of bool patterns
> into something the vectorizer can handle easily.
> And that can be extended over time what it handles.
> 
> The above just reflects it, probably just me trying to be too cautious,
> the vectorization would likely fail on the stmt feeding the store, because
> get_vectype_for_scalar_type would fail on it.
> 
> If we wanted to support general TYPE_PRECISION != GET_MODE_BITSIZE (TYPE_MODE)
> vectorization (hopefully with still preserving the pattern bool recognizer
> for the above stated reasons), we'd start with changing
> get_vectype_for_scalar_type to handle those types (then the
> tree-vect-data-refs.c and tree-vect-loop.c changes from this patch would
> be unnecessary), but then we'd need to handle it in other places too
> (I guess loads would be fine (unless BIT_FIELD_REF loads), but then
> casts and stores need extra code).

This is what I have right now, bootstrapped and tested on 
x86_64-unknown-linux-gnu.  I do see

FAIL: gfortran.dg/logical_dot_product.f90  -O3 -fomit-frame-pointer  
(internal c
ompiler error)
FAIL: gfortran.dg/mapping_1.f90  -O3 -fomit-frame-pointer  (internal 
compiler er
ror)
FAIL: gfortran.fortran-torture/execute/pr43390.f90,  -O3 -g  (internal 
compiler 
error)

so there is some fallout, but somebody broke dejagnu enough that
I can't easily debug this right now, so I'm post-poning it until
that is fixed.

It doesn't seem to break any testcases for Bool vectorization.

I probably should factor out the precision test.

Thanks,
Richard.

2011-10-24  Richard Guenther  <rguent...@suse.de>

        * tree-vect-stmts.c (vectorizable_assignment): Bail out for
        non-mode-precision operations.
        (vectorizable_shift): Likewise.
        (vectorizable_operation): Likewise.
        (vectorizable_type_demotion): Likewise.
        (vectorizable_type_promotion): Likewise.
        (vectorizable_store): Handle non-mode-precision stores.
        (vectorizable_load): Handle non-mode-precision loads.
        (get_vectype_for_scalar_type_and_size): Return a vector type
        for non-mode-precision integers.

        * gcc.dg/vect/vect-bool-1.c: New testcase.

Index: gcc/tree-vect-stmts.c
===================================================================
*** gcc/tree-vect-stmts.c       (revision 180380)
--- gcc/tree-vect-stmts.c       (working copy)
*************** vectorizable_assignment (gimple stmt, gi
*** 2173,2178 ****
--- 2173,2197 ----
              != GET_MODE_SIZE (TYPE_MODE (vectype_in)))))
      return false;
  
+   /* We do not handle bit-precision changes.  */
+   if ((CONVERT_EXPR_CODE_P (code)
+        || code == VIEW_CONVERT_EXPR)
+       && INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
+       && ((TYPE_PRECISION (TREE_TYPE (scalar_dest))
+          != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (scalar_dest))))
+         || ((TYPE_PRECISION (TREE_TYPE (op))
+              != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (op))))))
+       /* But a conversion that does not change the bit-pattern is ok.  */
+       && !((TYPE_PRECISION (TREE_TYPE (scalar_dest))
+           > TYPE_PRECISION (TREE_TYPE (op)))
+          && TYPE_UNSIGNED (TREE_TYPE (op))))
+     {
+       if (vect_print_dump_info (REPORT_DETAILS))
+         fprintf (vect_dump, "type conversion to/from bit-precision "
+                "unsupported.");
+       return false;
+     }
+ 
    if (!vec_stmt) /* transformation not required.  */
      {
        STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type;
*************** vectorizable_shift (gimple stmt, gimple_
*** 2326,2331 ****
--- 2345,2357 ----
  
    scalar_dest = gimple_assign_lhs (stmt);
    vectype_out = STMT_VINFO_VECTYPE (stmt_info);
+   if (TYPE_PRECISION (TREE_TYPE (scalar_dest))
+       != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (scalar_dest))))
+     {
+       if (vect_print_dump_info (REPORT_DETAILS))
+         fprintf (vect_dump, "bit-precision shifts not supported.");
+       return false;
+     }
  
    op0 = gimple_assign_rhs1 (stmt);
    if (!vect_is_simple_use_1 (op0, loop_vinfo, bb_vinfo,
*************** vectorizable_operation (gimple stmt, gim
*** 2660,2665 ****
--- 2686,2706 ----
    scalar_dest = gimple_assign_lhs (stmt);
    vectype_out = STMT_VINFO_VECTYPE (stmt_info);
  
+   /* Most operations cannot handle bit-precision types without extra
+      truncations.  */
+   if ((TYPE_PRECISION (TREE_TYPE (scalar_dest))
+        != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (scalar_dest))))
+       /* Exception are bitwise operations.  */
+       && code != BIT_IOR_EXPR
+       && code != BIT_XOR_EXPR
+       && code != BIT_AND_EXPR
+       && code != BIT_NOT_EXPR)
+     {
+       if (vect_print_dump_info (REPORT_DETAILS))
+         fprintf (vect_dump, "bit-precision arithmetic not supported.");
+       return false;
+     }
+ 
    op0 = gimple_assign_rhs1 (stmt);
    if (!vect_is_simple_use_1 (op0, loop_vinfo, bb_vinfo,
                             &def_stmt, &def, &dt[0], &vectype))
*************** vectorizable_type_demotion (gimple stmt,
*** 3082,3090 ****
    if (! ((INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
          && INTEGRAL_TYPE_P (TREE_TYPE (op0)))
         || (SCALAR_FLOAT_TYPE_P (TREE_TYPE (scalar_dest))
!            && SCALAR_FLOAT_TYPE_P (TREE_TYPE (op0))
!            && CONVERT_EXPR_CODE_P (code))))
      return false;
    if (!vect_is_simple_use_1 (op0, loop_vinfo, bb_vinfo,
                             &def_stmt, &def, &dt[0], &vectype_in))
      {
--- 3123,3142 ----
    if (! ((INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
          && INTEGRAL_TYPE_P (TREE_TYPE (op0)))
         || (SCALAR_FLOAT_TYPE_P (TREE_TYPE (scalar_dest))
!            && SCALAR_FLOAT_TYPE_P (TREE_TYPE (op0)))))
      return false;
+ 
+   if (INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
+       && ((TYPE_PRECISION (TREE_TYPE (scalar_dest))
+          != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (scalar_dest))))
+         || ((TYPE_PRECISION (TREE_TYPE (op0))
+              != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (op0)))))))
+     {
+       if (vect_print_dump_info (REPORT_DETAILS))
+         fprintf (vect_dump, "type demotion to/from bit-precision 
unsupported.");
+       return false;
+     }
+ 
    if (!vect_is_simple_use_1 (op0, loop_vinfo, bb_vinfo,
                             &def_stmt, &def, &dt[0], &vectype_in))
      {
*************** vectorizable_type_promotion (gimple stmt
*** 3365,3370 ****
--- 3417,3435 ----
             && SCALAR_FLOAT_TYPE_P (TREE_TYPE (op0))
             && CONVERT_EXPR_CODE_P (code))))
      return false;
+ 
+   if (INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
+       && ((TYPE_PRECISION (TREE_TYPE (scalar_dest))
+          != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (scalar_dest))))
+         || ((TYPE_PRECISION (TREE_TYPE (op0))
+              != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (op0)))))))
+     {
+       if (vect_print_dump_info (REPORT_DETAILS))
+         fprintf (vect_dump, "type promotion to/from bit-precision "
+                "unsupported.");
+       return false;
+     }
+ 
    if (!vect_is_simple_use_1 (op0, loop_vinfo, bb_vinfo,
                             &def_stmt, &def, &dt[0], &vectype_in))
      {
*************** vectorizable_store (gimple stmt, gimple_
*** 3673,3689 ****
        return false;
      }
  
-   /* The scalar rhs type needs to be trivially convertible to the vector
-      component type.  This should always be the case.  */
    elem_type = TREE_TYPE (vectype);
-   if (!useless_type_conversion_p (elem_type, TREE_TYPE (op)))
-     {
-       if (vect_print_dump_info (REPORT_DETAILS))
-         fprintf (vect_dump, "???  operands of different types");
-       return false;
-     }
- 
    vec_mode = TYPE_MODE (vectype);
    /* FORNOW. In some cases can vectorize even if data-type not supported
       (e.g. - array initialization with 0).  */
    if (optab_handler (mov_optab, vec_mode) == CODE_FOR_nothing)
--- 3738,3746 ----
        return false;
      }
  
    elem_type = TREE_TYPE (vectype);
    vec_mode = TYPE_MODE (vectype);
+ 
    /* FORNOW. In some cases can vectorize even if data-type not supported
       (e.g. - array initialization with 0).  */
    if (optab_handler (mov_optab, vec_mode) == CODE_FOR_nothing)
*************** vectorizable_load (gimple stmt, gimple_s
*** 4117,4123 ****
    bool strided_load = false;
    bool load_lanes_p = false;
    gimple first_stmt;
-   tree scalar_type;
    bool inv_p;
    bool negative;
    bool compute_in_loop = false;
--- 4174,4179 ----
*************** vectorizable_load (gimple stmt, gimple_s
*** 4192,4198 ****
        return false;
      }
  
!   scalar_type = TREE_TYPE (DR_REF (dr));
    mode = TYPE_MODE (vectype);
  
    /* FORNOW. In some cases can vectorize even if data-type not supported
--- 4248,4254 ----
        return false;
      }
  
!   elem_type = TREE_TYPE (vectype);
    mode = TYPE_MODE (vectype);
  
    /* FORNOW. In some cases can vectorize even if data-type not supported
*************** vectorizable_load (gimple stmt, gimple_s
*** 4204,4219 ****
        return false;
      }
  
-   /* The vector component type needs to be trivially convertible to the
-      scalar lhs.  This should always be the case.  */
-   elem_type = TREE_TYPE (vectype);
-   if (!useless_type_conversion_p (TREE_TYPE (scalar_dest), elem_type))
-     {
-       if (vect_print_dump_info (REPORT_DETAILS))
-         fprintf (vect_dump, "???  operands of different types");
-       return false;
-     }
- 
    /* Check if the load is a part of an interleaving chain.  */
    if (STMT_VINFO_STRIDED_ACCESS (stmt_info))
      {
--- 4260,4265 ----
*************** vectorizable_load (gimple stmt, gimple_s
*** 4560,4566 ****
                    msq = new_temp;
  
                    bump = size_binop (MULT_EXPR, vs_minus_1,
!                                      TYPE_SIZE_UNIT (scalar_type));
                    ptr = bump_vector_ptr (dataref_ptr, NULL, gsi, stmt, bump);
                    new_stmt = gimple_build_assign_with_ops
                                 (BIT_AND_EXPR, NULL_TREE, ptr,
--- 4606,4612 ----
                    msq = new_temp;
  
                    bump = size_binop (MULT_EXPR, vs_minus_1,
!                                      TYPE_SIZE_UNIT (elem_type));
                    ptr = bump_vector_ptr (dataref_ptr, NULL, gsi, stmt, bump);
                    new_stmt = gimple_build_assign_with_ops
                                 (BIT_AND_EXPR, NULL_TREE, ptr,
*************** get_vectype_for_scalar_type_and_size (tr
*** 5441,5453 ****
    if (nbytes < TYPE_ALIGN_UNIT (scalar_type))
      return NULL_TREE;
  
!   /* If we'd build a vector type of elements whose mode precision doesn't
!      match their types precision we'll get mismatched types on vector
!      extracts via BIT_FIELD_REFs.  This effectively means we disable
!      vectorization of bool and/or enum types in some languages.  */
    if (INTEGRAL_TYPE_P (scalar_type)
        && GET_MODE_BITSIZE (inner_mode) != TYPE_PRECISION (scalar_type))
!     return NULL_TREE;
  
    if (GET_MODE_CLASS (inner_mode) != MODE_INT
        && GET_MODE_CLASS (inner_mode) != MODE_FLOAT)
--- 5487,5500 ----
    if (nbytes < TYPE_ALIGN_UNIT (scalar_type))
      return NULL_TREE;
  
!   /* For vector types of elements whose mode precision doesn't
!      match their types precision we use a element type of mode
!      precision.  The vectorization routines will have to make sure
!      they support the proper result truncation/extension.  */
    if (INTEGRAL_TYPE_P (scalar_type)
        && GET_MODE_BITSIZE (inner_mode) != TYPE_PRECISION (scalar_type))
!     scalar_type = build_nonstandard_integer_type (GET_MODE_BITSIZE 
(inner_mode),
!                                                 TYPE_UNSIGNED (scalar_type));
  
    if (GET_MODE_CLASS (inner_mode) != MODE_INT
        && GET_MODE_CLASS (inner_mode) != MODE_FLOAT)
Index: gcc/testsuite/gcc.dg/vect/vect-bool-1.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/vect-bool-1.c     (revision 0)
--- gcc/testsuite/gcc.dg/vect/vect-bool-1.c     (revision 0)
***************
*** 0 ****
--- 1,15 ----
+ /* { dg-do compile } */
+ /* { dg-require-effective-target vect_int } */
+ 
+ _Bool a[1024];
+ _Bool b[1024];
+ _Bool c[1024];
+ void foo (void)
+ {
+   unsigned i;
+   for (i = 0; i < 1024; ++i)
+     a[i] = b[i] | c[i];
+ }
+ 
+ /* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */

Reply via email to