On Mon, 24 Oct 2011, Richard Guenther wrote:
> On Thu, 20 Oct 2011, Jakub Jelinek wrote:
>
> > On Thu, Oct 20, 2011 at 11:42:01AM +0200, Richard Guenther wrote:
> > > > + if (TREE_CODE (scalar_dest) == VIEW_CONVERT_EXPR
> > > > + && is_pattern_stmt_p (stmt_info))
> > > > + scalar_dest = TREE_OPERAND (scalar_dest, 0);
> > > > if (TREE_CODE (scalar_dest) != ARRAY_REF
> > > > && TREE_CODE (scalar_dest) != INDIRECT_REF
> > > > && TREE_CODE (scalar_dest) != COMPONENT_REF
> > >
> > > Just change the if () stmt to
> > >
> > > if (!handled_component_p (scalar_dest)
> > > && TREE_CODE (scalar_dest) != MEM_REF)
> > > return false;
> >
> > That will accept BIT_FIELD_REF and ARRAY_RANGE_REF (as well as VCE outside
> > of pattern stmts).
> > The VCEs I hope don't appear, but the first two might, and I'm not sure
> > we are prepared to handle them. Certainly not BIT_FIELD_REFs.
> >
> > > > + rhs = adjust_bool_pattern (var, TREE_TYPE (vectype), NULL_TREE,
> > > > stmts);
> > > > + if (TREE_CODE (lhs) == MEM_REF || TREE_CODE (lhs) ==
> > > > TARGET_MEM_REF)
> > > > + {
> > > > + lhs = copy_node (lhs);
> > >
> > > We don't handle TARGET_MEM_REF in vectorizable_store, so no need to
> > > do it here. In fact, just unconditionally do ...
> > >
> > > > + TREE_TYPE (lhs) = TREE_TYPE (vectype);
> > > > + }
> > > > + else
> > > > + lhs = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (vectype), lhs);
> > >
> > > ... this (wrap it in a V_C_E). No need to special-case any
> > > MEM_REFs.
> >
> > Ok. After all it seems vectorizable_store pretty much ignores it
> > (except for the scalar_dest check above). For aliasing it uses the type
> > from DR_REF and otherwise it uses the vectorized type.
> >
> > > > + if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE
> > > > (rhs)))
> > >
> > > This should never be false, so you can as well unconditionally build
> > > the conversion stmt.
> >
> > You mean because currently adjust_bool_pattern will prefer signed types
> > over unsigned while here lhs will be unsigned? I guess I should
> > change it to use signed type for the memory store too to avoid the extra
> > cast instead. Both types can be certainly the same precision, e.g. for:
> > unsigned char a[N], b[N];
> > unsigned int d[N], e[N];
> > bool c[N];
> > ...
> > for (i = 0; i < N; ++i)
> > c[i] = a[i] < b[i];
> > or different precision, e.g. for:
> > for (i = 0; i < N; ++i)
> > c[i] = d[i] < e[i];
> >
> > > > @@ -347,6 +347,28 @@ vect_determine_vectorization_factor (loo
> > > > gcc_assert (STMT_VINFO_DATA_REF (stmt_info)
> > > > || is_pattern_stmt_p (stmt_info));
> > > > vectype = STMT_VINFO_VECTYPE (stmt_info);
> > > > + if (STMT_VINFO_DATA_REF (stmt_info))
> > > > + {
> > > > + struct data_reference *dr = STMT_VINFO_DATA_REF
> > > > (stmt_info);
> > > > + tree scalar_type = TREE_TYPE (DR_REF (dr));
> > > > + /* vect_analyze_data_refs will allow bool writes
> > > > through,
> > > > + in order to allow vect_recog_bool_pattern to
> > > > transform
> > > > + those. If they couldn't be transformed, give up
> > > > now. */
> > > > + if (((TYPE_PRECISION (scalar_type) == 1
> > > > + && TYPE_UNSIGNED (scalar_type))
> > > > + || TREE_CODE (scalar_type) == BOOLEAN_TYPE)
> > >
> > > Shouldn't it be always possible to vectorize those? For loads
> > > we can assume the memory contains only 1 or 0 (we assume that for
> > > scalar loads), for stores we can mask out all other bits explicitly
> > > if you add support for truncating conversions to non-mode precision
> > > (in fact, we could support non-mode precision vectorization that way,
> > > if not support bitfield loads or extending conversions).
> >
> > Not without the pattern recognizer transforming it into something.
> > That is something we've discussed on IRC before I started working on the
> > first vect_recog_bool_pattern patch, we'd need to special case bool and
> > one-bit precision types in way too many places all around the vectorizer.
> > Another reason for that was that what vect_recog_bool_pattern does currently
> > is certainly way faster than what would we end up with if we just handled
> > bool as unsigned (or signed?) char with masking on casts and stores
> > - the ability to use any integer type for the bools rather than char
> > as appropriate means we can avoid many VEC_PACK_TRUNK_EXPRs and
> > corresponding VEC_UNPACK_{LO,HI}_EXPRs.
> > So the chosen solution was attempt to transform some of bool patterns
> > into something the vectorizer can handle easily.
> > And that can be extended over time what it handles.
> >
> > The above just reflects it, probably just me trying to be too cautious,
> > the vectorization would likely fail on the stmt feeding the store, because
> > get_vectype_for_scalar_type would fail on it.
> >
> > If we wanted to support general TYPE_PRECISION != GET_MODE_BITSIZE
> > (TYPE_MODE)
> > vectorization (hopefully with still preserving the pattern bool recognizer
> > for the above stated reasons), we'd start with changing
> > get_vectype_for_scalar_type to handle those types (then the
> > tree-vect-data-refs.c and tree-vect-loop.c changes from this patch would
> > be unnecessary), but then we'd need to handle it in other places too
> > (I guess loads would be fine (unless BIT_FIELD_REF loads), but then
> > casts and stores need extra code).
>
> This is what I have right now, bootstrapped and tested on
> x86_64-unknown-linux-gnu. I do see
>
> FAIL: gfortran.dg/logical_dot_product.f90 -O3 -fomit-frame-pointer
> (internal c
> ompiler error)
> FAIL: gfortran.dg/mapping_1.f90 -O3 -fomit-frame-pointer (internal
> compiler er
> ror)
> FAIL: gfortran.fortran-torture/execute/pr43390.f90, -O3 -g (internal
> compiler
> error)
>
> so there is some fallout, but somebody broke dejagnu enough that
> I can't easily debug this right now, so I'm post-poning it until
> that is fixed.
>
> It doesn't seem to break any testcases for Bool vectorization.
This one bootstraps and regtests fine on x86_64-unknown-linux-gnu.
I didn't find a good pattern to split out, eventually how we call
the vectorizable_* routines should be re-factored a bit.
Does this look ok to you?
Thanks,
Richard.
2011-10-24 Richard Guenther <[email protected]>
* tree-vect-stmts.c (vect_get_vec_def_for_operand): Convert constants
to vector element type.
(vectorizable_assignment): Bail out for non-mode-precision operations.
(vectorizable_shift): Likewise.
(vectorizable_operation): Likewise.
(vectorizable_type_demotion): Likewise.
(vectorizable_type_promotion): Likewise.
(vectorizable_store): Handle non-mode-precision stores.
(vectorizable_load): Handle non-mode-precision loads.
(get_vectype_for_scalar_type_and_size): Return a vector type
for non-mode-precision integers.
* tree-vect-loop.c (vectorizable_reduction): Bail out for
non-mode-precision reductions.
* gcc.dg/vect/vect-bool-1.c: New testcase.
Index: gcc/tree-vect-stmts.c
===================================================================
*** gcc/tree-vect-stmts.c (revision 180380)
--- gcc/tree-vect-stmts.c (working copy)
*************** vect_get_vec_def_for_operand (tree op, g
*** 1204,1210 ****
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "Create vector_cst. nunits = %d", nunits);
! vec_cst = build_vector_from_val (vector_type, op);
return vect_init_vector (stmt, vec_cst, vector_type, NULL);
}
--- 1204,1212 ----
if (vect_print_dump_info (REPORT_DETAILS))
fprintf (vect_dump, "Create vector_cst. nunits = %d", nunits);
! vec_cst = build_vector_from_val (vector_type,
! fold_convert (TREE_TYPE (vector_type),
! op));
return vect_init_vector (stmt, vec_cst, vector_type, NULL);
}
*************** vectorizable_assignment (gimple stmt, gi
*** 2173,2178 ****
--- 2175,2199 ----
!= GET_MODE_SIZE (TYPE_MODE (vectype_in)))))
return false;
+ /* We do not handle bit-precision changes. */
+ if ((CONVERT_EXPR_CODE_P (code)
+ || code == VIEW_CONVERT_EXPR)
+ && INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
+ && ((TYPE_PRECISION (TREE_TYPE (scalar_dest))
+ != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (scalar_dest))))
+ || ((TYPE_PRECISION (TREE_TYPE (op))
+ != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (op))))))
+ /* But a conversion that does not change the bit-pattern is ok. */
+ && !((TYPE_PRECISION (TREE_TYPE (scalar_dest))
+ > TYPE_PRECISION (TREE_TYPE (op)))
+ && TYPE_UNSIGNED (TREE_TYPE (op))))
+ {
+ if (vect_print_dump_info (REPORT_DETAILS))
+ fprintf (vect_dump, "type conversion to/from bit-precision "
+ "unsupported.");
+ return false;
+ }
+
if (!vec_stmt) /* transformation not required. */
{
STMT_VINFO_TYPE (stmt_info) = assignment_vec_info_type;
*************** vectorizable_shift (gimple stmt, gimple_
*** 2326,2331 ****
--- 2347,2359 ----
scalar_dest = gimple_assign_lhs (stmt);
vectype_out = STMT_VINFO_VECTYPE (stmt_info);
+ if (TYPE_PRECISION (TREE_TYPE (scalar_dest))
+ != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (scalar_dest))))
+ {
+ if (vect_print_dump_info (REPORT_DETAILS))
+ fprintf (vect_dump, "bit-precision shifts not supported.");
+ return false;
+ }
op0 = gimple_assign_rhs1 (stmt);
if (!vect_is_simple_use_1 (op0, loop_vinfo, bb_vinfo,
*************** vectorizable_operation (gimple stmt, gim
*** 2660,2665 ****
--- 2688,2708 ----
scalar_dest = gimple_assign_lhs (stmt);
vectype_out = STMT_VINFO_VECTYPE (stmt_info);
+ /* Most operations cannot handle bit-precision types without extra
+ truncations. */
+ if ((TYPE_PRECISION (TREE_TYPE (scalar_dest))
+ != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (scalar_dest))))
+ /* Exception are bitwise operations. */
+ && code != BIT_IOR_EXPR
+ && code != BIT_XOR_EXPR
+ && code != BIT_AND_EXPR
+ && code != BIT_NOT_EXPR)
+ {
+ if (vect_print_dump_info (REPORT_DETAILS))
+ fprintf (vect_dump, "bit-precision arithmetic not supported.");
+ return false;
+ }
+
op0 = gimple_assign_rhs1 (stmt);
if (!vect_is_simple_use_1 (op0, loop_vinfo, bb_vinfo,
&def_stmt, &def, &dt[0], &vectype))
*************** vectorizable_type_demotion (gimple stmt,
*** 3082,3090 ****
if (! ((INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
&& INTEGRAL_TYPE_P (TREE_TYPE (op0)))
|| (SCALAR_FLOAT_TYPE_P (TREE_TYPE (scalar_dest))
! && SCALAR_FLOAT_TYPE_P (TREE_TYPE (op0))
! && CONVERT_EXPR_CODE_P (code))))
return false;
if (!vect_is_simple_use_1 (op0, loop_vinfo, bb_vinfo,
&def_stmt, &def, &dt[0], &vectype_in))
{
--- 3125,3144 ----
if (! ((INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
&& INTEGRAL_TYPE_P (TREE_TYPE (op0)))
|| (SCALAR_FLOAT_TYPE_P (TREE_TYPE (scalar_dest))
! && SCALAR_FLOAT_TYPE_P (TREE_TYPE (op0)))))
return false;
+
+ if (INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
+ && ((TYPE_PRECISION (TREE_TYPE (scalar_dest))
+ != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (scalar_dest))))
+ || ((TYPE_PRECISION (TREE_TYPE (op0))
+ != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (op0)))))))
+ {
+ if (vect_print_dump_info (REPORT_DETAILS))
+ fprintf (vect_dump, "type demotion to/from bit-precision
unsupported.");
+ return false;
+ }
+
if (!vect_is_simple_use_1 (op0, loop_vinfo, bb_vinfo,
&def_stmt, &def, &dt[0], &vectype_in))
{
*************** vectorizable_type_promotion (gimple stmt
*** 3365,3370 ****
--- 3419,3437 ----
&& SCALAR_FLOAT_TYPE_P (TREE_TYPE (op0))
&& CONVERT_EXPR_CODE_P (code))))
return false;
+
+ if (INTEGRAL_TYPE_P (TREE_TYPE (scalar_dest))
+ && ((TYPE_PRECISION (TREE_TYPE (scalar_dest))
+ != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (scalar_dest))))
+ || ((TYPE_PRECISION (TREE_TYPE (op0))
+ != GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (op0)))))))
+ {
+ if (vect_print_dump_info (REPORT_DETAILS))
+ fprintf (vect_dump, "type promotion to/from bit-precision "
+ "unsupported.");
+ return false;
+ }
+
if (!vect_is_simple_use_1 (op0, loop_vinfo, bb_vinfo,
&def_stmt, &def, &dt[0], &vectype_in))
{
*************** vectorizable_store (gimple stmt, gimple_
*** 3673,3689 ****
return false;
}
- /* The scalar rhs type needs to be trivially convertible to the vector
- component type. This should always be the case. */
elem_type = TREE_TYPE (vectype);
- if (!useless_type_conversion_p (elem_type, TREE_TYPE (op)))
- {
- if (vect_print_dump_info (REPORT_DETAILS))
- fprintf (vect_dump, "??? operands of different types");
- return false;
- }
-
vec_mode = TYPE_MODE (vectype);
/* FORNOW. In some cases can vectorize even if data-type not supported
(e.g. - array initialization with 0). */
if (optab_handler (mov_optab, vec_mode) == CODE_FOR_nothing)
--- 3740,3748 ----
return false;
}
elem_type = TREE_TYPE (vectype);
vec_mode = TYPE_MODE (vectype);
+
/* FORNOW. In some cases can vectorize even if data-type not supported
(e.g. - array initialization with 0). */
if (optab_handler (mov_optab, vec_mode) == CODE_FOR_nothing)
*************** vectorizable_load (gimple stmt, gimple_s
*** 4117,4123 ****
bool strided_load = false;
bool load_lanes_p = false;
gimple first_stmt;
- tree scalar_type;
bool inv_p;
bool negative;
bool compute_in_loop = false;
--- 4176,4181 ----
*************** vectorizable_load (gimple stmt, gimple_s
*** 4192,4198 ****
return false;
}
! scalar_type = TREE_TYPE (DR_REF (dr));
mode = TYPE_MODE (vectype);
/* FORNOW. In some cases can vectorize even if data-type not supported
--- 4250,4256 ----
return false;
}
! elem_type = TREE_TYPE (vectype);
mode = TYPE_MODE (vectype);
/* FORNOW. In some cases can vectorize even if data-type not supported
*************** vectorizable_load (gimple stmt, gimple_s
*** 4204,4219 ****
return false;
}
- /* The vector component type needs to be trivially convertible to the
- scalar lhs. This should always be the case. */
- elem_type = TREE_TYPE (vectype);
- if (!useless_type_conversion_p (TREE_TYPE (scalar_dest), elem_type))
- {
- if (vect_print_dump_info (REPORT_DETAILS))
- fprintf (vect_dump, "??? operands of different types");
- return false;
- }
-
/* Check if the load is a part of an interleaving chain. */
if (STMT_VINFO_STRIDED_ACCESS (stmt_info))
{
--- 4262,4267 ----
*************** vectorizable_load (gimple stmt, gimple_s
*** 4560,4566 ****
msq = new_temp;
bump = size_binop (MULT_EXPR, vs_minus_1,
! TYPE_SIZE_UNIT (scalar_type));
ptr = bump_vector_ptr (dataref_ptr, NULL, gsi, stmt, bump);
new_stmt = gimple_build_assign_with_ops
(BIT_AND_EXPR, NULL_TREE, ptr,
--- 4608,4614 ----
msq = new_temp;
bump = size_binop (MULT_EXPR, vs_minus_1,
! TYPE_SIZE_UNIT (elem_type));
ptr = bump_vector_ptr (dataref_ptr, NULL, gsi, stmt, bump);
new_stmt = gimple_build_assign_with_ops
(BIT_AND_EXPR, NULL_TREE, ptr,
*************** get_vectype_for_scalar_type_and_size (tr
*** 5441,5453 ****
if (nbytes < TYPE_ALIGN_UNIT (scalar_type))
return NULL_TREE;
! /* If we'd build a vector type of elements whose mode precision doesn't
! match their types precision we'll get mismatched types on vector
! extracts via BIT_FIELD_REFs. This effectively means we disable
! vectorization of bool and/or enum types in some languages. */
if (INTEGRAL_TYPE_P (scalar_type)
&& GET_MODE_BITSIZE (inner_mode) != TYPE_PRECISION (scalar_type))
! return NULL_TREE;
if (GET_MODE_CLASS (inner_mode) != MODE_INT
&& GET_MODE_CLASS (inner_mode) != MODE_FLOAT)
--- 5489,5502 ----
if (nbytes < TYPE_ALIGN_UNIT (scalar_type))
return NULL_TREE;
! /* For vector types of elements whose mode precision doesn't
! match their types precision we use a element type of mode
! precision. The vectorization routines will have to make sure
! they support the proper result truncation/extension. */
if (INTEGRAL_TYPE_P (scalar_type)
&& GET_MODE_BITSIZE (inner_mode) != TYPE_PRECISION (scalar_type))
! scalar_type = build_nonstandard_integer_type (GET_MODE_BITSIZE
(inner_mode),
! TYPE_UNSIGNED (scalar_type));
if (GET_MODE_CLASS (inner_mode) != MODE_INT
&& GET_MODE_CLASS (inner_mode) != MODE_FLOAT)
Index: gcc/tree-vect-loop.c
===================================================================
*** gcc/tree-vect-loop.c (revision 180380)
--- gcc/tree-vect-loop.c (working copy)
*************** vectorizable_reduction (gimple stmt, gim
*** 4422,4427 ****
--- 4422,4432 ----
&& !SCALAR_FLOAT_TYPE_P (scalar_type))
return false;
+ /* Do not try to vectorize bit-precision reductions. */
+ if ((TYPE_PRECISION (scalar_type)
+ != GET_MODE_PRECISION (TYPE_MODE (scalar_type))))
+ return false;
+
/* All uses but the last are expected to be defined in the loop.
The last use is the reduction variable. In case of nested cycle this
assumption is not true: we use reduc_index to record the index of the
Index: gcc/testsuite/gcc.dg/vect/vect-bool-1.c
===================================================================
*** gcc/testsuite/gcc.dg/vect/vect-bool-1.c (revision 0)
--- gcc/testsuite/gcc.dg/vect/vect-bool-1.c (revision 0)
***************
*** 0 ****
--- 1,15 ----
+ /* { dg-do compile } */
+ /* { dg-require-effective-target vect_int } */
+
+ _Bool a[1024];
+ _Bool b[1024];
+ _Bool c[1024];
+ void foo (void)
+ {
+ unsigned i;
+ for (i = 0; i < 1024; ++i)
+ a[i] = b[i] | c[i];
+ }
+
+ /* { dg-final { scan-tree-dump "vectorized 1 loops" "vect" } } */
+ /* { dg-final { cleanup-tree-dump "vect" } } */