On 12/11/18 14:30, Richard Biener wrote:
I guess I already asked this question when WIDEN_MULT_PLUS_EXPR was
introduced - but isn't that fully contained within a DOT_PROD_EXPR?
I'm not sure what exactly you mean here.
A mailing list search to find that post was unsuccessful. The earliest
it found for
widen_sum_expr and richard and (biener or guenther) was from October 2012,
and WIDEN_SUM_EXPR already existed then as was visibile in the context
of the
patch under discussion.
From the perspective of expressing scalar operations, or expressing
reduction operations on
(originally) scalars in the vectorizer, WIDEN_SUM_EXPR is redundant, as
we could use DOT_PROD_EXPR with one constant input.
For describing vectorizing narrow vector operations, WIDEN_SUM_EXPR
would be more suitable, as
we wouldn't need to bring in matrix operations, but the current
non-widening semantics suffer
from being insufficiently precisely defined in what they are actually
summing up.
From the perspective of providing optabs, there are two aspects to
consider: First, if a dot_prod
optab were to be to be used for expanding WIDEN_SUM_EXPR (by that or any
other name), a
target that implements WIDEN_SUM but not DOT_PROD in general would have
to have a dot_prod
expander that requires a vector of constant ones for one of its
operands. And then vectorizer, when
it wants to consider DOT_PROD_EXPR for two vector variables, would have
to test the operand
predicates of the expander. That sounds like a lot more trouble than
the little reduction in the
set of optabs is worth.
Some comments on the patch.
+ tree vecotype
+ = build_vector_type (otype, GET_MODE_NUNITS (TYPE_MODE (vecitype)));
TYPE_VECTOR_SUBPARTS (vecitype)
You want to pass in the half/full types and use get_vectype_for_scalar_type
which also makes sure the target supports the vector type.
WIDEN_MULT_PLUS is special on our target in that it creates double-sized
vectors. To get a full-sized vector in the reduction loop, you have to have
a double-sized vector result. We already make the reduction special in that
the result can vary in size, it's scalar for an ordinary DOT_PROD_EXPR,
while
a vector for other operations.
I think you want to extend and re-use supportable_widening_operation
here anyways.
I see. Other targets generally use even/odd or lo/hi vectorized
widening mult operations.
ia64/vect.md, s390/vector.md, spu/spu.md and tilegx/tilegx.md have
mult_lo / mult_even patterns, but no dot_prod pattern, so if I go via an
enhanced supportable_widening_operation, my patch should become relevant
to these platforms.
To make it cover our target as well, I'll have to make it able, if the
optabs indicate that, to specify a single WIDEN_MULT_PLUS_EXPR with a
double-sized vector result instead of a split into two halves.
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c (revision 266008)
+++ gcc/tree-vect-stmts.c (working copy)
@@ -10638,7 +10638,11 @@ vect_get_vector_types_for_stmt (stmt_vec
scalar_type);
if (maybe_ne (GET_MODE_SIZE (TYPE_MODE (vectype)),
- GET_MODE_SIZE (TYPE_MODE (nunits_vectype))))
+ GET_MODE_SIZE (TYPE_MODE (nunits_vectype)))
+ /* Reductions that use a widening reduction would show
+ a mismatch but that's already been checked to be OK. */
+ && STMT_VINFO_DEF_TYPE (stmt_info) != vect_reduction_def)
+
return opt_result::failure_at (stmt,
"not vectorized: different sized vector "
"types in statement, %T and %T\n",
that change doesn't look good.
Would it become acceptable if I made it specific to WIDEN_MULT_PLUS_EXPR ?