On Thu, Oct 20, 2011 at 12:31 PM, Jakub Jelinek <ja...@redhat.com> wrote:
> On Thu, Oct 20, 2011 at 11:42:01AM +0200, Richard Guenther wrote:
>> > +  if (TREE_CODE (scalar_dest) == VIEW_CONVERT_EXPR
>> > +      && is_pattern_stmt_p (stmt_info))
>> > +    scalar_dest = TREE_OPERAND (scalar_dest, 0);
>> >    if (TREE_CODE (scalar_dest) != ARRAY_REF
>> >        && TREE_CODE (scalar_dest) != INDIRECT_REF
>> >        && TREE_CODE (scalar_dest) != COMPONENT_REF
>>
>> Just change the if () stmt to
>>
>>  if (!handled_component_p (scalar_dest)
>>      && TREE_CODE (scalar_dest) != MEM_REF)
>>    return false;
>
> That will accept BIT_FIELD_REF and ARRAY_RANGE_REF (as well as VCE outside of 
> pattern stmts).
> The VCEs I hope don't appear, but the first two might, and I'm not sure
> we are prepared to handle them.  Certainly not BIT_FIELD_REFs.
>
>> > +      rhs = adjust_bool_pattern (var, TREE_TYPE (vectype), NULL_TREE, 
>> > stmts);
>> > +      if (TREE_CODE (lhs) == MEM_REF || TREE_CODE (lhs) == TARGET_MEM_REF)
>> > +   {
>> > +     lhs = copy_node (lhs);
>>
>> We don't handle TARGET_MEM_REF in vectorizable_store, so no need to
>> do it here.  In fact, just unconditionally do ...
>>
>> > +     TREE_TYPE (lhs) = TREE_TYPE (vectype);
>> > +   }
>> > +      else
>> > +   lhs = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (vectype), lhs);
>>
>> ... this (wrap it in a V_C_E).  No need to special-case any
>> MEM_REFs.
>
> Ok.  After all it seems vectorizable_store pretty much ignores it
> (except for the scalar_dest check above).  For aliasing it uses the type
> from DR_REF and otherwise it uses the vectorized type.
>
>> > +      if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs)))
>>
>> This should never be false, so you can as well unconditionally build
>> the conversion stmt.
>
> You mean because currently adjust_bool_pattern will prefer signed types
> over unsigned while here lhs will be unsigned?  I guess I should
> change it to use signed type for the memory store too to avoid the extra
> cast instead.  Both types can be certainly the same precision, e.g. for:
> unsigned char a[N], b[N];
> unsigned int d[N], e[N];
> bool c[N];
> ...
>  for (i = 0; i < N; ++i)
>    c[i] = a[i] < b[i];
> or different precision, e.g. for:
>  for (i = 0; i < N; ++i)
>    c[i] = d[i] < e[i];
>
>> > @@ -347,6 +347,28 @@ vect_determine_vectorization_factor (loo
>> >           gcc_assert (STMT_VINFO_DATA_REF (stmt_info)
>> >                       || is_pattern_stmt_p (stmt_info));
>> >           vectype = STMT_VINFO_VECTYPE (stmt_info);
>> > +         if (STMT_VINFO_DATA_REF (stmt_info))
>> > +           {
>> > +             struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
>> > +             tree scalar_type = TREE_TYPE (DR_REF (dr));
>> > +             /* vect_analyze_data_refs will allow bool writes through,
>> > +                in order to allow vect_recog_bool_pattern to transform
>> > +                those.  If they couldn't be transformed, give up now.  */
>> > +             if (((TYPE_PRECISION (scalar_type) == 1
>> > +                   && TYPE_UNSIGNED (scalar_type))
>> > +                  || TREE_CODE (scalar_type) == BOOLEAN_TYPE)
>>
>> Shouldn't it be always possible to vectorize those?  For loads
>> we can assume the memory contains only 1 or 0 (we assume that for
>> scalar loads), for stores we can mask out all other bits explicitly
>> if you add support for truncating conversions to non-mode precision
>> (in fact, we could support non-mode precision vectorization that way,
>> if not support bitfield loads or extending conversions).
>
> Not without the pattern recognizer transforming it into something.
> That is something we've discussed on IRC before I started working on the
> first vect_recog_bool_pattern patch, we'd need to special case bool and
> one-bit precision types in way too many places all around the vectorizer.
> Another reason for that was that what vect_recog_bool_pattern does currently
> is certainly way faster than what would we end up with if we just handled
> bool as unsigned (or signed?) char with masking on casts and stores
> - the ability to use any integer type for the bools rather than char
> as appropriate means we can avoid many VEC_PACK_TRUNK_EXPRs and
> corresponding VEC_UNPACK_{LO,HI}_EXPRs.
> So the chosen solution was attempt to transform some of bool patterns
> into something the vectorizer can handle easily.
> And that can be extended over time what it handles.
>
> The above just reflects it, probably just me trying to be too cautious,
> the vectorization would likely fail on the stmt feeding the store, because
> get_vectype_for_scalar_type would fail on it.
>
> If we wanted to support general TYPE_PRECISION != GET_MODE_BITSIZE (TYPE_MODE)
> vectorization (hopefully with still preserving the pattern bool recognizer
> for the above stated reasons), we'd start with changing
> get_vectype_for_scalar_type to handle those types (then the
> tree-vect-data-refs.c and tree-vect-loop.c changes from this patch would
> be unnecessary), but then we'd need to handle it in other places too
> (I guess loads would be fine (unless BIT_FIELD_REF loads), but then
> casts and stores need extra code).

I'll try to poke at that a bit, thus support general bit-precision types for
loads and stores and the few operations that are safe on them.  If you
have a store to a bool like

int *a, *b;
_Bool *c;

for (;;)
  c[i] = a[i] < b[i];

will the compare choose an int vector type and then demote it to
char for the store?  I suppose trying to generally handle loads/stores
for these types shouldn't interfere too much with this.  But I'll see ...

Richard.

>        Jakub
>

Reply via email to