Christopher Bazley <[email protected]> writes:
>>> Agreed. The only valid situations seem to be:
>>>
>>> (1) a duplicate of a single zero, where:
>>>
>>> npatterns == nelts_per_pattern == encoded_nelts == 1
>>>
>>> and the only encoded value is zero
>>>
>>> (2) the combination of:
>>>
>>> - nelts_per_pattern == 2
>>> - multiple_p (TYPE_VECTOR_SUBPARTS (type), npatterns)
>>> - the second half of the encoded elements are all zeros
>>>
>>> But these combinations would not come about by chance. The caller
>>> would have to take steps to ensure that they're true. So rather
>>> than check for these relatively complex conditions, it might
>>> be clearer to add a new gimple_build interface that explicitly
>>> fills with zeros, using a normal array (instead of a
>>> tree_vector_builder) for the explicitly-initialised elements.
>>
>> Would a new gimple_build_*_with_zeros function remove the need for
>> vect_create_constant_vectors to pad with zeros at all?
>>
>> The design of vect_create_constant_vectors seems to be heavily built
>> around use of a tree_vector_builder. I'm a bit reluctant to do
>> anything that would require significant refactoring of
>> vect_create_constant_vectors, or that would require this seemingly
>> rather ordinary case to be treated specially.
The current code is built for the normal VLA loop case, where the
sequence of scalar constants needs to be repeated to fill a vector.
For example, in:
for (int i = 0; i < 100; ++i)
{
x[i*2] += 1;
x[i*2 + 1] += 2;
}
we need { 1, 2, 1, 2, 1, 2, ... }.
We can't do that filling explicitly at compile-time because we don't
know how many copies are needed -- that depends on the runtime vector
length. So instead we use a tree_vector_builder that encodes { 1, 2 }
and says that the pattern needs to be repeated to fill a vector.
This also works for fixed-length loop vectorisation because, in the
general case, filling is needed there too. We could of course do the
filling explicitly at compile time, but it would be somewhat wasted
effort, since the resulting constant would be canonicalised back to
the "{ 1, 2 } repeating" encoding.
If you want to do something different for BB SLP then I think it makes
sense that there is some difference in the way that the constant is
constructed. It doesn't need to be a big difference. tree_vector_builder
inherits from auto_vec, so it would be possible to create a new
gimple_build_* that takes a vec (or, better, an array_slice) and still
share the current tree_vector_builder code in vect_create_constant_vectors.
Richard