On Thu, 3 Jan 2019, Jakub Jelinek wrote:
On Thu, Jan 03, 2019 at 11:48:12AM +0100, Marc Glisse wrote:
The following patch adds support for the __builtin_convertvector builtin.
C casts on generic vectors are just reinterpretation of the bits (i.e. a
VCE), this builtin allows to cast int/unsigned elements to float or vice
versa or promote/demote them. doc/ change is missing, will write it soon.
The builtin appeared in I think clang 3.4 and is apparently in real-world
use as e.g. Honza reported. The first argument is an expression with vector
type, the second argument is a vector type (similarly e.g. to va_arg), to
which the first argument should be converted. Both vector types need to
have the same number of elements.
I've implemented same element size (thus also whole vector size) conversions
efficiently - signed to unsigned and vice versa or same vector type just
using a VCE, for e.g. int <-> float or long long <-> double using
appropriate optab, possibly repeated multiple times for very large vectors.
IIUC, you only lower __builtin_convertvector to VCE or FLOAT_EXPR or
whatever in tree-vect-generic. That seems quite late. At least for the
"easy" same-size case, I think we should do it early (gimplification?),
No, it must not be done at gimplification time, think about OpenMP/OpenACC
offloading, the target before IPA optimizations might not be the target
after them, while they have to agree on ABI issues, the optabs definitely
can be and are different and these optabs originally added for the
vectorizer are something that doesn't have a fallback, whatever introduces
it into the IL is responsible for verification it is supported.
Ah, I was missing this. And I don't see why we should keep it that way. As
long as the vectorizer was the only producer, it made sense not to have a
fallback, it was not needed. But now that we are talking of having the
user produce it almost directly, it would make sense for it to behave like
other vector operations (say PLUS_EXPR).
That said, not sure if e.g. using an opaque builtin for the conversion that
supportable_convert_operation sometimes uses is better over this ifn.
What exact optimization opportunities you are looking for if it is lowered
earlier? I have the VECTOR_CST folding in place...
I don't know, any kind of optimization we currently do on scalars... For
conversions between integers and floats, that seems to be very limited,
maybe combine consecutive casts in rare cases. For sign changes, we have a
number of transformations in match.pd that are fine with an intermediate
cast that only changes the sign (I even introduced nop_convert to handle
vectors at the same time). I guess we could handle this IFN as well. It is
just that having 2 ways to express the same thing tends to cause code
duplication.
On the other hand, for narrowing/widening conversions, keeping it as one
stmt with your ifn may be more convenient to optimize than a large mess of
VEC_UNPACK_FLOAT_HI_EXPR and friends. Again I am thinking more of match.pd
type of transformation, nothing that looks at the target.
--
Marc Glisse