Am Mon, Aug 05, 2024 at 01:00:31PM +0200 schrieb Richard Biener:
> On Fri, Aug 2, 2024 at 2:43 PM Juergen Christ <[email protected]> wrote:
> >
> > Do not convert floats to ints in multiple step if trapping math is
> > enabled. This might hide some inexact signals.
> >
> > Also use correct sign (the sign of the target integer type) for the
> > intermediate steps. This only affects undefined behaviour (casting
> > floats to unsigned datatype where the float is negative).
> >
> > gcc/ChangeLog:
> >
> > * tree-vect-stmts.cc (vectorizable_conversion): multi-step
> > float to int conversion only with trapping math and correct
> > sign.
> >
> > Signed-off-by: Juergen Christ <[email protected]>
> >
> > Bootstrapped and tested on x84 and s390. Ok for trunk?
> >
> > ---
> > gcc/tree-vect-stmts.cc | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
> > index fdcda0d2abae..2ddd13383193 100644
> > --- a/gcc/tree-vect-stmts.cc
> > +++ b/gcc/tree-vect-stmts.cc
> > @@ -5448,7 +5448,8 @@ vectorizable_conversion (vec_info *vinfo,
> > break;
> >
> > cvt_type
> > - = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode),
> > 0);
> > + = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode),
> > + TYPE_UNSIGNED (lhs_type));
>
> But lhs_type should be a float type here, the idea that for a
> FLOAT_EXPR (int -> float)
> a signed integer type is the natural one to use - as it's 2x wider
> than the original
> RHS type it's signedness doesn't matter. Note all float types should be
> !TYPE_UNSIGNED so this hunk is a no-op but still less clear on the intent IMO.
>
> Please drop it.
Will do. Sorry about that.
> > cvt_type = get_same_sized_vectype (cvt_type, vectype_in);
> > if (cvt_type == NULL_TREE)
> > goto unsupported;
> > @@ -5505,10 +5506,11 @@ vectorizable_conversion (vec_info *vinfo,
> > if (GET_MODE_SIZE (lhs_mode) >= GET_MODE_SIZE (rhs_mode))
> > goto unsupported;
> >
> > - if (code == FIX_TRUNC_EXPR)
> > + if (code == FIX_TRUNC_EXPR && !flag_trapping_math)
> > {
> > cvt_type
> > - = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode),
> > 0);
> > + = build_nonstandard_integer_type (GET_MODE_BITSIZE (rhs_mode),
> > + TYPE_UNSIGNED (lhs_type));
>
> Here it might be relevant for correctness - we have to choose between
> sfix and ufix for the float -> [u]int conversion.
>
> Do you have a testcase? Shouldn't the exactness be independent of the
> integer
> type we convert to?
I was looking at this little program which contains undefined behaviour:
#include <stdio.h>
__attribute__((noinline,noclone,noipa))
void
vec_pack_ufix_trunc_v2df (double *in, unsigned int *out);
void
vec_pack_ufix_trunc_v2df (double *in, unsigned int *out)
{
out[0] = in[0];
out[1] = in[1];
out[2] = in[2];
out[3] = in[3];
}
int main()
{
double in[] = {-1,-2,-3,-4};
unsigned int out[4];
vec_pack_ufix_trunc_v2df (in, out);
for (int i = 0; i < 4; ++i)
printf("out[%d] = %u\n", i, out[i]);
return 0;
}
On s390x, I get different results after vectorization:
out[0] = 4294967295
out[1] = 4294967294
out[2] = 4294967293
out[3] = 4294967292
than without vectorization:
out[0] = 0
out[1] = 0
out[2] = 0
out[3] = 0
Even if this is undefined behaviour, I think it would be nice to have
consistent results here.
Also, while I added an expander to circumvent this problem in a
previous patch, reviewers requested to hide this behind trapping math.
Thus, I looked into this.
Seeing the result from the CI for aarch64, I guess there are some
tests that actually expect this vectorization to always happen even
though it might not be save w.r.t. trapping math.
>
> > cvt_type = get_same_sized_vectype (cvt_type, vectype_in);
> > if (cvt_type == NULL_TREE)
> > goto unsupported;
> > --
> > 2.43.5
> >