Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

Alan Lawrence Mon, 21 Sep 2015 02:45:08 -0700

[Resending in plain text] This makes sense to me now, although I find
your comment slightly confusing:


[....] in that
+;; the meaning of HI and LO is always taken with a little-endian view of
+;; the vector

You mean vec_unpacks_{hi,lo} (which seems to go against the
*architectural* bit after this), or hi/lo in cases other than
vec_unpack (=> not "always"), or something else?

maybe s/always/usually/ or s/always/otherwise/ ?

Cheers, Alan

On 10 September 2015 at 15:02, James Greenhalgh
<[email protected]> wrote:
>
> On Wed, Sep 09, 2015 at 10:28:28AM +0100, Christophe Lyon wrote:
>> On 9 September 2015 at 10:31, James Greenhalgh <[email protected]> 
>> wrote:
>> >
>> > Hi,
>> >
>> > This patch clears up some remaining confusion in the vector lane orderings
>> > for the two intrinsics mentioned in the title.
>> >
>> > Bootstrapped on aarch64-none-linux-gnu and regression tested for
>> > aarch64_be-none-elf with no issues.
>> >
>>
>> Does this actually fix an existing testcase?
>
> Yes, of course, sorry - that was a useless introduction to the patch!
>
> First, I've updated the patch with a testcase, which fails for me on
> aarch64_be-none-elf but not aarch64-*-*.
>
> The issue is that the RTL folding routines will happily fold through
> a vec_concat or a vec_select, which we have given the wrong operands
> to when in BYTES_BIG_ENDIAN mode.  The fix is similar to that which we
> have elsewhere in aarch64-simd.md, which is to split out the big
> and little endian forms of the patterns which need vec_concat, and
> to build a vec_par_cnst_*_half mask for the patterns which need
> vec_select. This keeps us in the GCC-view of lane ordering.
>
> There is test coverage that these patterns do the right thing for the
> vectorizer (I know, because I initially typoed s/le/be and saw tests
> gcc.dg/vect fall over), and the new testcase adds coverage for the
> expansion path through intrinsics.
>
> I've rebased on top of Alan's patch, which goes halfway to fixing the
> issue, but which didn't fix the float_truncate patterns, which had an
> incorrect vec_concat. That simplifies the patch considerably.
>
> Rechecked on aarch64_be-none-elf and aarch64-none-linux-gnu with no
> issues.
>
> OK?
>
> Thanks,
> James
>
> ---
> gcc/
>
> 2015-09-09  James Greenhalgh  <[email protected]>
>
>         * config/aarch64/aarch64-simd.md
>
>         (aarch64_float_truncate_hi_v4sf): Rewrite as an expand.
>         (aarch64_float_truncate_hi_v4sf_le): New.
>         (aarch64_float_truncate_hi_v4sf_be): Likewise.
>
> gcc/testsuite/
>
> 2015-09-09  James Greenhalgh  <[email protected]>
>
>         * gcc.target/aarch64/advsimd-intrinsics/vcvt_high_1.c: New.
>

Re: [AArch64] Fix vcvt_high_f64_f32 and vcvt_figh_f32_f64 intrinsics.

Reply via email to