Jonathan Wright <jonathan.wri...@arm.com> writes:
> Hi,
>
> A previous commit "aarch64: Remove macros for vld4[q]_lane Neon
> intrinsics" introduced some float <-> int type conversion errors.
> This patch fixes those errors.

It looks like:

__extension__ extern __inline float64x1x3_t
__attribute__ ((__always_inline__, __gnu_inline__,__artificial__))
vld3_lane_f64 (const float64_t * __ptr, float64x1x3_t __b, const int __c)
{
  …
  __o = __builtin_aarch64_ld3_lanedi (
          (__builtin_aarch64_simd_di *) __ptr, __o, __c);
  …
}

also has the wrong type, although the call itself is self-consistent.

OK with that changed too, thanks.

Richard

> Bootstrapped and regression tested on aarch64-none-linux-gnu - no
> issues.
>
> Ok for master?
>
> Thanks,
> Jonathan
>
> ---
>
> gcc/ChangeLog:
>
> 2021-08-18  Jonathan Wright  <jonathan.wri...@arm.com>
>
>         * config/aarch64/arm_neon.h (vld4_lane_f32): Use float RTL
>         pattern.
>         (vld4q_lane_f64): Use float type cast.
>
>
>
> From: Andreas Schwab <sch...@linux-m68k.org>
> Sent: 18 August 2021 13:09
> To: Jonathan Wright via Gcc-patches <gcc-patches@gcc.gnu.org>
> Cc: Jonathan Wright <jonathan.wri...@arm.com>; Richard Sandiford 
> <richard.sandif...@arm.com>
> Subject: Re: [PATCH 3/3] aarch64: Remove macros for vld4[q]_lane Neon 
> intrinsics
>
> I think this patch breaks bootstrap.
>
> In file included from ../../libcpp/lex.c:756:
> /opt/gcc/gcc-20210818/Build/prev-gcc/include/arm_neon.h: In function 
> 'float32x2x4_t vld4_lane_f32(const float32_t*, float32x2x4_t, int)':
> /opt/gcc/gcc-20210818/Build/prev-gcc/include/arm_neon.h:21081:11: error: 
> cannot convert 'float*' to 'const int*'
> 21081 |           (__builtin_aarch64_simd_sf *) __ptr, __o, __c);
>       |           ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>       |           |
>       |           float*
> <built-in>: note:   initializing argument 1 of '__builtin_aarch64_simd_xi 
> __builtin_aarch64_ld4_lanev2si(const int*, __builtin_aarch64_simd_xi, int)'
> /opt/gcc/gcc-20210818/Build/prev-gcc/include/arm_neon.h: In function 
> 'float64x2x4_t vld4q_lane_f64(const float64_t*, float64x2x4_t, int)':
> /opt/gcc/gcc-20210818/Build/prev-gcc/include/arm_neon.h:21384:9: error: 
> cannot convert 'long int*' to 'const double*'
> 21384 |         (__builtin_aarch64_simd_di *) __ptr, __o, __c);
>       |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>       |         |
>       |         long int*
> <built-in>: note:   initializing argument 1 of '__builtin_aarch64_simd_xi 
> __builtin_aarch64_ld4_lanev2df(const double*, __builtin_aarch64_simd_xi, int)'
>
> Andreas.
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
> "And now for something completely different."
>
> diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
> index 
> d8b29706a2078f4be374d4c2b0d5882d820ba8e0..98427b345ee2ab9d7a15eed9d784aa7f8c4da168
>  100644
> --- a/gcc/config/aarch64/arm_neon.h
> +++ b/gcc/config/aarch64/arm_neon.h
> @@ -21077,7 +21077,7 @@ vld4_lane_f32 (const float32_t * __ptr, float32x2x4_t 
> __b, const int __c)
>    __o = __builtin_aarch64_set_qregxiv4sf (__o, (float32x4_t) __temp.val[1], 
> 1);
>    __o = __builtin_aarch64_set_qregxiv4sf (__o, (float32x4_t) __temp.val[2], 
> 2);
>    __o = __builtin_aarch64_set_qregxiv4sf (__o, (float32x4_t) __temp.val[3], 
> 3);
> -  __o =      __builtin_aarch64_ld4_lanev2si (
> +  __o =      __builtin_aarch64_ld4_lanev2sf (
>         (__builtin_aarch64_simd_sf *) __ptr, __o, __c);
>    __b.val[0] = (float32x2_t) __builtin_aarch64_get_dregxidi (__o, 0);
>    __b.val[1] = (float32x2_t) __builtin_aarch64_get_dregxidi (__o, 1);
> @@ -21381,7 +21381,7 @@ vld4q_lane_f64 (const float64_t * __ptr, 
> float64x2x4_t __b, const int __c)
>    __o = __builtin_aarch64_set_qregxiv4si (__o, (int32x4_t) __b.val[2], 2);
>    __o = __builtin_aarch64_set_qregxiv4si (__o, (int32x4_t) __b.val[3], 3);
>    __o = __builtin_aarch64_ld4_lanev2df (
> -     (__builtin_aarch64_simd_di *) __ptr, __o, __c);
> +     (__builtin_aarch64_simd_df *) __ptr, __o, __c);
>    ret.val[0] = (float64x2_t) __builtin_aarch64_get_qregxiv4si (__o, 0);
>    ret.val[1] = (float64x2_t) __builtin_aarch64_get_qregxiv4si (__o, 1);
>    ret.val[2] = (float64x2_t) __builtin_aarch64_get_qregxiv4si (__o, 2);

Reply via email to