Kewen:
On 6/4/24 00:19, Kewen.Lin wrote:
> Hi,
>
> on 2024/5/29 23:58, Carl Love wrote:
>> Updated the patch per the feedback comments from the previous version.
>>
>> Carl
>> -------------------------------------------------------
>>
>> rs6000, extend the current vec_{un,}signed{e,o} built-ins
>>
>> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds
>> convert a vector of floats to signed/unsigned long long ints. Extend the
>> existing vec_{un,}signed{e,o} built-ins to handle the argument
>> vector of floats to return the even/odd signed/unsigned integers.
>>
>> The define expands vsignede_v4sf, vsignedo_v4sf, vunsignede_v4sf,
>> vunsignedo_v4sf are added to support the new vec_{un,}signed{e,o}
>> built-ins.
>>
>> The built-ins __builtin_vsx_xvcvspsxds and __builtin_vsx_xvcvspuxds are
>> now for internal use only. They are not documented and they do not
>> have testcases.
>>> The built-in __builtin_vsx_xvcvdpsxws is redundant as it is covered by
>> vec_signed{e,o}, remove.
>>
>> The built-in __builtin_vsx_xvcvdpuxws is redundant as it is covered by
>> vec_unsigned{e,o}, remove.
>>
>> The built-in __builtin_vsx_xvcvdpuxds_uns is redundant as it is covered by
>> vec_unsigned, remove.
>>
>> The __builtin_vsx_xvcvspuxws is redundante as it is covered by
>> vec_unsigned, remove.
>
> I perfer to move these removals into sub-patch 2/13 or split them out into
> a new patch, since they don't match the subject of this patch. Moving it
> to sub-patch 2/13 looks good as they are all about vec_{un,}signed{,e,o}.
Yes, we need to have all of the vec_unsigned in the same patch. Moved
__builtin_vsx_xvcvdpuxds_uns and __builtin_vsx_xvcvspuxws to patch 2.
>
>>
>> Add testcases and update documentation.
>>
>> gcc/ChangeLog:
>> * config/rs6000/rs6000-builtins.def (__builtin_vsx_xvcvspsxds_low,
>> __builtin_vsx_xvcvspuxds_low): New built-in definitions.
>> (__builtin_vsx_xvcvspuxds): Fix return type.
>> (XVCVSPSXDS, XVCVSPUXDS): Renamed VEC_VSIGNEDE_V4SF,
>> VEC_VUNSIGNEDE_V4SF respectively.
>> (vsx_xvcvspsxds, vsx_xvcvspuxds): Renamed vsignede_v4sf,
>> vunsignede_v4sf respectively.
>> (__builtin_vsx_xvcvdpsxws, __builtin_vsx_xvcvdpuxws,
>> __builtin_vsx_xvcvdpuxds_uns, __builtin_vsx_xvcvspuxws): Removed.
>> * config/rs6000/rs6000-overload.def (vec_signede, vec_signedo,
>> vec_unsignede,vec_unsignedo): Add new overloaded specifications.
>> * config/rs6000/vsx.md (vsignede_v4sf, vsignedo_v4sf,
>> vunsignede_v4sf, vunsignedo_v4sf): New define_expands.
>> * doc/extend.texi (vec_signedo, vec_signede): Add documentation.
>>
>> gcc/testsuite/ChangeLog:
>> * gcc.target/powerpc/builtins-3-runnable.c: New tests for the added
>> overloaded built-ins.
>> ---
>> gcc/config/rs6000/rs6000-builtins.def | 25 ++----
>> gcc/config/rs6000/rs6000-overload.def | 8 ++
>> gcc/config/rs6000/vsx.md | 88 +++++++++++++++++++
>> gcc/doc/extend.texi | 10 +++
>> .../gcc.target/powerpc/builtins-3-runnable.c | 51 +++++++++--
>> 5 files changed, 157 insertions(+), 25 deletions(-)
>>
>> diff --git a/gcc/config/rs6000/rs6000-builtins.def
>> b/gcc/config/rs6000/rs6000-builtins.def
>> index bf9a0ae22fc..cea2649b86c 100644
>> --- a/gcc/config/rs6000/rs6000-builtins.def
>> +++ b/gcc/config/rs6000/rs6000-builtins.def
>> @@ -1688,32 +1688,23 @@
>> const vsll __builtin_vsx_xvcvdpsxds_scale (vd, const int);
>> XVCVDPSXDS_SCALE vsx_xvcvdpsxds_scale {}
>>
>> - const vsi __builtin_vsx_xvcvdpsxws (vd);
>> - XVCVDPSXWS vsx_xvcvdpsxws {}
>> -
>> - const vsll __builtin_vsx_xvcvdpuxds (vd);
>> - XVCVDPUXDS vsx_fixuns_truncv2dfv2di2 {}
>> -
>> const vsll __builtin_vsx_xvcvdpuxds_scale (vd, const int);
>> XVCVDPUXDS_SCALE vsx_xvcvdpuxds_scale {}
>>
>> - const vull __builtin_vsx_xvcvdpuxds_uns (vd);
>> - XVCVDPUXDS_UNS vsx_fixuns_truncv2dfv2di2 {}
>> -
>> - const vsi __builtin_vsx_xvcvdpuxws (vd);
>> - XVCVDPUXWS vsx_xvcvdpuxws {}
>> -
>> const vd __builtin_vsx_xvcvspdp (vf);
>> XVCVSPDP vsx_xvcvspdp {}
>>
>> const vsll __builtin_vsx_xvcvspsxds (vf);
>> - XVCVSPSXDS vsx_xvcvspsxds {}
>> + VEC_VSIGNEDE_V4SF vsignede_v4sf {}
>
> We should rename __builtin_vsx_xvcvspsxds to
> __builtin_vsx_vsignede_v4sf, one reason is to align with
> the existing others, one more important thing
> is that it doesn't generate 1-1 mapping xvcvspsxds,
> putting that mnemonic can be misleading.
Yes, that would be more consistent. Changed.
>
>> +
>> + const vsll __builtin_vsx_xvcvspsxds_low (vf);
>
> Ditto.
Changed.
>
>> + VEC_VSIGNEDO_V4SF vsignedo_v4sf {}
>>
>> - const vsll __builtin_vsx_xvcvspuxds (vf); - XVCVSPUXDS vsx_xvcvspuxds
>> {}
>> + const vull __builtin_vsx_xvcvspuxds (vf);
>
> Ditto.
Changed.
>
>> + VEC_VUNSIGNEDE_V4SF vunsignede_v4sf {}
>>
>> - const vsi __builtin_vsx_xvcvspuxws (vf);
>> - XVCVSPUXWS vsx_fixuns_truncv4sfv4si2 {}
>> + const vull __builtin_vsx_xvcvspuxds_low (vf);
>
> Ditto.
Changed.
>
>> + VEC_VUNSIGNEDO_V4SF vunsignedo_v4sf {}
>>
>> const vd __builtin_vsx_xvcvsxddp (vsll);
>> XVCVSXDDP vsx_floatv2div2df2 {}
>> diff --git a/gcc/config/rs6000/rs6000-overload.def
>> b/gcc/config/rs6000/rs6000-overload.def
>> index 84bd9ae6554..4d857bb1af3 100644
>> --- a/gcc/config/rs6000/rs6000-overload.def
>> +++ b/gcc/config/rs6000/rs6000-overload.def
>> @@ -3307,10 +3307,14 @@
>> [VEC_SIGNEDE, vec_signede, __builtin_vec_vsignede]
>> vsi __builtin_vec_vsignede (vd);
>> VEC_VSIGNEDE_V2DF
>> + vsll __builtin_vec_vsignede (vf);
>> + VEC_VSIGNEDE_V4SF
>>
>> [VEC_SIGNEDO, vec_signedo, __builtin_vec_vsignedo]
>> vsi __builtin_vec_vsignedo (vd);
>> VEC_VSIGNEDO_V2DF
>> + vsll __builtin_vec_vsignedo (vf);
>> + VEC_VSIGNEDO_V4SF
>>
>> [VEC_SIGNEXTI, vec_signexti, __builtin_vec_signexti]
>> vsi __builtin_vec_signexti (vsc);
>> @@ -4433,10 +4437,14 @@
>> [VEC_UNSIGNEDE, vec_unsignede, __builtin_vec_vunsignede]
>> vui __builtin_vec_vunsignede (vd);
>> VEC_VUNSIGNEDE_V2DF
>> + vull __builtin_vec_vunsignede (vf);
>> + VEC_VUNSIGNEDE_V4SF
>>
>> [VEC_UNSIGNEDO, vec_unsignedo, __builtin_vec_vunsignedo]
>> vui __builtin_vec_vunsignedo (vd);
>> VEC_VUNSIGNEDO_V2DF
>> + vull __builtin_vec_vunsignedo (vf);
>> + VEC_VUNSIGNEDO_V4SF
>>
>> [VEC_VEE, vec_extract_exp, __builtin_vec_extract_exp]
>> vui __builtin_vec_extract_exp (vf);
>> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
>> index f135fa079bd..a8f3d459232 100644
>> --- a/gcc/config/rs6000/vsx.md
>> +++ b/gcc/config/rs6000/vsx.md
>> @@ -2704,6 +2704,94 @@ (define_expand "vsx_xvcvsp<su>xds"
>> DONE;
>> })
>>
>> +;; Convert low vector elements of 32-bit floating point numbers to vector of
>> +;; 64-bit signed
>
> Maybe:
>
> ;; Convert float vector even elements to {un,}signed long long vector
Changed the four comments to the suggested pattern.
>
>> +(define_expand "vsignede_v4sf"
>> + [(match_operand:V2DI 0 "vsx_register_operand")
>> + (match_operand:V4SF 1 "vsx_register_operand")]
>> + "VECTOR_UNIT_VSX_P (V2DFmode)"
>> +{
>> + if (BYTES_BIG_ENDIAN)
>> + {
>> + /* Shift left one word to put even word in correct location */
>> + rtx rtx_tmp = gen_reg_rtx (V4SFmode);
>> + rtx rtx_val = GEN_INT (4);
>> + emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1],
>> operands[1],
>> + rtx_val));
>> + emit_insn (gen_vsx_xvcvspsxds_be (operands[0], rtx_tmp));
>> + }
>
> I think this is wrong, even elements on BE is word 0 and 2, it doesn't
> requires vector shifting (similar to doublee<mode>2), while LE needs.
OK, went thru this again, used gdb to look at how things get laied out in the
registers. I agree it loks like I have the shifting backwards for LE/BE on the
even/odd stuff. Fixing this requires fixing the expected results in the
corresponding test case as they are backwards.
>
>> + else
>> + emit_insn (gen_vsx_xvcvspsxds_le (operands[0], operands[1]));
>> +
>> + DONE;
>> +})
>> +
>> +;; Convert high vector elements of 32-bit floating point numbers to vector
>> of
>> +;; 64-bit signed
>
> Ditto.
Changed
>
>> +(define_expand "vsignedo_v4sf"
>> + [(match_operand:V2DI 0 "vsx_register_operand")
>> + (match_operand:V4SF 1 "vsx_register_operand")]
>> + "VECTOR_UNIT_VSX_P (V2DFmode)"
>> +{
>> + if (BYTES_BIG_ENDIAN)
>> + emit_insn (gen_vsx_xvcvspsxds_be (operands[0], operands[1]));
>
> As above, this is for odd elements, so BE needs vector shifting while LE
> doesn't.
Chnaged this file and the corresponding expected test case results.
>
> The vunsigned* below need the according fixes.
>
>> + else
>> + {
>> + /* Shift left one word to put even word in correct location */
>> + rtx rtx_tmp = gen_reg_rtx (V4SFmode);
>> + rtx rtx_val = GEN_INT (4);
>> + emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
>> + rtx_val));
>> + emit_insn (gen_vsx_xvcvspsxds_le (operands[0], rtx_tmp));
>> + }
>> +
>> + DONE;
>> +})
>> +
>> +;; Convert low vector elements of 32-bit floating point numbers to vector of
>> +;; 64-bit unsigned integers.
Changed comment as suggested above.
>> +(define_expand "vunsignede_v4sf"
>> + [(match_operand:V2DI 0 "vsx_register_operand")
>> + (match_operand:V4SF 1 "vsx_register_operand")]
>> + "VECTOR_UNIT_VSX_P (V2DFmode)"
>> +{
>> + if (BYTES_BIG_ENDIAN)
>> + {
>> + /* Shift left one word to put even word in correct location */
>> + rtx rtx_tmp = gen_reg_rtx (V4SFmode);
>> + rtx rtx_val = GEN_INT (4);
>> + emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
>> + rtx_val));
>> + emit_insn (gen_vsx_xvcvspuxds_be (operands[0], rtx_tmp));
>> + }
>> + else
>> + emit_insn (gen_vsx_xvcvspuxds_le (operands[0], operands[1]));
>> +
>> + DONE;
>> +})
>> +
>> +;; Convert high vector elements of 32-bit floating point numbers to vector
>> of
>> +;; 64-bit unsigned integers.
Changed comment as suggested above.
>> +(define_expand "vunsignedo_v4sf"
>> + [(match_operand:V2DI 0 "vsx_register_operand")
>> + (match_operand:V4SF 1 "vsx_register_operand")]
>> + "VECTOR_UNIT_VSX_P (V2DFmode)"
>> +{
>> + if (BYTES_BIG_ENDIAN)
>> + emit_insn (gen_vsx_xvcvspuxds_be (operands[0], operands[1]));
>> + else
>> + {
>> + /* Shift left one word to put even word in correct location */
>> + rtx rtx_tmp = gen_reg_rtx (V4SFmode);
>> + rtx rtx_val = GEN_INT (4);
>> + emit_insn (gen_altivec_vsldoi_v4sf (rtx_tmp, operands[1], operands[1],
>> + rtx_val));
>> + emit_insn (gen_vsx_xvcvspuxds_le (operands[0], rtx_tmp));
>> + }
>> +
>> + DONE;
>> +})
>> +
>> ;; Generate float2 double
>> ;; convert two double to float
>> (define_expand "float2_v2df"
>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> index 267fccd1512..b88e61641a2 100644
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -22577,6 +22577,16 @@ if the VSX instruction set is available. The
>> @samp{vec_vsx_ld} and
>> @samp{vec_vsx_st} built-in functions always generate the VSX @samp{LXVD2X},
>> @samp{LXVW4X}, @samp{STXVD2X}, and @samp{STXVW4X} instructions.
>>
>> +@smallexample
>> +vector signed signed long long vec_signedo (vector float);
>> +vector signed signed long long vec_signede (vector float);
>> +vector unsigned signed long long vec_signedo (vector float);
>> +vector unsigned signed long long vec_signede (vector float);
>> +@end smallexample
>
> Nit: s/signed long/long/
Yea, a little verbose there... :-) Fixed.
>
> BR,
> Kewen
>
>> +
>> +The overloaded built-ins @code{vec_signedo} and @code{vec_signede} are
>> +additional extensions to the built-ins as documented in the PVIPR.
>> +
>> @node PowerPC AltiVec Built-in Functions Available on ISA 2.07
>> @subsubsection PowerPC AltiVec Built-in Functions Available on ISA 2.07
>>
>> diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
>> b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
>> index 5dcdfbee791..557befc9a4a 100644
>> --- a/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-3-runnable.c
>> @@ -3,7 +3,7 @@
>> /* { dg-options "-maltivec -mvsx" } */
>>
>> #include <altivec.h> // vector
>> -
>> +#define DEBUG 1
>> #ifdef DEBUG
>> #include <stdio.h>
>> #endif
>> @@ -81,14 +81,15 @@ void test_unsigned_int_result(int check, vector unsigned
>> int vec_result,
>> }
>>
>> void test_ll_int_result(vector long long int vec_result,
>> - vector long long int vec_expected)
>> + vector long long int vec_expected,
>> + char *string)
>> {
>> int i;
>>
>> for (i = 0; i < 2; i++)
>> if (vec_result[i] != vec_expected[i]) {
>> #ifdef DEBUG
>> - printf("Test_ll_int_result: ");
>> + printf("Test_ll_int_result %s: ", string);
>> printf("vec_result[%d] (%lld) != vec_expected[%d]
>> (%lld)\n",
>> i, vec_result[i], i, vec_expected[i]);
>> #else
>> @@ -98,14 +99,15 @@ void test_ll_int_result(vector long long int vec_result,
>> }
>>
>> void test_ll_unsigned_int_result(vector long long unsigned int vec_result,
>> - vector long long unsigned int vec_expected)
>> + vector long long unsigned int vec_expected,
>> + char *string)
>> {
>> int i;
>>
>> for (i = 0; i < 2; i++)
>> if (vec_result[i] != vec_expected[i]) {
>> #ifdef DEBUG
>> - printf("Test_ll_unsigned_int_result: ");
>> + printf("Test_ll_unsigned_int_result %s: ", string);
>> printf("vec_result[%d] (%lld) != vec_expected[%d]
>> (%lld)\n",
>> i, vec_result[i], i, vec_expected[i]);
>> #else
>> @@ -292,7 +294,8 @@ int main()
>> vec_dble0 = (vector double){-124.930, 81234.49};
>> vec_ll_int_expected = (vector long long signed int){-124, 81234};
>> vec_ll_int_result = vec_signed (vec_dble0);
>> - test_ll_int_result (vec_ll_int_result, vec_ll_int_expected);
>> + test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
>> + "vec_signed");
>>
>> /* Convert double precision vector float to vector int, even words */
>> vec_dble0 = (vector double){-124.930, 81234.49};
>> @@ -321,12 +324,44 @@ int main()
>> test_unsigned_int_result (ALL, vec_uns_int_result,
>> vec_uns_int_expected);
>>
>> + /* Convert single precision vector float, even args, to vector
>> + signed long long int. */
>> + vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
>> + vec_ll_int_expected = (vector signed long long int){834, -5};
>> + vec_ll_int_result = vec_signede (vec_flt0);
>> + test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
>> + "vec_signede");
>> +
>> + /* Convert single precision vector float, odd args, to vector
>> + signed long long int. */
>> + vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
>> + vec_ll_int_expected = (vector signed long long int){14, -3};
>> + vec_ll_int_result = vec_signedo (vec_flt0);
>> + test_ll_int_result (vec_ll_int_result, vec_ll_int_expected,
>> + "vec_signedo");
>> +
>> + /* Convert single precision vector float, even args, to vector
>> + unsigned long long int. */
>> + vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
>> + vec_ll_uns_int_expected = (vector unsigned long long int){834, 0};
>> + vec_ll_uns_int_result = vec_unsignede (vec_flt0);
>> + test_ll_unsigned_int_result (vec_ll_uns_int_result,
>> + vec_ll_uns_int_expected, "vec_unsignede");
>> +
>> + /* Convert single precision vector float, odd args, to vector
>> + unsigned long long int. */
>> + vec_flt0 = (vector float){14.930, 834.49, -3.3, -5.4};
>> + vec_ll_uns_int_expected = (vector unsigned long long int){14, 0};
>> + vec_ll_uns_int_result = vec_unsignedo (vec_flt0);
>> + test_ll_unsigned_int_result (vec_ll_uns_int_result,
>> + vec_ll_uns_int_expected, "vec_unsignedo");
>> +
>> /* Convert double precision float to long long unsigned int */
>> vec_dble0 = (vector double){124.930, 8134.49};
>> vec_ll_uns_int_expected = (vector long long unsigned int){124, 8134};
>> vec_ll_uns_int_result = vec_unsigned (vec_dble0);
>> test_ll_unsigned_int_result (vec_ll_uns_int_result,
>> - vec_ll_uns_int_expected);
>> + vec_ll_uns_int_expected, "vec_unsigned");
>>
>> /* Convert double precision float to long long unsigned int. Negative
>> arguments. */
>> @@ -334,7 +369,7 @@ int main()
>> vec_ll_uns_int_expected = (vector long long unsigned int){0, 0};
>> vec_ll_uns_int_result = vec_unsigned (vec_dble0);
>> test_ll_unsigned_int_result (vec_ll_uns_int_result,
>> - vec_ll_uns_int_expected);
>> + vec_ll_uns_int_expected, "vec_unsigned");
>>
>> /* Convert double precision vector float to vector unsigned int,
>> even words. Negative arguments */
>