[gcc r15-1121] doc: Remove link to www.amelek.gda.pl/avr/
https://gcc.gnu.org/g:ad2775b0e3d65b0b844bfd13e2f8b15240fb3b93 commit r15-1121-gad2775b0e3d65b0b844bfd13e2f8b15240fb3b93 Author: Gerald Pfeifer Date: Sun Jun 9 09:24:14 2024 +0200 doc: Remove link to www.amelek.gda.pl/avr/ The entire server/site appears gone for a while. gcc: * doc/install.texi (avr): Remove link to www.amelek.gda.pl/avr/. Diff: --- gcc/doc/install.texi | 2 -- 1 file changed, 2 deletions(-) diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index 906c78aaca5..2addafd2465 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -4016,8 +4016,6 @@ can also be obtained from: @itemize @bullet @item @uref{http://www.nongnu.org/avr/,,http://www.nongnu.org/avr/} -@item -@uref{http://www.amelek.gda.pl/avr/,,http://www.amelek.gda.pl/avr/} @end itemize The following error:
[gcc r15-1122] i386: Implement .SAT_SUB for unsigned scalar integers [PR112600]
https://gcc.gnu.org/g:8bb6b2f4ae19c3aab7d7a5e5c8f5965f89d90e01 commit r15-1122-g8bb6b2f4ae19c3aab7d7a5e5c8f5965f89d90e01 Author: Uros Bizjak Date: Sun Jun 9 12:09:13 2024 +0200 i386: Implement .SAT_SUB for unsigned scalar integers [PR112600] The following testcase: unsigned sub_sat (unsigned x, unsigned y) { unsigned res; res = x - y; res &= -(x >= y); return res; } currently compiles (-O2) to: sub_sat: movl%edi, %edx xorl%eax, %eax subl%esi, %edx cmpl%esi, %edi setnb %al negl%eax andl%edx, %eax ret We can expand through ussub{m}3 optab to use carry flag from the subtraction and generate code using SBB instruction implementing: unsigned res = x - y; res &= ~(-(x < y)); sub_sat: subl%esi, %edi sbbl%eax, %eax notl%eax andl%edi, %eax ret PR target/112600 gcc/ChangeLog: * config/i386/i386.md (ussub3): New expander. (sub_3): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr112600-b.c: New test. Diff: --- gcc/config/i386/i386.md| 31 ++- gcc/testsuite/gcc.target/i386/pr112600-b.c | 40 ++ 2 files changed, 70 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index bc2ef819df6..d69bc8d6e48 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -8436,6 +8436,14 @@ "ix86_fixup_binary_operands_no_copy (MINUS, mode, operands, TARGET_APX_NDD);") +(define_expand "sub_3" + [(parallel [(set (reg:CC FLAGS_REG) + (compare:CC +(match_operand:SWI 1 "nonimmediate_operand") +(match_operand:SWI 2 ""))) + (set (match_operand:SWI 0 "register_operand") + (minus:SWI (match_dup 1) (match_dup 2)))])]) + (define_insn "*sub_3" [(set (reg FLAGS_REG) (compare (match_operand:SWI 1 "nonimmediate_operand" "0,0,rm,r") @@ -9883,7 +9891,28 @@ emit_insn (gen_add3_cc_overflow_1 (res, operands[1], operands[2])); emit_insn (gen_x86_movcc_0_m1_neg (msk)); dst = expand_simple_binop (mode, IOR, res, msk, -operands[0], 1, OPTAB_DIRECT); +operands[0], 1, OPTAB_WIDEN); + + if (!rtx_equal_p (dst, operands[0])) +emit_move_insn (operands[0], dst); + DONE; +}) + +(define_expand "ussub3" + [(set (match_operand:SWI 0 "register_operand") + (us_minus:SWI (match_operand:SWI 1 "register_operand") + (match_operand:SWI 2 "")))] + "" +{ + rtx res = gen_reg_rtx (mode); + rtx msk = gen_reg_rtx (mode); + rtx dst; + + emit_insn (gen_sub_3 (res, operands[1], operands[2])); + emit_insn (gen_x86_movcc_0_m1_neg (msk)); + msk = expand_simple_unop (mode, NOT, msk, NULL, 1); + dst = expand_simple_binop (mode, AND, res, msk, +operands[0], 1, OPTAB_WIDEN); if (!rtx_equal_p (dst, operands[0])) emit_move_insn (operands[0], dst); diff --git a/gcc/testsuite/gcc.target/i386/pr112600-b.c b/gcc/testsuite/gcc.target/i386/pr112600-b.c new file mode 100644 index 000..ea14bb9738b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr112600-b.c @@ -0,0 +1,40 @@ +/* PR target/112600 */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { scan-assembler-times "sbb" 4 } } */ + +unsigned char +sub_sat_char (unsigned char x, unsigned char y) +{ + unsigned char res; + res = x - y; + res &= -(x >= y); + return res; +} + +unsigned short +sub_sat_short (unsigned short x, unsigned short y) +{ + unsigned short res; + res = x - y; + res &= -(x >= y); + return res; +} + +unsigned int +sub_sat_int (unsigned int x, unsigned int y) +{ + unsigned int res; + res = x - y; + res &= -(x >= y); + return res; +} + +unsigned long +sub_sat_long (unsigned long x, unsigned long y) +{ + unsigned long res; + res = x - y; + res &= -(x >= y); + return res; +}
[gcc r15-1123] [committed] [RISC-V] Fix false-positive uninitialized variable
https://gcc.gnu.org/g:932c6f8dd8859afb13475c2de466bd1a159530da commit r15-1123-g932c6f8dd8859afb13475c2de466bd1a159530da Author: Jeff Law Date: Sun Jun 9 09:17:55 2024 -0600 [committed] [RISC-V] Fix false-positive uninitialized variable Andreas noted we were getting an uninit warning after the recent constant synthesis changes. Essentially there's no way for the uninit analysis code to know the first entry in the CODES array is a UNKNOWN which will set X before its first use. So trivial initialization with NULL_RTX is the obvious fix. Pushed to the trunk. gcc/ * config/riscv/riscv.cc (riscv_move_integer): Initialize "x". Diff: --- gcc/config/riscv/riscv.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 95f3636f8e4..c17141d909a 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -2720,7 +2720,7 @@ riscv_move_integer (rtx temp, rtx dest, HOST_WIDE_INT value, struct riscv_integer_op codes[RISCV_MAX_INTEGER_OPS]; machine_mode mode; int i, num_ops; - rtx x; + rtx x = NULL_RTX; mode = GET_MODE (dest); /* We use the original mode for the riscv_build_integer call, because HImode
[gcc r15-1124] FreeBSD: Stop linking _p libs for -pg as of FreeBSD 14
https://gcc.gnu.org/g:48abb540701447b0cd9df7542720ab65a34fc1b1 commit r15-1124-g48abb540701447b0cd9df7542720ab65a34fc1b1 Author: Andreas Tobler Date: Sun Jun 9 23:18:04 2024 +0200 FreeBSD: Stop linking _p libs for -pg as of FreeBSD 14 As of FreeBSD version 14, FreeBSD no longer provides profiled system libraries like libc_p and libpthread_p. Stop linking against them if the FreeBSD major version is 14 or more. gcc: * config/freebsd-spec.h: Change fbsd-lib-spec for FreeBSD > 13, do not link against profiled system libraries if -pg is invoked. Add a define to note about this change. * config/aarch64/aarch64-freebsd.h: Use the note to inform if -pg is invoked on FreeBSD > 13. * config/arm/freebsd.h: Likewise. * config/i386/freebsd.h: Likewise. * config/i386/freebsd64.h: Likewise. * config/riscv/freebsd.h: Likewise. * config/rs6000/freebsd64.h: Likewise. * config/rs6000/sysv4.h: Likeise. Diff: --- gcc/config/aarch64/aarch64-freebsd.h | 1 + gcc/config/arm/freebsd.h | 1 + gcc/config/freebsd-spec.h| 18 ++ gcc/config/i386/freebsd.h| 1 + gcc/config/i386/freebsd64.h | 1 + gcc/config/riscv/freebsd.h | 1 + gcc/config/rs6000/freebsd64.h| 1 + gcc/config/rs6000/sysv4.h| 1 + 8 files changed, 21 insertions(+), 4 deletions(-) diff --git a/gcc/config/aarch64/aarch64-freebsd.h b/gcc/config/aarch64/aarch64-freebsd.h index 53cc17a1caf..e26d69ce46c 100644 --- a/gcc/config/aarch64/aarch64-freebsd.h +++ b/gcc/config/aarch64/aarch64-freebsd.h @@ -35,6 +35,7 @@ #undef FBSD_TARGET_LINK_SPEC #define FBSD_TARGET_LINK_SPEC " \ %{p:%nconsider using `-pg' instead of `-p' with gprof (1)} \ +" FBSD_LINK_PG_NOTE " \ %{v:-V} \ %{assert*} %{R*} %{rpath*} %{defsym*} \ %{shared:-Bshareable %{h*} %{soname*}} \ diff --git a/gcc/config/arm/freebsd.h b/gcc/config/arm/freebsd.h index 9d0a5a842ab..ee4860ae637 100644 --- a/gcc/config/arm/freebsd.h +++ b/gcc/config/arm/freebsd.h @@ -47,6 +47,7 @@ #undef LINK_SPEC #define LINK_SPEC "\ %{p:%nconsider using `-pg' instead of `-p' with gprof (1)} \ + " FBSD_LINK_PG_NOTE " \ %{v:-V} \ %{assert*} %{R*} %{rpath*} %{defsym*} \ %{shared:-Bshareable %{h*} %{soname*}} \ diff --git a/gcc/config/freebsd-spec.h b/gcc/config/freebsd-spec.h index a6d1ad1280f..f43056bf2cf 100644 --- a/gcc/config/freebsd-spec.h +++ b/gcc/config/freebsd-spec.h @@ -92,19 +92,29 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see libc, depending on whether we're doing profiling or need threads support. (similar to the default, except no -lg, and no -p). */ +#if FBSD_MAJOR < 14 +#define FBSD_LINK_PG_NOTHREADS "%{!pg: -lc} %{pg: -lc_p}" +#define FBSD_LINK_PG_THREADS "%{!pg: %{pthread:-lpthread} -lc} " \ + "%{pg: %{pthread:-lpthread} -lc_p}" +#define FBSD_LINK_PG_NOTE "" +#else +#define FBSD_LINK_PG_NOTHREADS "%{-lc} " +#define FBSD_LINK_PG_THREADS "%{pthread:-lpthread} -lc " +#define FBSD_LINK_PG_NOTE "%{pg:%nFreeBSD no longer provides profiled "\ + "system libraries}" +#endif + #ifdef FBSD_NO_THREADS #define FBSD_LIB_SPEC " \ %{pthread: %eThe -pthread option is only supported on FreeBSD when gcc \ is built with the --enable-threads configure-time option.} \ %{!shared: \ -%{!pg: -lc} \ -%{pg: -lc_p} \ +" FBSD_LINK_PG_NOTHREADS " \ }" #else #define FBSD_LIB_SPEC " \ %{!shared: \ -%{!pg: %{pthread:-lpthread} -lc} \ -%{pg: %{pthread:-lpthread_p} -lc_p} \ +" FBSD_LINK_PG_THREADS " \ }\ %{shared:\ %{pthread:-lpthread} -lc \ diff --git a/gcc/config/i386/freebsd.h b/gcc/config/i386/freebsd.h index 3c57dc7cfae..583c752bb76 10
[gcc r14-10294] libstdc++: Use __builtin_shufflevector for simd split and concat
https://gcc.gnu.org/g:ff4646793f2805f0c66705469becdfdd4b5356d1 commit r14-10294-gff4646793f2805f0c66705469becdfdd4b5356d1 Author: Matthias Kretz Date: Mon May 6 12:13:55 2024 +0200 libstdc++: Use __builtin_shufflevector for simd split and concat Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/114958 * include/experimental/bits/simd.h (__as_vector): Return scalar simd as one-element vector. Return vector from single-vector fixed_size simd. (__vec_shuffle): New. (__extract_part): Adjust return type signature. (split): Use __extract_part for any split into non-fixed_size simds. (concat): If the return type stores a single vector, use __vec_shuffle (which calls __builtin_shufflevector) to produce the return value. * include/experimental/bits/simd_builtin.h (__shift_elements_right): Removed. (__extract_part): Return single elements directly. Use __vec_shuffle (which calls __builtin_shufflevector) to for all non-trivial cases. * include/experimental/bits/simd_fixed_size.h (__extract_part): Return single elements directly. * testsuite/experimental/simd/pr114958.cc: New test. (cherry picked from commit fb1649f8b4ad5043dd0e65e4e3a643a0ced018a9) Diff: --- libstdc++-v3/include/experimental/bits/simd.h | 161 +++-- .../include/experimental/bits/simd_builtin.h | 152 +-- .../include/experimental/bits/simd_fixed_size.h| 4 +- .../testsuite/experimental/simd/pr114958.cc| 20 +++ 4 files changed, 145 insertions(+), 192 deletions(-) diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 6ef9c955cfa..6a6fd4f109d 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -1651,7 +1651,24 @@ template if constexpr (__is_vector_type_v<_V>) return __x; else if constexpr (is_simd<_V>::value || is_simd_mask<_V>::value) - return __data(__x)._M_data; + { + if constexpr (__is_fixed_size_abi_v) + { + static_assert(is_simd<_V>::value); + static_assert(_V::abi_type::template __traits< + typename _V::value_type>::_SimdMember::_S_tuple_size == 1); + return __as_vector(__data(__x).first); + } + else if constexpr (_V::size() > 1) + return __data(__x)._M_data; + else + { + static_assert(is_simd<_V>::value); + using _Tp = typename _V::value_type; + using _RV [[__gnu__::__vector_size__(sizeof(_Tp))]] = _Tp; + return _RV{__data(__x)}; + } + } else if constexpr (__is_vectorizable_v<_V>) return __vector_type_t<_V, 2>{__x}; else @@ -2061,6 +2078,60 @@ template > return ~__a; } +// }}} +// __vec_shuffle{{{ +template + _GLIBCXX_SIMD_INTRINSIC constexpr auto + __vec_shuffle(_T0 __x, _T1 __y, index_sequence<_Is...> __seq, _Fun __idx_perm) + { +constexpr int _N0 = sizeof(__x) / sizeof(__x[0]); +constexpr int _N1 = sizeof(__y) / sizeof(__y[0]); +#if __has_builtin(__builtin_shufflevector) +#ifdef __clang__ +// Clang requires _T0 == _T1 +if constexpr (sizeof(__x) > sizeof(__y) and _N1 == 1) + return __vec_shuffle(__x, _T0{__y[0]}, __seq, __idx_perm); +else if constexpr (sizeof(__x) > sizeof(__y)) + return __vec_shuffle(__x, __intrin_bitcast<_T0>(__y), __seq, __idx_perm); +else if constexpr (sizeof(__x) < sizeof(__y) and _N0 == 1) + return __vec_shuffle(_T1{__x[0]}, __y, __seq, [=](int __i) { + __i = __idx_perm(__i); + return __i < _N0 ? __i : __i - _N0 + _N1; +}); +else if constexpr (sizeof(__x) < sizeof(__y)) + return __vec_shuffle(__intrin_bitcast<_T1>(__x), __y, __seq, [=](int __i) { + __i = __idx_perm(__i); + return __i < _N0 ? __i : __i - _N0 + _N1; +}); +else +#endif + return __builtin_shufflevector(__x, __y, [=] { + constexpr int __j = __idx_perm(_Is); + static_assert(__j < _N0 + _N1); + return __j; +}()...); +#else +using _Tp = __remove_cvref_t; +return __vector_type_t<_Tp, sizeof...(_Is)> { + [=]() -> _Tp { + constexpr int __j = __idx_perm(_Is); + static_assert(__j < _N0 + _N1); + if constexpr (__j < 0) + return 0; + else if constexpr (__j < _N0) + return __x[__j]; + else + return __y[__j - _N0]; + }()... +}; +#endif + } + +template + _GLIBCXX_SIMD_INTRINSIC constexpr auto + __vec_shuffle(_T0 __x, _Seq __seq, _Fun __idx_perm) + { return __vec_shuffle(__x, _T0(), __seq, __idx_perm); } + // }}} // __concat{{{ te
[gcc r14-10295] libstdc++: Avoid MMX return types from __builtin_shufflevector
https://gcc.gnu.org/g:237f060033bc119461c43aae482254463f01b29e commit r14-10295-g237f060033bc119461c43aae482254463f01b29e Author: Matthias Kretz Date: Wed May 15 11:02:22 2024 +0200 libstdc++: Avoid MMX return types from __builtin_shufflevector This resolves a regression on i686 that was introduced with r15-429-gfb1649f8b4ad50. Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/115247 * include/experimental/bits/simd.h (__as_vector): Don't use vector_size(8) on __i386__. (__vec_shuffle): Never return MMX vectors, widen to 16 bytes instead. (concat): Fix padding calculation to pick up widening logic from __as_vector. (cherry picked from commit 241a6cc88d866fb36bd35ddb3edb659453d6322e) Diff: --- libstdc++-v3/include/experimental/bits/simd.h | 39 +++ 1 file changed, 28 insertions(+), 11 deletions(-) diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 6a6fd4f109d..7c524625719 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -1665,7 +1665,12 @@ template { static_assert(is_simd<_V>::value); using _Tp = typename _V::value_type; +#ifdef __i386__ + constexpr auto __bytes = sizeof(_Tp) == 8 ? 16 : sizeof(_Tp); + using _RV [[__gnu__::__vector_size__(__bytes)]] = _Tp; +#else using _RV [[__gnu__::__vector_size__(sizeof(_Tp))]] = _Tp; +#endif return _RV{__data(__x)}; } } @@ -2081,11 +2086,14 @@ template > // }}} // __vec_shuffle{{{ template - _GLIBCXX_SIMD_INTRINSIC constexpr auto + _GLIBCXX_SIMD_INTRINSIC constexpr + __vector_type_t()[0])>, sizeof...(_Is)> __vec_shuffle(_T0 __x, _T1 __y, index_sequence<_Is...> __seq, _Fun __idx_perm) { constexpr int _N0 = sizeof(__x) / sizeof(__x[0]); constexpr int _N1 = sizeof(__y) / sizeof(__y[0]); +using _Tp = remove_reference_t()[0])>; +using _RV [[maybe_unused]] = __vector_type_t<_Tp, sizeof...(_Is)>; #if __has_builtin(__builtin_shufflevector) #ifdef __clang__ // Clang requires _T0 == _T1 @@ -2105,14 +2113,23 @@ template }); else #endif - return __builtin_shufflevector(__x, __y, [=] { - constexpr int __j = __idx_perm(_Is); - static_assert(__j < _N0 + _N1); - return __j; -}()...); + { + const auto __r = __builtin_shufflevector(__x, __y, [=] { + constexpr int __j = __idx_perm(_Is); + static_assert(__j < _N0 + _N1); + return __j; +}()...); +#ifdef __i386__ + if constexpr (sizeof(__r) == sizeof(_RV)) + return __r; + else + return _RV {__r[_Is]...}; +#else + return __r; +#endif + } #else -using _Tp = __remove_cvref_t; -return __vector_type_t<_Tp, sizeof...(_Is)> { +return _RV { [=]() -> _Tp { constexpr int __j = __idx_perm(_Is); static_assert(__j < _N0 + _N1); @@ -4393,9 +4410,9 @@ template __vec_shuffle(__as_vector(__xs)..., std::make_index_sequence<_RW::_S_full_size>(), [](int __i) { constexpr int __sizes[2] = {int(simd_size_v<_Tp, _As>)...}; - constexpr int __padding0 - = sizeof(__vector_type_t<_Tp, __sizes[0]>) / sizeof(_Tp) - - __sizes[0]; + constexpr int __vsizes[2] + = {int(sizeof(__as_vector(__xs)) / sizeof(_Tp))...}; + constexpr int __padding0 = __vsizes[0] - __sizes[0]; return __i >= _Np ? -1 : __i < __sizes[0] ? __i : __i + __padding0; })}; }
[gcc r14-10296] libstdc++: Fix simd conversion for -fno-signed-char for Clang
https://gcc.gnu.org/g:489b58b79782fa361c0d7e852e0e684d743c8399 commit r14-10296-g489b58b79782fa361c0d7e852e0e684d743c8399 Author: Matthias Kretz Date: Mon Jun 3 12:02:07 2024 +0200 libstdc++: Fix simd conversion for -fno-signed-char for Clang The special case for Clang in the trait producing a signed integer type lead to the trait returning 'char' where it should have been 'signed char'. This workaround was introduced because on Clang the return type of vector compares was not convertible to '_SimdWrapper< __int_for_sizeof_t<...' unless '__int_for_sizeof_t' was an alias for 'char'. In order to not rewrite the complete mask type code (there is code scattered around the implementation assuming signed integers), this needs to be 'signed char'; so the special case for Clang needs to be removed. The conversion issue is now solved in _SimdWrapper, which now additionally allows conversion from vector types with compatible integral type. Signed-off-by: Matthias Kretz libstdc++-v3/ChangeLog: PR libstdc++/115308 * include/experimental/bits/simd.h (__int_for_sizeof): Remove special cases for __clang__. (_SimdWrapper): Change constructor overload set to allow conversion from vector types with integral conversions via bit reinterpretation. (cherry picked from commit 8e36cf4c5c9140915d001db132a900b48037) Diff: --- libstdc++-v3/include/experimental/bits/simd.h | 45 --- 1 file changed, 27 insertions(+), 18 deletions(-) diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h index 7c524625719..cb1f13d8ba6 100644 --- a/libstdc++-v3/include/experimental/bits/simd.h +++ b/libstdc++-v3/include/experimental/bits/simd.h @@ -606,19 +606,12 @@ template static_assert(_Bytes > 0); if constexpr (_Bytes == sizeof(int)) return int(); - #ifdef __clang__ -else if constexpr (_Bytes == sizeof(char)) - return char(); - #else else if constexpr (_Bytes == sizeof(_SChar)) return _SChar(); - #endif else if constexpr (_Bytes == sizeof(short)) return short(); - #ifndef __clang__ else if constexpr (_Bytes == sizeof(long)) return long(); - #endif else if constexpr (_Bytes == sizeof(_LLong)) return _LLong(); #ifdef __SIZEOF_INT128__ @@ -2747,6 +2740,8 @@ template // }}} // _SimdWrapper{{{ +struct _DisabledSimdWrapper; + template struct _SimdWrapper< _Tp, _Width, @@ -2756,16 +2751,17 @@ template == sizeof(__vector_type_t<_Tp, _Width>), __vector_type_t<_Tp, _Width>> { -using _Base - = _SimdWrapperBase<__has_iec559_behavior<__signaling_NaN, _Tp>::value - && sizeof(_Tp) * _Width - == sizeof(__vector_type_t<_Tp, _Width>), -__vector_type_t<_Tp, _Width>>; +static constexpr bool _S_need_default_init + = __has_iec559_behavior<__signaling_NaN, _Tp>::value + and sizeof(_Tp) * _Width == sizeof(__vector_type_t<_Tp, _Width>); + +using _BuiltinType = __vector_type_t<_Tp, _Width>; + +using _Base = _SimdWrapperBase<_S_need_default_init, _BuiltinType>; static_assert(__is_vectorizable_v<_Tp>); static_assert(_Width >= 2); // 1 doesn't make sense, use _Tp directly then -using _BuiltinType = __vector_type_t<_Tp, _Width>; using value_type = _Tp; static inline constexpr size_t _S_full_size @@ -2801,13 +2797,26 @@ template _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper& operator=(_SimdWrapper&&) = default; -template >, -is_same<_V, __intrinsic_type_t<_Tp, _Width> +// Convert from exactly matching __vector_type_t +using _SimdWrapperBase<_S_need_default_init, _BuiltinType>::_SimdWrapperBase; + +// Convert from __intrinsic_type_t if __intrinsic_type_t and __vector_type_t differ, otherwise +// this ctor should not exist. Making the argument type unusable is our next best solution. +_GLIBCXX_SIMD_INTRINSIC constexpr +_SimdWrapper(conditional_t>, + _DisabledSimdWrapper, __intrinsic_type_t<_Tp, _Width>> __x) +: _Base(__vector_bitcast<_Tp, _Width>(__x)) {} + +// Convert from different __vector_type_t, but only if bit reinterpretation is a correct +// conversion of the value_type +template , + typename = enable_if_t + and is_integral_v>> _GLIBCXX_SIMD_INTRINSIC constexpr _SimdWrapper(_V __x) - // __vector_bitcast can convert e.g. __m128 to __vector(2) float - : _Base(__vector_bitcast<_Tp, _Width>(__x)) {} + : _Base(reinterpret_cast<_BuiltinType>(__x)) {} template && ...)
[gcc r15-1126] tree-optimization/115383 - EXTRACT_LAST_REDUCTION with multiple stmt copies
https://gcc.gnu.org/g:c1429e3a8da0cdfe9391e1e9b2c7228d896a3a87 commit r15-1126-gc1429e3a8da0cdfe9391e1e9b2c7228d896a3a87 Author: Richard Biener Date: Fri Jun 7 12:15:31 2024 +0200 tree-optimization/115383 - EXTRACT_LAST_REDUCTION with multiple stmt copies The EXTRACT_LAST_REDUCTION code isn't ready to deal with multiple stmt copies but SLP no longer checks for this. The following adjusts code generation to handle the situation. PR tree-optimization/115383 * tree-vect-stmts.cc (vectorizable_condition): Handle generating a chain of .FOLD_EXTRACT_LAST. * gcc.dg/vect/pr115383.c: New testcase. Diff: --- gcc/testsuite/gcc.dg/vect/pr115383.c | 20 gcc/tree-vect-stmts.cc | 20 +++- 2 files changed, 35 insertions(+), 5 deletions(-) diff --git a/gcc/testsuite/gcc.dg/vect/pr115383.c b/gcc/testsuite/gcc.dg/vect/pr115383.c new file mode 100644 index 000..92c24699146 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr115383.c @@ -0,0 +1,20 @@ +#include "tree-vect.h" + +int __attribute__((noipa)) +s331 (int i, int n) +{ + int j = 0; + for (; i < n; i++) +if ((float)i < 0.) + j = i; + return j; +} + +int main() +{ + check_vect (); + int j = s331(-13, 17); + if (j != -1) +abort (); + return 0; +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 5098b7fab6a..05a169ecb2d 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -12415,6 +12415,9 @@ vectorizable_condition (vec_info *vinfo, reduction_type != EXTRACT_LAST_REDUCTION ? else_clause : NULL, vectype, &vec_oprnds3); + if (reduction_type == EXTRACT_LAST_REDUCTION) +vec_else_clause = else_clause; + /* Arguments are ready. Create the new vector stmt. */ FOR_EACH_VEC_ELT (vec_oprnds0, i, vec_cond_lhs) { @@ -12557,17 +12560,24 @@ vectorizable_condition (vec_info *vinfo, { gimple *old_stmt = vect_orig_stmt (stmt_info)->stmt; tree lhs = gimple_get_lhs (old_stmt); + if ((unsigned)i != vec_oprnds0.length () - 1) + lhs = copy_ssa_name (lhs); if (len) new_stmt = gimple_build_call_internal - (IFN_LEN_FOLD_EXTRACT_LAST, 5, else_clause, vec_compare, -vec_then_clause, len, bias); + (IFN_LEN_FOLD_EXTRACT_LAST, 5, vec_else_clause, vec_compare, +vec_then_clause, len, bias); else new_stmt = gimple_build_call_internal - (IFN_FOLD_EXTRACT_LAST, 3, else_clause, vec_compare, -vec_then_clause); + (IFN_FOLD_EXTRACT_LAST, 3, vec_else_clause, vec_compare, +vec_then_clause); gimple_call_set_lhs (new_stmt, lhs); SSA_NAME_DEF_STMT (lhs) = new_stmt; - if (old_stmt == gsi_stmt (*gsi)) + if ((unsigned)i != vec_oprnds0.length () - 1) + { + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + vec_else_clause = lhs; + } + else if (old_stmt == gsi_stmt (*gsi)) vect_finish_replace_stmt (vinfo, stmt_info, new_stmt); else {
[gcc r15-1127] MIPS/testsuite: add -mno-branch-likely to r10k-cache-barrier-13.c
https://gcc.gnu.org/g:8e2eb6039d183b7c571da9eb83b933021c5b29be commit r15-1127-g8e2eb6039d183b7c571da9eb83b933021c5b29be Author: YunQiang Su Date: Sat Jun 8 16:05:13 2024 +0800 MIPS/testsuite: add -mno-branch-likely to r10k-cache-barrier-13.c In mips.cc(mips_reorg_process_insns), there is this claim: Also delete cache barriers if the last instruction was an annulled branch. INSN will not be speculatively executed. And with -O1 on mips64, we can generate binary code like this, which fails this test. gcc/testsuite * gcc.target/mips/r10k-cache-barrier-13.c: Add -mno-branch-likely option. Diff: --- gcc/testsuite/gcc.target/mips/r10k-cache-barrier-13.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-13.c b/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-13.c index ee9c84b5988..ac005fb08b3 100644 --- a/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-13.c +++ b/gcc/testsuite/gcc.target/mips/r10k-cache-barrier-13.c @@ -1,4 +1,4 @@ -/* { dg-options "-mr10k-cache-barrier=store" } */ +/* { dg-options "-mr10k-cache-barrier=store -mno-branch-likely" } */ /* Test that indirect calls are protected. */
[gcc r15-1128] libgcc/aarch64: also provide AT_HWCAP2 fallback
https://gcc.gnu.org/g:48d6d8c9e91018a625a797d50ac4def88376a515 commit r15-1128-g48d6d8c9e91018a625a797d50ac4def88376a515 Author: Jan Beulich Date: Mon Jun 10 08:47:58 2024 +0200 libgcc/aarch64: also provide AT_HWCAP2 fallback Much like AT_HWCAP is already provided in case the platform headers don't have the value (yet). libgcc/ * config/aarch64/cpuinfo.c: Provide AT_HWCAP2. Diff: --- libgcc/config/aarch64/cpuinfo.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/libgcc/config/aarch64/cpuinfo.c b/libgcc/config/aarch64/cpuinfo.c index 544c5516133..ec36d105738 100644 --- a/libgcc/config/aarch64/cpuinfo.c +++ b/libgcc/config/aarch64/cpuinfo.c @@ -146,6 +146,9 @@ struct { #define HWCAP_PACG (1UL << 31) #endif +#ifndef AT_HWCAP2 +#define AT_HWCAP2 26 +#endif #ifndef HWCAP2_DCPODP #define HWCAP2_DCPODP (1 << 0) #endif