[gcc r15-7026] c++: Copy over further 2 flags for !TREE_PUBLIC in copy_linkage [PR118513]
https://gcc.gnu.org/g:20a4306793e4978dfff13ca669739eb46915d4e4 commit r15-7026-g20a4306793e4978dfff13ca669739eb46915d4e4 Author: Jakub Jelinek Date: Sat Jan 18 21:50:23 2025 +0100 c++: Copy over further 2 flags for !TREE_PUBLIC in copy_linkage [PR118513] The following testcase ICEs in import_export_decl. When cp_finish_decomp handles std::tuple* using structural binding, it calls copy_linkage to copy various VAR_DECL flags from the structured binding base to the individual sb variables. In this case the base variable is in anonymous union, so we call constrain_visibility (..., VISIBILITY_ANON, ...) on it which e.g. clears TREE_PUBLIC etc. (flags which copy_linkage copies) but doesn't copy over DECL_INTERFACE_KNOWN/DECL_NOT_REALLY_EXTERN. When cp_finish_decl calls determine_visibility on the individual sb variables, those have !TREE_PUBLIC since copy_linkage and so nothing tries to determine visibility and nothing sets DECL_INTERFACE_KNOWN and DECL_NOT_REALLY_EXTERN. Now, this isn't a big deal without modules, the individual variables are var_finalized_p and so nothing really cares about missing DECL_INTERFACE_KNOWN. But in the module case the variables are streamed out and in and care about those bits. The following patch is an attempt to copy over also those flags (but I've limited it to the !TREE_PUBLIC case just in case). Other option would be to call it unconditionally, or call constrain_visibility with VISIBILITY_ANON for !TREE_PUBLIC (but are all !TREE_PUBLIC constrained visibility) or do it only in the cp_finish_decomp case after the copy_linkage call there. 2025-01-18 Jakub Jelinek PR c++/118513 * decl2.cc (copy_linkage): If not TREE_PUBLIC, also set DECL_INTERFACE_KNOWN, assert it was set on decl and copy DECL_NOT_REALLY_EXTERN flags. * g++.dg/modules/decomp-3_a.H: New test. * g++.dg/modules/decomp-3_b.C: New test. Diff: --- gcc/cp/decl2.cc | 7 +++ gcc/testsuite/g++.dg/modules/decomp-3_a.H | 20 gcc/testsuite/g++.dg/modules/decomp-3_b.C | 12 3 files changed, 39 insertions(+) diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc index f64aa848b94a..55f056ef9dc6 100644 --- a/gcc/cp/decl2.cc +++ b/gcc/cp/decl2.cc @@ -3651,6 +3651,13 @@ copy_linkage (tree guard, tree decl) comdat_linkage (guard); DECL_VISIBILITY (guard) = DECL_VISIBILITY (decl); DECL_VISIBILITY_SPECIFIED (guard) = DECL_VISIBILITY_SPECIFIED (decl); + if (!TREE_PUBLIC (decl)) + { + gcc_checking_assert (DECL_INTERFACE_KNOWN (decl)); + DECL_INTERFACE_KNOWN (guard) = 1; + if (DECL_LANG_SPECIFIC (decl) && DECL_LANG_SPECIFIC (guard)) + DECL_NOT_REALLY_EXTERN (guard) = DECL_NOT_REALLY_EXTERN (decl); + } } } diff --git a/gcc/testsuite/g++.dg/modules/decomp-3_a.H b/gcc/testsuite/g++.dg/modules/decomp-3_a.H new file mode 100644 index ..74223cc60e2a --- /dev/null +++ b/gcc/testsuite/g++.dg/modules/decomp-3_a.H @@ -0,0 +1,20 @@ +// PR c++/118513 +// { dg-additional-options -fmodule-header } +// { dg-module-cmi {} } + +namespace std { + template struct tuple_size; + template struct tuple_element; +} + +struct A { + int a, b; + template int &get () { if (I == 0) return a; else return b; } +}; + +template <> struct std::tuple_size { static const int value = 2; }; +template struct std::tuple_element { using type = int; }; + +namespace { +auto [x, y] = A { 42, 43 }; +} diff --git a/gcc/testsuite/g++.dg/modules/decomp-3_b.C b/gcc/testsuite/g++.dg/modules/decomp-3_b.C new file mode 100644 index ..00566ab1232c --- /dev/null +++ b/gcc/testsuite/g++.dg/modules/decomp-3_b.C @@ -0,0 +1,12 @@ +// PR c++/118513 +// { dg-do run } +// { dg-additional-options "-fmodules-ts" } + +import "decomp-3_a.H"; + +int +main () +{ + if (x != 42 || y != 43) +__builtin_abort (); +}
[gcc r15-7025] [RISC-V][PR target/116308] Fix generation of initial RTL for atomics
https://gcc.gnu.org/g:deb3a4ae5dc04616dff893de074de0797594c98e commit r15-7025-gdeb3a4ae5dc04616dff893de074de0797594c98e Author: Jeff Law Date: Sat Jan 18 13:44:33 2025 -0700 [RISC-V][PR target/116308] Fix generation of initial RTL for atomics While this wasn't originally marked as a regression, it almost certainly is given that older versions of GCC would have used libatomic and would not have ICE'd on this code. Basically this is another case where we directly used simplify_gen_subreg when we should have used gen_lowpart. When I fixed a similar bug a while back I noted the code in question as needing another looksie. I think at that time my brain saw the mixed modes (SI & QI) and locked up. But the QI stuff is just the shift count, not some deeper issue. So fixing is trivial. We just replace the simplify_gen_subreg with a gen_lowpart and get on with our lives. Tested on rv64 and rv32 in my tester. Waiting on pre-commit testing for final verdict. PR target/116308 gcc/ * config/riscv/riscv.cc (riscv_lshift_subword): Use gen_lowpart rather than simplify_gen_subreg. gcc/testsuite/ * gcc.target/riscv/pr116308.c: New test. Diff: --- gcc/config/riscv/riscv.cc | 4 +--- gcc/testsuite/gcc.target/riscv/pr116308.c | 9 + 2 files changed, 10 insertions(+), 3 deletions(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 9a1db2d2b380..f5e672bb7f50 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -11963,9 +11963,7 @@ riscv_lshift_subword (machine_mode mode, rtx value, rtx shift, rtx *shifted_value) { rtx value_reg = gen_reg_rtx (SImode); - emit_move_insn (value_reg, simplify_gen_subreg (SImode, value, - mode, 0)); - + emit_move_insn (value_reg, gen_lowpart (SImode, value)); emit_move_insn (*shifted_value, gen_rtx_ASHIFT (SImode, value_reg, gen_lowpart (QImode, shift))); } diff --git a/gcc/testsuite/gcc.target/riscv/pr116308.c b/gcc/testsuite/gcc.target/riscv/pr116308.c new file mode 100644 index ..241df14bd922 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/pr116308.c @@ -0,0 +1,9 @@ +/* { dg-do compile } */ +/* { dg-options "-Ofast -march=rv64gc -mabi=lp64d" { target rv64 } } */ +/* { dg-options "-Ofast -march=rv32gc -mabi=ilp32" { target rv32 } } */ + +_Float16 test__Float16_post_inc() +{ +_Atomic _Float16 n; +return n++; +}
[gcc r15-7031] testsuite: Fixes for test case pr117546.c
https://gcc.gnu.org/g:34c51485808188eec3ebdacf969dd335e908aab3 commit r15-7031-g34c51485808188eec3ebdacf969dd335e908aab3 Author: Dimitar Dimitrov Date: Sat Jan 18 20:19:43 2025 +0200 testsuite: Fixes for test case pr117546.c This test fails on AVR. Debugging the test on x86 host, I noticed that u in function s sometimes has value 16128. The "t <= 3 * u" expression in the same function results in signed integer overflow for targets with sizeof(int)=2. Fix by requiring int32 effective target. Also add return statement for the main function. gcc/testsuite/ChangeLog: * gcc.dg/torture/pr117546.c: Require effective target int32. (main): Add return statement. Signed-off-by: Dimitar Dimitrov Diff: --- gcc/testsuite/gcc.dg/torture/pr117546.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/torture/pr117546.c b/gcc/testsuite/gcc.dg/torture/pr117546.c index 21e2aef18b9a..b60f877a9063 100644 --- a/gcc/testsuite/gcc.dg/torture/pr117546.c +++ b/gcc/testsuite/gcc.dg/torture/pr117546.c @@ -1,4 +1,4 @@ -/* { dg-do run } */ +/* { dg-do run { target int32 } } */ typedef struct { int a; @@ -81,4 +81,6 @@ int main() { l.glyf.coords[4] = (e){2, 206}; l.glyf.coords[6] = (e){0, 308, 5}; w(&l); + + return 0; }
[gcc r15-7029] doc: Adjust link to OpenMP specifications
https://gcc.gnu.org/g:1cc063e070bad7c20a34db3f5e534d7cf036ef83 commit r15-7029-g1cc063e070bad7c20a34db3f5e534d7cf036ef83 Author: Gerald Pfeifer Date: Sun Jan 19 08:52:55 2025 +0800 doc: Adjust link to OpenMP specifications gcc: * doc/extend.texi (OpenMP): Adjust link to specifications. Diff: --- gcc/doc/extend.texi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index dd9a8d2f8ba5..b0bb0d47230e 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -29395,7 +29395,7 @@ architectures, including Unix and Microsoft Windows platforms. It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior. -GCC implements all of the @uref{https://openmp.org/specifications/, +GCC implements all of the @uref{https://www.openmp.org/specifications/, OpenMP Application Program Interface v4.5}, and many features from later versions of the OpenMP specification. @xref{OpenMP Implementation Status,,,libgomp,
[gcc r15-7030] doc: Move modula2.org link to https
https://gcc.gnu.org/g:f7e0ac1dc147e2c8224dbcd1ecf2a98fd7882902 commit r15-7030-gf7e0ac1dc147e2c8224dbcd1ecf2a98fd7882902 Author: Gerald Pfeifer Date: Sun Jan 19 09:40:15 2025 +0800 doc: Move modula2.org link to https gcc: * doc/gm2.texi (Type compatibility): Move modula2.org link to https. Diff: --- gcc/doc/gm2.texi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/doc/gm2.texi b/gcc/doc/gm2.texi index 0bace308d112..5af8b228831f 100644 --- a/gcc/doc/gm2.texi +++ b/gcc/doc/gm2.texi @@ -1913,7 +1913,7 @@ Expression compatibility is a symmetric relation. For example two sub expressions of @code{INTEGER} and @code{CARDINAL} are not expression compatible -(@uref{http://freepages.modula2.org/report4/modula-2.html} and ISO +(@uref{https://freepages.modula2.org/report4/modula-2.html} and ISO Modula-2). In GNU Modula-2 this rule is also extended across all fixed sized data
[gcc r15-7014] c++: Fix up find_array_ctor_elt RAW_DATA_CST handling [PR118534]
https://gcc.gnu.org/g:413985b632afb07032d3b32d992029fced187814 commit r15-7014-g413985b632afb07032d3b32d992029fced187814 Author: Jakub Jelinek Date: Sat Jan 18 09:14:27 2025 +0100 c++: Fix up find_array_ctor_elt RAW_DATA_CST handling [PR118534] This is the third bug discovered today with the https://gcc.gnu.org/pipermail/gcc-patches/2025-January/673945.html hack but then turned into proper testcases where embed-24.C FAILed since introduction of optimized #embed support and the others when optimizing large C++ initializers using RAW_DATA_CST. find_array_ctor_elt already has RAW_DATA_CST support, but on the following testcases it misses one case I've missed. The CONSTRUCTORs in question went through the braced_list_to_string optimization which can turn INTEGER_CST RAW_DATA_CST INTEGER_CST into just larger RAW_DATA_CST covering even those 2 bytes around it (if they appear there in the underlying RAW_DATA_OWNER). With this optimization, RAW_DATA_CST can be the last CONSTRUCTOR_ELTS elt in a CONSTRUCTOR, either the sole one or say preceeded by some unrelated other elements. Now, if RAW_DATA_CST is the only one or if there are no RAW_DATA_CSTs earlier in CONSTRUCTOR_ELTS, we can trigger a bug in find_array_ctor_elt. It has a smart optimization for the very common case where CONSTRUCTOR_ELTS have indexes and index of the last elt is equal to CONSTRUCTOR_NELTS (ary) - 1, then obviously we know there are no RAW_DATA_CSTs before it and the indexes just go from 0 to nelts-1, so when we care about any of those earlier indexes, we can just return i; and not worry about anything. Except it uses if (i < end) return i; rather than if (i < end - 1) return i; For the latter cases, i.e. anything before the last elt, we know there are no surprises and return i; is right. But for the if (i == end - 1) case, return i; is only correct if the last elt is not RAW_DATA_CST, if it is RAW_DATA_CST, we still need to split it, which is handled by the code later in the function. So, for that we need begin = end - 1, so that the binary search will just care about that last element. 2025-01-18 Jakub Jelinek PR c++/118534 * constexpr.cc (find_array_ctor_elt): Don't return i early if i == end - 1 and the last elt's value is RAW_DATA_CST. * g++.dg/cpp/embed-24.C: New test. * g++.dg/cpp1y/pr118534.C: New test. Diff: --- gcc/cp/constexpr.cc | 15 ++- gcc/testsuite/g++.dg/cpp/embed-24.C | 30 ++ gcc/testsuite/g++.dg/cpp1y/pr118534.C | 31 +++ 3 files changed, 71 insertions(+), 5 deletions(-) diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc index c898e3bfa6ea..7ff38f8b5e52 100644 --- a/gcc/cp/constexpr.cc +++ b/gcc/cp/constexpr.cc @@ -4155,12 +4155,17 @@ find_array_ctor_elt (tree ary, tree dindex, bool insert) else if (TREE_CODE (cindex) == INTEGER_CST && compare_tree_int (cindex, end - 1) == 0) { - if (i < end) - return i; tree value = (*elts)[end - 1].value; - if (TREE_CODE (value) == RAW_DATA_CST - && wi::to_offset (dindex) < (wi::to_offset (cindex) - + RAW_DATA_LENGTH (value))) + if (i < end) + { + if (i == end - 1 && TREE_CODE (value) == RAW_DATA_CST) + begin = end - 1; + else + return i; + } + else if (TREE_CODE (value) == RAW_DATA_CST + && wi::to_offset (dindex) < (wi::to_offset (cindex) + + RAW_DATA_LENGTH (value))) begin = end - 1; else begin = end; diff --git a/gcc/testsuite/g++.dg/cpp/embed-24.C b/gcc/testsuite/g++.dg/cpp/embed-24.C new file mode 100644 index ..baaad7273e35 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp/embed-24.C @@ -0,0 +1,30 @@ +// PR c++/118534 +// { dg-do compile { target c++14 } } +// { dg-options "" } + +template +constexpr bool +foo () +{ + T x[160] = { +#embed __FILE__ limit (160) + }; + const int y[160] = { +42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, 42, +#embed __FILE__ limit (147) gnu::offset (13) + }; + unsigned long n = 13; + for (T *p = x; n; --n, p++) +*p = 42; + for (int i = 0; i < 160; ++i) +if (x[i] != y[i]) + return false; + return true; +} + +int +main () +{ + static_assert (foo (), ""); + static_assert (foo (), ""); +} diff --git a/gcc/testsuite/g++.dg/cpp1y/pr118534.C b/gcc/testsuite/g++.dg/cpp1y/pr118534.C new file mode 100644 index ..72ffdd3b6c5a --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp1y/pr118534.C @@ -0,0 +1,31 @@ +// PR c++/118534 +// { dg-do compile { target c++14 } } + +template +constexpr bool +foo () +{ + T x[160] = {
[gcc r15-7018] AArch64: Use standard names for saturating arithmetic
https://gcc.gnu.org/g:aa361611490947eb228e5b625a3f0f23ff647dbd commit r15-7018-gaa361611490947eb228e5b625a3f0f23ff647dbd Author: Akram Ahmad Date: Fri Jan 17 17:43:49 2025 + AArch64: Use standard names for saturating arithmetic This renames the existing {s,u}q{add,sub} instructions to use the standard names {s,u}s{add,sub}3 which are used by IFN_SAT_ADD and IFN_SAT_SUB. The NEON intrinsics for saturating arithmetic and their corresponding builtins are changed to use these standard names too. Using the standard names for the instructions causes 32 and 64-bit unsigned scalar saturating arithmetic to use the NEON instructions, resulting in an additional (and inefficient) FMOV to be generated when the original operands are in GP registers. This patch therefore also restores the original behaviour of using the adds/subs instructions in this circumstance. Additional tests are written for the scalar and Adv. SIMD cases to ensure that the correct instructions are used. The NEON intrinsics are already tested elsewhere. gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc: Expand iterators. * config/aarch64/aarch64-simd-builtins.def: Use standard names * config/aarch64/aarch64-simd.md: Use standard names, split insn definitions on signedness of operator and type of operands. * config/aarch64/arm_neon.h: Use standard builtin names. * config/aarch64/iterators.md: Add VSDQ_I_QI_HI iterator to simplify splitting of insn for unsigned scalar arithmetic. gcc/testsuite/ChangeLog: * gcc.target/aarch64/scalar_intrinsics.c: Update testcases. * gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect.inc: Template file for unsigned vector saturating arithmetic tests. * gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_1.c: 8-bit vector type tests. * gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_2.c: 16-bit vector type tests. * gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_3.c: 32-bit vector type tests. * gcc.target/aarch64/advsimd-intrinsics/saturating_arithmetic_autovect_4.c: 64-bit vector type tests. * gcc.target/aarch64/saturating_arithmetic.inc: Template file for scalar saturating arithmetic tests. * gcc.target/aarch64/saturating_arithmetic_1.c: 8-bit tests. * gcc.target/aarch64/saturating_arithmetic_2.c: 16-bit tests. * gcc.target/aarch64/saturating_arithmetic_3.c: 32-bit tests. * gcc.target/aarch64/saturating_arithmetic_4.c: 64-bit tests. Co-authored-by: Tamar Christina Diff: --- gcc/config/aarch64/aarch64-builtins.cc | 12 + gcc/config/aarch64/aarch64-simd-builtins.def | 8 +- gcc/config/aarch64/aarch64-simd.md | 207 +++- gcc/config/aarch64/arm_neon.h | 96 gcc/config/aarch64/iterators.md| 4 + .../saturating_arithmetic_autovect.inc | 58 + .../saturating_arithmetic_autovect_1.c | 79 ++ .../saturating_arithmetic_autovect_2.c | 79 ++ .../saturating_arithmetic_autovect_3.c | 75 ++ .../saturating_arithmetic_autovect_4.c | 77 ++ .../aarch64/saturating-arithmetic-signed.c | 270 + .../gcc.target/aarch64/saturating_arithmetic.inc | 39 +++ .../gcc.target/aarch64/saturating_arithmetic_1.c | 36 +++ .../gcc.target/aarch64/saturating_arithmetic_2.c | 36 +++ .../gcc.target/aarch64/saturating_arithmetic_3.c | 30 +++ .../gcc.target/aarch64/saturating_arithmetic_4.c | 30 +++ .../gcc.target/aarch64/scalar_intrinsics.c | 32 +-- 17 files changed, 1096 insertions(+), 72 deletions(-) diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/config/aarch64/aarch64-builtins.cc index 86eebc168859..6d5479c2e449 100644 --- a/gcc/config/aarch64/aarch64-builtins.cc +++ b/gcc/config/aarch64/aarch64-builtins.cc @@ -5039,6 +5039,18 @@ aarch64_general_gimple_fold_builtin (unsigned int fcode, gcall *stmt, new_stmt = gimple_build_assign (gimple_call_lhs (stmt), LSHIFT_EXPR, args[0], args[1]); break; + /* lower saturating add/sub neon builtins to gimple. */ + BUILTIN_VSDQ_I (BINOP, ssadd, 3, DEFAULT) + BUILTIN_VSDQ_I (BINOPU, usadd, 3, DEFAULT) + new_stmt = gimple_build_call_internal (IFN_SAT_ADD, 2, args[0], args[1]); + gimple_call_set_lhs (new_stmt, gimple_call_lhs (stmt)); + break; + BUILTIN_VSDQ_I (BINOP, sssub, 3, DEFAULT) + BUILTIN_VSDQ_I (BINOPU, ussub, 3, DEFAULT) + new_stmt = gimple_build_call_intern
[gcc r15-7015] Revert "AArch64: Use standard names for SVE saturating arithmetic"
https://gcc.gnu.org/g:8787f63de6e51bc43f86bb08c8a5f4a370246a90 commit r15-7015-g8787f63de6e51bc43f86bb08c8a5f4a370246a90 Author: Tamar Christina Date: Sat Jan 18 11:12:35 2025 + Revert "AArch64: Use standard names for SVE saturating arithmetic" This reverts commit 26b2d9f27ca24f0705641a85f29d179fa0600869. Diff: --- gcc/config/aarch64/aarch64-sve.md | 4 +- .../aarch64/sve/saturating_arithmetic.inc | 68 -- .../aarch64/sve/saturating_arithmetic_1.c | 60 --- .../aarch64/sve/saturating_arithmetic_2.c | 60 --- .../aarch64/sve/saturating_arithmetic_3.c | 62 .../aarch64/sve/saturating_arithmetic_4.c | 62 6 files changed, 2 insertions(+), 314 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index e975286a0190..ba4b4d904c77 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -4449,7 +4449,7 @@ ;; - ;; Unpredicated saturating signed addition and subtraction. -(define_insn "s3" +(define_insn "@aarch64_sve_" [(set (match_operand:SVE_FULL_I 0 "register_operand") (SBINQOPS:SVE_FULL_I (match_operand:SVE_FULL_I 1 "register_operand") @@ -4465,7 +4465,7 @@ ) ;; Unpredicated saturating unsigned addition and subtraction. -(define_insn "s3" +(define_insn "@aarch64_sve_" [(set (match_operand:SVE_FULL_I 0 "register_operand") (UBINQOPS:SVE_FULL_I (match_operand:SVE_FULL_I 1 "register_operand") diff --git a/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic.inc b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic.inc deleted file mode 100644 index 0b3ebbcb0d6f.. --- a/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic.inc +++ /dev/null @@ -1,68 +0,0 @@ -/* Template file for vector saturating arithmetic validation. - - This file defines saturating addition and subtraction functions for a given - scalar type, testing the auto-vectorization of these two operators. This - type, along with the corresponding minimum and maximum values for that type, - must be defined by any test file which includes this template file. */ - -#ifndef SAT_ARIT_AUTOVEC_INC -#define SAT_ARIT_AUTOVEC_INC - -#include -#include - -#ifndef UT -#define UT uint32_t -#define UMAX UINT_MAX -#define UMIN 0 -#endif - -void uaddq (UT *out, UT *a, UT *b, int n) -{ - for (int i = 0; i < n; i++) -{ - UT sum = a[i] + b[i]; - out[i] = sum < a[i] ? UMAX : sum; -} -} - -void uaddq2 (UT *out, UT *a, UT *b, int n) -{ - for (int i = 0; i < n; i++) -{ - UT sum; - if (!__builtin_add_overflow(a[i], b[i], &sum)) - out[i] = sum; - else - out[i] = UMAX; -} -} - -void uaddq_imm (UT *out, UT *a, int n) -{ - for (int i = 0; i < n; i++) -{ - UT sum = a[i] + 50; - out[i] = sum < a[i] ? UMAX : sum; -} -} - -void usubq (UT *out, UT *a, UT *b, int n) -{ - for (int i = 0; i < n; i++) -{ - UT sum = a[i] - b[i]; - out[i] = sum > a[i] ? UMIN : sum; -} -} - -void usubq_imm (UT *out, UT *a, int n) -{ - for (int i = 0; i < n; i++) -{ - UT sum = a[i] - 50; - out[i] = sum > a[i] ? UMIN : sum; -} -} - -#endif \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_1.c b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_1.c deleted file mode 100644 index 6936e9a27044.. --- a/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_1.c +++ /dev/null @@ -1,60 +0,0 @@ -/* { dg-do compile { target { aarch64*-*-* } } } */ -/* { dg-options "-O2 --save-temps -ftree-vectorize" } */ -/* { dg-final { check-function-bodies "**" "" "" } } */ - -/* -** uaddq: -** ... -** ld1b\tz([0-9]+)\.b, .* -** ld1b\tz([0-9]+)\.b, .* -** uqadd\tz\2.b, z\1\.b, z\2\.b -** ... -** ldr\tb([0-9]+), .* -** ldr\tb([0-9]+), .* -** uqadd\tb\4, b\3, b\4 -** ... -*/ -/* -** uaddq2: -** ... -** ld1b\tz([0-9]+)\.b, .* -** ld1b\tz([0-9]+)\.b, .* -** uqadd\tz\2.b, z\1\.b, z\2\.b -** ... -** ldr\tb([0-9]+), .* -** ldr\tb([0-9]+), .* -** uqadd\tb\4, b\3, b\4 -** ... -*/ -/* -** uaddq_imm: -** ... -** ld1b\tz([0-9]+)\.b, .* -** uqadd\tz\1.b, z\1\.b, #50 -** ... -** movi\tv([0-9]+)\.8b, 0x32 -** ... -** ldr\tb([0-9]+), .* -** uqadd\tb\3, b\3, b\2 -** ... -*/ -/* -** usubq: { xfail *-*-* } -** ... -** ld1b\tz([0-9]+)\.b, .* -** ld1b\tz([0-9]+)\.b, .* -** uqsub\tz\2.b, z\1\.b, z\2\.b -** ... -** ldr\tb([0-9]+), .* -** ldr\tb([0-9]+), .* -** uqsub\tb\4, b\3, b\4 -** ... -*/ - -#include - -#define UT unsigned char -#define UMAX UCHAR_MAX -#define UMIN 0 - -#include "saturating_arithmetic.inc" \ No newline at end of file diff --git a/gcc/testsuite/gcc
[gcc r15-7016] Revert "AArch64: Use standard names for saturating arithmetic"
https://gcc.gnu.org/g:1775a7280a230776927897147f1b07964cf5cfc7 commit r15-7016-g1775a7280a230776927897147f1b07964cf5cfc7 Author: Tamar Christina Date: Sat Jan 18 11:12:38 2025 + Revert "AArch64: Use standard names for saturating arithmetic" This reverts commit 5f5833a4107ddfbcd87651bf140151de043f4c36. Diff: --- gcc/config/aarch64/aarch64-builtins.cc | 12 - gcc/config/aarch64/aarch64-simd-builtins.def | 8 +- gcc/config/aarch64/aarch64-simd.md | 207 +--- gcc/config/aarch64/arm_neon.h | 96 gcc/config/aarch64/iterators.md| 4 - .../saturating_arithmetic_autovect.inc | 58 - .../saturating_arithmetic_autovect_1.c | 79 -- .../saturating_arithmetic_autovect_2.c | 79 -- .../saturating_arithmetic_autovect_3.c | 75 -- .../saturating_arithmetic_autovect_4.c | 77 -- .../aarch64/saturating-arithmetic-signed.c | 270 - .../gcc.target/aarch64/saturating_arithmetic.inc | 39 --- .../gcc.target/aarch64/saturating_arithmetic_1.c | 36 --- .../gcc.target/aarch64/saturating_arithmetic_2.c | 36 --- .../gcc.target/aarch64/saturating_arithmetic_3.c | 30 --- .../gcc.target/aarch64/saturating_arithmetic_4.c | 30 --- .../gcc.target/aarch64/scalar_intrinsics.c | 32 +-- 17 files changed, 72 insertions(+), 1096 deletions(-) diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/config/aarch64/aarch64-builtins.cc index 6d5479c2e449..86eebc168859 100644 --- a/gcc/config/aarch64/aarch64-builtins.cc +++ b/gcc/config/aarch64/aarch64-builtins.cc @@ -5039,18 +5039,6 @@ aarch64_general_gimple_fold_builtin (unsigned int fcode, gcall *stmt, new_stmt = gimple_build_assign (gimple_call_lhs (stmt), LSHIFT_EXPR, args[0], args[1]); break; - /* lower saturating add/sub neon builtins to gimple. */ - BUILTIN_VSDQ_I (BINOP, ssadd, 3, DEFAULT) - BUILTIN_VSDQ_I (BINOPU, usadd, 3, DEFAULT) - new_stmt = gimple_build_call_internal (IFN_SAT_ADD, 2, args[0], args[1]); - gimple_call_set_lhs (new_stmt, gimple_call_lhs (stmt)); - break; - BUILTIN_VSDQ_I (BINOP, sssub, 3, DEFAULT) - BUILTIN_VSDQ_I (BINOPU, ussub, 3, DEFAULT) - new_stmt = gimple_build_call_internal (IFN_SAT_SUB, 2, args[0], args[1]); - gimple_call_set_lhs (new_stmt, gimple_call_lhs (stmt)); - break; - BUILTIN_VSDQ_I_DI (BINOP, sshl, 0, DEFAULT) BUILTIN_VSDQ_I_DI (BINOP_UUS, ushl, 0, DEFAULT) { diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def index 6cc45b18a723..286272a33118 100644 --- a/gcc/config/aarch64/aarch64-simd-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-builtins.def @@ -71,10 +71,10 @@ BUILTIN_VSDQ_I (BINOP, sqrshl, 0, DEFAULT) BUILTIN_VSDQ_I (BINOP_UUS, uqrshl, 0, DEFAULT) /* Implemented by aarch64_. */ - BUILTIN_VSDQ_I (BINOP, ssadd, 3, DEFAULT) - BUILTIN_VSDQ_I (BINOPU, usadd, 3, DEFAULT) - BUILTIN_VSDQ_I (BINOP, sssub, 3, DEFAULT) - BUILTIN_VSDQ_I (BINOPU, ussub, 3, DEFAULT) + BUILTIN_VSDQ_I (BINOP, sqadd, 0, DEFAULT) + BUILTIN_VSDQ_I (BINOPU, uqadd, 0, DEFAULT) + BUILTIN_VSDQ_I (BINOP, sqsub, 0, DEFAULT) + BUILTIN_VSDQ_I (BINOPU, uqsub, 0, DEFAULT) /* Implemented by aarch64_qadd. */ BUILTIN_VSDQ_I (BINOP_SSU, suqadd, 0, DEFAULT) BUILTIN_VSDQ_I (BINOP_UUS, usqadd, 0, DEFAULT) diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index e2afe87e5130..eeb626f129a8 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -5162,214 +5162,15 @@ ) ;; q -(define_insn "s3" - [(set (match_operand:VSDQ_I_QI_HI 0 "register_operand" "=w") - (BINQOPS:VSDQ_I_QI_HI - (match_operand:VSDQ_I_QI_HI 1 "register_operand" "w") - (match_operand:VSDQ_I_QI_HI 2 "register_operand" "w")))] +(define_insn "aarch64_q" + [(set (match_operand:VSDQ_I 0 "register_operand" "=w") + (BINQOPS:VSDQ_I (match_operand:VSDQ_I 1 "register_operand" "w") + (match_operand:VSDQ_I 2 "register_operand" "w")))] "TARGET_SIMD" "q\\t%0, %1, %2" [(set_attr "type" "neon_q")] ) -(define_expand "s3" - [(parallel -[(set (match_operand:GPI 0 "register_operand") - (SBINQOPS:GPI (match_operand:GPI 1 "register_operand") - (match_operand:GPI 2 "aarch64_plus_operand"))) -(clobber (scratch:GPI)) -(clobber (reg:CC CC_REGNUM))])] -) - -;; Introducing a temporary GP reg allows signed saturating arithmetic with GPR -;; operands to be calculated without the use of costly transfers to and from FP -;; registers. For example, saturating addition usually uses three FMOVs: -;; -;; fmov d0, x0 -;; fmov d1, x1 -;; sqadd d0, d0, d1 -;; fmov x0, d0 -;; -;
[gcc r15-7017] AArch64: Use standard names for SVE saturating arithmetic
https://gcc.gnu.org/g:8f8ca83f2f6f165c4060ee1fc18ed3c74571ab7a commit r15-7017-g8f8ca83f2f6f165c4060ee1fc18ed3c74571ab7a Author: Akram Ahmad Date: Fri Jan 17 17:44:23 2025 + AArch64: Use standard names for SVE saturating arithmetic Rename the existing SVE unpredicated saturating arithmetic instructions to use standard names which are used by IFN_SAT_ADD and IFN_SAT_SUB. gcc/ChangeLog: * config/aarch64/aarch64-sve.md: Rename insns gcc/testsuite/ChangeLog: * gcc.target/aarch64/sve/saturating_arithmetic.inc: Template file for auto-vectorizer tests. * gcc.target/aarch64/sve/saturating_arithmetic_1.c: Instantiate 8-bit vector tests. * gcc.target/aarch64/sve/saturating_arithmetic_2.c: Instantiate 16-bit vector tests. * gcc.target/aarch64/sve/saturating_arithmetic_3.c: Instantiate 32-bit vector tests. * gcc.target/aarch64/sve/saturating_arithmetic_4.c: Instantiate 64-bit vector tests. Diff: --- gcc/config/aarch64/aarch64-sve.md | 4 +- .../aarch64/sve/saturating_arithmetic.inc | 68 ++ .../aarch64/sve/saturating_arithmetic_1.c | 60 +++ .../aarch64/sve/saturating_arithmetic_2.c | 60 +++ .../aarch64/sve/saturating_arithmetic_3.c | 62 .../aarch64/sve/saturating_arithmetic_4.c | 62 6 files changed, 314 insertions(+), 2 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md index ba4b4d904c77..e975286a0190 100644 --- a/gcc/config/aarch64/aarch64-sve.md +++ b/gcc/config/aarch64/aarch64-sve.md @@ -4449,7 +4449,7 @@ ;; - ;; Unpredicated saturating signed addition and subtraction. -(define_insn "@aarch64_sve_" +(define_insn "s3" [(set (match_operand:SVE_FULL_I 0 "register_operand") (SBINQOPS:SVE_FULL_I (match_operand:SVE_FULL_I 1 "register_operand") @@ -4465,7 +4465,7 @@ ) ;; Unpredicated saturating unsigned addition and subtraction. -(define_insn "@aarch64_sve_" +(define_insn "s3" [(set (match_operand:SVE_FULL_I 0 "register_operand") (UBINQOPS:SVE_FULL_I (match_operand:SVE_FULL_I 1 "register_operand") diff --git a/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic.inc b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic.inc new file mode 100644 index ..0b3ebbcb0d6f --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic.inc @@ -0,0 +1,68 @@ +/* Template file for vector saturating arithmetic validation. + + This file defines saturating addition and subtraction functions for a given + scalar type, testing the auto-vectorization of these two operators. This + type, along with the corresponding minimum and maximum values for that type, + must be defined by any test file which includes this template file. */ + +#ifndef SAT_ARIT_AUTOVEC_INC +#define SAT_ARIT_AUTOVEC_INC + +#include +#include + +#ifndef UT +#define UT uint32_t +#define UMAX UINT_MAX +#define UMIN 0 +#endif + +void uaddq (UT *out, UT *a, UT *b, int n) +{ + for (int i = 0; i < n; i++) +{ + UT sum = a[i] + b[i]; + out[i] = sum < a[i] ? UMAX : sum; +} +} + +void uaddq2 (UT *out, UT *a, UT *b, int n) +{ + for (int i = 0; i < n; i++) +{ + UT sum; + if (!__builtin_add_overflow(a[i], b[i], &sum)) + out[i] = sum; + else + out[i] = UMAX; +} +} + +void uaddq_imm (UT *out, UT *a, int n) +{ + for (int i = 0; i < n; i++) +{ + UT sum = a[i] + 50; + out[i] = sum < a[i] ? UMAX : sum; +} +} + +void usubq (UT *out, UT *a, UT *b, int n) +{ + for (int i = 0; i < n; i++) +{ + UT sum = a[i] - b[i]; + out[i] = sum > a[i] ? UMIN : sum; +} +} + +void usubq_imm (UT *out, UT *a, int n) +{ + for (int i = 0; i < n; i++) +{ + UT sum = a[i] - 50; + out[i] = sum > a[i] ? UMIN : sum; +} +} + +#endif \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_1.c b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_1.c new file mode 100644 index ..6936e9a27044 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/saturating_arithmetic_1.c @@ -0,0 +1,60 @@ +/* { dg-do compile { target { aarch64*-*-* } } } */ +/* { dg-options "-O2 --save-temps -ftree-vectorize" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +/* +** uaddq: +** ... +** ld1b\tz([0-9]+)\.b, .* +** ld1b\tz([0-9]+)\.b, .* +** uqadd\tz\2.b, z\1\.b, z\2\.b +** ... +** ldr\tb([0-9]+), .* +** ldr\tb([0-9]+), .* +** uqadd\tb\4, b\3, b\4 +** ... +*/ +/* +** uaddq2: +** ... +** ld1b\tz([0-9]+)\.b, .* +** ld1b\tz([0-9]+)\.b, .* +** uqadd\tz\2.b, z\1\.b,
[gcc r15-7021] [PR target/118357] RISC-V: Disable fusing vsetvl instructions by VSETVL_VTYPE_CHANGE_ONLY for XThead
https://gcc.gnu.org/g:b9493e98da58c7689645b4ee1a2f653b86a5d758 commit r15-7021-gb9493e98da58c7689645b4ee1a2f653b86a5d758 Author: Jin Ma Date: Sat Jan 18 07:43:17 2025 -0700 [PR target/118357] RISC-V: Disable fusing vsetvl instructions by VSETVL_VTYPE_CHANGE_ONLY for XTheadVector. In RVV 1.0, the instruction "vsetvlizero,zero,*" indicates that the available vector length (avl) does not change. However, in XTheadVector, this same instruction signifies that the avl should take the maximum value. Consequently, when fusing vsetvl instructions, the optimization labeled "VSETVL_VTYPE_CHANGE_ONLY" is disabled for XTheadVector. PR target/118357 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc: Function change_vtype_only_p always returns false for XTheadVector. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/xtheadvector/pr118357.c: New test. Diff: --- gcc/config/riscv/riscv-vsetvl.cc | 3 ++- gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr118357.c | 13 + 2 files changed, 15 insertions(+), 1 deletion(-) diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index a4016beebc0c..72c4c59514e5 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -903,7 +903,8 @@ public: bool valid_p () const { return m_state == state_type::VALID; } bool unknown_p () const { return m_state == state_type::UNKNOWN; } bool empty_p () const { return m_state == state_type::EMPTY; } - bool change_vtype_only_p () const { return m_change_vtype_only; } + bool change_vtype_only_p () const { return m_change_vtype_only +&& !TARGET_XTHEADVECTOR; } void set_valid () { m_state = state_type::VALID; } void set_unknown () { m_state = state_type::UNKNOWN; } diff --git a/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr118357.c b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr118357.c new file mode 100644 index ..aebb0e3088ab --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/xtheadvector/pr118357.c @@ -0,0 +1,13 @@ +/* { dg-do compile { target { rv64 } } } */ +/* { dg-options "-march=rv64gc_xtheadvector -mabi=lp64d -O2" } */ + +#include + +vfloat16m4_t foo (float *ptr, size_t vl) +{ + vfloat32m8_t _p = __riscv_vle32_v_f32m8 (ptr, vl); + vfloat16m4_t _half = __riscv_vfncvt_f_f_w_f16m4 (_p, vl); + return _half; +} + +/* { dg-final { scan-assembler-not {th.vsetvli\tzero,zero} } }*/
[gcc r15-7019] AVR: Fix a plenk in doc/invoke.texi.
https://gcc.gnu.org/g:0b2f2c62f654d36b0d0056428bc973605a09b10f commit r15-7019-g0b2f2c62f654d36b0d0056428bc973605a09b10f Author: Georg-Johann Lay Date: Sat Jan 18 14:44:04 2025 +0100 AVR: Fix a plenk in doc/invoke.texi. gcc/ * doc/invoke.texi (AVR Options): Fix plenk at -msplit-ldst. Diff: --- gcc/doc/invoke.texi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 13afb4a0d0d8..9723d9cd0148 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -24423,8 +24423,8 @@ This optimization is turned on per default for @option{-O2} and higher, including @option{-Os} but excluding @option{-Oz}. Splitting of shifts with a constant offset that is a multiple of 8 is controlled by @option{-mfuse-move}. -@opindex msplit-ldst +@opindex msplit-ldst @item -msplit-ldst Split multi-byte loads and stores into several byte loads and stores. This optimization is turned on per default for @option{-O2} and higher.
[gcc r15-7020] tree-optimization/118529 - ICE with condition vectorization
https://gcc.gnu.org/g:c81543b3379fa11742d2178b87edbf1e72799d61 commit r15-7020-gc81543b3379fa11742d2178b87edbf1e72799d61 Author: Richard Biener Date: Fri Jan 17 15:41:19 2025 +0100 tree-optimization/118529 - ICE with condition vectorization On sparc we end up choosing vector(8) for the condition but vector(2) int for the value of a COND_EXPR but we fail to verify their shapes match and thus things go downhill. This is a missed-optimization on the pattern recognition side as well as unhandled vector decomposition in vectorizable_condition. The following plugs just the observed ICE for now. PR tree-optimization/118529 * tree-vect-stmts.cc (vectorizable_condition): Check the shape of the vector and condition vector type are compatible. * gcc.target/sparc/pr118529.c: New testcase. Diff: --- gcc/testsuite/gcc.target/sparc/pr118529.c | 17 + gcc/tree-vect-stmts.cc| 5 +++-- 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.target/sparc/pr118529.c b/gcc/testsuite/gcc.target/sparc/pr118529.c new file mode 100644 index ..1393763e9db2 --- /dev/null +++ b/gcc/testsuite/gcc.target/sparc/pr118529.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mvis3" } */ + +long c; +int d[10]; +int e; +void g() { + int b = 1 & e; + int *f = d; + b = -b; + c = 0; + for (; c < 10; c++) { +int h = f[c] ^ c; +h &= b; +f[c] ^= h; + } +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index b5dd1a2e40f1..833029fcb001 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -12676,8 +12676,9 @@ vectorizable_condition (vec_info *vinfo, masked = !COMPARISON_CLASS_P (cond_expr); vec_cmp_type = truth_type_for (comp_vectype); - - if (vec_cmp_type == NULL_TREE) + if (vec_cmp_type == NULL_TREE + || maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), + TYPE_VECTOR_SUBPARTS (vec_cmp_type))) return false; cond_code = TREE_CODE (cond_expr);
[gcc r15-7022] RISC-V: Disable RV64-only crc testcases for RV32
https://gcc.gnu.org/g:729591f1017bf72f924d2bb6ebbad202da95171d commit r15-7022-g729591f1017bf72f924d2bb6ebbad202da95171d Author: Bohan Lei Date: Sat Jan 18 08:09:48 2025 -0700 RISC-V: Disable RV64-only crc testcases for RV32 These testcases require RV64 targets. They fail when -march=rv32* is specified while using an riscv64* compiler. gcc/testsuite/ChangeLog: * gcc.target/riscv/crc-21-rv64-zbc.c: Disallow rv32 targets. * gcc.target/riscv/crc-21-rv64-zbkc.c: Ditto. Diff: --- gcc/testsuite/gcc.target/riscv/crc-21-rv64-zbc.c | 5 ++--- gcc/testsuite/gcc.target/riscv/crc-21-rv64-zbkc.c | 5 ++--- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/gcc/testsuite/gcc.target/riscv/crc-21-rv64-zbc.c b/gcc/testsuite/gcc.target/riscv/crc-21-rv64-zbc.c index 503b412f2e19..bfb724a0f70e 100644 --- a/gcc/testsuite/gcc.target/riscv/crc-21-rv64-zbc.c +++ b/gcc/testsuite/gcc.target/riscv/crc-21-rv64-zbc.c @@ -1,6 +1,5 @@ -/* { dg-do run { target { riscv64*-*-* && riscv_zbc_ok } } } */ -/* { dg-options "-march=rv64gc_zbc -fdump-tree-crc -fdump-rtl-dfinish" { target { rv64 } } } */ -/* { dg-options "-march=rv32gc_zbc -fdump-tree-crc -fdump-rtl-dfinish" { target { rv32 } } } */ +/* { dg-do run { target { rv64 && riscv_zbc_ok } } } */ +/* { dg-options "-march=rv64gc_zbc -fdump-tree-crc -fdump-rtl-dfinish" } */ /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */ #include "../../gcc.dg/torture/crc-21.c" diff --git a/gcc/testsuite/gcc.target/riscv/crc-21-rv64-zbkc.c b/gcc/testsuite/gcc.target/riscv/crc-21-rv64-zbkc.c index 2bf0172a8377..92a9ca8398a7 100644 --- a/gcc/testsuite/gcc.target/riscv/crc-21-rv64-zbkc.c +++ b/gcc/testsuite/gcc.target/riscv/crc-21-rv64-zbkc.c @@ -1,6 +1,5 @@ -/* { dg-do run { target { riscv64*-*-* && riscv_zbkc_ok } } } */ -/* { dg-options "-march=rv64gc_zbkc -fdump-tree-crc -fdump-rtl-dfinish" { target { rv64 } } } */ -/* { dg-options "-march=rv32gc_zbkc -fdump-tree-crc -fdump-rtl-dfinish" { target { rv32 } } } */ +/* { dg-do run { target { rv64 && riscv_zbkc_ok } } } */ +/* { dg-options "-march=rv64gc_zbkc -fdump-tree-crc -fdump-rtl-dfinish" } */ /* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */ #include "../../gcc.dg/torture/crc-21.c"
[gcc r15-7023] Fix bootstrap failure on SPARC with -O3 -mcpu=niagara4
https://gcc.gnu.org/g:d309844d6fe02e695eb82cbd30fd135e836f24eb commit r15-7023-gd309844d6fe02e695eb82cbd30fd135e836f24eb Author: Eric Botcazou Date: Sat Jan 18 18:58:02 2025 +0100 Fix bootstrap failure on SPARC with -O3 -mcpu=niagara4 This is a regression present on the mainline only, but the underlying issue has been latent for years: the compiler and the assembler disagree on the support of the VIS 3B SIMD ISA, the former bundling it with VIS 3 but not the latter. IMO the documentation is not very clear, so this patch just aligns the compiler with the assembler. gcc/ PR target/118512 * config/sparc/sparc-c.cc (sparc_target_macros): Deal with VIS 3B. * config/sparc/sparc.cc (dump_target_flag_bits): Likewise. (sparc_option_override): Likewise. (sparc_vis_init_builtins): Likewise. * config/sparc/sparc.md (fpcmp_vis): Replace TARGET_VIS3 with TARGET_VIS3B. (vec_cmp): Likewise. (fpcmpu_vis): Likewise. (vec_cmpu): Likewise. (vcond_mask_): Likewise. * config/sparc/sparc.opt (VIS3B): New target mask. * doc/invoke.texi (SPARC options): Document -mvis3b. gcc/testsuite/ * gcc.target/sparc/20230328-1.c: Pass -mvis3b instead of -mvis3. * gcc.target/sparc/20230328-4.c: Likewise. * gcc.target/sparc/fucmp.c: Likewise. * gcc.target/sparc/vis3misc.c: Likewise. Diff: --- gcc/config/sparc/sparc-c.cc | 5 ++ gcc/config/sparc/sparc.cc | 74 - gcc/config/sparc/sparc.md | 12 ++--- gcc/config/sparc/sparc.opt | 6 ++- gcc/doc/invoke.texi | 22 +++-- gcc/testsuite/gcc.target/sparc/20230328-1.c | 2 +- gcc/testsuite/gcc.target/sparc/20230328-4.c | 2 +- gcc/testsuite/gcc.target/sparc/fucmp.c | 2 +- gcc/testsuite/gcc.target/sparc/vis3misc.c | 3 +- 9 files changed, 81 insertions(+), 47 deletions(-) diff --git a/gcc/config/sparc/sparc-c.cc b/gcc/config/sparc/sparc-c.cc index 47a22d583b69..d365da3a10be 100644 --- a/gcc/config/sparc/sparc-c.cc +++ b/gcc/config/sparc/sparc-c.cc @@ -52,6 +52,11 @@ sparc_target_macros (void) cpp_define (parse_in, "__VIS__=0x400"); cpp_define (parse_in, "__VIS=0x400"); } + else if (TARGET_VIS3B) +{ + cpp_define (parse_in, "__VIS__=0x310"); + cpp_define (parse_in, "__VIS=0x310"); +} else if (TARGET_VIS3) { cpp_define (parse_in, "__VIS__=0x300"); diff --git a/gcc/config/sparc/sparc.cc b/gcc/config/sparc/sparc.cc index a62b8033954f..2196a0c4498e 100644 --- a/gcc/config/sparc/sparc.cc +++ b/gcc/config/sparc/sparc.cc @@ -1671,6 +1671,8 @@ dump_target_flag_bits (const int flags) fprintf (stderr, "VIS2 "); if (flags & MASK_VIS3) fprintf (stderr, "VIS3 "); + if (flags & MASK_VIS3B) +fprintf (stderr, "VIS3B "); if (flags & MASK_VIS4) fprintf (stderr, "VIS4 "); if (flags & MASK_VIS4B) @@ -1919,19 +1921,23 @@ sparc_option_override (void) if (TARGET_VIS3) target_flags |= MASK_VIS2 | MASK_VIS; - /* -mvis4 implies -mvis3, -mvis2 and -mvis. */ - if (TARGET_VIS4) + /* -mvis3b implies -mvis3, -mvis2 and -mvis. */ + if (TARGET_VIS3B) target_flags |= MASK_VIS3 | MASK_VIS2 | MASK_VIS; - /* -mvis4b implies -mvis4, -mvis3, -mvis2 and -mvis */ + /* -mvis4 implies -mvis3b, -mvis3, -mvis2 and -mvis. */ + if (TARGET_VIS4) +target_flags |= MASK_VIS3B | MASK_VIS3 | MASK_VIS2 | MASK_VIS; + + /* -mvis4b implies -mvis4, -mvis3b, -mvis3, -mvis2 and -mvis */ if (TARGET_VIS4B) -target_flags |= MASK_VIS4 | MASK_VIS3 | MASK_VIS2 | MASK_VIS; +target_flags |= MASK_VIS4 | MASK_VIS3B | MASK_VIS3 | MASK_VIS2 | MASK_VIS; - /* Don't allow -mvis, -mvis2, -mvis3, -mvis4, -mvis4b, -mfmaf and -mfsmuld if - FPU is disabled. */ + /* Don't allow -mvis, -mvis2, -mvis3, -mvis3b, -mvis4, -mvis4b, -mfmaf and + -mfsmuld if FPU is disabled. */ if (!TARGET_FPU) -target_flags &= ~(MASK_VIS | MASK_VIS2 | MASK_VIS3 | MASK_VIS4 - | MASK_VIS4B | MASK_FMAF | MASK_FSMULD); +target_flags &= ~(MASK_VIS | MASK_VIS2 | MASK_VIS3 | MASK_VIS3B + | MASK_VIS4 | MASK_VIS4B | MASK_FMAF | MASK_FSMULD); /* -mvis assumes UltraSPARC+, so we are sure v9 instructions are available; -m64 also implies v9. */ @@ -11451,10 +11457,6 @@ sparc_vis_init_builtins (void) def_builtin_const ("__builtin_vis_fmean16", CODE_FOR_fmean16_vis, SPARC_BUILTIN_FMEAN16, v4hi_ftype_v4hi_v4hi); - def_builtin_const ("__builtin_vis_fpadd64", CODE_FOR_fpadd64_vis, -SPARC_BUILTIN_FPADD64, di_ftype_di_di); - def_builtin_const ("__builtin_vis_fpsub64", CODE_FOR_fpsub64_vis, -SPARC_BUILTIN_FPSUB64, di_ftype_di_di); def_builtin
[gcc r15-7024] Fix uniqueness of symtab_node::get_dump_name.
https://gcc.gnu.org/g:557d1a44ece3b9cf0084a4ebcc2e50875d788393 commit r15-7024-g557d1a44ece3b9cf0084a4ebcc2e50875d788393 Author: Michal Jires Date: Thu Jan 16 14:42:59 2025 +0100 Fix uniqueness of symtab_node::get_dump_name. symtab_node::get_dump_name uses node order to identify nodes. Order is no longer unique because of Incremental LTO patches. This patch moves uid from cgraph_node node to symtab_node, so get_dump_name can use uid instead and get back unique dump names. In inlining passes, uid is replaced with more appropriate (more compact for indexing) summary id. Bootstrapped/regtested on x86_64-linux. Ok for trunk? gcc/ChangeLog: * cgraph.cc (symbol_table::create_empty): Move uid to symtab_node. (test_symbol_table_test): Change expected dump id. * cgraph.h (struct cgraph_node): Move uid to symtab_node. (symbol_table::register_symbol): Likewise. * dumpfile.cc (test_capture_of_dump_calls): Change expected dump id. * ipa-inline.cc (update_caller_keys): Use summary id instead of uid. (update_callee_keys): Likewise. * symtab.cc (symtab_node::get_dump_name): Use uid instead of order. gcc/testsuite/ChangeLog: * gcc.dg/live-patching-1.c: Change expected dump id. * gcc.dg/live-patching-4.c: Likewise. Diff: --- gcc/cgraph.cc | 4 ++-- gcc/cgraph.h | 25 ++--- gcc/dumpfile.cc| 8 gcc/ipa-inline.cc | 6 +++--- gcc/symtab.cc | 2 +- gcc/testsuite/gcc.dg/live-patching-1.c | 2 +- gcc/testsuite/gcc.dg/live-patching-4.c | 2 +- 7 files changed, 26 insertions(+), 23 deletions(-) diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc index 83a9b59ef302..d0b19ad850e0 100644 --- a/gcc/cgraph.cc +++ b/gcc/cgraph.cc @@ -290,7 +290,7 @@ cgraph_node * symbol_table::create_empty (void) { cgraph_count++; - return new (ggc_alloc ()) cgraph_node (cgraph_max_uid++); + return new (ggc_alloc ()) cgraph_node (); } /* Register HOOK to be called with DATA on each removed edge. */ @@ -4338,7 +4338,7 @@ test_symbol_table_test () /* Verify that the node has order 0 on both iterations, and thus that nodes have predictable dump names in selftests. */ ASSERT_EQ (node->order, 0); - ASSERT_STREQ (node->dump_name (), "test_decl/0"); + ASSERT_STREQ (node->dump_name (), "test_decl/1"); } } diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 7856d53c9e92..065fcc742e8b 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -124,7 +124,7 @@ public: order (-1), next_sharing_asm_name (NULL), previous_sharing_asm_name (NULL), same_comdat_group (NULL), ref_list (), alias_target (NULL), lto_file_data (NULL), aux (NULL), - x_comdat_group (NULL_TREE), x_section (NULL) + x_comdat_group (NULL_TREE), x_section (NULL), m_uid (-1) {} /* Return name. */ @@ -492,6 +492,12 @@ public: /* Perform internal consistency checks, if they are enabled. */ static inline void checking_verify_symtab_nodes (void); + /* Get unique identifier of the node. */ + inline int get_uid () + { +return m_uid; + } + /* Type of the symbol. */ ENUM_BITFIELD (symtab_type) type : 8; @@ -668,6 +674,9 @@ protected: void *data, bool include_overwrite); private: + /* Unique id of the node. */ + int m_uid; + /* Workers for set_section. */ static bool set_section_from_string (symtab_node *n, void *s); static bool set_section_from_node (symtab_node *n, void *o); @@ -882,7 +891,7 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public symtab_node friend class symbol_table; /* Constructor. */ - explicit cgraph_node (int uid) + explicit cgraph_node () : symtab_node (SYMTAB_FUNCTION), callees (NULL), callers (NULL), indirect_calls (NULL), next_sibling_clone (NULL), prev_sibling_clone (NULL), clones (NULL), @@ -903,7 +912,7 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public symtab_node redefined_extern_inline (false), tm_may_enter_irr (false), ipcp_clone (false), gc_candidate (false), called_by_ifunc_resolver (false), has_omp_variant_constructs (false), - m_uid (uid), m_summary_id (-1) + m_summary_id (-1) {} /* Remove the node from cgraph and all inline clones inlined into it. @@ -1304,12 +1313,6 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public symtab_node dump_cgraph (stderr); } - /* Get unique identifier of the node. */ - inline int get_uid () - { -return m_uid; - } - /* Get summary id of the node. */ inline int get_summary_id () { @@ -1503,8 +1506,6 @@ struct GTY((tag ("SY