[gcc] Deleted branch 'majin/heads/dev' in namespace 'refs/users'
The branch 'majin/heads/dev' in namespace 'refs/users' was deleted. It previously pointed to: d0acb7b2b26d... PR modula2/118010 m2 libc lseek procedure interface correct
[gcc r15-7368] libstdc++: Fix gnu.ver CXXABI_1.3.16 for Solaris [PR118701]
https://gcc.gnu.org/g:6b49883e62a1ecf01ffd78c4c20fa7af87f8ec4d commit r15-7368-g6b49883e62a1ecf01ffd78c4c20fa7af87f8ec4d Author: Rainer Orth Date: Wed Feb 5 09:59:56 2025 +0100 libstdc++: Fix gnu.ver CXXABI_1.3.16 for Solaris [PR118701] This patch commit c6977f765838a5ca8d321d916221a7368622bdd9 Author: Andreas Schwab Date: Tue Jan 21 23:50:15 2025 +0100 libstdc++: correct symbol version of typeinfo for bfloat16_t on RISC-V broke the libstdc++-abi/abi_check test on Solaris: the log shows 1 incompatible symbols 0 Argument "{CXXABI_1.3.15}" isn't numeric in numeric eq (==) at /vol/gcc/src/hg/master/local/libstdc++-v3/scripts/extract_symvers.pl line 129. version status: incompatible type: uncategorized status: added The problem has two parts: * The patch above introduced a new version in libstdc++.so, CXXABI_1.3.16, which everywhere but on RISC-V contains no symbols (a weak version). This is the first time this happened in libstdc++. * Solaris uses scripts/extract_symvers.pl to determine the version info. The script currently chokes on the pvs output for weak versions: libstdc++.so.6.0.34 - CXXABI_1.3.16 [WEAK]: {CXXABI_1.3.15}; instead of libstdc++.so.6.0.34 - CXXABI_1.3.16: {CXXABI_1.3.15}; While this patch hardens the script to cope with weak versions, there's no reason to introduce them in the first place. So the new version is only created on __riscv. Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11, and x86_64-pc-linux-gnu. 2025-01-29 Rainer Orth Jonathan Wakely libstdc++-v3: PR libstdc++/118701 * config/abi/pre/gnu.ver (CXXABI_1.3.16): Move __riscv guard around version. * scripts/extract_symvers.pl: Allow for weak versions. * testsuite/util/testsuite_abi.cc (check_version): Wrap CXXABI_1.3.16 in __riscv. Diff: --- libstdc++-v3/config/abi/pre/gnu.ver | 4 ++-- libstdc++-v3/scripts/extract_symvers.pl | 14 -- libstdc++-v3/testsuite/util/testsuite_abi.cc | 2 ++ 3 files changed, 16 insertions(+), 4 deletions(-) diff --git a/libstdc++-v3/config/abi/pre/gnu.ver b/libstdc++-v3/config/abi/pre/gnu.ver index 84ce874fe036..adadc62e3533 100644 --- a/libstdc++-v3/config/abi/pre/gnu.ver +++ b/libstdc++-v3/config/abi/pre/gnu.ver @@ -2874,15 +2874,15 @@ CXXABI_1.3.15 { } CXXABI_1.3.14; +#ifdef __riscv CXXABI_1.3.16 { -#ifdef __riscv _ZTIDF16b; _ZTIPDF16b; _ZTIPKDF16b; -#endif } CXXABI_1.3.15; +#endif # Symbols in the support library (libsupc++) supporting transactional memory. CXXABI_TM_1 { diff --git a/libstdc++-v3/scripts/extract_symvers.pl b/libstdc++-v3/scripts/extract_symvers.pl index e0e6e5b70c4a..fb18e11a1b3e 100644 --- a/libstdc++-v3/scripts/extract_symvers.pl +++ b/libstdc++-v3/scripts/extract_symvers.pl @@ -34,8 +34,18 @@ while () { # Remove trailing semicolon. s/;$//; -# shared object, dash, version, symbol, [size] -(undef, undef, $version, $symbol, $size) = split; +if (/\[WEAK\]/) { + # Allow for weak versions like + # libstdc++.so.6.0.34 - CXXABI_1.3.16 [WEAK]: {CXXABI_1.3.15}; + # + # shared object, dash, version "[WEAK]", symbol, [size] + (undef, undef, $version, undef, $symbol, $size) = split; +} else { + # libstdc++.so.6.0.34 - CXXABI_1.3.16: {CXXABI_1.3.15}; + # + # shared object, dash, version, symbol, [size] + (undef, undef, $version, $symbol, $size) = split; +} # Remove colon separator from version field. $version =~ s/:$//; diff --git a/libstdc++-v3/testsuite/util/testsuite_abi.cc b/libstdc++-v3/testsuite/util/testsuite_abi.cc index 0d6080fb92c0..1b4044c95188 100644 --- a/libstdc++-v3/testsuite/util/testsuite_abi.cc +++ b/libstdc++-v3/testsuite/util/testsuite_abi.cc @@ -237,7 +237,9 @@ check_version(symbol& test, bool added) known_versions.push_back("CXXABI_1.3.13"); known_versions.push_back("CXXABI_1.3.14"); known_versions.push_back("CXXABI_1.3.15"); +#ifdef __riscv known_versions.push_back("CXXABI_1.3.16"); +#endif known_versions.push_back("CXXABI_IEEE128_1.3.13"); known_versions.push_back("CXXABI_TM_1"); known_versions.push_back("CXXABI_FLOAT128");
[gcc r15-7369] testsuite: Revert to the original version of pr100056.c
https://gcc.gnu.org/g:754137d9cb741dcbbd27b03bddb3dc10d1376421 commit r15-7369-g754137d9cb741dcbbd27b03bddb3dc10d1376421 Author: Richard Sandiford Date: Wed Feb 5 09:05:05 2025 + testsuite: Revert to the original version of pr100056.c r15-268-g9dbff9c05520 restored the original GCC 11 output for pr100056.c, so this patch reverts the changes made to the test in r12-7259-g25332d2325c7. (The code parts of r12-7259 still seem useful, as a belt-and-braces thing.) gcc/testsuite/ * gcc.target/aarch64/pr100056.c: Restore the original version of the scan-assemblers. Diff: --- gcc/testsuite/gcc.target/aarch64/pr100056.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/gcc/testsuite/gcc.target/aarch64/pr100056.c b/gcc/testsuite/gcc.target/aarch64/pr100056.c index 70499772d285..0b77824da457 100644 --- a/gcc/testsuite/gcc.target/aarch64/pr100056.c +++ b/gcc/testsuite/gcc.target/aarch64/pr100056.c @@ -1,9 +1,7 @@ /* PR target/100056 */ /* { dg-do compile } */ /* { dg-options "-O2" } */ -/* { dg-final { scan-assembler-not {\t[us]bfiz\tw[0-9]+, w[0-9]+, 11} { xfail *-*-* } } } */ -/* { dg-final { scan-assembler-times {\t[us]bfiz\tw[0-9]+, w[0-9]+, 11} 2 } } */ -/* { dg-final { scan-assembler-times {\tadd\tw[0-9]+, w[0-9]+, w[0-9]+, uxtb\n} 2 } } */ +/* { dg-final { scan-assembler-not {\t[us]bfiz\tw[0-9]+, w[0-9]+, 11} } } */ int or_shift_u8 (unsigned char i)
[gcc(refs/users/mikael/heads/refactor_descriptor_v01)] Annulation modif dump assumed_rank_12.f90
https://gcc.gnu.org/g:d3da1c5219031e93be3da8c0cf2d972612e86509 commit d3da1c5219031e93be3da8c0cf2d972612e86509 Author: Mikael Morin Date: Wed Feb 5 11:45:00 2025 +0100 Annulation modif dump assumed_rank_12.f90 Diff: --- gcc/fortran/trans-array.cc | 126 - 1 file changed, 124 insertions(+), 2 deletions(-) diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index 90eafe7ffe18..531281049646 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -1085,11 +1085,131 @@ field_count (tree type) } -bool +#if 0 +static bool complete_init_p (tree type, vec *init_values) { return (unsigned) field_count (type) == vec_safe_length (init_values); } +#endif + + +static int +cmp_wi (const void *x, const void *y) +{ + const offset_int *wix = (const offset_int *) x; + const offset_int *wiy = (const offset_int *) y; + + return wi::cmpu (*wix, *wiy); +} + + +static offset_int +get_offset_bits (tree field) +{ + offset_int field_offset = wi::to_offset (DECL_FIELD_OFFSET (field)); + offset_int field_bit_offset = wi::to_offset (DECL_FIELD_BIT_OFFSET (field)); + unsigned long offset_align = DECL_OFFSET_ALIGN (field); + + return field_offset * offset_align + field_bit_offset; +} + + +static bool +check_cleared_low_bits (const offset_int &val, int bitcount) +{ + if (bitcount == 0) +return true; + + offset_int mask = wi::mask (bitcount, false); + if ((val & mask) != 0) +return false; + + return true; +} + + +static bool +right_shift_if_clear (const offset_int &val, int bitcount, offset_int *result) +{ + if (bitcount == 0) +{ + *result = val; + return true; +} + + if (!check_cleared_low_bits (val, bitcount)) +return false; + + *result = val >> bitcount; + return true; +} + + +static bool +contiguous_init_p (tree type, tree value) +{ + gcc_assert (TREE_CODE (value) == CONSTRUCTOR); + auto_vec field_offsets; + int count = field_count (type); + field_offsets.reserve (count); + + tree field = TYPE_FIELDS (type); + offset_int expected_offset = 0; + while (field != NULL_TREE) +{ + offset_int field_offset_bits = get_offset_bits (field); + offset_int field_offset; + if (!right_shift_if_clear (field_offset_bits, 3, &field_offset)) + return false; + + offset_int type_size = wi::to_offset (TYPE_SIZE_UNIT (TREE_TYPE (field))); + int align = wi::ctz (type_size); + if (!check_cleared_low_bits (field_offset, align)) + return false; + + if (field_offset != expected_offset) + return false; + + expected_offset += type_size; + field_offsets.quick_push (field_offset); + + field = DECL_CHAIN (field); +} + + auto_vec value_offsets; + value_offsets.reserve (count); + + unsigned i; + tree field_init; + FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (value), i, field, field_init) +{ + if (TREE_TYPE (field) != TREE_TYPE (field_init)) + return false; + + offset_int field_offset_bits = get_offset_bits (field); + offset_int field_offset; + if (!right_shift_if_clear (field_offset_bits, 3, &field_offset)) + return false; + + value_offsets.quick_push (field_offset); +} + + value_offsets.qsort (cmp_wi); + + unsigned idx = 0; + offset_int field_off, val_off; + while (field_offsets.iterate (idx, &field_off) +&& value_offsets.iterate (idx, &val_off)) +{ + if (val_off != field_off) + return false; + + idx++; +} + + return true; +} static bool @@ -1161,7 +1281,9 @@ init_struct (stmtblock_t *block, tree data_ref, init_kind kind, if (TREE_STATIC (data_ref) || !modifiable_p (data_ref)) DECL_INITIAL (data_ref) = value; - else if (TREE_CODE (value) == CONSTRUCTOR) + else if (TREE_CODE (value) == CONSTRUCTOR + && !(TREE_CONSTANT (value) + && contiguous_init_p (type, value))) { unsigned i; tree field, field_init;
[gcc(refs/users/mikael/heads/refactor_descriptor_v01)] Séparation motifs dump assumed_rank_12.f90
https://gcc.gnu.org/g:543f6360597cdbf1c6fc208116fd2a18c12d871e commit 543f6360597cdbf1c6fc208116fd2a18c12d871e Author: Mikael Morin Date: Wed Feb 5 11:57:09 2025 +0100 Séparation motifs dump assumed_rank_12.f90 Diff: --- gcc/testsuite/gfortran.dg/assumed_rank_12.f90 | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gfortran.dg/assumed_rank_12.f90 b/gcc/testsuite/gfortran.dg/assumed_rank_12.f90 index 873498f82d76..cacfb7ed52af 100644 --- a/gcc/testsuite/gfortran.dg/assumed_rank_12.f90 +++ b/gcc/testsuite/gfortran.dg/assumed_rank_12.f90 @@ -16,5 +16,9 @@ function f() result(res) end function f end -! { dg-final { scan-tree-dump " = f \\(\\);.*desc.0.dtype = .*;.*desc.0.data = .void .. D.*;.*sub \\(&desc.0\\);.*D.*= .integer.kind=4. .. desc.0.data;" "original" } } +! { dg-final { scan-tree-dump " = f \\(\\);" "original" } } +! { dg-final { scan-tree-dump "desc.0.dtype = .*;" "original" } } +! { dg-final { scan-tree-dump "desc.0.data = .void .. D.*;" "original" } } +! { dg-final { scan-tree-dump "sub \\(&desc.0\\);" "original" } } +! { dg-final { scan-tree-dump "D.*= .integer.kind=4. .. desc.0.data;" "original" } }
[gcc r15-7367] MAINTAINERS: Add myself to write after approval
https://gcc.gnu.org/g:884893ae87ae9a562c38f997d9b332c3591b commit r15-7367-g884893ae87ae9a562c38f997d9b332c3591b Author: Jin Ma Date: Tue Dec 3 15:50:14 2024 +0800 MAINTAINERS: Add myself to write after approval ChangeLog: * MAINTAINERS: Add myself. Diff: --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 44367b27b415..c423dd6e7874 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -641,6 +641,7 @@ H.J. Lu hjl Xiong Hu Luo- Bin Bin Lv shlb Christophe Lyon clyon +Jin Ma majin Jun Ma junma Andrew MacLeod - Luis Machadoluisgpm
[gcc(refs/users/majin/heads/master)] MAINTAINERS: Add myself to write after approval
https://gcc.gnu.org/g:8cae77a5be9c59aa511cd957ea6ea700605a5d97 commit 8cae77a5be9c59aa511cd957ea6ea700605a5d97 Author: Jin Ma Date: Tue Dec 3 15:50:14 2024 +0800 MAINTAINERS: Add myself to write after approval ChangeLog: * MAINTAINERS: Add myself. Diff: --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 44367b27b415..c423dd6e7874 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -641,6 +641,7 @@ H.J. Lu hjl Xiong Hu Luo- Bin Bin Lv shlb Christophe Lyon clyon +Jin Ma majin Jun Ma junma Andrew MacLeod - Luis Machadoluisgpm
[gcc r15-7370] vect: Fix wrong code with pr108692.c on targets with only non-widening ABD [PR118727]
https://gcc.gnu.org/g:da88e7027a34a44de84f6d8d5a96d262c29080a7 commit r15-7370-gda88e7027a34a44de84f6d8d5a96d262c29080a7 Author: Xi Ruoyao Date: Sun Feb 2 21:22:36 2025 +0800 vect: Fix wrong code with pr108692.c on targets with only non-widening ABD [PR118727] With things like // signed char a_14, a_16; a.0_4 = (unsigned char) a_14; _5 = (int) a.0_4; b.1_6 = (unsigned char) b_16; _7 = (int) b.1_6; c_17 = _5 - _7; _8 = ABS_EXPR ; r_18 = _8 + r_23; An ABD pattern will be recognized for _8: patt_31 = .ABD (a.0_4, b.1_6); It's still correct. But then when the SAD pattern is recognized: patt_29 = SAD_EXPR ; This is not correct. This only happens for targets with both uabd and sabd but not vec_widen_{s,u}abd, currently LoongArch is the only target affected. The problem is vect_look_through_possible_promotion will throw away a series of conversions if the effect is equivalent to a sign change and a promotion, but here the sign change is definitely relevant, and the promotion is also relevant for "mixed sign" cases like r += abs((unsigned int)(unsigned char) a - (signed int)(signed char) b (we need to promote to HImode as the difference can exceed the range of QImode). If there were any redundant promotion, it should have been stripped in vect_recog_abd_pattern (i.e. when patt_31 = .ABD (a.0_4, b.1_6) is recognized) instead of in vect_recog_sad_pattern, or we'd have a missed-optimization if the ABD output is not summerized. So anyway vect_recog_sad_pattern is just not a proper location to call vect_look_through_possible_promotion for the ABD inputs, remove the calls to fix the issue. gcc/ChangeLog: PR tree-optimization/118727 * tree-vect-patterns.cc (vect_recog_sad_pattern): Don't call vect_look_through_possible_promotion on ABD inputs. gcc/testsuite/ChangeLog: PR tree-optimization/118727 * gcc.dg/pr108692.c: Mention PR 118727 in the comment. * gcc.dg/pr118727.c: New test case. Diff: --- gcc/testsuite/gcc.dg/pr108692.c | 1 + gcc/testsuite/gcc.dg/pr118727.c | 32 gcc/tree-vect-patterns.cc | 11 ++- 3 files changed, 35 insertions(+), 9 deletions(-) diff --git a/gcc/testsuite/gcc.dg/pr108692.c b/gcc/testsuite/gcc.dg/pr108692.c index 13a27496ad9f..22032817 100644 --- a/gcc/testsuite/gcc.dg/pr108692.c +++ b/gcc/testsuite/gcc.dg/pr108692.c @@ -1,4 +1,5 @@ /* PR tree-optimization/108692 */ +/* PR tree-optimization/118727 */ /* { dg-do run } */ /* { dg-options "-O2 -ftree-vectorize" } */ diff --git a/gcc/testsuite/gcc.dg/pr118727.c b/gcc/testsuite/gcc.dg/pr118727.c new file mode 100644 index ..2ee5fa782362 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr118727.c @@ -0,0 +1,32 @@ +/* PR tree-optimization/118727 */ +/* { dg-do run } */ +/* { dg-options "-O2 -ftree-vectorize" } */ + +__attribute__((noipa)) int +foo (signed char *x, signed char *y, int n) +{ + int i, r = 0; + signed char a, b; + for (i = 0; i < n; i++) +{ + a = x[i]; + b = y[i]; + /* Slightly twisted from pr108692.c. */ + int c = (unsigned int)(unsigned char) a - (signed int)(signed char) b; + r = r + (c < 0 ? -c : c); +} + return r; +} + +int +main () +{ + signed char x[64] = {}, y[64] = {}; + if (__CHAR_BIT__ != 8 || __SIZEOF_INT__ != 4) +return 0; + x[32] = -1; + y[32] = -128; + if (foo (x, y, 64) != 383) +__builtin_abort (); + return 0; +} diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index 5aebf9505485..6fc97d1b6ef9 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -1405,15 +1405,8 @@ vect_recog_sad_pattern (vec_info *vinfo, tree abd_oprnd0 = gimple_call_arg (abd_stmt, 0); tree abd_oprnd1 = gimple_call_arg (abd_stmt, 1); - if (gimple_call_internal_fn (abd_stmt) == IFN_ABD) - { - if (!vect_look_through_possible_promotion (vinfo, abd_oprnd0, -&unprom[0]) - || !vect_look_through_possible_promotion (vinfo, abd_oprnd1, - &unprom[1])) - return NULL; - } - else if (gimple_call_internal_fn (abd_stmt) == IFN_VEC_WIDEN_ABD) + if (gimple_call_internal_fn (abd_stmt) == IFN_ABD + || gimple_call_internal_fn (abd_stmt) == IFN_VEC_WIDEN_ABD) { unprom[0].op = abd_oprnd0; unprom[0].type = TREE_TYPE (abd_oprnd0);
[gcc r15-7371] arm: cleanup code in ldm_stm_operation_p; relax limits on ldm/stm
https://gcc.gnu.org/g:aead1d44b7df50c77ff63482f5548f237ff29033 commit r15-7371-gaead1d44b7df50c77ff63482f5548f237ff29033 Author: Richard Earnshaw Date: Thu Dec 19 15:32:36 2024 + arm: cleanup code in ldm_stm_operation_p; relax limits on ldm/stm I needed to make some adjustments to this function to permit a push or pop of a single register in thumb2 code, since ldm/stm can be a two-byte instruction instead of 4. Trying to read the code as it was made me scratch my head as the logic was not very clear. So this patch cleans up the code somewhat, fixes a couple of minor bugs and removes the limit of having to use multiple registers when using this form of the instruction (the shape of this pattern is such that I can't see it being generated automatically by the compiler, so there should be no adverse affects of this). Buglets fixed: - Validate that the first element contains RETURN if we're matching a return instruction. - Don't allow the base address register to be stored if saving regs and the address is being updated (this is unpredictable in the architecture). - Verify that the last register loaded in a RETURN insn is the PC. gcc/ * config/arm/arm.cc (decompose_addr_for_ldm_stm): New function. (ldm_stm_operation_p): Rework to clarify logic. Allow single registers to be pushed or popped using LDM/STM. Diff: --- gcc/config/arm/arm.cc | 224 -- 1 file changed, 126 insertions(+), 98 deletions(-) diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc index 86838ebde5f8..4ee84361dc6e 100644 --- a/gcc/config/arm/arm.cc +++ b/gcc/config/arm/arm.cc @@ -14267,6 +14267,30 @@ adjacent_mem_locations (rtx a, rtx b) return 0; } +/* Helper routine for ldm_stm_operation_p. Decompose a simple offset + address into the base register and the offset. Return false iff + it is more complex than this. */ +static inline bool +decompose_addr_for_ldm_stm (rtx addr, rtx *base, HOST_WIDE_INT *offset) +{ + if (REG_P (addr)) +{ + *base = addr; + *offset = 0; + return true; +} + else if (GET_CODE (addr) == PLUS + && REG_P (XEXP (addr, 0)) + && CONST_INT_P (XEXP (addr, 1))) +{ + *base = XEXP (addr, 0); + *offset = INTVAL (XEXP (addr, 1)); + return true; +} + + return false; +} + /* Return true if OP is a valid load or store multiple operation. LOAD is true for load operations, false for store operations. CONSECUTIVE is true if the register numbers in the operation must be consecutive in the register @@ -14282,23 +14306,25 @@ adjacent_mem_locations (rtx a, rtx b) 1. If offset is 0, first insn should be (SET (R_d0) (MEM (src_addr))). 2. REGNO (R_d0) < REGNO (R_d1) < ... < REGNO (R_dn). 3. If consecutive is TRUE, then for kth register being loaded, - REGNO (R_dk) = REGNO (R_d0) + k. +REGNO (R_dk) = REGNO (R_d0) + k. The pattern for store is similar. */ bool ldm_stm_operation_p (rtx op, bool load, machine_mode mode, - bool consecutive, bool return_pc) +bool consecutive, bool return_pc) { - HOST_WIDE_INT count = XVECLEN (op, 0); - rtx reg, mem, addr; - unsigned regno; - unsigned first_regno; - HOST_WIDE_INT i = 1, base = 0, offset = 0; + int count = XVECLEN (op, 0); + rtx reg, mem; + rtx addr_base; + int reg_loc, mem_loc; + unsigned prev_regno; + HOST_WIDE_INT addr_offset; rtx elt; bool addr_reg_in_reglist = false; bool update = false; - int reg_increment; - int offset_adj; - int regs_per_val; + int reg_bytes; + int words_per_reg; /* How many words in memory a register takes. */ + int elt_num = 0; + int base_elt_num; /* Element number of the first transfer operation. */ /* If not in SImode, then registers must be consecutive (e.g., VLDM instructions for DFmode). */ @@ -14306,138 +14332,140 @@ ldm_stm_operation_p (rtx op, bool load, machine_mode mode, /* Setting return_pc for stores is illegal. */ gcc_assert (!return_pc || load); - /* Set up the increments and the regs per val based on the mode. */ - reg_increment = GET_MODE_SIZE (mode); - regs_per_val = reg_increment / 4; - offset_adj = return_pc ? 1 : 0; + /* Set up the increments and sizes for the mode. */ + reg_bytes = GET_MODE_SIZE (mode); + words_per_reg = ARM_NUM_REGS (mode); + + /* If this is a return, then the first element in the par must be + (return). */ + if (return_pc) +{ + if (GET_CODE (XVECEXP (op, 0, 0)) != RETURN) + return false; + elt_num++; +} - if (count <= 1 - || GET_CODE (XVECEXP (op, 0, offset_adj)) != SET - || (load && !REG_P (SET_DEST (XVECEXP (op, 0, offset_adj) + if (elt_num >= count) return false; /* Check if this is a write-back. */ - elt = XVECEXP (op, 0, offset_adj); + elt
[gcc r15-7373] arm: Use POP {pc} to return when returning [PR118089]
https://gcc.gnu.org/g:5163cf2ae14c5e7ec730ad72680564001d0d0441 commit r15-7373-g5163cf2ae14c5e7ec730ad72680564001d0d0441 Author: Richard Earnshaw Date: Thu Dec 19 16:00:48 2024 + arm: Use POP {pc} to return when returning [PR118089] When generating thumb2 code, LDM SP!, {PC} is a two-byte instruction, whereas LDR PC, [SP], #4 is needs 4 bytes. When optimizing for size, or when there's no obvious performance benefit prefer the former. gcc/ChangeLog: PR target/118089 * config/arm/arm.cc (thumb2_expand_return): Use LDM SP!, {PC} when optimizing for size, or when there's no performance benefit over LDR PC, [SP], #4. (arm_expand_epilogue): Likewise. Diff: --- gcc/config/arm/arm.cc | 62 +-- 1 file changed, 35 insertions(+), 27 deletions(-) diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc index 4ee84361dc6e..7e2082101d83 100644 --- a/gcc/config/arm/arm.cc +++ b/gcc/config/arm/arm.cc @@ -27762,35 +27762,40 @@ thumb2_expand_return (bool simple_return) /* TODO: Verify that this path is never taken for cmse_nonsecure_entry functions or adapt code to handle according to ACLE. This path should not be reachable for cmse_nonsecure_entry functions though we prefer -to assert it for now to ensure that future code changes do not silently -change this behavior. */ +to assert it for now to ensure that future code changes do not +silently change this behavior. */ gcc_assert (!IS_CMSE_ENTRY (arm_current_func_type ())); if (arm_current_function_pac_enabled_p ()) -{ - gcc_assert (!(saved_regs_mask & (1 << PC_REGNUM))); - arm_emit_multi_reg_pop (saved_regs_mask); - emit_insn (gen_aut_nop ()); - emit_jump_insn (simple_return_rtx); -} - else if (num_regs == 1) -{ - rtx par = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (2)); - rtx reg = gen_rtx_REG (SImode, PC_REGNUM); - rtx addr = gen_rtx_MEM (SImode, - gen_rtx_POST_INC (SImode, -stack_pointer_rtx)); - set_mem_alias_set (addr, get_frame_alias_set ()); - XVECEXP (par, 0, 0) = ret_rtx; - XVECEXP (par, 0, 1) = gen_rtx_SET (reg, addr); - RTX_FRAME_RELATED_P (XVECEXP (par, 0, 1)) = 1; - emit_jump_insn (par); -} + { + gcc_assert (!(saved_regs_mask & (1 << PC_REGNUM))); + arm_emit_multi_reg_pop (saved_regs_mask); + emit_insn (gen_aut_nop ()); + emit_jump_insn (simple_return_rtx); + } + /* Use LDR PC, [sp], #4. Only do this if not optimizing for size and +there's a known performance benefit (we don't know this exactly, but +preferring LDRD/STRD over LDM/STM is a reasonable proxy). */ + else if (num_regs == 1 + && !optimize_size + && current_tune->prefer_ldrd_strd) + { + rtx par = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (2)); + rtx reg = gen_rtx_REG (SImode, PC_REGNUM); + rtx addr = gen_rtx_MEM (SImode, + gen_rtx_POST_INC (SImode, + stack_pointer_rtx)); + set_mem_alias_set (addr, get_frame_alias_set ()); + XVECEXP (par, 0, 0) = ret_rtx; + XVECEXP (par, 0, 1) = gen_rtx_SET (reg, addr); + RTX_FRAME_RELATED_P (XVECEXP (par, 0, 1)) = 1; + emit_jump_insn (par); + } else -{ - saved_regs_mask &= ~ (1 << LR_REGNUM); - saved_regs_mask |= (1 << PC_REGNUM); - arm_emit_multi_reg_pop (saved_regs_mask); -} + { + saved_regs_mask &= ~ (1 << LR_REGNUM); + saved_regs_mask |= (1 << PC_REGNUM); + arm_emit_multi_reg_pop (saved_regs_mask); + } } else { @@ -28204,7 +28209,10 @@ arm_expand_epilogue (bool really_return) return_in_pc = true; } - if (num_regs == 1 && (!IS_INTERRUPT (func_type) || !return_in_pc)) + if (num_regs == 1 + && !optimize_size + && current_tune->prefer_ldrd_strd + && !(IS_INTERRUPT (func_type) && return_in_pc)) { for (i = 0; i <= LAST_ARM_REGNUM; i++) if (saved_regs_mask & (1 << i))
[gcc r15-7372] arm: remove constraints from *pop_multiple_with_writeback_and_return
https://gcc.gnu.org/g:b47c7a5a3c8280ea64754a6c24582236eacef8a2 commit r15-7372-gb47c7a5a3c8280ea64754a6c24582236eacef8a2 Author: Richard Earnshaw Date: Thu Dec 19 15:54:16 2024 + arm: remove constraints from *pop_multiple_with_writeback_and_return This pattern is intended to be used only by the epilogue generation code and will always use fixed hard registers. As such, it does not need any register constraints, which might be misleading if a post-reload pass wanted to try renumbering various registers. So remove the constraints. Futhermore, to permit this pattern to match when popping just the PC (which is not a valid register_operand), remove the match on the first transfer register: pop_multiple_return will validate everything it needs to. gcc/ChangeLog: * config/arm/arm.md (*pop_multiple_with_writeback_and_return): Remove constraints. Don't validate the first transfer register here. Diff: --- gcc/config/arm/arm.md | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 842903e0bcdb..442d86b93292 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -11964,12 +11964,10 @@ (define_insn "*pop_multiple_with_writeback_and_return" [(match_parallel 0 "pop_multiple_return" [(return) - (set (match_operand:SI 1 "s_register_operand" "+rk") + (set (match_operand:SI 1 "register_operand" "") (plus:SI (match_dup 1) - (match_operand:SI 2 "const_int_I_operand" "I"))) - (set (match_operand:SI 3 "s_register_operand" "=rk") - (mem:SI (match_dup 1))) -])] + (match_operand:SI 2 "const_int_I_operand" ""))) +])] "TARGET_32BIT && (reload_in_progress || reload_completed)" "* {
[gcc r15-7381] [committed] Disable ABS instruction on bfin port
https://gcc.gnu.org/g:3e08a4ecea27c54fda90e8f58641b1986ad957e1 commit r15-7381-g3e08a4ecea27c54fda90e8f58641b1986ad957e1 Author: Jeff Law Date: Wed Feb 5 14:22:33 2025 -0700 [committed] Disable ABS instruction on bfin port I was looking at a regression on the bfin port with a recent change to the IRA and stumbled across this just doing a general port healthyness evaluation. The ABS instruction in the blackfin ISA is defined as saturating on INT_MIN, which is a bit unexpected. We certainly can't use it when -fwrapv is enabled. Given the failures on the C23 uabs tests, I'm inclined to just disable the pattern completely. Fixes pr23047, uabs-2 and uabs-3. While it's not a regression, it's the blackfin port, so I think we've got a higher degree of freedom here. Pushing to the trunk. gcc/ * config/bfin/bfin.md (abssi): Disable pattern. Diff: --- gcc/config/bfin/bfin.md | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/gcc/config/bfin/bfin.md b/gcc/config/bfin/bfin.md index 810bd52cadf0..27c156b6c1e8 100644 --- a/gcc/config/bfin/bfin.md +++ b/gcc/config/bfin/bfin.md @@ -1440,12 +1440,15 @@ "%0 = min(%1,%2)%!" [(set_attr "type" "dsp32")]) -(define_insn "abssi2" - [(set (match_operand:SI 0 "register_operand" "=d") - (abs:SI (match_operand:SI 1 "register_operand" "d")))] - "" - "%0 = abs %1%!" - [(set_attr "type" "dsp32")]) +;; The ABS instruction is defined as saturating. So at the least +;; it is inappropriate for -fwrapv. This also fixes the C23 uabs +;; tests. +;;(define_insn "abssi2" +;; [(set (match_operand:SI 0 "register_operand" "=d") +;; (abs:SI (match_operand:SI 1 "register_operand" "d")))] +;; "" +;; "%0 = abs %1%!" +;; [(set_attr "type" "dsp32")]) (define_insn "ssabssi2" [(set (match_operand:SI 0 "register_operand" "=d")
[gcc(refs/users/mikael/heads/refactor_descriptor_v01)] Renseignement token par gfc_set_descriptor_from_scalar.
https://gcc.gnu.org/g:b626ff646018c285848ad420a72a43b1fba1a751 commit b626ff646018c285848ad420a72a43b1fba1a751 Author: Mikael Morin Date: Wed Feb 5 15:12:25 2025 +0100 Renseignement token par gfc_set_descriptor_from_scalar. Diff: --- gcc/fortran/trans-array.cc | 27 --- gcc/fortran/trans-array.h | 2 +- gcc/fortran/trans-expr.cc | 15 +++ 3 files changed, 32 insertions(+), 12 deletions(-) diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index 531281049646..c09b9bdab155 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -682,6 +682,7 @@ public: virtual bool set_span () const { return false; } virtual bool set_token () const { return true; } virtual tree get_data_value () const { return NULL_TREE; } + virtual tree get_caf_token () const { return null_pointer_node; } virtual bt get_type_type (const gfc_typespec &) const { return BT_UNKNOWN; } virtual tree get_length (gfc_typespec *ts) const { return get_size_info (*ts); } }; @@ -751,22 +752,24 @@ private: bool initialisation; gfc_typespec *ts; tree value; + tree caf_token; bool use_tree_type_; bool clear_token; tree get_elt_type () const; public: scalar_value(gfc_typespec &arg_ts, tree arg_value) -: initialisation(true), ts(&arg_ts), value(arg_value), use_tree_type_ (false), clear_token(true) { } - scalar_value(tree arg_value) -: initialisation(true), ts(nullptr), value(arg_value), use_tree_type_ (true), clear_token(false) { } +: initialisation(true), ts(&arg_ts), value(arg_value), caf_token (NULL_TREE), use_tree_type_ (false), clear_token(true) { } + scalar_value(tree arg_value, tree arg_caf_token) +: initialisation(true), ts(nullptr), value(arg_value), caf_token (arg_caf_token), use_tree_type_ (true), clear_token(false) { } virtual bool is_initialization () const { return initialisation; } virtual bool initialize_data () const { return true; } virtual tree get_data_value () const; virtual gfc_typespec *get_type () const { return ts; } virtual bool set_span () const { return true; } virtual bool use_tree_type () const { return use_tree_type_; } - virtual bool set_token () const { return clear_token; } + virtual bool set_token () const { return clear_token || caf_token != NULL_TREE; } + virtual tree get_caf_token () const; virtual bt get_type_type (const gfc_typespec &) const; virtual tree get_length (gfc_typespec *ts) const; }; @@ -838,6 +841,16 @@ scalar_value::get_length (gfc_typespec * type_info) const return size; } +tree +scalar_value::get_caf_token () const +{ + if (set_token () + && caf_token != NULL_TREE) +return caf_token; + else +return modify_info::get_caf_token (); +} + static tree build_dtype (gfc_typespec *ts, int rank, const symbol_attribute &, @@ -933,7 +946,7 @@ get_descriptor_init (tree type, gfc_typespec *ts, int rank, tree token_field = gfc_advance_chain (fields, CAF_TOKEN_FIELD - (!dim_present)); tree token_value = fold_convert (TREE_TYPE (token_field), - null_pointer_node); + init.get_caf_token ()); CONSTRUCTOR_APPEND_ELT (v, token_field, token_value); } @@ -1430,11 +1443,11 @@ gfc_set_scalar_descriptor (stmtblock_t *block, tree descriptor, void gfc_set_descriptor_from_scalar (stmtblock_t *block, tree desc, tree scalar, - symbol_attribute *attr) + symbol_attribute *attr, tree caf_token) { init_struct (block, desc, get_descriptor_init (TREE_TYPE (desc), nullptr, 0, attr, - scalar_value (scalar))); + scalar_value (scalar, caf_token))); } diff --git a/gcc/fortran/trans-array.h b/gcc/fortran/trans-array.h index 97cf7f8cb41f..2dad79aa9993 100644 --- a/gcc/fortran/trans-array.h +++ b/gcc/fortran/trans-array.h @@ -149,7 +149,7 @@ void gfc_set_descriptor_with_shape (stmtblock_t *, tree, tree, gfc_expr *, locus *); tree gfc_get_scalar_to_descriptor_type (tree scalar, symbol_attribute attr); void gfc_set_descriptor_from_scalar (stmtblock_t *, tree, tree, -symbol_attribute *); +symbol_attribute *, tree = NULL_TREE); void gfc_copy_sequence_descriptor (stmtblock_t &, tree, tree, bool); void gfc_set_gfc_from_cfi (stmtblock_t *, stmtblock_t *, tree, tree, tree, gfc_symbol *, bool, bool, bool); diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index 39bd7178c3c0..13a1ec1e8fe3 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -883,14 +883,20 @@ gfc_conv_derived_to_class (gfc_se *parmse, gfc_expr *e, gfc_symbol *fsym, /* Now set the data field. */ ctree = gfc_
[gcc r15-7374] cselib: For CALL_INSNs to const/pure fns invalidate memory below sp [PR117239]
https://gcc.gnu.org/g:886ce970eb096bb302228c891f0c8a889c79ad40 commit r15-7374-g886ce970eb096bb302228c891f0c8a889c79ad40 Author: Jakub Jelinek Date: Wed Feb 5 13:16:17 2025 +0100 cselib: For CALL_INSNs to const/pure fns invalidate memory below sp [PR117239] The following testcase is miscompiled on x86_64 during postreload. After reload (with IPA-RA figuring out the calls don't modify any registers but %rax for return value) postreload sees (insn 14 12 15 2 (set (mem:DI (plus:DI (reg/f:DI 7 sp) (const_int 16 [0x10])) [0 S8 A64]) (reg:DI 1 dx [orig:105 q+16 ] [105])) "pr117239.c":18:7 95 {*movdi_internal} (nil)) (call_insn/i 15 14 16 2 (set (reg:SI 0 ax) (call (mem:QI (symbol_ref:DI ("baz") [flags 0x3] ) [0 baz S1 A8]) (const_int 24 [0x18]))) "pr117239.c":18:7 1476 {*call_value} (expr_list:REG_CALL_DECL (symbol_ref:DI ("baz") [flags 0x3] ) (expr_list:REG_EH_REGION (const_int 0 [0]) (nil))) (nil)) (insn 16 15 18 2 (parallel [ (set (reg/f:DI 7 sp) (plus:DI (reg/f:DI 7 sp) (const_int 24 [0x18]))) (clobber (reg:CC 17 flags)) ]) "pr117239.c":18:7 285 {*adddi_1} (expr_list:REG_ARGS_SIZE (const_int 0 [0]) (nil))) ... (call_insn/i 19 18 21 2 (set (reg:SI 0 ax) (call (mem:QI (symbol_ref:DI ("foo") [flags 0x3] ) [0 foo S1 A8]) (const_int 0 [0]))) "pr117239.c":19:3 1476 {*call_value} (expr_list:REG_CALL_DECL (symbol_ref:DI ("foo") [flags 0x3] ) (expr_list:REG_EH_REGION (const_int 0 [0]) (nil))) (nil)) (insn 21 19 26 2 (parallel [ (set (reg/f:DI 7 sp) (plus:DI (reg/f:DI 7 sp) (const_int -24 [0xffe8]))) (clobber (reg:CC 17 flags)) ]) "pr117239.c":19:3 discrim 1 285 {*adddi_1} (expr_list:REG_ARGS_SIZE (const_int 24 [0x18]) (nil))) (insn 26 21 24 2 (set (mem:DI (plus:DI (reg/f:DI 7 sp) (const_int 16 [0x10])) [0 S8 A64]) (reg:DI 1 dx [orig:105 q+16 ] [105])) "pr117239.c":19:3 discrim 1 95 {*movdi_internal} (nil)) i.e. movq%rdx, 16(%rsp) callbaz addq$24, %rsp ... callfoo subq$24, %rsp movq%rdx, 16(%rsp) Now, postreload uses cselib and cselib remembered that %rdx value has been stored into 16(%rsp). Both baz and foo are pure calls. If they weren't, when processing those CALL_INSNs cselib would invalidate all MEMs if (RTL_LOOPING_CONST_OR_PURE_CALL_P (insn) || !(RTL_CONST_OR_PURE_CALL_P (insn))) cselib_invalidate_mem (callmem); where callmem is (mem:BLK (scratch)). But they are pure, so instead the code just invalidates the argument slots from CALL_INSN_FUNCTION_USAGE. The calls actually clobber more than that, even const/pure calls clobber all memory below the stack pointer. And that is something that hasn't been invalidated. In this failing testcase, the call to baz is not a big deal, we don't have anything remembered in memory below %rsp at that call. But then we increment %rsp by 24, so the %rsp+16 is now 8 bytes below stack and do the call to foo. And that call now actually, not just in theory, clobbers the memory below the stack pointer (in particular overwrites it with the return value). But cselib does not invalidate. Then %rsp is decremented again (in preparation for another call, to bar) and cselib is processing store of %rdx (which IPA-RA says has not been modified by either baz or foo calls) to %rsp + 16, and it sees the memory already has that value, so the store is useless, let's remove it. But it is not, the call to foo has changed it, so it needs to be stored again. The following patch adds targetted invalidation of memory below stack pointer (or on SPARC memory below stack pointer + 2047 when stack bias is used, or on PA memory above stack pointer instead). It does so only in !ACCUMULATE_OUTGOING_ARGS or cfun->calls_alloca functions, because in other functions the stack pointer should be constant from the end of prologue till start of epilogue and so nothing should be stored within the function below the stack pointer. Now, memory below stack pointer is special, except for functions using alloca/VLAs I believe no addressable memory should be there, it should be purely outgoing function argument area, if we take address of some automatic variable, it should live all the time above the outgoing function argument area. So on top of just trying to flush memory below stack pointer (represented by %rsp - PTRDIFF_MAX with
[gcc r15-7376] Fortran/OpenMP: Add location data to 'sorry' [PR118740]
https://gcc.gnu.org/g:6f95af4f22b641fbb3509f1436bce811d4e4acad commit r15-7376-g6f95af4f22b641fbb3509f1436bce811d4e4acad Author: Tobias Burnus Date: Wed Feb 5 14:03:47 2025 +0100 Fortran/OpenMP: Add location data to 'sorry' [PR118740] PR fortran/118740 gcc/fortran/ChangeLog: * openmp.cc (gfc_match_omp_context_selector, match_omp_metadirective): Change sorry to sorry_at and use gfc_current_locus as location. * trans-openmp.cc (gfc_trans_omp_clauses): Likewise, but use n->where. gcc/testsuite/ChangeLog: * gfortran.dg/gomp/append_args-2.f90: Update for line change. Diff: --- gcc/fortran/openmp.cc| 6 -- gcc/fortran/trans-openmp.cc | 8 +--- gcc/testsuite/gfortran.dg/gomp/append_args-2.f90 | 2 +- 3 files changed, 10 insertions(+), 6 deletions(-) diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc index b1684f841f5b..e8df9d63fec2 100644 --- a/gcc/fortran/openmp.cc +++ b/gcc/fortran/openmp.cc @@ -6536,7 +6536,8 @@ gfc_match_omp_context_selector (gfc_omp_set_selector *oss, /* FIXME: The "requires" selector was added in OpenMP 5.1. Currently only the now-deprecated syntax from OpenMP 5.0 is supported. */ - sorry ("% selector is not supported yet"); + sorry_at (gfc_get_location (&gfc_current_locus), + "% selector is not supported yet"); return MATCH_ERROR; } else @@ -6942,7 +6943,8 @@ match_omp_metadirective (bool begin_p) gfc_matching_omp_context_selector = false; if (is_omp_declarative_stmt (directive)) - sorry ("declarative directive variants are not supported"); + sorry_at (gfc_get_location (&gfc_current_locus), + "declarative directive variants are not supported"); if (gfc_error_flag_test ()) { diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc index e29ef85ae398..e9103cd3bac3 100644 --- a/gcc/fortran/trans-openmp.cc +++ b/gcc/fortran/trans-openmp.cc @@ -3345,7 +3345,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, if (openacc && n->sym->ts.type == BT_CLASS) { if (n->sym->attr.optional) - sorry ("optional class parameter"); + sorry_at (gfc_get_location (&n->where), + "optional class parameter"); tree ptr = gfc_class_data_get (decl); ptr = build_fold_indirect_ref (ptr); OMP_CLAUSE_DECL (node) = ptr; @@ -3761,7 +3762,8 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, gcc_assert (!ref->next); } else - sorry ("unhandled expression type"); + sorry_at (gfc_get_location (&n->where), + "unhandled expression type"); } tree inner = se.expr; @@ -4041,7 +4043,7 @@ gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses, gcc_unreachable (); } else - sorry ("unhandled expression"); + sorry_at (gfc_get_location (&n->where), "unhandled expression"); finalize_map_clause: diff --git a/gcc/testsuite/gfortran.dg/gomp/append_args-2.f90 b/gcc/testsuite/gfortran.dg/gomp/append_args-2.f90 index a20f610a03dc..7a68977ed4d0 100644 --- a/gcc/testsuite/gfortran.dg/gomp/append_args-2.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/append_args-2.f90 @@ -40,7 +40,7 @@ contains subroutine f5 () !$omp declare variant (f1ox) match(user={condition(flag)}) & ! { dg-error "the 'append_args' clause can only be specified if the 'dispatch' selector of the construct selector set appears in the 'match' clause at .1." } !$omp& append_args ( interop ( target , targetsync) ) -! { dg-error "'q' at .1. must be a nonpointer, nonallocatable scalar integer dummy argument of 'omp_interop_kind' kind as it utilized with the 'append_args' clause at .2." "" { target *-*-* } .-2 } +! { dg-error "'q' at .1. must be a nonpointer, nonallocatable scalar integer dummy argument of 'omp_interop_kind' kind as it utilized with the 'append_args' clause at .2." "" { target *-*-* } .-1 } end subroutine subroutine f6 (x, y)
[gcc r15-7377] aarch64: Fix sve/acle/general/ldff1_8.c failures
https://gcc.gnu.org/g:50a31b6765fe17aee22a1fc1457c762a53140c8e commit r15-7377-g50a31b6765fe17aee22a1fc1457c762a53140c8e Author: Richard Sandiford Date: Wed Feb 5 15:35:13 2025 + aarch64: Fix sve/acle/general/ldff1_8.c failures gcc.target/aarch64/sve/acle/general/ldff1_8.c and gcc.target/aarch64/sve/ptest_1.c were failing because the aarch64 port was giving a zero (unknown) cost to instructions that compute two results in parallel. This was latent until r15-1575-gea8061f46a30, which fixed rtl-ssa to treat zero costs as unknown. A long-standing todo here is to make insn_cost derive costs from md information, rather than having to write a lot of matching code in aarch64_rtx_costs. But that's not something we can do for GCC 15. This patch instead treats the cost of a PARALLEL as being the maximum cost of its constituent sets. I don't like this very much, since it isn't really target-specific behaviour. If it were stage 1, I'd be trying to change pattern_cost instead. gcc/ * config/aarch64/aarch64.cc (aarch64_insn_cost): Give PARALLELs the same cost as the costliest SET. Diff: --- gcc/config/aarch64/aarch64.cc | 19 ++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 16754fa9e7bd..c1e40200806a 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -15889,7 +15889,24 @@ aarch64_insn_cost (rtx_insn *insn, bool speed) { if (rtx set = single_set (insn)) return set_rtx_cost (set, speed); - return pattern_cost (PATTERN (insn), speed); + + /* If the instruction does multiple sets in parallel, use the cost + of the most expensive set. This copes with instructions that set + the flags to a useful value as a side effect. */ + rtx pat = PATTERN (insn); + if (GET_CODE (pat) == PARALLEL) +{ + int max_cost = 0; + for (int i = 0; i < XVECLEN (pat, 0); ++i) + { + rtx x = XVECEXP (pat, 0, i); + if (GET_CODE (x) == SET) + max_cost = std::max (max_cost, set_rtx_cost (x, speed)); + } + return max_cost; +} + + return pattern_cost (pat, speed); } /* Implement TARGET_INIT_BUILTINS. */
[gcc r15-7378] go: update builtin function attributes
https://gcc.gnu.org/g:0006c07b7ac6594195d5db322e39907203be4c2a commit r15-7378-g0006c07b7ac6594195d5db322e39907203be4c2a Author: Ian Lance Taylor Date: Wed Feb 5 10:14:57 2025 -0800 go: update builtin function attributes PR go/118746 * go-gcc.cc (class Gcc_backend): Define builtin_cold, builtin_leaf, builtin_nonnull. Alphabetize constants. (Gcc_backend::Gcc_backend): Update attributes for builtin functions to match builtins.def. (Gcc_backend::define_builtin): Split out attribute setting into set_attribtues. (Gcc_backend::set_attribtues): New method split out of define_builtin. Support new flag values. Diff: --- gcc/go/go-gcc.cc | 262 ++- 1 file changed, 141 insertions(+), 121 deletions(-) diff --git a/gcc/go/go-gcc.cc b/gcc/go/go-gcc.cc index cf0c84b9fe3b..9917812d0eaf 100644 --- a/gcc/go/go-gcc.cc +++ b/gcc/go/go-gcc.cc @@ -540,16 +540,22 @@ class Gcc_backend : public Backend convert_tree(tree, tree, Location); private: - static const int builtin_const = 1 << 0; - static const int builtin_noreturn = 1 << 1; - static const int builtin_novops = 1 << 2; - static const int builtin_pure = 1 << 3; - static const int builtin_nothrow = 1 << 4; + static const int builtin_cold = 1 << 0; + static const int builtin_const = 1 << 1; + static const int builtin_leaf = 1 << 2; + static const int builtin_nonnull = 1 << 3; + static const int builtin_noreturn = 1 << 4; + static const int builtin_nothrow = 1 << 5; + static const int builtin_novops = 1 << 6; + static const int builtin_pure = 1 << 7; void define_builtin(built_in_function bcode, const char* name, const char* libname, tree fntype, int flags); + void + set_attributes(tree decl, int flags); + // A mapping of the GCC built-ins exposed to GCCGo. std::map builtin_functions_; }; @@ -571,22 +577,26 @@ Gcc_backend::Gcc_backend() tree t = this->integer_type(true, BITS_PER_UNIT)->get_tree(); tree p = build_pointer_type(build_qualified_type(t, TYPE_QUAL_VOLATILE)); this->define_builtin(BUILT_IN_SYNC_ADD_AND_FETCH_1, "__sync_fetch_and_add_1", - NULL, build_function_type_list(t, p, t, NULL_TREE), 0); + NULL, build_function_type_list(t, p, t, NULL_TREE), + builtin_leaf); t = this->integer_type(true, BITS_PER_UNIT * 2)->get_tree(); p = build_pointer_type(build_qualified_type(t, TYPE_QUAL_VOLATILE)); this->define_builtin(BUILT_IN_SYNC_ADD_AND_FETCH_2, "__sync_fetch_and_add_2", - NULL, build_function_type_list(t, p, t, NULL_TREE), 0); + NULL, build_function_type_list(t, p, t, NULL_TREE), + builtin_leaf); t = this->integer_type(true, BITS_PER_UNIT * 4)->get_tree(); p = build_pointer_type(build_qualified_type(t, TYPE_QUAL_VOLATILE)); this->define_builtin(BUILT_IN_SYNC_ADD_AND_FETCH_4, "__sync_fetch_and_add_4", - NULL, build_function_type_list(t, p, t, NULL_TREE), 0); + NULL, build_function_type_list(t, p, t, NULL_TREE), + builtin_leaf); t = this->integer_type(true, BITS_PER_UNIT * 8)->get_tree(); p = build_pointer_type(build_qualified_type(t, TYPE_QUAL_VOLATILE)); this->define_builtin(BUILT_IN_SYNC_ADD_AND_FETCH_8, "__sync_fetch_and_add_8", - NULL, build_function_type_list(t, p, t, NULL_TREE), 0); + NULL, build_function_type_list(t, p, t, NULL_TREE), + builtin_leaf); // We use __builtin_expect for magic import functions. this->define_builtin(BUILT_IN_EXPECT, "__builtin_expect", NULL, @@ -594,7 +604,7 @@ Gcc_backend::Gcc_backend() long_integer_type_node, long_integer_type_node, NULL_TREE), - builtin_const); + builtin_const | builtin_nothrow | builtin_leaf); // We use __builtin_memcmp for struct comparisons. this->define_builtin(BUILT_IN_MEMCMP, "__builtin_memcmp", "memcmp", @@ -603,7 +613,7 @@ Gcc_backend::Gcc_backend() const_ptr_type_node, size_type_node, NULL_TREE), - builtin_pure | builtin_nothrow); + builtin_pure | builtin_nothrow | builtin_nonnull); // We use __builtin_memmove for copying data. this->define_builtin(BUILT_IN_MEMMOVE, "__builtin_memmove", "memmove", @@ -612,7 +622,7 @@ Gcc_backend::Gcc_backend() const_ptr_type_node, size_type_node,
[gcc r15-7379] [PR115568][LRA]: Use more strict output reload check in rematerialization
https://gcc.gnu.org/g:98545441308c2ae4d535f14b108ad6551fd927d5 commit r15-7379-g98545441308c2ae4d535f14b108ad6551fd927d5 Author: Vladimir N. Makarov Date: Wed Feb 5 14:23:23 2025 -0500 [PR115568][LRA]: Use more strict output reload check in rematerialization In this PR case LRA rematerialized a value from inheritance insn instead of output reload one. This resulted in considering a rematerilization candidate value available when it was actually not. As a consequence an insn after rematerliazation used the unexpected value and this use resulted in fp exception. The patch fixes this bug. gcc/ChangeLog: PR rtl-optimization/115568 * lra-remat.cc (create_cands): Check that output reload insn is adjacent to given insn. Update a comment. gcc/testsuite/ChangeLog: PR rtl-optimization/115568 * gcc.target/i386/pr115568.c: New. Diff: --- gcc/lra-remat.cc | 10 + gcc/testsuite/gcc.target/i386/pr115568.c | 38 2 files changed, 44 insertions(+), 4 deletions(-) diff --git a/gcc/lra-remat.cc b/gcc/lra-remat.cc index bb13c616a740..2f3afffcf5be 100644 --- a/gcc/lra-remat.cc +++ b/gcc/lra-remat.cc @@ -459,7 +459,8 @@ create_cands (void) if (insn2 != NULL && dst_regno >= FIRST_PSEUDO_REGISTER && reg_renumber[dst_regno] < 0 - && BLOCK_FOR_INSN (insn2) == BLOCK_FOR_INSN (insn)) + && BLOCK_FOR_INSN (insn2) == BLOCK_FOR_INSN (insn) + && insn2 == prev_nonnote_insn (insn)) { create_cand (insn2, regno_potential_cand[src_regno].nop, dst_regno, insn); @@ -473,9 +474,10 @@ create_cands (void) gcc_assert (REG_P (*id->operand_loc[nop])); int regno = REGNO (*id->operand_loc[nop]); gcc_assert (regno >= FIRST_PSEUDO_REGISTER); - /* If we're setting an unrenumbered pseudo, make a candidate immediately. - If it's an output reload register, save it for later; the code above - looks for output reload insns later on. */ + /* If we're setting an unrenumbered pseudo, make a candidate + immediately. If it's a potential output reload register, save + it for later; the code above looks for output reload insns later + on. */ if (reg_renumber[regno] < 0) create_cand (insn, nop, regno); else if (regno >= lra_constraint_new_regno_start) diff --git a/gcc/testsuite/gcc.target/i386/pr115568.c b/gcc/testsuite/gcc.target/i386/pr115568.c new file mode 100644 index ..cedc7ac3843d --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr115568.c @@ -0,0 +1,38 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fno-tree-sink -fno-tree-ter -fschedule-insns" } */ + +int a, c, d = 1, e, f = 1, h, i, j; +unsigned b = 1, g; +int main() { + for (; h < 2; h++) { +int k = ~(b || 0), l = ((~e - j) ^ a % b) % k, m = (b ^ -1) + e; +unsigned o = ~a % ~1; +if (f) { + l = d; + m = 10; + i = e; + d = -(~e + b); + g = o % m; + e = -1; +n: + a = a % ~i; + b = ~k; + if (!g) { +b = e + o % -1; +continue; + } + if (!l) +break; +} +int q = (~d + g) << ~e, p = (~d - q) & a >> b; +unsigned s = ~((g & e) + (p | (b ^ (d + k; +int r = (e & s) + p, u = d | ~a, +t = ((~(q + (~a + (s + e & u) | (-g & (c << d ^ p)); +if (t) + if (!r) +goto n; +g = m; +e = i; + } + return 0; +}
[gcc r15-7380] c++: Reject default arguments for template class friend functions [PR118319]
https://gcc.gnu.org/g:198f4df07d6a1db9c8ef39536da56c1b596c57a8 commit r15-7380-g198f4df07d6a1db9c8ef39536da56c1b596c57a8 Author: Simon Martin Date: Wed Feb 5 20:35:34 2025 +0100 c++: Reject default arguments for template class friend functions [PR118319] We segfault upon the following invalid code === cut here === template struct S { friend void foo (int a = []{}()); }; void foo (int a) {} int main () { S<0> t; foo (); } === cut here === The problem is that we end up with a LAMBDA_EXPR callee in set_flags_from_callee, and dereference its NULL_TREE TREE_TYPE (TREE_TYPE (..)). This patch sets the default argument to error_mark_node and gives a hard error for template class friend functions that do not meet the requirement in C++17 11.3.6/4 (the change is restricted to templates per discussion with Jason). PR c++/118319 gcc/cp/ChangeLog: * decl.cc (grokfndecl): Inspect all friend function parameters. If it's not valid for them to have a default value and we're processing a template, set the default value to error_mark_node and give a hard error. gcc/testsuite/ChangeLog: * g++.dg/parse/defarg18.C: New test. * g++.dg/parse/defarg18a.C: New test. Diff: --- gcc/cp/decl.cc | 22 +--- gcc/testsuite/g++.dg/parse/defarg18.C | 48 ++ gcc/testsuite/g++.dg/parse/defarg18a.C | 33 +++ 3 files changed, 99 insertions(+), 4 deletions(-) diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index b7af33b32317..4238314558b1 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -11213,14 +11213,28 @@ grokfndecl (tree ctype, expression, that declaration shall be a definition..." */ if (friendp && !funcdef_flag) { + bool has_errored = false; for (tree t = FUNCTION_FIRST_USER_PARMTYPE (decl); t && t != void_list_node; t = TREE_CHAIN (t)) if (TREE_PURPOSE (t)) { - permerror (DECL_SOURCE_LOCATION (decl), - "friend declaration of %qD specifies default " - "arguments and isn%'t a definition", decl); - break; + diagnostic_t diag_kind = DK_PERMERROR; + /* For templates, mark the default argument as erroneous and give a + hard error. */ + if (processing_template_decl) + { + diag_kind = DK_ERROR; + TREE_PURPOSE (t) = error_mark_node; + } + if (!has_errored) + { + has_errored = true; + emit_diagnostic (diag_kind, +DECL_SOURCE_LOCATION (decl), +/*diagnostic_option_id=*/0, +"friend declaration of %qD specifies default " +"arguments and isn%'t a definition", decl); + } } } diff --git a/gcc/testsuite/g++.dg/parse/defarg18.C b/gcc/testsuite/g++.dg/parse/defarg18.C new file mode 100644 index ..62c8f15f2843 --- /dev/null +++ b/gcc/testsuite/g++.dg/parse/defarg18.C @@ -0,0 +1,48 @@ +// PR c++/118319 +// { dg-do "compile" { target c++11 } } + +// Template case, that used to crash. + +template +struct S { + friend void foo1 (int a = []{}()); // { dg-error "specifies default|only declaration" } + friend void foo3 (int a, // { dg-error "specifies default|only declaration" } + int b = []{}(), + int c = []{}()); +}; + +void foo1 (int a) {} +void foo3 (int a, int b, int c) {} + +void hello (){ + S<0> t; + foo1 (); + foo3 (1, 2); +} + + +// Template case, that already worked. + +template +struct T { + friend void bar (int a = []{}()); // { dg-error "specifies default|only declaration" } +}; + +void hallo (){ + T<0> t; + bar (); // { dg-error "not declared" } +} + + +// Non template case, that already worked. + +struct NoTemplate { + friend void baz (int a = []{}()); // { dg-error "specifies default|could not convert" } +}; + +void baz (int a) {} // { dg-error "only declaration" } + +void ola (){ + NoTemplate t; + baz (); // { dg-error "void value not ignored" } +} diff --git a/gcc/testsuite/g++.dg/parse/defarg18a.C b/gcc/testsuite/g++.dg/parse/defarg18a.C new file mode 100644 index ..9157a4d5158f --- /dev/null +++ b/gcc/testsuite/g++.dg/parse/defarg18a.C @@ -0,0 +1,33 @@ +// PR c++/118319 - With -fpermissive +// { dg-do "compile" { target c++11 } } +// { dg-additional-options "-fpermissive" } + +// Template case, that used to crash. +// Check that we error-out even with -fpermissive. + +template +struct S { // { dg-error "instantiating erroneous template" } + friend void foo1 (int a = []{}()); // { dg-warning "specifies default|only declaration" } +
[gcc(refs/users/meissner/heads/work192-bugs)] Update ChangeLog.*
https://gcc.gnu.org/g:28ad7e300ddd4fbe27e5e227003fe19aef48433d commit 28ad7e300ddd4fbe27e5e227003fe19aef48433d Author: Michael Meissner Date: Thu Feb 6 00:15:42 2025 -0500 Update ChangeLog.* Diff: --- gcc/ChangeLog.bugs | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs index f4df387fdb86..285b35930a1c 100644 --- a/gcc/ChangeLog.bugs +++ b/gcc/ChangeLog.bugs @@ -82,11 +82,13 @@ power9/power10 systems and there were no regressions. Can I check this patch into the GCC trunk, and after a waiting period, can I check this into the active older branches? -2025-02-03 Michael Meissner +2025-02-06 Michael Meissner gcc/ PR target/118541 + * config/rs6000/predicates.md (invert_fpmask_comparison_operator): Do + not allow UNLT and UNLE unless -ffast-math. * config/rs6000/rs6000-protos.h (REVERSE_COND_ORDERED_OK): New macro. (REVERSE_COND_NO_ORDERED): Likewise. (rs6000_reverse_condition): Add argument. @@ -102,6 +104,8 @@ gcc/testsuite/ PR target/118541 * gcc.target/powerpc/pr118541.c: New test. + Branch work192-bugs, patch #212 was reverted + Branch work192-bugs, patch #211 was reverted Branch work192-bugs, patch #210 was reverted Branch work192-bugs, patch #202
[gcc(refs/users/meissner/heads/work192-bugs)] Revert changes
https://gcc.gnu.org/g:5e977a841603fa542cd7a990530d692324f3c90f commit 5e977a841603fa542cd7a990530d692324f3c90f Author: Michael Meissner Date: Thu Feb 6 00:08:32 2025 -0500 Revert changes Diff: --- gcc/config/rs6000/rs6000-protos.h | 6 +--- gcc/config/rs6000/rs6000.cc | 29 ++- gcc/config/rs6000/rs6000.h | 10 ++- gcc/config/rs6000/rs6000.md | 24 ++-- gcc/testsuite/gcc.target/powerpc/pr118541.c | 43 - 5 files changed, 20 insertions(+), 92 deletions(-) diff --git a/gcc/config/rs6000/rs6000-protos.h b/gcc/config/rs6000/rs6000-protos.h index 112332660d3b..4619142d197b 100644 --- a/gcc/config/rs6000/rs6000-protos.h +++ b/gcc/config/rs6000/rs6000-protos.h @@ -114,12 +114,8 @@ extern const char *rs6000_sibcall_template (rtx *, unsigned int); extern const char *rs6000_indirect_call_template (rtx *, unsigned int); extern const char *rs6000_indirect_sibcall_template (rtx *, unsigned int); extern const char *rs6000_pltseq_template (rtx *, int); - -#define REVERSE_COND_ORDERED_OKfalse -#define REVERSE_COND_NO_ORDEREDtrue - extern enum rtx_code rs6000_reverse_condition (machine_mode, - enum rtx_code, bool); + enum rtx_code); extern rtx rs6000_emit_eqne (machine_mode, rtx, rtx, rtx); extern rtx rs6000_emit_fp_cror (rtx_code, machine_mode, rtx); extern void rs6000_emit_sCOND (machine_mode, rtx[]); diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index d1bf2d29f3da..f9f9a0b931db 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -15360,27 +15360,17 @@ rs6000_print_patchable_function_entry (FILE *file, } enum rtx_code -rs6000_reverse_condition (machine_mode cc_mode, - enum rtx_code code, - bool no_ordered) +rs6000_reverse_condition (machine_mode mode, enum rtx_code code) { /* Reversal of FP compares takes care -- an ordered compare - becomes an unordered compare and vice versa. - - However, this is not safe for ordered comparisons (i.e. for isgreater, - etc.) starting with the power9 because ifcvt.cc will want to create a fp - cmove, and the x{s,v}cmp{eq,gt,ge}{dp,qp} instructions will trap if one of - the arguments is a signalling NaN. */ - - if (cc_mode == CCFPmode + becomes an unordered compare and vice versa. */ + if (mode == CCFPmode && (!flag_finite_math_only || code == UNLT || code == UNLE || code == UNGT || code == UNGE || code == UNEQ || code == LTGT)) -return (no_ordered - ? UNKNOWN - : reverse_condition_maybe_unordered (code)); - - return reverse_condition (code); +return reverse_condition_maybe_unordered (code); + else +return reverse_condition (code); } /* Check if C (as 64bit integer) can be rotated to a constant which constains @@ -15990,14 +15980,11 @@ rs6000_emit_sCOND (machine_mode mode, rtx operands[]) rtx not_result = gen_reg_rtx (CCEQmode); rtx not_op, rev_cond_rtx; machine_mode cc_mode; - enum rtx_code rev; cc_mode = GET_MODE (XEXP (condition_rtx, 0)); - rev = rs6000_reverse_condition (cc_mode, cond_code, - REVERSE_COND_ORDERED_OK); - rev_cond_rtx = gen_rtx_fmt_ee (rev, SImode, XEXP (condition_rtx, 0), -const0_rtx); + rev_cond_rtx = gen_rtx_fmt_ee (rs6000_reverse_condition (cc_mode, cond_code), +SImode, XEXP (condition_rtx, 0), const0_rtx); not_op = gen_rtx_COMPARE (CCEQmode, rev_cond_rtx, const0_rtx); emit_insn (gen_rtx_SET (not_result, not_op)); condition_rtx = gen_rtx_EQ (VOIDmode, not_result, const0_rtx); diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index c595d7138bcd..ec08c96d0f67 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -1812,17 +1812,11 @@ extern scalar_int_mode rs6000_pmode; /* Can the condition code MODE be safely reversed? This is safe in all cases on this port, because at present it doesn't use the - trapping FP comparisons (fcmpo). - - However, this is not safe for ordered comparisons (i.e. for isgreater, etc.) - starting with the power9 because ifcvt.cc will want to create a fp cmove, - and the x{s,v}cmp{eq,gt,ge}{dp,qp} instructions will trap if one of the - arguments is a signalling NaN. */ + trapping FP comparisons (fcmpo). */ #define REVERSIBLE_CC_MODE(MODE) 1 /* Given a condition code and a mode, return the inverse condition. */ -#define REVERSE_CONDITION(CODE, MODE) \ - rs6000_reverse_condition (MODE, CODE, REVERSE_COND_NO_ORDERED) +#define REVERSE_CONDITION(CODE, MODE) rs6000_reverse_condition (MODE, CODE) /* Target cpu costs. */ diff --git a/gcc/config/rs6000/rs6000.m
[gcc(refs/users/meissner/heads/work192-bugs)] Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.
https://gcc.gnu.org/g:513f333adf91d7478eace35bd5df922fa4dce9cf commit 513f333adf91d7478eace35bd5df922fa4dce9cf Author: Michael Meissner Date: Thu Feb 6 00:13:36 2025 -0500 Fix PR 118541, do not generate unordered fp cmoves for IEEE compares. In bug PR target/118541 on power9, power10, and power11 systems, for the function: extern double __ieee754_acos (double); double __acospi (double x) { double ret = __ieee754_acos (x) / 3.14; return __builtin_isgreater (ret, 1.0) ? 1.0 : ret; } GCC currently generates the following code: Power9 Power10 and Power11 == === bl __ieee754_acos bl __ieee754_acos@notoc nop plfd 0,.LC0@pcrel addis 9,2,.LC2@toc@ha xxspltidp 12,1065353216 addi 1,1,32 addi 1,1,32 lfd 0,.LC2@toc@l(9) ld 0,16(1) addis 9,2,.LC0@toc@ha fdiv 0,1,0 ld 0,16(1) mtlr 0 lfd 12,.LC0@toc@l(9)xscmpgtdp 1,0,12 fdiv 0,1,0 xxsel 1,0,12,1 mtlr 0 blr xscmpgtdp 1,0,12 xxsel 1,0,12,1 blr This is because ifcvt.c optimizes the conditional floating point move to use the XSCMPGTDP instruction. However, the XSCMPGTDP instruction will generate an interrupt if one of the arguments is a signalling NaN and signalling NaNs can generate an interrupt. The IEEE comparison functions (isgreater, etc.) require that the comparison not raise an interrupt. The following patch changes the PowerPC back end so that ifcvt.c will not change the if/then test and move into a conditional move if the comparison is one of the comparisons that do not raise an error with signalling NaNs and -Ofast is not used. If a normal comparison is used or -Ofast is used, GCC will continue to generate XSCMPGTDP and XXSEL. For the following code: double ordered_compare (double a, double b, double c, double d) { return __builtin_isgreater (a, b) ? c : d; } /* Verify normal > does generate xscmpgtdp. */ double normal_compare (double a, double b, double c, double d) { return a > b ? c : d; } with the following patch, GCC generates the following for power9, power10, and power11: ordered_compare: fcmpu 0,1,2 fmr 1,4 bnglr 0 fmr 1,3 blr normal_compare: xscmpgtdp 1,1,2 xxsel 1,4,3,1 blr I have built bootstrap compilers on big endian power9 systems and little endian power9/power10 systems and there were no regressions. Can I check this patch into the GCC trunk, and after a waiting period, can I check this into the active older branches? 2025-02-06 Michael Meissner gcc/ PR target/118541 * config/rs6000/predicates.md (invert_fpmask_comparison_operator): Do not allow UNLT and UNLE unless -ffast-math. * config/rs6000/rs6000-protos.h (REVERSE_COND_ORDERED_OK): New macro. (REVERSE_COND_NO_ORDERED): Likewise. (rs6000_reverse_condition): Add argument. * config/rs6000/rs6000.cc (rs6000_reverse_condition): Do not allow ordered comparisons to be reversed for floating point cmoves. (rs6000_emit_sCOND): Adjust rs6000_reverse_condition call. * config/rs6000/rs6000.h (REVERSE_CONDITION): Likewise. * config/rs6000/rs6000.md (reverse_branch_comparison): Name insn. Adjust rs6000_reverse_condition call. gcc/testsuite/ PR target/118541 * gcc.target/powerpc/pr118541.c: New test. Diff: --- gcc/config/rs6000/predicates.md | 8 -- gcc/config/rs6000/rs6000-protos.h | 6 +++- gcc/config/rs6000/rs6000.cc | 29 +-- gcc/config/rs6000/rs6000.h | 10 +-- gcc/config/rs6000/rs6000.md | 24 ++-- gcc/testsuite/gcc.target/powerpc/pr118541.c | 43 + 6 files changed, 98 insertions(+), 22 deletions(-) diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index 647e89afb6a7..700b266b62f5 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -1465,9 +1465,13 @@ ;; Return 1 if OP is a comparison operator suitab
[gcc r15-7375] cselib: Fix up previous patch for SPARC [PR117239]
https://gcc.gnu.org/g:6094801d6fd7849d2d95ce78f7c6ef01686b9f63 commit r15-7375-g6094801d6fd7849d2d95ce78f7c6ef01686b9f63 Author: Jakub Jelinek Date: Wed Feb 5 14:06:42 2025 +0100 cselib: Fix up previous patch for SPARC [PR117239] Sorry, our CI bot just notified me I broke SPARC build. There are two #ifdef STACK_ADDRESS_OFFSET guarded snippets and the macro is only defined on SPARC target, so I didn't notice there was a syntax error. Fixed thusly. 2025-02-05 Jakub Jelinek PR rtl-optimization/117239 * cselib.cc (cselib_init): Remove spurious closing paren in the #ifdef STACK_ADDRESS_OFFSET specific code. Diff: --- gcc/cselib.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/cselib.cc b/gcc/cselib.cc index d18208e51440..7f1991b09c3c 100644 --- a/gcc/cselib.cc +++ b/gcc/cselib.cc @@ -3394,7 +3394,7 @@ cselib_init (int record_what) #ifdef STACK_ADDRESS_OFFSET /* On SPARC take stack pointer bias into account as well. */ off += (STACK_ADDRESS_OFFSET - - FIRST_PARM_OFFSET (current_function_decl))); + - FIRST_PARM_OFFSET (current_function_decl)); #endif callmem[1] = plus_constant (Pmode, stack_pointer_rtx, off); }