[gcc r15-2037] [APX NF] Add a pass to convert legacy insn to NF insns
https://gcc.gnu.org/g:681ff5ccca153864eb86099eed201838d8d98bc2 commit r15-2037-g681ff5ccca153864eb86099eed201838d8d98bc2 Author: Hongyu Wang Date: Thu Apr 18 16:53:26 2024 +0800 [APX NF] Add a pass to convert legacy insn to NF insns For APX ccmp, current infrastructure will always generate cstore for the ccmp flag user, like cmpe%rcx, %r8 ccmpnel %rax, %rbx seta%dil add %rcx, %r9 add %r9, %rdx testb %dil, %dil je .L2 For such case, the legacy add clobbers FLAGS_REG so there should have extra cstore to avoid the flag be reset before using it. If the instructions between flag producer and user are NF insns, the setcc/ test sequence is not required. Add a pass to convert legacy flag clobber insns to their NF counterpart. The convertion only happens when 1. APX_NF enabled. 2. For a BB, cstore was find, and there are insns between such cstore and next explicit set insn to FLAGS_REG (test or cmp). 3. All the insns found should have NF counterpart. The pass was added after rtl-ifcvt which eliminates some branch when profitable, which could cause some flag-clobbering insn put between cstore and jcc. gcc/ChangeLog: * config/i386/i386.md (has_nf): New define_attr, add to all nf related patterns. * config/i386/i386-features.cc (apx_nf_convert): New function to convert Non-NF insns to their NF counterparts. (class pass_apx_nf_convert): New pass class. (make_pass_apx_nf_convert): New. * config/i386/i386-passes.def: Add pass_apx_nf_convert after rtl_ifcvt. * config/i386/i386-protos.h (make_pass_apx_nf_convert): Declare. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-nf-2.c: New test. Diff: --- gcc/config/i386/i386-features.cc | 163 +++ gcc/config/i386/i386-passes.def | 1 + gcc/config/i386/i386-protos.h| 1 + gcc/config/i386/i386.md | 67 - gcc/testsuite/gcc.target/i386/apx-nf-2.c | 32 ++ 5 files changed, 259 insertions(+), 5 deletions(-) diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc index fc224ed06b0e..3da56ddbdccd 100644 --- a/gcc/config/i386/i386-features.cc +++ b/gcc/config/i386/i386-features.cc @@ -3259,6 +3259,169 @@ make_pass_remove_partial_avx_dependency (gcc::context *ctxt) return new pass_remove_partial_avx_dependency (ctxt); } +/* Convert legacy instructions that clobbers EFLAGS to APX_NF + instructions when there are no flag set between a flag + producer and user. */ + +static unsigned int +ix86_apx_nf_convert (void) +{ + timevar_push (TV_MACH_DEP); + + basic_block bb; + rtx_insn *insn; + hash_map converting_map; + auto_vec current_convert_list; + + bool converting_seq = false; + rtx cc = gen_rtx_REG (CCmode, FLAGS_REG); + + FOR_EACH_BB_FN (bb, cfun) +{ + /* Reset conversion for each bb. */ + converting_seq = false; + FOR_BB_INSNS (bb, insn) + { + if (!NONDEBUG_INSN_P (insn)) + continue; + + if (recog_memoized (insn) < 0) + continue; + + /* Convert candidate insns after cstore, which should +satisify the two conditions: +1. Is not flag user or producer, only clobbers +FLAGS_REG. +2. Have corresponding nf pattern. */ + + rtx pat = PATTERN (insn); + + /* Starting convertion at first cstorecc. */ + rtx set = NULL_RTX; + if (!converting_seq + && (set = single_set (insn)) + && ix86_comparison_operator (SET_SRC (set), VOIDmode) + && reg_overlap_mentioned_p (cc, SET_SRC (set)) + && !reg_overlap_mentioned_p (cc, SET_DEST (set))) + { + converting_seq = true; + current_convert_list.truncate (0); + } + /* Terminate at the next explicit flag set. */ + else if (reg_set_p (cc, pat) + && GET_CODE (set_of (cc, pat)) != CLOBBER) + converting_seq = false; + + if (!converting_seq) + continue; + + if (get_attr_has_nf (insn) + && GET_CODE (pat) == PARALLEL) + { + /* Record the insn to candidate map. */ + current_convert_list.safe_push (insn); + converting_map.put (insn, pat); + } + /* If the insn clobbers flags but has no nf_attr, +revoke all previous candidates. */ + else if (!get_attr_has_nf (insn) + && reg_set_p (cc, pat) + && GET_CODE (set_of (cc, pat)) == CLOBBER) + { + for (auto item : current_convert_list) + converting_map.remove (item); +
[gcc r15-2038] Fix SSA_NAME leak due to def_stmt is removed before use_stmt.
https://gcc.gnu.org/g:f27bf48e0204524ead795fe618cd8b1224f72fd4 commit r15-2038-gf27bf48e0204524ead795fe618cd8b1224f72fd4 Author: liuhongt Date: Fri Jul 12 09:39:23 2024 +0800 Fix SSA_NAME leak due to def_stmt is removed before use_stmt. - _5 = __atomic_fetch_or_8 (&set_work_pending_p, 1, 0); - # DEBUG old => (long int) _5 + _6 = .ATOMIC_BIT_TEST_AND_SET (&set_work_pending_p, 0, 1, 0, __atomic_fetch_or_8); + # DEBUG old => NULL # DEBUG BEGIN_STMT - # DEBUG D#2 => _5 & 1 + # DEBUG D#2 => NULL ... - _10 = ~_5; - _8 = (_Bool) _10; - # DEBUG ret => _8 + _8 = _6 == 0; + # DEBUG ret => (_Bool) _10 confirmed. convert_atomic_bit_not does this, it checks for single_use and removes the def, failing to release the name (which would fix this up IIRC). Note the function removes stmts in "wrong" order (before uses of LHS are removed), so it requires larger surgery. And it leaks SSA names. gcc/ChangeLog: PR target/115872 * tree-ssa-ccp.cc (convert_atomic_bit_not): Remove use_stmt after use_nop_stmt is removed. (optimize_atomic_bit_test_and): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr115872.c: New test. Diff: --- gcc/testsuite/gcc.target/i386/pr115872.c | 16 gcc/tree-ssa-ccp.cc | 12 2 files changed, 24 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/gcc.target/i386/pr115872.c b/gcc/testsuite/gcc.target/i386/pr115872.c new file mode 100644 index ..937004456d37 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr115872.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -g" } */ + +long set_work_pending_p; +_Bool set_work_pending() { + _Bool __trans_tmp_1; + long mask = 1, old = __atomic_fetch_or(&set_work_pending_p, mask, 0); + __trans_tmp_1 = old & mask; + return !__trans_tmp_1; +} +void __queue_work() { + _Bool ret = set_work_pending(); + if (ret) +__queue_work(); +} + diff --git a/gcc/tree-ssa-ccp.cc b/gcc/tree-ssa-ccp.cc index 3749126b5f7c..de83d26d311a 100644 --- a/gcc/tree-ssa-ccp.cc +++ b/gcc/tree-ssa-ccp.cc @@ -3332,9 +3332,10 @@ convert_atomic_bit_not (enum internal_fn fn, gimple *use_stmt, return nullptr; gimple_stmt_iterator gsi; - gsi = gsi_for_stmt (use_stmt); - gsi_remove (&gsi, true); tree var = make_ssa_name (TREE_TYPE (lhs)); + /* use_stmt need to be removed after use_nop_stmt, + so use_lhs can be released. */ + gimple *use_stmt_removal = use_stmt; use_stmt = gimple_build_assign (var, BIT_AND_EXPR, lhs, and_mask); gsi = gsi_for_stmt (use_not_stmt); gsi_insert_before (&gsi, use_stmt, GSI_NEW_STMT); @@ -3344,6 +3345,8 @@ convert_atomic_bit_not (enum internal_fn fn, gimple *use_stmt, gsi_insert_after (&gsi, g, GSI_NEW_STMT); gsi = gsi_for_stmt (use_not_stmt); gsi_remove (&gsi, true); + gsi = gsi_for_stmt (use_stmt_removal); + gsi_remove (&gsi, true); return use_stmt; } @@ -3646,8 +3649,7 @@ optimize_atomic_bit_test_and (gimple_stmt_iterator *gsip, */ } var = make_ssa_name (TREE_TYPE (use_rhs)); - gsi = gsi_for_stmt (use_stmt); - gsi_remove (&gsi, true); + gimple* use_stmt_removal = use_stmt; g = gimple_build_assign (var, BIT_AND_EXPR, use_rhs, and_mask); gsi = gsi_for_stmt (use_nop_stmt); @@ -3664,6 +3666,8 @@ optimize_atomic_bit_test_and (gimple_stmt_iterator *gsip, gsi_insert_after (&gsi, g, GSI_NEW_STMT); gsi = gsi_for_stmt (use_nop_stmt); gsi_remove (&gsi, true); + gsi = gsi_for_stmt (use_stmt_removal); + gsi_remove (&gsi, true); } } else
[gcc r15-2039] varasm: Add support for emitting binary data with the new gas .base64 directive
https://gcc.gnu.org/g:9964edfb4abdec25f8be48e667afcae30ce03a37 commit r15-2039-g9964edfb4abdec25f8be48e667afcae30ce03a37 Author: Jakub Jelinek Date: Mon Jul 15 09:48:38 2024 +0200 varasm: Add support for emitting binary data with the new gas .base64 directive Nick has implemented a new .base64 directive in gas (to be shipped in the upcoming binutils 2.43; big thanks for that). See https://sourceware.org/bugzilla/show_bug.cgi?id=31964 The following patch adjusts default_elf_asm_output_ascii (i.e. ASM_OUTPUT_ASCII elfos.h implementation) to use it if it detects binary data and gas supports it. Without this patch, we emit stuff like: .string "\177ELF\002\001\001\003" .string "" .string "" .string "" .string "" .string "" .string "" .string "" .string "\002" .string ">" ... .string "\324\001\236 0FS\202\002E\n0@\203\004\005&\202\021\337)\021\203C\020A\300\220I\004\t\b\206(\234\0132l\004b\300\bK\006\220$0\303\020P$\233\211\002D\f" etc., with this patch more compact .base64 "f0VMRgIBAQMAAAIAPgABABf3AABAAACneB0AAEAAOAAOAEAALAArAAYEQABAAEAAAEAAQAAAEAMQAwgAAwQAAABQAwAAAFADQAAAUANcABwAAQABBABAAEAAADBwOQAAMHA5EAEFAIA5gHkA" .base64 "AACAeQAAxSSgAgDFJKACAAAQAQQAsNkCAACwGQMAALAZAwDMtc0AAMy1zQAAABABBgAAAGhmpwMAaHbnAwBoducDAOAMAQAA4MEeEAIGkH2nAwCQjecDAJCN5wMAQAIAAABAAggABAQAAABwAwAAAHADQAAAcANAAABAAEAACAAA" .base64 "AAAEBLADsANAAACwA0AAACAAIAAEAAcEaGanAwBoducDAGh25wMQAAgAU+V0ZAQAAABwAwAAAHADQAAAcANAAABAAEAACABQ5XRkBAw/WAMADD+YAwAMP5gDAPy7CgAA/LsKAAAEAFHldGQG" .base64 "ABAAUuV0ZAQAAABoZqcDAGh25wMAaHbnAwCYGQAAAJgZAQAvbGliNjQvbGQtbGludXgteDg2LTY0LnNvLjIAAAQwBQAAAEdOVQACgADABAEAAQABwAQJAAIAAcAEAwAEEAEAAABHTlUAAAMCAAOAAACsqAAAgS0AAOJWAAAjNwAAXjAAF1gAAHsxAABBBwAA" .base64 "G0kAALGmAACwoACQhOw1AACNYgAAAFQox3UAALZAiIUAALGeAABBlAAAWEsAAPmRAACmOgAAADh3lCBymgAAaosAAMIjAAAKMQAAMkIAADU05ZwAAFIdAAAIGQAAAMFbAAAoTQAAGDcAAIRgAAA6HgAAlxwAAADOlgAAAEhPAAARiwAAMGgAAOVtAADMFgCrjgAAYl4AACZVAAA/HgBqPwAA" The patch attempts to juggle between readability and compactness, so if it detects some hunk of the initializer that would be shorter to be emitted as .string/.ascii directive, it does so, but if it previously used .base64 directive it switches mode only if there is a 16+ char ASCII-ish string. On my #embed testcase from yesterday unsigned char a[] = { #embed "cc1plus" }; without this patch it emits 2.4GB of assembly, while with this patch 649M. Compile times (trunk, so yes,rtl,extra checking) are: time ./xgcc -B ./ -S -std=c23 -O2 embed-11.c real0m13.647s user0m7.157s sys 0m2.597s time ./xgcc -B ./ -c -std=c23 -O2 embed-11.c real0m28.649s user0m26.653s sys 0m1.958s without the patch and time ./xgcc -B ./ -S -std=c23 -O2 embed-11.c real0m4.283s user0m2.288s sys 0m0.859s time ./xgcc -B ./ -c -std=c23 -O2 embed-11.c real0m6.888s user0m5.876s sys 0m1.002s with the patch, so that feels like significant improvement. The resulting embed-11.o is identical between the two ways of expressing the mostly binary data in the assembly. But note that there are portions like: .base64 "nAAvZRcAIgAOAFAzMwEABgCEQBgAEgAOAFBHcAIA7AAAX19nbXB6X2dldF9zaQBtcGZyX3NldF9zaV8yZXhwAG1wZnJfY29zaABtcGZyX3RhbmgAbXBmcl9zZXRfbmFuAG1wZnJfc3ViAG1wZnJfdGFuAG1wZnJfc3RydG9mcgBfX2dtcHpfc3ViX3VpAF9fZ21wX2dldF9tZW1vcnlfZnVuY3Rpb25zAF9fZ21wel9zZXRfdWkAbXBmcl9wb3cAX19nbXB6X3N1YgBfX2dtcHpfZml0c19zbG9uZ19wAG1wZnJfYXRh" .base64 "bjIAX19nbXB6X2RpdmV4YWN0AG1wZnJfc2V0X2VtaW4AX19nbXB6X3NldABfX2dtcHpfbXVsAG1wZnJfY2xlYXIAbXBmcl9sb2cAbXBmcl9hdGFuaABfX2dtcHpfc3dhcABtcGZyX2FzaW5oAG1wZnJfYXNpbgBtcGZyX2NsZWFycwBfX2dtcHpfbXVsXzJleHAAX19nbXB6X2FkZG11bABtcGZyX3NpbmgAX19nbXB6X2FkZF91aQBfX2dtcHFfY2xlYXIAX19nbW9uX3N0YXJ0X18AbXBmcl9hY29zAG1wZnJfc2V0X2VtYXgAbXBmcl9jb3MAbXBmcl9zaW4A
[gcc r15-2040] AVR: avr-md - Simplify GET_MODE and GET_MODE_BITSIZE.
https://gcc.gnu.org/g:8f87b3c5ecd47f6ac0d7407ae5d436a12fb169dd commit r15-2040-g8f87b3c5ecd47f6ac0d7407ae5d436a12fb169dd Author: Georg-Johann Lay Date: Mon Jul 15 09:12:03 2024 +0200 AVR: avr-md - Simplify GET_MODE and GET_MODE_BITSIZE. gcc/ * config/avr/avr.md: Simplify mode usage. (GET_MODE_SIZE (mode)): Use instead. (GET_MODE_BITSIZE (mode) - 1): Use instead. (GET_MODE_MASK (QImode)): Use 0xff instead. * config/avr/avr-fixed.md: Same. Diff: --- gcc/config/avr/avr-fixed.md | 8 gcc/config/avr/avr.md | 36 ++-- 2 files changed, 22 insertions(+), 22 deletions(-) diff --git a/gcc/config/avr/avr-fixed.md b/gcc/config/avr/avr-fixed.md index ca0e254e314b..911b8b2cd67c 100644 --- a/gcc/config/avr/avr-fixed.md +++ b/gcc/config/avr/avr-fixed.md @@ -231,7 +231,7 @@ (match_dup 2))] "" { -operands[2] = gen_rtx_REG (mode, 26 - GET_MODE_SIZE (mode)); +operands[2] = gen_rtx_REG (mode, 26 - ); }) ;; "*ssneghq2" "*ssnegha2" @@ -651,7 +651,7 @@ { if (CONST_INT_P (operands[2]) && !(optimize_size - && 4 == GET_MODE_SIZE (mode))) + && 4 == )) { emit_insn (gen_round3_const (operands[0], operands[1], operands[2])); DONE; @@ -661,8 +661,8 @@ const unsigned int regno_in[] = { -1U, 22, 22, -1U, 18 }; const unsigned int regno_out[] = { -1U, 24, 24, -1U, 22 }; -operands[3] = gen_rtx_REG (mode, regno_out[(size_t) GET_MODE_SIZE (mode)]); -operands[4] = gen_rtx_REG (mode, regno_in[(size_t) GET_MODE_SIZE (mode)]); +operands[3] = gen_rtx_REG (mode, regno_out[(size_t) ]); +operands[4] = gen_rtx_REG (mode, regno_in[(size_t) ]); avr_fix_inputs (operands, 1 << 2, regmask (mode, REGNO (operands[4]))); operands[5] = simplify_gen_subreg (QImode, force_reg (HImode, operands[2]), HImode, 0); // $2 is no more needed, but is referenced for expand. diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md index 8c3e55a91ee0..e67284421b64 100644 --- a/gcc/config/avr/avr.md +++ b/gcc/config/avr/avr.md @@ -556,7 +556,7 @@ && REG_Z == REGNO (XEXP (operands[0], 0)) && reload_completed" { -operands[0] = GEN_INT (GET_MODE_SIZE (mode)); +operands[0] = GEN_INT (); return "%~call __load_%0"; } [(set_attr "length" "1,2") @@ -679,7 +679,7 @@ "avr_xload_libgcc_p (mode) && reload_completed" { -rtx x_bytes = GEN_INT (GET_MODE_SIZE (mode)); +rtx x_bytes = GEN_INT (); output_asm_insn ("%~call __xload_%0", &x_bytes); return ""; @@ -1023,7 +1023,7 @@ operands[2] = replace_equiv_address (operands[1], gen_rtx_POST_INC (Pmode, addr)); operands[3] = addr; -operands[4] = gen_int_mode (-GET_MODE_SIZE (mode), HImode); +operands[4] = gen_int_mode (-, HImode); }) @@ -4789,7 +4789,7 @@ [(set (match_dup 2) (match_dup 3)) (set (match_dup 4) (match_dup 5))] { -machine_mode mode_hi = 4 == GET_MODE_SIZE (mode) ? HImode : QImode; +machine_mode mode_hi = == 4 ? HImode : QImode; bool lo_first = REGNO (operands[0]) < REGNO (operands[1]); rtx dst_lo = simplify_gen_subreg (HImode, operands[0], mode, 0); rtx src_lo = simplify_gen_subreg (HImode, operands[1], mode, 0); @@ -4833,7 +4833,7 @@ && reload_completed" [(const_int 1)] { -for (int i = 0; i < GET_MODE_SIZE (mode); i++) +for (int i = 0; i < ; i++) { rtx dst = simplify_gen_subreg (QImode, operands[0], mode, i); rtx src = simplify_gen_subreg (QImode, operands[1], mode, i); @@ -4962,7 +4962,7 @@ operands[3] = gen_rtx_SCRATCH (QImode); } else if (offset == 1 - || offset == GET_MODE_BITSIZE (mode) -1) + || offset == ) { // Support rotate left/right by 1. @@ -5117,7 +5117,7 @@ (clobber (match_scratch: 3 "="))] "AVR_HAVE_MOVW && CONST_INT_P (operands[2]) - && GET_MODE_SIZE (mode) % 2 == 0 + && % 2 == 0 && 0 == INTVAL (operands[2]) % 16" "#" "&& reload_completed" @@ -5141,7 +5141,7 @@ "CONST_INT_P (operands[2]) && (8 == INTVAL (operands[2]) % 16 || ((!AVR_HAVE_MOVW -|| GET_MODE_SIZE (mode) % 2 != 0) +|| % 2 != 0) && 0 == INTVAL (operands[2]) % 16))" "#" "&& reload_completed" @@ -6658,7 +6658,7 @@ (compare:CC (any_extend:HISI (match_operand:QIPSI 0 "register_operand" "r")) (match_operand:HISI 1 "register_operand" "r")))] "reload_completed - && GET_MODE_SIZE (mode) > GET_MODE_SIZE (mode)" + && > " { return avr_out_cmp_ext (operands, , nullptr); } @@ -6671,7 +6671,7 @@ (compare:CC (match_operand:HISI 0 "register_operand" "r") (any_extend:HISI (match_operand:QIPSI 1 "register_operand" "r"] "reload_completed - && GET_MODE_SIZE (mode) > GE
[gcc(refs/users/meissner/heads/work171-bugs)] Revert changes
https://gcc.gnu.org/g:aa366303d519602bef7b7425d15ce517a31e3291 commit aa366303d519602bef7b7425d15ce517a31e3291 Author: Michael Meissner Date: Mon Jul 15 12:34:03 2024 -0400 Revert changes Diff: --- gcc/ChangeLog.bugs | 22 +- libgcc/config.host | 16 ++-- libgcc/config/rs6000/t-float128 | 2 +- libgcc/config/rs6000/t-float128-vsx | 3 --- libgcc/configure| 29 + libgcc/configure.ac | 17 + 6 files changed, 18 insertions(+), 71 deletions(-) diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs index f75ce1a0c83a..076ccebfc45e 100644 --- a/gcc/ChangeLog.bugs +++ b/gcc/ChangeLog.bugs @@ -1,24 +1,4 @@ - Branch work171-bugs, patch #311 - -Use -mcpu=power7 if needed and not -mvsx to build float128 support. - -2024-07-12 Michael Meissner - -libgcc/ - - PR target/115800 - PR target/113652 - * config.host (powerpc*-*-linux*): Do not enable the float128 hardware - and float128 power10 hardware support unless the basic float128 support - is added. Add support for building the float128 support when the default - compiler does not enable VSX. - * config/rs6000/t-float128 (FP128_CFLAGS_SW): Do not use -mvsx, instead - use FP128_CFLAGS_VSX to optionally add -mcpu=power7. - * config/rs6000/t-float128-vsx: New file. - * configure.ac (powerpc*-*-linux*): Determine if the default powerpc cpu - includes VSX support. - * configure: Regenerate. - + Branch work171-bugs, patch #311 was reverted Branch work171-bugs, patch #310 was reverted Branch work171-bugs, patch #304 was reverted Branch work171-bugs, patch #303 was reverted diff --git a/libgcc/config.host b/libgcc/config.host index 804d12e8fd6a..9fae51d4ce7d 100644 --- a/libgcc/config.host +++ b/libgcc/config.host @@ -1291,19 +1291,15 @@ powerpc*-*-linux*) esac if test $libgcc_cv_powerpc_float128 = yes; then - if test $libgcc_cv_powerpc_vsx = no; then - tmake_file="${tmake_file} rs6000/t-float128-vsx" - fi - tmake_file="${tmake_file} rs6000/t-float128" + fi - if test $libgcc_cv_powerpc_float128_hw = yes; then - tmake_file="${tmake_file} rs6000/t-float128-hw" + if test $libgcc_cv_powerpc_float128_hw = yes; then + tmake_file="${tmake_file} rs6000/t-float128-hw" + fi - if test $libgcc_cv_powerpc_3_1_float128_hw = yes; then - tmake_file="${tmake_file} rs6000/t-float128-p10-hw" - fi - fi + if test $libgcc_cv_powerpc_3_1_float128_hw = yes; then + tmake_file="${tmake_file} rs6000/t-float128-p10-hw" fi extra_parts="$extra_parts ecrti.o ecrtn.o ncrti.o ncrtn.o" diff --git a/libgcc/config/rs6000/t-float128 b/libgcc/config/rs6000/t-float128 index 8037a6290a82..b09b5664af0e 100644 --- a/libgcc/config/rs6000/t-float128 +++ b/libgcc/config/rs6000/t-float128 @@ -74,7 +74,7 @@ fp128_includes= $(srcdir)/soft-fp/double.h \ $(srcdir)/soft-fp/soft-fp.h # Build the emulator without ISA 3.0 hardware support. -FP128_CFLAGS_SW = -Wno-type-limits $(FLOAT128_CFLAGS_VSX) -mfloat128 \ +FP128_CFLAGS_SW = -Wno-type-limits -mvsx -mfloat128 \ -mno-float128-hardware -mno-gnu-attribute \ -I$(srcdir)/soft-fp \ -I$(srcdir)/config/rs6000 \ diff --git a/libgcc/config/rs6000/t-float128-vsx b/libgcc/config/rs6000/t-float128-vsx deleted file mode 100644 index c691546242d9.. --- a/libgcc/config/rs6000/t-float128-vsx +++ /dev/null @@ -1,3 +0,0 @@ -# Add -mcpu=power7 option if the default compiler does not support VSX - -FLOAT128_CFLAGS_VSX= -mabi=altivec -mcpu=power7 diff --git a/libgcc/configure b/libgcc/configure index ad12baf965f4..a69d314374a3 100755 --- a/libgcc/configure +++ b/libgcc/configure @@ -5180,32 +5180,13 @@ esac esac case ${host} in -# Test if the default compiler enables VSX. If it does not, we need to build -# the float128 bit support using -mcpu=power7 to enable the VSX instruction set. -# -# Also check if a new glibc is being used so that __builtin_cpu_supports can be -# used. +# At present, we cannot turn -mfloat128 on via #pragma GCC target, so just +# check if we have VSX (ISA 2.06) support to build the software libraries, and +# whether the assembler can handle xsaddqp for hardware support. Also check if +# a new glibc is being used so that __builtin_cpu_supports can be used. powerpc*-*-linux*
[gcc r15-2041] RISC-V: Fix testcase for vector .SAT_SUB in zip benchmark
https://gcc.gnu.org/g:4306f76192bc7ab71c5997a7e2c95320505029ab commit r15-2041-g4306f76192bc7ab71c5997a7e2c95320505029ab Author: Edwin Lu Date: Fri Jul 12 11:31:16 2024 -0700 RISC-V: Fix testcase for vector .SAT_SUB in zip benchmark The following testcase was not properly testing anything due to an uninitialized variable. As a result, the loop was not iterating through the testing data, but instead on undefined values which could cause an unexpected abort. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h: initialize variable Signed-off-by: Edwin Lu Diff: --- gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h index d238c6392def..309d63377d53 100644 --- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h @@ -9,6 +9,7 @@ main () for (i = 0; i < sizeof (DATA) / sizeof (DATA[0]); i++) { + d = DATA[i]; RUN_BINARY_VX (&d.x[N], d.b, N); for (k = 0; k < N; k++)
[gcc r15-2042] [i386] adjust flag_omit_frame_pointer in a single function [PR113719]
https://gcc.gnu.org/g:bf8e80f9d164f8778d86a3dc50e501cf19a9eff1 commit r15-2042-gbf8e80f9d164f8778d86a3dc50e501cf19a9eff1 Author: Alexandre Oliva Date: Mon Jul 15 14:00:36 2024 -0300 [i386] adjust flag_omit_frame_pointer in a single function [PR113719] The first two patches for PR113719 have each regressed gcc.dg/ipa/iinline-attr.c on a different target. The reason for this instability is that there are competing flag_omit_frame_pointer overriders on x86: - ix86_recompute_optlev_based_flags computes and sets a -f[no-]omit-frame-pointer default depending on USE_IX86_FRAME_POINTER and, in 32-bit mode, optimize_size - ix86_option_override_internal enables flag_omit_frame_pointer for -momit-leaf-frame-pointer to take effect ix86_option_override[_internal] calls ix86_recompute_optlev_based_flags before setting flag_omit_frame_pointer. It is called during global process_options. But ix86_recompute_optlev_based_flags is also called by parse_optimize_options, during attribute processing, and at that point, ix86_option_override is not called, so the final overrider for global options is not applied to the optimize attributes. If they differ, the testcase fails. In order to fix this, we need to process all overriders of this option whenever we process any of them. Since this setting is affected by optimization options, it makes sense to compute it in parse_optimize_options, rather than in process_options. for gcc/ChangeLog PR target/113719 * config/i386/i386-options.cc (ix86_option_override_internal): Move flag_omit_frame_pointer final overrider... (ix86_recompute_optlev_based_flags): ... here. Diff: --- gcc/config/i386/i386-options.cc | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index 5824c0cb072e..059ef3ae6ad4 100644 --- a/gcc/config/i386/i386-options.cc +++ b/gcc/config/i386/i386-options.cc @@ -1911,6 +1911,12 @@ ix86_recompute_optlev_based_flags (struct gcc_options *opts, opts->x_flag_pcc_struct_return = DEFAULT_PCC_STRUCT_RETURN; } } + + /* Keep nonleaf frame pointers. */ + if (opts->x_flag_omit_frame_pointer) +opts->x_target_flags &= ~MASK_OMIT_LEAF_FRAME_POINTER; + else if (TARGET_OMIT_LEAF_FRAME_POINTER_P (opts->x_target_flags)) +opts->x_flag_omit_frame_pointer = 1; } /* Implement part of TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE hook. */ @@ -2590,12 +2596,6 @@ ix86_option_override_internal (bool main_args_p, opts->x_target_flags |= MASK_NO_RED_ZONE; } - /* Keep nonleaf frame pointers. */ - if (opts->x_flag_omit_frame_pointer) -opts->x_target_flags &= ~MASK_OMIT_LEAF_FRAME_POINTER; - else if (TARGET_OMIT_LEAF_FRAME_POINTER_P (opts->x_target_flags)) -opts->x_flag_omit_frame_pointer = 1; - /* If we're doing fast math, we don't care about comparison order wrt NaNs. This lets us use a shorter comparison sequence. */ if (opts->x_flag_finite_math_only)
[gcc r15-2043] RISC-V: Attribute parser: Use alloca() instead of new + std::unique_ptr
https://gcc.gnu.org/g:5040c273484d7123a40a99cdeb434cecbd17a2e9 commit r15-2043-g5040c273484d7123a40a99cdeb434cecbd17a2e9 Author: Christoph Müllner Date: Fri Jul 5 04:48:15 2024 +0200 RISC-V: Attribute parser: Use alloca() instead of new + std::unique_ptr Allocating an object on the heap with new, wrapping it in a std::unique_ptr and finally getting the buffer via buf.get() is a correct way to allocate a buffer that is automatically freed on return. However, a simple invocation of alloca() does the same with less overhead. gcc/ChangeLog: * config/riscv/riscv-target-attr.cc (riscv_target_attr_parser::parse_arch): Replace new + std::unique_ptr by alloca(). (riscv_process_one_target_attr): Likewise. (riscv_process_target_attr): Likewise. Signed-off-by: Christoph Müllner Diff: --- gcc/config/riscv/riscv-target-attr.cc | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/gcc/config/riscv/riscv-target-attr.cc b/gcc/config/riscv/riscv-target-attr.cc index 0bbe7df25d19..3d7753f64574 100644 --- a/gcc/config/riscv/riscv-target-attr.cc +++ b/gcc/config/riscv/riscv-target-attr.cc @@ -109,8 +109,7 @@ riscv_target_attr_parser::parse_arch (const char *str) { /* Parsing the extension list like "+[,+]*". */ size_t len = strlen (str); - std::unique_ptr buf (new char[len+1]); - char *str_to_check = buf.get (); + char *str_to_check = (char *) alloca (len + 1); strcpy (str_to_check, str); const char *token = strtok_r (str_to_check, ",", &str_to_check); m_subset_list = riscv_cmdline_subset_list ()->clone (); @@ -247,8 +246,7 @@ riscv_process_one_target_attr (char *arg_str, return false; } - std::unique_ptr buf (new char[len+1]); - char *str_to_check = buf.get(); + char *str_to_check = (char *) alloca (len + 1); strcpy (str_to_check, arg_str); char *arg = strchr (str_to_check, '='); @@ -334,8 +332,7 @@ riscv_process_target_attr (tree fndecl, tree args, location_t loc, return false; } - std::unique_ptr buf (new char[len+1]); - char *str_to_check = buf.get (); + char *str_to_check = (char *) alloca (len + 1); strcpy (str_to_check, TREE_STRING_POINTER (args)); /* Used to catch empty spaces between semi-colons i.e.
[gcc(refs/users/meissner/heads/work171-bugs)] Do not add -mvsx when building or testing the float128 support.
https://gcc.gnu.org/g:5b62b27b2f30b678e2f9ba4f743ce080ea51f1dc commit 5b62b27b2f30b678e2f9ba4f743ce080ea51f1dc Author: Michael Meissner Date: Mon Jul 15 13:11:49 2024 -0400 Do not add -mvsx when building or testing the float128 support. In the past, we would add -mvsx when building the float128 support in libgcc. This allowed us to build the float128 support on a big endian system where the default cpu is power4. While the libgcc support can be built, given there is no glibc support for float128 available. However, adding -mvsx and building the libgcc float128 support causes problems if you set the default cpu to something like a 7540, which does not have VSX support. The assembler complains that when the code does a ".machine 7450", you cannot use VSX instructions. With these patches, the float128 libgcc support is only built if the default compiler has VSX support. If somebody wanted to enable the glibc support for big endian, they would need to set the base cpu to power8 to enable building the libgcc float128 libraries. In addition to the changes in libgcc, this patch also changes the GCC tests so that it will only test float128 if the default compiler enables the VSX instruction set. Otherwise all of the float128 tests will fail because the libgcc support is not available. 2024-07-15 Michael Meissner gcc/testsuite/ PR target/115800 PR target/113652 * lib/target-supports.exp (check_ppc_float128_sw_available): Do not add the -mvsx option. (check_effective_target___float128): Likewise. libgcc/ PR target/115800 PR target/113652 * config.host (powerpc*-*-linux*): Do not add t-float128-hw or t-float128-p10-hw if the default compiler does not support float128. * config/rs6000/t-float128 (FP128_CFLAGS_SW): Do not add -mvsx when building the basic float128 support. * config/rs6000/t-float128-hw (FP128_CFLAGS_HW): Likewise. * config/rs6000/t-float128-p10-hw (FP128_3_1_CFLAGS_HW): Likewise. * configure.ac (powerpc*-*-linux*): Do not add -mvsx when testing whether to build the float128 support. * configure: Regenerate. Diff: --- gcc/testsuite/lib/target-supports.exp | 8 libgcc/config.host | 12 ++-- libgcc/config/rs6000/t-float128| 8 +++- libgcc/config/rs6000/t-float128-hw | 3 +-- libgcc/config/rs6000/t-float128-p10-hw | 3 +-- libgcc/configure | 8 +++- libgcc/configure.ac| 8 +++- 7 files changed, 33 insertions(+), 17 deletions(-) diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index b7df6150bcbd..ca5276873064 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -2979,7 +2979,7 @@ proc check_ppc_float128_sw_available { } { || [istarget *-*-darwin*]} { expr 0 } else { - set options "-mfloat128 -mvsx" + set options "-mfloat128" check_runtime_nocache ppc_float128_sw_available { volatile __float128 x = 1.0q; volatile __float128 y = 2.0q; @@ -3005,7 +3005,7 @@ proc check_ppc_float128_hw_available { } { || [istarget *-*-darwin*]} { expr 0 } else { - set options "-mfloat128 -mvsx -mfloat128-hardware -mcpu=power9" + set options "-mfloat128 -mfloat128-hardware -mcpu=power9" check_runtime_nocache ppc_float128_hw_available { volatile __float128 x = 1.0q; volatile __float128 y = 2.0q; @@ -3947,7 +3947,7 @@ proc check_effective_target___float128 { } { proc add_options_for___float128 { flags } { if { [istarget powerpc*-*-linux*] } { - return "$flags -mfloat128 -mvsx" + return "$flags -mfloat128" } return "$flags" } @@ -7234,7 +7234,7 @@ proc check_effective_target_powerpc_float128_sw_ok { } { __float128 z = x + y; return (z == 3.0q); } - } "-mfloat128 -mvsx"] + } "-mfloat128"] } else { return 0 } diff --git a/libgcc/config.host b/libgcc/config.host index 9fae51d4ce7d..261b08859a4d 100644 --- a/libgcc/config.host +++ b/libgcc/config.host @@ -1292,14 +1292,14 @@ powerpc*-*-linux*) if test $libgcc_cv_powerpc_float128 = yes; then tmake_file="${tmake_file} rs6000/t-float128" - fi - if test $libgcc_cv_powerpc_float128_hw = yes; then - tmake_file="${tmake_file} rs6000/t-float128-hw" - fi + if test $libgcc_cv_powerpc_float128_hw = yes; then + tmake_file="${tmake_file} rs6000/t-float128-hw" -
[gcc(refs/users/meissner/heads/work171-bugs)] Update ChangeLog.*
https://gcc.gnu.org/g:3c3f8727179681e80cf28e57bbdb18f672c2e8d1 commit 3c3f8727179681e80cf28e57bbdb18f672c2e8d1 Author: Michael Meissner Date: Mon Jul 15 13:13:06 2024 -0400 Update ChangeLog.* Diff: --- gcc/ChangeLog.bugs | 48 1 file changed, 48 insertions(+) diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs index 076ccebfc45e..5b1df025f236 100644 --- a/gcc/ChangeLog.bugs +++ b/gcc/ChangeLog.bugs @@ -1,3 +1,51 @@ + Branch work171-bugs, patch #320 + +Do not add -mvsx when building or testing the float128 support. + +In the past, we would add -mvsx when building the float128 support in libgcc. +This allowed us to build the float128 support on a big endian system where the +default cpu is power4. While the libgcc support can be built, given there is no +glibc support for float128 available. + +However, adding -mvsx and building the libgcc float128 support causes problems +if you set the default cpu to something like a 7540, which does not have VSX +support. The assembler complains that when the code does a ".machine 7450", you +cannot use VSX instructions. + +With these patches, the float128 libgcc support is only built if the default +compiler has VSX support. If somebody wanted to enable the glibc support for +big endian, they would need to set the base cpu to power8 to enable building the +libgcc float128 libraries. + +In addition to the changes in libgcc, this patch also changes the GCC tests so +that it will only test float128 if the default compiler enables the VSX +instruction set. Otherwise all of the float128 tests will fail because the +libgcc support is not available. + +2024-07-15 Michael Meissner + +gcc/testsuite/ + + PR target/115800 + PR target/113652 + * lib/target-supports.exp (check_ppc_float128_sw_available): Do not add + the -mvsx option. + (check_effective_target___float128): Likewise. + +libgcc/ + + PR target/115800 + PR target/113652 + * config.host (powerpc*-*-linux*): Do not add t-float128-hw or + t-float128-p10-hw if the default compiler does not support float128. + * config/rs6000/t-float128 (FP128_CFLAGS_SW): Do not add -mvsx when + building the basic float128 support. + * config/rs6000/t-float128-hw (FP128_CFLAGS_HW): Likewise. + * config/rs6000/t-float128-p10-hw (FP128_3_1_CFLAGS_HW): Likewise. + * configure.ac (powerpc*-*-linux*): Do not add -mvsx when testing + whether to build the float128 support. + * configure: Regenerate. + Branch work171-bugs, patch #311 was reverted Branch work171-bugs, patch #310 was reverted Branch work171-bugs, patch #304 was reverted
[gcc r14-10423] Fortran: improve attribute conflict checking [PR93635]
https://gcc.gnu.org/g:71ec9ed7a7353f66d55b034a45336bf43a026b1d commit r14-10423-g71ec9ed7a7353f66d55b034a45336bf43a026b1d Author: Harald Anlauf Date: Thu May 23 21:13:00 2024 +0200 Fortran: improve attribute conflict checking [PR93635] gcc/fortran/ChangeLog: PR fortran/93635 * symbol.cc (conflict_std): Helper function for reporting attribute conflicts depending on the Fortran standard version. (conf_std): Helper macro for checking standard-dependent conflicts. (gfc_check_conflict): Use it. gcc/testsuite/ChangeLog: PR fortran/93635 * gfortran.dg/c-interop/c1255-2.f90: Adjust pattern. * gfortran.dg/pr87907.f90: Likewise. * gfortran.dg/pr93635.f90: New test. Co-authored-by: Steven G. Kargl (cherry picked from commit 9561cf550a66a89e7c8d31202a03c4fddf82a3f2) Diff: --- gcc/fortran/symbol.cc | 63 +++-- gcc/testsuite/gfortran.dg/c-interop/c1255-2.f90 | 4 +- gcc/testsuite/gfortran.dg/pr87907.f90 | 8 ++-- gcc/testsuite/gfortran.dg/pr93635.f90 | 19 4 files changed, 54 insertions(+), 40 deletions(-) diff --git a/gcc/fortran/symbol.cc b/gcc/fortran/symbol.cc index 0a1646def678..5db3c887127b 100644 --- a/gcc/fortran/symbol.cc +++ b/gcc/fortran/symbol.cc @@ -407,18 +407,36 @@ gfc_check_function_type (gfc_namespace *ns) / Symbol attribute stuff */ +/* Older standards produced conflicts for some attributes that are allowed + in newer standards. Check for the conflict and issue an error depending + on the standard in play. */ + +static bool +conflict_std (int standard, const char *a1, const char *a2, const char *name, + locus *where) +{ + if (name == NULL) +{ + return gfc_notify_std (standard, "%s attribute conflicts " +"with %s attribute at %L", a1, a2, +where); +} + else +{ + return gfc_notify_std (standard, "%s attribute conflicts " +"with %s attribute in %qs at %L", +a1, a2, name, where); +} +} + /* This is a generic conflict-checker. We do this to avoid having a single conflict in two places. */ #define conf(a, b) if (attr->a && attr->b) { a1 = a; a2 = b; goto conflict; } #define conf2(a) if (attr->a) { a2 = a; goto conflict; } -#define conf_std(a, b, std) if (attr->a && attr->b)\ - {\ -a1 = a;\ -a2 = b;\ -standard = std;\ -goto conflict_std;\ - } +#define conf_std(a, b, std) if (attr->a && attr->b \ + && !conflict_std (std, a, b, name, where)) \ + return false; bool gfc_check_conflict (symbol_attribute *attr, const char *name, locus *where) @@ -451,7 +469,6 @@ gfc_check_conflict (symbol_attribute *attr, const char *name, locus *where) "OACC DECLARE DEVICE_RESIDENT"; const char *a1, *a2; - int standard; if (attr->artificial) return true; @@ -460,20 +477,10 @@ gfc_check_conflict (symbol_attribute *attr, const char *name, locus *where) where = &gfc_current_locus; if (attr->pointer && attr->intent != INTENT_UNKNOWN) -{ - a1 = pointer; - a2 = intent; - standard = GFC_STD_F2003; - goto conflict_std; -} +conf_std (pointer, intent, GFC_STD_F2003); - if (attr->in_namelist && (attr->allocatable || attr->pointer)) -{ - a1 = in_namelist; - a2 = attr->allocatable ? allocatable : pointer; - standard = GFC_STD_F2003; - goto conflict_std; -} + conf_std (in_namelist, allocatable, GFC_STD_F2003); + conf_std (in_namelist, pointer, GFC_STD_F2003); /* Check for attributes not allowed in a BLOCK DATA. */ if (gfc_current_state () == COMP_BLOCK_DATA) @@ -922,20 +929,6 @@ conflict: a1, a2, name, where); return false; - -conflict_std: - if (name == NULL) -{ - return gfc_notify_std (standard, "%s attribute conflicts " - "with %s attribute at %L", a1, a2, - where); -} - else -{ - return gfc_notify_std (standard, "%s attribute conflicts " -"with %s attribute in %qs at %L", - a1, a2, name, where); -} } #undef conf diff --git a/gcc/testsuite/gfortran.dg/c-interop/c1255-2.f90 b/gcc/testsuite/gfortran.dg/c-interop/c1255-2.f90 index 0e5505a01835..feed2e7645fd 100644 --- a/gcc/testsuite/gfortran.dg/c-interop/c1255-2.f90 +++ b/gcc/testsuite/gfortran.dg/c-interop/c1255-2.f90 @@ -92,12 +92,12 @@ module m2 end function ! function result is a
[gcc r15-2046] Revert "RISC-V: Attribute parser: Use alloca() instead of new + std::unique_ptr"
https://gcc.gnu.org/g:eb0c163aada970b8351067b17121f013fc58dbc9 commit r15-2046-geb0c163aada970b8351067b17121f013fc58dbc9 Author: Christoph Müllner Date: Mon Jul 15 23:42:39 2024 +0200 Revert "RISC-V: Attribute parser: Use alloca() instead of new + std::unique_ptr" This reverts commit 5040c273484d7123a40a99cdeb434cecbd17a2e9. Diff: --- gcc/config/riscv/riscv-target-attr.cc | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/gcc/config/riscv/riscv-target-attr.cc b/gcc/config/riscv/riscv-target-attr.cc index 57235c9c0a7e..1645a6692177 100644 --- a/gcc/config/riscv/riscv-target-attr.cc +++ b/gcc/config/riscv/riscv-target-attr.cc @@ -101,7 +101,8 @@ riscv_target_attr_parser::parse_arch (const char *str) { /* Parsing the extension list like "+[,+]*". */ size_t len = strlen (str); - char *str_to_check = (char *) alloca (len + 1); + std::unique_ptr buf (new char[len+1]); + char *str_to_check = buf.get (); strcpy (str_to_check, str); const char *token = strtok_r (str_to_check, ",", &str_to_check); const char *local_arch_str = global_options.x_riscv_arch_string; @@ -253,7 +254,8 @@ riscv_process_one_target_attr (char *arg_str, return false; } - char *str_to_check = (char *) alloca (len + 1); + std::unique_ptr buf (new char[len+1]); + char *str_to_check = buf.get(); strcpy (str_to_check, arg_str); char *arg = strchr (str_to_check, '='); @@ -339,7 +341,8 @@ riscv_process_target_attr (tree args, location_t loc) return false; } - char *str_to_check = (char *) alloca (len + 1); + std::unique_ptr buf (new char[len+1]); + char *str_to_check = buf.get (); strcpy (str_to_check, TREE_STRING_POINTER (args)); /* Used to catch empty spaces between semi-colons i.e.
[gcc r15-2047] c++: alias template with dependent attributes [PR115897]
https://gcc.gnu.org/g:7954bb4fcb6fa80f6bb840133314885011821188 commit r15-2047-g7954bb4fcb6fa80f6bb840133314885011821188 Author: Patrick Palka Date: Mon Jul 15 18:07:55 2024 -0400 c++: alias template with dependent attributes [PR115897] Here we're prematurely stripping the dependent alias template-id A to its defining-type-id T when used as a template argument, which in turn causes us to essentially ignore A's vector_size attribute in the outer template-id. This has always been a problem for class template-ids it seems, and after r14-2170 variable template-ids are affected as well. This patch marks alias templates that have a dependent attribute as complex (as with e.g. constrained alias templates) so that we don't look through them prematurely. PR c++/115897 gcc/cp/ChangeLog: * pt.cc (complex_alias_template_p): Return true for an alias template with attributes. (get_underlying_template): Don't look through an alias template with attributes. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/alias-decl-77.C: New test. Reviewed-by: Jason Merrill Diff: --- gcc/cp/pt.cc | 10 ++ gcc/testsuite/g++.dg/cpp0x/alias-decl-77.C | 32 ++ 2 files changed, 42 insertions(+) diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc index e38e02488be1..4d72ff60cb8e 100644 --- a/gcc/cp/pt.cc +++ b/gcc/cp/pt.cc @@ -6628,6 +6628,11 @@ complex_alias_template_p (const_tree tmpl, tree *seen_out) if (get_constraints (tmpl)) return true; + /* An alias with dependent type attributes is complex. */ + if (any_dependent_type_attributes_p (DECL_ATTRIBUTES + (DECL_TEMPLATE_RESULT (tmpl +return true; + if (!complex_alias_tmpl_info) complex_alias_tmpl_info = hash_map::create_ggc (13); @@ -6780,6 +6785,11 @@ get_underlying_template (tree tmpl) if (!at_least_as_constrained (underlying, tmpl)) break; + /* If TMPL adds dependent type attributes, it isn't equivalent. */ + if (any_dependent_type_attributes_p (DECL_ATTRIBUTES + (DECL_TEMPLATE_RESULT (tmpl + break; + /* Alias is equivalent. Strip it and repeat. */ tmpl = underlying; } diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-77.C b/gcc/testsuite/g++.dg/cpp0x/alias-decl-77.C new file mode 100644 index ..f72e4cc26538 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-77.C @@ -0,0 +1,32 @@ +// PR c++/115897 +// { dg-do compile { target c++11 } } + +template +struct is_same { static constexpr bool value = __is_same(T, U); }; + +#if __cpp_variable_templates +template +constexpr bool is_same_v = __is_same(T, U); +#endif + +template +using A [[gnu::vector_size(16)]] = T; + +template +using B = T; + +template +using C [[gnu::vector_size(16)]] = B; + +template +void f() { + static_assert(!is_same>::value, ""); + static_assert(is_same, A>::value, ""); + +#if __cpp_variable_templates + static_assert(!is_same_v>, ""); + static_assert(is_same_v, A>, ""); +#endif +}; + +template void f();
[gcc r15-2048] Fix sign/carry bit handling in ext-dce.
https://gcc.gnu.org/g:94b21f13763638f64e83e7f9959c7f1523b9eaed commit r15-2048-g94b21f13763638f64e83e7f9959c7f1523b9eaed Author: Jeff Law Date: Mon Jul 15 16:57:44 2024 -0600 Fix sign/carry bit handling in ext-dce. My change to fix a ubsan issue broke handling propagation of the carry/sign bit down through a right shift. Thanks to Andreas for the analysis and proposed fix and Sergei for the testcase. PR rtl-optimization/115876 PR rtl-optimization/115916 gcc/ * ext-dce.cc (carry_backpropagate): Make return type unsigned as well. Cast to signed for right shift to preserve sign bit. gcc/testsuite/ * g++.dg/torture/pr115916.C: New test. Co-author: Andreas Schwab Co-author: Sergei Trofimovich Diff: --- gcc/ext-dce.cc | 4 +- gcc/testsuite/g++.dg/torture/pr115916.C | 90 + 2 files changed, 92 insertions(+), 2 deletions(-) diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc index 91789d283fcd..2869a389c3aa 100644 --- a/gcc/ext-dce.cc +++ b/gcc/ext-dce.cc @@ -373,7 +373,7 @@ binop_implies_op2_fully_live (rtx_code code) binop_implies_op2_fully_live (e.g. shifts), the computed mask may exclusively pertain to the first operand. */ -HOST_WIDE_INT +unsigned HOST_WIDE_INT carry_backpropagate (unsigned HOST_WIDE_INT mask, enum rtx_code code, rtx x) { if (mask == 0) @@ -393,7 +393,7 @@ carry_backpropagate (unsigned HOST_WIDE_INT mask, enum rtx_code code, rtx x) case ASHIFT: if (CONSTANT_P (XEXP (x, 1)) && known_lt (UINTVAL (XEXP (x, 1)), GET_MODE_BITSIZE (mode))) - return mask >> INTVAL (XEXP (x, 1)); + return (HOST_WIDE_INT)mask >> INTVAL (XEXP (x, 1)); return (2ULL << floor_log2 (mask)) - 1; /* We propagate for the shifted operand, but not the shift diff --git a/gcc/testsuite/g++.dg/torture/pr115916.C b/gcc/testsuite/g++.dg/torture/pr115916.C new file mode 100644 index ..3d788678eaa3 --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr115916.C @@ -0,0 +1,90 @@ +/* { dg-do run } */ + +#include +#include + +struct ve { +ve() = default; +ve(const ve&) = default; +ve& operator=(const ve&) = default; + +// note that the code usually uses the first half of this array +uint8_t raw[16] = {}; +}; + +static ve First8_(void) { +ve m; +__builtin_memset(m.raw, 0xff, 8); +return m; +} + +static ve And_(ve a, ve b) { +ve au; +__builtin_memcpy(au.raw, a.raw, 16); +for (size_t i = 0; i < 8; ++i) { +au.raw[i] &= b.raw[i]; +} +return au; +} + +__attribute__((noipa, optimize(0))) +static void vec_assert(ve a) { +if (a.raw[6] != 0x06 && a.raw[6] != 0x07) +__builtin_trap(); +} + +static ve Reverse4_(ve v) { +ve ret; +for (size_t i = 0; i < 8; i += 4) { +ret.raw[i + 0] = v.raw[i + 3]; +ret.raw[i + 1] = v.raw[i + 2]; +ret.raw[i + 2] = v.raw[i + 1]; +ret.raw[i + 3] = v.raw[i + 0]; +} +return ret; +} + +static ve DupEven_(ve v) { +for (size_t i = 0; i < 8; i += 2) { +v.raw[i + 1] = v.raw[i]; +} +return v; +} + +template +ve Per4LaneBlockShuffle_(ve v) { +if (b) { +return Reverse4_(v); +} else { +return DupEven_(v); +} +} + +template +static inline __attribute__((always_inline)) void DoTestPer4LaneBlkShuffle(const ve v) { +ve actual = Per4LaneBlockShuffle_(v); +const auto valid_lanes_mask = First8_(); +ve actual_masked = And_(valid_lanes_mask, actual); +vec_assert(actual_masked); +} + +static void DoTestPer4LaneBlkShuffles(const ve v) { +alignas(128) uint8_t src_lanes[8]; +__builtin_memcpy(src_lanes, v.raw, 8); +// need both, hm +DoTestPer4LaneBlkShuffle(v); +DoTestPer4LaneBlkShuffle(v); +} + +__attribute__((noipa, optimize(0))) +static void bug(void) { + uint8_t iv[8] = {1,2,3,4,5,6,7,8}; + ve v; + __builtin_memcpy(v.raw, iv, 8); + DoTestPer4LaneBlkShuffles(v); +} + +int main(void) { +bug(); +} +
[gcc(refs/users/meissner/heads/work171-bugs)] Do not add -mfloat128 to the tests.
https://gcc.gnu.org/g:9cbff1336d43a8c13fc6d77ee93d165f6bc86de6 commit 9cbff1336d43a8c13fc6d77ee93d165f6bc86de6 Author: Michael Meissner Date: Mon Jul 15 19:06:08 2024 -0400 Do not add -mfloat128 to the tests. 2024-07-15 Michael Meissner gcc/testsuite/ PR target/115800 PR target/113652 * lib/target-supports.exp (check_ppc_float128_sw_available): Do not add -mfloat128 on PowerPC tests. (check_ppc_float128_hw_available): Likewise. (check_effective_target_ppc_ieee128_ok): Likewise. (add_options_for___float128): Likewise. (check_effective_target_power10_ok): Likewise. (check_effective_target_powerpc_float128_sw_ok): Likewise. (check_effective_target_powerpc_float128_hw_ok): Likewise. Diff: --- gcc/testsuite/lib/target-supports.exp | 18 +++--- 1 file changed, 7 insertions(+), 11 deletions(-) diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index ca5276873064..8ab8f6a10586 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -2979,7 +2979,6 @@ proc check_ppc_float128_sw_available { } { || [istarget *-*-darwin*]} { expr 0 } else { - set options "-mfloat128" check_runtime_nocache ppc_float128_sw_available { volatile __float128 x = 1.0q; volatile __float128 y = 2.0q; @@ -3005,7 +3004,7 @@ proc check_ppc_float128_hw_available { } { || [istarget *-*-darwin*]} { expr 0 } else { - set options "-mfloat128 -mfloat128-hardware -mcpu=power9" + set options "-mfloat128-hardware -mcpu=power9" check_runtime_nocache ppc_float128_hw_available { volatile __float128 x = 1.0q; volatile __float128 y = 2.0q; @@ -3030,7 +3029,6 @@ proc check_effective_target_ppc_ieee128_ok { } { || [istarget *-*-vxworks*]} { expr 0 } else { - set options "-mfloat128" check_runtime_nocache ppc_ieee128_ok { int main() { @@ -3946,9 +3944,6 @@ proc check_effective_target___float128 { } { } proc add_options_for___float128 { flags } { -if { [istarget powerpc*-*-linux*] } { - return "$flags -mfloat128" -} return "$flags" } @@ -7217,8 +7212,9 @@ proc check_effective_target_power10_ok { } { } } -# Return 1 if this is a PowerPC target supporting -mfloat128 via either -# software emulation on power7/power8 systems or hardware support on power9. +# Return 1 if this is a PowerPC target supporting IEEE 128-bit floating point +# via either software emulation on power7/power8 systems or hardware support on +# power9. proc check_effective_target_powerpc_float128_sw_ok { } { if { [istarget powerpc*-*-*] @@ -7234,14 +7230,14 @@ proc check_effective_target_powerpc_float128_sw_ok { } { __float128 z = x + y; return (z == 3.0q); } - } "-mfloat128"] + }] } else { return 0 } } -# Return 1 if this is a PowerPC target supporting -mfloat128 via hardware -# support on power9. +# Return 1 if this is a PowerPC target supporting IEEE 128-bit floating point +# via hardware support on power9. proc check_effective_target_powerpc_float128_hw_ok { } { if { [istarget powerpc*-*-*]
[gcc(refs/users/meissner/heads/work171-bugs)] Update ChangeLog.*
https://gcc.gnu.org/g:befe8a6d0c46daad90a4c3631c00dd5f40a6b1c4 commit befe8a6d0c46daad90a4c3631c00dd5f40a6b1c4 Author: Michael Meissner Date: Mon Jul 15 19:06:53 2024 -0400 Update ChangeLog.* Diff: --- gcc/ChangeLog.bugs | 19 +++ 1 file changed, 19 insertions(+) diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs index 5b1df025f236..7a40e702d934 100644 --- a/gcc/ChangeLog.bugs +++ b/gcc/ChangeLog.bugs @@ -1,3 +1,22 @@ + Branch work171-bugs, patch #321 + +Do not add -mfloat128 to the tests. + +2024-07-15 Michael Meissner + +gcc/testsuite/ + + PR target/115800 + PR target/113652 + * lib/target-supports.exp (check_ppc_float128_sw_available): Do not add + -mfloat128 on PowerPC tests. + (check_ppc_float128_hw_available): Likewise. + (check_effective_target_ppc_ieee128_ok): Likewise. + (add_options_for___float128): Likewise. + (check_effective_target_power10_ok): Likewise. + (check_effective_target_powerpc_float128_sw_ok): Likewise. + (check_effective_target_powerpc_float128_hw_ok): Likewise. + Branch work171-bugs, patch #320 Do not add -mvsx when building or testing the float128 support.
[gcc r15-2049] Fix liveness computation for shift/rotate counts in ext-dce
https://gcc.gnu.org/g:b31b8af807f5459674b0b310cb62a5bc81b676e7 commit r15-2049-gb31b8af807f5459674b0b310cb62a5bc81b676e7 Author: Jeff Law Date: Mon Jul 15 18:15:33 2024 -0600 Fix liveness computation for shift/rotate counts in ext-dce So as I've noted before I believe the control flow in ext-dce.cc is horribly messy. While investigating a fix for 115877 I came across another problem related to control flow handling. Specifically, if we have an binary op which implies the 2nd operand is fully live, then we'd actually fail to mark that operand as live. We essentially broke out of the loop which was supposed to be safe. But Y was a REG and if Y is a REG or CONST_INT we skip sub-rtxs and thus failed to process that operand (the shift count) at all. Rather than muck around with control flow, we can just set all the bits as live in DST_MASK and let normal processing continue. With all the bits live IN DST_MASK all the bits implied by the mode of the argument will also be live. No testcase. Bootstrapped and regression tested on x86. Pushing to the trunk. gcc/ * ext-dce.cc (ext_dce_process_uses): Simplify control flow and fix liveness computation for shift/rotate counts. Diff: --- gcc/ext-dce.cc | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/gcc/ext-dce.cc b/gcc/ext-dce.cc index 2869a389c3aa..6c961feee635 100644 --- a/gcc/ext-dce.cc +++ b/gcc/ext-dce.cc @@ -632,10 +632,11 @@ ext_dce_process_uses (rtx_insn *insn, rtx obj, bitmap live_tmp) else if (!CONSTANT_P (y)) break; - /* We might have (ashift (const_int 1) (reg...)) */ - /* XXX share this logic with code below. */ + /* We might have (ashift (const_int 1) (reg...)) +By setting dst_mask we can continue iterating on the +the next operand and it will be considered fully live. */ if (binop_implies_op2_fully_live (GET_CODE (src))) - break; + dst_mask = -1; /* If this was anything but a binary operand, break the inner loop. This is conservatively correct as it will cause the
[gcc r15-2051] libbacktrace: support FDPIC
https://gcc.gnu.org/g:c6803cdaba7a7bf933e52cd36f901430253cc2b0 commit r15-2051-gc6803cdaba7a7bf933e52cd36f901430253cc2b0 Author: Ian Lance Taylor Date: Mon Jul 15 17:27:18 2024 -0700 libbacktrace: support FDPIC Based on patch by Max Filippov. * internal.h: If FDPIC, #include and/or . (libbacktrace_using_fdpic): Define. (struct libbacktrace_base_address): Define. (libbacktrace_add_base): Define. (backtrace_dwarf_add): Change base_address to struct libbacktrace_base_address. * dwarf.c (struct dwarf_data): Change base_address to struct libbacktrace_base_address. (add_ranges, find_address_ranges, build_ddress_map): Likewise. (build_dwarf_data, build_dwarf_add): Likewise. (add_low_high_range): Change base_address to struct libbacktrace_base_address. Use libbacktrace_add_base. (add_ranges_from_ranges, add_ranges_from_rnglists): Likewise. (add_line): Use libbacktrace_add_base. * elf.c (elf_initialize_syminfo): Change base_address to struct libbacktrace_base_address. Use libbacktrace_add_base. (elf_add): Change base_address to struct libbacktrace_base_address. (phdr_callback): Likewise. Initialize base_address.m. (backtrace_initialize): If using FDPIC, don't call elf_add with main executable; always use dl_iterate_phdr. * macho.c (macho_add_symtab): Change base_address to struct libbacktrace_base_address. Use libbacktrace_add_base. (macho_syminfo): Change base_address to struct libbacktrace_base_address. (macho_add_fat, macho_add_dsym, macho_add): Likewise. (backtrace_initialize): Likewise. Initialize base_address.m. * pecoff.c (coff_initialize_syminfo): Change base_address to struct libbacktrace_base_address. Use libbacktrace_add_base. (coff_add): Change base_address to struct libbacktrace_base_address. Initialize base_address.m. Diff: --- libbacktrace/dwarf.c| 63 +++-- libbacktrace/elf.c | 31 +--- libbacktrace/internal.h | 36 +++- libbacktrace/macho.c| 25 libbacktrace/pecoff.c | 30 --- 5 files changed, 123 insertions(+), 62 deletions(-) diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c index cc36a0a2990d..96ffc4cc481b 100644 --- a/libbacktrace/dwarf.c +++ b/libbacktrace/dwarf.c @@ -388,8 +388,8 @@ struct dwarf_data struct dwarf_data *next; /* The data for .gnu_debugaltlink. */ struct dwarf_data *altlink; - /* The base address for this file. */ - uintptr_t base_address; + /* The base address mapping for this file. */ + struct libbacktrace_base_address base_address; /* A sorted list of address ranges. */ struct unit_addrs *addrs; /* Number of address ranges in list. */ @@ -1610,8 +1610,9 @@ update_pcrange (const struct attr* attr, const struct attr_val* val, static int add_low_high_range (struct backtrace_state *state, const struct dwarf_sections *dwarf_sections, - uintptr_t base_address, int is_bigendian, - struct unit *u, const struct pcrange *pcrange, + struct libbacktrace_base_address base_address, + int is_bigendian, struct unit *u, + const struct pcrange *pcrange, int (*add_range) (struct backtrace_state *state, void *rdata, uintptr_t lowpc, uintptr_t highpc, @@ -1646,8 +1647,8 @@ add_low_high_range (struct backtrace_state *state, /* Add in the base address of the module when recording PC values, so that we can look up the PC directly. */ - lowpc += base_address; - highpc += base_address; + lowpc = libbacktrace_add_base (lowpc, base_address); + highpc = libbacktrace_add_base (highpc, base_address); return add_range (state, rdata, lowpc, highpc, error_callback, data, vec); } @@ -1659,7 +1660,7 @@ static int add_ranges_from_ranges ( struct backtrace_state *state, const struct dwarf_sections *dwarf_sections, -uintptr_t base_address, int is_bigendian, +struct libbacktrace_base_address base_address, int is_bigendian, struct unit *u, uintptr_t base, const struct pcrange *pcrange, int (*add_range) (struct backtrace_state *state, void *rdata, @@ -1705,10 +1706,11 @@ add_ranges_from_ranges ( base = (uintptr_t) high; else { - if (!add_range (state, rdata, - (uintptr_t) low + base + base_address, - (uintptr_t) high + base + base_address, - error_callback, data, vec)) + uin
[gcc r15-2052] i386: extend trunc{128}2{16,32,64}'s scope.
https://gcc.gnu.org/g:a902e35396d68f10bd27477153fafa4f5ac9c319 commit r15-2052-ga902e35396d68f10bd27477153fafa4f5ac9c319 Author: Hu, Lin1 Date: Thu Jul 11 15:03:22 2024 +0800 i386: extend trunc{128}2{16,32,64}'s scope. Based on actual usage, trunc{128}2{16,32,64} use some instructions from sse/sse3, so extend their scope to extend the scope of optimization. gcc/ChangeLog: PR target/107432 * config/i386/sse.md (PMOV_SRC_MODE_3_AVX2): Add TARGET_AVX2 for V4DI and V8SI. (PMOV_SRC_MODE_4): Add TARGET_AVX2 for V4DI. (trunc2): Change constraint from TARGET_AVX2 to TARGET_SSSE3. (trunc2): Ditto. (truncv2div2si2): Change constraint from TARGET_AVX2 to TARGET_SSE. gcc/testsuite/ChangeLog: PR target/107432 * gcc.target/i386/pr107432-10.c: New test. Diff: --- gcc/config/i386/sse.md | 11 gcc/testsuite/gcc.target/i386/pr107432-10.c | 41 + 2 files changed, 47 insertions(+), 5 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index c134494cd200..e44822f705b4 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -15000,7 +15000,8 @@ "TARGET_AVX512VL") (define_mode_iterator PMOV_SRC_MODE_3 [V4DI V2DI V8SI V4SI (V8HI "TARGET_AVX512BW")]) -(define_mode_iterator PMOV_SRC_MODE_3_AVX2 [V4DI V2DI V8SI V4SI V8HI]) +(define_mode_iterator PMOV_SRC_MODE_3_AVX2 + [(V4DI "TARGET_AVX2") V2DI (V8SI "TARGET_AVX2") V4SI V8HI]) (define_mode_attr pmov_dst_3_lower [(V4DI "v4qi") (V2DI "v2qi") (V8SI "v8qi") (V4SI "v4qi") (V8HI "v8qi")]) (define_mode_attr pmov_dst_3 @@ -15014,7 +15015,7 @@ [(set (match_operand: 0 "register_operand") (truncate: (match_operand:PMOV_SRC_MODE_3_AVX2 1 "register_operand")))] - "TARGET_AVX2" + "TARGET_SSSE3" { if (TARGET_AVX512VL && (mode != V8HImode || TARGET_AVX512BW)) @@ -15390,7 +15391,7 @@ (match_dup 2)))] "operands[0] = adjust_address_nv (operands[0], V8QImode, 0);") -(define_mode_iterator PMOV_SRC_MODE_4 [V4DI V2DI V4SI]) +(define_mode_iterator PMOV_SRC_MODE_4 [(V4DI "TARGET_AVX2") V2DI V4SI]) (define_mode_attr pmov_dst_4 [(V4DI "V4HI") (V2DI "V2HI") (V4SI "V4HI")]) (define_mode_attr pmov_dst_4_lower @@ -15404,7 +15405,7 @@ [(set (match_operand: 0 "register_operand") (truncate: (match_operand:PMOV_SRC_MODE_4 1 "register_operand")))] - "TARGET_AVX2" + "TARGET_SSSE3" { if (TARGET_AVX512VL) { @@ -15659,7 +15660,7 @@ [(set (match_operand:V2SI 0 "register_operand") (truncate:V2SI (match_operand:V2DI 1 "register_operand")))] - "TARGET_AVX2" + "TARGET_SSE" { if (TARGET_AVX512VL) { diff --git a/gcc/testsuite/gcc.target/i386/pr107432-10.c b/gcc/testsuite/gcc.target/i386/pr107432-10.c new file mode 100644 index ..57edf7cfc781 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr107432-10.c @@ -0,0 +1,41 @@ +/* { dg-do compile } */ +/* { dg-options "-march=x86-64-v2 -O2" } */ +/* { dg-final { scan-assembler-times "shufps" 1 } } */ +/* { dg-final { scan-assembler-times "pshufb" 5 } } */ + +#include + +typedef short __v2hi __attribute__ ((__vector_size__ (4))); +typedef char __v2qi __attribute__ ((__vector_size__ (2))); +typedef char __v4qi __attribute__ ((__vector_size__ (4))); +typedef char __v8qi __attribute__ ((__vector_size__ (8))); + +__v2si mm_cvtepi64_epi32_builtin_convertvector(__v2di a) +{ + return __builtin_convertvector((__v2di)a, __v2si); +} + +__v2hi mm_cvtepi64_epi16_builtin_convertvector(__m128i a) +{ + return __builtin_convertvector((__v2di)a, __v2hi); +} + +__v4hi mm_cvtepi32_epi16_builtin_convertvector(__m128i a) +{ + return __builtin_convertvector((__v4si)a, __v4hi); +} + +__v2qi mm_cvtepi64_epi8_builtin_convertvector(__m128i a) +{ + return __builtin_convertvector((__v2di)a, __v2qi); +} + +__v4qi mm_cvtepi32_epi8_builtin_convertvector(__m128i a) +{ + return __builtin_convertvector((__v4si)a, __v4qi); +} + +__v8qi mm_cvtepi16_epi8_builtin_convertvector(__m128i a) +{ + return __builtin_convertvector((__v8hi)a, __v8qi); +}
[gcc r14-10425] x86: Update branch hint for Redwood Cove.
https://gcc.gnu.org/g:1fff665a51e221a578a92631fc8ea62dd79fa3b6 commit r14-10425-g1fff665a51e221a578a92631fc8ea62dd79fa3b6 Author: H.J. Lu Date: Tue Apr 26 11:08:55 2022 -0700 x86: Update branch hint for Redwood Cove. According to Intel® 64 and IA-32 Architectures Optimization Reference Manual[1], Branch Hint is updated for Redwood Cove. cut from [1]- Starting with the Redwood Cove microarchitecture, if the predictor has no stored information about a branch, the branch has the Intel® SSE2 branch taken hint (i.e., instruction prefix 3EH), When the codec decodes the branch, it flips the branch’s prediction from not-taken to taken. It then flushes the pipeline in front of it and steers this pipeline to fetch the taken path of the branch. cut end - Split tune branch_prediction_hints into branch_prediction_hints_taken and branch_prediction_hints_not_taken, always generate branch hint for conditional branches, both tunes are disabled by default. [1] https://www.intel.com/content/www/us/en/content-details/821612/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html gcc/ * config/i386/i386.cc (ix86_print_operand): Always generate branch hint for conditional branches. * config/i386/i386.h (TARGET_BRANCH_PREDICTION_HINTS): Split into .. (TARGET_BRANCH_PREDICTION_HINTS_TAKEN): .. this, and .. (TARGET_BRANCH_PREDICTION_HINTS_NOT_TAKEN): .. this. * config/i386/x86-tune.def (X86_TUNE_BRANCH_PREDICTION_HINTS): Split into .. (X86_TUNE_BRANCH_PREDICTION_HINTS_TAKEN): .. this, and .. (X86_TUNE_BRANCH_PREDICTION_HINTS_NOT_TAKEN): .. this. (cherry picked from commit a910c30c7c27cd0f6d2d2694544a09fb11d611b9) Diff: --- gcc/config/i386/i386.cc | 29 + gcc/config/i386/i386.h | 6 -- gcc/config/i386/x86-tune.def | 13 +++-- 3 files changed, 24 insertions(+), 24 deletions(-) diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 984ba37beeb9..3827e2b61fe4 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -14203,7 +14203,8 @@ ix86_print_operand (FILE *file, rtx x, int code) if (!optimize || optimize_function_for_size_p (cfun) - || !TARGET_BRANCH_PREDICTION_HINTS) + || (!TARGET_BRANCH_PREDICTION_HINTS_NOT_TAKEN + && !TARGET_BRANCH_PREDICTION_HINTS_TAKEN)) return; x = find_reg_note (current_output_insn, REG_BR_PROB, 0); @@ -14212,25 +14213,13 @@ ix86_print_operand (FILE *file, rtx x, int code) int pred_val = profile_probability::from_reg_br_prob_note (XINT (x, 0)).to_reg_br_prob_base (); - if (pred_val < REG_BR_PROB_BASE * 45 / 100 - || pred_val > REG_BR_PROB_BASE * 55 / 100) - { - bool taken = pred_val > REG_BR_PROB_BASE / 2; - bool cputaken - = final_forward_branch_p (current_output_insn) == 0; - - /* Emit hints only in the case default branch prediction - heuristics would fail. */ - if (taken != cputaken) - { - /* We use 3e (DS) prefix for taken branches and - 2e (CS) prefix for not taken branches. */ - if (taken) - fputs ("ds ; ", file); - else - fputs ("cs ; ", file); - } - } + bool taken = pred_val > REG_BR_PROB_BASE / 2; + /* We use 3e (DS) prefix for taken branches and + 2e (CS) prefix for not taken branches. */ + if (taken && TARGET_BRANCH_PREDICTION_HINTS_TAKEN) + fputs ("ds ; ", file); + else if (!taken && TARGET_BRANCH_PREDICTION_HINTS_NOT_TAKEN) + fputs ("cs ; ", file); } return; } diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 529edff93a41..26e15d2677fb 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -306,8 +306,10 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST]; #define TARGET_ZERO_EXTEND_WITH_AND \ ix86_tune_features[X86_TUNE_ZERO_EXTEND_WITH_AND] #define TARGET_UNROLL_STRLEN ix86_tune_features[X86_TUNE_UNROLL_STRLEN] -#define TARGET_BRANCH_PREDICTION_HINTS \ - ix86_tune_features[X86_TUNE_BRANCH_PREDICTION_HINTS] +#define TARGET_BRANCH_PREDICTION_HINTS_NOT_TAKEN \ + ix86_tune_features[X86_TUNE_BRANCH_PREDICTION_HINTS_NOT_TAKEN] +#define TARGET_BRANCH_PREDICTION_HINTS_TAKEN \ + ix86_tune_features[X8
[gcc(refs/users/meissner/heads/work171-bugs)] Fix typos.
https://gcc.gnu.org/g:6482b9e47c5ff9557c720eedabdfccfb50b70edd commit 6482b9e47c5ff9557c720eedabdfccfb50b70edd Author: Michael Meissner Date: Mon Jul 15 23:16:02 2024 -0400 Fix typos. 2024-07-15 Michael Meissner gcc/testsuite/ PR target/115800 PR target/113652 * lib/target-supports.exp (check_ppc_float128_sw_available): Fix typo in last change. (check_effective_target_ppc_ieee128_ok): Likewise. Diff: --- gcc/testsuite/lib/target-supports.exp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 8ab8f6a10586..6b460f24cc3a 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -2987,7 +2987,7 @@ proc check_ppc_float128_sw_available { } { __float128 z = x + y; return (z != 3.0q); } - } $options + } "" } }] } @@ -3035,7 +3035,7 @@ proc check_effective_target_ppc_ieee128_ok { } { __ieee128 a; return 0; } - } $options + } "" } }] }
[gcc(refs/users/meissner/heads/work171-bugs)] Update ChangeLog.*
https://gcc.gnu.org/g:2cccf2a28a48b07dc553f9b34db1b44377e1ff6c commit 2cccf2a28a48b07dc553f9b34db1b44377e1ff6c Author: Michael Meissner Date: Mon Jul 15 23:17:16 2024 -0400 Update ChangeLog.* Diff: --- gcc/ChangeLog.bugs | 14 ++ 1 file changed, 14 insertions(+) diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs index 7a40e702d934..c2eb08c8f8c2 100644 --- a/gcc/ChangeLog.bugs +++ b/gcc/ChangeLog.bugs @@ -1,3 +1,17 @@ + Branch work171-bugs, patch #322 + +Fix typos. + +2024-07-15 Michael Meissner + +gcc/testsuite/ + + PR target/115800 + PR target/113652 + * lib/target-supports.exp (check_ppc_float128_sw_available): Fix typo in + last change. + (check_effective_target_ppc_ieee128_ok): Likewise. + Branch work171-bugs, patch #321 Do not add -mfloat128 to the tests.
[gcc(refs/users/meissner/heads/work171-bugs)] Remove -mfloat128 option.
https://gcc.gnu.org/g:a05125de6ec8a4d6fe02e3fa88cca3633b95de1a commit a05125de6ec8a4d6fe02e3fa88cca3633b95de1a Author: Michael Meissner Date: Tue Jul 16 02:37:03 2024 -0400 Remove -mfloat128 option. 2024-07-15 Michael Meissner gcc/testsuite/ PR target/115800 PR target/113652 * gcc.target/powerpc/abs128-1.c: Remove passing -mfloat128. If needed, add explicit requires for float128. * gcc.target/powerpc/copysign128-1.c: Likewise. * gcc.target/powerpc/divkc3-1.c: Likewise. * gcc.target/powerpc/float128-3.c: Likewise. * gcc.target/powerpc/float128-5.c: Likewise. * gcc.target/powerpc/float128-math.c: Likewise. * gcc.target/powerpc/inf128-1.c: Likewise. * gcc.target/powerpc/mulkc3-1.c: Likewise. * gcc.target/powerpc/nan128-1.c: Likewise. * gcc.target/powerpc/p9-lxvx-stxvx-3.c: Likewise. * gcc.target/powerpc/pr104253.c: Likewise. * gcc.target/powerpc/pr70640.c: Likewise. * gcc.target/powerpc/pr70669.c: Likewise. * gcc.target/powerpc/pr79004.c: Likewise. * gcc.target/powerpc/pr79038-1.c: Likewise. * gcc.target/powerpc/pr81959.c: Likewise. * gcc.target/powerpc/pr85657-1.c: Likewise. * gcc.target/powerpc/pr85657-2.c: Likewise. * gcc.target/powerpc/pr99708.c: Likewise. * gcc.target/powerpc/signbit-1.c: Likewise. * gcc.target/powerpc/signbit-2.c: Likewise. * gcc.target/powerpc/signbit-3.c: Likewise. Diff: --- gcc/testsuite/gcc.target/powerpc/abs128-1.c| 4 ++-- gcc/testsuite/gcc.target/powerpc/copysign128-1.c | 4 ++-- gcc/testsuite/gcc.target/powerpc/divkc3-1.c| 4 ++-- gcc/testsuite/gcc.target/powerpc/float128-3.c | 1 + gcc/testsuite/gcc.target/powerpc/float128-5.c | 1 + gcc/testsuite/gcc.target/powerpc/float128-math.c | 3 ++- gcc/testsuite/gcc.target/powerpc/inf128-1.c| 3 ++- gcc/testsuite/gcc.target/powerpc/mulkc3-1.c| 3 ++- gcc/testsuite/gcc.target/powerpc/nan128-1.c| 3 ++- gcc/testsuite/gcc.target/powerpc/p9-lxvx-stxvx-3.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr104253.c| 2 +- gcc/testsuite/gcc.target/powerpc/pr70640.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr70669.c | 3 ++- gcc/testsuite/gcc.target/powerpc/pr79004.c | 3 ++- gcc/testsuite/gcc.target/powerpc/pr79038-1.c | 3 ++- gcc/testsuite/gcc.target/powerpc/pr81959.c | 3 ++- gcc/testsuite/gcc.target/powerpc/pr85657-1.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr85657-2.c | 2 +- gcc/testsuite/gcc.target/powerpc/pr99708.c | 2 +- gcc/testsuite/gcc.target/powerpc/signbit-1.c | 2 +- gcc/testsuite/gcc.target/powerpc/signbit-2.c | 2 +- gcc/testsuite/gcc.target/powerpc/signbit-3.c | 2 +- 22 files changed, 33 insertions(+), 23 deletions(-) diff --git a/gcc/testsuite/gcc.target/powerpc/abs128-1.c b/gcc/testsuite/gcc.target/powerpc/abs128-1.c index fe5206daff8c..3449c9ca94d8 100644 --- a/gcc/testsuite/gcc.target/powerpc/abs128-1.c +++ b/gcc/testsuite/gcc.target/powerpc/abs128-1.c @@ -1,5 +1,5 @@ -/* { dg-do run { target { powerpc64*-*-* && vsx_hw } } } */ -/* { dg-options "-mfloat128 -mvsx" } */ +/* { dg-do run { target { powerpc64*-*-* && vsx_hw && ppc_float128_sw } } } */ +/* { dg-options "-mvsx" } */ void abort (); diff --git a/gcc/testsuite/gcc.target/powerpc/copysign128-1.c b/gcc/testsuite/gcc.target/powerpc/copysign128-1.c index 429dfc072e3b..1e8ae5fa7533 100644 --- a/gcc/testsuite/gcc.target/powerpc/copysign128-1.c +++ b/gcc/testsuite/gcc.target/powerpc/copysign128-1.c @@ -1,5 +1,5 @@ -/* { dg-do run { target { powerpc64*-*-* && vsx_hw } } } */ -/* { dg-options "-mfloat128 -mvsx" } */ +/* { dg-do run { target { powerpc64*-*-* && vsx_hw && ppc_float128_sw } } } */ +/* { dg-options "-mvsx" } */ void abort (); diff --git a/gcc/testsuite/gcc.target/powerpc/divkc3-1.c b/gcc/testsuite/gcc.target/powerpc/divkc3-1.c index 89bf04f12a97..2b4f08ecef51 100644 --- a/gcc/testsuite/gcc.target/powerpc/divkc3-1.c +++ b/gcc/testsuite/gcc.target/powerpc/divkc3-1.c @@ -1,5 +1,5 @@ -/* { dg-do run { target { powerpc64*-*-* && p8vector_hw } } } */ -/* { dg-options "-mfloat128 -mvsx" } */ +/* { dg-do run { target { powerpc64*-*-* && p8vector_hw && ppc_float128_sw } } } */ +/* { dg-options "-mvsx" } */ void abort (); diff --git a/gcc/testsuite/gcc.target/powerpc/float128-3.c b/gcc/testsuite/gcc.target/powerpc/float128-3.c index e62ad5f5247f..e58bccdfa159 100644 --- a/gcc/testsuite/gcc.target/powerpc/float128-3.c +++ b/gcc/testsuite/gcc.target/powerpc/float128-3.c @@ -1,6 +1,7 @@ /* { dg-do compile { target { powerpc*-*-linux* } } } */ /* { dg-options "-O2 -mvsx -mno-float128" } */ /* { dg-require-effective-target powerpc_vsx } */ +/* { dg-require-effective-target ppc_floa
[gcc(refs/users/meissner/heads/work171-bugs)] Update ChangeLog.*
https://gcc.gnu.org/g:fc7d1bdcd0cd73e800030aa4845324ec5071cd55 commit fc7d1bdcd0cd73e800030aa4845324ec5071cd55 Author: Michael Meissner Date: Tue Jul 16 02:39:13 2024 -0400 Update ChangeLog.* Diff: --- gcc/ChangeLog.bugs | 35 +++ 1 file changed, 35 insertions(+) diff --git a/gcc/ChangeLog.bugs b/gcc/ChangeLog.bugs index c2eb08c8f8c2..99b0a6a4ec40 100644 --- a/gcc/ChangeLog.bugs +++ b/gcc/ChangeLog.bugs @@ -1,3 +1,38 @@ + Branch work171-bugs, patch #323 + +Remove -mfloat128 option. + +2024-07-16 Michael Meissner + +gcc/testsuite/ + + PR target/115800 + PR target/113652 + + * gcc.target/powerpc/abs128-1.c: Remove passing -mfloat128. If needed, + add explicit requires for float128. + * gcc.target/powerpc/copysign128-1.c: Likewise. + * gcc.target/powerpc/divkc3-1.c: Likewise. + * gcc.target/powerpc/float128-3.c: Likewise. + * gcc.target/powerpc/float128-5.c: Likewise. + * gcc.target/powerpc/float128-math.c: Likewise. + * gcc.target/powerpc/inf128-1.c: Likewise. + * gcc.target/powerpc/mulkc3-1.c: Likewise. + * gcc.target/powerpc/nan128-1.c: Likewise. + * gcc.target/powerpc/p9-lxvx-stxvx-3.c: Likewise. + * gcc.target/powerpc/pr104253.c: Likewise. + * gcc.target/powerpc/pr70640.c: Likewise. + * gcc.target/powerpc/pr70669.c: Likewise. + * gcc.target/powerpc/pr79004.c: Likewise. + * gcc.target/powerpc/pr79038-1.c: Likewise. + * gcc.target/powerpc/pr81959.c: Likewise. + * gcc.target/powerpc/pr85657-1.c: Likewise. + * gcc.target/powerpc/pr85657-2.c: Likewise. + * gcc.target/powerpc/pr99708.c: Likewise. + * gcc.target/powerpc/signbit-1.c: Likewise. + * gcc.target/powerpc/signbit-2.c: Likewise. + * gcc.target/powerpc/signbit-3.c: Likewise. + Branch work171-bugs, patch #322 Fix typos.