[gcc(refs/users/mikael/heads/stabilisation_descriptor_v02)] fortran: Delay evaluation of array bounds after reallocation
https://gcc.gnu.org/g:6b68aa726ed5a45e33bd2d87636e2daa5059fd65 commit 6b68aa726ed5a45e33bd2d87636e2daa5059fd65 Author: Mikael Morin Date: Mon Jul 7 11:46:08 2025 +0200 fortran: Delay evaluation of array bounds after reallocation Delay the evaluation of bounds, offset, etc after the reallocation, for the scalarization of allocatable arrays on the left hand side of assignments. Before this change, the code preceding the scalarization loop is like: D.4757 = ref2.offset; D.4759 = ref2.dim[0].ubound; D.4762 = ref2.dim[0].lbound; { if (ref2.data == 0B) goto realloc; if (ref2.dim[0].lbound + 4 != ref2.dim[0].ubound) goto realloc; goto L.10; realloc: ... change offset and bounds ... D.4757 = ref2.offset; D.4762 = NON_LVALUE_EXPR ; ... reallocation ... L.10:; } while (1) { ... scalarized code ... so the bounds etc are evaluated first to variables, and the reallocation code takes care to update the variables during the reallocation. This is problematic because the variables' initialization references the array bounds, which for unallocated arrays are uninitialized at the evaluation point. This used to (correctly) cause uninitialized warnings (see PR fortran/108889), and a workaround for variables was found, that initializes the bounds of arrays variables to some value beforehand if they are unallocated. For allocatable components, there is no warning but the problem remains, some uninitialized values are used, even if discarded later. After this change the code becomes: { if (ref2.data == 0B) goto realloc; if (ref2.dim[0].lbound + 4 != ref2.dim[0].ubound) goto realloc; goto L.10; realloc:; ... change offset and bounds ... ... reallocation ... L.10:; } D.4762 = ref2.offset; D.4763 = ref2.dim[0].lbound; D.4764 = ref2.dim[0].ubound; while (1) { ... scalarized code so the scalarizer avoids storing the values to variables at the time it evaluates them, if the array is reallocatable on assignment. Instead, it keeps expressions with references to the array descriptor fields, expressions that remain valid through reallocation. After the reallocation code has been generated, the expressions stored by the scalarizer are evaluated in place to variables. The decision to delay evaluation is based on the existing field is_alloc_lhs, which requires a few tweaks to be alway correct wrt to what its name suggests. Namely it should be set even if the assignment right hand side is an intrinsic function, and it should not be set if the right hand side is a scalar and neither if the -fno-realloc-lhs flag is passed to the compiler. gcc/fortran/ChangeLog: * trans-array.cc (gfc_conv_ss_descriptor): Don't evaluate offset and data to a variable if is_alloc_lhs is set. Move the existing evaluation decision condition for data... (save_descriptor_data): ... here as a new predicate. (evaluate_bound): Add argument save_value. Omit the evaluation of the value to a variable if that argument isn't set. (gfc_conv_expr_descriptor): Update caller. (gfc_conv_section_startstride): Update caller. Set save_value if is_alloc_lhs is not set. Omit the evaluation of stride to a variable if save_value isn't set. (gfc_set_delta): Omit the evaluation of delta to a variable if is_alloc_lhs is set. (gfc_is_reallocatable_lhs): Return false if flag_realloc_lhs isn't set. (gfc_alloc_allocatable_for_assignment): Don't update the variables that may be stored in saved_offset, delta, and data. Call instead... (update_realloated_descriptor): ... this new procedure. * trans-expr.cc (gfc_trans_assignment_1): Don't omit setting the is_alloc_lhs flag if the right hand side is an intrinsic function. Clear the flag if the right hand side is scalar. Diff: --- gcc/fortran/trans-array.cc | 137 - gcc/fortran/trans-expr.cc | 14 ++--- 2 files changed, 104 insertions(+), 47 deletions(-) diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index 7be2d7b11a62..7b83d3fab8d7 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -3420,6 +3420,23 @@ gfc_add_loop_ss_code (gfc_loopinfo * loop, gfc_ss * ss, bool subscript, } +/* Given an array descriptor expression DESCR and its data point
[gcc(refs/users/mikael/heads/stabilisation_descriptor_v02)] fortran: Amend descriptor bounds init if unallocated
https://gcc.gnu.org/g:0312dae9024cd86b355bbaf790c97759b287c8a0 commit 0312dae9024cd86b355bbaf790c97759b287c8a0 Author: Mikael Morin Date: Wed Jul 9 09:40:32 2025 +0200 fortran: Amend descriptor bounds init if unallocated Always generate the conditional initialization of unallocated variables regardless on the basic variable allocation tracking done in the frontend and with an additional always false condition. The scalarizer used to always evaluate array bounds, including in the case of unallocated arrays on the left hand side of an assignment. This was (correctly) causing uninitialized warnings, even if the uninitialized values were in the end discarded. Since the fix for PR fortran/108889, an initialization of the descriptor bounds is added to silent the uninitialized warnings, conditional on the array being unallocated. This initialization is not useful in the execution of the program, and it is removed if the compiler can prove that the variable is unallocated (in the case of a local variable for example). Unfortunately, the compiler is not always able to prove it and the useless initialization may remain in the final code. Moreover, the generated code that was causing the evaluation of uninitialized variables has ben changed to avoid them, so we can try to remove or revisit that unallocated variable bounds initialization tweak. Unfortunately, just removing the extra initialization restores the warnings at -O0, as there is no dead code removal at that optimization level. Instead, this change keeps the initialization and modifies its guarding condition with an extra always false variable, so that if optimizations are enabled the whole initialization block is removed, and if they are disabled it remains and is sufficient to prevent the warning. The new variable requires the code generation to be done earlier in the function so that the variable declaration and usage are in the same scope. As the modified condition guarantees the removal of the block with optimizations, we can emit it more broadly and remove the basic allocation tracking that was done in the frontend to limit its emission. gcc/fortran/ChangeLog: * gfortran.h (gfc_symbol): Remove field allocated_in_scope. * trans-array.cc (gfc_array_allocate): Don't set it. (gfc_alloc_allocatable_for_assignment): Likewise. Generate the unallocated descriptor bounds initialisation before the opening of the reallocation code block. Create a variable and use it as additional condition to the unallocated descriptor bounds initialisation. Diff: --- gcc/fortran/gfortran.h | 4 -- gcc/fortran/trans-array.cc | 91 -- 2 files changed, 48 insertions(+), 47 deletions(-) diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 6848bd1762d3..69367e638c5b 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -2028,10 +2028,6 @@ typedef struct gfc_symbol /* Set if this should be passed by value, but is not a VALUE argument according to the Fortran standard. */ unsigned pass_as_value:1; - /* Set if an allocatable array variable has been allocated in the current - scope. Used in the suppression of uninitialized warnings in reallocation - on assignment. */ - unsigned allocated_in_scope:1; /* Set if an external dummy argument is called with different argument lists. This is legal in Fortran, but can cause problems with autogenerated C prototypes for C23. */ diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index 7b83d3fab8d7..52888c1e1f1b 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -6800,8 +6800,6 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg, else gfc_add_expr_to_block (&se->pre, set_descriptor); - expr->symtree->n.sym->allocated_in_scope = 1; - return true; } @@ -11495,14 +11493,60 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop, && !expr2->value.function.isym) expr2->ts.u.cl->backend_decl = rss->info->string_length; - gfc_start_block (&fblock); - /* Since the lhs is allocatable, this must be a descriptor type. Get the data and array size. */ desc = linfo->descriptor; gcc_assert (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (desc))); array1 = gfc_conv_descriptor_data_get (desc); + /* If the data is null, set the descriptor bounds and offset. This suppresses + the maybe used uninitialized warning. Note that the always false variable + prevents this block from from ever being executed. The whole block should + be removed by optimizations. Component references are not subject to the + warnings, so we don't uselessly complicate the generated code
[gcc(refs/users/mikael/heads/stabilisation_descriptor_v02)] fortran: Factor array descriptor references
https://gcc.gnu.org/g:8c7924c0e3ad450e98ae2081dce8fa2a9916479d commit 8c7924c0e3ad450e98ae2081dce8fa2a9916479d Author: Mikael Morin Date: Wed Jul 9 21:18:18 2025 +0200 fortran: Factor array descriptor references Save parts of array descriptor references to a variable so that all the expressions using the descriptor as base object benefit from the simplified reference. gcc/fortran/ChangeLog: * trans-array.cc (gfc_conv_ss_descriptor): Move the descriptor reference initialisation... (set_factored_descriptor_value): ... to this new function. Walk the reference passed as arguments and try to simplify some of it to a variable. Diff: --- gcc/fortran/trans-array.cc | 79 +- 1 file changed, 78 insertions(+), 1 deletion(-) diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index 52888c1e1f1b..51ec1c78a28c 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -3437,6 +3437,83 @@ save_descriptor_data (tree descr, tree data) } +/* Save the descriptor reference VALUE to storage pointed by DESC_PTR. As there + may be a lot of code using subreferences of the descriptor, try to factor + them by evaluating the leading part of the data reference to a variable, + adding extra code to BLOCK. + + To avoid copying large amounts of data we only save pointers in the reference + chain, and as late in the chain as possible.*/ + +void +set_factored_descriptor_value (stmtblock_t *block, tree *desc_ptr, tree value) +{ + /* As the reference is processed from last to first, statements will be + generate in reversed order, so can't be put directly in BLOCK. We use + TMP_BLOCK instead. */ + stmtblock_t tmp_block; + tree accumulated_code = NULL_TREE; + + gfc_init_block (&tmp_block); + + tree *ptr_ref = nullptr; + + tree data_ref = value; + bool seen_component = false; + while (true) +{ + if (TREE_CODE (data_ref) == INDIRECT_REF) + { + /* If there is no component reference after the pointer dereference in +the reference chain, the pointer can't be saved to a variable as +it may be a pointer or allocatable, and we have to keep the parent +reference to be able to update the pointer value. Otherwise the +pointer can be saved to a variable. */ + if (seen_component) + { + /* Don't evaluate the pointer to a variable yet; do it only if the +variable would be significantly more simple than the reference +it replaces. That is if the reference contains anything +different from a NOP, a COMPONENT or a DECL. */ + ptr_ref = &TREE_OPERAND (data_ref, 0); + } + + data_ref = TREE_OPERAND (data_ref, 0); + } + else if (TREE_CODE (data_ref) == COMPONENT_REF) + { + seen_component = true; + data_ref = TREE_OPERAND (data_ref, 0); + } + else if (TREE_CODE (data_ref) == NOP_EXPR) + data_ref = TREE_OPERAND (data_ref, 0); + else + { + if (DECL_P (data_ref)) + break; + + if (ptr_ref != nullptr) + { + /* We have seen a pointer before, and its reference appears to be +worth saving. Do it now. */ + tree ptr = *ptr_ref; + *ptr_ref = gfc_evaluate_now (ptr, &tmp_block); + gfc_add_expr_to_block (&tmp_block, accumulated_code); + accumulated_code = gfc_finish_block (&tmp_block); + } + + if (TREE_CODE (data_ref) == ARRAY_REF) + data_ref = TREE_OPERAND (data_ref, 0); + else + break; + } +} + + *desc_ptr = value; + gfc_add_expr_to_block (block, accumulated_code); +} + + /* Translate expressions for the descriptor and data pointer of a SS. */ /*GCC ARRAYS*/ @@ -3457,7 +3534,7 @@ gfc_conv_ss_descriptor (stmtblock_t * block, gfc_ss * ss, int base) se.descriptor_only = 1; gfc_conv_expr_lhs (&se, ss_info->expr); gfc_add_block_to_block (block, &se.pre); - info->descriptor = se.expr; + set_factored_descriptor_value (block, &info->descriptor, se.expr); ss_info->string_length = se.string_length; ss_info->class_container = se.class_container;
[gcc] Created branch 'mikael/heads/stabilisation_descriptor_v02' in namespace 'refs/users'
The branch 'mikael/heads/stabilisation_descriptor_v02' was created in namespace 'refs/users' pointing to: 8c7924c0e3ad... fortran: Factor array descriptor references
[gcc r16-2173] [PATCH] libgcc: PR target/116363 Fix SFtype to UDWtype conversion
https://gcc.gnu.org/g:e6f2daff77ee1f709105cb9f8e3e92f04c179431 commit r16-2173-ge6f2daff77ee1f709105cb9f8e3e92f04c179431 Author: Jan Dubiec Date: Thu Jul 10 07:41:08 2025 -0600 [PATCH] libgcc: PR target/116363 Fix SFtype to UDWtype conversion This patch fixes SFtype to UDWtype (aka float to unsigned long long) conversion on targets without DFmode like e.g. H8/300H. It solely relies on SFtype->UWtype and UWtype->UDWtype conversions/casts. The existing code in line 2218 (counter = a) assigns/casts a float which is *always* not lesser than Wtype_MAXp1_F to an UWtype int which of course does not have enough capacity. PR target/116363 libgcc/ChangeLog: * libgcc2.c (__fixunssfDI): Fix SFtype to UDWtype conversion for targets without LIBGCC2_HAS_DF_MODE defined Diff: --- libgcc/libgcc2.c | 41 + 1 file changed, 13 insertions(+), 28 deletions(-) diff --git a/libgcc/libgcc2.c b/libgcc/libgcc2.c index faefff3730ca..df99c78eb204 100644 --- a/libgcc/libgcc2.c +++ b/libgcc/libgcc2.c @@ -2187,36 +2187,21 @@ __fixunssfDI (SFtype a) if (a < 1) return 0; if (a < Wtype_MAXp1_F) -return (UWtype)a; +return (UWtype) a; if (a < Wtype_MAXp1_F * Wtype_MAXp1_F) { - /* Since we know that there are fewer significant bits in the SFmode -quantity than in a word, we know that we can convert out all the -significant bits in one step, and thus avoid losing bits. */ - - /* ??? This following loop essentially performs frexpf. If we could -use the real libm function, or poke at the actual bits of the fp -format, it would be significantly faster. */ - - UWtype shift = 0, counter; - SFtype msb; - - a /= Wtype_MAXp1_F; - for (counter = W_TYPE_SIZE / 2; counter != 0; counter >>= 1) - { - SFtype counterf = (UWtype)1 << counter; - if (a >= counterf) - { - shift |= counter; - a /= counterf; - } - } - - /* Rescale into the range of one word, extract the bits of that -one word, and shift the result into position. */ - a *= Wtype_MAXp1_F; - counter = a; - return (DWtype)counter << shift; + /* We assume that SFtype -> UWtype and UWtype -> UDWtype casts work + properly. Obviously, we *cannot* assume that SFtype -> UDWtype + works as expected. */ + SFtype a_hi, a_lo; + + a_hi = a / Wtype_MAXp1_F; + a_lo = a - a_hi * Wtype_MAXp1_F; + + /* A lot of parentheses. This is to make it very clear what is + the sequence of operations. */ + return ((UDWtype) ((UWtype) a_hi)) << W_TYPE_SIZE +| (UDWtype) ((UWtype) a_lo); } return -1; #else
[gcc r16-2174] RISC-V: Make zero-stride load broadcast a tunable.
https://gcc.gnu.org/g:dcba959fb30dc250eeb6fdd05aa878e5f1fc8c2d commit r16-2174-gdcba959fb30dc250eeb6fdd05aa878e5f1fc8c2d Author: Robin Dapp Date: Thu Jul 10 09:41:48 2025 +0200 RISC-V: Make zero-stride load broadcast a tunable. This patch makes the zero-stride load broadcast idiom dependent on a uarch-tunable "use_zero_stride_load". Right now we have quite a few paths that reach a strided load and some of them are not exactly straightforward. While broadcast is relatively rare on rv64 targets it is more common on rv32 targets that want to vectorize 64-bit elements. While the patch is more involved than I would have liked it could have even touched more places. The whole broadcast-like insn path feels a bit hackish due to the several optimizations we employ. Some of the complications stem from the fact that we lump together real broadcasts, vector single-element sets, and strided broadcasts. The strided-load alternatives currently require a memory_constraint to work properly which causes more complications when trying to disable just these. In short, the whole pred_broadcast handling in combination with the sew64_scalar_helper could use work in the future. I was about to start with it in this patch but soon realized that it would only distract from the original intent. What can help in the future is split strided and non-strided broadcast entirely, as well as the single-element sets. Yet unclear is whether we need to pay special attention for misaligned strided loads (PR120782). I regtested on rv32 and rv64 with strided_load_broadcast_p forced to true and false. With either I didn't observe any new execution failures but obviously there are new scan failures with strided broadcast turned off. PR target/118734 gcc/ChangeLog: * config/riscv/constraints.md (Wdm): Use tunable for Wdm constraint. * config/riscv/riscv-protos.h (emit_avltype_insn): Declare. (can_be_broadcasted_p): Rename to... (can_be_broadcast_p): ...this. * config/riscv/predicates.md: Use renamed function. (strided_load_broadcast_p): Declare. * config/riscv/riscv-selftests.cc (run_broadcast_selftests): Only run broadcast selftest if strided broadcasts are OK. * config/riscv/riscv-v.cc (emit_avltype_insn): New function. (sew64_scalar_helper): Only emit a pred_broadcast if the new tunable says so. (can_be_broadcasted_p): Rename to... (can_be_broadcast_p): ...this and use new tunable. * config/riscv/riscv.cc (struct riscv_tune_param): Add strided broad tunable. (strided_load_broadcast_p): Implement. * config/riscv/vector.md: Use strided_load_broadcast_p () and work around 64-bit broadcast on rv32 targets. Diff: --- gcc/config/riscv/constraints.md | 7 ++-- gcc/config/riscv/predicates.md | 2 +- gcc/config/riscv/riscv-protos.h | 4 ++- gcc/config/riscv/riscv-selftests.cc | 10 -- gcc/config/riscv/riscv-v.cc | 58 +++- gcc/config/riscv/riscv.cc | 20 +++ gcc/config/riscv/vector.md | 66 +++-- 7 files changed, 133 insertions(+), 34 deletions(-) diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md index ccab1a2e29df..5ecaa19eb014 100644 --- a/gcc/config/riscv/constraints.md +++ b/gcc/config/riscv/constraints.md @@ -237,10 +237,11 @@ (and (match_code "const_vector") (match_test "rtx_equal_p (op, riscv_vector::gen_scalar_move_mask (GET_MODE (op)))"))) -(define_memory_constraint "Wdm" +(define_constraint "Wdm" "Vector duplicate memory operand" - (and (match_code "mem") - (match_code "reg" "0"))) + (and (match_test "strided_load_broadcast_p ()") + (and (match_code "mem") + (match_code "reg" "0" ;; Vendor ISA extension constraints. diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md index 8baad2fae7a9..1f9a6b562e53 100644 --- a/gcc/config/riscv/predicates.md +++ b/gcc/config/riscv/predicates.md @@ -617,7 +617,7 @@ ;; The scalar operand can be directly broadcast by RVV instructions. (define_predicate "direct_broadcast_operand" - (match_test "riscv_vector::can_be_broadcasted_p (op)")) + (match_test "riscv_vector::can_be_broadcast_p (op)")) ;; A CONST_INT operand that has exactly two bits cleared. (define_predicate "const_nottwobits_operand" diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 38f63ea84248..a41c4c299fac 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -604,6 +604,7 @@ void emit_vlmax_vsetvl (machine_mode, rtx); void emit_hard_vlmax_vsetvl (machine_mode, rtx);
[gcc r16-2175] expand: ICE if asked to expand RDIV with non-float type.
https://gcc.gnu.org/g:5aa21765236730c1772c19454cbb71365b84d583 commit r16-2175-g5aa21765236730c1772c19454cbb71365b84d583 Author: Robin Dapp Date: Wed Jul 9 15:58:05 2025 +0200 expand: ICE if asked to expand RDIV with non-float type. This patch adds asserts that ensure we only expand an RDIV_EXPR with actual float mode. It also replaces the RDIV_EXPR in setting a vectorized loop's length by EXACT_DIV_EXPR. The code in question is only used with length-control targets (riscv, powerpc, s390). PR target/121014 gcc/ChangeLog: * cfgexpand.cc (expand_debug_expr): Assert FLOAT_MODE_P. * optabs-tree.cc (optab_for_tree_code): Assert FLOAT_TYPE_P. * tree-vect-loop.cc (vect_get_loop_len): Use EXACT_DIV_EXPR. Diff: --- gcc/cfgexpand.cc | 2 ++ gcc/optabs-tree.cc| 2 ++ gcc/tree-vect-loop.cc | 2 +- 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc index 33649d43f71c..a656ccebf176 100644 --- a/gcc/cfgexpand.cc +++ b/gcc/cfgexpand.cc @@ -5358,6 +5358,8 @@ expand_debug_expr (tree exp) return simplify_gen_binary (MULT, mode, op0, op1); case RDIV_EXPR: + gcc_assert (FLOAT_MODE_P (mode)); + /* Fall through. */ case TRUNC_DIV_EXPR: case EXACT_DIV_EXPR: if (unsignedp) diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc index 6dfe8ee4c4e4..9308a6dfd65c 100644 --- a/gcc/optabs-tree.cc +++ b/gcc/optabs-tree.cc @@ -82,6 +82,8 @@ optab_for_tree_code (enum tree_code code, const_tree type, return unknown_optab; /* FALLTHRU */ case RDIV_EXPR: + gcc_assert (FLOAT_TYPE_P (type)); + /* FALLTHRU */ case TRUNC_DIV_EXPR: case EXACT_DIV_EXPR: if (TYPE_SATURATING (type)) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 8ea0f45d79fc..56f80db57bbc 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -11079,7 +11079,7 @@ vect_get_loop_len (loop_vec_info loop_vinfo, gimple_stmt_iterator *gsi, factor = exact_div (nunits1, nunits2).to_constant (); tree iv_type = LOOP_VINFO_RGROUP_IV_TYPE (loop_vinfo); gimple_seq seq = NULL; - loop_len = gimple_build (&seq, RDIV_EXPR, iv_type, loop_len, + loop_len = gimple_build (&seq, EXACT_DIV_EXPR, iv_type, loop_len, build_int_cst (iv_type, factor)); if (seq) gsi_insert_seq_before (gsi, seq, GSI_SAME_STMT);
[gcc] Deleted branch 'mikael/heads/stabilisation_descriptor_v01' in namespace 'refs/users'
The branch 'mikael/heads/stabilisation_descriptor_v01' in namespace 'refs/users' was deleted. It previously pointed to: a8bc113ef2e4... Revert "Ajout directive warning" Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- a8bc113... Revert "Ajout directive warning" 3ade607... Ajout directive warning 662b288... Revert "Revert ajout code mort" 13f5c49... Déplacement variables après réallocation
[gcc(refs/users/mikael/heads/stabilisation_descriptor_v01)] Correction array_constructor_1
https://gcc.gnu.org/g:41b730b8a79522e8e5a6115f01a02968a571e85b commit 41b730b8a79522e8e5a6115f01a02968a571e85b Author: Mikael Morin Date: Sat Jul 5 15:05:20 2025 +0200 Correction array_constructor_1 Diff: --- gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 b/gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 index 45eafacd5a67..a0c55076a9ae 100644 --- a/gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 +++ b/gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 @@ -9,6 +9,8 @@ program grow_type_array type(container), allocatable :: list(:) +allocate(list(0)) + list = [list, new_elem(5)] deallocate(list)
[gcc(refs/users/mikael/heads/stabilisation_descriptor_v01)] fortran: Delay evaluation of array bounds after reallocation
https://gcc.gnu.org/g:f6115ed47ddc5e89ea058afc98f43bae5d49cbf0 commit f6115ed47ddc5e89ea058afc98f43bae5d49cbf0 Author: Mikael Morin Date: Mon Jul 7 11:46:08 2025 +0200 fortran: Delay evaluation of array bounds after reallocation Delay the evaluation of bounds, offset, etc after the reallocation, for the scalarization of allocatable arrays on the left hand side of assignments. Before this change, the code preceding the scalarization loop is like: D.4757 = ref2.offset; D.4759 = ref2.dim[0].ubound; D.4762 = ref2.dim[0].lbound; { if (ref2.data == 0B) goto realloc; if (ref2.dim[0].lbound + 4 != ref2.dim[0].ubound) goto realloc; goto L.10; realloc: ... change offset and bounds ... D.4757 = ref2.offset; D.4762 = NON_LVALUE_EXPR ; ... reallocation ... L.10:; } while (1) { ... scalarized code ... so the bounds etc are evaluated first to variables, and the reallocation code takes care to update the variables during the reallocation. This is problematic because the variables' initialization references the array bounds, which for unallocated arrays are uninitialized at the evaluation point. This used to (correctly) cause uninitialized warnings (see PR fortran/108889), and a workaround for variables was found, that initializes the bounds of arrays variables to some value beforehand if they are unallocated. For allocatable components, there is no warning but the problem remains, some uninitialized values are used, even if discarded later. After this change the code becomes: { if (ref2.data == 0B) goto realloc; if (ref2.dim[0].lbound + 4 != ref2.dim[0].ubound) goto realloc; goto L.10; realloc:; ... change offset and bounds ... ... reallocation ... L.10:; } D.4762 = ref2.offset; D.4763 = ref2.dim[0].lbound; D.4764 = ref2.dim[0].ubound; while (1) { ... scalarized code so the scalarizer avoids storing the values to variables at the time it evaluates them, if the array is reallocatable on assignment. Instead, it keeps expressions with references to the array descriptor fields, expressions that remain valid through reallocation. After the reallocation code has been generated, the expressions stored by the scalarizer are evaluated in place to variables. The decision to delay evaluation is based on the existing field is_alloc_lhs, which requires a few tweaks to be alway correct wrt to what its name suggests. Namely it should be set even if the assignment right hand side is an intrinsic function, and it should not be set if the right hand side is a scalar and neither if the -fno-realloc-lhs flag is passed to the compiler. gcc/fortran/ChangeLog: * trans-array.cc (gfc_conv_ss_descriptor): Don't evaluate offset and data to a variable if is_alloc_lhs is set. Move the existing evaluation decision condition for data... (save_descriptor_data): ... here as a new predicate. (evaluate_bound): Add argument save_value. Omit the evaluation of the value to a variable if that argument isn't set. (gfc_conv_expr_descriptor): Update caller. (gfc_conv_section_startstride): Update caller. Set save_value if is_alloc_lhs is not set. Omit the evaluation of stride to a variable if save_value isn't set. (gfc_set_delta): Omit the evaluation of delta to a variable if is_alloc_lhs is set. (gfc_is_reallocatable_lhs): Return false if flag_realloc_lhs isn't set. (gfc_alloc_allocatable_for_assignment): Don't update the variables that may be stored in saved_offset, delta, and data. Call instead... (update_reallocated_descriptor): ... this new procedure. * trans-expr.cc (gfc_trans_assignment_1): Don't omit setting the is_alloc_lhs flag if the right hand side is an intrinsic function. Clear the flag if the right hand side is scalar. Diff: --- gcc/fortran/trans-array.cc | 137 - gcc/fortran/trans-expr.cc | 14 ++--- 2 files changed, 104 insertions(+), 47 deletions(-) diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index 7be2d7b11a62..7b83d3fab8d7 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -3420,6 +3420,23 @@ gfc_add_loop_ss_code (gfc_loopinfo * loop, gfc_ss * ss, bool subscript, } +/* Given an array descriptor expression DESCR and its data poin
[gcc(refs/users/mikael/heads/stabilisation_descriptor_v01)] fortran: generate array reallocation out of loops
https://gcc.gnu.org/g:a1e8410d02b3c9cd658dab13fa5422a3b99c2230 commit a1e8410d02b3c9cd658dab13fa5422a3b99c2230 Author: Mikael Morin Date: Sun Jul 6 16:56:16 2025 +0200 fortran: generate array reallocation out of loops Generate the array reallocation on assignment code before entering the scalarization loops. This doesn't move the generated code itself, which was already put before the outermost loop, but only changes the current scope at the time the code is generated. This is a prerequisite for a followup patch that makes the reallocation code create new variables. Without this change the new variables would be declared in the innermost loop body and couldn't be used outside of it. gcc/fortran/ChangeLog: * trans-expr.cc (gfc_trans_assignment_1): Generate array reallocation code before entering the scalarisation loops. Diff: --- gcc/fortran/trans-expr.cc | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index 3e0d763d2fb0..760c8c4e72bd 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -12943,6 +12943,7 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag, rhs_caf_attr = gfc_caf_attr (expr2, false, &rhs_refs_comp); } + tree reallocation = NULL_TREE; if (lss != gfc_ss_terminator) { /* The assignment needs scalarization. */ @@ -13011,6 +13012,15 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag, ompws_flags |= OMPWS_SCALARIZER_WS | OMPWS_SCALARIZER_BODY; } + /* F2003: Allocate or reallocate lhs of allocatable array. */ + if (realloc_flag) + { + realloc_lhs_warning (expr1->ts.type, true, &expr1->where); + ompws_flags &= ~OMPWS_SCALARIZER_WS; + reallocation = gfc_alloc_allocatable_for_assignment (&loop, expr1, + expr2); + } + /* Start the scalarized loop body. */ gfc_start_scalarized_body (&loop, &body); } @@ -13319,15 +13329,8 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag, gfc_add_expr_to_block (&body, tmp); } - /* F2003: Allocate or reallocate lhs of allocatable array. */ - if (realloc_flag) - { - realloc_lhs_warning (expr1->ts.type, true, &expr1->where); - ompws_flags &= ~OMPWS_SCALARIZER_WS; - tmp = gfc_alloc_allocatable_for_assignment (&loop, expr1, expr2); - if (tmp != NULL_TREE) - gfc_add_expr_to_block (&loop.code[expr1->rank - 1], tmp); - } + if (reallocation != NULL_TREE) + gfc_add_expr_to_block (&loop.code[loop.dimen - 1], reallocation); if (maybe_workshare) ompws_flags &= ~OMPWS_SCALARIZER_BODY;
[gcc] Created branch 'mikael/heads/stabilisation_descriptor_v01' in namespace 'refs/users'
The branch 'mikael/heads/stabilisation_descriptor_v01' was created in namespace 'refs/users' pointing to: 7e72a078ae71... fortran: Amend descriptor bounds init if unallocated
[gcc r16-2160] Remove vect_dissolve_slp_only_groups
https://gcc.gnu.org/g:e13076208452c001fd831eaaaebe1fd34762dc31 commit r16-2160-ge13076208452c001fd831eaaaebe1fd34762dc31 Author: Richard Biener Date: Wed Jul 9 15:10:26 2025 +0200 Remove vect_dissolve_slp_only_groups This function dissolves DR groups that are not subject to SLP. Which means it is no longer necessary. * tree-vect-loop.cc (vect_dissolve_slp_only_groups): Remove. (vect_analyze_loop_2): Do not call it. Diff: --- gcc/tree-vect-loop.cc | 75 --- 1 file changed, 75 deletions(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index d57d34dfad27..2d5ea414559f 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -2260,78 +2260,6 @@ vect_get_datarefs_in_loop (loop_p loop, basic_block *bbs, return opt_result::success (); } -/* Look for SLP-only access groups and turn each individual access into its own - group. */ -static void -vect_dissolve_slp_only_groups (loop_vec_info loop_vinfo) -{ - unsigned int i; - struct data_reference *dr; - - DUMP_VECT_SCOPE ("vect_dissolve_slp_only_groups"); - - vec datarefs = LOOP_VINFO_DATAREFS (loop_vinfo); - FOR_EACH_VEC_ELT (datarefs, i, dr) -{ - gcc_assert (DR_REF (dr)); - stmt_vec_info stmt_info - = vect_stmt_to_vectorize (loop_vinfo->lookup_stmt (DR_STMT (dr))); - - /* Check if the load is a part of an interleaving chain. */ - if (STMT_VINFO_GROUPED_ACCESS (stmt_info)) - { - stmt_vec_info first_element = DR_GROUP_FIRST_ELEMENT (stmt_info); - dr_vec_info *dr_info = STMT_VINFO_DR_INFO (first_element); - unsigned int group_size = DR_GROUP_SIZE (first_element); - - /* Check if SLP-only groups. */ - if (!STMT_SLP_TYPE (stmt_info) - && STMT_VINFO_SLP_VECT_ONLY (first_element)) - { - /* Dissolve the group. */ - STMT_VINFO_SLP_VECT_ONLY (first_element) = false; - - stmt_vec_info vinfo = first_element; - while (vinfo) - { - stmt_vec_info next = DR_GROUP_NEXT_ELEMENT (vinfo); - DR_GROUP_FIRST_ELEMENT (vinfo) = vinfo; - DR_GROUP_NEXT_ELEMENT (vinfo) = NULL; - DR_GROUP_SIZE (vinfo) = 1; - if (STMT_VINFO_STRIDED_P (first_element) - /* We cannot handle stores with gaps. */ - || DR_IS_WRITE (dr_info->dr)) - { - STMT_VINFO_STRIDED_P (vinfo) = true; - DR_GROUP_GAP (vinfo) = 0; - } - else - DR_GROUP_GAP (vinfo) = group_size - 1; - /* Duplicate and adjust alignment info, it needs to -be present on each group leader, see dr_misalignment. */ - if (vinfo != first_element) - { - dr_vec_info *dr_info2 = STMT_VINFO_DR_INFO (vinfo); - dr_info2->target_alignment = dr_info->target_alignment; - int misalignment = dr_info->misalignment; - if (misalignment != DR_MISALIGNMENT_UNKNOWN) - { - HOST_WIDE_INT diff - = (TREE_INT_CST_LOW (DR_INIT (dr_info2->dr)) - - TREE_INT_CST_LOW (DR_INIT (dr_info->dr))); - unsigned HOST_WIDE_INT align_c - = dr_info->target_alignment.to_constant (); - misalignment = (misalignment + diff) % align_c; - } - dr_info2->misalignment = misalignment; - } - vinfo = next; - } - } - } -} -} - /* Determine if operating on full vectors for LOOP_VINFO might leave some scalar iterations still to do. If so, decide how we should handle those scalar iterations. The possibilities are: @@ -2687,9 +2615,6 @@ start_over: goto again; } - /* Dissolve SLP-only groups. */ - vect_dissolve_slp_only_groups (loop_vinfo); - /* For now, we don't expect to mix both masking and length approaches for one loop, disable it if both are recorded. */ if (LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
[gcc r16-2158] Remove non-SLP vectorization factor determining
https://gcc.gnu.org/g:4b47acfe2b626d1276e229a0cf165e934813df6c commit r16-2158-g4b47acfe2b626d1276e229a0cf165e934813df6c Author: Richard Biener Date: Wed Jul 9 12:53:45 2025 +0200 Remove non-SLP vectorization factor determining The following removes the VF determining step from non-SLP stmts. For now we keep setting STMT_VINFO_VECTYPE for all stmts, there are too many places to fix, including some more complicated ones, so this is defered for a followup. Along this removes vect_update_vf_for_slp, merging the check for present hybrid SLP stmts to vect_detect_hybrid_slp and fail analysis early. This also removes to essentially duplicate this check in the stmt walk of vect_analyze_loop_operations. Getting rid of that, and performing some other checks earlier is also defered to a followup. * tree-vect-loop.cc (vect_determine_vf_for_stmt_1): Rename to ... (vect_determine_vectype_for_stmt_1): ... this and only set STMT_VINFO_VECTYPE. Fail for single-element vector types. (vect_determine_vf_for_stmt): Rename to ... (vect_determine_vectype_for_stmt): ... this and only set STMT_VINFO_VECTYPE. Fail for single-element vector types. (vect_determine_vectorization_factor): Rename to ... (vect_set_stmts_vectype): ... this and only set STMT_VINFO_VECTYPE. (vect_update_vf_for_slp): Remove. (vect_analyze_loop_operations): Remove walk over stmts. (vect_analyze_loop_2): Call vect_set_stmts_vectype instead of vect_determine_vectorization_factor. Set vectorization factor from LOOP_VINFO_SLP_UNROLLING_FACTOR. Fail if vect_detect_hybrid_slp detects hybrid stmts or when vect_make_slp_decision finds nothing to SLP. * tree-vect-slp.cc (vect_detect_hybrid_slp): Move check whether we have any hybrid stmts here from vect_update_vf_for_slp * tree-vect-stmts.cc (vect_analyze_stmt): Remove loop over stmts. * tree-vectorizer.h (vect_detect_hybrid_slp): Update. Diff: --- gcc/tree-vect-loop.cc | 220 ++--- gcc/tree-vect-slp.cc | 48 ++- gcc/tree-vect-stmts.cc | 12 ++- gcc/tree-vectorizer.h | 2 +- 4 files changed, 100 insertions(+), 182 deletions(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 42e00159ff82..98ac528e3a97 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -168,9 +168,8 @@ static stmt_vec_info vect_is_simple_reduction (loop_vec_info, stmt_vec_info, may already be set for general statements (not just data refs). */ static opt_result -vect_determine_vf_for_stmt_1 (vec_info *vinfo, stmt_vec_info stmt_info, - bool vectype_maybe_set_p, - poly_uint64 *vf) +vect_determine_vectype_for_stmt_1 (vec_info *vinfo, stmt_vec_info stmt_info, + bool vectype_maybe_set_p) { gimple *stmt = stmt_info->stmt; @@ -192,6 +191,12 @@ vect_determine_vf_for_stmt_1 (vec_info *vinfo, stmt_vec_info stmt_info, if (stmt_vectype) { + if (known_le (TYPE_VECTOR_SUBPARTS (stmt_vectype), 1U)) + return opt_result::failure_at (STMT_VINFO_STMT (stmt_info), + "not vectorized: unsupported " + "data-type in %G", + STMT_VINFO_STMT (stmt_info)); + if (STMT_VINFO_VECTYPE (stmt_info)) /* The only case when a vectype had been already set is for stmts that contain a data ref, or for "pattern-stmts" (stmts generated @@ -203,9 +208,6 @@ vect_determine_vf_for_stmt_1 (vec_info *vinfo, stmt_vec_info stmt_info, STMT_VINFO_VECTYPE (stmt_info) = stmt_vectype; } - if (nunits_vectype) -vect_update_max_nunits (vf, nunits_vectype); - return opt_result::success (); } @@ -215,13 +217,12 @@ vect_determine_vf_for_stmt_1 (vec_info *vinfo, stmt_vec_info stmt_info, or false if something prevented vectorization. */ static opt_result -vect_determine_vf_for_stmt (vec_info *vinfo, - stmt_vec_info stmt_info, poly_uint64 *vf) +vect_determine_vectype_for_stmt (vec_info *vinfo, stmt_vec_info stmt_info) { if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "==> examining statement: %G", stmt_info->stmt); - opt_result res = vect_determine_vf_for_stmt_1 (vinfo, stmt_info, false, vf); + opt_result res = vect_determine_vectype_for_stmt_1 (vinfo, stmt_info, false); if (!res) return res; @@ -240,7 +241,7 @@ vect_determine_vf_for_stmt (vec_info *vinfo, dump_printf_loc (MSG_NOTE, vect_location, "==> examining pattern def stmt: %G", def_stmt_info->stmt); -
[gcc r16-2159] Remove vect_analyze_loop_operations
https://gcc.gnu.org/g:3bf2aa834e1270e3167c9559bef9a8ef1f668604 commit r16-2159-g3bf2aa834e1270e3167c9559bef9a8ef1f668604 Author: Richard Biener Date: Wed Jul 9 15:04:12 2025 +0200 Remove vect_analyze_loop_operations This removes the remains of vect_analyze_loop_operations. All the checks it does still on LC PHIs of inner loops in outer loop vectorization should be handled by vectorizable_lc_phi. * tree-vect-loop.cc (vect_active_double_reduction_p): Remove. (vect_analyze_loop_operations): Remove. (vect_analyze_loop_2): Do not call it. Diff: --- gcc/tree-vect-loop.cc | 137 -- 1 file changed, 137 deletions(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 98ac528e3a97..d57d34dfad27 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -1960,133 +1960,6 @@ vect_create_loop_vinfo (class loop *loop, vec_info_shared *shared, -/* Return true if STMT_INFO describes a double reduction phi and if - the other phi in the reduction is also relevant for vectorization. - This rejects cases such as: - - outer1: - x_1 = PHI ; - ... - - inner: - x_2 = ...; - ... - - outer2: - x_3 = PHI ; - - if nothing in x_2 or elsewhere makes x_1 relevant. */ - -static bool -vect_active_double_reduction_p (stmt_vec_info stmt_info) -{ - if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_double_reduction_def) -return false; - - return STMT_VINFO_RELEVANT_P (STMT_VINFO_REDUC_DEF (stmt_info)); -} - -/* Function vect_analyze_loop_operations. - - Scan the loop stmts and make sure they are all vectorizable. */ - -static opt_result -vect_analyze_loop_operations (loop_vec_info loop_vinfo) -{ - class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); - basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo); - int nbbs = loop->num_nodes; - int i; - stmt_vec_info stmt_info; - - DUMP_VECT_SCOPE ("vect_analyze_loop_operations"); - - for (i = 0; i < nbbs; i++) -{ - basic_block bb = bbs[i]; - - for (gphi_iterator si = gsi_start_phis (bb); !gsi_end_p (si); - gsi_next (&si)) -{ - gphi *phi = si.phi (); - - stmt_info = loop_vinfo->lookup_stmt (phi); - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, "examining phi: %G", -(gimple *) phi); - if (virtual_operand_p (gimple_phi_result (phi))) - continue; - - /* ??? All of the below unconditional FAILs should be in -done earlier after analyzing cycles, possibly when -determining stmt relevancy? */ - - /* Inner-loop loop-closed exit phi in outer-loop vectorization - (i.e., a phi in the tail of the outer-loop). */ - if (! is_loop_header_bb_p (bb)) -{ - /* FORNOW: we currently don't support the case that these phis - are not used in the outerloop (unless it is double reduction, - i.e., this phi is vect_reduction_def), cause this case - requires to actually do something here. */ - if (STMT_VINFO_LIVE_P (stmt_info) - && !vect_active_double_reduction_p (stmt_info)) - return opt_result::failure_at (phi, - "Unsupported loop-closed phi" - " in outer-loop.\n"); - - /* If PHI is used in the outer loop, we check that its operand - is defined in the inner loop. */ - if (STMT_VINFO_RELEVANT_P (stmt_info)) -{ - tree phi_op; - - if (gimple_phi_num_args (phi) != 1) -return opt_result::failure_at (phi, "unsupported phi"); - - phi_op = PHI_ARG_DEF (phi, 0); - stmt_vec_info op_def_info = loop_vinfo->lookup_def (phi_op); - if (!op_def_info) - return opt_result::failure_at (phi, "unsupported phi\n"); - - if (STMT_VINFO_RELEVANT (op_def_info) != vect_used_in_outer - && (STMT_VINFO_RELEVANT (op_def_info) - != vect_used_in_outer_by_reduction)) - return opt_result::failure_at (phi, "unsupported phi\n"); - - if ((STMT_VINFO_DEF_TYPE (stmt_info) == vect_internal_def - || (STMT_VINFO_DEF_TYPE (stmt_info) - == vect_double_reduction_def)) - && ! PURE_SLP_STMT (stmt_info)) - return opt_result::failure_at (phi, "unsupported phi\n"); -} - - continue; -} - - gcc_assert (stmt_info); - - if ((STMT_VINFO_RELEVANT (stmt_info) == vect_used_in_scope - || STMT_VINFO_LIVE_P (stmt_info)) - && STMT_VINFO_DEF_TYPE (stmt_info) != ve
[gcc r16-2171] testsuite: Add -funwind-tables to sve*/pfalse* tests
https://gcc.gnu.org/g:2ff8da46152cbade579700823cc7b1460ddd91b8 commit r16-2171-g2ff8da46152cbade579700823cc7b1460ddd91b8 Author: Richard Sandiford Date: Thu Jul 10 14:23:57 2025 +0100 testsuite: Add -funwind-tables to sve*/pfalse* tests The SVE svpfalse folding tests use CFI directives to delimit the function bodies. That requires -funwind-tables to be enabled, which is true by default for *-linux-gnu targets, but not for *-elf. gcc/testsuite/ * gcc.target/aarch64/sve/pfalse-binary.c: Add -funwind-tables. * gcc.target/aarch64/sve/pfalse-binary_int_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-binary_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-binary_opt_single_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-binary_rotate.c: Likewise. * gcc.target/aarch64/sve/pfalse-binary_uint64_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-binary_uint_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-binaryxn.c: Likewise. * gcc.target/aarch64/sve/pfalse-clast.c: Likewise. * gcc.target/aarch64/sve/pfalse-compare_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-compare_wide_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-count_pred.c: Likewise. * gcc.target/aarch64/sve/pfalse-fold_left.c: Likewise. * gcc.target/aarch64/sve/pfalse-load.c: Likewise. * gcc.target/aarch64/sve/pfalse-load_ext.c: Likewise. * gcc.target/aarch64/sve/pfalse-load_ext_gather_index.c: Likewise. * gcc.target/aarch64/sve/pfalse-load_ext_gather_offset.c: Likewise. * gcc.target/aarch64/sve/pfalse-load_gather_sv.c: Likewise. * gcc.target/aarch64/sve/pfalse-load_gather_vs.c: Likewise. * gcc.target/aarch64/sve/pfalse-load_replicate.c: Likewise. * gcc.target/aarch64/sve/pfalse-prefetch.c: Likewise. * gcc.target/aarch64/sve/pfalse-prefetch_gather_index.c: Likewise. * gcc.target/aarch64/sve/pfalse-prefetch_gather_offset.c: Likewise. * gcc.target/aarch64/sve/pfalse-ptest.c: Likewise. * gcc.target/aarch64/sve/pfalse-rdffr.c: Likewise. * gcc.target/aarch64/sve/pfalse-reduction.c: Likewise. * gcc.target/aarch64/sve/pfalse-reduction_wide.c: Likewise. * gcc.target/aarch64/sve/pfalse-shift_right_imm.c: Likewise. * gcc.target/aarch64/sve/pfalse-store.c: Likewise. * gcc.target/aarch64/sve/pfalse-store_scatter_index.c: Likewise. * gcc.target/aarch64/sve/pfalse-store_scatter_offset.c: Likewise. * gcc.target/aarch64/sve/pfalse-storexn.c: Likewise. * gcc.target/aarch64/sve/pfalse-ternary_opt_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-ternary_rotate.c: Likewise. * gcc.target/aarch64/sve/pfalse-unary.c: Likewise. * gcc.target/aarch64/sve/pfalse-unary_convert_narrowt.c: Likewise. * gcc.target/aarch64/sve/pfalse-unary_convertxn.c: Likewise. * gcc.target/aarch64/sve/pfalse-unary_n.c: Likewise. * gcc.target/aarch64/sve/pfalse-unary_pred.c: Likewise. * gcc.target/aarch64/sve/pfalse-unary_to_uint.c: Likewise. * gcc.target/aarch64/sve/pfalse-unaryxn.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_int_opt_n.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_int_opt_single_n.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_opt_n.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_opt_single_n.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_to_uint.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_uint_opt_n.c: Likewise. * gcc.target/aarch64/sve2/pfalse-binary_wide.c: Likewise. * gcc.target/aarch64/sve2/pfalse-compare.c: Likewise. * gcc.target/aarch64/sve2/pfalse-load_ext_gather_index_restricted.c, * gcc.target/aarch64/sve2/pfalse-load_ext_gather_offset_restricted.c, * gcc.target/aarch64/sve2/pfalse-load_gather_sv_restricted.c: Likewise. * gcc.target/aarch64/sve2/pfalse-load_gather_vs.c: Likewise. * gcc.target/aarch64/sve2/pfalse-shift_left_imm_to_uint.c: Likewise. * gcc.target/aarch64/sve2/pfalse-shift_right_imm.c: Likewise. * gcc.target/aarch64/sve2/pfalse-store_scatter_index_restricted.c, * gcc.target/aarch64/sve2/pfalse-store_scatter_offset_restricted.c, * gcc.target/aarch64/sve2/pfalse-unary.c: Likewise. * gcc.target/aarch64/sve2/pfalse-unary_convert.c: Likewise. * gcc.target/aarch64/sve2/pfalse-unary_convert_narrowt.c: Likewise. * gcc.target/aarch64/sve2/pfalse-unary_to_int.c: Likewise. Diff: --- gcc/testsuite/gcc
[gcc r16-2172] [RISC-V] Detect new fusions for RISC-V
https://gcc.gnu.org/g:742f55622690d35c6cc95f2b8722307699731571 commit r16-2172-g742f55622690d35c6cc95f2b8722307699731571 Author: Daniel Barboza Date: Thu Jul 10 07:28:38 2025 -0600 [RISC-V] Detect new fusions for RISC-V This is primarily Daniel's work... He's chasing things in QEMU & LLVM right now so I'm doing a bit of clean-up and shepherding this patch forward. -- Instruction fusion is a reasonably common way to improve the performance of code on many architectures/designs. A few years ago we submitted (via VRULL I suspect) fusion support for a number of cases in the RISC-V space. We made each type of fusion selectable independently in the tuning structure so that designs which implemented some particular set of fusions could select just the ones their design implemented. This patch adds to that generic infrastructure. In particular we're introducing additional load fusions, store pair fusions, bitfield extractions and a few B extension related fusions. Conceptually for the new load fusions we're adding the ability to fuse most add/shNadd instructions with a subsequent load. There's a couple of exceptions, but in general the expectation is that if we have add/shNadd for address computation, then they can potentially use with the load where the address gets used. We've had limited forms of store pair fusion for a while. Essentially we required both stores to be 64 bits wide and land on opposite sides of a 128 bit cache line. That was enough to help prologues and a few other things, but was fairly restrictive. The new cases capture store pairs where the two stores have the same size and hit consecutive memory locations. For example, storing consecutive bytes with sb+sb is fusible. For bitfield extractions we can fuse together a shift left followed by a shift right for arbitrary shift counts where as previously we restricted the shift counts to those implementing sign/zero extensions of 8, and 16 bit objects. Finally some B extension fusions. orc.b+not which shows up in string comparisons, ctz+andi (deepsjeng?), neg+max (synthesized abs). I hope these prove to be useful to other RISC-V designs. I wouldn't be surprised if we have to break down the new load fusions further for some designs. If we need to do that it wouldn't be hard. FWIW, our data indicates the generalized store fusions followed by the expanded load fusions are the most important cases for the new code. These have been tested with crosses and bootstrapped on the BPI. Waiting on pre-commit CI before moving forward (though it has been failing to pick up some patches recently...) gcc/ * config/riscv/riscv.cc (riscv_fusion_pairs): Add new cases. (riscv_set_is_add): New function. (riscv_set_is_addi, riscv_set_is_adduw, riscv_set_is_shNadd): Likewise. (riscv_set_is_shNadduw): Likewise. (riscv_macro_fusion_pair_p): Add new fusion cases. Co-authored-by: Jeff Law Diff: --- gcc/config/riscv/riscv.cc | 383 +- 1 file changed, 382 insertions(+), 1 deletion(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index b868a503a35f..023adc3284df 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -283,6 +283,10 @@ enum riscv_fusion_pairs RISCV_FUSE_AUIPC_LD = (1 << 7), RISCV_FUSE_LDPREINCREMENT = (1 << 8), RISCV_FUSE_ALIGNED_STD = (1 << 9), + RISCV_FUSE_CACHE_ALIGNED_STD = (1 << 10), + RISCV_FUSE_BFEXT = (1 << 11), + RISCV_FUSE_EXPANDED_LD = (1 << 12), + RISCV_FUSE_B_ALUI = (1 << 13), }; /* Costs of various operations on the different architectures. */ @@ -10205,6 +10209,81 @@ riscv_fusion_enabled_p(enum riscv_fusion_pairs op) return tune_param->fusible_ops & op; } +/* Matches an add: + (set (reg:DI rd) (plus:SI (reg:SI rs1) (reg:SI rs2))) */ + +static bool +riscv_set_is_add (rtx set) +{ + return (GET_CODE (SET_SRC (set)) == PLUS + && REG_P (XEXP (SET_SRC (set), 0)) + && REG_P (XEXP (SET_SRC (set), 1)) + && REG_P (SET_DEST (set))); +} + +/* Matches an addi: + (set (reg:DI rd) (plus:SI (reg:SI rs1) (const_int imm))) */ + +static bool +riscv_set_is_addi (rtx set) +{ + return (GET_CODE (SET_SRC (set)) == PLUS + && REG_P (XEXP (SET_SRC (set), 0)) + && CONST_INT_P (XEXP (SET_SRC (set), 1)) + && REG_P (SET_DEST (set))); +} + +/* Matches an add.uw: + (set (reg:DI rd) +(plus:DI (zero_extend:DI (reg:SI rs1)) (reg:DI rs2))) */ + +static bool +riscv_set_is_adduw (rtx set) +{ + return (GET_CODE (SET_SRC (set)) == PLUS + && GET_CODE (XEXP (SET_SRC (set), 0)) == ZERO_EXTEND + && REG_P (XEXP (XEXP (SET_SRC (set), 0), 0)) + && REG_P (XEXP (SET_SRC (set), 1))
[gcc r16-2176] Fixes to auto-profile and Gimple matching.
https://gcc.gnu.org/g:50f3a6a437ad4f2438191b6d9aa9aed8575b9372 commit r16-2176-g50f3a6a437ad4f2438191b6d9aa9aed8575b9372 Author: Jan Hubicka Date: Thu Jul 10 16:56:21 2025 +0200 Fixes to auto-profile and Gimple matching. This patch fixes several issues I noticed in gimple matching and -Wauto-profile warning. One problem is that we mismatched symbols with user names, such as "*strlen" instead of "strlen". I added raw_symbol_name to strip extra '*' which is ok on ELF targets which are only targets we support with auto-profile, but eventually we will want to add the user prefix. There is sorry about this. Also I think dwarf2out is wrong: static void add_linkage_attr (dw_die_ref die, tree decl) { const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)); /* Mimic what assemble_name_raw does with a leading '*'. */ if (name[0] == '*') name = &name[1]; The patch also fixes locations of warning. I used location of problematic statement as warning_at parmaeter but also included info about the containing funtction. This makes warning_at to ignore the fist location that is fixed now. I also fixed the ICE with -Wno-auto-profile disussed earlier. Bootstrapped/regtested x86_64-linux. Autoprofiled bootstrap now fails for weird reasons for me (it does not bild the training stage), so I will try to debug this before comitting. gcc/ChangeLog: * auto-profile.cc: Include output.h. (function_instance::set_call_location): Also sanity check that location is known. (raw_symbol_name): Two new static functions. (dump_inline_stack): Use it. (string_table::get_index_by_decl): Likewise. (function_instance::get_cgraph_node): Likewise. (function_instance::get_function_instance_by_decl): Fix typo in warning; use raw names; fix lineno decoding. (match_with_target): Add containing funciton parameter; correctly output function and call location in warning. (function_instance::lookup_count): Fix warning locations. (function_instance::match): Fix warning locations; avoid crash with mismatched callee; do not warn about broken callsites twice. (autofdo_source_profile::offline_external_functions): Use raw_assembler_name. (walk_block): Use raw_assembler_name. gcc/testsuite/ChangeLog: * gcc.dg/tree-prof/afdo-inline.c: Add user symbol names. Diff: --- gcc/auto-profile.cc | 231 +-- gcc/testsuite/gcc.dg/tree-prof/afdo-inline.c | 9 ++ 2 files changed, 156 insertions(+), 84 deletions(-) diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc index 219676012e76..5226e4550257 100644 --- a/gcc/auto-profile.cc +++ b/gcc/auto-profile.cc @@ -53,6 +53,7 @@ along with GCC; see the file COPYING3. If not see #include "auto-profile.h" #include "tree-pretty-print.h" #include "gimple-pretty-print.h" +#include "output.h" /* The following routines implements AutoFDO optimization. @@ -430,7 +431,8 @@ public: void set_call_location (location_t l) { -gcc_checking_assert (call_location_ == UNKNOWN_LOCATION); +gcc_checking_assert (call_location_ == UNKNOWN_LOCATION +&& l != UNKNOWN_LOCATION); call_location_= l; } @@ -685,6 +687,26 @@ dump_afdo_loc (FILE *f, unsigned loc) fprintf (f, "%i", loc >> 16); } +/* Return assembler name as in symbol table and DW_AT_linkage_name. */ + +static const char * +raw_symbol_name (const char *asmname) +{ + /* If we start supporting user_label_prefixes, add_linkage_attr will also + need to be fixed. */ + if (strlen (user_label_prefix)) +sorry ("auto-profile is not supported for targets with user label prefix"); + return asmname + (asmname[0] == '*'); +} + +/* Convenience wrapper that looks up assembler name. */ + +static const char * +raw_symbol_name (tree decl) +{ + return raw_symbol_name (IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl))); +} + /* Dump STACK to F. */ static void @@ -695,7 +717,7 @@ dump_inline_stack (FILE *f, inline_stack *stack) { fprintf (f, "%s%s:", first ? "" : "; ", - IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (p.decl))); + raw_symbol_name (p.decl)); dump_afdo_loc (f, p.afdo_loc); first = false; } @@ -817,7 +839,7 @@ string_table::get_index (const char *name) const int string_table::get_index_by_decl (tree decl) const { - const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl)); + const char *name = raw_symbol_name (decl); int ret = get_index (name); if (ret != -1) return ret; @@ -880,10 +902,9 @@ function_instance::~function_instan
[gcc r16-2177] cobol: Add PUSH and POP to CDF.
https://gcc.gnu.org/g:3f59a1cac717f8af84e884e9ec0f6ef14e102e6e commit r16-2177-g3f59a1cac717f8af84e884e9ec0f6ef14e102e6e Author: James K. Lowden Date: Wed Jul 9 18:14:40 2025 -0400 cobol: Add PUSH and POP to CDF. Introduce cdf_directives_t class to centralize management of CDF state. Move existing CDF state variables and functions into the new class. gcc/cobol/ChangeLog: PR cobol/120765 * cdf.y: Extend grammar for new CDF syntax, relocate dictionary. * cdfval.h (cdf_dictionary): Use new CDF dictionary. * dts.h: Remove useless assignment, note incorrect behavior. * except.cc: Remove obsolete EC state. * gcobol.1: Document CDF in its own section. * genapi.cc (parser_statement_begin): Use new EC state function. (parser_file_merge): Same. (parser_check_fatal_exception): Same. * genutil.cc (get_and_check_refstart_and_reflen): Same. (get_depending_on_value_from_odo): Same. (get_data_offset): Same. (process_this_exception): Same. * lexio.cc (check_push_pop_directive): New function. (check_source_format_directive): Restrict regex search to 1 line. (cdftext::free_form_reference_format): Use new function. * parse.y: Define new CDF tokens, use new CDF state. * parse_ante.h (cdf_tokens): Use new CDF state. (redefined_token): Same. (class prog_descr_t): Remove obsolete CDF state. (class program_stack_t): Same. (current_call_convention): Same. * scan.l: Recognize new CDF tokens. * scan_post.h (is_cdf_token): Same. * symbols.h (cdf_current_tokens): Change current_call_convention to return void. * token_names.h: Regenerate. * udf/stored-char-length.cbl: Use new PUSH/POP CDF functionality. * util.cc (class cdf_directives_t): Define cdf_directives_t. (current_call_convention): Same. (cdf_current_tokens): Same. (cdf_dictionary): Same. (cdf_enabled_exceptions): Same. (cdf_push): Same. (cdf_push_call_convention): Same. (cdf_push_current_tokens): Same. (cdf_push_dictionary): Same. (cdf_push_enabled_exceptions): Same. (cdf_push_source_format): Same. (cdf_pop): Same. (cdf_pop_call_convention): Same. (cdf_pop_current_tokens): Same. (cdf_pop_dictionary): Same. (cdf_pop_enabled_exceptions): Same. (cdf_pop_source_format): Same. * util.h (cdf_push): Declare cdf_directives_t. (cdf_push_call_convention): Same. (cdf_push_current_tokens): Same. (cdf_push_dictionary): Same. (cdf_push_enabled_exceptions): Same. (cdf_push_source_format): Same. (cdf_pop): Same. (cdf_pop_call_convention): Same. (cdf_pop_current_tokens): Same. (cdf_pop_dictionary): Same. (cdf_pop_source_format): Same. (cdf_pop_enabled_exceptions): Same. libgcobol/ChangeLog: * common-defs.h (cdf_enabled_exceptions): Use new CDF state. Diff: --- gcc/cobol/cdf.y | 94 +- gcc/cobol/cdfval.h |4 + gcc/cobol/dts.h | 14 +- gcc/cobol/except.cc |2 - gcc/cobol/gcobol.1 | 192 +-- gcc/cobol/genapi.cc |6 +- gcc/cobol/genutil.cc |7 + gcc/cobol/lexio.cc | 72 +- gcc/cobol/parse.y| 21 +- gcc/cobol/parse_ante.h | 48 +- gcc/cobol/scan.l | 13 + gcc/cobol/scan_post.h|2 + gcc/cobol/symbols.h |3 +- gcc/cobol/token_names.h | 2228 +- gcc/cobol/udf/stored-char-length.cbl |4 + gcc/cobol/util.cc| 90 +- gcc/cobol/util.h | 15 + libgcobol/common-defs.h |2 +- 18 files changed, 1541 insertions(+), 1276 deletions(-) diff --git a/gcc/cobol/cdf.y b/gcc/cobol/cdf.y index f1a791245854..840eb5033151 100644 --- a/gcc/cobol/cdf.y +++ b/gcc/cobol/cdf.y @@ -105,14 +105,14 @@ void input_file_status_notify(); using std::map; - static map dictionary; - #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wunused-function" static bool cdfval_add( const char name[], const cdfval_t& value, bool override = false ) { +cdf_values_t& dictionary( cdf_dictionary() ); + if( scanner_parsing() ) { if( ! override ) { if( dictionary.find(name) != dictionary.end() ) return false; @@ -123,6 +123,8 @@ void input_file_status_notify(); } static void
[gcc r16-2178] aarch64: Fix LD1Q and ST1Q failures for big-endian
https://gcc.gnu.org/g:e7f049471c6caf22c65ac48773d864fca7a4cdc4 commit r16-2178-ge7f049471c6caf22c65ac48773d864fca7a4cdc4 Author: Richard Sandiford Date: Thu Jul 10 16:54:45 2025 +0100 aarch64: Fix LD1Q and ST1Q failures for big-endian LD1Q gathers and ST1Q scatters are unusual in that they operate on 128-bit blocks (effectively VNx1TI). However, we don't have modes or ACLE types for 128-bit integers, and 128-bit integers are not the intended use case. Instead, the instructions are intended to be used in "hybrid VLA" operations, where each 128-bit block is an Advanced SIMD vector. The normal SVE modes therefore capture the intended use case better than VNx1TI would. For example, VNx2DI is effectively N copies of V2DI, VNx4SI N copies of V4SI, etc. Since there is only one LD1Q instruction and one ST1Q instruction, the ACLE support used a single pattern for each, with the loaded or stored data having mode VNx2DI. The ST1Q pattern was generated by: rtx data = e.args.last (); e.args.last () = force_lowpart_subreg (VNx2DImode, data, GET_MODE (data)); e.prepare_gather_address_operands (1, false); return e.use_exact_insn (CODE_FOR_aarch64_scatter_st1q); where the force_lowpart_subreg bitcast the stored data to VNx2DI. But such subregs require an element reverse on big-endian targets (see the comment at the head of aarch64-sve.md), which wasn't the intention. The code should have used aarch64_sve_reinterpret instead. The LD1Q pattern was used as follows: e.prepare_gather_address_operands (1, false); return e.use_exact_insn (CODE_FOR_aarch64_gather_ld1q); which always returns a VNx2DI value, leaving the caller to bitcast that to the correct mode. That bitcast again uses subregs and has the same problem as above. However, for the reasons explained in the comment, using aarch64_sve_reinterpret does not work well for LD1Q. The patch instead parameterises the LD1Q based on the required data mode. gcc/ * config/aarch64/aarch64-sve2.md (aarch64_gather_ld1q): Replace with... (@aarch64_gather_ld1q): ...this, parameterizing based on mode. * config/aarch64/aarch64-sve-builtins-sve2.cc (svld1q_gather_impl::expand): Update accordingly. (svst1q_scatter_impl::expand): Use aarch64_sve_reinterpret instead of force_lowpart_subreg. Diff: --- gcc/config/aarch64/aarch64-sve-builtins-sve2.cc | 5 +++-- gcc/config/aarch64/aarch64-sve2.md | 21 +++-- 2 files changed, 18 insertions(+), 8 deletions(-) diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc index d9922de7ca5a..abe21a8b61c6 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.cc @@ -316,7 +316,8 @@ public: expand (function_expander &e) const override { e.prepare_gather_address_operands (1, false); -return e.use_exact_insn (CODE_FOR_aarch64_gather_ld1q); +auto icode = code_for_aarch64_gather_ld1q (e.tuple_mode (0)); +return e.use_exact_insn (icode); } }; @@ -722,7 +723,7 @@ public: expand (function_expander &e) const override { rtx data = e.args.last (); -e.args.last () = force_lowpart_subreg (VNx2DImode, data, GET_MODE (data)); +e.args.last () = aarch64_sve_reinterpret (VNx2DImode, data); e.prepare_gather_address_operands (1, false); return e.use_exact_insn (CODE_FOR_aarch64_scatter_st1q); } diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 789ec0dd1a3c..660901d4b3f1 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -334,12 +334,21 @@ ;; - LD1Q (SVE2p1) ;; - -;; Model this as operating on the largest valid element size, which is DI. -;; This avoids having to define move patterns & more for VNx1TI, which would -;; be difficult without a non-gather form of LD1Q. -(define_insn "aarch64_gather_ld1q" - [(set (match_operand:VNx2DI 0 "register_operand") - (unspec:VNx2DI +;; For little-endian targets, it would be enough to use a single pattern, +;; with a subreg to bitcast the result to whatever mode is needed. +;; However, on big-endian targets, the bitcast would need to be an +;; aarch64_sve_reinterpret instruction. That would interact badly +;; with the "&" and "?" constraints in this pattern: if the result +;; of the reinterpret needs to be in the same register as the index, +;; the RA would tend to prefer to allocate a separate register for the +;; intermediate (uncast) result, even if the reinterpret prefers tying. +;; +;; The index is logically VNx1DI rather than VNx2DI, but introducing +;; and using VNx1DI would just create mor
[gcc(refs/users/mikael/heads/stabilisation_descriptor_v01)] fortran: Amend descriptor bounds init if unallocated
https://gcc.gnu.org/g:7e72a078ae71594f6f34d406a80b47ca90cf876e commit 7e72a078ae71594f6f34d406a80b47ca90cf876e Author: Mikael Morin Date: Wed Jul 9 09:40:32 2025 +0200 fortran: Amend descriptor bounds init if unallocated Always generate the conditional initialization of unallocated variables regardless of the basic variable allocation tracking done in the frontend and with an additional always false condition. The scalarizer used to always evaluate array bounds, including in the case of unallocated arrays on the left hand side of an assignment. This was (correctly) causing uninitialized warnings, even if the uninitialized values were in the end discarded. Since the fix for PR fortran/108889, an initialization of the descriptor bounds is added to silent the uninitialized warnings, conditional on the array being unallocated. This initialization is not useful in the execution of the program, and it is removed if the compiler can prove that the variable is unallocated (in the case of a local variable for example). Unfortunately, the compiler is not always able to prove it and the useless initialization may remain in the final code. Moreover, the generated code that was causing the evaluation of uninitialized variables has ben changed to avoid them, so we can try to remove or revisit that unallocated variable bounds initialization tweak. Unfortunately, just removing the extra initialization restores the warnings at -O0, as there is no dead code removal at that optimization level. Instead, this change keeps the initialization and modifies its guarding condition with an extra always false variable, so that if optimizations are enabled the whole initialization block is removed, and if they are disabled it remains and is sufficient to prevent the warning. The new variable requires the code generation to be done earlier in the function so that the variable declaration and usage are in the same scope. As the modified condition guarantees the removal of the block with optimizations, we can emit it more broadly and remove the basic allocation tracking that was done in the frontend to limit its emission. gcc/fortran/ChangeLog: * gfortran.h (gfc_symbol): Remove field allocated_in_scope. * trans-array.cc (gfc_array_allocate): Don't set it. (gfc_alloc_allocatable_for_assignment): Likewise. Generate the unallocated descriptor bounds initialisation before the opening of the reallocation code block. Create a variable and use it as additional condition to the unallocated descriptor bounds initialisation. Diff: --- gcc/fortran/gfortran.h | 4 -- gcc/fortran/trans-array.cc | 91 -- 2 files changed, 48 insertions(+), 47 deletions(-) diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 6848bd1762d3..69367e638c5b 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -2028,10 +2028,6 @@ typedef struct gfc_symbol /* Set if this should be passed by value, but is not a VALUE argument according to the Fortran standard. */ unsigned pass_as_value:1; - /* Set if an allocatable array variable has been allocated in the current - scope. Used in the suppression of uninitialized warnings in reallocation - on assignment. */ - unsigned allocated_in_scope:1; /* Set if an external dummy argument is called with different argument lists. This is legal in Fortran, but can cause problems with autogenerated C prototypes for C23. */ diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index 7b83d3fab8d7..52888c1e1f1b 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -6800,8 +6800,6 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg, else gfc_add_expr_to_block (&se->pre, set_descriptor); - expr->symtree->n.sym->allocated_in_scope = 1; - return true; } @@ -11495,14 +11493,60 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop, && !expr2->value.function.isym) expr2->ts.u.cl->backend_decl = rss->info->string_length; - gfc_start_block (&fblock); - /* Since the lhs is allocatable, this must be a descriptor type. Get the data and array size. */ desc = linfo->descriptor; gcc_assert (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (desc))); array1 = gfc_conv_descriptor_data_get (desc); + /* If the data is null, set the descriptor bounds and offset. This suppresses + the maybe used uninitialized warning. Note that the always false variable + prevents this block from from ever being executed. The whole block should + be removed by optimizations. Component references are not subject to the + warnings, so we don't uselessly complicate the generated code
[gcc r16-2164] aarch64: Extend HVLA permutations to big-endian
https://gcc.gnu.org/g:3b870131487d786a74f27a89d0415c8207770f14 commit r16-2164-g3b870131487d786a74f27a89d0415c8207770f14 Author: Richard Sandiford Date: Thu Jul 10 10:57:28 2025 +0100 aarch64: Extend HVLA permutations to big-endian TARGET_VECTORIZE_VEC_PERM_CONST has code to match the SVE2.1 "hybrid VLA" DUPQ, EXTQ, UZPQ{1,2}, and ZIPQ{1,2} instructions. This matching was conditional on !BYTES_BIG_ENDIAN. The ACLE code also lowered the associated SVE2.1 intrinsics into suitable VEC_PERM_EXPRs. This lowering was not conditional on !BYTES_BIG_ENDIAN. The mismatch led to lots of ICEs in the ACLE tests on big-endian targets: we lowered to VEC_PERM_EXPRs that are not supported. I think the !BYTES_BIG_ENDIAN restriction was unnecessary. SVE maps the first memory element to the least significant end of the register for both endiannesses, so no endian correction or lane number adjustment is necessary. This is in some ways a bit counterintuitive. ZIPQ1 is conceptually "apply Advanced SIMD ZIP1 to each 128-bit block" and endianness does matter when choosing between Advanced SIMD ZIP1 and ZIP2. For example, the V4SI permute selector { 0, 4, 1, 5 } corresponds to ZIP1 for little- endian and ZIP2 for big-endian. But the difference between the hybrid VLA and Advanced SIMD permute selectors is a consequence of the difference between the SVE and Advanced SIMD element orders. The same thing applies to ACLE intrinsics. The current lowering of svzipq1 etc. is correct for both endiannesses. If ACLE code does: 2x svld1_s32 + svzipq1_s32 + svst1_s32 then the byte-for-byte result is the same for both endiannesses. On big-endian targets, this is different from using the Advanced SIMD sequence below for each 128-bit block: 2x LDR + ZIP1 + STR In contrast, the byte-for-byte result of: 2x svld1q_gather_s32 + svzipq1_s32 + svst11_scatter_s32 depends on endianness, since the quadword gathers and scatters use Advanced SIMD byte ordering for each 128-bit block. This gather/scatter sequence behaves in the same way as the Advanced SIMD LDR+ZIP1+STR sequence for both endiannesses. Programmers writing ACLE code have to be aware of this difference if they want to support both endiannesses. The patch includes some new execution tests to verify the expansion of the VEC_PERM_EXPRs. gcc/ * doc/sourcebuild.texi (aarch64_sve2_hw, aarch64_sve2p1_hw): Document. * config/aarch64/aarch64.cc (aarch64_evpc_hvla): Extend to BYTES_BIG_ENDIAN. gcc/testsuite/ * lib/target-supports.exp (check_effective_target_aarch64_sve2p1_hw): New proc. * gcc.target/aarch64/sve2/dupq_1.c: Extend to big-endian. Add noipa attributes. * gcc.target/aarch64/sve2/extq_1.c: Likewise. * gcc.target/aarch64/sve2/uzpq_1.c: Likewise. * gcc.target/aarch64/sve2/zipq_1.c: Likewise. * gcc.target/aarch64/sve2/dupq_1_run.c: New test. * gcc.target/aarch64/sve2/extq_1_run.c: Likewise. * gcc.target/aarch64/sve2/uzpq_1_run.c: Likewise. * gcc.target/aarch64/sve2/zipq_1_run.c: Likewise. Diff: --- gcc/config/aarch64/aarch64.cc | 1 - gcc/doc/sourcebuild.texi | 6 ++ gcc/testsuite/gcc.target/aarch64/sve2/dupq_1.c | 26 +++ gcc/testsuite/gcc.target/aarch64/sve2/dupq_1_run.c | 87 ++ gcc/testsuite/gcc.target/aarch64/sve2/extq_1.c | 20 ++--- gcc/testsuite/gcc.target/aarch64/sve2/extq_1_run.c | 73 ++ gcc/testsuite/gcc.target/aarch64/sve2/uzpq_1.c | 18 ++--- gcc/testsuite/gcc.target/aarch64/sve2/uzpq_1_run.c | 78 +++ gcc/testsuite/gcc.target/aarch64/sve2/zipq_1.c | 18 ++--- gcc/testsuite/gcc.target/aarch64/sve2/zipq_1_run.c | 78 +++ gcc/testsuite/lib/target-supports.exp | 17 + 11 files changed, 380 insertions(+), 42 deletions(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 477cbece6c98..27c315fc35e8 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -26801,7 +26801,6 @@ aarch64_evpc_hvla (struct expand_vec_perm_d *d) machine_mode vmode = d->vmode; if (!TARGET_SVE2p1 || !TARGET_NON_STREAMING - || BYTES_BIG_ENDIAN || d->vec_flags != VEC_SVE_DATA || GET_MODE_UNIT_BITSIZE (vmode) > 64) return false; diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 6c5586e4b034..85fb810d96c5 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -2373,6 +2373,12 @@ whether it does so by default). @itemx aarch64_sve1024_hw @itemx aarch64_sve2048_hw Like @code{aarch64_sve_hw}, but also test for an exact ha
[gcc r16-2167] Avoid vect_is_simple_use call from get_load_store_type
https://gcc.gnu.org/g:13beea469554efcffd0f2cda6f0484a603577f27 commit r16-2167-g13beea469554efcffd0f2cda6f0484a603577f27 Author: Richard Biener Date: Thu Jul 10 10:25:03 2025 +0200 Avoid vect_is_simple_use call from get_load_store_type This isn't the required refactoring of vect_check_gather_scatter but it avoids a now unnecessary call to vect_is_simple_use which is problematic because it looks at STMT_VINFO_VECTYPE which we want to get rid of. SLP build already ensures vect_is_simple_use on all lane defs, so all we need is to populate the offset_vectype and offset_dt which is not always set by vect_check_gather_scatter. That's both easy to get from the SLP child directly. * tree-vect-stmts.cc (get_load_store_type): Do not use vect_is_simple_use to fill gather/scatter offset operand vectype and dt. Diff: --- gcc/tree-vect-stmts.cc | 15 --- 1 file changed, 4 insertions(+), 11 deletions(-) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index e5971e4a357b..4aa69da2218b 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -2466,17 +2466,10 @@ get_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info, vls_type == VLS_LOAD ? "gather" : "scatter"); return false; } - else if (!vect_is_simple_use (gs_info->offset, vinfo, - &gs_info->offset_dt, - &gs_info->offset_vectype)) - { - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, -"%s index use not simple.\n", -vls_type == VLS_LOAD ? "gather" : "scatter"); - return false; - } - else if (gs_info->ifn == IFN_LAST && !gs_info->decl) + slp_tree offset_node = SLP_TREE_CHILDREN (slp_node)[0]; + gs_info->offset_dt = SLP_TREE_DEF_TYPE (offset_node); + gs_info->offset_vectype = SLP_TREE_VECTYPE (offset_node); + if (gs_info->ifn == IFN_LAST && !gs_info->decl) { if (!TYPE_VECTOR_SUBPARTS (vectype).is_constant () || !TYPE_VECTOR_SUBPARTS (gs_info->offset_vectype).is_constant ()
[gcc r16-2168] Avoid vect_is_simple_use call from vectorizable_reduction
https://gcc.gnu.org/g:31c96621cc307fed1a0a01c0c2f18afaaf50b256 commit r16-2168-g31c96621cc307fed1a0a01c0c2f18afaaf50b256 Author: Richard Biener Date: Thu Jul 10 11:21:26 2025 +0200 Avoid vect_is_simple_use call from vectorizable_reduction When analyzing the reduction cycle we look to determine the reduction input vector type, for lane-reducing ops we look at the input but instead of using vect_is_simple_use which is problematic for SLP we should simply get at the SLP operands vector type. If that's not set and we make up one we should also ensure it stays so. * tree-vect-loop.cc (vectorizable_reduction): Avoid vect_is_simple_use and record a vector type if we come up with one. Diff: --- gcc/tree-vect-loop.cc | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 7b260c34a846..8ea0f45d79fc 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -7378,23 +7378,20 @@ vectorizable_reduction (loop_vec_info loop_vinfo, if (lane_reducing_op_p (op.code)) { - enum vect_def_type dt; - tree vectype_op; - /* The last operand of lane-reducing operation is for reduction. */ gcc_assert (reduc_idx > 0 && reduc_idx == (int) op.num_ops - 1); - if (!vect_is_simple_use (op.ops[0], loop_vinfo, &dt, &vectype_op)) - return false; - + slp_tree op_node = SLP_TREE_CHILDREN (slp_for_stmt_info)[0]; + tree vectype_op = SLP_TREE_VECTYPE (op_node); tree type_op = TREE_TYPE (op.ops[0]); - if (!vectype_op) { vectype_op = get_vectype_for_scalar_type (loop_vinfo, type_op); - if (!vectype_op) + if (!vectype_op + || !vect_maybe_update_slp_op_vectype (op_node, + vectype_op)) return false; }
[gcc r15-9949] Fix 'main' function in 'gcc.dg/builtin-dynamic-object-size-pr120780.c'
https://gcc.gnu.org/g:57eae2c32f2ce654053f5ce4b6fb4eb79381d7da commit r15-9949-g57eae2c32f2ce654053f5ce4b6fb4eb79381d7da Author: Thomas Schwinge Date: Wed Jul 9 10:06:39 2025 +0200 Fix 'main' function in 'gcc.dg/builtin-dynamic-object-size-pr120780.c' Fix-up for commit 72e85d46472716e670cbe6e967109473b8d12d38 "tree-optimization/120780: Support object size for containing objects". 'size_t sz' is unused here, and GCC/nvptx doesn't accept this: spawn -ignore SIGHUP [...]/nvptx-none-run ./builtin-dynamic-object-size-pr120780.exe error : Prototype doesn't match for 'main' in 'input file 1 at offset 1924', first defined in 'input file 1 at offset 1924' nvptx-run: cuLinkAddData failed: unknown error (CUDA_ERROR_UNKNOWN, 999) FAIL: gcc.dg/builtin-dynamic-object-size-pr120780.c execution test gcc/testsuite/ * gcc.dg/builtin-dynamic-object-size-pr120780.c: Fix 'main' function. (cherry picked from commit c6ca6e57004653b787d2d6243fe5ee00cda8aad0) Diff: --- gcc/testsuite/gcc.dg/builtin-dynamic-object-size-pr120780.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-pr120780.c b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-pr120780.c index 0d6593ec8289..12e6c29569c7 100644 --- a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-pr120780.c +++ b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-pr120780.c @@ -207,7 +207,7 @@ test5 (size_t sz) } int -main (size_t sz) +main (void) { test1 (sizeof (struct container)); test1 (sizeof (struct container) - sizeof (int));
[gcc(refs/users/omachota/heads/rtl-ssa-dce)] rtl-ssa-dce: format code
https://gcc.gnu.org/g:e5a639732d48e976af0466bb3721d0e0df3da8da commit e5a639732d48e976af0466bb3721d0e0df3da8da Author: Ondřej Machota Date: Thu Jul 10 18:08:51 2025 +0200 rtl-ssa-dce: format code Diff: --- gcc/dce.cc | 61 +++-- 1 file changed, 31 insertions(+), 30 deletions(-) diff --git a/gcc/dce.cc b/gcc/dce.cc index 67fb42541d84..4691901f56d6 100644 --- a/gcc/dce.cc +++ b/gcc/dce.cc @@ -1386,6 +1386,7 @@ private: sbitmap m_marked_phis; }; +// Return true if INSN cannot be deleted. bool rtl_ssa_dce::is_rtx_pattern_prelive (const_rtx insn) { @@ -1393,7 +1394,7 @@ rtl_ssa_dce::is_rtx_pattern_prelive (const_rtx insn) { case PREFETCH: case UNSPEC: -case TRAP_IF: /* testsuite/gcc.c-torture/execute/20020418-1.c */ +case TRAP_IF: return true; default: @@ -1401,7 +1402,7 @@ rtl_ssa_dce::is_rtx_pattern_prelive (const_rtx insn) } } -// Return true if an call INSN can be deleted +// Return true if an call INSN can be deleted. bool rtl_ssa_dce::can_delete_call (const_rtx insn) { @@ -1418,6 +1419,7 @@ rtl_ssa_dce::can_delete_call (const_rtx insn) && cfun->can_delete_dead_exceptions && insn_nothrow_p (insn); } +// Return true if rtx INSN is prelive. bool rtl_ssa_dce::is_rtx_prelive (const_rtx insn) { @@ -1461,33 +1463,34 @@ rtl_ssa_dce::is_rtx_prelive (const_rtx insn) } } +// Return true if INSN is prelive - cannot be deleted. bool rtl_ssa_dce::is_prelive (insn_info *insn) { // Bb head and end contain artificial uses that we need to mark as prelive. // Debug instructions are also prelive, however, they are not added to the - // worklist + // worklist. if (insn->is_bb_head () || insn->is_bb_end () || insn->is_debug_insn ()) return true; - // Phi instructions are never prelive + // Phi instructions are never prelive. if (insn->is_artificial ()) return false; - gcc_assert (insn->is_real ()); + gcc_checking_assert (insn->is_real ()); for (def_info *def : insn->defs ()) { - // The purpose of this pass is not to eliminate memory stores... + // The purpose of this pass is not to eliminate memory stores. if (def->is_mem ()) return true; gcc_checking_assert (def->is_reg ()); - // We should not delete the frame pointer because of the dward2frame pass + // We should not delete the frame pointer because of the dward2frame pass. if (frame_pointer_needed && def->regno () == HARD_FRAME_POINTER_REGNUM) return true; - // Skip clobbers, they are handled inside is_rtx_prelive + // Skip clobbers, they are handled inside is_rtx_prelive. if (def->kind () == access_kind::CLOBBER) continue; @@ -1509,7 +1512,7 @@ rtl_ssa_dce::is_prelive (insn_info *insn) // Mark SET as visited and return true if SET->insn() is not nullptr and SET // has not been visited. Otherwise return false. bool -rtl_ssa_dce::mark_if_not_visited (const set_info *set) +rtl_ssa_dce::mark_if_not_visited (set_info *set) { insn_info *insn = set->insn (); if (!insn) @@ -1517,20 +1520,20 @@ rtl_ssa_dce::mark_if_not_visited (const set_info *set) if (insn->is_phi ()) { - const phi_info *phi = static_cast (set); - auto uid = phi->uid (); + phi_info *phi = static_cast (set); + unsigned int uid = phi->uid (); if (bitmap_bit_p (m_marked_phis, uid)) return false; bitmap_set_bit (m_marked_phis, uid); if (dump_file) - fprintf (dump_file, "Phi node %d:%d marked as live\n", set->regno (), + fprintf (dump_file, "Phi node %d in insn %d marked as live\n", uid, insn->uid ()); } else { - auto uid = insn->uid (); + unsigned int uid = insn->uid (); if (m_marked.get_bit (uid)) return false; @@ -1550,8 +1553,6 @@ rtl_ssa_dce::append_not_visited_sets (auto_vec &worklist, { for (use_info *use : uses) { - // This seems to be a good idea, however there is a problem is - // process_uses_of_deleted_def if (use->only_occurs_in_notes ()) continue; @@ -1562,27 +1563,23 @@ rtl_ssa_dce::append_not_visited_sets (auto_vec &worklist, if (!mark_if_not_visited (parent_set)) continue; - // mark_if_not_visited will return false if insn is nullptr - // insn_info *insn = parent_set->insn (); - // gcc_checking_assert (insn); - - // if (dump_file) - // fprintf (dump_file, "Adding insn %d to worklist\n", insn->uid ()); worklist.safe_push (parent_set); } } -// Mark INSN and add its uses to WORKLIST if INSN is not a debug instruction +// Mark INSN and add its uses to WORKLIST if INSN is not a debug instruction. void rtl_ssa_dce::mark_prelive_insn (insn_info *insn, auto_vec &worklist) { if (dump_file) fprintf (dump_file, "Insn %d marked as prelive\n", insn->uid ()); - // A phi node will never be pre
[gcc r16-2165] aarch64: PR target/120999: Adjust operands for movprfx alternative of NBSL implementation of NOR
https://gcc.gnu.org/g:b7bd72ce71df5266e7a7039da318e49862389a72 commit r16-2165-gb7bd72ce71df5266e7a7039da318e49862389a72 Author: Kyrylo Tkachov Date: Wed Jul 9 10:04:01 2025 -0700 aarch64: PR target/120999: Adjust operands for movprfx alternative of NBSL implementation of NOR While the SVE2 NBSL instruction accepts MOVPRFX to add more flexibility due to its tied operands, the destination of the movprfx cannot be also a source operand. But the offending pattern in aarch64-sve2.md tries to do exactly that for the "=?&w,w,w" alternative and gas warns for the attached testcase. This patch adjusts that alternative to avoid taking operand 0 as an input in the NBSL again. So for the testcase in the patch we now generate: nor_z: movprfx z0, z1 nbslz0.d, z0.d, z2.d, z1.d ret instead of the previous: nor_z: movprfx z0, z1 nbslz0.d, z0.d, z2.d, z0.d ret which generated a gas warning. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov gcc/ PR target/120999 * config/aarch64/aarch64-sve2.md (*aarch64_sve2_nor): Adjust movprfx alternative. gcc/testsuite/ PR target/120999 * gcc.target/aarch64/sve2/pr120999.c: New test. Diff: --- gcc/config/aarch64/aarch64-sve2.md | 2 +- gcc/testsuite/gcc.target/aarch64/sve2/pr120999.c | 17 + 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/gcc/config/aarch64/aarch64-sve2.md b/gcc/config/aarch64/aarch64-sve2.md index 62524f36de65..789ec0dd1a3c 100644 --- a/gcc/config/aarch64/aarch64-sve2.md +++ b/gcc/config/aarch64/aarch64-sve2.md @@ -1628,7 +1628,7 @@ "TARGET_SVE2" {@ [ cons: =0 , %1 , 2 ; attrs: movprfx ] [ w, 0 , w ; * ] nbsl\t%0.d, %0.d, %2.d, %0.d - [ ?&w , w , w ; yes] movprfx\t%0, %1\;nbsl\t%0.d, %0.d, %2.d, %0.d + [ ?&w , w , w ; yes] movprfx\t%0, %1\;nbsl\t%0.d, %0.d, %2.d, %1.d } "&& !CONSTANT_P (operands[3])" { diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr120999.c b/gcc/testsuite/gcc.target/aarch64/sve2/pr120999.c new file mode 100644 index ..2dca36aea228 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr120999.c @@ -0,0 +1,17 @@ +/* PR target/120999. */ +/* { dg-do assemble } */ +/* { dg-options "-O2 --save-temps" } */ +/* { dg-final { check-function-bodies "**" "" "" } } */ + +#include + +#define NOR(x, y) (~((x) | (y))) + +/* +** nor_z: +** movprfx z0, z1 +** nbslz0.d, z0.d, z2.d, z1.d +** ret +*/ +svuint64_t nor_z(svuint64_t c, svuint64_t a, svuint64_t b) { return NOR(a, b); } +
[gcc r15-9948] tree-optimization/120780: Support object size for containing objects
https://gcc.gnu.org/g:63c4d4f59a92007c6d0f35e4d7aa1a97691306db commit r15-9948-g63c4d4f59a92007c6d0f35e4d7aa1a97691306db Author: Siddhesh Poyarekar Date: Thu Jun 26 17:46:00 2025 -0400 tree-optimization/120780: Support object size for containing objects MEM_REF cast of a subobject to its containing object has negative offsets, which objsz sees as an invalid access. Support this use case by peeking into the structure to validate that the containing object indeed contains a type of the subobject at that offset and if present, adjust the wholesize for the object to allow the negative offset. gcc/ChangeLog: PR tree-optimization/120780 * tree-object-size.cc (inner_at_offset, get_wholesize_for_memref): New functions. (addr_object_size): Call get_wholesize_for_memref. gcc/testsuite/ChangeLog: PR tree-optimization/120780 * gcc.dg/builtin-dynamic-object-size-pr120780.c: New test case. Signed-off-by: Siddhesh Poyarekar (cherry picked from commit 72e85d46472716e670cbe6e967109473b8d12d38) Diff: --- .../gcc.dg/builtin-dynamic-object-size-pr120780.c | 233 + gcc/tree-object-size.cc| 90 +++- 2 files changed, 322 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-pr120780.c b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-pr120780.c new file mode 100644 index ..0d6593ec8289 --- /dev/null +++ b/gcc/testsuite/gcc.dg/builtin-dynamic-object-size-pr120780.c @@ -0,0 +1,233 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#include "builtin-object-size-common.h" +typedef __SIZE_TYPE__ size_t; +#define NUM_MCAST_RATE 6 + +#define MIN(a,b) ((a) < (b) ? (a) : (b)) +#define MAX(a,b) ((a) > (b) ? (a) : (b)) + +struct inner +{ + int dummy[4]; +}; + +struct container +{ + int mcast_rate[NUM_MCAST_RATE]; + struct inner mesh; +}; + +static void +test1_child (struct inner *ifmsh, size_t expected) +{ + struct container *sdata = +(struct container *) ((void *) ifmsh + - __builtin_offsetof (struct container, mesh)); + + if (__builtin_dynamic_object_size (sdata->mcast_rate, 1) + != sizeof (sdata->mcast_rate)) +FAIL (); + + if (__builtin_dynamic_object_size (&sdata->mesh, 1) != expected) +FAIL (); +} + +void +__attribute__((noinline)) +test1 (size_t sz) +{ + struct container *sdata = __builtin_malloc (sz); + struct inner *ifmsh = &sdata->mesh; + + test1_child (ifmsh, + (sz > sizeof (sdata->mcast_rate) + ? sz - sizeof (sdata->mcast_rate) : 0)); + + __builtin_free (sdata); +} + +struct container2 +{ + int mcast_rate[NUM_MCAST_RATE]; + union +{ + int dummy; + double dbl; + struct inner mesh; +} u; +}; + +static void +test2_child (struct inner *ifmsh, size_t sz) +{ + struct container2 *sdata = +(struct container2 *) ((void *) ifmsh + - __builtin_offsetof (struct container2, u.mesh)); + + if (__builtin_dynamic_object_size (sdata->mcast_rate, 1) + != sizeof (sdata->mcast_rate)) +FAIL (); + + size_t diff = sizeof (*sdata) - sz; + size_t expected = MIN(sizeof (double), MAX (sizeof (sdata->u), diff) - diff); + + if (__builtin_dynamic_object_size (&sdata->u.dbl, 1) != expected) +FAIL (); + + expected = MAX (sizeof (sdata->u.mesh), diff) - diff; + if (__builtin_dynamic_object_size (&sdata->u.mesh, 1) != expected) +FAIL (); +} + +void +__attribute__((noinline)) +test2 (size_t sz) +{ + struct container2 *sdata = __builtin_malloc (sz); + struct inner *ifmsh = &sdata->u.mesh; + + test2_child (ifmsh, sz);; + + __builtin_free (sdata); +} + +struct container3 +{ + int mcast_rate[NUM_MCAST_RATE]; + char mesh[8]; +}; + +static void +test3_child (char ifmsh[], size_t expected) +{ + struct container3 *sdata = +(struct container3 *) ((void *) ifmsh + - __builtin_offsetof (struct container3, mesh)); + + if (__builtin_dynamic_object_size (sdata->mcast_rate, 1) + != sizeof (sdata->mcast_rate)) +FAIL (); + + if (__builtin_dynamic_object_size (sdata->mesh, 1) != expected) +FAIL (); +} + +void +__attribute__((noinline)) +test3 (size_t sz) +{ + struct container3 *sdata = __builtin_malloc (sz); + char *ifmsh = sdata->mesh; + size_t diff = sizeof (*sdata) - sz; + + test3_child (ifmsh, MAX(sizeof (sdata->mesh), diff) - diff); + + __builtin_free (sdata); +} + + +struct container4 +{ + int mcast_rate[NUM_MCAST_RATE]; + struct +{ + int dummy; + struct inner mesh; +} s; +}; + +static void +test4_child (struct inner *ifmsh, size_t expected) +{ + struct container4 *sdata = +(struct container4 *) ((void *) ifmsh + - __builtin_offsetof (struct container4, s.mesh)); + + + if (__builtin_dynamic_object_size (sdata->mcast_rate, 1) + != sizeof (sdata->mcast_ra
[gcc r16-2166] Pass SLP node down to cost hook for reduction cost
https://gcc.gnu.org/g:b57c6b5d27dd1840e9d466a5717476280287a322 commit r16-2166-gb57c6b5d27dd1840e9d466a5717476280287a322 Author: Richard Biener Date: Thu Jul 10 10:08:23 2025 +0200 Pass SLP node down to cost hook for reduction cost The following arranges vector reduction costs to hand down the SLP node (of the reduction stmt) to the cost hooks, not only the stmt_info. This also avoids accessing STMT_VINFO_VECTYPE of an unrelated stmt to the node that is subject to code generation. * tree-vect-loop.cc (vect_model_reduction_cost): Get SLP node instead of stmt_info and use that when recording costs. Diff: --- gcc/tree-vect-loop.cc | 37 +++-- 1 file changed, 19 insertions(+), 18 deletions(-) diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index 6f9765b54594..7b260c34a846 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -5001,7 +5001,7 @@ vect_is_emulated_mixed_dot_prod (stmt_vec_info stmt_info) static void vect_model_reduction_cost (loop_vec_info loop_vinfo, - stmt_vec_info stmt_info, internal_fn reduc_fn, + slp_tree node, internal_fn reduc_fn, vect_reduction_type reduction_type, int ncopies, stmt_vector_for_cost *cost_vec) { @@ -5017,9 +5017,10 @@ vect_model_reduction_cost (loop_vec_info loop_vinfo, if (reduction_type == COND_REDUCTION) ncopies *= 2; - vectype = STMT_VINFO_VECTYPE (stmt_info); + vectype = SLP_TREE_VECTYPE (node); mode = TYPE_MODE (vectype); - stmt_vec_info orig_stmt_info = vect_orig_stmt (stmt_info); + stmt_vec_info orig_stmt_info += vect_orig_stmt (SLP_TREE_REPRESENTATIVE (node)); gimple_match_op op; if (!gimple_extract_op (orig_stmt_info->stmt, &op)) @@ -5037,16 +5038,16 @@ vect_model_reduction_cost (loop_vec_info loop_vinfo, if (reduc_fn != IFN_LAST) /* Count one reduction-like operation per vector. */ inside_cost = record_stmt_cost (cost_vec, ncopies, vec_to_scalar, - stmt_info, 0, vect_body); + node, 0, vect_body); else { /* Use NELEMENTS extracts and NELEMENTS scalar ops. */ unsigned int nelements = ncopies * vect_nunits_for_cost (vectype); inside_cost = record_stmt_cost (cost_vec, nelements, - vec_to_scalar, stmt_info, 0, + vec_to_scalar, node, 0, vect_body); inside_cost += record_stmt_cost (cost_vec, nelements, - scalar_stmt, stmt_info, 0, + scalar_stmt, node, 0, vect_body); } } @@ -5063,7 +5064,7 @@ vect_model_reduction_cost (loop_vec_info loop_vinfo, /* We need the initial reduction value. */ prologue_stmts = 1; prologue_cost += record_stmt_cost (cost_vec, prologue_stmts, -scalar_to_vec, stmt_info, 0, +scalar_to_vec, node, 0, vect_prologue); } @@ -5080,24 +5081,24 @@ vect_model_reduction_cost (loop_vec_info loop_vinfo, { /* An EQ stmt and an COND_EXPR stmt. */ epilogue_cost += record_stmt_cost (cost_vec, 2, -vector_stmt, stmt_info, 0, +vector_stmt, node, 0, vect_epilogue); /* Reduction of the max index and a reduction of the found values. */ epilogue_cost += record_stmt_cost (cost_vec, 2, -vec_to_scalar, stmt_info, 0, +vec_to_scalar, node, 0, vect_epilogue); /* A broadcast of the max value. */ epilogue_cost += record_stmt_cost (cost_vec, 1, -scalar_to_vec, stmt_info, 0, +scalar_to_vec, node, 0, vect_epilogue); } else { epilogue_cost += record_stmt_cost (cost_vec, 1, vector_stmt, -stmt_info, 0, vect_epilogue); +node, 0, vect_epilogue); epilogue_cost += record_stmt_cost (cost_vec, 1, -vec_to_scalar, stmt_info, 0, +vec_to_scalar, node, 0,
[gcc r16-2170] Handle failed gcond pattern gracefully
https://gcc.gnu.org/g:2f2e9bcfb0fd9cbf46e2d0d03b3f32f7df8d4fff commit r16-2170-g2f2e9bcfb0fd9cbf46e2d0d03b3f32f7df8d4fff Author: Richard Biener Date: Thu Jul 10 11:26:04 2025 +0200 Handle failed gcond pattern gracefully SLP analysis of early break conditions asserts pattern recognition canonicalized all of them. But the pattern can fail, for example when vector types cannot be computed. So be graceful here, so we don't ICE when we didn't yet compute vector types. * tree-vect-slp.cc (vect_analyze_slp): Fail for non-canonical gconds. Diff: --- gcc/tree-vect-slp.cc | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 5ef45fd60f57..ad75386926a8 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -5068,9 +5068,15 @@ vect_analyze_slp (vec_info *vinfo, unsigned max_tree_size, tree args0 = gimple_cond_lhs (stmt); tree args1 = gimple_cond_rhs (stmt); - /* These should be enforced by cond lowering. */ - gcc_assert (gimple_cond_code (stmt) == NE_EXPR); - gcc_assert (zerop (args1)); + /* These should be enforced by cond lowering, but if it failed +bail. */ + if (gimple_cond_code (stmt) != NE_EXPR + || TREE_TYPE (args0) != boolean_type_node + || !integer_zerop (args1)) + { + roots.release (); + continue; + } /* An argument without a loop def will be codegened from vectorizing the root gcond itself. As such we don't need to try to build an SLP tree
[gcc r16-2169] Adjust reduction with conversion SLP build
https://gcc.gnu.org/g:2b99395c312883ccf114476347a7f5174fde436d commit r16-2169-g2b99395c312883ccf114476347a7f5174fde436d Author: Richard Biener Date: Thu Jul 10 11:23:59 2025 +0200 Adjust reduction with conversion SLP build The following adjusts how we set SLP_TREE_VECTYPE for the conversion node we build when fixing up the reduction with conversion SLP instance. This should probably see more TLC, but the following avoids relying on STMT_VINFO_VECTYPE for this. * tree-vect-slp.cc (vect_build_slp_instance): Do not use SLP_TREE_VECTYPE to determine the conversion back to the reduction IV. Diff: --- gcc/tree-vect-slp.cc | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index 68ef1ddda77a..5ef45fd60f57 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -4067,7 +4067,12 @@ vect_build_slp_instance (vec_info *vinfo, for (unsigned i = 0; i < group_size; ++i) scalar_stmts.quick_push (next_info); slp_tree conv = vect_create_new_slp_node (scalar_stmts, 1); - SLP_TREE_VECTYPE (conv) = STMT_VINFO_VECTYPE (next_info); + SLP_TREE_VECTYPE (conv) + = get_vectype_for_scalar_type (vinfo, + TREE_TYPE +(gimple_assign_lhs + (scalar_def)), + group_size); SLP_TREE_CHILDREN (conv).quick_push (node); SLP_INSTANCE_TREE (new_instance) = conv; /* We also have to fake this conversion stmt as SLP reduction
[gcc r16-2163] Remove dead code dealing with non-SLP
https://gcc.gnu.org/g:18c48295afb424bfc5c1fbb812e68119e9eb4ccb commit r16-2163-g18c48295afb424bfc5c1fbb812e68119e9eb4ccb Author: Richard Biener Date: Thu Jul 10 09:44:50 2025 +0200 Remove dead code dealing with non-SLP After vect_analyze_loop_operations is gone we can clean up vect_analyze_stmt as it is no longer called out of SLP context. * tree-vectorizer.h (vect_analyze_stmt): Remove stmt-info and need_to_vectorize arguments. * tree-vect-slp.cc (vect_slp_analyze_node_operations_1): Adjust. * tree-vect-stmts.cc (can_vectorize_live_stmts): Remove stmt_info argument and remove non-SLP path. (vect_analyze_stmt): Remove stmt_info and need_to_vectorize argument and prune paths no longer reachable. (vect_transform_stmt): Adjust. Diff: --- gcc/tree-vect-slp.cc | 6 +- gcc/tree-vect-stmts.cc | 180 ++--- gcc/tree-vectorizer.h | 3 +- 3 files changed, 38 insertions(+), 151 deletions(-) diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc index f97a3635cff1..68ef1ddda77a 100644 --- a/gcc/tree-vect-slp.cc +++ b/gcc/tree-vect-slp.cc @@ -7898,8 +7898,6 @@ vect_slp_analyze_node_operations_1 (vec_info *vinfo, slp_tree node, slp_instance node_instance, stmt_vector_for_cost *cost_vec) { - stmt_vec_info stmt_info = SLP_TREE_REPRESENTATIVE (node); - /* Calculate the number of vector statements to be created for the scalar stmts in this node. It is the number of scalar elements in one scalar iteration (DR_GROUP_SIZE) multiplied by VF divided by the number of @@ -7928,9 +7926,7 @@ vect_slp_analyze_node_operations_1 (vec_info *vinfo, slp_tree node, return true; } - bool dummy; - return vect_analyze_stmt (vinfo, stmt_info, &dummy, - node, node_instance, cost_vec); + return vect_analyze_stmt (vinfo, node, node_instance, cost_vec); } static int diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 081dd653fd46..e5971e4a357b 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -13186,37 +13186,27 @@ vectorizable_early_exit (vec_info *vinfo, stmt_vec_info stmt_info, VEC_STMT_P is as for vectorizable_live_operation. */ static bool -can_vectorize_live_stmts (vec_info *vinfo, stmt_vec_info stmt_info, +can_vectorize_live_stmts (vec_info *vinfo, slp_tree slp_node, slp_instance slp_node_instance, bool vec_stmt_p, stmt_vector_for_cost *cost_vec) { loop_vec_info loop_vinfo = dyn_cast (vinfo); - if (slp_node) -{ - stmt_vec_info slp_stmt_info; - unsigned int i; - FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (slp_node), i, slp_stmt_info) - { - if (slp_stmt_info - && (STMT_VINFO_LIVE_P (slp_stmt_info) - || (loop_vinfo - && LOOP_VINFO_EARLY_BREAKS (loop_vinfo) - && STMT_VINFO_DEF_TYPE (slp_stmt_info) - == vect_induction_def)) - && !vectorizable_live_operation (vinfo, slp_stmt_info, slp_node, - slp_node_instance, i, - vec_stmt_p, cost_vec)) - return false; - } + stmt_vec_info slp_stmt_info; + unsigned int i; + FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (slp_node), i, slp_stmt_info) +{ + if (slp_stmt_info + && (STMT_VINFO_LIVE_P (slp_stmt_info) + || (loop_vinfo + && LOOP_VINFO_EARLY_BREAKS (loop_vinfo) + && STMT_VINFO_DEF_TYPE (slp_stmt_info) + == vect_induction_def)) + && !vectorizable_live_operation (vinfo, slp_stmt_info, slp_node, + slp_node_instance, i, + vec_stmt_p, cost_vec)) + return false; } - else if ((STMT_VINFO_LIVE_P (stmt_info) - || (LOOP_VINFO_EARLY_BREAKS (loop_vinfo) - && STMT_VINFO_DEF_TYPE (stmt_info) == vect_induction_def)) - && !vectorizable_live_operation (vinfo, stmt_info, - slp_node, slp_node_instance, -1, - vec_stmt_p, cost_vec)) -return false; return true; } @@ -13225,115 +13215,42 @@ can_vectorize_live_stmts (vec_info *vinfo, stmt_vec_info stmt_info, opt_result vect_analyze_stmt (vec_info *vinfo, - stmt_vec_info stmt_info, bool *need_to_vectorize, slp_tree node, slp_instance node_instance, stmt_vector_for_cost *cost_vec) { + stmt_vec_info stmt_info = SLP_TREE_REPRESENTATIVE (node); bb_vec_info bb_vinfo = dyn_cast (vinfo); enum vect_relevant relevance = STMT_VINFO_RELEVANT (stmt_
[gcc r16-2161] Change bellow in comments to below
https://gcc.gnu.org/g:0931cea0e3a67e6a17790aeb676c793bccb2039a commit r16-2161-g0931cea0e3a67e6a17790aeb676c793bccb2039a Author: Jakub Jelinek Date: Thu Jul 10 10:16:43 2025 +0200 Change bellow in comments to below While I'm not a native English speaker, I believe all the uses of bellow (roar/bark/...) in comments in gcc are meant to be below (beneath/under/...). 2025-07-10 Jakub Jelinek gcc/ * tree-vect-loop.cc (scale_profile_for_vect_loop): Comment spelling fix: bellow -> below. * ipa-polymorphic-call.cc (record_known_type): Likewise. * config/i386/x86-tune.def: Likewise. * config/riscv/vector.md (*vsetvldi_no_side_effects_si_extend): Likewise. * tree-scalar-evolution.cc (iv_can_overflow_p): Likewise. * ipa-devirt.cc (add_type_duplicate): Likewise. * tree-ssa-loop-niter.cc (maybe_lower_iteration_bound): Likewise. * gimple-ssa-sccopy.cc: Likewise. * cgraphunit.cc: Likewise. * graphite.h (struct poly_dr): Likewise. * ipa-reference.cc (ignore_edge_p): Likewise. * tree-ssa-alias.cc (ao_compare::compare_ao_refs): Likewise. * profile-count.h (profile_probability::probably_reliable_p): Likewise. * ipa-inline-transform.cc (inline_call): Likewise. gcc/ada/ * par-load.adb: Comment spelling fix: bellow -> below. * libgnarl/s-taskin.ads: Likewise. gcc/testsuite/ * gfortran.dg/g77/980310-3.f: Comment spelling fix: bellow -> below. * jit.dg/test-debuginfo.c: Likewise. libstdc++-v3/ * testsuite/22_locale/codecvt/codecvt_unicode.h (ucs2_to_utf8_out_error): Comment spelling fix: bellow -> below. (utf16_to_ucs2_in_error): Likewise. Diff: --- gcc/ada/libgnarl/s-taskin.ads | 2 +- gcc/ada/par-load.adb | 2 +- gcc/cgraphunit.cc | 2 +- gcc/config/i386/x86-tune.def | 2 +- gcc/config/riscv/vector.md | 2 +- gcc/gimple-ssa-sccopy.cc | 2 +- gcc/graphite.h | 2 +- gcc/ipa-devirt.cc | 2 +- gcc/ipa-inline-transform.cc| 2 +- gcc/ipa-polymorphic-call.cc| 2 +- gcc/ipa-reference.cc | 2 +- gcc/profile-count.h| 2 +- gcc/testsuite/gfortran.dg/g77/980310-3.f | 2 +- gcc/testsuite/jit.dg/test-debuginfo.c | 2 +- gcc/tree-scalar-evolution.cc | 2 +- gcc/tree-ssa-alias.cc | 2 +- gcc/tree-ssa-loop-niter.cc | 2 +- gcc/tree-vect-loop.cc | 2 +- libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h | 4 ++-- 19 files changed, 20 insertions(+), 20 deletions(-) diff --git a/gcc/ada/libgnarl/s-taskin.ads b/gcc/ada/libgnarl/s-taskin.ads index d68e199e6262..dbf2e7bf91ec 100644 --- a/gcc/ada/libgnarl/s-taskin.ads +++ b/gcc/ada/libgnarl/s-taskin.ads @@ -390,7 +390,7 @@ package System.Tasking is System_Domain : Dispatching_Domain_Access; -- All processors belong to default system dispatching domain at start up. -- We use a pointer which creates the actual variable for the reasons - -- explained bellow in Dispatching_Domain_Tasks. + -- explained below in Dispatching_Domain_Tasks. Dispatching_Domains_Frozen : Boolean := False; -- True when the main procedure has been called. Hence, no new dispatching diff --git a/gcc/ada/par-load.adb b/gcc/ada/par-load.adb index 96fa7e85938d..4a97f14ffb51 100644 --- a/gcc/ada/par-load.adb +++ b/gcc/ada/par-load.adb @@ -83,7 +83,7 @@ procedure Load is -- withed units and the second round handles Ada 2005 limited-withed units. -- This is required to allow the low-level circuitry that detects circular -- dependencies of units the correct notification of errors (see comment - -- bellow). This variable is used to indicate that the second round is + -- below). This variable is used to indicate that the second round is -- required. function Same_File_Name_Except_For_Case diff --git a/gcc/cgraphunit.cc b/gcc/cgraphunit.cc index fa54a59d02b8..8e8d85562b03 100644 --- a/gcc/cgraphunit.cc +++ b/gcc/cgraphunit.cc @@ -63,7 +63,7 @@ along with GCC; see the file COPYING3. If not see final assembler is generated. This is done in the following way. Note that with link time optimization the process is split into three stages (compile time, linktime analysis and parallel linktime as - indicated bellow)
[gcc r16-2162] Comment spelling fix: tunning -> tuning
https://gcc.gnu.org/g:60a7c817d2deb640e9649825a8e4e05293a7ba2d commit r16-2162-g60a7c817d2deb640e9649825a8e4e05293a7ba2d Author: Jakub Jelinek Date: Thu Jul 10 10:23:31 2025 +0200 Comment spelling fix: tunning -> tuning Kyrylo noticed another spelling bug and like usually, the same mistake happens in multiple places. 2025-07-10 Jakub Jelinek * config/i386/x86-tune.def: Change "Tunning the" to "tuning" in comment and use semicolon instead of dot in comment. * loop-unroll.cc (decide_unroll_stupid): Comment spelling fix, tunning -> tuning. Diff: --- gcc/config/i386/x86-tune.def | 2 +- gcc/loop-unroll.cc | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def index a039db3cfced..a86cbad281c1 100644 --- a/gcc/config/i386/x86-tune.def +++ b/gcc/config/i386/x86-tune.def @@ -31,7 +31,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see - Updating ix86_issue_rate and ix86_adjust_cost in i386.md - possibly updating ia32_multipass_dfa_lookahead, ix86_sched_reorder and ix86_sched_init_global if those tricks are needed. -- Tunning the flags below. Those are split into sections and each +- tuning flags below; those are split into sections and each section is very roughly ordered by importance. */ /*/ diff --git a/gcc/loop-unroll.cc b/gcc/loop-unroll.cc index 6149cecb28de..c80a6cb6cd0c 100644 --- a/gcc/loop-unroll.cc +++ b/gcc/loop-unroll.cc @@ -1185,7 +1185,7 @@ decide_unroll_stupid (class loop *loop, int flags) /* Do not unroll loops with branches inside -- it increases number of mispredicts. - TODO: this heuristic needs tunning; call inside the loop body + TODO: this heuristic needs tuning; call inside the loop body is also relatively good reason to not unroll. */ if (num_loop_branches (loop) > 1) {
[gcc r15-9947] aarch64: Add support for NVIDIA GB10
https://gcc.gnu.org/g:aad37494dc0b96e95501190b93a32ff7c85debfc commit r15-9947-gaad37494dc0b96e95501190b93a32ff7c85debfc Author: Kyrylo Tkachov Date: Mon Jun 2 07:08:12 2025 -0700 aarch64: Add support for NVIDIA GB10 This adds support for -mcpu=gb10. This is a big.LITTLE configuration involving Cortex-X925 and Cortex-A725 cores. The appropriate MIDR numbers are added to detect them in -mcpu=native. We did not add an -mcpu=cortex-x925.cortex-a725 option because GB10 does include the crypto instructions which we want on by default, and the current convention is to not enable such extensions for Arm Cortex cores in -mcpu where they are optional in the IP. Bootstrapped and tested on aarch64-none-linux-gnu. Signed-off-by: Kyrylo Tkachov gcc/ * config/aarch64/aarch64-cores.def (gb10): New entry. * config/aarch64/aarch64-tune.md: Regenerate. * doc/invoke.texi (AArch64 Options): Document the above. (cherry picked from commit 9ff6ade24cae5a51d1ee9d9ad4b4a5c682e4a5ed) Diff: --- gcc/config/aarch64/aarch64-cores.def | 3 +++ gcc/config/aarch64/aarch64-tune.md | 2 +- gcc/doc/invoke.texi | 2 +- 3 files changed, 5 insertions(+), 2 deletions(-) diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index 24b7cd362aaf..8040409d2830 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -226,6 +226,9 @@ AARCH64_CORE("demeter", demeter, cortexa57, V9A, (I8MM, BF16, SVE2_BITPERM, RNG, /* NVIDIA ('N') cores. */ AARCH64_CORE("olympus", olympus, cortexa57, V9_2A, (SVE2_BITPERM, RNG, LS64, MEMTAG, PROFILE, FAMINMAX, FP8FMA, FP8DOT2, FP8DOT4, LUT, SVE2_AES, SVE2_SHA3, SVE2_SM4), neoversev3, 0x4e, 0x10, -1) +/* Armv9-A big.LITTLE processors. */ +AARCH64_CORE("gb10", gb10, cortexa57, V9_2A, (SVE2_BITPERM, SVE2_AES, SVE2_SHA3, SVE2_SM4, MEMTAG, PROFILE), cortexx925, 0x41, AARCH64_BIG_LITTLE (0xd85, 0xd87), -1) + /* Generic Architecture Processors. */ AARCH64_CORE("generic", generic, cortexa53, V8A, (), generic, 0x0, 0x0, -1) AARCH64_CORE("generic-armv8-a", generic_armv8_a, cortexa53, V8A, (), generic_armv8_a, 0x0, 0x0, -1) diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md index 982074c2c21e..40ff147d6f83 100644 --- a/gcc/config/aarch64/aarch64-tune.md +++ b/gcc/config/aarch64/aarch64-tune.md @@ -1,5 +1,5 @@ ;; -*- buffer-read-only: t -*- ;; Generated automatically by gentune.sh from aarch64-cores.def (define_attr "tune" - "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,fujitsu_monaka,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexr82ae,applea12,applem1_0,applem1_1,applem1_2,applem1_3,applem2_0,applem2_1,applem2_2,applem2_3,applem3_0,cortexa510,cortexa520,cortexa520ae,cortexa710,cortexa715,cortexa720,cortexa720ae,cortexa725,cortexx2,cortexx3,cortexx4,cortexx925,neoversen2,cobalt100,neoversen3,neoversev2 ,grace,neoversev3,neoversev3ae,demeter,olympus,generic,generic_armv8_a,generic_armv9_a" + "cortexa34,cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88,thunderxt88p1,octeontx,octeontxt81,octeontxt83,thunderxt81,thunderxt83,ampere1,ampere1a,ampere1b,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,cortexa76ae,cortexa77,cortexa78,cortexa78ae,cortexa78c,cortexa65,cortexa65ae,cortexx1,cortexx1c,neoversen1,ares,neoversee1,octeontx2,octeontx2t98,octeontx2t96,octeontx2t93,octeontx2f95,octeontx2f95n,octeontx2f95mm,a64fx,fujitsu_monaka,tsv110,thunderx3t110,neoversev1,zeus,neoverse512tvb,saphira,oryon1,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55,cortexr82,cortexr82ae,applea12,applem1_0,applem1_1,applem1_2,applem1_3,applem2_0,applem2_1,applem2_2,applem2_3,applem3_0,cortexa510,cortexa520,cortexa520ae,cortexa710,cortexa715,cortexa720,cortexa720ae,cortexa725,cortexx2,cortexx3,cortexx4,cortexx925,neoversen2,cobalt100,neoversen3,neoversev2 ,grace,neoversev3,neoversev3ae,demeter,olympus,gb10,generic,generic_armv8_a,generic_armv9_a" (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 14750aed64db..eca871b93d97 100644 --- a/gcc/doc/invoke.texi
[gcc] Created tag 'releases/gcc-12.5.0'
The signed tag 'releases/gcc-12.5.0' was created pointing to: c17d40bb3778... Update ChangeLog and version files for release Tagger: Richard Biener Date: Fri Jul 11 06:34:05 2025 + GCC 12.5.0 release
[gcc r16-2188] Stop updating gcc-12 branch
https://gcc.gnu.org/g:14076f15bf618d8febd1e4c6a86995f057408de8 commit r16-2188-g14076f15bf618d8febd1e4c6a86995f057408de8 Author: Richard Biener Date: Fri Jul 11 08:32:26 2025 +0200 Stop updating gcc-12 branch contrib/ * gcc-changelog/git_update_version.py: Stop updating gcc-12 branch. Diff: --- contrib/gcc-changelog/git_update_version.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/gcc-changelog/git_update_version.py b/contrib/gcc-changelog/git_update_version.py index aa9adee58fef..b3ea33bb5161 100755 --- a/contrib/gcc-changelog/git_update_version.py +++ b/contrib/gcc-changelog/git_update_version.py @@ -85,7 +85,7 @@ def prepend_to_changelog_files(repo, folder, git_commit, add_to_git): repo.git.add(full_path) -active_refs = ['master', 'releases/gcc-12', +active_refs = ['master', 'releases/gcc-13', 'releases/gcc-14', 'releases/gcc-15'] parser = argparse.ArgumentParser(description='Update DATESTAMP and generate '
[gcc r12-11261] Update ChangeLog and version files for release
https://gcc.gnu.org/g:c17d40bb3778bca5e81595f033df9222b66658eb commit r12-11261-gc17d40bb3778bca5e81595f033df9222b66658eb Author: Richard Biener Date: Fri Jul 11 06:33:59 2025 + Update ChangeLog and version files for release Diff: --- ChangeLog | 4 c++tools/ChangeLog| 4 config/ChangeLog | 4 contrib/ChangeLog | 4 contrib/header-tools/ChangeLog| 4 contrib/reghunt/ChangeLog | 4 contrib/regression/ChangeLog | 4 fixincludes/ChangeLog | 4 gcc/BASE-VER | 2 +- gcc/ChangeLog | 4 gcc/ada/ChangeLog | 4 gcc/analyzer/ChangeLog| 4 gcc/c-family/ChangeLog| 4 gcc/c/ChangeLog | 4 gcc/cp/ChangeLog | 4 gcc/d/ChangeLog | 4 gcc/fortran/ChangeLog | 4 gcc/go/ChangeLog | 4 gcc/jit/ChangeLog | 4 gcc/lto/ChangeLog | 4 gcc/objc/ChangeLog| 4 gcc/objcp/ChangeLog | 4 gcc/po/ChangeLog | 4 gcc/testsuite/ChangeLog | 4 gnattools/ChangeLog | 4 gotools/ChangeLog | 4 include/ChangeLog | 4 intl/ChangeLog| 4 libada/ChangeLog | 4 libatomic/ChangeLog | 4 libbacktrace/ChangeLog| 4 libcc1/ChangeLog | 4 libcody/ChangeLog | 4 libcpp/ChangeLog | 4 libcpp/po/ChangeLog | 4 libdecnumber/ChangeLog| 4 libffi/ChangeLog | 4 libgcc/ChangeLog | 4 libgcc/config/avr/libf7/ChangeLog | 4 libgcc/config/libbid/ChangeLog| 4 libgfortran/ChangeLog | 4 libgomp/ChangeLog | 4 libiberty/ChangeLog | 4 libitm/ChangeLog | 4 libobjc/ChangeLog | 4 liboffloadmic/ChangeLog | 4 libphobos/ChangeLog | 4 libquadmath/ChangeLog | 4 libsanitizer/ChangeLog| 4 libssp/ChangeLog | 4 libstdc++-v3/ChangeLog| 4 libvtv/ChangeLog | 4 lto-plugin/ChangeLog | 4 maintainer-scripts/ChangeLog | 4 zlib/ChangeLog| 4 55 files changed, 217 insertions(+), 1 deletion(-) diff --git a/ChangeLog b/ChangeLog index 3e0631117ef8..9ccf9972fcfa 100644 --- a/ChangeLog +++ b/ChangeLog @@ -1,3 +1,7 @@ +2025-07-11 Release Manager + + * GCC 12.5.0 released. + 2024-06-20 Release Manager * GCC 12.4.0 released. diff --git a/c++tools/ChangeLog b/c++tools/ChangeLog index 5e961ce0c226..dd92a5011707 100644 --- a/c++tools/ChangeLog +++ b/c++tools/ChangeLog @@ -1,3 +1,7 @@ +2025-07-11 Release Manager + + * GCC 12.5.0 released. + 2025-04-20 John David Anglin PR other/107616 diff --git a/config/ChangeLog b/config/ChangeLog index 40932d6a39af..44e0c840e1fa 100644 --- a/config/ChangeLog +++ b/config/ChangeLog @@ -1,3 +1,7 @@ +2025-07-11 Release Manager + + * GCC 12.5.0 released. + 2024-06-20 Release Manager * GCC 12.4.0 released. diff --git a/contrib/ChangeLog b/contrib/ChangeLog index c412d61b5795..1ee43d73955d 100644 --- a/contrib/ChangeLog +++ b/contrib/ChangeLog @@ -1,3 +1,7 @@ +2025-07-11 Release Manager + + * GCC 12.5.0 released. + 2024-06-20 Release Manager * GCC 12.4.0 released. diff --git a/contrib/header-tools/ChangeLog b/contrib/header-tools/ChangeLog index c834c0a87c16..0665fbfd0f91 100644 --- a/contrib/header-tools/ChangeLog +++ b/contrib/header-tools/ChangeLog @@ -1,3 +1,7 @@ +2025-07-11 Release Manager + + * GCC 12.5.0 released. + 2024-06-20 Release Manager * GCC 12.4.0 released. diff --git a/contrib/reghunt/ChangeLog b/contrib/reghunt/ChangeLog index 1de203aa1b06..8a77aeae4f56 100644 --- a/contrib/reghunt/ChangeLog +++ b/contrib/reghunt/ChangeLog @@ -1,3 +1,7 @@ +2025-07-11 Release Manager + + * GCC 12.5.0 released. + 2024-06-20 Release Manager * GCC 12.4.0 released. diff --git a/contrib/regression/ChangeLog b/contrib/regression/ChangeLog index fbea8904965c..6845969a18c2 100644 --- a/contrib/regression/ChangeLog +++ b/contrib/regression/ChangeLog @@ -1,3 +1,7 @@ +2025-07-11 Release Manager + + * GCC 12.5.0 released. + 2024-06-20 Release Manager * GCC 12.4.0 released. diff --git a/fixincludes/ChangeLog b/fixincludes/ChangeLog index d693ea068ae1..efc0bdf10c5c 100644 --- a/fixincludes/ChangeLog +++ b/fixincludes/ChangeLog @@ -1,3 +1,7 @@ +2025-07-11 Release Manager + + * GCC 12.
[gcc r16-2185] c++: Fix up final handling in C++98 [PR120628]
https://gcc.gnu.org/g:8f063b40e5b8f23cb89fee21afaa71deedbdf2aa commit r16-2185-g8f063b40e5b8f23cb89fee21afaa71deedbdf2aa Author: Jakub Jelinek Date: Thu Jul 10 23:47:42 2025 +0200 c++: Fix up final handling in C++98 [PR120628] The following patch is on top of the https://gcc.gnu.org/pipermail/gcc-patches/2025-June/686210.html patch which stopped treating override as conditional keyword in class properties. This PR mentions another problem; we emit a bogus warning on code like struct C {}; struct C final = {}; in C++98. In this case we parse final as conditional keyword in C++ (including pedwarn) but the caller then immediately aborts the tentative parse because it isn't followed by { nor (in some cases) : . I think we certainly shouldn't pedwarn on it, but I think we even shouldn't warn for it say for -Wc++11-compat, because we don't actually treat the identifier as conditional keyword even in C++11 and later. The patch only does this if final is the only class property conditional keyword, if one uses struct S __final final __final = {}; one gets the warning and duplicate diagnostics and later parsing errors. 2025-07-10 Jakub Jelinek PR c++/120628 * parser.cc (cp_parser_elaborated_type_specifier): Use cp_parser_nth_token_starts_class_definition_p with extra argument 1 instead of cp_parser_next_token_starts_class_definition_p. (cp_parser_class_property_specifier_seq_opt): For final conditional keyword in C++98 check if the token after it isn't cp_parser_nth_token_starts_class_definition_p nor CPP_NAME and in that case break without consuming it nor warning. (cp_parser_class_head): Use cp_parser_nth_token_starts_class_definition_p with extra argument 1 instead of cp_parser_next_token_starts_class_definition_p. (cp_parser_next_token_starts_class_definition_p): Renamed to ... (cp_parser_nth_token_starts_class_definition_p): ... this. Add N argument. Use cp_lexer_peek_nth_token instead of cp_lexer_peek_token. * g++.dg/cpp0x/final1.C: New test. * g++.dg/cpp0x/final2.C: New test. * g++.dg/cpp0x/override6.C: New test. Diff: --- gcc/cp/parser.cc | 21 ++--- gcc/testsuite/g++.dg/cpp0x/final1.C| 11 +++ gcc/testsuite/g++.dg/cpp0x/final2.C| 26 ++ gcc/testsuite/g++.dg/cpp0x/override6.C | 26 ++ 4 files changed, 77 insertions(+), 7 deletions(-) diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index 1f58425a70b6..21bec72c7961 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -3091,8 +3091,8 @@ static cp_token *cp_parser_require_keyword (cp_parser *, enum rid, required_token); static bool cp_parser_token_starts_function_definition_p (cp_token *); -static bool cp_parser_next_token_starts_class_definition_p - (cp_parser *); +static bool cp_parser_nth_token_starts_class_definition_p + (cp_parser *, size_t); static bool cp_parser_next_token_ends_template_argument_p (cp_parser *); static bool cp_parser_nth_token_starts_template_argument_list_p @@ -22031,7 +22031,7 @@ cp_parser_elaborated_type_specifier (cp_parser* parser, bool template_p = (template_parm_lists_apply -&& (cp_parser_next_token_starts_class_definition_p (parser) +&& (cp_parser_nth_token_starts_class_definition_p (parser, 1) || cp_lexer_next_token_is (parser->lexer, CPP_SEMICOLON))); /* An unqualified name was used to reference this type, so there were no qualifying templates. */ @@ -28095,6 +28095,13 @@ cp_parser_class_property_specifier_seq_opt (cp_parser *parser) break; if (id_equal (token->u.value, "final")) { + /* For C++98, quietly ignore final in e.g. +struct S final = 24; */ + if (cxx_dialect == cxx98 + && virt_specifiers == VIRT_SPEC_UNSPECIFIED + && !cp_parser_nth_token_starts_class_definition_p (parser, 2) + && !cp_lexer_nth_token_is (parser->lexer, 2, CPP_NAME)) + break; maybe_warn_cpp0x (CPP0X_OVERRIDE_CONTROLS); virt_specifier = VIRT_SPEC_FINAL; } @@ -28318,7 +28325,7 @@ cp_parser_class_head (cp_parser* parser, class-head, since a class-head only appears as part of a class-specifier. We have to detect this situation before calling xref_tag, since that has irreversible side-effects. */ - if (!cp_parser_next_token_starts_class_definition_p (parser)) + if (!cp_parser_nth_token_starts_class_definition_p (parser, 1)) { cp_parser_error (parser, "expected %<{%> or %<:%>"); type = error_mark_node; @@ -35696,15 +35703,15 @@ cp_parser_token_starts_function_definition_p (cp_token*
[gcc r16-2186] c++: Save 8 further bytes from lang_type allocations
https://gcc.gnu.org/g:bdb0a6be69b3b3e8f94aa72a9263810a80cb9a5f commit r16-2186-gbdb0a6be69b3b3e8f94aa72a9263810a80cb9a5f Author: Jakub Jelinek Date: Fri Jul 11 00:05:23 2025 +0200 c++: Save 8 further bytes from lang_type allocations The following patch implements the /* FIXME reuse another field? */ comment on the lambda_expr member. I think (and asserts in the patch seem to confirm) CLASSTYPE_KEY_METHOD is only ever non-NULL for TYE_POLYMORPHIC_P and on the other side CLASSTYPE_LAMBDA_EXPR is only used on closure types which are never polymorphic. So, the patch just uses one member for both, with the accessor macros changed to be no longer lvalues and adding SET_* variants of the macros for setters. 2025-07-11 Jakub Jelinek * cp-tree.h (struct lang_type): Add comment before key_method. Remove lambda_expr. (CLASSTYPE_KEY_METHOD): Give NULL_TREE if not TYPE_POLYMORPHIC_P. (SET_CLASSTYPE_KEY_METHOD): Define. (CLASSTYPE_LAMBDA_EXPR): Give NULL_TREE if TYPE_POLYMORPHIC_P. Use key_method member instead of lambda_expr. (SET_CLASSTYPE_LAMBDA_EXPR): Define. * class.cc (determine_key_method): Use SET_CLASSTYPE_KEY_METHOD macro. * decl.cc (xref_tag): Use SET_CLASSTYPE_LAMBDA_EXPR macro. * lambda.cc (begin_lambda_type): Likewise. * module.cc (trees_in::read_class_def): Use SET_CLASSTYPE_LAMBDA_EXPR and SET_CLASSTYPE_KEY_METHOD macros, assert lambda is NULL if TYPE_POLYMORPHIC_P and otherwise assert key_method is NULL. Diff: --- gcc/cp/class.cc | 2 +- gcc/cp/cp-tree.h | 19 +++ gcc/cp/decl.cc | 2 +- gcc/cp/lambda.cc | 2 +- gcc/cp/module.cc | 10 -- 5 files changed, 26 insertions(+), 9 deletions(-) diff --git a/gcc/cp/class.cc b/gcc/cp/class.cc index 9a41c00788a8..151ee2bc4714 100644 --- a/gcc/cp/class.cc +++ b/gcc/cp/class.cc @@ -7452,7 +7452,7 @@ determine_key_method (tree type) && ! DECL_DECLARED_INLINE_P (method) && ! DECL_PURE_VIRTUAL_P (method)) { - CLASSTYPE_KEY_METHOD (type) = method; + SET_CLASSTYPE_KEY_METHOD (type, method); break; } diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 43705733d514..90816281224d 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -2519,11 +2519,11 @@ struct GTY(()) lang_type { vec *pure_virtuals; tree friend_classes; vec * GTY((reorder ("resort_type_member_vec"))) members; + /* CLASSTYPE_KEY_METHOD for TYPE_POLYMORPHIC_P types, CLASSTYPE_LAMBDA_EXPR + otherwise. */ tree key_method; tree decl_list; tree befriending_classes; - /* FIXME reuse another field? */ - tree lambda_expr; union maybe_objc_info { /* If not c_dialect_objc, this part is not even allocated. */ char GTY((tag ("0"))) non_objc; @@ -2646,7 +2646,13 @@ struct GTY(()) lang_type { /* The member function with which the vtable will be emitted: the first noninline non-pure-virtual member function. NULL_TREE if there is no key function or if this is a class template */ -#define CLASSTYPE_KEY_METHOD(NODE) (LANG_TYPE_CLASS_CHECK (NODE)->key_method) +#define CLASSTYPE_KEY_METHOD(NODE) \ + (TYPE_POLYMORPHIC_P (NODE) \ + ? LANG_TYPE_CLASS_CHECK (NODE)->key_method \ + : NULL_TREE) +#define SET_CLASSTYPE_KEY_METHOD(NODE, VALUE) \ + (gcc_checking_assert (TYPE_POLYMORPHIC_P (NODE)),\ + LANG_TYPE_CLASS_CHECK (NODE)->key_method = (VALUE)) /* Vector of members. During definition, it is unordered and only member functions are present. After completion it is sorted and @@ -2778,7 +2784,12 @@ struct GTY(()) lang_type { /* The associated LAMBDA_EXPR that made this class. */ #define CLASSTYPE_LAMBDA_EXPR(NODE) \ - (LANG_TYPE_CLASS_CHECK (NODE)->lambda_expr) + (TYPE_POLYMORPHIC_P (NODE) \ + ? NULL_TREE \ + : LANG_TYPE_CLASS_CHECK (NODE)->key_method) +#define SET_CLASSTYPE_LAMBDA_EXPR(NODE, VALUE) \ + (gcc_checking_assert (!TYPE_POLYMORPHIC_P (NODE)), \ + LANG_TYPE_CLASS_CHECK (NODE)->key_method = (VALUE)) /* The extra mangling scope for this closure type. */ #define LAMBDA_TYPE_EXTRA_SCOPE(NODE) \ (LAMBDA_EXPR_EXTRA_SCOPE (CLASSTYPE_LAMBDA_EXPR (NODE))) diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index 664dbbec2796..843f0e4fd160 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -17289,7 +17289,7 @@ xref_tag (enum tag_types tag_code, tree name, if (IDENTIFIER_LAMBDA_P (name)) /* Mark it as a lambda type right now. Our caller will correct the value. */ - CLASSTYPE_LAMBDA_EXPR (t) = error_mark_node; + SET_CLASSTYPE_LAMBDA_EXPR (t, error_mark_node); t = pushtag (name, t, how); } else diff --git a/gcc/cp/lambda.cc b/gcc/cp/lambda.cc ind
[gcc r16-2184] c++: Don't incorrectly reject override after class head name [PR120569]
https://gcc.gnu.org/g:bcb51fe0e26bed7e2c44c4822ca6dec135ba61f3 commit r16-2184-gbcb51fe0e26bed7e2c44c4822ca6dec135ba61f3 Author: Jakub Jelinek Date: Thu Jul 10 23:41:56 2025 +0200 c++: Don't incorrectly reject override after class head name [PR120569] While the https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p2786r13.html#c03-compatibility-changes-for-annex-c-diff.cpp03.dcl.dcl hunk dropped because struct C {}; struct C final {}; is actually not valid C++98 (which didn't have list initialization), we actually also reject struct D {}; struct D override {}; and that IMHO is valid all the way from C++11 onwards. Especially in the light of P2786R13 adding new contextual keywords, I think it is better to use a separate routine for parsing the class-virt-specifier-seq (in C++11, there was export next to final), class-virt-specifier (in C++14 to C++23) and class-property-specifier-seq (in C++26) instead of using the same function for virt-specifier-seq and class-property-specifier-seq. 2025-07-10 Jakub Jelinek PR c++/120569 * parser.cc (cp_parser_class_property_specifier_seq_opt): New function. (cp_parser_class_head): Use it instead of cp_parser_property_specifier_seq_opt. Don't diagnose VIRT_SPEC_OVERRIDE here. Formatting fix. * g++.dg/cpp0x/override2.C: Expect different diagnostics with override or duplicate final. * g++.dg/cpp0x/override5.C: New test. * g++.dg/cpp0x/duplicate1.C: Expect different diagnostics with duplicate final. Diff: --- gcc/cp/parser.cc| 68 ++--- gcc/testsuite/g++.dg/cpp0x/duplicate1.C | 2 +- gcc/testsuite/g++.dg/cpp0x/override2.C | 6 +-- gcc/testsuite/g++.dg/cpp0x/override5.C | 26 + 4 files changed, 85 insertions(+), 17 deletions(-) diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc index d96fdf8f9271..1f58425a70b6 100644 --- a/gcc/cp/parser.cc +++ b/gcc/cp/parser.cc @@ -28068,6 +28068,57 @@ cp_parser_class_specifier (cp_parser* parser) return type; } +/* Parse an (optional) class-property-specifier-seq. + + class-property-specifier-seq: + class-property-specifier class-property-specifier-seq [opt] + + class-property-specifier: + final + + Returns a bitmask representing the class-property-specifiers. */ + +static cp_virt_specifiers +cp_parser_class_property_specifier_seq_opt (cp_parser *parser) +{ + cp_virt_specifiers virt_specifiers = VIRT_SPEC_UNSPECIFIED; + + while (true) +{ + cp_token *token; + cp_virt_specifiers virt_specifier; + + /* Peek at the next token. */ + token = cp_lexer_peek_token (parser->lexer); + /* See if it's a class-property-specifier. */ + if (token->type != CPP_NAME) + break; + if (id_equal (token->u.value, "final")) + { + maybe_warn_cpp0x (CPP0X_OVERRIDE_CONTROLS); + virt_specifier = VIRT_SPEC_FINAL; + } + else if (id_equal (token->u.value, "__final")) + virt_specifier = VIRT_SPEC_FINAL; + else + break; + + if (virt_specifiers & virt_specifier) + { + gcc_rich_location richloc (token->location); + richloc.add_fixit_remove (); + error_at (&richloc, "duplicate %qD specifier", token->u.value); + cp_lexer_purge_token (parser->lexer); + } + else + { + cp_lexer_consume_token (parser->lexer); + virt_specifiers |= virt_specifier; + } +} + return virt_specifiers; +} + /* Parse a class-head. class-head: @@ -28258,12 +28309,10 @@ cp_parser_class_head (cp_parser* parser, pop_deferring_access_checks (); if (id) -{ - cp_parser_check_for_invalid_template_id (parser, id, - class_key, - type_start_token->location); -} - virt_specifiers = cp_parser_virt_specifier_seq_opt (parser); +cp_parser_check_for_invalid_template_id (parser, id, +class_key, +type_start_token->location); + virt_specifiers = cp_parser_class_property_specifier_seq_opt (parser); /* If it's not a `:' or a `{' then we can't really be looking at a class-head, since a class-head only appears as part of a @@ -28279,13 +28328,6 @@ cp_parser_class_head (cp_parser* parser, /* At this point, we're going ahead with the class-specifier, even if some other problem occurs. */ cp_parser_commit_to_tentative_parse (parser); - if (virt_specifiers & VIRT_SPEC_OVERRIDE) -{ - cp_parser_error (parser, - "cannot specify % for a class"); - type = error_mark_node; - goto out; -} /* Issue the error about the overly-qualified name now. */ if (qualified
[gcc r16-2181] Reduce the # of arguments of .ACCESS_WITH_SIZE from 6 to 4.
https://gcc.gnu.org/g:e53f481141f1415847329f3bef906e5eb91226ad commit r16-2181-ge53f481141f1415847329f3bef906e5eb91226ad Author: Qing Zhao Date: Wed Jul 9 21:31:55 2025 + Reduce the # of arguments of .ACCESS_WITH_SIZE from 6 to 4. This is an improvement to the design of internal function .ACCESS_WITH_SIZE. Currently, the .ACCESS_WITH_SIZE is designed as: ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, TYPE_OF_SIZE, ACCESS_MODE, TYPE_SIZE_UNIT for element) which returns the REF_TO_OBJ same as the 1st argument; 1st argument REF_TO_OBJ: The reference to the object; 2nd argument REF_TO_SIZE: The reference to the size of the object, 3rd argument CLASS_OF_SIZE: The size referenced by the REF_TO_SIZE represents 0: the number of bytes. 1: the number of the elements of the object type; 4th argument TYPE_OF_SIZE: A constant 0 with its TYPE being the same as the TYPE of the object referenced by REF_TO_SIZE 5th argument ACCESS_MODE: -1: Unknown access semantics 0: none 1: read_only 2: write_only 3: read_write 6th argument: The TYPE_SIZE_UNIT of the element TYPE of the FAM when 3rd argument is 1. NULL when 3rd argument is 0. Among the 6 arguments: A. The 3rd argument CLASS_OF_SIZE is not needed. If the REF_TO_SIZE represents the number of bytes, simply pass 1 to the TYPE_SIZE_UNIT argument. B. The 4th and the 5th arguments can be combined into 1 argument, whose TYPE represents the TYPE_OF_SIZE, and the constant value represents the ACCESS_MODE. As a result, the new design of the .ACCESS_WITH_SIZE is: ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, TYPE_OF_SIZE + ACCESS_MODE, TYPE_SIZE_UNIT for element) which returns the REF_TO_OBJ same as the 1st argument; 1st argument REF_TO_OBJ: The reference to the object; 2nd argument REF_TO_SIZE: The reference to the size of the object, 3rd argument TYPE_OF_SIZE + ACCESS_MODE: An integer constant with a pointer TYPE. The pointee TYPE of the pointer TYPE is the TYPE of the object referenced by REF_TO_SIZE. The integer constant value represents the ACCESS_MODE: 0: none 1: read_only 2: write_only 3: read_write 4th argument: The TYPE_SIZE_UNIT of the element TYPE of the array. gcc/c-family/ChangeLog: * c-ubsan.cc (get_bound_from_access_with_size): Adjust the position of the arguments per the new design. gcc/c/ChangeLog: * c-typeck.cc (build_access_with_size_for_counted_by): Update comments. Adjust the arguments per the new design. gcc/ChangeLog: * internal-fn.cc (expand_ACCESS_WITH_SIZE): Update comments. * internal-fn.def (ACCESS_WITH_SIZE): Update comments. * tree-object-size.cc (access_with_size_object_size): Update comments. Adjust the arguments per the new design. Diff: --- gcc/c-family/c-ubsan.cc | 10 ++ gcc/c/c-typeck.cc | 18 +- gcc/internal-fn.cc | 28 +--- gcc/internal-fn.def | 2 +- gcc/tree-object-size.cc | 34 +- 5 files changed, 38 insertions(+), 54 deletions(-) diff --git a/gcc/c-family/c-ubsan.cc b/gcc/c-family/c-ubsan.cc index 78b786854699..a4dc31066afb 100644 --- a/gcc/c-family/c-ubsan.cc +++ b/gcc/c-family/c-ubsan.cc @@ -397,8 +397,7 @@ get_bound_from_access_with_size (tree call) return NULL_TREE; tree ref_to_size = CALL_EXPR_ARG (call, 1); - unsigned int class_of_size = TREE_INT_CST_LOW (CALL_EXPR_ARG (call, 2)); - tree type = TREE_TYPE (CALL_EXPR_ARG (call, 3)); + tree type = TREE_TYPE (TREE_TYPE (CALL_EXPR_ARG (call, 2))); tree size = fold_build2 (MEM_REF, type, unshare_expr (ref_to_size), build_int_cst (ptr_type_node, 0)); /* If size is negative value, treat it as zero. */ @@ -410,12 +409,7 @@ get_bound_from_access_with_size (tree call) build_zero_cst (type), size); } - /* Only when class_of_size is 1, i.e, the number of the elements of - the object type, return the size. */ - if (class_of_size != 1) -return NULL_TREE; - else -size = fold_convert (sizetype, size); + size = fold_convert (sizetype, size); return size; } diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index de3d6c78db88..9a5eb0da3a1d 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -2982,7 +2982,7 @@ build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type) to: - (*.ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, (TYPE_OF_SIZE)0, -1, + (*.ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, (* TYPE_OF_SIZE)0,
[gcc r16-2180] Passing TYPE_SIZE_UNIT of the element as the 6th argument to .ACCESS_WITH_SIZE (PR121000)
https://gcc.gnu.org/g:1cf8d08a977f528c6e81601b7586ccf8bc8aa2a6 commit r16-2180-g1cf8d08a977f528c6e81601b7586ccf8bc8aa2a6 Author: Qing Zhao Date: Wed Jul 9 20:10:30 2025 + Passing TYPE_SIZE_UNIT of the element as the 6th argument to .ACCESS_WITH_SIZE (PR121000) The size of the element of the FAM _cannot_ reliably depends on the original TYPE of the FAM that we passed as the 6th parameter to the .ACCESS_WITH_SIZE: TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (gimple_call_arg (call, 5 when the element of the FAM has a variable length type. Since the variable that represents TYPE_SIZE_UNIT has no explicit usage in the original IL, compiler transformations (such as DSE) that are applied before object_size phase might eliminate the whole definition to the variable that represents the TYPE_SIZE_UNIT of the element of the FAM. In order to resolve this issue, instead of passing the original TYPE of the FAM as the 6th argument to .ACCESS_WITH_SIZE, we should explicitly pass the original TYPE_SIZE_UNIT of the element TYPE of the FAM as the 6th argument to the call to .ACCESS_WITH_SIZE. PR middle-end/121000 gcc/c/ChangeLog: * c-typeck.cc (build_access_with_size_for_counted_by): Update comments. Pass TYPE_SIZE_UNIT of the element as the 6th argument. gcc/ChangeLog: * internal-fn.cc (expand_ACCESS_WITH_SIZE): Update comments. * internal-fn.def (ACCESS_WITH_SIZE): Update comments. * tree-object-size.cc (access_with_size_object_size): Update comments. Get the element_size from the 6th argument directly. gcc/testsuite/ChangeLog: * gcc.dg/flex-array-counted-by-pr121000.c: New test. Diff: --- gcc/c/c-typeck.cc | 10 +++-- gcc/internal-fn.cc | 10 ++--- gcc/internal-fn.def| 2 +- .../gcc.dg/flex-array-counted-by-pr121000.c| 43 ++ gcc/tree-object-size.cc| 28 +++--- 5 files changed, 69 insertions(+), 24 deletions(-) diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index e24629be918b..de3d6c78db88 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -2983,7 +2983,7 @@ build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type) to: (*.ACCESS_WITH_SIZE (REF, COUNTED_BY_REF, 1, (TYPE_OF_SIZE)0, -1, - (TYPE_OF_ARRAY *)0)) + TYPE_SIZE_UNIT for element) NOTE: The return type of this function is the POINTER type pointing to the original flexible array type. @@ -2995,8 +2995,8 @@ build_counted_by_ref (tree datum, tree subdatum, tree *counted_by_type) The 4th argument of the call is a constant 0 with the TYPE of the object pointed by COUNTED_BY_REF. - The 6th argument of the call is a constant 0 with the pointer TYPE - to the original flexible array type. + The 6th argument of the call is the TYPE_SIZE_UNIT of the element TYPE + of the FAM. */ static tree @@ -3007,6 +3007,8 @@ build_access_with_size_for_counted_by (location_t loc, tree ref, gcc_assert (c_flexible_array_member_type_p (TREE_TYPE (ref))); /* The result type of the call is a pointer to the flexible array type. */ tree result_type = c_build_pointer_type (TREE_TYPE (ref)); + tree element_size = TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (ref))); + tree first_param = c_fully_fold (array_to_pointer_conversion (loc, ref), false, NULL); tree second_param @@ -3020,7 +3022,7 @@ build_access_with_size_for_counted_by (location_t loc, tree ref, build_int_cst (integer_type_node, 1), build_int_cst (counted_by_type, 0), build_int_cst (integer_type_node, -1), - build_int_cst (result_type, 0)); + element_size); /* Wrap the call with an INDIRECT_REF with the flexible array type. */ call = build1 (INDIRECT_REF, TREE_TYPE (ref), call); SET_EXPR_LOCATION (call, loc); diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index ed6ef0e4c647..c6e705cb6f5e 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -3443,7 +3443,7 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) /* Expand the IFN_ACCESS_WITH_SIZE function: ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, -TYPE_OF_SIZE, ACCESS_MODE) +TYPE_OF_SIZE, ACCESS_MODE, TYPE_SIZE_UNIT for element) which returns the REF_TO_OBJ same as the 1st argument; 1st argument REF_TO_OBJ: The reference to the object; @@ -3451,16 +3451,16 @@ expand_DEFERRED_INIT (internal_fn, gcall *stmt) 3rd argument CLASS_OF_SIZE: The size referenced by the REF_TO_SIZE represents 0: the number of
[gcc] Deleted branch 'mikael/heads/stabilisation_descriptor_v01' in namespace 'refs/users'
The branch 'mikael/heads/stabilisation_descriptor_v01' in namespace 'refs/users' was deleted. It previously pointed to: 7e72a078ae71... fortran: Amend descriptor bounds init if unallocated Diff: !!! WARNING: THE FOLLOWING COMMITS ARE NO LONGER ACCESSIBLE (LOST): --- 7e72a07... fortran: Amend descriptor bounds init if unallocated f6115ed... fortran: Delay evaluation of array bounds after reallocatio a1e8410... fortran: generate array reallocation out of loops 41b730b... Correction array_constructor_1
[gcc(refs/users/mikael/heads/stabilisation_descriptor_v01)] fortran: Generate array reallocation out of loops
https://gcc.gnu.org/g:3d82df83f96e3c0bef0fe042bdcf1a2f71b78045 commit 3d82df83f96e3c0bef0fe042bdcf1a2f71b78045 Author: Mikael Morin Date: Thu Jul 10 21:32:46 2025 +0200 fortran: Generate array reallocation out of loops Regression tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Generate the array reallocation on assignment code before entering the scalarization loops. This doesn't move the generated code itself, which was already put before the outermost loop, but only changes the current scope at the time the code is generated. This is a prerequisite for a followup patch that makes the reallocation code create new variables. Without this change the new variables would be declared in the innermost loop body and couldn't be used outside of it. gcc/fortran/ChangeLog: * trans-expr.cc (gfc_trans_assignment_1): Generate array reallocation code before entering the scalarisation loops. Diff: --- gcc/fortran/trans-expr.cc | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc index 3e0d763d2fb0..760c8c4e72bd 100644 --- a/gcc/fortran/trans-expr.cc +++ b/gcc/fortran/trans-expr.cc @@ -12943,6 +12943,7 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag, rhs_caf_attr = gfc_caf_attr (expr2, false, &rhs_refs_comp); } + tree reallocation = NULL_TREE; if (lss != gfc_ss_terminator) { /* The assignment needs scalarization. */ @@ -13011,6 +13012,15 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag, ompws_flags |= OMPWS_SCALARIZER_WS | OMPWS_SCALARIZER_BODY; } + /* F2003: Allocate or reallocate lhs of allocatable array. */ + if (realloc_flag) + { + realloc_lhs_warning (expr1->ts.type, true, &expr1->where); + ompws_flags &= ~OMPWS_SCALARIZER_WS; + reallocation = gfc_alloc_allocatable_for_assignment (&loop, expr1, + expr2); + } + /* Start the scalarized loop body. */ gfc_start_scalarized_body (&loop, &body); } @@ -13319,15 +13329,8 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr * expr2, bool init_flag, gfc_add_expr_to_block (&body, tmp); } - /* F2003: Allocate or reallocate lhs of allocatable array. */ - if (realloc_flag) - { - realloc_lhs_warning (expr1->ts.type, true, &expr1->where); - ompws_flags &= ~OMPWS_SCALARIZER_WS; - tmp = gfc_alloc_allocatable_for_assignment (&loop, expr1, expr2); - if (tmp != NULL_TREE) - gfc_add_expr_to_block (&loop.code[expr1->rank - 1], tmp); - } + if (reallocation != NULL_TREE) + gfc_add_expr_to_block (&loop.code[loop.dimen - 1], reallocation); if (maybe_workshare) ompws_flags &= ~OMPWS_SCALARIZER_BODY;
[gcc(refs/users/mikael/heads/stabilisation_descriptor_v01)] fortran: Delay evaluation of array bounds after reallocation
https://gcc.gnu.org/g:8e09c2418a0bcba8e170398a6173b2d950b47ac4 commit 8e09c2418a0bcba8e170398a6173b2d950b47ac4 Author: Mikael Morin Date: Thu Jul 10 21:32:57 2025 +0200 fortran: Delay evaluation of array bounds after reallocation Regression tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Delay the evaluation of bounds, offset, etc after the reallocation, for the scalarization of allocatable arrays on the left hand side of assignments. Before this change, the code preceding the scalarization loop is like: D.4757 = ref2.offset; D.4759 = ref2.dim[0].ubound; D.4762 = ref2.dim[0].lbound; { if (ref2.data == 0B) goto realloc; if (ref2.dim[0].lbound + 4 != ref2.dim[0].ubound) goto realloc; goto L.10; realloc: ... change offset and bounds ... D.4757 = ref2.offset; D.4762 = NON_LVALUE_EXPR ; ... reallocation ... L.10:; } while (1) { ... scalarized code ... so the bounds etc are evaluated first to variables, and the reallocation code takes care to update the variables during the reallocation. This is problematic because the variables' initialization references the array bounds, which for unallocated arrays are uninitialized at the evaluation point. This used to (correctly) cause uninitialized warnings (see PR fortran/108889), and a workaround for variables was found, that initializes the bounds of arrays variables to some value beforehand if they are unallocated. For allocatable components, there is no warning but the problem remains, some uninitialized values are used, even if discarded later. After this change the code becomes: { if (ref2.data == 0B) goto realloc; if (ref2.dim[0].lbound + 4 != ref2.dim[0].ubound) goto realloc; goto L.10; realloc:; ... change offset and bounds ... ... reallocation ... L.10:; } D.4762 = ref2.offset; D.4763 = ref2.dim[0].lbound; D.4764 = ref2.dim[0].ubound; while (1) { ... scalarized code so the scalarizer avoids storing the values to variables at the time it evaluates them, if the array is reallocatable on assignment. Instead, it keeps expressions with references to the array descriptor fields, expressions that remain valid through reallocation. After the reallocation code has been generated, the expressions stored by the scalarizer are evaluated in place to variables. The decision to delay evaluation is based on the existing field is_alloc_lhs, which requires a few tweaks to be alway correct wrt to what its name suggests. Namely it should be set even if the assignment right hand side is an intrinsic function, and it should not be set if the right hand side is a scalar and neither if the -fno-realloc-lhs flag is passed to the compiler. gcc/fortran/ChangeLog: * trans-array.cc (gfc_conv_ss_descriptor): Don't evaluate offset and data to a variable if is_alloc_lhs is set. Move the existing evaluation decision condition for data... (save_descriptor_data): ... here as a new predicate. (evaluate_bound): Add argument save_value. Omit the evaluation of the value to a variable if that argument isn't set. (gfc_conv_expr_descriptor): Update caller. (gfc_conv_section_startstride): Update caller. Set save_value if is_alloc_lhs is not set. Omit the evaluation of stride to a variable if save_value isn't set. (gfc_set_delta): Omit the evaluation of delta to a variable if is_alloc_lhs is set. (gfc_is_reallocatable_lhs): Return false if flag_realloc_lhs isn't set. (gfc_alloc_allocatable_for_assignment): Don't update the variables that may be stored in saved_offset, delta, and data. Call instead... (update_reallocated_descriptor): ... this new procedure. * trans-expr.cc (gfc_trans_assignment_1): Don't omit setting the is_alloc_lhs flag if the right hand side is an intrinsic function. Clear the flag if the right hand side is scalar. Diff: --- gcc/fortran/trans-array.cc | 137 - gcc/fortran/trans-expr.cc | 14 ++--- 2 files changed, 104 insertions(+), 47 deletions(-) diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index 7be2d7b11a62..7b83d3fab8d7 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -3420,6 +3420,23 @@ gfc_add_loop_ss_code (gfc_loopinfo * loop, gfc_ss * ss, bool subscript, } +/* Given an array descriptor expression DESCR and its data pointer D
[gcc(refs/users/mikael/heads/stabilisation_descriptor_v01)] fortran: Amend descriptor bounds init if unallocated
https://gcc.gnu.org/g:a4b9621bfff13aa051078e07f6dec483faeb631b commit a4b9621bfff13aa051078e07f6dec483faeb631b Author: Mikael Morin Date: Thu Jul 10 21:33:09 2025 +0200 fortran: Amend descriptor bounds init if unallocated Regression tested on x86_64-pc-linux-gnu. OK for master? -- >8 -- Always generate the conditional initialization of unallocated variables regardless of the basic variable allocation tracking done in the frontend and with an additional always false condition. The scalarizer used to always evaluate array bounds, including in the case of unallocated arrays on the left hand side of an assignment. This was (correctly) causing uninitialized warnings, even if the uninitialized values were in the end discarded. Since the fix for PR fortran/108889, an initialization of the descriptor bounds is added to silent the uninitialized warnings, conditional on the array being unallocated. This initialization is not useful in the execution of the program, and it is removed if the compiler can prove that the variable is unallocated (in the case of a local variable for example). Unfortunately, the compiler is not always able to prove it and the useless initialization may remain in the final code. Moreover, the generated code that was causing the evaluation of uninitialized variables has ben changed to avoid them, so we can try to remove or revisit that unallocated variable bounds initialization tweak. Unfortunately, just removing the extra initialization restores the warnings at -O0, as there is no dead code removal at that optimization level. Instead, this change keeps the initialization and modifies its guarding condition with an extra always false variable, so that if optimizations are enabled the whole initialization block is removed, and if they are disabled it remains and is sufficient to prevent the warning. The new variable requires the code generation to be done earlier in the function so that the variable declaration and usage are in the same scope. As the modified condition guarantees the removal of the block with optimizations, we can emit it more broadly and remove the basic allocation tracking that was done in the frontend to limit its emission. gcc/fortran/ChangeLog: * gfortran.h (gfc_symbol): Remove field allocated_in_scope. * trans-array.cc (gfc_array_allocate): Don't set it. (gfc_alloc_allocatable_for_assignment): Likewise. Generate the unallocated descriptor bounds initialisation before the opening of the reallocation code block. Create a variable and use it as additional condition to the unallocated descriptor bounds initialisation. Diff: --- gcc/fortran/gfortran.h | 4 -- gcc/fortran/trans-array.cc | 92 -- 2 files changed, 49 insertions(+), 47 deletions(-) diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 6848bd1762d3..69367e638c5b 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -2028,10 +2028,6 @@ typedef struct gfc_symbol /* Set if this should be passed by value, but is not a VALUE argument according to the Fortran standard. */ unsigned pass_as_value:1; - /* Set if an allocatable array variable has been allocated in the current - scope. Used in the suppression of uninitialized warnings in reallocation - on assignment. */ - unsigned allocated_in_scope:1; /* Set if an external dummy argument is called with different argument lists. This is legal in Fortran, but can cause problems with autogenerated C prototypes for C23. */ diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc index 7b83d3fab8d7..1561936daf1c 100644 --- a/gcc/fortran/trans-array.cc +++ b/gcc/fortran/trans-array.cc @@ -6800,8 +6800,6 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg, else gfc_add_expr_to_block (&se->pre, set_descriptor); - expr->symtree->n.sym->allocated_in_scope = 1; - return true; } @@ -11495,14 +11493,61 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop, && !expr2->value.function.isym) expr2->ts.u.cl->backend_decl = rss->info->string_length; - gfc_start_block (&fblock); - /* Since the lhs is allocatable, this must be a descriptor type. Get the data and array size. */ desc = linfo->descriptor; gcc_assert (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (desc))); array1 = gfc_conv_descriptor_data_get (desc); + /* If the data is null, set the descriptor bounds and offset. This suppresses + the maybe used uninitialized warning. Note that the always false variable + prevents this block from ever being executed, and makes sure that the + optimizers are able to remove it. Component references ar
[gcc(refs/users/mikael/heads/stabilisation_descriptor_v01)] Correction array_constructor_1
https://gcc.gnu.org/g:41b730b8a79522e8e5a6115f01a02968a571e85b commit 41b730b8a79522e8e5a6115f01a02968a571e85b Author: Mikael Morin Date: Sat Jul 5 15:05:20 2025 +0200 Correction array_constructor_1 Diff: --- gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 b/gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 index 45eafacd5a67..a0c55076a9ae 100644 --- a/gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 +++ b/gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 @@ -9,6 +9,8 @@ program grow_type_array type(container), allocatable :: list(:) +allocate(list(0)) + list = [list, new_elem(5)] deallocate(list)
[gcc] Created branch 'mikael/heads/stabilisation_descriptor_v01' in namespace 'refs/users'
The branch 'mikael/heads/stabilisation_descriptor_v01' was created in namespace 'refs/users' pointing to: a4b9621bfff1... fortran: Amend descriptor bounds init if unallocated
[gcc r16-2182] aarch64: Guard VF-based costing with !m_costing_for_scalar
https://gcc.gnu.org/g:a1e616955e9971fda54a160a49e6cf70dd838a0c commit r16-2182-ga1e616955e9971fda54a160a49e6cf70dd838a0c Author: Richard Sandiford Date: Thu Jul 10 22:00:41 2025 +0100 aarch64: Guard VF-based costing with !m_costing_for_scalar g:4b47acfe2b626d1276e229a0cf165e934813df6c caused a segfault in aarch64_vector_costs::analyze_loop_vinfo when costing scalar code, since we'd end up dividing by a zero VF. Much of the structure of the aarch64 costing code dates from a stage 4 patch, when we had to work within the bounds of what the target-independent code did. Some of it could do with a rework now that we're not so constrained. This patch is therefore an emergency fix rather than the best long-term solution. I'll revisit when I have more time to think about it. gcc/ * config/aarch64/aarch64.cc (aarch64_vector_costs::add_stmt_cost): Guard VF-based costing with !m_costing_for_scalar. Diff: --- gcc/config/aarch64/aarch64.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 27c315fc35e8..10b8ed5d3874 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -17932,7 +17932,7 @@ aarch64_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, /* Do one-time initialization based on the vinfo. */ loop_vec_info loop_vinfo = dyn_cast (m_vinfo); - if (!m_analyzed_vinfo) + if (!m_analyzed_vinfo && !m_costing_for_scalar) { if (loop_vinfo) analyze_loop_vinfo (loop_vinfo);
[gcc r16-2179] testsuite: Fix unallocated array usage in test
https://gcc.gnu.org/g:ca034694757f0fb3461a1d0c22708a3e4c0e40fa commit r16-2179-gca034694757f0fb3461a1d0c22708a3e4c0e40fa Author: Mikael Morin Date: Sat Jul 5 15:05:20 2025 +0200 testsuite: Fix unallocated array usage in test gcc/testsuite/ChangeLog: * gfortran.dg/asan/array_constructor_1.f90: Allocate array before using it. Diff: --- gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 | 2 ++ 1 file changed, 2 insertions(+) diff --git a/gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 b/gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 index 45eafacd5a67..a0c55076a9ae 100644 --- a/gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 +++ b/gcc/testsuite/gfortran.dg/asan/array_constructor_1.f90 @@ -9,6 +9,8 @@ program grow_type_array type(container), allocatable :: list(:) +allocate(list(0)) + list = [list, new_elem(5)] deallocate(list)