Re: [PATCH 30/61] MSA: Make MSA and microMIPS R5 unsupported
Xi Ruoyao writes: > On Wed, 2025-04-23 at 12:43 +, Aleksandar Rakic wrote: >> From 16b3207aed5e4846fde4f3ffa1253c65ef6ba056 Mon Sep 17 00:00:00 2001 >> From: Aleksandar Rakic >> Date: Wed, 23 Apr 2025 14:14:17 +0200 >> Subject: [PATCH] Make MSA and microMIPS R5 unsupported >> >> There are no platforms nor simulators for MSA and microMIPS R5 so >> turning off this support for now. >> >> gcc/ChangeLog: >> >> * config/mips/mips.cc (mips_option_override): Error out for >> -mmicromips -mips32r5 -mmsa. >> >> Cherry-picked 1009d6ff7a8d3b56e0224a6b193c5a7b3c29aa5f >> from https://github.com/MIPS/gcc >> >> Signed-off-by: Matthew Fortune >> Signed-off-by: Faraz Shahbazker >> Signed-off-by: Aleksandar Rakic >> --- >> gcc/config/mips/mips.cc | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc >> index 0d3d0263f2d..23205dfb616 100644 >> --- a/gcc/config/mips/mips.cc >> +++ b/gcc/config/mips/mips.cc >> @@ -20414,6 +20414,7 @@ static void >> mips_option_override (void) >> { >> int i, regno, mode; >> + unsigned int is_micromips; >> >> if (OPTION_SET_P (mips_isa_option)) >> mips_isa_option_info = &mips_cpu_info_table[mips_isa_option]; >> @@ -20434,6 +20435,7 @@ mips_option_override (void) >> /* Save the base compression state and process flags as though we >> were generating uncompressed code. */ >> mips_base_compression_flags = TARGET_COMPRESSION; >> + is_micromips = TARGET_MICROMIPS; >> target_flags &= ~TARGET_COMPRESSION; >> mips_base_code_readable = mips_code_readable; >> >> @@ -20678,7 +20680,7 @@ mips_option_override (void) >> "-mcompact-branches=never"); >> } >> >> - if (is_micromips && TARGET_MSA) >> + if (is_micromips && mips_isa_rev <= 5 && TARGET_MSA) > > Why not just "TARGET_MICROMIPS && mips_isa_rev <= 5 && TARGET_MSA"? > >> error ("unsupported combination: %s", "-mmicromips -mmsa"); > > And should this line be updated too like "-mmicromips -mmsa is only > supported for MIPSr6"? > > Unfortunately the original patch is already applied and breaking even a > non-bootstrapping build for MIPS. Thus a fix is needed ASAP or we'd > revert the original patch. i.e. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119929#c2. Also, Aleksandar, do you have an account on Bugzilla? It'd be useful to be able to CC you on any MIPS-related issues with the upstreaming of these patches. Thanks.
Re: [PATCH] gcc: For Windows x86-32, always attempt to realign stack regardless of SSE
> For Windows x86-32 targets, the Microsoft ABI only guarantees that the > stack is aligned to 4-byte boundaries. GCC knows about the default > alignment of the stack. However, before this commit, it did not realign the > stack unless SSE was also enabled. > > When a stricter (larger) alignment is requested, it's always necessary to > realign the stack, as what Solaris does. Yes, or else if you configure the compiler --with-fpmath=sse (which is IMO the right thing to do for native 32-bit x86 platforms nowadays). > PR target/07 > * config/i386/cygming.h (STACK_REALIGN_DEFAULT): Copy from sol2.h. FWIW looks good to me. -- Eric Botcazou
[PATCH] gcc: For Windows x86-32, always attempt to realign stack regardless of SSE
For Windows x86-32 targets, the Microsoft ABI only guarantees that the stack is aligned to 4-byte boundaries. GCC knows about the default alignment of the stack. However, before this commit, it did not realign the stack unless SSE was also enabled. When a stricter (larger) alignment is requested, it's always necessary to realign the stack, as what Solaris does. Reference: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=07#c14 Signed-off-by: LIU Hao gcc/ChangeLog: PR target/07 * config/i386/cygming.h (STACK_REALIGN_DEFAULT): Copy from sol2.h. --- gcc/config/i386/cygming.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/i386/cygming.h b/gcc/config/i386/cygming.h index 3ddcbecb22fd..d587d25a58a8 100644 --- a/gcc/config/i386/cygming.h +++ b/gcc/config/i386/cygming.h @@ -36,7 +36,7 @@ along with GCC; see the file COPYING3. If not see /* 32-bit Windows aligns the stack on a 4-byte boundary but SSE instructions may require 16-byte alignment. */ #undef STACK_REALIGN_DEFAULT -#define STACK_REALIGN_DEFAULT TARGET_SSE +#define STACK_REALIGN_DEFAULT (TARGET_64BIT ? 0 : 1) /* Support hooks for SEH. */ #undef TARGET_ASM_UNWIND_EMIT -- 2.49.0 From d043b30147e00231a99012d631bfc6291340b283 Mon Sep 17 00:00:00 2001 From: LIU Hao Date: Sun, 27 Apr 2025 18:18:34 +0800 Subject: [PATCH] gcc: For Windows x86-32, always attempt to realign stack regardless of SSE For Windows x86-32 targets, the Microsoft ABI only guarantees that the stack is aligned to 4-byte boundaries. GCC knows about the default alignment of the stack. However, before this commit, it did not realign the stack unless SSE was also enabled. When a stricter (larger) alignment is requested, it's always necessary to realign the stack, as what Solaris does. Reference: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=07#c14 Signed-off-by: LIU Hao gcc/ChangeLog: PR target/07 * config/i386/cygming.h (STACK_REALIGN_DEFAULT): Copy from sol2.h. --- gcc/config/i386/cygming.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/config/i386/cygming.h b/gcc/config/i386/cygming.h index 3ddcbecb22fd..d587d25a58a8 100644 --- a/gcc/config/i386/cygming.h +++ b/gcc/config/i386/cygming.h @@ -36,7 +36,7 @@ along with GCC; see the file COPYING3. If not see /* 32-bit Windows aligns the stack on a 4-byte boundary but SSE instructions may require 16-byte alignment. */ #undef STACK_REALIGN_DEFAULT -#define STACK_REALIGN_DEFAULT TARGET_SSE +#define STACK_REALIGN_DEFAULT (TARGET_64BIT ? 0 : 1) /* Support hooks for SEH. */ #undef TARGET_ASM_UNWIND_EMIT -- 2.49.0 OpenPGP_signature.asc Description: OpenPGP digital signature
RE: [PATCH] AArch64: Fold LD1/ST1 with ptrue to LDR/STR for 128-bit VLS
> -Original Message- > From: Richard Sandiford > Sent: Friday, April 25, 2025 6:55 PM > To: Jennifer Schmitz > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] AArch64: Fold LD1/ST1 with ptrue to LDR/STR for 128-bit > VLS > > Jennifer Schmitz writes: > > If -msve-vector-bits=128, SVE loads and stores (LD1 and ST1) with a > > ptrue predicate can be replaced by neon instructions (LDR and STR), > > thus avoiding the predicate altogether. This also enables formation of > > LDP/STP pairs. > > > > For example, the test cases > > > > svfloat64_t > > ptrue_load (float64_t *x) > > { > > svbool_t pg = svptrue_b64 (); > > return svld1_f64 (pg, x); > > } > > void > > ptrue_store (float64_t *x, svfloat64_t data) > > { > > svbool_t pg = svptrue_b64 (); > > return svst1_f64 (pg, x, data); > > } > > > > were previously compiled to > > (with -O2 -march=armv8.2-a+sve -msve-vector-bits=128): > > > > ptrue_load: > > ptrue p3.b, vl16 > > ld1dz0.d, p3/z, [x0] > > ret > > ptrue_store: > > ptrue p3.b, vl16 > > st1dz0.d, p3, [x0] > > ret > > > > Now the are compiled to: > > > > ptrue_load: > > ldr q0, [x0] > > ret > > ptrue_store: > > str q0, [x0] > > ret > > > > The implementation includes the if-statement > > if (known_eq (BYTES_PER_SVE_VECTOR, 16) > > && known_eq (GET_MODE_SIZE (mode), 16)) > > > > which checks for 128-bit VLS and excludes partial modes with a > > mode size < 128 (e.g. VNx2QI). > > I think it would be better to use: > > if (known_eq (GET_MODE_SIZE (mode), 16) > && aarch64_classify_vector_mode (mode) == VEC_SVE_DATA > > to defend against any partial structure modes that might be added in future. > Hi Both, Just a suggestion so feel free to ignore, but I do wonder if this optimization shouldn't look at the predicate bits rather than the mode size. Since this is valid for any load where the predicate uses the lower N bits where n corresponds to an Adv. SIMD register size. e.g. it should be valid for: #include svfloat64_t ptrue_load (float64_t *x) { svbool_t pg = svptrue_pat_b8 (SV_VL16); return svld1_f64 (pg, x); } void ptrue_store (float64_t *x, svfloat64_t data) { svbool_t pg = svptrue_pat_b8 (SV_VL16); return svst1_f64 (pg, x, data); } In general, along with #include svfloat64_t ptrue_load (float64_t *x) { svbool_t pg = svptrue_pat_b8 (SV_VL8); return svld1_f64 (pg, x); } void ptrue_store (float64_t *x, svfloat64_t data) { svbool_t pg = svptrue_pat_b8 (SV_VL8); return svst1_f64 (pg, x, data); } It just so happens that at VL128 the SV_VL16 == SV_ALL. Looking at the predicate bits instead would help optimize all codegen. Thanks, Tamar > > > > The patch was bootstrapped and tested on aarch64-linux-gnu, no regression. > > OK for mainline? > > > > Signed-off-by: Jennifer Schmitz > > > > gcc/ > > * config/aarch64/aarch64.cc (aarch64_emit_sve_pred_move): > > Fold LD1/ST1 with ptrue to LDR/STR for 128-bit VLS. > > > > gcc/testsuite/ > > * gcc.target/aarch64/sve/ldst_ptrue_128_to_neon.c: New test. > > * gcc.target/aarch64/sve/cond_arith_6.c: Adjust expected outcome. > > * gcc.target/aarch64/sve/pst/return_4_128.c: Likewise. > > * gcc.target/aarch64/sve/pst/return_5_128.c: Likewise. > > * gcc.target/aarch64/sve/pst/struct_3_128.c: Likewise. > > --- > > gcc/config/aarch64/aarch64.cc | 27 ++-- > > .../gcc.target/aarch64/sve/cond_arith_6.c | 3 +- > > .../aarch64/sve/ldst_ptrue_128_to_neon.c | 36 +++ > > .../gcc.target/aarch64/sve/pcs/return_4_128.c | 39 --- > > .../gcc.target/aarch64/sve/pcs/return_5_128.c | 39 --- > > .../gcc.target/aarch64/sve/pcs/struct_3_128.c | 64 +-- > > 6 files changed, 102 insertions(+), 106 deletions(-) > > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sve/ldst_ptrue_128_to_neon.c > > > > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc > > index f7bccf532f8..ac01149276b 100644 > > --- a/gcc/config/aarch64/aarch64.cc > > +++ b/gcc/config/aarch64/aarch64.cc > > @@ -6416,13 +6416,28 @@ aarch64_stack_protect_canary_mem > (machine_mode mode, rtx decl_rtl, > > void > > aarch64_emit_sve_pred_move (rtx dest, rtx pred, rtx src) > > { > > - expand_operand ops[3]; > >machine_mode mode = GET_MODE (dest); > > - create_output_operand (&ops[0], dest, mode); > > - create_input_operand (&ops[1], pred, GET_MODE(pred)); > > - create_input_operand (&ops[2], src, mode); > > - temporary_volatile_ok v (true); > > - expand_insn (code_for_aarch64_pred_mov (mode), 3, ops); > > + if ((MEM_P (dest) || MEM_P (src)) > > + && known_eq (BYTES_PER_SVE_VECTOR, 16) > > + && known_eq (GET_MODE_SIZE (mode), 16) > > + && !BYTES_BIG_ENDIAN) > > +{ > > + rtx tmp = gen_reg_rtx (V16QImode); > > + emit_move_insn (tmp, lowpart_subreg (V16QImode, src, mode)); > > + if (MEM_P (src)) > > +
[PATCH] ipa, cgraph: Enable constant propagation to OpenMP kernels
This patch enables constant propagation to outlined OpenMP kernels and improves support for optimizing callback functions in general. It implements the attribute 'callback' as found in clang, though argument numbering is a bit different, as described below. The title says OpenMP, but it can be used for any function which takes a callback argument, such as pthread functions, qsort and others. The attribute 'callback' captures the notion of a function calling one of its arguments with some of its parameters as arguments. An OpenMP example of such function is GOMP_parallel. We implement the attribute with new callgraph edges called 'callback' edges. They are imaginary edges pointing from the caller of the function with the attribute (e.g. caller of GOMP_parallel) to the body function itself (e.g. the outlined OpenMP body). They share their call statement with the edge from which they are derived (direct edge caller -> GOMP_parallel in this case). These edges allow passes such as ipa-cp to the see the hidden call site to the body function and optimize the function accordingly. To illustrate on an example, the body GOMP_parallel looks something like this: void GOMP_parallel (void (*fn) (void *), void *data, /* ... */) { /* ... */ fn (data); /* ... */ } If we extend it with the attribute 'callback(1, 2)', we express that the function calls its first argument and passes it its second argument. This is represented in the call graph in this manner: direct indirect caller -> GOMP_parallel ---> fn | --> fn callback The direct edge is then the parent edge, with all callback edges being the child edges. While constant propagation is the main focus of this patch, callback edges can be useful for different passes (for example, it improves icf for OpenMP kernels), as they allow for address redirection. If the outlined body function gets optimized and cloned, from body_fn to body_fn.optimized, the callback edge allows us to replace the address in the arguments list: GOMP_parallel (body_fn, &data_struct, /* ... */); becomes GOMP_parallel (body_fn.optimized, &data_struct, /* ... */); This redirection is possible for any function with the attribute. This callback attribute implementation is partially compatible with clang's implementation. Its semantics, arguments and argument indexing style are the same, but we represent an unknown argument position with 0 (precedent set by attributes such as 'format'), while clang uses -1 or '?'. We also allow for multiple callback attributes on the same function, while clang only allows one. The attribute allows us to propagate constants into body functions of OpenMP constructs. Currently, GCC won't propagate the value 'c' into the OpenMP body in the following example: int a[100]; void test(int c) { #pragma omp parallel for for (int i = 0; i < c; i++) { if (!__builtin_constant_p(c)) { __builtin_abort(); } a[i] = i; } } int main() { test(100); return a[5] - 5; } With this patch, the body function will get cloned and the constant 'c' will get propagated. Bootstrapped and regtested on x86_64-linux. OK for master? Thanks, Josef Melcr gcc/ChangeLog: * builtin-attrs.def (0): New int list. (ATTR_CALLBACK): Callback attribute identifier. (DEF_CALLBACK_ATTRIBUTE): Macro for callback attribute creation. (GOMP): Attributes for libgomp functions. (OACC): Attribute used for oacc functions. (ATTR_CALLBACK_GOMP_LIST): ATTR_NOTHROW_LIST but with the callback attribute added, used for many libgomp functions. (ATTR_CALLBACK_GOMP_TASK_HELPER_LIST): Helper list for the construction of ATTR_CALLBACK_GOMP_TASK_LIST. (ATTR_CALLBACK_GOMP_TASK_LIST): New attribute list for GOMP_task, includes two callback attributes. (ATTR_CALLBACK_OACC_LIST): Same as ATTR_CALLBACK_GOMP_LIST, used for oacc builtins. * cgraph.cc (cgraph_add_edge_to_call_site_hash): When hashing callback edges, always hash the parent edge. (cgraph_node::get_edge): Always return callback parent edge. (cgraph_edge::set_call_stmt): Add cascade for callback edges. (symbol_table::create_edge): Allow callback edges to share the same call statement. (cgraph_edge::make_callback): New method, derives a callback edge this method is called on. (cgraph_edge::get_callback_parent_edge): New method. (cgraph_edge::first_callback_target): New method. (cgraph_edge::next_callback_target): New method. (cgraph_edge::purge_callback_children): New method. (cgraph_edge::redirect_call_stmt_to_callee): Add callback edge redirection, set call statements for child edges when updating the parent's statement. (cgraph_node::remove_callers): Remove child edges when removing their parent. (c
Re: [PATCH] ipa, cgraph: Enable constant propagation to OpenMP kernels
On Sun, Apr 27, 2025 at 2:58 AM Josef Melcr wrote: > > This patch enables constant propagation to outlined OpenMP kernels and > improves support for optimizing callback functions in general. It > implements the attribute 'callback' as found in clang, though argument > numbering is a bit different, as described below. The title says OpenMP, > but it can be used for any function which takes a callback argument, such > as pthread functions, qsort and others. > > The attribute 'callback' captures the notion of a function calling one > of its arguments with some of its parameters as arguments. An OpenMP > example of such function is GOMP_parallel. > We implement the attribute with new callgraph edges called 'callback' > edges. They are imaginary edges pointing from the caller of the function > with the attribute (e.g. caller of GOMP_parallel) to the body function > itself (e.g. the outlined OpenMP body). They share their call statement > with the edge from which they are derived (direct edge caller -> GOMP_parallel > in this case). These edges allow passes such as ipa-cp to the see the > hidden call site to the body function and optimize the function accordingly. > > To illustrate on an example, the body GOMP_parallel looks something > like this: > > void GOMP_parallel (void (*fn) (void *), void *data, /* ... */) > { > /* ... */ > fn (data); > /* ... */ > } > > > If we extend it with the attribute 'callback(1, 2)', we express that the > function calls its first argument and passes it its second argument. > This is represented in the call graph in this manner: > > direct indirect > caller -> GOMP_parallel ---> fn > | > --> fn > callback > > The direct edge is then the parent edge, with all callback edges being > the child edges. > While constant propagation is the main focus of this patch, callback > edges can be useful for different passes (for example, it improves icf > for OpenMP kernels), as they allow for address redirection. > If the outlined body function gets optimized and cloned, from body_fn to > body_fn.optimized, the callback edge allows us to replace the > address in the arguments list: > > GOMP_parallel (body_fn, &data_struct, /* ... */); > > becomes > > GOMP_parallel (body_fn.optimized, &data_struct, /* ... */); > > This redirection is possible for any function with the attribute. > > This callback attribute implementation is partially compatible with > clang's implementation. Its semantics, arguments and argument indexing style > are > the same, but we represent an unknown argument position with 0 > (precedent set by attributes such as 'format'), while clang uses -1 or '?'. > We also allow for multiple callback attributes on the same function, > while clang only allows one. > > The attribute allows us to propagate constants into body functions of > OpenMP constructs. Currently, GCC won't propagate the value 'c' into the > OpenMP body in the following example: > > int a[100]; > void test(int c) { > #pragma omp parallel for > for (int i = 0; i < c; i++) { > if (!__builtin_constant_p(c)) { > __builtin_abort(); > } > a[i] = i; > } > } > int main() { > test(100); > return a[5] - 5; > } > > With this patch, the body function will get cloned and the constant 'c' > will get propagated. > > Bootstrapped and regtested on x86_64-linux. OK for master? This seems like it could also improve code dealing with C++ lambdas. Have you thought of that? Thanks, Andrew > > Thanks, > Josef Melcr > > gcc/ChangeLog: > > * builtin-attrs.def (0): New int list. > (ATTR_CALLBACK): Callback attribute identifier. > (DEF_CALLBACK_ATTRIBUTE): Macro for callback attribute creation. > (GOMP): Attributes for libgomp functions. > (OACC): Attribute used for oacc functions. > (ATTR_CALLBACK_GOMP_LIST): ATTR_NOTHROW_LIST but with the > callback attribute added, used for many libgomp functions. > (ATTR_CALLBACK_GOMP_TASK_HELPER_LIST): Helper list for the > construction of ATTR_CALLBACK_GOMP_TASK_LIST. > (ATTR_CALLBACK_GOMP_TASK_LIST): New attribute list for > GOMP_task, includes two callback attributes. > (ATTR_CALLBACK_OACC_LIST): Same as ATTR_CALLBACK_GOMP_LIST, used > for oacc builtins. > * cgraph.cc (cgraph_add_edge_to_call_site_hash): When hashing > callback edges, always hash the parent edge. > (cgraph_node::get_edge): Always return callback parent edge. > (cgraph_edge::set_call_stmt): Add cascade for callback edges. > (symbol_table::create_edge): Allow callback edges to share the > same call statement. > (cgraph_edge::make_callback): New method, derives a callback > edge this method is called on. > (cgraph_edge::get_callback_parent_edge): New method. > (cgraph_edge::first_callback_target): New method. > (cg
Re: [PATCH] ipa, cgraph: Enable constant propagation to OpenMP kernels
Lambdas have crossed my mind, but I have not yet had the time to look thoroughly into their implementation and the issues they face. I do plan to look into them once I am done with some incremental improvements for the attribute and callback edges, as lambdas seem like a good candidate for this sort of thing, given their use case. Thanks, Josef Melcr On 4/27/25 19:36, Andrew Pinski wrote: On Sun, Apr 27, 2025 at 2:58 AM Josef Melcr wrote: This patch enables constant propagation to outlined OpenMP kernels and improves support for optimizing callback functions in general. It implements the attribute 'callback' as found in clang, though argument numbering is a bit different, as described below. The title says OpenMP, but it can be used for any function which takes a callback argument, such as pthread functions, qsort and others. The attribute 'callback' captures the notion of a function calling one of its arguments with some of its parameters as arguments. An OpenMP example of such function is GOMP_parallel. We implement the attribute with new callgraph edges called 'callback' edges. They are imaginary edges pointing from the caller of the function with the attribute (e.g. caller of GOMP_parallel) to the body function itself (e.g. the outlined OpenMP body). They share their call statement with the edge from which they are derived (direct edge caller -> GOMP_parallel in this case). These edges allow passes such as ipa-cp to the see the hidden call site to the body function and optimize the function accordingly. To illustrate on an example, the body GOMP_parallel looks something like this: void GOMP_parallel (void (*fn) (void *), void *data, /* ... */) { /* ... */ fn (data); /* ... */ } If we extend it with the attribute 'callback(1, 2)', we express that the function calls its first argument and passes it its second argument. This is represented in the call graph in this manner: direct indirect caller -> GOMP_parallel ---> fn | --> fn callback The direct edge is then the parent edge, with all callback edges being the child edges. While constant propagation is the main focus of this patch, callback edges can be useful for different passes (for example, it improves icf for OpenMP kernels), as they allow for address redirection. If the outlined body function gets optimized and cloned, from body_fn to body_fn.optimized, the callback edge allows us to replace the address in the arguments list: GOMP_parallel (body_fn, &data_struct, /* ... */); becomes GOMP_parallel (body_fn.optimized, &data_struct, /* ... */); This redirection is possible for any function with the attribute. This callback attribute implementation is partially compatible with clang's implementation. Its semantics, arguments and argument indexing style are the same, but we represent an unknown argument position with 0 (precedent set by attributes such as 'format'), while clang uses -1 or '?'. We also allow for multiple callback attributes on the same function, while clang only allows one. The attribute allows us to propagate constants into body functions of OpenMP constructs. Currently, GCC won't propagate the value 'c' into the OpenMP body in the following example: int a[100]; void test(int c) { #pragma omp parallel for for (int i = 0; i < c; i++) { if (!__builtin_constant_p(c)) { __builtin_abort(); } a[i] = i; } } int main() { test(100); return a[5] - 5; } With this patch, the body function will get cloned and the constant 'c' will get propagated. Bootstrapped and regtested on x86_64-linux. OK for master? This seems like it could also improve code dealing with C++ lambdas. Have you thought of that? Thanks, Andrew Thanks, Josef Melcr gcc/ChangeLog: * builtin-attrs.def (0): New int list. (ATTR_CALLBACK): Callback attribute identifier. (DEF_CALLBACK_ATTRIBUTE): Macro for callback attribute creation. (GOMP): Attributes for libgomp functions. (OACC): Attribute used for oacc functions. (ATTR_CALLBACK_GOMP_LIST): ATTR_NOTHROW_LIST but with the callback attribute added, used for many libgomp functions. (ATTR_CALLBACK_GOMP_TASK_HELPER_LIST): Helper list for the construction of ATTR_CALLBACK_GOMP_TASK_LIST. (ATTR_CALLBACK_GOMP_TASK_LIST): New attribute list for GOMP_task, includes two callback attributes. (ATTR_CALLBACK_OACC_LIST): Same as ATTR_CALLBACK_GOMP_LIST, used for oacc builtins. * cgraph.cc (cgraph_add_edge_to_call_site_hash): When hashing callback edges, always hash the parent edge. (cgraph_node::get_edge): Always return callback parent edge. (cgraph_edge::set_call_stmt): Add cascade for callback edges. (symbol_table::create_edge): Allow callback edges to share the same call statement. (cgraph_ed
Re: [PATCH] cfgexpand: Change __builtin_unreachable to __builtin_trap if only thing in function [PR109267]
> On 27 Apr 2025, at 00:06, Andrew Pinski wrote: > > When we have an empty function, things can go wrong with > cfi_startproc/cfi_endproc and a few other > things like exceptions. So if the only thing the function does is a call to > __builtin_unreachable, > let's expand that to a __builtin_trap instead. For most targets that is one > instruction wide so it > won't hurt things that much and we get correct behavior for exceptions and > some linkers will be better > for it. > > Bootstrapped and tested on x86_64-linux-gnu. This also works to restore bootstrap for aarch64-darwin and is preferable to the patch I suggested (since it is narrower in application). A couple of typographical nits below … thanks Iain > > PR middle-end/109267 > > gcc/ChangeLog: > > * cfgexpand.cc (expand_gimple_basic_block): If the first non debug > statement in the > first (and only) basic block is a call to __builtin_unreachable change > it to a call > to __builtin_trap. some of these lines look quite long in the patch? > > gcc/testsuite/ChangeLog: > > * gcc.dg/pr109267-1.c: New test. > * gcc.dg/pr109267-2.c: New test. > > Signed-off-by: Andrew Pinski > --- > gcc/cfgexpand.cc | 8 > gcc/testsuite/gcc.dg/pr109267-1.c | 14 ++ > gcc/testsuite/gcc.dg/pr109267-2.c | 13 + > 3 files changed, 35 insertions(+) > create mode 100644 gcc/testsuite/gcc.dg/pr109267-1.c > create mode 100644 gcc/testsuite/gcc.dg/pr109267-2.c > > diff --git a/gcc/cfgexpand.cc b/gcc/cfgexpand.cc > index e84f12a5e93..e14df760b7a 100644 > --- a/gcc/cfgexpand.cc > +++ b/gcc/cfgexpand.cc > @@ -6206,6 +6206,14 @@ expand_gimple_basic_block (basic_block bb, bool > disable_tail_calls) > basic_block new_bb; > > stmt = gsi_stmt (gsi); > + > + /* If we are expanding the first (and only) bb and the only non debug > + statement is __builtin_unreachable call, then replace it with a trap > + so the function is at least one instruction in size. */ > + if (!nondebug_stmt_seen && bb->index == NUM_FIXED_BLOCKS > + && gimple_call_builtin_p (stmt, BUILT_IN_UNREACHABLE)) ^^ whitespace glitch? > + gimple_call_set_fndecl(stmt, builtin_decl_implicit (BUILT_IN_TRAP)); > + > if (!is_gimple_debug (stmt)) > nondebug_stmt_seen = true; > > diff --git a/gcc/testsuite/gcc.dg/pr109267-1.c > b/gcc/testsuite/gcc.dg/pr109267-1.c > new file mode 100644 > index 000..4f1da8b41e3 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr109267-1.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-rtl-expand-details" } */ > + > +/* PR middle-end/109267 */ > + > +int f(void) > +{ > + __builtin_unreachable(); > +} > + > +/* This unreachable should expand as trap. */ > + > +/* { dg-final { scan-rtl-dump-times "__builtin_trap " 1 "expand"} } */ > +/* { dg-final { scan-rtl-dump-times "__builtin_unreachable " 1 "expand"} } */ > diff --git a/gcc/testsuite/gcc.dg/pr109267-2.c > b/gcc/testsuite/gcc.dg/pr109267-2.c > new file mode 100644 > index 000..e6da4860998 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr109267-2.c > @@ -0,0 +1,13 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-rtl-expand-details" } */ > + > +/* PR middle-end/109267 */ > +void g(void); > +int f(int *t) > +{ > + g(); > + __builtin_unreachable(); > +} > + > +/* This should be expanded to unreachable so it should show up twice. */ > +/* { dg-final { scan-rtl-dump-times "__builtin_unreachable " 2 "expand"} } */ > -- > 2.43.0 >
Re: [PATCH] tailc: Improve tail recursion handling [PR119493]
On Thu, 24 Apr 2025, Jakub Jelinek wrote: > On Tue, Apr 01, 2025 at 11:51:49AM +0200, Jakub Jelinek wrote: > > Here it is, ok if it passes bootstrap/regtest? I'll queue the interdiff > > between this patch and the previous one for GCC 16. > > Here is the interdiff to improve the tail recursion handling also for > non-musttail calls. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK > 2025-04-24 Jakub Jelinek > > PR tree-optimization/119493 > * tree-tailcall.cc (find_tail_calls): Handle non-gimple_reg_type > arguments which aren't just passed through for tail recursions > even for non-musttail calls. > > --- gcc/tree-tailcall.cc.jj 2025-04-01 16:47:30.373502796 +0200 > +++ gcc/tree-tailcall.cc 2025-04-01 20:08:34.578787921 +0200 > @@ -685,8 +685,7 @@ find_tail_calls (basic_block bb, struct > ? !is_gimple_reg (param) > : (!is_gimple_variable (param) >|| TREE_THIS_VOLATILE (param) > - || may_be_aliased (param) > - || !gimple_call_must_tail_p (call))) > + || may_be_aliased (param))) > break; > } > } > > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg, Germany; GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
Re: [PATCH v2] gcc: do not apply store motion on loop with no exits.
ywgrit writes: > I encountered one problem with loop-im pass. > I compiled the program dhry2reg which belongs to > unixbench(https://github.com/kdlucas/byte-unixbench). > > The gcc used > gcc (GCC) 12.3.0 > > The commands executed as following > make > ./Run -c -i 1 dhry2reg > > The results are shown below. > Dhrystone 2 using register variables 0.1 lps (10.0 s, 1 > samples) > > System Benchmarks Partial Index BASELINE RESULTINDEX > Dhrystone 2 using register variables 116700.0 0.1 0.0 > > System Benchmarks Index Score (Partial Only) 10.0 > > Obviously, the "INDEX" is abnormal. > I wrote a demo named dhry.c based on the dhry2reg logic. It's best to file a bug report so we can: a) discuss the validity of the testcase, and b) reference it in review of the commit & even once it is in .. but that said, I thought this looked familiar, and I found PR117695 which is marked as INVALID. > [...] FWIW, all the analysis should go in the commit message, as well as including a testcase in the commit itself. The commit message should also have a ChangeLog. See these links: * https://gcc.gnu.org/contribute.html * https://gcc.gnu.org/codingconventions.html But these are general remarks. I can't approve the patch (that is for others) but I'm not sure if the testcase is valid anyway. > [...] thanks, sam
[PATCH] Fix name mismatch for fortran.
From: "hongtao.liu" Function name in afdo_string_table is step3d_t_tile. but DECL_ASSEMBLER_NAME (edge->callee->decl))) gets __step3d_t_mod_MOD_step3d_t_tile, Looks like the prefix is not in the debug string table, so let's also check directly for afdo_string_table->get_index_by_decl (edge->callee->decl). Tested with autofdo enabled, the issue is fixed by the patch. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR gcov-profile/118508 * auto-profile.cc (autofdo_source_profile::get_callsite_total_count): Fix name mismatch for fortran. --- gcc/auto-profile.cc | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc index aa4d1634f01..2d2d4a428f2 100644 --- a/gcc/auto-profile.cc +++ b/gcc/auto-profile.cc @@ -837,8 +837,10 @@ autofdo_source_profile::get_callsite_total_count ( function_instance *s = get_function_instance_by_inline_stack (stack); if (s == NULL - || afdo_string_table->get_index (IDENTIFIER_POINTER ( - DECL_ASSEMBLER_NAME (edge->callee->decl))) != s->name ()) + || (afdo_string_table->get_index (IDENTIFIER_POINTER ( + DECL_ASSEMBLER_NAME (edge->callee->decl))) != s->name () + && afdo_string_table->get_index_by_decl (edge->callee->decl) + != s->name())) return 0; return s->total_count (); -- 2.34.1
[PATCH] [autofdo] Annotate bb with all debug_stmt with location of phi in the single_succ.
From: "hongtao.liu" For BB with all debug_stmt, it will be ignored by afdo_set_bb_count, but it can be set with count of single successors PHIs which edge from the BB.(only nonzero count is annotatted). Tested with -march=x86-64-v3 -O2 autofdo enabled, the issue in the PR is fixed. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR gcov-profile/118581 * auto-profile.cc (autofdo_source_profile::get_count_info): Overload the function with parameter gimple location instead of stmt. (afdo_set_bb_count): For !has_annotated BB, Check single successors PHIs corresponding to the block and use those count. --- gcc/auto-profile.cc | 53 ++--- 1 file changed, 50 insertions(+), 3 deletions(-) diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc index 2d2d4a428f2..1fa2946c065 100644 --- a/gcc/auto-profile.cc +++ b/gcc/auto-profile.cc @@ -303,6 +303,10 @@ public: in INFO and return true; otherwise return false. */ bool get_count_info (gimple *stmt, count_info *info) const; + /* Find count_info for a given gimple location GIMPLE_LOC. If found, + store the count_info in INFO and return true; otherwise return false. */ + bool get_count_info (location_t gimple_loc, count_info *info) const; + /* Find total count of the callee of EDGE. */ gcov_type get_callsite_total_count (struct cgraph_edge *edge) const; @@ -724,11 +728,18 @@ autofdo_source_profile::get_function_instance_by_decl (tree decl) const bool autofdo_source_profile::get_count_info (gimple *stmt, count_info *info) const { - if (LOCATION_LOCUS (gimple_location (stmt)) == cfun->function_end_locus) + return get_count_info (gimple_location (stmt), info); +} + +bool +autofdo_source_profile::get_count_info (location_t gimple_loc, + count_info *info) const +{ + if (LOCATION_LOCUS (gimple_loc) == cfun->function_end_locus) return false; inline_stack stack; - get_inline_stack (gimple_location (stmt), &stack); + get_inline_stack (gimple_loc, &stack); if (stack.length () == 0) return false; function_instance *s = get_function_instance_by_inline_stack (stack); @@ -1132,7 +1143,43 @@ afdo_set_bb_count (basic_block bb, const stmt_set &promoted) } if (!has_annotated) -return false; +{ + /* For BB with all debug stmt which assigne a value with constant, +check successors PHIs corresponding to the block and +use those counts. */ + edge tmp_e; + edge_iterator tmp_ei; + FOR_EACH_EDGE (tmp_e, tmp_ei, bb->succs) + { + basic_block bb_succ = tmp_e->dest; + for (gphi_iterator gpi = gsi_start_phis (bb_succ); + !gsi_end_p (gpi); + gsi_next (&gpi)) + { + gphi *phi = gpi.phi (); + size_t i; + for (i = 0; i < gimple_phi_num_args (phi); i++) + { + edge e = gimple_phi_arg_edge (phi, i); + if (e->src != bb) + continue; + location_t phi_loc = gimple_phi_arg_location (phi, i); + inline_stack stack; + count_info info; + if (afdo_source_profile->get_count_info (phi_loc, &info) + && info.count != 0) + { + if (info.count > max_count) + max_count = info.count; + has_annotated = true; + } + } + } + } + + if (!has_annotated) + return false; +} for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) afdo_source_profile->mark_annotated (gimple_location (gsi_stmt (gsi))); -- 2.34.1
[PATCH] RISC-V: Implment H modifier for printing the next register name
For RV32 inline assembly, when handling 64-bit integer data, it is often necessary to process the lower and upper 32 bits separately. Unfortunately, we can only output the current register name (lower 32 bits) but not the next register name (upper 32 bits). To address this, the modifier 'H' has been added to allow users to handle the upper 32 bits of the data. While I believe the modifier 'N' (representing the next register name) might be more suitable for this functionality, 'N' is already in use. Therefore, 'H' (representing the high register) was chosen instead. Co-Authored-By: Dimitar Dimitrov gcc/ChangeLog: * config/riscv/riscv.cc (riscv_print_operand): Add H. * doc/extend.texi: Document for H. gcc/testsuite/ChangeLog: * gcc.target/riscv/modifier-H-error-1.c: New test. * gcc.target/riscv/modifier-H-error-2.c: New test. * gcc.target/riscv/modifier-H.c: New test. --- gcc/config/riscv/riscv.cc | 22 +++ gcc/doc/extend.texi | 1 + .../gcc.target/riscv/modifier-H-error-1.c | 13 +++ .../gcc.target/riscv/modifier-H-error-2.c | 11 ++ gcc/testsuite/gcc.target/riscv/modifier-H.c | 22 +++ 5 files changed, 69 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/modifier-H-error-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/modifier-H-error-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/modifier-H.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index bad59e248d0..c5eec7a0136 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -6879,6 +6879,7 @@ riscv_asm_output_opcode (FILE *asm_out_file, const char *p) 'T' Print shift-index of inverted single-bit mask OP. '~' Print w if TARGET_64BIT is true; otherwise not print anything. 'N' Print register encoding as integer (0-31). + 'H' Print the name of the next register for integer. Note please keep this list and the list in riscv.md in sync. */ @@ -7174,6 +7175,27 @@ riscv_print_operand (FILE *file, rtx op, int letter) asm_fprintf (file, "%u", (regno - offset)); break; } +case 'H': + { + if (!REG_P (op)) + { + output_operand_lossage ("modifier 'H' require register operand"); + break; + } + if (REGNO (op) > 31) + { + output_operand_lossage ("modifier 'H' is for integer registers only"); + break; + } + if (REGNO (op) == 31) + { + output_operand_lossage ("modifier 'H' cannot be applied to R31"); + break; + } + + fputs (reg_names[REGNO (op) + 1], file); + break; + } default: switch (code) { diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 0978c4c41b2..212d2487558 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -12585,6 +12585,7 @@ The list below describes the supported modifiers and their effects for RISC-V. @item @code{z} @tab Print ''@code{zero}'' instead of 0 if the operand is an immediate with a value of zero. @item @code{i} @tab Print the character ''@code{i}'' if the operand is an immediate. @item @code{N} @tab Print the register encoding as integer (0 - 31). +@item @code{H} @tab Print the name of the next register for integer. @end multitable @anchor{shOperandmodifiers} diff --git a/gcc/testsuite/gcc.target/riscv/modifier-H-error-1.c b/gcc/testsuite/gcc.target/riscv/modifier-H-error-1.c new file mode 100644 index 000..43ecff6498e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/modifier-H-error-1.c @@ -0,0 +1,13 @@ +/* { dg-do compile { target { rv32 } } } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ +/* { dg-options "-march=rv32gc -mabi=ilp32d -O0" } */ + +float foo () +{ + float ret; + asm ("fld\t%H0,(a0)\n\t":"=f"(ret)); + + return ret; +} + +/* { dg-error "modifier 'H' is for integer registers only" "" { target { "riscv*-*-*" } } 0 } */ diff --git a/gcc/testsuite/gcc.target/riscv/modifier-H-error-2.c b/gcc/testsuite/gcc.target/riscv/modifier-H-error-2.c new file mode 100644 index 000..db478b6ddf6 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/modifier-H-error-2.c @@ -0,0 +1,11 @@ +/* { dg-do compile { target { rv32 } } } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ +/* { dg-options "-march=rv32gc -mabi=ilp32d -O0 " } */ + +void foo () +{ + register int x31 __asm__ ("x31"); + asm ("li\t%H0,1\n\t":"=r"(x31)); +} + +/* { dg-error "modifier 'H' cannot be applied to R31" "" { target { "riscv*-*-*" } } 0 } */ diff --git a/gcc/testsuite/gcc.target/riscv/modifier-H.c b/gcc/testsuite/gcc.target/riscv/modifier-H.c new file mode 100644 index 000..3571ea966f0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/modifier-H.c @@ -0,0 +1,22 @@ +/* { dg-do compile { target { rv32 } } } */ +/* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ +/
Re: [PATCH 30/61] MSA: Make MSA and microMIPS R5 unsupported
On Wed, 2025-04-23 at 12:43 +, Aleksandar Rakic wrote: > From 16b3207aed5e4846fde4f3ffa1253c65ef6ba056 Mon Sep 17 00:00:00 2001 > From: Aleksandar Rakic > Date: Wed, 23 Apr 2025 14:14:17 +0200 > Subject: [PATCH] Make MSA and microMIPS R5 unsupported > > There are no platforms nor simulators for MSA and microMIPS R5 so > turning off this support for now. > > gcc/ChangeLog: > > * config/mips/mips.cc (mips_option_override): Error out for > -mmicromips -mips32r5 -mmsa. > > Cherry-picked 1009d6ff7a8d3b56e0224a6b193c5a7b3c29aa5f > from https://github.com/MIPS/gcc > > Signed-off-by: Matthew Fortune > Signed-off-by: Faraz Shahbazker > Signed-off-by: Aleksandar Rakic > --- > gcc/config/mips/mips.cc | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc > index 0d3d0263f2d..23205dfb616 100644 > --- a/gcc/config/mips/mips.cc > +++ b/gcc/config/mips/mips.cc > @@ -20414,6 +20414,7 @@ static void > mips_option_override (void) > { > int i, regno, mode; > + unsigned int is_micromips; > > if (OPTION_SET_P (mips_isa_option)) > mips_isa_option_info = &mips_cpu_info_table[mips_isa_option]; > @@ -20434,6 +20435,7 @@ mips_option_override (void) > /* Save the base compression state and process flags as though we > were generating uncompressed code. */ > mips_base_compression_flags = TARGET_COMPRESSION; > + is_micromips = TARGET_MICROMIPS; > target_flags &= ~TARGET_COMPRESSION; > mips_base_code_readable = mips_code_readable; > > @@ -20678,7 +20680,7 @@ mips_option_override (void) > "-mcompact-branches=never"); > } > > - if (is_micromips && TARGET_MSA) > + if (is_micromips && mips_isa_rev <= 5 && TARGET_MSA) Why not just "TARGET_MICROMIPS && mips_isa_rev <= 5 && TARGET_MSA"? > error ("unsupported combination: %s", "-mmicromips -mmsa"); And should this line be updated too like "-mmicromips -mmsa is only supported for MIPSr6"? Unfortunately the original patch is already applied and breaking even a non-bootstrapping build for MIPS. Thus a fix is needed ASAP or we'd revert the original patch. -- Xi Ruoyao School of Aerospace Science and Technology, Xidian University
Re: [PATCH] Add testcase for bogus Warray-bounds warning dealing with __builtin_unreachable [PR100038]
On Sat, Apr 5, 2025 at 4:56 AM Andrew Pinski wrote: > > After EVRP was switched to the ranger (r12-2305-g398572c1544d8b), we are > better handling the case > where __builtin_unreachable comes after a loop. Instead of removing > __builtin_unreachable and having > the loop become an infinite one; it is kept around longer and allows GCC to > unroll the loop 2 times instead > of 3 times. When GCC unrolled the loop 3 times, GCC would produce a bogus > Warray-bounds warning for the 3rd > iteration. > This adds the testcase to make sure we don't regress on this case. It is > originally extracted from LLVM source > code too. Ping? It would be useful to have this testcase so we don't regress the warning; especially since this is extracted from LLVM (with asserts turned off which IIRC is the default way of building LLVM these days). Thanks, Andrew > > PR tree-optimization/100038 > > gcc/testsuite/ChangeLog: > > * g++.dg/tree-ssa/pr100038.C: New test. > > Signed-off-by: Andrew Pinski > --- > gcc/testsuite/g++.dg/tree-ssa/pr100038.C | 17 + > 1 file changed, 17 insertions(+) > create mode 100644 gcc/testsuite/g++.dg/tree-ssa/pr100038.C > > diff --git a/gcc/testsuite/g++.dg/tree-ssa/pr100038.C > b/gcc/testsuite/g++.dg/tree-ssa/pr100038.C > new file mode 100644 > index 000..7024c4db2b2 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/tree-ssa/pr100038.C > @@ -0,0 +1,17 @@ > +// { dg-do compile } > +// { dg-options "-O2 -Wextra -Wall -Warray-bounds" } > + > +struct SparseBitVectorElement { > + long Bits[2]; > + int find_first() const; > +}; > + > +// we should not get an `array subscript 2 is above array bounds of` > +// warning here because we have an unreachable at that point > + > +int SparseBitVectorElement::find_first() const { > + for (unsigned i = 0; i < 2; ++i) > +if (Bits[i]) // { dg-bogus "is above array bounds of" } > + return i; > + __builtin_unreachable(); > +} > -- > 2.43.0 >
[PATCH] c++: Add attribute handles_virtual_move_assign
This patch should make it easier to selectively disable -Wvirtual-move-assign errors by adding an attribute for move assignment operators which marks them as handling duplicate calls. gcc/cp/ChangeLog: * method.cc: Include "attribs.h". (synthesized_method_walk): Avoid outputting -Wvirtual_move_assign when the base class' move assignment operator has the handles_virtual_move_assign attribute. * tree.cc (handle_handles_virtual_move_assign): Add. (cxx_gnu_attributes): Add handles_virtual_move_assign to the attribute list. gcc/ChangeLog: * doc/extend.texi (C++-Specific Variable, Function, and Type Attributes): Document handles_virtual_move_assign. gcc/testsuite/ChangeLog: * g++.dg/warn/Wvirtual-move-assign-1.C: New test. Signed-off-by: Owen Avery --- gcc/cp/method.cc | 5 ++- gcc/cp/tree.cc| 28 gcc/doc/extend.texi | 13 .../g++.dg/warn/Wvirtual-move-assign-1.C | 32 +++ 4 files changed, 77 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/warn/Wvirtual-move-assign-1.C diff --git a/gcc/cp/method.cc b/gcc/cp/method.cc index 05c19cf0661..898f05c9b7d 100644 --- a/gcc/cp/method.cc +++ b/gcc/cp/method.cc @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3. If not see #include "toplev.h" #include "intl.h" #include "common/common-target.h" +#include "attribs.h" static void do_build_copy_assign (tree); static void do_build_copy_constructor (tree); @@ -2949,7 +2950,9 @@ synthesized_method_walk (tree ctype, special_function_kind sfk, bool const_p, && BINFO_VIRTUAL_P (base_binfo) && fn && TREE_CODE (fn) == FUNCTION_DECL && move_fn_p (fn) && !trivial_fn_p (fn) - && vbase_has_user_provided_move_assign (BINFO_TYPE (base_binfo))) + && vbase_has_user_provided_move_assign (BINFO_TYPE (base_binfo)) + && !lookup_attribute ("handles_virtual_move_assign", + DECL_ATTRIBUTES (fn))) warning (OPT_Wvirtual_move_assign, "defaulted move assignment for %qT calls a non-trivial " "move assignment operator for virtual base %qT", diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc index 5863b6878f0..4efd5121319 100644 --- a/gcc/cp/tree.cc +++ b/gcc/cp/tree.cc @@ -48,6 +48,8 @@ static tree handle_init_priority_attribute (tree *, tree, tree, int, bool *); static tree handle_abi_tag_attribute (tree *, tree, tree, int, bool *); static tree handle_contract_attribute (tree *, tree, tree, int, bool *); static tree handle_no_dangling_attribute (tree *, tree, tree, int, bool *); +static tree handle_handles_virtual_move_assign (tree *, tree, tree, int, + bool *); /* If REF is an lvalue, returns the kind of lvalue that REF is. Otherwise, returns clk_none. */ @@ -5234,6 +5236,8 @@ static const attribute_spec cxx_gnu_attributes[] = handle_abi_tag_attribute, NULL }, { "no_dangling", 0, 1, false, true, false, false, handle_no_dangling_attribute, NULL }, + { "handles_virtual_move_assign", 0, 0, false, false, false, false, +handle_handles_virtual_move_assign, NULL }, }; const scoped_attribute_specs cxx_gnu_attribute_table = @@ -5565,6 +5569,30 @@ handle_no_dangling_attribute (tree *node, tree name, tree args, int, return NULL_TREE; } +/* Handle a "handles_virtual_move_assign" attribute; arguments as in + struct attribute_spec.handler. */ + +tree +handle_handles_virtual_move_assign (tree *node, tree name, tree /*args*/, + int /*flags*/, bool *no_add_attrs) +{ + if (TREE_CODE (*node) != FUNCTION_DECL || DECL_CONSTRUCTOR_P (*node) + || !move_fn_p (*node)) +{ + warning ( + OPT_Wattributes, + "%qE attribute ignored; valid only for move assignment operators", + name); + *no_add_attrs = true; +} + else +{ + *no_add_attrs = false; +} + + return NULL_TREE; +} + /* Return a new PTRMEM_CST of the indicated TYPE. The MEMBER is the thing pointed to by the constant. */ diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 0978c4c41b2..39b3455909d 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -30412,6 +30412,19 @@ decltype(auto) foo(T&& t) @{ @}; @end smallexample +@cindex @code{handles_virtual_move_assign} function attribute +@item handles_virtual_move_assign + +If a C++ type has a default move assignment operator and virtually +inherits from a base class with a non-trivial move assignment operator, +the default move assignment operator may call the non-trivial assigment +operator multiple times. This causes gcc to emit a +@code{virtual-move-assign} warning, even if the non-trivial assignment +operator is written to handle this. This attribute can be used on a +base class' move as
Re: [PATCH] c++: Add attribute handles_virtual_move_assign
I'm open to renaming the attribute and/or test file, of course.
Re: [PATCH v2 3/3] xtensa: Make large const_int legitimate during RTL instruction combination pass
Hi Suwa-san, On Thu, Apr 24, 2025 at 12:07 AM Takayuki 'January June' Suwa wrote: > > Recent gcc versions tend to convert constants for which > TARGET_LEGITIMATE_CONSTANT_P returns false into references to literal pool > entries during the RTL instruction combination pass for pattern matching. > > For example, the following pattern will currently never match unless either > TARGET_CONST16 or TARGET_AUTO_LITPOOLS is enabled: > >[(set (match_operand:SI 0 "register_operand" "=a") > (match_operator:SI 2 "boolean_operator" > [(match_operand:SI 1 "register_operand" "r") > (const_int -2147483648)]))] > > because INT_MIN will be put into literal pool during the combination. > > This patch avoids the above problem in the way described in the title. > > gcc/ChangeLog: > > * config/xtensa/xtensa.cc (xtensa_legitimate_constant_p): > Add a logical OR with !xtensa_split1_finished_p() to the condition > that returns true to include the RTL instruction combination pass. > (xtensa_emit_move_sequence): Make its behavior consistent with > the above change. > --- > gcc/config/xtensa/xtensa.cc | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) This change results in the following regression in the gcc testsuite: -PASS: gcc.c-torture/execute/pr68328.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) -PASS: gcc.c-torture/execute/pr68328.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects execution test +FAIL: gcc.c-torture/execute/pr68328.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (internal compiler error: in extract_constrain_insn, at recog.cc:2783) +FAIL: gcc.c-torture/execute/pr68328.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects (test for excess errors) with the following diagnostics: /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/testsuite/gcc.c-torture/execute/pr68328.c: In function 'bar': /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/testsuite/gcc.c-torture/execute/pr68328.c:16:1: error: unrecognizable insn: (insn 7 4 8 2 (parallel [ (asm_operands/v ("") ("") 0 [ (const_int 1193046 [0x123456]) (const_int 0 [0]) ] [ (asm_input:SI ("g") /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/testsuite/gcc.c-torture/execute/pr68328.c:13) (asm_input:SI ("g") /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/testsuite/gcc.c-torture/execute/pr68328.c:13) ] [] /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/testsuite/gcc.c-torture/execute/pr68328.c:13) (clobber (mem:BLK (scratch) [0 A8])) ]) "/home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/testsuite/gcc.c-torture/execute/pr68328.c":13:3 -1 (nil)) during RTL pass: reload /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/testsuite/gcc.c-torture/execute/pr68328.c:16:1: internal compiler error: in extract_constrain_insn, at recog.cc:2783 0x1b1de5f internal_error(char const*, ...) /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/diagnostic-global-context.cc:517 0x8242ea fancy_abort(char const*, int, char const*) /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/diagnostic.cc:1749 0x6c3448 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/rtl-error.cc:108 0x6c3464 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/rtl-error.cc:116 0x6c1ff9 extract_constrain_insn(rtx_insn*) /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/recog.cc:2783 0xc20fd7 check_rtl /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/lra.cc:2202 0xc2517b lra(_IO_FILE*, int) /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/lra.cc:2636 0xbdbe77 do_reload /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/ira.cc:5987 0xbdbe77 execute /home/jcmvbkbc/ws/tensilica/gcc/gcc/gcc/ira.cc:6175 -- Thanks. -- Max
[PATCH] i386: Quote user-defined symbols in assembly in Intel syntax
Hello, I'm sending this patch again after GCC 15 has been released. This patch was sent in February and but there were no comments: https://patchwork.sourceware.org/project/gcc/patch/eca6660c-6578-4e39-8aa9-be9fdd013...@126.com/ -- Best regards, LIU Hao From f6c09e9397d5fe9c0dd1f7a02c90536732aed3df Mon Sep 17 00:00:00 2001 From: LIU Hao Date: Sat, 22 Feb 2025 13:11:51 +0800 Subject: [PATCH] i386: Quote user-defined symbols in assembly in Intel syntax With `-masm=intel`, GCC generates registers without % prefixes. If a user-declared symbol happens to match a register, it will confuse the assembler. User-defined symbols should be quoted, so they are not to be mistaken for registers or operators. Support for quoted symbols were added in Binutils 2.26, originally for ARM assembly, where registers are also unprefixed: https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=d02603dc201f80cd9d2a1f4b1a16110b1e04222b This change is required for `@SECREL32` to work in Intel syntax when targeting Windows, where `@` is allowed as part of a symbol. GNU AS fails to parse a plain symbol with that suffix: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80881#c79 gcc/config/: PR target/53929 PR target/80881 * gcc/config/i386/i386-protos.h (ix86_asm_output_labelref): Declare new function for quoting user-defined symbols in Intel syntax. * gcc/config/i386/i386.cc (ix86_asm_output_labelref): Implement it. * gcc/config/i386/i386.h (ASM_OUTPUT_LABELREF): Use it. * gcc/config/i386/cygming.h (ASM_OUTPUT_LABELREF): Use it. --- gcc/config/i386/cygming.h | 5 +++-- gcc/config/i386/i386-protos.h | 1 + gcc/config/i386/i386.cc | 13 + gcc/config/i386/i386.h| 7 +++ 4 files changed, 24 insertions(+), 2 deletions(-) diff --git a/gcc/config/i386/cygming.h b/gcc/config/i386/cygming.h index 3ddcbecb22fd..4a192900045a 100644 --- a/gcc/config/i386/cygming.h +++ b/gcc/config/i386/cygming.h @@ -247,9 +247,10 @@ do { \ #undef ASM_OUTPUT_LABELREF #define ASM_OUTPUT_LABELREF(STREAM, NAME) \ do { \ + const char* prefix = ""; \ if ((NAME)[0] != FASTCALL_PREFIX)\ -fputs (user_label_prefix, (STREAM)); \ - fputs ((NAME), (STREAM));\ +prefix = user_label_prefix;\ + ix86_asm_output_labelref ((STREAM), prefix, (NAME)); \ } while (0) /* This does much the same in memory rather than to a stream. */ diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index bea3fd4b2e2a..3b9e28ced91c 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -198,6 +198,7 @@ extern int ix86_attr_length_vex_default (rtx_insn *, bool, bool); extern rtx ix86_libcall_value (machine_mode); extern bool ix86_function_arg_regno_p (int); extern void ix86_asm_output_function_label (FILE *, const char *, tree); +extern void ix86_asm_output_labelref (FILE *, const char *, const char *); extern void ix86_call_abi_override (const_tree); extern int ix86_reg_parm_stack_space (const_tree); diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 3128973ba79c..d9dea86afa9c 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -1709,6 +1709,19 @@ ix86_asm_output_function_label (FILE *out_file, const char *fname, } } +/* Output a user-defined label. In AT&T syntax, registers are prefixed + with %, so labels require no punctuation. In Intel syntax, registers + are unprefixed, so labels may clash with registers or other operators, + and require quoting. */ +void +ix86_asm_output_labelref (FILE *file, const char *prefix, const char *label) +{ + if (ASSEMBLER_DIALECT == ASM_ATT) +fprintf (file, "%s%s", prefix, label); + else +fprintf (file, "\"%s%s\"", prefix, label); +} + /* Implementation of call abi switching target hook. Specific to FNDECL the specific call register sets are set. See also ix86_conditional_register_usage for more details. */ diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 40b1aa4e6dfe..79a1afdde02c 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -2251,6 +2251,13 @@ extern unsigned int const svr4_debugger_register_map[FIRST_PSEUDO_REGISTER]; } while (0) #endif +/* In Intel syntax, we have to quote user-defined labels that would + match (unprefixed) registers or operators. */ + +#undef ASM_OUTPUT_LABELREF +#define ASM_OUTPUT_LABELREF(STREAM, NAME) \ + ix86_asm_output_labelref ((STREAM), user_label_prefix, (NAME)) + /* Under some conditions we need jump tables in the text section, because the assembler cannot handle label differences between sections. */ -- 2.48.1 From f6c09e9397d5fe9c0dd1f7a02c90536732aed3df Mon Sep 17 00:00:00 2001 From: LIU Hao Date: Sat, 22 Feb 2025 13:11:51 +0800 Subject:
RE: [PATCH v1][GCC16-Stage-1] RISC-V: Remove unnecessary frm restore volatile define_insn
Kindly ping. Pan -Original Message- From: Li, Pan2 Sent: Wednesday, April 16, 2025 10:57 PM To: gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; rdapp@gmail.com; Chen, Ken ; Li, Pan2 Subject: [PATCH v1][GCC16-Stage-1] RISC-V: Remove unnecessary frm restore volatile define_insn From: Pan Li After we add the frm register to the global_regs, we may not need to define_insn that volatile to emit the frm restore insns. The cooperatively-managed global register will help to handle this, instead of emit the volatile define_insn explicitly. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_emit_frm_mode_set): Refactor the frm mode set by removing fsrmsi_restore_volatile. * config/riscv/vector-iterators.md (unspecv): Remove as unnecessary. * config/riscv/vector.md (fsrmsi_restore_volatile): Ditto. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/float-point-dynamic-frm-49.c: Adjust the asm dump check times. * gcc.target/riscv/rvv/base/float-point-dynamic-frm-50.c: Ditto. * gcc.target/riscv/rvv/base/float-point-dynamic-frm-52.c: Ditto. * gcc.target/riscv/rvv/base/float-point-dynamic-frm-74.c: Ditto. * gcc.target/riscv/rvv/base/float-point-dynamic-frm-75.c: Ditto. Signed-off-by: Pan Li --- gcc/config/riscv/riscv.cc | 43 ++- gcc/config/riscv/vector-iterators.md | 4 -- gcc/config/riscv/vector.md| 13 -- .../rvv/base/float-point-dynamic-frm-49.c | 2 +- .../rvv/base/float-point-dynamic-frm-50.c | 2 +- .../rvv/base/float-point-dynamic-frm-52.c | 2 +- .../rvv/base/float-point-dynamic-frm-74.c | 2 +- .../rvv/base/float-point-dynamic-frm-75.c | 2 +- 8 files changed, 28 insertions(+), 42 deletions(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 38f3ae7cd84..3878702e3a1 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -12047,27 +12047,30 @@ riscv_emit_frm_mode_set (int mode, int prev_mode) if (prev_mode == riscv_vector::FRM_DYN_CALL) emit_insn (gen_frrmsi (backup_reg)); /* Backup frm when DYN_CALL. */ - if (mode != prev_mode) -{ - rtx frm = gen_int_mode (mode, SImode); - - if (mode == riscv_vector::FRM_DYN_CALL - && prev_mode != riscv_vector::FRM_DYN && STATIC_FRM_P (cfun)) - /* No need to emit when prev mode is DYN already. */ - emit_insn (gen_fsrmsi_restore_volatile (backup_reg)); - else if (mode == riscv_vector::FRM_DYN_EXIT && STATIC_FRM_P (cfun) - && prev_mode != riscv_vector::FRM_DYN - && prev_mode != riscv_vector::FRM_DYN_CALL) - /* No need to emit when prev mode is DYN or DYN_CALL already. */ - emit_insn (gen_fsrmsi_restore_volatile (backup_reg)); - else if (mode == riscv_vector::FRM_DYN - && prev_mode != riscv_vector::FRM_DYN_CALL) - /* Restore frm value from backup when switch to DYN mode. */ - emit_insn (gen_fsrmsi_restore (backup_reg)); - else if (riscv_static_frm_mode_p (mode)) - /* Set frm value when switch to static mode. */ - emit_insn (gen_fsrmsi_restore (frm)); + if (mode == prev_mode) +return; + + if (riscv_static_frm_mode_p (mode)) +{ + /* Set frm value when switch to static mode. */ + emit_insn (gen_fsrmsi_restore (gen_int_mode (mode, SImode))); + return; } + + bool restore_p += /* No need to emit when prev mode is DYN. */ + (STATIC_FRM_P (cfun) && mode == riscv_vector::FRM_DYN_CALL + && prev_mode != riscv_vector::FRM_DYN) + /* No need to emit if prev mode is DYN or DYN_CALL. */ + || (STATIC_FRM_P (cfun) && mode == riscv_vector::FRM_DYN_EXIT + && prev_mode != riscv_vector::FRM_DYN + && prev_mode != riscv_vector::FRM_DYN_CALL) + /* Restore frm value when switch to DYN mode. */ + || (mode == riscv_vector::FRM_DYN + && prev_mode != riscv_vector::FRM_DYN_CALL); + + if (restore_p) +emit_insn (gen_fsrmsi_restore (backup_reg)); } /* Implement Mode switching. */ diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index f8da71b1d65..28f52481952 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -122,10 +122,6 @@ (define_c_enum "unspec" [ UNSPEC_SF_VFNRCLIPU ]) -(define_c_enum "unspecv" [ - UNSPECV_FRM_RESTORE_EXIT -]) - ;; Subset of VI with fractional LMUL types (define_mode_iterator VI_FRAC [ RVVMF2QI RVVMF4QI (RVVMF8QI "TARGET_VECTOR_ELEN_64") diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 51eb64fb122..9dae11a7849 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -1116,19 +1116,6 @@ (define_insn "fsrmsi_restore" (set_attr "mode" "SI")] ) -;; The volatile fsrmsi restore is used for the exit point for the -;; dynamic mode switchin
[PATCH v3] x86: Properly find the maximum stack slot alignment
On Wed, Apr 23, 2025 at 1:56 PM Uros Bizjak wrote: > +static void > +ix86_find_all_reg_uses_1 (HARD_REG_SET ®set, > + rtx set, unsigned int regno, > + auto_bitmap &worklist) > +{ > + rtx dest = SET_DEST (set); > + > + if (!REG_P (dest)) > +return; > + > + /* Reject non-Pmode modes. */ > + if (GET_MODE (dest) != Pmode) > +return; > > We can reject non-Pmode modes. > > OTOH, if the patch is OK for you, I think it is good to go forward. > Here is the v3 patch. The only change I made is rtx set = single_set (insn); if (set) { ix86_find_all_reg_uses_1 (regset, set, ref_regno, worklist); continue; I added. } rtx pat = PATTERN (insn); if (GET_CODE (pat) != PARALLEL) continue; OK for master? Thanks. Don't assume that stack slots can only be accessed by stack or frame registers. We first find all registers defined by stack or frame registers. Then check memory accesses by such registers, including stack and frame registers. gcc/ PR target/109780 PR target/109093 * config/i386/i386.cc (stack_access_data): New. (ix86_update_stack_alignment): Likewise. (ix86_find_all_reg_use_1): Likewise. (ix86_find_all_reg_use): Likewise. (ix86_find_max_used_stack_alignment): Also check memory accesses from registers defined by stack or frame registers. gcc/testsuite/ PR target/109780 PR target/109093 * g++.target/i386/pr109780-1.C: New test. * gcc.target/i386/pr109093-1.c: Likewise. * gcc.target/i386/pr109780-1.c: Likewise. * gcc.target/i386/pr109780-2.c: Likewise. * gcc.target/i386/pr109780-3.c: Likewise. -- H.J. From 2233834e398711b65c8b8eeefbf6fa830a6c2974 Mon Sep 17 00:00:00 2001 From: "H.J. Lu" Date: Tue, 14 Mar 2023 11:41:51 -0700 Subject: [PATCH] x86: Properly find the maximum stack slot alignment Don't assume that stack slots can only be accessed by stack or frame registers. We first find all registers defined by stack or frame registers. Then check memory accesses by such registers, including stack and frame registers. gcc/ PR target/109780 PR target/109093 * config/i386/i386.cc (stack_access_data): New. (ix86_update_stack_alignment): Likewise. (ix86_find_all_reg_use_1): Likewise. (ix86_find_all_reg_use): Likewise. (ix86_find_max_used_stack_alignment): Also check memory accesses from registers defined by stack or frame registers. gcc/testsuite/ PR target/109780 PR target/109093 * g++.target/i386/pr109780-1.C: New test. * gcc.target/i386/pr109093-1.c: Likewise. * gcc.target/i386/pr109780-1.c: Likewise. * gcc.target/i386/pr109780-2.c: Likewise. * gcc.target/i386/pr109780-3.c: Likewise. Signed-off-by: H.J. Lu Co-Authored-By: Uros Bizjak --- gcc/config/i386/i386.cc| 195 ++--- gcc/testsuite/g++.target/i386/pr109780-1.C | 72 gcc/testsuite/gcc.target/i386/pr109093-1.c | 33 gcc/testsuite/gcc.target/i386/pr109780-1.c | 14 ++ gcc/testsuite/gcc.target/i386/pr109780-2.c | 21 +++ gcc/testsuite/gcc.target/i386/pr109780-3.c | 46 + 6 files changed, 360 insertions(+), 21 deletions(-) create mode 100644 gcc/testsuite/g++.target/i386/pr109780-1.C create mode 100644 gcc/testsuite/gcc.target/i386/pr109093-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr109780-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr109780-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr109780-3.c diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 3171d6e0ad4..dd076242177 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -8473,6 +8473,123 @@ output_probe_stack_range (rtx reg, rtx end) return ""; } +/* Data passed to ix86_update_stack_alignment. */ +struct stack_access_data +{ + /* The stack access register. */ + const_rtx reg; + /* Pointer to stack alignment. */ + unsigned int *stack_alignment; +}; + +/* Update the maximum stack slot alignment from memory alignment in PAT. */ + +static void +ix86_update_stack_alignment (rtx, const_rtx pat, void *data) +{ + /* This insn may reference stack slot. Update the maximum stack slot + alignment if the memory is referenced by the stack access register. */ + stack_access_data *p = (stack_access_data *) data; + + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, pat, ALL) +{ + auto op = *iter; + if (MEM_P (op) && reg_mentioned_p (p->reg, op)) + { + unsigned int alignment = MEM_ALIGN (op); + + if (alignment > *p->stack_alignment) + *p->stack_alignment = alignment; + break; + } +} +} + +/* Helper function for ix86_find_all_reg_uses. */ + +static void +ix86_find_all_reg_uses_1 (HARD_REG_SET ®set, + rtx set, unsigned int regno, + auto_bitmap &worklist) +{ + rtx dest = SET_DEST (set); + + if (!REG_P (dest)) +return; + + /* Reject non-Pmode modes. */ + if (GET_MODE (dest) != Pmode) +return; + + unsigned int dst_regno = REGNO (dest); + + if (TEST_HARD_REG_
Re: [PATCH] c++: Add attribute handles_virtual_move_assign
On 4/27/25 15:57, Owen Avery wrote: This patch should make it easier to selectively disable -Wvirtual-move-assign errors by adding an attribute for move assignment operators which marks them as handling duplicate calls. I'm only qualified to comment on the documentation part of the patch. diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 0978c4c41b2..39b3455909d 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -30412,6 +30412,19 @@ decltype(auto) foo(T&& t) @{ @}; @end smallexample +@cindex @code{handles_virtual_move_assign} function attribute +@item handles_virtual_move_assign + +If a C++ type has a default move assignment operator and virtually +inherits from a base class with a non-trivial move assignment operator, +the default move assignment operator may call the non-trivial assigment s/assigment/assignment/ +operator multiple times. This causes gcc to emit a s/gcc/GCC/ +@code{virtual-move-assign} warning, even if the non-trivial assignment I don't think the manual names warnings like that or refers to them with that kind of markup. I'd just say "emit a warning" here. Instead, you need to mention and cross-reference @option{-Wvirtual-move-assign} (or its negative form), and make corresponding changes to the docs for that option to link here. +operator is written to handle this. This attribute can be used on a +base class' move assignment operator declaration to indicate that it s/class'/class's/ +can handle the described situation, and that gcc should avoid emitting s/gcc/GCC/ +a warning. + @cindex @code{warn_unused} type attribute @item warn_unused -Sandra
Re: [PATCH] c-family: Improve location for -Wunknown-pragmas in a _Pragma [PR118838]
On Mon, Apr 07, 2025 at 01:58:08PM -0400, Marek Polacek wrote: > On Wed, Feb 12, 2025 at 08:27:37PM -0500, Lewis Hyatt wrote: > > Hello- > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118838 > > > > This patch addresses the issue mentioned in the PR (another instance of > > _Pragma string location issues). bootstrap + regtest all languages on > > aarch64 looks good. Is it OK please for now or for stage 1? Note, it is not > > a regression, since this never worked in C or C++ frontends; but on the > > other hand, r15-4505 for GCC 15 fixed some related issues, so it could be > > nice if this one gets in along with it. Thanks! > > > > -Lewis > > > > -- >8 -- > > > > The warning for -Wunknown-pragmas is issued at the location provided by > > libcpp to the def_pragma() callback. This location is > > cpp_reader::directive_line, which is a location for the start of the line > > only; it is also not a valid location in case the unknown pragma was lexed > > from a _Pragma string. These factors make it impossible to suppress > > -Wunknown-pragmas via _Pragma("GCC diagnostic...") directives on the same > > source line, as in the PR and the test case. Address that by issuing the > > warning at a better location returned by cpp_get_diagnostic_override_loc(). > > libcpp already maintains this location to handle _Pragma-related diagnostics > > internally; it was needed also to make a publicly accessible version of it. > > > > gcc/c-family/ChangeLog: > > > > PR c/118838 > > * c-lex.cc (cb_def_pragma): Call cpp_get_diagnostic_override_loc() > > to get a valid location at which to issue -Wunknown-pragmas, in case > > it was triggered from a _Pragma. > > > > libcpp/ChangeLog: > > > > PR c/118838 > > * errors.cc (cpp_get_diagnostic_override_loc): New function. > > * include/cpplib.h (cpp_get_diagnostic_override_loc): Declare. > > > > gcc/testsuite/ChangeLog: > > > > PR c/118838 > > * c-c++-common/cpp/pragma-diagnostic-loc-2.c: New test. > > * g++.dg/gomp/macro-4.C: Adjust expected output. > > * gcc.dg/gomp/macro-4.c: Likewise. > > * gcc.dg/cpp/Wunknown-pragmas-1.c: Likewise. > > --- > > libcpp/errors.cc | 10 + > > libcpp/include/cpplib.h | 5 + > > gcc/c-family/c-lex.cc | 7 +- > > .../cpp/pragma-diagnostic-loc-2.c | 15 + > > gcc/testsuite/g++.dg/gomp/macro-4.C | 8 +++ > > gcc/testsuite/gcc.dg/cpp/Wunknown-pragmas-1.c | 22 +++ > > gcc/testsuite/gcc.dg/gomp/macro-4.c | 8 +++ > > 7 files changed, 57 insertions(+), 18 deletions(-) > > create mode 100644 gcc/testsuite/c-c++-common/cpp/pragma-diagnostic-loc-2.c > > > > diff --git a/libcpp/errors.cc b/libcpp/errors.cc > > index 9621c4b66ea..d9efb6acd30 100644 > > --- a/libcpp/errors.cc > > +++ b/libcpp/errors.cc > > @@ -52,6 +52,16 @@ cpp_diagnostic_get_current_location (cpp_reader *pfile) > > } > > } > > > > +/* Sometimes a diagnostic needs to be generated before libcpp has been able > > + to generate a valid location for the current token; in that case, the > > + non-zero location returned by this function is the preferred one to > > use. */ > > + > > +location_t > > +cpp_get_diagnostic_override_loc (const cpp_reader *pfile) > > +{ > > + return pfile->diagnostic_override_loc; > > +} > > + > > /* Print a diagnostic at the given location. */ > > > > ATTRIBUTE_CPP_PPDIAG (5, 0) > > diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h > > index 90aa3160ebf..04d4621da3c 100644 > > --- a/libcpp/include/cpplib.h > > +++ b/libcpp/include/cpplib.h > > @@ -1168,6 +1168,11 @@ extern const char *cpp_probe_header_unit (cpp_reader > > *, const char *file, > > extern const char *cpp_get_narrow_charset_name (cpp_reader *) > > ATTRIBUTE_PURE; > > extern const char *cpp_get_wide_charset_name (cpp_reader *) ATTRIBUTE_PURE; > > > > +/* Sometimes a diagnostic needs to be generated before libcpp has been able > > + to generate a valid location for the current token; in that case, the > > + non-zero location returned by this function is the preferred one to > > use. */ > > I don't love duplicating the comment like this, it's going to get out of sync. > > > +extern location_t cpp_get_diagnostic_override_loc (const cpp_reader *); > > + > > /* This function reads the file, but does not start preprocessing. It > > returns the name of the original file; this is the same as the > > input file, except for preprocessed input. This will generate at > > diff --git a/gcc/c-family/c-lex.cc b/gcc/c-family/c-lex.cc > > index e450c9a57f0..df84020de62 100644 > > --- a/gcc/c-family/c-lex.cc > > +++ b/gcc/c-family/c-lex.cc > > @@ -248,7 +248,12 @@ cb_def_pragma (cpp_reader *pfile, location_t loc) > > { > >const unsigned char *space, *name; > >const cpp_token *s; > > - location_t fe_loc = loc; > > + > > +
Re: [PATCH] Fix size_t in id-15.c and infoleak-net-ethtool-ioctl.c for llp64
On 4/24/25 7:49 AM, Jonathan Yong wrote: Attached patch OK for master branch? Will push soon if there are no objections. gcc/testsuite/ChangeLog: * gcc.dg/graphite/id-15.c: Use __SIZE_TYPE__ instead of unsigned long. * gcc.dg/plugin/infoleak-net-ethtool-ioctl.c: ditto. Pushed to master branch.
Re: [PATCH] gcc: For Windows x86-32, always attempt to realign stack regardless of SSE
On 4/27/25 2:49 PM, Eric Botcazou wrote: For Windows x86-32 targets, the Microsoft ABI only guarantees that the stack is aligned to 4-byte boundaries. GCC knows about the default alignment of the stack. However, before this commit, it did not realign the stack unless SSE was also enabled. When a stricter (larger) alignment is requested, it's always necessary to realign the stack, as what Solaris does. Yes, or else if you configure the compiler --with-fpmath=sse (which is IMO the right thing to do for native 32-bit x86 platforms nowadays). PR target/07 * config/i386/cygming.h (STACK_REALIGN_DEFAULT): Copy from sol2.h. FWIW looks good to me. Thanks, pushed to master branch.
[PATCH] RISC-V: Fix register move cost for SIBCALL_REGS/JALR_REGS
Hi, according to Jeff's requirement (https://gcc.gnu.org/pipermail/gcc-patches/2025-April/681864.html), I divide the change of riscv_register_move_cost into separate patch. Please help to review. Thanks. Zhijin From b4c581393e864619192034bd8000c7e89443c19a Mon Sep 17 00:00:00 2001 From: Zhijin Zeng