[PATCH] gas: m68k M680LC040 F-LINE NOP insertion fixing emulation of fpu instructions
Hi, The m68k m68lc040 (early revisions) have an issue after executing f-line (co processor mmu/fpu) instructions and executing a trap in the case of fpu emulation. Unfornately due to time passed the processor errata as to how to address the issue has been lost. Fortunately the issue was documented in a NetBSD PR: http://gnats.netbsd.org/13078 Which proposes a solution to the problem which I have patched the GNU assembler for. As I have an affected machine with an lc040 processor I can confirm that all programmes compiled with the patched assember work as expected. Best regards, Nat --- a/external/gpl3/binutils/dist/gas/config/tc-m68k.c Sun Dec 01 20:36:00 2024 + +++ b/external/gpl3/binutils/dist/gas/config/tc-m68k.c Mon Dec 02 19:03:37 2024 +1100 @@ -74,6 +74,7 @@ static int flag_short_refs; /* -l option. */ static int flag_long_jumps; /* -S option. */ static int flag_keep_pcrel; /* --pcrel option. */ +static bool lcfix = true; #ifdef REGISTER_PREFIX_OPTIONAL int flag_reg_prefix_optional = REGISTER_PREFIX_OPTIONAL; @@ -4272,8 +4273,24 @@ } } + bool hasnop = false; + char nop[4] = "nop"; + toP = NULL; +next: memset (&the_ins, '\0', sizeof (the_ins)); + m68k_ip (str); + + if (lcfix == true && hasnop == false && + (the_ins.opcode[0] & 0xf000) == 0xf000) +{ + memset (&the_ins, '\0', sizeof (the_ins)); + m68k_ip (nop); + hasnop = true; +} + else +hasnop = false; + er = the_ins.error; if (!er) { @@ -4349,6 +4366,8 @@ if (the_ins.reloc[m].wid == 'B') fixP->fx_signed = 1; } + if (hasnop == true) + goto next; return; } @@ -4447,6 +4466,8 @@ the_ins.reloc[m].pic_reloc)); fixP->fx_pcrel_adjust = the_ins.reloc[m].pcrel_fix; } + if (hasnop == true) +goto next; } /* Comparison function used by qsort to rank the opcode entries by name. */ @@ -7455,6 +7476,8 @@ ; else if (m68k_set_cpu (arg, 0, 1)) ; + else if (startswith (arg, "no-lcfix")) + lcfix = false; else return 0; break; @@ -7556,6 +7579,7 @@ fprintf (stream, _("\ -march= set architecture\n\ -mcpu= set cpu [default %s]\n\ +-mno-lcfix no compatability with lc040 nop before f-line\n\ "), default_cpu); for (i = 0; m68k_extensions[i].name; i++) fprintf (stream, _("\
[COMMITTED 138/146] gccrs: derive(Clone): Use lang item for PhantomData in Clone
From: Arthur Cohen gcc/rust/ChangeLog: * expand/rust-derive-clone.cc (DeriveClone::visit_union): Create a lang item path instead of a regular path. --- gcc/rust/expand/rust-derive-clone.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/rust/expand/rust-derive-clone.cc b/gcc/rust/expand/rust-derive-clone.cc index 8093bf67ff0..2f733fae910 100644 --- a/gcc/rust/expand/rust-derive-clone.cc +++ b/gcc/rust/expand/rust-derive-clone.cc @@ -263,7 +263,7 @@ DeriveClone::visit_union (Union &item) {StructField ( Identifier ("_t"), builder.single_generic_type_path ( - "PhantomData", + LangItem::Kind::PHANTOM_DATA, GenericArgs ( {}, {GenericArg::create_type (builder.single_type_path ("T"))}, {})), Visibility::create_private (), item.get_locus ())}); -- 2.45.2
RE: [PATCH] RISC-V: Add pattern for vector-scalar multiply-add/sub [PR119100]
> Note that since this isn't a regression it'll need to wait for gcc-16 > development to open before the patch can go forward. > Pan Li sent a similar patch for vadd.vv/vadd.vx I think in November and I > believe he intended to continue when stage 1 opens. Yes, thanks Robin and Jeff, I will re-send the patch of vadd.vv/vx after stage 1 open, and then all other similar cases. Pan -Original Message- From: Jeff Law Sent: Sunday, March 30, 2025 8:31 AM To: Robin Dapp ; Paul-Antoine Arras ; gcc-patches@gcc.gnu.org; Li, Pan2 Subject: Re: [PATCH] RISC-V: Add pattern for vector-scalar multiply-add/sub [PR119100] On 3/27/25 1:39 PM, Robin Dapp wrote: > Hi Paul-Antoine, > >> This pattern enables the combine pass to merge a vec_duplicate into a >> plus-mult >> or minus-mult RTL instruction. >> >> Before this patch, we have two instructions, e.g.: >> vfmv.v.f v6,fa0 >> vfmadd.vv v9,v6,v7 >> >> After, we get only one: >> vfmadd.vf v9,fa0,v7 >> >> On SPEC2017's 503.bwaves_r, depending on the workload, the reduction >> in dynamic >> instruction count varies from -4.66% to -4.75%. > > The general issue with this kind of optimization (we have discussed it a > few times already) is that, depending on the uarch, we want the local > combine optimization that you show but not the fwprop/late-combine one > where we propagate a vector broadcast into a loop. > > So IMHO in order to continue with this and similar patterns we need at > least accompanying rtx_cost handling that would allow us to tune per uarch. > > Pan Li sent a similar patch for vadd.vv/vadd.vx I think in November and > I believe he intended to continue when stage 1 opens. > > An outstanding question is how to distinguish the combine case from the > late-combine case. I haven't yet thought about that in detail. The other thing we should consider is that we can certainly theorize that this kind of register file crossing case can have an extra penalty (it traditionally does), we don't have actual evidence that it's causing a problem on any RISC-V designs. So may be the way to go is add a field to the uarch tuning structure indicating the additional cost (if any) of a register file crossing vector op of this nature. Then query that in riscv_rtx_costs or whatever our rtx_cost function is named. Default that additional cost to zero initially. Then uarch experts can fill in the appropriate value. Yea, such a simplistic approach wouldn't handle cases like ours where you really need nearby context to be sure, but I don't think we want to over-engineer this solution too badly right now. Note that since this isn't a regression it'll need to wait for gcc-16 development to open before the patch can go forward. Thanks! JEff ps. I know Baylibre's remit was to test dynamic icounts and there were good reasons for that. So don't worry about not having run it on design. If you happen to still have executables, pass them along privately, I can run them on a BPI. ROMS is a few hours of runtime, but that's not a big deal.
Re: [PATCH] RISC-V: Add pattern for vector-scalar multiply-add/sub [PR119100]
On 3/27/25 1:39 PM, Robin Dapp wrote: Hi Paul-Antoine, This pattern enables the combine pass to merge a vec_duplicate into a plus-mult or minus-mult RTL instruction. Before this patch, we have two instructions, e.g.: vfmv.v.f v6,fa0 vfmadd.vv v9,v6,v7 After, we get only one: vfmadd.vf v9,fa0,v7 On SPEC2017's 503.bwaves_r, depending on the workload, the reduction in dynamic instruction count varies from -4.66% to -4.75%. The general issue with this kind of optimization (we have discussed it a few times already) is that, depending on the uarch, we want the local combine optimization that you show but not the fwprop/late-combine one where we propagate a vector broadcast into a loop. So IMHO in order to continue with this and similar patterns we need at least accompanying rtx_cost handling that would allow us to tune per uarch. Pan Li sent a similar patch for vadd.vv/vadd.vx I think in November and I believe he intended to continue when stage 1 opens. An outstanding question is how to distinguish the combine case from the late-combine case. I haven't yet thought about that in detail. The other thing we should consider is that we can certainly theorize that this kind of register file crossing case can have an extra penalty (it traditionally does), we don't have actual evidence that it's causing a problem on any RISC-V designs. So may be the way to go is add a field to the uarch tuning structure indicating the additional cost (if any) of a register file crossing vector op of this nature. Then query that in riscv_rtx_costs or whatever our rtx_cost function is named. Default that additional cost to zero initially. Then uarch experts can fill in the appropriate value. Yea, such a simplistic approach wouldn't handle cases like ours where you really need nearby context to be sure, but I don't think we want to over-engineer this solution too badly right now. Note that since this isn't a regression it'll need to wait for gcc-16 development to open before the patch can go forward. Thanks! JEff ps. I know Baylibre's remit was to test dynamic icounts and there were good reasons for that. So don't worry about not having run it on design. If you happen to still have executables, pass them along privately, I can run them on a BPI. ROMS is a few hours of runtime, but that's not a big deal.
Re: [PATCH] avoid-store-forwarding: Fix reg init on load-elimination [PR119160]
On 3/28/25 5:12 AM, Konstantinos Eleftheriou wrote: In the case that we are eliminating the load instruction, we use zero_extend for the initialization of the base register for the zero-offset store. This causes issues when the store and the load use the same mode, as we are trying to generate a zero_extend with the same inner and outer modes. This patch fixes the issue by zero-extending the value stored in the base register only when the load's mode is wider than the store's mode. PR rtl-optimization/119160 gcc/ChangeLog: * avoid-store-forwarding.cc (process_store_forwarding): Zero-extend the value stored in the base register, in case of load-elimination, only when the mode of the destination is wider. gcc/testsuite/ChangeLog: * gcc.dg/pr119160.c: New test. OK jeff
Re: [PATCH] combine: Special case set_noop_p in two spots
On 3/28/25 5:20 AM, Jakub Jelinek wrote: Hi! Here is the incremental patch I was talking about. For noop sets, we don't need to test much, they can go to i2 unless that would violate i3 JUMP condition. With this the try_combine on the pr119291.c testcase doesn't fail, but succeeds and we get (insn 22 21 23 4 (set (pc) (pc)) "pr119291.c":27:15 2147483647 {NOOP_MOVE} (nil)) (insn 23 22 24 4 (set (reg/v:SI 117 [ e ]) (reg/v:SI 116 [ e ])) 96 {*movsi_internal} (expr_list:REG_DEAD (reg/v:SI 116 [ e ]) (nil))) (note 24 23 25 4 NOTE_INSN_DELETED) (insn 25 24 26 4 (set (reg/v:SI 116 [ e ]) (const_int 0 [0])) "pr119291.c":28:13 96 {*movsi_internal} (nil)) (note 26 25 27 4 NOTE_INSN_DELETED) (insn 27 26 28 4 (set (reg:DI 128 [ _9 ]) (const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal} (nil)) after it. Ok for trunk if this passes bootstrap/regtest? 2025-03-28 Jakub Jelinek * combine.cc (try_combine): Sets which satisfy set_noop_p can go to i2 unless i3 is a jump and the other set is not. Shouldn't this wait for gcc-16? Or can you make a reasonable case that the nop moves constitute a code quality regression? OK for gcc-16. OK for gcc-15 is those nop moves are a code quality regression. jeff
[PATCH] gcc.dg/analyzer/torture/switch-3.c: fix llp64 warnings
Patch OK for master branch? gcc/testsuite/ChangeLog: * gcc.dg/analyzer/torture/switch-3.c: fix llp64 warnings diff --git a/gcc/testsuite/gcc.dg/analyzer/torture/switch-3.c b/gcc/testsuite/gcc.dg/analyzer/torture/switch-3.c index 57b8acdb292..b40be664d38 100644 --- a/gcc/testsuite/gcc.dg/analyzer/torture/switch-3.c +++ b/gcc/testsuite/gcc.dg/analyzer/torture/switch-3.c @@ -68,7 +68,7 @@ extern void check_init_u32 (__u32 v); /* Adapted/reduced from arch/x86/kernel/cpu/mtrr/if.c: mtrr_ioctl, which is GPL-2.0 */ -long mtrr_ioctl(unsigned int cmd, unsigned long __arg) { +long mtrr_ioctl(unsigned int cmd, __UINTPTR_TYPE__ __arg) { int err = 0; struct mtrr_sentry sentry; struct mtrr_gentry gentry;
Re: [PATCH v3 16/19] Refactor FMV frontend hooks and logic.
> On 27 Mar 2025, at 16:45, Alfie Richards wrote: > > > This change refactors FMV handling in the frontend to allows greater > reasoning about versions in shared code. > > This is needed for target_version semantics and allowing target_clones > and target_versions to both be used for the declaration there are now > two questions that need to be answered for the front end. > > 1. Are these two declarations completely distinct FMV declarations > (ie. the versions they define have no overlap). If so, they don't match. > 2. Are these two declarations matching and therefore mergeable. > (ie. two target_clone decls that define the same set of versions, or > an un-annotated declaration, and a target_clones definition containing the > default version). If so, the existing merging logic should be used to > try to merge these and diagnose if it's not possible. If not, then this > needs to be diagnosed. > > To do this the common_function_versions function has been renamed > distinct_function_versions (meaning, are the versions defined by these > two functions completely distinct from eachother). > > The common function version hook was changed to instead take two > string_slice's and determine if they define the same version. > > There is a new function, called mergeable_version_decls which checks > if two decls (which define overlapping versions) can be merged. > For example, if they are two target_clone decls which define the exact > same set of versions. > > This change also records the conflicting version so that it can be > included in diagnostics. > > gcc/ChangeLog: > > * attribs.cc (attr_strcmp): Moved to target specific code. > (sorted_attr_string): Moved to target specific code. > (common_function_versions): New function. > * attribs.h (sorted_attr_string): Removed. > (common_function_versions): New function. > * config/aarch64/aarch64.cc (aarch64_common_function_versions): > New function. > * config/riscv/riscv.cc (riscv_common_function_versions): New function. > * doc/tm.texi: Regenerated. > * target.def: Change common_function_versions hook. > * tree.cc (distinct_version_decls): New function. > (mergeable_version_decls): Ditto. > * tree.h (distinct_version_decls): New function. > (mergeable_version_decls): Ditto. > > gcc/cp/ChangeLog: > > * class.cc (resolve_address_of_overloaded_function): Updated to use > distinct_version_decls instead of common_function_version hook. > * cp-tree.h (decls_match): Updated to use > distinct_version_decls instead of common_function_version hook. > * decl.cc (decls_match): Refacture to use distinct_version_decls and > to pass through conflicting_version argument. > (maybe_version_functions): Updated to use > distinct_version_decls instead of common_function_version hook. > (duplicate_decls): Add logic to handle conflicting unmergable decls > and improve diagnostics for conflicting versions. > * decl2.cc (check_classfn): Updated to use > distinct_version_decls instead of common_function_version hook. > --- > gcc/attribs.cc| 75 ++--- > gcc/attribs.h | 3 +- > gcc/config/aarch64/aarch64.cc | 16 ++- > gcc/config/riscv/riscv.cc | 32 +++--- > gcc/cp/class.cc | 4 +- > gcc/cp/cp-tree.h | 2 +- > gcc/cp/decl.cc| 43 +-- > gcc/cp/decl2.cc | 2 +- > gcc/doc/tm.texi | 4 +- > gcc/target.def| 6 +- > gcc/tree.cc | 204 ++ > gcc/tree.h| 6 + > 12 files changed, 293 insertions(+), 104 deletions(-) > --- > diff --git a/gcc/attribs.cc b/gcc/attribs.cc > index 80833388ff2..13ddee3376b 100644 > --- a/gcc/attribs.cc > +++ b/gcc/attribs.cc > @@ -1086,7 +1086,14 @@ make_attribute (string_slice name, string_slice > arg_name, tree chain) > return attr; > } > > -/* Common functions used for target clone support. */ > +/* Used for targets with target_version semantics. */ > + > +bool > +common_function_versions (string_slice fn1 ATTRIBUTE_UNUSED, > + string_slice fn2 ATTRIBUTE_UNUSED) > +{ > + gcc_unreachable(); > +} > > /* Comparator function to be used in qsort routine to sort attribute >specification strings to "target". */ > @@ -1176,72 +1183,6 @@ sorted_attr_string (tree arglist) > XDELETEVEC (attr_str); > return ret_str; > } > - > - > -/* This function returns true if FN1 and FN2 are versions of the same > function, > - that is, the target strings of the function decls are different. This > assumes > - that FN1 and FN2 have the same signature. */ > - > -bool > -common_function_versions (tree fn1, tree fn2) > -{ > - tree attr1, attr2; > - char *target1, *target2; > - bool result; > - > - if (TREE_CODE (fn1) != FUNCTION_DECL > - || TREE_CODE (fn2) != FUNCTION_DECL) > -return false; > - > - attr1 = lookup_attribute ("target", DECL_ATTRIBUTES (fn1)); > - attr2 = lookup_attribute ("target", DECL_ATTRIBUTES (fn2)); > - > - /* At le
[PATCH v2] RISC-V: vsetvl: skip abnormal edge on vsetvl insertion [PR119533]
changes since v2 - dump log sanfu --- vsetvl phase4 uses LCM guided info to insert VSETVL insns. It has an additional loop to insert missing vsetvls on certain edges. Currently it asserts/aborts on encountering EDGE_ABNORMAL. When enabling go frontend with V enabled, libgo build hits the assert. It seems to be safe to just skip the abnormal edge. Verified that a go toolchain build with the fix completes successfully and runs the testsuite. rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintntl_zihintpause_zihpm_zfa_zfhmin_zba_zbb_zbs_zkt_zvbb_zvkt/ lp64d/ medlow | 738 / 146 |7 / 3 | 72 /12 | Also to make sure nothing regressed on rvv and go side, did additional 2 sets of runs. 1. RVV build, go disabled, w/ and w/o fix rv64imafdcv_zvl256b_zba_zbb_zbs_zicond/ lp64d/ medlow | 244 /96 |7 / 3 | 67 /12 | rv64imafdcv_zvl256b_zba_zbb_zbs_zicond/ lp64d/ medlow | 244 /96 |7 / 3 | 67 /12 | 2. go enabled, RVV disabled, w/ and w/o fix rv64imafdc_zba_zbb_zbs_zicond_zfa/ lp64d/ medlow | 155 /47 |0 / 0 |0 / 0 | rv64imafdc_zba_zbb_zbs_zicond_zfa/ lp64d/ medlow | 155 /47 |0 / 0 |0 / 0 | PR target/119533 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pre_vsetvl::emit_vsetvl): skip EDGE_ABNORMAL. gcc/testsuite/ChangeLog: * go.dg/pr119533-riscv.go: New test. Signed-off-by: Vineet Gupta --- gcc/config/riscv/riscv-vsetvl.cc | 7 +- gcc/testsuite/go.dg/pr119533-riscv.go | 120 ++ 2 files changed, 126 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/go.dg/pr119533-riscv.go diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index 0ac2538f596f..dd966b5ed5d9 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -3391,7 +3391,6 @@ pre_vsetvl::emit_vsetvl () edge_iterator eg_iterator; FOR_EACH_EDGE (eg, eg_iterator, bb->cfg_bb ()->succs) { - gcc_assert (!(eg->flags & EDGE_ABNORMAL)); if (dump_file) { fprintf ( @@ -3400,6 +3399,12 @@ pre_vsetvl::emit_vsetvl () eg->src->index, eg->dest->index); footer_info.dump (dump_file, ""); } + if (eg->flags & EDGE_ABNORMAL) + { + if (dump_file) + fprintf (dump_file, "\nskipping EDGE_ABNORMAL\n"); + continue; + } start_sequence (); insert_vsetvl_insn (EMIT_DIRECT, footer_info); rtx_insn *rinsn = get_insns (); diff --git a/gcc/testsuite/go.dg/pr119533-riscv.go b/gcc/testsuite/go.dg/pr119533-riscv.go new file mode 100644 index ..30f52d267c5f --- /dev/null +++ b/gcc/testsuite/go.dg/pr119533-riscv.go @@ -0,0 +1,120 @@ +// { dg-do compile { target riscv64*-*-* } } +// { dg-options "-O2 -march=rv64gcv -mabi=lp64d" } + +// Reduced from libgo build (multi-file reduction, merged mnaully +// and hand reduced again). + +package ast +import ( + "go/token" + "go/scanner" + "reflect" +) +type v struct {} +type w func( string, reflect.Value) bool +func x( string, reflect.Value) bool +type r struct { + scanner.ErrorList +} +type ab interface {} +type ae interface {} +type af interface {} +type ag struct {} +func (ag) Pos() token.Pos +func (ag) ah() token.Pos +type c struct { + ajae } +type ak struct { + al[]c } +type ( + am struct { + anstring} + bs struct { + Valuestring + } +) +func ao(string) *am +type ( + ap interface {} + aq struct { + arbs } +as struct { + bt ak + an am} +) +type File struct { + *ag + token.Pos + *am + at []af + *v + au[]*aq + av *am + aw []*ag } +type ax struct { + anstring + *v + ay map[string]File } +func a(az *token.FileSet, b token.Pos) int +type k struct { + l token.Pos + ah token.Pos +} +type m struct { + bb bool + bc *ag +} + +type bi uint +func bj(a *as) string { + if b := a.bt; len(b.al) == 1 { + c := b.al[0].aj + if e := c; e != nil {} + } + return a.an.an +} +func MergePackageFiles(f ax, g bi) *File { + h := 0 + bk := 0 + k := 0 + bl := make([]string, len(f.ay)) + i := 0 + for bm, a := range f.ay { + bl[i] = bm + k += len(a.at) + } + var bn *ag + var l token.Pos + if h > 0 {} + var bo []af + bu := make(map[string]int) + m := 0 + for _, bm := range bl { + a := f.ay[bm] + for _, d := range a.at { + if g!= 0 { +
Re: [PATCH] RISC-V: vsetvl: skip abnormal edge on vsetvl insertion [PR119533]
On 3/29/25 13:36, Andreas Schwab wrote: >> + if (eg->flags & EDGE_ABNORMAL) >> +{ >> + fprintf (dump_file, "\nskipping EDGE_ABNORMAL\n"); > This will crash if dump_file is NULL. Sorry, last minute update. Fixed, v2 posted. Thx, -Vineet
Re: [PATCH] gcc/mingw: Align `.refptr.` to 8-byte boundaries for 64-bit targets
On 3/29/25 5:56 PM, LIU Hao wrote: This is a minor change, bootstrapped on x86_64-w64-mingw32. Thanks, applied to master branch as obvious/trivial.
Re: [PATCH v2] RISC-V: vsetvl: skip abnormal edge on vsetvl insertion [PR119533]
On 3/29/25 6:49 PM, Vineet Gupta wrote: changes since v2 - dump log sanfu --- vsetvl phase4 uses LCM guided info to insert VSETVL insns. It has an additional loop to insert missing vsetvls on certain edges. Currently it asserts/aborts on encountering EDGE_ABNORMAL. When enabling go frontend with V enabled, libgo build hits the assert. It seems to be safe to just skip the abnormal edge. Verified that a go toolchain build with the fix completes successfully and runs the testsuite. rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintntl_zihintpause_zihpm_zfa_zfhmin_zba_zbb_zbs_zkt_zvbb_zvkt/ lp64d/ medlow | 738 / 146 |7 / 3 | 72 /12 | Also to make sure nothing regressed on rvv and go side, did additional 2 sets of runs. 1. RVV build, go disabled, w/ and w/o fix rv64imafdcv_zvl256b_zba_zbb_zbs_zicond/ lp64d/ medlow | 244 /96 |7 / 3 | 67 /12 | rv64imafdcv_zvl256b_zba_zbb_zbs_zicond/ lp64d/ medlow | 244 /96 |7 / 3 | 67 /12 | 2. go enabled, RVV disabled, w/ and w/o fix rv64imafdc_zba_zbb_zbs_zicond_zfa/ lp64d/ medlow | 155 /47 |0 / 0 |0 / 0 | rv64imafdc_zba_zbb_zbs_zicond_zfa/ lp64d/ medlow | 155 /47 |0 / 0 |0 / 0 | PR target/119533 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pre_vsetvl::emit_vsetvl): skip EDGE_ABNORMAL. gcc/testsuite/ChangeLog: * go.dg/pr119533-riscv.go: New test. So presumably it wants to insert on the EH edge for a reason. Just skipping the edge is probably wrong. The way these scenarios are handled in the more generic LCM passes is to kill all values on the EH edge. With all values killed on the EH edge, no redundancy exists which should require insertion on the edge. Jeff
Re: [PATCH RFC] c++: optimize push_to_top_level [PR64500]
> Am 29.03.2025 um 00:05 schrieb Jason Merrill : > > Tested x86_64-pc-linux-gnu, initially with extra checking to make sure that > indeed nothing got saved from a namespace level. > > This isn't a regression, but a 20% speedup for a simple change is pretty > attractive; what do people think about this change for GCC 15? Go for it. Richard > -- 8< -- > > Profiling showed that the loop to save away IDENTIFIER_BINDINGs from open > binding levels was taking 5% of total compilation time in the PR116285 > testcase. This turned out to be because we were unnecessarily trying to do > this for namespaces, whose bindings are found through > DECL_NAMESPACE_BINDINGS, not IDENTIFIER_BINDING. > > As a result we would frequently loop through everything in std::, checking > whether it needs to be stored, and never storing anything. > > This change actually appears to speed up compilation for the PR116285 > testcase by ~20%. > > The replaced comments referred either to long-replaced handling of classes > and templates, or to wanting b to point to :: when the loop exits. > >PR c++/64500 >PR c++/116285 > > gcc/cp/ChangeLog: > >* name-lookup.cc (push_to_top_level): Don't try to store_bindings >for namespace levels. > --- > gcc/cp/name-lookup.cc | 20 > 1 file changed, 12 insertions(+), 8 deletions(-) > > diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc > index 7f1ee869d52..61b7bfcaf94 100644 > --- a/gcc/cp/name-lookup.cc > +++ b/gcc/cp/name-lookup.cc > @@ -8675,6 +8675,9 @@ store_class_bindings (vec > *names, > > static GTY((deletable)) struct saved_scope *free_saved_scope; > > +/* Temporarily make the current scope the global namespace, saving away > + the current scope for pop_from_top_level. */ > + > void > push_to_top_level (void) > { > @@ -8716,18 +8719,19 @@ push_to_top_level (void) > store_class_bindings (previous_class_level->class_shadowed, > &s->old_bindings); > > - /* Have to include the global scope, because class-scope decls > - aren't listed anywhere useful. */ > + /* Save and clear any IDENTIFIER_BINDING from local scopes. */ > for (; b; b = b->level_chain) > { > tree t; > > - /* Template IDs are inserted into the global level. If they were > - inserted into namespace level, finish_file wouldn't find them > - when doing pending instantiations. Therefore, don't stop at > - namespace level, but continue until :: . */ > - if (global_scope_p (b)) > -break; > + /* We don't need to consider namespace scopes, they don't affect > + IDENTIFIER_BINDING. */ > + if (b->kind == sk_namespace) > +{ > + /* Jump straight to '::'. */ > + b = NAMESPACE_LEVEL (global_namespace); > + break; > +} > > store_bindings (b->names, &s->old_bindings); > /* We also need to check class_shadowed to save class-level type > > base-commit: d9b56c65a2697e0d7a6c0f15f1977803dc94579b > -- > 2.49.0 >
[COMMITTED 096/146] gccrs: nr2.0: Resolve type aliases inside trait definitions
From: Owen Avery gcc/rust/ChangeLog: * resolve/rust-toplevel-name-resolver-2.0.cc (TopLevel::visit): Add visitor for TraitItemType. * resolve/rust-toplevel-name-resolver-2.0.h (TopLevel::visit): Likewise. gcc/testsuite/ChangeLog: * rust/compile/nr2/exclude: Remove entries. Signed-off-by: Owen Avery --- gcc/rust/resolve/rust-toplevel-name-resolver-2.0.cc | 9 + gcc/rust/resolve/rust-toplevel-name-resolver-2.0.h | 1 + gcc/testsuite/rust/compile/nr2/exclude | 5 - 3 files changed, 10 insertions(+), 5 deletions(-) diff --git a/gcc/rust/resolve/rust-toplevel-name-resolver-2.0.cc b/gcc/rust/resolve/rust-toplevel-name-resolver-2.0.cc index 6d52fcaaac8..4833233e025 100644 --- a/gcc/rust/resolve/rust-toplevel-name-resolver-2.0.cc +++ b/gcc/rust/resolve/rust-toplevel-name-resolver-2.0.cc @@ -109,6 +109,15 @@ TopLevel::visit (AST::Trait &trait) DefaultResolver::visit (trait); } +void +TopLevel::visit (AST::TraitItemType &trait_item) +{ + insert_or_error_out (trait_item.get_identifier ().as_string (), trait_item, + Namespace::Types); + + DefaultResolver::visit (trait_item); +} + template static void insert_macros (std::vector ¯os, NameResolutionContext &ctx) diff --git a/gcc/rust/resolve/rust-toplevel-name-resolver-2.0.h b/gcc/rust/resolve/rust-toplevel-name-resolver-2.0.h index 7f4e29585de..f540ab9ae61 100644 --- a/gcc/rust/resolve/rust-toplevel-name-resolver-2.0.h +++ b/gcc/rust/resolve/rust-toplevel-name-resolver-2.0.h @@ -148,6 +148,7 @@ private: void visit (AST::Module &module) override; void visit (AST::Trait &trait) override; + void visit (AST::TraitItemType &trait_item) override; void visit (AST::MacroRulesDefinition ¯o) override; void visit (AST::Function &function) override; void visit (AST::BlockExpr &expr) override; diff --git a/gcc/testsuite/rust/compile/nr2/exclude b/gcc/testsuite/rust/compile/nr2/exclude index e23669f309b..da5880d9a57 100644 --- a/gcc/testsuite/rust/compile/nr2/exclude +++ b/gcc/testsuite/rust/compile/nr2/exclude @@ -74,8 +74,6 @@ issue-2139.rs issue-2142.rs issue-2165.rs issue-2166.rs -issue-2190-1.rs -issue-2190-2.rs issue-2238.rs issue-2304.rs issue-2330.rs @@ -85,7 +83,6 @@ issue-2723-1.rs issue-2723-2.rs issue-2772-2.rs issue-2775.rs -issue-2747.rs issue-2782.rs issue-2812.rs issue-850.rs @@ -98,7 +95,6 @@ macros/mbe/macro-issue1233.rs macros/mbe/macro-issue1400.rs macros/mbe/macro13.rs macros/mbe/macro15.rs -macros/mbe/macro20.rs macros/mbe/macro23.rs macros/mbe/macro40.rs macros/mbe/macro43.rs @@ -198,7 +194,6 @@ iflet.rs issue-3033.rs issue-3009.rs issue-2323.rs -issue-2953-1.rs issue-2953-2.rs issue-1773.rs issue-2905-1.rs -- 2.45.2
[PATCH] RISC-V: vsetvl: skip abnormal edge on vsetvl insertion [PR119533]
vsetvl phase4 uses LCM guided info to insert VSETVL insns. It has an additional loop to insert missing vsetvls on certain edges. Currently it asserts/aborts on encountering EDGE_ABNORMAL. When enabling go frontend with V enabled, libgo build hits the assert. It seems to be safe to just skip the abnormal edge. Verified that a go toolchain build with the fix completes successfully and runs the testsuite. rv64imafdcv_zicbom_zicboz_zicntr_zicond_zicsr_zifencei_zihintntl_zihintpause_zihpm_zfa_zfhmin_zba_zbb_zbs_zkt_zvbb_zvkt/ lp64d/ medlow | 738 / 146 |7 / 3 | 72 /12 | Also to make sure nothing regressed on rvv and go side, did additional 2 sets of runs. 1. RVV build, go disabled, w/ and w/o fix rv64imafdcv_zvl256b_zba_zbb_zbs_zicond/ lp64d/ medlow | 244 /96 |7 / 3 | 67 /12 | rv64imafdcv_zvl256b_zba_zbb_zbs_zicond/ lp64d/ medlow | 244 /96 |7 / 3 | 67 /12 | 2. go enabled, RVV disabled, w/ and w/o fix rv64imafdc_zba_zbb_zbs_zicond_zfa/ lp64d/ medlow | 155 /47 |0 / 0 |0 / 0 | rv64imafdc_zba_zbb_zbs_zicond_zfa/ lp64d/ medlow | 155 /47 |0 / 0 |0 / 0 | PR target/119533 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pre_vsetvl::emit_vsetvl): skip EDGE_ABNORMAL. gcc/testsuite/ChangeLog: * go.dg/pr119533-riscv.go: New test. Signed-off-by: Vineet Gupta --- gcc/config/riscv/riscv-vsetvl.cc | 6 +- gcc/testsuite/go.dg/pr119533-riscv.go | 120 ++ 2 files changed, 125 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/go.dg/pr119533-riscv.go diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index 0ac2538f596f..3b81e7f09924 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -3391,7 +3391,6 @@ pre_vsetvl::emit_vsetvl () edge_iterator eg_iterator; FOR_EACH_EDGE (eg, eg_iterator, bb->cfg_bb ()->succs) { - gcc_assert (!(eg->flags & EDGE_ABNORMAL)); if (dump_file) { fprintf ( @@ -3400,6 +3399,11 @@ pre_vsetvl::emit_vsetvl () eg->src->index, eg->dest->index); footer_info.dump (dump_file, ""); } + if (eg->flags & EDGE_ABNORMAL) + { + fprintf (dump_file, "\nskipping EDGE_ABNORMAL\n"); + continue; + } start_sequence (); insert_vsetvl_insn (EMIT_DIRECT, footer_info); rtx_insn *rinsn = get_insns (); diff --git a/gcc/testsuite/go.dg/pr119533-riscv.go b/gcc/testsuite/go.dg/pr119533-riscv.go new file mode 100644 index ..30f52d267c5f --- /dev/null +++ b/gcc/testsuite/go.dg/pr119533-riscv.go @@ -0,0 +1,120 @@ +// { dg-do compile { target riscv64*-*-* } } +// { dg-options "-O2 -march=rv64gcv -mabi=lp64d" } + +// Reduced from libgo build (multi-file reduction, merged mnaully +// and hand reduced again). + +package ast +import ( + "go/token" + "go/scanner" + "reflect" +) +type v struct {} +type w func( string, reflect.Value) bool +func x( string, reflect.Value) bool +type r struct { + scanner.ErrorList +} +type ab interface {} +type ae interface {} +type af interface {} +type ag struct {} +func (ag) Pos() token.Pos +func (ag) ah() token.Pos +type c struct { + ajae } +type ak struct { + al[]c } +type ( + am struct { + anstring} + bs struct { + Valuestring + } +) +func ao(string) *am +type ( + ap interface {} + aq struct { + arbs } +as struct { + bt ak + an am} +) +type File struct { + *ag + token.Pos + *am + at []af + *v + au[]*aq + av *am + aw []*ag } +type ax struct { + anstring + *v + ay map[string]File } +func a(az *token.FileSet, b token.Pos) int +type k struct { + l token.Pos + ah token.Pos +} +type m struct { + bb bool + bc *ag +} + +type bi uint +func bj(a *as) string { + if b := a.bt; len(b.al) == 1 { + c := b.al[0].aj + if e := c; e != nil {} + } + return a.an.an +} +func MergePackageFiles(f ax, g bi) *File { + h := 0 + bk := 0 + k := 0 + bl := make([]string, len(f.ay)) + i := 0 + for bm, a := range f.ay { + bl[i] = bm + k += len(a.at) + } + var bn *ag + var l token.Pos + if h > 0 {} + var bo []af + bu := make(map[string]int) + m := 0 + for _, bm := range bl { + a := f.ay[bm] + for _, d := range a.at { + if g!= 0 { + if a, p := d.(*as); p { +
Re: [PATCH] libstdc++: Constrain formatters for chrono types [PR119517]
On Fri, Mar 28, 2025 at 9:33 PM Jonathan Wakely wrote: > On 28/03/25 16:31 +0100, Tomasz Kamiński wrote: > >The formatters for chrono types defined the parse/format methods > >as accepting unconstrained types, this in combination with lack > >of constrain on _CharT lead to them falsy statisfying formattable > >requirements for any type used as character. > > > >This patch adjust the fromatter::parse signature to: > > constexpr typename basic_format_parse_context<_CharT>::iterator > > parse(basic_format_parse_context<_CharT>& __pc); > >And formatter::format to: > > template > > typename basic_format_context<_Out, _CharT>::iterator > > format(const T& __t, > > basic_format_context<_Out, _CharT>& __fc) const; > > > >Furthermore we _CharT with __format::__char (char or wchar_t), > > > > PR libstdc++/119517 > > > >libstdc++-v3/ChangeLog: > > > > * include/bits/chrono_io.h (formatter): > > Add __format::__char for _CharT and adjust parse and format > > method signatures. > > * testsuite/std/time/format/pr119517.cc: New test. > >--- > >Testing on x86_64-linux, std/time/format tests passed. > >OK for trunk? > > > > libstdc++-v3/include/bits/chrono_io.h | 448 +- > > .../testsuite/std/time/format/pr119517.cc | 44 ++ > > 2 files changed, 262 insertions(+), 230 deletions(-) > > create mode 100644 libstdc++-v3/testsuite/std/time/format/pr119517.cc > > > >diff --git a/libstdc++-v3/include/bits/chrono_io.h > b/libstdc++-v3/include/bits/chrono_io.h > >index c55b651d049..3a5bc5695fb 100644 > >--- a/libstdc++-v3/include/bits/chrono_io.h > >+++ b/libstdc++-v3/include/bits/chrono_io.h > >@@ -1785,277 +1785,272 @@ namespace __format > > __format::__formatter_chrono<_CharT> _M_f; > > }; > > > >- template > >+ template<__format::__char _CharT> > > struct formatter > > { > >- template > >- constexpr typename _ParseContext::iterator > >- parse(_ParseContext& __pc) > >- { return _M_f._M_parse(__pc, __format::_Day); } > >+ constexpr typename basic_format_parse_context<_CharT>::iterator > >+ parse(basic_format_parse_context<_CharT>& __pc) > >+ { return _M_f._M_parse(__pc, __format::_Day); } > > > >- template > >- typename _FormatContext::iterator > >- format(const chrono::day& __t, _FormatContext& __fc) const > >+ template > >+ typename basic_format_context<_Out, _CharT>::iterator > >+ format(const chrono::day& __t, > >+ basic_format_context<_Out, _CharT>& __fc) const > > { return _M_f._M_format(__t, __fc); } > > > > private: > > __format::__formatter_chrono<_CharT> _M_f; > > }; > > > >- template > >+ template<__format::__char _CharT> > > struct formatter > > { > >- template > >- constexpr typename _ParseContext::iterator > >- parse(_ParseContext& __pc) > >- { return _M_f._M_parse(__pc, __format::_Month); } > >+ constexpr typename basic_format_parse_context<_CharT>::iterator > >+ parse(basic_format_parse_context<_CharT>& __pc) > >+ { return _M_f._M_parse(__pc, __format::_Month); } > > > >- template > >- typename _FormatContext::iterator > >- format(const chrono::month& __t, _FormatContext& __fc) const > >+ template > >+ typename basic_format_context<_Out, _CharT>::iterator > >+ format(const chrono::month& __t, > >+ basic_format_context<_Out, _CharT>& __fc) const > > { return _M_f._M_format(__t, __fc); } > > > > private: > > __format::__formatter_chrono<_CharT> _M_f; > > }; > > > >- template > >+ template<__format::__char _CharT> > > struct formatter > > { > >- template > >- constexpr typename _ParseContext::iterator > >- parse(_ParseContext& __pc) > >- { return _M_f._M_parse(__pc, __format::_Year); } > >+ constexpr typename basic_format_parse_context<_CharT>::iterator > >+ parse(basic_format_parse_context<_CharT>& __pc) > >+ { return _M_f._M_parse(__pc, __format::_Year); } > > > >- template > >- typename _FormatContext::iterator > >- format(const chrono::year& __t, _FormatContext& __fc) const > >+ template > >+ typename basic_format_context<_Out, _CharT>::iterator > >+ format(const chrono::year& __t, > >+ basic_format_context<_Out, _CharT>& __fc) const > > { return _M_f._M_format(__t, __fc); } > > > > private: > > __format::__formatter_chrono<_CharT> _M_f; > > }; > > > >- template > >+ template<__format::__char _CharT> > > struct formatter > > { > >- template > >- constexpr typename _ParseContext::iterator > >- parse(_ParseContext& __pc) > >- { return _M_f._M_parse(__pc, __format::_Weekday); } > >+ constexpr typename basic_format_parse_context<_CharT>::iterator > >+ parse(basic_format_parse_context<_CharT>& __pc) > >+ { return _M_f._M_parse(__pc, __format::_Weekday); } > > > >- template
Re: [PATCH] cobol: Fix up cobol/{charmaps,valconv}.cc rules
> On 29 Mar 2025, at 15:56, Jakub Jelinek wrote: > > On Sat, Mar 29, 2025 at 03:50:54PM +, Iain Sandoe wrote: >>> I'm not sure if sed -E is portable enough (sure, I know it is in POSIX, but >>> that is not enough). >>> How about just >>> sed -e '/^#include/s,"\([^"]*.h\)","../../libgcobol/\1",' $& > $@ >> >> This, unfortunately, works too well (with s/&/^) .. because it also >> processes #include “config.h” >> which then points to a non-existent file. I think we want to include config >> for both FE and >> library (so we cannot get around it by indenting the config.h include - well >> we could, but …) > > Neither libgcobol/charmaps.cc nor libgcobol/valconv.cc has config.h include. but it’s an approved patch (just waiting for the main config change to be reviewed) and needed for the other lib changes. I’ll investigate if we could find a way to drop it fro those two files. Iain
Re: [PATCH] cobol: Fix up cobol/{charmaps,valconv}.cc rules
> On 29 Mar 2025, at 16:12, Jakub Jelinek wrote: > > On Sat, Mar 29, 2025 at 04:03:07PM +, Iain Sandoe wrote: >> >> >>> On 29 Mar 2025, at 15:56, Jakub Jelinek wrote: >>> >>> On Sat, Mar 29, 2025 at 03:50:54PM +, Iain Sandoe wrote: > I'm not sure if sed -E is portable enough (sure, I know it is in POSIX, > but > that is not enough). > How about just > sed -e '/^#include/s,"\([^"]*.h\)","../../libgcobol/\1",' $& > $@ This, unfortunately, works too well (with s/&/^) .. because it also processes #include “config.h” which then points to a non-existent file. I think we want to include config for both FE and library (so we cannot get around it by indenting the config.h include - well we could, but …) >>> >>> Neither libgcobol/charmaps.cc nor libgcobol/valconv.cc has config.h include. >> >> but it’s an approved patch (just waiting for the main config change to be >> reviewed) and needed >> for the other lib changes. I’ll investigate if we could find a way to drop >> it fro those two files. > > config.h is the only header ending with g in there, so > sed -e '/^#include/s,"\([^"]*[^g"].h\)","../../libgcobol/\1",' $^ > $@ > then? Not necessary (at the moment), with the recent header shuffles I no longer need to include config.h in the shared sources. So your previous edition is fine (tested only on x86_64-darwin, so far). So, with current trunk + your previous (amended) patch - Darwin can now build the FE. Perhaps one might forcast needing more configure tests as other platforms start to pick this up. Tomorrow, I will post a summary of what’s needed to get the library building using libquadmath - most of the patches are actually approved - but i need to post the changes for the lib files. thanks Iain
[committed] c++: Fix comment typo
Hi! Found a typo in a comment. Committed as obvious. 2025-03-29 Jakub Jelinek * name-lookup.cc (maybe_lazily_declare): Fix comment typo, anout -> about. --- gcc/cp/name-lookup.cc.jj2025-03-27 19:13:58.168298922 +0100 +++ gcc/cp/name-lookup.cc 2025-03-29 11:29:48.421292457 +0100 @@ -2012,8 +2012,8 @@ get_class_binding_direct (tree klass, tr static void maybe_lazily_declare (tree klass, tree name) { - /* See big comment anout module_state::write_pendings regarding adding a check - bit. */ + /* See big comment about module_state::write_pendings regarding adding + a check bit. */ if (modules_p ()) lazy_load_pendings (TYPE_NAME (klass)); Jakub
[PATCH] cobol, libgcobol: Currently libgcobol depends on libstdc++.
Tested on x86_64 linux, darwin, aarch64 linux, OK for trunk? thanks, Iain --- 8< --- We need to add libstdc++ to link lines even when the link is not '-static' since libgcobol depends on libstdc++. gcc/cobol/ChangeLog: * gcobolspec.cc (lang_specific_driver): Add libstdc++ for any link line. Signed-off-by: Iain Sandoe --- gcc/cobol/gcobolspec.cc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/cobol/gcobolspec.cc b/gcc/cobol/gcobolspec.cc index 7de41fd037a..364c14c8a70 100644 --- a/gcc/cobol/gcobolspec.cc +++ b/gcc/cobol/gcobolspec.cc @@ -600,7 +600,7 @@ lang_specific_driver (struct cl_decoded_option **in_decoded_options, { add_arg_lib(DL_LIBRARY, static_in_general); } - if( need_libstdc && static_in_general ) + if( need_libstdc ) { add_arg_lib(STDCPP_LIBRARY, static_in_general); } -- 2.39.2 (Apple Git-143)
Re: [PATCH 7/8] target/119010 - Zen4/Zen5 reservations for movlhps loads
> The following fixes up the ssemov2 type introduction, amending > the znver4_sse_mov_fp_load reservation. This fixes > > ;; 14--> b 0: i1436 xmm6=vec_concat(xmm6,[ax+0x8]) :nothing > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > PR target/119010 > * config/i386/zn4zn5.md (znver4_sse_mov_fp_load, > znver5_sse_mov_fp_load): Also match ssemov2. OK, Thanks! Honza > --- > gcc/config/i386/zn4zn5.md | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/i386/zn4zn5.md b/gcc/config/i386/zn4zn5.md > index 1ac1d07c04b..ecb1e3bbedb 100644 > --- a/gcc/config/i386/zn4zn5.md > +++ b/gcc/config/i386/zn4zn5.md > @@ -1036,14 +1036,14 @@ > > (define_insn_reservation "znver4_sse_mov_fp_load" 6 >(and (eq_attr "cpu" "znver4") > - (and (eq_attr "type" "ssemov") > + (and (eq_attr "type" "ssemov,ssemov2") > (and (eq_attr "mode" > "V16SF,V8DF,V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,DF,SF") > (eq_attr "memory" "load" >"znver4-direct,znver4-load,znver4-fpu") > > (define_insn_reservation "znver5_sse_mov_fp_load" 6 >(and (eq_attr "cpu" "znver5") > - (and (eq_attr "type" "ssemov") > + (and (eq_attr "type" "ssemov,ssemov2") > (and (eq_attr "mode" > "V16SF,V8DF,V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,DF,SF") > (eq_attr "memory" "load" >"znver4-direct,znver5-load,znver4-fpu") > -- > 2.43.0 >
Re: [PATCH 2/8] target/119010 - missing reservations for Zen4/5 and SSE compares
> There's the znver4_sse_test reservation which matches the memory-less > SSE compares but currently requires prefix_extra == 1. The old > znver automata in this case sometimes uses znver1-double instead of > znver1-direct, but it's quite a maze. The following simply drops prefix_extra is used to determine instruction length (whether there is extra byte for prefix) I believe that the double versions are for the cases where zens split long vectors to halves, but I also find it somewhat confusing. > the prefix_extra requirement, but I have no idea what I'm doing here. > There doesn't seem to be any documentation on the scheduler relevant > attributes used, or at least I cannot find that. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > PR target/119010 > * config/i386/zn4zn5.md (znver4_sse_test): Drop test of > prefix_extra attribute. Yes, this is OK. While znver4 halves AVX512, this is still improvement over what we have and I don't thik it really matters to model the double issue precisely here. > --- > gcc/config/i386/zn4zn5.md | 5 ++--- > 1 file changed, 2 insertions(+), 3 deletions(-) > > diff --git a/gcc/config/i386/zn4zn5.md b/gcc/config/i386/zn4zn5.md > index fb856e9dc98..40e51456a46 100644 > --- a/gcc/config/i386/zn4zn5.md > +++ b/gcc/config/i386/zn4zn5.md > @@ -953,9 +953,8 @@ > > (define_insn_reservation "znver4_sse_test" 1 >(and (eq_attr "cpu" "znver4,znver5") > - (and (eq_attr "prefix_extra" "1") > -(and (eq_attr "type" "ssecomi") > - (eq_attr "memory" "none" > + (and (eq_attr "type" "ssecomi") > + (eq_attr "memory" "none"))) >"znver4-direct,znver4-fpu1|znver4-fpu2") > > (define_insn_reservation "znver4_sse_test_load" 6 > -- > 2.43.0 >
Re: [PATCH 3/8] target/119010 - add reservations for integer vector compares to zen4/zen5
> The following handles TI, OI and XI mode in the respective EVEX > compare reservations that do not use memory (I've not yet run into > ones with). The znver automata has separate reservations for > integer compares (but only for zen1, for zen2 and zen3 there are > no compare reservations at all), but I don't see why that should > be necessary here. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > PR target/119010 > * config/i386/zn4zn5.md (znver4_sse_cmp_avx128, > znver5_sse_cmp_avx128): Handle TImode. > (znver4_sse_cmp_avx256, znver5_sse_cmp_avx256): Handle OImode. > (znver4_sse_cmp_avx512, znver5_sse_cmp_avx512): Handle XImode. OK, thanks! Honza
[pushed] c++/modules: unexported friend template
Tested x86_64-pc-linux-gnu, applying to trunk. Do you agree with my choice of how to adjust duplicate_decls? -- 8< -- Here we were failing to match the injected friend declaration to the definition because the latter isn't exported. But the friend is attached to the module, so we need to look for any reachable declaration in that module, not just the exports. The duplicate_decls change is to avoid clobbering DECL_MODULE_IMPORT_P on the imported definition; matching an injected friend doesn't change that it's imported. I considered checking get_originating_module == 0 or !decl_defined_p instead of DECL_UNIQUE_FRIEND_P there, but I think this situation is specific to friends. I removed an assert because we have a test for the same condition a few lines above. gcc/cp/ChangeLog: * decl.cc (duplicate_decls): Don't clobber DECL_MODULE_IMPORT_P with an injected friend. * name-lookup.cc (check_module_override): Look at all reachable decls in decl's originating module. gcc/testsuite/ChangeLog: * g++.dg/modules/friend-9_a.C: New test. * g++.dg/modules/friend-9_b.C: New test. --- gcc/cp/decl.cc| 6 -- gcc/cp/name-lookup.cc | 19 ++- gcc/testsuite/g++.dg/modules/friend-9_a.C | 13 + gcc/testsuite/g++.dg/modules/friend-9_b.C | 13 + 4 files changed, 40 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/g++.dg/modules/friend-9_a.C create mode 100644 gcc/testsuite/g++.dg/modules/friend-9_b.C diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc index a785d5e79cb..7d10b228ec6 100644 --- a/gcc/cp/decl.cc +++ b/gcc/cp/decl.cc @@ -2539,8 +2539,10 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, bool was_hidden) } /* Propagate purviewness and importingness as with -set_instantiating_module. */ - if (modules_p () && DECL_LANG_SPECIFIC (new_result)) +set_instantiating_module, unless newdecl is a friend injection. */ + if (modules_p () && DECL_LANG_SPECIFIC (new_result) + && !(TREE_CODE (new_result) == FUNCTION_DECL + && DECL_UNIQUE_FRIEND_P (new_result))) { if (DECL_MODULE_PURVIEW_P (new_result)) DECL_MODULE_PURVIEW_P (old_result) = true; diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc index df033edafc7..7fadbccfe39 100644 --- a/gcc/cp/name-lookup.cc +++ b/gcc/cp/name-lookup.cc @@ -3777,6 +3777,10 @@ check_module_override (tree decl, tree mvec, bool hiding, any reachable declaration, so we should check for overriding here too. */ bool any_reachable = deduction_guide_p (decl); + /* DECL might have an originating module if it's an instantiation of a + friend; we want to look at all reachable decls in that module. */ + unsigned decl_mod = get_originating_module (decl); + if (BINDING_VECTOR_SLOTS_PER_CLUSTER == BINDING_SLOTS_FIXED) { cluster++; @@ -3789,18 +3793,15 @@ check_module_override (tree decl, tree mvec, bool hiding, /* Are we importing this module? */ if (cluster->indices[jx].span != 1) continue; - if (!cluster->indices[jx].base) + unsigned cluster_mod = cluster->indices[jx].base; + if (!cluster_mod) continue; - if (!any_reachable - && !bitmap_bit_p (imports, cluster->indices[jx].base)) + bool c_any_reachable = (any_reachable || cluster_mod == decl_mod); + if (!c_any_reachable && !bitmap_bit_p (imports, cluster_mod)) continue; /* Is it loaded? */ if (cluster->slots[jx].is_lazy ()) - { - gcc_assert (cluster->indices[jx].span == 1); - lazy_load_binding (cluster->indices[jx].base, - scope, name, &cluster->slots[jx]); - } + lazy_load_binding (cluster_mod, scope, name, &cluster->slots[jx]); tree bind = cluster->slots[jx]; if (!bind) /* Errors could cause there to be nothing. */ @@ -3812,7 +3813,7 @@ check_module_override (tree decl, tree mvec, bool hiding, /* If there was a matching STAT_TYPE here then xref_tag should have found it, but we need to check anyway because a conflicting using-declaration may exist. */ - if (any_reachable) + if (c_any_reachable) { type = STAT_TYPE (bind); bind = STAT_DECL (bind); diff --git a/gcc/testsuite/g++.dg/modules/friend-9_a.C b/gcc/testsuite/g++.dg/modules/friend-9_a.C new file mode 100644 index 000..ca95027b470 --- /dev/null +++ b/gcc/testsuite/g++.dg/modules/friend-9_a.C @@ -0,0 +1,13 @@ +// { dg-additional-options -fmodules } +// { dg-module-cmi M } +// { dg-module-do link } + +export module M; + +export template struct A +{ + template friend void f (U); +}; + +template +void f(U u) { } diff --git a/gcc/testsuite/g++.dg/modules/friend-9_b.C b/gcc/testsuit
Re: [PATCH] ipa-sra: Don't change return type to void if there are musttail calls [PR119484]
> Hi! > > The following testcase is rejected, because IPA-SRA decides to > turn bar.constprop call into bar.constprop.isra which returns void. > While there is no explicit lhs on the call, as it is a musttail call > the tailc pass checks if IPA-VRP returns singleton from that function > and the function returns the same value and in that case it still turns > it into a tail call. This can't work with IPA-SRA changing it into > void returning function though. > > The following patch fixes this by forcing returning the original type > if there are musttail calls. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2025-03-27 Jakub Jelinek > > PR ipa/119484 > * ipa-sra.cc (isra_analyze_call): Don't set m_return_ignored if > gimple_call_must_tail_p even if it doesn't have lhs. > > * c-c++-common/pr119484.c: New test. OK Honza
Re: [PATCH 8/8] target/119010 - add mode attribute to *vmovv16si_constm1_pternlog_false_dep
> Like the other instances. This avoids > > ;; 1--> b 0: i6540 {xmm2=const_vector;unspec[xmm2] 38;}:nothing > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > PR target/119010 > * config/i386/sse.md (*vmov_constm1_pternlog_false_dep): > Add mode attribute. OK Thanks, Honza
Re: [PATCH] cobol: Fix up cobol/{charmaps,valconv}.cc rules
On Sat, Mar 29, 2025 at 03:18:20PM +, Iain Sandoe wrote: > Hi Jakub > > Thanks for doing this... > > > On 28 Mar 2025, at 14:39, Jakub Jelinek wrote: > > > > +cobol/charmaps.cc cobol/valconv.cc: cobol/%.cc: $(LIB_SOURCE)/%.cc > > + -l='ec\|common-defs\|io\|gcobolio\|libgcobol\|gfileio\|charmaps'; \ > > + l=$$l'\|valconv\|exceptl'; \ > > + sed -e '/^#include/s,"\('$$l'\)\.h","../../libgcobol/\1.h",' $^ > $@ > > .. however, this does not work with the BSD sed on Darwin (although it does > with GNU sed on Darwin). > > The issue appears to be that alternation is an ERE addition from my reading > of > https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03 > (the Darwin BRE sed works fine when there is only one match in the > sub-expression, > but fails as soon as any alternate is added). > > Darwin’s sed (and at least x86_64 / aarch64 Linux sed) work fine with the > updated patch > to use an ERE instead. Do you think this is an acceptable update? (it is > still POSIX sed). I'm not sure if sed -E is portable enough (sure, I know it is in POSIX, but that is not enough). How about just sed -e '/^#include/s,"\([^"]*.h\)","../../libgcobol/\1",' $& > $@ ? I mean, both charmaps.cc and valconv.cc use #include <> for system headers and #include "" for local headers and we want to adjust all of the latter and none of the former? > diff --git a/gcc/cobol/Make-lang.in b/gcc/cobol/Make-lang.in > index 990d51a8578..bbf1c9ef30e 100644 > --- a/gcc/cobol/Make-lang.in > +++ b/gcc/cobol/Make-lang.in > @@ -88,9 +88,8 @@ cobol1_OBJS =\ > # so that the .h files can be found. > > cobol/charmaps.cc cobol/valconv.cc: cobol/%.cc: $(LIB_SOURCE)/%.cc > - -l='ec\|common-defs\|io\|gcobolio\|gfileio\|charmaps'; \ > - l=$$l'\|valconv\|exceptl'; \ > - sed -e '/^#include/s,"\('$$l'\)\.h","../../libgcobol/\1.h",' $^ > $@ > + -l='ec|common-defs|io|gcobolio|gfileio|charmaps|valconv|exceptl'; \ > + sed -E -e '/^#include/s,"('$$l')\.h","../../libgcobol/\1.h",' $^ > $@ > > LIB_SOURCE_H=$(wildcard $(LIB_SOURCE)/*.h) > Jakub
[pushed] jit, Darwin: Update exports with ABI 28 through 34.
Tested on x86-64-darwin21, pushed to trunk, thanks, Iain --- 8< --- Synchronise the darwin export list with the current map. gcc/jit/ChangeLog: * libgccjit.exports: Add symbols for ABI 28 to 34. Signed-off-by: Iain Sandoe --- gcc/jit/libgccjit.exports | 21 + 1 file changed, 21 insertions(+) diff --git a/gcc/jit/libgccjit.exports b/gcc/jit/libgccjit.exports index e32bbe2fd40..26dc634a0e8 100644 --- a/gcc/jit/libgccjit.exports +++ b/gcc/jit/libgccjit.exports @@ -230,4 +230,25 @@ _gcc_jit_function_add_integer_array_attribute # LIBGCCJIT_ABI_27 _gcc_jit_context_new_sizeof +# LIBGCCJIT_ABI_28 +_gcc_jit_context_new_alignof + +# LIBGCCJIT_ABI_29 +_gcc_jit_global_set_readonly + +# LIBGCCJIT_ABI_30 +_gcc_jit_context_convert_vector + +# LIBGCCJIT_ABI_31 +_gcc_jit_context_new_vector_access +_gcc_jit_context_new_rvalue_vector_perm + +# LIBGCCJIT_ABI_32 +_gcc_jit_context_get_target_builtin_function + +# LIBGCCJIT_ABI_33 +_gcc_jit_function_new_temp + +# LIBGCCJIT_ABI_34 +_gcc_jit_context_set_output_ident -- 2.39.2 (Apple Git-143)
Re: [PATCH 1/8] target/119010 - fixup zn4zn5 reservation for move from const_vector
> movv8si_internal uses sselog1 and V4SFmode for an instruction like > > (insn 363 2437 371 97 (set (reg:V8SI 46 xmm10 [1125]) > (const_vector:V8SI [ > (const_int 0 [0]) repeated x8 > ])) "ComputeNonbondedUtil.C":185:21 2402 {movv8si_internal} > > this wasn't catched by the existing znver4_sse_log1 reservation, > I think the znver automaton catches this with the generic > > (define_insn_reservation "znver1_sse_log1" 1 > (and (eq_attr "cpu" "znver1,znver2,znver3") > (and (eq_attr "type" "sselog1") >(eq_attr "memory" "none"))) > "znver1-direct,znver1-fp1|znver1-fp2") > > which does not look at the mode at all. The zn4zn5 automaton lacks > this and instead has separated store and load-store reservations > in odd ways. The following renames the store one and introduces > a none variant. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > PR target/119010 > * config/i386/zn4zn5.md (znver4_sse_log1): Rename to > znver4_sse_log1_store. > (znver5_sse_log1): Rename to znver5_sse_log1_store. > (znver4_sse_log1): New memory-less variant. This is OK an alternative would be to place more specific avx512 patterns frist and avoid long list of modes later. But I suppose the list is not going to get longer in future. Honza > --- > gcc/config/i386/zn4zn5.md | 9 - > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/gcc/config/i386/zn4zn5.md b/gcc/config/i386/zn4zn5.md > index 75e31020215..fb856e9dc98 100644 > --- a/gcc/config/i386/zn4zn5.md > +++ b/gcc/config/i386/zn4zn5.md > @@ -893,13 +893,20 @@ >"znver4-direct,znver5-load,znver4-fpu") > > (define_insn_reservation "znver4_sse_log1" 1 > + (and (eq_attr "cpu" "znver4,znver5") > + (and (eq_attr "type" "sselog1") > +(and (eq_attr "mode" > "V4SF,V8SF,V2DF,V4DF,QI,HI,SI,DI,TI,OI") > + (eq_attr "memory" "none" > + "znver4-direct,znver4-fpu1|znver4-fpu2") > + > +(define_insn_reservation "znver4_sse_log1_store" 1 >(and (eq_attr "cpu" "znver4") > (and (eq_attr "type" "sselog1") > (and (eq_attr "mode" > "V4SF,V8SF,V2DF,V4DF,QI,HI,SI,DI,TI,OI") > (eq_attr "memory" "store" > > "znver4-direct,znver4-fpu1|znver4-fpu2,znver4-fp-store") > > -(define_insn_reservation "znver5_sse_log1" 1 > +(define_insn_reservation "znver5_sse_log1_store" 1 >(and (eq_attr "cpu" "znver5") > (and (eq_attr "type" "sselog1") > (and (eq_attr "mode" > "V4SF,V8SF,V2DF,V4DF,QI,HI,SI,DI,TI,OI") > -- > 2.43.0 >
Re: [PATCH 5/8] target/119010 - fixup Zen4/Zen5 fp<->int convert reservations
> They were using ssecvt instead of sseicvt, I've also added handling > for sseicvt2 which was introduced without fixing up automata, and > the relevant instruction uses DFmode. IMO this is a quite messy > area that could need TLC in the machine description itself. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > PR target/119010 > * config/i386/zn4zn5.md (znver4_sse_icvt): Use sseicvt. > (znver4_sse_icvt_store): Likewise. > (znver5_sse_icvt_store): Likewise. > (znver4_sse_icvt2): New. OK Thanks, Honza > --- > gcc/config/i386/zn4zn5.md | 13 ++--- > 1 file changed, 10 insertions(+), 3 deletions(-) > > diff --git a/gcc/config/i386/zn4zn5.md b/gcc/config/i386/zn4zn5.md > index e89d0f49ec8..6720fda1705 100644 > --- a/gcc/config/i386/zn4zn5.md > +++ b/gcc/config/i386/zn4zn5.md > @@ -1263,21 +1263,28 @@ > > (define_insn_reservation "znver4_sse_icvt" 3 >(and (eq_attr "cpu" "znver4,znver5") > - (and (eq_attr "type" "ssecvt") > + (and (eq_attr "type" "sseicvt") > (and (eq_attr "mode" "SI") > (eq_attr "memory" "none" >"znver4-direct,znver4-fpu2|znver4-fpu3") > > +(define_insn_reservation "znver4_sse_icvt2" 3 > + (and (eq_attr "cpu" "znver4,znver5") > + (and (eq_attr "type" "sseicvt2") > +(and (eq_attr "mode" "DF") > + (eq_attr "memory" "none" > + "znver4-direct,znver4-fpu2|znver4-fpu3") > + > (define_insn_reservation "znver4_sse_icvt_store" 4 >(and (eq_attr "cpu" "znver4") > - (and (eq_attr "type" "ssecvt") > + (and (eq_attr "type" "sseicvt") > (and (eq_attr "mode" "SI") > (eq_attr "memory" "store" > > "znver4-double,znver4-fpu2|znver4-fpu3,znver4-fp-store") > > (define_insn_reservation "znver5_sse_icvt_store" 4 >(and (eq_attr "cpu" "znver5") > - (and (eq_attr "type" "ssecvt") > + (and (eq_attr "type" "sseicvt") > (and (eq_attr "mode" "SI") > (eq_attr "memory" "store" > > "znver4-double,znver4-fpu2|znver4-fpu3,znver5-fp-store256") > -- > 2.43.0 >
Re: [PATCH 4/8] target/119010 - handle DFmode in SSE divide reservations for Zen4/Zen5
> Like the other DFmode cases. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > PR target/119010 > * config/i386/zn4zn5.md (znver4_sse_div_pd, > znver4_sse_div_pd_load, znver5_sse_div_pd_load): Handle DFmode. OK, thanks! Honza > --- > gcc/config/i386/zn4zn5.md | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/gcc/config/i386/zn4zn5.md b/gcc/config/i386/zn4zn5.md > index c7ced5411f0..e89d0f49ec8 100644 > --- a/gcc/config/i386/zn4zn5.md > +++ b/gcc/config/i386/zn4zn5.md > @@ -1156,7 +1156,7 @@ > (define_insn_reservation "znver4_sse_div_pd" 13 >(and (eq_attr "cpu" "znver4,znver5") > (and (eq_attr "type" "ssediv") > -(and (eq_attr "mode" "V4DF,V2DF,V1DF") > +(and (eq_attr "mode" "V4DF,V2DF,V1DF,DF") > (eq_attr "memory" "none" >"znver4-direct,znver4-fdiv*5") > > @@ -1170,14 +1170,14 @@ > (define_insn_reservation "znver4_sse_div_pd_load" 18 >(and (eq_attr "cpu" "znver4") > (and (eq_attr "type" "ssediv") > -(and (eq_attr "mode" "V4DF,V2DF,V1DF") > +(and (eq_attr "mode" "V4DF,V2DF,V1DF,DF") > (eq_attr "memory" "load" >"znver4-direct,znver4-load,znver4-fdiv*5") > > (define_insn_reservation "znver5_sse_div_pd_load" 18 >(and (eq_attr "cpu" "znver5") > (and (eq_attr "type" "ssediv") > -(and (eq_attr "mode" "V4DF,V2DF,V1DF") > +(and (eq_attr "mode" "V4DF,V2DF,V1DF,DF") > (eq_attr "memory" "load" >"znver4-direct,znver5-load,znver4-fdiv*5") > > -- > 2.43.0 >
Re: [PATCH] cobol: Fix up cobol/{charmaps,valconv}.cc rules
On Sat, Mar 29, 2025 at 03:50:54PM +, Iain Sandoe wrote: > > I'm not sure if sed -E is portable enough (sure, I know it is in POSIX, but > > that is not enough). > > How about just > > sed -e '/^#include/s,"\([^"]*.h\)","../../libgcobol/\1",' $& > $@ > > This, unfortunately, works too well (with s/&/^) .. because it also processes > #include “config.h” > which then points to a non-existent file. I think we want to include config > for both FE and > library (so we cannot get around it by indenting the config.h include - well > we could, but …) Neither libgcobol/charmaps.cc nor libgcobol/valconv.cc has config.h include. Jakub
Re: [PATCH] cobol: Fix up cobol/{charmaps,valconv}.cc rules
> On 29 Mar 2025, at 15:28, Jakub Jelinek wrote: > > On Sat, Mar 29, 2025 at 03:18:20PM +, Iain Sandoe wrote: >> Hi Jakub >> >> Thanks for doing this... >> >>> On 28 Mar 2025, at 14:39, Jakub Jelinek wrote: >>> >>> +cobol/charmaps.cc cobol/valconv.cc: cobol/%.cc: $(LIB_SOURCE)/%.cc >>> + -l='ec\|common-defs\|io\|gcobolio\|libgcobol\|gfileio\|charmaps'; \ >>> + l=$$l'\|valconv\|exceptl'; \ >>> + sed -e '/^#include/s,"\('$$l'\)\.h","../../libgcobol/\1.h",' $^ > $@ >> >> .. however, this does not work with the BSD sed on Darwin (although it does >> with GNU sed on Darwin). >> >> The issue appears to be that alternation is an ERE addition from my reading >> of >> https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03 >> (the Darwin BRE sed works fine when there is only one match in the >> sub-expression, >> but fails as soon as any alternate is added). >> >> Darwin’s sed (and at least x86_64 / aarch64 Linux sed) work fine with the >> updated patch >> to use an ERE instead. Do you think this is an acceptable update? (it is >> still POSIX sed). > > I'm not sure if sed -E is portable enough (sure, I know it is in POSIX, but > that is not enough). > How about just > sed -e '/^#include/s,"\([^"]*.h\)","../../libgcobol/\1",' $& > $@ This, unfortunately, works too well (with s/&/^) .. because it also processes #include “config.h” which then points to a non-existent file. I think we want to include config for both FE and library (so we cannot get around it by indenting the config.h include - well we could, but …) trying to think of variations on this theme. Iain > ? > I mean, both charmaps.cc and valconv.cc use #include <> for system headers > and #include "" for local headers and we want to adjust all of the latter > and none of the former? > >> diff --git a/gcc/cobol/Make-lang.in b/gcc/cobol/Make-lang.in >> index 990d51a8578..bbf1c9ef30e 100644 >> --- a/gcc/cobol/Make-lang.in >> +++ b/gcc/cobol/Make-lang.in >> @@ -88,9 +88,8 @@ cobol1_OBJS =\ >> # so that the .h files can be found. >> >> cobol/charmaps.cc cobol/valconv.cc: cobol/%.cc: $(LIB_SOURCE)/%.cc >> - -l='ec\|common-defs\|io\|gcobolio\|gfileio\|charmaps'; \ >> - l=$$l'\|valconv\|exceptl'; \ >> - sed -e '/^#include/s,"\('$$l'\)\.h","../../libgcobol/\1.h",' $^ > $@ >> + -l='ec|common-defs|io|gcobolio|gfileio|charmaps|valconv|exceptl'; \ >> + sed -E -e '/^#include/s,"('$$l')\.h","../../libgcobol/\1.h",' $^ > $@ >> >> LIB_SOURCE_H=$(wildcard $(LIB_SOURCE)/*.h) >> > > Jakub >
Re: [PATCH v3 05/19] Update is_function_default_version to work with target_version.
> On 27 Mar 2025, at 16:45, Alfie Richards wrote: > > > Notably this respects target_version semantics where an unannotated > function can be the default version. > > gcc/ChangeLog: > > * attribs.cc (is_function_default_version): Add target_version logic. > --- > gcc/attribs.cc | 27 --- > 1 file changed, 20 insertions(+), 7 deletions(-) > > Should we have this for GCC 15?
[PATCH] gcc/mingw: Align `.refptr.` to 8-byte boundaries for 64-bit targets
This is a minor change, bootstrapped on x86_64-w64-mingw32. -- Best regards, LIU Hao From 83c3e90432f9ebc97785d81be7a94066d9923920 Mon Sep 17 00:00:00 2001 From: LIU Hao Date: Sat, 29 Mar 2025 22:47:54 +0800 Subject: [PATCH] gcc/mingw: Align `.refptr.` to 8-byte boundaries for 64-bit targets Windows only requires sections to be aligned on a 4-byte boundary. This used to work because in binutils the `.rdata` section is over-aligned to a 16-byte boundary, which will be fixed in the future. This matches the output of Clang. Signed-off-by: LIU Hao gcc/ChangeLog: * config/mingw/winnt.cc (mingw_pe_file_end): Add `.p2align`. --- gcc/config/mingw/winnt.cc | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/config/mingw/winnt.cc b/gcc/config/mingw/winnt.cc index ed80fdac21b4..f22496615eda 100644 --- a/gcc/config/mingw/winnt.cc +++ b/gcc/config/mingw/winnt.cc @@ -833,6 +833,7 @@ mingw_pe_file_end (void) } fprintf (asm_out_file, "\t.section\t.rdata$%s, \"dr\"\n" + "\t.p2align\t3, 0\n" "\t.globl\t%s\n" "\t.linkonce\tdiscard\n", oname, oname); fprintf (asm_out_file, "%s:\n\t.quad\t%s\n", oname, name); -- 2.49.0 OpenPGP_signature.asc Description: OpenPGP digital signature
[ping] [PATCH] includes, Darwin: Handle modular use for macOS SDKs [PR116827].
C++ modules are not really usable on latest Darwin without resolving this, thanks Iain > On 23 Mar 2025, at 12:29, Iain Sandoe wrote: > > From: Iain Sandoe > > Tested on x86_64/aarch64 Darwin and x86_64-linux, > OK for trunk? > backports to branches supporting modules? > thanks > Iain > > --- 8< --- > > Recent changes to the OS SDKs have altered the way in which include guards > are used for a number of headers when C++ modules are enabled. Instead of > placing the guards in the included header, they are being placed in the > including header. This breaks the assumptions in the current GCC stddef.h > specifically, that the presence of __PTRDIFF_T and __SIZE_T means that the > relevant defs are already made. However in the case of the module-enabled > C++ with these SDKs, that is no longer true. > > stddef.h has a large body of special-cases already, but it seems that the > only viable solution here is to add a new one specifically for __APPLE__ > and modular code. > > This fixes around 280 new fails in the modules test-suite; it is needed on > all open branches that support modules. > > PR target/116827 > > gcc/ChangeLog: > > * ginclude/stddef.h: Undefine __PTRDIFF_T and __SIZE_T for module- > enabled c++ on Darwin/macOS platforms. > > Signed-off-by: Iain Sandoe > --- > gcc/ginclude/stddef.h | 11 +++ > 1 file changed, 11 insertions(+) > > diff --git a/gcc/ginclude/stddef.h b/gcc/ginclude/stddef.h > index 0d53103ce20..bf9c6e609dc 100644 > --- a/gcc/ginclude/stddef.h > +++ b/gcc/ginclude/stddef.h > @@ -89,6 +89,17 @@ see the files COPYING3 and COPYING.RUNTIME respectively. > If not, see > #undef _PTRDIFF_T_ > #endif > > +#if defined (__APPLE__) > +# if defined(__has_feature) && __has_feature(modules) > +# if defined (__need_ptrdiff_t) > +# undef __PTRDIFF_T > +# endif > +# if defined (__need_size_t) > +# undef __SIZE_T > +# endif > +# endif > +#endif > + > /* On VxWorks, may have defined macros like >_TYPE_size_t which will typedef size_t. fixincludes patched the >vxTypesBase.h so that this macro is only defined if _GCC_SIZE_T is > -- > 2.39.2 (Apple Git-143) >
Re: [gcc-wwwdocs PATCH] gcc-14/15: Mention recent change for Intel x86_64
On Mon, 24 Mar 2025, Haochen Jiang wrote: > Mention AVX10.1 option changes, revise AVX10.2 option and mention > APX_F new feature in GCC 15. > --- >New ISA extension support for Intel AVX10.1 was added. > - AVX10.1 intrinsics are available via the -mavx10.1 or > - -mavx10.1-256 compiler switch with 256-bit vector size > - support. 512-bit vector size support for AVX10.1 intrinsics are > - available via the -mavx10.1-512 compiler switch. > + AVX10.1 intrinsics are available via the -mavx10.1-256 > + compiler switch with 256-bit vector size support. 512-bit vector size > + support for AVX10.1 intrinsics are available via the > + -mavx10.1-512 compiler switch. -mavx10.1 > + enables AVX10.1 intrinsics with 256-bit vector size support in GCC 14.1 I suggest to just use "256-bit vector support", dropping the word "size", and similar for the 512-bit case. > + and GCC 14.2. Since GCC 14.3, it enables AVX10.1 intrinsics with > 512-bit > + vector size support. Since GCC 14.3, using -mavx10.1 will > + emit a warning due to this behavior change. How about streamlining this to "Since GCC 14.3, it enables AVX10.1 intrinsics with 512-bit vector support (and emits a warning due to this behavior change)." ? > + compiler switch. MOVRS vector intrinsics are available via > + the -mmovrs -mavx10.2 compiler switch. Technically this is not one switch, so "switches"? > +AMX-FP8, AMX-MOVRS, AMX-TF32, AMX-TRANSPOSE, APX_F, AVX10.2, AVX-IFMA, > +AVX-NE-CONVERT, AVX-VNNI-INT16, AVX-VNNI-INT8, CMPccXADD, MOVRS, SHA512, > +SM3, SM4 and USER_MSR ISA extensions. We usually go for an Oxford comma, so "...SM4, and USER_MSR...". > + -mavx10.1-256, -mavx10.1-512 and > + -mevex512 are marked as deprecated. Meanwhile, How about just "... are deprecated"? > + -mavx10.1 enables AVX10.1 intrinsics with 512-bit > + vector size support, while in GCC 14.1 and GCC 14.2, it only enables > + 256-bit vector size support. GCC will emit a warning when using these > + compiler switches. -mavx10.1-256, > -mavx10.1-512 > + and -mevex512 will be removed in GCC 16, while the warning > + for the behavior change on -mavx10.1 will also be removed. How about "..in GCC 16 together with he warning..." or "in GCC 16, as will the warning..."? This is fine with these (or similar) changes. Thank you, Gerald
Re: [PATCH] libstdc++: Constrain formatters for chrono types [PR119517]
On Sat, 29 Mar 2025 at 18:16, Tomasz Kaminski wrote: > > > > On Fri, Mar 28, 2025 at 9:33 PM Jonathan Wakely wrote: >> >> On 28/03/25 16:31 +0100, Tomasz Kamiński wrote: >> >The formatters for chrono types defined the parse/format methods >> >as accepting unconstrained types, this in combination with lack >> >of constrain on _CharT lead to them falsy statisfying formattable >> >requirements for any type used as character. >> > >> >This patch adjust the fromatter::parse signature to: >> > constexpr typename basic_format_parse_context<_CharT>::iterator >> > parse(basic_format_parse_context<_CharT>& __pc); >> >And formatter::format to: >> > template >> > typename basic_format_context<_Out, _CharT>::iterator >> > format(const T& __t, >> > basic_format_context<_Out, _CharT>& __fc) const; >> > >> >Furthermore we _CharT with __format::__char (char or wchar_t), >> > >> > PR libstdc++/119517 >> > >> >libstdc++-v3/ChangeLog: >> > >> > * include/bits/chrono_io.h (formatter): >> > Add __format::__char for _CharT and adjust parse and format >> > method signatures. >> > * testsuite/std/time/format/pr119517.cc: New test. >> >--- >> >Testing on x86_64-linux, std/time/format tests passed. >> >OK for trunk? >> > >> > libstdc++-v3/include/bits/chrono_io.h | 448 +- >> > .../testsuite/std/time/format/pr119517.cc | 44 ++ >> > 2 files changed, 262 insertions(+), 230 deletions(-) >> > create mode 100644 libstdc++-v3/testsuite/std/time/format/pr119517.cc >> > >> >diff --git a/libstdc++-v3/include/bits/chrono_io.h >> >b/libstdc++-v3/include/bits/chrono_io.h >> >index c55b651d049..3a5bc5695fb 100644 >> >--- a/libstdc++-v3/include/bits/chrono_io.h >> >+++ b/libstdc++-v3/include/bits/chrono_io.h >> >@@ -1785,277 +1785,272 @@ namespace __format >> > __format::__formatter_chrono<_CharT> _M_f; >> > }; >> > >> >- template >> >+ template<__format::__char _CharT> >> > struct formatter >> > { >> >- template >> >- constexpr typename _ParseContext::iterator >> >- parse(_ParseContext& __pc) >> >- { return _M_f._M_parse(__pc, __format::_Day); } >> >+ constexpr typename basic_format_parse_context<_CharT>::iterator >> >+ parse(basic_format_parse_context<_CharT>& __pc) >> >+ { return _M_f._M_parse(__pc, __format::_Day); } >> > >> >- template >> >- typename _FormatContext::iterator >> >- format(const chrono::day& __t, _FormatContext& __fc) const >> >+ template >> >+ typename basic_format_context<_Out, _CharT>::iterator >> >+ format(const chrono::day& __t, >> >+ basic_format_context<_Out, _CharT>& __fc) const >> > { return _M_f._M_format(__t, __fc); } >> > >> > private: >> > __format::__formatter_chrono<_CharT> _M_f; >> > }; >> > >> >- template >> >+ template<__format::__char _CharT> >> > struct formatter >> > { >> >- template >> >- constexpr typename _ParseContext::iterator >> >- parse(_ParseContext& __pc) >> >- { return _M_f._M_parse(__pc, __format::_Month); } >> >+ constexpr typename basic_format_parse_context<_CharT>::iterator >> >+ parse(basic_format_parse_context<_CharT>& __pc) >> >+ { return _M_f._M_parse(__pc, __format::_Month); } >> > >> >- template >> >- typename _FormatContext::iterator >> >- format(const chrono::month& __t, _FormatContext& __fc) const >> >+ template >> >+ typename basic_format_context<_Out, _CharT>::iterator >> >+ format(const chrono::month& __t, >> >+ basic_format_context<_Out, _CharT>& __fc) const >> > { return _M_f._M_format(__t, __fc); } >> > >> > private: >> > __format::__formatter_chrono<_CharT> _M_f; >> > }; >> > >> >- template >> >+ template<__format::__char _CharT> >> > struct formatter >> > { >> >- template >> >- constexpr typename _ParseContext::iterator >> >- parse(_ParseContext& __pc) >> >- { return _M_f._M_parse(__pc, __format::_Year); } >> >+ constexpr typename basic_format_parse_context<_CharT>::iterator >> >+ parse(basic_format_parse_context<_CharT>& __pc) >> >+ { return _M_f._M_parse(__pc, __format::_Year); } >> > >> >- template >> >- typename _FormatContext::iterator >> >- format(const chrono::year& __t, _FormatContext& __fc) const >> >+ template >> >+ typename basic_format_context<_Out, _CharT>::iterator >> >+ format(const chrono::year& __t, >> >+ basic_format_context<_Out, _CharT>& __fc) const >> > { return _M_f._M_format(__t, __fc); } >> > >> > private: >> > __format::__formatter_chrono<_CharT> _M_f; >> > }; >> > >> >- template >> >+ template<__format::__char _CharT> >> > struct formatter >> > { >> >- template >> >- constexpr typename _ParseContext::iterator >> >- parse(_ParseContext& __pc) >> >- { return _M_f._M_parse(__pc, __format::_Weekday); } >> >+ constexpr type
Re: [PATCH v3 14/19] Add reject_target_clone hook and filter target_clone versions.
Hi Alfie, It appears that you've duplicated patch 14/19. The only difference between them is the title, which replaces "and" with "in order to". I think the latter version is what you intended. Thanks, Yangyu Chen
Re: [PATCH 6/8] target/119010 - reservations for Zen4/Zen5 movhlps to memory
> The following adds missing reservations for the store variant of > sselog reservations covering > > ;; 112--> b 0: i1499 [dx-0x10]=vec_select(xmm10,parallel):nothing > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > PR target/119010 > * config/i386/zn4zn5.md (znver4_sse_log_evex_store, > znver5_sse_log_evex_store): New reservations. OK, Thanks! Honza > --- > gcc/config/i386/zn4zn5.md | 14 ++ > 1 file changed, 14 insertions(+) > > diff --git a/gcc/config/i386/zn4zn5.md b/gcc/config/i386/zn4zn5.md > index 6720fda1705..1ac1d07c04b 100644 > --- a/gcc/config/i386/zn4zn5.md > +++ b/gcc/config/i386/zn4zn5.md > @@ -1367,6 +1367,20 @@ > (eq_attr "memory" "load" > > "znver4-direct,znver5-load,znver4-fpu0|znver4-fpu1|znver4-fpu2|znver4-fpu3") > > +(define_insn_reservation "znver4_sse_log_evex_store" 1 > + (and (eq_attr "cpu" "znver4") > + (and (eq_attr "type" "sselog") > +(and (eq_attr "mode" "V16SF,V8DF,XI") > + (eq_attr "memory" "store" > + > "znver4-direct,znver4-store,znver4-fpu0*2|znver4-fpu1*2|znver4-fpu2*2|znver4-fpu3*2") > + > +(define_insn_reservation "znver5_sse_log_evex_store" 1 > + (and (eq_attr "cpu" "znver5") > + (and (eq_attr "type" "sselog") > +(and (eq_attr "mode" "V16SF,V8DF,XI") > + (eq_attr "memory" "store" > + > "znver4-direct,znver5-store,znver4-fpu0|znver4-fpu1|znver4-fpu2|znver4-fpu3") > + > (define_insn_reservation "znver4_sse_log1_evex" 1 >(and (eq_attr "cpu" "znver4") > (and (eq_attr "type" "sselog1") > -- > 2.43.0 >
[PATCH] testsuite: arm: fixup more dg-final syntax
... as Richard E mentioned on the ML. Followup to r15-8956-ge90d6c2639c392. gcc/testsuite/ChangeLog: * gcc.target/arm/short-vfp-1.c: Add whitespace around brace. --- Pushed. gcc/testsuite/gcc.target/arm/short-vfp-1.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/gcc/testsuite/gcc.target/arm/short-vfp-1.c b/gcc/testsuite/gcc.target/arm/short-vfp-1.c index ddab09a4b7fb..18d38a580377 100644 --- a/gcc/testsuite/gcc.target/arm/short-vfp-1.c +++ b/gcc/testsuite/gcc.target/arm/short-vfp-1.c @@ -38,8 +38,8 @@ test_sihi (short x) return (int)x; } -/* { dg-final { scan-assembler-times {vcvt\.s32\.f32\ts[0-9]+,s[0-9]+} 2 }} */ -/* { dg-final { scan-assembler-times {vcvt\.f32\.s32\ts[0-9]+,s[0-9]+} 2 }} */ -/* { dg-final { scan-assembler-times {vmov\tr[0-9]+,s[0-9]+} 2 }} */ -/* { dg-final { scan-assembler-times {vmov\ts[0-9]+,r[0-9]+} 2 }} */ -/* { dg-final { scan-assembler-times {sxth\tr[0-9]+,r[0-9]+} 2 }} */ +/* { dg-final { scan-assembler-times {vcvt\.s32\.f32\ts[0-9]+,s[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vcvt\.f32\.s32\ts[0-9]+,s[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vmov\tr[0-9]+,s[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {vmov\ts[0-9]+,r[0-9]+} 2 } } */ +/* { dg-final { scan-assembler-times {sxth\tr[0-9]+,r[0-9]+} 2 } } */ base-commit: eb26b667518c951d06f3c51118a1d41dcdda8b99 -- 2.49.0