Re: [PATCH] Fortran: reject SAVE of a COMMON in a BLOCK construct [PR119199]
Hi Harald, Regtested on x86_64-pc-linux-gnu. OK for mainline? Looks good to me. Thanks for the patch! Best regards Thomas
Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language
Am Montag, dem 10.03.2025 um 19:30 -0400 schrieb John McCall: > On 10 Mar 2025, at 18:30, Martin Uecker wrote: > > Am Montag, dem 10.03.2025 um 16:45 -0400 schrieb John McCall: > > > > .. > > > > > > > > > > > > While the next example is also ok in C++. > > > > > > > > > > > > constexpr int n = 2; > > > > > > > > > > > > struct foo { > > > > > > char buf[n]; > > > > > > }; > > > > > > > > > > > > With both declarations of 'n' the example has UB in C++. > > > > > > So I am not convinced the proposed rules make a lot > > > > > > of sense for C++ either. > > > > > > > > If C required a diagnostic in your first example, it would actually > > > > put a fair amount of pressure on the C++ committee to get rid of > > > > this spurious UB rule. > > > > Why would C want a diagnostic here? > > When I said “your first example”, Martin, I did actually mean your > first example: Sorry, I meant "there". No reason to be condescending though. > But I think it’s clear that you and I just differ on some basic design > philosophy, so let’s just end the conversation here. I think the issue that if one does not agree with the design decisions made previously for the name lookup rules in the C and C++ languages and wants to change those (or adding new inconsistent ones), then this is not simply a question of language design preferences. Martin > > John. > > > > > > > I still think one could use designator syntax, i.e. '.n', which > > > > > > would be clearer and intuitive for both C and C++ programmers. > > > > > > > > This doesn’t really solve the ambiguity problem. If n is a field name, > > > > a programmer who writes __counted_by(n) almost certainly means to name > > > > the field. “The proper syntax is .n” is the cause of the bug, not its > > > > solution. > > > > Field names in C are in a different namespace. So far, when you write > > 'n' this *never* refers to field member in any context. And I have > > never seen anybody request a warning for the examples above. So, no, > > "a programmer almost certainly means this" can not possible be true. > > > > Martin
Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language
> On Mar 10, 2025, at 11:04 PM, Martin Uecker wrote: > > Am Montag, dem 10.03.2025 um 19:30 -0400 schrieb John McCall: >> On 10 Mar 2025, at 18:30, Martin Uecker wrote: >>> Am Montag, dem 10.03.2025 um 16:45 -0400 schrieb John McCall: > > > .. > >>> > >>> While the next example is also ok in C++. >>> >>> constexpr int n = 2; >>> >>> struct foo { >>> char buf[n]; >>> }; >>> >>> With both declarations of 'n' the example has UB in C++. >>> So I am not convinced the proposed rules make a lot >>> of sense for C++ either. > > If C required a diagnostic in your first example, it would actually > put a fair amount of pressure on the C++ committee to get rid of > this spurious UB rule. >>> >>> Why would C want a diagnostic here? >> >> When I said “your first example”, Martin, I did actually mean your >> first example: > > Sorry, I meant "there". No reason to be condescending though. > >> But I think it’s clear that you and I just differ on some basic design >> philosophy, so let’s just end the conversation here. > > I think the issue that if one does not agree with the > design decisions made previously for the name lookup rules > in the C and C++ languages and wants to change those (or > adding new inconsistent ones), then this is not simply > a question of language design preferences. > > Martin > >> >> John. >> >>> I still think one could use designator syntax, i.e. '.n', which >>> would be clearer and intuitive for both C and C++ programmers. > > This doesn’t really solve the ambiguity problem. If n is a field name, > a programmer who writes __counted_by(n) almost certainly means to name > the field. “The proper syntax is .n” is the cause of the bug, not its > solution. >>> >>> Field names in C are in a different namespace. So far, when you write >>> 'n' this *never* refers to field member in any context. And I have >>> never seen anybody request a warning for the examples above. So, no, >>> "a programmer almost certainly means this" can not possible be true. >>> >>> Martin > I won't speak for John, but the way I see it you can optimise for theoretically consistent minimalist semantics, or strive to optimise each feature for its most common use cases. That's a difference in design philosophy. Although I don't think these ambiguities matter in practice (as long as there are workarounds) since their occurrence will be vanishingly rare, the fact that 'n' currently means one thing doesn't mean that it should have to mean the same thing in another context with different design goals, especially when that is a completely new context. The context of "__counted_by(...)" surrounding the expression should be enough to infer what is meant. The fact that code could be structured in a way to mislead the reader is nothing new to C, and when the context where that even could be a concern is largely hypothetical I don't think eliminating it should be a priority. Henrik
Re: [PATCH v2] i386: Verify that argument registers are spilled properly
On Sun, Mar 9, 2025 at 11:34 PM H.J. Lu wrote: > > While working on a local x86 patch, which passed the GCC testsuite, I got > a compiler error: > > In function ‘paravirt_read_msr’, > inlined from ‘perf_ibs_handle_irq’ at arch/x86/events/amd/ibs.c:1055:2: > ./arch/x86/include/asm/paravirt_types.h:397:17: error: ‘asm’ operand has > impossible constraints or there are not enough registers > 397 | asm volatile(ALTERNATIVE(PARAVIRT_CALL, > ALT_CALL_INSTR, \ > | ^~~ > > when building x86-64 Linux kernel. RDI, RSI, RDX and RCX registers are > used to pass arguments in 64-bit mode. EAX, EDX and ECX registers are > used to pass arguments in 32-bit mode. But there is no coverage in the > GCC testsuite. Add tests to verify that argument registers are spilled > properly. > > PR target/119171 > * gcc.target/i386/pr119171-1.c: New test. > * gcc.target/i386/pr119171-2.c: Likewise. OK. Thanks, Uros. > > Signed-off-by: H.J. Lu > --- > gcc/testsuite/gcc.target/i386/pr119171-1.c | 14 ++ > gcc/testsuite/gcc.target/i386/pr119171-2.c | 13 + > 2 files changed, 27 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/i386/pr119171-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr119171-2.c > > diff --git a/gcc/testsuite/gcc.target/i386/pr119171-1.c > b/gcc/testsuite/gcc.target/i386/pr119171-1.c > new file mode 100644 > index 000..a017e6e215f > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr119171-1.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile { target { ! ia32 } } } */ > +/* { dg-options "-O2" } */ > + > +extern long a1, a2, a3, a4; > +extern void foo (void *, void *, void *, void *); > +void > +bar (void *rdi, void *rsi, void *rdx, void *rcx) > +{ > + asm ("" : "=D"(a1) : "D"(0)); > + asm ("" : "=S"(a2) : "S"(0)); > + asm ("" : "=d"(a3) : "d"(0)); > + asm ("" : "=c"(a4) : "c"(0)); > + foo (rdi, rsi, rdx, rcx); > +} > diff --git a/gcc/testsuite/gcc.target/i386/pr119171-2.c > b/gcc/testsuite/gcc.target/i386/pr119171-2.c > new file mode 100644 > index 000..5c209f9aed0 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr119171-2.c > @@ -0,0 +1,13 @@ > +/* { dg-do compile { target ia32 } } */ > +/* { dg-options "-O2 -mregparm=3" } */ > + > +extern long a1, a2, a3; > +extern void foo (void *, void *, void *); > +void > +bar (void *eax, void *edx, void *ecx) > +{ > + asm ("" : "=a"(a1) : "a"(0)); > + asm ("" : "=d"(a2) : "d"(0)); > + asm ("" : "=c"(a3) : "c"(0)); > + foo (eax, edx, ecx); > +} > -- > 2.48.1 >
[PATCH v2] driver: Fix multilib_os_dir and multiarch_dir for those target use TARGET_COMPUTE_MULTILIB
This patch fixes the multilib_os_dir and multiarch_dir for those targets that use TARGET_COMPUTE_MULTILIB, since the TARGET_COMPUTE_MULTILIB hook only update/fix the multilib_dir but not the multilib_os_dir and multiarch_dir, so the multilib_os_dir and multiarch_dir are not set correctly for those targets. Use RISC-V linux target (riscv64-unknown-linux-gnu) as an example: ``` $ riscv64-unknown-linux-gnu-gcc -print-multi-lib .; lib32/ilp32;@march=rv32imac@mabi=ilp32 lib32/ilp32d;@march=rv32imafdc@mabi=ilp32d lib64/lp64;@march=rv64imac@mabi=lp64 lib64/lp64d;@march=rv64imafdc@mabi=lp64d ``` If we use the exactly same -march and -mabi options to compile a source file, the multilib_os_dir and multiarch_dir are set correctly: ``` $ riscv64-unknown-linux-gnu-gcc -print-multi-os-directory -march=rv64imafdc -mabi=lp64d ../lib64/lp64d $ riscv64-unknown-linux-gnu-gcc -print-multi-directory -march=rv64imafdc -mabi=lp64d lib64/lp64d ``` However if we use the -march=rv64imafdcv -mabi=lp64d option to compile a source file, the multilib_os_dir and multiarch_dir are not set correctly: ``` $ riscv64-unknown-linux-gnu-gcc -print-multi-os-directory -march=rv64imafdc -mabi=lp64d lib64/lp64d $ riscv64-unknown-linux-gnu-gcc -print-multi-directory -march=rv64imafdc -mabi=lp64d lib64/lp64d ``` That's because the TARGET_COMPUTE_MULTILIB hook only update/fix the multilib_dir but not the multilib_os_dir, so the multilib_os_dir is blank and will use same value as multilib_dir, but that is not correct. So we introduce second chance to fix the multilib_os_dir if it's not set, we do also try to fix the multiarch_dir, because it may also not set correctly if multilib_os_dir is not set. Changes since v1: - Fix non-multilib build. - Fix fix indentation. gcc/ChangeLog: * gcc.c (find_multilib_os_dir_by_multilib_dir): New. (set_multilib_dir): Fix multilib_os_dir and multiarch_dir if multilib_os_dir is not set. --- gcc/gcc.cc | 108 - 1 file changed, 107 insertions(+), 1 deletion(-) diff --git a/gcc/gcc.cc b/gcc/gcc.cc index 04b3736a5da..89944c33133 100644 --- a/gcc/gcc.cc +++ b/gcc/gcc.cc @@ -9736,6 +9736,103 @@ default_arg (const char *p, int len) return 0; } +/* Use multilib_dir as key to find corresponding multilib_os_dir and + multiarch_dir. */ + +static void +find_multilib_os_dir_by_multilib_dir (const char *multilib_dir, + const char **p_multilib_os_dir, + const char **p_multiarch_dir) +{ + const char *p = multilib_select; + unsigned int this_path_len; + const char *this_path; + int ok = 0; + + while (*p != '\0') +{ + /* Ignore newlines. */ + if (*p == '\n') + { + ++p; + continue; + } + + /* Get the initial path. */ + this_path = p; + while (*p != ' ') + { + if (*p == '\0') + { + fatal_error (input_location, "multilib select %qs %qs is invalid", + multilib_select, multilib_reuse); + } + ++p; + } + this_path_len = p - this_path; + + ok = 0; + + /* Skip any arguments, we don't care at this stage. */ + while (*++p != ';'); + + if (this_path_len != 1 + || this_path[0] != '.') + { + char *new_multilib_dir = XNEWVEC (char, this_path_len + 1); + char *q; + + strncpy (new_multilib_dir, this_path, this_path_len); + new_multilib_dir[this_path_len] = '\0'; + q = strchr (new_multilib_dir, ':'); + if (q != NULL) + *q = '\0'; + + if (strcmp (new_multilib_dir, multilib_dir) == 0) + ok = 1; + } + + /* Found matched multilib_dir, update multilib_os_dir and +multiarch_dir. */ + if (ok) + { + const char *q = this_path, *end = this_path + this_path_len; + + while (q < end && *q != ':') + q++; + if (q < end) + { + const char *q2 = q + 1, *ml_end = end; + char *new_multilib_os_dir; + + while (q2 < end && *q2 != ':') + q2++; + if (*q2 == ':') + ml_end = q2; + if (ml_end - q == 1) + *p_multilib_os_dir = xstrdup ("."); + else + { + new_multilib_os_dir = XNEWVEC (char, ml_end - q); + memcpy (new_multilib_os_dir, q + 1, ml_end - q - 1); + new_multilib_os_dir[ml_end - q - 1] = '\0'; + *p_multilib_os_dir = new_multilib_os_dir; + } + + if (q2 < end && *q2 == ':') + { + char *new_multiarch_dir = XNEWVEC (char, end - q2); + memcpy (new_multiarch_dir, q2 + 1, end - q2 - 1); + new_multiarch_dir[end - q2 - 1] = '\0'; + *p_multiarch_dir = new_multiarch_dir; + } +
[PATCH] Fix a pasto in ao_compare::compare_ao_refs
Hi, when reading the function ao_compare::compare_ao_refs I came accross what I believe to ba a copy-and-paste error which this patch fixes. Bootstrapped, LTO-bootstrapped and tested on x86_64-linux. OK for master? Thanks, Martin gcc/ChangeLog: 2025-03-10 Martin Jambor * tree-ssa-alias.cc (ao_compare::compare_ao_refs): Fix a copy-and-paste error. --- gcc/tree-ssa-alias.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/tree-ssa-alias.cc b/gcc/tree-ssa-alias.cc index 2489aa6b808..e93d5187d50 100644 --- a/gcc/tree-ssa-alias.cc +++ b/gcc/tree-ssa-alias.cc @@ -4355,12 +4355,13 @@ ao_compare::compare_ao_refs (ao_ref *ref1, ao_ref *ref2, c1 = p1, nskipped1 = i; i++; } + i = 0; for (tree p2 = ref2->ref; handled_component_p (p2); p2 = TREE_OPERAND (p2, 0)) { if (component_ref_to_zero_sized_trailing_array_p (p2)) end_struct_ref2 = p2; if (ends_tbaa_access_path_p (p2)) - c2 = p2, nskipped1 = i; + c2 = p2, nskipped2 = i; i++; } -- 2.47.1
RE: [PATCH] combine: Add REG_DEAD notes to the last instruction after a split [PR118914]
> -Original Message- > From: Andrew Pinski (QUIC) > Sent: Wednesday, February 19, 2025 8:14 PM > To: gcc-patches@gcc.gnu.org > Cc: Andrew Pinski (QUIC) > Subject: [PATCH] combine: Add REG_DEAD notes to the last > instruction after a split [PR118914] > > So gcc.target/aarch64/rev16_2.c started to fail after r15-268- > g9dbff9c05520a7, the problem is combine now rejects the > instruction combine. This happens because after a different > combine which uses a define_split and that define_split > creates a new pseudo register. That new pseudo register is > dead after the last instruction in the stream BUT combine > never creates a REG_DEAD for it. So combine thinks it is still > needed after and now with the i2 not changing, combine > rejects the combine. > > Now combine should be creating a REG_DEAD for the new > pseudo registers for the last instruction of the split. This fixes > rev16_2.c and also improves the situtation in other cases by > removing other dead instructions. > > Bootstrapped and tested on aarch64-linux-gnu and x86_64- > linux-gnu. Ping? This fixes a regression on aarch64. > > gcc/ChangeLog: > > PR rtl-optimization/118914 > * combine.cc (recog_for_combine): Add old_nregs and > new_nregs > argument (defaulting to 0). Update call to > recog_for_combine_1. > (combine_split_insns): Add old_nregs and new_nregs > arguments, > store the old and new max registers to them. > (try_combine): Update calls to combine_split_insns and > pass old_nregs and new_nregs for the i3 call to > recog_for_combine. > (find_split_point): Update call to combine_split_insns; > ignoring > the values there. > (recog_for_combine_1): Add old_nregs and new_nregs > arguments, > if the insn was recognized (and not to no-op move), add > the > REG_DEAD notes to pnotes argument. > > Signed-off-by: Andrew Pinski > --- > gcc/combine.cc | 46 +++ > --- > 1 file changed, 31 insertions(+), 15 deletions(-) > > diff --git a/gcc/combine.cc b/gcc/combine.cc index > 3beeb514b81..a687c7c2015 100644 > --- a/gcc/combine.cc > +++ b/gcc/combine.cc > @@ -454,7 +454,7 @@ static bool merge_outer_ops (enum > rtx_code *, HOST_WIDE_INT *, enum rtx_code, static rtx > simplify_shift_const_1 (enum rtx_code, machine_mode, rtx, > int); static rtx simplify_shift_const (rtx, enum rtx_code, > machine_mode, rtx, >int); > -static int recog_for_combine (rtx *, rtx_insn *, rtx *); > +static int recog_for_combine (rtx *, rtx_insn *, rtx *, unsigned > = 0, > +unsigned = 0); > static rtx gen_lowpart_for_combine (machine_mode, rtx); > static enum rtx_code simplify_compare_const (enum > rtx_code, machine_mode, >rtx *, rtx *); > @@ -515,18 +515,22 @@ target_canonicalize_comparison > (enum rtx_code *code, rtx *op0, rtx *op1, > > /* Try to split PATTERN found in INSN. This returns NULL_RTX > if > PATTERN cannot be split. Otherwise, it returns an insn > sequence. > + Updates OLD_NREGS with the max number of regs before > the split > + and NEW_NREGS after the split. > This is a wrapper around split_insns which ensures that the > reg_stat vector is made larger if the splitter creates a new > register. */ > > static rtx_insn * > -combine_split_insns (rtx pattern, rtx_insn *insn) > +combine_split_insns (rtx pattern, rtx_insn *insn, > + unsigned int *old_nregs, > + unsigned int *new_regs) > { >rtx_insn *ret; >unsigned int nregs; > - > + *old_nregs = max_reg_num (); >ret = split_insns (pattern, insn); > - nregs = max_reg_num (); > + *new_regs = nregs = max_reg_num (); >if (nregs > reg_stat.length ()) > reg_stat.safe_grow_cleared (nregs, true); >return ret; > @@ -3566,12 +3570,13 @@ try_combine (rtx_insn *i3, > rtx_insn *i2, rtx_insn *i1, rtx_insn *i0, > { >rtx parallel, *split; >rtx_insn *m_split_insn; > + unsigned int old_nregs, new_nregs; > >/* See if the MD file can split NEWPAT. If it can't, see if > letting it >use I2DEST as a scratch register will help. In the latter > case, >convert I2DEST to the mode of the source of NEWPAT if we > can. */ > > - m_split_insn = combine_split_insns (newpat, i3); > + m_split_insn = combine_split_insns (newpat, i3, > &old_nregs, > + &new_nregs); > >/* We can only use I2DEST as a scratch reg if it doesn't > overlap any >inputs of NEWPAT. */ > @@ -3599,7 +3604,7 @@ try_combine (rtx_insn *i3, rtx_insn > *i2, rtx_insn *i1, rtx_insn *i0, > gen_rtvec (2, newpat, > gen_rtx_CLOBBER (VOIDmode, > i2dest))); > - m_split_insn = combine_split_insns (parallel, i3); > + m_split_insn = combine_split_insns (parallel, i3,
Re: The COBOL front end, version 3, now in 14 easy pieces
On Mon, Mar 10, 2025 at 10:40 PM David Malcolm wrote: > > On Mon, 2025-03-10 at 19:07 +0100, Richard Biener wrote: > > On Mon, Mar 10, 2025 at 5:34 PM Richard Biener > > wrote: > > > > > > On Sat, Mar 8, 2025 at 12:12 AM Robert Dubner > > > wrote: > > > > > > > > Richard and Jakub, and everybody else who has offered advice and > > > > help > > > > > > > > I trust you realize that your message means Jim and I are > > > > reaching the top > > > > of a mountain we've been climbing for several years now. > > > > > > > > This is very exciting. > > > > > > > > Thank you. Thank you very much. > > > > > > During this I saw > > > > > > index 10a42cb1dd7..8e18ef1 100644 > > > --- a/gcc/Makefile.in > > > +++ b/gcc/Makefile.in > > > ... > > > +# user-defined functions for GCOBOL > > > +udfdir = $(datadir)/gcobol/udf > > > .. > > > @@ -4031,7 +4035,9 @@ installdirs: > > > $(mkinstalldirs) $(DESTDIR)$(includedir) > > > $(mkinstalldirs) $(DESTDIR)$(infodir) > > > $(mkinstalldirs) $(DESTDIR)$(man1dir) > > > + $(mkinstalldirs) $(DESTDIR)$(man3dir) > > > $(mkinstalldirs) $(DESTDIR)$(man7dir) > > > + $(mkinstalldirs) $(DESTDIR)$(udfdir) > > > > > > while generating man3dir is probably OK when not configuring COBOL > > > the udfdir creation looks spurious and is repeated(?) in > > > cobol/Make-lang.in: > > > > > > cobol.install-common: installdirs > > > $(INSTALL_PROGRAM) gcobol$(exeext) > > > $(DESTDIR)$(bindir)/ > > > $(INSTALL_PROGRAM) cobol1$(exeext) > > > $(DESTDIR)$(libexecsubdir)/ > > > $(INSTALL) -m 755 $(srcdir)/cobol/gcobc > > > $(DESTDIR)$(bindir)/ > > > mkdir -p $(DESTDIR)$(datadir)/gcobol/udf > > > $(INSTALL_DATA) $(srcdir)/cobol/udf/* > > > $(DESTDIR)$(datadir)/gcobol/udf/ > > > > > > I don't know exactly what UDF is, but I'm going to prune the > > > gcc/Makefile.in > > > UDF changes. Does that analysis look OK? > > > > > > Building gcobol with GCC 7 shows > > > > > > gcc/cobol/except.cc:285:70: sorry, unimplemented: non-trivial > > > designated initializers not supported > > > > > > that needs to be sorted out (post-merge is OK). > > > > > > I'm preparing a branch with squashed merged from cobol-patched and > > > will push that later > > > today to a branch on gcc.gnu.org git and tomorrow from there to > > > trunk. > > > > So I've pushed refs/users/rguenth/heads/cobol which has picked all > > changes from the cobol-patched branch, re-ordered and squashed as > > shown by git log. The directories and ChangeLog prerequesites are > > already merged. I've kept two revs, one for libgcobol and one for > > gcc/cobol > > population and the toplevel configury/make is almost last to keep all > > revs > > building (without cobol). > > > > commit b7de2a002baf3cbdb42e0fdb14175637fb7c449a (HEAD -> cobol-merge, > > users/rguenth/cobol) > > For reference (for the curious), this can be seen in the web UI > here:https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/users/rguenth/heads/cobol > > Sounds like it will go onto trunk tomorrow; thanks! > > I noticed that the copyright dates don't include 2025 in these files: > gcc/cobol/LICENSE > gcc/cobol/Make-lang.in > gcc/cobol/cobol1.cc > gcc/cobol/gcobolspec.cc > gcc/cobol/lang.opt I've fixed those in my copy as well as the same issue in some libgcobol files. > FWIW gcc/cobol/lang.opt.urls has some D-specific things that look like > copy-and-paste cruft, but hopefully it won't cause problems. I guess we'll re-generate those eventually. I have bootstrapped the merge tree one with and once without cobol enabled on x86_64-unknown-linux-gnu and checked the installed contents. I have created a 'cobol' component in bugzilla. Robert, James - you want to create gcc.gnu.org/bugzilla accounts with the @gcc.gnu.org e-mail you are using for the git write access - that will get you bug edit privileges. I will open a bugreport tracking things to be done before the release. Pushed as r15-7942-g86c692c51c3569 Congrats! Richard. > > Dave >
Re: [PATCH] c, v2: do not warn about truncating NUL char when initializing nonstring arrays [PR117178]
On Mon, Mar 10, 2025 at 01:03:44PM -0700, Kees Cook wrote: > Hi! Thanks again for continuing to poke at this. Even with commit > 1301e18f69ce ("gimple-ssa-warn-access: Adjust maybe_warn_nonstring_arg > for nonstring multidimensional arrays [PR117178]"), this is still not > working. Does the above test pass for you? I still get warnings with > the latest ToT GCC. Oh! I see from the bug that that commit wasn't intended to fix it. I see the proposed patch[1] there. I will go test this now... -Kees [1] https://gcc.gnu.org/bugzilla/attachment.cgi?id=60692&action=diff -- Kees Cook
Re: [PATCH] LoongArch: Fix ICE when trying to recognize bitwise + alsl.w pair [PR119127]
LGTM. Thanks. 在 2025/3/10 下午2:40, Xi Ruoyao 写道: When we call loongarch_reassoc_shift_bitwise for _alsl_reversesi_extend, the mask is in DImode but we are trying to operate it in SImode, causing an ICE. To fix the issue sign-extend the mask into the mode we want. And also specially handle the case the mask is extended into -1 to avoid a miss-optimization. gcc/ChangeLog: PR target/119127 * config/loongarch/loongarch.cc (loongarch_reassoc_shift_bitwise): Sign extend mask to mode, specially handle the case it's extended to -1. * config/loongarch/loongarch.md (loongarch_reassoc_shift_bitwise): Update the comment for the special case. --- Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk? gcc/config/loongarch/loongarch.cc | 22 +-- gcc/config/loongarch/loongarch.md | 6 ++--- gcc/testsuite/gcc.target/loongarch/pr119127.c | 14 3 files changed, 31 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/pr119127.c diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 3779e283f8d..f21b8ae0ea3 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -4575,8 +4575,22 @@ loongarch_reassoc_shift_bitwise (bool is_and, rtx shamt, rtx mask, if (ctz_hwi (INTVAL (mask)) < INTVAL (shamt)) return NULL_RTX; + /* When trying alsl.w, deliberately ignore the high bits. */ + mask = gen_int_mode (UINTVAL (mask), mode); + rtx new_mask = simplify_const_binary_operation (LSHIFTRT, mode, mask, shamt); + + /* Do an arithmetic shift for checking ins_zero_bitmask_operand or -1: + ashiftrt (0x, 2) is 0x6000 which is an + ins_zero_bitmask_operand, but lshiftrt will produce + 0x3fff6000. */ + rtx new_mask_1 = simplify_const_binary_operation (ASHIFTRT, mode, mask, + shamt); + + if (is_and && const_m1_operand (new_mask_1, mode)) +return new_mask_1; + if (const_uns_arith_operand (new_mask, mode)) return new_mask; @@ -4586,13 +4600,7 @@ loongarch_reassoc_shift_bitwise (bool is_and, rtx shamt, rtx mask, if (low_bitmask_operand (new_mask, mode)) return new_mask; - /* Do an arithmetic shift for checking ins_zero_bitmask_operand: - ashiftrt (0x, 2) is 0x6000 which is an - ins_zero_bitmask_operand, but lshiftrt will produce - 0x3fff6000. */ - new_mask = simplify_const_binary_operation (ASHIFTRT, mode, mask, - shamt); - return ins_zero_bitmask_operand (new_mask, mode) ? new_mask : NULL_RTX; + return ins_zero_bitmask_operand (new_mask_1, mode) ? new_mask_1 : NULL_RTX; } /* Implement TARGET_CONSTANT_ALIGNMENT. */ diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index 478f859051c..a13398fdff4 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -3230,10 +3230,8 @@ (define_insn_and_split "_alsl_reversesi_extended" emit_insn (gen_di3 (operands[0], operands[1], operands[3])); else { - /* Hmm would we really reach here? If we reach here we'd have - a miss-optimization in the generic code (as it should have - optimized this to alslsi3_extend_subreg). But let's be safe - than sorry. */ + /* We can end up here with things like: + x:DI = sign_extend(a:SI + ((b:DI << 2) & 0xfffc)#0) */ gcc_checking_assert (); emit_move_insn (operands[0], operands[1]); } diff --git a/gcc/testsuite/gcc.target/loongarch/pr119127.c b/gcc/testsuite/gcc.target/loongarch/pr119127.c new file mode 100644 index 000..4e253beb0f4 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/pr119127.c @@ -0,0 +1,14 @@ +/* PR target/119127: ICE caused by operating DImode const in SImode */ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=loongarch64 -mabi=lp64d" } */ + +int x; +struct Type { + unsigned SubclassData : 24; +} y; + +void +test (void) +{ + x = y.SubclassData * 37; +}
Re: [PATCH] libgcc: Fix up unwind-dw2-btree.h [PR119151]
Hi Jakub, What differs from the textbook implementations is mostly that the leaf nodes don't include just address as a key, but address range, address + size (where we don't insert any ranges with zero size) and the lookups can be performed for any address in the [address, address + size) range. The keys on inner nodes are still just address-1, so the child covers all nodes where addr <= key unless it is covered already in children to the left. The user (static executables or JIT) should always ensure there is no overlap in between any of the ranges. thanks for tracking this down (and sorry for me messing it up in the first place). I checked your logic and your patch, and both look fine to me. Best Thomas
Re: [PATCH] gimple-ssa-warn-access: Adjust maybe_warn_nonstring_arg for nonstring multidimensional arrays [PR117178]
> Am 10.03.2025 um 08:35 schrieb Jakub Jelinek : > > Hi! > > I've rushed up the PR117178 multi-dimensional nonstring support. Before > a week off I've just bootstrapped/regtested the actual change, so > handle_nonstring_attribute/get_attr_nonstring_decl adjustments after playing > with simple short tests and left the testcases for the next day, where I've > basically copied the attr-nonstring-{1,...,8}.c tests to > attr-nonstring-{9,...,16}.c and adjusted them so that they test > multi-dimensional nonstring instead of single-dimensional ones. But I > forgot I haven't finished adjusting the dg-* lines and a few others to make > all the tests pass and posted the patch for review anyway. > When it got approved recently, I've only found out that some of the tests > still FAIL when about to commit it and quickly adjusted them, but I've added > 4 xfails in there. Sorry. > > The following patch fixes those 4 xfails (and results in 2 false positive > warnings not being produced either). > The thing is that maybe_warn_nonstring_arg simply assumed that nonstring > arrays must be single-dimensional, so when it sees a nonstring decl with > ARRAY_TYPE, it just used its dimension. With multi-dimensional arrays > that is not the right dimension to use though, it can be dimension of > some outer dimension, e.g. if we have > char a[5][6][7] __attribute__((nonstring)) if decl is > a[5] it would assume maximum non-NUL terminated string length of 5 rather than > 7, if a[5][6] it would assume 6 and only for a[5][6][0] it would assume the > correct 7. So, the following patch looks through all the outer dimensions > to reach the innermost one (which for attribute nonstring is guaranteed to > have char/unsigned char/signed char element type). > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Ok Richard > 2025-03-10 Jakub Jelinek > >PR c/117178 >* gimple-ssa-warn-access.cc (maybe_warn_nonstring_arg): Look through >multi-dimensional array types, stop at the innermost ARRAY_TYPE. > >* c-c++-common/attr-nonstring-11.c: Remove xfails. >* c-c++-common/attr-nonstring-12.c (warn_strcmp_cst_1, >warn_strcmp_cst_2): Don't expect any warnings here. >(warn_strcmp_cst_3, warn_strcmp_cst_4): New functions with expected >warnings. > > --- gcc/gimple-ssa-warn-access.cc.jj2025-01-02 11:23:17.0 +0100 > +++ gcc/gimple-ssa-warn-access.cc2025-03-08 11:44:09.304064889 +0100 > @@ -602,6 +602,10 @@ maybe_warn_nonstring_arg (tree fndecl, G > bool known_size = false; > tree type = TREE_TYPE (decl); > > + while (TREE_CODE (type) == ARRAY_TYPE > + && TREE_CODE (TREE_TYPE (type)) == ARRAY_TYPE) > +type = TREE_TYPE (type); > + > /* Determine the array size. For arrays of unknown bound and > pointers reset BOUND to trigger the appropriate warning. */ > if (TREE_CODE (type) == ARRAY_TYPE) > --- gcc/testsuite/c-c++-common/attr-nonstring-11.c.jj2025-03-08 > 00:49:09.958121143 +0100 > +++ gcc/testsuite/c-c++-common/attr-nonstring-11.c2025-03-08 > 11:50:15.513041709 +0100 > @@ -165,8 +165,8 @@ void test_strcmp (struct MemArrays *p) > void test_strncmp_warn (struct MemArrays *p) > { > enum { N = sizeof arr[2] }; > - T (strncmp (str[2], arr[2], N));/* { dg-bogus "argument 2 declared > attribute 'nonstring' is smaller than the specified bound 4" "" { xfail *-*-* > } } */ > - T (strncmp (arr[2], str[2], N));/* { dg-bogus "argument 1 declared > attribute 'nonstring' is smaller than the specified bound 4" "" { xfail *-*-* > } } */ > + T (strncmp (str[2], arr[2], N)); > + T (strncmp (arr[2], str[2], N)); > > T (strncmp (str[2], arr[2], N + 1)); /* { dg-warning "argument 2 declared > attribute .nonstring. is smaller than the specified bound 5" } */ > T (strncmp (arr[2], str[2], N + 1)); /* { dg-warning "argument 1 declared > attribute .nonstring. is smaller than the specified bound 5" } */ > @@ -237,7 +237,7 @@ void test_stpncpy_warn (struct MemArrays > enum { N = sizeof arr[2] }; > > T (stpncpy (str[2], str[2], N)); > - T (stpncpy (str[2], arr[2], N));/* { dg-bogus "argument 2 declared > attribute 'nonstring' is smaller than the specified bound 4" "" { xfail *-*-* > } } */ > + T (stpncpy (str[2], arr[2], N)); > T (stpncpy (arr[2], str[2], N)); > > T (stpncpy (str[2], *ptr, N)); > @@ -370,7 +370,7 @@ void test_stnrdup_warn (struct MemArrays > enum { N = sizeof arr[2] }; > > T (strndup (str[2], N)); > - T (strndup (arr[2], N));/* { dg-bogus "argument 1 declared attribute > 'nonstring' is smaller than the specified bound 4" "" { xfail *-*-* } } */ > + T (strndup (arr[2], N)); > > T (strndup (*ptr, N)); > T (strndup (*parr, N)); > --- gcc/testsuite/c-c++-common/attr-nonstring-12.c.jj2025-03-08 > 00:07:01.879907901 +0100 > +++ gcc/testsuite/c-c++-common/attr-nonstring-12.c2025-03-08 > 11:52:11.850445944 +0100 > @@ -26,12 +26,22 @@ enum { X = sizeof ar5[2]
[PATCH] libgcc: Fix up unwind-dw2-btree.h [PR119151]
Hi! The following testcase shows a bug in unwind-dw2-btree.h. In short, the header provides lock-free btree data structure (so no parent link on nodes, both insertion and deletion are done in top-down walks with some locking of just a few nodes at a time so that lookups can notice concurrent modifications and retry, non-leaf (inner) nodes contain keys which are initially the base address of the left-most leaf entry of the following child (or all ones if there is none) minus one, insertion ensures balancing of the tree to ensure [d/2, d] entries filled through aggressive splitting if it sees a full tree while walking, deletion performs various operations like merging neighbour trees, merging into parent or moving some nodes from neighbour to the current one). What differs from the textbook implementations is mostly that the leaf nodes don't include just address as a key, but address range, address + size (where we don't insert any ranges with zero size) and the lookups can be performed for any address in the [address, address + size) range. The keys on inner nodes are still just address-1, so the child covers all nodes where addr <= key unless it is covered already in children to the left. The user (static executables or JIT) should always ensure there is no overlap in between any of the ranges. In the testcase a bunch of insertions are done, always followed by one removal, followed by one insertion of a range slightly different from the removed one. E.g. in the first case [&code[0x50], &code[0x59]] range is removed and then we insert [&code[0x4c], &code[0x53]] range instead. This is valid, it doesn't overlap anything. But the problem is that some non-leaf (inner) one used the &code[0x4f] key (after the 11 insertions completely correctly). On removal, nothing adjusts the keys on the parent nodes (it really can't in the top-down only walk, the keys could be many nodes above it and unlike insertion, removal only knows the start address, doesn't know the removed size and so will discover it only when reaching the leaf node which contains it; plus even if it knew the address and size, it still doesn't know what the second left-most leaf node will be (i.e. the one after removal)). And on insertion, if nodes aren't split at a level, nothing adjusts the inner keys either. If a range is inserted and is either fully bellow key (keys are - 1, so having address + size - 1 being equal to key is fine) or fully after key (i.e. address > key), it works just fine, but if the key is in a middle of the range like in this case, &code[0x4f] is in the middle of the [&code[0x4c], &code[0x53]] range, then insertion works fine (we only use size on the leaf nodes), and lookup of the addresses below the key work fine too (i.e. [&code[0x4c], &code[0x4f]] will succeed). The problem is with lookups after the key (i.e. [&code[0x50, &code[0x53]]), the lookup looks for them in different children of the btree and doesn't find an entry and returns NULL. As users need to ensure non-overlapping entries at any time, the following patch fixes it by adjusting keys during insertion where we know not just the address but also size; if we find during the top-down walk a key which is in the middle of the range being inserted, we simply increase the key to be equal to address + size - 1 of the range being inserted. There can't be any existing leaf nodes overlapping the range in correct programs and the btree rebalancing done on deletion ensures we don't have any empty nodes which would also cause problems. The patch adjusts the keys in two spots, once for the current node being walked (the last hunk in the header, with large comment trying to explain it) and once during inner node splitting in a parent node if we'd otherwise try to add that key in the middle of the range being inserted into the parent node (in that case it would be missed in the last hunk). The testcase covers both of those spots, so succeeds with GCC 12 (which didn't have btrees) and fails with vanilla GCC trunk and also fails if either the if (fence < base + size - 1) fence = iter->content.children[slot].separator = base + size - 1; or if (left_fence >= target && left_fence < target + size - 1) left_fence = target + size - 1; hunk is removed (of course, only with the current node sizes, i.e. up to 15 children of inner nodes and up to 10 entries in leaf nodes). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and after a while affected rel? 2025-03-10 Jakub Jelinek Michael Leuchtenburg PR libgcc/119151 * unwind-dw2-btree.h (btree_split_inner): Add size argument. If left_fence is in the middle of [target,size) range, increase it to target + size - 1. (btree_insert): Adjust btree_split_inner caller. If fence is smaller than base + size - 1, increase it and separator of the slot to base + size - 1. * gcc.dg/pr119151.c: New test. --- libgcc/unwind-dw2-btree.h.
[PATCH] libgcc: Formatting fixes for unwind-dw2-btree.h
Hi! Studying unwind-dw2-btree.h was really hard for me because the formatting is wrong or weird in many ways all around the code and that kept distracting my attention. That includes all kinds of things, including wrong indentation, using {} around single statement substatements, excessive use of ()s around some parts of expressions which don't increase code clarity, no space after dot in comments, some comments not starting with capital letters, some not ending with dot, adding {} around some parts of code without any obvious reason (and when it isn't done in a similar neighboring function) or ( at the end of line without any reason. The following patch fixes the formatting issues I found, no functional changes. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Or ok for GCC 16? 2025-03-10 Jakub Jelinek * unwind-dw2-btree.h: Formatting fixes. --- libgcc/unwind-dw2-btree.h.jj2025-03-08 10:23:28.437681484 +0100 +++ libgcc/unwind-dw2-btree.h 2025-03-08 10:58:16.041085778 +0100 @@ -31,12 +31,12 @@ see the files COPYING3 and COPYING.RUNTI // Common logic for version locks. struct version_lock { - // The lock itself. The lowest bit indicates an exclusive lock, - // the second bit indicates waiting threads. All other bits are + // The lock itself. The lowest bit indicates an exclusive lock, + // the second bit indicates waiting threads. All other bits are // used as counter to recognize changes. // Overflows are okay here, we must only prevent overflow to the // same value within one lock_optimistic/validate - // range. Even on 32 bit platforms that would require 1 billion + // range. Even on 32 bit platforms that would require 1 billion // frame registrations within the time span of a few assembler // instructions. uintptr_type version_lock; @@ -60,10 +60,10 @@ version_lock_initialize_locked_exclusive static inline bool version_lock_try_lock_exclusive (struct version_lock *vl) { - uintptr_type state = __atomic_load_n (&(vl->version_lock), __ATOMIC_SEQ_CST); + uintptr_type state = __atomic_load_n (&vl->version_lock, __ATOMIC_SEQ_CST); if (state & 1) return false; - return __atomic_compare_exchange_n (&(vl->version_lock), &state, state | 1, + return __atomic_compare_exchange_n (&vl->version_lock, &state, state | 1, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST); } @@ -78,10 +78,10 @@ restart: // We should virtually never get contention here, as frame // changes are rare. - uintptr_type state = __atomic_load_n (&(vl->version_lock), __ATOMIC_SEQ_CST); + uintptr_type state = __atomic_load_n (&vl->version_lock, __ATOMIC_SEQ_CST); if (!(state & 1)) { - if (__atomic_compare_exchange_n (&(vl->version_lock), &state, state | 1, + if (__atomic_compare_exchange_n (&vl->version_lock, &state, state | 1, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) return; @@ -90,13 +90,13 @@ restart: // We did get contention, wait properly. #ifdef __GTHREAD_HAS_COND __gthread_mutex_lock (&version_lock_mutex); - state = __atomic_load_n (&(vl->version_lock), __ATOMIC_SEQ_CST); + state = __atomic_load_n (&vl->version_lock, __ATOMIC_SEQ_CST); while (true) { // Check if the lock is still held. if (!(state & 1)) { - if (__atomic_compare_exchange_n (&(vl->version_lock), &state, + if (__atomic_compare_exchange_n (&vl->version_lock, &state, state | 1, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) { @@ -104,15 +104,13 @@ restart: return; } else - { - continue; - } + continue; } // Register waiting thread. if (!(state & 2)) { - if (!__atomic_compare_exchange_n (&(vl->version_lock), &state, + if (!__atomic_compare_exchange_n (&vl->version_lock, &state, state | 2, false, __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST)) continue; @@ -120,7 +118,7 @@ restart: // And sleep. __gthread_cond_wait (&version_lock_cond, &version_lock_mutex); - state = __atomic_load_n (&(vl->version_lock), __ATOMIC_SEQ_CST); + state = __atomic_load_n (&vl->version_lock, __ATOMIC_SEQ_CST); } #else // Spin if we do not have condition variables available. @@ -133,15 +131,15 @@ restart: static void version_lock_unlock_exclusive (struct version_lock *vl) { - // increase version, reset exclusive lock bits - uintptr_type state = __atomic_load_n (&(vl->version_lock), __ATOMIC_SEQ_CST); - uintptr_type ns = (state + 4) & (~((uintptr_type) 3)); - state = __atomic_exchange_n (&(vl->version_lock), ns, __ATOMIC_SEQ_CST); +
Ping: [PATCH] LoongArch: combine related slli operations
Ping 在 2025/1/7 下午8:45, Zhou Zhao 写道: 在 2025/1/7 下午7:49, Lulu Cheng 写道: 在 2025/1/2 下午5:46, Zhou Zhao 写道: If SImode reg is continuous left shifted twice, combine related instruction to one. gcc/ChangeLog: * config/loongarch/loongarch.md (extsv_ashlsi3): New template Hi, zhaozhou: The indentation here is wrong, it needs to be aligned with *. Hi, Lulu cheng: Thanks for your advice, I have revised. gcc/testsuite/ChangeLog: * gcc.target/loongarch/slli-1.c: New test. --- gcc/config/loongarch/loongarch.md | 13 + gcc/testsuite/gcc.target/loongarch/slli-1.c | 10 ++ 2 files changed, 23 insertions(+) create mode 100644 gcc/testsuite/gcc.target/loongarch/slli-1.c diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index 19a22a93de3..1c62e5c02a1 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -3048,6 +3048,19 @@ (define_insn "si3_extend" [(set_attr "type" "shift") (set_attr "mode" "SI")]) +(define_insn "extsv_ashlsi3" + [(set (match_operand:DI 0 "register_operand" "=r") + (ashift:DI + (sign_extract:DI (match_operand:DI 1 "register_operand" "r") + (match_operand:SI 2 "const_uimm5_operand") + (const_int 0)) + (match_operand:SI 3 "const_uimm5_operand")))] + "TARGET_64BIT + &&(INTVAL (operands[2]) + INTVAL (operands[3])) == 0x20" + "slli.w\t%0,%1,%3" + [(set_attr "type" "shift") + (set_attr "mode" "SI")]) + (define_insn "*rotr3" [(set (match_operand:GPR 0 "register_operand" "=r,r") (rotatert:GPR (match_operand:GPR 1 "register_operand" "r,r") diff --git a/gcc/testsuite/gcc.target/loongarch/slli-1.c b/gcc/testsuite/gcc.target/loongarch/slli-1.c new file mode 100644 index 000..891d6457b12 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/slli-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +int +foo (int x) +{ + return (x << 2) * 8; +} + +/* { dg-final { scan-assembler-times "slli\.\[dw\]" 1} } */
Re: [Ping, Patch, www-docs, Fortran, Coarray-ABI] Announce coarray-ABI changes in gfortran-15
Hi Steve and Jerry, thanks for the review and the proposed changes. I have based on them, but needed to adapt some places, because the meaning was changed. Can you please take another look? Jerry, where do I find this check-script? In bin/ nothing jumps out at me to be a check-script. Thanks, Andre On Thu, 6 Mar 2025 10:22:32 -0800 Jerry D wrote: > On 3/6/25 10:02 AM, Steve Kargl wrote: > > Andre, > > > > Here's a bit of wordsmith. I removed the abbreviation "Esp." > > I'm not sure if there is additional markup needed; especially, > > with the "-fcoarray=single" I inserted. > > > >Coarray support has been reworked to allow access to components > >in derived types that have not been compiled with coarray support > >enabled; especially, when the derived type is in a binary only > >module. This has changed the ABI and may lead to link-time errors > >with libraries compiled with a previous GCC version and the > >-fcoarray=single option. If this option has been used, it is > >recommended to recompile the libraries. The OpenCoarrays library > >is not affected, because it provides support backwards compatibility > >with the older ABI. > > > > I'm not sure how to test the change. > > > > OK to commit with or without my suggested change. > > > > For www-docs there is an html check script provided. You can also review > by looking with a web browser. > > For texi files, if make info succeeds and review the resulting info > files installed in the usr/share if I recall correctly. > > Jerry -- Andre Vehreschild * Email: vehre ad gmx dot de From 4be8081974ea721301b6b60a0b5d6e2a19502a28 Mon Sep 17 00:00:00 2001 From: Andre Vehreschild Date: Thu, 20 Feb 2025 10:47:22 +0100 Subject: [PATCH] gcc-15/changes: Document coarray changes. ABI of coarrays has changed. Document possible linker errors for caf_single. --- htdocs/gcc-15/changes.html | 9 + 1 file changed, 9 insertions(+) diff --git a/htdocs/gcc-15/changes.html b/htdocs/gcc-15/changes.html index bc071501..7e5da369 100644 --- a/htdocs/gcc-15/changes.html +++ b/htdocs/gcc-15/changes.html @@ -444,6 +444,15 @@ asm (".text; %cc0: mov %cc2, %%r0; .previous;" incompatible with the module format generated by GCC 8 - 14, but GCC 15 can for compatibility still read GCC 8 - 14 created module files. + Coarray support has been reworked to allow access to components in derived + types that have not been compiled with coarray support enabled; + especially, when the derived type is in a binary only module. This has + changed the ABI and may lead to link-time errors with object files + generated with a previous GCC version and to be linked to the current + caf_single library. If this library is to be used, then + it is recommended to recompile all artifacts. The OpenCoarrays library + is not affected, because it provides backwards compatibility with the + older ABI. -- 2.48.1
Re: AArch64: Turn off outline atomics with -mcmodel=large (PR112465)
On Fri, Mar 7, 2025 at 11:07 PM Richard Sandiford wrote: > > Wilco Dijkstra writes: > > Hi Richard, > > > >>> Basically the small and large model are fundamentally incompatible. The > >>> infamous > >>> "dumb linker" approach means it doesn't try to sort sections, so an ADRP > >>> relocation > >>> will be out of reach if its data is placed after a huge array. Static > >>> linking with GLIBC or > >>> enabling outline atomics are in fact the same issue. > >> > >> This is probably a daft question, but why does the incompatibility > >> only affect static linking? Couldn't the same problem affect dynamic > >> executables if the code mixes medium and large code in an unfortunate way? > > > > Any mixing of large and small means you get the lowest common denominator. > > Statically linking with a small model library (like outline atomics or > > GLIBC) is exactly > > the same as linking in a small object. You effectively end up with max > > sizes from the > > small model with all the limitations of the large model (no PIC or PIE). > > > > I can't decode the other paragraphs at all - what exactly would you do > > differently in > > each compilation unit? > > That was also what I was trying to say. In the worst case, the linked > object has to meet the requirements of the lowest common denominator. > > And my supposition was that that isn't a property of static vs dynamic. > It applies to both (assuming that dynamic is supported by -mcmodel=large, > despite the current documentation). If you link medium and large code > together statically, the result (in general) has to be small enough for > the medium model. The same is true if you link medium and large code > together in a dynamic executable. > > So -static isn't intrinsically incompatible with -mcmodel=large. > The documentation seems to be describing a property of the way that > GLIBC is built on GNU/Linux systems, rather than a feature of -static > vs -mcmodel=large per se. I agree. Referring to these options as incompatible implies that this just wont work and there are correctness issues. They are incompatible in the sense that it depends on link order whether the ultimate elf module can be constructed or not and whether the final ELF module whether a shared object or an executable ( statically or dynamically linked) fits in the space requirements of the memory model. If it doesn't yes it's an unpleasant user experience but it's fundamentally not incompatible. Anecdotally I have very recently heard of a couple of cases where folks use linker scripts to place such sections to make this work : today IIUC they could write a linker script and use that at link time to get an application together. Some users might prefer that others might prefer other tricks and it's difficult to have a one size fits all solution. Are we leaving users without an option if the compiler won't even allow -mcmodel=large + static linking. ? I see a few users who have large applications that they are bringing on to AArch64 linux -mcmodel=large is probably the only escape hook currently until some of this is fixed in the tools and that catches up via versions in the distros. Cheers Ramana > > On aarch64-elf, everything is linked statically (unless you go to > heroic lengths to use aarch64-elf on a "full" custom OS). And that's > fine provided that everything is built with the right model. > > Thanks, > Richard
Re: Ping: [PATCH] LoongArch: combine related slli operations
在 2025/3/10 下午3:24, Zhou Zhao 写道: Ping Hi, You need to send another patch for version V2 with modifications. However, I am currently testing the value of align, and GCC15 is already in stage 4, so this patch might have to wait until the GCC16 stage 1 phase before it can be integrated. Thanks. 在 2025/1/7 下午8:45, Zhou Zhao 写道: 在 2025/1/7 下午7:49, Lulu Cheng 写道: 在 2025/1/2 下午5:46, Zhou Zhao 写道: If SImode reg is continuous left shifted twice, combine related instruction to one. gcc/ChangeLog: * config/loongarch/loongarch.md (extsv_ashlsi3): New template Hi, zhaozhou: The indentation here is wrong, it needs to be aligned with *. Hi, Lulu cheng: Thanks for your advice, I have revised. gcc/testsuite/ChangeLog: * gcc.target/loongarch/slli-1.c: New test. --- gcc/config/loongarch/loongarch.md | 13 + gcc/testsuite/gcc.target/loongarch/slli-1.c | 10 ++ 2 files changed, 23 insertions(+) create mode 100644 gcc/testsuite/gcc.target/loongarch/slli-1.c diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index 19a22a93de3..1c62e5c02a1 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -3048,6 +3048,19 @@ (define_insn "si3_extend" [(set_attr "type" "shift") (set_attr "mode" "SI")]) +(define_insn "extsv_ashlsi3" + [(set (match_operand:DI 0 "register_operand" "=r") + (ashift:DI + (sign_extract:DI (match_operand:DI 1 "register_operand" "r") + (match_operand:SI 2 "const_uimm5_operand") + (const_int 0)) + (match_operand:SI 3 "const_uimm5_operand")))] + "TARGET_64BIT + &&(INTVAL (operands[2]) + INTVAL (operands[3])) == 0x20" + "slli.w\t%0,%1,%3" + [(set_attr "type" "shift") + (set_attr "mode" "SI")]) + (define_insn "*rotr3" [(set (match_operand:GPR 0 "register_operand" "=r,r") (rotatert:GPR (match_operand:GPR 1 "register_operand" "r,r") diff --git a/gcc/testsuite/gcc.target/loongarch/slli-1.c b/gcc/testsuite/gcc.target/loongarch/slli-1.c new file mode 100644 index 000..891d6457b12 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/slli-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +int +foo (int x) +{ + return (x << 2) * 8; +} + +/* { dg-final { scan-assembler-times "slli\.\[dw\]" 1} } */
Re: [PATCH] libgcc: Fix up unwind-dw2-btree.h [PR119151]
> Am 10.03.2025 um 08:16 schrieb Jakub Jelinek : > > Hi! > > The following testcase shows a bug in unwind-dw2-btree.h. > In short, the header provides lock-free btree data structure (so no parent > link on nodes, both insertion and deletion are done in top-down walks > with some locking of just a few nodes at a time so that lookups can notice > concurrent modifications and retry, non-leaf (inner) nodes contain keys > which are initially the base address of the left-most leaf entry of the > following child (or all ones if there is none) minus one, insertion ensures > balancing of the tree to ensure [d/2, d] entries filled through aggressive > splitting if it sees a full tree while walking, deletion performs various > operations like merging neighbour trees, merging into parent or moving some > nodes from neighbour to the current one). > What differs from the textbook implementations is mostly that the leaf nodes > don't include just address as a key, but address range, address + size > (where we don't insert any ranges with zero size) and the lookups can be > performed for any address in the [address, address + size) range. The keys > on inner nodes are still just address-1, so the child covers all nodes > where addr <= key unless it is covered already in children to the left. > The user (static executables or JIT) should always ensure there is no > overlap in between any of the ranges. > > In the testcase a bunch of insertions are done, always followed by one > removal, followed by one insertion of a range slightly different from the > removed one. E.g. in the first case [&code[0x50], &code[0x59]] range > is removed and then we insert [&code[0x4c], &code[0x53]] range instead. > This is valid, it doesn't overlap anything. But the problem is that some > non-leaf (inner) one used the &code[0x4f] key (after the 11 insertions > completely correctly). On removal, nothing adjusts the keys on the parent > nodes (it really can't in the top-down only walk, the keys could be many nodes > above it and unlike insertion, removal only knows the start address, doesn't > know the removed size and so will discover it only when reaching the leaf > node which contains it; plus even if it knew the address and size, it still > doesn't know what the second left-most leaf node will be (i.e. the one after > removal)). And on insertion, if nodes aren't split at a level, nothing > adjusts the inner keys either. If a range is inserted and is either fully > bellow key (keys are - 1, so having address + size - 1 being equal to key is > fine) or fully after key (i.e. address > key), it works just fine, but if > the key is in a middle of the range like in this case, &code[0x4f] is in the > middle of the [&code[0x4c], &code[0x53]] range, then insertion works fine > (we only use size on the leaf nodes), and lookup of the addresses below > the key work fine too (i.e. [&code[0x4c], &code[0x4f]] will succeed). > The problem is with lookups after the key (i.e. [&code[0x50, &code[0x53]]), > the lookup looks for them in different children of the btree and doesn't > find an entry and returns NULL. > > As users need to ensure non-overlapping entries at any time, the following > patch fixes it by adjusting keys during insertion where we know not just > the address but also size; if we find during the top-down walk a key > which is in the middle of the range being inserted, we simply increase the > key to be equal to address + size - 1 of the range being inserted. > There can't be any existing leaf nodes overlapping the range in correct > programs and the btree rebalancing done on deletion ensures we don't have > any empty nodes which would also cause problems. > > The patch adjusts the keys in two spots, once for the current node being > walked (the last hunk in the header, with large comment trying to explain > it) and once during inner node splitting in a parent node if we'd otherwise > try to add that key in the middle of the range being inserted into the > parent node (in that case it would be missed in the last hunk). > The testcase covers both of those spots, so succeeds with GCC 12 (which > didn't have btrees) and fails with vanilla GCC trunk and also fails if > either the > if (fence < base + size - 1) > fence = iter->content.children[slot].separator = base + size - 1; > or > if (left_fence >= target && left_fence < target + size - 1) > left_fence = target + size - 1; > hunk is removed (of course, only with the current node sizes, i.e. up to > 15 children of inner nodes and up to 10 entries in leaf nodes). > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk > and after a while affected rel? Ok Thanks, Richard > 2025-03-10 Jakub Jelinek > Michael Leuchtenburg > > PR libgcc/119151 > * unwind-dw2-btree.h (btree_split_inner): Add size argument. If > left_fence is in the middle of [target,size) range, increase it to > target + size - 1. > (btree_insert): Adjust btree_split_inner caller.
Re: [PATCH] contrib: Clean up outdated parts of gcc-git-customization.sh
On 07/03/2025 21:12, Jonathan Wakely wrote: > It's very unlikely that anybody is still using the old remotes/$user Git > repo setup and still needs this script to be able to migrate it to the > remotes/users/$user structure. Simplify the script by removing those > parts. > > This fixes an error that gets displayed in some circumstances: > fatal: no such section: remote.me > > contrib/ChangeLog: > > * gcc-git-customization.sh: Delete outdated commands for > migrating from very old git setups. > --- > > Tested in a fresh clone, run multiple times to check it's idempotent. > > OK for trunk? LGTM. R. > > contrib/gcc-git-customization.sh | 46 > 1 file changed, 46 deletions(-) > > diff --git a/contrib/gcc-git-customization.sh > b/contrib/gcc-git-customization.sh > index 54bd35ea1aa..e58220fb342 100755 > --- a/contrib/gcc-git-customization.sh > +++ b/contrib/gcc-git-customization.sh > @@ -156,45 +156,7 @@ if [ "x$dohook" = xyes ]; then > fi > fi > > -# Scan the existing settings to see if there are any we need to rewrite. > -vendors=$(git config --get-all "remote.${upstream}.fetch" "refs/vendors/" | > sed 's:.*refs/vendors/\([^/][^/]*\)/.*:\1:' | sort | uniq) > url=$(git config --get "remote.${upstream}.url") > -pushurl=$(git config --get "remote.${upstream}.pushurl") > -for v in $vendors > -do > -echo "Migrating vendor \"$v\" to new remote \"vendors/$v\"" > -git config --unset-all "remote.${upstream}.fetch" "refs/vendors/$v/" > -git config --unset-all "remote.${upstream}.push" "refs/vendors/$v/" > -git config "remote.vendors/${v}.url" "${url}" > -if [ "x$pushurl" != "x" ] > -then > - git config "remote.vendors/${v}.pushurl" "${pushurl}" > -fi > -git config --add "remote.vendors/${v}.fetch" > "+refs/vendors/$v/heads/*:refs/remotes/vendors/${v}/*" > -git config --add "remote.vendors/${v}.fetch" > "+refs/vendors/$v/tags/*:refs/tags/vendors/${v}/*" > -done > - > -# Convert the remote 'pfx' to users/pfx to avoid problems with ambiguous refs > -# on user branches > -old_remote=$(git config --get "remote.${old_pfx}.url") > -if [ -n "${old_remote}" ] > -then > -echo "Migrating remote \"${old_pfx}\" to new remote \"users/${new_pfx}\"" > -# Create a dummy fetch rule that will cause the subsequent prune to > remove the old remote refs. > -git config --replace-all "remote.${old_pfx}.fetch" > "+refs/empty/*:refs/remotes/${old_pfx}/*" > -# Remove any remotes > -git remote prune ${old_pfx} > -git config --remove-section "remote.${old_pfx}" > -for br in $(git branch --list "${old_pfx}/*") > -do > - old_remote=$(git config --get "branch.${br}.remote") > - if [ "${old_remote}" = "${old_pfx}" ] > - then > - git config "branch.${br}.remote" "users/${new_pfx}" > - fi > -done > -fi > - > echo "Setting up tracking for personal namespace $remote_id in > remotes/users/${new_pfx}" > git config "remote.users/${new_pfx}.url" "${url}" > if [ "x$pushurl" != "x" ] > @@ -205,12 +167,4 @@ git config --replace-all "remote.users/${new_pfx}.fetch" > "+refs/users/${remote_i > git config --replace-all "remote.users/${new_pfx}.fetch" > "+refs/users/${remote_id}/tags/*:refs/tags/users/${new_pfx}/*" > "refs/users/${remote_id}/tags/" > git config --replace-all "remote.users/${new_pfx}.push" > "refs/heads/${new_pfx}/*:refs/users/${remote_id}/heads/*" > "refs/users/${remote_id}" > > -if [ "$old_pfx" != "$new_pfx" -a "$old_pfx" != "${upstream}" ] > -then > -git config --remove-section "remote.${old_pfx}" > -fi > - > -git config --unset-all "remote.${upstream}.fetch" "refs/users/${remote_id}/" > -git config --unset-all "remote.${upstream}.push" "refs/users/${remote_id}/" > - > git fetch "users/${new_pfx}"
Re: [PATCH] arm: [MVE] Fix predicates for vec_cmp, vec_vcmpu and vcond_mask (PR 115439)
On 16/01/2025 14:20, Christophe Lyon wrote: > When compiling c-c++-common/vector-compare-3.c with > -march=armv8.1-m.main+mve+fp.dp -mfloat-abi=hard -mfpu=auto > (which enables MVE), we fail to match vcond_mask because operand 3 has > s_register_operand as predicate for a MVE_VPRED mode, but we try to > match: > (insn 26 25 27 2 (set (reg:V4SI 137) > (unspec:V4SI [ > (reg:V4SI 144) > (reg:V4SI 145) > (subreg:V4BI (reg:HI 143) 0) > ] VPSELQ_S)) > "/src/gcc/testsuite/c-c++-common/vector-compare-3.c":23:6 -1 > (nil)) > > The fix is to use the right predicate: vpr_register_operand. > > The patch also fixes vec_cmp and vec_cmpu in the same way. > > When testing with > -mthumb/-march=armv8.1-m.main+mve.fp+fp.dp/-mtune=cortex-m55/-mfloat-abi=hard/-mfpu=auto > it fixes the ICES in c-c++-common/vector-compare-3.c, > g++.dg/opt/pr79734.C, g++.dg/tree-ssa/pr50.C and > gcc.dg/tree-ssa/pr50.c > > gcc/ChangeLog > > PR target/115439 > * config/arm/mve.md (vec_vcmp, vec_vcmpu, vcond_mask): Use > vpr_register_operand predicate for MVE_VPRED operands. This is OK. But I think we should extend vpr_register_operand to allow type-punning subregs, otherwise we might end up with some redundant move operations. I'll work on a patch for that, unless you want to. R. > --- > gcc/config/arm/mve.md | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md > index 0c0337f9ee2..8527bd753e3 100644 > --- a/gcc/config/arm/mve.md > +++ b/gcc/config/arm/mve.md > @@ -4587,7 +4587,7 @@ (define_expand "mov" > ;; Expanders for vec_cmp and vcond > > (define_expand "vec_cmp" > - [(set (match_operand: 0 "s_register_operand") > + [(set (match_operand: 0 "vpr_register_operand") > (match_operator: 1 "comparison_operator" > [(match_operand:MVE_VLD_ST 2 "s_register_operand") > (match_operand:MVE_VLD_ST 3 "reg_or_zero_operand")]))] > @@ -4600,7 +4600,7 @@ (define_expand "vec_cmp" > }) > > (define_expand "vec_cmpu" > - [(set (match_operand: 0 "s_register_operand") > + [(set (match_operand: 0 "vpr_register_operand") > (match_operator: 1 "comparison_operator" > [(match_operand:MVE_2 2 "s_register_operand") > (match_operand:MVE_2 3 "reg_or_zero_operand")]))] > @@ -4614,7 +4614,7 @@ (define_expand "vec_cmpu" > (define_expand "vcond_mask_" >[(set (match_operand:MVE_VLD_ST 0 "s_register_operand") > (if_then_else:MVE_VLD_ST > - (match_operand: 3 "s_register_operand") > + (match_operand: 3 "vpr_register_operand") > (match_operand:MVE_VLD_ST 1 "s_register_operand") > (match_operand:MVE_VLD_ST 2 "s_register_operand")))] >"TARGET_HAVE_MVE"
[PATCH] s390: fix delegitimization of addresses
In legitimize_pic_address we create a (const (unspec ... UNSPEC_GOTENT)) in the GOT offset might be >= 4k. However, the s390_delegitimize_address does not contain a case for this scenario. gcc/ChangeLog: * config/s390/s390.cc (s390_delegitimize_address): Add missing case. gcc/testsuite/ChangeLog: * gcc.target/s390/delegitimize.c: New test. Bootstrapped and tested on s390x. Okay for trunk? Signed-off-by: Juergen Christ --- gcc/config/s390/s390.cc | 15 + gcc/testsuite/gcc.target/s390/delegitimize.c | 22 2 files changed, 37 insertions(+) create mode 100644 gcc/testsuite/gcc.target/s390/delegitimize.c diff --git a/gcc/config/s390/s390.cc b/gcc/config/s390/s390.cc index 29aef501fdd2..535659ee5181 100644 --- a/gcc/config/s390/s390.cc +++ b/gcc/config/s390/s390.cc @@ -8218,6 +8218,21 @@ s390_delegitimize_address (rtx orig_x) return plus_constant (Pmode, XVECEXP (y, 0, 0), offset); } + if (GET_CODE (x) == CONST) +{ + /* Extract the symbol ref from: +(const:DI (unspec:DI [(symbol_ref:DI ("foo"))] + UNSPEC_PLT/GOTENT)) */ + + y = XEXP (x, 0); + if (GET_CODE (y) == UNSPEC + && (XINT (y, 1) == UNSPEC_GOTENT + || XINT (y, 1) == UNSPEC_PLT31)) + return XVECEXP (y, 0, 0); + else + return orig_x; +} + if (GET_CODE (x) != MEM) return orig_x; diff --git a/gcc/testsuite/gcc.target/s390/delegitimize.c b/gcc/testsuite/gcc.target/s390/delegitimize.c new file mode 100644 index ..bf143745dca2 --- /dev/null +++ b/gcc/testsuite/gcc.target/s390/delegitimize.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-nostdinc -std=gnu11 -fshort-wchar -funsigned-char -fno-common -fno-PIE -fno-strict-aliasing -m64 -fPIC -mpacked-stack -mbackchain -msoft-float -march=z13 -mtune=z13 -mindirect-branch=thunk-extern -mfunction-return=thunk-extern -mindirect-branch-table -DCC_USING_EXPOLINE -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -DCONFIG_AS_CFI_VAL_OFFSET=1 -fno-delete-null-pointer-checks -O2 -fno-allow-store-data-races -fno-stack-protector -ftrivial-auto-var-init=zero -fno-stack-clash-protection -pg -mrecord-mcount -mnop-mcount -mfentry -fno-inline-functions-called-once -fmin-function-alignment=8 -fstrict-flex-arrays=3 -fno-strict-overflow -fno-stack-check -fconserve-stack -Wall -Wundef -Werror=implicit-function-declaration -Werror=implicit-int -Werror=return-type -Werror=strict-prototypes -Wno-format-security -Wno-trigraphs -Wno-frame-address -Wno-address-of-packed-member -Wmissing-declarations -Wmissing-prototypes -Wframe-larger-than=2048 -Wno-main -Wno-dangling-pointer -Wvla -Wno-pointer-sign -Wcast-function-type -Wno-stringop-overflow -Wno-array-bounds -Wno-alloc-size-larger-than -Wimplicit-fallthrough=5 -Werror=date-time -Werror=incompatible-pointer-types -Werror=designated-init -Wenum-conversion -Wextra -Wunused -Wno-unused-but-set-variable -Wno-unused-const-variable -Wno-packed-not-aligned -Wno-format-overflow -Wno-format-truncation -Wno-stringop-truncation -Wno-override-init -Wno-missing-field-initializers -Wno-type-limits -Wno-shift-negative-value -Wno-maybe-uninitialized -Wno-sign-compare -Wno-unused-parameter -g -gdwarf-4 -fdump-rtl-final-details" } */ + +struct sk_buff { + struct { +struct { + struct { +int inner_ipproto; + }; +}; + }; +}; +void skb_udp_tunnel_segment(struct sk_buff *skb); +const int *inet_offloads[42], *inet6_offloads[42]; +_Bool skb_udp_tunnel_segment_is_ipv6; +void skb_udp_tunnel_segment(struct sk_buff *skb) { + const int **offloads = + skb_udp_tunnel_segment_is_ipv6 ? inet6_offloads : inet_offloads; + *(volatile typeof(_Generic(0, default : 0)) *)&offloads[skb->inner_ipproto]; +} + +/* { dg-final { scan-rtl-dump-not "Failed to expand as dwarf:" "final" } } */ -- 2.43.5
Re: c++/modules: Stream section, tls_model, and comdat_group
On 3/10/25 9:52 AM, Nathaniel Shead wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? Or should this wait for GCC16? -- >8 -- While looking at PR c++/119154 I noticed some more properties that we don't stream or check properly. This patch adds the ones we were missing, and adds checks that the values don't clash with existing decls. These seem to me like properties that should be recomputed rather than streamed; aren't we already streaming the section attribute? This comment also applies to the existing streaming of tls_model. There's probably a lot more duplicate_decls-like work that should be done in is_matching_decl (notably decl attributes?) but this is at least a slight improvement to the status-quo. gcc/cp/ChangeLog: * module.cc (trees_out::core_vals): Stream TLS model, section name, and comdat group. (trees_in::core_vals): Read them. (trees_out::decl_value): No longer need to stream TLS model. (trees_in::decl_value): Likewise; free symtab node of discarded decls when we're done with them. (trees_in::is_matching_decl): Add checks for corresponding TLS model and section name. gcc/testsuite/ChangeLog: * g++.dg/modules/attrib-3_a.C: New test. * g++.dg/modules/attrib-3_b.C: New test. Signed-off-by: Nathaniel Shead --- gcc/cp/module.cc | 104 +++--- gcc/testsuite/g++.dg/modules/attrib-3_a.C | 9 ++ gcc/testsuite/g++.dg/modules/attrib-3_b.C | 14 +++ 3 files changed, 117 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/g++.dg/modules/attrib-3_a.C create mode 100644 gcc/testsuite/g++.dg/modules/attrib-3_b.C diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc index 8e0fa3c5bfc..9aa86c939f6 100644 --- a/gcc/cp/module.cc +++ b/gcc/cp/module.cc @@ -6399,6 +6399,9 @@ trees_out::core_vals (tree t) /* Decls. */ case VAR_DECL: + if (streaming_p ()) + if (CP_DECL_THREAD_LOCAL_P (t)) + u (decl_tls_model (t)); if (DECL_CONTEXT (t) && TREE_CODE (DECL_CONTEXT (t)) != FUNCTION_DECL) { @@ -6691,6 +6694,22 @@ trees_out::core_vals (tree t) break; } + if (VAR_OR_FUNCTION_DECL_P (t) + && (DECL_EXTERNAL (t) || TREE_PUBLIC (t) || TREE_STATIC (t))) +{ + if (streaming_p ()) + { + const char *name = DECL_SECTION_NAME (t); + size_t len = name ? strlen (name) : 0; + str (name, len); + } + + tree comdat_group = NULL_TREE; + if (DECL_ONE_ONLY (t)) + comdat_group = symtab_node::get (t)->get_comdat_group (); + WT (comdat_group); +} + if (CODE_CONTAINS_STRUCT (code, TS_TYPED)) { /* We want to stream the type of a expression-like nodes /after/ @@ -6933,6 +6952,11 @@ trees_in::core_vals (tree t) /* Decls. */ case VAR_DECL: + if (CP_DECL_THREAD_LOCAL_P (t)) + { + enum tls_model model = tls_model (u()); + set_decl_tls_model (t, model); + } if (DECL_CONTEXT (t) && TREE_CODE (DECL_CONTEXT (t)) != FUNCTION_DECL) { @@ -7227,6 +7251,23 @@ trees_in::core_vals (tree t) break; } + if (VAR_OR_FUNCTION_DECL_P (t) + && (DECL_EXTERNAL (t) || TREE_PUBLIC (t) || TREE_STATIC (t))) +{ + size_t section_len = 0; + const char *section_name = str (§ion_len); + if (section_len) + set_decl_section_name (t, section_name); + + if (tree comdat_group = tree_node ()) + { + if (TREE_CODE (t) == FUNCTION_DECL) + cgraph_node::get_create (t)->set_comdat_group (comdat_group); + else + varpool_node::get_create (t)->set_comdat_group (comdat_group); + } +} + if (CODE_CONTAINS_STRUCT (code, TS_TYPED)) { tree type = tree_node (); @@ -8316,9 +8357,6 @@ trees_out::decl_value (tree decl, depset *dep) decl, cloned_p ? "" : "not "); } - if (streaming_p () && VAR_P (decl) && CP_DECL_THREAD_LOCAL_P (decl)) -u (decl_tls_model (decl)); - if (streaming_p ()) dump (dumper::TREE) && dump ("Written decl:%d %C:%N", tag, TREE_CODE (decl), decl); @@ -8757,6 +8795,12 @@ trees_in::decl_value () DECL_CONTEXT (parm) = e_inner; } + /* We will not need to track the comdat group etc of the +to-be-discarded decl, so remove it from the symbol table. */ + if (VAR_OR_FUNCTION_DECL_P (inner)) + if (struct symtab_node *snode = symtab_node::get (inner)) + snode->remove (); + /* And our result is the existing node. */ decl = existing; } @@ -8802,13 +8846,6 @@ trees_in::decl_value () } } - if (VAR_P (decl) && CP_DECL_THREAD_LOCAL_P (decl)) -{ - enum tls_model model = tls_model (u ()); - if (is_new) - set_decl_tls_model (decl, model); -} - i
[pushed][PR114991][IRA]: Improve reg equiv invariant calculation
The following patch solves performance issue mentioned in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114991 The patch was successfully bootstrapped and tested on x86_64, aarch64, ppc64le. commit e355fe414aa3aaf215c7dd9dd789ce217a1b458c Author: Vladimir N. Makarov Date: Mon Mar 10 16:26:59 2025 -0400 [PR114991][IRA]: Improve reg equiv invariant calculation In PR test case IRA preferred to allocate hard reg to a pseudo instead of its equivalence. This resulted in allocating caller-saved hard reg and generating save/restore insns in the function prologue/epilogue. The equivalence is an invariant (stack pointer plus offset) and the pseudo is used mostly as memory address. This happened as there was no simplification of insn after the invariant substitution. The patch adds the necessary code. gcc/ChangeLog: PR target/114991 * ira-costs.cc (equiv_can_be_consumed_p): Add new argument invariant_p. Add code for dealing with the invariant. (calculate_equiv_gains): Don't consider init insns. Pass the new argument to equiv_can_be_consumed_p. Don't treat invariant as memory. gcc/testsuite/ChangeLog: PR target/114991 * gcc.target/aarch64/pr114991.c: New test. diff --git a/gcc/ira-costs.cc b/gcc/ira-costs.cc index a404e9f2690..b568c7d0326 100644 --- a/gcc/ira-costs.cc +++ b/gcc/ira-costs.cc @@ -1788,11 +1788,27 @@ validate_autoinc_and_mem_addr_p (rtx x) MEM_ADDR_SPACE (x))); } -/* Check that reg REGNO can be changed by TO in INSN. Return true in case the - result insn would be valid one. */ +/* Check that reg REGNO in INSN can be changed by TO (which is an invariant + equiv when INVARIANT_P is true). Return true in case the result insn would + be valid one. */ static bool -equiv_can_be_consumed_p (int regno, rtx to, rtx_insn *insn) +equiv_can_be_consumed_p (int regno, rtx to, rtx_insn *insn, bool invariant_p) { + if (invariant_p) +{ + /* We use more expensive code for the invariant because we need to + simplify the result insn as the invariant can be arithmetic rtx + inserted into another arithmetic rtx. */ + rtx pat = PATTERN (insn); + int code = INSN_CODE (insn); + PATTERN (insn) = copy_rtx (pat); + PATTERN (insn) + = simplify_replace_rtx (PATTERN (insn), regno_reg_rtx[regno], to); + bool res = !insn_invalid_p (insn, false); + PATTERN (insn) = pat; + INSN_CODE (insn) = code; + return res; +} validate_replace_src_group (regno_reg_rtx[regno], to, insn); /* We can change register to equivalent memory in autoinc rtl. Some code including verify_changes assumes that autoinc contains only a register. @@ -1910,6 +1926,14 @@ calculate_equiv_gains (void) || !get_equiv_regno (PATTERN (insn), regno, subreg) || !bitmap_bit_p (&equiv_pseudos, regno)) continue; + + rtx_insn_list *x; + for (x = ira_reg_equiv[regno].init_insns; x != NULL; x = x->next ()) + if (insn == x->insn ()) + break; + if (x != NULL) + continue; /* skip equiv init insn */ + rtx subst = ira_reg_equiv[regno].memory; if (subst == NULL) @@ -1919,13 +1943,17 @@ calculate_equiv_gains (void) ira_assert (subst != NULL); mode = PSEUDO_REGNO_MODE (regno); ira_init_register_move_cost_if_necessary (mode); - bool consumed_p = equiv_can_be_consumed_p (regno, subst, insn); + bool consumed_p + = equiv_can_be_consumed_p (regno, subst, insn, + subst == ira_reg_equiv[regno].invariant); rclass = pref[COST_INDEX (regno)]; if (MEM_P (subst) /* If it is a change of constant into double for example, the result constant probably will be placed in memory. */ - || (subreg != NULL_RTX && !INTEGRAL_MODE_P (GET_MODE (subreg + || (ira_reg_equiv[regno].invariant == NULL + && subreg != NULL_RTX + && !INTEGRAL_MODE_P (GET_MODE (subreg cost = ira_memory_move_cost[mode][rclass][1] + (consumed_p ? 0 : 1); else if (consumed_p) continue; diff --git a/gcc/testsuite/gcc.target/aarch64/pr114991.c b/gcc/testsuite/gcc.target/aarch64/pr114991.c new file mode 100644 index 000..d3c7bd131dd --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/pr114991.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-shrink-wrap" } */ + +typedef struct { int arr[16]; } S; + +void g (S *); +void h (S); +void f(int x) +{ + S s; + g (&s); + h (s); +} + +/* { dg-final { scan-assembler-not "\[ \t\]?str\[ \t\]x" } } */
Re: The COBOL front end, version 3, now in 14 easy pieces
On Mon, 10 Mar 2025 18:05:21 +0100 Jakub Jelinek wrote: > Designated initializers are C++20, so you should just avoid that. So, > I'd recommend just: > cbl_field_data_t data = { /* memsize= */capacity_cast(len), > /* capacity= */capacity_cast(len), > /* digits= */0, > /* rdigits= */0, > /* initial= */reinterpret_cast > (blob), /* picture= */reinterpret_cast(blob) }; Very well. Now that we're in the tree and official maintainers, this will be a good learning experience for doing things the right way within the gcc repository. IIUC, we should remove all designated initializers ASAP, definitely before gcc-15 is branched. I think I can have repairs made this week. There was some hiccup setting up my write access, but I would hope that will be resolved soon enough, and Richard will be the first to know. --jkl
New German PO file for 'cpplib' (version 15-b20250216)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'cpplib' has been submitted by the German team of translators. The file is available at: https://translationproject.org/latest/cpplib/de.po (This file, 'cpplib-15-b20250216.de.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: https://translationproject.org/latest/cpplib/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: https://translationproject.org/domain/cpplib.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
Contents of PO file 'cpplib-15-b20250216.de.po'
cpplib-15-b20250216.de.po.gz Description: Binary data The Translation Project robot, in the name of your translation coordinator.
Re: [patch, fortran] Fix PR 119078, putting a procedure in an abstract interface into global namespace
Hi Thomas, Am 10.03.25 um 21:01 schrieb Thomas Koenig: Hello world, the attached patch makes sure that procedures from abstract interfaces and dummy arguments are not put into the global symbol table, and are not checked against global symbols. Regression-tested. OK for trunk? Best regards Thomas Abstract interfaces and dummy arguments are not global. gcc/fortran/ChangeLog: PR fortran/119078 * frontend-passes.cc (check_against_globals): Do not check for abstract interfaces or dummy arguments. * resolve.cc (gfc_verify_binding_labels): Adjust comment. Do not put abstract interfaces or dummy argument into global namespace. gcc/testsuite/ChangeLog: PR fortran/119078 * gfortran.dg/interface_58.f90: New test the patch looks basically fine but I cannot find the testcase. Cheers, Harald
Re: [RFC] [C]New syntax for the argument of counted_by attribute for C language
On 10 Mar 2025, at 15:34, Martin Uecker wrote: > Am Montag, dem 10.03.2025 um 15:00 -0400 schrieb John McCall: >> That said, my preference is still to just give preference to the field name, >> which sidesteps any need for disambiguation syntax and avoids this whole >> problem where structs can be broken by just adding a global variable that >> happens to collide with a field. > > I don't think it is a good idea when the 'n' in 'buf' refers to the > previous global 'n' coming before and the 'n' in attribute > refers to a member 'n' coming later in the following example. I agree. I think compilers ought to diagnose this with a warning, and arguably it should be invalid under the standard. (It would be very easy to do this; you just keep a set of global declarations referenced by expressions within the struct body and diagnose if any of them collide with fields when the body is complete. That is not a particularly expensive thing to track, most importantly because the number of declarations referenced by expressions in a struct body is modally zero.) Fortunately, this sort of thing is largely theoretical in actual code today. Unfortunately, it is not even slightly theoretical with bounds-safety attributes. > And neither global names nor struct members may always be under > the control of the programmer. A programmer adding an attribute to a struct field can certainly declare a new global constant immediately before the struct. > Also that simply bringing > a new identifier into scope can break code elsewhere worries me. Yes, but this is only a problem if we diagnose an ambiguity. There is no such problem with just preferring a struct field, unless you’re equally concerned about shadowing whenever you create a local variable. > Finally, the following does not even compile in C++. > > struct foo { > char buf[n]; > const static int n = 2; > }; And I am not suggesting that it should. Anything that would change the type of a declaration really has to continue to be processed immediately. But code being invalid does not create inconsistency. Non-constant array bounds are invalid in structs, but that is not inconsistent, it’s a limitation in expressivity. What’s important is that code be consistently interpreted everywhere it is valid. > While the next example is also ok in C++. > > constexpr int n = 2; > > struct foo { > char buf[n]; > }; > > With both declarations of 'n' the example has UB in C++. > So I am not convinced the proposed rules make a lot > of sense for C++ either. If C required a diagnostic in your first example, it would actually put a fair amount of pressure on the C++ committee to get rid of this spurious UB rule. > I still think one could use designator syntax, i.e. '.n', which > would be clearer and intuitive for both C and C++ programmers. This doesn’t really solve the ambiguity problem. If n is a field name, a programmer who writes __counted_by(n) almost certainly means to name the field. “The proper syntax is .n” is the cause of the bug, not its solution. John.
Re: [COMMITTED] Sanitizer: Mention -g option in documentation [PR56682]
Hi! On Fri, 7 Mar 2025 at 23:57, Sandra Loosemore wrote: > > gcc/ChangeLog > PR sanitizer/56682 > * doc/invoke.texi (Instrumentation Options): Document that -g > is useful with -fsanitize=thread and -fsanitize=address. > Also mention -fno-omit-frame-pointer per the asan wiki. > --- > gcc/doc/invoke.texi | 9 - > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index c61f3514070..e01d64c5229 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -17861,12 +17861,16 @@ The option cannot be combined with > @option{-fsanitize=thread} or > (only with @code{-mlam=u48} or @code{-mlam=u57} options) and AArch64, > in both cases only in ABIs with 64-bit pointers. > > +When compiling with @option{-fsanitize=address}, you should also also I think you made a typo: "also also" ? Thanks, Christophe > +use @option{-g} to produce more meaningful output. > To get more accurate stack traces, it is possible to use options such as > @option{-O0}, @option{-O1}, or @option{-Og} (which, for instance, prevent > most function inlining), @option{-fno-optimize-sibling-calls} (which prevents > optimizing sibling and tail recursive calls; this option is implicit for > @option{-O0}, @option{-O1}, or @option{-Og}), or @option{-fno-ipa-icf} (which > -disables Identical Code Folding for functions). Since multiple runs of the > +disables Identical Code Folding for functions). > +Using @option{-fno-omit-frame-pointer} also improves stack traces. > +Since multiple runs of the > program may yield backtraces with different addresses due to ASLR (Address > Space Layout Randomization), it may be desirable to turn ASLR off. On Linux, > this can be achieved with @samp{setarch `uname -m` -R ./prog}. > @@ -17972,6 +17976,9 @@ supported options. > The option cannot be combined with @option{-fsanitize=address}, > @option{-fsanitize=leak}. > > +When compiling with @option{-fsanitize=thread}, you should also use > +@option{-g} to produce more meaningful output. > + > Note that sanitized atomic builtins cannot throw exceptions when > operating on invalid memory addresses with non-call exceptions > (@option{-fnon-call-exceptions}). > -- > 2.34.1 >
[committed] libphobos: Default to libc closefrom in spawnProcessPosix [PR119112]
Hi, This is a backport of two changes in upstream Phobos. - The current implementation of spawnProcessPosix is broken on systems with a large `ulimit -n' because it always OOMs making it impossible to spawn processes. Using the libc implementation, when available, for doing file descriptor operations en-mass solves this problem. - The fallback code now reads the list of file descriptors in use from /dev/fd or /proc/this/fd, avoiding the need to scroll through the entire list of possible file descriptors. This fixes the fork process being very slow and memory intensive when RLIMIT_NOFILE is too high. Bootstrapped and regression tested on x86_64-linux-gnu/-m32, committed to the releases/gcc-14 branch. Regards, Iain. --- PR d/119112 libphobos/ChangeLog: * libdruntime/core/sys/freebsd/unistd.d (closefrom): Add binding. * libdruntime/core/sys/linux/unistd.d (closefrom): Likewise. * libdruntime/core/sys/openbsd/unistd.d (closefrom): Likewise. * src/std/process.d (enum InternalError): Add closefds_dup2. (spawnProcessPosix): Use closefrom when available. Reviewed-on: https://github.com/dlang/phobos/pull/9048 Reviewed-on: https://github.com/dlang/phobos/pull/9077 --- .../libdruntime/core/sys/freebsd/unistd.d | 2 + libphobos/libdruntime/core/sys/linux/unistd.d | 4 + .../libdruntime/core/sys/openbsd/unistd.d | 2 + libphobos/src/std/process.d | 194 ++ 4 files changed, 158 insertions(+), 44 deletions(-) diff --git a/libphobos/libdruntime/core/sys/freebsd/unistd.d b/libphobos/libdruntime/core/sys/freebsd/unistd.d index 493cda1c8c9..ebb7afa2726 100644 --- a/libphobos/libdruntime/core/sys/freebsd/unistd.d +++ b/libphobos/libdruntime/core/sys/freebsd/unistd.d @@ -17,3 +17,5 @@ extern(C): nothrow: int getosreldate() pure @trusted; + +void closefrom(int); diff --git a/libphobos/libdruntime/core/sys/linux/unistd.d b/libphobos/libdruntime/core/sys/linux/unistd.d index faa226cf407..548870268db 100644 --- a/libphobos/libdruntime/core/sys/linux/unistd.d +++ b/libphobos/libdruntime/core/sys/linux/unistd.d @@ -5,6 +5,7 @@ public import core.sys.posix.unistd; version (linux): extern(C): nothrow: +@nogc: // Additional seek constants for sparse file handling // from Linux's unistd.h, stdio.h, and linux/fs.h @@ -21,3 +22,6 @@ char* getpass(const(char)* prompt); // Exit all threads in a process void exit_group(int status); + +/// Close all open file descriptors greater or equal to `lowfd` +void closefrom(int lowfd); diff --git a/libphobos/libdruntime/core/sys/openbsd/unistd.d b/libphobos/libdruntime/core/sys/openbsd/unistd.d index 4232c036049..9618543d8da 100644 --- a/libphobos/libdruntime/core/sys/openbsd/unistd.d +++ b/libphobos/libdruntime/core/sys/openbsd/unistd.d @@ -19,3 +19,5 @@ int getthrname(pid_t, char*, size_t); int pledge(const scope char*, const scope char*); int setthrname(pid_t, const scope char*); int unveil(const scope char*, const scope char*); + +int closefrom(int); diff --git a/libphobos/src/std/process.d b/libphobos/src/std/process.d index 494910f3535..3b892217917 100644 --- a/libphobos/src/std/process.d +++ b/libphobos/src/std/process.d @@ -872,6 +872,7 @@ version (Posix) private enum InternalError : ubyte doubleFork, malloc, preExec, +closefds_dup2, } /* @@ -1000,7 +1001,7 @@ private Pid spawnProcessPosix(scope const(char[])[] args, if (config.flags & Config.Flags.detached) close(pidPipe[0]); close(forkPipe[0]); -immutable forkPipeOut = forkPipe[1]; +auto forkPipeOut = forkPipe[1]; immutable pidPipeOut = pidPipe[1]; // Set the working directory. @@ -1034,56 +1035,157 @@ private Pid spawnProcessPosix(scope const(char[])[] args, if (!(config.flags & Config.Flags.inheritFDs)) { -// NOTE: malloc() and getrlimit() are not on the POSIX async -// signal safe functions list, but practically this should -// not be a problem. Java VM and CPython also use malloc() -// in its own implementation via opendir(). -import core.stdc.stdlib : malloc; -import core.sys.posix.poll : pollfd, poll, POLLNVAL; -import core.sys.posix.sys.resource : rlimit, getrlimit, RLIMIT_NOFILE; - -// Get the maximum number of file descriptors that could be open. -rlimit r; -if (getrlimit(RLIMIT_NOFILE, &r) != 0) -{ -abortOnError(forkPipeOut, InternalError.getrlimit, .errno); -} -immutable maxDescriptors = cast(int) r.rlim_cur; +version (FreeBSD) +import core.sys.freebsd.unistd : closefrom; +else version (OpenBSD) +import core.sys.openbsd.unistd : closefrom; -// The above, less stdin, stdout, a