C++ patch ping
Hi! I'd like to ping the following patches libcpp: __VA_OPT__ p1042r1 placemarker changes [PR101488] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575621.html together with your https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577602.html incremental patch (successfully tested on x86_64-linux and i686-linux). libcpp, v2: Implement C++23 P1949R7 - C++ Identifier Syntax using Unicode Standard Annex 31 [PR100977] https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576854.html The incremental patch for splitting tokens at unsupported characters withdrawn, the above is the base patch. Thanks. Jakub
Re: [PATCH take 3] Experimental -fpreserve-traps option
On Sun, Aug 29, 2021 at 11:28 AM Roger Sayle wrote: > > > This is another attempt to add an -fpreserve-traps command line > option to GCC. Many thanks to Richard Beiner for approving the > code clean-up pieces of my earlier submission: This revision > contains just the actual functional change(s). > > My recent attempt at humour in comment #8 of PR 77980 has > reminded me that we still really need to tackle/improve the > semantics of trapping instructions. > > Firstly, I completely agree with Richard (and Eric) that the > current situation with the semantics of -fexceptions and > -fnon-call-exceptions (especially the latter without the > former) is a mess. Alas, the testsuite (and source) is full > of checks that the current status quo is preserved, so any > attempts to tweak exception handling leads to a world of pain. I'm giving the following a try now which at least would remove the '-fno-exceptions -fnon-call-exceptions' state (it then might ask for unification of the flag_ vars to an enum, but that would be for a followup). diff --git a/gcc/common.opt b/gcc/common.opt index ed8ab5fbe13..7d69ab5ef7c 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1509,7 +1509,7 @@ Common Var(flag_emit_class_debug_always) Init(0) Do not suppress C++ class debug information. fexceptions -Common Var(flag_exceptions) Optimization +Common Var(flag_exceptions) Optimization EnabledBy(fnon-call-exceptions) Enable exception handling. fexpensive-optimizations > The most relevant thing that I would like to point out is that > this proposed option, -fpreserve-traps, and the semantics of > trapping instructions are completely orthogonal to exception > handling. As the term EH implies, exceptions are a form of flow > control (on those targets that support it). Traps, signals(?), > segmentation faults, illegal accesses and invalid instructions > are things that can't (always?) be controlled. Exceptional yes, > but not handleable. Are they really independent? Consider -fnon-call-exceptions where trapping instructions (may) raise exceptions - now with your change (ENOPATCH btw...) any such instructions would be considered having side-effects despite the 'trapping' bit now covered by EH IL. Now - individual stmts can be marked as known to not throw, what does this now mean for its trapping state? How do -fpreserve-traps and -fdelete-dead-exceptions interact here? I think that going down the -fpreserve-traps way opens a whole can of worms with respects to users doing weird things and us having to maintain sth not well thought out - so IMHO adding a user-visible option shouldn't be the first thing to do. IIRC we got -ftrapv because there were frontends requiring its semantics (Java)? But even -ftrapv is utterly broken and only "works" in very small circumstances (SImode and DImode operations). But now -ftrapv interacts with -fpreserve-traps as well given it makes some (not very well defined) operations possibly trapping. What does preserving a trap mean btw, does it require to preserve the exact trapping instruction (a trap handler might inspect the trapping instruction after all)? Or are we allowed to replace a known to be trapping instruction by sth else that also traps? I suppose traps will be non-recoverable (you have -fnon-call-exceptions and EH as the only mechanism to resume after a trap). Does -fpreserve-traps imply that a trap is now sth well-defined and that for example a division by zero is no longer considered undefined behavior (with all consequences for VRP and thus removing unused divisions)? Is any operation triggering undefined behavior now possibly trapping? In the end the main question to ask is probably what the motivation behind adding -fpreserve-traps is? Is it making a quite vaguely defined part of GCCs behavior properly defined? Is it giving users much needed control about "preserving traps"? I hoped the one jumping on the "traps" bandwagon would have tackled -ftrapv first (thankfully very few references in GCC itself remain), by nuking flag_trapv from the middle-end and reflecting -ftrapv to the IL from the two FEs it is exposed to (C and C++). We did have this situation with adding some extra option, only confusing things further, in the past - not sure if you remember -fstrict-overflow which added a third kind of overflow behavior. Thanks, Richard. > I also agree with Richard's sentiment that GCC should aim to > store semantics in the representation, not in global variables. > Interestingly, whether a tree or rtx can trap has been explicitly > represented in every allocated node for decades; it's what the > compiler should do (or is allowed to do) with that information > that has been missing. > > My proposal is to treat -fpreserve-traps as an experiment for > the time being. Allowing all the places in the compiler that > make assumptions about traps to be found/documented (by searching > for flag_perserve_traps in the source code), and their influence > on performance be
[PATCH] Make sure -fexceptions is enabled when -fnon-call-exceptions is
This makes -fexceptions enabled by -fnon-call-exceptions, removing the odd state of !flag_exceptions && flag_non_call_exceptions from middle-end consideration. Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages. OK? Richard. 2021-08-30 Richard Biener * common.opt (fexceptions): Mark EnabledBy(fnon-call-exceptions). * doc/invoke.texi (fnon-call-exceptions): Document this enables -fexceptions. --- gcc/common.opt | 2 +- gcc/doc/invoke.texi | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/gcc/common.opt b/gcc/common.opt index ed8ab5fbe13..7d69ab5ef7c 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1509,7 +1509,7 @@ Common Var(flag_emit_class_debug_always) Init(0) Do not suppress C++ class debug information. fexceptions -Common Var(flag_exceptions) Optimization +Common Var(flag_exceptions) Optimization EnabledBy(fnon-call-exceptions) Enable exception handling. fexpensive-optimizations diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index b83bd902cec..f7bb193b51d 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -16410,7 +16410,8 @@ Note that this requires platform-specific runtime support that does not exist everywhere. Moreover, it only allows @emph{trapping} instructions to throw exceptions, i.e.@: memory references or floating-point instructions. It does not allow exceptions to be thrown from -arbitrary signal handlers such as @code{SIGALRM}. +arbitrary signal handlers such as @code{SIGALRM}. This enables +@option{-fexceptions}. @item -fdelete-dead-exceptions @opindex fdelete-dead-exceptions -- 2.31.1
Re: [PATCH] Make sure -fexceptions is enabled when -fnon-call-exceptions is
> This makes -fexceptions enabled by -fnon-call-exceptions, removing > the odd state of !flag_exceptions && flag_non_call_exceptions from > middle-end consideration. FWIW fine with me. -- Eric Botcazou
Re: [PATCH][pushed] Add -fprofile-reproducible=parallel-runs to STAGEfeedback_CFLAGS to Makefile.tpl.
On 8/27/21 19:45, Gleb Fotengauer-Malinovskiy wrote: Hi, On Thu, Mar 11, 2021 at 04:19:51PM +0100, Martin Liška wrote: Pushed as obvious, the original change was done in g:e05a117dc4b98f3ac60851608f532ba7cee7343a. Martin ChangeLog: * Makefile.tpl: The change was done Makefile.in which is generated file. --- Makefile.tpl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile.tpl b/Makefile.tpl index 3b88f351d5b..6e0337fb48f 100644 --- a/Makefile.tpl +++ b/Makefile.tpl @@ -488,7 +488,7 @@ STAGEprofile_TFLAGS = $(STAGE2_TFLAGS) STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS)) STAGEtrain_TFLAGS = $(filter-out -fchecking=1,$(STAGE3_TFLAGS)) -STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use +STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use -fprofile-reproducible=parallel-runs Did you mean to add -fprofile-reproducible flag to STAGEprofile_CFLAGS instead? No. I suppose this flag doesn't mean anything without -fprofile-generate, which is passed through STAGEprofile_CFLAGS. The flag -fprofile-reproducible is used when we use a collected profile, so it works *only* with -fprofile-use flag. Thanks for heads up! Cheers, Martin Thanks,
Re: [PATCH] libffi: Fix MIPS r6 support
在 2021/8/30 5:00, Jeff Law 写道: On 8/28/2021 1:23 AM, Xi Ruoyao wrote: On Fri, 2021-08-27 at 15:28 -0600, Jeff Law via Gcc-patches wrote: On 8/26/2021 10:58 PM, YunQiang Su wrote: for some instructions, MIPS r6 uses different encoding other than the previous releases. 1. mips/n32.S disable .set mips4: since it casuses old insn encoding is used. https://github.com/libffi/libffi/pull/396 2. mips/ffi.c: the encoding for JR is hardcoded: we need to use different value for r6 and pre-r6. https://github.com/libffi/libffi/pull/401 libffi/ PR libffi/83636 * src/mips/n32.S: disable .set mips4 * src/mips/ffi.c: use different JR encoding for r6. These should go to the upstream libffi project. Once accepted there you can add them to GCC. Hi Jeff, The two PRs are already merged, and released since libffi-3.3.0 (now the upstream latest release is 3.4.2). ACK. Thanks for confirming. It's always OK to cherrypick/backport from libffi back to GCC. I don't have a MIPSr6 so I can't test though. Understood. Me neither, but I really should get a tiny chroot for mipsr6 so that my tester can validate it regularly. We have port Debian to MIPS64r6el. http://58.246.137.130:20180/tools/tarball/ You can use both buster(Debian 10) or bullseye (Debian 11). And both qemu-system and qemu-user can work. Jeff .
Re: [PATCH v3] Fix incomplete computation in fill_always_executed_in_1
On 2021/8/27 15:45, Richard Biener wrote: On Thu, 26 Aug 2021, Xionghu Luo wrote: On 2021/8/24 16:20, Richard Biener wrote: On Tue, 24 Aug 2021, Xionghu Luo wrote: On 2021/8/19 20:11, Richard Biener wrote: - class loop *inn_loop = loop; if (ALWAYS_EXECUTED_IN (loop->header) == NULL) { @@ -3232,19 +3231,6 @@ fill_always_executed_in_1 (class loop *loop, sbitmap contains_call) to disprove this if possible). */ if (bb->flags & BB_IRREDUCIBLE_LOOP) break; - - if (!flow_bb_inside_loop_p (inn_loop, bb)) - break; - - if (bb->loop_father->header == bb) - { - if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb)) - break; - - /* In a loop that is always entered we may proceed anyway. -But record that we entered it and stop once we leave it. */ - inn_loop = bb->loop_father; - } } while (1) I'm not sure this will work correct (I'm not sure how the existing code makes it so either...). That said, I can't poke any hole into the change. What I see is that definitely if (dominated_by_p (CDI_DOMINATORS, loop->latch, bb)) last = bb; if (bitmap_bit_p (contains_call, bb->index)) break; doesn't work reliably since the DOM ordering will process blocks A B and C in random order for for (;;) { if (cond) { A: foo (); } else B:; C:; } and thus we can end up setting 'last' to C_before_ processing 'A' and thus arriving at the call foo () ... get_loop_body_in_dom_order does some "special sauce" but not to address the above problem - but it might be that a subtle issue like the above is the reason for the inner loop handling. The inner loop block order does_not_ adhere to this "special sauce", that is - the "Additionally, if a basic block s dominates the latch, then only blocks dominated by s are be after it." guarantee holds for the outer loop latch, not for the inner. Digging into the history of fill_always_executed_in_1 doesn't reveal anything - the inner loop handling has been present since introduction by Zdenek - but usually Zdenek has a reason for doing things as he does;) Yes, this is really complicated usage, thanks for point it out. :) I constructed two cases to verify this with inner loop includes "If A; else B; C". Finding that fill_sons_in_loop in get_loop_body_in_dom_order will also checks whether the bb domintes outer loop’s latch, if C dominate outer loop’s latch, C is postponed, the access order is ABC, 'last' won’t be set to C if A or B contains call; But it depends on the order of visiting ABC and that's hard to put into a testcase since it depends on the order of edges and the processing of the dominance computation. ABC are simply unordered with respect to a dominator walk. Otherwise if C doesn’t dominate outer loop’s latch in fill_sons_in_loop, the access order is CAB, but 'last' also won’t be updated to C in fill_always_executed_in_1 since there is also dominate check, then if A or B contains call, it could break successfully. C won't be set to ALWAYS EXECUTED for both circumstance. Note it might be simply a measure against quadratic complexity, esp. since with your patch we also dive into not always executed subloops as you remove the if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb)) break; check. I suggest to evaluate behavior of the patch on a testcase like void foo (int n, int **k) { for (int i = 0; i < n; ++i) if (k[0][i]) for (int j = 0; j < n; ++j) if (k[1][j]) for (int l = 0; l < n; ++l) if (k[2][l]) ... } Theoretically the complexity is changing from L1(bbs) to L1(bbs)+L2(bbs)+L3(bbs)+…+Ln(bbs), so fill_always_executed_in_1's execution time is supposed to be increase from O(n) to O(n2)? The time should depend on loop depth and bb counts. I also drafted a test case has 73-depth loop function with 25 no-ipa function copies each compiled in lim2 and lim4 dependently. Total execution time of fill_always_executed_in_1 is increased from 32ms to 58ms, almost doubled but not quadratic? It's more like n + (n-1) + (n-2) + ... + 1 which is n^2/2 but that's still O(n^2). It seems reasonable to see compiling time getting longer since most bbs are checked more but a MUST to ensure early break correctly in every loop level... Though loop nodes could be huge, loop depth will never be so large in actual code? The "in practice" argument is almost always defeated by automatic program generators ;) I suspect you'll see quadratic behavior with your patch. You should be at least able to preserve a check like /* Do not process not always executed subloops to avoid quadratic behavior. */
RE: [PATCH take 3] Experimental -fpreserve-traps option
Hi Richard, > I hoped the one jumping on the "traps" bandwagon would have tackled -ftrapv > first. I did, back in 2003 I introduced (or helped introduce) the flag "-fwrapv" 😊 https://gcc.gnu.org/legacy-ml/gcc-patches/2003-03/msg02126.html Prior to that, signed overflow was undefined in the middle-end, and it was useful to have some way to specify the sensible behaviour of having signed and unsigned arithmetic behave the same way. I'm delighted that those semantics have become the default. Doh! Sorry for the ENOPATCH. I'll wait to hear how your experiments with -fnon-call-execeptions go before trying again, but an interesting additional usage is in match.pd: diff --git a/gcc/match.pd b/gcc/match.pd index f421c74..1eeb5eb 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -348,7 +348,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* X / bool_range_Y is X. */ (simplify (div @0 SSA_NAME@1) - (if (INTEGRAL_TYPE_P (type) && ssa_name_has_boolean_range (@1)) + (if (INTEGRAL_TYPE_P (type) + && ssa_name_has_boolean_range (@1) + && !flag_preserve_traps) @0)) /* X / X is one. */ (simplify This is the transformation that leads to the counter-intuitive "quantum" behaviour in PR 77980. It's difficult to believe that all the language front-ends supported by GCC consider this transformation as "safe". Perhaps 20 years from now, preserving traps will be the sensible default. Cheers, Roger -- -Original Message- From: Richard Biener Sent: 30 August 2021 08:13 To: Roger Sayle Cc: GCC Patches ; Eric Botcazou Subject: Re: [PATCH take 3] Experimental -fpreserve-traps option On Sun, Aug 29, 2021 at 11:28 AM Roger Sayle wrote: > > > This is another attempt to add an -fpreserve-traps command line option > to GCC. Many thanks to Richard Beiner for approving the code clean-up > pieces of my earlier submission: This revision contains just the > actual functional change(s). > > My recent attempt at humour in comment #8 of PR 77980 has reminded me > that we still really need to tackle/improve the semantics of trapping > instructions. > > Firstly, I completely agree with Richard (and Eric) that the current > situation with the semantics of -fexceptions and -fnon-call-exceptions > (especially the latter without the > former) is a mess. Alas, the testsuite (and source) is full of checks > that the current status quo is preserved, so any attempts to tweak > exception handling leads to a world of pain. I'm giving the following a try now which at least would remove the '-fno-exceptions -fnon-call-exceptions' state (it then might ask for unification of the flag_ vars to an enum, but that would be for a followup). diff --git a/gcc/common.opt b/gcc/common.opt index ed8ab5fbe13..7d69ab5ef7c 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1509,7 +1509,7 @@ Common Var(flag_emit_class_debug_always) Init(0) Do not suppress C++ class debug information. fexceptions -Common Var(flag_exceptions) Optimization +Common Var(flag_exceptions) Optimization +EnabledBy(fnon-call-exceptions) Enable exception handling. fexpensive-optimizations > The most relevant thing that I would like to point out is that this > proposed option, -fpreserve-traps, and the semantics of trapping > instructions are completely orthogonal to exception handling. As the > term EH implies, exceptions are a form of flow control (on those > targets that support it). Traps, signals(?), segmentation faults, > illegal accesses and invalid instructions are things that can't > (always?) be controlled. Exceptional yes, but not handleable. Are they really independent? Consider -fnon-call-exceptions where trapping instructions (may) raise exceptions - now with your change (ENOPATCH btw...) any such instructions would be considered having side-effects despite the 'trapping' bit now covered by EH IL. Now - individual stmts can be marked as known to not throw, what does this now mean for its trapping state? How do -fpreserve-traps and -fdelete-dead-exceptions interact here? I think that going down the -fpreserve-traps way opens a whole can of worms with respects to users doing weird things and us having to maintain sth not well thought out - so IMHO adding a user-visible option shouldn't be the first thing to do. IIRC we got -ftrapv because there were frontends requiring its semantics (Java)? But even -ftrapv is utterly broken and only "works" in very small circumstances (SImode and DImode operations). But now -ftrapv interacts with -fpreserve-traps as well given it makes some (not very well defined) operations possibly trapping. What does preserving a trap mean btw, does it require to preserve the exact trapping instruction (a trap handler might inspect the trapping instruction after all)? Or are we allowed to replace a known to be trapping instruction by sth else that also traps? I suppose traps will be non-recoverable (you have -fnon-call-exceptions and EH as the only mechanism to re
Re: [PATCH v3] Fix incomplete computation in fill_always_executed_in_1
On Mon, 30 Aug 2021, Xionghu Luo wrote: > > > On 2021/8/27 15:45, Richard Biener wrote: > > On Thu, 26 Aug 2021, Xionghu Luo wrote: > > > >> > >> > >> On 2021/8/24 16:20, Richard Biener wrote: > >>> On Tue, 24 Aug 2021, Xionghu Luo wrote: > >>> > > > On 2021/8/19 20:11, Richard Biener wrote: > >> - class loop *inn_loop = loop; > >> > >> if (ALWAYS_EXECUTED_IN (loop->header) == NULL) > >> { > >> @@ -3232,19 +3231,6 @@ fill_always_executed_in_1 (class loop *loop, > >> @@ sbitmap contains_call) > >> to disprove this if possible). */ > >>if (bb->flags & BB_IRREDUCIBLE_LOOP) > >>break; > >> - > >> -if (!flow_bb_inside_loop_p (inn_loop, bb)) > >> - break; > >> - > >> -if (bb->loop_father->header == bb) > >> - { > >> -if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb)) > >> - break; > >> - > >> -/* In a loop that is always entered we may proceed > >> anyway. > >> - But record that we entered it and stop once we leave > >> it. */ > >> -inn_loop = bb->loop_father; > >> - } > >> } > >> > >> while (1) > > I'm not sure this will work correct (I'm not sure how the existing > > code makes it so either...). That said, I can't poke any hole > > into the change. What I see is that definitely > > > > if (dominated_by_p (CDI_DOMINATORS, loop->latch, bb)) > >last = bb; > > > > if (bitmap_bit_p (contains_call, bb->index)) > >break; > > > > doesn't work reliably since the DOM ordering will process blocks > > A B and C in random order for > > > > for (;;) > > { > > if (cond) > >{ > > A: foo (); > >} > > else B:; > > C:; > > } > > > > and thus we can end up setting 'last' to C_before_ processing > > 'A' and thus arriving at the call foo () ... > > > > get_loop_body_in_dom_order does some "special sauce" but not > > to address the above problem - but it might be that a subtle > > issue like the above is the reason for the inner loop handling. > > The inner loop block order does_not_ adhere to this "special sauce", > > that is - the "Additionally, if a basic block s dominates > > the latch, then only blocks dominated by s are be after it." > > guarantee holds for the outer loop latch, not for the inner. > > > > Digging into the history of fill_always_executed_in_1 doesn't > > reveal anything - the inner loop handling has been present > > since introduction by Zdenek - but usually Zdenek has a reason > > for doing things as he does;) > > Yes, this is really complicated usage, thanks for point it out. :) > I constructed two cases to verify this with inner loop includes "If A; > else B; C". > Finding that fill_sons_in_loop in get_loop_body_in_dom_order will also > checks > whether the bb domintes outer loop’s latch, if C dominate outer loop’s > latch, > C is postponed, the access order is ABC, 'last' won’t be set to C if A or > B contains call; > >>> > >>> But it depends on the order of visiting ABC and that's hard to put into > >>> a testcase since it depends on the order of edges and the processing > >>> of the dominance computation. ABC are simply unordered with respect > >>> to a dominator walk. > >>> > Otherwise if C doesn’t dominate outer loop’s latch in fill_sons_in_loop, > the access order is CAB, but 'last' also won’t be updated to C in > fill_always_executed_in_1 > since there is also dominate check, then if A or B contains call, it > could break > successfully. > > C won't be set to ALWAYS EXECUTED for both circumstance. > > > > > Note it might be simply a measure against quadratic complexity, > > esp. since with your patch we also dive into not always executed > > subloops as you remove the > > > > if (!dominated_by_p (CDI_DOMINATORS, loop->latch, bb)) > >break; > > > > check. I suggest to evaluate behavior of the patch on a testcase > > like > > > > void foo (int n, int **k) > > { > > for (int i = 0; i < n; ++i) > >if (k[0][i]) > > for (int j = 0; j < n; ++j) > >if (k[1][j]) > > for (int l = 0; l < n; ++l) > >if (k[2][l]) > > ... > > } > > Theoretically the complexity is changing from L1(bbs) to > L1(bbs)+L2(bbs)+L3(bbs)+…+Ln(bbs), > so fill_always_executed_in_1's execution time is supposed to be increase > from >
Re: [PATCH take 3] Experimental -fpreserve-traps option
On Mon, Aug 30, 2021 at 11:13 AM Roger Sayle wrote: > > > Hi Richard, > > > I hoped the one jumping on the "traps" bandwagon would have tackled -ftrapv > > first. > > I did, back in 2003 I introduced (or helped introduce) the flag "-fwrapv" 😊 > https://gcc.gnu.org/legacy-ml/gcc-patches/2003-03/msg02126.html > > Prior to that, signed overflow was undefined in the middle-end, and it was > useful to > have some way to specify the sensible behaviour of having signed and unsigned > arithmetic > behave the same way. I'm delighted that those semantics have become the > default. Actually -fno-wrapv is the default nowadays. That still is a hassle whenever the compiler itself tries to emit twos-complement arithmetic - while there's the possibility to use unsigned arithmetic for most cases there's no substitute for signed division (albeit the possibly overflowing case is rare). What's most annoying is IMHO that we have a bunch of flags controlling overflow behavior - flag_wrapv, flag_trapv plus flag_sanitize & SANITIZE_SI_OVERFLOW (and we've had a separate flag_strict_overflow and we do have a separate flag_wrapv_pointer now) but not all possible states have well-defined behavior, like flag_trapv && flag_wrapv (we disallow this specific combination of course). > Doh! Sorry for the ENOPATCH. I'll wait to hear how your experiments with > -fnon-call-execeptions > go before trying again, but an interesting additional usage is in match.pd: > > diff --git a/gcc/match.pd b/gcc/match.pd > index f421c74..1eeb5eb 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -348,7 +348,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > /* X / bool_range_Y is X. */ > (simplify >(div @0 SSA_NAME@1) > - (if (INTEGRAL_TYPE_P (type) && ssa_name_has_boolean_range (@1)) > + (if (INTEGRAL_TYPE_P (type) > + && ssa_name_has_boolean_range (@1) > + && !flag_preserve_traps) > @0)) > /* X / X is one. */ > (simplify > > This is the transformation that leads to the counter-intuitive "quantum" > behaviour in PR 77980. > It's difficult to believe that all the language front-ends supported by GCC > consider this transformation > as "safe". Perhaps 20 years from now, preserving traps will be the sensible > default. Yes, but there's interplay with what GCC considers undefined behavior. So wouldn't it better to have flag_divde_by_zero_traps which says divide by zero is _not_ undefined behavior? I think that -fpreserve-traps applying to -ftrapv is not what people would expect (they'd expect dead code to be elided). Richard. > Cheers, > Roger > -- > > -Original Message- > From: Richard Biener > Sent: 30 August 2021 08:13 > To: Roger Sayle > Cc: GCC Patches ; Eric Botcazou > > Subject: Re: [PATCH take 3] Experimental -fpreserve-traps option > > On Sun, Aug 29, 2021 at 11:28 AM Roger Sayle > wrote: > > > > > > This is another attempt to add an -fpreserve-traps command line option > > to GCC. Many thanks to Richard Beiner for approving the code clean-up > > pieces of my earlier submission: This revision contains just the > > actual functional change(s). > > > > My recent attempt at humour in comment #8 of PR 77980 has reminded me > > that we still really need to tackle/improve the semantics of trapping > > instructions. > > > > Firstly, I completely agree with Richard (and Eric) that the current > > situation with the semantics of -fexceptions and -fnon-call-exceptions > > (especially the latter without the > > former) is a mess. Alas, the testsuite (and source) is full of checks > > that the current status quo is preserved, so any attempts to tweak > > exception handling leads to a world of pain. > > I'm giving the following a try now which at least would remove the > '-fno-exceptions -fnon-call-exceptions' state (it then might ask for > unification of the flag_ vars to an enum, but that would be for a followup). > > diff --git a/gcc/common.opt b/gcc/common.opt index ed8ab5fbe13..7d69ab5ef7c > 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -1509,7 +1509,7 @@ Common Var(flag_emit_class_debug_always) Init(0) Do > not suppress C++ class debug information. > > fexceptions > -Common Var(flag_exceptions) Optimization > +Common Var(flag_exceptions) Optimization > +EnabledBy(fnon-call-exceptions) > Enable exception handling. > > fexpensive-optimizations > > > The most relevant thing that I would like to point out is that this > > proposed option, -fpreserve-traps, and the semantics of trapping > > instructions are completely orthogonal to exception handling. As the > > term EH implies, exceptions are a form of flow control (on those > > targets that support it). Traps, signals(?), segmentation faults, > > illegal accesses and invalid instructions are things that can't > > (always?) be controlled. Exceptional yes, but not handleable. > > Are they really independent? Consider -fnon-call-exceptions where trapping > instructions (may) raise exceptions - now with your change (ENOPATCH btw...) >
New Swedish PO file for 'gcc' (version 11.2.0)
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the Swedish team of translators. The file is available at: https://translationproject.org/latest/gcc/sv.po (This file, 'gcc-11.2.0.sv.po', has just now been sent to you in a separate email.) All other PO files for your package are available in: https://translationproject.org/latest/gcc/ Please consider including all of these in your next release, whether official or a pretest. Whenever you have a new distribution with a new version number ready, containing a newer POT file, please send the URL of that distribution tarball to the address below. The tarball may be just a pretest or a snapshot, it does not even have to compile. It is just used by the translators when they need some extra translation context. The following HTML page has been updated: https://translationproject.org/domain/gcc.html If any question arises, please contact the translation coordinator. Thank you for all your work, The Translation Project robot, in the name of your translation coordinator.
Re: [PATCH] Use __builtin_trap() for abort() if inhibit_libc
Hi, On Tue, Aug 17, 2021 at 10:43 AM Sebastian Huber < sebastian.hu...@embedded-brains.de> wrote: > abort() is used in gcc_assert() and gcc_unreachable() which is used by > target > libraries such as libgcov.a. This patch changes the abort() definition > under > certain conditions. If inhibit_libc is defined and abort is not already > defined, then abort() is defined to __builtin_trap(). > > The inhibit_libc define is usually defined if GCC is built for targets > running > in embedded systems which may optionally use a C standard library. If > inhibit_libc is defined, then there may be still a full featured abort() > available. abort() is a heavy weight function which depends on signals and > file streams. For statically linked applications, this means that a > dependency > on gcc_assert() pulls in the support for signals and file streams. This > could > prevent using gcov to test low end targets for example. Using > __builtin_trap() > avoids these dependencies if the target implements a "trap" instruction. > The > application or operating system could use a trap handler to react to > failed GCC > runtime checks which caused a trap. > > gcc/ > > * tsystem.h (abort): Define abort() if inhibit_libc is defined and > it > is not already defined. > This causes a build failure on arm: In file included from /libgcc/config/arm/unwind-arm.c:144: /libgcc/unwind-arm-common.inc:55:24: error: macro "abort" passed 1 arguments, but takes just 0 55 | extern void abort (void); This small patch should fix it: diff --git a/libgcc/unwind-arm-common.inc b/libgcc/unwind-arm-common.inc index c9b70c10d4f..77ec02ec811 100644 --- a/libgcc/unwind-arm-common.inc +++ b/libgcc/unwind-arm-common.inc @@ -50,10 +50,6 @@ #define ARM_SIGCONTEXT_R0 0xc #endif -/* We add a prototype for abort here to avoid creating a dependency on - target headers. */ -extern void abort (void); - /* Definitions for C++ runtime support routines. We make these weak declarations to avoid pulling in libsupc++ unnecessarily. */ typedef unsigned char bool; Christophe > --- > gcc/tsystem.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/tsystem.h b/gcc/tsystem.h > index e1e6a96a4f48..5c72c69ff3ed 100644 > --- a/gcc/tsystem.h > +++ b/gcc/tsystem.h > @@ -59,7 +59,7 @@ extern int atexit (void (*)(void)); > #endif > > #ifndef abort > -extern void abort (void) __attribute__ ((__noreturn__)); > +#define abort() __builtin_trap () > #endif > > #ifndef strlen > -- > 2.26.2 > >
[PATCH] [i386] Unify UNSPEC_MASKED_EQ/GT to the form of UNSPEC_PCMP.
Currently for evex vpcmpeqb instruction, we have two forms of rtl template representation, one is (unspec [op1 op2] UNSPEC_MASK_EQ), the other is (unspec [op1, op2, const_int 0] UNSPEC_PCMP), which increases the maintenance burden, such as optimization (not: vpcmpeqb) to (vpcmpneqb) requires two define_insn_and_split to match the two forms respectively, this patch removes UNSPEC_MASK_EQ/GT, unifying them into the form of UNSPEC_PCMP. Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Pushed to trunk. gcc/ChangeLog: * config/i386/sse.md (*_ucmp3_1): Change from define_split to define_insn_and_split. (*avx2_eq3): Removed. (_eq3): Adjust pattern (_eq3_1): Rename to .. (*_eq3_1): .. this, and adjust pattern. (*avx2_gt3): Removed. (_gt3): Change from define_insn to define_expand, and adjust pattern. (UNSPEC_MASKED_EQ, UNSPEC_MASKED_GT): Removed. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512bw-vpcmpeqb-1.c: Adjust testcase. * gcc.target/i386/avx512bw-vpcmpeqw-1.c: Ditto. * gcc.target/i386/avx512bw-vpcmpgtb-1.c: Ditto. * gcc.target/i386/avx512bw-vpcmpgtw-1.c: Ditto. * gcc.target/i386/avx512f-vpcmpeqd-1.c: Ditto. * gcc.target/i386/avx512f-vpcmpeqq-1.c: Ditto. * gcc.target/i386/avx512f-vpcmpgtd-1.c: Ditto. * gcc.target/i386/avx512f-vpcmpgtq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpeqd-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpeqq-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgtd-1.c: Ditto. * gcc.target/i386/avx512vl-vpcmpgtq-1.c: Ditto. * gcc.target/i386/bitwise_mask_op-1.c: Ditto. * gcc.target/i386/bitwise_mask_op-2.c: Ditto. --- gcc/config/i386/sse.md| 100 ++ .../gcc.target/i386/avx512bw-vpcmpeqb-1.c | 12 +-- .../gcc.target/i386/avx512bw-vpcmpeqw-1.c | 12 +-- .../gcc.target/i386/avx512bw-vpcmpgtb-1.c | 12 +-- .../gcc.target/i386/avx512bw-vpcmpgtw-1.c | 12 +-- .../gcc.target/i386/avx512f-vpcmpeqd-1.c | 4 +- .../gcc.target/i386/avx512f-vpcmpeqq-1.c | 4 +- .../gcc.target/i386/avx512f-vpcmpgtd-1.c | 4 +- .../gcc.target/i386/avx512f-vpcmpgtq-1.c | 4 +- .../gcc.target/i386/avx512vl-vpcmpeqd-1.c | 8 +- .../gcc.target/i386/avx512vl-vpcmpeqq-1.c | 8 +- .../gcc.target/i386/avx512vl-vpcmpgtd-1.c | 8 +- .../gcc.target/i386/avx512vl-vpcmpgtq-1.c | 8 +- .../gcc.target/i386/bitwise_mask_op-1.c | 6 -- .../gcc.target/i386/bitwise_mask_op-2.c | 1 - 15 files changed, 78 insertions(+), 125 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index ac0c46328f2..5785e73241c 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -100,8 +100,6 @@ (define_c_enum "unspec" [ UNSPEC_COMPRESS UNSPEC_COMPRESS_STORE UNSPEC_EXPAND - UNSPEC_MASKED_EQ - UNSPEC_MASKED_GT ;; Mask operations UNSPEC_MASKOP @@ -3426,7 +3424,7 @@ (define_int_attr pcmp_signed_mask ;; For signed comparison, handle EQ 0: NEQ 4, ;; for unsigned comparison extra handle LE:2, NLE:6, equivalent to EQ and NEQ. -(define_split +(define_insn_and_split "*_ucmp3_1" [(set (match_operand: 0 "register_operand") (unspec: [(us_minus:VI12_AVX512VL @@ -3435,9 +3433,11 @@ (define_split (match_operand:VI12_AVX512VL 3 "const0_operand") (match_operand:SI 4 "const_0_to_7_operand")] UNSPEC_PCMP_ITER))] - "TARGET_AVX512BW + "TARGET_AVX512BW && ix86_pre_reload_split () && ix86_binary_operator_ok (US_MINUS, mode, operands) && (INTVAL (operands[4]) & ) == 0" + "#" + "&& 1" [(const_int 0)] { /* LE: 2, NLT: 5, NLE: 6, LT: 1 */ @@ -13801,24 +13801,6 @@ (define_insn "*avx2_eq3" (set_attr "prefix" "vex") (set_attr "mode" "OI")]) -(define_insn_and_split "*avx2_eq3" - [(set (match_operand:VI_128_256 0 "register_operand") - (vec_merge:VI_128_256 - (match_operand:VI_128_256 1 "vector_all_ones_operand") - (match_operand:VI_128_256 2 "const0_operand") - (unspec: - [(match_operand:VI_128_256 3 "nonimmediate_operand") -(match_operand:VI_128_256 4 "nonimmediate_operand")] -UNSPEC_MASKED_EQ)))] - "TARGET_AVX512VL && ix86_pre_reload_split () - && !(MEM_P (operands[3]) && MEM_P (operands[4]))" - "#" - "&& 1" - [(set (match_dup 0) - (eq:VI_128_256 - (match_dup 3) - (match_dup 4)))]) - (define_insn_and_split "*avx2_pcmp3_1" [(set (match_operand:VI_128_256 0 "register_operand") (vec_merge:VI_128_256 @@ -13978,8 +13960,9 @@ (define_expand "_eq3" [(set (match_operand: 0 "register_operand") (unspec: [(match_operand:VI12_AVX512VL 1 "nonimmediate_operand") - (match_operand:VI12_AVX512VL 2 "nonimmediate_operand")] - UNSPEC_MASKED_EQ))] + (match_operand:VI12_AVX512VL 2 "nonim
Fix 'hash_table::expand' to destruct stale Value objects (was: 'hash_map>')
Hi! Ping -- we still need to plug the memory leak; see patch attached, and/or long discussion here: On 2021-08-16T14:10:00-0600, Martin Sebor wrote: > On 8/16/21 6:44 AM, Thomas Schwinge wrote: >> On 2021-08-12T17:15:44-0600, Martin Sebor via Gcc wrote: >>> On 8/6/21 10:57 AM, Thomas Schwinge wrote: So I'm trying to do some C++... ;-) Given: /* A map from SSA names or var decls to record fields. */ typedef hash_map field_map_t; /* For each propagation record type, this is a map from SSA names or var decls to propagate, to the field in the record type that should be used for transmission and reception. */ typedef hash_map record_field_map_t; Thus, that's a 'hash_map>'. (I may do that, right?) Looking through GCC implementation files, very most of all uses of 'hash_map' boil down to pointer key ('tree', for example) and pointer/integer value. >>> >>> Right. Because most GCC containers rely exclusively on GCC's own >>> uses for testing, if your use case is novel in some way, chances >>> are it might not work as intended in all circumstances. >>> >>> I've wrestled with hash_map a number of times. A use case that's >>> close to yours (i.e., a non-trivial value type) is in cp/parser.c: >>> see class_to_loc_map_t. >> >> Indeed, at the time you sent this email, I already had started looking >> into that one! (The Fortran test cases that I originally analyzed, which >> triggered other cases of non-POD/non-trivial destructor, all didn't >> result in a memory leak, because the non-trivial constructor doesn't >> actually allocate any resources dynamically -- that's indeed different in >> this case here.) ..., and indeed: >> >>> (I don't remember if I tested it for leaks >>> though. It's used to implement -Wmismatched-tags so compiling >>> a few tests under Valgrind should show if it does leak.) >> >> ... it does leak memory at present. :-| (See attached commit log for >> details for one example.) (Attached "Fix 'hash_table::expand' to destruct stale Value objects" again.) >> To that effect, to document the current behavior, I propose to >> "Add more self-tests for 'hash_map' with Value type with non-trivial >> constructor/destructor" (We've done that in commit e4f16e9f357a38ec702fb69a0ffab9d292a6af9b "Add more self-tests for 'hash_map' with Value type with non-trivial constructor/destructor", quickly followed by bug fix commit bb04a03c6f9bacc890118b9e12b657503093c2f8 "Make 'gcc/hash-map-tests.c:test_map_of_type_with_ctor_and_dtor_expand' work on 32-bit architectures [PR101959]". >> (Also cherry-pick into release branches, eventually?) Then: record_field_map_t field_map ([...]); // see below for ([...]) { tree record_type = [...]; [...] bool existed; field_map_t &fields = field_map.get_or_insert (record_type, &existed); gcc_checking_assert (!existed); [...] for ([...]) fields.put ([...], [...]); [...] } [stuff that looks up elements from 'field_map'] field_map.empty (); This generally works. If I instantiate 'record_field_map_t field_map (40);', Valgrind is happy. If however I instantiate 'record_field_map_t field_map (13);' (where '13' would be the default for 'hash_map'), Valgrind complains: 2,080 bytes in 10 blocks are definitely lost in loss record 828 of 876 at 0x483DD99: calloc (vg_replace_malloc.c:762) by 0x175F010: xcalloc (xmalloc.c:162) by 0xAF4A2C: hash_table>>> simple_hashmap_traits, tree_node*> >::hash_entry, false, xcallocator>::hash_table(unsigned long, bool, bool, bool, mem_alloc_origin) (hash-table.h:275) by 0x15E0120: hash_map>>> simple_hashmap_traits, tree_node*> >::hash_map(unsigned long, bool, bool, bool) (hash-map.h:143) by 0x15DEE87: hash_map>>> tree_node*, simple_hashmap_traits, tree_node*> >, simple_hashmap_traits, hash_map>>> simple_hashmap_traits, tree_node*> > > >::get_or_insert(tree_node* const&, bool*) (hash-map.h:205) by 0x15DD52C: execute_omp_oacc_neuter_broadcast() (omp-oacc-neuter-broadcast.cc:1371) [...] (That's with '#pragma GCC optimize "O0"' at the top of the 'gcc/*.cc' file.) My suspicion was that it is due to the 'field_map' getting resized as it incrementally grows (and '40' being big enough for that to never happen), and somehow the non-POD (?) value objects not being properly handled during that. Working my way a bit through 'gcc/hash-map.*' and 'gcc/hash-table.*' (but not claiming that I understand all that, off hand), it seems as if my theory is right: I'm
[wwwdocs] gcc-12/changes.html: nvptx - new __PTX_SM__ macro
Document Roger's patch https://gcc.gnu.org/g:3c496e92d795a8fe5c527e3c5b5a6606669ae50d OK? Suggestions? Tobias PS: I have a pending wwwdocs patch for OpenMP, but I think I will hold off with an updated version until Jakub's next patch – to avoid too many updates. PPS: I also have a pending wwwdocs patch for GCN, but there one tuning follow-up patch is missing before it is fully implemented. Hence, holding off that one as well. - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 gcc-12/changes.html: nvptx - new __PTX_SM__ macro diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html index a8859882..d217f1c8 100644 --- a/htdocs/gcc-12/changes.html +++ b/htdocs/gcc-12/changes.html @@ -149,6 +149,8 @@ a work-in-progress. for the generated code; permitted values are 3.1 (default, matches previous GCC versions) and 6.3. + The __PTX_SM__ predefined macro allows code to check the + compute model being targeted by the compiler.
Re: [PATCH] Use __builtin_trap() for abort() if inhibit_libc
Hello Christophe, it seems there are a couple of more abort() declarations: libgcc/unwind-arm-common.inc:extern void abort (void); libgcc/config/c6x/pr-support.c:extern void abort (void); libgcc/config/arm/pr-support.c:extern void abort (void); libgcc/config/arm/linux-atomic-64bit.c:extern void abort (void); libgcc/fp-bit.c: external to abort if abort is not used by the function, and the stubs libgcc/fp-bit.c:extern void abort (void); I will prepare a patch. -- embedded brains GmbH Herr Sebastian HUBER Dornierstr. 4 82178 Puchheim Germany email: sebastian.hu...@embedded-brains.de phone: +49-89-18 94 741 - 16 fax: +49-89-18 94 741 - 08 Registergericht: Amtsgericht München Registernummer: HRB 157899 Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler Unsere Datenschutzerklärung finden Sie hier: https://embedded-brains.de/datenschutzerklaerung/
Re: [RFA] Some libgcc headers are missing the runtime exception
Hi! Ping. For easy reference I've again attached Richard Sandiford's "libgcc: Add missing runtime exception notices". On 2021-07-12T17:34:09+0100, Richard Sandiford via Gcc-patches wrote: > David Edelsohn writes: >> On Mon, Jul 12, 2021 at 11:58 AM Richard Sandiford >> wrote: >>> David Edelsohn writes: >>> > On Fri, Jul 9, 2021 at 1:31 PM Richard Sandiford >>> > wrote: >>> >> David Edelsohn writes: >>> >> > On Fri, Jul 9, 2021 at 12:53 PM Richard Sandiford via Gcc >>> >> > wrote: >>> >> >> It was pointed out to me off-list that config/aarch64/value-unwind.h >>> >> >> is missing the runtime exception. It looks like a few other files >>> >> >> are too; a fuller list is: >>> >> >> >>> >> >> libgcc/config/aarch64/value-unwind.h >>> >> >> libgcc/config/frv/frv-abi.h >>> >> >> libgcc/config/i386/value-unwind.h >>> >> >> libgcc/config/pa/pa64-hpux-lib.h >>> >> >> >>> >> >> Certainly for the aarch64 file this was simply a mistake; >>> >> >> it seems to have been copied from the i386 version, both of which >>> >> >> reference the runtime exception but don't actually include it. >>> >> >> >>> >> >> What's the procedure for fixing this? Can we treat it as a textual >>> >> >> error or do the files need to be formally relicensed? >>> >> > >>> >> > I'm unsure what you mean by "formally relicensed". >>> >> >>> >> It seemed like there were two possibilities: the licence of the files >>> >> is actually GPL + exception despite what the text says (the textual >>> >> error case), or the licence of the files is plain GPL because the text >>> >> has said so since the introduction of the files. In the latter case >>> >> I'd have imagined that someone would need to relicense the code so >>> >> that it is GPL + exception. >>> >> >>> >> > It generally is considered a textual omission. The runtime library >>> >> > components of GCC are intended to be licensed under the runtime >>> >> > exception, which was granted and approved at the time of introduction. >>> >> >>> >> OK, thanks. So would a patch to fix at least the i386 and aarch64 header >>> >> files be acceptable? (I'm happy to fix the other two as well if that's >>> >> definitely the right thing to do. It's just that there's more history >>> >> involved there…) >>> > >>> > Please correct the text in the files. The files in libgcc used in the >>> > GCC runtime are intended to be licensed with the runtime exception and >>> > GCC previously was granted approval for that licensing and purpose. >>> > >>> > As you are asking the question, I sincerely doubt that ARM and Cavium >>> > intended to apply a license without the exception to those files. And >>> > similarly for Intel and FRV. >>> >>> FTR, I think only Linaro (rather than Arm) touched the aarch64 file. >>> >>> > The runtime exception explicitly was intended for this purpose and >>> > usage at the time that GCC received approval to apply the exception. >>> >>> Ack. Is the patch below OK for trunk and branches? >> >> I'm not certain whom you are asking for approval, > > I was assuming it would need a global reviewer. > >> but it looks good to me. > > Thanks. So in addition to David, would a Global Reviewer please review this? Grüße Thomas - Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955 >From a601ac8ea9be14a898215456c22cd826e8fd92d9 Mon Sep 17 00:00:00 2001 From: Richard Sandiford Date: Mon, 12 Jul 2021 13:04:56 +0100 Subject: [PATCH] libgcc: Add missing runtime exception notices MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Quoting from https://gcc.gnu.org/pipermail/gcc/2021-July/236716.html: It was pointed out to me off-list that config/aarch64/value-unwind.h is missing the runtime exception. It looks like a few other files are too; a fuller list is: libgcc/config/aarch64/value-unwind.h libgcc/config/frv/frv-abi.h libgcc/config/i386/value-unwind.h libgcc/config/pa/pa64-hpux-lib.h Certainly for the aarch64 file this was simply a mistake; it seems to have been copied from the i386 version, both of which reference the runtime exception but don't actually include it. Similarly, frv-abi.h referenced the exception but didn't include it. pa64-hpux-lib.h was missing any reference to the exception. The decision was that this was simply a mistake [https://gcc.gnu.org/pipermail/gcc/2021-July/236717.html]: […] It generally is considered a textual omission. The runtime library components of GCC are intended to be licensed under the runtime exception, which was granted and approved at the time of introduction. --
Re: [PATCH] Use __builtin_trap() for abort() if inhibit_libc
On Mon, Aug 30, 2021 at 12:55 PM Sebastian Huber wrote: > > Hello Christophe, > > it seems there are a couple of more abort() declarations: > > libgcc/unwind-arm-common.inc:extern void abort (void); > libgcc/config/c6x/pr-support.c:extern void abort (void); > libgcc/config/arm/pr-support.c:extern void abort (void); > libgcc/config/arm/linux-atomic-64bit.c:extern void abort (void); > libgcc/fp-bit.c: external to abort if abort is not used by the > function, and the stubs > libgcc/fp-bit.c:extern void abort (void); > > I will prepare a patch. Less likley to break might be to simply wrap 'abort' in parentheses to avoid macro expansion? > -- > embedded brains GmbH > Herr Sebastian HUBER > Dornierstr. 4 > 82178 Puchheim > Germany > email: sebastian.hu...@embedded-brains.de > phone: +49-89-18 94 741 - 16 > fax: +49-89-18 94 741 - 08 > > Registergericht: Amtsgericht München > Registernummer: HRB 157899 > Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler > Unsere Datenschutzerklärung finden Sie hier: > https://embedded-brains.de/datenschutzerklaerung/
Re: [RFA] Some libgcc headers are missing the runtime exception
On Mon, Aug 30, 2021 at 12:59 PM Thomas Schwinge wrote: > > Hi! > > Ping. For easy reference I've again attached Richard Sandiford's > "libgcc: Add missing runtime exception notices". > > On 2021-07-12T17:34:09+0100, Richard Sandiford via Gcc-patches > wrote: > > David Edelsohn writes: > >> On Mon, Jul 12, 2021 at 11:58 AM Richard Sandiford > >> wrote: > >>> David Edelsohn writes: > >>> > On Fri, Jul 9, 2021 at 1:31 PM Richard Sandiford > >>> > wrote: > >>> >> David Edelsohn writes: > >>> >> > On Fri, Jul 9, 2021 at 12:53 PM Richard Sandiford via Gcc > >>> >> > wrote: > >>> >> >> It was pointed out to me off-list that config/aarch64/value-unwind.h > >>> >> >> is missing the runtime exception. It looks like a few other files > >>> >> >> are too; a fuller list is: > >>> >> >> > >>> >> >> libgcc/config/aarch64/value-unwind.h > >>> >> >> libgcc/config/frv/frv-abi.h > >>> >> >> libgcc/config/i386/value-unwind.h > >>> >> >> libgcc/config/pa/pa64-hpux-lib.h > >>> >> >> > >>> >> >> Certainly for the aarch64 file this was simply a mistake; > >>> >> >> it seems to have been copied from the i386 version, both of which > >>> >> >> reference the runtime exception but don't actually include it. > >>> >> >> > >>> >> >> What's the procedure for fixing this? Can we treat it as a textual > >>> >> >> error or do the files need to be formally relicensed? > >>> >> > > >>> >> > I'm unsure what you mean by "formally relicensed". > >>> >> > >>> >> It seemed like there were two possibilities: the licence of the files > >>> >> is actually GPL + exception despite what the text says (the textual > >>> >> error case), or the licence of the files is plain GPL because the text > >>> >> has said so since the introduction of the files. In the latter case > >>> >> I'd have imagined that someone would need to relicense the code so > >>> >> that it is GPL + exception. > >>> >> > >>> >> > It generally is considered a textual omission. The runtime library > >>> >> > components of GCC are intended to be licensed under the runtime > >>> >> > exception, which was granted and approved at the time of > >>> >> > introduction. > >>> >> > >>> >> OK, thanks. So would a patch to fix at least the i386 and aarch64 > >>> >> header > >>> >> files be acceptable? (I'm happy to fix the other two as well if that's > >>> >> definitely the right thing to do. It's just that there's more history > >>> >> involved there…) > >>> > > >>> > Please correct the text in the files. The files in libgcc used in the > >>> > GCC runtime are intended to be licensed with the runtime exception and > >>> > GCC previously was granted approval for that licensing and purpose. > >>> > > >>> > As you are asking the question, I sincerely doubt that ARM and Cavium > >>> > intended to apply a license without the exception to those files. And > >>> > similarly for Intel and FRV. > >>> > >>> FTR, I think only Linaro (rather than Arm) touched the aarch64 file. > >>> > >>> > The runtime exception explicitly was intended for this purpose and > >>> > usage at the time that GCC received approval to apply the exception. > >>> > >>> Ack. Is the patch below OK for trunk and branches? > >> > >> I'm not certain whom you are asking for approval, > > > > I was assuming it would need a global reviewer. > > > >> but it looks good to me. > > > > Thanks. > > So in addition to David, would a Global Reviewer please review this? OK. Thanks, Richard. > > Grüße > Thomas > > > - > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 > München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas > Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht > München, HRB 106955
Re: [PATCH] Use __builtin_trap() for abort() if inhibit_libc
On 30/08/2021 13:44, Richard Biener wrote: On Mon, Aug 30, 2021 at 12:55 PM Sebastian Huber wrote: Hello Christophe, it seems there are a couple of more abort() declarations: libgcc/unwind-arm-common.inc:extern void abort (void); libgcc/config/c6x/pr-support.c:extern void abort (void); libgcc/config/arm/pr-support.c:extern void abort (void); libgcc/config/arm/linux-atomic-64bit.c:extern void abort (void); libgcc/fp-bit.c: external to abort if abort is not used by the function, and the stubs libgcc/fp-bit.c:extern void abort (void); I will prepare a patch. Less likley to break might be to simply wrap 'abort' in parentheses to avoid macro expansion? I think the abort declaration in libgcc/unwind-arm-common.inc is the only problematic spot since unwind-arm-common.inc includes tsystem.h. -- embedded brains GmbH Herr Sebastian HUBER Dornierstr. 4 82178 Puchheim Germany email: sebastian.hu...@embedded-brains.de phone: +49-89-18 94 741 - 16 fax: +49-89-18 94 741 - 08 Registergericht: Amtsgericht München Registernummer: HRB 157899 Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler Unsere Datenschutzerklärung finden Sie hier: https://embedded-brains.de/datenschutzerklaerung/
[PATCH] Fix arm target build with inhibit_libc
Do not declare abort in "libgcc/unwind-arm-common.inc" since it is already provided by "tsystem.h". It fixes the following build error: In file included from libgcc/config/arm/unwind-arm.c:144: libgcc/unwind-arm-common.inc:55:24: error: macro "abort" passed 1 arguments, but takes just 0 55 | extern void abort (void); libgcc/ * unwind-arm-common.inc (abort): Remove. --- libgcc/unwind-arm-common.inc | 4 1 file changed, 4 deletions(-) diff --git a/libgcc/unwind-arm-common.inc b/libgcc/unwind-arm-common.inc index c9b70c10d4f5..77ec02ec811e 100644 --- a/libgcc/unwind-arm-common.inc +++ b/libgcc/unwind-arm-common.inc @@ -50,10 +50,6 @@ #define ARM_SIGCONTEXT_R0 0xc #endif -/* We add a prototype for abort here to avoid creating a dependency on - target headers. */ -extern void abort (void); - /* Definitions for C++ runtime support routines. We make these weak declarations to avoid pulling in libsupc++ unnecessarily. */ typedef unsigned char bool; -- 2.26.2
Re: [PATCH] Set bound/cmp/control for until wrap loop.
On Mon, 30 Aug 2021, guojiufu wrote: > On 2021-08-30 14:15, Jiufu Guo wrote: > > Hi, > > > > In patch r12-3136, niter->control, niter->bound and niter->cmp are > > derived from number_of_iterations_lt. While for 'until wrap condition', > > the calculation in number_of_iterations_lt is not align the requirements > > on the define of them and requirements in determine_exit_conditions. > > > > This patch calculate niter->control, niter->bound and niter->cmp in > > number_of_iterations_until_wrap. > > > > The ICEs in the PR are pass with this patch. > > Bootstrap and reg-tests pass on ppc64/ppc64le and x86. > > Is this ok for trunk? > > > > BR. > > Jiufu Guo > > > Add ChangeLog: > gcc/ChangeLog: > > 2021-08-30 Jiufu Guo > > PR tree-optimization/102087 > * tree-ssa-loop-niter.c (number_of_iterations_until_wrap): > Set bound/cmp/control for niter. > > gcc/testsuite/ChangeLog: > > 2021-08-30 Jiufu Guo > > PR tree-optimization/102087 > * gcc.dg/vect/pr101145_3.c: Update tests. > * gcc.dg/pr102087.c: New test. > > > --- > > gcc/tree-ssa-loop-niter.c | 14 +- > > gcc/testsuite/gcc.dg/pr102087.c| 25 + > > gcc/testsuite/gcc.dg/vect/pr101145_3.c | 4 +++- > > 3 files changed, 41 insertions(+), 2 deletions(-) > > create mode 100644 gcc/testsuite/gcc.dg/pr102087.c > > > > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c > > index 7af92d1c893..747f04d3ce0 100644 > > --- a/gcc/tree-ssa-loop-niter.c > > +++ b/gcc/tree-ssa-loop-niter.c > > @@ -1482,7 +1482,7 @@ number_of_iterations_until_wrap (class loop *, > > tree type, affine_iv *iv0, > > affine_iv *iv1, class tree_niter_desc *niter) > > { > >tree niter_type = unsigned_type_for (type); > > - tree step, num, assumptions, may_be_zero; > > + tree step, num, assumptions, may_be_zero, span; > >wide_int high, low, max, min; > > > >may_be_zero = fold_build2 (LE_EXPR, boolean_type_node, iv1->base, > > iv0->base); > > @@ -1513,6 +1513,8 @@ number_of_iterations_until_wrap (class loop *, > > tree type, affine_iv *iv0, > > low = wi::to_wide (iv0->base); > > else > > low = min; > > + > > + niter->control = *iv1; > > } > >/* {base, -C} < n. */ > >else if (tree_int_cst_sign_bit (iv0->step) && integer_zerop > > (iv1->step)) > > @@ -1533,6 +1535,8 @@ number_of_iterations_until_wrap (class loop *, > > tree type, affine_iv *iv0, > > high = wi::to_wide (iv1->base); > > else > > high = max; > > + > > + niter->control = *iv0; > > } > >else > > return false; it looks like the above two should already be in effect from the caller (guarding with integer_nozerop)? > > @@ -1556,6 +1560,14 @@ number_of_iterations_until_wrap (class loop *, > > tree type, affine_iv *iv0, > >niter->assumptions, assumptions); > > > >niter->control.no_overflow = false; > > + niter->control.base = fold_build2 (MINUS_EXPR, niter_type, > > +niter->control.base, > > niter->control.step); how do we know IVn - STEP doesn't already wrap? A comment might be good to explain you're turning the simplified exit condition into { IVbase - STEP, +, STEP } != niter * STEP + (IVbase - STEP) which, when mathematically looking at it makes me wonder why there's the seemingly redundant '- STEP' term? Also is NE_EXPR really correct since STEP might be not 1? Only for non equality compares the '- STEP' should matter? Richard. > > + span = fold_build2 (MULT_EXPR, niter_type, niter->niter, > > + fold_convert (niter_type, niter->control.step)); > > + niter->bound = fold_build2 (PLUS_EXPR, niter_type, span, > > + fold_convert (niter_type, niter->control.base)); > > + niter->bound = fold_convert (type, niter->bound); > > + niter->cmp = NE_EXPR; > > > >return true; > > } > > diff --git a/gcc/testsuite/gcc.dg/pr102087.c > > b/gcc/testsuite/gcc.dg/pr102087.c > > new file mode 100644 > > index 000..ef1f9f5cba9 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/pr102087.c > > @@ -0,0 +1,25 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O3" } */ > > + > > +unsigned __attribute__ ((noinline)) > > +foo (int *__restrict__ a, int *__restrict__ b, unsigned l, unsigned n) > > +{ > > + while (n < ++l) > > +*a++ = *b++ + 1; > > + return l; > > +} > > + > > +volatile int a[1]; > > +unsigned b; > > +int c; > > + > > +int > > +check () > > +{ > > + int d; > > + for (; b > 1; b++) > > +for (c = 0; c < 2; c++) > > + for (d = 0; d < 2; d++) > > + a[0]; > > + return 0; > > +} > > diff --git a/gcc/testsuite/gcc.dg/vect/pr101145_3.c > > b/gcc/testsuite/gcc.dg/vect/pr101145_3.c > > index 99289afec0b..40cb0240aaa 100644 > > --- a/gcc/testsuite/gcc.dg/vect/pr101145_3.c > > +++ b/gcc/testsuite/gcc.dg/vect/pr101145_3.c > > @@ -1,5 +1,6 @@ > > /* { dg-require-effective-target ve
[PATCH] tree-optimization/102128 - rework if-converted BB vect heuristic
This reworks the previous attempt to avoid leaving around if-converted scalar code in BB vectorized loop bodies to keep costing independent subgraphs which should address the observed regression with 519.lbm_r. For this to work we now first cost all subgraphs and only after doing that proceed to emit vectorized code. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. 2021-08-30 Richard Biener PR tree-optimization/102128 * tree-vect-slp.c (vect_bb_vectorization_profitable_p): Move scanning for if-converted scalar code to the caller and instead delay clearing the visited flag for profitable subgraphs. (vect_slp_region): Cost all subgraphs before scheduling. For if-converted BB vectorization scan for scalar COND_EXPRs and do not vectorize if any found and the cost model is very-cheap. --- gcc/tree-vect-slp.c | 112 +++- 1 file changed, 58 insertions(+), 54 deletions(-) diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c index 4d688c7a267..4ca24408249 100644 --- a/gcc/tree-vect-slp.c +++ b/gcc/tree-vect-slp.c @@ -5275,34 +5275,6 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo, vector_costs.safe_splice (instance->cost_vec); instance->cost_vec.release (); } - /* When we're vectorizing an if-converted loop body with the - very-cheap cost model make sure we vectorized all if-converted - code. */ - bool force_not_profitable = false; - if (orig_loop && flag_vect_cost_model == VECT_COST_MODEL_VERY_CHEAP) -{ - gcc_assert (bb_vinfo->bbs.length () == 1); - for (gimple_stmt_iterator gsi = gsi_start_bb (bb_vinfo->bbs[0]); - !gsi_end_p (gsi); gsi_next (&gsi)) - { - /* The costing above left us with DCEable vectorized scalar -stmts having the visited flag set. */ - if (gimple_visited_p (gsi_stmt (gsi))) - continue; - - if (gassign *ass = dyn_cast (gsi_stmt (gsi))) - if (gimple_assign_rhs_code (ass) == COND_EXPR) - { - force_not_profitable = true; - break; - } - } -} - - /* Unset visited flag. */ - stmt_info_for_cost *cost; - FOR_EACH_VEC_ELT (scalar_costs, i, cost) -gimple_set_visited (cost->stmt_info->stmt, false); if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "Cost model analysis: \n"); @@ -5319,6 +5291,7 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo, li_scalar_costs (scalar_costs.length ()); auto_vec > li_vector_costs (vector_costs.length ()); + stmt_info_for_cost *cost; FOR_EACH_VEC_ELT (scalar_costs, i, cost) { unsigned l = gimple_bb (cost->stmt_info->stmt)->loop_father->num; @@ -5341,6 +5314,7 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo, /* Now cost the portions individually. */ unsigned vi = 0; unsigned si = 0; + bool profitable = true; while (si < li_scalar_costs.length () && vi < li_vector_costs.length ()) { @@ -5407,30 +5381,29 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo, example). */ if (vec_outside_cost + vec_inside_cost > scalar_cost) { - scalar_costs.release (); - vector_costs.release (); - return false; + profitable = false; + break; } } - if (vi < li_vector_costs.length ()) + if (profitable && vi < li_vector_costs.length ()) { if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "Excess vector cost for part in loop %d:\n", li_vector_costs[vi].first); - scalar_costs.release (); - vector_costs.release (); - return false; + profitable = false; } - if (dump_enabled_p () && force_not_profitable) -dump_printf_loc (MSG_NOTE, vect_location, -"not profitable because of unprofitable if-converted " -"scalar code\n"); + /* Unset visited flag. This is delayed when the subgraph is profitable + and we process the loop for remaining unvectorized if-converted code. */ + if (orig_loop && !profitable) +FOR_EACH_VEC_ELT (scalar_costs, i, cost) + gimple_set_visited (cost->stmt_info->stmt, false); scalar_costs.release (); vector_costs.release (); - return !force_not_profitable; + + return profitable; } /* qsort comparator for lane defs. */ @@ -5884,9 +5857,8 @@ vect_slp_region (vec bbs, vec datarefs, bb_vinfo->shared->check_datarefs (); - unsigned i; - slp_instance instance; - FOR_EACH_VEC_ELT (BB_VINFO_SLP_INSTANCES (bb_vinfo), i, instance) + auto_vec profitable_subgraphs; + for (slp_instance instance : BB_VINFO_SLP_INSTANCES (bb_vinfo)) { if (instance->subgraph_entries.is_empty ()) continue; @@ -5894,9 +5866,7 @
Re: [PATCH] Make sure -fexceptions is enabled when -fnon-call-exceptions is
On Mon, 30 Aug 2021, Eric Botcazou wrote: > > This makes -fexceptions enabled by -fnon-call-exceptions, removing > > the odd state of !flag_exceptions && flag_non_call_exceptions from > > middle-end consideration. > > FWIW fine with me. Thanks - pushed. Richard.
RE: [PATCH] [MIPS] Hazard barrier return support
> -Original Message- > From: Dragan Mladjenovic > Sent: 17 August 2021 17:59 > To: 'Andrew Pinski' > Cc: 'gcc-patches@gcc.gnu.org' > Subject: RE: [PATCH] [MIPS] Hazard barrier return support > > > > > -Original Message- > > From: Dragan Mladjenovic > > Sent: 16 August 2021 22:40 > > To: 'Andrew Pinski' > > Cc: gcc-patches@gcc.gnu.org > > Subject: RE: [PATCH] [MIPS] Hazard barrier return support > > > > > > > > > -Original Message- > > > From: Andrew Pinski [mailto:pins...@gmail.com] > > > Sent: 16 August 2021 21:17 > > > To: Dragan Mladjenovic > > > Cc: gcc-patches@gcc.gnu.org > > > Subject: Re: [PATCH] [MIPS] Hazard barrier return support > > > > > > On Mon, Aug 16, 2021 at 7:43 AM Dragan Mladjenovic via Gcc-patches > > > wrote: > > > > > > > > This patch allows a function to request clearing of all > > > > instruction and execution hazards upon normal return via > > > > __attribute__ > > > ((use_hazard_barrier_return)). > > > > > > > > 2017-04-25 Prachi Godbole > > > > > > > > gcc/ > > > > * config/mips/mips.h (machine_function): New variable > > > > use_hazard_barrier_return_p. > > > > * config/mips/mips.md (UNSPEC_JRHB): New unspec. > > > > (mips_hb_return_internal): New insn pattern. > > > > * config/mips/mips.c (mips_attribute_table): Add attribute > > > > use_hazard_barrier_return. > > > > (mips_use_hazard_barrier_return_p): New static function. > > > > (mips_function_attr_inlinable_p): Likewise. > > > > (mips_compute_frame_info): Set use_hazard_barrier_return_p. > > > > Emit error for unsupported architecture choice. > > > > (mips_function_ok_for_sibcall, mips_can_use_return_insn): > > > > Return false for use_hazard_barrier_return. > > > > (mips_expand_epilogue): Emit hazard barrier return. > > > > * doc/extend.texi: Document use_hazard_barrier_return. > > > > > > > > gcc/testsuite/ > > > > * gcc.target/mips/hazard-barrier-return-attribute.c: New test. > > > > --- > > > > Rehash of original patch posted by Prachi with minimal changes. > > > > Tested against mips-mti-elf with mips32r2/-EB and mips32r2/-EB/- > > micromips. > > > > > > > > gcc/config/mips/mips.c| 58 +-- > > > > gcc/config/mips/mips.h| 3 + > > > > gcc/config/mips/mips.md | 15 + > > > > gcc/doc/extend.texi | 6 ++ > > > > .../mips/hazard-barrier-return-attribute.c| 20 +++ > > > > 5 files changed, 98 insertions(+), 4 deletions(-) create mode > > > > 100644 > > > > gcc/testsuite/gcc.target/mips/hazard-barrier-return-attribute.c > > > > > > > > diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index > > > > 89d1be6cea6..6ce12fce52e 100644 > > > > --- a/gcc/config/mips/mips.c > > > > +++ b/gcc/config/mips/mips.c > > > > @@ -630,6 +630,7 @@ static const struct attribute_spec > > > mips_attribute_table[] = { > > > > mips_handle_use_shadow_register_set_attr, NULL }, > > > >{ "keep_interrupts_masked", 0, 0, false, true, true, false, NULL, > NULL }, > > > >{ "use_debug_exception_return", 0, 0, false, true, true, false, > > > > NULL, NULL }, > > > > + { "use_hazard_barrier_return", 0, 0, true, false, false, false, > > > > + NULL, NULL }, > > > >{ NULL, 0, 0, false, false, false, false, NULL, NULL } > > > > }; > > > > > > > > @@ -1309,6 +1310,16 @@ mips_use_debug_exception_return_p (tree > > > type) > > > >TYPE_ATTRIBUTES (type)) != NULL; } > > > > > > > > +/* Check if the attribute to use hazard barrier return is set for > > > > + the function declaration DECL. */ > > > > + > > > > +static bool > > > > +mips_use_hazard_barrier_return_p (const_tree decl) { > > > > + return lookup_attribute ("use_hazard_barrier_return", > > > > + DECL_ATTRIBUTES (decl)) != NULL; } > > > > + > > > > /* Return the set of compression modes that are explicitly required > > > > by the attributes in ATTRIBUTES. */ > > > > > > > > @@ -1494,6 +1505,19 @@ mips_can_inline_p (tree caller, tree callee) > > > >return default_target_can_inline_p (caller, callee); } > > > > > > > > +/* Implement TARGET_FUNCTION_ATTRIBUTE_INLINABLE_P. > > > > + > > > > + A function requesting clearing of all instruction and execution > hazards > > > > + before returning cannot be inlined - thereby not clearing any > > > > hazards. > > > > + All our other function attributes are related to how out-of-line > > > > copies > > > > + should be compiled or called. They don't in themselves > > > > + prevent inlining. */ > > > > + > > > > +static bool > > > > +mips_function_attr_inlinable_p (const_tree decl) { > > > > + return !mips_use_hazard_barrier_return_p (decl); } > > > > + > > > > /* Handle an "interrupt" attribute with an optional argument. */ > > > > > > > > static tree > > > > @@ -7921,6
Re: [PATCH] Check the type of mask while generating cond_op in gimple simplication.
On Fri, Aug 27, 2021 at 8:53 AM liuhongt wrote: > > When gimple simplifcation try to combine op and vec_cond_expr to cond_op, > it doesn't check if mask type matches. It causes an ICE when expand cond_op > with mismatched mode. > This patch add a function named cond_vectorized_internal_fn_supported_p > to additionally check mask type than vectorized_internal_fn_supported_p. > > Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR middle-end/102080 > * internal-fn.c (cond_vectorized_internal_fn_supported_p): New > functions. > * internal-fn.h (cond_vectorized_internal_fn_supported_p): New > declaration. > * match.pd: Check the type of mask while generating cond_op in > gimple simplication. > > gcc/testsuite/ChangeLog: > > PR middle-end/102080 > * gcc.target/i386/pr102080.c: New test. > --- > gcc/internal-fn.c| 22 ++ > gcc/internal-fn.h| 1 + > gcc/match.pd | 24 > gcc/testsuite/gcc.target/i386/pr102080.c | 16 > 4 files changed, 55 insertions(+), 8 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr102080.c > > diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c > index 1360a00f0b9..8b2b65db1a7 100644 > --- a/gcc/internal-fn.c > +++ b/gcc/internal-fn.c > @@ -4102,6 +4102,28 @@ expand_internal_call (gcall *stmt) >expand_internal_call (gimple_call_internal_fn (stmt), stmt); > } > > +/* Check cond_op for vector modes since vectorized_internal_fn_supported_p > + doesn't check if mask type matches. */ > +bool > +cond_vectorized_internal_fn_supported_p (internal_fn ifn, tree type, > +tree mask_type) > +{ > + if (!vectorized_internal_fn_supported_p (ifn, type)) > +return false; > + > + machine_mode mask_mode; > + machine_mode vmode = TYPE_MODE (type); > + int size1, size2; > + if (VECTOR_MODE_P (vmode) > + && targetm.vectorize.get_mask_mode (vmode).exists(&mask_mode) > + && GET_MODE_SIZE (mask_mode).is_constant (&size1) > + && GET_MODE_SIZE (TYPE_MODE (mask_type)).is_constant (&size2) > + && size1 != size2) Why do we check for equal size rather than just mode equality which I think would work for non-constant sized modes as well? And when using sizes you'd instead use maybe_ne (GET_MODE_SIZE (mask_mode), GET_MODE_SIZE (TYPE_MODE (mask_type))) Thanks, Richard. > +return false; > + > + return true; > +} > + > /* If TYPE is a vector type, return true if IFN is a direct internal > function that is supported for that type. If TYPE is a scalar type, > return true if IFN is a direct internal function that is supported for > diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h > index 19d0f849a5a..f0aea00103c 100644 > --- a/gcc/internal-fn.h > +++ b/gcc/internal-fn.h > @@ -236,5 +236,6 @@ extern void expand_PHI (internal_fn, gcall *); > extern void expand_SHUFFLEVECTOR (internal_fn, gcall *); > > extern bool vectorized_internal_fn_supported_p (internal_fn, tree); > +extern bool cond_vectorized_internal_fn_supported_p (internal_fn, tree, > tree); > > #endif > diff --git a/gcc/match.pd b/gcc/match.pd > index e5bbb123a6a..72b1bc674db 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -6987,14 +6987,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > cond_op (COND_BINARY) > (simplify >(vec_cond @0 (view_convert? (uncond_op@4 @1 @2)) @3) > - (with { tree op_type = TREE_TYPE (@4); } > - (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), > op_type) > + (with { tree op_type = TREE_TYPE (@4); > + tree mask_type = TREE_TYPE (@0); } > + (if (cond_vectorized_internal_fn_supported_p (as_internal_fn (cond_op), > +op_type, mask_type) > && element_precision (type) == element_precision (op_type)) > (view_convert (cond_op @0 @1 @2 (view_convert:op_type @3)) > (simplify >(vec_cond @0 @1 (view_convert? (uncond_op@4 @2 @3))) > - (with { tree op_type = TREE_TYPE (@4); } > - (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), > op_type) > + (with { tree op_type = TREE_TYPE (@4); > + tree mask_type = TREE_TYPE (@0); } > + (if (cond_vectorized_internal_fn_supported_p (as_internal_fn (cond_op), > +op_type, mask_type) > && element_precision (type) == element_precision (op_type)) > (view_convert (cond_op (bit_not @0) @2 @3 (view_convert:op_type @1))) > > @@ -7003,14 +7007,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > cond_op (COND_TERNARY) > (simplify >(vec_cond @0 (view_convert? (uncond_op@5 @1 @2 @3)) @4) > - (with { tree op_type = TREE_TYPE (@5); } > - (if (vectorized_internal_fn_supported_p (as_internal_fn (cond_op), > op_type) > + (with { tree op_type = TREE_TYPE (@5);
Re: [PATCH v2 1/2] MIPS: use mips_isa enum instead hardcoded numbers
On 8/30/2021 12:52 AM, Xi Ruoyao via Gcc-patches wrote: These two patches look good to me. Still, need a maintainer's approval. Consider them approved. Thanks, Jeff
Re: [committed] Reduce vector comparison of uniform vectors to a scalar comparison
On 8/30/2021 12:51 AM, Richard Biener wrote: On Fri, Aug 27, 2021 at 9:31 PM Jeff Law via Gcc-patches wrote: I was working some aspects of our port's vector support and stumbled across a bit of silly code. We were comparing two vectors that were both uniform. We can simplify a comparison of uniform vectors to a comparison of a representative element from each. That in turn may expose const/copy propagation opportunities or even jump threading opportunities. And that's precisely what this patch does inside DOM. At the start of statement processing we see if we can reduce a vector comparison to a scalar comparison. You can see this in pr95308.C on x86. As we enter dom3 we have: ;; basic block 2, loop depth 0 ;;pred: ENTRY e.1_1 = e; _27 = f_26(D) != 0; _163 = e.1_1 != 0; _164 = _27 & _163; _196 = _164 ? -1 : 0; vect_cst__194 = {_196, _196, _196, _196, _196, _196, _196, _196}; [ ... ] ;; basic block 8, loop depth 1 ;;pred: 7 ;;6 if (vect_cst__194 == { 0, 0, 0, 0, 0, 0, 0, 0 }) goto ; [100.00%] else goto ; [20.00%] ;;succ: 10 ;;9 There's another similar comparison later. We transform the comparison into: ;; basic block 8, loop depth 1 ;;pred: 7 ;;6 if (_196 == 0) goto ; [100.00%] else goto ; [20.00%] ;;succ: 13 ;;9 There's nice secondary effects that show up after the transformation as well. It's an interesting case indeed, but I think a match.pd rule would have been better to address this. Sth like (for cmp (eq ne) (simplify (cmp vec_same_elem_p@0 vec_same_elem_p@1) (if (INTEGRAL_TYPE_P (type) && (TREE_CODE (type) == BOOLEAN_TYPE || TYPE_PRECISION (type) == 1)) (cmp { uniform_vector_p (@0); } { uniform_vector_p (@1); } should do the trick. That makes the transform available to more places (and DOM should also pick it up this way). Good point. I'll give that a whirl. jeff
Re: [PATCH] libffi: Fix MIPS r6 support
On 8/30/2021 2:47 AM, YunQiang Su wrote: 在 2021/8/30 5:00, Jeff Law 写道: On 8/28/2021 1:23 AM, Xi Ruoyao wrote: On Fri, 2021-08-27 at 15:28 -0600, Jeff Law via Gcc-patches wrote: On 8/26/2021 10:58 PM, YunQiang Su wrote: for some instructions, MIPS r6 uses different encoding other than the previous releases. 1. mips/n32.S disable .set mips4: since it casuses old insn encoding is used. https://github.com/libffi/libffi/pull/396 2. mips/ffi.c: the encoding for JR is hardcoded: we need to use different value for r6 and pre-r6. https://github.com/libffi/libffi/pull/401 libffi/ PR libffi/83636 * src/mips/n32.S: disable .set mips4 * src/mips/ffi.c: use different JR encoding for r6. These should go to the upstream libffi project. Once accepted there you can add them to GCC. Hi Jeff, The two PRs are already merged, and released since libffi-3.3.0 (now the upstream latest release is 3.4.2). ACK. Thanks for confirming. It's always OK to cherrypick/backport from libffi back to GCC. I don't have a MIPSr6 so I can't test though. Understood. Me neither, but I really should get a tiny chroot for mipsr6 so that my tester can validate it regularly. We have port Debian to MIPS64r6el. http://58.246.137.130:20180/tools/tarball/ You can use both buster(Debian 10) or bullseye (Debian 11). And both qemu-system and qemu-user can work. Understood. I need the minimal chroot to compress down under 100M to make gitlab happy (that's where my scripts & jenkins system expect to find them). My scripts to build the chroots also ensure a consistent set of packages & version #s across the different systems being tested. I've just never gotten around to building one for mips64r6. I could probably take yours trim out unnecessary stuff and use it for now. Getting a bootstrap test on the target weekly is definitely helpful. jeff
Re: [PATCH v2] x86-64: Add ABI warning for 64-bit vectors
On Sun, Aug 29, 2021 at 12:11:23PM -0700, H.J. Lu wrote: > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -1840,6 +1840,54 @@ init_cumulative_args (CUMULATIVE_ARGS *cum, /* > Argument info to initialize */ >cfun->machine->arg_reg_available = (cum->nregs > 0); > } > > +/* Return the single 64-bit vector type of TYPE. */ > + > +static const_tree > +single_m64_base_type (const_tree type) > +{ > + if ((TREE_CODE (type) == RECORD_TYPE > + || TREE_CODE (type) == UNION_TYPE > + || TREE_CODE (type) == QUAL_UNION_TYPE) > + && int_size_in_bytes (type) == 8) > +{ > + const_tree field; > + const_tree first_field = nullptr; > + > + for (field = TYPE_FIELDS (type); > +field; > +field = DECL_CHAIN (field)) > + if (TREE_CODE (field) == FIELD_DECL) > + { > + if (TREE_TYPE (field) == error_mark_node) > + continue; > + > + /* Skip if structure has more than one field. */ > + if (first_field) > + return nullptr; > + > + first_field = field; > + } > + > + /* Skip if structure doesn't have any fields. */ > + if (!first_field) > + return nullptr; Is this an attempt to emulate compute_record_mode or something else? How should it treat zero-width bitfields (either the C ones kept in the structures or C++ ones formerly removed from them and now no longer)? compute_record_mode actually has more complicated details... Jakub
rs6000: Backports of mode promote patches to GCC 11
Hi! I have backported 9080a3bf2329, a3f6bd789149, and f0529d96f567 to the GCC 11 branch. This solves PR102062, but is a >2% performance win for such a trivial patch, too, which is enough reason on its own :-) Segher
[PING #2][PATCH] enable ranger and caching in pass_waccess
Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578135.html Are there any further comments on the patch? Richard, you were kind enough to review the first two patches in this series. Would you mind doing the same for this one? It continues in the same vein. Martin On 8/25/21 3:26 PM, Martin Sebor wrote: On 8/25/21 10:14 AM, Andrew MacLeod wrote: On 8/25/21 11:20 AM, Martin Sebor wrote: Ping: Andrew, did I answer your questions? Do you (or anyone else) have any other comments on the latest patch below? https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577865.html I wasn't attempting to block it, its outside my review purview.. I was merely commenting that you should not need a pointer to a range_query at all anymore Are you planning to transition to using the get_range_query() interface instead of keeping a range_query pointer in the pointer_query class? This pass and to a smaller extent the pointer_query class that's used by it and the strlen pass are still a work in progress. I also still need to convert the strlen pass to use Ranger and I expect it will take some changes to pointer_query. So at that point, if going through get_range_query (cfun) everywhere is what you recommend, I'm happy to do it. absolutely. you should not need to even know whether you have a ranger instance running or not. get_range_query will give you whichever is active, and there is ALWAYS something active.. defaulting to the global version. the code in get_range() seems to be the end of the call chain which uses the pointer and should be consolidated to something much simpler if (rvals && stmt) { if (!rvals->range_of_expr (vr, val, stmt)) return NULL_TREE; <...> // ?? This entire function should use get_range_query or get_global_range_query (), // instead of doing something different for RVALS and global ranges. if (!get_global_range_query ()->range_of_expr (vr, val) || vr.undefined_p ()) return NULL_TREE; This entire section can boil down to something like if (!get_range_query (cfun)->range_of_expr (vr, val, stmt)) return NULL; So get_range_query (cfun) will never be null? And no special handling of INTEGER_CST is needed, or checking for SSA_NAME. Nice. Attached is another revision of the same patch with this function simplified. (FYI: I expect the whole function to eventually go away, and to make other similar simplifications, but I'm not there yet.) PS There has been an effort to get rid of global variables from GCC, or, as the first step, to avoid accessing them directly(*). If and when that happens, it seems like each pass will have to store either the ranger instance as a member (directly or indirectly, via a member of a class that stores it) or the function passed to pass::execute() if it wants to access either. [*] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573975.html The patch at the link above wasn't approved but IIUC removing globals from GCC is still a goal. I have no idea what direction that is going, but the net effect will be the same in the end. You'll be calling get_range_query() with either the function pointer, or the pass pointer, or something.. but you should never need to pass around a pointer to either a ranger or range-query. As I said earlier, this will likely be a pass property and accessed thru the pass eventually... but until then, use what we have.. cfun. It'll be trivial to transition down the road to whatever the ultimate solution is. Okay, I'll do that in my future changes. Martin
[committed] libphobos: Compile configure tests with -fno-druntime
Hi, This patch changes WITH_LOCAL_DRUNTIME to build with `-fno-druntime'. The D tests done at configure-time for libphobos don't require a functional D run-time, so don't enable any run-time features. Bootstrapped and regression tested on x86_64-linux-gnu/-m32/-mx32, and committed to mainline. Regards, Iain. --- libphobos/ChangeLog: * configure: Regenerate. * m4/autoconf.m4 (AC_LANG_PROGRAM): Declare module name 'object'. * m4/gcc_support.m4 (WITH_LOCAL_DRUNTIME): Compile tests with -fno-druntime. --- libphobos/configure | 28 ++-- libphobos/m4/autoconf.m4| 2 +- libphobos/m4/gcc_support.m4 | 2 +- 3 files changed, 16 insertions(+), 16 deletions(-) diff --git a/libphobos/configure b/libphobos/configure index b1c8ecb5673..14298a0dc4f 100755 --- a/libphobos/configure +++ b/libphobos/configure @@ -5382,7 +5382,7 @@ fi gdc_save_DFLAGS=$GDCFLAGS - GDCFLAGS="-fno-moduleinfo -nostdinc -I $phobos_cv_abs_srcdir/libdruntime $GDCFLAGS" + GDCFLAGS="-fno-druntime -nostdinc -I $phobos_cv_abs_srcdir/libdruntime $GDCFLAGS" ac_ext=d ac_compile='$GDC -c $GDCFLAGS conftest.$ac_ext >&5' @@ -5392,7 +5392,7 @@ ac_compiler_gnu=yes { $as_echo "$as_me:${as_lineno-$LINENO}: checking If $GDC can compile D sources" >&5 $as_echo_n "checking If $GDC can compile D sources... " >&6; } cat > conftest.$ac_ext <<_ACEOF -module mod; +module object; extern(C) int main() { @@ -12097,7 +12097,7 @@ CC="$lt_save_CC" gdc_save_DFLAGS=$GDCFLAGS - GDCFLAGS="-fno-moduleinfo -nostdinc -I $phobos_cv_abs_srcdir/libdruntime $GDCFLAGS" + GDCFLAGS="-fno-druntime -nostdinc -I $phobos_cv_abs_srcdir/libdruntime $GDCFLAGS" # Source file extension for D test sources. @@ -14089,7 +14089,7 @@ fi gdc_save_DFLAGS=$GDCFLAGS - GDCFLAGS="-fno-moduleinfo -nostdinc -I $phobos_cv_abs_srcdir/libdruntime -nophoboslib $GDCFLAGS" + GDCFLAGS="-fno-druntime -nostdinc -I $phobos_cv_abs_srcdir/libdruntime -nophoboslib $GDCFLAGS" ac_ext=d ac_compile='$GDC -c $GDCFLAGS conftest.$ac_ext >&5' @@ -14098,7 +14098,7 @@ ac_compiler_gnu=yes GDCFLAGS="$GDCFLAGS -g -Werror -ffunction-sections -fdata-sections" cat > conftest.$ac_ext <<_ACEOF -module mod; +module object; int foo; void bar() { } extern(C) int main() { @@ -14562,7 +14562,7 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu gdc_save_DFLAGS=$GDCFLAGS - GDCFLAGS="-fno-moduleinfo -nostdinc -I $phobos_cv_abs_srcdir/libdruntime -nophoboslib $GDCFLAGS" + GDCFLAGS="-fno-druntime -nostdinc -I $phobos_cv_abs_srcdir/libdruntime -nophoboslib $GDCFLAGS" ac_ext=d ac_compile='$GDC -c $GDCFLAGS conftest.$ac_ext >&5' @@ -14576,7 +14576,7 @@ if ${ac_cv_search_malloc+:} false; then : else ac_func_search_save_LIBS=$LIBS cat > conftest.$ac_ext <<_ACEOF -module mod; +module object; extern(C) int malloc(); extern(C) int main() { @@ -14622,7 +14622,7 @@ if ${ac_cv_search_pthread_create+:} false; then : else ac_func_search_save_LIBS=$LIBS cat > conftest.$ac_ext <<_ACEOF -module mod; +module object; extern(C) int pthread_create(); extern(C) int main() { @@ -14668,7 +14668,7 @@ if ${ac_cv_search_cosf+:} false; then : else ac_func_search_save_LIBS=$LIBS cat > conftest.$ac_ext <<_ACEOF -module mod; +module object; extern(C) int cosf(); extern(C) int main() { @@ -14714,7 +14714,7 @@ if ${ac_cv_search_clock_gettime+:} false; then : else ac_func_search_save_LIBS=$LIBS cat > conftest.$ac_ext <<_ACEOF -module mod; +module object; extern(C) int clock_gettime(); extern(C) int main() { @@ -14765,7 +14765,7 @@ $as_echo_n "checking for atomic builtins for byte... " >&6; } else cat > conftest.$ac_ext <<_ACEOF -module mod; +module object; import gcc.builtins; extern(C) int main() { @@ -14799,7 +14799,7 @@ $as_echo_n "checking for atomic builtins for short... " >&6; } else cat > conftest.$ac_ext <<_ACEOF -module mod; +module object; import gcc.builtins; extern(C) int main() { @@ -14833,7 +14833,7 @@ $as_echo_n "checking for atomic builtins for int... " >&6; } else cat > conftest.$ac_ext <<_ACEOF -module mod; +module object; import gcc.builtins; extern(C) int main() { @@ -14867,7 +14867,7 @@ $as_echo_n "checking for atomic builtins for long... " >&6; } else cat > conftest.$ac_ext <<_ACEOF -module mod; +module object; import gcc.builtins; extern(C) int main() { diff --git a/libphobos/m4/autoconf.m4 b/libphobos/m4/autoconf.m4 index f46a7809f43..f0ca947478c 100644 --- a/libphobos/m4/autoconf.m4 +++ b/libphobos/m4/autoconf.m4 @@ -27,7 +27,7 @@ AU_DEFUN([AC_LANG_D], [AC_LANG(D)]) # AC_LANG_PROGRAM(D)([PROLOGUE], [BODY]) # --- m4_define([AC_LANG_PROGRAM(D)], -[module mod; +[module object; $1 extern(C) int main() { diff --git a/libphobos/m4/gcc_support.m4 b/libphobos/m4/gcc_support.m4 index 0903ed4b45f..cc1acb4a230 100644 --- a/libphobos/m4/gcc_support.m4 +++ b/libphobos/m4/
[PATCH] c++: New way to sanitize destructor calls [PR101355]
Hi! The following patch changes the way destructor calls are sanitized. Currently, they are sanitized just like any other member calls, transforming `a::~a(this)' to `a::~a(.UBSAN_NULL(SAVE_EXPR(this)), SAVE_EXPR(this))'. However, this is a problem for coroutines. In some cases, a destructor call is cloned to another location, so then the saved expression is not evaluated before the second call to the destructor. The patch gets rid of SAVE_EXPRs by creating a temporary variable, so the calls are transformed to something resembling: { void *ptr; ptr = this; .UBSAN_NULL(ptr); a::~a(ptr); } Unfortunately, the implementation is not so straight-forward. Firstly, we do not want to sanitize again already sanitized call. I accomplish this by creating a hash set of instrumented expressions on the way to the tree's root. Secondly, the affected calls are cloned, so I need to explicitly copy the tree node to change it's argument. I am not sure if the proposed solution is optimal and will not cause the creation of an exponential number of nodes. The patch survives bootstrapping and regression testing on x86_64-pc-linux-gnu. If approved, I would like to note that I do not have write access to the repository and this is my first contribution to GCC. 2021-08-30 Dan Klishch PR c++/101355 gcc/testsuite/ * g++.dg/coroutines/pr101355.C: New test. gcc/c-family/ * c-ubsan.h (ubsan_maybe_instrument_member_call): Change prototype. * c-ubsan.c: Add new include. (ubsan_should_instrument_reference_or_call): Add new function. (ubsan_maybe_instrument_reference_or_call): Add possibility to convert call-expr to bind-expr, so it would not use save-expr. (ubsan_maybe_instrument_member_call): Add flag to allow using new functionality of the previous. (ubsan_maybe_instrument_reference): Fix arguments of call to ubsan_maybe_instrument_reference_or_call. gcc/cp/ * cp-gimplify.c (struct cp_genericize_data): Add instrumented_dtor_set field. (cp_genericize_r): Tell ubsan to instrument destructors without using save-expr. (cp_genericize_tree): Add initialization of instrumented_dtor_set. diff --git a/gcc/c-family/c-ubsan.c b/gcc/c-family/c-ubsan.c index 12a7bca5c32..593836984a5 100644 --- a/gcc/c-family/c-ubsan.c +++ b/gcc/c-family/c-ubsan.c @@ -23,6 +23,7 @@ along with GCC; see the file COPYING3. If not see #include "coretypes.h" #include "tm.h" #include "c-family/c-common.h" +#include "tree-iterator.h" #include "ubsan.h" #include "c-family/c-ubsan.h" #include "stor-layout.h" @@ -398,22 +399,15 @@ ubsan_maybe_instrument_array_ref (tree *expr_p, bool ignore_off_by_one) } } -static tree -ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree ptype, - enum ubsan_null_ckind ckind) +static bool +ubsan_should_instrument_reference_or_call (tree op, tree ptype, + unsigned int &mina) { - if (!sanitize_flags_p (SANITIZE_ALIGNMENT | SANITIZE_NULL) - || current_function_decl == NULL_TREE) -return NULL_TREE; - - tree type = TREE_TYPE (ptype); - tree orig_op = op; bool instrument = false; - unsigned int mina = 0; if (sanitize_flags_p (SANITIZE_ALIGNMENT)) { - mina = min_align_of_type (type); + mina = min_align_of_type (TREE_TYPE (ptype)); if (mina <= 1) mina = 0; } @@ -454,19 +448,68 @@ ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree ptype, instrument = true; } } - if (!instrument) + return instrument; +} + +static tree +ubsan_maybe_instrument_reference_or_call (location_t loc, tree op, tree ptype, + enum ubsan_null_ckind ckind, + bool use_save_expr, tree *stmt_p, + tree *copied_call) +{ + if (!sanitize_flags_p (SANITIZE_ALIGNMENT | SANITIZE_NULL) + || current_function_decl == NULL_TREE) return NULL_TREE; - op = save_expr (orig_op); + + unsigned int mina = 0; + if (!ubsan_should_instrument_reference_or_call (op, ptype, mina)) +return NULL_TREE; + gcc_assert (POINTER_TYPE_P (ptype)); if (TREE_CODE (ptype) == REFERENCE_TYPE) ptype = build_pointer_type (TREE_TYPE (ptype)); tree kind = build_int_cst (ptype, ckind); tree align = build_int_cst (pointer_sized_int_node, mina); - tree call -= build_call_expr_internal_loc (loc, IFN_UBSAN_NULL, void_type_node, - 3, op, kind, align); - TREE_SIDE_EFFECTS (call) = 1; - return fold_build2 (COMPOUND_EXPR, TREE_TYPE (op), call, op); + + if (use_save_expr) +{ + op = save_expr (op); + tree call = build_call_expr_internal_loc ( + loc, IFN_UBSAN_NULL, void_type_node, 3, op, kind, align); + TREE_SIDE_EFFECTS (call) = 1; + return fold_build2 (COMPOUND_
Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.
This commit introduces an ICE building libgcc for 32-bit SPARC. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133 -- Joseph S. Myers jos...@codesourcery.com
[PATCH] libiberty, configure, Darwin: Avoid detecting deprecated sbrk.
Hi, Darwin provides an implementation of sbrk, which is detected by the libiberty configuration process. However, (like most of the BSD-derivatives) sbrk/brk are deprecated on Darwin which leads to build-time warnings. It seems that the configure process does not see the deprecation warnings as reason for excluding the fn. Darwin should use the malloc-based implementation. This patch works around the issue by removing sbrk from the functions searched (for Darwin only, although it’s likely that other BSD-ish ports might wish to do the same). Open to more elegant solutions, of course, tested on powerpc,i686,x86_64-darwin, x86_64, powerpc64- linux, OK for master? thanks Iain Signed-off-by: Iain Sandoe libiberty/ChangeLog: * configure: Regenerate. * configure.ac: Do not search for sbrk on Darwin. * xmalloc.c: Do not declare sbrk unless it has been found by configure. --- libiberty/configure| 43 -- libiberty/configure.ac | 15 +-- libiberty/xmalloc.c| 2 ++ 3 files changed, 35 insertions(+), 25 deletions(-) diff --git a/libiberty/configure.ac b/libiberty/configure.ac index a85ff25501a..4b78c1830c7 100644 --- a/libiberty/configure.ac +++ b/libiberty/configure.ac @@ -395,9 +395,16 @@ vars="sys_errlist sys_nerr sys_siglist" checkfuncs="__fsetlocking canonicalize_file_name dup3 getrlimit getrusage \ getsysinfo gettimeofday on_exit pipe2 psignal pstat_getdynamic pstat_getstatic \ - realpath setrlimit sbrk spawnve spawnvpe strerror strsignal sysconf sysctl \ + realpath setrlimit spawnve spawnvpe strerror strsignal sysconf sysctl \ sysmp table times wait3 wait4" +# Darwin has sbrk, but it is deprecated and that produces build-time warnings +# so do not check for it. +case "${host}" in + *-*-darwin*) ;; + *) checkfuncs="$checkfuncs sbrk" +esac + # These are neither executed nor required, but they help keep # autoheader happy without adding a bunch of text to acconfig.h. if test "x" = "y"; then @@ -695,7 +702,11 @@ if test -z "${setobjs}"; then AC_CHECK_FUNCS($checkfuncs) AC_CHECK_DECLS([basename(char *), ffs, asprintf, vasprintf, snprintf, vsnprintf]) - AC_CHECK_DECLS([calloc, getenv, getopt, malloc, realloc, sbrk]) + AC_CHECK_DECLS([calloc, getenv, getopt, malloc, realloc]) + case "${host}" in + *-*-darwin*) ;; # Darwin's sbrk implementation is deprecated. + *) AC_CHECK_DECLS([sbrk]);; + esac AC_CHECK_DECLS([strtol, strtoul, strtoll, strtoull]) AC_CHECK_DECLS([strverscmp]) AC_CHECK_DECLS([strnlen]) diff --git a/libiberty/xmalloc.c b/libiberty/xmalloc.c index 85826c1e10f..b9e23c38643 100644 --- a/libiberty/xmalloc.c +++ b/libiberty/xmalloc.c @@ -87,7 +87,9 @@ extern "C" { void *malloc (size_t); void *realloc (void *, size_t); void *calloc (size_t, size_t); +#ifdef HAVE_SBRK void *sbrk (ptrdiff_t); +#endif #ifdef __cplusplus } #endif /* __cplusplus */
[committed] hppa: Fix libgfortran build on hppa*-hp-hpux[01]*
The following change fixes the build of libgfortran on hppa*-hp-hpux[01]*. Tested on hppa64-hp-hpux11.11 and hppa2.0w-hp-hpux11.11. Committed to trunk. Dave --- Fix libgfortran build on hppa*-hp-hpux[01]* Add include hack to define PRIdPTR, PRIiPTR, PRIoPTR, PRIuPTR, PRIxPTR and PRIXPTR in inttypes.h. 2021-08-30 John David Anglin fixincludes/ChangeLog: * inclhack.def (hpux_c99_inttypes5): New hack to define PRIdPTR, etc. * fixincl.x: Regenerate. * tests/base/inttypes.h: Update. diff --git a/fixincludes/inclhack.def b/fixincludes/inclhack.def index b7ad6982e96..46e3b8c993a 100644 --- a/fixincludes/inclhack.def +++ b/fixincludes/inclhack.def @@ -2668,6 +2668,34 @@ fix = { "#define SCNxMAX SCNx32\n"; }; +/* + * Fix missing PRIdPTR defines in inttypes.h + */ +fix = { +hackname = hpux_c99_inttypes5; +mach = "hppa*-hp-hpux11.[01]*"; +files = inttypes.h; +select= "#ifndef[ \t]_KERNEL[ \t]*\n"; +c_fix = format; +c_fix_arg = "#ifndef __LP64__\n" + "#define PRIdPTR\t\tPRId32\n" + "#define PRIiPTR\t\tPRIi32\n" + "#define PRIoPTR\t\tPRIo32\n" + "#define PRIuPTR\t\tPRIu32\n" + "#define PRIxPTR\t\tPRIx32\n" + "#define PRIXPTR\t\tPRIX32\n" + "#else\n" + "#define PRIdPTR\t\tPRId64\n" + "#define PRIiPTR\t\tPRIi64\n" + "#define PRIoPTR\t\tPRIo64\n" + "#define PRIuPTR\t\tPRIu64\n" + "#define PRIxPTR\t\tPRIx64\n" + "#define PRIXPTR\t\tPRIX64\n" + "#endif\n\n" + "#ifndef _KERNEL\n"; +test_text = "#ifndef _KERNEL\n"; +}; + /* * Fix hpux broken ctype macros */ diff --git a/fixincludes/tests/base/inttypes.h b/fixincludes/tests/base/inttypes.h index 144ea6596e8..9c1f93eb736 100644 --- a/fixincludes/tests/base/inttypes.h +++ b/fixincludes/tests/base/inttypes.h @@ -42,3 +42,25 @@ #define SCNuMAX SCNu32 #endif /* HPUX_C99_INTTYPES4_CHECK */ + + +#if defined( HPUX_C99_INTTYPES5_CHECK ) +#ifndef __LP64__ +#define PRIdPTRPRId32 +#define PRIiPTRPRIi32 +#define PRIoPTRPRIo32 +#define PRIuPTRPRIu32 +#define PRIxPTRPRIx32 +#define PRIXPTRPRIX32 +#else +#define PRIdPTRPRId64 +#define PRIiPTRPRIi64 +#define PRIoPTRPRIo64 +#define PRIuPTRPRIu64 +#define PRIxPTRPRIx64 +#define PRIXPTRPRIX64 +#endif + +#ifndef _KERNEL + +#endif /* HPUX_C99_INTTYPES5_CHECK */
[PATCH] gdb: Add a dependency between gdb and libbacktrace
I plan to make use of libbacktrace within GDB. I believe that the patch below needs to be merged into GCCs toplevel directory and then back-ported to the binutils-gdb repository. Is this OK to merge? Thanks, Andrew --- GDB is going to start using libbacktrace, so add a build dependency between the two projects. This change needs to be added into the GCC toplevel files, and then back-ported to the binutils-gdb repository. ChangeLog: * Makefile.def: Add all-gdb dependency on all-libbacktrace. * Makefile.in: Regenerate. --- ChangeLog| 5 + Makefile.def | 1 + Makefile.in | 1 + 3 files changed, 7 insertions(+) diff --git a/Makefile.def b/Makefile.def index fbfdb6fee08..de3e0052106 100644 --- a/Makefile.def +++ b/Makefile.def @@ -423,6 +423,7 @@ dependencies = { module=all-gdb; on=all-sim; }; dependencies = { module=all-gdb; on=all-libdecnumber; }; dependencies = { module=all-gdb; on=all-libtermcap; }; dependencies = { module=all-gdb; on=all-libctf; }; +dependencies = { module=all-gdb; on=all-libbacktrace; }; // Host modules specific to gdbserver. dependencies = { module=configure-gdbserver; on=all-gnulib; }; diff --git a/Makefile.in b/Makefile.in index 5c85fcc5dd0..61af99dc75a 100644 --- a/Makefile.in +++ b/Makefile.in @@ -61237,6 +61237,7 @@ all-gdb: maybe-all-libiconv all-gdb: maybe-all-opcodes all-gdb: maybe-all-libdecnumber all-gdb: maybe-all-libctf +all-gdb: maybe-all-libbacktrace all-gdbserver: maybe-all-libiberty configure-gdbsupport: maybe-configure-intl all-gdbsupport: maybe-all-intl -- 2.25.4
Re: [EXTERNAL] Re: [PATCH] Propagate get_nonzero_bits information in division [PR77980]
Thanks Jeff. I've reached out to Roger to see if the fix would be better suited in CCP. If that isn't the right spot, I'll reach out to Aldy and Andrew about getting the fix in VRP/Ranger. From: Jeff Law Sent: Sunday, August 22, 2021 8:11 PM To: Victor Tong ; gcc-patches@gcc.gnu.org ; pins...@gcc.gnu.org Subject: [EXTERNAL] Re: [PATCH] Propagate get_nonzero_bits information in division [PR77980] On 7/26/2021 6:45 PM, Victor Tong via Gcc-patches wrote: > This change enables the "t1 != 0" check to be optimized away in this code: > > int x1 = 0; > unsigned int x2 = 1; > > int main () > { > int t1 = x1*(1/(x2+x2)); > if (t1 != 0) __builtin_abort(); > return 0; > } > > The change utilizes the VRP framework to propagate the get_nonzero_bits > information from the "x2+x2" expression to the "1/(x2+x2)" division > expression. Specifically, the framework knows that the least significant bit > of the "x2+x2" expression must be zero. > > The get_nonzero_bits information of the left hand side and right hand side of > expressions needed to be passed down to operator_div::wi_fold() in the VRP > framework. The majority of this change involves adding two additional > parameters to propagate this information. There are future opportunities to > use the non zero bit information to perform better optimizations in other > types of expressions. > > The changes were tested against x86_64-pc-linux-gnu and all tests in "make -k > check" passed. > > The original approach was to implement a match.pd pattern to support this but > the pattern wasn't being triggered. More context is available in: > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fbugzilla%2Fshow_bug.cgi%3Fid%3D77980&data=04%7C01%7Cvitong%40microsoft.com%7C08dcd27e5482418d559708d965e3a826%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637652850726534114%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=S1WWWvuXwEIA20pmv%2Fn8kk6xlwv9VwsS%2BTl0jqEHf4w%3D&reserved=0 So you're going to want to sync with Aldy & Andrew as they're the experts on the Ranger design & implementation. This hits the Ranger API as well as design questions about how best to tie in the nonzero_bits capabilities. You might also want to reach out to Roger Sayle. He's been poking around in a closely related area, though more focused on the bitwise conditional constant propagation rather than Ranger/VRP. In fact, I just acked a patch of his that looks closely related. https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgcc.gnu.org%2Fpipermail%2Fgcc-patches%2F2021-August%2F577888.html&data=04%7C01%7Cvitong%40microsoft.com%7C08dcd27e5482418d559708d965e3a826%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637652850726534114%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=gn1L1exTW5TjdV8CDBDqB0Z5a4V7EP2tBM2uTF2ihck%3D&reserved=0 Jeff
[pushed] c++: Add warning about missing 'requires'
I noticed that concepts-lambda14.C had two useless requires-expressions: static_assert(requires { C; }); always succeeds, because C is always a valid expression for any type, regardless of whether C is satisfied for a particular type. Presumably the user means static_assert(requires { requires C; }); to make the C a nested-requirement. Of course, static_assert(C); is much simpler and means the same thing; this is more relevant in the middle of a longer requires-expression, such as the bug this warning found in cmcstl2: template META_CONCEPT input_iterator = input_or_output_iterator && readable && requires(I& i, const I& ci) { typename iterator_category_t; derived_from, input_iterator_tag>; i++; }; where 'requires' is missing before 'derived_from'. Tested x86_64-pc-linux-gnu, applying to trunk. gcc/ChangeLog: * doc/invoke.texi: Document -Wmissing-requires. gcc/c-family/ChangeLog: * c.opt: Add -Wmissing-requires. gcc/cp/ChangeLog: * parser.c (cp_parser_simple_requirement): Warn about missing requires. gcc/testsuite/ChangeLog: * g++.dg/cpp2a/concepts-lambda14.C: Add expected warnings. --- gcc/doc/invoke.texi | 22 +++ gcc/c-family/c.opt| 4 gcc/cp/parser.c | 19 .../g++.dg/cpp2a/concepts-lambda14.C | 4 ++-- 4 files changed, 47 insertions(+), 2 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index f7bb193b51d..8969bac664d 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -8752,6 +8752,28 @@ s x = @{ @}; This warning is included in @option{-Wextra}. To get other @option{-Wextra} warnings without this one, use @option{-Wextra -Wno-missing-field-initializers}. +@item -Wno-missing-requires +@opindex Wmissing-requires +@opindex Wno-missing-requires + +By default, the compiler warns about a concept-id appearing as a C++20 simple-requirement: + +@smallexample +bool satisfied = requires @{ C @}; +@end smallexample + +Here @samp{satisfied} will be true if @samp{C} is a valid +expression, which it is for all T. Presumably the user meant to write + +@smallexample +bool satisfied = requires @{ requires C @}; +@end smallexample + +so @samp{satisfied} is only true if concept @samp{C} is satisfied for +type @samp{T}. + +This warning can be disabled with @option{-Wno-missing-requires}. + @item -Wno-multichar @opindex Wno-multichar @opindex Wmultichar diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 91929706aff..c5fe90003f2 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -839,6 +839,10 @@ Wmissing-field-initializers C ObjC C++ ObjC++ Var(warn_missing_field_initializers) Warning EnabledBy(Wextra) Warn about missing fields in struct initializers. +Wmissing-requires +C++ ObjC++ Var(warn_missing_requires) Init(1) Warning +Warn about likely missing requires keyword. + Wmultistatement-macros C ObjC C++ ObjC++ Var(warn_multistatement_macros) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) Warn about unsafe macros expanding to multiple statements used as a body of a clause such as if, else, while, switch, or for. diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c index a959c71dfa3..797e70ba5bb 100644 --- a/gcc/cp/parser.c +++ b/gcc/cp/parser.c @@ -29911,6 +29911,25 @@ cp_parser_simple_requirement (cp_parser *parser) if (expr.get_location() == UNKNOWN_LOCATION) expr.set_location (start); + for (tree t = expr; ; ) +{ + if (TREE_CODE (t) == TRUTH_ANDIF_EXPR + || TREE_CODE (t) == TRUTH_ORIF_EXPR) + { + t = TREE_OPERAND (t, 0); + continue; + } + if (concept_check_p (t)) + { + gcc_rich_location richloc (get_start (start)); + richloc.add_fixit_insert_before (start, "requires "); + warning_at (&richloc, OPT_Wmissing_requires, "testing " + "if a concept-id is a valid expression; add " + "% to check satisfaction"); + } + break; +} + return finish_simple_requirement (expr.get_location (), expr); } diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda14.C b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda14.C index bdc893da857..02b6b6a8438 100644 --- a/gcc/testsuite/g++.dg/cpp2a/concepts-lambda14.C +++ b/gcc/testsuite/g++.dg/cpp2a/concepts-lambda14.C @@ -11,9 +11,9 @@ void foo() noexcept(!__is_same(T, void)) { } template auto f() { - return [](T, bool a = requires { C; }){ + return [](T, bool a = requires { C; }){ // { dg-warning Wmissing-requires } static_assert(requires { requires C && (C || C); }); // { dg-error "assert" } -static_assert(requires { C; }); +static_assert(requires { C; }); // { dg-warning Wmissing-requires } static_assert(requires { { foo() } noexcept -> C; }); static_assert(!requires { type
[PATCH] Fix PR 90142: contrib/download_prerequisites uses test ==
From: Andrew Pinski Since == is not portable, it is better to use = in contrib/ download_prerequisites. The only place == was used is inside the function md5_check which is used only on Mac OS X. Tested on Mac OS X as: ./contrib/download_prerequisites --md5 Both with all files having the correct checksum and one with a broken one. contrib/ChangeLog: * download_prerequisites (md5_check): Replace == inside test with = to be more portable. --- contrib/download_prerequisites | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/contrib/download_prerequisites b/contrib/download_prerequisites index 8f69b61f5a9..11c283ecb1a 100755 --- a/contrib/download_prerequisites +++ b/contrib/download_prerequisites @@ -113,7 +113,7 @@ md5_check() { md5_checksum_output=$(md5 -r "${file_to_check}") # Grab the text before the first space md5_checksum_detected="${md5_checksum_output%% *}" - [ "${md5_checksum_expected}" == "${md5_checksum_detected}" ] \ + [ "${md5_checksum_expected}" = "${md5_checksum_detected}" ] \ || die "Cannot verify integrity of possibly corrupted file ${file_to_check}" echo "${file_to_check}: OK" } -- 2.17.1
[PATCH, committed] PR fortran/101349 - ICE in gfc_get_descriptor_field, at fortran/trans-array.c:140
The check of arguments to ALLOCATE did not properly implement F2008:C628 / F2018:C932, as it excluded unlimited polymophics, in contrast to the standard text. Fix this. Committed Steve's patch as obvious after regtesting on x86_64-pc-linux-gnu. Thanks, Harald Fortran - correct check for constraint F2008:C628 / F2018:C932 gcc/fortran/ChangeLog: PR fortran/101349 * resolve.c (resolve_allocate_expr): An unlimited polymorphic argument to ALLOCATE must be ALLOCATABLE or a POINTER. Fix the corresponding check. gcc/testsuite/ChangeLog: PR fortran/101349 * gfortran.dg/unlimited_polymorphic_33.f90: New test. diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c index f641d0d4dae..d7aa2868cef 100644 --- a/gcc/fortran/resolve.c +++ b/gcc/fortran/resolve.c @@ -7829,8 +7829,9 @@ resolve_allocate_expr (gfc_expr *e, gfc_code *code, bool *array_alloc_wo_spec) } } - /* Check for F08:C628. */ - if (allocatable == 0 && pointer == 0 && !unlimited) + /* Check for F08:C628 (F2018:C932). Each allocate-object shall be a data + pointer or an allocatable variable. */ + if (allocatable == 0 && pointer == 0) { gfc_error ("Allocate-object at %L must be ALLOCATABLE or a POINTER", &e->where); diff --git a/gcc/testsuite/gfortran.dg/unlimited_polymorphic_33.f90 b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_33.f90 new file mode 100644 index 000..8bb8864cb26 --- /dev/null +++ b/gcc/testsuite/gfortran.dg/unlimited_polymorphic_33.f90 @@ -0,0 +1,18 @@ +! { dg-do compile } +! PR fortran/101349 - ICE in gfc_get_descriptor_field +! Check constraint F2008:C628 / F2018:C932 + +subroutine s(x) + class(*) :: x(:) + allocate (x, source=['abc']) ! { dg-error "must be ALLOCATABLE or a POINTER" } +end + +subroutine t(x) + class(*), allocatable :: x(:) + allocate (x, source=['abc']) +end + +subroutine u(x) + class(*), pointer :: x(:) + allocate (x, source=['abc']) +end
[PATCH, committed] PR fortran/102113 - parsing error in assigned goto
A whitespace issue during parsing. Committed Steve's patch as obvious after regtesting on x86_64-pc-linux-gnu. Thanks, Harald Fortran - fix whitespace issue during parsing of assigned goto gcc/fortran/ChangeLog: PR fortran/102113 * match.c (gfc_match_goto): Allow for whitespace in parsing list of labels. gcc/testsuite/ChangeLog: PR fortran/102113 * gfortran.dg/goto_9.f90: New test.
Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.
On 8/30/2021 1:09 PM, Joseph Myers wrote: This commit introduces an ICE building libgcc for 32-bit SPARC. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133 I've sent Hongtao a testcase which is ICE-ing. It was mcore-elf, but I saw similar failures on a half-dozen architectures before I reverted the change locally. jeff
Re: [PATCH v3 1/6] rs6000: Support SSE4.1 "round" intrinsics
On Fri, Aug 27, 2021 at 08:44:43AM -0500, Bill Schmidt via Gcc-patches wrote: > On 8/23/21 2:03 PM, Paul A. Clarke wrote: > > + __fpscr_save.__fr = __builtin_mffsl (); > > As pointed out in the v1 review, __builtin_mffsl is enabled (or supposed to > be) only for POWER9 and later. This will fail to work on POWER8 and earlier > when the new builtins support is complete and this is enforced more > carefully. Please #ifdef and use __builtin_mffs on earlier processors. > Please do this everywhere this occurs. > > I think you got some contradictory guidance on this, but trust me, this will > break. The confusing thing is that _builtin_mffsl is explicitly supported on earlier processors, if I read the code right (from gcc/config/rs6000/rs6000.md): -- (define_expand "rs6000_mffsl" [(set (match_operand:DF 0 "gpc_reg_operand") (unspec_volatile:DF [(const_int 0)] UNSPECV_MFFSL))] "TARGET_HARD_FLOAT" { /* If the low latency mffsl instruction (ISA 3.0) is available use it, otherwise fall back to the older mffs instruction to emulate the mffsl instruction. */ if (!TARGET_P9_MISC) { rtx tmp1 = gen_reg_rtx (DFmode); /* The mffs instruction reads the entire FPSCR. Emulate the mffsl instruction using the mffs instruction and masking the result. */ emit_insn (gen_rs6000_mffs (tmp1)); ... -- Is that going away? If so, that would be a possible (undesirable?) API change, no? PC
Re: [PATCH] Fix x86/56337 : 1<<28 alignment is broken
On Fri, Jul 23, 2021 at 1:33 PM apinski--- via Gcc-patches wrote: > > From: Andrew Pinski > > The problem here is the x86_64 back-end uses a signed integer > for alignment and then divides by BITS_PER_UNIT so if we had > INT_MIN (which is what 1<<28*8 is), we would get the wrong result. > > This fixes the problem by using unsigned for the argument to > x86_output_aligned_bss and x86_output_aligned_bss. > > OK? Bootstrapped and tested on x86_64-linux-gnu. Ping? > > gcc/ChangeLog: > > PR target/56337 > * config/i386/i386-protos.h (x86_output_aligned_bss): > Change align argument to unsigned type. > (x86_elf_aligned_decl_common): Likewise. > * config/i386/i386.c (x86_elf_aligned_decl_common): Likewise. > (x86_output_aligned_bss): Likewise. > --- > gcc/config/i386/i386-protos.h | 4 ++-- > gcc/config/i386/i386.c| 4 ++-- > 2 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h > index 07ac02aff69..fa1b0d0d787 100644 > --- a/gcc/config/i386/i386-protos.h > +++ b/gcc/config/i386/i386-protos.h > @@ -325,9 +325,9 @@ struct ix86_address > extern int ix86_decompose_address (rtx, struct ix86_address *); > extern int memory_address_length (rtx, bool); > extern void x86_output_aligned_bss (FILE *, tree, const char *, > - unsigned HOST_WIDE_INT, int); > + unsigned HOST_WIDE_INT, unsigned); > extern void x86_elf_aligned_decl_common (FILE *, tree, const char *, > -unsigned HOST_WIDE_INT, int); > +unsigned HOST_WIDE_INT, unsigned); > > #ifdef RTX_CODE > extern void ix86_fp_comparison_codes (enum rtx_code code, enum rtx_code *, > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index 876a19f4c1f..f86d11dfb11 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -837,7 +837,7 @@ x86_64_elf_unique_section (tree decl, int reloc) > void > x86_elf_aligned_decl_common (FILE *file, tree decl, > const char *name, unsigned HOST_WIDE_INT size, > - int align) > + unsigned align) > { >if ((ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_MEDIUM_PIC) >&& size > (unsigned int)ix86_section_threshold) > @@ -858,7 +858,7 @@ x86_elf_aligned_decl_common (FILE *file, tree decl, > > void > x86_output_aligned_bss (FILE *file, tree decl, const char *name, > - unsigned HOST_WIDE_INT size, int align) > + unsigned HOST_WIDE_INT size, unsigned align) > { >if ((ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_MEDIUM_PIC) >&& size > (unsigned int)ix86_section_threshold) > -- > 2.17.1 >
[pushed] c++: fold function template args sooner [PR101460]
As discussed in the PR, we were giving a lot of unnecessary errors for this testcase because we didn't try to do constant evaluation until convert_nontype_argument, which happens for each of the candidates. But when looking at a template-id as the function operand of a call, we can try to fold arguments before we get into overload resolution. Tested x86_64-pc-linux-gnu, applying to trunk. PR c++/101460 gcc/cp/ChangeLog: * cp-tree.h (cxx_constant_value_sfinae): Declare. * constexpr.c (cxx_constant_value_sfinae): New. * pt.c (fold_targs_r, maybe_fold_fn_template_args): New. (tsubst_copy_and_build) [CALL_EXPR]: Call maybe_fold_fn_template_args. gcc/testsuite/ChangeLog: * g++.dg/template/explicit-args6.C: New test. --- gcc/cp/cp-tree.h | 1 + gcc/cp/constexpr.c| 12 gcc/cp/pt.c | 60 +++ .../g++.dg/template/explicit-args6.C | 34 +++ 4 files changed, 107 insertions(+) create mode 100644 gcc/testsuite/g++.dg/template/explicit-args6.C diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 75ee88721b4..6a179375a56 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -8266,6 +8266,7 @@ extern bool require_constant_expression (tree); extern bool require_rvalue_constant_expression (tree); extern bool require_potential_rvalue_constant_expression (tree); extern tree cxx_constant_value (tree, tree = NULL_TREE); +extern tree cxx_constant_value_sfinae (tree, tsubst_flags_t); extern void cxx_constant_dtor (tree, tree); extern tree cxx_constant_init (tree, tree = NULL_TREE); extern tree maybe_constant_value (tree, tree = NULL_TREE, bool = false); diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c index b9c006217be..9606719bc73 100644 --- a/gcc/cp/constexpr.c +++ b/gcc/cp/constexpr.c @@ -7458,6 +7458,18 @@ cxx_constant_value (tree t, tree decl) return cxx_eval_outermost_constant_expr (t, false, true, true, false, decl); } +/* As above, but respect SFINAE. */ + +tree +cxx_constant_value_sfinae (tree t, tsubst_flags_t complain) +{ + bool sfinae = !(complain & tf_error); + tree r = cxx_eval_outermost_constant_expr (t, sfinae, true, true); + if (sfinae && !TREE_CONSTANT (r)) +r = error_mark_node; + return r; +} + /* Like cxx_constant_value, but used for evaluation of constexpr destructors of constexpr variables. The actual initializer of DECL is not modified. */ diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 020a4bf2f6d..d7d0dce691c 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -19465,6 +19465,63 @@ out: return r; } +/* Subroutine of maybe_fold_fn_template_args. */ + +static bool +fold_targs_r (tree targs, tsubst_flags_t complain) +{ + int len = TREE_VEC_LENGTH (targs); + for (int i = 0; i < len; ++i) +{ + tree &elt = TREE_VEC_ELT (targs, i); + if (!elt || TYPE_P (elt) + || TREE_CODE (elt) == TEMPLATE_DECL) + continue; + if (TREE_CODE (elt) == NONTYPE_ARGUMENT_PACK) + { + if (!fold_targs_r (ARGUMENT_PACK_ARGS (elt), complain)) + return false; + } + else if (/* We can only safely preevaluate scalar prvalues. */ + SCALAR_TYPE_P (TREE_TYPE (elt)) + && !glvalue_p (elt) + && !TREE_CONSTANT (elt)) + { + elt = cxx_constant_value_sfinae (elt, complain); + if (elt == error_mark_node) + return false; + } +} + + return true; +} + +/* Try to do constant evaluation of any explicit template arguments in FN + before overload resolution, to get any errors only once. Return true iff + we didn't have any problems folding. */ + +static bool +maybe_fold_fn_template_args (tree fn, tsubst_flags_t complain) +{ + if (processing_template_decl || fn == NULL_TREE) +return true; + if (fn == error_mark_node) +return false; + if (TREE_CODE (fn) == OFFSET_REF + || TREE_CODE (fn) == COMPONENT_REF) +fn = TREE_OPERAND (fn, 1); + if (BASELINK_P (fn)) +fn = BASELINK_FUNCTIONS (fn); + if (TREE_CODE (fn) != TEMPLATE_ID_EXPR) +return true; + tree targs = TREE_OPERAND (fn, 1); + if (targs == NULL_TREE) +return true; + if (targs == error_mark_node) +return false; + return fold_targs_r (targs, complain); +} + /* Like tsubst but deals with expressions and performs semantic analysis. FUNCTION_P is true if T is the "F" in "F (ARGS)" or "F (ARGS)". */ @@ -20343,6 +20400,9 @@ tsubst_copy_and_build (tree t, && !mark_used (function, complain) && !(complain & tf_error)) RETURN (error_mark_node); + if (!maybe_fold_fn_template_args (function, complain)) + return error_mark_node; + /* Put back tf_decltype for the actual call. */ complain |= decltype_flag; diff --git a/gcc/testsuite/g++.dg/template/explicit-args6
Re: [PATCH v3 1/6] rs6000: Support SSE4.1 "round" intrinsics
Hi Paul, On 8/30/21 4:16 PM, Paul A. Clarke wrote: On Fri, Aug 27, 2021 at 08:44:43AM -0500, Bill Schmidt via Gcc-patches wrote: On 8/23/21 2:03 PM, Paul A. Clarke wrote: + __fpscr_save.__fr = __builtin_mffsl (); As pointed out in the v1 review, __builtin_mffsl is enabled (or supposed to be) only for POWER9 and later. This will fail to work on POWER8 and earlier when the new builtins support is complete and this is enforced more carefully. Please #ifdef and use __builtin_mffs on earlier processors. Please do this everywhere this occurs. I think you got some contradictory guidance on this, but trust me, this will break. The confusing thing is that _builtin_mffsl is explicitly supported on earlier processors, if I read the code right (from gcc/config/rs6000/rs6000.md): -- (define_expand "rs6000_mffsl" [(set (match_operand:DF 0 "gpc_reg_operand") (unspec_volatile:DF [(const_int 0)] UNSPECV_MFFSL))] "TARGET_HARD_FLOAT" { /* If the low latency mffsl instruction (ISA 3.0) is available use it, otherwise fall back to the older mffs instruction to emulate the mffsl instruction. */ if (!TARGET_P9_MISC) { rtx tmp1 = gen_reg_rtx (DFmode); /* The mffs instruction reads the entire FPSCR. Emulate the mffsl instruction using the mffs instruction and masking the result. */ emit_insn (gen_rs6000_mffs (tmp1)); ... -- Is that going away? If so, that would be a possible (undesirable?) API change, no? Hm, I see. I missed that in the builtins conversion. Apparently there's nothing in the test suite that verifies this work on P9, which is a hole that could use fixing. This usage isn't documented anywhere near the builtin machinery, either. I'll patch the new builtins code to move this to a more permissive stanza and document why. You can leave your code as is. Thanks! Bill PC
[pushed] c++: preserve location through constexpr
While working on the patch for PR101460, I noticed that we were losing the expression location when folding class prvalue expressions. The final patch doesn't fold class prvalues, but this still seems a worthwhile change. I don't add location wrappers for scalar prvalues because many callers are trying to fold them away. Tested x86_64-pc-linux-gnu, applying to trunk. gcc/cp/ChangeLog: * constexpr.c (cxx_eval_outermost_constant_expr): Copy expr location to result. --- gcc/cp/constexpr.c | 5 + 1 file changed, 5 insertions(+) diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c index 9606719bc73..e78fdf021b2 100644 --- a/gcc/cp/constexpr.c +++ b/gcc/cp/constexpr.c @@ -7445,6 +7445,11 @@ cxx_eval_outermost_constant_expr (tree t, bool allow_non_constant, } } + /* Remember the original location if that wouldn't need a wrapper. */ + if (location_t loc = EXPR_LOCATION (t)) +if (CAN_HAVE_LOCATION_P (r)) + SET_EXPR_LOCATION (r, loc); + return r; } base-commit: a8de832470f78a40a0e2c8de866a471bf74bf0ab -- 2.27.0
[pushed] c++: limit instantiation with ill-formed class [PR96286]
I noticed that after the static_assert failures in lwg3466.cc, we got various follow-on errors because we went ahead and tried to instantiate the promise member functions even after instantiating the class itself ran into problems. Interrupting instantiation of the class itself seems likely to cause error-recovery problems, but preventing instantiation of member functions seems strictly better for error-recovery. This doesn't fix any of the specific testcases in PR96286, but addresses part of that problem space. Tested x86_64-pc-linux-gnu, applying to trunk. PR c++/96286 gcc/cp/ChangeLog: * cp-tree.h (struct lang_type): Add erroneous bit-field. (CLASSTYPE_ERRONEOUS): New. * pt.c (limit_bad_template_recursion): Check it. (instantiate_class_template_1): Set it. libstdc++-v3/ChangeLog: * testsuite/30_threads/promise/requirements/lwg3466.cc: Remove dg-prune-outputs. gcc/testsuite/ChangeLog: * g++.dg/template/access2.C: Split struct A. --- gcc/cp/cp-tree.h | 7 ++- gcc/cp/pt.c| 14 +- gcc/testsuite/g++.dg/template/access2.C| 6 +- .../30_threads/promise/requirements/lwg3466.cc | 4 4 files changed, 24 insertions(+), 7 deletions(-) diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index 6a179375a56..ce7ca53a113 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -2336,6 +2336,7 @@ struct GTY(()) lang_type { unsigned has_constexpr_ctor : 1; unsigned unique_obj_representations : 1; unsigned unique_obj_representations_set : 1; + bool erroneous : 1; /* When adding a flag here, consider whether or not it ought to apply to a template instance if it applies to the template. If @@ -2344,7 +2345,7 @@ struct GTY(()) lang_type { /* There are some bits left to fill out a 32-bit word. Keep track of this by updating the size of this bitfield whenever you add or remove a flag. */ - unsigned dummy : 5; + unsigned dummy : 4; tree primary_base; vec *vcall_indices; @@ -2660,6 +2661,10 @@ struct GTY(()) lang_type { /* Nonzero if a _DECL node requires us to output debug info for this class. */ #define CLASSTYPE_DEBUG_REQUESTED(NODE) \ (LANG_TYPE_CLASS_CHECK (NODE)->debug_requested) + +/* True if we saw errors while instantiating this class. */ +#define CLASSTYPE_ERRONEOUS(NODE) \ + (LANG_TYPE_CLASS_CHECK (NODE)->erroneous) /* Additional macros for inheritance information. */ diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index d7d0dce691c..fcf3ac31b25 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -10885,9 +10885,14 @@ limit_bad_template_recursion (tree decl) { struct tinst_level *lev = current_tinst_level; int errs = errorcount + sorrycount; - if (lev == NULL || errs == 0 || !neglectable_inst_p (decl)) + if (errs == 0 || !neglectable_inst_p (decl)) return false; + /* Avoid instantiating members of an ill-formed class. */ + if (DECL_CLASS_SCOPE_P (decl) + && CLASSTYPE_ERRONEOUS (DECL_CONTEXT (decl))) +return true; + for (; lev; lev = lev->next) if (neglectable_inst_p (lev->maybe_get_node ())) break; @@ -12212,6 +12217,13 @@ instantiate_class_template_1 (tree type) finish_struct_1 (type); TYPE_BEING_DEFINED (type) = 0; + /* Remember if instantiating this class ran into errors, so we can avoid + instantiating member functions in limit_bad_template_recursion. We set + this flag even if the problem was in another instantiation triggered by + this one, as that will likely also cause trouble for member functions. */ + if (errorcount + sorrycount > current_tinst_level->errors) +CLASSTYPE_ERRONEOUS (type) = true; + /* We don't instantiate default arguments for member functions. 14.7.1: The implicit instantiation of a class template specialization causes diff --git a/gcc/testsuite/g++.dg/template/access2.C b/gcc/testsuite/g++.dg/template/access2.C index 0620c10f79d..4a80bb4d280 100644 --- a/gcc/testsuite/g++.dg/template/access2.C +++ b/gcc/testsuite/g++.dg/template/access2.C @@ -5,6 +5,9 @@ template struct A { typename T::X x; // { dg-error "this context" } +}; + +template struct A2 { int f() { return T::i; } // { dg-error "this context" } }; @@ -16,5 +19,6 @@ class B { int main() { A ab; // { dg-message "required" } - ab.f(); // { dg-message "required" } + A2 a2b; + a2b.f(); // { dg-message "required" } } diff --git a/libstdc++-v3/testsuite/30_threads/promise/requirements/lwg3466.cc b/libstdc++-v3/testsuite/30_threads/promise/requirements/lwg3466.cc index bb04bf0737c..acef47f12f3 100644 --- a/libstdc++-v3/testsuite/30_threads/promise/requirements/lwg3466.cc +++ b/libstdc++-v3/testsuite/30_threads/promise/requirements/lwg3466.cc @@ -27,17 +27,13 @@ std::promise good2; st
[PATCH] PR fortran/101327 - ICE in find_array_element, at fortran/expr.c:1355
There was an issue when trying to use an element from an array constructor which was a broken in a way probably only Gerhard could conceive. We hit an assert that can be replaced by more robust code. Patch is basically Steve's. Regtested on x86_64-pc-linux-gnu. OK for mainline? Thanks, Harald Fortran - improve error recovery determining array element from constructor gcc/fortran/ChangeLog: PR fortran/101327 * expr.c (find_array_element): When bounds cannot be determined as constant, return error instead of aborting. gcc/testsuite/ChangeLog: PR fortran/101327 * gfortran.dg/pr101327.f90: New test. diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c index 35563a78697..dfecc3012e1 100644 --- a/gcc/fortran/expr.c +++ b/gcc/fortran/expr.c @@ -1337,7 +1337,9 @@ find_array_element (gfc_constructor_base base, gfc_array_ref *ar, for (i = 0; i < ar->dimen; i++) { if (!gfc_reduce_init_expr (ar->as->lower[i]) - || !gfc_reduce_init_expr (ar->as->upper[i])) + || !gfc_reduce_init_expr (ar->as->upper[i]) + || ar->as->upper[i]->expr_type != EXPR_CONSTANT + || ar->as->lower[i]->expr_type != EXPR_CONSTANT) { t = false; cons = NULL; @@ -1351,9 +1353,6 @@ find_array_element (gfc_constructor_base base, gfc_array_ref *ar, goto depart; } - gcc_assert (ar->as->upper[i]->expr_type == EXPR_CONSTANT - && ar->as->lower[i]->expr_type == EXPR_CONSTANT); - /* Check the bounds. */ if ((ar->as->upper[i] && mpz_cmp (e->value.integer, diff --git a/gcc/testsuite/gfortran.dg/pr101327.f90 b/gcc/testsuite/gfortran.dg/pr101327.f90 new file mode 100644 index 000..f4377aabe7a --- /dev/null +++ b/gcc/testsuite/gfortran.dg/pr101327.f90 @@ -0,0 +1,11 @@ +! { dg-do compile } +! PR fortran/101327 - ICE in find_array_element, at fortran/expr.c:1355 + +subroutine s + integer, parameter :: n([2]) = [1, 2] ! { dg-error "must be scalar" } + type t + integer :: a(n(1):n(2)) + end type +end + +! { dg-error "cannot be automatic or of deferred shape" " " { target *-*-* } 5 }
[committed] analyzer: support "bifurcation"; reimplement realloc [PR99260]
Most of the state-management code in the analyzer involves modifying state objects in-place, which implies a single outcome. (I originally implemented in-place modification because I wanted to avoid having to create copies of state objects, and it's now very difficult to change this aspect of the analyzer's design) However, there are various special-cases such as "realloc" for which it's best to split the state into multiple outcomes. This patch adds a mechanism for "bifurcating" the analysis in places where there isn't a split in the CFG, and uses it to implement realloc, in this case treating it as having 3 possible outcomes: - failure, returning NULL - success, growing the buffer in-place without moving it - success, allocating a new buffer, copying the content of the old buffer to it, and freeing the old buffer. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as r12-3237-geafa9d969237fd8f712c4b25a8c58932c01f44b4. gcc/ChangeLog: PR analyzer/99260 * Makefile.in (ANALYZER_OBJS): Add analyzer/call-info.o. gcc/analyzer/ChangeLog: PR analyzer/99260 * analyzer.h (class custom_edge_info): New class, adapted from exploded_edge::custom_info_t. Make member functions const. Make update_model return bool, converting edge param from reference to a pointer, and adding a ctxt param. (class path_context): New class. * call-info.cc: New file. * call-info.h: New file. * engine.cc: Include "analyzer/call-info.h" and . (impl_region_model_context::impl_region_model_context): Update for new m_path_ctxt field. (impl_region_model_context::bifurcate): New. (impl_region_model_context::terminate_path): New. (impl_region_model_context::get_malloc_map): New. (impl_sm_context::impl_sm_context): Update for new m_path_ctxt field. (impl_sm_context::get_fndecl_for_call): Likewise. (impl_sm_context::set_next_state): Likewise. (impl_sm_context::warn): Likewise. (impl_sm_context::is_zero_assignment): Likewise. (impl_sm_context::get_path_context): New. (impl_sm_context::m_path_ctxt): New. (impl_region_model_context::on_condition): Update for new path_ctxt param. Handle m_enode_for_diag being NULL. (impl_region_model_context::on_phi): Update for new path_ctxt param. (exploded_node::on_stmt): Add path_ctxt param, updating ctor calls to use it as necessary. Use it to bail out after sm-handling, if needed. (exploded_node::detect_leaks): Update for new path_ctxt param. (dynamic_call_info_t::update_model): Update for conversion of exploded_edge::custom_info_t to custom_edge_info. (dynamic_call_info_t::add_events_to_path): Likewise. (rewind_info_t::update_model): Likewise. (rewind_info_t::add_events_to_path): Likewise. (exploded_edge::exploded_edge): Likewise. (exploded_graph::add_edge): Likewise. (exploded_graph::maybe_process_run_of_before_supernode_enodes): Update for new path_ctxt param. (class impl_path_context): New. (exploded_graph::process_node): Update for new path_ctxt param. Create an impl_path_context and pass it to exploded_node::on_stmt. Use it to terminate iterating stmts if terminate_path is called on it. After processing a run of stmts, query path_ctxt to potentially terminate the analysis path, and/or to "bifurcate" the analysis into multiple additional paths. (feasibility_state::maybe_update_for_edge): Update for new update_model ctxt param. * exploded-graph.h (impl_region_model_context::impl_region_model_context): Add path_ctxt param. (impl_region_model_context::bifurcate): New. (impl_region_model_context::terminate_path): New (impl_region_model_context::get_ext_state): New. (impl_region_model_context::get_malloc_map): New. (impl_region_model_context::m_path_ctxt): New field. (exploded_node::on_stmt): Add path_ctxt param. (class exploded_edge::custom_info_t): Move to analyzer.h, renaming to custom_edge_info, and making the changes as noted in analyzer.h above. (exploded_edge::exploded_edge): Update for these changes to exploded_edge::custom_info_t. (exploded_edge::m_custom_info): Likewise. (class dynamic_call_info_t): Likewise. (class rewind_info_t): Likewise. (exploded_graph::add_edge): Likewise. * program-state.cc (program_state::on_edge): Update for new path_ctxt param. (program_state::push_call): Likewise. (program_state::returning_call): Likewise. (program_state::prune_for_point): Likewise. * region-model-impl-calls.cc: Include "analyzer/call-info.h". (call_details::get_fndecl_for_call): New.
Re: [pushed] c++: preserve location through constexpr
On Mon, Aug 30, 2021 at 05:25:01PM -0400, Jason Merrill via Gcc-patches wrote: > While working on the patch for PR101460, I noticed that we were losing the > expression location when folding class prvalue expressions. The final patch > doesn't fold class prvalues, but this still seems a worthwhile change. I > don't add location wrappers for scalar prvalues because many callers are > trying to fold them away. > > Tested x86_64-pc-linux-gnu, applying to trunk. > > gcc/cp/ChangeLog: > > * constexpr.c (cxx_eval_outermost_constant_expr): Copy > expr location to result. > --- > gcc/cp/constexpr.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c > index 9606719bc73..e78fdf021b2 100644 > --- a/gcc/cp/constexpr.c > +++ b/gcc/cp/constexpr.c > @@ -7445,6 +7445,11 @@ cxx_eval_outermost_constant_expr (tree t, bool > allow_non_constant, > } > } > > + /* Remember the original location if that wouldn't need a wrapper. */ > + if (location_t loc = EXPR_LOCATION (t)) > +if (CAN_HAVE_LOCATION_P (r)) > + SET_EXPR_LOCATION (r, loc); Can we use protected_set_expr_location or did you not want that? Marek
Re: [PATCH][pushed] Add -fprofile-reproducible=parallel-runs to STAGEfeedback_CFLAGS to Makefile.tpl.
On Mon, Aug 30, 2021 at 10:13:21AM +0200, Martin Liška wrote: > On 8/27/21 19:45, Gleb Fotengauer-Malinovskiy wrote: > > Hi, > > > > On Thu, Mar 11, 2021 at 04:19:51PM +0100, Martin Liška wrote: > >> Pushed as obvious, the original change was done > >> in g:e05a117dc4b98f3ac60851608f532ba7cee7343a. > >> > >> Martin > >> > >> ChangeLog: > >> > >>* Makefile.tpl: The change was done Makefile.in which > >>is generated file. > >> --- > >>Makefile.tpl | 2 +- > >>1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/Makefile.tpl b/Makefile.tpl > >> index 3b88f351d5b..6e0337fb48f 100644 > >> --- a/Makefile.tpl > >> +++ b/Makefile.tpl > >> @@ -488,7 +488,7 @@ STAGEprofile_TFLAGS = $(STAGE2_TFLAGS) > >>STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS)) > >>STAGEtrain_TFLAGS = $(filter-out -fchecking=1,$(STAGE3_TFLAGS)) > >> > >> -STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use > >> +STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use > >> -fprofile-reproducible=parallel-runs > > > > Did you mean to add -fprofile-reproducible flag to STAGEprofile_CFLAGS > > instead? > > No. > > > I suppose this flag doesn't mean anything without > > -fprofile-generate, which is passed through STAGEprofile_CFLAGS. > > The flag -fprofile-reproducible is used when we use a collected profile, so > it works > *only* with -fprofile-use flag. Thank you for the answer, now I see how it works. I think that it means that the documentation is rather misleading about this flag: '-fprofile-reproducible=[multithreaded|parallel-runs|serial]' Control level of reproducibility of profile gathered by '-fprofile-generate'. [...] It convinced me that this flag changes the way the code is instrumented for profile generation. The rest of it convinced me even more, actually. > Thanks for heads up! > Cheers, > Martin > > > > > Thanks, > > -- glebfm
[PATCH] jit : Generate debug info for variables
Hi, This is a patch to generate debug info for local variables as well as globals. With this, "ptype foo", "info variables", "info locals" etc works when debugging in GDB. Finalizing of global variable declares are moved to after locations are handled and done as Fortran, C, Go etc do it. Also, primitive types have their TYPE_NAME set for debug info on types to work. Below are the patch, and I attached a testcase. Since it requires GDB to run it might not be suitable? Make check-jit runs fine on Debian x64. Regards, >From d77e77104024c7ae9ce31b419dad1f0a5801fda7 Mon Sep 17 00:00:00 2001 From: Petter Tomner Date: Mon, 30 Aug 2021 01:44:07 +0200 Subject: [PATCH 1/2] libgccjit: Generate debug info for variables Finalize declares via available helpers after location is set. Set TYPE_NAME of primitives and friends to "int" etc. Debug info is now set properly for variables. 2021-08-31 Petter Tomner gcc/jit jit-playback.c jit-playback.h gcc/testsuite/jit.dg/ test-error-array-bounds.c: Array is not unsigned --- gcc/jit/jit-playback.c| 76 ++- gcc/jit/jit-playback.h| 5 ++ .../jit.dg/test-error-array-bounds.c | 2 +- 3 files changed, 65 insertions(+), 18 deletions(-) diff --git a/gcc/jit/jit-playback.c b/gcc/jit/jit-playback.c index 79ac525e5df..74321b0946a 100644 --- a/gcc/jit/jit-playback.c +++ b/gcc/jit/jit-playback.c @@ -165,7 +165,8 @@ gt_ggc_mx () /* Given an enum gcc_jit_types value, get a "tree" type. */ -static tree +tree +playback::context:: get_tree_node_for_type (enum gcc_jit_types type_) { switch (type_) @@ -192,11 +193,7 @@ get_tree_node_for_type (enum gcc_jit_types type_) return short_unsigned_type_node; case GCC_JIT_TYPE_CONST_CHAR_PTR: - { - tree const_char = build_qualified_type (char_type_node, - TYPE_QUAL_CONST); - return build_pointer_type (const_char); - } + return m_const_char_ptr; case GCC_JIT_TYPE_INT: return integer_type_node; @@ -579,10 +576,6 @@ playback::lvalue * playback::context:: global_finalize_lvalue (tree inner) { - varpool_node::get_create (inner); - - varpool_node::finalize_decl (inner); - m_globals.safe_push (inner); return new lvalue (this, inner); @@ -2952,9 +2945,7 @@ replay () { JIT_LOG_SCOPE (get_logger ()); - m_const_char_ptr -= build_pointer_type (build_qualified_type (char_type_node, - TYPE_QUAL_CONST)); + init_types (); /* Replay the recorded events: */ timevar_push (TV_JIT_REPLAY); @@ -2984,15 +2975,22 @@ replay () { int i; function *func; - + tree global; /* No GC can happen yet; process the cached source locations. */ handle_locations (); + /* Finalize globals. See how FORTRAN 95 does it in gfc_be_parse_file() + for a simple reference. */ + FOR_EACH_VEC_ELT (m_globals, i, global) +rest_of_decl_compilation (global, true, true); + + wrapup_global_declarations (m_globals.address(), m_globals.length()); + /* We've now created tree nodes for the stmts in the various blocks -in each function, but we haven't built each function's single stmt -list yet. Do so now. */ + in each function, but we haven't built each function's single stmt + list yet. Do so now. */ FOR_EACH_VEC_ELT (m_functions, i, func) - func->build_stmt_list (); + func->build_stmt_list (); /* No GC can have happened yet. */ @@ -3081,6 +3079,50 @@ location_comparator (const void *lhs, const void *rhs) return loc_lhs->get_column_num () - loc_rhs->get_column_num (); } +/* Initialize the NAME_TYPE of the primitive types as well as some + others. */ +void +playback::context:: +init_types () +{ + /* See lto_init() in lto-lang.c or void visit (TypeBasic *t) in D's types.cc + for reference. If TYPE_NAME is not set, debug info will not contain types */ +#define NAME_TYPE(t,n) \ +if (t) \ + TYPE_NAME (t) = build_decl (UNKNOWN_LOCATION, TYPE_DECL, \ + get_identifier (n), t) + + NAME_TYPE (integer_type_node, "int"); + NAME_TYPE (char_type_node, "char"); + NAME_TYPE (long_integer_type_node, "long int"); + NAME_TYPE (unsigned_type_node, "unsigned int"); + NAME_TYPE (long_unsigned_type_node, "long unsigned int"); + NAME_TYPE (long_long_integer_type_node, "long long int"); + NAME_TYPE (long_long_unsigned_type_node, "long long unsigned int"); + NAME_TYPE (short_integer_type_node, "short int"); + NAME_TYPE (short_unsigned_type_node, "short unsigned int"); + if (signed_char_type_node != char_type_node) +NAME_TYPE (signed_char_type_node, "signed char"); + if (unsigned_char_type_node != char_type_node) +NAME_TYPE (unsigned_char_type_node, "unsigned char"); + NAME_TYPE (float_type_node, "float"); + NAME_
Sv: [PATCH] jit : Generate debug info for variables
Well I seemed to have attached the wrong testcase. Here is the proper one attached. Regards, -Ursprungligt meddelande- Från: Petter Tomner Skickat: den 31 augusti 2021 02:14 Till: gcc-patches@gcc.gnu.org; j...@gcc.gnu.org Ämne: [PATCH] jit : Generate debug info for variables Hi, This is a patch to generate debug info for local variables as well as globals. With this, "ptype foo", "info variables", "info locals" etc works when debugging in GDB. Finalizing of global variable declares are moved to after locations are handled and done as Fortran, C, Go etc do it. Also, primitive types have their TYPE_NAME set for debug info on types to work. Below are the patch, and I attached a testcase. Since it requires GDB to run it might not be suitable? Make check-jit runs fine on Debian x64. Regards, 0002-libgccjit-Test-cases-for-debug-info.patch Description: 0002-libgccjit-Test-cases-for-debug-info.patch
[PATCH] c++: check arity before deduction w/ explicit targs [PR12672]
During overload resolution, when the arity of a function template clearly disagrees with the arity of the call, no specialization of the function template could yield a viable candidate. The deduction routine type_unification_real already notices this situation, but not before it substitutes explicit template arguments into the template, a step which could induce a hard error. Although it's necessary to perform this substitution first in order to check arity perfectly (since the substitution can e.g. expand a non-trailing parameter pack), in most cases we can determine ahead of time whether there's an arity disagreement without needing to perform deduction. To that end, this patch implements an (approximate) arity check in add_template_candidate_real that guards actual deduction. It's enabled only when there are explicit template arguments since that's when deduction can force otherwise avoidable template instantiations. (I experimented with enabling it unconditionally as an optimization and saw some compile-time improvements of about 5% but also some regressions by about the same magnitude, so kept it conditional.) In passing, this adds a least_p parameter to arity_rejection for sake of consistent diagnostics with unify_arity. A couple of testcases needed to be adjusted so that deduction continues to occur as intended after this change. Except in unify6.C, where we were expecting foo to be ill-formed due to substitution yielding a function type with an added 'const', but I think this is permitted by [dcl.fct]/7, so I changed the test accordingly. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? PR c++/12672 gcc/cp/ChangeLog: * call.c (rejection_reason::call_varargs_p): Rename this previously unused member to ... (rejection_reason::least_p): ... this. (arity_rejection): Add least_p parameter. (add_template_candidate_real): When there are explicit template arguments, check that the arity of the call agrees with the arity of the function before attempting deduction. (print_arity_information): Add least_p parameter. (print_z_candidate): Adjust call to print_arity_information. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/decltype29.C: Adjust. * g++.dg/template/error56.C: Adjust. * g++.old-deja/g++.pt/unify6.C: Adjust. * g++.dg/template/explicit-args6.C: New test. --- gcc/cp/call.c | 67 --- gcc/testsuite/g++.dg/cpp0x/decltype29.C | 4 +- gcc/testsuite/g++.dg/template/error56.C | 4 +- .../g++.dg/template/explicit-args6.C | 33 + gcc/testsuite/g++.old-deja/g++.pt/unify6.C| 4 +- 5 files changed, 96 insertions(+), 16 deletions(-) create mode 100644 gcc/testsuite/g++.dg/template/explicit-args6.C diff --git a/gcc/cp/call.c b/gcc/cp/call.c index e4df72ec1a3..80e6121ce44 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -455,8 +455,8 @@ struct rejection_reason { int expected; /* The actual number of arguments in the call. */ int actual; - /* Whether the call was a varargs call. */ - bool call_varargs_p; + /* Whether EXPECTED should be treated as a lower bound. */ + bool least_p; } arity; /* Information about an argument conversion mismatch. */ struct conversion_info conversion; @@ -628,12 +628,13 @@ alloc_rejection (enum rejection_reason_code code) } static struct rejection_reason * -arity_rejection (tree first_arg, int expected, int actual) +arity_rejection (tree first_arg, int expected, int actual, bool least_p = false) { struct rejection_reason *r = alloc_rejection (rr_arity); int adjust = first_arg != NULL_TREE; r->u.arity.expected = expected - adjust; r->u.arity.actual = actual - adjust; + r->u.arity.least_p = least_p; return r; } @@ -3452,6 +3453,44 @@ add_template_candidate_real (struct z_candidate **candidates, tree tmpl, } gcc_assert (ia == nargs_without_in_chrg); + if (!obj && explicit_targs) +{ + /* Check that there's no obvious arity mismatch before proceeding with +deduction. This avoids substituting explicit template arguments +into the template (which could result in an error outside the +immediate context) when the resulting candidate would be unviable +anyway. */ + int min_arity = 0, max_arity = 0; + tree parms = TYPE_ARG_TYPES (TREE_TYPE (tmpl)); + parms = skip_artificial_parms_for (tmpl, parms); + for (; parms != void_list_node; parms = TREE_CHAIN (parms)) + { + if (!parms || PACK_EXPANSION_P (TREE_VALUE (parms))) + { + max_arity = -1; + break; + } + if (TREE_PURPOSE (parms)) + /* A parameter with a default argument. */ + ++max_arity; + else + ++min_arity, ++max_arity; + } + if (ia < (unsigne
[PATCH] c++: shortcut bad convs during overload resolution [PR101904]
In the context of overload resolution we have the notion of a "bad" argument conversion, which is a conversion that "would be a permitted with a bending of the language standards", and we handle such bad conversions specially. In particular, we rank a bad conversion as better than no conversion but worse than a good conversion, and a bad conversion doesn't necessarily make a candidate unviable. With the flag -fpermissive, we permit the situation where overload resolution selects a candidate that contains a bad conversion (which we call a non-strictly viable candidate). And without the flag we issue a distinct permerror in this situation instead. One consequence of this defacto behavior is that in order to distinguish a non-strictly viable candidate from an unviable candidate, if we encounter a bad argument conversion during overload resolution we must keep converting subsequent arguments because a subsequent conversion could render the candidate unviable instead of just non-strictly viable. But checking subsequent arguments can force template instantiations and result in otherwise avoidable hard errors. And in particular, all 'this' conversions are at worst bad, so this means the const/ref-qualifiers of a member function can't be used to prune a candidate quickly, which is the subject of the mentioned PR. This patch tries to improve the situation without changing the defacto output of add_candidates. Specifically, when considering a candidate during overload resolution this patch makes us shortcut argument conversion checking upon encountering the first bad conversion (tentatively marking the candidate as non-strictly viable, though it could ultimately be unviable) under the assumption that we'll eventually find a strictly viable candidate anyway (rendering the distinction between non-strictly viable and unviable moot, since both are worse than a strictly viable candidate). If this assumption turns out to be false, we'll fully reconsider the candidate under the defacto behavior (without the shortcutting). So in the best case (there's a strictly viable candidate), we avoid some argument conversions and/or template argument deduction that may cause a hard error. In the worst case (there's no such candidate), we have to redundantly consider some candidates twice. (In a previous version of the patch, to avoid this redundant checking I created a new "deferred" conversion type that represents a conversion that is yet to be performed, and instead of reconsidering a candidate I just realized its deferred conversions. But it doesn't seem this redundancy is a significant performance issue to justify the added complexity of this other approach.) Lots of care was taken to preserve the defacto behavior w.r.t. non-strictly viable candidates, but I wonder how important this behavior is nowadays? Can the notion of a non-strictly viable candidate be done away with, or is it here to stay? Bootstrapped and regtested on x86_64-pc-linux-gnu. PR c++/101904 gcc/cp/ChangeLog: * call.c (build_this_conversion): New function, split out from add_function_candidate. (add_function_candidate): New parameter shortcut_bad_convs. Document it. Use build_this_conversion. Stop at the first bad argument conversion when shortcut_bad_convs is true. (add_template_candidate_real): New parameter shortcut_bad_convs. Use build_this_conversion to check the 'this' conversion before attempting deduction. When the rejection reason code is rr_bad_arg_conversion, pass -1 instead of 0 as the viable parameter to add_candidate. Pass 'convs' to add_candidate. (add_template_candidate): New parameter shortcut_bad_convs. (add_template_conv_candidate): Pass false as shortcut_bad_convs to add_template_candidate_real. (add_candidates): Prefer to shortcut bad conversions during overload resolution under the assumption that we'll eventually see a strictly viable candidate. If this assumption turns out to be false, re-process the non-strictly viable candidates without shortcutting those bad conversions. gcc/testsuite/ChangeLog: * g++.dg/template/conv17.C: New test. --- gcc/cp/call.c | 250 + gcc/testsuite/g++.dg/template/conv17.C | 56 ++ 2 files changed, 233 insertions(+), 73 deletions(-) create mode 100644 gcc/testsuite/g++.dg/template/conv17.C diff --git a/gcc/cp/call.c b/gcc/cp/call.c index e4df72ec1a3..2a0bf933a89 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -175,17 +175,17 @@ static struct z_candidate *splice_viable (struct z_candidate *, bool, bool *); static bool any_strictly_viable (struct z_candidate *); static struct z_candidate *add_template_candidate (struct z_candidate **, tree, tree, tree, tree, const vec *, -tree, tree, tree, int, unification_kind_t, tsubst_flags_t); +tree, tree, tre
[PATCH] c++: parameter pack inside constexpr if [PR101764]
Here when partially substituting into the pack expansion, substitution into the constexpr if yields a still-dependent tree, so tsubst_expr returns an IF_STMT with an unsubstituted IF_COND and with IF_STMT_EXTRA_ARGS added to. Hence after partial substitution the pack expansion pattern still refers to the parameter pack 'ts' of level 2 (and it's thus represented in the new PACK_EXPANSION_PARAMETER_PACKS) even though the partially instantiated generic lambda admits only one level of template arguments. This causes us to crash during the subsequent instantiation with the lambda's template arguments because of the level mismatch. (Likewise when the constexpr if is replaced by a requires-expr, which too uses the extra args mechanism for delaying partial instantiation.) So essentially, a pack expansion pattern that contains a parameter pack inside an "extra args" tree doesn't play well with partial substitution thereof. This patch fixes this by forcing such pack expansions to use the extra args mechanism as well. Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk? PR c++/101764 gcc/cp/ChangeLog: * cp-tree.h (PACK_EXPANSION_FORCE_EXTRA_ARGS_P): New accessor macro. * pt.c (uses_extra_args_mechanism_p): New function. (find_parameter_pack_data::found_pack_within_extra_args_tree_p): New data member. (find_parameter_pack_data::inside_extra_args_tree_p): Likewise. (find_parameter_packs_r): Detect parameter packs within "extra args" trees and set found_pack_within_extra_args_tree_p appropriately. (make_pack_expansion): Set PACK_EXPANSION_FORCE_EXTRA_ARGS_P if found_pack_within_extra_args_tree_p. (use_pack_expansion_extra_args_p): Return true if there were unsubstituted packs and PACK_EXPANSION_FORCE_EXTRA_ARGS_P. (tsubst_pack_expansion): Pass the pack expansion to use_pack_expansion_extra_args_p. gcc/testsuite/ChangeLog: * g++.dg/cpp1z/constexpr-if35.C: New test. --- gcc/cp/cp-tree.h| 5 ++ gcc/cp/pt.c | 69 - gcc/testsuite/g++.dg/cpp1z/constexpr-if35.C | 18 ++ 3 files changed, 90 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-if35.C diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h index ce7ca53a113..06dec495428 100644 --- a/gcc/cp/cp-tree.h +++ b/gcc/cp/cp-tree.h @@ -493,6 +493,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX]; CONSTRUCTOR_C99_COMPOUND_LITERAL (in CONSTRUCTOR) OVL_NESTED_P (in OVERLOAD) DECL_MODULE_EXPORT_P (in _DECL) + PACK_EXPANSION_FORCE_EXTRA_ARGS_P (in *_PACK_EXPANSION) 4: IDENTIFIER_MARKED (IDENTIFIER_NODEs) TREE_HAS_CONSTRUCTOR (in INDIRECT_REF, SAVE_EXPR, CONSTRUCTOR, CALL_EXPR, or FIELD_DECL). @@ -3902,6 +3903,10 @@ struct GTY(()) lang_decl { /* True iff this pack expansion is for auto... in lambda init-capture. */ #define PACK_EXPANSION_AUTO_P(NODE) TREE_LANG_FLAG_2 (NODE) +/* True if we must use PACK_EXPANSION_EXTRA_ARGS to avoid partial + substitution into this pack expansion. */ +#define PACK_EXPANSION_FORCE_EXTRA_ARGS_P(NODE) TREE_LANG_FLAG_3 (NODE) + /* True iff the wildcard can match a template parameter pack. */ #define WILDCARD_PACK_P(NODE) TREE_LANG_FLAG_0 (NODE) diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index fcf3ac31b25..a92dff88d9d 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -3855,6 +3855,20 @@ expand_builtin_pack_call (tree call, tree args, tsubst_flags_t complain, return NULL_TREE; } +/* Return true if the tree T uses the extra args mechanism for + deferring partial substitution into it. */ + +static bool +uses_extra_args_mechanism_p (tree t) +{ + return (PACK_EXPANSION_P (t) + || TREE_CODE (t) == REQUIRES_EXPR + || (TREE_CODE (t) == IF_STMT + && IF_STMT_CONSTEXPR_P (t))); +} + +static tree find_parameter_packs_r (tree *, int *, void*); + /* Structure used to track the progress of find_parameter_packs_r. */ struct find_parameter_pack_data { @@ -3867,6 +3881,16 @@ struct find_parameter_pack_data /* True iff we're making a type pack expansion. */ bool type_pack_expansion_p; + + /* True iff we found a pack inside a subtree that uses the extra + args mechanism. */ + bool found_pack_within_extra_args_tree_p = false; + +private: + /* True iff find_parameter_packs_r is currently visiting a tree + that uses the extra args mechanism or a subtree thereof. */ + bool inside_extra_args_tree_p = false; + friend tree find_parameter_packs_r (tree *, int *, void*); }; /* Identifies all of the argument packs that occur in a template @@ -3968,6 +3992,37 @@ find_parameter_packs_r (tree *tp, int *walk_subtrees, void* data) *ppd->parameter_packs = tree_cons (NULL_TREE, t, *ppd->parameter_packs); } + if (!ppd->inside_extra_args_tree_p + && uses_extra_args
[PATCH] Fix PR driver/79181 (and others), not deleting some /tmp/cc* files for LTO.
From: Andrew Pinski So the main issue here is that some signals are not setup unlike collect2. So this merges the setting up of the signal handlers to one function in collect-utils and has collect2 and lto-wrapper call that function. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: PR driver/79181 * collect-utils.c (setup_signals): New declaration. * collect-utils.h (setup_signals): New function. * collect2.c (handler): Delete. (main): Instead of manually setting up the signals, just call setup_signals. * lto-wrapper.c (main): Likewise. --- gcc/collect-utils.c | 37 + gcc/collect-utils.h | 1 + gcc/collect2.c | 36 +--- gcc/lto-wrapper.c | 18 +- 4 files changed, 40 insertions(+), 52 deletions(-) diff --git a/gcc/collect-utils.c b/gcc/collect-utils.c index 6b5d61d5162..19423d31885 100644 --- a/gcc/collect-utils.c +++ b/gcc/collect-utils.c @@ -57,6 +57,43 @@ fatal_signal (int signum) so its normal effect occurs. */ kill (getpid (), signum); } + +/* Setup the signal handlers for the utils. */ +void +setup_signals (void) +{ +#ifdef SIGQUIT + if (signal (SIGQUIT, SIG_IGN) != SIG_IGN) +signal (SIGQUIT, fatal_signal); +#endif + if (signal (SIGINT, SIG_IGN) != SIG_IGN) +signal (SIGINT, fatal_signal); +#ifdef SIGALRM + if (signal (SIGALRM, SIG_IGN) != SIG_IGN) +signal (SIGALRM, fatal_signal); +#endif +#ifdef SIGHUP + if (signal (SIGHUP, SIG_IGN) != SIG_IGN) +signal (SIGHUP, fatal_signal); +#endif + if (signal (SIGSEGV, SIG_IGN) != SIG_IGN) +signal (SIGSEGV, fatal_signal); + if (signal (SIGTERM, SIG_IGN) != SIG_IGN) +signal (SIGTERM, fatal_signal); +#ifdef SIGPIPE + if (signal (SIGPIPE, SIG_IGN) != SIG_IGN) +signal (SIGPIPE, fatal_signal); +#endif +#ifdef SIGBUS + if (signal (SIGBUS, SIG_IGN) != SIG_IGN) +signal (SIGBUS, fatal_signal); +#endif +#ifdef SIGCHLD + /* We *MUST* set SIGCHLD to SIG_DFL so that the wait4() call will + receive the signal. A different setting is inheritable */ + signal (SIGCHLD, SIG_DFL); +#endif +} /* Wait for a process to finish, and exit if a nonzero status is found. */ diff --git a/gcc/collect-utils.h b/gcc/collect-utils.h index 4f0e3ce9832..15f831d778a 100644 --- a/gcc/collect-utils.h +++ b/gcc/collect-utils.h @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3. If not see extern void notice (const char *, ...) __attribute__ ((format (printf, 1, 2))); extern void fatal_signal (int); +extern void setup_signals (void); extern struct pex_obj *collect_execute (const char *, char **, const char *, const char *, diff --git a/gcc/collect2.c b/gcc/collect2.c index 07092c2733a..cf04a58ba4d 100644 --- a/gcc/collect2.c +++ b/gcc/collect2.c @@ -301,7 +301,6 @@ const char tool_name[] = "collect2"; static symkind is_ctor_dtor (const char *); -static void handler (int); static void maybe_unlink_list (char **); static void add_to_list (struct head *, const char *); static int extract_init_priority (const char *); @@ -408,14 +407,6 @@ collect_atexit (void) tool_cleanup (false); } -static void -handler (int signo) -{ - tool_cleanup (true); - - signal (signo, SIG_DFL); - raise (signo); -} /* Notify user of a non-error, without translating the format string. */ void notice_translated (const char *cmsgid, ...) @@ -907,11 +898,7 @@ main (int argc, char **argv) COLLECT2_HOST_INITIALIZATION; #endif -#ifdef SIGCHLD - /* We *MUST* set SIGCHLD to SIG_DFL so that the wait4() call will - receive the signal. A different setting is inheritable */ - signal (SIGCHLD, SIG_DFL); -#endif + setup_signals (); /* Unlock the stdio streams. */ unlock_std_streams (); @@ -1051,27 +1038,6 @@ main (int argc, char **argv) if (argc < 2) fatal_error (input_location, "no arguments"); -#ifdef SIGQUIT - if (signal (SIGQUIT, SIG_IGN) != SIG_IGN) -signal (SIGQUIT, handler); -#endif - if (signal (SIGINT, SIG_IGN) != SIG_IGN) -signal (SIGINT, handler); -#ifdef SIGALRM - if (signal (SIGALRM, SIG_IGN) != SIG_IGN) -signal (SIGALRM, handler); -#endif -#ifdef SIGHUP - if (signal (SIGHUP, SIG_IGN) != SIG_IGN) -signal (SIGHUP, handler); -#endif - if (signal (SIGSEGV, SIG_IGN) != SIG_IGN) -signal (SIGSEGV, handler); -#ifdef SIGBUS - if (signal (SIGBUS, SIG_IGN) != SIG_IGN) -signal (SIGBUS, handler); -#endif - /* Extract COMPILER_PATH and PATH into our prefix list. */ prefix_from_env ("COMPILER_PATH", &cpath); prefix_from_env ("PATH", &path); diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c index aae48aff100..903c258a03a 100644 --- a/gcc/lto-wrapper.c +++ b/gcc/lto-wrapper.c @@ -2125,23 +2125,7 @@ main (int argc, char *argv[]) if (atexit (lto_wrapper_cleanup) != 0) fatal_error (input_location, "% failed"); - if (signal (SIGINT, SIG_IGN) != S
Re: [PATCH] libffi: Fix MIPS r6 support
Jeff Law via Gcc-patches 于2021年8月30日周一 下午9:48写道: > > > > On 8/30/2021 2:47 AM, YunQiang Su wrote: > > 在 2021/8/30 5:00, Jeff Law 写道: > >> > >> > >> On 8/28/2021 1:23 AM, Xi Ruoyao wrote: > >>> On Fri, 2021-08-27 at 15:28 -0600, Jeff Law via Gcc-patches wrote: > > On 8/26/2021 10:58 PM, YunQiang Su wrote: > > for some instructions, MIPS r6 uses different encoding other than > > the previous releases. > > > > 1. mips/n32.S disable .set mips4: since it casuses old insn encoding > > is used. > > https://github.com/libffi/libffi/pull/396 > > 2. mips/ffi.c: the encoding for JR is hardcoded: we need to use > > different value for r6 and pre-r6. > > https://github.com/libffi/libffi/pull/401 > > > > libffi/ > > PR libffi/83636 > > * src/mips/n32.S: disable .set mips4 > > * src/mips/ffi.c: use different JR encoding for r6. > These should go to the upstream libffi project. Once accepted there > you > can add them to GCC. > >>> Hi Jeff, > >>> > >>> The two PRs are already merged, and released since libffi-3.3.0 (now > >>> the > >>> upstream latest release is 3.4.2). > >> ACK. Thanks for confirming. It's always OK to cherrypick/backport > >> from libffi back to GCC. > >> > >>> > >>> I don't have a MIPSr6 so I can't test though. > >> Understood. Me neither, but I really should get a tiny chroot for > >> mipsr6 so that my tester can validate it regularly. > >> > > > > We have port Debian to MIPS64r6el. > > http://58.246.137.130:20180/tools/tarball/ > > You can use both buster(Debian 10) or bullseye (Debian 11). > > > > And both qemu-system and qemu-user can work. > Understood. I need the minimal chroot to compress down under 100M to A minimal tarball is ready now with size about 87M. > make gitlab happy (that's where my scripts & jenkins system expect to > find them). My scripts to build the chroots also ensure a consistent > set of packages & version #s across the different systems being > tested. I've just never gotten around to building one for mips64r6. I > could probably take yours trim out unnecessary stuff and use it for > now. Getting a bootstrap test on the target weekly is definitely helpful. > That's great. Since the minimal rootfs lacks of lots of tools, and if you need some tools, you can just apt install them,then. > jeff
Re: [PATCH] Set bound/cmp/control for until wrap loop.
On 2021-08-30 20:02, Richard Biener wrote: On Mon, 30 Aug 2021, guojiufu wrote: On 2021-08-30 14:15, Jiufu Guo wrote: > Hi, > > In patch r12-3136, niter->control, niter->bound and niter->cmp are > derived from number_of_iterations_lt. While for 'until wrap condition', > the calculation in number_of_iterations_lt is not align the requirements > on the define of them and requirements in determine_exit_conditions. > > This patch calculate niter->control, niter->bound and niter->cmp in > number_of_iterations_until_wrap. > > The ICEs in the PR are pass with this patch. > Bootstrap and reg-tests pass on ppc64/ppc64le and x86. > Is this ok for trunk? > > BR. > Jiufu Guo > Add ChangeLog: gcc/ChangeLog: 2021-08-30 Jiufu Guo PR tree-optimization/102087 * tree-ssa-loop-niter.c (number_of_iterations_until_wrap): Set bound/cmp/control for niter. gcc/testsuite/ChangeLog: 2021-08-30 Jiufu Guo PR tree-optimization/102087 * gcc.dg/vect/pr101145_3.c: Update tests. * gcc.dg/pr102087.c: New test. > --- > gcc/tree-ssa-loop-niter.c | 14 +- > gcc/testsuite/gcc.dg/pr102087.c| 25 + > gcc/testsuite/gcc.dg/vect/pr101145_3.c | 4 +++- > 3 files changed, 41 insertions(+), 2 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/pr102087.c > > diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c > index 7af92d1c893..747f04d3ce0 100644 > --- a/gcc/tree-ssa-loop-niter.c > +++ b/gcc/tree-ssa-loop-niter.c > @@ -1482,7 +1482,7 @@ number_of_iterations_until_wrap (class loop *, > tree type, affine_iv *iv0, > affine_iv *iv1, class tree_niter_desc *niter) > { >tree niter_type = unsigned_type_for (type); > - tree step, num, assumptions, may_be_zero; > + tree step, num, assumptions, may_be_zero, span; >wide_int high, low, max, min; > >may_be_zero = fold_build2 (LE_EXPR, boolean_type_node, iv1->base, > iv0->base); > @@ -1513,6 +1513,8 @@ number_of_iterations_until_wrap (class loop *, > tree type, affine_iv *iv0, > low = wi::to_wide (iv0->base); > else >low = min; > + > + niter->control = *iv1; > } >/* {base, -C} < n. */ >else if (tree_int_cst_sign_bit (iv0->step) && integer_zerop > (iv1->step)) > @@ -1533,6 +1535,8 @@ number_of_iterations_until_wrap (class loop *, > tree type, affine_iv *iv0, > high = wi::to_wide (iv1->base); > else >high = max; > + > + niter->control = *iv0; > } >else > return false; it looks like the above two should already be in effect from the caller (guarding with integer_nozerop)? I add them just because set these fields in one function. Yes, they have been set in caller already, I could remove them here. > @@ -1556,6 +1560,14 @@ number_of_iterations_until_wrap (class loop *, > tree type, affine_iv *iv0, >niter->assumptions, assumptions); > >niter->control.no_overflow = false; > + niter->control.base = fold_build2 (MINUS_EXPR, niter_type, > + niter->control.base, > niter->control.step); how do we know IVn - STEP doesn't already wrap? The last IV value is just cross the max/min value of the type at the last iteration, then IVn - STEP is the nearest value to max(or min) and not wrap. A comment might be good to explain you're turning the simplified exit condition into { IVbase - STEP, +, STEP } != niter * STEP + (IVbase - STEP) which, when mathematically looking at it makes me wonder why there's the seemingly redundant '- STEP' term? Also is NE_EXPR really correct since STEP might be not 1? Only for non equality compares the '- STEP' should matter? I need to add comments for this. This is a little tricky. The last value of the original IV just cross max/min at most one STEP, at there wrapping already happen. Using "{IVbase, +, STEP} != niter * STEP + IVbase" is not wrong in the aspect of exit condition. But this would not work well with existing code: like determine_exit_conditions, which will convert NE_EXP to LT_EXPR/GT_EXPR. And so, the '- STEP' is added to adjust the IV.base and bound, with '- STEP' the bound will be the last value just before wrap. Thanks again for your review! BR. Jiufu Richard. > + span = fold_build2 (MULT_EXPR, niter_type, niter->niter, > +fold_convert (niter_type, niter->control.step)); > + niter->bound = fold_build2 (PLUS_EXPR, niter_type, span, > +fold_convert (niter_type, niter->control.base)); > + niter->bound = fold_convert (type, niter->bound); > + niter->cmp = NE_EXPR; > >return true; > } > diff --git a/gcc/testsuite/gcc.dg/pr102087.c > b/gcc/testsuite/gcc.dg/pr102087.c > new file mode 100644 > index 000..ef1f9f5cba9 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/pr102087.c > @@ -0,0 +1,25 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O3" } */ > + > +unsigned __attribute__ ((noinline)) > +foo (int *__restrict__
Re: [PATCH 19/34] rs6000: Handle overloads during program parsing
Hi Segher, On 8/27/21 6:07 PM, Segher Boessenkool wrote: Hi! On Thu, Jul 29, 2021 at 08:31:06AM -0500, Bill Schmidt wrote: Although this patch looks quite large, the changes are fairly minimal. Most of it is duplicating the large function that does the overload resolution using the automatically generated data structures instead of the old hand-generated ones. This doesn't make the patch terribly easy to review, unfortunately. Yeah, and it pretty important that it does the same as the old code did. Just be aware that generally we aren't changing the logic and functionality of overload handling. Good :-) * config/rs6000/rs6000-c.c (rs6000-builtins.h): New include. (altivec_resolve_new_overloaded_builtin): New forward decl. (rs6000_new_builtin_type_compatible): New function. (altivec_resolve_overloaded_builtin): Call altivec_resolve_new_overloaded_builtin. (altivec_build_new_resolved_builtin): New function. (altivec_resolve_new_overloaded_builtin): Likewise. * config/rs6000/rs6000-call.c (rs6000_new_builtin_is_supported_p): Likewise. No "_p" please (it already has a verb in the name, and explicit ones are much clearer anyway). OK. This requires a small change to the gen program to match. Does everything else belong in the C frontend file but this last function not? Maybe this could be split up better. Maybe there should be a separate file for just the builtin support, it probably is big enough? (This is all for later of course, but please think about it. Code rearrangement (or even refactoring) can be done at any time, there is no time pressure on it). Yes, it's a fair point to figure out whether this should be refactored better. This particular function is used both by the overloading support and in other places, so it really belongs elsewhere, I think. In general, we might consider simplifying the target hooks that land in the C frontend file and move the guts into a separate builtin support file. Certainly rs6000-call.c is still too large, and factoring out the builtins parts would be an improvement. I'll put this on the list for follow-up after the main series lands. +static tree +altivec_resolve_new_overloaded_builtin (location_t, tree, void *); This fits on one line, please do so (this is a declaration, not a function definition; those are put with the name in the first column, to make searching for them a lot easier). +static bool +rs6000_new_builtin_type_compatible (tree t, tree u) +{ + if (t == error_mark_node) +return false; + + if (INTEGRAL_TYPE_P (t) && INTEGRAL_TYPE_P (u)) +return true; + + if (TARGET_IEEEQUAD && TARGET_LONG_DOUBLE_128 + && is_float128_p (t) && is_float128_p (u)) Indent is wrong here? Yeah, how'd that happen... + if (POINTER_TYPE_P (t) && POINTER_TYPE_P (u)) +{ + t = TREE_TYPE (t); + u = TREE_TYPE (u); + if (TYPE_READONLY (u)) + t = build_qualified_type (t, TYPE_QUAL_CONST); +} Just use TYPE_MAIN_VARIANT, don't make garbage here? I don't understand how TYPE_MAIN_VARIANT is appropriate. Please explain? + /* The AltiVec overloading implementation is overall gross, but this + is particularly disgusting. The vec_{all,any}_{ge,le} builtins + are completely different for floating-point vs. integer vector + types, because the former has vcmpgefp, but the latter should use + vcmpgtXX. Yes. The integer comparisons were reduced to just two in original VMX (eq and gt, well signed and unsigned versions of the latter), but that cannot be done for floating point (all of ab can be false at the same time (for a or b NaN), so this cannot be compressed to just two functions, we really need three at least). We have the same thing in xscmp* btw, it's not just VMX. Having three ops for the majority of comparisons (and doing the rest with two or so) is nicer than having 14 :-) There aren't builtins for most of that, thankfully. + if (TARGET_DEBUG_BUILTIN) +fprintf (stderr, "altivec_resolve_overloaded_builtin, code = %4d, %s\n", +(int)fcode, IDENTIFIER_POINTER (DECL_NAME (fndecl))); (space after cast, also in debug code) + const char *name + = fcode == RS6000_OVLD_VEC_ADDE ? "vec_adde": "vec_sube"; (space before and after colon) Yeah, there is a lot of bad formatting in the copied code. I probably should have tried to clean it all up, but generally just did where I was making specific changes. I'll see what I can do about these things. + const char *name = fcode == RS6000_OVLD_VEC_ADDEC ? + "vec_addec": "vec_subec"; (ditto. also, ? cannot end a line. maybe just const char *name; name = fcode == RS6000_OVLD_VEC_ADDEC ? "vec_addec" : "vec_subec"; ) + const char *name + = fcode == RS6000_OVLD_VEC_SPLATS ? "vec_splats": "vec_promote"; (more) + case E_SFmode: type = V4SF_type_node; size = 4; break;
[PATCH] Fix gcc.dg/ipa/inline-8.c for -fPIC
From: Andrew Pinski The problem here is with -fPIC, both cmp and move don't bind locally so they are not even tried to be inlined. This fixes the issue by marking both functions as static and now the testcase passes for both -fPIC and -fno-PIC cases. OK? Tested on x86_64-linux-gnu. gcc/testsuite/ChangeLog: * gcc.dg/ipa/inline-8.c: Mark cmp and move as static so they both bind local and available for inlinine. --- gcc/testsuite/gcc.dg/ipa/inline-8.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.dg/ipa/inline-8.c b/gcc/testsuite/gcc.dg/ipa/inline-8.c index 388283ca213..c51eec20fc8 100644 --- a/gcc/testsuite/gcc.dg/ipa/inline-8.c +++ b/gcc/testsuite/gcc.dg/ipa/inline-8.c @@ -6,13 +6,13 @@ #include extern int isnanf (float); /* Can't be inlined because isnanf will be optimized out. */ -int +static int cmp (float a) { return isnanf (a); } /* Can be inlined. */ -int +static int move (int a) { return a; -- 2.17.1
Re: [PING #2][PATCH] enable ranger and caching in pass_waccess
On Mon, Aug 30, 2021 at 4:27 PM Martin Sebor wrote: > > Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/578135.html > > Are there any further comments on the patch? > > Richard, you were kind enough to review the first two patches in this > series. Would you mind doing the same for this one? It continues in > the same vein. OK. Richard. > Martin > > On 8/25/21 3:26 PM, Martin Sebor wrote: > > On 8/25/21 10:14 AM, Andrew MacLeod wrote: > >> On 8/25/21 11:20 AM, Martin Sebor wrote: > >>> Ping: Andrew, did I answer your questions? Do you (or anyone else) > >>> have any other comments on the latest patch below? > >>> > >>> https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577865.html > >> > >> > >> I wasn't attempting to block it, its outside my review purview.. > >> > >> I was merely commenting that you should not need a pointer to a > >> range_query at all anymore > >> > >> > >>> > > > > > Are you planning to transition to using the get_range_query() > > interface instead of keeping a range_query pointer in the > > pointer_query class? > > This pass and to a smaller extent the pointer_query class that's > used by it and the strlen pass are still a work in progress. > I also still need to convert the strlen pass to use Ranger and > I expect it will take some changes to pointer_query. So at that > point, if going through get_range_query (cfun) everywhere is what > you recommend, I'm happy to do it. > > >> absolutely. you should not need to even know whether you have a ranger > >> instance running or not. get_range_query will give you whichever is > >> active, and there is ALWAYS something active.. defaulting to the > >> global version. > >> > >> the code in get_range() seems to be the end of the call chain which > >> uses the pointer and should be consolidated to something much simpler > >> > >> if (rvals && stmt) > >> { > >>if (!rvals->range_of_expr (vr, val, stmt)) > >> return NULL_TREE; > >> > >> <...> > >> > >>// ?? This entire function should use get_range_query or > >> get_global_range_query (), > >>// instead of doing something different for RVALS and global > >> ranges. > >> > >>if (!get_global_range_query ()->range_of_expr (vr, val) || > >> vr.undefined_p ()) > >> return NULL_TREE; > >> > >> > >> This entire section can boil down to something like > >> > >> if (!get_range_query (cfun)->range_of_expr (vr, val, stmt)) > >>return NULL; > > > > So get_range_query (cfun) will never be null? And no special handling > > of INTEGER_CST is needed, or checking for SSA_NAME. Nice. Attached > > is another revision of the same patch with this function simplified. > > > > (FYI: I expect the whole function to eventually go away, and to make > > other similar simplifications, but I'm not there yet.) > > > PS There has been an effort to get rid of global variables from GCC, > or, as the first step, to avoid accessing them directly(*). If and > when that happens, it seems like each pass will have to store either > the ranger instance as a member (directly or indirectly, via a member > of a class that stores it) or the function passed to pass::execute() > if it wants to access either. > > [*] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573975.html > The patch at the link above wasn't approved but IIUC removing globals > from GCC is still a goal. > >>> > >> I have no idea what direction that is going, but the net effect will > >> be the same in the end. You'll be calling get_range_query() with > >> either the function pointer, or the pass pointer, or something.. but > >> you should never need to pass around a pointer to either a ranger or > >> range-query. As I said earlier, this will likely be a pass property > >> and accessed thru the pass eventually... but until then, use what we > >> have.. cfun. It'll be trivial to transition down the road to whatever > >> the ultimate solution is. > > > > Okay, I'll do that in my future changes. > > > > Martin >
Re: [PATCH] gdb: Add a dependency between gdb and libbacktrace
On Mon, Aug 30, 2021 at 10:22 PM Andrew Burgess wrote: > > I plan to make use of libbacktrace within GDB. I believe that the > patch below needs to be merged into GCCs toplevel directory and then > back-ported to the binutils-gdb repository. > > Is this OK to merge? OK. > Thanks, > Andrew > > --- > > GDB is going to start using libbacktrace, so add a build dependency > between the two projects. This change needs to be added into the GCC > toplevel files, and then back-ported to the binutils-gdb repository. > > ChangeLog: > > * Makefile.def: Add all-gdb dependency on all-libbacktrace. > * Makefile.in: Regenerate. > --- > ChangeLog| 5 + > Makefile.def | 1 + > Makefile.in | 1 + > 3 files changed, 7 insertions(+) > > diff --git a/Makefile.def b/Makefile.def > index fbfdb6fee08..de3e0052106 100644 > --- a/Makefile.def > +++ b/Makefile.def > @@ -423,6 +423,7 @@ dependencies = { module=all-gdb; on=all-sim; }; > dependencies = { module=all-gdb; on=all-libdecnumber; }; > dependencies = { module=all-gdb; on=all-libtermcap; }; > dependencies = { module=all-gdb; on=all-libctf; }; > +dependencies = { module=all-gdb; on=all-libbacktrace; }; > > // Host modules specific to gdbserver. > dependencies = { module=configure-gdbserver; on=all-gnulib; }; > diff --git a/Makefile.in b/Makefile.in > index 5c85fcc5dd0..61af99dc75a 100644 > --- a/Makefile.in > +++ b/Makefile.in > @@ -61237,6 +61237,7 @@ all-gdb: maybe-all-libiconv > all-gdb: maybe-all-opcodes > all-gdb: maybe-all-libdecnumber > all-gdb: maybe-all-libctf > +all-gdb: maybe-all-libbacktrace > all-gdbserver: maybe-all-libiberty > configure-gdbsupport: maybe-configure-intl > all-gdbsupport: maybe-all-intl > -- > 2.25.4 >
Re: [PATCH] Fix PR 90142: contrib/download_prerequisites uses test ==
On Mon, Aug 30, 2021 at 10:49 PM apinski--- via Gcc-patches wrote: > > From: Andrew Pinski > > Since == is not portable, it is better to use = in contrib/ > download_prerequisites. The only place == was used is inside > the function md5_check which is used only on Mac OS X. > > Tested on Mac OS X as: > ./contrib/download_prerequisites --md5 > Both with all files having the correct checksum and one with a broken one. Ok. > contrib/ChangeLog: > > * download_prerequisites (md5_check): Replace == inside > test with = to be more portable. > --- > contrib/download_prerequisites | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/contrib/download_prerequisites b/contrib/download_prerequisites > index 8f69b61f5a9..11c283ecb1a 100755 > --- a/contrib/download_prerequisites > +++ b/contrib/download_prerequisites > @@ -113,7 +113,7 @@ md5_check() { >md5_checksum_output=$(md5 -r "${file_to_check}") ># Grab the text before the first space >md5_checksum_detected="${md5_checksum_output%% *}" > - [ "${md5_checksum_expected}" == "${md5_checksum_detected}" ] \ > + [ "${md5_checksum_expected}" = "${md5_checksum_detected}" ] \ > || die "Cannot verify integrity of possibly corrupted file > ${file_to_check}" >echo "${file_to_check}: OK" > } > -- > 2.17.1 >
Re: *PING**2 – Re: [Patch] Fortran: Fix Bind(C) char-len check, add ptr-contiguous check
Hi Tobias, s/However, as argument they are also iteroperable/However, as an argument they are also interoperable/ s/ /* else: valid only sind F2018 - and an assumed-shape/rank array; however, gfc_notify_std is already called when those array type are used. Thus, silently accept F200x. */ / /* else: valid only since F2018 - and an assumed-shape/rank array; however, gfc_notify_std is already called when those array types are used. Thus, silently accept F200x. */ Apart from those nits, it looks good to me. It even regtests OK :-) Thanks for sorting out the standard-ese. OK for mainline and, I would suggest 11-branch. Cheers Paul On Sun, 29 Aug 2021 at 08:35, Tobias Burnus wrote: > PING**2 > > On 25.08.21 20:58, Tobias Burnus wrote: > > Early *PING*. > > (I also should still review several Fortan patches... There are lots of > > patches waiting for review :-/) > > > > On 20.08.21 19:24, Tobias Burnus wrote: > >> The following is about interoperability (BIND(C)) only. > >> > >> > >> * The patch adds a missing check for pointer + contiguous. > >> (Rejected to avoid copy-in issues? Or checking issues?) > >> > >> > >> * And it corrects an issue regarding len > 1 characters. While > >> > >> subroutine foo(x) > >> character(len=2) :: x(*) > >> > >> is valid Fortran code (the argument can be "abce" or ['a','b','c','d'] > >> or ...) > >> – and would work also with bind(C) as the len=2 does not need to be > >> passed > >> as hidden argument as len is a constant. > >> However, it is not valid nonetheless. > >> > >> > >> OK? Comments? > >> > >> Tobias > >> > >> > >> PS: Referenced locations in the standard (F2018): > >> > >> C1554 If proc-language-binding-spec is specified for a procedure, > >> each of its dummy arguments shall be an interoperable procedure (18.3.6) > >> or a variable that is interoperable (18.3.4, 18.3.5), assumed-shape, > >> assumed-rank, assumed-type, of type CHARACTER with assumed length, > >> or that has the ALLOCATABLE or POINTER attribute. > >> > >> 18.3.1: "... If the type is character, the length type parameter is > >> interoperable if and only if its value is one. ..." > >> > >> "18.3.4 Interoperability of scalar variables": > >> "... A named scalar Fortran variable is interoperable ... if it > >> is of type character12its length is not assumed or declared by > >> an expression that is not a constant expression." > >> > >> 18.3.5: Likewise but for arrays. > >> > >> 18.3.6 "... Fortran procedure interface is interoperable with a C > >> function prototype ..." > >> "(5) any dummy argument without the VALUE attribute corresponds > >> to a formal parameter of the prototype that is of a pointer type, > >> and either > >> • the dummy argument is interoperable with an entity of the > >> referenced type ..." > >> (Remark: those are passed as byte stream) > >> "• the dummy argument is a nonallocatable nonpointer variable of > >> type > >> CHARACTER with assumed character length and the formal > >> parameter is > >> a pointer to CFI_cdesc_t, > >> • the dummy argument is allocatable, assumed-shape, > >> assumed-rank, or > >> a pointer without the CONTIGUOUS attribute, and the formal > >> parameter > >> is a pointer to CFI_cdesc_t, or > >> (Remark: those two use an array descriptor, also for > >> explicit-size/assumed-size > >> arrays or for scalars.) > >> • the dummy argument is assumed-type ..." > >> > > - > > Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße > > 201, 80634 München; Gesellschaft mit beschränkter Haftung; > > Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: > > München; Registergericht München, HRB 106955 > -- "If you can't explain it simply, you don't understand it well enough" - Albert Einstein
Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.
On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu wrote: > > On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches > wrote: > > > > On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford > > wrote: > > > > > > Richard Biener via Gcc-patches writes: > > > > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford > > > > wrote: > > > >> > > > >> Richard Biener via Gcc-patches writes: > > > >> > One thought I had is whether we can "fix" validate_subreg to have > > > >> > less > > > >> > "weird" allowed float-int > > > >> > special cases. As said upthread I think that we either should allow > > > >> > all of those, implying that > > > >> > subregs work semantically as if there's subregs to same-sized integer > > > >> > modes inbetween or > > > >> > disallow them all and make sure we're actually doing that > > > >> > explicitely. > > > >> > > > > >> > For example > > > >> > > > > >> > /* ??? Similarly, e.g. with (subreg:DF (reg:TI)). Though > > > >> > store_bit_field > > > >> > is the culprit here, and not the backends. */ > > > >> > else if (known_ge (osize, regsize) && known_ge (isize, osize)) > > > >> > ; > > > >> > > > > >> > I can't decipther rtl.text as to what the semantics of such a subreg > > > >> > is > > > >> > given the docs hand-wave about WORDS_BIG_ENDIAN vs. > > > >> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens > > > >> > when you mix those in a subreg. So maybe the above should > > > >> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN. > > > >> > > > > >> > But then the world would be much simpler if subregs of non-same size > > > >> > modes have explicit documentation for the mode kinds we have. > > > >> > > > >> Yeah. Although validate_subreg was a good idea, some of the mode > > > >> checks > > > >> are IMO a failed experiment. The hope was that eventually we'd remove > > > >> all those special exceptions once the culprit has been fixed. However, > > > >> the code is over 16 years old at this point and those changes never > > > >> happened. > > > >> > > > >> Nested subregs aren't a thing (thankfully) and one of the big > > > >> disadvantages > > > >> of the current validate_subreg mode-changing rules is that they aren't > > > >> transitive. This can artificially require temporary pseudos for things > > > >> that could be expressed directly as a single subreg. > > > > > > > > And that's what the proposed patch does (add same-mode size integer mode > > > > punning intermediate subregs). > > > > > > > > So if that's not supposed to be necessary then why restrict subregs at > > > > all? > > > > > > I was trying to say: I'm not sure we should. > > > > > > > I mean you seem to imply that the semantics would be clear and > > > > well-defined > > > > (to you - not to me). The only thing is that of course not all subregs > > > > are > > > > "implemented" by a target (or can be, w/o spilling). > > > > > > Yeah. That's for TARGET_CAN_CHANGE_MODE_CLASS to decide. > > > But it only comes in to play during RA or when trying to take > > > the subreg of a particular hard register. Transitivity doesn't > > > matter so much for the hard register case since the result of > > > simplify_gen_subreg should then be another hard register. > > > > > > > Which means - we should adjust validate_subreg with another special-case > > > > or rather generalize the existing ones to an overall set that makes more > > > > sense? > > > > > > Maybe it's too radical, but I would whether we should just get rid of: > > > > > > /* ??? This should not be here. Temporarily continue to allow word_mode > > > subregs of anything. The most common offender is (subreg:SI > > > (reg:DF)). > > > Generally, backends are doing something sketchy but it'll take time > > > to > > > fix them all. */ > > > if (omode == word_mode) > > > ; > > > /* ??? Similarly, e.g. with (subreg:DF (reg:TI)). Though > > > store_bit_field > > > is the culprit here, and not the backends. */ > > > else if (known_ge (osize, regsize) && known_ge (isize, osize)) > > > ; > > > /* Allow component subregs of complex and vector. Though given the > > > below > > > extraction rules, it's not always clear what that means. */ > > > else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode)) > > >&& GET_MODE_INNER (imode) == omode) > > > ; > > > /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs, > > > i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0). This > > > surely isn't the cleanest way to represent this. It's questionable > > > if this ought to be represented at all -- why can't this all be > > > hidden > > > in post-reload splitters that make arbitrarily mode changes to the > > > registers themselves. */ > > > else if (VECTOR_MODE_P (omode) > > >&& GET_MODE_INNER (omode) == GET_MODE_INNER (imode)) > > > ; > > > /* Subregs involving floating point modes are not
Re: [PATCH] Fix x86/56337 : 1<<28 alignment is broken
On Mon, Aug 30, 2021 at 11:38 PM Andrew Pinski via Gcc-patches wrote: > > On Fri, Jul 23, 2021 at 1:33 PM apinski--- via Gcc-patches > wrote: > > > > From: Andrew Pinski > > > > The problem here is the x86_64 back-end uses a signed integer > > for alignment and then divides by BITS_PER_UNIT so if we had > > INT_MIN (which is what 1<<28*8 is), we would get the wrong result. > > > > This fixes the problem by using unsigned for the argument to > > x86_output_aligned_bss and x86_output_aligned_bss. > > > > OK? Bootstrapped and tested on x86_64-linux-gnu. > > Ping? OK. Thanks, Richard. > > > > gcc/ChangeLog: > > > > PR target/56337 > > * config/i386/i386-protos.h (x86_output_aligned_bss): > > Change align argument to unsigned type. > > (x86_elf_aligned_decl_common): Likewise. > > * config/i386/i386.c (x86_elf_aligned_decl_common): Likewise. > > (x86_output_aligned_bss): Likewise. > > --- > > gcc/config/i386/i386-protos.h | 4 ++-- > > gcc/config/i386/i386.c| 4 ++-- > > 2 files changed, 4 insertions(+), 4 deletions(-) > > > > diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h > > index 07ac02aff69..fa1b0d0d787 100644 > > --- a/gcc/config/i386/i386-protos.h > > +++ b/gcc/config/i386/i386-protos.h > > @@ -325,9 +325,9 @@ struct ix86_address > > extern int ix86_decompose_address (rtx, struct ix86_address *); > > extern int memory_address_length (rtx, bool); > > extern void x86_output_aligned_bss (FILE *, tree, const char *, > > - unsigned HOST_WIDE_INT, int); > > + unsigned HOST_WIDE_INT, unsigned); > > extern void x86_elf_aligned_decl_common (FILE *, tree, const char *, > > -unsigned HOST_WIDE_INT, int); > > +unsigned HOST_WIDE_INT, unsigned); > > > > #ifdef RTX_CODE > > extern void ix86_fp_comparison_codes (enum rtx_code code, enum rtx_code *, > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > > index 876a19f4c1f..f86d11dfb11 100644 > > --- a/gcc/config/i386/i386.c > > +++ b/gcc/config/i386/i386.c > > @@ -837,7 +837,7 @@ x86_64_elf_unique_section (tree decl, int reloc) > > void > > x86_elf_aligned_decl_common (FILE *file, tree decl, > > const char *name, unsigned HOST_WIDE_INT size, > > - int align) > > + unsigned align) > > { > >if ((ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_MEDIUM_PIC) > >&& size > (unsigned int)ix86_section_threshold) > > @@ -858,7 +858,7 @@ x86_elf_aligned_decl_common (FILE *file, tree decl, > > > > void > > x86_output_aligned_bss (FILE *file, tree decl, const char *name, > > - unsigned HOST_WIDE_INT size, int align) > > + unsigned HOST_WIDE_INT size, unsigned align) > > { > >if ((ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_MEDIUM_PIC) > >&& size > (unsigned int)ix86_section_threshold) > > -- > > 2.17.1 > >
Re: [PATCH] Fix PR driver/79181 (and others), not deleting some /tmp/cc* files for LTO.
On Tue, Aug 31, 2021 at 4:17 AM apinski--- via Gcc-patches wrote: > > From: Andrew Pinski > > So the main issue here is that some signals are not setup unlike collect2. > So this merges the setting up of the signal handlers to one function in > collect-utils and has collect2 and lto-wrapper call that function. > > OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. OK. > gcc/ChangeLog: > > PR driver/79181 > * collect-utils.c (setup_signals): New declaration. > * collect-utils.h (setup_signals): New function. > * collect2.c (handler): Delete. > (main): Instead of manually setting up the signals, > just call setup_signals. > * lto-wrapper.c (main): Likewise. > --- > gcc/collect-utils.c | 37 + > gcc/collect-utils.h | 1 + > gcc/collect2.c | 36 +--- > gcc/lto-wrapper.c | 18 +- > 4 files changed, 40 insertions(+), 52 deletions(-) > > diff --git a/gcc/collect-utils.c b/gcc/collect-utils.c > index 6b5d61d5162..19423d31885 100644 > --- a/gcc/collect-utils.c > +++ b/gcc/collect-utils.c > @@ -57,6 +57,43 @@ fatal_signal (int signum) > so its normal effect occurs. */ >kill (getpid (), signum); > } > + > +/* Setup the signal handlers for the utils. */ > +void > +setup_signals (void) > +{ > +#ifdef SIGQUIT > + if (signal (SIGQUIT, SIG_IGN) != SIG_IGN) > +signal (SIGQUIT, fatal_signal); > +#endif > + if (signal (SIGINT, SIG_IGN) != SIG_IGN) > +signal (SIGINT, fatal_signal); > +#ifdef SIGALRM > + if (signal (SIGALRM, SIG_IGN) != SIG_IGN) > +signal (SIGALRM, fatal_signal); > +#endif > +#ifdef SIGHUP > + if (signal (SIGHUP, SIG_IGN) != SIG_IGN) > +signal (SIGHUP, fatal_signal); > +#endif > + if (signal (SIGSEGV, SIG_IGN) != SIG_IGN) > +signal (SIGSEGV, fatal_signal); > + if (signal (SIGTERM, SIG_IGN) != SIG_IGN) > +signal (SIGTERM, fatal_signal); > +#ifdef SIGPIPE > + if (signal (SIGPIPE, SIG_IGN) != SIG_IGN) > +signal (SIGPIPE, fatal_signal); > +#endif > +#ifdef SIGBUS > + if (signal (SIGBUS, SIG_IGN) != SIG_IGN) > +signal (SIGBUS, fatal_signal); > +#endif > +#ifdef SIGCHLD > + /* We *MUST* set SIGCHLD to SIG_DFL so that the wait4() call will > + receive the signal. A different setting is inheritable */ > + signal (SIGCHLD, SIG_DFL); > +#endif > +} > > /* Wait for a process to finish, and exit if a nonzero status is found. */ > > diff --git a/gcc/collect-utils.h b/gcc/collect-utils.h > index 4f0e3ce9832..15f831d778a 100644 > --- a/gcc/collect-utils.h > +++ b/gcc/collect-utils.h > @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3. If not see > extern void notice (const char *, ...) >__attribute__ ((format (printf, 1, 2))); > extern void fatal_signal (int); > +extern void setup_signals (void); > > extern struct pex_obj *collect_execute (const char *, char **, > const char *, const char *, > diff --git a/gcc/collect2.c b/gcc/collect2.c > index 07092c2733a..cf04a58ba4d 100644 > --- a/gcc/collect2.c > +++ b/gcc/collect2.c > @@ -301,7 +301,6 @@ const char tool_name[] = "collect2"; > > static symkind is_ctor_dtor (const char *); > > -static void handler (int); > static void maybe_unlink_list (char **); > static void add_to_list (struct head *, const char *); > static int extract_init_priority (const char *); > @@ -408,14 +407,6 @@ collect_atexit (void) >tool_cleanup (false); > } > > -static void > -handler (int signo) > -{ > - tool_cleanup (true); > - > - signal (signo, SIG_DFL); > - raise (signo); > -} > /* Notify user of a non-error, without translating the format string. */ > void > notice_translated (const char *cmsgid, ...) > @@ -907,11 +898,7 @@ main (int argc, char **argv) >COLLECT2_HOST_INITIALIZATION; > #endif > > -#ifdef SIGCHLD > - /* We *MUST* set SIGCHLD to SIG_DFL so that the wait4() call will > - receive the signal. A different setting is inheritable */ > - signal (SIGCHLD, SIG_DFL); > -#endif > + setup_signals (); > >/* Unlock the stdio streams. */ >unlock_std_streams (); > @@ -1051,27 +1038,6 @@ main (int argc, char **argv) >if (argc < 2) > fatal_error (input_location, "no arguments"); > > -#ifdef SIGQUIT > - if (signal (SIGQUIT, SIG_IGN) != SIG_IGN) > -signal (SIGQUIT, handler); > -#endif > - if (signal (SIGINT, SIG_IGN) != SIG_IGN) > -signal (SIGINT, handler); > -#ifdef SIGALRM > - if (signal (SIGALRM, SIG_IGN) != SIG_IGN) > -signal (SIGALRM, handler); > -#endif > -#ifdef SIGHUP > - if (signal (SIGHUP, SIG_IGN) != SIG_IGN) > -signal (SIGHUP, handler); > -#endif > - if (signal (SIGSEGV, SIG_IGN) != SIG_IGN) > -signal (SIGSEGV, handler); > -#ifdef SIGBUS > - if (signal (SIGBUS, SIG_IGN) != SIG_IGN) > -signal (SIGBUS, handler); > -#endif > - >/* Extract COMPILER_PATH and PATH into our prefix list. */ >prefix_from_env ("COMPILER_PATH", &cpath); >
Re: [PATCH] Fix gcc.dg/ipa/inline-8.c for -fPIC
On Tue, Aug 31, 2021 at 7:41 AM apinski--- via Gcc-patches wrote: > > From: Andrew Pinski > > The problem here is with -fPIC, both cmp and move > don't bind locally so they are not even tried to be > inlined. This fixes the issue by marking both > functions as static and now the testcase passes > for both -fPIC and -fno-PIC cases. > > OK? Tested on x86_64-linux-gnu. OK > gcc/testsuite/ChangeLog: > > * gcc.dg/ipa/inline-8.c: Mark cmp and move as > static so they both bind local and available for > inlinine. > --- > gcc/testsuite/gcc.dg/ipa/inline-8.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/gcc/testsuite/gcc.dg/ipa/inline-8.c > b/gcc/testsuite/gcc.dg/ipa/inline-8.c > index 388283ca213..c51eec20fc8 100644 > --- a/gcc/testsuite/gcc.dg/ipa/inline-8.c > +++ b/gcc/testsuite/gcc.dg/ipa/inline-8.c > @@ -6,13 +6,13 @@ > #include > extern int isnanf (float); > /* Can't be inlined because isnanf will be optimized out. */ > -int > +static int > cmp (float a) > { >return isnanf (a); > } > /* Can be inlined. */ > -int > +static int > move (int a) > { >return a; > -- > 2.17.1 >
[PATCH] PR middle-end/100810: Penalize IV candidates with undefined value bases
Time to hopefully earn some goodwill from the team; this patch fixes a P1 wrong-code-on-valid regression in ivopts. Many thanks to Andrew Pinski for help with the analysis. Consider the code fragment below: int i; for (j=0; j<10; j++) i++; This results in a loop containing two induction variables, i and j, where j is initialized, but i isn't (typically indicated in tree dumps by qualified ssa names like i(D). In PR 100810, the loop optimizers end up selecting i as the "best" candidate (perhaps because having no initialization it's cheaper) which leads to problems in later passes when (the equivalent of) j is considered not to have the value 10 after the loop, as its definition is now computed from an undefined/uninitialized value. The fix below is to add a field to track whether any "iv_group" contains a use of an iv based on a undefined value, and then prohibit IVs that are based on undefined values from being candidates for groups that don't use undefined values. This may seem lenient, but it allows an IV with an undefined base to be a candidate for itself, and is a sufficient condition to avoid the above bug/regression. A stricter condition might be to only allow "undefined_value iv"s as candidates for iv_groups where *all* uses are to "undefined_value ivs"? My concern was that this might lead to cases/loops that no longer have suitable candidates (i.e. a possible performance regression). Hopefully, the tree-loop optimization experts agree with my analysis/fix. This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check" with no new failures. Ok for mainline? 2021-08-31 Roger Sayle Andrew Pinski gcc/ChangeLog PR middle-end/100810 * tree-ssa-loop-ivopts.c (struct iv_group): Add a new uses_undefined_value_p field, indicating this group has uses of iv's whose base is ssa_undefined_value_p. (record_use): Update uses_undefined_value_p as required. (record_group): Initialize uses_undefined_value_p to false. (determine_group_iv_cost_generic): Consider a candidate with a ssa_undefined_value_p base to have infinite_cost for a group where uses_undefined_value_p is false. gcc/testsuite/ChangeLog PR middle-end/100810 * gcc.dg/pr100810.c: New test case. Roger -- diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c index 4a498ab..ca8f526 100644 --- a/gcc/tree-ssa-loop-ivopts.c +++ b/gcc/tree-ssa-loop-ivopts.c @@ -432,6 +432,8 @@ struct iv_group struct iv_cand *selected; /* To indicate this is a doloop use group. */ bool doloop_p; + /* This group uses undefined values. */ + bool uses_undefined_value_p; /* Uses in the group. */ vec vuses; }; @@ -1540,6 +1542,12 @@ record_use (struct iv_group *group, tree *use_p, struct iv *iv, use->addr_offset = addr_offset; group->vuses.safe_push (use); + + /* Record uses of undefined values. */ + if (TREE_CODE (iv->base) == SSA_NAME + && ssa_undefined_value_p (iv->base)) +group->uses_undefined_value_p = true; + return use; } @@ -1582,6 +1590,7 @@ record_group (struct ivopts_data *data, enum use_type type) group->related_cands = BITMAP_ALLOC (NULL); group->vuses.create (1); group->doloop_p = false; + group->uses_undefined_value_p = false; data->vgroups.safe_push (group); return group; @@ -4960,6 +4969,12 @@ determine_group_iv_cost_generic (struct ivopts_data *data, the candidate. */ if (cand->pos == IP_ORIGINAL && cand->incremented_at == use->stmt) cost = no_cost; + /* Disallow using an iv based on an undefined value as a candidate + replacement for a group that uses only defined values. */ + else if (!group->uses_undefined_value_p + && TREE_CODE (cand->iv->base) == SSA_NAME + && ssa_undefined_value_p (cand->iv->base)) +cost = infinite_cost; else cost = get_computation_cost (data, use, cand, false, &inv_vars, NULL, &inv_expr); /* { dg-do run } */ /* { dg-options "-O2" } */ int a, b = 1, c = 1, e, f = 1, g, h, j; volatile int d; static void k() { int i; h = b; if (c && a >= 0) { while (a) { i++; h--; } if (g) for (h = 0; h < 2; h++) ; if (!b) i &&d; } } static void l() { for (; j < 1; j++) if (!e && c && f) k(); } int main() { if (f) l(); if (h != 1) __builtin_abort(); return 0; }
Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.
On Tue, Aug 31, 2021 at 2:11 PM Richard Biener wrote: > > On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu wrote: > > > > On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches > > wrote: > > > > > > On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford > > > wrote: > > > > > > > > Richard Biener via Gcc-patches writes: > > > > > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford > > > > > wrote: > > > > >> > > > > >> Richard Biener via Gcc-patches writes: > > > > >> > One thought I had is whether we can "fix" validate_subreg to have > > > > >> > less > > > > >> > "weird" allowed float-int > > > > >> > special cases. As said upthread I think that we either should > > > > >> > allow > > > > >> > all of those, implying that > > > > >> > subregs work semantically as if there's subregs to same-sized > > > > >> > integer > > > > >> > modes inbetween or > > > > >> > disallow them all and make sure we're actually doing that > > > > >> > explicitely. > > > > >> > > > > > >> > For example > > > > >> > > > > > >> > /* ??? Similarly, e.g. with (subreg:DF (reg:TI)). Though > > > > >> > store_bit_field > > > > >> > is the culprit here, and not the backends. */ > > > > >> > else if (known_ge (osize, regsize) && known_ge (isize, osize)) > > > > >> > ; > > > > >> > > > > > >> > I can't decipther rtl.text as to what the semantics of such a > > > > >> > subreg is > > > > >> > given the docs hand-wave about WORDS_BIG_ENDIAN vs. > > > > >> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens > > > > >> > when you mix those in a subreg. So maybe the above should > > > > >> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN. > > > > >> > > > > > >> > But then the world would be much simpler if subregs of non-same > > > > >> > size > > > > >> > modes have explicit documentation for the mode kinds we have. > > > > >> > > > > >> Yeah. Although validate_subreg was a good idea, some of the mode > > > > >> checks > > > > >> are IMO a failed experiment. The hope was that eventually we'd > > > > >> remove > > > > >> all those special exceptions once the culprit has been fixed. > > > > >> However, > > > > >> the code is over 16 years old at this point and those changes never > > > > >> happened. > > > > >> > > > > >> Nested subregs aren't a thing (thankfully) and one of the big > > > > >> disadvantages > > > > >> of the current validate_subreg mode-changing rules is that they > > > > >> aren't > > > > >> transitive. This can artificially require temporary pseudos for > > > > >> things > > > > >> that could be expressed directly as a single subreg. > > > > > > > > > > And that's what the proposed patch does (add same-mode size integer > > > > > mode > > > > > punning intermediate subregs). > > > > > > > > > > So if that's not supposed to be necessary then why restrict subregs > > > > > at all? > > > > > > > > I was trying to say: I'm not sure we should. > > > > > > > > > I mean you seem to imply that the semantics would be clear and > > > > > well-defined > > > > > (to you - not to me). The only thing is that of course not all > > > > > subregs are > > > > > "implemented" by a target (or can be, w/o spilling). > > > > > > > > Yeah. That's for TARGET_CAN_CHANGE_MODE_CLASS to decide. > > > > But it only comes in to play during RA or when trying to take > > > > the subreg of a particular hard register. Transitivity doesn't > > > > matter so much for the hard register case since the result of > > > > simplify_gen_subreg should then be another hard register. > > > > > > > > > Which means - we should adjust validate_subreg with another > > > > > special-case > > > > > or rather generalize the existing ones to an overall set that makes > > > > > more > > > > > sense? > > > > > > > > Maybe it's too radical, but I would whether we should just get rid of: > > > > > > > > /* ??? This should not be here. Temporarily continue to allow > > > > word_mode > > > > subregs of anything. The most common offender is (subreg:SI > > > > (reg:DF)). > > > > Generally, backends are doing something sketchy but it'll take > > > > time to > > > > fix them all. */ > > > > if (omode == word_mode) > > > > ; > > > > /* ??? Similarly, e.g. with (subreg:DF (reg:TI)). Though > > > > store_bit_field > > > > is the culprit here, and not the backends. */ > > > > else if (known_ge (osize, regsize) && known_ge (isize, osize)) > > > > ; > > > > /* Allow component subregs of complex and vector. Though given the > > > > below > > > > extraction rules, it's not always clear what that means. */ > > > > else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode)) > > > >&& GET_MODE_INNER (imode) == omode) > > > > ; > > > > /* ??? x86 sse code makes heavy use of *paradoxical* vector subregs, > > > > i.e. (subreg:V4SF (reg:SF) 0) or (subreg:V4SF (reg:V2SF) 0). This > > > > surely isn't the cleanest way to represent this. It's que
Re: [PATCH] Make sure we're playing with integral modes before call extract_integral_bit_field.
On Tue, Aug 31, 2021 at 2:30 PM Hongtao Liu wrote: > > On Tue, Aug 31, 2021 at 2:11 PM Richard Biener > wrote: > > > > On Fri, Aug 27, 2021 at 6:50 AM Hongtao Liu wrote: > > > > > > On Thu, Aug 26, 2021 at 7:09 PM Richard Biener via Gcc-patches > > > wrote: > > > > > > > > On Thu, Aug 26, 2021 at 12:50 PM Richard Sandiford > > > > wrote: > > > > > > > > > > Richard Biener via Gcc-patches writes: > > > > > > On Thu, Aug 26, 2021 at 11:06 AM Richard Sandiford > > > > > > wrote: > > > > > >> > > > > > >> Richard Biener via Gcc-patches writes: > > > > > >> > One thought I had is whether we can "fix" validate_subreg to > > > > > >> > have less > > > > > >> > "weird" allowed float-int > > > > > >> > special cases. As said upthread I think that we either should > > > > > >> > allow > > > > > >> > all of those, implying that > > > > > >> > subregs work semantically as if there's subregs to same-sized > > > > > >> > integer > > > > > >> > modes inbetween or > > > > > >> > disallow them all and make sure we're actually doing that > > > > > >> > explicitely. > > > > > >> > > > > > > >> > For example > > > > > >> > > > > > > >> > /* ??? Similarly, e.g. with (subreg:DF (reg:TI)). Though > > > > > >> > store_bit_field > > > > > >> > is the culprit here, and not the backends. */ > > > > > >> > else if (known_ge (osize, regsize) && known_ge (isize, osize)) > > > > > >> > ; > > > > > >> > > > > > > >> > I can't decipther rtl.text as to what the semantics of such a > > > > > >> > subreg is > > > > > >> > given the docs hand-wave about WORDS_BIG_ENDIAN vs. > > > > > >> > FLOAT_WORDS_BIG_ENDIAN but don't actually say what happens > > > > > >> > when you mix those in a subreg. So maybe the above should > > > > > >> > have explicitely have WORDS_BIG_ENDIAN == FLOAT_WORDS_BIG_ENDIAN. > > > > > >> > > > > > > >> > But then the world would be much simpler if subregs of non-same > > > > > >> > size > > > > > >> > modes have explicit documentation for the mode kinds we have. > > > > > >> > > > > > >> Yeah. Although validate_subreg was a good idea, some of the mode > > > > > >> checks > > > > > >> are IMO a failed experiment. The hope was that eventually we'd > > > > > >> remove > > > > > >> all those special exceptions once the culprit has been fixed. > > > > > >> However, > > > > > >> the code is over 16 years old at this point and those changes never > > > > > >> happened. > > > > > >> > > > > > >> Nested subregs aren't a thing (thankfully) and one of the big > > > > > >> disadvantages > > > > > >> of the current validate_subreg mode-changing rules is that they > > > > > >> aren't > > > > > >> transitive. This can artificially require temporary pseudos for > > > > > >> things > > > > > >> that could be expressed directly as a single subreg. > > > > > > > > > > > > And that's what the proposed patch does (add same-mode size integer > > > > > > mode > > > > > > punning intermediate subregs). > > > > > > > > > > > > So if that's not supposed to be necessary then why restrict subregs > > > > > > at all? > > > > > > > > > > I was trying to say: I'm not sure we should. > > > > > > > > > > > I mean you seem to imply that the semantics would be clear and > > > > > > well-defined > > > > > > (to you - not to me). The only thing is that of course not all > > > > > > subregs are > > > > > > "implemented" by a target (or can be, w/o spilling). > > > > > > > > > > Yeah. That's for TARGET_CAN_CHANGE_MODE_CLASS to decide. > > > > > But it only comes in to play during RA or when trying to take > > > > > the subreg of a particular hard register. Transitivity doesn't > > > > > matter so much for the hard register case since the result of > > > > > simplify_gen_subreg should then be another hard register. > > > > > > > > > > > Which means - we should adjust validate_subreg with another > > > > > > special-case > > > > > > or rather generalize the existing ones to an overall set that makes > > > > > > more > > > > > > sense? > > > > > > > > > > Maybe it's too radical, but I would whether we should just get rid of: > > > > > > > > > > /* ??? This should not be here. Temporarily continue to allow > > > > > word_mode > > > > > subregs of anything. The most common offender is (subreg:SI > > > > > (reg:DF)). > > > > > Generally, backends are doing something sketchy but it'll take > > > > > time to > > > > > fix them all. */ > > > > > if (omode == word_mode) > > > > > ; > > > > > /* ??? Similarly, e.g. with (subreg:DF (reg:TI)). Though > > > > > store_bit_field > > > > > is the culprit here, and not the backends. */ > > > > > else if (known_ge (osize, regsize) && known_ge (isize, osize)) > > > > > ; > > > > > /* Allow component subregs of complex and vector. Though given the > > > > > below > > > > > extraction rules, it's not always clear what that means. */ > > > > > else if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode)) > > > > >&&
Re: [PATCH][pushed] Add -fprofile-reproducible=parallel-runs to STAGEfeedback_CFLAGS to Makefile.tpl.
On 8/31/21 01:03, Gleb Fotengauer-Malinovskiy wrote: On Mon, Aug 30, 2021 at 10:13:21AM +0200, Martin Liška wrote: On 8/27/21 19:45, Gleb Fotengauer-Malinovskiy wrote: Hi, On Thu, Mar 11, 2021 at 04:19:51PM +0100, Martin Liška wrote: Pushed as obvious, the original change was done in g:e05a117dc4b98f3ac60851608f532ba7cee7343a. Martin ChangeLog: * Makefile.tpl: The change was done Makefile.in which is generated file. --- Makefile.tpl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile.tpl b/Makefile.tpl index 3b88f351d5b..6e0337fb48f 100644 --- a/Makefile.tpl +++ b/Makefile.tpl @@ -488,7 +488,7 @@ STAGEprofile_TFLAGS = $(STAGE2_TFLAGS) STAGEtrain_CFLAGS = $(filter-out -fchecking=1,$(STAGE3_CFLAGS)) STAGEtrain_TFLAGS = $(filter-out -fchecking=1,$(STAGE3_TFLAGS)) -STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use +STAGEfeedback_CFLAGS = $(STAGE4_CFLAGS) -fprofile-use -fprofile-reproducible=parallel-runs Did you mean to add -fprofile-reproducible flag to STAGEprofile_CFLAGS instead? No. I suppose this flag doesn't mean anything without -fprofile-generate, which is passed through STAGEprofile_CFLAGS. The flag -fprofile-reproducible is used when we use a collected profile, so it works *only* with -fprofile-use flag. Thank you for the answer, now I see how it works. Good. I think that it means that the documentation is rather misleading about this flag: '-fprofile-reproducible=[multithreaded|parallel-runs|serial]' Control level of reproducibility of profile gathered by '-fprofile-generate'. [...] Well, feel free to send a patch that can improve the wording. Cheers, Martin It convinced me that this flag changes the way the code is instrumented for profile generation. The rest of it convinced me even more, actually. Thanks for heads up! Cheers, Martin Thanks,