[PATCH][PR65802] Mark ifn_va_arg with ECF_NOTHROW
Hi, this patch fixes PR65802. The problem described in PR65802 is that when compiling the test-case (included in the patch below) at -O0, the compiler runs into a gcc_assert ICE in redirect_eh_edge_1 during pass_cleanup_eh: ... gcc_assert (lookup_stmt_eh_lp (throw_stmt) == old_lp_nr); ... In more detail, during compilation the ifn_va_arg is marked at as a throwing function. That causes exception handling code to be generated, with exception handling edges: ... ;; basic block 2, loop depth 0, count 0, freq 0, maybe hot ;; prev block 0, next block 3, flags: (NEW, REACHABLE) ;; pred: ENTRY (FALLTHRU) [LP 1] # .MEM_5 = VDEF <.MEM_4(D)> # USE = anything # CLB = anything _6 = VA_ARG (&cD.2333, 0B); ;; succ: 7 (EH) ;; 3 (FALLTHRU) ... After pass_lower_vaarg, the expansion of ifn_va_arg is spread over several basic blocks: ... ;; basic block 2, loop depth 0, count 0, freq 0, maybe hot ;;prev block 0, next block 11, flags: (NEW, REACHABLE) ;;pred: ENTRY (FALLTHRU) ;;succ: 11 [100.0%] (FALLTHRU) ;; basic block 11, loop depth 0, count 0, freq 0, maybe hot ;;prev block 2, next block 12, flags: (NEW) ;;pred: 2 [100.0%] (FALLTHRU) # VUSE <.MEM_4(D)> _22 = cD.2333.gp_offsetD.5; if (_22 >= 48) goto (); else goto (); ;;succ: 13 (TRUE_VALUE) ;;12 (FALSE_VALUE) ;; basic block 12, loop depth 0, count 0, freq 0, maybe hot ;;prev block 11, next block 13, flags: (NEW) ;;pred: 11 (FALSE_VALUE) : # VUSE <.MEM_4(D)> _23 = cD.2333.reg_save_areaD.8; # VUSE <.MEM_4(D)> _24 = cD.2333.gp_offsetD.5; _25 = (sizetype) _24; addr.1_26 = _23 + _25; # VUSE <.MEM_4(D)> _27 = cD.2333.gp_offsetD.5; _28 = _27 + 8; # .MEM_29 = VDEF <.MEM_4(D)> cD.2333.gp_offsetD.5 = _28; goto (); ;;succ: 14 (FALLTHRU) ;; basic block 13, loop depth 0, count 0, freq 0, maybe hot ;;prev block 12, next block 14, flags: (NEW) ;;pred: 11 (TRUE_VALUE) : # VUSE <.MEM_4(D)> _30 = cD.2333.overflow_arg_areaD.7; addr.1_31 = _30; _32 = _30 + 8; # .MEM_33 = VDEF <.MEM_4(D)> cD.2333.overflow_arg_areaD.7 = _32; ;;succ: 14 (FALLTHRU) ;; basic block 14, loop depth 0, count 0, freq 0, maybe hot ;;prev block 13, next block 15, flags: (NEW) ;;pred: 12 (FALLTHRU) ;;13 (FALLTHRU) # .MEM_20 = PHI <.MEM_29(12), .MEM_33(13)> # addr.1_21 = PHI : # VUSE <.MEM_20> _6 = MEM[(intD.9 * * {ref-all})addr.1_21]; ;;succ: 15 (FALLTHRU) ;; basic block 15, loop depth 0, count 0, freq 0, maybe hot ;;prev block 14, next block 3, flags: (NEW) ;;pred: 14 (FALLTHRU) ;;succ: 7 (EH) ;;3 (FALLTHRU) ... And an ICE is triggered in redirect_eh_edge_1, because the code expects the last statement in a BB with an outgoing EH edge to be a throwing statement. That's obviously not the case, since bb15 is empty. But also all the other statements in the expansion are non-throwing. Looking at the representation before the ifn_va_arg, VA_ARG_EXPR is non-throwing (even with -fnon-call-exceptions). And looking at the situation before the introduction of ifn_va_arg, the expansion of VA_ARG_EXPR also didn't contain any throwing statements. This patch fixes the ICE by marking ifn_va_arg with ECF_NOTHROW. Bootstrapped and reg-tested on x86_64. OK for trunk? Thanks, - Tom Mark ifn_va_arg with ECF_NOTHROW 2015-04-20 Tom de Vries PR tree-optimization/65802 * internal-fn.def (VA_ARG): Add ECF_NOTROW to flags. * g++.dg/pr65802.C: New test. --- gcc/internal-fn.def| 2 +- gcc/testsuite/g++.dg/pr65802.C | 29 + 2 files changed, 30 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/pr65802.C diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index f557c64..7e19313 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -62,4 +62,4 @@ DEF_INTERNAL_FN (ADD_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL) -DEF_INTERNAL_FN (VA_ARG, 0, NULL) +DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW, NULL) diff --git a/gcc/testsuite/g++.dg/pr65802.C b/gcc/testsuite/g++.dg/pr65802.C new file mode 100644 index 000..26e5317 --- /dev/null +++ b/gcc/testsuite/g++.dg/pr65802.C @@ -0,0 +1,29 @@ +// { dg-do compile } +// { dg-options "-O0" } + +typedef int tf (); + +struct S +{ + tf m_fn1; +} a; + +void +fn1 () +{ + try +{ + __builtin_va_list c; + { + int *d = __builtin_va_arg (c, int *); + int **e = &d; + __asm__("" : "=d"(e)); + a.m_fn1 (); + } + a.m_fn1 (); +} + catch (...) +{ + +} +} -- 1.9.1
Re: Ping^3 : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.
On Tue, Apr 21, 2015 at 11:03 AM, Segher Boessenkool wrote: > On Tue, Apr 21, 2015 at 09:39:16AM +0800, Terry Guo wrote: >> Is this one ok to trunk? > > Probably, if you send the patch + changelog entry :-) > > Did you fix the comment? REG_USERVAR_P and HARD_REGISTER_P can be > set for more than just register asm. > > > Segher Sorry for missing the patch. I believe that I addressed your patch. Please review it again to make sure my understanding is correct. The patch is attached and here is the URL to it https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01593.html. The ChangeLog: gcc/ChangeLog: 2015-04-21 Terry Guo PR rtl-optimization/64818 * combine.c (can_combine_p): Don't combine if DEST is a user-specified register. gcc/testsuite/ChangeLog: 2015-04-21 Terry Guo PR rtl-optimization/64818 * gcc.target/arm/pr64818.c: New. pr64818-combine-user-specified-register.patch-5 Description: Binary data
Re: [PATCH][PR65802] Mark ifn_va_arg with ECF_NOTHROW
On Tue, 21 Apr 2015, Tom de Vries wrote: > Hi, > > this patch fixes PR65802. > > The problem described in PR65802 is that when compiling the test-case > (included in the patch below) at -O0, the compiler runs into a gcc_assert ICE > in redirect_eh_edge_1 during pass_cleanup_eh: > ... > gcc_assert (lookup_stmt_eh_lp (throw_stmt) == old_lp_nr); > ... > > > In more detail, during compilation the ifn_va_arg is marked at as a throwing > function. That causes exception handling code to be generated, with exception > handling edges: > ... > ;; basic block 2, loop depth 0, count 0, freq 0, maybe hot > ;; prev block 0, next block 3, flags: (NEW, REACHABLE) > ;; pred: ENTRY (FALLTHRU) > [LP 1] # .MEM_5 = VDEF <.MEM_4(D)> > # USE = anything > # CLB = anything > _6 = VA_ARG (&cD.2333, 0B); > ;; succ: 7 (EH) > ;; 3 (FALLTHRU) > ... > > After pass_lower_vaarg, the expansion of ifn_va_arg is spread over several > basic blocks: > ... > ;; basic block 2, loop depth 0, count 0, freq 0, maybe hot > ;;prev block 0, next block 11, flags: (NEW, REACHABLE) > ;;pred: ENTRY (FALLTHRU) > ;;succ: 11 [100.0%] (FALLTHRU) > > ;; basic block 11, loop depth 0, count 0, freq 0, maybe hot > ;;prev block 2, next block 12, flags: (NEW) > ;;pred: 2 [100.0%] (FALLTHRU) > # VUSE <.MEM_4(D)> > _22 = cD.2333.gp_offsetD.5; > if (_22 >= 48) > goto (); > else > goto (); > ;;succ: 13 (TRUE_VALUE) > ;;12 (FALSE_VALUE) > > ;; basic block 12, loop depth 0, count 0, freq 0, maybe hot > ;;prev block 11, next block 13, flags: (NEW) > ;;pred: 11 (FALSE_VALUE) > : > # VUSE <.MEM_4(D)> > _23 = cD.2333.reg_save_areaD.8; > # VUSE <.MEM_4(D)> > _24 = cD.2333.gp_offsetD.5; > _25 = (sizetype) _24; > addr.1_26 = _23 + _25; > # VUSE <.MEM_4(D)> > _27 = cD.2333.gp_offsetD.5; > _28 = _27 + 8; > # .MEM_29 = VDEF <.MEM_4(D)> > cD.2333.gp_offsetD.5 = _28; > goto (); > ;;succ: 14 (FALLTHRU) > > ;; basic block 13, loop depth 0, count 0, freq 0, maybe hot > ;;prev block 12, next block 14, flags: (NEW) > ;;pred: 11 (TRUE_VALUE) > : > # VUSE <.MEM_4(D)> > _30 = cD.2333.overflow_arg_areaD.7; > addr.1_31 = _30; > _32 = _30 + 8; > # .MEM_33 = VDEF <.MEM_4(D)> > cD.2333.overflow_arg_areaD.7 = _32; > ;;succ: 14 (FALLTHRU) > > ;; basic block 14, loop depth 0, count 0, freq 0, maybe hot > ;;prev block 13, next block 15, flags: (NEW) > ;;pred: 12 (FALLTHRU) > ;;13 (FALLTHRU) > # .MEM_20 = PHI <.MEM_29(12), .MEM_33(13)> > # addr.1_21 = PHI > : > # VUSE <.MEM_20> > _6 = MEM[(intD.9 * * {ref-all})addr.1_21]; > ;;succ: 15 (FALLTHRU) > > ;; basic block 15, loop depth 0, count 0, freq 0, maybe hot > ;;prev block 14, next block 3, flags: (NEW) > ;;pred: 14 (FALLTHRU) > ;;succ: 7 (EH) > ;;3 (FALLTHRU) > ... > > And an ICE is triggered in redirect_eh_edge_1, because the code expects the > last statement in a BB with an outgoing EH edge to be a throwing statement. > > That's obviously not the case, since bb15 is empty. But also all the other > statements in the expansion are non-throwing. > > > Looking at the representation before the ifn_va_arg, VA_ARG_EXPR is > non-throwing (even with -fnon-call-exceptions). > > And looking at the situation before the introduction of ifn_va_arg, the > expansion of VA_ARG_EXPR also didn't contain any throwing statements. > > > This patch fixes the ICE by marking ifn_va_arg with ECF_NOTHROW. > > Bootstrapped and reg-tested on x86_64. > > OK for trunk? Ok. Thanks, Richard. > Thanks, > - Tom > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Re: [PATCH][PR65802] Mark ifn_va_arg with ECF_NOTHROW
> Mark ifn_va_arg with ECF_NOTHROW You can defnitly make it ECF_LEAF too. I wonder if we can make it ECF_CONST or at leat PURE this would help to keep variadic functions const/pure that may be moderately interesting in practice. Honza > > 2015-04-20 Tom de Vries > > PR tree-optimization/65802 > * internal-fn.def (VA_ARG): Add ECF_NOTROW to flags. > > * g++.dg/pr65802.C: New test. > --- > gcc/internal-fn.def| 2 +- > gcc/testsuite/g++.dg/pr65802.C | 29 + > 2 files changed, 30 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/g++.dg/pr65802.C > > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > index f557c64..7e19313 100644 > --- a/gcc/internal-fn.def > +++ b/gcc/internal-fn.def > @@ -62,4 +62,4 @@ DEF_INTERNAL_FN (ADD_OVERFLOW, ECF_CONST | ECF_LEAF | > ECF_NOTHROW, NULL) > DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) > DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) > DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL) > -DEF_INTERNAL_FN (VA_ARG, 0, NULL) > +DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW, NULL) > diff --git a/gcc/testsuite/g++.dg/pr65802.C b/gcc/testsuite/g++.dg/pr65802.C > new file mode 100644 > index 000..26e5317 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/pr65802.C > @@ -0,0 +1,29 @@ > +// { dg-do compile } > +// { dg-options "-O0" } > + > +typedef int tf (); > + > +struct S > +{ > + tf m_fn1; > +} a; > + > +void > +fn1 () > +{ > + try > +{ > + __builtin_va_list c; > + { > + int *d = __builtin_va_arg (c, int *); > + int **e = &d; > + __asm__("" : "=d"(e)); > + a.m_fn1 (); > + } > + a.m_fn1 (); > +} > + catch (...) > +{ > + > +} > +} > -- > 1.9.1 >
Re: [PATCH][PR65802] Mark ifn_va_arg with ECF_NOTHROW
On Tue, 21 Apr 2015, Jan Hubicka wrote: > > Mark ifn_va_arg with ECF_NOTHROW > > You can defnitly make it ECF_LEAF too. I wonder if we can make it ECF_CONST > or at leat PURE > this would help to keep variadic functions const/pure that may be moderately > interesting > in practice. Yes to ECF_LEAF but it isn't const or pure as it modifies the valist argument so you can't for example DCE va_arg (...) if the result isn't needed. Richard. > Honza > > > > 2015-04-20 Tom de Vries > > > > PR tree-optimization/65802 > > * internal-fn.def (VA_ARG): Add ECF_NOTROW to flags. > > > > * g++.dg/pr65802.C: New test. > > --- > > gcc/internal-fn.def| 2 +- > > gcc/testsuite/g++.dg/pr65802.C | 29 + > > 2 files changed, 30 insertions(+), 1 deletion(-) > > create mode 100644 gcc/testsuite/g++.dg/pr65802.C > > > > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > > index f557c64..7e19313 100644 > > --- a/gcc/internal-fn.def > > +++ b/gcc/internal-fn.def > > @@ -62,4 +62,4 @@ DEF_INTERNAL_FN (ADD_OVERFLOW, ECF_CONST | ECF_LEAF | > > ECF_NOTHROW, NULL) > > DEF_INTERNAL_FN (SUB_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) > > DEF_INTERNAL_FN (MUL_OVERFLOW, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL) > > DEF_INTERNAL_FN (TSAN_FUNC_EXIT, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL) > > -DEF_INTERNAL_FN (VA_ARG, 0, NULL) > > +DEF_INTERNAL_FN (VA_ARG, ECF_NOTHROW, NULL) > > diff --git a/gcc/testsuite/g++.dg/pr65802.C b/gcc/testsuite/g++.dg/pr65802.C > > new file mode 100644 > > index 000..26e5317 > > --- /dev/null > > +++ b/gcc/testsuite/g++.dg/pr65802.C > > @@ -0,0 +1,29 @@ > > +// { dg-do compile } > > +// { dg-options "-O0" } > > + > > +typedef int tf (); > > + > > +struct S > > +{ > > + tf m_fn1; > > +} a; > > + > > +void > > +fn1 () > > +{ > > + try > > +{ > > + __builtin_va_list c; > > + { > > + int *d = __builtin_va_arg (c, int *); > > + int **e = &d; > > + __asm__("" : "=d"(e)); > > + a.m_fn1 (); > > + } > > + a.m_fn1 (); > > +} > > + catch (...) > > +{ > > + > > +} > > +} > > -- > > 1.9.1 > > > > -- Richard Biener SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Jennifer Guild, Dilip Upmanyu, Graham Norton HRB 21284 (AG Nuernberg)
Re: [PATCH][AArch64] Implement -m{cpu,tune,arch}=native using only /proc/cpuinfo
On 21/04/15 05:41, Kumar, Venkataramanan wrote: Hi Kyrill, In AMD Seattle board, I see that CPU implementer is 0x41 and CPU part is 0xd07.CPU variant is 1 but you don’t do anything with that. It matches with cortex-a57 and its features. Thanks, that's a Cortex-A57. I will try a bootstrap test as well. Awesome. I'd like to have a --with-{arch,tune,cpu}=native configure option at some point in the future but I'm not sure at the moment how that would be done without some refactoring. Kyrill Regards, Venkat. -Original Message- From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-ow...@gcc.gnu.org] On Behalf Of Kyrill Tkachov Sent: Monday, April 20, 2015 9:18 PM To: GCC Patches Cc: Marcus Shawcroft; Richard Earnshaw; James Greenhalgh; Evandro Menezes; Andrew Pinski; James Greenhalgh Subject: [PATCH][AArch64] Implement -m{cpu,tune,arch}=native using only /proc/cpuinfo Hi all, This is an attempt to add native CPU detection to AArch64 GNU/Linux targets. Similar to other ports we use SPEC rewriting to rewrite -m{cpu,tune,arch}=native options into the appropriate CPU/architecture and the architecture extension options when appropriate (i.e. +crypto/+crc etc). For CPU/architecture detection it gets a bit involved, especially when running on a big.LITTLE system. My proposed approach is to look at /proc/cpuinfo/ and search for the implementer id and part number fields that uniquely identify each core (appropriate identifying information is added to aarch64-cores.def). If we find two types of core we have a big.LITTLE system, so search through the core definitions extracted from aarch64-cores.def to find if we support such a combination (currently only cortex-a57.cortex-a53 and cortex-a72.cortex-a53) and make sure that the implementer id field matches up. I tested this on a 4xCortex-A53 + 2xCortex-A57 big.LITTLE Ubuntu GNU/Linux system. There are two formats for /proc/cpuinfo/ that I'm aware of. The first (old) one has the format: -- processor: 0 processor: 1 processor: 2 processor: 3 processor: 4 processor: 5 Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer: 0x41 CPU architecture: AArch64 CPU variant: 0x0 CPU part: 0xd03 -- In this format it lists the 6 cores but the CPU part it reports is only the one for the core from which /proc/cpuinfo was read from (!), in this case one of the Cortex-A53 cores. This means we detect a different CPU depending on which core GCC was invoked on. Not ideal really, but there's no more information that we can extract. Given the /proc/cpuinfo above, this patch will rewrite -mcpu=native into -mcpu=cortex-a53+fp+simd+crypto+crc The newer /proc/cpuinfo format proposed at https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=44b82b7700d05a52cd983799d3ecde1a976b3bed looks like this: -- processor : 0 Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part: 0xd03 CPU revision: 0 processor : 1 Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part: 0xd03 CPU revision: 0 processor : 2 Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part: 0xd03 CPU revision: 0 processor : 3 Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part: 0xd03 CPU revision: 0 processor : 4 Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part: 0xd07 CPU revision: 0 processor : 5 Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 CPU implementer : 0x41 CPU architecture: 8 CPU variant : 0x0 CPU part: 0xd07 CPU revision: 0 -- The Features field is used to detect the architectural features that we map to GCC option extensions i.e. +fp,+crypto,+simd,+crc etc. Similarly, -march=native would be rewritten into -march=armv8-a+fp+simd+crypto+crc while -mtune=native into -march=cortex-a57.cortex-a53 (the arch extension options are not valid for -mtune). If it detects more than one implementer ID or the implementer IDs not matching up somewhere or some other weirdness /proc/cpuinfo or fails to recognise the CPU it will bail out and ignore the option entirely (similarly to other ports). The patch works fine with both /proc/cpuinfo formats although, as mentioned above, it will not be able to detect the big.LITTLE combination from the first format. I've filled in the implemente
Re: [PATCH][AArch64] Increase static buffer size in aarch64_rewrite_selected_cpu
On 20/04/15 21:30, James Greenhalgh wrote: On Mon, Apr 20, 2015 at 05:24:39PM +0100, Kyrill Tkachov wrote: Hi all, When trying to compile a testcase with -mcpu=cortex-a57+crypto+nocrc I got the weird assembler error: Assembler messages: Error: missing architectural extension Error: unrecognized option -mcpu=cortex-a57+crypto+no The problem is the aarch64_rewrite_selected_cpu that is used to rewrite -mcpu for big.LITTLE options has a limit of 20 characters in what it handles, which we can exhaust quickly if we specify architectural extensions in a fine-grained manner. This patch increases that character limit to 128 and adds an assert to confirm that no bad things happen. You've implemented this as a hard ICE, was that intended? Yeah, the idea is that before this we would silently truncate i.e. do the wrong thing. Now, if we exceed the limit we ICE. I don't think it should be a user error because it's not really the user's fault that the compiler doesn't handle crazy long strings but handling arbitrary large strings would make this function more complex than I think is needed for the majority of cases. If you plan to rewrite this in the future, we can revisit that. It also fixes another problem: If we pass a big.LITTLE combination with feature modifiers like: -mcpu=cortex-a57.cortex-a53+nosimd the code will truncate everything after '.', thus destroying the extensions that we want to pass. The patch adds code to stitch the extensions back on after the LITTLE cpu is removed. UGH, I should not be allowed near strings! This code is on my list of things I'd love to rewrite to this year! For now, this is OK and please also queue it for 5.2 when that opens for patches. Committed to trunk with r58. Thanks for looking at it, Kyrill Ok for trunk? Yes, thanks. And sorry again for introducing this in the first place. James
Re: [PATCH] tetstsuite gcc.target/i386/ avx512*
Hello Andreas, On 19 Apr 21:56, Andreas Tobler wrote: > Done so and tested on FreeBSD amd64-unknown-freebsd11.0 and CentOS7.1. > > Ok for trunk? The patch is OK for trunk and for gcc-5 branch (when it is open). Thanks for fixing this! -- K
Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall
On 20/04/15 19:02, Jeff Law wrote: On 04/20/2015 02:25 AM, Kyrill Tkachov wrote: Hi Jeff, Hmmm, so what happens if the difference is < 0? I'd be a bit worried about that case for the PA (for example). So how about asserting that the INTVAL is >= 0 prior to returning so that we catch that case if it ever occurs? INTVAL being >= 0 is the case that I want to catch with this function. INTVAL <0 is the usual case on leaf call optimisation. On arm, at least, it means that x and y use the same base register (i.e. same stack frame) but the offsets are such that reading SIZE bytes from X will not overlap with Y, thus not requiring the workaround in this patch. Thus, asserting that the result is positive is not right here. What characteristic on pa makes this problematic? Is it the STACK_GROWS_UPWARD? Yea or more correctly that {STACK,FRAME}_GROWS_UPWARD and ARGS_GROW_DOWNWARD. I think the stormy16 may have downward growing args too. Should I then extend this function to do something like: HOST_WIDE_INT res = INTVAL (sub); #ifndef STACK_GROWS_DOWNWARD res = -res; #endif return res? It certainly feels like something is needed for targets where growth is in the opposite direction -- but my guess is that without a concrete case that triggers on those targets (just the PA in 64 bit mode and stormy?) we'll probably get it wrong in one way or another. Hence my suggestion that we assert rather than try to handle it and silently generate incorrect code in the process. However, this function is expected to return negative numbers when there is no overlap i.e. in the vast majority of cases when this bug doesn't manifest. So asserting that it's positive is just going to ICE at -O2 in almost any code. From reading config/stormy16/stormy-abi it seems to me that we don't pass arguments partially in stormy16, so this code would never be called there. That leaves pa as the potential problematic target. I don't suppose there's an easy way to test on pa? My checkout of binutils doesn't seem to include a sim target for it. Kyrill Jeff
[PATCH][AArch64] Add zero_extend variants of logical+not ops
Hi all, We were missing the patterns for the zero-extend versions of the negated-logic ops, bic,orn,eon leading to redundant zero-extends being generated for code like: unsigned long bar (unsigned int a, unsigned int b) { return a ^ ~b; } unsigned long bar2 (unsigned int a, unsigned int b) { return a & ~b; } With this patch for the above we can generate: bar: eonw0, w1, w0 ret bar2: bicw0, w0, w1 ret instead of: bar: eonw0, w1, w0 uxtwx0, w0 ret bar2: bicw0, w0, w1 uxtwx0, w0 ret Bootstrapped and tested on aarch64-linux. Ok for trunk? Thanks, Kyrill 2015-04-21 Kyrylo Tkachov * config/aarch64/aarch64.md (*_one_cmplsidi3_ze): New pattern. (*xor_one_cmplsidi3_ze): Likewise. commit 8ff76787ce2674b918e1e6ed8b09cafb6b7a Author: Kyrylo Tkachov Date: Mon Mar 2 16:20:10 2015 + [AArch64] Add zero_extend variants of logical+not ops diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 4aa8f5c..1a7f888 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -3058,6 +3058,26 @@ (define_insn "*_one_cmpl3" (set_attr "simd" "*,yes")] ) +(define_insn "*_one_cmplsidi3_ze" + [(set (match_operand:DI 0 "register_operand" "=r") + (zero_extend:DI + (NLOGICAL:SI (not:SI (match_operand:SI 1 "register_operand" "r")) + (match_operand:SI 2 "register_operand" "r"] + "" + "\\t%w0, %w2, %w1" + [(set_attr "type" "logic_reg")] +) + +(define_insn "*xor_one_cmplsidi3_ze" + [(set (match_operand:DI 0 "register_operand" "=r") +(zero_extend:DI + (not:SI (xor:SI (match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "register_operand" "r")] + "" + "eon\\t%w0, %w1, %w2" + [(set_attr "type" "logic_reg")] +) + ;; (xor (not a) b) is simplify_rtx-ed down to (not (xor a b)). ;; eon does not operate on SIMD registers so the vector variant must be split. (define_insn_and_split "*xor_one_cmpl3"
[C++ Patch, committed] PR 65801
Hi, I committed the below to trunk as approved by Jason on the audit trail. Will go in gcc-5 branch too for 5.2. Tested x86_64-linux. Thanks, Paolo. // /cp 2015-04-20 Paolo Carlini PR c++/65801 * typeck2.c (check_narrowing): In C++11 mode too, -Wno-narrowing suppresses the diagnostic. 2015-04-20 Paolo Carlini PR c++/65801 * doc/invoke.texi ([-Wnarrowing]): Update. /testsuite 2015-04-20 Paolo Carlini PR c++/65801 * g++.dg/cpp0x/Wnarrowing2.C: New. Index: cp/typeck2.c === --- cp/typeck2.c(revision 40) +++ cp/typeck2.c(working copy) @@ -957,9 +957,13 @@ check_narrowing (tree type, tree init, tsubst_flag } } else if (complain & tf_error) - error_at (EXPR_LOC_OR_LOC (init, input_location), - "narrowing conversion of %qE from %qT to %qT inside { }", - init, ftype, type); + { + global_dc->pedantic_errors = 1; + pedwarn (EXPR_LOC_OR_LOC (init, input_location), OPT_Wnarrowing, + "narrowing conversion of %qE from %qT to %qT inside { }", + init, ftype, type); + global_dc->pedantic_errors = flag_pedantic_errors; + } } return cxx_dialect == cxx98 || ok; Index: doc/invoke.texi === --- doc/invoke.texi (revision 40) +++ doc/invoke.texi (working copy) @@ -2706,10 +2706,10 @@ int i = @{ 2.2 @}; // error: narrowing from double This flag is included in @option{-Wall} and @option{-Wc++11-compat}. -With @option{-std=c++11}, @option{-Wno-narrowing} suppresses for -non-constants the diagnostic required by the standard. Note that this -does not affect the meaning of well-formed code; narrowing conversions -are still considered ill-formed in SFINAE context. +With @option{-std=c++11}, @option{-Wno-narrowing} suppresses the diagnostic +required by the standard. Note that this does not affect the meaning +of well-formed code; narrowing conversions are still considered +ill-formed in SFINAE context. @item -Wnoexcept @r{(C++ and Objective-C++ only)} @opindex Wnoexcept Index: testsuite/g++.dg/cpp0x/Wnarrowing2.C === --- testsuite/g++.dg/cpp0x/Wnarrowing2.C(revision 0) +++ testsuite/g++.dg/cpp0x/Wnarrowing2.C(working copy) @@ -0,0 +1,5 @@ +// PR c++/65801 +// { dg-do compile { target c++11 } } +// { dg-options "-Wno-narrowing" } + +static struct zai { unsigned int x; } x = {-1};
[PATCH][ARM][stage-1] Initialise cost to COSTS_N_INSNS (1) and increment in arm rtx costs
Hi all, This is the first of a series to clean up and simplify the arm rtx costs function. This patch initialises the cost to COSTS_N_INSNS (1) at the top and increments it when appropriate in the rest of the function. This makes it more similar to the aarch64 rtx costs function and saves us the trouble of having to remember to initialise the cost to COSTS_N_INSNS (1) in each case of the switch statement. Bootstrapped and tested arm-none-linux-gnueabihf. Compiled some large programs with no codegen difference, except some DIV synthesis algorithms were changed, presumably due to the cost of SDIV/UDIV, which is now being correctly calculated (before it was missing the baseline COSTS_N_INSNS (1)). Ok for trunk? Thanks, Kyrill 2015-04-21 Kyrylo Tkachov * config/arm/arm.c (arm_new_rtx_costs): Initialise cost to COSTS_N_INSNS (1) and increment it appropriately throughout the function. commit 8c4d923b6a2fc902a1a195e2e8c5f934e571d8dd Author: Kyrylo Tkachov Date: Thu Apr 2 11:44:54 2015 +0100 [ARM] Initialise rtx cost to COSTS_N_INSNS (1) once at the beginning diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 4dfe4a7..00da2b7 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -9704,6 +9704,8 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, { machine_mode mode = GET_MODE (x); + *cost = COSTS_N_INSNS (1); + if (TARGET_THUMB1) { if (speed_p) @@ -9804,8 +9806,6 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, bool is_ldm = load_multiple_operation (x, SImode); bool is_stm = store_multiple_operation (x, SImode); - *cost = COSTS_N_INSNS (1); - if (is_ldm || is_stm) { if (speed_p) @@ -9832,10 +9832,10 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, case UDIV: if (TARGET_HARD_FLOAT && GET_MODE_CLASS (mode) == MODE_FLOAT && (mode == SFmode || !TARGET_VFP_SINGLE)) - *cost = COSTS_N_INSNS (speed_p - ? extra_cost->fp[mode != SFmode].div : 1); + *cost += COSTS_N_INSNS (speed_p + ? extra_cost->fp[mode != SFmode].div : 0); else if (mode == SImode && TARGET_IDIV) - *cost = COSTS_N_INSNS (speed_p ? extra_cost->mult[0].idiv : 1); + *cost += COSTS_N_INSNS (speed_p ? extra_cost->mult[0].idiv : 0); else *cost = LIBCALL_COST (2); return false; /* All arguments must be in registers. */ @@ -9848,7 +9848,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, case ROTATE: if (mode == SImode && REG_P (XEXP (x, 1))) { - *cost = (COSTS_N_INSNS (2) + *cost += (COSTS_N_INSNS (1) + rtx_cost (XEXP (x, 0), code, 0, speed_p)); if (speed_p) *cost += extra_cost->alu.shift_reg; @@ -9861,7 +9861,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, case ASHIFTRT: if (mode == DImode && CONST_INT_P (XEXP (x, 1))) { - *cost = (COSTS_N_INSNS (3) + *cost += (COSTS_N_INSNS (2) + rtx_cost (XEXP (x, 0), code, 0, speed_p)); if (speed_p) *cost += 2 * extra_cost->alu.shift; @@ -9869,8 +9869,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, } else if (mode == SImode) { - *cost = (COSTS_N_INSNS (1) - + rtx_cost (XEXP (x, 0), code, 0, speed_p)); + *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p); /* Slightly disparage register shifts at -Os, but not by much. */ if (!CONST_INT_P (XEXP (x, 1))) *cost += (speed_p ? extra_cost->alu.shift_reg : 1 @@ -9882,8 +9881,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, { if (code == ASHIFT) { - *cost = (COSTS_N_INSNS (1) - + rtx_cost (XEXP (x, 0), code, 0, speed_p)); + *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p); /* Slightly disparage register shifts at -Os, but not by much. */ if (!CONST_INT_P (XEXP (x, 1))) @@ -9895,14 +9893,13 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, if (arm_arch_thumb2 && CONST_INT_P (XEXP (x, 1))) { /* Can use SBFX/UBFX. */ - *cost = COSTS_N_INSNS (1); if (speed_p) *cost += extra_cost->alu.bfx; *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p); } else { - *cost = COSTS_N_INSNS (2); + *cost += COSTS_N_INSNS (1); *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p); if (speed_p) { @@ -9919,7 +9916,7 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, } else /* Rotates. */ { - *cost = COSTS_N_INSNS (3 + !CONST_INT_P (XEXP (x, 1))); + *cost += COSTS_N_INSNS (2 + !CONST_INT_P (XEXP (x, 1))); *cost += rtx_cost (XEXP (x, 0), code, 0, speed_p); if (speed_p) { @@ -9943,7 +9940,6 @@ arm_new_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer_code, { if (mode == SImode) { -
Re: [PATCH][combine] Do not call rtx costs on potentially unrecognisable rtxes in combine
On 20/04/15 19:51, Jeff Law wrote: On 04/20/2015 08:04 AM, Kyrill Tkachov wrote: Hi all, I'm trying to reduce the cases where the midend calls the backend rtx costs on bogus rtl for which the backend doesn't have patterns or ways of handling. Having to handle these kinds of rtxes sanely bloats those functions and makes them harder to maintain. One of the cases where this occurs is in combine and distribute_and_simplify_rtx in particular. Citing the comment at that function: " See if X is of the form (* (+ A B) C), and if so convert to (+ (* A C) (* B C)) and try to simplify. Most of the time, this results in no change. However, if some of the operands are the same or inverses of each other, simplifications will result." The problem is that after it applies the distributive law it calls rtx costs to figure out whether the rtx became simpler. This rtx can get pretty complex. For example, on arm I've seen it try to cost: (plus:SI (mult:SI (plus:SI (reg:SI 232 [ m1 ]) (const_int 1 [0x1])) (reg:SI 232 [ m1 ])) (plus:SI (reg:SI 232 [ m1 ]) (const_int 1 [0x1]))) which is never going to match anything on arm anyway, so why should the costs function handle it? In any case, I believe combine's design is such that it should first be attempting to call recog and split on the rtxes, and only if that succeeds should it be making a target-specific decision on which rtx to prefer. distribute_and_simplify_rtx goes against that by calling rtx costs on an unverified rtx in attempt to gauge its complexity. This patch remedies that by removing the call to rtx costs and instead manually performing a relatively simple check on whether the resultant rtx was simplified. That is, using the example from the comment, whether (+ (* A C) (* B C)) still has + at the top and * in the two operands. This should give a good indication on whether any meaningful simplification was made (The '+' and '*' operators in the example can be any operators that can be distributed over). Initially, I wanted to just return the distributed version and let recog reject the invalid rtxes but that caused some code quality regressions on arm where the original rtx would not recog but would match a beneficial splitter, whereas the distributed rtx would not. With this patch I saw almost no codegen differences on arm for the whole of SPEC2006. The one exception was 416.gamess where it managed to merge a mul and an add into an mla which resulted in a slightly better code sequence. That was in a pretty large file and I don't speak Fortran'ese, so I couldn't really reduce a testcase for it, but my guess is that before the patch the costs would return some essentially random value for an arbitrarily complex rtx that it was passed to, which changed the decision in distribute_and_simplify_rtx on whether to return the distributed rtx, which could have impacted further optimisations in combine. I tried it on x86_64 as well. Again, there were almost no codegen differences. The exception was tonto and wrf where a few instructions were eliminated, but no significant difference. The resultant binaries for these two were a tiny bit smaller, with no impact on runtime. Therefore I claim that this a safe thing to do, as it leaves the target-specific rtx cost judgements in combine to be made only on valid recog-ed rtxes, and not having them cancel optimisations early due to rtx costs not handling arbitrary rtxes well. Bootstrapped on arm, x86_64, aarch64 (all linux). Tested on arm,aarch64. Ok for trunk? Thanks, Kyrill 2015-04-20 Kyrylo Tkachov * combine.c (distribute_and_simplify_rtx): Do not check rtx costs. Look at the rtx codes to see if a simplification occured. OK. Thanks Though I do wonder if, in practice, we can identify those cases that do simplify more directly apriori and just punt everything else rather than this rather convoluted approach. You mean like calling simplify_binary_operation that returns NULL if no simplification is possible? Kyrill jeff
Re: [PATCH][AArch64] Add zero_extend variants of logical+not ops
On 21/04/15 09:44, Kyrill Tkachov wrote: > Hi all, > > We were missing the patterns for the zero-extend versions of the > negated-logic ops, bic,orn,eon > leading to redundant zero-extends being generated for code like: > > unsigned long > bar (unsigned int a, unsigned int b) > { > return a ^ ~b; > } > > unsigned long > bar2 (unsigned int a, unsigned int b) > { > return a & ~b; > } > > > With this patch for the above we can generate: > bar: > eonw0, w1, w0 > ret > > bar2: > bicw0, w0, w1 > ret > > > instead of: > bar: > eonw0, w1, w0 > uxtwx0, w0 > ret > > bar2: > bicw0, w0, w1 > uxtwx0, w0 > ret > > > Bootstrapped and tested on aarch64-linux. > Ok for trunk? > > Thanks, > Kyrill > > 2015-04-21 Kyrylo Tkachov > > * config/aarch64/aarch64.md (*_one_cmplsidi3_ze): > New pattern. > (*xor_one_cmplsidi3_ze): Likewise. > > aarch64-ze-logic.patch > > > commit 8ff76787ce2674b918e1e6ed8b09cafb6b7a > Author: Kyrylo Tkachov > Date: Mon Mar 2 16:20:10 2015 + > > [AArch64] Add zero_extend variants of logical+not ops > > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 4aa8f5c..1a7f888 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -3058,6 +3058,26 @@ (define_insn "*_one_cmpl3" > (set_attr "simd" "*,yes")] > ) > > +(define_insn "*_one_cmplsidi3_ze" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (zero_extend:DI > + (NLOGICAL:SI (not:SI (match_operand:SI 1 "register_operand" "r")) > +(match_operand:SI 2 "register_operand" "r"] > + "" > + "\\t%w0, %w2, %w1" > + [(set_attr "type" "logic_reg")] > +) > + > +(define_insn "*xor_one_cmplsidi3_ze" > + [(set (match_operand:DI 0 "register_operand" "=r") > +(zero_extend:DI > + (not:SI (xor:SI (match_operand:SI 1 "register_operand" "r") > + (match_operand:SI 2 "register_operand" "r")] > + "" > + "eon\\t%w0, %w1, %w2" > + [(set_attr "type" "logic_reg")] > +) > + I would have thought combine ought to know how to canonicalize this last case into the form supported above. That helps if one of the operands is a constant, since then you can eliminate the NOT entirely. Anyway, that's probably best held for a follow-up. OK. R.
[wwwdocs] PATCH for Re: GCC Plugin Announcement; CTraps - Lightweight dynamic analysis for concurrent code
Hi Brandon, On Wed, 23 Jan 2013, Brandon Lucia wrote: > I have implemented a GCC plugin that I have found useful for doing > dynamic program analysis, debugging, and performance tuning in > concurrent code. > > The plugin is called CTraps, short for Communication Traps. The main > idea behind CTraps is that a compiler pass implemented as a GCC plugin > instruments instructions that access memory locations that might be > shared between threads. The instrumentation inserts a function call > before such accesses. I added this to our extensions page at https://gcc.gnu.org/extensions.html per the patch below. If you have further updates or changes, just advise. Gerald PS: The README file on github felt a bit confusing/not as clear as your e-mail here. Index: extensions.html === RCS file: /cvs/gcc/wwwdocs/htdocs/extensions.html,v retrieving revision 1.54 diff -u -r1.54 extensions.html --- extensions.html 20 Apr 2015 22:52:58 - 1.54 +++ extensions.html 21 Apr 2015 10:10:38 - @@ -12,6 +12,14 @@ tree. Please direct feedback and bug reports to their respective maintainers, not our mailing lists. +https://github.com/blucia0a/CTraps-gcc";>CTraps plugin for GCC + +CTraps, short for Communication Traps, adds a compiler pass as +a plugin that instruments instructions that access memory locations +that might be shared between threads. It supports dynamic program +analysis, debugging, and performance tuning in concurrent code. + + http://gcc-melt.org";>GCC MELT MELT is a high-level domain specific language to ease the
RE: [PATCH, ping1] Fix removing of df problem in df_finish_pass
Committed. I'll wait a week and then ask for approval for a backport to 5.1.1 once 5.1 is released. Best regards, Thomas > -Original Message- > From: Kenneth Zadeck [mailto:zad...@naturalbridge.com] > Sent: Monday, April 20, 2015 9:26 PM > To: Thomas Preud'homme; 'Bernhard Reutner-Fischer'; gcc- > patc...@gcc.gnu.org; 'Paolo Bonzini'; 'Seongbae Park' > Subject: Re: [PATCH, ping1] Fix removing of df problem in df_finish_pass > > As a dataflow maintainer, I approve this patch for the next release. > However, you will have to get approval of a release manager to get it > into 5.0. > > > > On 04/20/2015 04:22 AM, Thomas Preud'homme wrote: > > Ping? > > > >> -Original Message- > >> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > >> ow...@gcc.gnu.org] On Behalf Of Thomas Preud'homme > >> Sent: Tuesday, March 03, 2015 12:02 PM > >> To: 'Bernhard Reutner-Fischer'; gcc-patches@gcc.gnu.org; 'Paolo > Bonzini'; > >> 'Seongbae Park'; 'Kenneth Zadeck' > >> Subject: RE: [PATCH] Fix removing of df problem in df_finish_pass > >> > >>> From: Bernhard Reutner-Fischer [mailto:rep.dot@gmail.com] > >>> Sent: Saturday, February 28, 2015 4:00 AM > use df_remove_problem rather than manually removing > problems, > >>> living > >>> > >>> leaving > >> Indeed. Please find updated changelog below: > >> > >> 2015-03-03 Thomas Preud'homme > > >> > >>* df-core.c (df_finish_pass): Iterate over df- > >>> problems_by_index[] and > >>use df_remove_problem rather than manually removing > >> problems, leaving > >>holes in df->problems_in_order[]. > >> > >> Best regards, > >> > >> Thomas > >> > >> > >> > >> > > > >
[patch] Document libstdc++ dual ABI
This adds some proper documentation for the ABI changes. Committed to trunk. commit 738e20c17326a4d966b24d081549991f0a318774 Author: Jonathan Wakely Date: Mon Apr 20 22:49:43 2015 +0100 * doc/xml/manual/configure.xml: Update descriptions of options affecting dual ABI and add cross-references. * doc/xml/manual/strings.xml: Clarify that string isn't COW now. * doc/xml/manual/using.xml: Document ABI transition. * doc/html/*: Regenerate. diff --git a/libstdc++-v3/doc/xml/manual/configure.xml b/libstdc++-v3/doc/xml/manual/configure.xml index a6e0c21..56d071e 100644 --- a/libstdc++-v3/doc/xml/manual/configure.xml +++ b/libstdc++-v3/doc/xml/manual/configure.xml @@ -385,18 +385,22 @@ --disable-libstdcxx-dual-abi - Disable support for the new, C++11-conforming std::string - implementation. This option changes the library ABI. + Disable support for the new, C++11-conforming implementations of + std::string, std::list etc. so that the + library only provides definitions of types using the old ABI + (see ). + This option changes the library ABI. ---with-default-libstdcxx-abi +--with-default-libstdcxx-abi=OPTION - By default, the new std::string implementation will be - declared and a macro must be defined to declare the old implementation - instead. That default can be reversed by configuring the library with - --with-default-libstdcxx-abi=c++98. + Set the default value for the _GLIBCXX_USE_CXX11_ABI + macro (see ). + The default is OPTION=c++11 which sets the macro to + 1, + use OPTION=c++98 to set it to 0. This option does not change the library ABI. diff --git a/libstdc++-v3/doc/xml/manual/strings.xml b/libstdc++-v3/doc/xml/manual/strings.xml index 6a94fa2..101f8cd 100644 --- a/libstdc++-v3/doc/xml/manual/strings.xml +++ b/libstdc++-v3/doc/xml/manual/strings.xml @@ -353,7 +353,7 @@ stringtok(Container &container, string const &in, a vector's memory usage (see this FAQ entry) but the regular copy constructor cannot be used - because libstdc++'s string is Copy-On-Write. + because libstdc++'s string is Copy-On-Write in GCC 3. In C++11 mode you can call s.shrink_to_fit() to achieve the same effect as diff --git a/libstdc++-v3/doc/xml/manual/using.xml b/libstdc++-v3/doc/xml/manual/using.xml index 0ce4407..8b4af1a 100644 --- a/libstdc++-v3/doc/xml/manual/using.xml +++ b/libstdc++-v3/doc/xml/manual/using.xml @@ -875,6 +875,22 @@ g++ -Winvalid-pch -I. -include stdc++.h -H -g -O2 hello.cc -o test.exe +_GLIBCXX_USE_CXX11_ABI + + +Defined to the value 1 by default. +Configurable via --disable-libstdcxx-dual-abi +and/or --with-default-libstdcxx-abi. +ABI-changing. +When defined to a non-zero value the library headers will use the +new C++11-conforming ABI introduced in GCC 5, rather than the older +ABI introduced in GCC 3.4. This changes the definition of several +class templates, including std:string, +std::list and some locale facets. +For more details see . + + + _GLIBCXX_CONCEPT_CHECKS @@ -922,6 +938,94 @@ g++ -Winvalid-pch -I. -include stdc++.h -H -g -O2 hello.cc -o test.exe + + Dual ABI + + + In the GCC 5.1 release libstdc++ introduced a new library ABI that + includes new implementations of std::string and + std::list. These changes were necessary to conform + to the 2011 C++ standard which forbids Copy-On-Write strings and requires + lists to keep track of their size. + + + In order to maintain backwards compatibility for existing code linked + to libstdc++ the library's soname has not changed and the old + implementations are still supported in parallel with the new ones. + This is achieved by defining the new implementations in an inline namespace + so they have different names for linkage purposes, e.g. the new version of + std::listis actually defined as + std::__cxx11::list . Because the symbols + for the new implementations have different names the definitions for both + versions can be present in the same library. + + + The _GLIBCXX_USE_CXX11_ABI macro (see +) controls whether + the declarations in the library headers use the old or new ABI. + So the decision of which ABI to use can be made separately for each + source file being compiled. + Using the default configuration options for GCC the default value + of the macro is 1 which causes the new ABI to be active, + so to use the old ABI you must explicitly define the macro to + 0 before including any library headers. + (Be aware that some GNU/Linux distributions configure GCC 5 differently so + that the default value of the macro is 0 and users must + define it to 1 to enable the new ABI.) + + + Although the changes were made for C++11 conformance, the choice of ABI + to use is independent of the -std
[patch] Document effects of -std=c++14 and -std=c++03 in libstdc++ manual
A small doc patch that could also go to the 4.9 and 5 branches. Committed only to trunk for now. commit c5a5a32af8b7cb69c14decbfca9c1a3175e7c535 Author: Jonathan Wakely Date: Mon Apr 20 13:20:16 2015 +0100 * doc/xml/manual/abi.xml: Use uppercase for C++ Standard Library. * doc/xml/manual/using.xml: Document newer -std options. Use better examples of nested namespaces. diff --git a/libstdc++-v3/doc/xml/manual/abi.xml b/libstdc++-v3/doc/xml/manual/abi.xml index ee3a27e..86c591d 100644 --- a/libstdc++-v3/doc/xml/manual/abi.xml +++ b/libstdc++-v3/doc/xml/manual/abi.xml @@ -66,7 +66,7 @@ Putting all of these ideas together results in the C++ Standard -library ABI, which is the compilation of a given library API by a +Library ABI, which is the compilation of a given library API by a given compiler ABI. In a nutshell: diff --git a/libstdc++-v3/doc/xml/manual/using.xml b/libstdc++-v3/doc/xml/manual/using.xml index f6f615e..0ce4407 100644 --- a/libstdc++-v3/doc/xml/manual/using.xml +++ b/libstdc++-v3/doc/xml/manual/using.xml @@ -13,7 +13,10 @@ - By default, g++ is equivalent to g++ -std=gnu++98. The standard library also defaults to this dialect. + The standard library conforms to the dialect of C++ specified by the + -std option passed to the compiler. + By default, g++ is equivalent to + g++ -std=gnu++98. @@ -32,12 +35,14 @@ - -std=c++98 + -std=c++98 or -std=c++03 + Use the 1998 ISO C++ standard plus amendments. - -std=gnu++98 + -std=gnu++98 or -std=gnu++03 + As directly above, with GNU extensions. @@ -52,6 +57,16 @@ + -std=c++14 + Use the 2014 ISO C++ standard. + + + + -std=gnu++14 + As directly above, with GNU extensions. + + + -fexceptions See exception-free dialect @@ -923,8 +938,8 @@ g++ -Winvalid-pch -I. -include stdc++.h -H -g -O2 hello.cc -o test.exe std The ISO C++ standards specify that "all library entities are defined within namespace std." This includes namespaces nested -within namespace std, such as namespace -std::tr1. +within namespace std, such as namespace +std::chrono. abi
[C/C++ PATCH] Improve -Wlogical-op (PR c/63357)
This patch improves -Wlogical-op so that it also warns about cases such as P && P or P || P. I made use of what merge_ranges computes: if we have equal operands with the same ranges, warn -- that seems to work well. (-Wlogical-op still isn't enabled neither by -Wall nor by -Wextra.) Bootstrapped/regtested on x86_64-linux, ok for trunk? 2015-04-21 Marek Polacek PR c/63357 * c-common.c (warn_logical_operator): Warn if the operands have the same expressions. * doc/invoke.texi: Update description of -Wlogical-op. * c-c++-common/Wlogical-op-1.c: New test. diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c index 7fe7fa6..6eecc73 100644 --- gcc/c-family/c-common.c +++ gcc/c-family/c-common.c @@ -1772,22 +1772,35 @@ warn_logical_operator (location_t location, enum tree_code code, tree type, return; /* If both expressions have the same operand, if we can merge the - ranges, and if the range test is always false, then warn. */ + ranges, ... */ if (operand_equal_p (lhs, rhs, 0) && merge_ranges (&in_p, &low, &high, in0_p, low0, high0, - in1_p, low1, high1) - && 0 != (tem = build_range_check (UNKNOWN_LOCATION, - type, lhs, in_p, low, high)) - && integer_zerop (tem)) + in1_p, low1, high1)) { - if (or_op) -warning_at (location, OPT_Wlogical_op, -"logical % " -"of collectively exhaustive tests is always true"); - else -warning_at (location, OPT_Wlogical_op, -"logical % " -"of mutually exclusive tests is always false"); + tem = build_range_check (UNKNOWN_LOCATION, type, lhs, in_p, low, high); + /* ... and if the range test is always false, then warn. */ + if (tem && integer_zerop (tem)) + { + if (or_op) + warning_at (location, OPT_Wlogical_op, + "logical % of collectively exhaustive tests is " + "always true"); + else + warning_at (location, OPT_Wlogical_op, + "logical % of mutually exclusive tests is " + "always false"); + } + /* Or warn if the operands have exactly the same range, e.g. +A > 0 && A > 0. */ + else if (low0 == low1 && high0 == high1) + { + if (or_op) + warning_at (location, OPT_Wlogical_op, + "logical % of equal expressions"); + else + warning_at (location, OPT_Wlogical_op, + "logical % of equal expressions"); + } } } diff --git gcc/doc/invoke.texi gcc/doc/invoke.texi index c20dd4d..8ce233b 100644 --- gcc/doc/invoke.texi +++ gcc/doc/invoke.texi @@ -4936,7 +4936,12 @@ programmer intended to use @code{strcmp}. This warning is enabled by @opindex Wno-logical-op Warn about suspicious uses of logical operators in expressions. This includes using logical operators in contexts where a -bit-wise operator is likely to be expected. +bit-wise operator is likely to be expected. Also warns when +the operands of a logical operator are the same: +@smallexample +extern int a; +if (a < 0 && a < 0) @{ @dots{} @} +@end smallexample @item -Wlogical-not-parentheses @opindex Wlogical-not-parentheses diff --git gcc/testsuite/c-c++-common/Wlogical-op-1.c gcc/testsuite/c-c++-common/Wlogical-op-1.c index e69de29..33d4f38 100644 --- gcc/testsuite/c-c++-common/Wlogical-op-1.c +++ gcc/testsuite/c-c++-common/Wlogical-op-1.c @@ -0,0 +1,109 @@ +/* PR c/63357 */ +/* { dg-do compile } */ +/* { dg-options "-Wlogical-op" } */ + +#ifndef __cplusplus +# define bool _Bool +# define true 1 +# define false 0 +#endif + +extern int bar (void); +extern int *p; +struct R { int a, b; } S; + +void +andfn (int a, int b) +{ + if (a && a) {} /* { dg-warning "logical .and. of equal expressions" } */ + if (!a && !a) {} /* { dg-warning "logical .and. of equal expressions" } */ + if (!!a && !!a) {} /* { dg-warning "logical .and. of equal expressions" } */ + if (a > 0 && a > 0) {} /* { dg-warning "logical .and. of equal expressions" } */ + if (a < 0 && a < 0) {} /* { dg-warning "logical .and. of equal expressions" } */ + if (a == 0 && a == 0) {} /* { dg-warning "logical .and. of equal expressions" } */ + if (a <= 0 && a <= 0) {} /* { dg-warning "logical .and. of equal expressions" } */ + if (a >= 0 && a >= 0) {} /* { dg-warning "logical .and. of equal expressions" } */ + if (a == 0 && !(a != 0)) {} /* { dg-warning "logical .and. of equal expressions" } */ + + if (a && a && a) {} /* { dg-warning "logical .and. of equal expressions" } */ + if ((a + 1) && (a + 1)) {} /* { dg-warning "logical .and. of equal expressions" } */ + if ((10 * a) && (a * 10)) {} /* { dg-warning "logical .and. of equal expressions" }
[PATCH] Skip preprocessor directives in mklog
Hi all, Contrib/mklog is currently faked by preprocessor directives inside functions to produce invalid ChangeLog. The attached patch fixes this. Tested with my local mklog testsuite and http://paste.debian.net/167999/ . Ok to commit? -Y commit 23a738d05393676e72db82cb527d5fb1b3060e2f Author: Yury Gribov Date: Tue Apr 21 14:17:23 2015 +0300 2015-04-21 Yury Gribov * mklog: Ignore preprocessor directives. diff --git a/contrib/mklog b/contrib/mklog index f7974a7..455614b 100755 --- a/contrib/mklog +++ b/contrib/mklog @@ -131,7 +131,6 @@ sub is_unified_hunk_start { } # Check if line is a top-level declaration. -# TODO: ignore preprocessor directives except maybe #define ? sub is_top_level { my ($function, $is_context_diff) = (@_); if (is_unified_hunk_start ($function) @@ -143,7 +142,7 @@ sub is_top_level { } else { $function =~ s/^.//; } - return $function && $function !~ /^[\s{]/; + return $function && $function !~ /^[\s{#]/; } # Read contents of .diff file
[PATCH][i386] Properly scale vec_construct cost
Hi, currently vec_construct cost is simply TYPE_VECTOR_SUBPARTS / 2 + 1, a reasonable estimate only of other target stmt costs are close to 1. The idea was you need that many vector stmts thus the following patch which should fix skewed costs for bdver2 for example with a vec_stmt_cost of 6. Fixing this gets important for a fix for PR62283 which will consider building vectors up from parts during basic-block vectorization and relies on the cost model to reject too expensive ones. For example gcc.dg/vect/bb-slp-14.c will now be vectorized (with the generic cost model and just SSE2) as Cost model analysis: Vector inside of basic block cost: 2 Vector prologue cost: 7 Vector epilogue cost: 0 Scalar cost of basic block: 10 .LFB7: .cfi_startproc subq$24, %rsp .cfi_def_cfa_offset 32 movlin+12(%rip), %eax testl %edi, %edi movdin+4(%rip), %xmm0 movdin(%rip), %xmm1 movl%eax, 12(%rsp) movdin+4(%rip), %xmm4 movd12(%rsp), %xmm3 movl%edi, 12(%rsp) punpckldq %xmm4, %xmm1 punpckldq %xmm3, %xmm0 punpcklqdq %xmm0, %xmm1 movd12(%rsp), %xmm0 movl%esi, 12(%rsp) movd12(%rsp), %xmm5 paddd .LC2(%rip), %xmm1 movdqa %xmm1, %xmm2 psrlq $32, %xmm1 punpckldq %xmm5, %xmm0 punpcklqdq %xmm0, %xmm0 pmuludq %xmm0, %xmm2 psrlq $32, %xmm0 pmuludq %xmm1, %xmm0 pshufd $8, %xmm2, %xmm1 pshufd $8, %xmm0, %xmm0 punpckldq %xmm0, %xmm1 movaps %xmm1, out(%rip) je .L12 vs. the scalar variant .LFB7: .cfi_startproc subq$8, %rsp .cfi_def_cfa_offset 16 movlin(%rip), %edx movlin+4(%rip), %eax movlin+12(%rip), %ecx addl$23, %edx imull %edi, %edx leal31(%rcx), %r8d movl%edx, out(%rip) leal142(%rax), %edx addl$2, %eax imull %edi, %eax imull %esi, %edx movl%eax, out+8(%rip) movl%r8d, %eax imull %esi, %eax testl %edi, %edi movl%edx, out+4(%rip) movl%eax, out+12(%rip) je .L12 Some excessive PRE across the conditional asm() keeps part of the scalar computes live (yes, the cost model accounts for that). Previously we didn't vectorize the basic-block because the loads from in[] could not be vectorized. Now we will build up a vector from the scalar loads. The vectorized code is generated from : vect_cst_.19_43 = {x_10(D), y_13(D), x_10(D), y_13(D)}; _3 = in[0]; _5 = in[1]; _8 = in[3]; vect_cst_.16_47 = {_3, _5, _5, _8}; vect_a0_4.15_42 = vect_cst_.16_47 + { 23, 142, 2, 31 }; vect__11.18_44 = vect_a0_4.15_42 * vect_cst_.19_43; MEM[(unsigned int *)&out] = vect__11.18_44; thus the code we generate for _3 = in[0]; _5 = in[1]; _8 = in[3]; vect_cst_.16_47 = {_3, _5, _5, _8}; is quite bad. It get's better for -mavx but I wonder where we should try to optimize code generation for constructors... (we can vectorize the loads by enhancing load permutation support, of course - another vectorizer improvement I have some partial patches for). Well, anyway - below for the "obvoious" cost model patch. Boostrapped on x86_64-unknown-linux-gnu, testing in progress. Ok for trunk? Thanks, Richard. 2015-04-21 Richard Biener * config/i386/i386.c (ix86_builtin_vectorization_cost): Scale vec_construct cost by vec_stmt_cost. Index: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c (revision 30) +++ gcc/config/i386/i386.c (working copy) @@ -46731,7 +46731,7 @@ ix86_builtin_vectorization_cost (enum ve case vec_construct: elements = TYPE_VECTOR_SUBPARTS (vectype); - return elements / 2 + 1; + return ix86_cost->vec_stmt_cost * (elements / 2 + 1); default: gcc_unreachable ();
Re: [Patch] pr65779 - [5/6 Regression] undefined local symbol on powerpc
On Mon, Apr 20, 2015 at 03:17:21PM +0200, Jakub Jelinek wrote: > On Mon, Apr 20, 2015 at 10:30:32PM +0930, Alan Modra wrote: > Zapping is conservatively correct, if you don't know where the var lives in > or how to compute it, you tell the debugger you don't know it. > Of course, it is a QoI issue, if there is an easy way how to reconstruct the > value otherwise, it is always better to do so. That's what this revised patch does, fix the easy cases. > > Of course, all this moving for shrink-wrap is senseless in a block > > that contains a call. > > Yeah, such blocks clearly aren't going to be shrink-wrapped, so there is no > point to move it that far, right? It's not where we're moving to, but from. The first block in the function has a call, but prepare_shrink_wrap goes ahead regardless, moving reg copies and initialization out of the block. Ideally none of the moves would be committed until we decide that we can shrink wrap. The tricky part is that we need to perform the moves in order to update dataflow info used to decide whether other moves can happen. So I think the only way to get back to the original insn stream is keep info around for an undo. Anyway, here's the current patch. The debug_loc info looks much better, so we should see fewer of those messages from gdb. Cures a dozen quality fails on powerpc64 too (all in one testcase). Bootstrapped and regression tested powerpc64-linux and x86_64-linux. gcc/ PR debug/65779 * shrink-wrap.c (insn_uses_reg): New function. (move_insn_for_shrink_wrap): Try to fix up debug insns related to the moved insn. gcc/testsuite/ * gcc.dg/pr65779.c: New. Index: shrink-wrap.c === --- shrink-wrap.c (revision 27) +++ shrink-wrap.c (working copy) @@ -182,6 +182,24 @@ live_edge_for_reg (basic_block bb, int regno, int return live_edge; } +/* Return true if INSN df shows a use of a reg in the range + [REGNO,END_REGNO). */ + +static bool +insn_uses_reg (rtx_insn *insn, unsigned int regno, unsigned int end_regno) +{ + df_ref use; + + FOR_EACH_INSN_USE (use, insn) +{ + rtx reg = DF_REF_REG (use); + + if (REG_P (reg) && REGNO (reg) >= regno && REGNO (reg) < end_regno) + return true; +} + return false; +} + /* Try to move INSN from BB to a successor. Return true on success. USES and DEFS are the set of registers that are used and defined after INSN in BB. SPLIT_P indicates whether a live edge from BB @@ -342,8 +360,11 @@ move_insn_for_shrink_wrap (basic_block bb, rtx_ins /* At this point we are committed to moving INSN, but let's try to move it as far as we can. */ + auto_vec live_bbs; do { + if (MAY_HAVE_DEBUG_INSNS) + live_bbs.safe_push (bb); live_out = df_get_live_out (bb); live_in = df_get_live_in (next_block); bb = next_block; @@ -426,6 +447,54 @@ move_insn_for_shrink_wrap (basic_block bb, rtx_ins SET_REGNO_REG_SET (bb_uses, i); } + /* Try to fix up debug insns in the tail of the entry block and any + intervening blocks that use regs set by the insn we are moving. */ + if (MAY_HAVE_DEBUG_INSNS) +{ + while (!live_bbs.is_empty ()) + { + rtx_insn *dinsn; + basic_block tmp_bb = live_bbs.pop (); + + FOR_BB_INSNS_REVERSE (tmp_bb, dinsn) + { + if (dinsn == insn) + break; + if (DEBUG_INSN_P (dinsn) + && insn_uses_reg (dinsn, dregno, end_dregno)) + { + if (live_bbs.is_empty ()) + /* Put debug info for the insn we'll be moving + into the destination block. */ + { + rtx_insn *newdinsn + = emit_debug_insn_after (copy_rtx (PATTERN (dinsn)), +bb_note (bb)); + df_insn_rescan (newdinsn); + } + + /* If the insn is a simple reg-reg copy, then reset +the debug insn to point to src. */ + if (REG_P (src) && GET_MODE (src) == GET_MODE (dest)) + { + INSN_VAR_LOCATION_LOC (dinsn) + = simplify_replace_rtx (INSN_VAR_LOCATION_LOC (dinsn), + dest, src); + df_insn_rescan (dinsn); + } + else + { + /* Otherwise remove anything about this variable. */ + INSN_VAR_LOCATION_LOC (dinsn) + = gen_rtx_UNKNOWN_VAR_LOC (); + df_insn_rescan_debug_internal (dinsn); + } + break; + } + } + } +} + emit_insn_after (PATTERN (insn), bb_note (bb)); delete_insn (insn); return t
Re: [PATCH][i386] Properly scale vec_construct cost
On Tue, Apr 21, 2015 at 1:30 PM, Richard Biener wrote: > Well, anyway - below for the "obvoious" cost model patch. > > Boostrapped on x86_64-unknown-linux-gnu, testing in progress. > > Ok for trunk? > > Thanks, > Richard. > > 2015-04-21 Richard Biener > > * config/i386/i386.c (ix86_builtin_vectorization_cost): Scale > vec_construct cost by vec_stmt_cost. OK. Thanks, Uros.
[wwwdocs] Add libstdc++ ABI changes to /gcc-5/changes.html
I plan to commit this to wwwdocs later today, it adds a caveat to the top of the file, with a link to a larger description in the libstdc++ section, which links to the new page I've just added to the manual. It also clarifies that the deprecations apply to C++, so people who don't care about C++ can ignore that item. Index: htdocs/gcc-5/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.109 diff -u -r1.109 changes.html --- htdocs/gcc-5/changes.html 20 Apr 2015 08:22:35 - 1.109 +++ htdocs/gcc-5/changes.html 21 Apr 2015 11:45:51 - @@ -16,15 +16,17 @@ The default mode for C is now -std=gnu11 instead of -std=gnu89. +The C++ runtime library (libstdc++) uses a new ABI by default +(see below). The Graphite framework for loop optimizations no longer requires the CLooG library, only ISL version 0.14 (recommended) or 0.12.2. The installation manual contains more information about requirements to build GCC. -The non-standard type traits +The non-standard C++0x type traits has_trivial_default_constructor, has_trivial_copy_constructor and has_trivial_copy_assign have been deprecated and will -be removed in a future version. The standard traits +be removed in a future version. The standard C++11 traits is_trivially_default_constructible, is_trivially_copy_constructible and is_trivially_copy_assignable should be used instead. @@ -415,6 +417,11 @@ Runtime Library (libstdc++) +A Dual +ABI is provided by the library. A new ABI is enabled by default. +The old ABI is still supported and can be used by defining the macro +_GLIBCXX_USE_CXX11_ABI to 0 before +including any C++ standard library headers. A new implementation of std::string is enabled by default, using the small string optimization instead of copy-on-write reference counting.
[patch, libgomp] Re-factor GOMP_MAP_POINTER handling
Hi, while investigating some issues in the variable mapping code, I observed that the GOMP_MAP_POINTER handling is essentially duplicated under the PSET case. This patch abstracts and unifies the handling code, basically just a cleanup patch. Ran libgomp tests to ensure no regressions, ok for trunk? Thanks, Chung-Lin 2015-04-21 Chung-Lin Tang libgomp/ * target.c (gomp_map_pointer): New function abstracting out GOMP_MAP_POINTER handling. (gomp_map_vars): Remove GOMP_MAP_POINTER handling code and use gomp_map_pointer(). Index: target.c === --- target.c (revision 448412) +++ target.c (working copy) @@ -163,6 +163,60 @@ get_kind (bool is_openacc, void *kinds, int idx) : ((unsigned char *) kinds)[idx]; } +static void +gomp_map_pointer (struct target_mem_desc *tgt, uintptr_t host_ptr, + uintptr_t target_offset, uintptr_t bias) +{ + struct gomp_device_descr *devicep = tgt->device_descr; + struct splay_tree_s *mem_map = &devicep->mem_map; + struct splay_tree_key_s cur_node; + + cur_node.host_start = host_ptr; + if (cur_node.host_start == (uintptr_t) NULL) +{ + cur_node.tgt_offset = (uintptr_t) NULL; + /* FIXME: see comment about coalescing host/dev transfers below. */ + devicep->host2dev_func (devicep->target_id, + (void *) (tgt->tgt_start + target_offset), + (void *) &cur_node.tgt_offset, + sizeof (void *)); + return; +} + /* Add bias to the pointer value. */ + cur_node.host_start += bias; + cur_node.host_end = cur_node.host_start + 1; + splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); + if (n == NULL) +{ + /* Could be possibly zero size array section. */ + cur_node.host_end--; + n = splay_tree_lookup (mem_map, &cur_node); + if (n == NULL) + { + cur_node.host_start--; + n = splay_tree_lookup (mem_map, &cur_node); + cur_node.host_start++; + } +} + if (n == NULL) +{ + gomp_mutex_unlock (&devicep->lock); + gomp_fatal ("Pointer target of array section wasn't mapped"); +} + cur_node.host_start -= n->host_start; + cur_node.tgt_offset += n->tgt->tgt_start + n->tgt_offset + cur_node.host_start; + /* At this point tgt_offset is target address of the + array section. Now subtract bias to get what we want + to initialize the pointer with. */ + cur_node.tgt_offset -= bias; + /* FIXME: see comment about coalescing host/dev transfers below. */ + devicep->host2dev_func (devicep->target_id, + (void *) (tgt->tgt_start + target_offset), + (void *) &cur_node.tgt_offset, + sizeof (void *)); +} + attribute_hidden struct target_mem_desc * gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum, void **hostaddrs, void **devaddrs, size_t *sizes, void *kinds, @@ -336,54 +390,8 @@ gomp_map_vars (struct gomp_device_descr *devicep, k->host_end - k->host_start); break; case GOMP_MAP_POINTER: - cur_node.host_start - = (uintptr_t) *(void **) k->host_start; - if (cur_node.host_start == (uintptr_t) NULL) - { - cur_node.tgt_offset = (uintptr_t) NULL; - /* FIXME: see above FIXME comment. */ - devicep->host2dev_func (devicep->target_id, - (void *) (tgt->tgt_start - + k->tgt_offset), - (void *) &cur_node.tgt_offset, - sizeof (void *)); - break; - } - /* Add bias to the pointer value. */ - cur_node.host_start += sizes[i]; - cur_node.host_end = cur_node.host_start + 1; - n = splay_tree_lookup (mem_map, &cur_node); - if (n == NULL) - { - /* Could be possibly zero size array section. */ - cur_node.host_end--; - n = splay_tree_lookup (mem_map, &cur_node); - if (n == NULL) - { - cur_node.host_start--; - n = splay_tree_lookup (mem_map, &cur_node); - cur_node.host_start++; - } - } - if (n == NULL) - { - gomp_mutex_unlock (&devicep->lock); - gomp_fatal ("Pointer target of array section " -"wasn't mapped"); - } - cur_node.host_start -= n->host_start; - cur_node.tgt_offset = n->tgt->tgt_start + n->tgt_offset - + cur_node.host_start; - /* At this point tgt_offset is target address of the - array section. Now subtract bias to get what we want - to initialize the pointer with. */ - cur_node.tgt_offset -= sizes[i]; - /* FIXME: see above FIXME comment. */ - devicep->host2dev_func (devicep->target_id, - (void *) (tgt->tgt_start - + k->tgt_offset), - (void *) &cur_node.tgt_offset, - sizeof (void *)); + gomp_map_pointer (tgt, (uintptr_t) *(void **) k->host_start, + k->tgt_offset, sizes[i]); break; case GOMP_MAP_TO_PSET: /* FIXME: see above FIXME comment. */ @@ -405,58 +413,12 @@ gomp_map_vars (struct gomp_device_descr *devicep, { tgt->list[j] = k;
Re: [PATCH] Skip preprocessor directives in mklog
On 21-04-15 13:26, Yury Gribov wrote: Hi all, Contrib/mklog is currently faked by preprocessor directives inside functions to produce invalid ChangeLog. Hi Yury, The effect of the patch on the mklog output using the pastebin input is: ... @@ -2,11 +2,13 @@ 2015-04-21 x - * builtins.c: + * builtins.c (expand_builtin): * defaults.h: - * df-scan.c: + * df-scan.c (df_bb_refs_collect): + (df_get_exit_block_use_set): * except.c: - * haifa-sched.c: - * ira-lives.c: - * lra-lives.c: + * haifa-sched.c (initiate_bb_reg_pressure_info): + * ira-lives.c (process_bb_node_lives): + * lra-lives.c (process_bb_lives): ... So, for f.i. this patch hunk: ... diff --git a/gcc/builtins.c b/gcc/builtins.c index 9263777..028d793 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -6510,10 +6510,8 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, expand_builtin_eh_return (CALL_EXPR_ARG (exp, 0), CALL_EXPR_ARG (exp, 1)); return const0_rtx; -#ifdef EH_RETURN_DATA_REGNO case BUILT_IN_EH_RETURN_DATA_REGNO: return expand_builtin_eh_return_data_regno (exp); -#endif case BUILT_IN_EXTEND_POINTER: return expand_builtin_extend_pointer (CALL_EXPR_ARG (exp, 0)); case BUILT_IN_EH_POINTER: ... with the patch we output: ... * builtins.c (expand_builtin): ... instead of: ... * builtins.c: ... That looks like an improvement to me. Thanks, - Tom The attached patch fixes this. Tested with my local mklog testsuite and http://paste.debian.net/167999/ . Ok to commit? -Y mklog-1.diff commit 23a738d05393676e72db82cb527d5fb1b3060e2f Author: Yury Gribov Date: Tue Apr 21 14:17:23 2015 +0300 2015-04-21 Yury Gribov * mklog: Ignore preprocessor directives. diff --git a/contrib/mklog b/contrib/mklog index f7974a7..455614b 100755 --- a/contrib/mklog +++ b/contrib/mklog @@ -131,7 +131,6 @@ sub is_unified_hunk_start { } # Check if line is a top-level declaration. -# TODO: ignore preprocessor directives except maybe #define ? sub is_top_level { my ($function, $is_context_diff) = (@_); if (is_unified_hunk_start ($function) @@ -143,7 +142,7 @@ sub is_top_level { } else { $function =~ s/^.//; } - return $function && $function !~ /^[\s{]/; + return $function && $function !~ /^[\s{#]/; } # Read contents of .diff file
[PATCH] Fix PR65788
The following fixes PR65788. We need to use UNDEFINED whenever possible to not get spurious invalid lattice transitions later. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2015-04-21 Richard Biener PR tree-optimization/65788 * tree-ssa-ccp.c (evaluate_stmt): Evaluate to UNDEFINED early. Index: gcc/tree-ssa-ccp.c === *** gcc/tree-ssa-ccp.c (revision 27) --- gcc/tree-ssa-ccp.c (working copy) *** evaluate_stmt (gimple stmt) *** 1756,1761 --- 1756,1769 val.mask = 0; } } + /* If the statement result is likely UNDEFINED, make it so. */ + else if (likelyvalue == UNDEFINED) + { + val.lattice_val = UNDEFINED; + val.value = NULL_TREE; + val.mask = 0; + return val; + } /* Resort to simplification for bitwise tracking. */ if (flag_tree_bit_ccp *** evaluate_stmt (gimple stmt) *** 1890,1896 if (flag_tree_bit_ccp && ((is_constant && TREE_CODE (val.value) == INTEGER_CST) ! || (!is_constant && likelyvalue != UNDEFINED)) && gimple_get_lhs (stmt) && TREE_CODE (gimple_get_lhs (stmt)) == SSA_NAME) { --- 1898,1904 if (flag_tree_bit_ccp && ((is_constant && TREE_CODE (val.value) == INTEGER_CST) ! || !is_constant) && gimple_get_lhs (stmt) && TREE_CODE (gimple_get_lhs (stmt)) == SSA_NAME) { *** evaluate_stmt (gimple stmt) *** 1918,1939 } } if (!is_constant) { ! /* The statement produced a nonconstant value. If the statement !had UNDEFINED operands, then the result of the statement !should be UNDEFINED. Otherwise, the statement is VARYING. */ ! if (likelyvalue == UNDEFINED) ! { ! val.lattice_val = likelyvalue; ! val.mask = 0; ! } ! else ! { ! val.lattice_val = VARYING; ! val.mask = -1; ! } ! val.value = NULL_TREE; } --- 1926,1936 } } + /* The statement produced a nonconstant value. */ if (!is_constant) { ! val.lattice_val = VARYING; ! val.mask = -1; val.value = NULL_TREE; }
Re: [PATCH] 65479 - sanitizer stack trace missing frames past #0 on powerpc64
On Tue, 2015-04-21 at 08:22 +0200, Jakub Jelinek wrote: > > -#if defined(__powerpc__) || defined(__powerpc64__) > > - // PCs are always 4 byte aligned. > > - return pc - 4; > > -#elif defined(__sparc__) || defined(__mips__) > > - return pc - 8; > > The SPARC/MIPS case is of course needed, because on these architectures > the call is followed by a delay slot. But I wonder why you need anything > special on any other architecture, why pc - 1 isn't good enough for those. > The point isn't to find a PC of the call instruction, on some targets that > is very hard and you need to disassemble, but to just find some byte in the > call instruction. I wrote the "pc - 4" code for powerpc* and I guess I was just being pedantic on returning the first address of the instruction. If using "pc - 1" works, then I'm fine with that. Peter
Re: [PATCH][expmed] Properly account for the cost and latency of shift+add ops when synthesizing mults
On 20 April 2015 at 16:12, Kyrill Tkachov wrote: > Thanks, > I could've sworn I had sent this version out a couple hours ago. > My mail client has been playing up. > > Here it is with 6 tests. For the tests corresponding to f1/f3 in my > example above I scan that we don't use the 'w1' reg. > > I'll give the AArch64 maintainers to comment on the tests for a day or two > before committing. Using scan-assembler-times is more robust than scan-assembler. Otherwise, OK by me. /Marcus > Thanks, > Kyrill > > 2015-04-20 Kyrylo Tkachov > > * expmed.c: (synth_mult): Only assume overlapping > shift with previous steps in alg_sub_t_m2 case. > > 2015-04-20 Kyrylo Tkachov > > * gcc.target/aarch64/mult-synth_1.c: New test. > * gcc.target/aarch64/mult-synth_2.c: Likewise. > * gcc.target/aarch64/mult-synth_3.c: Likewise. > * gcc.target/aarch64/mult-synth_4.c: Likewise. > * gcc.target/aarch64/mult-synth_5.c: Likewise. > * gcc.target/aarch64/mult-synth_6.c: Likewise. >> >> >> jeff >> >
Re: [Patch] pr65779 - [5/6 Regression] undefined local symbol on powerpc
On Tue, Apr 21, 2015 at 09:08:04PM +0930, Alan Modra wrote: > + if (DEBUG_INSN_P (dinsn) > + && insn_uses_reg (dinsn, dregno, end_dregno)) > + { > + if (live_bbs.is_empty ()) > + /* Put debug info for the insn we'll be moving > +into the destination block. */ > + { > + rtx_insn *newdinsn > + = emit_debug_insn_after (copy_rtx (PATTERN (dinsn)), > + bb_note (bb)); > + df_insn_rescan (newdinsn); > + } This isn't safe. There could be a debug_insn for the same decl anywhere in between the dinsn and bb_note (bb) on the chosen live path, if there is, this change will break stuff. > + /* If the insn is a simple reg-reg copy, then reset > + the debug insn to point to src. */ > + if (REG_P (src) && GET_MODE (src) == GET_MODE (dest)) > + { > + INSN_VAR_LOCATION_LOC (dinsn) > + = simplify_replace_rtx (INSN_VAR_LOCATION_LOC (dinsn), > + dest, src); > + df_insn_rescan (dinsn); > + } > + else > + { > + /* Otherwise remove anything about this variable. */ > + INSN_VAR_LOCATION_LOC (dinsn) > + = gen_rtx_UNKNOWN_VAR_LOC (); > + df_insn_rescan_debug_internal (dinsn); > + } This works (though the simplify_replace_rtx alone is dangerous, you'd better use propagate_for_debug), but is unnecessarily limitting. You could just insert a debug insn with a debug temp before the original insn and replace all the uses of the reg with the debug temporary. And, as you are walking all the bbs on the path insn by insn anyway, supposedly you could instead use the valtrack APIs for that. Thus, call dead_debug_local_init (&debug, NULL, NULL); before walking the first bb, then call dead_debug_add on each FOR_EACH_INSN_INFO_USE of the debug insns that overlaps the dest REG, and finally dead_debug_insert_temp with DEBUG_TEMP_BEFORE_WITH_VALUE and finally dead_debug_local_finish. Of course all this guarded with MAY_HAVE_DEBUG_INSNS. Jakub
Re: [PATCH] Fix PR65650 (1/n in merging CCP and copyprop)
On Thu, 2 Apr 2015, Richard Biener wrote: > > The following makes CCP track copies which avoids pass ordering > issues between CCP and copyprop as seen from the testcase. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, queued for stage1. > > For stage1 I'd like to get rid of copyprop completely, a 2nd patch > in the series will remove the copyprop instances immediately > preceeding/following CCP. > > CCP needs some TLC and I'm going to apply that during stage1. With the propagator engine improvement this need some extra testcase adjustments. Re-bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Note that the "obvious" improvement of ccp_lattice_meet to if (val1->lattice_val == UNDEFINED /* For UNDEFINED M SSA we can't always SSA because its definition may not dominate the PHI node. */ && (val2->lattice_val != CONSTANT || TREE_CODE (val2->value) != SSA_NAME || SSA_NAME_IS_DEFAULT_DEF (val2->value) || (gimple_bb (SSA_NAME_DEF_STMT (val2->value)) != where && dominated_by_p (CDI_DOMINATORS, where, gimple_bb (SSA_NAME_DEF_STMT (val2->value)) to enable optimistic copy propagation (yes, copyprop doesn't do that) regresses quite some gcc.dg/uninit-pred*.c testcases, so I'm at least not enabling this with this patch. For example in gcc.dg/uninit-pred-2_a.c: int foo (int n, int m, int r) { int flag = 0; int v; if (n) { v = r; flag = 1; } if (m) g++; else bar(); /* Wrong guard */ if (!flag) blah(v); /* { dg-warning "uninitialized" "real uninitialized var warning" } */ we see that we can optimistically propagate r into v for the call due to the PHI v_3 = PHI and v being uninitialized on one path. We have similar missed uninit warnings for optimistic constant propagations already so I think optimistically propagating copies isn't wrong. We might want to provide a flag to turn both off, of course. I'll send a followup enabling optimistic copyprop once I get my mind around on how to fix the testcases. Richard. 2015-04-16 Richard Biener PR tree-optimization/65650 * tree-ssa-ccp.c (valid_lattice_transition): Allow lattice transitions involving copies. (set_lattice_value): Adjust for copy lattice state. (ccp_lattice_meet): Do not merge UNDEFINED and a copy to the copy if that doesn't dominate the merge point. (bit_value_unop): Adjust what we treat as varying mask. (bit_value_binop): Likewise. (bit_value_assume_aligned): Likewise. (evaluate_stmt): When we simplified to a SSA name record a copy instead of dropping to varying. (visit_assignment): Simplify. * gimple-match.h (gimple_simplify): Add another callback. * gimple-fold.c (fold_stmt_1): Adjust caller. (gimple_fold_stmt_to_constant_1): Likewise - pass valueize for the 2nd callback. * gimple-match-head.c (gimple_simplify): Add a callback that is used to valueize the stmt operands and use it that way. * gcc.dg/tree-ssa/ssa-ccp-37.c: New testcase. * gcc.dg/tree-ssa/forwprop-11.c: Adjust. * gcc.dg/tree-ssa/ssa-fre-3.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-4.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-5.c: Likewise. * gcc.dg/tree-ssa/ssa-fre-32.c: Likewise. Index: gcc/gimple-fold.c === *** gcc/gimple-fold.c (revision 66) --- gcc/gimple-fold.c (working copy) *** fold_stmt_1 (gimple_stmt_iterator *gsi, *** 3621,3627 gimple_seq seq = NULL; code_helper rcode; tree ops[3] = {}; ! if (gimple_simplify (stmt, &rcode, ops, inplace ? NULL : &seq, valueize)) { if (replace_stmt_with_simplification (gsi, rcode, ops, &seq, inplace)) changed = true; --- 3621,3628 gimple_seq seq = NULL; code_helper rcode; tree ops[3] = {}; ! if (gimple_simplify (stmt, &rcode, ops, inplace ? NULL : &seq, ! valueize, valueize)) { if (replace_stmt_with_simplification (gsi, rcode, ops, &seq, inplace)) changed = true; *** gimple_fold_stmt_to_constant_1 (gimple s *** 4928,4934 edges if there are intermediate VARYING defs. For this reason do not follow SSA edges here even though SCCVN can technically just deal fine with that. */ ! if (gimple_simplify (stmt, &rcode, ops, NULL, gvalueize) && rcode.is_tree_code () && (TREE_CODE_LENGTH ((tree_code) rcode) == 0 || ((tree_code) rcode) == ADDR_EXPR) --- 4929,4935 edges if there are intermediate VARYING defs. For this reason do not follow SSA edges here even though SCCVN can technically just deal fine with that. */ ! if (gimple_simplify (stmt, &rcod
Re: [PATCH][expmed] Properly account for the cost and latency of shift+add ops when synthesizing mults
On 21/04/15 13:46, Marcus Shawcroft wrote: On 20 April 2015 at 16:12, Kyrill Tkachov wrote: Thanks, I could've sworn I had sent this version out a couple hours ago. My mail client has been playing up. Here it is with 6 tests. For the tests corresponding to f1/f3 in my example above I scan that we don't use the 'w1' reg. I'll give the AArch64 maintainers to comment on the tests for a day or two before committing. Using scan-assembler-times is more robust than scan-assembler. Otherwise, OK by me. /Marcus Thanks, I used scan-assembler-times for those tests. Attached is what I committed with r68. Kyrill Thanks, Kyrill 2015-04-20 Kyrylo Tkachov * expmed.c: (synth_mult): Only assume overlapping shift with previous steps in alg_sub_t_m2 case. 2015-04-20 Kyrylo Tkachov * gcc.target/aarch64/mult-synth_1.c: New test. * gcc.target/aarch64/mult-synth_2.c: Likewise. * gcc.target/aarch64/mult-synth_3.c: Likewise. * gcc.target/aarch64/mult-synth_4.c: Likewise. * gcc.target/aarch64/mult-synth_5.c: Likewise. * gcc.target/aarch64/mult-synth_6.c: Likewise. jeff Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 66) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,8 @@ +2015-04-21 Kyrylo Tkachov + + * expmed.c: (synth_mult): Only assume overlapping + shift with previous steps in alg_sub_t_m2 case. + 2015-04-21 Richard Biener PR tree-optimization/65788 Index: gcc/testsuite/gcc.target/aarch64/mult-synth_2.c === --- gcc/testsuite/gcc.target/aarch64/mult-synth_2.c (revision 0) +++ gcc/testsuite/gcc.target/aarch64/mult-synth_2.c (revision 0) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +int +foo (int x) +{ + return x * 25; +} + +/* { dg-final { scan-assembler-times "mul\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/aarch64/mult-synth_3.c === --- gcc/testsuite/gcc.target/aarch64/mult-synth_3.c (revision 0) +++ gcc/testsuite/gcc.target/aarch64/mult-synth_3.c (revision 0) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +int +foo (int x) +{ + return x * 11; +} + +/* { dg-final { scan-assembler-times "mul\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/aarch64/mult-synth_4.c === --- gcc/testsuite/gcc.target/aarch64/mult-synth_4.c (revision 0) +++ gcc/testsuite/gcc.target/aarch64/mult-synth_4.c (revision 0) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +long +foo (int x, int y) +{ + return (long)x * 6L; +} + +/* { dg-final { scan-assembler-times "smull\tx\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/aarch64/mult-synth_5.c === --- gcc/testsuite/gcc.target/aarch64/mult-synth_5.c (revision 0) +++ gcc/testsuite/gcc.target/aarch64/mult-synth_5.c (revision 0) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +int +foo (int x) +{ + return x * 10; +} + +/* { dg-final { scan-assembler-not "\tw1" } } */ +/* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/aarch64/mult-synth_6.c === --- gcc/testsuite/gcc.target/aarch64/mult-synth_6.c (revision 0) +++ gcc/testsuite/gcc.target/aarch64/mult-synth_6.c (revision 0) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +int +foo (int x) +{ + return x * 20; +} + +/* { dg-final { scan-assembler-not "\tw1" } } */ +/* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/gcc.target/aarch64/mult-synth_1.c === --- gcc/testsuite/gcc.target/aarch64/mult-synth_1.c (revision 0) +++ gcc/testsuite/gcc.target/aarch64/mult-synth_1.c (revision 0) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mcpu=cortex-a57 -save-temps" } */ + +int +foo (int x) +{ + return x * 100; +} + +/* { dg-final { scan-assembler-times "mul\tw\[0-9\]+, w\[0-9\]+, w\[0-9\]+" 1 } } */ +/* { dg-final { cleanup-saved-temps } } */ Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (revision 66) +++ gcc/testsuite/ChangeLog (working copy) @@ -1,3 +1,12 @@ +2015-04-21 Kyrylo Tkachov + + * gcc.target/aarch64/mult-synth_1.c: New test. + * gcc.target/aarch64/mult-synth_2.c: Likewise. + * gcc.target/aarch64/mult-synth_3.c: Likewi
Re: [patch,wwwdocs] Add gcc-5 caveats for avr.
Am 04/20/2015 um 09:02 PM schrieb Gerald Pfeifer: Hi Johann, On Mon, 20 Apr 2015, Georg-Johann Lay wrote: Okay to install? +The AVR port uses a new scheme to describe supported devices: +For each supported device the compiler provides a device-specific +http://gcc.gnu.org/onlinedocs/gcc/Spec-Files.html";>spec file. +If the compiler is used together with AVR-LibC, this requires at +least GCC 5.2 and a version of AVR-LibC which implements +http://savannah.nongnu.org/bugs/?44574";#44574. Can you please make the two links https-links? (Especially the one to gcc.gnu.org actually redirects.) Just using "#44574" for a reference, may that be a little confusing, or is it sufficiently clear to AVR users? + A new command option -nodevicelib has been added. "command-line option" +If this option is turned on the compiler won't link against AVR-LibC's +device-specific library libdevice.a by omitting +-ldevice from the linker's command line. How about making this "...-nodevicelib prevents the compiler from linking against"? +If the compiler had not been +http://gcc.gnu.org/install/configure.html";>configured +to be used with AVR-LibC, the compiler will not link against that +library and the option has no effect. "was not" (or "is") instead of "had not", and can you please use https here as well? Though, really, could this be just simplified to "If the compiler is not configured for use with AVR-LibC to begin with, this option has no effect"? Your patch is fine with the above changes or considering them and deciding not go for one or the other. Gerald Thanks for your support. The new entry also contains more topics. Index: gcc-5/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.109 diff -u -p -r1.109 changes.html --- gcc-5/changes.html 20 Apr 2015 08:22:35 - 1.109 +++ gcc-5/changes.html 21 Apr 2015 13:00:11 - @@ -28,6 +28,14 @@ is_trivially_default_constructible, is_trivially_copy_constructible and is_trivially_copy_assignable should be used instead. +On AVR, support has been added for the devices ATtiny4/5/9/10/20/40. +This requires Binutils 2.25 or newer. +The AVR port uses a new scheme to describe supported devices: +For each supported device the compiler provides a device-specific +https://gcc.gnu.org/onlinedocs/gcc/Spec-Files.html";>spec file. +If the compiler is used together with AVR-LibC, this requires at +least GCC 5.2 and a version of AVR-LibC which implements +https://savannah.nongnu.org/bugs/?44574";feature #44574. General Optimizer Improvements @@ -690,6 +698,57 @@ here. +AVR + + The compiler no more supports individual devices like ATmega8. +Specifying, say, -mmcu=atmega8 triggers the usage of the +device-specific +https://gcc.gnu.org/onlinedocs/gcc/Spec-Files.html";>spec file +specs-atmega8 which is part of the installation and describes +options for the sub-processes like compiler proper, assembler and linker. +You can add support for a new device -mmcu=mydevice as follows: + + In an empty directory /someplace, create a new + directory device-specs. + Copy a device spec file from the installed device-specs +folder, follow the comments in that file and then save it as +/someplace/device-specs/specs-mydevice. + Add -B /someplace -mmcu=mydevice to the +compiler's command-line options. Notice that /someplace +must specify an absolute path and that mydevice must +not start with "avr". + Provided you have a device-specific library +libmydevice.a available, you can put it at +/someplace, dito for a device-specific startup +file crtmydevice.o. + +The contents of the device spec files depend on the compiler's +configuration, in particular on --with-avrlibc=no and +whether or not it is configured for RTEMS. + + A new command-line option -nodevicelib has been added. +It prevents the compiler from linking against AVR-LibC's +device-specific library libdevice.a. + The following three command-line options have been added: + + -mrmw + Set if the device supports the read-modify-write instructions +LAC, LAS, LAT +and XCH. + -mn-flash=size + Specify the flash size of the device in units of 64 KiB, +rounded up to the next integer as needed. This option affects the +availability of the +https://gcc.gnu.org/onlinedocs/gcc/Named-Address-Spaces.html";>AVR + address-spaces. + -mskip-bug + Set if the device is affected by the respective silicon bug. + +In general, you don't need to set these options by ha
[PATCH 01/12] add default definition of EH_RETURN_DATA_REGNO
From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * defaults.h: New definition of EH_RETURN_DATA_REGNO. * except.c: Remove definition of EH_RETURN_DATA_REGNO. * builtins.c (expand_builtin): Remove check if EH_RETURN_DATA_REGNO is defined. * df-scan.c (df_bb_refs_collect): Likewise. (df_get_exit_block_use_set): Likewise. * haifa-sched.c (initiate_bb_reg_pressure_info): Likewise. * ira-lives.c (process_bb_node_lives): Likewise. * lra-lives.c (process_bb_lives): Likewise. --- gcc/builtins.c| 2 -- gcc/defaults.h| 6 ++ gcc/df-scan.c | 4 gcc/except.c | 6 -- gcc/haifa-sched.c | 2 -- gcc/ira-lives.c | 2 -- gcc/lra-lives.c | 2 -- 7 files changed, 6 insertions(+), 18 deletions(-) diff --git a/gcc/builtins.c b/gcc/builtins.c index 9263777..028d793 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -6510,10 +6510,8 @@ expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode, expand_builtin_eh_return (CALL_EXPR_ARG (exp, 0), CALL_EXPR_ARG (exp, 1)); return const0_rtx; -#ifdef EH_RETURN_DATA_REGNO case BUILT_IN_EH_RETURN_DATA_REGNO: return expand_builtin_eh_return_data_regno (exp); -#endif case BUILT_IN_EXTEND_POINTER: return expand_builtin_extend_pointer (CALL_EXPR_ARG (exp, 0)); case BUILT_IN_EH_POINTER: diff --git a/gcc/defaults.h b/gcc/defaults.h index 1d54798..911c2f8 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -377,6 +377,12 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #endif #endif +/* Provide defaults for stuff that may not be defined when using + sjlj exceptions. */ +#ifndef EH_RETURN_DATA_REGNO +#define EH_RETURN_DATA_REGNO(N) INVALID_REGNUM +#endif + /* If we have named section and we support weak symbols, then use the .jcr section for recording java classes which need to be registered at program start-up time. */ diff --git a/gcc/df-scan.c b/gcc/df-scan.c index 1700be9..b2e2e5d 100644 --- a/gcc/df-scan.c +++ b/gcc/df-scan.c @@ -3332,7 +3332,6 @@ df_bb_refs_collect (struct df_collection_rec *collection_rec, basic_block bb) return; } -#ifdef EH_RETURN_DATA_REGNO if (bb_has_eh_pred (bb)) { unsigned int i; @@ -3346,7 +3345,6 @@ df_bb_refs_collect (struct df_collection_rec *collection_rec, basic_block bb) bb, NULL, DF_REF_REG_DEF, DF_REF_AT_TOP); } } -#endif /* Add the hard_frame_pointer if this block is the target of a non-local goto. */ @@ -3751,7 +3749,6 @@ df_get_exit_block_use_set (bitmap exit_block_uses) bitmap_set_bit (exit_block_uses, i); } -#ifdef EH_RETURN_DATA_REGNO /* Mark the registers that will contain data for the handler. */ if (reload_completed && crtl->calls_eh_return) for (i = 0; ; ++i) @@ -3761,7 +3758,6 @@ df_get_exit_block_use_set (bitmap exit_block_uses) break; bitmap_set_bit (exit_block_uses, regno); } -#endif #ifdef EH_RETURN_STACKADJ_RTX if ((!HAVE_epilogue || ! epilogue_completed) diff --git a/gcc/except.c b/gcc/except.c index 833ec21..7573c88 100644 --- a/gcc/except.c +++ b/gcc/except.c @@ -174,12 +174,6 @@ along with GCC; see the file COPYING3. If not see #include "cfgloop.h" #include "builtins.h" -/* Provide defaults for stuff that may not be defined when using - sjlj exceptions. */ -#ifndef EH_RETURN_DATA_REGNO -#define EH_RETURN_DATA_REGNO(N) INVALID_REGNUM -#endif - static GTY(()) int call_site_base; struct tree_hash_traits : default_hashmap_traits diff --git a/gcc/haifa-sched.c b/gcc/haifa-sched.c index ad2450b..d47cb8c 100644 --- a/gcc/haifa-sched.c +++ b/gcc/haifa-sched.c @@ -1070,7 +1070,6 @@ initiate_bb_reg_pressure_info (basic_block bb) if (NONDEBUG_INSN_P (insn)) setup_ref_regs (PATTERN (insn)); initiate_reg_pressure_info (df_get_live_in (bb)); -#ifdef EH_RETURN_DATA_REGNO if (bb_has_eh_pred (bb)) for (i = 0; ; ++i) { @@ -1082,7 +1081,6 @@ initiate_bb_reg_pressure_info (basic_block bb) mark_regno_birth_or_death (curr_reg_live, curr_reg_pressure, regno, true); } -#endif } /* Save current register pressure related info. */ diff --git a/gcc/ira-lives.c b/gcc/ira-lives.c index b29f572..2837349 100644 --- a/gcc/ira-lives.c +++ b/gcc/ira-lives.c @@ -1319,7 +1319,6 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) curr_point++; } -#ifdef EH_RETURN_DATA_REGNO if (bb_has_eh_pred (bb)) for (j = 0; ; ++j) { @@ -1328,7 +1327,6 @@ process_bb_node_lives (ira_loop_tree_node_t loop_tree_node) break; make_hard_regno_born (regno); } -#endif /* Allocnos can't go in stack regs at the start of a basic block that is reached by an abnormal edge. Likewise for call diff
[PATCH 00/12] Reduce conditional compilation
From: Trevor Saunders Hi, This is a first round of patches to reduce the amount of code with in #if / #ifdef. This makes it incrementally easier to not break configs other than the one being built, and moves things slightly closer to using target hooks for everything. each commit bootstrapped and regtested on x86_64-linux-gnu without regression, and whole patch set run through config-list.mk without issue, ok? Trevor Saunders (12): add default definition of EH_RETURN_DATA_REGNO remove some ifdef HAVE_cc0 more HAVE_cc0 always define HAVE_cc0 make some HAVE_cc0 code always compiled provide default for RETURN_ADDR_OFFSET provide default for MASK_RETURN_ADDR reduce conditional compilation for HARD_FRAME_POINTER_IS_FRAME_POINTER remove #if for PIC_OFFSET_TABLE_REGNUM remove more ifdefs for HAVE_cc0 provide default for INSN_SETS_ARE_DELAYED add default for INSN_REFERENCES_ARE_DELAYED gcc/alias.c | 7 ++--- gcc/builtins.c| 2 -- gcc/caller-save.c | 4 +-- gcc/cfgcleanup.c | 26 +--- gcc/cfgrtl.c | 12 ++-- gcc/combine.c | 84 ++- gcc/conditions.h | 6 gcc/cprop.c | 4 +-- gcc/cse.c | 22 +- gcc/defaults.h| 23 ++ gcc/df-problems.c | 9 ++ gcc/df-scan.c | 46 +++- gcc/emit-rtl.c| 8 ++--- gcc/except.c | 26 ++-- gcc/final.c | 43 -- gcc/function.c| 5 ++- gcc/gcse.c| 24 --- gcc/genconfig.c | 1 + gcc/haifa-sched.c | 5 +-- gcc/ira-lives.c | 2 -- gcc/ira.c | 33 +--- gcc/jump.c| 3 -- gcc/loop-invariant.c | 4 +-- gcc/lra-constraints.c | 6 ++-- gcc/lra-lives.c | 2 -- gcc/optabs.c | 2 +- gcc/postreload.c | 4 +-- gcc/recog.c | 2 -- gcc/recog.h | 2 -- gcc/reginfo.c | 5 ++- gcc/regrename.c | 5 ++- gcc/reload.c | 12 +++- gcc/reload1.c | 10 +++--- gcc/reorg.c | 68 ++--- gcc/resource.c| 15 +++-- gcc/rtlanal.c | 2 -- gcc/sched-deps.c | 5 +-- gcc/sched-rgn.c | 4 +-- gcc/simplify-rtx.c| 5 ++- 39 files changed, 199 insertions(+), 349 deletions(-) -- 2.3.0.80.g18d0fec.dirty
[PATCH 02/12] remove some ifdef HAVE_cc0
From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * conditions.h: Define macros even if HAVE_cc0 is undefined. * emit-rtl.c: Define functions even if HAVE_cc0 is undefined. * final.c: Likewise. * jump.c: Likewise. * recog.c: Likewise. * recog.h: Declare functions even when HAVE_cc0 is undefined. * sched-deps.c (sched_analyze_2): Always compile case for cc0. --- gcc/conditions.h | 6 -- gcc/emit-rtl.c | 2 -- gcc/final.c | 2 -- gcc/jump.c | 3 --- gcc/recog.c | 2 -- gcc/recog.h | 2 -- gcc/sched-deps.c | 5 +++-- 7 files changed, 3 insertions(+), 19 deletions(-) diff --git a/gcc/conditions.h b/gcc/conditions.h index 2308bfc..7cd1e1c 100644 --- a/gcc/conditions.h +++ b/gcc/conditions.h @@ -20,10 +20,6 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_CONDITIONS_H #define GCC_CONDITIONS_H -/* None of the things in the files exist if we don't use CC0. */ - -#ifdef HAVE_cc0 - /* The variable cc_status says how to interpret the condition code. It is set by output routines for an instruction that sets the cc's and examined by output routines for jump instructions. @@ -117,6 +113,4 @@ extern CC_STATUS cc_status; (cc_status.flags = 0, cc_status.value1 = 0, cc_status.value2 = 0, \ CC_STATUS_MDEP_INIT) -#endif - #endif /* GCC_CONDITIONS_H */ diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index 483eacb..c1974bb 100644 --- a/gcc/emit-rtl.c +++ b/gcc/emit-rtl.c @@ -3541,7 +3541,6 @@ prev_active_insn (rtx uncast_insn) return insn; } -#ifdef HAVE_cc0 /* Return the next insn that uses CC0 after INSN, which is assumed to set it. This is the inverse of prev_cc0_setter (i.e., prev_cc0_setter applied to the result of this function should yield INSN). @@ -3589,7 +3588,6 @@ prev_cc0_setter (rtx uncast_insn) return insn; } -#endif #ifdef AUTO_INC_DEC /* Find a RTX_AUTOINC class rtx which matches DATA. */ diff --git a/gcc/final.c b/gcc/final.c index 1fa93d9..41f6bd9 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -191,7 +191,6 @@ static rtx last_ignored_compare = 0; static int insn_counter = 0; -#ifdef HAVE_cc0 /* This variable contains machine-dependent flags (defined in tm.h) set and examined by output routines that describe how to interpret the condition codes properly. */ @@ -202,7 +201,6 @@ CC_STATUS cc_status; from before the insn. */ CC_STATUS cc_prev_status; -#endif /* Number of unmatched NOTE_INSN_BLOCK_BEG notes we have seen. */ diff --git a/gcc/jump.c b/gcc/jump.c index 34b3b7b..bc91550 100644 --- a/gcc/jump.c +++ b/gcc/jump.c @@ -1044,8 +1044,6 @@ jump_to_label_p (const rtx_insn *insn) && JUMP_LABEL (insn) != NULL && !ANY_RETURN_P (JUMP_LABEL (insn))); } -#ifdef HAVE_cc0 - /* Return nonzero if X is an RTX that only sets the condition codes and has no side effects. */ @@ -1094,7 +1092,6 @@ sets_cc0_p (const_rtx x) } return 0; } -#endif /* Find all CODE_LABELs referred to in X, and increment their use counts. If INSN is a JUMP_INSN and there is at least one diff --git a/gcc/recog.c b/gcc/recog.c index a9d3b1f..c3ad86f 100644 --- a/gcc/recog.c +++ b/gcc/recog.c @@ -971,7 +971,6 @@ validate_simplify_insn (rtx insn) return ((num_changes_pending () > 0) && (apply_change_group () > 0)); } -#ifdef HAVE_cc0 /* Return 1 if the insn using CC0 set by INSN does not contain any ordered tests applied to the condition codes. EQ and NE tests do not count. */ @@ -988,7 +987,6 @@ next_insn_tests_no_inequality (rtx insn) return (INSN_P (next) && ! inequality_comparisons_p (PATTERN (next))); } -#endif /* Return 1 if OP is a valid general operand for machine mode MODE. This is either a register reference, a memory reference, diff --git a/gcc/recog.h b/gcc/recog.h index 45ea671..8a38b26 100644 --- a/gcc/recog.h +++ b/gcc/recog.h @@ -112,9 +112,7 @@ extern void validate_replace_rtx_group (rtx, rtx, rtx); extern void validate_replace_src_group (rtx, rtx, rtx); extern bool validate_simplify_insn (rtx insn); extern int num_changes_pending (void); -#ifdef HAVE_cc0 extern int next_insn_tests_no_inequality (rtx); -#endif extern bool reg_fits_class_p (const_rtx, reg_class_t, int, machine_mode); extern int offsettable_memref_p (rtx); diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c index 5434831..31de6be 100644 --- a/gcc/sched-deps.c +++ b/gcc/sched-deps.c @@ -2608,8 +2608,10 @@ sched_analyze_2 (struct deps_desc *deps, rtx x, rtx_insn *insn) return; -#ifdef HAVE_cc0 case CC0: +#ifdef HAVE_cc0 + gcc_unreachable (); +#endif /* User of CC0 depends on immediately preceding insn. */ SCHED_GROUP_P (insn) = 1; /* Don't move CC0 setter to another block (it can set up the @@ -2620,7 +2622,6 @@ sched_analyze_2 (struct deps_desc *deps, rtx x, rtx_insn *insn) sched_deps_info->finish_rhs (); return; -#endif case R
[PATCH 05/12] make some HAVE_cc0 code always compiled
From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * cfgrtl.c (rtl_merge_blocks): Change #if HAVE_cc0 to if (HAVE_cc0) (try_redirect_by_replacing_jump): Likewise. (rtl_tidy_fallthru_edge): Likewise. * combine.c (insn_a_feeds_b): Likewise. (find_split_point): Likewise. (simplify_set): Likewise. * cprop.c (cprop_jump): Likewise. * cse.c (cse_extended_basic_block): Likewise. * df-problems.c (can_move_insns_across): Likewise. * function.c (emit_use_return_register_into_block): Likewise. * haifa-sched.c (sched_init): Likewise. * ira.c (find_moveable_pseudos): Likewise. * loop-invariant.c (find_invariant_insn): Likewise. * lra-constraints.c (curr_insn_transform): Likewise. * postreload.c (reload_combine_recognize_const_pattern): * Likewise. * reload.c (find_reloads): Likewise. * reorg.c (delete_scheduled_jump): Likewise. (steal_delay_list_from_target): Likewise. (steal_delay_list_from_fallthrough): Likewise. (redundant_insn): Likewise. (fill_simple_delay_slots): Likewise. (fill_slots_from_thread): Likewise. (delete_computation): Likewise. * sched-rgn.c (add_branch_dependences): Likewise. --- gcc/cfgrtl.c | 12 +++- gcc/combine.c | 10 ++ gcc/cprop.c | 4 +--- gcc/cse.c | 4 +--- gcc/df-problems.c | 4 +--- gcc/function.c| 5 ++--- gcc/haifa-sched.c | 3 +-- gcc/ira.c | 5 ++--- gcc/loop-invariant.c | 4 +--- gcc/lra-constraints.c | 6 ++ gcc/postreload.c | 4 +--- gcc/reload.c | 10 +++--- gcc/reorg.c | 32 gcc/sched-rgn.c | 4 +--- 14 files changed, 29 insertions(+), 78 deletions(-) diff --git a/gcc/cfgrtl.c b/gcc/cfgrtl.c index 4c1708f..d93a49e 100644 --- a/gcc/cfgrtl.c +++ b/gcc/cfgrtl.c @@ -893,10 +893,9 @@ rtl_merge_blocks (basic_block a, basic_block b) del_first = a_end; -#if HAVE_cc0 /* If this was a conditional jump, we need to also delete the insn that set cc0. */ - if (only_sets_cc0_p (prev)) + if (HAVE_cc0 && only_sets_cc0_p (prev)) { rtx_insn *tmp = prev; @@ -905,7 +904,6 @@ rtl_merge_blocks (basic_block a, basic_block b) prev = BB_HEAD (a); del_first = tmp; } -#endif a_end = PREV_INSN (del_first); } @@ -1064,11 +1062,9 @@ try_redirect_by_replacing_jump (edge e, basic_block target, bool in_cfglayout) /* In case we zap a conditional jump, we'll need to kill the cc0 setter too. */ kill_from = insn; -#if HAVE_cc0 - if (reg_mentioned_p (cc0_rtx, PATTERN (insn)) + if (HAVE_cc0 && reg_mentioned_p (cc0_rtx, PATTERN (insn)) && only_sets_cc0_p (PREV_INSN (insn))) kill_from = PREV_INSN (insn); -#endif /* See if we can create the fallthru edge. */ if (in_cfglayout || can_fallthru (src, target)) @@ -1825,12 +1821,10 @@ rtl_tidy_fallthru_edge (edge e) delete_insn (table); } -#if HAVE_cc0 /* If this was a conditional jump, we need to also delete the insn that set cc0. */ - if (any_condjump_p (q) && only_sets_cc0_p (PREV_INSN (q))) + if (HAVE_cc0 && any_condjump_p (q) && only_sets_cc0_p (PREV_INSN (q))) q = PREV_INSN (q); -#endif q = PREV_INSN (q); } diff --git a/gcc/combine.c b/gcc/combine.c index 430084e..d71f863 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -1141,10 +1141,8 @@ insn_a_feeds_b (rtx_insn *a, rtx_insn *b) FOR_EACH_LOG_LINK (links, b) if (links->insn == a) return true; -#if HAVE_cc0 - if (sets_cc0_p (a)) + if (HAVE_cc0 && sets_cc0_p (a)) return true; -#endif return false; } @@ -4816,7 +4814,6 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src) break; case SET: -#if HAVE_cc0 /* If SET_DEST is CC0 and SET_SRC is not an operand, a COMPARE, or a ZERO_EXTRACT, the most likely reason why this doesn't match is that we need to put the operand into a register. So split at that @@ -4829,7 +4826,6 @@ find_split_point (rtx *loc, rtx_insn *insn, bool set_src) && ! (GET_CODE (SET_SRC (x)) == SUBREG && OBJECT_P (SUBREG_REG (SET_SRC (x) return &SET_SRC (x); -#endif /* See if we can split SET_SRC as it stands. */ split = find_split_point (&SET_SRC (x), insn, true); @@ -6582,13 +6578,12 @@ simplify_set (rtx x) else compare_mode = SELECT_CC_MODE (new_code, op0, op1); -#if !HAVE_cc0 /* If the mode changed, we have to change SET_DEST, the mode in the compare, and the mode in the place SET_DEST is used. If SET_DEST is a hard register, just build new versions with the proper mode. If it is a pseudo, we lose unless it is only time we set the pseudo, in
[PATCH 04/12] always define HAVE_cc0
From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * genconfig.c (main): Always define HAVE_cc0. * caller-save.c (insert_one_insn): Change ifdef HAVE_cc0 to #if HAVE_cc0. * cfgcleanup.c (flow_find_cross_jump): Likewise. (flow_find_head_matching_sequence): Likewise. (try_head_merge_bb): Likewise. * cfgrtl.c (rtl_merge_blocks): Likewise. (try_redirect_by_replacing_jump): Likewise. (rtl_tidy_fallthru_edge): Likewise. * combine.c (do_SUBST_MODE): Likewise. (insn_a_feeds_b): Likewise. (combine_instructions): Likewise. (can_combine_p): Likewise. (try_combine): Likewise. (find_split_point): Likewise. (subst): Likewise. (simplify_set): Likewise. (distribute_notes): Likewise. * cprop.c (cprop_jump): Likewise. * cse.c (cse_extended_basic_block): Likewise. * df-problems.c (can_move_insns_across): Likewise. * final.c (final): Likewise. (final_scan_insn): Likewise. * function.c (emit_use_return_register_into_block): Likewise. * gcse.c (insert_insn_end_basic_block): Likewise. * haifa-sched.c (sched_init): Likewise. * ira.c (find_moveable_pseudos): Likewise. * loop-invariant.c (find_invariant_insn): Likewise. * lra-constraints.c (curr_insn_transform): Likewise. * optabs.c (prepare_cmp_insn): Likewise. * postreload.c (reload_combine_recognize_const_pattern): * Likewise. * reload.c (find_reloads): Likewise. (find_reloads_address_1): Likewise. * reorg.c (delete_scheduled_jump): Likewise. (steal_delay_list_from_target): Likewise. (steal_delay_list_from_fallthrough): Likewise. (try_merge_delay_insns): Likewise. (redundant_insn): Likewise. (fill_simple_delay_slots): Likewise. (fill_slots_from_thread): Likewise. (delete_computation): Likewise. (relax_delay_slots): Likewise. * sched-deps.c (sched_analyze_2): Likewise. * sched-rgn.c (add_branch_dependences): Likewise. --- gcc/caller-save.c | 2 +- gcc/cfgcleanup.c | 12 ++-- gcc/cfgrtl.c | 6 +++--- gcc/combine.c | 36 ++-- gcc/cprop.c | 2 +- gcc/cse.c | 2 +- gcc/df-problems.c | 4 ++-- gcc/final.c | 14 +++--- gcc/function.c| 2 +- gcc/gcse.c| 2 +- gcc/genconfig.c | 1 + gcc/haifa-sched.c | 2 +- gcc/ira.c | 4 ++-- gcc/loop-invariant.c | 2 +- gcc/lra-constraints.c | 2 +- gcc/optabs.c | 2 +- gcc/postreload.c | 2 +- gcc/reload.c | 6 +++--- gcc/reorg.c | 30 +++--- gcc/sched-deps.c | 2 +- gcc/sched-rgn.c | 2 +- 21 files changed, 69 insertions(+), 68 deletions(-) diff --git a/gcc/caller-save.c b/gcc/caller-save.c index 3b01941..fc575eb 100644 --- a/gcc/caller-save.c +++ b/gcc/caller-save.c @@ -1400,7 +1400,7 @@ insert_one_insn (struct insn_chain *chain, int before_p, int code, rtx pat) rtx_insn *insn = chain->insn; struct insn_chain *new_chain; -#ifdef HAVE_cc0 +#if HAVE_cc0 /* If INSN references CC0, put our insns in front of the insn that sets CC0. This is always safe, since the only way we could be passed an insn that references CC0 is for a restore, and doing a restore earlier diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c index cee152e..58d235e 100644 --- a/gcc/cfgcleanup.c +++ b/gcc/cfgcleanup.c @@ -1416,7 +1416,7 @@ flow_find_cross_jump (basic_block bb1, basic_block bb2, rtx_insn **f1, i2 = PREV_INSN (i2); } -#ifdef HAVE_cc0 +#if HAVE_cc0 /* Don't allow the insn after a compare to be shared by cross-jumping unless the compare is also shared. */ if (ninsns && reg_mentioned_p (cc0_rtx, last1) && ! sets_cc0_p (last1)) @@ -1539,7 +1539,7 @@ flow_find_head_matching_sequence (basic_block bb1, basic_block bb2, rtx_insn **f i2 = NEXT_INSN (i2); } -#ifdef HAVE_cc0 +#if HAVE_cc0 /* Don't allow a compare to be shared by cross-jumping unless the insn after the compare is also shared. */ if (ninsns && reg_mentioned_p (cc0_rtx, last1) && sets_cc0_p (last1)) @@ -2330,7 +2330,7 @@ try_head_merge_bb (basic_block bb) cond = get_condition (jump, &move_before, true, false); if (cond == NULL_RTX) { -#ifdef HAVE_cc0 +#if HAVE_cc0 if (reg_mentioned_p (cc0_rtx, jump)) move_before = prev_nonnote_nondebug_insn (jump); else @@ -2499,7 +2499,7 @@ try_head_merge_bb (basic_block bb) cond = get_condition (jump, &move_before, true, false); if (cond == NULL_RTX) { -#ifdef HAVE_cc0 +#if HAVE_cc0 if (reg_mentioned_p (cc0_rtx, jump)) move_before = prev_nonnote_nondebug_insn (jump); else @@ -2522,7 +2522,7 @@ try_head_merge_bb
[PATCH 03/12] more removal of ifdef HAVE_cc0
From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * combine.c (find_single_use): Remove HAVE_cc0 ifdef for code that is trivially ded on non cc0 targets. (simplify_set): Likewise. (mark_used_regs_combine): Likewise. * cse.c (new_basic_block): Likewise. (fold_rtx): Likewise. (cse_insn): Likewise. (cse_extended_basic_block): Likewise. (set_live_p): Likewise. * rtlanal.c (canonicalize_condition): Likewise. * simplify-rtx.c (simplify_binary_operation_1): Likewise. --- gcc/combine.c | 6 -- gcc/cse.c | 18 -- gcc/rtlanal.c | 2 -- gcc/simplify-rtx.c | 5 ++--- 4 files changed, 2 insertions(+), 29 deletions(-) diff --git a/gcc/combine.c b/gcc/combine.c index 46cd6db..0a35b8f 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -686,7 +686,6 @@ find_single_use (rtx dest, rtx_insn *insn, rtx_insn **ploc) rtx *result; struct insn_link *link; -#ifdef HAVE_cc0 if (dest == cc0_rtx) { next = NEXT_INSN (insn); @@ -699,7 +698,6 @@ find_single_use (rtx dest, rtx_insn *insn, rtx_insn **ploc) *ploc = next; return result; } -#endif if (!REG_P (dest)) return 0; @@ -6724,7 +6722,6 @@ simplify_set (rtx x) src = SET_SRC (x), dest = SET_DEST (x); } -#ifdef HAVE_cc0 /* If we have (set (cc0) (subreg ...)), we try to remove the subreg in SRC. */ if (dest == cc0_rtx @@ -6744,7 +6741,6 @@ simplify_set (rtx x) src = SET_SRC (x); } } -#endif #ifdef LOAD_EXTEND_OP /* If we have (set FOO (subreg:M (mem:N BAR) 0)) with M wider than N, this @@ -13193,11 +13189,9 @@ mark_used_regs_combine (rtx x) case ADDR_VEC: case ADDR_DIFF_VEC: case ASM_INPUT: -#ifdef HAVE_cc0 /* CC0 must die in the insn after it is set, so we don't need to take special note of it here. */ case CC0: -#endif return; case CLOBBER: diff --git a/gcc/cse.c b/gcc/cse.c index 2a33827..d184d27 100644 --- a/gcc/cse.c +++ b/gcc/cse.c @@ -281,7 +281,6 @@ struct qty_table_elem /* The table of all qtys, indexed by qty number. */ static struct qty_table_elem *qty_table; -#ifdef HAVE_cc0 /* For machines that have a CC0, we do not record its value in the hash table since its use is guaranteed to be the insn immediately following its definition and any other insn is presumed to invalidate it. @@ -293,7 +292,6 @@ static struct qty_table_elem *qty_table; static rtx this_insn_cc0, prev_insn_cc0; static machine_mode this_insn_cc0_mode, prev_insn_cc0_mode; -#endif /* Insn being scanned. */ @@ -884,9 +882,7 @@ new_basic_block (void) } } -#ifdef HAVE_cc0 prev_insn_cc0 = 0; -#endif } /* Say that register REG contains a quantity in mode MODE not in any @@ -3166,10 +3162,8 @@ fold_rtx (rtx x, rtx_insn *insn) case EXPR_LIST: return x; -#ifdef HAVE_cc0 case CC0: return prev_insn_cc0; -#endif case ASM_OPERANDS: if (insn) @@ -3223,7 +3217,6 @@ fold_rtx (rtx x, rtx_insn *insn) const_arg = folded_arg; break; -#ifdef HAVE_cc0 case CC0: /* The cc0-user and cc0-setter may be in different blocks if the cc0-setter potentially traps. In that case PREV_INSN_CC0 @@ -3247,7 +3240,6 @@ fold_rtx (rtx x, rtx_insn *insn) const_arg = equiv_constant (folded_arg); } break; -#endif default: folded_arg = fold_rtx (folded_arg, insn); @@ -4522,11 +4514,9 @@ cse_insn (rtx_insn *insn) sets = XALLOCAVEC (struct set, XVECLEN (x, 0)); this_insn = insn; -#ifdef HAVE_cc0 /* Records what this insn does to set CC0. */ this_insn_cc0 = 0; this_insn_cc0_mode = VOIDmode; -#endif /* Find all regs explicitly clobbered in this insn, to ensure they are not replaced with any other regs @@ -5541,7 +5531,6 @@ cse_insn (rtx_insn *insn) } } -#ifdef HAVE_cc0 /* If setting CC0, record what it was set to, or a constant, if it is equivalent to a constant. If it is being set to a floating-point value, make a COMPARE with the appropriate constant of 0. If we @@ -5556,7 +5545,6 @@ cse_insn (rtx_insn *insn) this_insn_cc0 = gen_rtx_COMPARE (VOIDmode, this_insn_cc0, CONST0_RTX (mode)); } -#endif } /* Now enter all non-volatile source expressions in the hash table @@ -6604,11 +6592,9 @@ cse_extended_basic_block (struct cse_basic_block_data *ebb_data) record_jump_equiv (insn, taken); } -#ifdef HAVE_cc0 /* Clear the CC0-tracking related insns, they can't provide useful information across basic block boundaries. */ prev_insn_cc0 = 0; -#endif } gcc_assert (next_qty <= max_qty); @@ -6859,21 +6845,17 @@ static bool set_live_p (rtx set, rtx_in
[PATCH 06/12] provide default for RETURN_ADDR_OFFSET
From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * defaults.h (RETURN_ADDR_OFFSET): New definition. * except.c (expand_builtin_extract_return_addr): Remove ifdef RETURN_ADDR_OFFSET. (expand_builtin_frob_return_addr): Likewise. --- gcc/defaults.h | 5 + gcc/except.c | 14 +++--- 2 files changed, 12 insertions(+), 7 deletions(-) diff --git a/gcc/defaults.h b/gcc/defaults.h index 911c2f8..767901a 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -383,6 +383,11 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define EH_RETURN_DATA_REGNO(N) INVALID_REGNUM #endif +/* Offset between the eh handler address and entry in eh tables. */ +#ifndef RETURN_ADDR_OFFSET +#define RETURN_ADDR_OFFSET 0 +#endif + /* If we have named section and we support weak symbols, then use the .jcr section for recording java classes which need to be registered at program start-up time. */ diff --git a/gcc/except.c b/gcc/except.c index 7573c88..c98163d 100644 --- a/gcc/except.c +++ b/gcc/except.c @@ -2189,9 +2189,8 @@ expand_builtin_extract_return_addr (tree addr_tree) #endif /* Then adjust to find the real return address. */ -#if defined (RETURN_ADDR_OFFSET) - addr = plus_constant (Pmode, addr, RETURN_ADDR_OFFSET); -#endif + if (RETURN_ADDR_OFFSET) +addr = plus_constant (Pmode, addr, RETURN_ADDR_OFFSET); return addr; } @@ -2207,10 +2206,11 @@ expand_builtin_frob_return_addr (tree addr_tree) addr = convert_memory_address (Pmode, addr); -#ifdef RETURN_ADDR_OFFSET - addr = force_reg (Pmode, addr); - addr = plus_constant (Pmode, addr, -RETURN_ADDR_OFFSET); -#endif + if (RETURN_ADDR_OFFSET) +{ + addr = force_reg (Pmode, addr); + addr = plus_constant (Pmode, addr, -RETURN_ADDR_OFFSET); +} return addr; } -- 2.3.0.80.g18d0fec.dirty
[PATCH 07/12] provide default for MASK_RETURN_ADDR
From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * defaults.h (MASK_RETURN_ADDR): New definition. * except.c (expand_builtin_extract_return_addr): Remove ifdef MASK_RETURN_ADDR. --- gcc/defaults.h | 4 gcc/except.c | 6 +++--- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/gcc/defaults.h b/gcc/defaults.h index 767901a..843d7e2 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -388,6 +388,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define RETURN_ADDR_OFFSET 0 #endif +#ifndef MASK_RETURN_ADDR +#define MASK_RETURN_ADDR NULL_RTX +#endif + /* If we have named section and we support weak symbols, then use the .jcr section for recording java classes which need to be registered at program start-up time. */ diff --git a/gcc/except.c b/gcc/except.c index c98163d..5b24006 100644 --- a/gcc/except.c +++ b/gcc/except.c @@ -2184,9 +2184,9 @@ expand_builtin_extract_return_addr (tree addr_tree) } /* First mask out any unwanted bits. */ -#ifdef MASK_RETURN_ADDR - expand_and (Pmode, addr, MASK_RETURN_ADDR, addr); -#endif + rtx mask = MASK_RETURN_ADDR; + if (mask) +expand_and (Pmode, addr, mask, addr); /* Then adjust to find the real return address. */ if (RETURN_ADDR_OFFSET) -- 2.3.0.80.g18d0fec.dirty
[PATCH 08/12] reduce conditional compilation for HARD_FRAME_POINTER_IS_FRAME_POINTER
From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * alias.c (init_alias_target): Remove ifdef * HARD_FRAME_POINTER_IS_FRAME_POINTER. * df-scan.c (df_insn_refs_collect): Likewise. (df_get_regular_block_artificial_uses): Likewise. (df_get_eh_block_artificial_uses): Likewise. (df_get_entry_block_def_set): Likewise. (df_get_exit_block_use_set): Likewise. * emit-rtl.c (gen_rtx_REG): Likewise. * ira.c (ira_setup_eliminable_regset): Likewise. * reginfo.c (init_reg_sets_1): Likewise. * regrename.c (rename_chains): Likewise. * reload1.c (reload): Likewise. (eliminate_regs_in_insn): Likewise. * resource.c (mark_referenced_resources): Likewise. (init_resource_info): Likewise. --- gcc/alias.c | 7 +++ gcc/df-scan.c | 35 +-- gcc/emit-rtl.c | 6 +++--- gcc/ira.c | 23 --- gcc/reginfo.c | 5 ++--- gcc/regrename.c | 5 ++--- gcc/reload1.c | 10 -- gcc/resource.c | 11 +-- 8 files changed, 48 insertions(+), 54 deletions(-) diff --git a/gcc/alias.c b/gcc/alias.c index a7160f3..8f48660 100644 --- a/gcc/alias.c +++ b/gcc/alias.c @@ -2765,10 +2765,9 @@ init_alias_target (void) = unique_base_value (UNIQUE_BASE_VALUE_ARGP); static_reg_base_value[FRAME_POINTER_REGNUM] = unique_base_value (UNIQUE_BASE_VALUE_FP); -#if !HARD_FRAME_POINTER_IS_FRAME_POINTER - static_reg_base_value[HARD_FRAME_POINTER_REGNUM] -= unique_base_value (UNIQUE_BASE_VALUE_HFP); -#endif + if (!HARD_FRAME_POINTER_IS_FRAME_POINTER) +static_reg_base_value[HARD_FRAME_POINTER_REGNUM] + = unique_base_value (UNIQUE_BASE_VALUE_HFP); } /* Set MEMORY_MODIFIED when X modifies DATA (that is assumed diff --git a/gcc/df-scan.c b/gcc/df-scan.c index b2e2e5d..69332a8 100644 --- a/gcc/df-scan.c +++ b/gcc/df-scan.c @@ -3247,12 +3247,11 @@ df_insn_refs_collect (struct df_collection_rec *collection_rec, regno_reg_rtx[FRAME_POINTER_REGNUM], NULL, bb, insn_info, DF_REF_REG_USE, 0); -#if !HARD_FRAME_POINTER_IS_FRAME_POINTER - df_ref_record (DF_REF_BASE, collection_rec, - regno_reg_rtx[HARD_FRAME_POINTER_REGNUM], - NULL, bb, insn_info, - DF_REF_REG_USE, 0); -#endif + if (!HARD_FRAME_POINTER_IS_FRAME_POINTER) + df_ref_record (DF_REF_BASE, collection_rec, + regno_reg_rtx[HARD_FRAME_POINTER_REGNUM], + NULL, bb, insn_info, + DF_REF_REG_USE, 0); break; default: break; @@ -3442,9 +3441,9 @@ df_get_regular_block_artificial_uses (bitmap regular_block_artificial_uses) reference of the frame pointer. */ bitmap_set_bit (regular_block_artificial_uses, FRAME_POINTER_REGNUM); -#if !HARD_FRAME_POINTER_IS_FRAME_POINTER - bitmap_set_bit (regular_block_artificial_uses, HARD_FRAME_POINTER_REGNUM); -#endif + if (!HARD_FRAME_POINTER_IS_FRAME_POINTER) + bitmap_set_bit (regular_block_artificial_uses, + HARD_FRAME_POINTER_REGNUM); #if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM /* Pseudos with argument area equivalences may require @@ -3494,9 +3493,9 @@ df_get_eh_block_artificial_uses (bitmap eh_block_artificial_uses) if (frame_pointer_needed) { bitmap_set_bit (eh_block_artificial_uses, FRAME_POINTER_REGNUM); -#if !HARD_FRAME_POINTER_IS_FRAME_POINTER - bitmap_set_bit (eh_block_artificial_uses, HARD_FRAME_POINTER_REGNUM); -#endif + if (!HARD_FRAME_POINTER_IS_FRAME_POINTER) + bitmap_set_bit (eh_block_artificial_uses, + HARD_FRAME_POINTER_REGNUM); } #if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM if (fixed_regs[ARG_POINTER_REGNUM]) @@ -3580,11 +3579,11 @@ df_get_entry_block_def_set (bitmap entry_block_defs) /* Any reference to any pseudo before reload is a potential reference of the frame pointer. */ bitmap_set_bit (entry_block_defs, FRAME_POINTER_REGNUM); -#if !HARD_FRAME_POINTER_IS_FRAME_POINTER + /* If they are different, also mark the hard frame pointer as live. */ - if (!LOCAL_REGNO (HARD_FRAME_POINTER_REGNUM)) + if (!HARD_FRAME_POINTER_IS_FRAME_POINTER + && !LOCAL_REGNO (HARD_FRAME_POINTER_REGNUM)) bitmap_set_bit (entry_block_defs, HARD_FRAME_POINTER_REGNUM); -#endif } /* These registers are live everywhere. */ @@ -3718,11 +3717,11 @@ df_get_exit_block_use_set (bitmap exit_block_uses) if ((!reload_completed) || frame_pointer_needed) { bitmap_set_bit (exit_block_uses, FRAME_POINTER_REGNUM); -#if !HARD_FRAME_POINTER_IS_FRAME_POINTER + /* If they are different, also mark the hard frame pointer as live. */ - if (
[PATCH 10/12] remove more ifdefs for HAVE_cc0
From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * caller-save.c (insert_one_insn): Remove ifdef HAVE_cc0. * cfgcleanup.c (flow_find_cross_jump): Likewise. (flow_find_head_matching_sequence): Likewise. (try_head_merge_bb): Likewise. * combine.c (can_combine_p): Likewise. (try_combine): Likewise. (distribute_notes): Likewise. * df-problems.c (can_move_insns_across): Likewise. * final.c (final): Likewise. * gcse.c (insert_insn_end_basic_block): Likewise. * ira.c (find_moveable_pseudos): Likewise. * reorg.c (try_merge_delay_insns): Likewise. (fill_simple_delay_slots): Likewise. (fill_slots_from_thread): Likewise. * sched-deps.c (sched_analyze_2): Likewise. --- gcc/caller-save.c | 4 +--- gcc/cfgcleanup.c | 26 -- gcc/combine.c | 54 +- gcc/df-problems.c | 5 + gcc/final.c | 29 ++--- gcc/gcse.c| 24 +--- gcc/ira.c | 5 + gcc/reorg.c | 26 +++--- gcc/sched-deps.c | 6 +++--- 9 files changed, 69 insertions(+), 110 deletions(-) diff --git a/gcc/caller-save.c b/gcc/caller-save.c index fc575eb..76c3a7e 100644 --- a/gcc/caller-save.c +++ b/gcc/caller-save.c @@ -1400,18 +1400,16 @@ insert_one_insn (struct insn_chain *chain, int before_p, int code, rtx pat) rtx_insn *insn = chain->insn; struct insn_chain *new_chain; -#if HAVE_cc0 /* If INSN references CC0, put our insns in front of the insn that sets CC0. This is always safe, since the only way we could be passed an insn that references CC0 is for a restore, and doing a restore earlier isn't a problem. We do, however, assume here that CALL_INSNs don't reference CC0. Guard against non-INSN's like CODE_LABEL. */ - if ((NONJUMP_INSN_P (insn) || JUMP_P (insn)) + if (HAVE_cc0 && (NONJUMP_INSN_P (insn) || JUMP_P (insn)) && before_p && reg_referenced_p (cc0_rtx, PATTERN (insn))) chain = chain->prev, insn = chain->insn; -#endif new_chain = new_insn_chain (); if (before_p) diff --git a/gcc/cfgcleanup.c b/gcc/cfgcleanup.c index 58d235e..e5c4747 100644 --- a/gcc/cfgcleanup.c +++ b/gcc/cfgcleanup.c @@ -1416,12 +1416,11 @@ flow_find_cross_jump (basic_block bb1, basic_block bb2, rtx_insn **f1, i2 = PREV_INSN (i2); } -#if HAVE_cc0 /* Don't allow the insn after a compare to be shared by cross-jumping unless the compare is also shared. */ - if (ninsns && reg_mentioned_p (cc0_rtx, last1) && ! sets_cc0_p (last1)) + if (HAVE_cc0 && ninsns && reg_mentioned_p (cc0_rtx, last1) + && ! sets_cc0_p (last1)) last1 = afterlast1, last2 = afterlast2, last_dir = afterlast_dir, ninsns--; -#endif /* Include preceding notes and labels in the cross-jump. One, this may bring us to the head of the blocks as requested above. @@ -1539,12 +1538,11 @@ flow_find_head_matching_sequence (basic_block bb1, basic_block bb2, rtx_insn **f i2 = NEXT_INSN (i2); } -#if HAVE_cc0 /* Don't allow a compare to be shared by cross-jumping unless the insn after the compare is also shared. */ - if (ninsns && reg_mentioned_p (cc0_rtx, last1) && sets_cc0_p (last1)) + if (HAVE_cc0 && ninsns && reg_mentioned_p (cc0_rtx, last1) + && sets_cc0_p (last1)) last1 = beforelast1, last2 = beforelast2, ninsns--; -#endif if (ninsns) { @@ -2330,11 +2328,9 @@ try_head_merge_bb (basic_block bb) cond = get_condition (jump, &move_before, true, false); if (cond == NULL_RTX) { -#if HAVE_cc0 - if (reg_mentioned_p (cc0_rtx, jump)) + if (HAVE_cc0 && reg_mentioned_p (cc0_rtx, jump)) move_before = prev_nonnote_nondebug_insn (jump); else -#endif move_before = jump; } @@ -2499,11 +2495,9 @@ try_head_merge_bb (basic_block bb) cond = get_condition (jump, &move_before, true, false); if (cond == NULL_RTX) { -#if HAVE_cc0 - if (reg_mentioned_p (cc0_rtx, jump)) + if (HAVE_cc0 && reg_mentioned_p (cc0_rtx, jump)) move_before = prev_nonnote_nondebug_insn (jump); else -#endif move_before = jump; } } @@ -2522,12 +2516,10 @@ try_head_merge_bb (basic_block bb) /* Try again, using a different insertion point. */ move_before = jump; -#if HAVE_cc0 /* Don't try moving before a cc0 user, as that may invalidate the cc0. */ - if (reg_mentioned_p (cc0_rtx, jump)) + if (HAVE_cc0 && reg_mentioned_p (cc0_rtx, jump)) break; -#endif continue; } @@ -2582,12 +2574,10 @@ try_head_merge_bb (basic_block bb) /* For the unmerged insns, try a different insertion point. */ move_before = jump; -#if HAVE_cc0 /* Don't try moving before a cc0 user,
[PATCH 09/12] remove #if for PIC_OFFSET_TABLE_REGNUM
From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * df-scan.c (df_get_entry_block_def_set): Remove #ifdef PIC_OFFSET_TABLE_REGNUM. --- gcc/df-scan.c | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/gcc/df-scan.c b/gcc/df-scan.c index 69332a8..4232ec8 100644 --- a/gcc/df-scan.c +++ b/gcc/df-scan.c @@ -3589,10 +3589,6 @@ df_get_entry_block_def_set (bitmap entry_block_defs) /* These registers are live everywhere. */ if (!reload_completed) { -#ifdef PIC_OFFSET_TABLE_REGNUM - unsigned int picreg = PIC_OFFSET_TABLE_REGNUM; -#endif - #if FRAME_POINTER_REGNUM != ARG_POINTER_REGNUM /* Pseudos with argument area equivalences may require reloading via the argument pointer. */ @@ -3600,13 +3596,12 @@ df_get_entry_block_def_set (bitmap entry_block_defs) bitmap_set_bit (entry_block_defs, ARG_POINTER_REGNUM); #endif -#ifdef PIC_OFFSET_TABLE_REGNUM /* Any constant, or pseudo with constant equivalences, may require reloading from memory using the pic register. */ + unsigned int picreg = PIC_OFFSET_TABLE_REGNUM; if (picreg != INVALID_REGNUM && fixed_regs[picreg]) bitmap_set_bit (entry_block_defs, picreg); -#endif } #ifdef INCOMING_RETURN_ADDR_RTX -- 2.3.0.80.g18d0fec.dirty
[PATCH 12/12] add default for INSN_REFERENCES_ARE_DELAYED
From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * defaults.h (INSN_REFERENCES_ARE_DELAYED): New definition. * reorg.c (redundant_insn): Remove ifdef INSN_REFERENCES_ARE_DELAYED. * resource.c (mark_referenced_resources): Likewise. --- gcc/defaults.h | 4 gcc/reorg.c| 4 gcc/resource.c | 2 -- 3 files changed, 4 insertions(+), 6 deletions(-) diff --git a/gcc/defaults.h b/gcc/defaults.h index 79cb599..cafcb1e 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -1205,6 +1205,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define INSN_SETS_ARE_DELAYED(INSN) false #endif +#ifndef INSN_REFERENCES_ARE_DELAYED +#define INSN_REFERENCES_ARE_DELAYED(INSN) false +#endif + #ifdef GCC_INSN_FLAGS_H /* Dependent default target macro definitions diff --git a/gcc/reorg.c b/gcc/reorg.c index ae77f0a..d8d8ab69 100644 --- a/gcc/reorg.c +++ b/gcc/reorg.c @@ -1558,10 +1558,8 @@ redundant_insn (rtx insn, rtx_insn *target, rtx delay_list) if (INSN_SETS_ARE_DELAYED (seq->insn (0))) return 0; -#ifdef INSN_REFERENCES_ARE_DELAYED if (INSN_REFERENCES_ARE_DELAYED (seq->insn (0))) return 0; -#endif /* See if any of the insns in the delay slot match, updating resource requirements as we go. */ @@ -1658,10 +1656,8 @@ redundant_insn (rtx insn, rtx_insn *target, rtx delay_list) if (INSN_SETS_ARE_DELAYED (control)) return 0; -#ifdef INSN_REFERENCES_ARE_DELAYED if (INSN_REFERENCES_ARE_DELAYED (control)) return 0; -#endif if (JUMP_P (control)) annul_p = INSN_ANNULLED_BRANCH_P (control); diff --git a/gcc/resource.c b/gcc/resource.c index 5af9376..26d9fca 100644 --- a/gcc/resource.c +++ b/gcc/resource.c @@ -392,11 +392,9 @@ mark_referenced_resources (rtx x, struct resources *res, include_delayed_effects ? MARK_SRC_DEST_CALL : MARK_SRC_DEST); -#ifdef INSN_REFERENCES_ARE_DELAYED if (! include_delayed_effects && INSN_REFERENCES_ARE_DELAYED (as_a (x))) return; -#endif /* No special processing, just speed up. */ mark_referenced_resources (PATTERN (x), res, include_delayed_effects); -- 2.3.0.80.g18d0fec.dirty
[PATCH 11/12] provide default for INSN_SETS_ARE_DELAYED
From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * defaults.h (INSN_SETS_ARE_DELAYED): New definition. * reorg.c (redundant_insn): Remove ifdef INSN_SETS_ARE_DELAYED. * resource.c (mark_set_resources): Likewise. --- gcc/defaults.h | 4 gcc/reorg.c| 4 gcc/resource.c | 2 -- 3 files changed, 4 insertions(+), 6 deletions(-) diff --git a/gcc/defaults.h b/gcc/defaults.h index 843d7e2..79cb599 100644 --- a/gcc/defaults.h +++ b/gcc/defaults.h @@ -1201,6 +1201,10 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define DEFAULT_PCC_STRUCT_RETURN 1 #endif +#ifndef INSN_SETS_ARE_DELAYED +#define INSN_SETS_ARE_DELAYED(INSN) false +#endif + #ifdef GCC_INSN_FLAGS_H /* Dependent default target macro definitions diff --git a/gcc/reorg.c b/gcc/reorg.c index b7228f2..ae77f0a 100644 --- a/gcc/reorg.c +++ b/gcc/reorg.c @@ -1555,10 +1555,8 @@ redundant_insn (rtx insn, rtx_insn *target, rtx delay_list) slots because it is difficult to track its resource needs correctly. */ -#ifdef INSN_SETS_ARE_DELAYED if (INSN_SETS_ARE_DELAYED (seq->insn (0))) return 0; -#endif #ifdef INSN_REFERENCES_ARE_DELAYED if (INSN_REFERENCES_ARE_DELAYED (seq->insn (0))) @@ -1657,10 +1655,8 @@ redundant_insn (rtx insn, rtx_insn *target, rtx delay_list) /* If this is an INSN or JUMP_INSN with delayed effects, it is hard to track the resource needs properly, so give up. */ -#ifdef INSN_SETS_ARE_DELAYED if (INSN_SETS_ARE_DELAYED (control)) return 0; -#endif #ifdef INSN_REFERENCES_ARE_DELAYED if (INSN_REFERENCES_ARE_DELAYED (control)) diff --git a/gcc/resource.c b/gcc/resource.c index 9a013b3..5af9376 100644 --- a/gcc/resource.c +++ b/gcc/resource.c @@ -696,11 +696,9 @@ mark_set_resources (rtx x, struct resources *res, int in_dest, /* An insn consisting of just a CLOBBER (or USE) is just for flow and doesn't actually do anything, so we ignore it. */ -#ifdef INSN_SETS_ARE_DELAYED if (mark_type != MARK_SRC_DEST_CALL && INSN_SETS_ARE_DELAYED (as_a (x))) return; -#endif x = PATTERN (x); if (GET_CODE (x) != USE && GET_CODE (x) != CLOBBER) -- 2.3.0.80.g18d0fec.dirty
Re: [AArch64][PR65139] use clobber with match_scratch for aarch64_lshr_sisd_or_int_3
On 18/04/15 19:17, Maxim Kuvyrkov wrote: >> On Apr 18, 2015, at 8:21 PM, Richard Earnshaw >> wrote: >> >> On 18/04/15 16:13, Jakub Jelinek wrote: >>> On Sat, Apr 18, 2015 at 03:07:16PM +0100, Richard Earnshaw wrote: You need to ensure that your scratch register cannot overlap op1, since the scratch is written before op1 is read. >>> >>> - (clobber (match_scratch:QI 3 "=X,w,X"))] >>> + (clobber (match_scratch:QI 3 "=X,&w,X"))] >>> >>> incremental diff should ensure that, right? >>> >>> Jakub >>> >> >> >> Sorry, where in the patch is that hunk? >> >> I see just: >> >> + (clobber (match_scratch:QI 3 "=X,w,X"))] > > Jakub's suggestion is an incremental patch on top of Kugan's. > Ah, sorry, I though he was implying it was already in the patch somewhere. >> >> And why would early clobbering the scratch be notably better than the >> original? >> > > It will still be better. With this patch we want to allow RA freedom to > optimally handle both of the following cases: > > 1. operand[1] dies after the instruction. In this case we want operand[0] > and operand[1] to be assigned to the same reg, and operand[3] to be assigned > to a different register to provide a temporary. In this case we don't care > whether operand[3] is early-clobber or not. This case is not optimally > handled with current insn patterns. > > 2. operand[1] lives on after the instruction. In this case we want > operand[0] and operand[3] to be assigned to the same reg, and not clobber > operand[1]. By marking operand[3] early-clobber we ensure that operand[1] is > in a different register from what operand[0] and operand[3] were assigned to. > This case should be handled equally well before and after the patch. > > My understanding is that Kugan's patch with Jakub's fix on top satisfy both > of these cases. > I still don't think it handles all cases efficiently. If we really want the result in a different register from both of the inputs, then now we need two registers for the results, one for the result and another for the temporary. In that case we could have used the result register as the scratch, but now we can't. Maybe we can provide two alternatives, one that early-clobbers the result register but doesn't need a scratch and one that doesn't early-clobber the result, but does need a scratch. So something like (define_insn "aarch64_lshr_sisd_or_int_3" [(set (match_operand:GPI 0 "register_operand" "=w,&w,w,r") (lshiftrt:GPI (match_operand:GPI 1 "register_operand" "w,w,w,r") (match_operand:QI 2 "aarch64_reg_or_shift_imm_" "Us,w,w,rUs"))) (clobber (match_scratch:QI 3 "=X,X,w,X"))] ... but I haven't tested any of that. I would also note the conversation in https://gcc.gnu.org/ml/gcc/2015-04/msg00240.html. That seems to suggest we should be wary of using scratch sequences since the register allocator doesn't account for them properly. R. > -- > Maxim Kuvyrkov > www.linaro.org >
Re: [PATCH 11/12] provide default for INSN_SETS_ARE_DELAYED
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * defaults.h (INSN_SETS_ARE_DELAYED): New definition. * reorg.c (redundant_insn): Remove ifdef INSN_SETS_ARE_DELAYED. * resource.c (mark_set_resources): Likewise. OK. Jeff
Re: [PATCH 12/12] add default for INSN_REFERENCES_ARE_DELAYED
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * defaults.h (INSN_REFERENCES_ARE_DELAYED): New definition. * reorg.c (redundant_insn): Remove ifdef INSN_REFERENCES_ARE_DELAYED. * resource.c (mark_referenced_resources): Likewise. OK. Jeff
Re: [PATCH 06/12] provide default for RETURN_ADDR_OFFSET
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * defaults.h (RETURN_ADDR_OFFSET): New definition. * except.c (expand_builtin_extract_return_addr): Remove ifdef RETURN_ADDR_OFFSET. (expand_builtin_frob_return_addr): Likewise. OK. jeff
Re: [PATCH 07/12] provide default for MASK_RETURN_ADDR
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * defaults.h (MASK_RETURN_ADDR): New definition. * except.c (expand_builtin_extract_return_addr): Remove ifdef MASK_RETURN_ADDR. OK. jeff
[PATCH, i386]: Some spring cleaning in i386.h
Hello! This patch redefines various hard register numbers with ones from i386.md. Also, the patch reshuffles some defines to group them together in a better way. No functional changes. 2015-04-21 Uros Bizjak * config/i386/i386.md (ARGP_REG, FRAME_REG, BND2_REG, BND3_REG, FIRST_PSEUDO_REG): New. * config/i386/i386.h (STACK_POINTER_REGNUM): Define to SP_REG. (ARG_POINTER_REGNUM): Define to ARGP_REG. (FRAME_POINTER_REGNUM): Define to FRAME_REG. (HARD_FRAME_POINTER_REGNUM): Define to BP_REG. (FIRST_PSEUDO_REGISTER): Define to FIRST_PSEUDO_REG. (FIRST_INT_REG): New. (LAST_INT_REG): New. (FIRST_*_REG): Define using *_REG. (LAST_*_REG): Ditto. (QI_REGNO_P): Define using FIRST_QU_REG and LAST_QI_REG. (LEGACY_INT_REGNO_P): Define using FIRST_INT_REG and LAST_INT_REG. (FIRST_FLOAT_REG): Define to FIRST_STACK_REG. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}, committed to mainline SVN. Uros. Index: config/i386/i386.h === --- config/i386/i386.h (revision 57) +++ config/i386/i386.h (working copy) @@ -957,7 +957,7 @@ extern const char *host_detect_local_cpu (int argc eliminated during reloading in favor of either the stack or frame pointer. */ -#define FIRST_PSEUDO_REGISTER 81 +#define FIRST_PSEUDO_REGISTER FIRST_PSEUDO_REG /* Number of hardware registers that go into the DWARF-2 unwind info. If not defined, equals FIRST_PSEUDO_REGISTER. */ @@ -1100,7 +1100,7 @@ extern const char *host_detect_local_cpu (int argc || (MODE) == V16SImode || (MODE) == V16SFmode || (MODE) == V32HImode \ || (MODE) == V4TImode) -#define VALID_AVX512VL_128_REG_MODE(MODE) \ +#define VALID_AVX512VL_128_REG_MODE(MODE) \ ((MODE) == V2DImode || (MODE) == V2DFmode || (MODE) == V16QImode \ || (MODE) == V4SImode || (MODE) == V4SFmode || (MODE) == V8HImode) @@ -1121,6 +1121,10 @@ extern const char *host_detect_local_cpu (int argc || (MODE) == V2SImode || (MODE) == SImode \ || (MODE) == V4HImode || (MODE) == V8QImode) +#define VALID_MASK_REG_MODE(MODE) ((MODE) == HImode || (MODE) == QImode) + +#define VALID_MASK_AVX512BW_MODE(MODE) ((MODE) == SImode || (MODE) == DImode) + #define VALID_BND_REG_MODE(MODE) \ (TARGET_64BIT ? (MODE) == BND64mode : (MODE) == BND32mode) @@ -1150,10 +1154,16 @@ extern const char *host_detect_local_cpu (int argc || (MODE) == V16SImode || (MODE) == V32HImode || (MODE) == V8DFmode \ || (MODE) == V16SFmode) -#define VALID_MASK_REG_MODE(MODE) ((MODE) == HImode || (MODE) == QImode) +#define X87_FLOAT_MODE_P(MODE) \ + (TARGET_80387 && ((MODE) == SFmode || (MODE) == DFmode || (MODE) == XFmode)) -#define VALID_MASK_AVX512BW_MODE(MODE) ((MODE) == SImode || (MODE) == DImode) +#define SSE_FLOAT_MODE_P(MODE) \ + ((TARGET_SSE && (MODE) == SFmode) || (TARGET_SSE2 && (MODE) == DFmode)) +#define FMA4_VEC_FLOAT_MODE_P(MODE) \ + (TARGET_FMA4 && ((MODE) == V4SFmode || (MODE) == V2DFmode \ + || (MODE) == V8SFmode || (MODE) == V4DFmode)) + /* Value is 1 if hard register REGNO can hold a value of machine-mode MODE. */ #define HARD_REGNO_MODE_OK(REGNO, MODE)\ @@ -1198,42 +1208,46 @@ extern const char *host_detect_local_cpu (int argc register. The ordinary mov instructions won't work */ /* #define PC_REGNUM */ +/* Base register for access to arguments of the function. */ +#define ARG_POINTER_REGNUM ARGP_REG + /* Register to use for pushing function arguments. */ -#define STACK_POINTER_REGNUM 7 +#define STACK_POINTER_REGNUM SP_REG /* Base register for access to local variables of the function. */ -#define HARD_FRAME_POINTER_REGNUM 6 +#define FRAME_POINTER_REGNUM FRAME_REG +#define HARD_FRAME_POINTER_REGNUM BP_REG -/* Base register for access to local variables of the function. */ -#define FRAME_POINTER_REGNUM 20 +#define FIRST_INT_REG AX_REG +#define LAST_INT_REG SP_REG -/* First floating point reg */ -#define FIRST_FLOAT_REG 8 +#define FIRST_QI_REG AX_REG +#define LAST_QI_REG BX_REG /* First & last stack-like regs */ -#define FIRST_STACK_REG FIRST_FLOAT_REG -#define LAST_STACK_REG (FIRST_FLOAT_REG + 7) +#define FIRST_STACK_REG ST0_REG +#define LAST_STACK_REG ST7_REG -#define FIRST_SSE_REG (FRAME_POINTER_REGNUM + 1) -#define LAST_SSE_REG (FIRST_SSE_REG + 7) +#define FIRST_SSE_REG XMM0_REG +#define LAST_SSE_REG XMM7_REG -#define FIRST_MMX_REG (LAST_SSE_REG + 1) /*29*/ -#define LAST_MMX_REG (FIRST_MMX_REG + 7) +#define FIRST_MMX_REG MM0_REG +#define LAST_MMX_REG MM7_REG -#define FIRST_REX_INT_REG (LAST_MMX_REG + 1) /*37*/ -#define LAST_REX_INT_REG (FIRST_REX_INT_REG + 7) +#define FIRST_REX_INT_REG R8_REG +#define LAST_REX_INT_REG R15_REG -#define FIRST_REX_SSE_REG (LAST_REX_INT_REG + 1) /*45*/ -#define LAST_REX_SSE_REG (FIRST_REX_SSE_REG + 7) +#define FIRST_R
Re: [PATCH 09/12] remove #if for PIC_OFFSET_TABLE_REGNUM
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * df-scan.c (df_get_entry_block_def_set): Remove #ifdef PIC_OFFSET_TABLE_REGNUM. OK. jeff
Re: [PATCH 08/12] reduce conditional compilation for HARD_FRAME_POINTER_IS_FRAME_POINTER
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * alias.c (init_alias_target): Remove ifdef * HARD_FRAME_POINTER_IS_FRAME_POINTER. * df-scan.c (df_insn_refs_collect): Likewise. (df_get_regular_block_artificial_uses): Likewise. (df_get_eh_block_artificial_uses): Likewise. (df_get_entry_block_def_set): Likewise. (df_get_exit_block_use_set): Likewise. * emit-rtl.c (gen_rtx_REG): Likewise. * ira.c (ira_setup_eliminable_regset): Likewise. * reginfo.c (init_reg_sets_1): Likewise. * regrename.c (rename_chains): Likewise. * reload1.c (reload): Likewise. (eliminate_regs_in_insn): Likewise. * resource.c (mark_referenced_resources): Likewise. (init_resource_info): Likewise. OK. jeff
Re: [PATCH 01/12] add default definition of EH_RETURN_DATA_REGNO
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * defaults.h: New definition of EH_RETURN_DATA_REGNO. * except.c: Remove definition of EH_RETURN_DATA_REGNO. * builtins.c (expand_builtin): Remove check if EH_RETURN_DATA_REGNO is defined. * df-scan.c (df_bb_refs_collect): Likewise. (df_get_exit_block_use_set): Likewise. * haifa-sched.c (initiate_bb_reg_pressure_info): Likewise. * ira-lives.c (process_bb_node_lives): Likewise. * lra-lives.c (process_bb_lives): Likewise. This one wasn't as obvious as the others, but is clearly OK once the full loops being guarded by EH_RETURN_DATA_REGNO are examined. Jeff
Re: [PATCH 00/12] Reduce conditional compilation
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders Hi, This is a first round of patches to reduce the amount of code with in #if / #ifdef. This makes it incrementally easier to not break configs other than the one being built, and moves things slightly closer to using target hooks for everything. each commit bootstrapped and regtested on x86_64-linux-gnu without regression, and whole patch set run through config-list.mk without issue, ok? Thanks for tackling this. It's not particular deep work, but I do think it'll help reduce the long term maintenance costs and make developers' lives easier. Onward to the HAVE_cc0 patches :-) Jeff ps. You hit a good window, my daughter was update late last night and is sleeping in a bit, so I've got unexpected time this morning before my meetings.
Re: [PATCH 02/12] remove some ifdef HAVE_cc0
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * conditions.h: Define macros even if HAVE_cc0 is undefined. * emit-rtl.c: Define functions even if HAVE_cc0 is undefined. * final.c: Likewise. * jump.c: Likewise. * recog.c: Likewise. * recog.h: Declare functions even when HAVE_cc0 is undefined. * sched-deps.c (sched_analyze_2): Always compile case for cc0. OK. Note for anyone else reading at home, some of the functions being unconditionally compiled now already had unconditional prototypes in the header files. So not everything needed a .h file change. jeff
Re: [PATCH 03/12] more removal of ifdef HAVE_cc0
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * combine.c (find_single_use): Remove HAVE_cc0 ifdef for code that is trivially ded on non cc0 targets. (simplify_set): Likewise. (mark_used_regs_combine): Likewise. * cse.c (new_basic_block): Likewise. (fold_rtx): Likewise. (cse_insn): Likewise. (cse_extended_basic_block): Likewise. (set_live_p): Likewise. * rtlanal.c (canonicalize_condition): Likewise. * simplify-rtx.c (simplify_binary_operation_1): Likewise. OK. I find myself wondering if the conditionals should look like if (HAVE_cc0 && (whatever)) But I doubt it makes any measurable difference. It's something we can always add in the future if we feel the need to avoid the runtime checks for things that aren't ever going to happen on most modern targets. jeff
Re: [PATCH 04/12] always define HAVE_cc0
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * genconfig.c (main): Always define HAVE_cc0. * caller-save.c (insert_one_insn): Change ifdef HAVE_cc0 to #if HAVE_cc0. * cfgcleanup.c (flow_find_cross_jump): Likewise. (flow_find_head_matching_sequence): Likewise. (try_head_merge_bb): Likewise. * cfgrtl.c (rtl_merge_blocks): Likewise. (try_redirect_by_replacing_jump): Likewise. (rtl_tidy_fallthru_edge): Likewise. * combine.c (do_SUBST_MODE): Likewise. (insn_a_feeds_b): Likewise. (combine_instructions): Likewise. (can_combine_p): Likewise. (try_combine): Likewise. (find_split_point): Likewise. (subst): Likewise. (simplify_set): Likewise. (distribute_notes): Likewise. * cprop.c (cprop_jump): Likewise. * cse.c (cse_extended_basic_block): Likewise. * df-problems.c (can_move_insns_across): Likewise. * final.c (final): Likewise. (final_scan_insn): Likewise. * function.c (emit_use_return_register_into_block): Likewise. * gcse.c (insert_insn_end_basic_block): Likewise. * haifa-sched.c (sched_init): Likewise. * ira.c (find_moveable_pseudos): Likewise. * loop-invariant.c (find_invariant_insn): Likewise. * lra-constraints.c (curr_insn_transform): Likewise. * optabs.c (prepare_cmp_insn): Likewise. * postreload.c (reload_combine_recognize_const_pattern): * Likewise. * reload.c (find_reloads): Likewise. (find_reloads_address_1): Likewise. * reorg.c (delete_scheduled_jump): Likewise. (steal_delay_list_from_target): Likewise. (steal_delay_list_from_fallthrough): Likewise. (try_merge_delay_insns): Likewise. (redundant_insn): Likewise. (fill_simple_delay_slots): Likewise. (fill_slots_from_thread): Likewise. (delete_computation): Likewise. (relax_delay_slots): Likewise. * sched-deps.c (sched_analyze_2): Likewise. * sched-rgn.c (add_branch_dependences): Likewise. Doesn't go as far as I'd like, but it's still an improvement. OK. jeff
Re: [PATCH 05/12] make some HAVE_cc0 code always compiled
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * cfgrtl.c (rtl_merge_blocks): Change #if HAVE_cc0 to if (HAVE_cc0) (try_redirect_by_replacing_jump): Likewise. (rtl_tidy_fallthru_edge): Likewise. * combine.c (insn_a_feeds_b): Likewise. (find_split_point): Likewise. (simplify_set): Likewise. * cprop.c (cprop_jump): Likewise. * cse.c (cse_extended_basic_block): Likewise. * df-problems.c (can_move_insns_across): Likewise. * function.c (emit_use_return_register_into_block): Likewise. * haifa-sched.c (sched_init): Likewise. * ira.c (find_moveable_pseudos): Likewise. * loop-invariant.c (find_invariant_insn): Likewise. * lra-constraints.c (curr_insn_transform): Likewise. * postreload.c (reload_combine_recognize_const_pattern): * Likewise. * reload.c (find_reloads): Likewise. * reorg.c (delete_scheduled_jump): Likewise. (steal_delay_list_from_target): Likewise. (steal_delay_list_from_fallthrough): Likewise. (redundant_insn): Likewise. (fill_simple_delay_slots): Likewise. (fill_slots_from_thread): Likewise. (delete_computation): Likewise. * sched-rgn.c (add_branch_dependences): Likewise. OK. This is what I expected to see a lot of :-0 jeff
Re: [PATCH 10/12] remove more ifdefs for HAVE_cc0
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * caller-save.c (insert_one_insn): Remove ifdef HAVE_cc0. * cfgcleanup.c (flow_find_cross_jump): Likewise. (flow_find_head_matching_sequence): Likewise. (try_head_merge_bb): Likewise. * combine.c (can_combine_p): Likewise. (try_combine): Likewise. (distribute_notes): Likewise. * df-problems.c (can_move_insns_across): Likewise. * final.c (final): Likewise. * gcse.c (insert_insn_end_basic_block): Likewise. * ira.c (find_moveable_pseudos): Likewise. * reorg.c (try_merge_delay_insns): Likewise. (fill_simple_delay_slots): Likewise. (fill_slots_from_thread): Likewise. * sched-deps.c (sched_analyze_2): Likewise. OK. Jeff
Re: [PATCH 00/12] Reduce conditional compilation
On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders Hi, This is a first round of patches to reduce the amount of code with in #if / #ifdef. This makes it incrementally easier to not break configs other than the one being built, and moves things slightly closer to using target hooks for everything. each commit bootstrapped and regtested on x86_64-linux-gnu without regression, and whole patch set run through config-list.mk without issue, ok? So I think after looking at this patchset, any changes of a similar nature you want to make should be considered pre-approved. Just post them for archival purposes, but no need for you to wait for review as long as they have the same purpose and overall structure as was seen in these patches. jeff
RE: [PATCH 6/13] mips musl support
Szabolcs Nagy writes: > Set up dynamic linker name for mips. > > gcc/Changelog: > > 2015-04-16 Gregor Richards > > * config/mips/linux.h (MUSL_DYNAMIC_LINKER): Define. I understand that mips musl is o32 only currently is that correct? There does however appear to be both soft and hard float variants listed in the musl docs. Do you plan on using the same dynamic linker name for both float variants? No problem if so but someone must have decided to have unique names for big and little endian so I thought it worth checking. Also, are you aware of the two nan encoding formats that MIPS has and the support present in glibc's dynamic linker to deal with it? I wonder if it would be wise to refuse to target musl unless the ABI is known to be supported so that we can avoid compatibility issues when different ABI variants are added in musl. Thanks, Matthew
[PATCH][AArch64] Add branch-cost to cpu tuning information.
The AArch64 backend sets BRANCH_COST to be the constant value 2 for all cpus, meaning that the compiler thinks that branches cost the same across all cpus. This patch reworks the handling of branch costs to allow per-cpu values to be set. The actual value of the branch-costs is unchanged as the correct values for will need to be decided for each core. Tested aarch64-none-linux-gnu with gcc-check. Ok for trunk? Matthew 2015-05-21 Matthew Wahab * gcc/config/aarch64-protos.h (struct cpu_branch_cost): New. (tune_params): Add field branch_costs. (aarch64_branch_cost): Declare. * gcc/config/aarch64.c (generic_branch_cost): New. (generic_tunings): Set field cpu_branch_cost to generic_branch_cost. (cortexa53_tunings): Likewise. (cortexa57_tunings): Likewise. (thunderx_tunings): Likewise. (xgene1_tunings): Likewise. (aarch64_branch_cost): Define. * gcc/config/aarch64/aarch64.h (BRANCH_COST): Redefine. diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 8676c5c..77b01fa 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -162,12 +162,20 @@ struct cpu_vector_cost const int cond_not_taken_branch_cost; /* Cost of not taken branch. */ }; +/* Branch costs. */ +struct cpu_branch_cost +{ + const int predictable;/* Predictable branch or optimizing for size. */ + const int unpredictable; /* Unpredictable branch or optimizing for speed. */ +}; + struct tune_params { const struct cpu_cost_table *const insn_extra_cost; const struct cpu_addrcost_table *const addr_cost; const struct cpu_regmove_cost *const regmove_cost; const struct cpu_vector_cost *const vec_costs; + const struct cpu_branch_cost *const branch_costs; const int memmov_cost; const int issue_rate; const unsigned int fuseable_ops; @@ -259,6 +267,8 @@ void aarch64_print_operand (FILE *, rtx, char); void aarch64_print_operand_address (FILE *, rtx); void aarch64_emit_call_insn (rtx); +int aarch64_branch_cost (bool, bool); + /* Initialize builtins for SIMD intrinsics. */ void init_aarch64_simd_builtins (void); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 77a641e..a020316 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -339,12 +339,20 @@ static const struct cpu_vector_cost xgene1_vector_cost = #define AARCH64_FUSE_ADRP_LDR (1 << 3) #define AARCH64_FUSE_CMP_BRANCH (1 << 4) +/* Generic costs for branch instructions. */ +static const struct cpu_branch_cost generic_branch_cost = +{ + 2, /* Predictable. */ + 2 /* Unpredictable. */ +}; + static const struct tune_params generic_tunings = { &cortexa57_extra_costs, &generic_addrcost_table, &generic_regmove_cost, &generic_vector_cost, + &generic_branch_cost, 4, /* memmov_cost */ 2, /* issue_rate */ AARCH64_FUSE_NOTHING, /* fuseable_ops */ @@ -362,6 +370,7 @@ static const struct tune_params cortexa53_tunings = &generic_addrcost_table, &cortexa53_regmove_cost, &generic_vector_cost, + &generic_branch_cost, 4, /* memmov_cost */ 2, /* issue_rate */ (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD @@ -380,6 +389,7 @@ static const struct tune_params cortexa57_tunings = &cortexa57_addrcost_table, &cortexa57_regmove_cost, &cortexa57_vector_cost, + &generic_branch_cost, 4, /* memmov_cost */ 3, /* issue_rate */ (AARCH64_FUSE_MOV_MOVK | AARCH64_FUSE_ADRP_ADD @@ -398,6 +408,7 @@ static const struct tune_params thunderx_tunings = &generic_addrcost_table, &thunderx_regmove_cost, &generic_vector_cost, + &generic_branch_cost, 6, /* memmov_cost */ 2, /* issue_rate */ AARCH64_FUSE_CMP_BRANCH, /* fuseable_ops */ @@ -415,6 +426,7 @@ static const struct tune_params xgene1_tunings = &xgene1_addrcost_table, &xgene1_regmove_cost, &xgene1_vector_cost, + &generic_branch_cost, 6, /* memmov_cost */ 4, /* issue_rate */ AARCH64_FUSE_NOTHING, /* fuseable_ops */ @@ -5361,6 +5373,19 @@ aarch64_address_cost (rtx x, return cost; } +int +aarch64_branch_cost (bool speed_p, bool predictable_p) +{ + /* When optimizing for speed, use the cost of unpredictable branches. */ + const struct cpu_branch_cost *branch_costs = +aarch64_tune_params->branch_costs; + + if (!speed_p || predictable_p) +return branch_costs->predictable; + else +return branch_costs->unpredictable; +} + /* Return true if the RTX X in mode MODE is a zero or sign extract usable in an ADD or SUB (extended register) instruction. */ static bool diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index bf59e40..93a32f5 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -823,7 +823,8 @@ do { \ #define TRAMPOLINE_SECTION text_section /* To start with. */ -#define BRANCH_COST(SPEED_P, PREDICTABLE_P) 2 +#d
Re: [PATCH 01/12] add default definition of EH_RETURN_DATA_REGNO
On Tue, Apr 21, 2015 at 07:40:37AM -0600, Jeff Law wrote: > On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: > >From: Trevor Saunders > > > >gcc/ChangeLog: > > > >2015-04-21 Trevor Saunders > > > > * defaults.h: New definition of EH_RETURN_DATA_REGNO. > > * except.c: Remove definition of EH_RETURN_DATA_REGNO. > > * builtins.c (expand_builtin): Remove check if > > EH_RETURN_DATA_REGNO is defined. > > * df-scan.c (df_bb_refs_collect): Likewise. > > (df_get_exit_block_use_set): Likewise. > > * haifa-sched.c (initiate_bb_reg_pressure_info): Likewise. > > * ira-lives.c (process_bb_node_lives): Likewise. > > * lra-lives.c (process_bb_lives): Likewise. > This one wasn't as obvious as the others, but is clearly OK once the full > loops being guarded by EH_RETURN_DATA_REGNO are examined. Except that the bb_has_eh_pred predicate might burn CPU time for basic blocks with many predecessors. Though, the question is if there are any important targets that don't define EH_RETURN_DATA_REGNO already. Jakub
Re: [PATCH 01/12] add default definition of EH_RETURN_DATA_REGNO
On 04/21/2015 08:00 AM, Jakub Jelinek wrote: On Tue, Apr 21, 2015 at 07:40:37AM -0600, Jeff Law wrote: On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: From: Trevor Saunders gcc/ChangeLog: 2015-04-21 Trevor Saunders * defaults.h: New definition of EH_RETURN_DATA_REGNO. * except.c: Remove definition of EH_RETURN_DATA_REGNO. * builtins.c (expand_builtin): Remove check if EH_RETURN_DATA_REGNO is defined. * df-scan.c (df_bb_refs_collect): Likewise. (df_get_exit_block_use_set): Likewise. * haifa-sched.c (initiate_bb_reg_pressure_info): Likewise. * ira-lives.c (process_bb_node_lives): Likewise. * lra-lives.c (process_bb_lives): Likewise. This one wasn't as obvious as the others, but is clearly OK once the full loops being guarded by EH_RETURN_DATA_REGNO are examined. Except that the bb_has_eh_pred predicate might burn CPU time for basic blocks with many predecessors. Though, the question is if there are any important targets that don't define EH_RETURN_DATA_REGNO already. Probably not since they'll blow up elsewhere (I was recently helping someone with a private port that didn't define EH_RETURN_DATA_REGNO) :-) jeff
Re: [PATCH][combine] Do not call rtx costs on potentially unrecognisable rtxes in combine
On 04/21/2015 03:18 AM, Kyrill Tkachov wrote: Though I do wonder if, in practice, we can identify those cases that do simplify more directly apriori and just punt everything else rather than this rather convoluted approach. You mean like calling simplify_binary_operation that returns NULL if no simplification is possible? Not entirely sure, just a general sense that we're doing far more work here than is justified by the potential gains. The cases we care about are very limited (negated or duplicated arguments) and I'd be surprised if they're still showing up in combine.c these days. I didn't look at the history of that code, but I suspect it is *very very* old. I'm not asking you to tackle this problem, it was more meant as an observation. But if you want to dig deeper, go for it. If it were me, the first thing I'd do is try to construct a testcase that would get me into that code -- I'd be it's hard, particularly with the tree and rtl reassociations we do these days. Jeff
[patch] [java] bump libgcj soname
bump the libgcj soname on the trunk, as done for every release cycle, and update the cygwin/mingw32 files. ok for the trunk? Matthias gcc/ 2015-04-21 Matthias Klose * config/i386/cygwin.h (LIBGCJ_SONAME): Set libgcj version to -17. * config/i386/mingw32.h (LIBGCJ_SONAME): Set libgcj version to -17. libjava/ 2015-04-21 Matthias Klose * libtool-version: Bump soversion. Index: gcc/config/i386/cygwin.h === --- gcc/config/i386/cygwin.h (revision 68) +++ gcc/config/i386/cygwin.h (working copy) @@ -154,5 +154,5 @@ #define LIBGCC_SONAME "cyggcc_s" LIBGCC_EH_EXTN "-1.dll" /* We should find a way to not have to update this manually. */ -#define LIBGCJ_SONAME "cyggcj" /*LIBGCC_EH_EXTN*/ "-16.dll" +#define LIBGCJ_SONAME "cyggcj" /*LIBGCC_EH_EXTN*/ "-17.dll" Index: gcc/config/i386/mingw32.h === --- gcc/config/i386/mingw32.h (revision 68) +++ gcc/config/i386/mingw32.h (working copy) @@ -254,4 +254,4 @@ #define LIBGCC_SONAME "libgcc_s" LIBGCC_EH_EXTN "-1.dll" /* We should find a way to not have to update this manually. */ -#define LIBGCJ_SONAME "libgcj" /*LIBGCC_EH_EXTN*/ "-16.dll" +#define LIBGCJ_SONAME "libgcj" /*LIBGCC_EH_EXTN*/ "-17.dll" Index: libjava/libtool-version === --- libjava/libtool-version (revision 68) +++ libjava/libtool-version (working copy) @@ -5,4 +5,4 @@ # Note: When changing the version here, please do also update LIBGCJ_SONAME # in gcc/config/i386/cygwin.h and gcc/config/i386/mingw32.h. # CURRENT:REVISION:AGE -16:0:0 +17:0:0
Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall
On 04/21/2015 02:30 AM, Kyrill Tkachov wrote: From reading config/stormy16/stormy-abi it seems to me that we don't pass arguments partially in stormy16, so this code would never be called there. That leaves pa as the potential problematic target. I don't suppose there's an easy way to test on pa? My checkout of binutils doesn't seem to include a sim target for it. No simulator, no machines in the testfarm, the box I had access to via parisc-linux.org seems dead and my ancient PA overheats well before a bootstrap could complete. I often regret knowing about the backwards way many things were done on the PA because it makes me think about cases that only matter on dead architectures. Jeff
Re: [PATCH][combine] Do not call rtx costs on potentially unrecognisable rtxes in combine
On 21/04/15 15:06, Jeff Law wrote: On 04/21/2015 03:18 AM, Kyrill Tkachov wrote: Though I do wonder if, in practice, we can identify those cases that do simplify more directly apriori and just punt everything else rather than this rather convoluted approach. You mean like calling simplify_binary_operation that returns NULL if no simplification is possible? Not entirely sure, just a general sense that we're doing far more work here than is justified by the potential gains. The cases we care about are very limited (negated or duplicated arguments) and I'd be surprised if they're still showing up in combine.c these days. I didn't look at the history of that code, but I suspect it is *very very* old. I had a look when I was writing that patch and it was from 2005 (r96681). I'm not asking you to tackle this problem, it was more meant as an observation. But if you want to dig deeper, go for it. If it were me, the first thing I'd do is try to construct a testcase that would get me into that code -- I'd be it's hard, particularly with the tree and rtl reassociations we do these days. Yeah, the comment does mention that it's supposed to trigger rarely. I'm looking at it from the perspective of cleaning up rtx cost usages though. Thanks, Kyrill Jeff
Re: [patch] [java] bump libgcj soname
On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote: > bump the libgcj soname on the trunk, as done for every release cycle, Is that really needed though these days? Weren't there basically zero changes to libjava (both libjava and libjava/classpath) in the last 2 or more years? The few ones were mostly updating Copyright notices, minor configure changes, but I really haven't seen anything ABI changing for quite a while. Jakub
Re: [patch] [java] bump libgcj soname
On 04/21/2015 04:11 PM, Jakub Jelinek wrote: > On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote: >> bump the libgcj soname on the trunk, as done for every release cycle, > > Is that really needed though these days? > Weren't there basically zero changes to libjava (both libjava and > libjava/classpath) in the last 2 or more years? > The few ones were mostly updating Copyright notices, minor configure > changes, but I really haven't seen anything ABI changing for quite a while. yes, the GCC version is embedded in the GCJ_VERSIONED_LIBDIR which is defined as gcjsubdir=gcj-$gcjversion-$libgcj_soversion dbexecdir='$(toolexeclibdir)/'$gcjsubdir
Re: [PATCH 3/13] aarch64 musl support
> On Apr 20, 2015, at 11:52 AM, Szabolcs Nagy wrote: > > Set up dynamic linker name for aarch64. > > gcc/Changelog: > > 2015-04-16 Gregor Richards >Szabolcs Nagy > >* config/aarch64/aarch64-linux.h (MUSL_DYNAMIC_LINKER): Define. I don't think you need to check if defaulting to little or big-endian here are the specs always have one or the other passing through. Also if musl does not support ilp32, you might want to error out. Or even define the dynamic linker name even before support goes into musl. Thanks, Andrew > <03-aarch64.patch>
Re: [C/C++ PATCH] Improve -Wlogical-op (PR c/63357)
On 21/04/15 13:16, Marek Polacek wrote: (-Wlogical-op still isn't enabled neither by -Wall nor by -Wextra.) The reason is https://gcc.gnu.org/PR61534 which means we don't want to warn for: extern int xxx; #define XXX xxx int test (void) { if (!XXX && xxx) return 4; else return 0; } (gcc/testsuite/gcc.dg/pr40172-3.c, although it should be moved to c-c++-common) As noted in the PR: The problem is that !XXX becomes XXX == 0, but it has the location of "!", which is not virtual. If we look at the argument of the expression, then XXX is actually a var_decl, whose location corresponds to the declaration and not the use, and it is not virtual either. This is PR43486. Bootstrapped/regtested on x86_64-linux, ok for trunk? Does it pass bootstrap if you enable it? That is, is GCC itself -Wlogical-op clean? Cheers, Manuel.
Re: [patch] [java] bump libgcj soname
On Tue, Apr 21, 2015 at 04:16:18PM +0200, Matthias Klose wrote: > On 04/21/2015 04:11 PM, Jakub Jelinek wrote: > > On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote: > >> bump the libgcj soname on the trunk, as done for every release cycle, > > > > Is that really needed though these days? > > Weren't there basically zero changes to libjava (both libjava and > > libjava/classpath) in the last 2 or more years? > > The few ones were mostly updating Copyright notices, minor configure > > changes, but I really haven't seen anything ABI changing for quite a while. > > yes, the GCC version is embedded in the GCJ_VERSIONED_LIBDIR > > which is defined as > > gcjsubdir=gcj-$gcjversion-$libgcj_soversion > dbexecdir='$(toolexeclibdir)/'$gcjsubdir But why is that an argument for bumping it? If both GCC 5 and GCC 6 will (likely) provide the same ABI in the library, there is no reason not to use the same directory for those. Jakub
Re: [PATCH 02/12] remove some ifdef HAVE_cc0
On Tue, Apr 21, 2015 at 3:24 PM, wrote: > From: Trevor Saunders > > gcc/ChangeLog: > > 2015-04-21 Trevor Saunders > > * conditions.h: Define macros even if HAVE_cc0 is undefined. > * emit-rtl.c: Define functions even if HAVE_cc0 is undefined. > * final.c: Likewise. > * jump.c: Likewise. > * recog.c: Likewise. > * recog.h: Declare functions even when HAVE_cc0 is undefined. > * sched-deps.c (sched_analyze_2): Always compile case for cc0. > --- > gcc/conditions.h | 6 -- > gcc/emit-rtl.c | 2 -- > gcc/final.c | 2 -- > gcc/jump.c | 3 --- > gcc/recog.c | 2 -- > gcc/recog.h | 2 -- > gcc/sched-deps.c | 5 +++-- > 7 files changed, 3 insertions(+), 19 deletions(-) > > diff --git a/gcc/conditions.h b/gcc/conditions.h > index 2308bfc..7cd1e1c 100644 > --- a/gcc/conditions.h > +++ b/gcc/conditions.h > @@ -20,10 +20,6 @@ along with GCC; see the file COPYING3. If not see > #ifndef GCC_CONDITIONS_H > #define GCC_CONDITIONS_H > > -/* None of the things in the files exist if we don't use CC0. */ > - > -#ifdef HAVE_cc0 > - > /* The variable cc_status says how to interpret the condition code. > It is set by output routines for an instruction that sets the cc's > and examined by output routines for jump instructions. > @@ -117,6 +113,4 @@ extern CC_STATUS cc_status; > (cc_status.flags = 0, cc_status.value1 = 0, cc_status.value2 = 0, \ >CC_STATUS_MDEP_INIT) > > -#endif > - > #endif /* GCC_CONDITIONS_H */ > diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c > index 483eacb..c1974bb 100644 > --- a/gcc/emit-rtl.c > +++ b/gcc/emit-rtl.c > @@ -3541,7 +3541,6 @@ prev_active_insn (rtx uncast_insn) >return insn; > } > > -#ifdef HAVE_cc0 > /* Return the next insn that uses CC0 after INSN, which is assumed to > set it. This is the inverse of prev_cc0_setter (i.e., prev_cc0_setter > applied to the result of this function should yield INSN). > @@ -3589,7 +3588,6 @@ prev_cc0_setter (rtx uncast_insn) > >return insn; > } > -#endif > > #ifdef AUTO_INC_DEC > /* Find a RTX_AUTOINC class rtx which matches DATA. */ > diff --git a/gcc/final.c b/gcc/final.c > index 1fa93d9..41f6bd9 100644 > --- a/gcc/final.c > +++ b/gcc/final.c > @@ -191,7 +191,6 @@ static rtx last_ignored_compare = 0; > > static int insn_counter = 0; > > -#ifdef HAVE_cc0 > /* This variable contains machine-dependent flags (defined in tm.h) > set and examined by output routines > that describe how to interpret the condition codes properly. */ > @@ -202,7 +201,6 @@ CC_STATUS cc_status; > from before the insn. */ > > CC_STATUS cc_prev_status; > -#endif > > /* Number of unmatched NOTE_INSN_BLOCK_BEG notes we have seen. */ > > diff --git a/gcc/jump.c b/gcc/jump.c > index 34b3b7b..bc91550 100644 > --- a/gcc/jump.c > +++ b/gcc/jump.c > @@ -1044,8 +1044,6 @@ jump_to_label_p (const rtx_insn *insn) > && JUMP_LABEL (insn) != NULL && !ANY_RETURN_P (JUMP_LABEL (insn))); > } > > -#ifdef HAVE_cc0 > - > /* Return nonzero if X is an RTX that only sets the condition codes > and has no side effects. */ > > @@ -1094,7 +1092,6 @@ sets_cc0_p (const_rtx x) > } >return 0; > } > -#endif > > /* Find all CODE_LABELs referred to in X, and increment their use > counts. If INSN is a JUMP_INSN and there is at least one > diff --git a/gcc/recog.c b/gcc/recog.c > index a9d3b1f..c3ad86f 100644 > --- a/gcc/recog.c > +++ b/gcc/recog.c > @@ -971,7 +971,6 @@ validate_simplify_insn (rtx insn) >return ((num_changes_pending () > 0) && (apply_change_group () > 0)); > } > > -#ifdef HAVE_cc0 > /* Return 1 if the insn using CC0 set by INSN does not contain > any ordered tests applied to the condition codes. > EQ and NE tests do not count. */ > @@ -988,7 +987,6 @@ next_insn_tests_no_inequality (rtx insn) >return (INSN_P (next) > && ! inequality_comparisons_p (PATTERN (next))); > } > -#endif > > /* Return 1 if OP is a valid general operand for machine mode MODE. > This is either a register reference, a memory reference, > diff --git a/gcc/recog.h b/gcc/recog.h > index 45ea671..8a38b26 100644 > --- a/gcc/recog.h > +++ b/gcc/recog.h > @@ -112,9 +112,7 @@ extern void validate_replace_rtx_group (rtx, rtx, rtx); > extern void validate_replace_src_group (rtx, rtx, rtx); > extern bool validate_simplify_insn (rtx insn); > extern int num_changes_pending (void); > -#ifdef HAVE_cc0 > extern int next_insn_tests_no_inequality (rtx); > -#endif > extern bool reg_fits_class_p (const_rtx, reg_class_t, int, machine_mode); > > extern int offsettable_memref_p (rtx); > diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c > index 5434831..31de6be 100644 > --- a/gcc/sched-deps.c > +++ b/gcc/sched-deps.c > @@ -2608,8 +2608,10 @@ sched_analyze_2 (struct deps_desc *deps, rtx x, > rtx_insn *insn) > >return; > > -#ifdef HAVE_cc0 > case CC0: > +#ifdef HAVE_cc0 #ifndef ? > + gcc_unreachable (); > +#endif >/* Us
Re: [patch] [java] bump libgcj soname
On 04/21/2015 04:19 PM, Jakub Jelinek wrote: > On Tue, Apr 21, 2015 at 04:16:18PM +0200, Matthias Klose wrote: >> On 04/21/2015 04:11 PM, Jakub Jelinek wrote: >>> On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote: bump the libgcj soname on the trunk, as done for every release cycle, >>> >>> Is that really needed though these days? >>> Weren't there basically zero changes to libjava (both libjava and >>> libjava/classpath) in the last 2 or more years? >>> The few ones were mostly updating Copyright notices, minor configure >>> changes, but I really haven't seen anything ABI changing for quite a while. >> >> yes, the GCC version is embedded in the GCJ_VERSIONED_LIBDIR >> >> which is defined as >> >> gcjsubdir=gcj-$gcjversion-$libgcj_soversion >> dbexecdir='$(toolexeclibdir)/'$gcjsubdir > > But why is that an argument for bumping it? If both GCC 5 and GCC 6 will > (likely) provide the same ABI in the library, there is no reason not to use > the same directory for those. but currently there are different directories used (gcjversion already changed on the trunk) and compiled into the library. Do you mean that gcjsubdir should be just defined as gcj? Matthias
[PATCH][libstc++v3]Add new dg-require-thread-fence directive.
Hi all, This patch defines a new dg-require-thread-fence directive. And three test cases are updated to use it. The new directive are used to check whether the target support thread fence either by the target back-end or external library function call. A thread fence is required to expand atomic load/store. There is a case that a call to some external __sync_synchronize will be emitted, and it's not implemented. You will get linking errors like this: undefined reference to `__sync_synchronize`. Test cases which are gated by this directive will be skipped if no thread fence is available. For example the three test cases updated here. They fail on arm-none-eabi target where __sync_synchronize() isn't implemented and target cpu has no memory_barrier. ___sync_synchronize () is used to check whether thread-fence is available. In GCC sync_synchronize is expanded as expand_mem_thread_fence (MEMMODEL_SEQ_CST). Okay to commit? libstdc++-v3/ChangeLog: 2015-04-21 Renlin Li * testsuite/lib/dg-options.exp (dg-require-thread-fence): New. * testsuite/lib/libstdc++.exp (check_v3_target_thread_fence): New. * testsuite/29_atomics/atomic_flag/clear/1.cc: Use it. * testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc: Likewise. * testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc: Likewise. diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc b/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc index 0a4219c..a6e2299 100644 --- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc +++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/clear/1.cc @@ -1,4 +1,5 @@ // { dg-options "-std=gnu++11" } +// { dg-require-thread-fence "" } // Copyright (C) 2009-2015 Free Software Foundation, Inc. // diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc index 2ff740b..0655be4 100644 --- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc +++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/explicit.cc @@ -1,4 +1,5 @@ // { dg-options "-std=gnu++11" } +// { dg-require-thread-fence "" } // Copyright (C) 2008-2015 Free Software Foundation, Inc. // diff --git a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc index 6ac20c0..a867da2 100644 --- a/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc +++ b/libstdc++-v3/testsuite/29_atomics/atomic_flag/test_and_set/implicit.cc @@ -1,4 +1,5 @@ // { dg-options "-std=gnu++11" } +// { dg-require-thread-fence "" } // Copyright (C) 2008-2015 Free Software Foundation, Inc. // diff --git a/libstdc++-v3/testsuite/lib/dg-options.exp b/libstdc++-v3/testsuite/lib/dg-options.exp index 38c8206..56ca896 100644 --- a/libstdc++-v3/testsuite/lib/dg-options.exp +++ b/libstdc++-v3/testsuite/lib/dg-options.exp @@ -115,6 +115,15 @@ proc dg-require-cmath { args } { return } +proc dg-require-thread-fence { args } { +if { ![ check_v3_target_thread_fence ] } { + upvar dg-do-what dg-do-what + set dg-do-what [list [lindex ${dg-do-what} 0] "N" "P"] + return +} +return +} + proc dg-require-atomic-builtins { args } { if { ![ check_v3_target_atomic_builtins ] } { upvar dg-do-what dg-do-what diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp b/libstdc++-v3/testsuite/lib/libstdc++.exp index b2f7d00..9e395e2 100644 --- a/libstdc++-v3/testsuite/lib/libstdc++.exp +++ b/libstdc++-v3/testsuite/lib/libstdc++.exp @@ -1221,6 +1221,62 @@ proc check_v3_target_cmath { } { return $et_c99_math } +proc check_v3_target_thread_fence { } { +global cxxflags +global DEFAULT_CXXFLAGS +global et_thread_fence + +global tool + +if { ![info exists et_thread_fence_target_name] } { + set et_thread_fence_target_name "" +} + +# If the target has changed since we set the cached value, clear it. +set current_target [current_target_name] +if { $current_target != $et_thread_fence_target_name } { + verbose "check_v3_target_thread_fence: `$et_thread_fence_target_name'" 2 + set et_thread_fence_target_name $current_target + if [info exists et_thread_fence] { + verbose "check_v3_target_thread_fence: removing cached result" 2 + unset et_thread_fence + } +} + +if [info exists et_thread_fence] { + verbose "check_v3_target_thread_fence: using cached result" 2 +} else { + set et_thread_fence 0 + + # Set up and preprocess a C++11 test program that depends + # on the thread fence to be available. + set src thread_fence[pid].cc + + set f [open $src "w"] + puts $f "int main() {" + puts $f "__sync_synchronize ();" + puts $f "return 0;" + puts $f "}" + close $f + + set cxxflags_saved $cxxflags + set cxxflags "$cxxflags $DEFAULT_CXXFLAGS -Werror -std=gnu++11" + + set lines [v3_target_compile $src /dev/null executable ""] + set cxxflags $cxxflag
Re: [patch] [java] bump libgcj soname
On Tue, Apr 21, 2015 at 04:29:52PM +0200, Matthias Klose wrote: > On 04/21/2015 04:19 PM, Jakub Jelinek wrote: > > On Tue, Apr 21, 2015 at 04:16:18PM +0200, Matthias Klose wrote: > >> On 04/21/2015 04:11 PM, Jakub Jelinek wrote: > >>> On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote: > bump the libgcj soname on the trunk, as done for every release cycle, > >>> > >>> Is that really needed though these days? > >>> Weren't there basically zero changes to libjava (both libjava and > >>> libjava/classpath) in the last 2 or more years? > >>> The few ones were mostly updating Copyright notices, minor configure > >>> changes, but I really haven't seen anything ABI changing for quite a > >>> while. > >> > >> yes, the GCC version is embedded in the GCJ_VERSIONED_LIBDIR > >> > >> which is defined as > >> > >> gcjsubdir=gcj-$gcjversion-$libgcj_soversion > >> dbexecdir='$(toolexeclibdir)/'$gcjsubdir > > > > But why is that an argument for bumping it? If both GCC 5 and GCC 6 will > > (likely) provide the same ABI in the library, there is no reason not to use > > the same directory for those. > > but currently there are different directories used (gcjversion already changed > on the trunk) and compiled into the library. Do you mean that gcjsubdir > should > be just defined as gcj? What depends on BASE-VER sure, that is bumped automatically and should track the gcc version. But the soname, which is an unrelated number, there is no point to bump it. If you have a packaging issue, just solve it on the packaging side, but really there is no point to yearly bump a soname of something that doesn't change at all (and is really dead project for many years). Jakub
Re: [PATCH] 65479 - sanitizer stack trace missing frames past #0 on powerpc64
--- a/libsanitizer/ChangeLog +++ b/libsanitizer/ChangeLog @@ -1,3 +1,15 @@ +2015-04-19 Martin Sebor + + PR sanitizer/65479 + * libsanitizer/sanitizer_common/sanitizer_stacktrace.h + (StackTrace::signaled, StackTrace::min_insn_bytes): New data members. + (StackTrace::StackTrace): Initialize signaled. + * libsanitizer/sanitizer_common/sanitizer_stacktrace.cc + (StackTrace::GetPreviousInstructionPc): Rewrite. + * libsanitizer/sanitizer_common/sanitizer_stacktrace_libcdep.cc + (StackTrace::Print): Use min_insn_bytes to adjust PC value. + (BufferedStackTrace::Unwind): Set signaled. libsanitizer/ should not show up in the ChangeLog entry. But as somebody said earlier, the libsanitizer changes really should go to LLVM compiler-rt repo first and then be just backported, either cherry-picked (probably the case for the 5 branch backport later on) or go in full merge from compiler-rt. Okay, let me submit the sanitizer changes there. Since the tests will continue to fail without it, the libbacktrace change can go in later if that's preferable. --- a/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc +++ b/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc @@ -15,19 +15,33 @@ namespace __sanitizer { -uptr StackTrace::GetPreviousInstructionPc(uptr pc) { -#if defined(__arm__) - // Cancel Thumb bit. - pc = pc & (~1); -#endif Your code loses this, which is undesirable. The original function fails to return the pc value on ARM so I just took it out. I didn't look into what the intent was but all the tests pass with the patch on aarch64 (after applying the Fedora gcc 5 patch you mentioned yesterday). -#if defined(__powerpc__) || defined(__powerpc64__) - // PCs are always 4 byte aligned. - return pc - 4; -#elif defined(__sparc__) || defined(__mips__) - return pc - 8; The SPARC/MIPS case is of course needed, because on these architectures the call is followed by a delay slot. But I wonder why you need anything special on any other architecture, why pc - 1 isn't good enough for those. The point isn't to find a PC of the call instruction, on some targets that is very hard and you need to disassemble, but to just find some byte in the call instruction. I forgot about the delay slot. Thanks for the reminder. +const unsigned StackTrace::min_insn_bytes = +#if defined __ia64__ +// Intel Itanium has 5 byte instructions. +5 E.g. this is wrong, ia64 doesn't have 5 byte instructions, but has VLIW bundles, where in the 16 byte bundle there are up to 3 41-bit instructions plus template. But, ia64 isn't supported by libsanitizer and I doubt there are enough users that would be interested in writing support for a dead architecture. I suppose with the sanitizer output referencing the unmodified PC values on the stack the computation can be simplified to just subtract (and later add) 1 on all targets. Let me change that. Martin
Re: [PATCH 6/13] mips musl support
On Tue, Apr 21, 2015 at 01:58:02PM +, Matthew Fortune wrote: > Szabolcs Nagy writes: > > Set up dynamic linker name for mips. > > > > gcc/Changelog: > > > > 2015-04-16 Gregor Richards > > > > * config/mips/linux.h (MUSL_DYNAMIC_LINKER): Define. > > I understand that mips musl is o32 only currently is that correct? This is correct. Other ABIs if/when we support them will have different names. > There does however appear to be both soft and hard float variants > listed in the musl docs. Do you plan on using the same dynamic linker > name for both float variants? No problem if so but someone must have > decided to have unique names for big and little endian so I thought > it worth checking. No, it's supposed to be different (-sf suffix for soft float; see arch/mips/reloc.h in musl source). If this didn't make it into the patches it's an omission, probably because we didn't officially support the sf ABI at all for a long time. > Also, are you aware of the two nan encoding formats that MIPS has > and the support present in glibc's dynamic linker to deal with it? I am aware but somewhat skeptical of treating it as yet another dimension to ABI and the resulting ABI combinatorics. The vast majority of programs couldn't care less which is which and whether a NAN is quiet or signaling. Officially we just use the classic mips ABI (with qnan/snan swapped vs other archs) but there's no harm in somebody doing the opposite if they really know what they're doing. > I wonder if it would be wise to refuse to target musl unless the > ABI is known to be supported so that we can avoid compatibility > issues when different ABI variants are added in musl. Possibly, though this might make bootstrapping new ABIs harder. Rich
Re: [PATCH] PR 62173, re-shuffle insns for RTL loop invariant hoisting
Jiong Wang writes: > 2015-04-14 18:24 GMT+01:00 Jeff Law : >> On 04/14/2015 10:48 AM, Steven Bosscher wrote: So I think this stage2/3 binary difference is acceptable? >>> >>> >>> No, they should be identical. If there's a difference, then there's a >>> bug - which, it seems, you've already found, too. >> >> RIght. And so the natural question is how to fix. >> >> At first glance it would seem like having this new code ignore dependencies >> rising from debug insns would work. >> >> Which then begs the question, what happens to the debug insn -- it's >> certainly not going to be correct anymore if the transformation is made. > > Exactly. > > The debug_insn 2776 in my example is to record the base address of a > local array. the new code is doing correctly here by not shuffling the > operands of insn 2556 and 2557 as there is additional reference of > reg:1473 from debug insn, although the code will still execute correctly > if we do the transformation. > > my understanding to fix this: > > * delete the out-of-date mismatch debug_insn? as there is no guarantee > to generate accurate debug info under -O2. > > IMO, this debug_insn may affect "DW_AT_location" field for variable > descrption of "classes" in .debug_info section, but it's omitted in > the final output already. > > <3><38a4d>: Abbrev Number: 137 (DW_TAG_variable) > <38a4f> DW_AT_name : (indirect string, offset: 0x18db): classes > <38a53> DW_AT_decl_file : 1 > <38a54> DW_AT_decl_line : 548 > <38a56> DW_AT_type: <0x38cb4> > > * update the debug_insn? if the following change is OK with dwarf standard > >from > > insn0: reg0 = fp + reg1 > debug_insn: var_loc = reg0 + const_off > insn1: reg2 = reg0 + const_off > >to > > insn0: reg0 = fp + const_off > debug_insn: var_loc = reg0 + reg1 > insn1: reg2 = reg0 + reg1 > > Thanks, > And attachment is the new patch which will update debug_insn as described in the second solution above. Now the stage2/3 binary differences on AArch64 gone away. Bootstrap OK. On AArch64, this patch give 600+ new rtl loop invariants found across spec2k6 float. +4.5% perf improvement on 436.cactusADM because four new invariants found in the critical function "regex_compile". The similar improvements may be achieved on other RISC backends like powerpc/mips I guess. One thing to mention, for AArch64, one minor glitch in aarch64_legitimize_address needs to be fixed to let this patch take effect, I will send out that patch later as it's a seperate issue. Powerpc/Mips don't have this glitch in LEGITIMIZE_ADDRESS hook, so should be OK, and I verified the base address of local array in the testcase given by Seb on pr62173 do hoisted on ppc64 now. I think pr62173 is fixed on those 64bit arch by this patch. Thoughts? Thanks. 2015-04-21 Jiong Wang gcc/ * loop-invariant.c (find_defs): Enable DF_DU_CHAIN build. (vfp_const_iv): New hash table. (expensive_addr_check_p): New boolean. (init_inv_motion_data): Initialize new variables.> (free_inv_motion_data): Release hash table. (create_new_invariant): Set cheap_address to false for iv in vfp_const_iv table. (find_invariant_insn): Skip dependencies check for iv in vfp_const_iv table. (use_for_single_du): New function. (reshuffle_insn_with_vfp): Likewise. (find_invariants_bb): Call reshuffle_insn_with_vfp. gcc/testsuite/ * gcc.dg/pr62173.c: New testcase. -- Regards, Jiong diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c index f79b497..f70dfb0 100644 --- a/gcc/loop-invariant.c +++ b/gcc/loop-invariant.c @@ -203,6 +203,8 @@ typedef struct invariant *invariant_p; /* The invariants. */ static vec invariants; +static hash_table > *vfp_const_iv; +static bool need_expensive_addr_check_p; /* Check the size of the invariant table and realloc if necessary. */ @@ -695,7 +697,7 @@ find_defs (struct loop *loop) df_remove_problem (df_chain); df_process_deferred_rescans (); - df_chain_add_problem (DF_UD_CHAIN); + df_chain_add_problem (DF_UD_CHAIN + DF_DU_CHAIN); df_set_flags (DF_RD_PRUNE_DEAD_DEFS); df_analyze_loop (loop); check_invariant_table_size (); @@ -742,6 +744,9 @@ create_new_invariant (struct def *def, rtx_insn *insn, bitmap depends_on, See http://gcc.gnu.org/ml/gcc-patches/2009-10/msg01210.html . */ inv->cheap_address = address_cost (SET_SRC (set), word_mode, ADDR_SPACE_GENERIC, speed) < 3; + + if (need_expensive_addr_check_p && vfp_const_iv->find (insn)) + inv->cheap_address = false; } else { @@ -952,7 +957,8 @@ find_invariant_insn (rtx_insn *insn, bool always_reached, bool always_executed) return; depends_on = BITMAP_ALLOC (NULL); - if (!check_dependencies (insn, depends_on)) + if (!vfp_const_iv->find (insn) + && !check_dependencies (insn, depends_on)) { BITMAP_FREE (depends_on); return; @@ -1007,6 +1013,180 @@ find_invariants_insn (rtx_insn *in
Re: [PATCH] 65479 - sanitizer stack trace missing frames past #0 on powerpc64
On 04/21/2015 06:39 AM, Peter Bergner wrote: On Tue, 2015-04-21 at 08:22 +0200, Jakub Jelinek wrote: -#if defined(__powerpc__) || defined(__powerpc64__) - // PCs are always 4 byte aligned. - return pc - 4; -#elif defined(__sparc__) || defined(__mips__) - return pc - 8; The SPARC/MIPS case is of course needed, because on these architectures the call is followed by a delay slot. But I wonder why you need anything special on any other architecture, why pc - 1 isn't good enough for those. The point isn't to find a PC of the call instruction, on some targets that is very hard and you need to disassemble, but to just find some byte in the call instruction. I wrote the "pc - 4" code for powerpc* and I guess I was just being pedantic on returning the first address of the instruction. If using "pc - 1" works, then I'm fine with that. It works fine with the patch and produces sensible output because the decremented address is only used to look up the debug info and restored before it's output. Otherwise (with the unpatched code) we'd end up with odd PC addresses in the stack trace. Martin Peter
[patch, avr] extend part-clobbered check to AVR_TINY architecture
Hi, When tried backporting AVR_TINY architecture support to 4.9, build failed in libgcc for AVR_TINY. Failure was due to ICE same as: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53065 Fix provided for that bug checks for if the mode crosses the callee saved register. Below patch updates that check as the AVR_TINY has different set of callee saved registers (r18 and r19). This patch is against trunk. NOTE: ICE is re-produciable only with 4.9 + tiny patch and --with-dwarf2 enabled. Is this ok for trunk? diff --git a/gcc/config/avr/avr.c b/gcc/config/avr/avr.c index 68d5ddc..2f441e5 100644 --- a/gcc/config/avr/avr.c +++ b/gcc/config/avr/avr.c @@ -11333,9 +11333,10 @@ avr_hard_regno_call_part_clobbered (unsigned regno, machine_mode mode) return 0; /* Return true if any of the following boundaries is crossed: - 17/18, 27/28 and 29/30. */ + 17/18 or 19/20 (if AVR_TINY), 27/28 and 29/30. */ - return ((regno < 18 && regno + GET_MODE_SIZE (mode) > 18) + return ((regno <= LAST_CALLEE_SAVED_REG && + regno + GET_MODE_SIZE (mode) > (LAST_CALLEE_SAVED_REG + 1)) || (regno < REG_Y && regno + GET_MODE_SIZE (mode) > REG_Y) || (regno < REG_Z && regno + GET_MODE_SIZE (mode) > REG_Z)); } Regards, Pitchumani
Re: [PATCH 3/13] aarch64 musl support
On 21/04/15 15:16, pins...@gmail.com wrote: > > I don't think you need to check if defaulting to little or big-endian here > are the specs always have one or the other passing through. > i was not aware of this may be the ifdef is not necessary for other archs either i will check > Also if musl does not support ilp32, you might want to error out. Or even > define the dynamic linker name even before support goes into musl. > ok, i guess adding %{mabi=ilp32:_ilp32} won't hurt us
RE: [PATCH 6/13] mips musl support
Rich Felker writes: > On Tue, Apr 21, 2015 at 01:58:02PM +, Matthew Fortune wrote: > > Szabolcs Nagy writes: > > > Set up dynamic linker name for mips. > > > > > > gcc/Changelog: > > > > > > 2015-04-16 Gregor Richards > > > > > > * config/mips/linux.h (MUSL_DYNAMIC_LINKER): Define. > > > > I understand that mips musl is o32 only currently is that correct? > > This is correct. Other ABIs if/when we support them will have different > names. > > > There does however appear to be both soft and hard float variants > > listed in the musl docs. Do you plan on using the same dynamic linker > > name for both float variants? No problem if so but someone must have > > decided to have unique names for big and little endian so I thought > it > > worth checking. > > No, it's supposed to be different (-sf suffix for soft float; see > arch/mips/reloc.h in musl source). If this didn't make it into the > patches it's an omission, probably because we didn't officially support > the sf ABI at all for a long time. > > > Also, are you aware of the two nan encoding formats that MIPS has and > > the support present in glibc's dynamic linker to deal with it? > > I am aware but somewhat skeptical of treating it as yet another > dimension to ABI and the resulting ABI combinatorics. The vast majority > of programs couldn't care less which is which and whether a NAN is > quiet or signaling. Officially we just use the classic mips ABI (with > qnan/snan swapped vs other archs) but there's no harm in somebody doing > the opposite if they really know what they're doing. Couldn't agree more here but I know some people have been concerned about it so the strict rules were put in place. I will attempt to remember and copy the musl list when putting out a plan for formally relaxing the nan encoding rules. The proposal is probably less than 2 weeks away from being ready to review, it does of course make certain assumptions originating from glibc as reference but is an independent ABI proposal. > > I wonder if it would be wise to refuse to target musl unless the ABI > > is known to be supported so that we can avoid compatibility issues > > when different ABI variants are added in musl. > > Possibly, though this might make bootstrapping new ABIs harder. Indeed. The other alternative would be to set the dynamic linker name to something slightly silly for unsupported ABIs like /lib/fixme.so which would make it possible to bootstrap via the addition of a symlink but it is clearly not the approved name. thanks, Matthew
Re: [PATCH 04/12] always define HAVE_cc0
On Tue, Apr 21, 2015 at 07:53:05AM -0600, Jeff Law wrote: > On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: > >From: Trevor Saunders > > > >gcc/ChangeLog: > > > >2015-04-21 Trevor Saunders > > > > * genconfig.c (main): Always define HAVE_cc0. > > * caller-save.c (insert_one_insn): Change ifdef HAVE_cc0 to #if > > HAVE_cc0. > > * cfgcleanup.c (flow_find_cross_jump): Likewise. > > (flow_find_head_matching_sequence): Likewise. > > (try_head_merge_bb): Likewise. > > * cfgrtl.c (rtl_merge_blocks): Likewise. > > (try_redirect_by_replacing_jump): Likewise. > > (rtl_tidy_fallthru_edge): Likewise. > > * combine.c (do_SUBST_MODE): Likewise. > > (insn_a_feeds_b): Likewise. > > (combine_instructions): Likewise. > > (can_combine_p): Likewise. > > (try_combine): Likewise. > > (find_split_point): Likewise. > > (subst): Likewise. > > (simplify_set): Likewise. > > (distribute_notes): Likewise. > > * cprop.c (cprop_jump): Likewise. > > * cse.c (cse_extended_basic_block): Likewise. > > * df-problems.c (can_move_insns_across): Likewise. > > * final.c (final): Likewise. > > (final_scan_insn): Likewise. > > * function.c (emit_use_return_register_into_block): Likewise. > > * gcse.c (insert_insn_end_basic_block): Likewise. > > * haifa-sched.c (sched_init): Likewise. > > * ira.c (find_moveable_pseudos): Likewise. > > * loop-invariant.c (find_invariant_insn): Likewise. > > * lra-constraints.c (curr_insn_transform): Likewise. > > * optabs.c (prepare_cmp_insn): Likewise. > > * postreload.c (reload_combine_recognize_const_pattern): > > * Likewise. > > * reload.c (find_reloads): Likewise. > > (find_reloads_address_1): Likewise. > > * reorg.c (delete_scheduled_jump): Likewise. > > (steal_delay_list_from_target): Likewise. > > (steal_delay_list_from_fallthrough): Likewise. > > (try_merge_delay_insns): Likewise. > > (redundant_insn): Likewise. > > (fill_simple_delay_slots): Likewise. > > (fill_slots_from_thread): Likewise. > > (delete_computation): Likewise. > > (relax_delay_slots): Likewise. > > * sched-deps.c (sched_analyze_2): Likewise. > > * sched-rgn.c (add_branch_dependences): Likewise. > Doesn't go as far as I'd like, but it's still an improvement. Yeah, this one really just enables other nice things. I really dislike big patches since there's invariably something wrong somewhere and if you don't really know the code in question it can be next to impossible to figure out where the problem is. Trev > > OK. > > jeff >
Re: [PATCH 03/12] more removal of ifdef HAVE_cc0
On Tue, Apr 21, 2015 at 07:51:14AM -0600, Jeff Law wrote: > On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: > >From: Trevor Saunders > > > >gcc/ChangeLog: > > > >2015-04-21 Trevor Saunders > > > > * combine.c (find_single_use): Remove HAVE_cc0 ifdef for code > > that is trivially ded on non cc0 targets. > > (simplify_set): Likewise. > > (mark_used_regs_combine): Likewise. > > * cse.c (new_basic_block): Likewise. > > (fold_rtx): Likewise. > > (cse_insn): Likewise. > > (cse_extended_basic_block): Likewise. > > (set_live_p): Likewise. > > * rtlanal.c (canonicalize_condition): Likewise. > > * simplify-rtx.c (simplify_binary_operation_1): Likewise. > OK. I find myself wondering if the conditionals should look like > if (HAVE_cc0 > && (whatever)) > > But I doubt it makes any measurable difference. It's something we can > always add in the future if we feel the need to avoid the runtime checks for > things that aren't ever going to happen on most modern targets. yeah, it seems reasonably likely the branch predictor can deal with this for us (I tried to ensure things handled this way didn't do much other than a compare). If not well that's what profiling is for :-) Trev > > jeff >
Re: [PATCH 02/12] remove some ifdef HAVE_cc0
On Tue, Apr 21, 2015 at 04:14:01PM +0200, Richard Biener wrote: > On Tue, Apr 21, 2015 at 3:24 PM, wrote: > > From: Trevor Saunders > > > > gcc/ChangeLog: > > > > 2015-04-21 Trevor Saunders > > > > * conditions.h: Define macros even if HAVE_cc0 is undefined. > > * emit-rtl.c: Define functions even if HAVE_cc0 is undefined. > > * final.c: Likewise. > > * jump.c: Likewise. > > * recog.c: Likewise. > > * recog.h: Declare functions even when HAVE_cc0 is undefined. > > * sched-deps.c (sched_analyze_2): Always compile case for cc0. > > --- > > gcc/conditions.h | 6 -- > > gcc/emit-rtl.c | 2 -- > > gcc/final.c | 2 -- > > gcc/jump.c | 3 --- > > gcc/recog.c | 2 -- > > gcc/recog.h | 2 -- > > gcc/sched-deps.c | 5 +++-- > > 7 files changed, 3 insertions(+), 19 deletions(-) > > > > diff --git a/gcc/conditions.h b/gcc/conditions.h > > index 2308bfc..7cd1e1c 100644 > > --- a/gcc/conditions.h > > +++ b/gcc/conditions.h > > @@ -20,10 +20,6 @@ along with GCC; see the file COPYING3. If not see > > #ifndef GCC_CONDITIONS_H > > #define GCC_CONDITIONS_H > > > > -/* None of the things in the files exist if we don't use CC0. */ > > - > > -#ifdef HAVE_cc0 > > - > > /* The variable cc_status says how to interpret the condition code. > > It is set by output routines for an instruction that sets the cc's > > and examined by output routines for jump instructions. > > @@ -117,6 +113,4 @@ extern CC_STATUS cc_status; > > (cc_status.flags = 0, cc_status.value1 = 0, cc_status.value2 = 0, \ > >CC_STATUS_MDEP_INIT) > > > > -#endif > > - > > #endif /* GCC_CONDITIONS_H */ > > diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c > > index 483eacb..c1974bb 100644 > > --- a/gcc/emit-rtl.c > > +++ b/gcc/emit-rtl.c > > @@ -3541,7 +3541,6 @@ prev_active_insn (rtx uncast_insn) > >return insn; > > } > > > > -#ifdef HAVE_cc0 > > /* Return the next insn that uses CC0 after INSN, which is assumed to > > set it. This is the inverse of prev_cc0_setter (i.e., prev_cc0_setter > > applied to the result of this function should yield INSN). > > @@ -3589,7 +3588,6 @@ prev_cc0_setter (rtx uncast_insn) > > > >return insn; > > } > > -#endif > > > > #ifdef AUTO_INC_DEC > > /* Find a RTX_AUTOINC class rtx which matches DATA. */ > > diff --git a/gcc/final.c b/gcc/final.c > > index 1fa93d9..41f6bd9 100644 > > --- a/gcc/final.c > > +++ b/gcc/final.c > > @@ -191,7 +191,6 @@ static rtx last_ignored_compare = 0; > > > > static int insn_counter = 0; > > > > -#ifdef HAVE_cc0 > > /* This variable contains machine-dependent flags (defined in tm.h) > > set and examined by output routines > > that describe how to interpret the condition codes properly. */ > > @@ -202,7 +201,6 @@ CC_STATUS cc_status; > > from before the insn. */ > > > > CC_STATUS cc_prev_status; > > -#endif > > > > /* Number of unmatched NOTE_INSN_BLOCK_BEG notes we have seen. */ > > > > diff --git a/gcc/jump.c b/gcc/jump.c > > index 34b3b7b..bc91550 100644 > > --- a/gcc/jump.c > > +++ b/gcc/jump.c > > @@ -1044,8 +1044,6 @@ jump_to_label_p (const rtx_insn *insn) > > && JUMP_LABEL (insn) != NULL && !ANY_RETURN_P (JUMP_LABEL > > (insn))); > > } > > > > -#ifdef HAVE_cc0 > > - > > /* Return nonzero if X is an RTX that only sets the condition codes > > and has no side effects. */ > > > > @@ -1094,7 +1092,6 @@ sets_cc0_p (const_rtx x) > > } > >return 0; > > } > > -#endif > > > > /* Find all CODE_LABELs referred to in X, and increment their use > > counts. If INSN is a JUMP_INSN and there is at least one > > diff --git a/gcc/recog.c b/gcc/recog.c > > index a9d3b1f..c3ad86f 100644 > > --- a/gcc/recog.c > > +++ b/gcc/recog.c > > @@ -971,7 +971,6 @@ validate_simplify_insn (rtx insn) > >return ((num_changes_pending () > 0) && (apply_change_group () > 0)); > > } > > > > -#ifdef HAVE_cc0 > > /* Return 1 if the insn using CC0 set by INSN does not contain > > any ordered tests applied to the condition codes. > > EQ and NE tests do not count. */ > > @@ -988,7 +987,6 @@ next_insn_tests_no_inequality (rtx insn) > >return (INSN_P (next) > > && ! inequality_comparisons_p (PATTERN (next))); > > } > > -#endif > > > > /* Return 1 if OP is a valid general operand for machine mode MODE. > > This is either a register reference, a memory reference, > > diff --git a/gcc/recog.h b/gcc/recog.h > > index 45ea671..8a38b26 100644 > > --- a/gcc/recog.h > > +++ b/gcc/recog.h > > @@ -112,9 +112,7 @@ extern void validate_replace_rtx_group (rtx, rtx, rtx); > > extern void validate_replace_src_group (rtx, rtx, rtx); > > extern bool validate_simplify_insn (rtx insn); > > extern int num_changes_pending (void); > > -#ifdef HAVE_cc0 > > extern int next_insn_tests_no_inequality (rtx); > > -#endif > > extern bool reg_fits_class_p (const_rtx, reg_class_t, int, machine_mode); > > > > extern int offsettable_memref_p (rtx); > > diff --git
Re: [PATCH 00/12] Reduce conditional compilation
On Tue, Apr 21, 2015 at 07:57:19AM -0600, Jeff Law wrote: > On 04/21/2015 07:24 AM, tbsaunde+...@tbsaunde.org wrote: > >From: Trevor Saunders > > > >Hi, > > > >This is a first round of patches to reduce the amount of code with in #if / > >#ifdef. This makes it incrementally easier to not break configs other than > >the > >one being built, and moves things slightly closer to using target hooks for > >everything. > > > >each commit bootstrapped and regtested on x86_64-linux-gnu without > >regression, > >and whole patch set run through config-list.mk without issue, ok? > So I think after looking at this patchset, any changes of a similar nature > you want to make should be considered pre-approved. Just post them for > archival purposes, but no need for you to wait for review as long as they > have the same purpose and overall structure as was seen in these patches. thanks! Its also always nice to have someone double check your logic :-) Trev > > jeff >
[WIP] OpenMP 4 NVPTX support
Hi! Attached is a minimal patch to get at least a trivial OpenMP 4.0 testcase offloading to NVPTX (the first patch). The second patch is WIP, just first few needed changes to make libgomp to build for NVPTX (several weeks of work at least). The following seems to work and the output suggests that it was offloaded to a non-SHM arch: int main () { int v = 0; int *w = 0; int x = 0; #pragma omp target { v = 6; w = &v; x = 1; // omp_is_initial_device (); } __builtin_printf ("%d %p %p %d\n", v, &v, w, x); return 0; } but already tiny bit more complicated testcase: extern void *malloc (__SIZE_TYPE__); extern void free (void *); int main () { int v = 0; int *w = 0; int x = 0; #pragma omp target { v = 6; w = &v; char *p = malloc (64); x = 1; // omp_is_initial_device (); free (p); } __builtin_printf ("%d %p %p %d\n", v, &v, w, x); return 0; } suggests that while it is nice that when building nvptx accel compiler we build libgcc.a, libc.a, libm.a, libgfortran.a (and in the future hopefully libgomp.a), nothing attempts to link those in :(. Is the plan to link those in at mkoffload time (haven't seen any attempt of mkoffload to invoke the nvptx-none-ld linker though), or link those in somehow at link_ptx time in the plugin? In either case, it isn't clear to me how things will work (if at all) in the case where multiple shared libraries (or executable and at least one shared library) have their own offloading bits, and if you try to e.g. call an offloaded function defined in the shared library from an offloaded kernel in the executable, because if any library needs some global singleton case, if it is linked multiple times, no idea what the PTX JIT will do. Once that is resolved, another thing will be to figure out how to efficiently implement the TLS libgomp needs for its ICVs and other state - right now it uses either __thread, or pthread_getspecific, neither of these is usable of course. I've been thinking about an array of those structures in .shared memory indexed by %tid.x, but I guess that runs into the issue that the array would need to be declared fixed size and there is a very small size limitation on .shared memory size. So perhaps a file scope .shared pointer to global memory, where whomever launches an OpenMP 4.0 kernel (either the libgomp-plugin-nvptx.so.1 doing GOMP_run, or later on dynamic parallelism from GOMP_target in the nvptx libgomp.a) allocates the memory and some wrapper sets the .shared variable to that allocated memory, then calls the kernel? Jakub --- libgomp/plugin/plugin-nvptx.c.jj2015-04-21 08:38:00.0 +0200 +++ libgomp/plugin/plugin-nvptx.c 2015-04-21 16:55:25.247470080 +0200 @@ -978,8 +978,8 @@ event_add (enum ptx_event_type type, CUe void nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, - size_t *sizes, unsigned short *kinds, int num_gangs, int num_workers, - int vector_length, int async, void *targ_mem_desc) + size_t *sizes, unsigned short *kinds, int num_gangs, + int num_workers, int vector_length, int async, void *targ_mem_desc) { struct targ_fn_descriptor *targ_fn = (struct targ_fn_descriptor *) fn; CUfunction function; @@ -1137,7 +1137,6 @@ nvptx_host2dev (void *d, const void *h, CUresult r; CUdeviceptr pb; size_t ps; - struct nvptx_thread *nvthd = nvptx_thread (); if (!s) return 0; @@ -1162,7 +1161,8 @@ nvptx_host2dev (void *d, const void *h, GOMP_PLUGIN_fatal ("invalid size"); #ifndef DISABLE_ASYNC - if (nvthd->current_stream != nvthd->ptx_dev->null_stream) + struct nvptx_thread *nvthd = nvptx_thread (); + if (nvthd && nvthd->current_stream != nvthd->ptx_dev->null_stream) { CUevent *e; @@ -1202,7 +1202,6 @@ nvptx_dev2host (void *h, const void *d, CUresult r; CUdeviceptr pb; size_t ps; - struct nvptx_thread *nvthd = nvptx_thread (); if (!s) return 0; @@ -1227,7 +1226,8 @@ nvptx_dev2host (void *h, const void *d, GOMP_PLUGIN_fatal ("invalid size"); #ifndef DISABLE_ASYNC - if (nvthd->current_stream != nvthd->ptx_dev->null_stream) + struct nvptx_thread *nvthd = nvptx_thread (); + if (nvthd && nvthd->current_stream != nvthd->ptx_dev->null_stream) { CUevent *e; @@ -1559,7 +1559,8 @@ GOMP_OFFLOAD_get_name (void) unsigned int GOMP_OFFLOAD_get_caps (void) { - return GOMP_OFFLOAD_CAP_OPENACC_200; + return GOMP_OFFLOAD_CAP_OPENACC_200 +| GOMP_OFFLOAD_CAP_OPENMP_400; } int @@ -1759,7 +1760,7 @@ GOMP_OFFLOAD_openacc_parallel (void (*fn void *targ_mem_desc) { nvptx_exec (fn, mapnum, hostaddrs, devaddrs, sizes, kinds, num_gangs, - num_workers, vector_length, async, targ_mem_desc); + num_workers, vector_length, async, targ_mem_desc); } void @@ -1889,3 +1890,27 @@ GOMP_OFFLOAD_openacc_set_cuda_stream (in { return nvptx_set_cuda_stream (async, stream); } + +void
Re: [PATCH][doc] Improve pipeline description docs a bit
On 04/20/2015 04:31 AM, Kyrill Tkachov wrote: Hi all, This patch attempts to improve the pipeline description documentation. It fixes some grammar errors,typos and clarifies some concepts. The sections on the syntactic constructs are formatted to have a small description, and example, description of syntax elements and some elaboration. Is this ok for trunk? Thanks, Kyrill 2014-04-20 Kyrylo Tkachov * doc/md.texi (Specifying processor pipeline description): Improve wording. Clarify some constructs. H. I guess overall this is an improvement, but I still see quite a few things that need tweaking (and I wasn't even looking very hard). +latency time}. Instructions may not complete execution until all inputs +to the instruction have been evaluated and are available for use. +Taking data dependence delays into account is simple. I don't think the above sentence adds anything and could be deleted. +The data dependence (true, output, and anti-dependence) delay between two +instructions is modelled as being constant. In most cases this approach is +adequate. The second kind of interlock delays is a reservation delay. +The reservation delay means that two or more executing instructions will require s/will require/require/ + +The define_automaton construct declares the names of automata. +It takes the following form: @smallexample (define_automaton @var{automata-names}) @end smallexample @var{automata-names} is a string giving names of the automata. The -names are separated by commas. All the automata should have unique names. -The automaton name is used in the constructions @code{define_cpu_unit} and -@code{define_query_cpu_unit}. +names are separated by commas. All the automata must have unique names. +The automaton name is used to bind @code{define_cpu_unit} and +@code{define_query_cpu_unit} constructs to specific automata. + +This construct declares the names of automata. You already said that a few sentences above; delete this one. +The define_query_cpu_unit construct can be used to define units Add @code{} markup here. -@var{default_latency} is a number giving latency time of the +@var{default_latency} is a number giving the latency of the instruction. There is an important difference between the old description and the automaton based pipeline description. The latency -time is used for all dependencies when we use the old description. In -the automaton based pipeline description, the given latency time is only -used for true dependencies. The cost of anti-dependencies is always -zero and the cost of output dependencies is the difference between -latency times of the producing and consuming insns (if the difference -is negative, the cost is considered to be zero). You can always -change the default costs for any description by using the target hook +is used for all types of dependencies when we used the old description. In +the automaton based pipeline description, the latency is only taken into +account when analysing true dependencies (i.e. not output or +anti-dependencies). The cost of anti-dependencies is always zero and the +cost of output dependencies is the difference between the latencies +of the producing and consuming insns (if the difference is negative, the +cost is considered to be zero). You can always change the default cost +between any pair of insns by using the target hook @code{TARGET_SCHED_ADJUST_COST} (@pxref{Scheduling}). Here I am confused. What is the "old description"? If this is a leftover of some obsolete way of doing things, the references to it should be deleted. +construct. You must avoid having more than one +@code{define_insn_reservation} matching any one RTL insn, as the behaviour is s/behaviour/behavior/ +The following construct is used to describe a bypass i.e. an exception +in the execution latency between a pair of instructions: @dfn{bypass} ?? @var{guard} is an optional string giving the name of a C function which -defines an additional guard for the bypass. The function will get the +defines an additional guard for the bypass. The function will take the two insns as parameters. If the function returns zero the bypass will be ignored for this case. The additional guard is necessary to s/will take/takes/ s/will be ignored/is ignored/ +If there is more one bypass with the same output and input insns, the +chosen bypass is the first bypass with a guard function in its definition that +returns nonzero. If there is no such bypass, then a bypass without a guard +function is chosen. These constructs can be used to describe, for example, +forwarding paths in a processor pipeline. I don't understand what the last sentence has to do with the rest of this paragraph. If this is part of the general discussion of what define_bypass does, it should be moved up to the paragraph where the concept of a bypass is introduced. -@var{unit-names} is a string giving names o
[PATCH][AARCH64]Use mov for add with large immediate.
Hi all, This is a simple patch to generate a move instruction to temporarily hold the large immediate for a add instruction. GCC regression test has been run using aarch64-none-elf toolchain. NO new issues. Okay for trunk? Regards, Renlin Li gcc/ChangeLog: 2015-04-21 Renlin Li * config/aarch64/aarch64.md (add3): Use mov when allowed. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 1f4169e..9ea1939 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1414,18 +1414,28 @@ " if (! aarch64_plus_operand (operands[2], VOIDmode)) { - rtx subtarget = ((optimize && can_create_pseudo_p ()) - ? gen_reg_rtx (mode) : operands[0]); HOST_WIDE_INT imm = INTVAL (operands[2]); - - if (imm < 0) - imm = -(-imm & ~0xfff); + if (aarch64_move_imm (imm, mode) + && can_create_pseudo_p ()) + { + rtx tmp = gen_reg_rtx (mode); + emit_move_insn (tmp, operands[2]); + operands[2] = tmp; + } else -imm &= ~0xfff; + { + rtx subtarget = ((optimize && can_create_pseudo_p ()) + ? gen_reg_rtx (mode) : operands[0]); + + if (imm < 0) + imm = -(-imm & ~0xfff); + else + imm &= ~0xfff; - emit_insn (gen_add3 (subtarget, operands[1], GEN_INT (imm))); - operands[1] = subtarget; - operands[2] = GEN_INT (INTVAL (operands[2]) - imm); + emit_insn (gen_add3 (subtarget, operands[1], GEN_INT (imm))); + operands[1] = subtarget; + operands[2] = GEN_INT (INTVAL (operands[2]) - imm); + } } " )
Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation
On Thu, 2015-04-16 at 10:26 -0700, Mike Stump wrote: > On Apr 16, 2015, at 8:01 AM, David Malcolm wrote: > > Attached is a work-in-progress patch for a new > > -Wmisleading-indentation > > warning I've been experimenting with, for GCC 6. > > Seems like a nice idea in general. > > Does it also handle: > > if (cone); > stmt; > > ? Would be good to add that to the test suite, as that is another hard to > spot common error that should be caught. Not yet, but I agree that it would be a good thing to issue a warning for. > I do think that it is reasonable to warn for things like: > > stmt; > stmt; > > one of those two lines is likely misindented, though, maybe you want to start > with the high payback things first. > > An issue here is how to determine (i), or if it's OK to default to 8 > > Yes, 8 is the proper value to default it to. > > > and have a command-line option (param?) to override it? (though what about, > > say, each header file?) > > I’ll abstain from this. The purist in me says no option for other > than 8, life goes on. 20 years ago, someone was confused over hard v > soft tabbing and what exactly the editor key TAB does. That confusion > is over, the 8 people have won. Catering to other than 8 gives the > impression that the people that lost still have a chance at > winning. :-) > > > Thoughts on this, and on the patch? > > Would be nice to have a stricter version that warns about all wildly > inconsistently or wrongly indented lines. > > { > stmt; > stmt; // must be same as above > } > > { > stmt; // must be indented at least 1 > } > > if (cond) > stmt; // must be indented at least 1 I think I want to make a distinction between (A) classic C "gotchas", like the one in my mail and the: if (cond); stmt; one you mentioned above vs (B) wrong/inconsistent indentation. I think (A) is high-value, since it detects subtly wrong code, likely to have misled the reader, whereas I don't find (B) as interesting. I think (A) is "misleading", whereas (B) is "wrong"; the ugliness of the (B) cases tends to give me a "this code is ugly; beware, danger Will Robinson!" reaction, whereas (A) is less ugly and thus more dangerous. (if that makes sense; this may just be my own visceral reaction to the erroneous code). Or to put it another way, I hope to make (A) good enough to go into -Wall, whereas I think (B) would meet more resistance. Also, I think autogenerated code is more likely to run into (B) than (A). I have the patch working now for the C++ frontend. Am attaching the work-in-progress (sans ChangeLog). This one (v2) bootstrapped and regrtested on x86_64-unknown-linux-gnu (Fedora 20), with: 63 new "PASS" results in gcc.sum 189 new "PASS" results in g++.sum for the new test cases (relative to a control build of r48). I also moved the visual-parser.c/h to c-family, to make use of the -ftabstop option Tom mentioned in another mail. I also made it identify the kind of clause, so error messages say things like: ./Wmisleading-indentation-1.c:10:7: warning: statement is indented as if it were guarded by... [-Wmisleading-indentation] ./Wmisleading-indentation-1.c:8:3: note: ...this 'if' clause, but it is not which makes it easier to read, especially when dealing with nesting. This hasn't yet had any performance/leak fixes so it isn't ready as is. I plan to look at making it warn about the: if (cond); stmt; gotcha next, before trying to optimize it. (and no ChangeLog yet) Dave diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 80c91f0..8154469 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1143,7 +1143,8 @@ C_COMMON_OBJS = c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o \ c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o \ c-family/c-semantics.o c-family/c-ada-spec.o \ c-family/c-cilkplus.o \ - c-family/array-notation-common.o c-family/cilk.o c-family/c-ubsan.o + c-family/array-notation-common.o c-family/cilk.o c-family/c-ubsan.o \ + c-family/visual-parser.o # Language-independent object files. # We put the insn-*.o files first so that a parallel make will build diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt index 983f4a8..88f1f94 100644 --- a/gcc/c-family/c.opt +++ b/gcc/c-family/c.opt @@ -554,6 +554,10 @@ Wmemset-transposed-args C ObjC C++ ObjC++ Var(warn_memset_transposed_args) Warning LangEnabledBy(C ObjC C++ ObjC++,Wall) Warn about suspicious calls to memset where the third argument is constant literal zero and the second is not +Wmisleading-indentation +C C++ Common Var(warn_misleading_indentation) Warning +Warn when the indentation of the code does not reflect the block structure + Wmissing-braces C ObjC C++ ObjC++ Var(warn_missing_braces) Warning LangEnabledBy(C ObjC,Wall) Warn about possibly missing braces around initializers diff --git a/gcc/c-family/visual-parser.c b/gcc/c-family/visual-parser.c new file mode 100644 index 000..b1fcb8b --- /dev/null
Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation
On Tue, Apr 21, 2015 at 12:07:00PM -0400, David Malcolm wrote: > On Thu, 2015-04-16 at 10:26 -0700, Mike Stump wrote: > > On Apr 16, 2015, at 8:01 AM, David Malcolm wrote: > > > Attached is a work-in-progress patch for a new > > > -Wmisleading-indentation > > > warning I've been experimenting with, for GCC 6. > > > > Seems like a nice idea in general. > > > > Does it also handle: > > > > if (cone); > > stmt; > > > > ? Would be good to add that to the test suite, as that is another hard to > > spot common error that should be caught. > > Not yet, but I agree that it would be a good thing to issue a warning > for. > > > I do think that it is reasonable to warn for things like: > > > > stmt; > > stmt; > > > > one of those two lines is likely misindented, though, maybe you want to > > start with the high payback things first. > > > > An issue here is how to determine (i), or if it's OK to default to 8 > > > > Yes, 8 is the proper value to default it to. > > > > > and have a command-line option (param?) to override it? (though what > > > about, > > > say, each header file?) > > > > I’ll abstain from this. The purist in me says no option for other > > than 8, life goes on. 20 years ago, someone was confused over hard v > > soft tabbing and what exactly the editor key TAB does. That confusion > > is over, the 8 people have won. Catering to other than 8 gives the > > impression that the people that lost still have a chance at > > winning. :-) > > > > > Thoughts on this, and on the patch? > > > > Would be nice to have a stricter version that warns about all wildly > > inconsistently or wrongly indented lines. > > > > { > > stmt; > > stmt; // must be same as above > > } > > > > { > > stmt; // must be indented at least 1 > > } > > > > if (cond) > > stmt; // must be indented at least 1 > > I think I want to make a distinction between > > (A) classic C "gotchas", like the one in my mail and the: > > if (cond); > stmt; > > one you mentioned above > > vs > > (B) wrong/inconsistent indentation. > > I think (A) is high-value, since it detects subtly wrong code, likely to > have misled the reader, whereas I don't find (B) as interesting. I > think (A) is "misleading", whereas (B) is "wrong"; the ugliness of the > (B) cases tends to give me a "this code is ugly; beware, danger Will > Robinson!" reaction, whereas (A) is less ugly and thus more dangerous. So, while I was working on ifdef stuff in gcc I found the following pattern #ifdef FOO if (FOO) #endif bar (); which you may want to handle somehow. In that sort of case one side of the ifdef will necessarily have the B type of miss indentation. Trev > > (if that makes sense; this may just be my own visceral reaction to the > erroneous code). > > Or to put it another way, I hope to make (A) good enough to go into > -Wall, whereas I think (B) would meet more resistance. > Also, I think autogenerated code is more likely to run into (B) than > (A). > > I have the patch working now for the C++ frontend. Am attaching the > work-in-progress (sans ChangeLog). This one (v2) bootstrapped and > regrtested on x86_64-unknown-linux-gnu (Fedora 20), with: > 63 new "PASS" results in gcc.sum > 189 new "PASS" results in g++.sum > for the new test cases (relative to a control build of r48). > > I also moved the visual-parser.c/h to c-family, to make use of the > -ftabstop option Tom mentioned in another mail. > > I also made it identify the kind of clause, so error messages say things > like: > > ./Wmisleading-indentation-1.c:10:7: warning: statement is indented as if > it were guarded by... [-Wmisleading-indentation] > ./Wmisleading-indentation-1.c:8:3: note: ...this 'if' clause, but it is > not > > which makes it easier to read, especially when dealing with nesting. > > This hasn't yet had any performance/leak fixes so it isn't ready as is. > I plan to look at making it warn about the: > > if (cond); > stmt; > > gotcha next, before trying to optimize it. > > (and no ChangeLog yet) > > Dave > diff --git a/gcc/Makefile.in b/gcc/Makefile.in > index 80c91f0..8154469 100644 > --- a/gcc/Makefile.in > +++ b/gcc/Makefile.in > @@ -1143,7 +1143,8 @@ C_COMMON_OBJS = c-family/c-common.o > c-family/c-cppbuiltin.o c-family/c-dump.o \ >c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o \ >c-family/c-semantics.o c-family/c-ada-spec.o \ >c-family/c-cilkplus.o \ > - c-family/array-notation-common.o c-family/cilk.o c-family/c-ubsan.o > + c-family/array-notation-common.o c-family/cilk.o c-family/c-ubsan.o \ > + c-family/visual-parser.o > > # Language-independent object files. > # We put the insn-*.o files first so that a parallel make will build > diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt > index 983f4a8..88f1f94 100644 > --- a/gcc/c-family/c.opt > +++ b/gcc/c-family/c.opt > @@ -554,6 +554,10 @@ Wmemset-transposed-args > C ObjC C++ ObjC++ Var(warn_memset_transp
Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation
On 21/04/15 18:07, David Malcolm wrote: On Thu, 2015-04-16 at 10:26 -0700, Mike Stump wrote: Does it also handle: if (cone); stmt; ? Would be good to add that to the test suite, as that is another hard to spot common error that should be caught. Not yet, but I agree that it would be a good thing to issue a warning for. GCC already warns for the above: test.c:3:9: warning: suggest braces around empty body in an ‘if’ statement [-Wempty-body] if (a); ^ Cheers, Manuel.
Re: [RFC] Dynamically aligning the stack
On Tue, 2015-04-14 at 10:08 -0700, H.J. Lu wrote: > We have done just that in GCC 4.4 to implement dynamic stack > alignment on x86 :-). Some of x86 backend changes for dynamic > stack alignment are x86 psABI specific. Others are historical, > like -mstackrealign. which was the old attempt for dynamic stack > alignment. I am a bit confused about the history of stack alignment on x86. So I guess -mpreferred-stack-boundary=X came first and is not obsolete/depreciated. But I thought -mstackrealign=X was the current method of aligning the stack, but based on this comment and the patches you pointed me at I guess this is also obsolete (or at least deprecated) and that -mincoming-stack-boundary=X is the current option that should be used. But I am not sure how this option works. Obviously it tells GCC what assumption to make about stack alignment at the start of a function but how do you tell GCC what alignment you want for the function? Or does GCC figure that out for itself based on the instructions and data types it sees in the function? Steve Ellcey sell...@imgtec.com
Re: [patch] [java] bump libgcj soname
- Original Message - > On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote: > > bump the libgcj soname on the trunk, as done for every release cycle, > > Is that really needed though these days? > Weren't there basically zero changes to libjava (both libjava and > libjava/classpath) in the last 2 or more years? > The few ones were mostly updating Copyright notices, minor configure > changes, but I really haven't seen anything ABI changing for quite a while. > On the Classpath side, there's a bunch of stuff to merge in that would change the ABI. It's a matter of finding a good point at which to do it and time to do so. I keep missing the right point in the gcc lifecycle. > Jakub > -- Andrew :) Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 PGP Key: rsa4096/248BDC07 (hkp://keys.gnupg.net) Fingerprint = EC5A 1F5E C0AD 1D15 8F1F 8F91 3B96 A578 248B DC07
Re: [RFC stage 1] Proposed new warning: -Wmisleading-indentation
On Apr 21, 2015, at 9:07 AM, David Malcolm wrote: > I think I want to make a distinction between > > (A) classic C "gotchas", like the one in my mail and the: > > if (cond); >stmt; > > one you mentioned above > > vs > > (B) wrong/inconsistent indentation. > > I think (A) is high-value, since it detects subtly wrong code, likely to > have misled the reader, whereas I don't find (B) as interesting. Ok. I don’t have any problem with that. Going for the high value only makes the problem space smaller, more likely to implement and do a good job and avoids false positives and all sorts of what ifs that the other class would expose you to. I like your work and your plan.
Re: [patch] [java] bump libgcj soname
On Tue, Apr 21, 2015 at 01:04:04PM -0400, Andrew Hughes wrote: > - Original Message - > > On Tue, Apr 21, 2015 at 04:07:13PM +0200, Matthias Klose wrote: > > > bump the libgcj soname on the trunk, as done for every release cycle, > > > > Is that really needed though these days? > > Weren't there basically zero changes to libjava (both libjava and > > libjava/classpath) in the last 2 or more years? > > The few ones were mostly updating Copyright notices, minor configure > > changes, but I really haven't seen anything ABI changing for quite a while. > > > > On the Classpath side, there's a bunch of stuff to merge in that would > change the ABI. It's a matter of finding a good point at which to do it > and time to do so. I keep missing the right point in the gcc lifecycle. Now might be a good time (any time next 6.5 months or so), and if that is done, surely I have no issue with bumping the soname. Jakub