Re: PING: [ARM] Model automodified addresses in the Cortex A8 and A9 schedulers -- NEON
Ramana Radhakrishnan writes: > I do have a one nit on the ml bit though I must say I'm not an ML > expert which is why I resisted for a while. The one comment that I > have and I should have realized earlier was that the file had been > parameterized by the core in quite a few places and I would like to > still retain that capability. > > This look better ? I have retained your original logic and only > parameterized wblatency on the core . Looks good, thanks. Richard
[RFC] Add FMA support to sparc backend
Eric, this is a preliminary version of the FMA patch I've been working on. Just so you can see what I'm doing. First, ignore the fact that there are two configure tests for the presence of support for these instructions. I'm busy normalizing the -xarch options which binutils supports so that they are the same as Sun AS and therefore just one test is necessary. Second, like rs6000 the sparc negate fused multiply instructions negate the full result, not the multiply result. So we cannot use those instructions for the fnmadf4/fnmsdf4/fnmasf4/fnmssf4 patterns. Since rs6000 provides patterns for such negate operations (presumably just in case the combiner creates a match) I have done so for sparc as well. I was really surprised that cpu designers haven't settled on a consistent formula for negated fused multiply add/sub instructions. Ho hum... For now my plan is to turn these fused multiply instructions on if you ask to compile targetting a cpu that supports them. I'll write a suitable changelog etc. once everything is finalized, this patch posting is just to elicit feedback. Thanks! diff --git a/gcc/config.in b/gcc/config.in index d202b038..4bb6271 100644 --- a/gcc/config.in +++ b/gcc/config.in @@ -272,6 +272,12 @@ #endif +/* Define if your assembler supports FMAF instructions. */ +#ifndef USED_FOR_TARGET +#undef HAVE_AS_GAS_FMAF +#endif + + /* Define if your assembler supports the --gdwarf2 option. */ #ifndef USED_FOR_TARGET #undef HAVE_AS_GDWARF2_DEBUG_FLAG @@ -479,6 +485,12 @@ #endif +/* Define if your assembler supports FMAF instructions. */ +#ifndef USED_FOR_TARGET +#undef HAVE_AS_SUN_FMAF +#endif + + /* Define if your assembler and linker support thread-local storage. */ #ifndef USED_FOR_TARGET #undef HAVE_AS_TLS @@ -1047,12 +1059,6 @@ #endif -/* Define if _Unwind_GetIPInfo is available. */ -#ifndef USED_FOR_TARGET -#undef HAVE_GETIPINFO -#endif - - /* Define to 1 if you have the `getrlimit' function. */ #ifndef USED_FOR_TARGET #undef HAVE_GETRLIMIT diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c index cf9e197..ce15730 100644 --- a/gcc/config/sparc/sparc.c +++ b/gcc/config/sparc/sparc.c @@ -752,9 +752,9 @@ sparc_option_override (void) /* UltraSPARC T2 */ { MASK_ISA, MASK_V9}, /* UltraSPARC T3 */ -{ MASK_ISA, MASK_V9}, +{ MASK_ISA, MASK_V9 | MASK_FMAF}, /* UltraSPARC T4 */ -{ MASK_ISA, MASK_V9}, +{ MASK_ISA, MASK_V9 | MASK_FMAF}, }; const struct cpu_table *cpu; unsigned int i; @@ -833,9 +833,9 @@ sparc_option_override (void) if (target_flags_explicit & MASK_FPU) target_flags = (target_flags & ~MASK_FPU) | fpu; - /* Don't allow -mvis if FPU is disabled. */ + /* Don't allow -mvis or -mfmaf if FPU is disabled. */ if (! TARGET_FPU) -target_flags &= ~MASK_VIS; +target_flags &= ~(MASK_VIS | MASK_FMAF); /* -mvis assumes UltraSPARC+, so we are sure v9 instructions are available. @@ -9505,6 +9505,25 @@ sparc_rtx_costs (rtx x, int code, int outer_code, int opno ATTRIBUTE_UNUSED, *total = COSTS_N_INSNS (1); return false; +case FMA: + { + rtx sub; + + gcc_assert (float_mode_p); + *total = sparc_costs->float_mul; + + sub = XEXP (x, 0); + if (GET_CODE (sub) == NEG) + sub = XEXP (sub, 0); + *total += rtx_cost (sub, FMA, 0, speed); + + sub = XEXP (x, 2); + if (GET_CODE (sub) == NEG) + sub = XEXP (sub, 0); + *total += rtx_cost (sub, FMA, 2, speed); + return true; + } + case MULT: if (float_mode_p) *total = sparc_costs->float_mul; diff --git a/gcc/config/sparc/sparc.h b/gcc/config/sparc/sparc.h index afdca1e..c8174ce 100644 --- a/gcc/config/sparc/sparc.h +++ b/gcc/config/sparc/sparc.h @@ -1880,6 +1880,11 @@ extern int sparc_indent_opcode; #define TARGET_SUN_TLS TARGET_TLS #define TARGET_GNU_TLS 0 +#if !(defined(HAVE_AS_GAS_FMAF) || defined(HAVE_AS_SUN_FMAF)) +#undef TARGET_FMAF +#define TARGET_FMAF 0 +#endif + /* The number of Pmode words for the setjmp buffer. */ #define JMP_BUF_SIZE 12 diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md index 721db93..58ac6d7 100644 --- a/gcc/config/sparc/sparc.md +++ b/gcc/config/sparc/sparc.md @@ -5345,6 +5345,78 @@ "fmuls\t%1, %2, %0" [(set_attr "type" "fpmul")]) +(define_insn "fmadf4" + [(set (match_operand:DF 0 "register_operand" "=e") +(fma:DF (match_operand:DF 1 "register_operand" "e") + (match_operand:DF 2 "register_operand" "e") + (match_operand:DF 3 "register_operand" "e")))] + "TARGET_FMAF" + "fmaddd\t%1, %2, %3, %0" + [(set_attr "type" "fpmul")]) + +(define_insn "fmsdf4" + [(set (match_operand:DF 0 "register_operand" "=e") +(fma:DF (match_operand:DF 1 "register_operand" "e") + (match_operand:DF 2 "register_operand" "e") + (neg:DF (match_operand:DF 3 "register_operand" "e"] + "TARGET_FMAF" + "fmsubd\t%1, %2,
Re: [PATCH, PR50251] set DRAP type stack realignment for stack_restore
On 09/04/2011 03:44 PM, Tom de Vries wrote: > On 09/04/2011 11:10 AM, Richard Guenther wrote: >> On Sun, Sep 4, 2011 at 11:00 AM, Tom de Vries wrote: >>> Hi, >>> >>> this patch fixes PR50251, which was caused by r178353. >>> >>> The patch was bootstrapped and reg-tested on i686 and x86_64. >>> On i686, the test-cases reported failing in PR50251 pass again. >>> >>> The patch selects the DRAP type stack realignment method in case a >>> stack_restore >>> is used. If that is not done, the presence of the stack_restore at reload >>> leaves >>> FRAME_POINTER without an elimination rule for i386 port. >>> >>> OK for trunk? >> >> Shouldn't we eventually simply set cfun->calls_alloca when expanding either >> stack save or restore? Similar to how it is set from >> allocate_dynamic_stack_space? >> I'm not sure we encountered the dead stack save/restore pair before >> this new folding, >> so, just to make other targets not confused about them as well? > > Setting cfun->calls_alloca will work as well, but doesn't seem necessary to > me. > AFAIU, since other targets don't define MAX_STACK_ALIGNMENT, they don't need > to > do a realign themselves. If I disable MAX_STACK_ALIGNMENT on i386, the > middle-end handles the realign and the stack_restore causes no problems. So to > me this seems a problem with the i386-specific implementation of realignment. > > I'm more worried about other similar cases not working for i386 than about > other > ports. Another, more involved way to fix this, would be in reload to: > - calculate which registers we cannot use for elimination (which we already > do) > - pass that as context to target.can_eliminate. The i386 target can then > fulfill its requirement to be able to eliminate the frame pointer by > setting need_drap if the stack pointer is not available. > I think this way we could remove most if not all of the > 'crtl->need_drap = true' snippets, and have a completer solution and perhaps > also more optimal. > > Is this simple crtl->need_drap fix ok for now, or should I start testing the > cfun->calls_alloca fix? > Ping. I have tested the attached patch on both x86_64 and i686 and it fixes all failures in PR50251 without any regressions. Since I don't feel like the right person to dig into reload guts to implement the solution above, I want to check in this fix to i386 backend. Uros, Richard H., OK for trunk? Thank you, - Tom 2011-09-14 Tom de Vries * explow.c (emit_stack_restore): Set crtl->need_drap if stack_restore is emitted. Index: gcc/explow.c === --- gcc/explow.c (revision 178145) +++ gcc/explow.c (working copy) @@ -1062,6 +1062,20 @@ emit_stack_restore (enum save_level save /* The default is that we use a move insn. */ rtx (*fcn) (rtx, rtx) = gen_move_insn; + /* If stack_realign_drap, the x86 backend emits a prologue that aligns both + STACK_POINTER and HARD_FRAME_POINTER. + If stack_realign_fp, the x86 backend emits a prologue that aligns only + STACK_POINTER. This renders the HARD_FRAME_POINTER unusable for accessing + aligned variables, which is reflected in ix86_can_eliminate. + We normally still have the realigned STACK_POINTER that we can use. + But if there is a stack restore still present at reload, it can trigger + mark_not_eliminable for the STACK_POINTER, leaving no way to eliminate + FRAME_POINTER into a hard reg. + To prevent this situation, we force need_drap if we emit a stack + restore. */ + if (SUPPORTS_STACK_ALIGNMENT) +crtl->need_drap = true; + /* See if this machine has anything special to do for this kind of save. */ switch (save_level) {
Re: PING: [ARM] Model automodified addresses in the Cortex A8 and A9 schedulers -- NEON
On 14 September 2011 08:47, Richard Sandiford wrote: > Ramana Radhakrishnan writes: >> I do have a one nit on the ml bit though I must say I'm not an ML >> expert which is why I resisted for a while. The one comment that I >> have and I should have realized earlier was that the file had been >> parameterized by the core in quite a few places and I would like to >> still retain that capability. >> >> This look better ? I have retained your original logic and only >> parameterized wblatency on the core . > > Looks good, thanks. OK - please apply if no one objects in 24 hours. Ramana > > Richard >
Re: Missing SCALAR_INT_MODE_P checks in simplify_subreg
> Tested on arm-linux-gnueabi and x86_64-linux-gnu. Although I suppose > I could self-approve this, I'd appreciate it if someone would double-check. > > Richard > > > gcc/ > * simplify-rtx.c (simplify_subreg): Check that the inner mode is > a scalar integer before applying integer-only optimisations to > inner arithmetic. Looks good to me. -- Eric Botcazou
Re: [S390] Use match_test rather than eq/ne symbol_ref
> gcc/ > * config/s390/s390.md: Use match_test rather than eq/ne symbol_ref > throughout file. This is ok. Thanks! -Andreas-
Re: [Patch, Ada, Darwin] Remove use of "flat_namespace" linker flag.
Iain, > So is the following OK for trunk? > (no Ada regressions on *-darwin9, x86_64-darwin10). > > cheers > Iain > > ada: > > * gcc-interface/Makefile.in (darwin): Do not issue the > '-flat_namespace' linker flag during Ada build. >* mlib-tgt-specific-darwin.adb: Remove 'flat_namespace' flag from > the default shared library options. After double checking on our various test suites, this change is OK, thanks. Arno
Re: [PLUGIN] Fix PLUGIN_FINISH_TYPE
Hi, I tried to fix PLUGIN_FINISH_DECL as well to include typedefs in C++. The followings does not currently trigger the PLUGIN_FINISH_DECL (or not in all cases), but should them ? - function parameters (in the function prototype) - definition (with a function body) of a top-level function (while the exact same function definition enclosed in a class definition will trigger PLUGIN_FINISH_DECL) - label declaration - constants defined by enums - namespace Romain.
Re: Prevent inliner from removing aliases of used comdats
> Jan, > > Any testcase do can add? Sadly not really - in both cases (openoffice and mozilla) the modules involved are too large for testsuite. I tried to delta it but didn't get anything useful (about 100k of C++) Honza > > Graham
Re: [0/4] Modulo scheduling with haifa-sched for C6X
Bernd Schmidt writes: > I have added support for this to haifa-sched.c. I expect the question > "why not use SMS" to come up; there were a number of reasons why I felt > that code is unsuitable: Fully agree that SMS is unsuitable here FWIW, but... > There are (or were, when I started) some glaring weaknesses in SMS, such > as giving up when the loop contains autoincrement addresses (which is > the case in the example above), and by the looks of it fairly poor > memory disambiguation compared to sched-ebb with cselib. ...I didn't see from an admittedly quick read of the patch how you handle memory disambiguation between iterations. If a loop includes: lb $3,($4) sb $5,1($4) then the two instructions can be reordered by normal ebb scheduling, but the inter-iteration conflict is important for modulo scheduling. Richard
Re: [PATCH, PR50251] set DRAP type stack realignment for stack_restore
On Wed, Sep 14, 2011 at 10:20 AM, Tom de Vries wrote: this patch fixes PR50251, which was caused by r178353. The patch was bootstrapped and reg-tested on i686 and x86_64. On i686, the test-cases reported failing in PR50251 pass again. The patch selects the DRAP type stack realignment method in case a stack_restore is used. If that is not done, the presence of the stack_restore at reload leaves FRAME_POINTER without an elimination rule for i386 port. OK for trunk? >>> >>> Shouldn't we eventually simply set cfun->calls_alloca when expanding either >>> stack save or restore? Similar to how it is set from >>> allocate_dynamic_stack_space? >>> I'm not sure we encountered the dead stack save/restore pair before >>> this new folding, >>> so, just to make other targets not confused about them as well? >> >> Setting cfun->calls_alloca will work as well, but doesn't seem necessary to >> me. >> AFAIU, since other targets don't define MAX_STACK_ALIGNMENT, they don't need >> to >> do a realign themselves. If I disable MAX_STACK_ALIGNMENT on i386, the >> middle-end handles the realign and the stack_restore causes no problems. So >> to >> me this seems a problem with the i386-specific implementation of realignment. >> >> I'm more worried about other similar cases not working for i386 than about >> other >> ports. Another, more involved way to fix this, would be in reload to: >> - calculate which registers we cannot use for elimination (which we already >> do) >> - pass that as context to target.can_eliminate. The i386 target can then >> fulfill its requirement to be able to eliminate the frame pointer by >> setting need_drap if the stack pointer is not available. >> I think this way we could remove most if not all of the >> 'crtl->need_drap = true' snippets, and have a completer solution and perhaps >> also more optimal. >> >> Is this simple crtl->need_drap fix ok for now, or should I start testing the >> cfun->calls_alloca fix? >> > > Ping. > > I have tested the attached patch on both x86_64 and i686 and it fixes all > failures in PR50251 without any regressions. > > Since I don't feel like the right person to dig into reload guts to implement > the solution above, I want to check in this fix to i386 backend. > > Uros, Richard H., OK for trunk? > > Thank you, > - Tom > > 2011-09-14 Tom de Vries > > * explow.c (emit_stack_restore): Set crtl->need_drap if > stack_restore is emitted. Since this approach just follows the approach taken in other builtin_* functions, I'd say this is an oversight for stack_save/restore builtins and the patch is OK. Any enhancements in this area can be implemented independently of the fix. Please also add the test from PR, with -mpreferred-stack-boundary=12 that currently fails for 32bit and 64bit x86 targets. Thanks, Uros.
Re: [0/4] Modulo scheduling with haifa-sched for C6X
On 09/14/11 11:03, Richard Sandiford wrote: > ...I didn't see from an admittedly quick read of the patch how you > handle memory disambiguation between iterations. If a loop includes: > > lb $3,($4) > sb $5,1($4) > > then the two instructions can be reordered by normal ebb scheduling, > but the inter-iteration conflict is important for modulo scheduling. There's nothing special to handle, I think. sched-deps should see that the ld in iteration 1 is DEP_ANTI against the sb in iteration 0 (assuming there's also an increment). Bernd
Re: PING: [ARM] Model automodified addresses in the Cortex A8 and A9 schedulers -- NEON
On Tue, 13 Sep 2011 21:44:54 +0100 Ramana Radhakrishnan wrote: > On 9 September 2011 13:56, Richard Sandiford > wrote: > > Ping for this patch: > > I do have a one nit on the ml bit though I must say I'm not an ML > expert which is why I resisted for a while. The one comment that I > have and I should have realized earlier was that the file had been > parameterized by the core in quite a few places and I would like to > still retain that capability. > > This look better ? I have retained your original logic and only > parameterized wblatency on the core . > > Thoughts ? Can someone else also give the ML bits a once-over ? It > appears to generate identical descriptions to yours. A couple of stylistic points: +(* The latency to use on all address register writeback dependencies. *) +let writebackLatency whichcore = +match whichcore with +CortexA8 -> 1 +| CortexA9 -> 1 This always returns 1? Why not just: let writebackLatency _ = 1 Also you could write: let writebackLatency = function CortexA8 -> 1 | CortexA9 -> 1 which is a little more concise. + if (try ignore (Hashtbl.find ht (guard, latency)); false + with Not_found -> true) There's no need for this construct. Use Hashtbl.mem ("membership") instead, e.g.: if not (Hashtbl.mem ht (guard, latency)) then ... (I see that you're just modifying existing code, but still...) +let comp_fn (guard1, latency1) (guard2, latency2) = + if latency1 > latency2 then -1 + else if latency1 < latency2 then 1 + else if guard1 > guard2 then -1 + else if guard2 > guard1 then 1 + else 0 You can use the built-in function compare (e.g. "compare guard2 guard1") here, instead of the if/else chain. It has exactly the semantics you need, i.e. it returns -1/0/1 for less-than/equal/greater-than. I have no comments on the patch algorithmically speaking. Thanks, Julian
[PATCH, PR50322] Fix test-case ivopts-lt.c to use int of same size as pointer.
Mike, Zdenek, The attached patch fixes PR50322. The test-case is supposed to succeed if the loop counter data-type has the same size as a pointer. The patch defines TYPE to be an int datatype of the same size as a pointer, and uses it. After this fix, there's no need for the avr xfails anymore. tested with avr, x86_64 and x86_64 -m32. OK for trunk? Thanks, - Tom 2011-09-14 Tom de Vries PR testsuite/50322 * gcc.dg/tree-ssa/ivopts-lt.c: Define TYPE. Use TYPE in f1. Undo avr xfails. Index: gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c === --- gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c (revision 178804) +++ gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c (working copy) @@ -1,8 +1,18 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fdump-tree-ivopts" } */ +#if (__SIZEOF_LONG_LONG__ == __SIZEOF_POINTER__) +#define TYPE unsigned long long int +#elif (__SIZEOF_LONG__ == __SIZEOF_POINTER__) +#define TYPE unsigned long int +#elif (__SIZEOF_INT__ == __SIZEOF_POINTER__) +#define TYPE unsigned int +#else +#error Add target support here +#endif + void -f1 (char *p, unsigned long int i, unsigned long int n) +f1 (char *p, TYPE i, TYPE n) { p += i; do @@ -14,8 +24,7 @@ f1 (char *p, unsigned long int i, unsign while (i < n); } -/* For the fails on avr see PR tree-optimization/50322. */ -/* { dg-final { scan-tree-dump-times "PHI" 1 "ivopts" { xfail { "avr-*-*" } } } } */ +/* { dg-final { scan-tree-dump-times "PHI" 1 "ivopts" } } */ /* { dg-final { scan-tree-dump-times "PHI
[patch] Allow not simple ivs in SLP
Hi, This patch makes data-refs analysis to not fail if simple_iv returns false in basic block SLP. Bootstrapped and tested on powerpc64-suse-linux. OK for mainline? Thanks, Ira ChangeLog: * tree-data-ref.c (dr_analyze_innermost): Rename to... (dr_analyze_innermost_1): ... this. Add new argument. Allow not simple iv if analyzing basic block. (dr_analyze_innermost): Call dr_analyze_innermost_1. (create_data_ref): Call dr_analyze_innermost_1. testsuite/ChangeLog: * gcc.dg/vect/bb-slp-24.c: New test. Index: tree-data-ref.c === --- tree-data-ref.c (revision 178755) +++ tree-data-ref.c (working copy) @@ -722,11 +722,11 @@ canonicalize_base_object_address (tree addr) } /* Analyzes the behavior of the memory reference DR in the innermost loop or - basic block that contains it. Returns true if analysis succeed or false + basic block that contains it. Returns true if analysis succeed or false otherwise. */ -bool -dr_analyze_innermost (struct data_reference *dr) +static bool +dr_analyze_innermost_1 (struct data_reference *dr, struct loop *nest) { gimple stmt = DR_STMT (dr); struct loop *loop = loop_containing_stmt (stmt); @@ -769,14 +769,25 @@ canonicalize_base_object_address (tree addr) } else base = build_fold_addr_expr (base); + if (in_loop) { if (!simple_iv (loop, loop_containing_stmt (stmt), base, &base_iv, false)) { - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "failed: evolution of base is not affine.\n"); - return false; + if (nest) +{ + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "failed: evolution of base is not" +" affine.\n"); +return false; +} + else +{ + base_iv.base = base; + base_iv.step = ssize_int (0); + base_iv.no_overflow = true; +} } } else @@ -801,10 +812,18 @@ canonicalize_base_object_address (tree addr) else if (!simple_iv (loop, loop_containing_stmt (stmt), poffset, &offset_iv, false)) { - if (dump_file && (dump_flags & TDF_DETAILS)) -fprintf (dump_file, "failed: evolution of offset is not" -" affine.\n"); - return false; + if (nest) +{ + if (dump_file && (dump_flags & TDF_DETAILS)) +fprintf (dump_file, "failed: evolution of offset is not" +" affine.\n"); + return false; +} + else +{ + offset_iv.base = poffset; + offset_iv.step = ssize_int (0); +} } } @@ -832,6 +851,19 @@ canonicalize_base_object_address (tree addr) return true; } +/* Analyzes the behavior of the memory reference DR in the innermost loop or + basic block that contains it. Returns true if analysis succeed or false + otherwise. */ + +bool +dr_analyze_innermost (struct data_reference *dr) +{ + gimple stmt = DR_STMT (dr); + struct loop *loop = loop_containing_stmt (stmt); + + return dr_analyze_innermost_1 (dr, loop); +} + /* Determines the base object and the list of indices of memory reference DR, analyzed in LOOP and instantiated in loop nest NEST. */ @@ -972,7 +1004,7 @@ create_data_ref (loop_p nest, loop_p loop, tree me DR_REF (dr) = memref; DR_IS_READ (dr) = is_read; - dr_analyze_innermost (dr); + dr_analyze_innermost_1 (dr, nest); dr_analyze_indices (dr, nest, loop); dr_analyze_alias (dr); Index: testsuite/gcc.dg/vect/bb-slp-24.c === --- testsuite/gcc.dg/vect/bb-slp-24.c (revision 0) +++ testsuite/gcc.dg/vect/bb-slp-24.c (revision 0) @@ -0,0 +1,55 @@ +/* { dg-require-effective-target vect_int } */ + +#include +#include "tree-vect.h" + +#define A 3 +#define N 256 + +short src[N], dst[N]; + +void foo (short * __restrict dst, short * __restrict src, int h, int stride) +{ +int i; +h /= 8; +for (i = 0; i < h; i++) { +dst[0] += A*src[0]; +dst[1] += A*src[1]; +dst[2] += A*src[2]; +dst[3] += A*src[3]; +dst[4] += A*src[4]; +dst[5] += A*src[5]; +dst[6] += A*src[6]; +dst[7] += A*src[7]; +dst += stride; +src += stride; +} +} + + +int main (void) +{ + int i; + + check_vect (); + + for (i = 0; i < N; i++) +{ + dst[i] = 0; + src[i] = i; +} + + foo (dst, src, N, 8); + + for (i = 0; i < N; i++) +{ + if (dst[i] != A * i) +abort (); +} + + return 0; +} + +/* { dg-final { scan-tree-dump-times "basic block vectorized using SLP" 1 "slp" { target vect_element_a
Re: Missing SCALAR_INT_MODE_P checks in simplify_subreg
Eric Botcazou writes: >> Tested on arm-linux-gnueabi and x86_64-linux-gnu. Although I suppose >> I could self-approve this, I'd appreciate it if someone would double-check. >> >> Richard >> >> >> gcc/ >> * simplify-rtx.c (simplify_subreg): Check that the inner mode is >> a scalar integer before applying integer-only optimisations to >> inner arithmetic. > > Looks good to me. Thanks. I went ahead and applied it. Richard
[ARM] Optimise handling of neon_vget_high/low
neon_vget_high and neon_vget_low extract one half of a vector. The patterns look like: (define_insn "neon_vget_highv16qi" [(set (match_operand:V8QI 0 "s_register_operand" "=w") (vec_select:V8QI (match_operand:V16QI 1 "s_register_operand" "w") (parallel [(const_int 8) (const_int 9) (const_int 10) (const_int 11) (const_int 12) (const_int 13) (const_int 14) (const_int 15)])))] "TARGET_NEON" { int dest = REGNO (operands[0]); int src = REGNO (operands[1]); if (dest != src + 2) return "vmov\t%P0, %f1"; else return ""; } [(set_attr "neon_type" "neon_bp_simple")] ) But there's nothing here to tell the register allocator what's expected of it, so we do often get the move. The patch below makes the patterns expand to normal subreg moves instead. Unfortunately, when I first tried this, I ran across some bugs in simplify-rtx.c. They should be fixed now. Of course, I can't guarantee that there are other similar bugs elsewhere, but I'll try to fix any that crop up. The new patterns preserve the current treatment on big-endian targets. Namely, the "low" half is always in the lower-numbered registers (subreg byte offset 0). Tested on arm-linux-gnueabi. OK to install? Richard gcc/ * config/arm/neon.md (neon_vget_highv16qi, neon_vget_highv8hi) (neon_vget_highv4si, neon_vget_highv4sf, neon_vget_highv2di) (neon_vget_lowv16qi, neon_vget_lowv8hi, neon_vget_lowv4si) (neon_vget_lowv4sf, neon_vget_lowv2di): Turn into define_expands that produce subreg moves. Define using VQX iterators. Index: gcc/config/arm/neon.md === --- gcc/config/arm/neon.md 2011-09-13 13:33:29.0 +0100 +++ gcc/config/arm/neon.md 2011-09-13 16:21:23.189304498 +0100 @@ -2969,183 +2969,27 @@ (define_insn "neon_vcombine" (set_attr "neon_type" "neon_bp_simple")] ) -(define_insn "neon_vget_highv16qi" - [(set (match_operand:V8QI 0 "s_register_operand" "=w") - (vec_select:V8QI (match_operand:V16QI 1 "s_register_operand" "w") - (parallel [(const_int 8) (const_int 9) - (const_int 10) (const_int 11) - (const_int 12) (const_int 13) - (const_int 14) (const_int 15)])))] - "TARGET_NEON" -{ - int dest = REGNO (operands[0]); - int src = REGNO (operands[1]); - - if (dest != src + 2) -return "vmov\t%P0, %f1"; - else -return ""; -} - [(set_attr "neon_type" "neon_bp_simple")] -) - -(define_insn "neon_vget_highv8hi" - [(set (match_operand:V4HI 0 "s_register_operand" "=w") - (vec_select:V4HI (match_operand:V8HI 1 "s_register_operand" "w") -(parallel [(const_int 4) (const_int 5) - (const_int 6) (const_int 7)])))] - "TARGET_NEON" -{ - int dest = REGNO (operands[0]); - int src = REGNO (operands[1]); - - if (dest != src + 2) -return "vmov\t%P0, %f1"; - else -return ""; -} - [(set_attr "neon_type" "neon_bp_simple")] -) - -(define_insn "neon_vget_highv4si" - [(set (match_operand:V2SI 0 "s_register_operand" "=w") - (vec_select:V2SI (match_operand:V4SI 1 "s_register_operand" "w") -(parallel [(const_int 2) (const_int 3)])))] - "TARGET_NEON" -{ - int dest = REGNO (operands[0]); - int src = REGNO (operands[1]); - - if (dest != src + 2) -return "vmov\t%P0, %f1"; - else -return ""; -} - [(set_attr "neon_type" "neon_bp_simple")] -) - -(define_insn "neon_vget_highv4sf" - [(set (match_operand:V2SF 0 "s_register_operand" "=w") - (vec_select:V2SF (match_operand:V4SF 1 "s_register_operand" "w") -(parallel [(const_int 2) (const_int 3)])))] - "TARGET_NEON" -{ - int dest = REGNO (operands[0]); - int src = REGNO (operands[1]); - - if (dest != src + 2) -return "vmov\t%P0, %f1"; - else -return ""; -} - [(set_attr "neon_type" "neon_bp_simple")] -) - -(define_insn "neon_vget_highv2di" - [(set (match_operand:DI 0 "s_register_operand" "=w") - (vec_select:DI (match_operand:V2DI 1 "s_register_operand" "w") - (parallel [(const_int 1)])))] - "TARGET_NEON" -{ - int dest = REGNO (operands[0]); - int src = REGNO (operands[1]); - - if (dest != src + 2) -return "vmov\t%P0, %f1"; - else -return ""; -} - [(set_attr "neon_type" "neon_bp_simple")] -) - -(define_insn "neon_vget_lowv16qi" - [(set (match_operand:V8QI 0 "s_register_operand" "=w") - (vec_select:V8QI (match_operand:V16QI 1 "s_register_operand" "w") - (parallel [(const_int 0) (const_int 1) - (const_int 2) (const_int 3) - (const_int 4) (const_int 5) - (const_int 6) (const_int 7)])))
[C++ Patch] PR 50391
Hi, here I'm just avoiding calling build_exception_variant when specs is an error_mark_node. Tested x86_64-linux. Ok for mainline? Thanks, Paolo. / /cp 2011-09-14 Paolo Carlini PR c++/50391 * pt.c (regenerate_decl_from_template): Don't pass an error_mark_node to build_exception_variant. /testsuite 2011-09-14 Paolo Carlini PR c++/50391 * g++.dg/cpp0x/noexcept15.C: New. Index: testsuite/g++.dg/cpp0x/noexcept15.C === --- testsuite/g++.dg/cpp0x/noexcept15.C (revision 0) +++ testsuite/g++.dg/cpp0x/noexcept15.C (revision 0) @@ -0,0 +1,34 @@ +// PR c++/50391 +// { dg-options -std=c++0x } + +#include + +template + struct single + { +Tp elem; // { dg-error "incomplete type" } + +constexpr single(const Tp& e) +: elem(e) { } // { dg-error "invalid field" } + +single(single&& s) // { dg-error "not a member" } +noexcept(std::is_nothrow_move_constructible::value) +: elem(s.elem) { } // { dg-error "invalid field|no member" } + }; + +template + constexpr single::type> + make_single(Tp&& x) + { +return single::type>(x); + } + +class Blob; // { dg-error "forward declaration" } + +void +foo(Blob *b) +{ + make_single(*b); +} + +// { dg-prune-output "include" } Index: cp/pt.c === --- cp/pt.c (revision 178845) +++ cp/pt.c (working copy) @@ -17729,7 +17729,7 @@ regenerate_decl_from_template (tree decl, tree tmp specs = tsubst_exception_specification (TREE_TYPE (code_pattern), args, tf_error, NULL_TREE, /*defer_ok*/false); - if (specs) + if (specs && specs != error_mark_node) TREE_TYPE (decl) = build_exception_variant (TREE_TYPE (decl), specs);
[PATCH] [MELT] Fix CUMULMD5 leading space
Since v3.82, make issued some strange warning on duplicated targets. It turns out that those were due to a leading space hidden in CUMULMD5 variable, leading to target breakage. --- gcc/ChangeLog.MELT |4 gcc/melt-build.mk | 18 +- gcc/melt-build.tpl |2 +- 3 files changed, 14 insertions(+), 10 deletions(-) diff --git a/gcc/ChangeLog.MELT b/gcc/ChangeLog.MELT index 78c6ae6..6efd8d9 100644 --- a/gcc/ChangeLog.MELT +++ b/gcc/ChangeLog.MELT @@ -1,3 +1,7 @@ +2011-09-14 Alexandre Lissy + * melt-build.tpl: Remove leading space on CUMULMD5, breaking some make + targets. + * melt-build.mk: Regenerate. 2011-09-14 Basile Starynkevitch * melt-runtime.c (melt_dbgshortbacktrace): Print just the basename diff --git a/gcc/melt-build.mk b/gcc/melt-build.mk index af9e266..fd9d8a2 100644 --- a/gcc/melt-build.mk +++ b/gcc/melt-build.mk @@ -15,7 +15,7 @@ # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # -#@ from melt-build.tpl line 21 generated by Autogen 5.12 using melt-build.def +#@ from melt-build.tpl line 21 generated by Autogen 5.11.9 using melt-build.def ## the following Makefile variables are expected to be set @@ -147,7 +147,7 @@ MELT_GENERATED_FIRST_C_FILES= \ MELT_GENERATED_FIRST_BASE= \ $(basename $(notdir $(MELT_GENERATED_FIRST_C_FILES))) -MELT_GENERATED_FIRST_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-first+meltdesc.c) +MELT_GENERATED_FIRST_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-first+meltdesc.c) #@ from melt-build.tpl line 135 MELT_GENERATED_BASE_C_FILES= \ @@ -156,7 +156,7 @@ MELT_GENERATED_BASE_C_FILES= \ MELT_GENERATED_BASE_BASE= \ $(basename $(notdir $(MELT_GENERATED_BASE_C_FILES))) -MELT_GENERATED_BASE_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-base+meltdesc.c) +MELT_GENERATED_BASE_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-base+meltdesc.c) #@ from melt-build.tpl line 135 MELT_GENERATED_DEBUG_C_FILES= \ @@ -165,7 +165,7 @@ MELT_GENERATED_DEBUG_C_FILES= \ MELT_GENERATED_DEBUG_BASE= \ $(basename $(notdir $(MELT_GENERATED_DEBUG_C_FILES))) -MELT_GENERATED_DEBUG_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-debug+meltdesc.c) +MELT_GENERATED_DEBUG_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-debug+meltdesc.c) #@ from melt-build.tpl line 135 MELT_GENERATED_MACRO_C_FILES= \ @@ -174,7 +174,7 @@ MELT_GENERATED_MACRO_C_FILES= \ MELT_GENERATED_MACRO_BASE= \ $(basename $(notdir $(MELT_GENERATED_MACRO_C_FILES))) -MELT_GENERATED_MACRO_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-macro+meltdesc.c) +MELT_GENERATED_MACRO_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-macro+meltdesc.c) #@ from melt-build.tpl line 135 MELT_GENERATED_NORMAL_C_FILES= \ @@ -183,7 +183,7 @@ MELT_GENERATED_NORMAL_C_FILES= \ MELT_GENERATED_NORMAL_BASE= \ $(basename $(notdir $(MELT_GENERATED_NORMAL_C_FILES))) -MELT_GENERATED_NORMAL_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-normal+meltdesc.c) +MELT_GENERATED_NORMAL_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-normal+meltdesc.c) #@ from melt-build.tpl line 135 MELT_GENERATED_NORMATCH_C_FILES= \ @@ -192,7 +192,7 @@ MELT_GENERATED_NORMATCH_C_FILES= \ MELT_GENERATED_NORMATCH_BASE= \ $(basename $(notdir $(MELT_GENERATED_NORMATCH_C_FILES))) -MELT_GENERATED_NORMATCH_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-normatch+meltdesc.c) +MELT_GENERATED_NORMATCH_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-normatch+meltdesc.c) #@ from melt-build.tpl line 135 MELT_GENERATED_GENOBJ_C_FILES= \ @@ -201,7 +201,7 @@ MELT_GENERATED_GENOBJ_C_FILES= \ MELT_GENERATED_GENOBJ_BASE= \ $(basename $(notdir $(MELT_GENERATED_GENOBJ_C_FILES))) -MELT_GENERATED_GENOBJ_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-genobj+meltdesc.c) +MELT_GENERATED_GENOBJ_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-genobj+meltdesc.c) #@ from melt-build.tpl line 135 MELT
Re: [PATCH, ARM] Unaligned accesses for builtin memcpy [2/2]
On Fri, 6 May 2011 14:13:32 +0100 Julian Brown wrote: > Hi, > > This is the second of two patches to add unaligned-access support to > the ARM backend. It builds on the first patch to provide support for > unaligned accesses when expanding block moves (i.e. for builtin memcpy > operations). It makes some effort to use load/store multiple > instructions where appropriate (when accessing sufficiently-aligned > source or destination addresses), and also makes some effort to > generate fast code (for -O1/2/3) or small code (for -Os), though some > of the heuristics may need tweaking still. The preceding patch to this one has now been approved & applied: http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00575.html FAOD, is this one OK too? Thanks, Julian
Re: [RFC] Builtin infrastructure change
On Tue, 13 Sep 2011, Michael Meissner wrote: > implicit_built_in_decls to the new macros/functions, and poison > the old names. Make sure 0 is not a legitimate builtin index. I think the poisoning belongs in system.h; direct dependencies on the host compiler (such as GCC_VERSION and __GNUC__ tests) should be kept as local as possible, in just a few headers such as system.h and hwint.h, with other places testing feature macros defined in system.h (or through autoconf) if it's really necessary to have such dependencies there. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH, testsuite, ARM] Change to expected failure for g++.dg/abi/local1.C on ARM target
On Wed, 14 Sep 2011, Jiangning Liu wrote: > +// { dg-do run { xfail { arm*-*-eabi* } } } That's specific to bare-metal targets rather than all ARM EABI (such as arm-none-linux-gnueabi or arm-none-symbianelf), is that your intent? If you wanted to cover all ARM EABI targets you'd use the arm_eabi effective-target. But maybe you actually want a new effective-target keyword for targets defining __GXX_TYPEINFO_EQUALITY_INLINE to 0 rather than just ARM EABI? I've seen this test fail on i686-mingw32 and m68k-uclinux as well. -- Joseph S. Myers jos...@codesourcery.com
RFC: allowing fwprop to propagate subregs
At the moment, fwprop will propagate constants and registers even if no further rtl simplifications are possible: if (REG_P (new_rtx) || CONSTANT_P (new_rtx)) flags |= PR_CAN_APPEAR; What do you think about extending this to subregs? The reason for asking is that on NEON, vector loads like vld4 are represented as a load of a single monolithic register followed by subreg extractions of each vector: (set (reg:OI FULL) (...)) (set (reg:V2SI V0) (subreg:V2SI (reg:OI FULL) 0)) (set (reg:V2SI V1) (subreg:V2SI (reg:OI FULL) 16)) (set (reg:V2SI V2) (subreg:V2SI (reg:OI FULL) 32)) (set (reg:V2SI V3) (subreg:V2SI (reg:OI FULL) 48)) Nothing ever propagates these subregs, so the separate moves survive until IRA. This has three problems: - We generally want the registers allocated to V0...V3 to be the same as FULL, so that the four subreg moves become nops. And this often happens in simple examples. But if register pressure is relatively high, these moves can sometimes cause IRA to spill in cases where it doesn't if the subregs are used instead of each Vi. - Perhaps related, register pressure becomes harder to estimate. - These moves can interfere with pre-reload scheduling. In combination with the MODES_TIEABLE_P patch that I posted here: http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00626.html this patch significantly improves the code generated for several libav loops. Unfortunately, I don't have a setup that can do meaningful x86_64 performance measurements, but a diff of the before and after output for libav showed many cases where the patch removed moves. What do you think? Alternatives include propagating in lower-subreg, or maybe only in the second fwprop pass. Richard gcc/ * fwprop.c (propagate_rtx): Also set PR_CAN_APPEAR for subregs. Index: gcc/fwprop.c === --- gcc/fwprop.c2011-08-26 09:58:28.829540497 +0100 +++ gcc/fwprop.c2011-08-26 10:14:03.767707504 +0100 @@ -664,7 +664,7 @@ propagate_rtx (rtx x, enum machine_mode return NULL_RTX; flags = 0; - if (REG_P (new_rtx) || CONSTANT_P (new_rtx)) + if (REG_P (new_rtx) || CONSTANT_P (new_rtx) || GET_CODE (new_rtx) == SUBREG) flags |= PR_CAN_APPEAR; if (!for_each_rtx (&new_rtx, varying_mem_p, NULL)) flags |= PR_HANDLE_MEM;
Re: RFC: allowing fwprop to propagate subregs
On Wed, Sep 14, 2011 at 8:24 AM, Richard Sandiford wrote: > At the moment, fwprop will propagate constants and registers > even if no further rtl simplifications are possible: > > if (REG_P (new_rtx) || CONSTANT_P (new_rtx)) > flags |= PR_CAN_APPEAR; > > What do you think about extending this to subregs? The reason for > asking is that on NEON, vector loads like vld4 are represented as a load > of a single monolithic register followed by subreg extractions of each > vector: > > (set (reg:OI FULL) (...)) > (set (reg:V2SI V0) (subreg:V2SI (reg:OI FULL) 0)) > (set (reg:V2SI V1) (subreg:V2SI (reg:OI FULL) 16)) > (set (reg:V2SI V2) (subreg:V2SI (reg:OI FULL) 32)) > (set (reg:V2SI V3) (subreg:V2SI (reg:OI FULL) 48)) > > Nothing ever propagates these subregs, so the separate moves > survive until IRA. This has three problems: > > - We generally want the registers allocated to V0...V3 to be the same > as FULL, so that the four subreg moves become nops. And this often > happens in simple examples. But if register pressure is relatively > high, these moves can sometimes cause IRA to spill in cases where > it doesn't if the subregs are used instead of each Vi. > > - Perhaps related, register pressure becomes harder to estimate. > > - These moves can interfere with pre-reload scheduling. > > In combination with the MODES_TIEABLE_P patch that I posted here: > > http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00626.html > > this patch significantly improves the code generated for several libav > loops. Unfortunately, I don't have a setup that can do meaningful > x86_64 performance measurements, but a diff of the before and after > output for libav showed many cases where the patch removed moves. > > What do you think? Alternatives include propagating in lower-subreg, > or maybe only in the second fwprop pass. > > Richard > > > gcc/ > * fwprop.c (propagate_rtx): Also set PR_CAN_APPEAR for subregs. > > Index: gcc/fwprop.c > === > --- gcc/fwprop.c 2011-08-26 09:58:28.829540497 +0100 > +++ gcc/fwprop.c 2011-08-26 10:14:03.767707504 +0100 > @@ -664,7 +664,7 @@ propagate_rtx (rtx x, enum machine_mode > return NULL_RTX; > > flags = 0; > - if (REG_P (new_rtx) || CONSTANT_P (new_rtx)) > + if (REG_P (new_rtx) || CONSTANT_P (new_rtx) || GET_CODE (new_rtx) == > SUBREG) > flags |= PR_CAN_APPEAR; > if (!for_each_rtx (&new_rtx, varying_mem_p, NULL)) > flags |= PR_HANDLE_MEM; > A SUBREG may not be REG nor CONSTANT. Don't you need to check REG_P/CONSTANT_P on SUBREG? -- H.J.
Re: [MICROBLAZE] Use match_test rather than eq/ne symbol_ref
On 09/13/2011 10:58 AM, Richard Sandiford wrote: As per the subject. Tested by making sure that there were no new warnings building microblaze-elf, and that there were no changes in the assembly output for the C and C++ testsuite. OK to install? OK. -- Michael Eagerea...@eagercon.com 1960 Park Blvd., Palo Alto, CA 94306 650-325-8077
Re: [C++ Patch] PR 50391
OK. Jason
Re: RFC: allowing fwprop to propagate subregs
"H.J. Lu" writes: > On Wed, Sep 14, 2011 at 8:24 AM, Richard Sandiford > wrote: >> At the moment, fwprop will propagate constants and registers >> even if no further rtl simplifications are possible: >> >> if (REG_P (new_rtx) || CONSTANT_P (new_rtx)) >> flags |= PR_CAN_APPEAR; >> >> What do you think about extending this to subregs? The reason for >> asking is that on NEON, vector loads like vld4 are represented as a load >> of a single monolithic register followed by subreg extractions of each >> vector: >> >> (set (reg:OI FULL) (...)) >> (set (reg:V2SI V0) (subreg:V2SI (reg:OI FULL) 0)) >> (set (reg:V2SI V1) (subreg:V2SI (reg:OI FULL) 16)) >> (set (reg:V2SI V2) (subreg:V2SI (reg:OI FULL) 32)) >> (set (reg:V2SI V3) (subreg:V2SI (reg:OI FULL) 48)) >> >> Nothing ever propagates these subregs, so the separate moves >> survive until IRA. This has three problems: >> >> - We generally want the registers allocated to V0...V3 to be the same >> as FULL, so that the four subreg moves become nops. And this often >> happens in simple examples. But if register pressure is relatively >> high, these moves can sometimes cause IRA to spill in cases where >> it doesn't if the subregs are used instead of each Vi. >> >> - Perhaps related, register pressure becomes harder to estimate. >> >> - These moves can interfere with pre-reload scheduling. >> >> In combination with the MODES_TIEABLE_P patch that I posted here: >> >> http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00626.html >> >> this patch significantly improves the code generated for several libav >> loops. Unfortunately, I don't have a setup that can do meaningful >> x86_64 performance measurements, but a diff of the before and after >> output for libav showed many cases where the patch removed moves. >> >> What do you think? Alternatives include propagating in lower-subreg, >> or maybe only in the second fwprop pass. >> >> Richard >> >> >> gcc/ >> * fwprop.c (propagate_rtx): Also set PR_CAN_APPEAR for subregs. >> >> Index: gcc/fwprop.c >> === >> --- gcc/fwprop.c 2011-08-26 09:58:28.829540497 +0100 >> +++ gcc/fwprop.c 2011-08-26 10:14:03.767707504 +0100 >> @@ -664,7 +664,7 @@ propagate_rtx (rtx x, enum machine_mode >> return NULL_RTX; >> >> flags = 0; >> - if (REG_P (new_rtx) || CONSTANT_P (new_rtx)) >> + if (REG_P (new_rtx) || CONSTANT_P (new_rtx) || GET_CODE (new_rtx) == >> SUBREG) >> flags |= PR_CAN_APPEAR; >> if (!for_each_rtx (&new_rtx, varying_mem_p, NULL)) >> flags |= PR_HANDLE_MEM; >> > > A SUBREG may not be REG nor CONSTANT. Don't you need > to check REG_P/CONSTANT_P on SUBREG? Yeah, good point. There should be a "&& REG_P (SUBREG_REG (new_rtx))" in there. Probably also worth checking for non-paradoxical subregs. Richard
[Patch, committed] Add myself to MAINTAINERS
Added myself to MAINTAINERS (write after approval). Thanks, - Tom 2011-09-14 Tom de Vries * MAINTAINERS (Write After Approval): Add myself. Index: MAINTAINERS === --- MAINTAINERS (revision 178804) +++ MAINTAINERS (working copy) @@ -505,6 +505,7 @@ Konrad Trifunovickonrad.trifunovic@i David Ung dav...@mips.com Neil Vachharajaninvach...@gmail.com Kris Van Hees kris.van.h...@oracle.com +Tom de Vries t...@codesourcery.com Nenad Vukicevic ne...@intrepid.com Feng Wang fengw...@nudt.edu.cn Stephen M. Webb stephen.w...@bregmasoft.com
Re: [PATCH] [Annotalysis] Fix internal compiler errors in thread safety analysis.
On Tue, Sep 13, 2011 at 18:27, Delesley Hutchins wrote: > This patch fixes two bugs which cause an internal compiler error in > annotalysis. The first change fixes a failure when a variable is > typecast from an unknown (forward-defined) type. The second change > fixes a case in which an assert was triggered, because subexpressions > were not canonicalized correctly. > > Bootstrapped and passed gcc regression testsuite on > x86_64-unknown-linux-gnu. Tested on google core components. > > Okay for google/gcc-4_6? OK with a few nits below. > 2011-9-13 DeLesley Hutchins > > * gcc/cp/class.c (cp_get_virtual_function_decl) bugfix where > type is uknown > * gcc/tree-threadsafe-analyze.c (get_canonical_lock_expr) > don't remove & on recursive call Missing ':' after ')'. This entry should be split, the change to tree-threadsafe-analyze.c goes in gcc/ChangeLog.google-4_6. The change to gcc/cp/class.c goes in gcc/cp/ChangeLog.google-4_6. Also, you need a ChangeLog entry for gcc/testsuite/ChangeLog.google-4_6 to describe the changes to the testsuite files. > @@ -8384,9 +8384,15 @@ cp_get_virtual_function_decl (tree ref, tree known > { > HOST_WIDE_INT index = tree_low_cst (OBJ_TYPE_REF_TOKEN (ref), 1); > HOST_WIDE_INT i = 0; > - tree v = BINFO_VIRTUALS (TYPE_BINFO (known_type)); > + tree binfo = TYPE_BINFO(known_type); space before '('. > @@ -884,10 +884,12 @@ get_canonical_lock_expr (tree lock, tree base_obj, > { > tree base = TREE_OPERAND (lock, 0); > tree canon_base; > - /* When the expr is a pointer to a lockable type (i.e. mu.Lock() > + /* For expressions that denote locks, > + When the expr is a pointer to a lockable type (i.e. mu.Lock() > or Lock(&mu) internally), we don't need the address-taken > operator (&). */ > - if (lookup_attribute("lockable", TYPE_ATTRIBUTES (TREE_TYPE > (base > + if (!is_temp_expr && > + lookup_attribute("lockable", TYPE_ATTRIBUTES (TREE_TYPE > (base '&&' goes in the line below. Diego.
Re: [RFC] Builtin infrastructure change
On Wed, Sep 14, 2011 at 02:33:12PM +, Joseph S. Myers wrote: > On Tue, 13 Sep 2011, Michael Meissner wrote: > > > implicit_built_in_decls to the new macros/functions, and poison > > the old names. Make sure 0 is not a legitimate builtin index. > > I think the poisoning belongs in system.h; direct dependencies on the host > compiler (such as GCC_VERSION and __GNUC__ tests) should be kept as local > as possible, in just a few headers such as system.h and hwint.h, with > other places testing feature macros defined in system.h (or through > autoconf) if it's really necessary to have such dependencies there. Ok, I wasn't sure where the best place to put the poison. System.h sounds like the best place. -- Michael Meissner, IBM 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Re: RFC: allowing fwprop to propagate subregs
On 09/14/2011 05:44 PM, Richard Sandiford wrote: > A SUBREG may not be REG nor CONSTANT. Don't you need > to check REG_P/CONSTANT_P on SUBREG? Yeah, good point. There should be a "&& REG_P (SUBREG_REG (new_rtx))" in there. Probably also worth checking for non-paradoxical subregs. I was going to suggest the former. Paradoxical subregs are already handled elsewhere in fwprop, but it also shouldn't hurt to include them. Paolo
Re: [PATCH, PR50322] Fix test-case ivopts-lt.c to use int of same size as pointer.
Hi, > The attached patch fixes PR50322. > > The test-case is supposed to succeed if the loop counter data-type has the > same > size as a pointer. The patch defines TYPE to be an int datatype of the same > size > as a pointer, and uses it. After this fix, there's no need for the avr xfails > anymore. > > tested with avr, x86_64 and x86_64 -m32. what about using uintptr_t? Zdenek > Index: gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c > === > --- gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c (revision 178804) > +++ gcc/testsuite/gcc.dg/tree-ssa/ivopts-lt.c (working copy) > @@ -1,8 +1,18 @@ > /* { dg-do compile } */ > /* { dg-options "-O2 -fdump-tree-ivopts" } */ > > +#if (__SIZEOF_LONG_LONG__ == __SIZEOF_POINTER__) > +#define TYPE unsigned long long int > +#elif (__SIZEOF_LONG__ == __SIZEOF_POINTER__) > +#define TYPE unsigned long int > +#elif (__SIZEOF_INT__ == __SIZEOF_POINTER__) > +#define TYPE unsigned int > +#else > +#error Add target support here > +#endif > + > void > -f1 (char *p, unsigned long int i, unsigned long int n) > +f1 (char *p, TYPE i, TYPE n) > { >p += i; >do > @@ -14,8 +24,7 @@ f1 (char *p, unsigned long int i, unsign >while (i < n); > } > > -/* For the fails on avr see PR tree-optimization/50322. */ > -/* { dg-final { scan-tree-dump-times "PHI" 1 "ivopts" { xfail { "avr-*-*" } > } } } */ > +/* { dg-final { scan-tree-dump-times "PHI" 1 "ivopts" } } */ > /* { dg-final { scan-tree-dump-times "PHI -/* { dg-final { scan-tree-dump-times "p_\[0-9\]* <" 1 "ivopts" { xfail { > "avr-*-*" } } } } */ > +/* { dg-final { scan-tree-dump-times "p_\[0-9\]* <" 1 "ivopts" } } */ > /* { dg-final { cleanup-tree-dump "ivopts" } } */
[PATCH] [MELT] Fix CUMULMD5 leading space
Since v3.82, make issued some strange warning on duplicated targets. It turns out that those were due to a leading space hidden in CUMULMD5 variable, leading to target breakage. --- gcc/ChangeLog.MELT |4 gcc/melt-build.mk | 18 +- gcc/melt-build.tpl |2 +- 3 files changed, 14 insertions(+), 10 deletions(-) diff --git a/gcc/ChangeLog.MELT b/gcc/ChangeLog.MELT index 78c6ae6..6efd8d9 100644 --- a/gcc/ChangeLog.MELT +++ b/gcc/ChangeLog.MELT @@ -1,3 +1,7 @@ +2011-09-14 Alexandre Lissy + * melt-build.tpl: Remove leading space on CUMULMD5, breaking some make + targets. + * melt-build.mk: Regenerate. 2011-09-14 Basile Starynkevitch * melt-runtime.c (melt_dbgshortbacktrace): Print just the basename diff --git a/gcc/melt-build.mk b/gcc/melt-build.mk index af9e266..fd9d8a2 100644 --- a/gcc/melt-build.mk +++ b/gcc/melt-build.mk @@ -15,7 +15,7 @@ # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # -#@ from melt-build.tpl line 21 generated by Autogen 5.12 using melt-build.def +#@ from melt-build.tpl line 21 generated by Autogen 5.11.9 using melt-build.def ## the following Makefile variables are expected to be set @@ -147,7 +147,7 @@ MELT_GENERATED_FIRST_C_FILES= \ MELT_GENERATED_FIRST_BASE= \ $(basename $(notdir $(MELT_GENERATED_FIRST_C_FILES))) -MELT_GENERATED_FIRST_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-first+meltdesc.c) +MELT_GENERATED_FIRST_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-first+meltdesc.c) #@ from melt-build.tpl line 135 MELT_GENERATED_BASE_C_FILES= \ @@ -156,7 +156,7 @@ MELT_GENERATED_BASE_C_FILES= \ MELT_GENERATED_BASE_BASE= \ $(basename $(notdir $(MELT_GENERATED_BASE_C_FILES))) -MELT_GENERATED_BASE_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-base+meltdesc.c) +MELT_GENERATED_BASE_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-base+meltdesc.c) #@ from melt-build.tpl line 135 MELT_GENERATED_DEBUG_C_FILES= \ @@ -165,7 +165,7 @@ MELT_GENERATED_DEBUG_C_FILES= \ MELT_GENERATED_DEBUG_BASE= \ $(basename $(notdir $(MELT_GENERATED_DEBUG_C_FILES))) -MELT_GENERATED_DEBUG_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-debug+meltdesc.c) +MELT_GENERATED_DEBUG_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-debug+meltdesc.c) #@ from melt-build.tpl line 135 MELT_GENERATED_MACRO_C_FILES= \ @@ -174,7 +174,7 @@ MELT_GENERATED_MACRO_C_FILES= \ MELT_GENERATED_MACRO_BASE= \ $(basename $(notdir $(MELT_GENERATED_MACRO_C_FILES))) -MELT_GENERATED_MACRO_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-macro+meltdesc.c) +MELT_GENERATED_MACRO_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-macro+meltdesc.c) #@ from melt-build.tpl line 135 MELT_GENERATED_NORMAL_C_FILES= \ @@ -183,7 +183,7 @@ MELT_GENERATED_NORMAL_C_FILES= \ MELT_GENERATED_NORMAL_BASE= \ $(basename $(notdir $(MELT_GENERATED_NORMAL_C_FILES))) -MELT_GENERATED_NORMAL_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-normal+meltdesc.c) +MELT_GENERATED_NORMAL_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-normal+meltdesc.c) #@ from melt-build.tpl line 135 MELT_GENERATED_NORMATCH_C_FILES= \ @@ -192,7 +192,7 @@ MELT_GENERATED_NORMATCH_C_FILES= \ MELT_GENERATED_NORMATCH_BASE= \ $(basename $(notdir $(MELT_GENERATED_NORMATCH_C_FILES))) -MELT_GENERATED_NORMATCH_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-normatch+meltdesc.c) +MELT_GENERATED_NORMATCH_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-normatch+meltdesc.c) #@ from melt-build.tpl line 135 MELT_GENERATED_GENOBJ_C_FILES= \ @@ -201,7 +201,7 @@ MELT_GENERATED_GENOBJ_C_FILES= \ MELT_GENERATED_GENOBJ_BASE= \ $(basename $(notdir $(MELT_GENERATED_GENOBJ_C_FILES))) -MELT_GENERATED_GENOBJ_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-genobj+meltdesc.c) +MELT_GENERATED_GENOBJ_CUMULMD5 := $(shell $(GAWK) -F\" '/melt_cumulated_hexmd5/{print $$2}' $(melt_make_source_dir)/generated/warmelt-genobj+meltdesc.c) #@ from melt-build.tpl line 135 MELT
Re: Prevent inliner from removing aliases of used comdats
Would a testcase from PR50226 suffice?
Re: [trans-mem] Fix compilation of libitm with recent gcc
Patrick Marlier writes: > Here the error when gcc compiles libitm: > > ../../../transactional-memory/libitm/aatree.h: In constructor > ‘GTM::aa_node_base::aa_node_base(GTM::aa_node_base::level_type)’: > ../../../transactional-memory/libitm/aatree.h:53:16: error: > list-initializer for non-class type must not be parenthesized > [-Werror] > > The fix just consists to remove the parenthesis. I have run into the same problem while doing a merge from trunk. This will be fixed (with your patch ;-)) once I commit in the next few days. Oh yeah, expect many changes to the branch once I commit. We hadn't done a merge in almost 1.5 years. Thanks Patrick.
Go patch committed: Fix defer/recover at high optimization levels
The Go defer/panic/recover handling works through a little dance that includes comparing return addresses to see if a particular call to recover should pick up a particular call to panic. This dance only works if the appropriate functions are not inlined. I messed up the condition for whether the functions could be inlined, such that it usually worked but broke at -O3. This patch fixes the problem. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r b740d140ea5f go/gogo-tree.cc --- a/go/gogo-tree.cc Tue Sep 13 14:32:18 2011 -0700 +++ b/go/gogo-tree.cc Wed Sep 14 11:30:56 2011 -0700 @@ -1159,8 +1159,11 @@ // If a function calls the predeclared recover function, we // can't inline it, because recover behaves differently in a - // function passed directly to defer. - if (this->calls_recover_ && !this->is_recover_thunk_) + // function passed directly to defer. If this is a recover + // thunk that we built to test whether a function can be + // recovered, we can't inline it, because that will mess up + // our return address comparison. + if (this->calls_recover_ || this->is_recover_thunk_) DECL_UNINLINABLE(decl) = 1; // If this is a thunk created to call a function which calls
Re: Prevent inliner from removing aliases of used comdats
> Would a testcase from PR50226 suffice? Sure, fell free to commit it! Honza
Re: hashtable exception safety patch
On 09/13/2011 10:14 PM, Paolo Carlini wrote: Hi, To rebase would have forced me to delay the patch of one day which I try to avoid... without success. Here it is again. Sorry about that, I was annoyed seeing a bit of - surely, trivial - work I had done on the orig patch nullified. Thanks for your patience. Paolo Attached patch applied. 2011-09-14 François Dumont Paolo Carlini * include/bits/hashtable.h (_Hashtable<>::_M_rehash): Take and restore hash policy _M_prev_resize on exception. (_Hashtable<>::_M_insert_bucket): Capture hash policy next resize before using it and use latter method to have it restored on exception. (_Hashtable<>::_M_insert(_Arg&& __v, std::false_type): Likewise. (_Hashtable<>::insert(_InputIterator, _InputIterator): Likewise. (_Hashtable<>::rehash): Likewise. * testsuite/23_containers/unordered_set/insert/hash_policy.cc: New. * testsuite/23_containers/unordered_multiset/insert/hash_policy.cc: Likewise. François Index: include/bits/hashtable.h === --- include/bits/hashtable.h (revision 178792) +++ include/bits/hashtable.h (working copy) @@ -458,8 +458,9 @@ // reserve, if present, comes from _Rehash_base. private: - // Unconditionally change size of bucket array to n. - void _M_rehash(size_type __n); + // Unconditionally change size of bucket array to n, restore hash policy + // resize value to __next_resize on exception. + void _M_rehash(size_type __n, size_type __next_resize); }; @@ -743,7 +744,7 @@ _M_rehash_policy = __pol; size_type __n_bkt = __pol._M_bkt_for_elements(_M_element_count); if (__n_bkt > _M_bucket_count) - _M_rehash(__n_bkt); + _M_rehash(__n_bkt, __pol._M_next_resize); } template __do_rehash = _M_rehash_policy._M_need_rehash(_M_bucket_count, _M_element_count, 1); @@ -920,14 +922,14 @@ __n = this->_M_bucket_index(__k, __code, __do_rehash.second); } - // Allocate the new node before doing the rehash so that we don't - // do a rehash if the allocation throws. - _Node* __new_node = _M_allocate_node(std::forward<_Arg>(__v)); - + _Node* __new_node = 0; __try { + // Allocate the new node before doing the rehash so that we + // don't do a rehash if the allocation throws. + __new_node = _M_allocate_node(std::forward<_Arg>(__v)); if (__do_rehash.first) - _M_rehash(__do_rehash.second); + _M_rehash(__do_rehash.second, __saved_next_resize); __new_node->_M_next = _M_buckets[__n]; this->_M_store_code(__new_node, __code); @@ -939,7 +941,10 @@ } __catch(...) { - _M_deallocate_node(__new_node); + if (!__new_node) + _M_rehash_policy._M_next_resize = __saved_next_resize; + else + _M_deallocate_node(__new_node); __throw_exception_again; } } @@ -981,11 +986,12 @@ _H1, _H2, _Hash, _RehashPolicy, __chc, __cit, __uk>:: _M_insert(_Arg&& __v, std::false_type) { + const size_type __saved_next_resize = _M_rehash_policy._M_next_resize; std::pair __do_rehash = _M_rehash_policy._M_need_rehash(_M_bucket_count, _M_element_count, 1); if (__do_rehash.first) - _M_rehash(__do_rehash.second); + _M_rehash(__do_rehash.second, __saved_next_resize); const key_type& __k = this->_M_extract(__v); typename _Hashtable::_Hash_code_type __code = this->_M_hash_code(__k); @@ -1024,11 +1030,12 @@ insert(_InputIterator __first, _InputIterator __last) { size_type __n_elt = __detail::__distance_fw(__first, __last); + const size_type __saved_next_resize = _M_rehash_policy._M_next_resize; std::pair __do_rehash = _M_rehash_policy._M_need_rehash(_M_bucket_count, _M_element_count, __n_elt); if (__do_rehash.first) - _M_rehash(__do_rehash.second); + _M_rehash(__do_rehash.second, __saved_next_resize); for (; __first != __last; ++__first) this->insert(*__first); @@ -1184,9 +1191,11 @@ _H1, _H2, _Hash, _RehashPolicy, __chc, __cit, __uk>:: rehash(size_type __n) { + const size_type __saved_next_resize = _M_rehash_policy._M_next_resize; _M_rehash(std::max(_M_rehash_policy._M_next_bkt(__n), _M_rehash_policy._M_bkt_for_elements(_M_element_count - + 1))); + + 1)), + __saved_next_resize); } template:: -_M_rehash(size_type __n) +_M_rehash(size_type __n, size_type __next_resize) { - _Node** __new_array = _M_allocate_buckets(__n); + _Node** __new_array = 0; __try { + __new_array = _M_allocate_buckets(__n); _M_begin_bucket_index = __n; for (size_type __i = 0; __i < _M_bucket_count; ++__i) while (_Node* __p = _M_buckets[__i]) @@ -1218,15 +1228,23 @@ } __catch(...) { - // A failure here means that a hash function threw an exception. - // We can't
Re: [RFC] Builtin infrastructure change
On 09/13/2011 04:27 PM, Michael Meissner wrote: > This patch touches a bunch of files, but most of the changes are fairly > mechanical. What this does is change the infrastructure of how machine > independent builtin functions are handled. At this point, there is no change > to the compiler from a user perspective. Instead of having two parallel > arrays > that have tree pointers for built-in functions: > > built_in_decls > implicit_built_in_decls > > I have replaced these by two macros that call inline functions: > > BUILT_IN_DECLS > IMPLICIT_BUILT_IN_DECLS > > and changed the other parts that want to modify the builtins. By changing it > to a functional approach instead of an array, we can more easily change the > code underneath. Why bother with the case change and poisoning? Merely changing the type of the symbol from object to function ought to be sufficient to catch all users at compile time. I say this because I'm not keen on functions using all-caps names... r~
Re: [PATCH 2/7] Generate virtual locations for tokens
On 09/14/2011 04:50 AM, Dodji Seketeli wrote: To comply with this invariant, the initial patch of Tom was cloning T into T', and was using T'->src_loc to track the virtual location of T. Which is close to what you are proposing, while respecting the invariant. Yes, that's what I had in mind. I didn't realize that there was only one token for FOO. But it turned out that was using too much memory :-(. So we devised this scheme instead. Ah. :( + void *macro; This should be a union rather than an untyped pointer. + else if (context->tokens_kind == TOKENS_KIND_EXTENDED) + { + /* So we are in presence of an extended token context, which +means that each token in this context has a virtual +location attached to it. So let's not forget to update +the pointer to the current virtual location of the +current token when we update the pointer to the current +token */ + + rhs = *FIRST (context).ptoken++; + if (context->macro) The other places that deal with TOKENS_KIND_EXTENDED don't test that context->macro is non-null. Why is it needed here? +{ + cpp_hashnode *macro; + if (context->tokens_kind == TOKENS_KIND_EXTENDED) + { + macro_context *mc = (macro_context *) context->macro; + macro = mc->macro_node; + /* If context->buff is set, it means the life time of tokens +is bound to the life time of this context; so we must +free the tokens; that means we must free the virtual +locations of these tokens too. */ + if (context->buff && mc->virt_locs) + { + free (mc->virt_locs); + mc->virt_locs = NULL; + } + free (mc); + context->macro = NULL; + } + else + macro = (cpp_hashnode *) context->macro; + + if (macro != NULL) + macro->flags &= ~NODE_DISABLED; How can macro end up NULL if context->macro was set? +/* In the traditionnal mode of the preprocessor, if we are currently + location if we are in the traditionnal mode, and just returns "traditional" I don't think we need to talk about virtual locations before cpp_get_token_1; it's not an external interface, and it's redundant with the description before cpp_get_token_with_location. + result = cpp_get_token_1 (pfile, loc); if (pfile->context->macro) -*loc = pfile->invocation_location; +{ + if (!CPP_OPTION (pfile, track_macro_expansion)) + *loc = pfile->invocation_location; +} else *loc = result->src_loc; + *loc = maybe_adjust_loc_for_trad_cpp (pfile, *loc); Let's move this code into cpp_get_token_1 so that all the location tweaking is in one place. + switch (it->kind) +{ +case MACRO_ARG_TOKEN_NORMAL: +case MACRO_ARG_TOKEN_EXPANDED: + it->token_ptr++; + if (track_macro_exp_p) + it->location_ptr++; + break; +case MACRO_ARG_TOKEN_STRINGIFIED: +#ifdef ENABLE_CHECKING + if (it->num_forwards > 0) + abort (); + it->num_forwards++; +#endif + break; +} Don't you want to increment num_forwards in the normal/expanded cases, too? +tokens_buff_append_token (cpp_reader *pfile, + _cpp_buff *buffer, + source_location *virt_locs, + const cpp_token *token, + source_location def_loc, + source_location parm_def_loc, + const struct line_map *map, + unsigned int *macro_token_index) Why is macro_token_index a pointer? Nothing seems to modify the referent. +/* Appends a token to the end of the token buffer BUFFER. Note that + this function doesn't enlarge BUFFER; it overwrite the last memory + location of BUFFER that holds a token. That doesn't sound like appending. Jason
Go patch committed: Fix handling of structs with embedded builtin types
The Go frontend was mishandling structs with embedded builtin types, as in struct { int } This is because builtin type names are not packed with the package name, whereas field names are. Rather than try to sort out that confusion, I just added another hack around it. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian diff -r 24e17efc20cf go/expressions.cc --- a/go/expressions.cc Wed Sep 14 11:46:20 2011 -0700 +++ b/go/expressions.cc Wed Sep 14 15:23:39 2011 -0700 @@ -12640,6 +12640,16 @@ { const Struct_field* sf = st->field(fre->field_index()); name = sf->field_name(); + + // See below. FIXME. + if (!Gogo::is_hidden_name(name) + && name[0] >= 'a' + && name[0] <= 'z') + { + if (gogo->lookup_global(name.c_str()) != NULL) + name = gogo->pack_hidden_name(name, false); + } + char buf[20]; snprintf(buf, sizeof buf, "%u", fre->field_index()); size_t buflen = strlen(buf); @@ -12671,7 +12681,7 @@ // A predefined name won't be packed. If it starts with a // lower case letter we need to check for that case, because - // the field name will be packed. + // the field name will be packed. FIXME. if (!Gogo::is_hidden_name(name) && name[0] >= 'a' && name[0] <= 'z') @@ -13505,7 +13515,10 @@ Ast_dump_context* ast_dump_context) const { ast_dump_context->ostream() << "unsafe.Offsetof("; - ast_dump_context->ostream() << this->field_->field_name(); + ast_dump_context->dump_type(this->type_); + ast_dump_context->ostream() << '.'; + ast_dump_context->ostream() << +Gogo::message_name(this->field_->field_name()); ast_dump_context->ostream() << ")"; } diff -r 24e17efc20cf go/types.cc --- a/go/types.cc Wed Sep 14 11:46:20 2011 -0700 +++ b/go/types.cc Wed Sep 14 15:23:39 2011 -0700 @@ -1297,28 +1297,28 @@ if (!this->has_pointer()) runtime_type_kind |= RUNTIME_TYPE_KIND_NO_POINTERS; Struct_field_list::const_iterator p = fields->begin(); - go_assert(p->field_name() == "Kind"); + go_assert(p->is_field_name("Kind")); mpz_t iv; mpz_init_set_ui(iv, runtime_type_kind); vals->push_back(Expression::make_integer(&iv, p->type(), bloc)); ++p; - go_assert(p->field_name() == "align"); + go_assert(p->is_field_name("align")); Expression::Type_info type_info = Expression::TYPE_INFO_ALIGNMENT; vals->push_back(Expression::make_type_info(this, type_info)); ++p; - go_assert(p->field_name() == "fieldAlign"); + go_assert(p->is_field_name("fieldAlign")); type_info = Expression::TYPE_INFO_FIELD_ALIGNMENT; vals->push_back(Expression::make_type_info(this, type_info)); ++p; - go_assert(p->field_name() == "size"); + go_assert(p->is_field_name("size")); type_info = Expression::TYPE_INFO_SIZE; vals->push_back(Expression::make_type_info(this, type_info)); ++p; - go_assert(p->field_name() == "hash"); + go_assert(p->is_field_name("hash")); mpz_set_ui(iv, this->hash_for_method(gogo)); vals->push_back(Expression::make_integer(&iv, p->type(), bloc)); @@ -1327,7 +1327,7 @@ this->type_functions(&hash_fn, &equal_fn); ++p; - go_assert(p->field_name() == "hashfn"); + go_assert(p->is_field_name("hashfn")); Function_type* fntype = p->type()->function_type(); Named_object* no = Named_object::make_function_declaration(hash_fn, NULL, fntype, @@ -1336,14 +1336,14 @@ vals->push_back(Expression::make_func_reference(no, NULL, bloc)); ++p; - go_assert(p->field_name() == "equalfn"); + go_assert(p->is_field_name("equalfn")); fntype = p->type()->function_type(); no = Named_object::make_function_declaration(equal_fn, NULL, fntype, bloc); no->func_declaration_value()->set_asm_name(equal_fn); vals->push_back(Expression::make_func_reference(no, NULL, bloc)); ++p; - go_assert(p->field_name() == "string"); + go_assert(p->is_field_name("string")); Expression* s = Expression::make_string((name != NULL ? name->reflection(gogo) : this->reflection(gogo)), @@ -1351,7 +1351,7 @@ vals->push_back(Expression::make_unary(OPERATOR_AND, s, bloc)); ++p; - go_assert(p->field_name() == "uncommonType"); + go_assert(p->is_field_name("uncommonType")); if (name == NULL && methods == NULL) vals->push_back(Expression::make_nil(bloc)); else @@ -1365,7 +1365,7 @@ } ++p; - go_assert(p->field_name() == "ptrToThis"); + go_assert(p->is_field_name("ptrToThis")); if (name == NULL) vals->push_back(Expression::make_nil(bloc)); else @@ -1402,10 +1402,10 @@ vals->reserve(3); Struct_field_list::const_iterator p = fields->begin(); - go_assert(p->field_name() == "name"); + go_assert(p->is_field_name("name")); ++p; - go_assert(p->field_name() == "pkgPath"); + go_assert(p->is_field_name("pkgPath")); if (name == NULL) { @@ -1444,7 +1444,7 @@ } ++p; - go_assert(p->field_name() == "methods"); + go_assert(p->is_field_name("
Re: Initial shrink-wrapping patch
On 09/13/2011 08:36 AM, Bernd Schmidt wrote: > On 09/13/11 15:05, Richard Sandiford wrote: >> It just feels like checking for trap_if or turning off cross-jumping >> are working around problems in the representation of shrink-wrapped >> functions. There should be something in the IL to say that those >> two blocks cannot be merged for CFI reasons. > > There is - JUMP_LABELs and such, and the simple_return vs return > distinction. This works for essentially all the interesting cases. The > problem here is that we don't have a jump as the last insn. So how about > the solution in crossjumping as below? > >> Maybe two flags on >> the basic block to say whether they start (resp. end) with the >> "wrapped" version of the CFI? (Which unfortunately would need >> to be checked explicitly.) > > I think that's overdesigning it, and it breaks as soon as something > discards the bb info (reorg...) or puts a label in the middle of a > prologue or epilogue. Keeping that up-to-date would be much more > fragile than just manually dealing with the few cases where we can't > tell what's going on. > >> OTOH, if another reviewer thinks that's unreasnable, I'll happily >> defer to them. > > Cc'ing rth for a second opinion... It feels hacky, but I don't have anything better to suggest. I think the patch is ok. r~
Handle multi-word regsiters in REG_CFA_RESTORE notes
While testing the altest iteration of shrink-wrapping on mips-elf, a new failure showed up in gcc.dg/pr43139.c. When restoring floating-point registers, we attach REG_CFA_RESTORE notes for DFmode registers, but the dwarf2cfi code only records a single regno for such a multiword hard reg. Fixed with this patch, tested with shrink-wrapping on mips-elf (default multilib plus two with -mips16). Ok? Bernd * dwarf2cfi.c (dwarf2out_frame_debug_cfa_expression): Handle multi-word registers. Index: gcc/dwarf2cfi.c === --- gcc/dwarf2cfi.c (revision 178734) +++ gcc/dwarf2cfi.c (working copy) @@ -1236,10 +1236,15 @@ dwarf2out_frame_debug_cfa_expression (rt static void dwarf2out_frame_debug_cfa_restore (rtx reg) { - unsigned int regno = dwf_regno (reg); + unsigned int orig_regno = REGNO (reg); + int nregs = hard_regno_nregs[orig_regno][GET_MODE (reg)]; + while (nregs-- > 0) +{ + unsigned int regno = DWARF_FRAME_REGNUM (orig_regno + nregs); - add_cfi_restore (regno); - update_row_reg_save (cur_row, regno, NULL); + add_cfi_restore (regno); + update_row_reg_save (cur_row, regno, NULL); +} } /* A subroutine of dwarf2out_frame_debug, process a REG_CFA_WINDOW_SAVE.
[4.6] Fix set-but-not-used warning [committed] (issue5013044)
The GCC version we use internally to build 4.6 treats unused variable warnings as errors. This was causing bootstraps to fail. Tested on x86_64. Committed to gcc-4_6-branch. * tree-vect-stmts.c (vect_transform_stmt): Remove unused local variable ORIG_SCALAR_STMT. Index: tree-vect-stmts.c === --- tree-vect-stmts.c (revision 178875) +++ tree-vect-stmts.c (working copy) @@ -4814,7 +4814,7 @@ vect_transform_stmt (gimple stmt, gimple bool is_store = false; gimple vec_stmt = NULL; stmt_vec_info stmt_info = vinfo_for_stmt (stmt); - gimple orig_stmt_in_pattern, orig_scalar_stmt = stmt; + gimple orig_stmt_in_pattern; bool done; switch (STMT_VINFO_TYPE (stmt_info)) -- This patch is available for review at http://codereview.appspot.com/5013044
[ARM testsuite] remove option "-march=armv5te" for pr46934.c testcase
Hi, Bug PR46934 exists on all thumb1 targets, for example cortex-m0. This patch removes "-march=armv5te" option for testcase pr46934.c, allowing the test to be run on all thumb1 targets, not just armv5te. Is it ok? Thanks. gcc/testsuite/ChangeLog: 2011-09-15 Cheng Bin * gcc.target/arm/pr46934.c: remove -march=armv5te option. Thanks-chengbin relax-option-for-pr46934-testcase.patch Description: Binary data
[ARM] pass "--be8" to linker when linking for M profile
Hi, The linker should do endian swizzling at link-time according to "--be8" option. This patch modifies BE8_LINK_SPEC by adding cortex-m processors in the specs string. Since R-profile supports configurable big-endian instruction fetch, I didn't include it here. Is it ok? Thanks. 2011-09-15 Cheng Bin * config/arm/bpabi.h (BE8_LINK_SPEC): add cortex-m arch and processors. Thanks-chengbin gcc-be8-for-m-profile.patch Description: Binary data