Re: [C++ Patch and pubnames 2/2] Adjust c decl pretty printer to match demangler (issue6195056)
On 9 May 2012 01:34, Cary Coutant wrote: >>> A suggestion: Make dwarf_name call the demangler, and then a (new?) a >>> function that converts a mangled decl to a human-readable string. In >>> any case, the pretty-printer does a lot of stuff that is mostly >>> useless for just printing a declaration (translation, wrapping, etc.). >>> >>> Bonus point if all GNU toolchain program use the same functions for >>> demangling and undemagling (because I guess they actually don't, no?) >>> >>> I guess it cannot be so easy, so I apologize in advance for saying nonsense. >> >> It makes sense; and I don't think it is as complicated as it might sound. > > dwarf_name takes a tree; the demangler takes a mangled name. We don't > have mangled names for many of the names we want to enter into the > pubnames table. Sorry that was a typo. My proposal is to mangle the tree (GCC already does that) and then demangle it to a human-readable string (GCC also does that already, no? Or at least GDB and other programs have to do that anyway, no? In any case, the demangling code should be shared and, hence, consistent.) For the names that the mangler/demangler do not support, I guess it doesn't make sense to extend the demangler, so you will need to handle them explicitly. You may call then the pretty-printer, since anyway GDB and the other programs cannot demangle those names, so whatever is used by GCC should be fine. Or implement a custom pretty-printer for them. A custom pretty-printer will be for sure much simpler than the diagnostics pretty-printer, since you don't have to worry about all kinds of trees, wrapping, inheritance, i18n, etc. Probably I am missing some details that may make the plan above infeasible. In any case, I think your intention to make the output of the different toolchain programs more consistent is worthwhile. If dwarf_name has to use the pretty-printer, then I'd prefer that you changed the demangler to match the diagnostics output, rather than the other way around. Also, I am no maintainer, so these are merely suggestions that you may freely ignore. :-) Cheers, Manuel..
Re: [PATCH] Add option for dumping to stderr (issue6190057)
On Wed, May 9, 2012 at 1:46 AM, Sharad Singhai wrote: [...] > +@item -fdump-rtl-all=stderr > +@opindex fdump-rtl-all=stderr You do not need to have a separate index entry for '=stderr' or '=stdout'. Rather, expand the description to state this in all the documentation for -fdump-xxx=yyy. [...] > +/* Return non-zero iff the USER_FILENAME corresponds to stdout or > + stderr stream. */ > + > +static int > +dump_stream_p (const char *user_filename) > +{ > + if (user_filename) > + return !strncmp ("stderr", user_filename, 6) || > + !strncmp ("stdout", user_filename, 6); > + else > + return 0; > +} The name is ambiguous. This function is testing whether its string argument designates one of the *standard* output streams. Name it to reflect that.. Have it take the dump state context. Also the coding convention: the binary operator "||" should be on next line. In fact the thing could be simpler. Instead of testing over and over again against "stderr" (once in this function, then again later), just return the corresponding standard FILE* pointer. Also, this is a case of overuse of strncmp. If you name the function dump_get_standard_stream: return strcmp("stderr", dfi->user_filename) == 0 ? stderr : stdcmp("stdout", dfi->use_filename) ? stdout : NULL; you can simplify: > - name = get_dump_file_name (phase); > dfi = get_dump_file_info (phase); > - stream = fopen (name, dfi->state < 0 ? "w" : "a"); > - if (!stream) > - error ("could not open dump file %qs: %m", name); > + if (dump_stream_p (dfi->user_filename)) > + { > + if (!strncmp ("stderr", dfi->user_filename, 6)) > + stream = stderr; > + else > + if (!strncmp ("stdout", dfi->user_filename, 6)) > + stream = stdout; > + else > + error ("unknown stream: %qs: %m", dfi->user_filename); > + dfi->state = 1; > + } > else > - dfi->state = 1; > - free (name); > + { > + name = get_dump_file_name (phase); > + stream = fopen (name, dfi->state < 0 ? "w" : "a"); > + if (!stream) > + error ("could not open dump file %qs: %m", name); > + else > + dfi->state = 1; > + free (name); > + } > > if (flag_ptr) > *flag_ptr = dfi->flags; > @@ -987,35 +1017,45 @@ dump_flag_name (int phase) > dump_begin. */ > > void > -dump_end (int phase ATTRIBUTE_UNUSED, FILE *stream) > +dump_end (int phase, FILE *stream) > { > - fclose (stream); > + struct dump_file_info *dfi = get_dump_file_info (phase); > + if (!dump_stream_p (dfi->user_filename)) > + fclose (stream); > } > > /* Enable all tree dumps. Return number of enabled tree dumps. */ > > static int > -dump_enable_all (int flags) > +dump_enable_all (int flags, const char *user_filename) > { > int ir_dump_type = (flags & (TDF_TREE | TDF_RTL | TDF_IPA)); > int n = 0; > size_t i; > > for (i = TDI_none + 1; i < (size_t) TDI_end; i++) > - if ((dump_files[i].flags & ir_dump_type)) > - { > - dump_files[i].state = -1; > - dump_files[i].flags |= flags; > - n++; > - } > + { > + if ((dump_files[i].flags & ir_dump_type)) > + { > + dump_files[i].state = -1; > + dump_files[i].flags |= flags; > + n++; > + } > + if (user_filename) > + dump_files[i].user_filename = user_filename; > + } > > for (i = 0; i < extra_dump_files_in_use; i++) > - if ((extra_dump_files[i].flags & ir_dump_type)) > - { > - extra_dump_files[i].state = -1; > - extra_dump_files[i].flags |= flags; > - n++; > - } > + { > + if ((extra_dump_files[i].flags & ir_dump_type)) > + { > + extra_dump_files[i].state = -1; > + extra_dump_files[i].flags |= flags; > + n++; > + } > + if (user_filename) > + extra_dump_files[i].user_filename = user_filename; > + } > > return n; > } > @@ -1037,7 +1077,7 @@ dump_switch_p_1 (const char *arg, struct dump_file > if (!option_value) > return 0; > > - if (*option_value && *option_value != '-') > + if (*option_value && *option_value != '-' && *option_value != '=') > return 0; > > ptr = option_value; > @@ -1052,17 +1092,30 @@ dump_switch_p_1 (const char *arg, struct dump_file > while (*ptr == '-') > ptr++; > end_ptr = strchr (ptr, '-'); > + > if (!end_ptr) > end_ptr = ptr + strlen (ptr); > length = end_ptr - ptr; > > + if (*ptr == '=') > + { > + /* Interpret rest of the argument as a dump filename. The > + user provided filename overrides generated dump names as > + well as other command line filenames. */ > + flags |= TDF_USER_FILENAME; > + if (dfi->user_filename) > + free (dfi->user_filename); > + dfi->user_filename = xstrdup (ptr + 1); > + break; > + } > + > for (option_ptr = dump_options; option_ptr->na
Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs
On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li wrote: > To be clear, this flag is for malloc implementation (such as tcmalloc) > with side effect unknown to the compiler. Using -fno-builtin-xxx is > too conservative for that purpose. I don't think that flys. Btw, the patch also guards alloca - alloca is purely GCC internal. What's the "unknown side-effects" that are also important to preserve for free(malloc(4))? Richard. > David > > On Tue, May 8, 2012 at 7:43 AM, Dehao Chen wrote: >> Hello, >> >> This patch adds a flag to guard the optimization that optimize the >> following code away: >> >> free (malloc (4)); >> >> In some cases, we'd like this type of malloc/free pairs to remain in >> the optimized code. >> >> Tested with bootstrap, and no regression in the gcc testsuite. >> >> Is it ok for mainline? >> >> Thanks, >> Dehao >> >> gcc/ChangeLog >> 2012-05-08 Dehao Chen >> >> * common.opt (feliminate-malloc): New. >> * doc/invoke.texi: Document it. >> * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it. >> >> gcc/testsuite/ChangeLog >> 2012-05-08 Dehao Chen >> >> * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working >> as expected. >> >> Index: gcc/doc/invoke.texi >> === >> --- gcc/doc/invoke.texi (revision 187277) >> +++ gcc/doc/invoke.texi (working copy) >> @@ -360,7 +360,8 @@ >> -fcx-limited-range @gol >> -fdata-sections -fdce -fdelayed-branch @gol >> -fdelete-null-pointer-checks -fdevirtualize -fdse @gol >> --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol >> +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations @gol >> +-ffat-lto-objects @gol >> -ffast-math -ffinite-math-only -ffloat-store >> -fexcess-precision=@var{style} @gol >> -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol >> -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol >> @@ -6238,6 +6239,7 @@ >> -fdefer-pop @gol >> -fdelayed-branch @gol >> -fdse @gol >> +-feliminate-malloc @gol >> -fguess-branch-probability @gol >> -fif-conversion2 @gol >> -fif-conversion @gol >> @@ -6762,6 +6764,11 @@ >> Perform dead store elimination (DSE) on RTL@. >> Enabled by default at @option{-O} and higher. >> >> +@item -feliminate-malloc >> +@opindex feliminate-malloc >> +Eliminate unnecessary malloc/free pairs. >> +Enabled by default at @option{-O} and higher. >> + >> @item -fif-conversion >> @opindex fif-conversion >> Attempt to transform conditional jumps into branch-less equivalents. This >> Index: gcc/testsuite/gcc.dg/free-malloc.c >> === >> --- gcc/testsuite/gcc.dg/free-malloc.c (revision 0) >> +++ gcc/testsuite/gcc.dg/free-malloc.c (revision 0) >> @@ -0,0 +1,12 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2 -fno-eliminate-malloc" } */ >> +/* { dg-final { scan-assembler-times "malloc" 2} } */ >> +/* { dg-final { scan-assembler-times "free" 2} } */ >> + >> +extern void * malloc (unsigned long); >> +extern void free (void *); >> + >> +void test () >> +{ >> + free (malloc (10)); >> +} >> Index: gcc/common.opt >> === >> --- gcc/common.opt (revision 187277) >> +++ gcc/common.opt (working copy) >> @@ -1474,6 +1474,10 @@ >> Common Var(flag_dce) Init(1) Optimization >> Use the RTL dead code elimination pass >> >> +feliminate-malloc >> +Common Var(flag_eliminate_malloc) Init(1) Optimization >> +Eliminate unnecessary malloc/free pairs >> + >> fdse >> Common Var(flag_dse) Init(1) Optimization >> Use the RTL dead store elimination pass >> Index: gcc/tree-ssa-dce.c >> === >> --- gcc/tree-ssa-dce.c (revision 187277) >> +++ gcc/tree-ssa-dce.c (working copy) >> @@ -309,6 +309,8 @@ >> case BUILT_IN_CALLOC: >> case BUILT_IN_ALLOCA: >> case BUILT_IN_ALLOCA_WITH_ALIGN: >> + if (!flag_eliminate_malloc) >> + mark_stmt_necessary (stmt, true); >> return; >> >> default:;
Re: [PATCH] Fix endless loop in forwprop (PR tree-optimization/53226)
On Tue, May 8, 2012 at 8:33 PM, Jakub Jelinek wrote: > Hi! > > The attached testcase loops endlessly, using more and more memory. > The problem is that the prev stmt iterator sometimes references stmts that > remove_prop_source_from_use decides to remove, and since Michael's > gimple seq changes that seems to be fatal. > > Fixed by not keeping an iterator, but instead marking stmts that don't need > revisiting and restarting with first stmt that needs revisiting. This > assumes that new stmts will have uid 0, but I believe that is the case. > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Using the stmt UID looks odd as you are using it as flag - why not use one of the two stmt flags available to passes? Btw, the other possibility would be to simply change the various combiners to require them to update the iterator to point at the first statement that they have inserted. I have some TLC ideas in this area, let's see if I can get to them. Ok with using gimple_set_plf and friends. Thanks, Richard. > 2012-05-08 Jakub Jelinek > > PR tree-optimization/53226 > * tree-ssa-forwprop.c (ssa_forward_propagate_and_combine): Remove > prev and prev_initialized vars, gimple_set_uid (stmt, 0) before > processing it and gimple_set_uid (stmt, 1) if it doesn't need to be > revisited, look for earliest stmt with uid 0 if something changed. > > * gcc.c-torture/compile/pr53226.c: New test. > > --- gcc/tree-ssa-forwprop.c.jj 2012-05-03 08:35:52.0 +0200 > +++ gcc/tree-ssa-forwprop.c 2012-05-08 18:10:19.662061709 +0200 > @@ -2677,8 +2677,7 @@ ssa_forward_propagate_and_combine (void) > > FOR_EACH_BB (bb) > { > - gimple_stmt_iterator gsi, prev; > - bool prev_initialized; > + gimple_stmt_iterator gsi; > > /* Apply forward propagation to all stmts in the basic-block. > Note we update GSI within the loop as necessary. */ > @@ -2771,12 +2770,14 @@ ssa_forward_propagate_and_combine (void) > > /* Combine stmts with the stmts defining their operands. > Note we update GSI within the loop as necessary. */ > - prev_initialized = false; > for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);) > { > gimple stmt = gsi_stmt (gsi); > bool changed = false; > > + /* Mark stmt as potentially needing revisiting. */ > + gimple_set_uid (stmt, 0); > + > switch (gimple_code (stmt)) > { > case GIMPLE_ASSIGN: > @@ -2856,18 +2857,18 @@ ssa_forward_propagate_and_combine (void) > { > /* If the stmt changed then re-visit it and the statements > inserted before it. */ > - if (!prev_initialized) > + for (; !gsi_end_p (gsi); gsi_prev (&gsi)) > + if (gimple_uid (gsi_stmt (gsi))) > + break; > + if (gsi_end_p (gsi)) > gsi = gsi_start_bb (bb); > else > - { > - gsi = prev; > - gsi_next (&gsi); > - } > + gsi_next (&gsi); > } > else > { > - prev = gsi; > - prev_initialized = true; > + /* Stmt no longer needs to be revisited. */ > + gimple_set_uid (stmt, 1); > gsi_next (&gsi); > } > } > --- gcc/testsuite/gcc.c-torture/compile/pr53226.c.jj 2012-05-08 > 18:07:40.007000510 +0200 > +++ gcc/testsuite/gcc.c-torture/compile/pr53226.c 2012-05-08 > 18:07:35.071029578 +0200 > @@ -0,0 +1,13 @@ > +/* PR tree-optimization/53226 */ > + > +void > +foo (unsigned long *x, char y, char z) > +{ > + int i; > + for (i = y; i < z; ++i) > + { > + unsigned long a = ((unsigned char) i) & 63UL; > + unsigned long b = 1ULL << a; > + *x |= b; > + } > +} > > Jakub
Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs
On Wed, May 9, 2012 at 4:12 PM, Richard Guenther wrote: > On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li wrote: >> To be clear, this flag is for malloc implementation (such as tcmalloc) >> with side effect unknown to the compiler. Using -fno-builtin-xxx is >> too conservative for that purpose. > > I don't think that flys. Btw, the patch also guards alloca - alloca is purely > GCC internal. > > What's the "unknown side-effects" that are also important to preserve > for free(malloc(4))? Malloc implementation may record some info to a global structure, and a program may use this free(malloc()) pair to simulate the real runs to get some data, such as peak memory requirement. Dehao > > Richard. > >> David >> >> On Tue, May 8, 2012 at 7:43 AM, Dehao Chen wrote: >>> Hello, >>> >>> This patch adds a flag to guard the optimization that optimize the >>> following code away: >>> >>> free (malloc (4)); >>> >>> In some cases, we'd like this type of malloc/free pairs to remain in >>> the optimized code. >>> >>> Tested with bootstrap, and no regression in the gcc testsuite. >>> >>> Is it ok for mainline? >>> >>> Thanks, >>> Dehao >>> >>> gcc/ChangeLog >>> 2012-05-08 Dehao Chen >>> >>> * common.opt (feliminate-malloc): New. >>> * doc/invoke.texi: Document it. >>> * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it. >>> >>> gcc/testsuite/ChangeLog >>> 2012-05-08 Dehao Chen >>> >>> * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working >>> as expected. >>> >>> Index: gcc/doc/invoke.texi >>> === >>> --- gcc/doc/invoke.texi (revision 187277) >>> +++ gcc/doc/invoke.texi (working copy) >>> @@ -360,7 +360,8 @@ >>> -fcx-limited-range @gol >>> -fdata-sections -fdce -fdelayed-branch @gol >>> -fdelete-null-pointer-checks -fdevirtualize -fdse @gol >>> --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol >>> +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations >>> @gol >>> +-ffat-lto-objects @gol >>> -ffast-math -ffinite-math-only -ffloat-store >>> -fexcess-precision=@var{style} @gol >>> -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol >>> -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol >>> @@ -6238,6 +6239,7 @@ >>> -fdefer-pop @gol >>> -fdelayed-branch @gol >>> -fdse @gol >>> +-feliminate-malloc @gol >>> -fguess-branch-probability @gol >>> -fif-conversion2 @gol >>> -fif-conversion @gol >>> @@ -6762,6 +6764,11 @@ >>> Perform dead store elimination (DSE) on RTL@. >>> Enabled by default at @option{-O} and higher. >>> >>> +@item -feliminate-malloc >>> +@opindex feliminate-malloc >>> +Eliminate unnecessary malloc/free pairs. >>> +Enabled by default at @option{-O} and higher. >>> + >>> @item -fif-conversion >>> @opindex fif-conversion >>> Attempt to transform conditional jumps into branch-less equivalents. This >>> Index: gcc/testsuite/gcc.dg/free-malloc.c >>> === >>> --- gcc/testsuite/gcc.dg/free-malloc.c (revision 0) >>> +++ gcc/testsuite/gcc.dg/free-malloc.c (revision 0) >>> @@ -0,0 +1,12 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O2 -fno-eliminate-malloc" } */ >>> +/* { dg-final { scan-assembler-times "malloc" 2} } */ >>> +/* { dg-final { scan-assembler-times "free" 2} } */ >>> + >>> +extern void * malloc (unsigned long); >>> +extern void free (void *); >>> + >>> +void test () >>> +{ >>> + free (malloc (10)); >>> +} >>> Index: gcc/common.opt >>> === >>> --- gcc/common.opt (revision 187277) >>> +++ gcc/common.opt (working copy) >>> @@ -1474,6 +1474,10 @@ >>> Common Var(flag_dce) Init(1) Optimization >>> Use the RTL dead code elimination pass >>> >>> +feliminate-malloc >>> +Common Var(flag_eliminate_malloc) Init(1) Optimization >>> +Eliminate unnecessary malloc/free pairs >>> + >>> fdse >>> Common Var(flag_dse) Init(1) Optimization >>> Use the RTL dead store elimination pass >>> Index: gcc/tree-ssa-dce.c >>> === >>> --- gcc/tree-ssa-dce.c (revision 187277) >>> +++ gcc/tree-ssa-dce.c (working copy) >>> @@ -309,6 +309,8 @@ >>> case BUILT_IN_CALLOC: >>> case BUILT_IN_ALLOCA: >>> case BUILT_IN_ALLOCA_WITH_ALIGN: >>> + if (!flag_eliminate_malloc) >>> + mark_stmt_necessary (stmt, true); >>> return; >>> >>> default:;
Re: Fix gcc.dg/lower-subreg-1.c failure
Hans-Peter Nilsson writes: >> From: Richard Sandiford >> Date: Tue, 1 May 2012 16:46:38 +0200 > >> To repeat: as things stand, very few targets define proper rtx costs >> for SET. > > IMHO it's wrong to start blaming targets when rtx_cost doesn't > take the mode in account in the first place, for the default > cost. (Well, except for the modes-tieable subreg special-case.) > The targets where an operation in N * word_mode costs no more > than one in word_mode, if there even is one, is a minority, > let's adjust the defaults to that. I'll pass on approving or disapproving this patch, but for the record: a factor of word_mode seems a bit too simplistic. It's OK for moves and logic ops, but addition of multiword modes is a bit more complicated. Multiplication and division by multiword modes is even more so, of course. >> Of course, if the costs are defined properly and lower-subreg still >> makes the wrong choice, we need to look at why. > > By the way, regarding validity of rtx_cost calls: > >> +++ gcc/lower-subreg.c 2012-05-01 09:46:48.473830772 +0100 > >> +/* Return the cost of a CODE shift in mode MODE by OP1 bits, using the >> + rtxes in RTXES. SPEED_P selects between the speed and size cost. */ >> + >> +static int >> +shift_cost (bool speed_p, struct cost_rtxes *rtxes, enum rtx_code code, >> + enum machine_mode mode, int op1) >> +{ >> + PUT_MODE (rtxes->target, mode); >> + PUT_CODE (rtxes->shift, code); >> + PUT_MODE (rtxes->shift, mode); >> + PUT_MODE (rtxes->source, mode); >> + XEXP (rtxes->shift, 1) = GEN_INT (op1); >> + SET_SRC (rtxes->set) = rtxes->shift; >> + return insn_rtx_cost (rtxes->set, speed_p); >> +} >> + >> +/* For each X in the range [0, BITS_PER_WORD), set SPLITTING[X] >> + to true if it is profitable to split a double-word CODE shift >> + of X + BITS_PER_WORD bits. SPEED_P says whether we are testing >> + for speed or size profitability. >> + >> + Use the rtxes in RTXES to calculate costs. WORD_MOVE_ZERO_COST is >> + the cost of moving zero into a word-mode register. WORD_MOVE_COST >> + is the cost of moving between word registers. */ >> + >> +static void >> +compute_splitting_shift (bool speed_p, struct cost_rtxes *rtxes, >> +bool *splitting, enum rtx_code code, >> +int word_move_zero_cost, int word_move_cost) >> +{ > > I think there should be a gating check whether the target > implements that kind of shift in that mode at all, before > checking the cost. Not sure whether it's generally best to put > that test here, or to make the rtx_cost function return the cost > of a libcall for that mode when that happens. Similar for the > other insns. This has come up a few times in past discussions about rtx_cost (as I'm sure you remember :-)). On the one hand, I agree it might be nice to shield the backend from invalid insns. That would effectively mean calling expand on each insn though, which would be rather expensive. Especially in a GC world. It would also prevent the target from being able to take things like pipeline characteristics into account. E.g. if you know that something takes 2 operations that must be issued sequentially, you might want to add in an extra (sub-COSTS_N_INSN (1)) cost compared to 2 operations that can be issued together, even if the individual operations take the same number of cycles. The same goes for multiplication in general. On some targets the speed of a multiplication can depend on the second operand, so knowing its value is useful even if the target doesn't have a multiplication instruction that takes constant operands. As things stand, rtx_cost is intentionally used for more than just valid target insns. One of the main uses has always been to decide whether it is better to implement multiplications by a shift-and-add sequence, or whether to use multiplication instructions. The associated expmed code has never tried to decide which shifts are actually representable as matching rtl insns and which aren't. The same goes for division-using-multiplication. So I think this patch is using rtx_cost according to its current interface. If someone wants to change or restrict that interface, than that's a separate change IMO. And it should be done consistently rather than in this one place. In this case it doesn't matter anyway. If we never see a shift in mode M by amount X, we'll never need to make a decision about whether to split it. > Isn't the below better than doing virtually the same in each > target's rtx_costs? FWIW, MIPS, SH and x86 all used something slightly different (and more complicated). I imagine PowerPC and SPARC will too. So "each" seems a bit strong. That's not an objection to the patch though. I realise some ports do have very regular architectures where every register is the same width and has the same move cost. Richard
Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs
On Wed, May 9, 2012 at 10:38 AM, Dehao Chen wrote: > On Wed, May 9, 2012 at 4:12 PM, Richard Guenther > wrote: >> On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li wrote: >>> To be clear, this flag is for malloc implementation (such as tcmalloc) >>> with side effect unknown to the compiler. Using -fno-builtin-xxx is >>> too conservative for that purpose. >> >> I don't think that flys. Btw, the patch also guards alloca - alloca is >> purely >> GCC internal. >> >> What's the "unknown side-effects" that are also important to preserve >> for free(malloc(4))? > > Malloc implementation may record some info to a global structure, and > a program may use this free(malloc()) pair to simulate the real runs > to get some data, such as peak memory requirement. So why not use an alternate interface into this special allocator for this purpose? > Dehao > >> >> Richard. >> >>> David >>> >>> On Tue, May 8, 2012 at 7:43 AM, Dehao Chen wrote: Hello, This patch adds a flag to guard the optimization that optimize the following code away: free (malloc (4)); In some cases, we'd like this type of malloc/free pairs to remain in the optimized code. Tested with bootstrap, and no regression in the gcc testsuite. Is it ok for mainline? Thanks, Dehao gcc/ChangeLog 2012-05-08 Dehao Chen * common.opt (feliminate-malloc): New. * doc/invoke.texi: Document it. * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it. gcc/testsuite/ChangeLog 2012-05-08 Dehao Chen * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working as expected. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 187277) +++ gcc/doc/invoke.texi (working copy) @@ -360,7 +360,8 @@ -fcx-limited-range @gol -fdata-sections -fdce -fdelayed-branch @gol -fdelete-null-pointer-checks -fdevirtualize -fdse @gol --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations @gol +-ffat-lto-objects @gol -ffast-math -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol @@ -6238,6 +6239,7 @@ -fdefer-pop @gol -fdelayed-branch @gol -fdse @gol +-feliminate-malloc @gol -fguess-branch-probability @gol -fif-conversion2 @gol -fif-conversion @gol @@ -6762,6 +6764,11 @@ Perform dead store elimination (DSE) on RTL@. Enabled by default at @option{-O} and higher. +@item -feliminate-malloc +@opindex feliminate-malloc +Eliminate unnecessary malloc/free pairs. +Enabled by default at @option{-O} and higher. + @item -fif-conversion @opindex fif-conversion Attempt to transform conditional jumps into branch-less equivalents. This Index: gcc/testsuite/gcc.dg/free-malloc.c === --- gcc/testsuite/gcc.dg/free-malloc.c (revision 0) +++ gcc/testsuite/gcc.dg/free-malloc.c (revision 0) @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-eliminate-malloc" } */ +/* { dg-final { scan-assembler-times "malloc" 2} } */ +/* { dg-final { scan-assembler-times "free" 2} } */ + +extern void * malloc (unsigned long); +extern void free (void *); + +void test () +{ + free (malloc (10)); +} Index: gcc/common.opt === --- gcc/common.opt (revision 187277) +++ gcc/common.opt (working copy) @@ -1474,6 +1474,10 @@ Common Var(flag_dce) Init(1) Optimization Use the RTL dead code elimination pass +feliminate-malloc +Common Var(flag_eliminate_malloc) Init(1) Optimization +Eliminate unnecessary malloc/free pairs + fdse Common Var(flag_dse) Init(1) Optimization Use the RTL dead store elimination pass Index: gcc/tree-ssa-dce.c === --- gcc/tree-ssa-dce.c (revision 187277) +++ gcc/tree-ssa-dce.c (working copy) @@ -309,6 +309,8 @@ case BUILT_IN_CALLOC: case BUILT_IN_ALLOCA: case BUILT_IN_ALLOCA_WITH_ALIGN: + if (!flag_eliminate_malloc) + mark_stmt_necessary (stmt, true); return; default:;
Re: RFA: PR target/53120, constraint modifier "+" on operand tied by matching-constraint, "0".
Hi DJ, Thanks for the test program. With that I can confirm that the match_dup does work and that the bset instruction is still generated even with my patch applied. OK to apply ? Cheers Nick
Re: [PATCH] MIPS16: Remove DWARF-2 location information from GP accesses
On Tue, 8 May 2012, Richard Sandiford wrote: > > gcc-mips16-gp-pseudo-loc.patch > > Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.c > > === > > --- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.c 2012-05-02 > > 23:42:46.185566469 +0100 > > +++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.c 2012-05-03 > > 18:55:28.775580939 +0100 > > @@ -2622,7 +2622,8 @@ mips16_gp_pseudo_reg (void) > > scan = NEXT_INSN (scan); > > > >insn = gen_load_const_gp (cfun->machine->mips16_gp_pseudo_rtx); > > - emit_insn_after (insn, scan); > > + insn = emit_insn_after (insn, scan); > > + INSN_LOCATOR (insn) = 0; > > > >pop_topmost_sequence (); > > } > > An alternative would be to use prologue_locator, like ARM does. Is this instruction guaranteed to be emitted once per function only? > I'm not sure whether that's an improvement though, so the patch > is OK as-is, thanks. Applied now, thanks. Maciej
Re: [PATCH] MIPS16: Remove DWARF-2 location information from GP accesses
"Maciej W. Rozycki" writes: > On Tue, 8 May 2012, Richard Sandiford wrote: >> > gcc-mips16-gp-pseudo-loc.patch >> > Index: gcc-fsf-trunk-quilt/gcc/config/mips/mips.c >> > === >> > --- gcc-fsf-trunk-quilt.orig/gcc/config/mips/mips.c2012-05-02 >> > 23:42:46.185566469 +0100 >> > +++ gcc-fsf-trunk-quilt/gcc/config/mips/mips.c 2012-05-03 >> > 18:55:28.775580939 +0100 >> > @@ -2622,7 +2622,8 @@ mips16_gp_pseudo_reg (void) >> >scan = NEXT_INSN (scan); >> > >> >insn = gen_load_const_gp (cfun->machine->mips16_gp_pseudo_rtx); >> > - emit_insn_after (insn, scan); >> > + insn = emit_insn_after (insn, scan); >> > + INSN_LOCATOR (insn) = 0; >> > >> >pop_topmost_sequence (); >> > } >> >> An alternative would be to use prologue_locator, like ARM does. > > Is this instruction guaranteed to be emitted once per function only? Yes.
Re: [committed] Fix lower-subreg cost calculation
On 08/05/12 22:42, Richard Sandiford wrote: > Richard Earnshaw writes: >> FTR, this caused >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53278 > > Well, this really has been a brown-paper-bag patch. Fixed as below. > Tested on x86_64-linux-gnu and applied as obvious. > > Richard > > > gcc/ > PR rtl-optimization/53278 > * lower-subreg.c (decompose_multiword_subregs): Remove left-over > speed_p code from earlier patch. > > Index: gcc/lower-subreg.c > === > --- gcc/lower-subreg.c2012-05-08 19:45:31.0 +0100 > +++ gcc/lower-subreg.c2012-05-08 19:45:31.793855523 +0100 > @@ -1487,9 +1487,7 @@ decompose_multiword_subregs (void) >FOR_EACH_BB (bb) > { > rtx insn; > - bool speed_p; > > - speed_p = optimize_bb_for_speed_p (bb); > FOR_BB_INSNS (bb, insn) > { > rtx pat; > Which begs the question as to why -wshadow didn't pick this up... R.
Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs
On Wed, May 9, 2012 at 5:22 PM, Richard Guenther wrote: > On Wed, May 9, 2012 at 10:38 AM, Dehao Chen wrote: >> On Wed, May 9, 2012 at 4:12 PM, Richard Guenther >> wrote: >>> On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li >>> wrote: To be clear, this flag is for malloc implementation (such as tcmalloc) with side effect unknown to the compiler. Using -fno-builtin-xxx is too conservative for that purpose. >>> >>> I don't think that flys. Btw, the patch also guards alloca - alloca is >>> purely >>> GCC internal. >>> >>> What's the "unknown side-effects" that are also important to preserve >>> for free(malloc(4))? >> >> Malloc implementation may record some info to a global structure, and >> a program may use this free(malloc()) pair to simulate the real runs >> to get some data, such as peak memory requirement. > > So why not use an alternate interface into this special allocator for this > purpose? There can be the following scenario: We want to add a module to an existing app. Before implementing the module, we want to collect some statistics on real runs. In this scenario, we need: * No change to the legacy code * Optimized build for the simulation run * Provide accurate statistical info We want to collect data for both new module and the legacy code without changing the later, thus we cannot use a new malloc/free interface. Thanks, Dehao > >> Dehao >> >>> >>> Richard. >>> David On Tue, May 8, 2012 at 7:43 AM, Dehao Chen wrote: > Hello, > > This patch adds a flag to guard the optimization that optimize the > following code away: > > free (malloc (4)); > > In some cases, we'd like this type of malloc/free pairs to remain in > the optimized code. > > Tested with bootstrap, and no regression in the gcc testsuite. > > Is it ok for mainline? > > Thanks, > Dehao > > gcc/ChangeLog > 2012-05-08 Dehao Chen > > * common.opt (feliminate-malloc): New. > * doc/invoke.texi: Document it. > * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it. > > gcc/testsuite/ChangeLog > 2012-05-08 Dehao Chen > > * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working > as expected. > > Index: gcc/doc/invoke.texi > === > --- gcc/doc/invoke.texi (revision 187277) > +++ gcc/doc/invoke.texi (working copy) > @@ -360,7 +360,8 @@ > -fcx-limited-range @gol > -fdata-sections -fdce -fdelayed-branch @gol > -fdelete-null-pointer-checks -fdevirtualize -fdse @gol > --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects > @gol > +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations > @gol > +-ffat-lto-objects @gol > -ffast-math -ffinite-math-only -ffloat-store > -fexcess-precision=@var{style} @gol > -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol > -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol > @@ -6238,6 +6239,7 @@ > -fdefer-pop @gol > -fdelayed-branch @gol > -fdse @gol > +-feliminate-malloc @gol > -fguess-branch-probability @gol > -fif-conversion2 @gol > -fif-conversion @gol > @@ -6762,6 +6764,11 @@ > Perform dead store elimination (DSE) on RTL@. > Enabled by default at @option{-O} and higher. > > +@item -feliminate-malloc > +@opindex feliminate-malloc > +Eliminate unnecessary malloc/free pairs. > +Enabled by default at @option{-O} and higher. > + > @item -fif-conversion > @opindex fif-conversion > Attempt to transform conditional jumps into branch-less equivalents. > This > Index: gcc/testsuite/gcc.dg/free-malloc.c > === > --- gcc/testsuite/gcc.dg/free-malloc.c (revision 0) > +++ gcc/testsuite/gcc.dg/free-malloc.c (revision 0) > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-eliminate-malloc" } */ > +/* { dg-final { scan-assembler-times "malloc" 2} } */ > +/* { dg-final { scan-assembler-times "free" 2} } */ > + > +extern void * malloc (unsigned long); > +extern void free (void *); > + > +void test () > +{ > + free (malloc (10)); > +} > Index: gcc/common.opt > === > --- gcc/common.opt (revision 187277) > +++ gcc/common.opt (working copy) > @@ -1474,6 +1474,10 @@ > Common Var(flag_dce) Init(1) Optimization > Use the RTL dead code elimination pass > > +feliminate-malloc > +Common Var(flag_eliminate_malloc) Init(1) Optimization > +Eliminate unnecessary malloc/free pairs > + > fdse > Common Var(flag_dse) Init(1) Optimizati
Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs
On Wed, May 9, 2012 at 11:48 AM, Dehao Chen wrote: > On Wed, May 9, 2012 at 5:22 PM, Richard Guenther > wrote: >> On Wed, May 9, 2012 at 10:38 AM, Dehao Chen wrote: >>> On Wed, May 9, 2012 at 4:12 PM, Richard Guenther >>> wrote: On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li wrote: > To be clear, this flag is for malloc implementation (such as tcmalloc) > with side effect unknown to the compiler. Using -fno-builtin-xxx is > too conservative for that purpose. I don't think that flys. Btw, the patch also guards alloca - alloca is purely GCC internal. What's the "unknown side-effects" that are also important to preserve for free(malloc(4))? >>> >>> Malloc implementation may record some info to a global structure, and >>> a program may use this free(malloc()) pair to simulate the real runs >>> to get some data, such as peak memory requirement. >> >> So why not use an alternate interface into this special allocator for this >> purpose? > > There can be the following scenario: > > We want to add a module to an existing app. Before implementing the > module, we want to collect some statistics on real runs. In this > scenario, we need: > > * No change to the legacy code > * Optimized build for the simulation run > * Provide accurate statistical info > > We want to collect data for both new module and the legacy code > without changing the later, thus we cannot use a new malloc/free > interface. So put in a optimization barrier then. Like free (({ void * x = malloc (4); __asm ("" : "+m" (x)); __x; }); Btw, why can't you simply build the new module (which doesn't something real anyway, just fake stuff) without optimization? > Thanks, > Dehao > >> >>> Dehao >>> Richard. > David > > On Tue, May 8, 2012 at 7:43 AM, Dehao Chen wrote: >> Hello, >> >> This patch adds a flag to guard the optimization that optimize the >> following code away: >> >> free (malloc (4)); >> >> In some cases, we'd like this type of malloc/free pairs to remain in >> the optimized code. >> >> Tested with bootstrap, and no regression in the gcc testsuite. >> >> Is it ok for mainline? >> >> Thanks, >> Dehao >> >> gcc/ChangeLog >> 2012-05-08 Dehao Chen >> >> * common.opt (feliminate-malloc): New. >> * doc/invoke.texi: Document it. >> * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it. >> >> gcc/testsuite/ChangeLog >> 2012-05-08 Dehao Chen >> >> * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working >> as expected. >> >> Index: gcc/doc/invoke.texi >> === >> --- gcc/doc/invoke.texi (revision 187277) >> +++ gcc/doc/invoke.texi (working copy) >> @@ -360,7 +360,8 @@ >> -fcx-limited-range @gol >> -fdata-sections -fdce -fdelayed-branch @gol >> -fdelete-null-pointer-checks -fdevirtualize -fdse @gol >> --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects >> @gol >> +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations >> @gol >> +-ffat-lto-objects @gol >> -ffast-math -ffinite-math-only -ffloat-store >> -fexcess-precision=@var{style} @gol >> -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol >> -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol >> @@ -6238,6 +6239,7 @@ >> -fdefer-pop @gol >> -fdelayed-branch @gol >> -fdse @gol >> +-feliminate-malloc @gol >> -fguess-branch-probability @gol >> -fif-conversion2 @gol >> -fif-conversion @gol >> @@ -6762,6 +6764,11 @@ >> Perform dead store elimination (DSE) on RTL@. >> Enabled by default at @option{-O} and higher. >> >> +@item -feliminate-malloc >> +@opindex feliminate-malloc >> +Eliminate unnecessary malloc/free pairs. >> +Enabled by default at @option{-O} and higher. >> + >> @item -fif-conversion >> @opindex fif-conversion >> Attempt to transform conditional jumps into branch-less equivalents. >> This >> Index: gcc/testsuite/gcc.dg/free-malloc.c >> === >> --- gcc/testsuite/gcc.dg/free-malloc.c (revision 0) >> +++ gcc/testsuite/gcc.dg/free-malloc.c (revision 0) >> @@ -0,0 +1,12 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O2 -fno-eliminate-malloc" } */ >> +/* { dg-final { scan-assembler-times "malloc" 2} } */ >> +/* { dg-final { scan-assembler-times "free" 2} } */ >> + >> +extern void * malloc (unsigned long); >> +extern void free (void *); >> + >> +void test () >> +{ >> + free (malloc (10)); >> +} >> Index: gcc/common.opt >>
Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs
On Wed, May 09, 2012 at 05:48:25PM +0800, Dehao Chen wrote: > > So why not use an alternate interface into this special allocator for this > > purpose? > > There can be the following scenario: > > We want to add a module to an existing app. Before implementing the > module, we want to collect some statistics on real runs. In this > scenario, we need: > > * No change to the legacy code > * Optimized build for the simulation run > * Provide accurate statistical info > > We want to collect data for both new module and the legacy code > without changing the later, thus we cannot use a new malloc/free > interface. Note that even in the glibc testsuite there had to be workarounds for this (asm barrier was used): http://sources.redhat.com/ml/libc-alpha/2012-05/msg00138.html Jakub
[PATCH] ARM/NEON: vld1q_dup_s64 builtin
Hello, On ARM+Neon, the expansion of vld1q_dup_s64() and vld1q_dup_u64() builtins currently fails to load the second vector element. Here is a small patch to address this problem: 2012-05-07 Christophe Lyon * gcc/config/arm/neon.md (neon_vld1_dup): Fix vld1q_dup_s64. Index: gcc/config/arm/neon.md === --- gcc/config/arm/neon.md(revision 2659) +++ gcc/config/arm/neon.md(revision 2660) @@ -4203,7 +4203,7 @@ if (GET_MODE_NUNITS (mode) > 2) return "vld1.\t{%e0[], %f0[]}, %A1"; else -return "vld1.\t%h0, %A1"; +return "vld1.\t%e0, %A1 \;vmov\t%f0, %e0"; } [(set (attr "neon_type") (if_then_else (gt (const_string "") (const_string "1")) OK? Thanks, Christophe.
Backport: fma3 instruction generation for 'march=native' in AMD processors
Hello, Below is the patch that has been committed in trunk (Revision: 187075). We like to backport it to GCC 4.7 branch as couple of AMD processors require this change for fma3 instruction generation. Bootstrapping and testing are successful. Is it OK to commit in GCC 4.7 branch? Regards Ganesh PATCH = * config/i386/driver-i386.c (host_detect_local_cpu): Reset has_fma4 for AMD processors with both fma3 and fma4 support. Index: config/i386/driver-i386.c === --- config/i386/driver-i386.c (revision 186897) +++ config/i386/driver-i386.c (working copy) @@ -472,6 +472,8 @@ has_abm = ecx & bit_ABM; has_lwp = ecx & bit_LWP; has_fma4 = ecx & bit_FMA4; + if (vendor == SIG_AMD && has_fma4 && has_fma) + has_fma4 = 0; has_xop = ecx & bit_XOP; has_tbm = ecx & bit_TBM; has_lzcnt = ecx & bit_LZCNT;
Re: [RFC] improve caret diagnostics for overload failures
Someone opened a bug about this: http://gcc.gnu.org/PR53289 Pinging: http://gcc.gnu.org/ml/gcc-patches/2012-04/msg01836.html On 29 April 2012 12:28, Manuel López-Ibáñez wrote: > A new version using unsigned int for the flag type. It also adds > another use in the C FE. > > I am not asking for approval, only whether this > approach/implementation is the way to go. > > Cheers, > > Manuel. > > On 23 April 2012 20:09, Manuel López-Ibáñez wrote: >> So, apart from the type of the flag, are there any other comments on >> the patch? Is the approach acceptable? >> >> On 21 April 2012 17:51, Gabriel Dos Reis >> wrote: >>> On Sat, Apr 21, 2012 at 9:42 AM, Jakub Jelinek wrote: On Sat, Apr 21, 2012 at 04:26:32PM +0200, Manuel López-Ibáñez wrote: > On 21 April 2012 16:22, Gabriel Dos Reis > wrote: > > Do no use 'char' as the type of a flag. Prefer 'unsigned int'. > > > > Thanks, good catch! Should I worry about memory here and use something > shorter? If it is a bool flag, you certainly should use bool type, which is shorter. >>> >>> It is a bit flag -- see the patch in his original message and 'enum >>> diagnostic_info_flags'.
Re: [testsuite] Fix gcc.target/i386/hle-* testcases with Sun as
Hi Mike, > On May 8, 2012, at 10:19 AM, Rainer Orth wrote: >> Several /gcc.target/i386/hle-*.c tests are currently failing on Solaris >> 9/x86 with Sun as: >> >> FAIL: gcc.target/i386/hle-add-acq-1.c scan-assembler lock[ >> \\n\\t]+(xacquire|.byte[ \\t]+0xf2)[ \\t\\n]+add >> >> The .s file has >> >>lock; >>.byte 0xf2 >> >> but the scan-assembler regex currently doesn't allow for the ; (which is >> not present with gas 2.22). >> >> The patch below does just that. Tested with the appropriate runtest >> invocation on i386-pc-solaris2.9 configured with as and gas >> respectively. >> >> Ok for mainline? > > Ok, assuming that the ; has to be there. If it doesn't have to be I just rechecked: this is covered by the gcc_cv_as_ix86_rep_lock_prefix test in gcc/configure.as which fails with the Solaris (8 and) 9 native assembler. > there, an alternative patch might be to remove it from the port now > instead of the patch. Right, that's why I was asking for review rather than just installing on my own. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[PATCH] Vectorizer versioning TLC
This continues the series of patches cleaning up versioning / peeling in the vectorizer. This moves computation of whether to do a cost model check to a central place where it is also easy to see where it will eventually end up generated. Apart from this re-org this recognizes that runtime checks are pointless if we could have done the check fully at compile-time. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-05-09 Richard Guenther * tree-vectorizer.h (vect_loop_versioning): Adjust prototype. (vect_do_peeling_for_loop_bound): Likewise. (vect_do_peeling_for_alignment): Likewise. * tree-vect-loop-manip.c (conservative_cost_threshold): Remove. (vect_do_peeling_for_loop_bound): Get check_profitability and threshold as parameters. (vect_do_peeling_for_alignment): Likewise. (vect_loop_versioning): Likewise. * tree-vect-loop.c (vect_transform_loop): Compute check_profitability and threshold here. Control where to put the check here. Index: gcc/tree-vectorizer.h === *** gcc/tree-vectorizer.h (revision 187316) --- gcc/tree-vectorizer.h (working copy) *** extern LOC vect_loop_location; *** 807,816 in tree-vect-loop-manip.c. */ extern void slpeel_make_loop_iterate_ntimes (struct loop *, tree); extern bool slpeel_can_duplicate_loop_p (const struct loop *, const_edge); ! extern void vect_loop_versioning (loop_vec_info); extern void vect_do_peeling_for_loop_bound (loop_vec_info, tree *, ! tree, gimple_seq); ! extern void vect_do_peeling_for_alignment (loop_vec_info); extern LOC find_loop_location (struct loop *); extern bool vect_can_advance_ivs_p (loop_vec_info); --- 807,816 in tree-vect-loop-manip.c. */ extern void slpeel_make_loop_iterate_ntimes (struct loop *, tree); extern bool slpeel_can_duplicate_loop_p (const struct loop *, const_edge); ! extern void vect_loop_versioning (loop_vec_info, unsigned int, bool); extern void vect_do_peeling_for_loop_bound (loop_vec_info, tree *, ! unsigned int, bool); ! extern void vect_do_peeling_for_alignment (loop_vec_info, unsigned int, bool); extern LOC find_loop_location (struct loop *); extern bool vect_can_advance_ivs_p (loop_vec_info); Index: gcc/tree-vect-loop-manip.c === *** gcc/tree-vect-loop-manip.c (revision 187316) --- gcc/tree-vect-loop-manip.c (working copy) *** vect_update_ivs_after_vectorizer (loop_v *** 1853,1886 } } - /* Return the more conservative threshold between the -min_profitable_iters returned by the cost model and the user -specified threshold, if provided. */ - - static unsigned int - conservative_cost_threshold (loop_vec_info loop_vinfo, -int min_profitable_iters) - { - unsigned int th; - int min_scalar_loop_bound; - - min_scalar_loop_bound = ((PARAM_VALUE (PARAM_MIN_VECT_LOOP_BOUND) - * LOOP_VINFO_VECT_FACTOR (loop_vinfo)) - 1); - - /* Use the cost model only if it is more conservative than user specified - threshold. */ - th = (unsigned) min_scalar_loop_bound; - if (min_profitable_iters - && (!min_scalar_loop_bound - || min_profitable_iters > min_scalar_loop_bound)) - th = (unsigned) min_profitable_iters; - - if (th && vect_print_dump_info (REPORT_COST)) - fprintf (vect_dump, "Profitability threshold is %u loop iterations.", th); - - return th; - } - /* Function vect_do_peeling_for_loop_bound Peel the last iterations of the loop represented by LOOP_VINFO. --- 1853,1858 *** conservative_cost_threshold (loop_vec_in *** 1896,1902 void vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo, tree *ratio, ! tree cond_expr, gimple_seq cond_expr_stmt_list) { tree ni_name, ratio_mult_vf_name; struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); --- 1868,1874 void vect_do_peeling_for_loop_bound (loop_vec_info loop_vinfo, tree *ratio, ! unsigned int th, bool check_profitability) { tree ni_name, ratio_mult_vf_name; struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo); *** vect_do_peeling_for_loop_bound (loop_vec *** 1904,1913 edge update_e; basic_block preheader; int loop_num; - bool check_profitability = false; - unsigned int th = 0; - int min_profitable_iters; int max_iter; if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, "=== vect_do_peeling_for_loop_bound ==="); --- 1876,1884 edge update_e; basic_block preheader; int loop_num; int max_iter; + tree cond_expr = NULL_TREE; + gimple_seq cond_expr_st
Ping: [PATCH] Add -fdump-rtl--quiet
Ping. On 02/05/12 10:49, Andrew Stubbs wrote: On 19/04/12 13:58, Andrew Stubbs wrote: In the meantime, Mr Maintainers, can I commit my patch while we wait for the new world order? I'm happy to change the option name "quiet" to something else if necessary. Ping. I think David may have a point about 'quiet' being an inappropriate name for this option. He proposed 'ir' which I don't like but will do the job just fine. Ok, with that change? Andrew
Re: Backport: fma3 instruction generation for 'march=native' in AMD processors
Hello! > Below is the patch that has been committed in trunk (Revision: 187075). > We like to backport it to GCC 4.7 branch as couple of AMD processors require > this change for fma3 instruction generation. > Is it OK to commit in GCC 4.7 branch? > * config/i386/driver-i386.c (host_detect_local_cpu): Reset > has_fma4 for AMD processors with both fma3 and fma4 support. OK. Thanks, Uros.
Re: [testsuite] Fix gcc.target/i386/hle-* testcases with Sun as
On Wed, May 9, 2012 at 12:53 PM, Rainer Orth wrote: >>> Several /gcc.target/i386/hle-*.c tests are currently failing on Solaris >>> 9/x86 with Sun as: >>> >>> FAIL: gcc.target/i386/hle-add-acq-1.c scan-assembler lock[ >>> \\n\\t]+(xacquire|.byte[ \\t]+0xf2)[ \\t\\n]+add >>> >>> The .s file has >>> >>> lock; >>> .byte 0xf2 >>> >>> but the scan-assembler regex currently doesn't allow for the ; (which is >>> not present with gas 2.22). >>> >>> The patch below does just that. Tested with the appropriate runtest >>> invocation on i386-pc-solaris2.9 configured with as and gas >>> respectively. >>> >>> Ok for mainline? >> >> Ok, assuming that the ; has to be there. If it doesn't have to be > > I just rechecked: this is covered by the gcc_cv_as_ix86_rep_lock_prefix > test in gcc/configure.as which fails with the Solaris (8 and) 9 native > assembler. > >> there, an alternative patch might be to remove it from the port now >> instead of the patch. > > Right, that's why I was asking for review rather than just installing on > my own. I'd rather see that we remove semicolon in this case, but please note that xchg with memory operand doesn't need lock prefix. If it is not too much trouble... Uros.
RE: Propose to add a new argument MULTILIB_REQUIRED in multilib framework
> -Original Message- > From: Joseph Myers [mailto:jos...@codesourcery.com] > Sent: Thursday, May 03, 2012 8:49 PM > To: Terry Guo > Cc: gcc-patches@gcc.gnu.org > Subject: RE: Propose to add a new argument MULTILIB_REQUIRED in > multilib framework > > On Thu, 3 May 2012, Terry Guo wrote: > > > 2012-05-03 Terry Guo > > > > * Makefile.in (s-mlib): Add new argument MULTILIB_REQUIRED. > > * genmultilib (MULTILIB_REQUIRED): New. > > * doc/fragments.texi: Document the MULTILIB_REQUIRED. > > This is OK, with copyright dates on genmultilib and fragments.texi > updated, in the absence of any build system maintainer objections > within > 72 hours. > Committed into trunk with this patch at revision 187325 and copyright patch at revision 187327. BR, Terry
Re: [C++ Patch] PR 53158
Hi again, On 05/08/2012 03:00 PM, Jason Merrill wrote: On 05/07/2012 11:28 PM, Paolo Carlini wrote: error: could not convert ‘b.main()::()’ from ‘void’ to ‘bool’ It wouldn't say "operator()"? I think I'd leave that alone; it is somewhat more informative (about what b() expands to) and we're moving toward replacing %qE with caret anyway. The patch to ocp_convert is OK. As you may have noticed, the patchlet is still unapplied (I'm attaching below what I have tested and ready to get in per your approval). Is unapplied because I was really nervous due to the wrong location (thus caret) of the error call, at the end of the whole condition. Now, I'm wondering, shall we consistently use error_at (location_of (expr), ... for the error messages produced by the *convert* functions? The below quick fix makes me *much* more happy, the caret points to the closed round brace of the b() call. Can I trust all the exprs to come with an embedded location *at least* as accurate as input_location, normally better? In case I can do a pass through all of cvt.c etc and repost a largish patch... Thanks! Paolo. Index: testsuite/g++.dg/cpp0x/lambda/lambda-err2.C === --- testsuite/g++.dg/cpp0x/lambda/lambda-err2.C (revision 0) +++ testsuite/g++.dg/cpp0x/lambda/lambda-err2.C (revision 0) @@ -0,0 +1,12 @@ +// PR c++/53158 +// { dg-do compile { target c++11 } } + +int main() +{ + auto a = []() { return true; }; + auto b = []() { return a(); }; // { dg-error "'a' is not captured" } + int c, d; + while (b() && c < d) // { dg-error "could not convert" } +{ +} +} Index: cp/cvt.c === --- cp/cvt.c(revision 187280) +++ cp/cvt.c(working copy) @@ -743,6 +743,12 @@ ocp_convert (tree type, tree expr, int convtype, i } if (code == BOOLEAN_TYPE) { + if (TREE_CODE (intype) == VOID_TYPE) + { + error ("could not convert %qE from % to %", expr); + return error_mark_node; + } + /* We can't implicitly convert a scoped enum to bool, so convert to the underlying type first. */ if (SCOPED_ENUM_P (intype) && (convtype & CONV_STATIC)) Index: cvt.c === --- cvt.c (revision 187290) +++ cvt.c (working copy) @@ -743,6 +743,14 @@ ocp_convert (tree type, tree expr, int convtype, i } if (code == BOOLEAN_TYPE) { + if (TREE_CODE (intype) == VOID_TYPE) + { + error_at (location_of (expr), + "could not convert %qE from % to %", + expr); + return error_mark_node; + } + /* We can't implicitly convert a scoped enum to bool, so convert to the underlying type first. */ if (SCOPED_ENUM_P (intype) && (convtype & CONV_STATIC))
Re: [PATCH] Fix endless loop in forwprop (PR tree-optimization/53226)
On Wed, May 09, 2012 at 10:21:04AM +0200, Richard Guenther wrote: > Using the stmt UID looks odd as you are using it as flag - why not use > one of the two stmt flags available to passes? > > Btw, the other possibility would be to simply change the various combiners > to require them to update the iterator to point at the first statement that > they have inserted. I have some TLC ideas in this area, let's see if I can > get to them. > > Ok with using gimple_set_plf and friends. Here is what I've committed after doing another bootstrap/regtest: 2012-05-09 Jakub Jelinek PR tree-optimization/53226 * tree-ssa-forwprop.c (ssa_forward_propagate_and_combine): Remove prev and prev_initialized vars, gimple_set_plf (stmt, GF_PLF_1, false) before processing it and gimple_set_plf (stmt, GF_PLF_1, true) if it doesn't need to be revisited, look for earliest stmt with !gimple_plf (stmt, GF_PLF_1) if something changed. * gcc.c-torture/compile/pr53226.c: New test. --- gcc/tree-ssa-forwprop.c.jj 2012-05-03 08:35:52.0 +0200 +++ gcc/tree-ssa-forwprop.c 2012-05-08 18:10:19.662061709 +0200 @@ -2677,8 +2677,7 @@ ssa_forward_propagate_and_combine (void) FOR_EACH_BB (bb) { - gimple_stmt_iterator gsi, prev; - bool prev_initialized; + gimple_stmt_iterator gsi; /* Apply forward propagation to all stmts in the basic-block. Note we update GSI within the loop as necessary. */ @@ -2771,12 +2770,14 @@ ssa_forward_propagate_and_combine (void) /* Combine stmts with the stmts defining their operands. Note we update GSI within the loop as necessary. */ - prev_initialized = false; for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);) { gimple stmt = gsi_stmt (gsi); bool changed = false; + /* Mark stmt as potentially needing revisiting. */ + gimple_set_plf (stmt, GF_PLF_1, false); + switch (gimple_code (stmt)) { case GIMPLE_ASSIGN: @@ -2856,18 +2857,18 @@ ssa_forward_propagate_and_combine (void) { /* If the stmt changed then re-visit it and the statements inserted before it. */ - if (!prev_initialized) + for (; !gsi_end_p (gsi); gsi_prev (&gsi)) + if (gimple_plf (gsi_stmt (gsi), GF_PLF_1)) + break; + if (gsi_end_p (gsi)) gsi = gsi_start_bb (bb); else - { - gsi = prev; - gsi_next (&gsi); - } + gsi_next (&gsi); } else { - prev = gsi; - prev_initialized = true; + /* Stmt no longer needs to be revisited. */ + gimple_set_plf (stmt, GF_PLF_1, true); gsi_next (&gsi); } } --- gcc/testsuite/gcc.c-torture/compile/pr53226.c.jj2012-05-08 18:07:40.007000510 +0200 +++ gcc/testsuite/gcc.c-torture/compile/pr53226.c 2012-05-08 18:07:35.071029578 +0200 @@ -0,0 +1,13 @@ +/* PR tree-optimization/53226 */ + +void +foo (unsigned long *x, char y, char z) +{ + int i; + for (i = y; i < z; ++i) +{ + unsigned long a = ((unsigned char) i) & 63UL; + unsigned long b = 1ULL << a; + *x |= b; +} +} Jakub
Fix PR53185 (vectorizer segfault)
Hi, the current code for strided loads can't deal with the situation when a prologue loop (peeling for alignment) is created after analyzing the data refs. There are multiple issues (non-constant steps in DRs mainly), so this is a simple stop gap. Regtesting on x86_64-linux (all langs) in progress. Okay for trunk? Ciao, Michael. PR tree-optimization/53185 * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Disable peeling when we see strided loads. testsuite/ * gcc.dg/vect/pr53185.c: New test. Index: tree-vect-data-refs.c === --- tree-vect-data-refs.c (revision 187287) +++ tree-vect-data-refs.c (working copy) @@ -1507,6 +1507,17 @@ vect_enhance_data_refs_alignment (loop_v && GROUP_FIRST_ELEMENT (stmt_info) != stmt) continue; + /* FORNOW: Any strided load prevents peeling. The induction + variable analysis will fail when the prologue loop is generated, +and so we can't generate the new base for the pointer. */ + if (STMT_VINFO_STRIDE_LOAD_P (stmt_info)) + { + if (vect_print_dump_info (REPORT_DETAILS)) + fprintf (vect_dump, "strided load prevents peeling"); + do_peeling = false; + break; + } + /* For invariant accesses there is nothing to enhance. */ if (integer_zerop (DR_STEP (dr))) continue; Index: testsuite/gcc.dg/vect/pr53185.c === --- testsuite/gcc.dg/vect/pr53185.c (revision 0) +++ testsuite/gcc.dg/vect/pr53185.c (revision 0) @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -ftree-vectorize" } */ +unsigned short a, e; +int *b, *d; +int c; +extern int fn2(); +void fn1 () { + void *f; + for (;;) { +fn2 (); +b = f; +e = 0; +for (; e < a; ++e) + b[e] = d[e * c]; + } +}
Re: PR 40752: -Wconversion generates false warnings for operands not larger than target type
On 7 May 2012 16:52, Paolo Carlini wrote: > Hi, > > > On 06/11/2010 10:23 PM, Manuel López-Ibáñez wrote: >> >> On 11 August 2009 02:01, Joseph S. Myers wrote: >>> >>> On Tue, 11 Aug 2009, Manuel López-Ibáñez wrote: >>> Modified the patch to make use of the new c-c++-common testsuite. Bootstrapped and regression tested on x86_64-linux-gnu. OK for trunk? >>> >>> I still think the warnings for these cases are mostly correct (there are >>> cases where you may be able to make deductions about the range of >>> possible >>> values of the expression being converted) and appropriate, and if >>> disabled >>> should be disabled under some separate -Wno-conversion-whatever option. >> >> -Wno-conversion-after-promotion ? >> >> I am not sure what would be a good name, but I think it is worth to >> allow silencing these warnings, so suggestions are welcome. > > it looks like this PR is still open today, and I think resolving it one way > or the other isn't much work... Thus I'm asking: shall we actually have a > new -Wno-conversion-after-promotion? Or, alternately - the issue came up > quite often in recent times, because other widespread compilers don't warn - > suppress the warning only in case of conditional expressions: > > char > foo(bool haveBar, char bar_) > { > return haveBar ? bar_ : 0; > }; > > What do you think? I think Joseph has said before that if the range can be deduced to stay within limits of the target type, then we should never warn, no new option needed. This is exactly the case if you recurse only on conditional expression operands and casts, and it will fix the warning above. The other cases should be deal separately. Cheers, Manuel.
Re: [RFC ivopts] ARM - Make ivopts take into account whether pre and post increments are actually supported on targets.
> I would like another set of eyes on the backend specific changes - I > am currently regression testing this final version on FSF trunk. After testing and benchmarking and getting some private feedback about the patch, this is what I ended up committing. I have a follow up patch coming to adjust legitimate_address and friends for some of these modes. regards, Ramana 2012-05-09 Ramana Radhakrishnan * tree-ssa-loop-ivopts.c (add_autoinc_candidates, get_address_cost): Replace use of HAVE_{POST/PRE}_{INCREMENT/DECREMENT} with USE_{LOAD/STORE}_{PRE/POST}_{INCREMENT/DECREMENT} appropriately. * config/arm/arm.h (ARM_AUTOINC_VALID_FOR_MODE_P): New. (USE_LOAD_POST_INCREMENT): Define. (USE_LOAD_PRE_INCREMENT): Define. (USE_LOAD_POST_DECREMENT): Define. (USE_LOAD_PRE_DECREMENT): Define. (USE_STORE_PRE_DECREMENT): Define. (USE_STORE_PRE_INCREMENT): Define. (USE_STORE_POST_DECREMENT): Define. (USE_STORE_POST_INCREMENT): Define. (arm_auto_incmodes): Add enumeration. * config/arm/arm-protos.h (arm_autoinc_modes_ok_p): Declare. * config/arm/arm.c (arm_autoinc_modes_ok_p): Define. > > 2012-04-10 Ramana Radhakrishnan > > * tree-ssa-loop-ivopts.c (add_autoinc_candidates, get_address_cost): > Replace use of HAVE_{POST/PRE}_{INCREMENT/DECREMENT} with > USE_{LOAD/STORE}_{PRE/POST}_{INCREMENT/DECREMENT} appropriately. > * config/arm/arm.h (ARM_AUTOINC_VALID_FOR_MODE_P): New. > (USE_LOAD_POST_INCREMENT): Define. > (USE_LOAD_PRE_INCREMENT): Define. > (USE_LOAD_POST_DECREMENT): Define. > (USE_LOAD_PRE_DECREMENT): Define. > (USE_STORE_PRE_DECREMENT): Define. > (USE_STORE_PRE_INCREMENT): Define. > (USE_STORE_POST_DECREMENT): Define. > (USE_STORE_POST_INCREMENT): Define. > (arm_auto_incmodes): Add enumeration. > * config/arm/arm-protos.h (arm_autoinc_modes_ok_p): Declare. > * config/arm/arm.c (arm_autoinc_modes_ok_p): Define. > (arm_rtx_costs_1): Adjust costs for > auto-inc modes and pre / post modify in floating point mode. > (arm_size_rtx_costs): Likewise. > > > regards, > Ramana > > > >> Richard. >> >>> Richard. >>> Ramana Index: gcc/tree-ssa-loop-ivopts.c === --- gcc/tree-ssa-loop-ivopts.c (revision 187327) +++ gcc/tree-ssa-loop-ivopts.c (working copy) @@ -2362,8 +2362,12 @@ cstepi = int_cst_value (step); mem_mode = TYPE_MODE (TREE_TYPE (*use->op_p)); - if ((HAVE_PRE_INCREMENT && GET_MODE_SIZE (mem_mode) == cstepi) - || (HAVE_PRE_DECREMENT && GET_MODE_SIZE (mem_mode) == -cstepi)) + if (((USE_LOAD_PRE_INCREMENT (mem_mode) + || USE_STORE_PRE_INCREMENT (mem_mode)) + && GET_MODE_SIZE (mem_mode) == cstepi) + || ((USE_LOAD_PRE_DECREMENT (mem_mode) + || USE_STORE_PRE_DECREMENT (mem_mode)) + && GET_MODE_SIZE (mem_mode) == -cstepi)) { enum tree_code code = MINUS_EXPR; tree new_base; @@ -2380,8 +2384,12 @@ add_candidate_1 (data, new_base, step, important, IP_BEFORE_USE, use, use->stmt); } - if ((HAVE_POST_INCREMENT && GET_MODE_SIZE (mem_mode) == cstepi) - || (HAVE_POST_DECREMENT && GET_MODE_SIZE (mem_mode) == -cstepi)) + if (((USE_LOAD_POST_INCREMENT (mem_mode) + || USE_STORE_POST_INCREMENT (mem_mode)) + && GET_MODE_SIZE (mem_mode) == cstepi) + || ((USE_LOAD_POST_DECREMENT (mem_mode) + || USE_STORE_POST_DECREMENT (mem_mode)) + && GET_MODE_SIZE (mem_mode) == -cstepi)) { add_candidate_1 (data, base, step, important, IP_AFTER_USE, use, use->stmt); @@ -3315,25 +3323,29 @@ reg0 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 1); reg1 = gen_raw_REG (address_mode, LAST_VIRTUAL_REGISTER + 2); - if (HAVE_PRE_DECREMENT) + if (USE_LOAD_PRE_DECREMENT (mem_mode) + || USE_STORE_PRE_DECREMENT (mem_mode)) { addr = gen_rtx_PRE_DEC (address_mode, reg0); has_predec[mem_mode] = memory_address_addr_space_p (mem_mode, addr, as); } - if (HAVE_POST_DECREMENT) + if (USE_LOAD_POST_DECREMENT (mem_mode) + || USE_STORE_POST_DECREMENT (mem_mode)) { addr = gen_rtx_POST_DEC (address_mode, reg0); has_postdec[mem_mode] = memory_address_addr_space_p (mem_mode, addr, as); } - if (HAVE_PRE_INCREMENT) + if (USE_LOAD_PRE_INCREMENT (mem_mode) + || USE_STORE_PRE_DECREMENT (mem_mode)) { addr = gen_rtx_PRE_INC (address_mode, reg0); has_preinc[mem_mode] = memory_address_addr_space_p (mem_mode, addr, as); } - if (HAVE_POST_INCREMENT) + if (USE_LOAD_POST_INCREMENT (mem_mode) + || USE_STORE_POST_INCREMENT (mem_mode)) { addr = gen_r
[PATCH] Move unexecuted vect testcase
Committed as obvious (passes on x86_64). Richard. 2012-05-09 Richard Guenther PR tree-optimization/18437 * gfortran.dg/vect/rnflow-trs2a2.f90: Move ... * gfortran.dg/vect/fast-math-rnflow-trs2a2.f90: ... here.
RFA: Fix detecting of in-tree MPFR 3.1.0 sources
Hi Guys, http://www.mpfr.org/mpfr-current/#changes The current release of the MPFR library (v3.1.0) has reorganized its sources such the mpfr.h header file is now in a sub-directory called 'src', rather than being at the top level. This has broken GCC's use of in-tree MPFR sources. I am asking for permission to apply the patch below to fix the problem. I tested it by building an i686-pc-linux-gnu toolchain on a machine with no MPFR libraries installed, but with a copy of the mpfr 3.1.0 sources installed in-tree. I also built a second toolchain with an in-tree copy of the mpfr 2.4.2 sources, just to make sure that the old paths still worked. Both builds worked. OK to apply ? Cheers Nick gcc/ChangeLog 2012-05-09 Nick Clifton * configure.ac (mpfr-dir): When using in-tree MPFR sources allow for the fact that from release v3.1.0 of MPFR the source files were moved into a src sub-directory. * configure: Regenerate. Index: configure.ac === --- configure.ac(revision 187320) +++ configure.ac(working copy) @@ -1289,9 +1289,16 @@ gmplibs="-L$with_mpfr_lib $gmplibs" fi if test "x$with_mpfr$with_mpfr_include$with_mpfr_lib" = x && test -d ${srcdir}/mpfr; then - gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/'"$lt_cv_objdir $gmplibs" - gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr -I$$s/mpfr '"$gmpinc" - extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/'"$lt_cv_objdir" + # MPFR v3.1.0 moved the sources into a src sub-directory. + if test -d ${srcdir}/mpfr/src; then +gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/src/'"$lt_cv_objdir $gmplibs" +gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr/src -I$$s/mpfr/src '"$gmpinc" +extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr/src --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/src/'"$lt_cv_objdir" + else +gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/'"$lt_cv_objdir $gmplibs" +gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr -I$$s/mpfr '"$gmpinc" +extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/'"$lt_cv_objdir" + fi # Do not test the mpfr version. Assume that it is sufficient, since # it is in the source tree, and the library has not been built yet # but it would be included on the link line in the version check below
Re: [C++ Patch] PR 53158
On 05/09/2012 08:21 AM, Paolo Carlini wrote: Is unapplied because I was really nervous due to the wrong location (thus caret) of the error call, at the end of the whole condition. Now, I'm wondering, shall we consistently use error_at (location_of (expr), ... for the error messages produced by the *convert* functions? Sure. Note that as an alternative to error_at (location_of (expr) you can just use %q+E; the + means "use the location of this argument". The below quick fix makes me *much* more happy, the caret points to the closed round brace of the b() call. Can I trust all the exprs to come with an embedded location *at least* as accurate as input_location, normally better? I would think so, yes. If an expression doesn't have a location, location_of/%q+E will just use input_location. Jason
Re: [RFC] improve caret diagnostics for overload failures
On 04/29/2012 06:28 AM, Manuel López-Ibáñez wrote: A new version using unsigned int for the flag type. It also adds another use in the C FE. I am not asking for approval, only whether this approach/implementation is the way to go. That looks good. I would still also adjust the caret printer to avoid printing the same caret twice in a row even when we haven't updated the caller. Jason
Re: [C++ Patch] PR 53158
On 9 May 2012 14:59, Jason Merrill wrote: > On 05/09/2012 08:21 AM, Paolo Carlini wrote: >> >> Is unapplied because I was really nervous due to the wrong location >> (thus caret) of the error call, at the end of the whole condition. Now, >> I'm wondering, shall we consistently use error_at (location_of (expr), >> ... for the error messages produced by the *convert* functions? > > > Sure. Note that as an alternative to error_at (location_of (expr) you can > just use %q+E; the + means "use the location of this argument". This far less clear than error_at(location, "whatever"). And it requires the diagnostics machinery to know about input_location. I removed %H precisely for those reasons. Please, let's stop using "+" in diagnostics and always use explicit locations. Cheers, Manuel.
Re: [RFC] improve caret diagnostics for overload failures
On 9 May 2012 15:04, Jason Merrill wrote: > On 04/29/2012 06:28 AM, Manuel López-Ibáńez wrote: >> >> A new version using unsigned int for the flag type. It also adds >> another use in the C FE. >> >> I am not asking for approval, only whether this >> approach/implementation is the way to go. > > > That looks good. I would still also adjust the caret printer to avoid > printing the same caret twice in a row even when we haven't updated the > caller. I could implement that by storing the last location in the diagnostic_context or using a static location_t in diagnostic_show_locus. What is your preference? Cheers, Manuel.
Re: [C++ Patch] PR 53158
Hi, On 05/09/2012 02:59 PM, Jason Merrill wrote: On 05/09/2012 08:21 AM, Paolo Carlini wrote: Is unapplied because I was really nervous due to the wrong location (thus caret) of the error call, at the end of the whole condition. Now, I'm wondering, shall we consistently use error_at (location_of (expr), ... for the error messages produced by the *convert* functions? Sure. Note that as an alternative to error_at (location_of (expr) you can just use %q+E; the + means "use the location of this argument". A great, very concise, didn't know that (but maybe we want to make Manuel happier ;) The below quick fix makes me *much* more happy, the caret points to the closed round brace of the b() call. Can I trust all the exprs to come with an embedded location *at least* as accurate as input_location, normally better? I would think so, yes. If an expression doesn't have a location, location_of/%q+E will just use input_location. In the meanwhile I found a "counterexample" ;) Consider: enum E { e1 }; E e = e1; void bar(int a) { if (e = a) ; } right now the location for the "invalid conversion from ‘int’ to ‘E’" error message seems pretty good, points to the a in "if (e = a)". If I do: Index: call.c === --- call.c (revision 187290) +++ call.c (working copy) @@ -5704,7 +5704,7 @@ convert_like_real (conversion *convs, tree expr, t break; } - permerror (input_location, "invalid conversion from %qT to %qT", + permerror (location_of (expr), "invalid conversion from %qT to %qT", TREE_TYPE (expr), totype); if (fn) permerror (DECL_SOURCE_LOCATION (fn), then an horrible thing happen: the error message points to the b in bar in "void bar(int a)" !!! What the heck has that to do with the actual conversion two lines below?!? Isn't even pointing to the a in "void bar(int a)"! So... I guess I could start experimenting with my idea, but we should be ready to fix regressions... Or we can already take from the "counterexample" a general lesson? Or maybe a subset of the conversion* functions, those in cvt.c?, seems first blush safer to tweak? Paolo.
Re: PATCH: Update longlong.h from GLIBC
On Tue, May 8, 2012 at 4:05 AM, Andreas Jaeger wrote: > On Tuesday, May 08, 2012 11:59:34 Richard Earnshaw wrote: >> On 08/05/12 10:04, Andreas Jaeger wrote: >> > On Tuesday, May 08, 2012 10:43:14 Richard Guenther wrote: >> >> On Mon, May 7, 2012 at 11:11 PM, H.J. Lu wrote: >> >>> Hi, >> >>> >> >>> I am preparing to update GLIBC longlong.h from GCC. This patch >> >>> updates GCC longlong.h to use a URL instead of an FSF postal address >> >>> and replace spaces with tab. OK to install? >> >>> >> >>> Since I'd like to simply copy longlong.h from GCC release branch to >> >>> GLIBC, Is this also OK for 4.7 branch? >> >> >> >> Why? Does it fix anything there? >> > >> > It makes sharing the file between gcc and glibc easier, >> > >> > Andreas >> >> Why should glibc be depending on the GCC release branch? Sounds like >> the tail wagging the dog. > > Ah, you discuss the release branch ;) Let HJ defend that one. > > >> Changing this file has quite a high potential for introducing >> regressions. I don't think we should risk that on the release branch. > > It's only whitespace IMO. I'm arguing for the trunk to take the change, > I will take whatever I can get. Ian, is this OK for trunk? Thanks. -- H.J.
Re: [C++ Patch] PR 53158
I think the problem is you really want to use EXPR_LOC_OR_HERE rather than location_of; if the argument happens to be a DECL, location_of will give you the declaration location rather than the use location. Jason
Re: [C++ Patch] PR 53158
On 05/09/2012 09:04 AM, Manuel López-Ibáñez wrote: This far less clear than error_at(location, "whatever"). And it requires the diagnostics machinery to know about input_location. I removed %H precisely for those reasons. Please, let's stop using "+" in diagnostics and always use explicit locations. OK. Jason
Re: [RFC] improve caret diagnostics for overload failures
On 05/09/2012 09:07 AM, Manuel López-Ibáñez wrote: I could implement that by storing the last location in the diagnostic_context or using a static location_t in diagnostic_show_locus. What is your preference? diagnostic_context, I guess. Jason
Re: PR 40752: -Wconversion generates false warnings for operands not larger than target type
Hi, On 05/09/2012 02:54 PM, Manuel López-Ibáñez wrote: On 7 May 2012 16:52, Paolo Carlini wrote: Hi, On 06/11/2010 10:23 PM, Manuel López-Ibáñez wrote: On 11 August 2009 02:01, Joseph S. Myerswrote: On Tue, 11 Aug 2009, Manuel López-Ibáñez wrote: Modified the patch to make use of the new c-c++-common testsuite. Bootstrapped and regression tested on x86_64-linux-gnu. OK for trunk? I still think the warnings for these cases are mostly correct (there are cases where you may be able to make deductions about the range of possible values of the expression being converted) and appropriate, and if disabled should be disabled under some separate -Wno-conversion-whatever option. -Wno-conversion-after-promotion ? I am not sure what would be a good name, but I think it is worth to allow silencing these warnings, so suggestions are welcome. it looks like this PR is still open today, and I think resolving it one way or the other isn't much work... Thus I'm asking: shall we actually have a new -Wno-conversion-after-promotion? Or, alternately - the issue came up quite often in recent times, because other widespread compilers don't warn - suppress the warning only in case of conditional expressions: char foo(bool haveBar, char bar_) { return haveBar ? bar_ : 0; }; What do you think? I think Joseph has said before that if the range can be deduced to stay within limits of the target type, then we should never warn, no new option needed. This is exactly the case if you recurse only on conditional expression operands and casts, and it will fix the warning above. The other cases should be deal separately. Good, I even have a patch essentially ready for that, but can you point me to that detailed feedback? I couldn't find it when I looked for it. Thanks Paolo.
Re: [C++ Patch] PR 53158
On 05/09/2012 03:20 PM, Jason Merrill wrote: I think the problem is you really want to use EXPR_LOC_OR_HERE rather than location_of; if the argument happens to be a DECL, location_of will give you the declaration location rather than the use location. Ah! Thanks a lot, now all those small details I always see in the diagnostics are becoming much more clear ;) Let's see what I can do... Paolo.
Re: Fix PR53185 (vectorizer segfault)
On Wed, May 9, 2012 at 2:38 PM, Michael Matz wrote: > Hi, > > the current code for strided loads can't deal with the situation when a > prologue loop (peeling for alignment) is created after analyzing the data > refs. There are multiple issues (non-constant steps in DRs mainly), so > this is a simple stop gap. > > Regtesting on x86_64-linux (all langs) in progress. Okay for trunk? Ok. Thanks, Richard. > > Ciao, > Michael. > > PR tree-optimization/53185 > * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Disable > peeling when we see strided loads. > > testsuite/ > * gcc.dg/vect/pr53185.c: New test. > > Index: tree-vect-data-refs.c > === > --- tree-vect-data-refs.c (revision 187287) > +++ tree-vect-data-refs.c (working copy) > @@ -1507,6 +1507,17 @@ vect_enhance_data_refs_alignment (loop_v > && GROUP_FIRST_ELEMENT (stmt_info) != stmt) > continue; > > + /* FORNOW: Any strided load prevents peeling. The induction > + variable analysis will fail when the prologue loop is generated, > + and so we can't generate the new base for the pointer. */ > + if (STMT_VINFO_STRIDE_LOAD_P (stmt_info)) > + { > + if (vect_print_dump_info (REPORT_DETAILS)) > + fprintf (vect_dump, "strided load prevents peeling"); > + do_peeling = false; > + break; > + } > + > /* For invariant accesses there is nothing to enhance. */ > if (integer_zerop (DR_STEP (dr))) > continue; > Index: testsuite/gcc.dg/vect/pr53185.c > === > --- testsuite/gcc.dg/vect/pr53185.c (revision 0) > +++ testsuite/gcc.dg/vect/pr53185.c (revision 0) > @@ -0,0 +1,16 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O3 -ftree-vectorize" } */ > +unsigned short a, e; > +int *b, *d; > +int c; > +extern int fn2(); > +void fn1 () { > + void *f; > + for (;;) { > + fn2 (); > + b = f; > + e = 0; > + for (; e < a; ++e) > + b[e] = d[e * c]; > + } > +}
Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources
On Wed, May 9, 2012 at 2:52 PM, Nick Clifton wrote: > Hi Guys, > > http://www.mpfr.org/mpfr-current/#changes > > The current release of the MPFR library (v3.1.0) has reorganized its > sources such the mpfr.h header file is now in a sub-directory called > 'src', rather than being at the top level. This has broken GCC's use > of in-tree MPFR sources. > > I am asking for permission to apply the patch below to fix the > problem. I tested it by building an i686-pc-linux-gnu toolchain on a > machine with no MPFR libraries installed, but with a copy of the mpfr > 3.1.0 sources installed in-tree. I also built a second toolchain with > an in-tree copy of the mpfr 2.4.2 sources, just to make sure that the > old paths still worked. Both builds worked. > > OK to apply ? I think we only support dropping in exactly the versions we provide in infrastructure/ - which matches the version we require in install.texi. Or did that change? > Cheers > Nick > > gcc/ChangeLog > 2012-05-09 Nick Clifton > > * configure.ac (mpfr-dir): When using in-tree MPFR sources > allow for the fact that from release v3.1.0 of MPFR the source > files were moved into a src sub-directory. > * configure: Regenerate. > > Index: configure.ac > === > --- configure.ac (revision 187320) > +++ configure.ac (working copy) > @@ -1289,9 +1289,16 @@ > gmplibs="-L$with_mpfr_lib $gmplibs" > fi > if test "x$with_mpfr$with_mpfr_include$with_mpfr_lib" = x && test -d > ${srcdir}/mpfr; then > - gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/'"$lt_cv_objdir $gmplibs" > - gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr -I$$s/mpfr '"$gmpinc" > - extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr > --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/'"$lt_cv_objdir" > + # MPFR v3.1.0 moved the sources into a src sub-directory. > + if test -d ${srcdir}/mpfr/src; then > + gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/src/'"$lt_cv_objdir $gmplibs" > + gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr/src -I$$s/mpfr/src '"$gmpinc" > + extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr/src > --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/src/'"$lt_cv_objdir" > + else > + gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/'"$lt_cv_objdir $gmplibs" > + gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr -I$$s/mpfr '"$gmpinc" > + extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr > --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/'"$lt_cv_objdir" > + fi > # Do not test the mpfr version. Assume that it is sufficient, since > # it is in the source tree, and the library has not been built yet > # but it would be included on the link line in the version check below
Re: [PATCH] gcc/config/freebsd-spec.h: Fix building PIE executables. Link them with crt{begin,end}S.o and Scrt1.o which are PIC instead of crt{begin,end}.o and crt1.o which are not. Spec synced from g
Hmm, sorry, it seems I forgot to look at MAINTAINERS and CC him... On Tue, 8 May 2012 09:53:43 -0400 Alexis Ballier wrote: > gcc/config/i386/freebsd.h: Likewise. > --- > gcc/config/freebsd-spec.h |9 +++-- > gcc/config/i386/freebsd.h |9 +++-- > 2 files changed, 6 insertions(+), 12 deletions(-) > > diff --git a/gcc/config/freebsd-spec.h b/gcc/config/freebsd-spec.h > index 770a3d1..2808582 100644 > --- a/gcc/config/freebsd-spec.h > +++ b/gcc/config/freebsd-spec.h > @@ -64,11 +64,8 @@ see the files COPYING3 and COPYING.RUNTIME > respectively. If not, see before entering `main'. */ > > #define FBSD_STARTFILE_SPEC \ > - "%{!shared: \ > - %{pg:gcrt1.o%s} %{!pg:%{p:gcrt1.o%s} \ > -%{!p:%{profile:gcrt1.o%s} \ > - %{!profile:crt1.o%s \ > - crti.o%s %{!shared:crtbegin.o%s} %{shared:crtbeginS.o%s}" > + "%{!shared: %{pg|p|profile:gcrt1.o%s;pie:Scrt1.o%s;:crt1.o%s}} \ > + crti.o%s %{shared|pie:crtbeginS.o%s;:crtbegin.o%s}" > > /* Provide a ENDFILE_SPEC appropriate for FreeBSD. Here we tack on > the magical crtend.o file (see crtstuff.c) which provides part of > @@ -77,7 +74,7 @@ see the files COPYING3 and COPYING.RUNTIME > respectively. If not, see `crtn.o'. */ > > #define FBSD_ENDFILE_SPEC \ > - "%{!shared:crtend.o%s} %{shared:crtendS.o%s} crtn.o%s" > + "%{shared|pie:crtendS.o%s;:crtend.o%s} crtn.o%s" > > /* Provide a LIB_SPEC appropriate for FreeBSD as configured and as > required by the user-land thread model. Before __FreeBSD_version > diff --git a/gcc/config/i386/freebsd.h b/gcc/config/i386/freebsd.h > index 649274d..dd69e43 100644 > --- a/gcc/config/i386/freebsd.h > +++ b/gcc/config/i386/freebsd.h > @@ -67,11 +67,8 @@ along with GCC; see the file COPYING3. If not see > > #undef STARTFILE_SPEC > #define STARTFILE_SPEC \ > - "%{!shared: \ > - %{pg:gcrt1.o%s} %{!pg:%{p:gcrt1.o%s} \ > -%{!p:%{profile:gcrt1.o%s} \ > - %{!profile:crt1.o%s \ > - crti.o%s %{!shared:crtbegin.o%s} %{shared:crtbeginS.o%s}" > + "%{!shared: %{pg|p|profile:gcrt1.o%s;pie:Scrt1.o%s;:crt1.o%s}} \ > + crti.o%s %{shared|pie:crtbeginS.o%s;:crtbegin.o%s}" > > /* Provide a ENDFILE_SPEC appropriate for FreeBSD. Here we tack on > the magical crtend.o file (see crtstuff.c) which provides part of > @@ -81,7 +78,7 @@ along with GCC; see the file COPYING3. If not see > > #undef ENDFILE_SPEC > #define ENDFILE_SPEC \ > - "%{!shared:crtend.o%s} %{shared:crtendS.o%s} crtn.o%s" > + "%{shared|pie:crtendS.o%s;:crtend.o%s} crtn.o%s" > > /* Provide a LINK_SPEC appropriate for FreeBSD. Here we provide > support for the special GCC options -static and -shared, which allow > us to
Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources
On Wed, May 9, 2012 at 3:26 PM, Richard Guenther wrote: > On Wed, May 9, 2012 at 2:52 PM, Nick Clifton wrote: >> Hi Guys, >> >> http://www.mpfr.org/mpfr-current/#changes >> >> The current release of the MPFR library (v3.1.0) has reorganized its >> sources such the mpfr.h header file is now in a sub-directory called >> 'src', rather than being at the top level. This has broken GCC's use >> of in-tree MPFR sources. >> >> I am asking for permission to apply the patch below to fix the >> problem. I tested it by building an i686-pc-linux-gnu toolchain on a >> machine with no MPFR libraries installed, but with a copy of the mpfr >> 3.1.0 sources installed in-tree. I also built a second toolchain with >> an in-tree copy of the mpfr 2.4.2 sources, just to make sure that the >> old paths still worked. Both builds worked. >> >> OK to apply ? > > I think we only support dropping in exactly the versions we provide > in infrastructure/ - which matches the version we require in install.texi. > > Or did that change? Btw, it would probably be better to make the drop-in compiles doing a staged install during build instead of using the build tree for use. >> Cheers >> Nick >> >> gcc/ChangeLog >> 2012-05-09 Nick Clifton >> >> * configure.ac (mpfr-dir): When using in-tree MPFR sources >> allow for the fact that from release v3.1.0 of MPFR the source >> files were moved into a src sub-directory. >> * configure: Regenerate. >> >> Index: configure.ac >> === >> --- configure.ac (revision 187320) >> +++ configure.ac (working copy) >> @@ -1289,9 +1289,16 @@ >> gmplibs="-L$with_mpfr_lib $gmplibs" >> fi >> if test "x$with_mpfr$with_mpfr_include$with_mpfr_lib" = x && test -d >> ${srcdir}/mpfr; then >> - gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/'"$lt_cv_objdir $gmplibs" >> - gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr -I$$s/mpfr '"$gmpinc" >> - extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr >> --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/'"$lt_cv_objdir" >> + # MPFR v3.1.0 moved the sources into a src sub-directory. >> + if test -d ${srcdir}/mpfr/src; then >> + gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/src/'"$lt_cv_objdir $gmplibs" >> + gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr/src -I$$s/mpfr/src '"$gmpinc" >> + extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr/src >> --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/src/'"$lt_cv_objdir" >> + else >> + gmplibs='-L$$r/$(HOST_SUBDIR)/mpfr/'"$lt_cv_objdir $gmplibs" >> + gmpinc='-I$$r/$(HOST_SUBDIR)/mpfr -I$$s/mpfr '"$gmpinc" >> + extra_mpc_mpfr_configure_flags='--with-mpfr-include=$$s/mpfr >> --with-mpfr-lib=$$r/$(HOST_SUBDIR)/mpfr/'"$lt_cv_objdir" >> + fi >> # Do not test the mpfr version. Assume that it is sufficient, since >> # it is in the source tree, and the library has not been built yet >> # but it would be included on the link line in the version check below
Re: PATCH: Update longlong.h from GLIBC
On Wed, May 9, 2012 at 3:20 PM, H.J. Lu wrote: > On Tue, May 8, 2012 at 4:05 AM, Andreas Jaeger wrote: >> On Tuesday, May 08, 2012 11:59:34 Richard Earnshaw wrote: >>> On 08/05/12 10:04, Andreas Jaeger wrote: >>> > On Tuesday, May 08, 2012 10:43:14 Richard Guenther wrote: >>> >> On Mon, May 7, 2012 at 11:11 PM, H.J. Lu wrote: >>> >>> Hi, >>> >>> >>> >>> I am preparing to update GLIBC longlong.h from GCC. This patch >>> >>> updates GCC longlong.h to use a URL instead of an FSF postal address >>> >>> and replace spaces with tab. OK to install? >>> >>> >>> >>> Since I'd like to simply copy longlong.h from GCC release branch to >>> >>> GLIBC, Is this also OK for 4.7 branch? >>> >> >>> >> Why? Does it fix anything there? >>> > >>> > It makes sharing the file between gcc and glibc easier, >>> > >>> > Andreas >>> >>> Why should glibc be depending on the GCC release branch? Sounds like >>> the tail wagging the dog. >> >> Ah, you discuss the release branch ;) Let HJ defend that one. >> >> >>> Changing this file has quite a high potential for introducing >>> regressions. I don't think we should risk that on the release branch. >> >> It's only whitespace IMO. I'm arguing for the trunk to take the change, >> > > I will take whatever I can get. > > Ian, is this OK for trunk? Yes. Thanks, Richard. > Thanks. > > > -- > H.J.
Re: PR 53249: Multiple address modes for same address space
On Tue, May 8, 2012 at 4:05 PM, Richard Henderson wrote: > On 05/06/2012 11:41 AM, Richard Sandiford wrote: >> >> PR middle-end/53249 >> * dwarf2out.h (get_address_mode): Move declaration to... >> * rtl.h: ...here. >> * dwarf2out.c (get_address_mode): Move definition to... >> * rtlanal.c: ...here. >> * var-tracking.c (get_address_mode): Delete. >> * combine.c (find_split_point): Use get_address_mode instead of >> targetm.addr_space.address_mode. >> * cselib.c (cselib_record_sets): Likewise. >> * dse.c (canon_address, record_store): Likewise. >> * emit-rtl.c (adjust_address_1, offset_address): Likewise. >> * expr.c (move_by_pieces, emit_block_move_via_loop, >> store_by_pieces) >> (store_by_pieces_1, expand_assignment, store_expr, >> store_constructor) >> (expand_expr_real_1): Likewise. >> * ifcvt.c (noce_try_cmove_arith): Likewise. >> * optabs.c (maybe_legitimize_operand_same_code): Likewise. >> * reload.c (find_reloads): Likewise. >> * sched-deps.c (sched_analyze_1, sched_analyze_2): Likewise. >> * sel-sched-dump.c (debug_mem_addr_value): Likewise. > > > ok. > > > r~ I checked in this testcase. Thanks. -- H.J. -- 2012-05-09 H.J. Lu PR middle-end/53249 * gcc.target/i386/pr53249.c: New. diff --git a/gcc/testsuite/gcc.target/i386/pr53249.c b/gcc/testsuite/gcc.target/i386/pr53249.c new file mode 100644 index 000..9eab8bc --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr53249.c @@ -0,0 +1,25 @@ +/* { dg-do compile { target { ! { ia32 } } } } */ +/* { dg-options "-O2 -mx32 -ftls-model=initial-exec -maddress-mode=short" } */ + +struct gomp_task +{ + struct gomp_task *parent; +}; + +struct gomp_thread +{ + int foo1; + struct gomp_task *task; +}; + +extern __thread struct gomp_thread gomp_tls_data; + +void +__attribute__ ((noinline)) +gomp_end_task (void) +{ + struct gomp_thread *thr = &gomp_tls_data; + struct gomp_task *task = thr->task; + + thr->task = task->parent; +}
Re: [patch] support for multiarch systems
Il 09/05/2012 02:38, Matthias Klose ha scritto: > Index: gcc/doc/invoke.texi > === > --- gcc/doc/invoke.texi (revision 187271) > +++ gcc/doc/invoke.texi (working copy) > @@ -6110,6 +6110,11 @@ > @file{../lib32}, or if OS libraries are present in @file{lib/@var{subdir}} > subdirectories it prints e.g.@: @file{amd64}, @file{sparcv9} or @file{ev6}. > > +@item -print-multiarch > +@opindex print-multiarch > +Print the path to OS libraries for the selected multiarch, > +relative to some @file{lib} subdirectory. > + > @item -print-prog-name=@var{program} > @opindex print-prog-name > Like @option{-print-file-name}, but searches for a program such as > @samp{cpp}. So -print-multiarch is like --print-multi-os-directory? What is the difference, and where is it docuemnted? Should it fail if multiarch support is not compiled in? Paolo
Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources
On Wed, 9 May 2012, Nick Clifton wrote: Hi Guys, http://www.mpfr.org/mpfr-current/#changes The current release of the MPFR library (v3.1.0) has reorganized its sources such the mpfr.h header file is now in a sub-directory called 'src', rather than being at the top level. This has broken GCC's use of in-tree MPFR sources. I am asking for permission to apply the patch below to fix the problem. I tested it by building an i686-pc-linux-gnu toolchain on a machine with no MPFR libraries installed, but with a copy of the mpfr 3.1.0 sources installed in-tree. I also built a second toolchain with an in-tree copy of the mpfr 2.4.2 sources, just to make sure that the old paths still worked. Both builds worked. Note that this is PR50461, aka PR51935, which includes a patch which was submitted: http://gcc.gnu.org/ml/gcc-patches/2012-03/msg01505.html (your patch is fine as well, from my non-reviewer point of view) On Wed, 9 May 2012, Richard Guenther wrote: I think we only support dropping in exactly the versions we provide in infrastructure/ - which matches the version we require in install.texi. Or did that change? The documentation just says 2.4.2 (or later). Even if it wasn't supported, why not take the patch? And it is a first step to changing the version in infrastructure, which will eventually have to happen. -- Marc Glisse
Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources
Il 09/05/2012 14:52, Nick Clifton ha scritto: > Hi Guys, > > http://www.mpfr.org/mpfr-current/#changes > > The current release of the MPFR library (v3.1.0) has reorganized its > sources such the mpfr.h header file is now in a sub-directory called > 'src', rather than being at the top level. This has broken GCC's use > of in-tree MPFR sources. > > I am asking for permission to apply the patch below to fix the > problem. I tested it by building an i686-pc-linux-gnu toolchain on a > machine with no MPFR libraries installed, but with a copy of the mpfr > 3.1.0 sources installed in-tree. I also built a second toolchain with > an in-tree copy of the mpfr 2.4.2 sources, just to make sure that the > old paths still worked. Both builds worked. > > OK to apply ? Ok. I think it's nicer for the users if we enable both builds to work. Paolo
Re: PR 40752: -Wconversion generates false warnings for operands not larger than target type
On 9 May 2012 15:21, Paolo Carlini wrote: > Good, I even have a patch essentially ready for that, but can you point me > to that detailed feedback? I couldn't find it when I looked for it. This is how I understood the following paragraph: > I still think the warnings for these cases are mostly correct (there > are > cases where you may be able to make deductions about the range of > possible > values of the expression being converted) and appropriate, and if > disabled > should be disabled under some separate -Wno-conversion-whatever option. That is, cases where you not are able to make a deduction about ranges => the warnings are appropriate and should be disabled only under a separate option. cases where you are able to deduce the ranges => warning is not appropriate But I may have misinterpreted it. Cheers, Manuel. > > Thanks > Paolo.
Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources
Il 09/05/2012 15:31, Richard Guenther ha scritto: > Btw, it would probably be better to make the drop-in compiles doing > a staged install during build instead of using the build tree for use. Same for GCC while building target library, but not really easy to do... GMP/MPFR are compiled without shared libraries which makes it simpler, but it would also be a bit ad hoc. Paolo
Re: PR 40752: -Wconversion generates false warnings for operands not larger than target type
On Wed, 9 May 2012, Manuel López-Ibáñez wrote: > That is, > > cases where you not are able to make a deduction about ranges => the > warnings are appropriate and should be disabled only under a separate > option. > > cases where you are able to deduce the ranges => warning is not appropriate Yes, that's correct. -- Joseph S. Myers jos...@codesourcery.com
Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources
On Wed, May 9, 2012 at 3:40 PM, Paolo Bonzini wrote: > Il 09/05/2012 15:31, Richard Guenther ha scritto: >> Btw, it would probably be better to make the drop-in compiles doing >> a staged install during build instead of using the build tree for use. > > Same for GCC while building target library, but not really easy to do... > > GMP/MPFR are compiled without shared libraries which makes it simpler, > but it would also be a bit ad hoc. Sure - but support for dropping in support libraries into the bootstrap tree is ad-hoc, too ... > Paolo
Re: PR 40752: -Wconversion generates false warnings for operands not larger than target type
On 05/09/2012 03:54 PM, Joseph S. Myers wrote: On Wed, 9 May 2012, Manuel López-Ibáñez wrote: That is, cases where you not are able to make a deduction about ranges => the warnings are appropriate and should be disabled only under a separate option. cases where you are able to deduce the ranges => warning is not appropriate Yes, that's correct. Ah, great, thanks. Then I should have a patchlet almost ready, simply tweaking the conditional expression case with get_unwidened (op1, 0), likewise op2, and recursion. Paolo.
Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources
Il 09/05/2012 15:57, Richard Guenther ha scritto: >>> >> Btw, it would probably be better to make the drop-in compiles doing >>> >> a staged install during build instead of using the build tree for use. >> > >> > Same for GCC while building target library, but not really easy to do... >> > >> > GMP/MPFR are compiled without shared libraries which makes it simpler, >> > but it would also be a bit ad hoc. > > Sure - but support for dropping in support libraries into the bootstrap tree > is ad-hoc, too ... Yes, but pretty much confined in configure.ac. What I'm wary of is adding ad-hoc stuff to the Makefile, which is already enough of a mess... Paolo
Symbol table 20/many: cleanup of cgraph_remove_unreachable_nodes
Hi, this patch reworks cgraph_remove_unreachable_nodes that has grown from somewhat hackish but simple mark&sweep routine into quite unmaintainable mess. It is because it needs to handle several special cases - extern inline functions, virtual functions, clones and inline clone tree reshaping and these features was not added in particularly nice way and their handling changed several times. Things bacome even more messy when varpool walking was added with ipa-ref infrastructure in 4.6 timeframe. This patch rewrites the function to be similar to what lto-cgraph does. I.e. compute reachable symbols and the boundary. I also made some fixes and made the function to release function bodies that are no longer neccesary more aggressively. This, for about 10th time, made me to run into problems with OMP lowering. OMP create declarations of the artificial functions early, but fills them with bodies only much later. It keeps the functions incomplette for a while, with STRUCT_FUNCTION initialized but left empty. This patch simply makes it to initialize those structures when they are needed. Bootstrapped/regtested x86_64-linux, will commit it shortly. Honza * cgraph.h (cgraph_remove_unreachable_nodes): Rename to ... (symtab_remove_unreachable_nodes): ... this one. * ipa-cp.c (ipcp_driver): Do not remove unreachable nodes. * omp-low.c (create_omp_child_function): Do not initialize DECL_INITIAL and STRUCT_FUNCTION. (remove_exit_barrier): Do not try to walk not created yet function bodies. (expand_omp_taskreg, lower_omp_taskreg): Initialize the STRUCT_FUNCTION and DECL_INITIAL here. * cgraphunit.c (ipa_passes): Update. * cgraphclones.c (cgraph_materialize_all_clones): Update. * ipa-inline.c (ipa_inline): Update. * ipa.c: Include ipa-inline.h (enqueue_cgraph_node, enqueue_varpool_node): Remove. (enqueue_node): New function. (process_references): Update. (symtab_remove_unreachable_nodes): Cleanup. * passes.c (execute_todo, execute_one_pass): Update. Index: cgraph.h === --- cgraph.h(revision 187335) +++ cgraph.h(working copy) @@ -637,7 +637,7 @@ int compute_call_stmt_bb_frequency (tree void record_references_in_initializer (tree, bool); /* In ipa.c */ -bool cgraph_remove_unreachable_nodes (bool, FILE *); +bool symtab_remove_unreachable_nodes (bool, FILE *); cgraph_node_set cgraph_node_set_new (void); cgraph_node_set_iterator cgraph_node_set_find (cgraph_node_set, struct cgraph_node *); Index: ipa-cp.c === --- ipa-cp.c(revision 187335) +++ ipa-cp.c(working copy) @@ -2445,7 +2445,6 @@ ipcp_driver (void) struct cgraph_2edge_hook_list *edge_duplication_hook_holder; struct topo_info topo; - cgraph_remove_unreachable_nodes (true,dump_file); ipa_check_create_node_params (); ipa_check_create_edge_args (); grow_next_edge_clone_vector (); Index: omp-low.c === --- omp-low.c (revision 187335) +++ omp-low.c (working copy) @@ -1572,7 +1572,6 @@ create_omp_child_function (omp_context * DECL_UNINLINABLE (decl) = 1; DECL_EXTERNAL (decl) = 0; DECL_CONTEXT (decl) = NULL_TREE; - DECL_INITIAL (decl) = make_node (BLOCK); t = build_decl (DECL_SOURCE_LOCATION (decl), RESULT_DECL, NULL_TREE, void_type_node); @@ -1605,13 +1604,6 @@ create_omp_child_function (omp_context * DECL_CHAIN (t) = DECL_ARGUMENTS (decl); DECL_ARGUMENTS (decl) = t; } - - /* Allocate memory for the function structure. The call to - allocate_struct_function clobbers CFUN, so we need to restore - it afterward. */ - push_struct_function (decl); - cfun->function_end_locus = gimple_location (ctx->stmt); - pop_cfun (); } @@ -3241,28 +3233,31 @@ remove_exit_barrier (struct omp_region * unsigned ix; any_addressable_vars = 0; - FOR_EACH_LOCAL_DECL (DECL_STRUCT_FUNCTION (child_fun), ix, decl) - if (TREE_ADDRESSABLE (decl)) - { - any_addressable_vars = 1; - break; - } - for (block = gimple_block (stmt); - !any_addressable_vars - && block - && TREE_CODE (block) == BLOCK; - block = BLOCK_SUPERCONTEXT (block)) + if (DECL_STRUCT_FUNCTION (child_fun)) { - for (local_decls = BLOCK_VARS (block); - local_decls; - local_decls = DECL_CHAIN (local_decls)) - if (TREE_ADDRESSABLE (local_decls)) + FOR_EACH_LOCAL_DECL (DECL_STRUCT_FUNCTION (child_fun), ix, decl) + i
[google] Hide all uses of __float128 from Clang (issue6195066)
Hide all uses of __float128 from Clang. Brackets _GLIBCXX_USE_FLOAT128 with #ifndef __clang__. Clang does not currently support the __float128 builtin, and so will fail to process libstdc++ headers that use it. Tested for full bootstrap and dejagnu testsuite. Okay for google/integration and google/gcc-4_7-integration branches? Thanks. 2012-05-09 Simon Baldwin * libstdc++-v3/acinclude.m4: Bracket _GLIBCXX_USE_FLOAT128 definition with ifndef __clang__. * libstdc++-v3/config.h.in: Rebuild. Index: libstdc++-v3/config.h.in === --- libstdc++-v3/config.h.in(revision 187148) +++ libstdc++-v3/config.h.in(working copy) @@ -799,8 +799,11 @@ this host. */ #undef _GLIBCXX_USE_DECIMAL_FLOAT -/* Define if __float128 is supported on this host. */ +/* Define if __float128 is supported on this host. + Hide all uses of __float128 from Clang. Google ref b/6422845 */ +#ifndef __clang__ #undef _GLIBCXX_USE_FLOAT128 +#endif /* Defined if gettimeofday is available. */ #undef _GLIBCXX_USE_GETTIMEOFDAY Index: libstdc++-v3/acinclude.m4 === --- libstdc++-v3/acinclude.m4 (revision 187148) +++ libstdc++-v3/acinclude.m4 (working copy) @@ -2529,10 +2529,16 @@ int main() } EOF +AH_VERBATIM([_GLIBCXX_USE_FLOAT128,], +[/* Define if __float128 is supported on this host. + Hide all uses of __float128 from Clang. Google ref b/6422845 */ +#ifndef __clang__ +#undef _GLIBCXX_USE_FLOAT128 +#endif]) + AC_MSG_CHECKING([for __float128]) if AC_TRY_EVAL(ac_compile); then - AC_DEFINE(_GLIBCXX_USE_FLOAT128, 1, - [Define if __float128 is supported on this host.]) + AC_DEFINE(_GLIBCXX_USE_FLOAT128, 1) enable_float128=yes else enable_float128=no -- This patch is available for review at http://codereview.appspot.com/6195066
Re: [google] Hide all uses of __float128 from Clang (issue6195066)
On 12-05-09 08:19 , Simon Baldwin wrote: 2012-05-09 Simon Baldwin * libstdc++-v3/acinclude.m4: Bracket _GLIBCXX_USE_FLOAT128 definition with ifndef __clang__. * libstdc++-v3/config.h.in: Rebuild. OK. Diego.
Re: [google] Hide all uses of __float128 from Clang (issue6195066)
On Wed, May 9, 2012 at 8:19 AM, Simon Baldwin wrote: > > Hide all uses of __float128 from Clang. > > Brackets _GLIBCXX_USE_FLOAT128 with #ifndef __clang__. Clang does not > currently support the __float128 builtin, and so will fail to process > libstdc++ headers that use it. > > Tested for full bootstrap and dejagnu testsuite. > > Okay for google/integration and google/gcc-4_7-integration branches? Please check this into google/gcc-4_7 as well. Ollie
[PATCH] gcc_update explicit use of "svn"
contrib/gcc_update currently uses "svn" explicitly to determine the Revision and Branch instead of the $GCC_SVN shell variable used in other instances in the script. This patch uses the shell variable in all instances. Thanks, David * gcc_update: Use $GCC_SVN to retrieve branch and revision. Index: gcc_update === --- gcc_update (revision 187329) +++ gcc_update (working copy) @@ -372,8 +372,8 @@ exit 1 fi - revision=`svn info | awk '/Revision:/ { print $2 }'` - branch=`svn info | sed -ne "/URL:/ { + revision=`$GCC_SVN info | awk '/Revision:/ { print $2 }'` + branch=`$GCC_SVN info | sed -ne "/URL:/ { s,.*/trunk,trunk, s,.*/branches/,, s,.*/tags/,,
Re: [patch] support for multiarch systems
On 09.05.2012 15:37, Paolo Bonzini wrote: > Il 09/05/2012 02:38, Matthias Klose ha scritto: >> Index: gcc/doc/invoke.texi >> === >> --- gcc/doc/invoke.texi (revision 187271) >> +++ gcc/doc/invoke.texi (working copy) >> @@ -6110,6 +6110,11 @@ >> @file{../lib32}, or if OS libraries are present in @file{lib/@var{subdir}} >> subdirectories it prints e.g.@: @file{amd64}, @file{sparcv9} or @file{ev6}. >> >> +@item -print-multiarch >> +@opindex print-multiarch >> +Print the path to OS libraries for the selected multiarch, >> +relative to some @file{lib} subdirectory. >> + >> @item -print-prog-name=@var{program} >> @opindex print-prog-name >> Like @option{-print-file-name}, but searches for a program such as >> @samp{cpp}. > > So -print-multiarch is like --print-multi-os-directory? the former prints the part before the `:' in the MULTILIB_OSDIRNAMES, the latter the part after the `':', e.g. ../lib32 and i386-linux-gnu. > What is the difference, and where is it documented? Not sure how it should be further documented. > Should it fail if multiarch support is not compiled in? All the -print options always succeed. I would prefer if it just prints the empty string if it is not configured (as it does now). Matthias
Re: [Patch,AVR,trunk,4.7]: Implement PR53256
2012/5/7 Georg-Johann Lay : > AVR-LibC switched from using either signal /or/ interrupt function > attribute to using both at the same time. > > This was never documented or implemented but worked accidentally for > some time, but results in wrong code for 4.7+ > > This patch adds better documentation of these attributes and makes > 'interrupt' silently override 'signal'. > > Besides that, some more sanity checking is done for function attributes. > > ASM_DECLARE_FUNCTION_NAME just served to check isr names. > All the checking is done in the new hook TARGET_SET_CURRENT_FUNCTION > now so that ASM_DECLARE_FUNCTION_NAME from defaults.h can be used, > thus some clean-up in elf.h > > Ok for trunk and 4.7? > > Johann > > PR target/53256 > * config/avr/elf.h (ASM_DECLARE_FUNCTION_NAME): Remove. > * config/avr/avr-protos.h (avr_asm_declare_function_name): Remove. > * config/avr/avr.h (struct machine_function): Add attributes_checked_p. > * config/avr/avr.c (avr_asm_declare_function_name): Remove. > (expand_prologue): Move initialization of cfun->machine->is_naked, > is_interrupt, is_signal, is_OS_task, is_OS_main from here to... > (avr_regs_to_save): Ditto. > (avr_set_current_function): ...this new static function. > (TARGET_SET_CURRENT_FUNCTION): New define. > (avr_function_ok_for_sibcall): Use cfun->machine->is_* instead of > checking attributes of current_function_decl. > (signal_function_p): Rename to avr_signal_function_p. > (interrupt_function_p): Rename to avr_interrupt_function_p. > > * doc/extend.texi (Function Attributes): Better explanation of > 'interrupt' and 'signal for AVR. Move 'ifunc' down for > alphabetical order. > Approved. Denis.
Re: Optimize calls to functions that return one of their arguments
On 04/28/2012 11:31 AM, Bernd Schmidt wrote: This patch allows us to recognize that even if the argument to memcpy lives across the call, we can allocate it to a call-used register by reusing the return value of the function. First, the patch sets the existing "fn spec" attribute for memcpy/memmove. This is translated to a new form of CALL_INSN_FUNCTION_USAGE, a (set (returnreg) (argreg)). This is recognized by IRA to adjust costs, and for communicating to caller-save that the register can be restored cheaply. The optimization only triggers if the argument is passed in a register, which should be the case in the majority of sane ABIs. The effect on the new testcase: pushq%rbx |subq$8, %rsp movslq%edx, %rdxmovslq%edx, %rdx movq%rdi, %rbx < callmemcpycallmemcpy movq%rbx, %rax |addq$8, %rsp popq%rbx < retret Bernd, sorry for some delay. I needed to think about the patch. It is pretty interesting and original idea. My only major objection to the patch is about treatment of ALLOCNO_CHEAP_CALLS_CROSSED_NUM. I think it should be accumulated as ALLOCNO_CALLS_CROSSED_NUM. Otherwise, I am afraid you will have a degradation in many cases instead of improvement. IRA is a regional allocator. The first it tries to do coloring in whole function (seeing a whole picture), then it tries to improve allocation in subregions. When you treat ALLOCNO_CHEAP_CALLS_CROSSED_NUM not accumulated (it means not taking subregions into account) you mislead allocation in the region containing subregions. For example, if the single call is in a subregion, ALLOCNO_CHEAP_CALLS_CROSSED_NUM for the subregion allocno will be 1 but in whole program allocno will be 0. RA in whole function will tend to allocate callee-saved hard register and RA in the subregion will tend to allocate caller-saved hard register. That will increase a possibility to create additional shuffle insns on the subregion borders and as consequence will degrade the code. I don't expect that this micro-optimization improves SPEC2000, but it will improve some tests. So it is good to have it. It would be really interesting to see the optimization impact on SPEC2000. I think I'll do it for myself in a week. So IRA part of the patch is ok for me if you modify treatment of ALLOCNO_CHEAP_CALLS_CROSSED_NUM as it is done for ALLOCNO_CALLS_CROSSED_NUM (when upper region allocnos accumulate the values from the corresponding allocnos from its sub-regions).
Re: [google] Hide all uses of __float128 from Clang (issue6195066)
Hi Simon, Hide all uses of __float128 from Clang. Brackets _GLIBCXX_USE_FLOAT128 with #ifndef __clang__. Clang does not currently support the __float128 builtin, and so will fail to process libstdc++ headers that use it. if one day clang gets support for this type, won't this still turn everything off? Is it possible to test the compiler on some small program using __float128, and turn off use of __float128 if the compiler barfs? Ciao, Duncan.
Re: [google] Hide all uses of __float128 from Clang (issue6195066)
On Wed, May 9, 2012 at 6:08 PM, Duncan Sands wrote: > Hi Simon, > >> Hide all uses of __float128 from Clang. >> >> Brackets _GLIBCXX_USE_FLOAT128 with #ifndef __clang__. Clang does not >> currently support the __float128 builtin, and so will fail to process >> libstdc++ headers that use it. > > > if one day clang gets support for this type, won't this still turn > everything > off? Is it possible to test the compiler on some small program using > __float128, and turn off use of __float128 if the compiler barfs? Does it matter? This is for Google-internal branches only. Ciao! Steven
Re: RFA: Fix detecting of in-tree MPFR 3.1.0 sources
Hi Guys, Ok. I think it's nicer for the users if we enable both builds to work. Thanks. I have applied the patch, along with a reference to PR 50461 and a credit to Paul Smith in the ChangeLog entry since he came up with basically the same patch as mine. Cheers Nick
Re: [patch] support for multiarch systems
Il 09/05/2012 17:34, Matthias Klose ha scritto: >> > So -print-multiarch is like --print-multi-os-directory? > the former prints the part before the `:' in the MULTILIB_OSDIRNAMES, the > latter > the part after the `':', e.g. ../lib32 and i386-linux-gnu. Yes, of course. >> > What is the difference, and where is it documented? > Not sure how it should be further documented. No idea, it is a new concept and people need to understand how it relates to multilibbing for example, what shortcomings are addressed etc. I went through the Debian pages (only cursorily, I admit) and I found nothing of this. Another question I have is related to usage of the option. Are you supposed to look for libraries in the multilib directories too if the compiler is multiarch-enabled? Or only in /lib/i386-linux-gnu? Which one takes priority, multiarch or multiosdir? >From the patch I can guess the intended search path is /lib/MULTIARCH:/lib/MULTIOSDIR, but I'm not entirely sure about that and it needs documentation. >> > Should it fail if multiarch support is not compiled in? > All the -print options always succeed. I would prefer if it just prints the > empty string if it is not configured (as it does now). Will the empty string be a valid output for a multiarch-enabled compiler? For example "gcc --print-multi-os-directory" and "gcc --print-multi-directory" on a bi-arch x86-64 compiler will never print the empty string. Again, I guess the answer is no but I'm not sure. If the answer is no, returning the empty string is fine. If the answer is yes, and assuming the search path is /lib/MULTIARCH:/lib/MULTIOSDIR, then programs need to know whether they need to omit the /lib/MULTIARCH component of the search path. A failure status code is then required. Paolo
Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs
On Wed, May 9, 2012 at 1:12 AM, Richard Guenther wrote: > On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li wrote: >> To be clear, this flag is for malloc implementation (such as tcmalloc) >> with side effect unknown to the compiler. Using -fno-builtin-xxx is >> too conservative for that purpose. > > I don't think that flys. Btw, the patch also guards alloca - alloca is purely > GCC internal. > > What's the "unknown side-effects" that are also important to preserve > for free(malloc(4))? > The side effect is the user registered malloc hooks. The pattern 'free(malloc(4)', or loops like for (..) { malloc(..); // result not used } itself is not interesting. They only appear in the test code and the problem can be worked around by using -fno-builtin-malloc. The option is to avoid disable this and all similar malloc optimizations (in the future) for malloc implementation with hooks. Thanks, David > Richard. > >> David >> >> On Tue, May 8, 2012 at 7:43 AM, Dehao Chen wrote: >>> Hello, >>> >>> This patch adds a flag to guard the optimization that optimize the >>> following code away: >>> >>> free (malloc (4)); >>> >>> In some cases, we'd like this type of malloc/free pairs to remain in >>> the optimized code. >>> >>> Tested with bootstrap, and no regression in the gcc testsuite. >>> >>> Is it ok for mainline? >>> >>> Thanks, >>> Dehao >>> >>> gcc/ChangeLog >>> 2012-05-08 Dehao Chen >>> >>> * common.opt (feliminate-malloc): New. >>> * doc/invoke.texi: Document it. >>> * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it. >>> >>> gcc/testsuite/ChangeLog >>> 2012-05-08 Dehao Chen >>> >>> * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working >>> as expected. >>> >>> Index: gcc/doc/invoke.texi >>> === >>> --- gcc/doc/invoke.texi (revision 187277) >>> +++ gcc/doc/invoke.texi (working copy) >>> @@ -360,7 +360,8 @@ >>> -fcx-limited-range @gol >>> -fdata-sections -fdce -fdelayed-branch @gol >>> -fdelete-null-pointer-checks -fdevirtualize -fdse @gol >>> --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol >>> +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations >>> @gol >>> +-ffat-lto-objects @gol >>> -ffast-math -ffinite-math-only -ffloat-store >>> -fexcess-precision=@var{style} @gol >>> -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol >>> -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol >>> @@ -6238,6 +6239,7 @@ >>> -fdefer-pop @gol >>> -fdelayed-branch @gol >>> -fdse @gol >>> +-feliminate-malloc @gol >>> -fguess-branch-probability @gol >>> -fif-conversion2 @gol >>> -fif-conversion @gol >>> @@ -6762,6 +6764,11 @@ >>> Perform dead store elimination (DSE) on RTL@. >>> Enabled by default at @option{-O} and higher. >>> >>> +@item -feliminate-malloc >>> +@opindex feliminate-malloc >>> +Eliminate unnecessary malloc/free pairs. >>> +Enabled by default at @option{-O} and higher. >>> + >>> @item -fif-conversion >>> @opindex fif-conversion >>> Attempt to transform conditional jumps into branch-less equivalents. This >>> Index: gcc/testsuite/gcc.dg/free-malloc.c >>> === >>> --- gcc/testsuite/gcc.dg/free-malloc.c (revision 0) >>> +++ gcc/testsuite/gcc.dg/free-malloc.c (revision 0) >>> @@ -0,0 +1,12 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-options "-O2 -fno-eliminate-malloc" } */ >>> +/* { dg-final { scan-assembler-times "malloc" 2} } */ >>> +/* { dg-final { scan-assembler-times "free" 2} } */ >>> + >>> +extern void * malloc (unsigned long); >>> +extern void free (void *); >>> + >>> +void test () >>> +{ >>> + free (malloc (10)); >>> +} >>> Index: gcc/common.opt >>> === >>> --- gcc/common.opt (revision 187277) >>> +++ gcc/common.opt (working copy) >>> @@ -1474,6 +1474,10 @@ >>> Common Var(flag_dce) Init(1) Optimization >>> Use the RTL dead code elimination pass >>> >>> +feliminate-malloc >>> +Common Var(flag_eliminate_malloc) Init(1) Optimization >>> +Eliminate unnecessary malloc/free pairs >>> + >>> fdse >>> Common Var(flag_dse) Init(1) Optimization >>> Use the RTL dead store elimination pass >>> Index: gcc/tree-ssa-dce.c >>> === >>> --- gcc/tree-ssa-dce.c (revision 187277) >>> +++ gcc/tree-ssa-dce.c (working copy) >>> @@ -309,6 +309,8 @@ >>> case BUILT_IN_CALLOC: >>> case BUILT_IN_ALLOCA: >>> case BUILT_IN_ALLOCA_WITH_ALIGN: >>> + if (!flag_eliminate_malloc) >>> + mark_stmt_necessary (stmt, true); >>> return; >>> >>> default:;
Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs
On Wed, May 9, 2012 at 9:32 AM, Xinliang David Li wrote: > On Wed, May 9, 2012 at 1:12 AM, Richard Guenther > wrote: >> On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li wrote: >>> To be clear, this flag is for malloc implementation (such as tcmalloc) >>> with side effect unknown to the compiler. Using -fno-builtin-xxx is >>> too conservative for that purpose. >> >> I don't think that flys. Btw, the patch also guards alloca - alloca is >> purely >> GCC internal. >> >> What's the "unknown side-effects" that are also important to preserve >> for free(malloc(4))? >> > > The side effect is the user registered malloc hooks. IIRC future versions of glibc will have those malloc hooks removed. Because of this problem. Thanks, Andrew Pinski > > The pattern 'free(malloc(4)', or loops like > > for (..) > { > malloc(..); // result not used > } > > itself is not interesting. They only appear in the test code and the > problem can be worked around by using -fno-builtin-malloc. The option > is to avoid disable this and all similar malloc optimizations (in the > future) for malloc implementation with hooks. > > Thanks, > > David > >> Richard. >> >>> David >>> >>> On Tue, May 8, 2012 at 7:43 AM, Dehao Chen wrote: Hello, This patch adds a flag to guard the optimization that optimize the following code away: free (malloc (4)); In some cases, we'd like this type of malloc/free pairs to remain in the optimized code. Tested with bootstrap, and no regression in the gcc testsuite. Is it ok for mainline? Thanks, Dehao gcc/ChangeLog 2012-05-08 Dehao Chen * common.opt (feliminate-malloc): New. * doc/invoke.texi: Document it. * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it. gcc/testsuite/ChangeLog 2012-05-08 Dehao Chen * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working as expected. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 187277) +++ gcc/doc/invoke.texi (working copy) @@ -360,7 +360,8 @@ -fcx-limited-range @gol -fdata-sections -fdce -fdelayed-branch @gol -fdelete-null-pointer-checks -fdevirtualize -fdse @gol --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations @gol +-ffat-lto-objects @gol -ffast-math -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol @@ -6238,6 +6239,7 @@ -fdefer-pop @gol -fdelayed-branch @gol -fdse @gol +-feliminate-malloc @gol -fguess-branch-probability @gol -fif-conversion2 @gol -fif-conversion @gol @@ -6762,6 +6764,11 @@ Perform dead store elimination (DSE) on RTL@. Enabled by default at @option{-O} and higher. +@item -feliminate-malloc +@opindex feliminate-malloc +Eliminate unnecessary malloc/free pairs. +Enabled by default at @option{-O} and higher. + @item -fif-conversion @opindex fif-conversion Attempt to transform conditional jumps into branch-less equivalents. This Index: gcc/testsuite/gcc.dg/free-malloc.c === --- gcc/testsuite/gcc.dg/free-malloc.c (revision 0) +++ gcc/testsuite/gcc.dg/free-malloc.c (revision 0) @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-eliminate-malloc" } */ +/* { dg-final { scan-assembler-times "malloc" 2} } */ +/* { dg-final { scan-assembler-times "free" 2} } */ + +extern void * malloc (unsigned long); +extern void free (void *); + +void test () +{ + free (malloc (10)); +} Index: gcc/common.opt === --- gcc/common.opt (revision 187277) +++ gcc/common.opt (working copy) @@ -1474,6 +1474,10 @@ Common Var(flag_dce) Init(1) Optimization Use the RTL dead code elimination pass +feliminate-malloc +Common Var(flag_eliminate_malloc) Init(1) Optimization +Eliminate unnecessary malloc/free pairs + fdse Common Var(flag_dse) Init(1) Optimization Use the RTL dead store elimination pass Index: gcc/tree-ssa-dce.c === --- gcc/tree-ssa-dce.c (revision 187277) +++ gcc/tree-ssa-dce.c (working copy) @@ -309,6 +309,8 @@ case BUILT_IN_CALLOC: case BUILT_IN_ALLOCA: case BUILT_
Re: Fix gcc.dg/lower-subreg-1.c failure
> From: Richard Sandiford > Date: Wed, 9 May 2012 11:14:49 +0200 > Hans-Peter Nilsson writes: > >> From: Richard Sandiford > >> Date: Tue, 1 May 2012 16:46:38 +0200 > > > >> To repeat: as things stand, very few targets define proper rtx costs > >> for SET. > > > > IMHO it's wrong to start blaming targets when rtx_cost doesn't > > take the mode in account in the first place, for the default > > cost. (Well, except for the modes-tieable subreg special-case.) > > The targets where an operation in N * word_mode costs no more > > than one in word_mode, if there even is one, is a minority, > > let's adjust the defaults to that. > > I'll pass on approving or disapproving this patch, but for the record: > a factor of word_mode seems a bit too simplistic. I'd say it's the right level: simplistic enough for the default, not to mention now linear, without being plainly ignorant as now. > It's OK for moves > and logic ops, but addition of multiword modes is a bit more complicated. How about (same factor) factor*COSTS_N_INSNS (1)*3/2 to account for carry? Or is 2*factor a better default? Further improvements are welcome, but I see the patch as a strict improvement and I hope it will not be shot down by requests for perfection - at least not without detailing said perfection. > Multiplication and division by multiword modes is even more > so, of course. Suggestions are welcome, but in the absence of that, I'd say any factor larger than one is is a good start, like in the patch. > > I think there should be a gating check whether the target > > implements that kind of shift in that mode at all, before > > checking the cost. Not sure whether it's generally best to put > > that test here, or to make the rtx_cost function return the cost > > of a libcall for that mode when that happens. Similar for the > > other insns. > > This has come up a few times in past discussions about rtx_cost > (as I'm sure you remember :-)). On the one hand, I agree it might > be nice to shield the backend from invalid insns. That would > effectively mean calling expand on each insn though, which would be > rather expensive. No, nothing that complicated. I'm thinking of just basically checking that there's an operation in that mode, like: if (direct_optab_handler (code_to_optab [GET_CODE (x)], mode) == CODE_FOR_nothing) { ... return tabled default cost; for libcall or open-code ... } Restricting the validity-gating to checking that the mode is valid for the operation wouldn't interfere with fancy pipeline speculative use. > So I think this patch is using rtx_cost according to its current > interface. The "interface" use previously ignored the mode for most uses (QED), so that's not completely correct. ;) > If someone wants to change or restrict that interface, > than that's a separate change IMO. And it should be done consistently > rather than in this one place. > > In this case it doesn't matter anyway. If we never see a shift > in mode M by amount X, we'll never need to make a decision about > whether to split it. If it's never used, then I don't mind it being wrong if that simplifies the computation. :) > > Isn't the below better than doing virtually the same in each > > target's rtx_costs? > > FWIW, MIPS, SH and x86 all used something slightly different (and more > complicated). I imagine PowerPC and SPARC will too. So "each" seems > a bit strong. > > That's not an objection to the patch though. I realise some ports do > have very regular architectures where every register is the same width > and has the same move cost. Hence the default should follow a very regular model... brgds, H-P
Re: RFA: PR target/53120, constraint modifier "+" on operand tied by matching-constraint, "0".
>OK to apply ? Ok. Thanks!
Re: [patch] support for multiarch systems
On 09.05.2012 18:29, Paolo Bonzini wrote: > Il 09/05/2012 17:34, Matthias Klose ha scritto: So -print-multiarch is like --print-multi-os-directory? >> the former prints the part before the `:' in the MULTILIB_OSDIRNAMES, the >> latter >> the part after the `':', e.g. ../lib32 and i386-linux-gnu. > > Yes, of course. > What is the difference, and where is it documented? >> Not sure how it should be further documented. > > No idea, it is a new concept and people need to understand how it > relates to multilibbing for example, what shortcomings are addressed etc. > > I went through the Debian pages (only cursorily, I admit) and I found > nothing of this. these are referenced from the http://wiki.debian.org/Multiarch/Tuples https://wiki.ubuntu.com/MultiarchSpec#Filesystem_layout http://err.no/debian/amd64-multiarch-3 http://wiki.debian.org/Multiarch/TheCaseForMultiarch describes use cases for multiarch, and why Debian thinks that the existing approaches are not sufficient (having name collisions for different architectures or ad hoc names for new architectures like libx32). That may be contentious within the Linux community, but I would like to avoid this kind of discussion here. > Another question I have is related to usage of the option. Are you > supposed to look for libraries in the multilib directories too if the > compiler is multiarch-enabled? Debian/Ubuntu always use /lib for the default MULTIOSDIR, and this needs to be looked up in the future too. The use of locations like ../lib32 will be faded out over time, but I don't see any harm not to have looked up as well. > Or only in /lib/i386-linux-gnu? Which > one takes priority, multiarch or multiosdir? > > From the patch I can guess the intended search path is > /lib/MULTIARCH:/lib/MULTIOSDIR, but I'm not entirely sure about that and > it needs documentation. yes, this is the intended search path. I assume such kind of documentation shouldn't go into invoke.texi. Should it fail if multiarch support is not compiled in? >> All the -print options always succeed. I would prefer if it just prints the >> empty string if it is not configured (as it does now). > > Will the empty string be a valid output for a multiarch-enabled > compiler? For example "gcc --print-multi-os-directory" and "gcc > --print-multi-directory" on a bi-arch x86-64 compiler will never print > the empty string. Again, I guess the answer is no but I'm not sure. > > If the answer is no, returning the empty string is fine. The answer is no. If multiarch is enabled, then the string is always non-empty and should return a multiarch tuple as defined in http://wiki.debian.org/Multiarch/Tuples. Anything else should be considered an error. > If the answer is yes, and assuming the search path is > /lib/MULTIARCH:/lib/MULTIOSDIR, then programs need to know whether they > need to omit the /lib/MULTIARCH component of the search path. Unrelated, but why would programs hard code this path and not use something like this? gcc -v -E - &1 | grep ^LIBRARY_PATH Matthias
[doc] Fix xref in extend.texi
It is strange that a paragraph talking about GCC pragmas refers to the section in CPP talking about #indent and other such directives (no #pragma there). Wouldn't it better if it pointed to the section about #pragmas? OK? 2012-05-09 Manuel López-Ibáñez * doc/extend.texi (Function Attributes): Point xref to section about Pragmas. Index: doc/extend.texi === --- doc/extend.texi (revision 187299) +++ doc/extend.texi (working copy) @@ -4102,8 +4102,7 @@ found convenient to use @code{__attribute__} to achieve a natural attachment of attributes to their corresponding declarations, whereas @code{#pragma GCC} is of use for constructs that do not naturally form -part of the grammar. @xref{Other Directives,,Miscellaneous -Preprocessing Directives, cpp, The GNU C Preprocessor}. +part of the grammar. @xref{Pragmas,,Pragmas Accepted by GCC}. @node Attribute Syntax @section Attribute Syntax
[PATCH, i386]: Fix PR 44141, Redundant loads and stores generated for AMD bdver1 target
Hello! Attached patch avoids a deficiency in reload, where reload gives up on handling subregs of pseudos (please see the PR [1] for explanation by Ulrich). The patch simply avoids generating V4SF moves with V4SF subregs of V2DF values unless really necessary (i.e. moving SSE2 modes without SSE2 enabled, which shouldn't happen anyway). With patched gcc, expand pass emits (unaligned) moves in their original mode, and this mode is kept until asm is generated. The asm instruction is chosen according to the mode of insn pattern, and the mode is calculated using various influencing conditions. 2012-05-09 Uros Bizjak PR target/44141 * config/i386/i386.c (ix86_expand_vector_move_misalign): Do not handle 128 bit vectors specially for TARGET_AVX. Emit sse2_movupd and sse_movupd RTXes for TARGET_AVX, TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL or when optimizing for size. * config/i386/sse.md (*mov_internal): Remove TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL handling from asm output code. Calculate "mode" attribute according to optimize_function_for_size_p and TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL flag. (*_movu): Choose asm template depending on the mode of the instruction. Calculate "mode" attribute according to optimize_function_for_size_p, TARGET_SSE_TYPELESS_STORES and TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL flags. (*_movdqu): Ditto. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}. The patch also fixes the testcase from the PR. Patch will be committed to mainline SVN. [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44141#c16 Uros. Index: config/i386/sse.md === --- config/i386/sse.md (revision 187286) +++ config/i386/sse.md (working copy) @@ -449,8 +449,6 @@ && (misaligned_operand (operands[0], mode) || misaligned_operand (operands[1], mode))) return "vmovupd\t{%1, %0|%0, %1}"; - else if (TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL) - return "%vmovaps\t{%1, %0|%0, %1}"; else return "%vmovapd\t{%1, %0|%0, %1}"; @@ -460,8 +458,6 @@ && (misaligned_operand (operands[0], mode) || misaligned_operand (operands[1], mode))) return "vmovdqu\t{%1, %0|%0, %1}"; - else if (TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL) - return "%vmovaps\t{%1, %0|%0, %1}"; else return "%vmovdqa\t{%1, %0|%0, %1}"; @@ -475,19 +471,21 @@ [(set_attr "type" "sselog1,ssemov,ssemov") (set_attr "prefix" "maybe_vex") (set (attr "mode") - (cond [(match_test "TARGET_AVX") + (cond [(and (eq_attr "alternative" "1,2") + (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")) +(if_then_else + (match_test "GET_MODE_SIZE (mode) > 16") + (const_string "V8SF") + (const_string "V4SF")) + (match_test "TARGET_AVX") (const_string "") - (ior (ior (match_test "optimize_function_for_size_p (cfun)") -(not (match_test "TARGET_SSE2"))) + (ior (and (eq_attr "alternative" "1,2") +(match_test "optimize_function_for_size_p (cfun)")) (and (eq_attr "alternative" "2") (match_test "TARGET_SSE_TYPELESS_STORES"))) (const_string "V4SF") - (eq (const_string "mode") (const_string "V4SFmode")) -(const_string "V4SF") - (eq (const_string "mode") (const_string "V2DFmode")) -(const_string "V2DF") ] - (const_string "TI")))]) + (const_string "")))]) (define_insn "sse2_movq128" [(set (match_operand:V2DI 0 "register_operand" "=x") @@ -597,11 +595,33 @@ [(match_operand:VF 1 "nonimmediate_operand" "xm,x")] UNSPEC_MOVU))] "TARGET_SSE && !(MEM_P (operands[0]) && MEM_P (operands[1]))" - "%vmovu\t{%1, %0|%0, %1}" +{ + switch (get_attr_mode (insn)) +{ +case MODE_V8SF: +case MODE_V4SF: + return "%vmovups\t{%1, %0|%0, %1}"; +default: + return "%vmovu\t{%1, %0|%0, %1}"; +} +} [(set_attr "type" "ssemov") (set_attr "movu" "1") (set_attr "prefix" "maybe_vex") - (set_attr "mode" "")]) + (set (attr "mode") + (cond [(match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL") +(if_then_else + (match_test "GET_MODE_SIZE (mode) > 16") + (const_string "V8SF") + (const_string "V4SF")) + (match_test "TARGET_AVX") +(const_string "") + (ior (match_test "optimize_function_for_size_p (cfun)") + (and (eq_attr "alternative" "1") +(match_test "TARGET_SSE_TYPELESS_STORES"))) +
Re: [doc] Fix xref in extend.texi
On Wed, May 9, 2012 at 12:50 PM, Manuel López-Ibáñez wrote: > It is strange that a paragraph talking about GCC pragmas refers to the > section in CPP talking about #indent and other such directives (no > #pragma there). Wouldn't it better if it pointed to the section about > #pragmas? yes. patch ok. > > OK? > > 2012-05-09 Manuel López-Ibáñez > > * doc/extend.texi (Function Attributes): Point xref to section > about Pragmas. > > > Index: doc/extend.texi > === > --- doc/extend.texi (revision 187299) > +++ doc/extend.texi (working copy) > @@ -4102,8 +4102,7 @@ > found convenient to use @code{__attribute__} to achieve a natural > attachment of attributes to their corresponding declarations, whereas > @code{#pragma GCC} is of use for constructs that do not naturally form > -part of the grammar. @xref{Other Directives,,Miscellaneous > -Preprocessing Directives, cpp, The GNU C Preprocessor}. > +part of the grammar. @xref{Pragmas,,Pragmas Accepted by GCC}. > > @node Attribute Syntax > @section Attribute Syntax
Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs
On Wed, May 9, 2012 at 6:32 PM, Xinliang David Li wrote: > On Wed, May 9, 2012 at 1:12 AM, Richard Guenther > wrote: >> On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li wrote: >>> To be clear, this flag is for malloc implementation (such as tcmalloc) >>> with side effect unknown to the compiler. Using -fno-builtin-xxx is >>> too conservative for that purpose. >> >> I don't think that flys. Btw, the patch also guards alloca - alloca is >> purely >> GCC internal. >> >> What's the "unknown side-effects" that are also important to preserve >> for free(malloc(4))? >> > > The side effect is the user registered malloc hooks. > > The pattern 'free(malloc(4)', or loops like > > for (..) > { > malloc(..); // result not used > } > > itself is not interesting. They only appear in the test code and the > problem can be worked around by using -fno-builtin-malloc. The option > is to avoid disable this and all similar malloc optimizations (in the > future) for malloc implementation with hooks. What is then interesting? The above are all the same as if optimization figured out the storage is not used, no? Richard. > Thanks, > > David > >> Richard. >> >>> David >>> >>> On Tue, May 8, 2012 at 7:43 AM, Dehao Chen wrote: Hello, This patch adds a flag to guard the optimization that optimize the following code away: free (malloc (4)); In some cases, we'd like this type of malloc/free pairs to remain in the optimized code. Tested with bootstrap, and no regression in the gcc testsuite. Is it ok for mainline? Thanks, Dehao gcc/ChangeLog 2012-05-08 Dehao Chen * common.opt (feliminate-malloc): New. * doc/invoke.texi: Document it. * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it. gcc/testsuite/ChangeLog 2012-05-08 Dehao Chen * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working as expected. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 187277) +++ gcc/doc/invoke.texi (working copy) @@ -360,7 +360,8 @@ -fcx-limited-range @gol -fdata-sections -fdce -fdelayed-branch @gol -fdelete-null-pointer-checks -fdevirtualize -fdse @gol --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects @gol +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations @gol +-ffat-lto-objects @gol -ffast-math -ffinite-math-only -ffloat-store -fexcess-precision=@var{style} @gol -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol @@ -6238,6 +6239,7 @@ -fdefer-pop @gol -fdelayed-branch @gol -fdse @gol +-feliminate-malloc @gol -fguess-branch-probability @gol -fif-conversion2 @gol -fif-conversion @gol @@ -6762,6 +6764,11 @@ Perform dead store elimination (DSE) on RTL@. Enabled by default at @option{-O} and higher. +@item -feliminate-malloc +@opindex feliminate-malloc +Eliminate unnecessary malloc/free pairs. +Enabled by default at @option{-O} and higher. + @item -fif-conversion @opindex fif-conversion Attempt to transform conditional jumps into branch-less equivalents. This Index: gcc/testsuite/gcc.dg/free-malloc.c === --- gcc/testsuite/gcc.dg/free-malloc.c (revision 0) +++ gcc/testsuite/gcc.dg/free-malloc.c (revision 0) @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-eliminate-malloc" } */ +/* { dg-final { scan-assembler-times "malloc" 2} } */ +/* { dg-final { scan-assembler-times "free" 2} } */ + +extern void * malloc (unsigned long); +extern void free (void *); + +void test () +{ + free (malloc (10)); +} Index: gcc/common.opt === --- gcc/common.opt (revision 187277) +++ gcc/common.opt (working copy) @@ -1474,6 +1474,10 @@ Common Var(flag_dce) Init(1) Optimization Use the RTL dead code elimination pass +feliminate-malloc +Common Var(flag_eliminate_malloc) Init(1) Optimization +Eliminate unnecessary malloc/free pairs + fdse Common Var(flag_dse) Init(1) Optimization Use the RTL dead store elimination pass Index: gcc/tree-ssa-dce.c === --- gcc/tree-ssa-dce.c (revision 187277) +++ gcc/tree-ssa-dce.c (working copy) @@ -309,6 +309,8 @@ case BUILT_IN_CALLOC: case BUILT_IN_ALLOCA: case BU
[PATCH, i386]: Some further TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL cleanups
Hello! Practically no functional change. 2012-05-09 Uros Bizjak * config/i386/i386.c (*movdf_internal_rex64): Remove TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL handling from asm output code. Calculate "mode" attribute according to TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL flag. (*movdf_internal): Ditto. Tested on x86_64-pc-linux-gnu {,-m32}, committed to mainline SVN. Uros. Index: i386.md === --- i386.md (revision 187347) +++ i386.md (working copy) @@ -2953,8 +2953,7 @@ switch (get_attr_mode (insn)) { case MODE_V2DF: - if (!TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL) - return "%vmovapd\t{%1, %0|%0, %1}"; + return "%vmovapd\t{%1, %0|%0, %1}"; case MODE_V4SF: return "%vmovaps\t{%1, %0|%0, %1}"; @@ -3032,7 +3031,8 @@ movaps encodes one byte shorter. */ (eq_attr "alternative" "8") (cond - [(match_test "optimize_function_for_size_p (cfun)") + [(ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL") +(match_test "optimize_function_for_size_p (cfun)")) (const_string "V4SF") (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY") (const_string "V2DF") @@ -3094,8 +3094,7 @@ switch (get_attr_mode (insn)) { case MODE_V2DF: - if (!TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL) - return "%vmovapd\t{%1, %0|%0, %1}"; + return "%vmovapd\t{%1, %0|%0, %1}"; case MODE_V4SF: return "%vmovaps\t{%1, %0|%0, %1}"; @@ -3167,7 +3166,8 @@ movaps encodes one byte shorter. */ (eq_attr "alternative" "6,10") (cond - [(match_test "optimize_function_for_size_p (cfun)") + [(ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL") +(match_test "optimize_function_for_size_p (cfun)")) (const_string "V4SF") (match_test "TARGET_SSE_PARTIAL_REG_DEPENDENCY") (const_string "V2DF")
[C++ Patch] PR 53158 (EXPR_LOC_OR_HERE version)
Hi again, thus the below is what I booted and tested. As you can see I had to tweak a bit an existing testcase, for which the location is now a bit "smaller", 3 chars, in fact is still not perfect, but arguably a tad better. Otherwise no issues. Ah, I'm touching also a c-common.c function, in fact all the other warning* functions in the same file already use the EXPR_LOC_OR_HERE on the expr argument idea. Tested x86_64-linux. Thanks, Paolo. // /cp 2012-05-09 Paolo Carlini PR c++/53158 * cvt.c (ocp_convert): Error out early for void -> bool conversions. * typeck.c (decay_conversion): Use error_at. * call.c (build_integral_nontype_arg_conv, convert_like_real, convert_arg_to_ellipsis, perform_implicit_conversion_flags, initialize_reference): Likewise. * cvt.c (warn_ref_binding): Add location_t parameter. (cp_convert_to_pointer, convert_to_reference, ocp_convert, convert_to_void, ): Use error_at and warning_at. /c-family 2012-05-09 Paolo Carlini PR c++/53158 * c-common.c (warnings_for_convert_and_check): Use warning_at. /testsuite 2012-05-09 Paolo Carlini PR c++/53158 * g++.dg/cpp0x/lambda/lambda-err2.C: New. * g++.dg/parse/error26.C: Tweak dg-error column number. Index: c-family/c-common.c === --- c-family/c-common.c (revision 187335) +++ c-family/c-common.c (working copy) @@ -2329,6 +2329,8 @@ conversion_warning (tree type, tree expr) void warnings_for_convert_and_check (tree type, tree expr, tree result) { + location_t loc = EXPR_LOC_OR_HERE (expr); + if (TREE_CODE (expr) == INTEGER_CST && (TREE_CODE (type) == INTEGER_TYPE || TREE_CODE (type) == ENUMERAL_TYPE) @@ -2344,8 +2346,8 @@ warnings_for_convert_and_check (tree type, tree ex /* This detects cases like converting -129 or 256 to unsigned char. */ if (!int_fits_type_p (expr, c_common_signed_type (type))) -warning (OPT_Woverflow, - "large integer implicitly truncated to unsigned type"); +warning_at (loc, OPT_Woverflow, + "large integer implicitly truncated to unsigned type"); else conversion_warning (type, expr); } @@ -2357,16 +2359,16 @@ warnings_for_convert_and_check (tree type, tree ex && (TREE_CODE (TREE_TYPE (expr)) != INTEGER_TYPE || TYPE_PRECISION (TREE_TYPE (expr)) != TYPE_PRECISION (type))) - warning (OPT_Woverflow, -"overflow in implicit constant conversion"); + warning_at (loc, OPT_Woverflow, + "overflow in implicit constant conversion"); else conversion_warning (type, expr); } else if ((TREE_CODE (result) == INTEGER_CST || TREE_CODE (result) == FIXED_CST) && TREE_OVERFLOW (result)) -warning (OPT_Woverflow, - "overflow in implicit constant conversion"); +warning_at (loc, OPT_Woverflow, + "overflow in implicit constant conversion"); else conversion_warning (type, expr); } Index: testsuite/g++.dg/parse/error26.C === --- testsuite/g++.dg/parse/error26.C(revision 187335) +++ testsuite/g++.dg/parse/error26.C(working copy) @@ -4,7 +4,7 @@ void foo() { if (({int c[2];})) ; // { dg-error "7:ISO C.. forbids" "7" } - // { dg-error "20:could not convert" "20" { target *-*-* } 6 } + // { dg-error "17:could not convert" "17" { target *-*-* } 6 } } void bar() Index: testsuite/g++.dg/cpp0x/lambda/lambda-err2.C === --- testsuite/g++.dg/cpp0x/lambda/lambda-err2.C (revision 0) +++ testsuite/g++.dg/cpp0x/lambda/lambda-err2.C (revision 0) @@ -0,0 +1,12 @@ +// PR c++/53158 +// { dg-do compile { target c++11 } } + +int main() +{ + auto a = []() { return true; }; + auto b = []() { return a(); }; // { dg-error "'a' is not captured" } + int c, d; + while (b() && c < d) // { dg-error "could not convert" } +{ +} +} Index: cp/typeck.c === --- cp/typeck.c (revision 187335) +++ cp/typeck.c (working copy) @@ -1822,6 +1822,7 @@ decay_conversion (tree exp, tsubst_flags_t complai { tree type; enum tree_code code; + location_t loc = EXPR_LOC_OR_HERE (exp); type = TREE_TYPE (exp); if (type == error_mark_node) @@ -1853,7 +1854,7 @@ decay_conversion (tree exp, tsubst_flags_t complai if (code == VOID_TYPE) { if (complain & tf_error) - error ("void value not ignored as it ought to be"); + error_at (loc, "void value not ignored as it ought to be"); return error_mark_node; } if (invalid_nonstatic_memfn_p (exp, complain)) @@ -1882,7 +1883,7 @@ decay_conversion (
Re: Symbol table 19/many: cleanup varpool/front-end/varasm interactions
Jan, This patch is causing a bootstrap failure on AIX. It generates linker errors that TOC symbols are not emitted: ld: 0711-317 ERROR: Undefined symbol: LC..0 ld: 0711-317 ERROR: Undefined symbol: LC..1 ld: 0711-317 ERROR: Undefined symbol: LC..2 ld: 0711-317 ERROR: Undefined symbol: LC..3 ld: 0711-317 ERROR: Undefined symbol: LC..4 Although pLinux uses the same code, GCC is able to bootstrap on Linux. The function in question is gcc/config/rs6000/rs6000.c:output_toc() Do you have any suggestions on what is missing in the label allocation that causes your cgraph change to believe the symbols are unreferenced? I have opened PR bootstrap/53300. Thanks, David
Re: [PATCH] Optimize byte_from_pos, pos_from_bit
> This optimizes byte_from_pos and pos_from_bit by noting that we operate > on sizes whose computations have no intermediate (or final) overflow. > This is the single patch necessary to get Ada to bootstrap and test > with TYPE_IS_SIZETYPE removed. Rather than amending size_binop > (my original plan) I chose to optimize the above two commonly used > accessors. > > Conveniently normalize_offset can be re-written to use pos_from_bit > instead of inlinig it. I also took the liberty to document the > functions (sic). Nice, thanks. Could you add a blurb, in the head comment of the first function in which you operate under the no-overflow assumption, stating this fact and why this is necessary (an explicit mention of Ada isn't forbidden ;-), as well as a cross-reference to it in the head comment of the other function(s). > Any comments? Would you like different factoring of the optimization > (I considered adding a byte_from_bitpos)? Any idea why > byte_from_pos is using TRUNC_DIV_EXPR (only positive offsets?) and > pos_from_bit FLOOR_DIV_EXPR (also negative offsets?) - that seems > inconsistent, and we fold FLOOR_DIV_EXPR of unsigned types (sizetype) > to TRUNC_DIV_EXPR anyways. I think that a bit position is treated as non-negative, so bitpos can use TRUNC_DIV_EXPR in byte_from_pos and you need to use FLOOR_MOD_EXPR to get a non-negative *pbitpos in pos_from_bit. Very likely obsolete now indeed. -- Eric Botcazou
Re: [PATCH] Remove TYPE_IS_SIZETYPE
> This removes the TYPE_IS_SIZETYPE macro and all its uses (by > assuming it returns zero and applying trivial folding). Sizes > and bitsizes can still be treat specially by means of knowing > what the values represent and by means of using helper functions > that assume you are dealing with "sizes" (in particular size_binop > and friends and bit_from_pos, byte_from_pos or pos_from_bit). Fine with me, if you add the blurb I talked about in the other reply. > Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages > including Ada with the patch optimizing bute_from_pos and pos_from_bit Results on our internal testsuite are clean on x86-64 and almost clean on x86, an exception being: package t is type x (m : natural) is record s : string (1 .. m); r : natural; b : boolean; end record; for x'alignment use 4; pragma Pack (x); end t; Without the patches, compiling the package with -gnatR3 yields: Representation information for unit t (spec) for x'Object_Size use 17179869248; for x'Value_Size use ((#1 + 8) * 8) ; for x'Alignment use 4; for x use record m at 0 range 0 .. 30; s at 4 range 0 .. ((#1 * 8)) - 1; r at bit offset (((#1 + 4) * 8)) size in bits = 31 b at bit offset #1 + 7) * 8) + 7)) size in bits = 1 end record; With the patches, this yields: Representation information for unit t (spec) for x'Object_Size use 17179869248; for x'Value_Size use (((#1 + 7) + 1) * 8) ; for x'Alignment use 4; for x use record m at 0 range 0 .. 30; s at 4 range 0 .. ((#1 * 8)) - 1; r at bit offset (((#1 + 4) * 8)) size in bits = 31 b at bit offset #1 + 7) * 8) + 7)) size in bits = 1 end record; so we have lost a simple folding for x'Value_Size (TYPE_ADA_SIZE field). > 2012-05-08 Richard Guenther > > ada/ > * gcc-interface/cuintp.c (UI_From_gnu): Remove TYPE_IS_SIZETYPE use. OK, modulo the formatting: > Index: trunk/gcc/ada/gcc-interface/cuintp.c > === > *** trunk.orig/gcc/ada/gcc-interface/cuintp.c 2011-04-11 17:01:30.0 > +0200 --- trunk/gcc/ada/gcc-interface/cuintp.c2012-05-07 > 16:43:43.497218058 +0200 *** UI_From_gnu (tree Input) > *** 178,186 > if (host_integerp (Input, 0)) > return UI_From_Int (TREE_INT_CST_LOW (Input)); > else if (TREE_INT_CST_HIGH (Input) < 0 > !&& TYPE_UNSIGNED (gnu_type) > !&& !(TREE_CODE (gnu_type) == INTEGER_TYPE > ! && TYPE_IS_SIZETYPE (gnu_type))) > return No_Uint; > #endif > > --- 178,184 > if (host_integerp (Input, 0)) > return UI_From_Int (TREE_INT_CST_LOW (Input)); > else if (TREE_INT_CST_HIGH (Input) < 0 > !&& TYPE_UNSIGNED (gnu_type)) > return No_Uint; > #endif && TYPE_UNSIGNED (gnu_type)) on the same line. -- Eric Botcazou
[PATCH, i386]: Fix PR 52908 - xop-mul-1:f9 miscompiled on bulldozer (-mxop)
Hello! Attached patch fixes PR 52908. There is no need to generate "fake" multiply instructions after reload, we can expand directly to MAC instructions. This approach even produced better assembly for a couple of testcases in gcc.target/i386 testsuite. 2012-05-09 Uros Bizjak PR target/52908 * config/i386/sse.md (vec_widen_smult_hi_v4si): Expand using xop_pmacsdqh insn pattern instead of xop_mulv2div2di3_high. (vec_widen_smult_lo_v4si): Expand using xop_pmacsdql insn pattern instead of xop_mulv2div2di3_low. (xop_pdql): Fix vec_select selector. (xop_pdqh): Ditto. (xop_mulv2div2di3_low): Remove insn_and_split pattern. (xop_mulv2div2di3_high): Ditto. testsuite/ChangeLog: PR target/52908 * gcc.target/i386/xop-imul32widen-vector.c: Update scan-assembler directive to Scan for vpmuldq, not vpmacsdql. Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32}, and tested on XOP target by Venkataramanan. Patch was committed to mainline SVN. The version, attached to the PR should be backported to other release branches, but another volunteer should do the backport, since I don't have access to XOP target. Also, please note that XOP horizontal add/subtract instructions (and possibly others) have vec_select parallel RTX in wrong endiannes. Uros. Index: config/i386/sse.md === --- config/i386/sse.md (revision 187347) +++ config/i386/sse.md (working copy) @@ -5748,11 +5748,15 @@ if (TARGET_XOP) { + rtx t3 = gen_reg_rtx (V2DImode); + emit_insn (gen_sse2_pshufd_1 (t1, op1, GEN_INT (0), GEN_INT (2), GEN_INT (1), GEN_INT (3))); emit_insn (gen_sse2_pshufd_1 (t2, op2, GEN_INT (0), GEN_INT (2), GEN_INT (1), GEN_INT (3))); - emit_insn (gen_xop_mulv2div2di3_high (operands[0], t1, t2)); + emit_move_insn (t3, CONST0_RTX (V2DImode)); + + emit_insn (gen_xop_pmacsdqh (operands[0], t1, t2, t3)); DONE; } @@ -5777,11 +5781,15 @@ if (TARGET_XOP) { + rtx t3 = gen_reg_rtx (V2DImode); + emit_insn (gen_sse2_pshufd_1 (t1, op1, GEN_INT (0), GEN_INT (2), GEN_INT (1), GEN_INT (3))); emit_insn (gen_sse2_pshufd_1 (t2, op2, GEN_INT (0), GEN_INT (2), GEN_INT (1), GEN_INT (3))); - emit_insn (gen_xop_mulv2div2di3_low (operands[0], t1, t2)); + emit_move_insn (t3, CONST0_RTX (V2DImode)); + + emit_insn (gen_xop_pmacsdql (operands[0], t1, t2, t3)); DONE; } @@ -9792,11 +9800,11 @@ (sign_extend:V2DI (vec_select:V2SI (match_operand:V4SI 1 "nonimmediate_operand" "%x") - (parallel [(const_int 1) (const_int 3)]))) + (parallel [(const_int 0) (const_int 2)]))) (sign_extend:V2DI (vec_select:V2SI (match_operand:V4SI 2 "nonimmediate_operand" "xm") - (parallel [(const_int 1) (const_int 3)] + (parallel [(const_int 0) (const_int 2)] (match_operand:V2DI 3 "nonimmediate_operand" "x")))] "TARGET_XOP" "vpdql\t{%3, %2, %1, %0|%0, %1, %2, %3}" @@ -9810,93 +9818,17 @@ (sign_extend:V2DI (vec_select:V2SI (match_operand:V4SI 1 "nonimmediate_operand" "%x") - (parallel [(const_int 0) (const_int 2)]))) + (parallel [(const_int 1) (const_int 3)]))) (sign_extend:V2DI (vec_select:V2SI (match_operand:V4SI 2 "nonimmediate_operand" "xm") - (parallel [(const_int 0) (const_int 2)] + (parallel [(const_int 1) (const_int 3)] (match_operand:V2DI 3 "nonimmediate_operand" "x")))] "TARGET_XOP" "vpdqh\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssemuladd") (set_attr "mode" "TI")]) -;; We don't have a straight 32-bit parallel multiply and extend on XOP, so -;; fake it with a multiply/add. In general, we expect the define_split to -;; occur before register allocation, so we have to handle the corner case where -;; the target is the same as operands 1/2 -(define_insn_and_split "xop_mulv2div2di3_low" - [(set (match_operand:V2DI 0 "register_operand" "=&x") - (mult:V2DI - (sign_extend:V2DI - (vec_select:V2SI - (match_operand:V4SI 1 "register_operand" "%x") - (parallel [(const_int 1) (const_int 3)]))) - (sign_extend:V2DI - (vec_select:V2SI - (match_operand:V4SI 2 "nonimmediate_operand" "xm") - (parallel [(const_int 1) (const_int 3)])] - "TARGET_XOP" - "#" - "&& reload_completed" - [(set (match_dup 0) - (match_dup 3)) - (set (match_dup 0) - (plus:V2DI -(mult:V2DI - (sign_extend:V2DI - (vec_select:V2SI - (match_dup 1) - (parallel [(const_int 1) (con
[patch] Fix LTO regression in Ada
Hi, this is a regression present on mainline and 4.7 branch. On the attached testcase, the compiler aborts in LTO mode with: eric@atlantis:~/build/gcc/native32> gcc/xgcc -Bgcc -S lto11.adb -O -flto +===GNAT BUG DETECTED==+ | 4.8.0 20120506 (experimental) [trunk revision 187216] (i586-suse-linux) | tree code 'call_expr' is not supported in LTO streams The problem is that the Ada compiler started to use DECL_ORIGINAL_TYPE in 4.7.x and the type in this field can have arbitrary expressions as TYPE_SIZE, for example expressions with CALL_EXPRs. Now the type is both not gimplified and streamed in LTO mode, so the CALL_EXPRs are sent to the streamer as-is. The immediate solution would be not to stream DECL_ORIGINAL_TYPE (and clear it in free_lang_data_in_decl), but this yields a regression in C++ with -flto -g (ICE in splice_child_die). Therefore, the patch implements the alternate solution of gimplifying DECL_ORIGINAL_TYPE. Bootstrapped/regtested on x86_64-suse-linux, OK for mainline and 4.7 branch? 2012-05-09 Eric Botcazou * gimplify.c (gimplify_decl_expr): For a TYPE_DECL, gimplify the DECL_ORIGINAL_TYPE if it is present. 2012-05-09 Eric Botcazou * gnat.dg/lto11.ad[sb]: New test. -- Eric Botcazou -- { dg-do compile } -- { dg-options "-flto" { target lto } } with Ada.Streams; use Ada.Streams; package body Lto11 is procedure Write (S : not null access Root_Stream_Type'Class; V : Vector) is subtype M_SEA is Stream_Element_Array (1 .. V'Size / Stream_Element'Size); Bytes : M_SEA; for Bytes'Address use V'Address; pragma Import (Ada, Bytes); begin Ada.Streams.Write (S.all, Bytes); end; end Lto11; with Ada.Streams; use Ada.Streams; package Lto11 is type Vector is array (Positive range <>) of Float; procedure Write (S : not null access Root_Stream_Type'Class; V : Vector); end Lto11; diff --git a/gcc/gimplify.c b/gcc/gimplify.c index c4792e6..f69e773 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -1448,6 +1448,11 @@ gimplify_decl_expr (tree *stmt_p, gimple_seq *seq_p) && !TYPE_SIZES_GIMPLIFIED (TREE_TYPE (decl))) gimplify_type_sizes (TREE_TYPE (decl), seq_p); + if (TREE_CODE (decl) == TYPE_DECL + && DECL_ORIGINAL_TYPE (decl) + && !TYPE_SIZES_GIMPLIFIED (DECL_ORIGINAL_TYPE (decl))) +gimplify_type_sizes (DECL_ORIGINAL_TYPE (decl), seq_p); + if (TREE_CODE (decl) == VAR_DECL && !DECL_EXTERNAL (decl)) { tree init = DECL_INITIAL (decl);
Re: [RFC] PR 53063 encode group options in .opt files
On 9 May 2012 00:38, Joseph S. Myers wrote: > On Wed, 9 May 2012, Manuel López-Ibáñez wrote: > >> which looks correct to me. However, the build fails because now >> options.h requires input.h which requires line-map.h, which is not >> included when building for example libgcc. options.h is included by >> tm.h, so it basically appears everywhere. >> >> Any suggestions how to fix this? > > options.h already has some #if !defined(IN_LIBGCC2) && > !defined(IN_TARGET_LIBS) && !defined(IN_RTS) conditionals, so you could > arrange for some more such conditionals to be generated. This works great for libgcc, however, ada's runtime library is compiled with IN_GCC and includes options.h through tm.h, so it breaks: ../../xgcc -B../../ -c -DIN_GCC -g -O2 -W -Wall \ -iquote /home/manuel/test2/src/gcc \ -iquote . -iquote .. -iquote ../.. -iquote /home/manuel/test2/src/gcc/ada -iquote /home/manuel/test2/src/gcc -I/home/manuel/test2/src/gcc/../include \ ../rts/targext.c -o targext.o In file included from ../../options.h:3716:0, from ../../tm.h:19, from ../rts/targext.c:50: /home/manuel/test2/src/gcc/input.h:25:22: fatal error: line-map.h: No such file or directory #include "line-map.h" ^ That tm.h includes the whole options.h is bad, but I don't want to mess with it. So I am instead only including input.h if options.h was not included from tm.h. This requires surprisingly few changes. The other major issue is that if -Wunused is enabled by -Wall, it has to be enabled by handle_generated_option, so I need to modify every front-end that does that . In the future, we should also generate per-FE _auto function, but I didn't want to make this patch even larger. Bootstrapped and regression tested. OK? 2012-05-09 Manuel López-Ibáñez PR 53063 gcc/ * doc/options.texi (EnabledBy): Document * opts.c: Include opts.h and options.h before tm.h. (finish_options): Do not handle some sub-options here... (common_handle_option): ... instead call common_handle_option_auto here. * optc-gen.awk: Handle EnabledBy. * opth-gen.awk: Declare common_handle_option_auto. * common.opt (Wuninitialized): Use EnabledBy. Delete Init. (Wmaybe-uninitialized): Likewise. (Wunused-but-set-variable): Likewise. (Wunused-function): Likewise. (Wunused-label): Likewise. (Wunused-value): Likewise. (Wunused-variable): Likewise. * opt-read.awk: Create opt_numbers array. ada/ * gcc-interface/misc.c (gnat_parse_file): Move before ... (gnat_handle_option): ... this. Use handle_generated_option. c-family/ * c-opts.c (c_common_handle_option): Use handle_generated_option to enable sub-options. fortran/ * options.c: Include diagnostics.h instead of diagnostics-core.h. (set_Wall): Do not see warn_unused here. (gfc_handle_option): Set it here using handle_generated_option. group-options-2.diff Description: Binary data
Bug 53289 - unnecessary repetition of caret diagnostics
Simple enough. Bootstrapped and regression tested. The output for the example in the PR is now: /home/manuel/caret-overload.C:6:6: error: no matching function for call to ‘g(int)’ g(1); ^ /home/manuel/caret-overload.C:6:6: note: candidate is: /home/manuel/caret-overload.C:2:18: note: template typename T::type g(T) typename T::type g(T); ^ /home/manuel/caret-overload.C:2:18: note: template argument deduction/substitution failed: /home/manuel/caret-overload.C: In substitution of ‘template typename T::type g(T) [with T = int]’: /home/manuel/caret-overload.C:6:6: required from here /home/manuel/caret-overload.C:2:18: error: ‘int’ is not a class, struct, or union type OK? 2012-05-09 Manuel López-Ibáñez PR c++/53289 gcc/ * diagnostic.h (diagnostic_context): Add last_location. * diagnostic.c (diagnostic_initialize): Initialize it. (diagnostic_show_locus): Use it. caret-rep.diff Description: Binary data
Re: Bug 53289 - unnecessary repetition of caret diagnostics
On Wed, May 9, 2012 at 4:02 PM, Manuel López-Ibáñez wrote: > Simple enough. Bootstrapped and regression tested. > > The output for the example in the PR is now: > > /home/manuel/caret-overload.C:6:6: error: no matching function for > call to ‘g(int)’ > g(1); > ^ > /home/manuel/caret-overload.C:6:6: note: candidate is: > /home/manuel/caret-overload.C:2:18: note: template typename > T::type g(T) > typename T::type g(T); > ^ > /home/manuel/caret-overload.C:2:18: note: template argument > deduction/substitution failed: > /home/manuel/caret-overload.C: In substitution of ‘template > typename T::type g(T) [with T = int]’: > /home/manuel/caret-overload.C:6:6: required from here > /home/manuel/caret-overload.C:2:18: error: ‘int’ is not a class, > struct, or union type > > OK? OK. Thanks, -- Gaby
Re: [PATCH] Add -feliminate-malloc to enable/disable elimination of redundant malloc/free pairs
On Wed, May 9, 2012 at 11:21 AM, Richard Guenther wrote: > On Wed, May 9, 2012 at 6:32 PM, Xinliang David Li wrote: >> On Wed, May 9, 2012 at 1:12 AM, Richard Guenther >> wrote: >>> On Tue, May 8, 2012 at 6:18 PM, Xinliang David Li >>> wrote: To be clear, this flag is for malloc implementation (such as tcmalloc) with side effect unknown to the compiler. Using -fno-builtin-xxx is too conservative for that purpose. >>> >>> I don't think that flys. Btw, the patch also guards alloca - alloca is >>> purely >>> GCC internal. >>> >>> What's the "unknown side-effects" that are also important to preserve >>> for free(malloc(4))? >>> >> >> The side effect is the user registered malloc hooks. >> >> The pattern 'free(malloc(4)', or loops like >> >> for (..) >> { >> malloc(..); // result not used >> } >> >> itself is not interesting. They only appear in the test code and the >> problem can be worked around by using -fno-builtin-malloc. The option >> is to avoid disable this and all similar malloc optimizations (in the >> future) for malloc implementation with hooks. > > What is then interesting? The above are all the same as if optimization > figured out the storage is not used, no? Compiler may choose to convert malloc call into alloca or coalesce multiple malloc calls -- but this may not always be the best choice given that more advanced locality based heap allocation is possible. Nothing like this exist yet, but the point is that it is convenient to have a way to keep malloc call while keeping its aliasing property. David > > Richard. > >> Thanks, >> >> David >> >>> Richard. >>> David On Tue, May 8, 2012 at 7:43 AM, Dehao Chen wrote: > Hello, > > This patch adds a flag to guard the optimization that optimize the > following code away: > > free (malloc (4)); > > In some cases, we'd like this type of malloc/free pairs to remain in > the optimized code. > > Tested with bootstrap, and no regression in the gcc testsuite. > > Is it ok for mainline? > > Thanks, > Dehao > > gcc/ChangeLog > 2012-05-08 Dehao Chen > > * common.opt (feliminate-malloc): New. > * doc/invoke.texi: Document it. > * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Honor it. > > gcc/testsuite/ChangeLog > 2012-05-08 Dehao Chen > > * gcc.dg/free-malloc.c: Check if -fno-eliminate-malloc is working > as expected. > > Index: gcc/doc/invoke.texi > === > --- gcc/doc/invoke.texi (revision 187277) > +++ gcc/doc/invoke.texi (working copy) > @@ -360,7 +360,8 @@ > -fcx-limited-range @gol > -fdata-sections -fdce -fdelayed-branch @gol > -fdelete-null-pointer-checks -fdevirtualize -fdse @gol > --fearly-inlining -fipa-sra -fexpensive-optimizations -ffat-lto-objects > @gol > +-fearly-inlining -feliminate-malloc -fipa-sra -fexpensive-optimizations > @gol > +-ffat-lto-objects @gol > -ffast-math -ffinite-math-only -ffloat-store > -fexcess-precision=@var{style} @gol > -fforward-propagate -ffp-contract=@var{style} -ffunction-sections @gol > -fgcse -fgcse-after-reload -fgcse-las -fgcse-lm -fgraphite-identity @gol > @@ -6238,6 +6239,7 @@ > -fdefer-pop @gol > -fdelayed-branch @gol > -fdse @gol > +-feliminate-malloc @gol > -fguess-branch-probability @gol > -fif-conversion2 @gol > -fif-conversion @gol > @@ -6762,6 +6764,11 @@ > Perform dead store elimination (DSE) on RTL@. > Enabled by default at @option{-O} and higher. > > +@item -feliminate-malloc > +@opindex feliminate-malloc > +Eliminate unnecessary malloc/free pairs. > +Enabled by default at @option{-O} and higher. > + > @item -fif-conversion > @opindex fif-conversion > Attempt to transform conditional jumps into branch-less equivalents. > This > Index: gcc/testsuite/gcc.dg/free-malloc.c > === > --- gcc/testsuite/gcc.dg/free-malloc.c (revision 0) > +++ gcc/testsuite/gcc.dg/free-malloc.c (revision 0) > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fno-eliminate-malloc" } */ > +/* { dg-final { scan-assembler-times "malloc" 2} } */ > +/* { dg-final { scan-assembler-times "free" 2} } */ > + > +extern void * malloc (unsigned long); > +extern void free (void *); > + > +void test () > +{ > + free (malloc (10)); > +} > Index: gcc/common.opt > === > --- gcc/common.opt (revision 187277) > +++ gcc/common.opt (working copy) > @@ -1474,6 +1474,10 @@ > Common Var(flag_dce) Init(1) Optimization > Use the RTL dead code elimination pass >
[C++ Patch] PR 53301
Hi, shame on me. I think the patch almost qualifies as obvious. Tested x86_64-linux. Thanks, Paolo. // /cp 2012-05-10 Paolo Carlini PR c++/53301 * decl.c (check_default_argument): Fix typo (POINTER_TYPE_P instead of TYPE_PTR_P) in zero-as-null-pointer-constant warning. /testsuite 2012-05-10 Paolo Carlini PR c++/53301 * g++.dg/warn/Wzero-as-null-pointer-constant-6.C: New. Index: testsuite/g++.dg/warn/Wzero-as-null-pointer-constant-6.C === --- testsuite/g++.dg/warn/Wzero-as-null-pointer-constant-6.C(revision 0) +++ testsuite/g++.dg/warn/Wzero-as-null-pointer-constant-6.C(revision 0) @@ -0,0 +1,6 @@ +// PR c++/53301 +// { dg-options "-Wzero-as-null-pointer-constant" } + +class x { public: x(int v) {} }; + +void foo(const x& = 0); Index: cp/decl.c === --- cp/decl.c (revision 187335) +++ cp/decl.c (working copy) @@ -10619,7 +10619,7 @@ check_default_argument (tree decl, tree arg) if (warn_zero_as_null_pointer_constant && c_inhibit_evaluation_warnings == 0 - && (POINTER_TYPE_P (decl_type) || TYPE_PTR_TO_MEMBER_P (decl_type)) + && (TYPE_PTR_P (decl_type) || TYPE_PTR_TO_MEMBER_P (decl_type)) && null_ptr_cst_p (arg) && !NULLPTR_TYPE_P (TREE_TYPE (arg))) {
Go patch committed: Add -fgo-pkgpath option
This patch to the Go frontend adds a new command line option: -fgo-pkgpath. This permits setting the string that is returned by the PkgPath() method of reflect.Type for types defined in the package being compiled. It also sets the prefix used for symbol names. This largely replaces the existing -fgo-prefix option, although I retained -fgo-prefix for backward compatibility. This will permit libgo to have PkgPath values that are the same as the ones returned by the other Go compiler. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline and 4.7 branch. Ian 2012-05-09 Ian Lance Taylor * lang.opt: Add -fgo-pkgpath. * go-lang.c (go_pkgpath): New static variable. (go_prefix): New static variable. (go_langhook_init): Pass go_pkgpath and go_prefix to go_create_gogo. (go_langhook_handle_option): Handle -fgo-pkgpath. Change -fgo-prefix handling to just set go_prefix. * go-c.h (go_set_prefix): Don't declare. (go_create_gogo): Add pkgpath and prefix to declaration. * go-gcc.cc (Gcc_backend::global_variable): Change unique_prefix to pkgpath. Don't include the package name in the asm name. * gccgo.texi (Invoking gccgo): Document -fgo-pkgpath. Update the docs for -fgo-prefix. Index: gcc/go/lang.opt === --- gcc/go/lang.opt (revision 187020) +++ gcc/go/lang.opt (working copy) @@ -53,6 +53,10 @@ fgo-optimize- Go Joined RejectNegative -fgo-optimize- Turn on optimization passes in the frontend +fgo-pkgpath= +Go Joined RejectNegative +-fgo-pkgpath= Set Go package path + fgo-prefix= Go Joined RejectNegative -fgo-prefix= Set package-specific prefix for exported Go names Index: gcc/go/go-gcc.cc === --- gcc/go/go-gcc.cc (revision 187020) +++ gcc/go/go-gcc.cc (working copy) @@ -271,7 +271,7 @@ class Gcc_backend : public Backend Bvariable* global_variable(const std::string& package_name, - const std::string& unique_prefix, + const std::string& pkgpath, const std::string& name, Btype* btype, bool is_external, @@ -1281,7 +1281,7 @@ Gcc_backend::non_zero_size_type(tree typ Bvariable* Gcc_backend::global_variable(const std::string& package_name, - const std::string& unique_prefix, + const std::string& pkgpath, const std::string& name, Btype* btype, bool is_external, @@ -1310,9 +1310,9 @@ Gcc_backend::global_variable(const std:: { TREE_PUBLIC(decl) = 1; - std::string asm_name(unique_prefix); + std::string asm_name(pkgpath); asm_name.push_back('.'); - asm_name.append(var_name); + asm_name.append(name); SET_DECL_ASSEMBLER_NAME(decl, get_identifier_from_string(asm_name)); } TREE_USED(decl) = 1; Index: gcc/go/gccgo.texi === --- gcc/go/gccgo.texi (revision 187020) +++ gcc/go/gccgo.texi (working copy) @@ -157,14 +157,32 @@ compile time. When linking, specify a library search directory, as with @command{gcc}. +@item -fgo-pkgpath=@var{string} +@cindex @option{-fgo-pkgpath} +Set the package path to use. This sets the value returned by the +PkgPath method of reflect.Type objects. It is also used for the names +of globally visible symbols. The argument to this option should +normally be the string that will be used to import this package after +it has been installed; in other words, a pathname within the +directories specified by the @option{-I} option. + @item -fgo-prefix=@var{string} @cindex @option{-fgo-prefix} +An alternative to @option{-fgo-pkgpath}. The argument will be +combined with the package name from the source file to produce the +package path. If @option{-fgo-pkgpath} is used, @option{-fgo-prefix} +will be ignored. + Go permits a single program to include more than one package with the -same name. This option is required to make this work with -@command{gccgo}. The argument to this option may be any string. Each -package with the same name must use a distinct @option{-fgo-prefix} -option. The argument is typically the full path under which the -package will be installed, as that must obviously be unique. +same name in the @code{package} clause in the source file, though +obviously the two packages must be imported using different pathnames. +In order for this to work with @command{gccgo}, either +@option{-fgo-pkgpath} or @option{-fgo-prefix} must be specified when +compiling a package. + +Using either @option{-fgo-pkgpath} or @option{-fgo-prefix} disables +the special treatment of the @code{main} package and permits that +package to be imported like any other. @item -frequire-return-statement @itemx -fno-require-return-statement Index: gcc/go/gofrontend/gogo.cc === --- gcc/go/gof
Re: [PATCH] Add option for dumping to stderr (issue6190057)
Thanks for your suggestions/comments. I have updated the patch and documentation. It supports the following usage: gcc -fdump-tree-all=tree.dump -fdump-tree-pre=stdout -fdump-rtl-ira=ira.dump Here all tree dumps except the PRE are output into tree.dump, PRE dump goes to stdout and the IRA dump goes to ira.dump. Thanks, Sharad 2012-05-09 Sharad Singhai * doc/invoke.texi: Add documentation for the new option. * tree-dump.c (dump_get_standard_stream): New function. (dump_files): Update for new field. (dump_switch_p_1): Handle dump filenames. (dump_begin): Likewise. (get_dump_file_name): Likewise. (dump_end): Remove attribute. (dump_enable_all): Add new parameter FILENAME. All callers updated. (enable_rtl_dump_file): * tree-pass.h (enum tree_dump_index): Add new constant. (struct dump_file_info): Add new field FILENAME. * testsuite/g++.dg/other/dump-filename-1.C: New test. Index: doc/invoke.texi === --- doc/invoke.texi (revision 187265) +++ doc/invoke.texi (working copy) @@ -5322,20 +5322,23 @@ Here are some examples showing uses of these optio @item -d@var{letters} @itemx -fdump-rtl-@var{pass} +@itemx -fdump-rtl-@var{pass}=@var{filename} @opindex d Says to make debugging dumps during compilation at times specified by @var{letters}. This is used for debugging the RTL-based passes of the compiler. The file names for most of the dumps are made by appending a pass number and a word to the @var{dumpname}, and the files are -created in the directory of the output file. Note that the pass -number is computed statically as passes get registered into the pass -manager. Thus the numbering is not related to the dynamic order of -execution of passes. In particular, a pass installed by a plugin -could have a number over 200 even if it executed quite early. -@var{dumpname} is generated from the name of the output file, if -explicitly specified and it is not an executable, otherwise it is the -basename of the source file. These switches may have different effects -when @option{-E} is used for preprocessing. +created in the directory of the output file. If the +@option{=@var{filename}} is appended to the longer form of the dump +option then the dump is done on that file instead of numbered +files. Note that the pass number is computed statically as passes get +registered into the pass manager. Thus the numbering is not related +to the dynamic order of execution of passes. In particular, a pass +installed by a plugin could have a number over 200 even if it executed +quite early. @var{dumpname} is generated from the name of the output +file, if explicitly specified and it is not an executable, otherwise +it is the basename of the source file. These switches may have +different effects when @option{-E} is used for preprocessing. Debug dumps can be enabled with a @option{-fdump-rtl} switch or some @option{-d} option @var{letters}. Here are the possible @@ -5719,15 +5722,18 @@ counters for each function compiled. @item -fdump-tree-@var{switch} @itemx -fdump-tree-@var{switch}-@var{options} +@itemx -fdump-tree-@var{switch}-@var{options}=@var{filename} @opindex fdump-tree Control the dumping at various stages of processing the intermediate language tree to a file. The file name is generated by appending a switch specific suffix to the source file name, and the file is -created in the same directory as the output file. If the -@samp{-@var{options}} form is used, @var{options} is a list of -@samp{-} separated options which control the details of the dump. Not -all options are applicable to all dumps; those that are not -meaningful are ignored. The following options are available +created in the same directory as the output file. In case of +@option{=@var{filename}} option, the dump is output on the given file +name instead. If the @samp{-@var{options}} form is used, +@var{options} is a list of @samp{-} separated options which control +the details or location of the dump. Not all options are applicable +to all dumps; those that are not meaningful are ignored. The +following options are available @table @samp @item address @@ -5765,9 +5771,46 @@ Enable showing the tree dump for each statement. Enable showing the EH region number holding each statement. @item scev Enable showing scalar evolution analysis details. +@item slim +Inhibit dumping of members of a scope or body of a function merely +because that scope has been reached. Only dump such items when they +are directly reachable by some other path. When dumping pretty-printed +trees, this option inhibits dumping the bodies of control structures. +@item =@var{filename} +Instead of using an auto generated dump file name, use the given file +name. The file names @file{stdout} and @file{stderr} are treated +specially and are considered already open standard s
[h8300] increase dwarf address size
H8/300 cpus have a larger-than-64k address space, despite 16-bit pointers. OK to apply? Ok for 4.7 branch? See also http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48231 * config/h8300/h8300.h (DWARF2_ADDR_SIZE): Define as 4 bytes. Index: h8300.h === --- h8300.h (revision 187362) +++ h8300.h (working copy) @@ -126,12 +126,13 @@ extern const char * const *h8_reg_names; /* The return address is pushed on the stack. */ #define INCOMING_RETURN_ADDR_RTX gen_rtx_MEM (Pmode, gen_rtx_REG (Pmode, STACK_POINTER_REGNUM)) #define INCOMING_FRAME_SP_OFFSET (POINTER_SIZE / 8) #define DWARF_CIE_DATA_ALIGNMENT 2 +#define DWARF2_ADDR_SIZE 4 /* Define this if addresses of constant functions shouldn't be put through pseudo regs where they can be cse'd. Desirable on machines where ordinary constants are expensive but a CALL with constant address is cheap.
Re: [PATCH] Add option for dumping to stderr (issue6190057)
Bummer. I was thinking to reserve '=' for selective dumping: -fdump-tree-pre= I guess this can be achieved via @ -fdump-tree-pre@ -fdump-tree-pre=@ Another issue -- I don't think the current precedence rule is correct. Consider that -fopt-info=2 will be mapped to -fdump-tree-all-transform-verbose2=stderr -fdump-rtl-all-transform-verbose2=stderr then the current precedence rule will cause surprise when the following is used -fopt-info -fdump-tree-pre The PRE dump will be emitted to stderr which is not what user wants. In short, special streams should be treated as 'weak' the same way as your previous implementation. thanks, David On Wed, May 9, 2012 at 4:56 PM, Sharad Singhai wrote: > Thanks for your suggestions/comments. I have updated the patch and > documentation. It supports the following usage: > > gcc -fdump-tree-all=tree.dump -fdump-tree-pre=stdout > -fdump-rtl-ira=ira.dump > > Here all tree dumps except the PRE are output into tree.dump, PRE dump > goes to stdout and the IRA dump goes to ira.dump. > > Thanks, > Sharad > > 2012-05-09 Sharad Singhai > > * doc/invoke.texi: Add documentation for the new option. > * tree-dump.c (dump_get_standard_stream): New function. > (dump_files): Update for new field. > (dump_switch_p_1): Handle dump filenames. > (dump_begin): Likewise. > (get_dump_file_name): Likewise. > (dump_end): Remove attribute. > (dump_enable_all): Add new parameter FILENAME. > All callers updated. > (enable_rtl_dump_file): > * tree-pass.h (enum tree_dump_index): Add new constant. > (struct dump_file_info): Add new field FILENAME. > * testsuite/g++.dg/other/dump-filename-1.C: New test. > > Index: doc/invoke.texi > === > --- doc/invoke.texi (revision 187265) > +++ doc/invoke.texi (working copy) > @@ -5322,20 +5322,23 @@ Here are some examples showing uses of these optio > > @item -d@var{letters} > @itemx -fdump-rtl-@var{pass} > +@itemx -fdump-rtl-@var{pass}=@var{filename} > @opindex d > Says to make debugging dumps during compilation at times specified by > @var{letters}. This is used for debugging the RTL-based passes of the > compiler. The file names for most of the dumps are made by appending > a pass number and a word to the @var{dumpname}, and the files are > -created in the directory of the output file. Note that the pass > -number is computed statically as passes get registered into the pass > -manager. Thus the numbering is not related to the dynamic order of > -execution of passes. In particular, a pass installed by a plugin > -could have a number over 200 even if it executed quite early. > -@var{dumpname} is generated from the name of the output file, if > -explicitly specified and it is not an executable, otherwise it is the > -basename of the source file. These switches may have different effects > -when @option{-E} is used for preprocessing. > +created in the directory of the output file. If the > +@option{=@var{filename}} is appended to the longer form of the dump > +option then the dump is done on that file instead of numbered > +files. Note that the pass number is computed statically as passes get > +registered into the pass manager. Thus the numbering is not related > +to the dynamic order of execution of passes. In particular, a pass > +installed by a plugin could have a number over 200 even if it executed > +quite early. @var{dumpname} is generated from the name of the output > +file, if explicitly specified and it is not an executable, otherwise > +it is the basename of the source file. These switches may have > +different effects when @option{-E} is used for preprocessing. > > Debug dumps can be enabled with a @option{-fdump-rtl} switch or some > @option{-d} option @var{letters}. Here are the possible > @@ -5719,15 +5722,18 @@ counters for each function compiled. > > @item -fdump-tree-@var{switch} > @itemx -fdump-tree-@var{switch}-@var{options} > +@itemx -fdump-tree-@var{switch}-@var{options}=@var{filename} > @opindex fdump-tree > Control the dumping at various stages of processing the intermediate > language tree to a file. The file name is generated by appending a > switch specific suffix to the source file name, and the file is > -created in the same directory as the output file. If the > -@samp{-@var{options}} form is used, @var{options} is a list of > -@samp{-} separated options which control the details of the dump. Not > -all options are applicable to all dumps; those that are not > -meaningful are ignored. The following options are available > +created in the same directory as the output file. In case of > +@option{=@var{filename}} option, the dump is output on the given file > +name instead. If the @samp{-@var{options}} form is used, > +@var{options} is a list of @samp{-} separated options which control > +the details or location of the dump. Not
[Patch / RFC] Improving more locations for binary expressions
Hi, I'm looking into making more progress on those locations, for another class of cases. Consider: struct T { }; T foo(); void bar(int a, int b) { if (foo() && a < b) ; } thus, in this case, we have a class type T, instead of void. The error message is: a.cc: In function ‘void bar(int, int)’: a.cc:7:20: error: no match for ‘operator&&’ (operand types are ‘T’ and ‘bool’) if (foo() && a < b) ^ a.cc:7:20: note: candidate is: a.cc:7:20: note: operator&&(bool, bool) if (foo() && a < b) ^ a.cc:7:20: note: no known conversion for argument 1 from ‘T’ to ‘bool’ in case my message ends up garbled, the carets do not point to && (column 13), two times point to b (column 20), which is obviously wrong. In other terms, all the columns are 20, all wrong. Now, a first step toward fixing this is improving on the work I did some days ago on cp_parser_binary_expression: make sure that the location we are eventually passing to build_x_binary_op is always, provably, the location of the operator. It seems to me that In order to achieve that, the right thing to do is always handling tree_type and loc (the location of the operator) together, per the attached patch_parser (the initialization of loc to input_location is just to prevent uninitialized warnings) Then, with patch_parser applied (it passes testing), we end up with a correct first caret for the testcase above, that for "no match", but the second one remains wrong. What happens is that the location is correct up to the print_z_candidates call. It gets a correct location as argument, and prints the "candidate is" message with the right column, 13. Then it calls print_z_candidate without passing a location, which, in turn: - Prints the message with input_location as location, which is 20, wrong. - Then calls print_conversion_rejection passing location_of (candidate->fn) which is again 20, again wrong. Now, if I change print_z_candidate to get the location_t from print_z_candidates and use it for both the last two outputs, I finally get: a.cc: In function ‘void bar(int, int)’: a.cc:7:13: error: no match for ‘operator&&’ (operand types are ‘T’ and ‘bool’) if (foo() && a < b) ^ a.cc:7:13: note: candidate is: a.cc:7:13: note: operator&&(bool, bool) a.cc:7:13: note: no known conversion for argument 1 from ‘T’ to ‘bool’ again, in case the text is garbled, all the locations seems finally right to me, and, interestingly there is no caret at the end, which seems Ok. Now, what I suspect we are doing wrong here, is that we are not handling separately, in terms of locations, the *builtin* overloads, which should simply use always the location passed from upstream (as I did in my quick experiment) from all the other overloads, for which the current code seems fine. Seems right? The patch for this second half of the issue seems now also rather simple to me. Thanks, Paolo. /// Index: parser.c === --- parser.c(revision 187362) +++ parser.c(working copy) @@ -1621,6 +1621,8 @@ typedef struct cp_parser_expression_stack_entry enum tree_code tree_type; /* Precedence of the binary operation we are parsing. */ enum cp_parser_prec prec; + /* Location of the binary operation we are parsing. */ + location_t loc; } cp_parser_expression_stack_entry; /* The stack for storing partial expressions. We only need NUM_PREC_VALUES @@ -7277,7 +7279,7 @@ cp_parser_binary_expression (cp_parser* parser, bo cp_parser_expression_stack_entry *sp = &stack[0]; tree lhs, rhs; cp_token *token; - location_t loc; + location_t loc = input_location; enum tree_code tree_type, lhs_type, rhs_type; enum cp_parser_prec new_prec, lookahead_prec; tree overload; @@ -7290,7 +7292,6 @@ cp_parser_binary_expression (cp_parser* parser, bo { /* Get an operator token. */ token = cp_lexer_peek_token (parser->lexer); - loc = token->location; if (warn_cxx0x_compat && token->type == CPP_RSHIFT @@ -7320,6 +7321,7 @@ cp_parser_binary_expression (cp_parser* parser, bo get_rhs: tree_type = binops_by_token[token->type].tree_type; + loc = token->location; /* We used the operator token. */ cp_lexer_consume_token (parser->lexer); @@ -7349,6 +7351,7 @@ cp_parser_binary_expression (cp_parser* parser, bo stack overflows. */ sp->prec = prec; sp->tree_type = tree_type; + sp->loc = loc; sp->lhs = lhs; sp->lhs_type = lhs_type; sp++; @@ -7370,6 +7373,7 @@ cp_parser_binary_expression (cp_parser* parser, bo --sp; prec = sp->prec; tree_type = sp->tree_type; + loc = sp->loc; rhs = lhs; rhs_type = lhs_type; lhs = sp->lhs;
Re: [Patch / RFC] Improving more locations for binary expressions
Paolo Carlini writes: > in case my message ends up garbled, the carets do not point to && > (column 13), two times point to b (column 20), which is obviously > wrong. In other terms, all the columns are 20, all wrong. The new caret support does seem to have revealed a bunch of places where the column info in error messages was pretty screwy... I guess nobody paid much attention to it before... :] Should these get reported as bugzilla bugs...? -miles -- The trouble with most people is that they think with their hopes or fears or wishes rather than with their minds. -- Will Durant
Re: [patch] support for multiarch systems
Il 09/05/2012 19:19, Matthias Klose ha scritto: > these are referenced from the http://wiki.debian.org/Multiarch/Tuples > https://wiki.ubuntu.com/MultiarchSpec#Filesystem_layout > http://err.no/debian/amd64-multiarch-3 > > http://wiki.debian.org/Multiarch/TheCaseForMultiarch describes use cases for > multiarch, and why Debian thinks that the existing approaches are not > sufficient > (having name collisions for different architectures or ad hoc names for new > architectures like libx32). That may be contentious within the Linux > community, > but I would like to avoid this kind of discussion here. I don't care about contentiousness, I just would like this to be documented somewhere (for example in the internals manual where MULTILIB_* is documented too). Paolo
Re: [C Patch]: pr52543
Il 30/03/2012 12:08, Richard Sandiford ha scritto: >> > + There are two useful preprocessor defines for use by maintainers: >> > + >> > + #define LOG_COSTS >> > + >> > + if you wish to see the actual cost estimates that are being used >> > + for each mode wider than word mode and the cost estimates for zero >> > + extension and the shifts. This can be useful when port maintainers >> > + are tuning insn rtx costs. >> > + >> > + #define FORCE_LOWERING >> > + >> > + if you wish to test the pass with all the transformation forced on. >> > + This can be useful for finding bugs in the transformations. > Must admit I'm not keen on these kinds of macro, but it's Ian's call. Indeed, LOG_COSTS should be (dump_flags & TDF_DETAILS) != 0, and perhaps FORCE_LOWERING should be a -f flag (like -flower-all-subregs) or a --param. Paolo
Re: [C Patch]: pr52543
Il 10/05/2012 08:45, Paolo Bonzini ha scritto: > Il 30/03/2012 12:08, Richard Sandiford ha scritto: + There are two useful preprocessor defines for use by maintainers: + + #define LOG_COSTS + + if you wish to see the actual cost estimates that are being used + for each mode wider than word mode and the cost estimates for zero + extension and the shifts. This can be useful when port maintainers + are tuning insn rtx costs. + + #define FORCE_LOWERING + + if you wish to test the pass with all the transformation forced on. + This can be useful for finding bugs in the transformations. >> Must admit I'm not keen on these kinds of macro, but it's Ian's call. > > Indeed, LOG_COSTS should be (dump_flags & TDF_DETAILS) != 0, and perhaps > FORCE_LOWERING should be a -f flag (like -flower-all-subregs) or a --param. Not sure how this got sent a month after I wrote it (and decided not to send it). :) Paolo