Re: [PATCH][RFA] [PR rtl-optimization/64317] Enhance postreload-gcse.c to eliminate more redundant loads
On 03/16/2015 01:27 PM, Jakub Jelinek wrote: What effect does the patch have on compile time on say x86_64 or ppc64? I bootstrapped x86_64 trunk then timed compiling 640 .i/.ii files with -O2. That was repeated 10 times (to get a sense of variability). Each run took a little over 40 minutes with variability of less than a second (slowest run compared to fastest run). I then also gathered data for a smaller set of runs for -O2 -funroll-loops (given the consistency between runs, 10 iterations seemed like major overkill). Again variability was less than a second. I then applied the patch and repeated the -O2 and -O2 -funroll-loops test. The patched compiler was within a second of the unpatched compiler for both the -O2 and -O2 -funroll-loops tests. Given the difference was within the noise level of those tests, my conclusion is the new code makes no measurable difference in compile time performance for x86_64. Similar tests are in progress on powerpc64-linux-gnu. Jeff
Re: [PATCH][RFA] [PR rtl-optimization/64317] Enhance postreload-gcse.c to eliminate more redundant loads
On Sat, Mar 21, 2015 at 01:47:10AM -0600, Jeff Law wrote: > On 03/16/2015 01:27 PM, Jakub Jelinek wrote: > > > >What effect does the patch have on compile time on say x86_64 or ppc64? > I bootstrapped x86_64 trunk then timed compiling 640 .i/.ii files with -O2. > That was repeated 10 times (to get a sense of variability). Each run took a > little over 40 minutes with variability of less than a second (slowest run > compared to fastest run). > > I then also gathered data for a smaller set of runs for -O2 -funroll-loops > (given the consistency between runs, 10 iterations seemed like major > overkill). Again variability was less than a second. > > I then applied the patch and repeated the -O2 and -O2 -funroll-loops test. > The patched compiler was within a second of the unpatched compiler for both > the -O2 and -O2 -funroll-loops tests. Given the difference was within the > noise level of those tests, my conclusion is the new code makes no > measurable difference in compile time performance for x86_64. > > Similar tests are in progress on powerpc64-linux-gnu. Thanks for the testing. I think we want the patch in now. Jakub
Re: [Patch, Fortran] Extend (lib)coarray API/ABI documentation
Dear Tobias, Revision r221550 break bootstrap on platforms with recent makeinfo (mine is 5.2): see https://gcc.gnu.org/ml/gcc-regression/2015-03/. The error is ../../work/gcc/fortran/gfortran.texi:3850: @code missing close brace ../../work/gcc/fortran/gfortran.texi:3851: misplaced } ../../work/gcc/fortran/Make-lang.in:185: recipe for target 'doc/gfortran.info' failed and is fixed with the following patch --- ../_clean/gcc/fortran/gfortran.texi 2015-03-21 10:40:46.0 +0100 +++ gcc/fortran/gfortran.texi 2015-03-21 11:27:20.0 +0100 @@ -3847,8 +3847,8 @@ an error message; may be NULL @item @var{errmsg_len} @tab the buffer size of errmsg. @end multitable -@item @emph{NOTE} A simple implementation could be a simple @code{__asm__ -__volatile__ ("":::"memory)} to prevent code movements. +@item @emph{NOTE} A simple implementation could be a simple +@code{__asm__ __volatile__ ("":::"memory)} to prevent code movements. @end table TIA Dominiq
Re: [Patch, Fortran, pr55901, v1] [OOP] type is (character(len=*)) misinterpreted as array
Dear Andre, I have applied the three preliminary patches but have not yet applied the attached one for PR55901. As advertised the composite patch bootstraps and regtests on FC21,x86_64. I went through gfc_trans_allocate and cleaned up the formatting and some of the text in the comments. You did a heroic job to tidy up this function and so I thought that I should do my bit - one of the feature, previously, was that the line length often went well in excess of the gcc style guide limit of 72 and this tended to make it somewhat unreadable. I have not been rigorous about this, especially when readability would be impaired thereby, but it does look a lot better now. The composite diff is attached. Not only does the Metcalf example run correctly but also the PGI Insider linked list example. I have attached a version of this modified to function as a gfortran.dg testcase. With the attributions in there, I do not think that there are any copyright issues. The article itself has no copyright notice. I would very much like to say that this is OK for trunk but we are hard up against the end of stage 4 and so it should really wait for backporting to 5.2. Thanks for the patches Paul On 19 March 2015 at 16:13, Andre Vehreschild wrote: > Hi all, > > please find attached the parts missing to stop valgrind's complaining about > the > use of uninitialized memory. The issue was, that when constructing a temporary > class-object to call a routine with unlimited polymorphic arguments, the _len > component was never set. This is fixed by this patch now. > > Note, the patch is based on all these preliminary patches: > > https://gcc.gnu.org/ml/fortran/2015-03/msg00074.html > https://gcc.gnu.org/ml/fortran/2015-03/msg00075.html > https://gcc.gnu.org/ml/fortran/2015-03/msg00085.html > > Bootstraps and regtests ok on x86_64-linux-gnu/F20. > > Please review! > > - Andre > -- > Andre Vehreschild * Email: vehre ad gmx dot de -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx Index: gcc/fortran/class.c === *** gcc/fortran/class.c (revision 221500) --- gcc/fortran/class.c (working copy) *** gfc_add_component_ref (gfc_expr *e, cons *** 234,239 --- 234,242 } if (*tail != NULL && strcmp (name, "_data") == 0) next = *tail; + else + /* Avoid losing memory. */ + gfc_free_ref_list (*tail); (*tail) = gfc_get_ref(); (*tail)->next = next; (*tail)->type = REF_COMPONENT; *** find_intrinsic_vtab (gfc_typespec *ts) *** 2562,2574 c->attr.access = ACCESS_PRIVATE; /* Build a minimal expression to make use of !target-memory.c/gfc_element_size for 'size'. */ e = gfc_get_expr (); e->ts = *ts; e->expr_type = EXPR_VARIABLE; c->initializer = gfc_get_int_expr (gfc_default_integer_kind, NULL, !(int)gfc_element_size (e)); gfc_free_expr (e); /* Add component _extends. */ --- 2565,2583 c->attr.access = ACCESS_PRIVATE; /* Build a minimal expression to make use of !target-memory.c/gfc_element_size for 'size'. Special handling !for character arrays, that are not constant sized: to support !len(str)*kind, only the kind information is stored in the !vtab. */ e = gfc_get_expr (); e->ts = *ts; e->expr_type = EXPR_VARIABLE; c->initializer = gfc_get_int_expr (gfc_default_integer_kind, NULL, !ts->type == BT_CHARACTER !&& charlen == 0 ? ! ts->kind : ! (int)gfc_element_size (e)); gfc_free_expr (e); /* Add component _extends. */ Index: gcc/fortran/gfortran.h === *** gcc/fortran/gfortran.h (revision 221500) --- gcc/fortran/gfortran.h (working copy) *** void gfc_add_component_ref (gfc_expr *, *** 3168,3173 --- 3168,3174 void gfc_add_class_array_ref (gfc_expr *); #define gfc_add_data_component(e) gfc_add_component_ref(e,"_data") #define gfc_add_vptr_component(e) gfc_add_component_ref(e,"_vptr") + #define gfc_add_len_component(e) gfc_add_component_ref(e,"_len") #define gfc_add_hash_component(e) gfc_add_component_ref(e,"_hash") #define gfc_add_size_component(e) gfc_add_component_ref(e,"_size") #define gfc_add_def_init_component(e) gfc_add_component_ref(e,"_def_init") Index: gcc/fortran/trans-array
Re: [Patch, Fortran] Extend (lib)coarray API/ABI documentation
On Sat, Mar 21, 2015 at 6:19 AM, Dominique Dhumieres wrote: > Dear Tobias, > > Revision r221550 break bootstrap on platforms with recent makeinfo (mine is > 5.2): > see https://gcc.gnu.org/ml/gcc-regression/2015-03/. The error is > > ../../work/gcc/fortran/gfortran.texi:3850: @code missing close brace > ../../work/gcc/fortran/gfortran.texi:3851: misplaced } > ../../work/gcc/fortran/Make-lang.in:185: recipe for target > 'doc/gfortran.info' failed > > and is fixed with the following patch > > --- ../_clean/gcc/fortran/gfortran.texi 2015-03-21 10:40:46.0 +0100 > +++ gcc/fortran/gfortran.texi 2015-03-21 11:27:20.0 +0100 > @@ -3847,8 +3847,8 @@ an error message; may be NULL > @item @var{errmsg_len} @tab the buffer size of errmsg. > @end multitable > > -@item @emph{NOTE} A simple implementation could be a simple @code{__asm__ > -__volatile__ ("":::"memory)} to prevent code movements. > +@item @emph{NOTE} A simple implementation could be a simple > +@code{__asm__ __volatile__ ("":::"memory)} to prevent code movements. > @end table > I checked in this to restore gcc build. -- H.J. --- Index: ChangeLog === --- ChangeLog (revision 221551) +++ ChangeLog (working copy) @@ -1,3 +1,8 @@ +2015-03-21 H.J. Lu + + * gfortran.texi (_gfortran_caf_sync_memory): Put @{xxx} in one + line. + 2015-03-21 Tobias Burnus * gfortran.texi (_gfortran_caf_sync_all, _gfortran_caf_sync_images, Index: gfortran.texi === --- gfortran.texi (revision 221551) +++ gfortran.texi (working copy) @@ -3847,8 +3847,8 @@ @item @var{errmsg_len} @tab the buffer size of errmsg. @end multitable -@item @emph{NOTE} A simple implementation could be a simple @code{__asm__ -__volatile__ ("":::"memory)} to prevent code movements. +@item @emph{NOTE} A simple implementation could be a simple +@code{__asm__ __volatile__ ("":::"memory)} to prevent code movements. @end table
Re: [Patch, Fortran, pr55901, v1] [OOP] type is (character(len=*)) misinterpreted as array
On 03/21/2015 07:11 AM, Paul Richard Thomas wrote: --- snip --- I would very much like to say that this is OK for trunk but we are hard up against the end of stage 4 and so it should really wait for backporting to 5.2. IMHO, since gfortran is not release critical, we should consider, in the interest of progress, committing this to trunk now. It will give much needed exposure to OOP features and allow users to exercise the code. (Subject to release manager approval) Regards, Jerry
[committed] Don't run testsuite/libgomp.oacc-c-c++-common/reduction-4.c on hppa*-*-hpux*
This test fails on targets that don't have complex.h. The standard c99_runtime check doesn't work in the libgomp testsuite, so I explicitly disabled the test on hppa*-*-hpux*. Tested on hppa64-hp-hpux11.11 and hppa2.0w-hp-hpux11.11. Committed to trunk. Dave -- John David Anglin dave.ang...@bell.net 2015-03-21 John David Anglin * testsuite/libgomp.oacc-c-c++-common/reduction-4.c: Don't run on hppa*-*-hpux*. Index: testsuite/libgomp.oacc-c-c++-common/reduction-4.c === --- testsuite/libgomp.oacc-c-c++-common/reduction-4.c (revision 221555) +++ testsuite/libgomp.oacc-c-c++-common/reduction-4.c (working copy) @@ -1,4 +1,4 @@ -/* { dg-do run } */ +/* { dg-do run { target { ! { hppa*-*-hpux* } } } } */ /* complex reductions. */
Re: [patch] Fix stack allocation oddity
On Thu, Nov 14, 2013 at 3:52 AM, Eric Botcazou wrote: > Hi, > > we have a test in the gnat.dg testsuite (stack_usage1.adb) which checks that > the allocation of big temporaries created in non-overlapping blocks on the > stack is optimal, i.e. that they share a stack slot. It is run at -O0 and > passes. If you run it at -O2, it also passes. Now, if you run it at -O1, it > fails and that's a regression from the pre-TREE_CLOBBER_P era. > > The problem is that, when optimization is enabled, DECL_IGNORED_P variables > are removed from blocks by remove_unused_scope_block_p and moved to the > toplevel. Now defer_stack_allocation has: > > /* Variables in the outermost scope automatically conflict with > every other variable. The only reason to want to defer them > at all is that, after sorting, we can more efficiently pack > small variables in the stack frame. Continue to defer at -O2. */ > if (toplevel && optimize < 2) > return false; > > The comment is slightly obsolete in the TREE_CLOBBER_P era, since toplevel > variables don't necessarily conflict with each other, for example the above > variables moved to toplevel by remove_unused_scope_block_p. > > We don't think that we need to tweak again remove_unused_scope_block_p in the > TREE_CLOBBER_P era; instead we can defer the allocation of big DECL_IGNORED_P > variables at toplevel from defer_stack_allocation. > > Tested on x86_64-suse-linux, OK for the mainline? > > > 2013-11-14 Olivier Hainque > > * cfgexpand.c (defer_stack_allocation): When optimization is enabled, > defer allocation of DECL_IGNORED_P variables at toplevel unless really > small. Factorize size threshold computation from the existing one. > (expand_used_vars): Refine comment. > This caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65504 -- H.J.
[patch, doc] tidy Pointer Bounds Checker documentation
Ilya, thanks for getting the -fcheck-pointer-bounds and related documentation checked in a while back. I've committed this further patch to clean it up a little bit -- some grammar and markup corrections, plus I added some more index entries and cross-references. As I've discussed previously in another thread, as a longer-term project I'd really like to reorganize the whole manual so we can treat program annotation features in their own sections instead of mixing them up with debug options (etc), but this will have to do for the upcoming 5.0 release. -Sandra 2015-03-21 Sandra Loosemore gcc/ * doc/invoke.texi (-fcheck-pointer-bounds): Copy-edit, add additional index entries and cross-references. (-fchkp-check-incomplete-type): Likewise. (-fchkp-first-field-has-own-bounds): Likewise. (-fchkp-narrow-to-innermost-array): Likewise. (-fchkp-use-fast-string-functions): Likewise. (-fchkp-use-nochk-string-functions): Likewise. (-fchkp-use-static-const-bounds): Likewise. (-fchkp-treat-zero-dynamic-size-as-infinite): Likewise. (-fchkp-instrument-marked-only): Likewise. (-fchkp-use-wrappers): Likewise. (-static-libmpx): Likewise. (-static-libmpxwrappers): Likewise. * doc/extend.texi (bnd_legacy): Likewise. (bnd_instrument): Likewise. (bnd_variable_size): Likewise. (Pointer Bounds Checker builtins): Likewise. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 221548) +++ gcc/doc/invoke.texi (working copy) @@ -5843,31 +5843,42 @@ is usable even in freestanding environme @item -fcheck-pointer-bounds @opindex fcheck-pointer-bounds @opindex fno-check-pointer-bounds +@cindex Pointer Bounds Checker options Enable Pointer Bounds Checker instrumentation. Each memory reference -is instrumented with checks of pointer used for memory access against -bounds associated with that pointer. Generated instrumentation may -be controlled by various @option{-fchkp-*} options. Currently there -is only Intel MPX based implementation available, thus i386 target -and @option{-mmpx} are required. MPX based instrumentation requires -a runtime library to enable MPX in a hardware and handle bounds +is instrumented with checks of the pointer used for memory access against +bounds associated with that pointer. + +Currently there +is only an implementation for Intel MPX available, thus x86 target +and @option{-mmpx} are required to enable this feature. +MPX-based instrumentation requires +a runtime library to enable MPX in hardware and handle bounds violation signals. By default when @option{-fcheck-pointer-bounds} and @option{-mmpx} options are used to link a program, the GCC driver -links against @option{libmpx} runtime library. MPX based instrumentation -may be used for a debugging and also it may be included into a release -version to increase program security. Depending on usage you may -put different requirements to runtime library. Current version - of MPX runtime library is more oriented to be used as a debugging +links against the @file{libmpx} runtime library. MPX-based instrumentation +may be used for debugging and also may be included in production code +to increase program security. Depending on usage, you may +have different requirements for the runtime library. The current version +of the MPX runtime library is more oriented for use as a debugging tool. MPX runtime library usage implies @option{-lpthread}. See also @option{-static-libmpx}. The runtime library behavior can be influenced using various @env{CHKP_RT_*} environment variables. See @uref{https://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler} for more details. +Generated instrumentation may be controlled by various +@option{-fchkp-*} options and by the @code{bnd_variable_size} +structure field attribute (@pxref{Type Attributes}) and +@code{bnd_legacy}, and @code{bnd_instrument} function attributes +(@pxref{Function Attributes}). GCC also provides a number of built-in +functions for controlling the Pointer Bounds Checker. @xref{Pointer +Bounds Checker builtins}, for more information. + @item -fchkp-check-incomplete-type @opindex fchkp-check-incomplete-type @opindex fno-chkp-check-incomplete-type Generate pointer bounds checks for variables with incomplete type. -Enabled by default +Enabled by default. @item -fchkp-narrow-bounds @opindex fchkp-narrow-bounds @@ -5880,15 +5891,15 @@ and @option{-fchkp-first-field-has-own-b @item -fchkp-first-field-has-own-bounds @opindex fchkp-first-field-has-own-bounds @opindex fno-chkp-first-field-has-own-bounds -Forces Pointer Bounds Checker to use narrowed bounds for address of the -first field in the structure. By default pointer to the first field has -the same bounds as pointer to the whole structure. +Forces Pointer Bounds Checker to use narrowed bounds for the address of the +first field in the structure. By default a pointer to the first field has +the sam
Re: [PATCH] pr 63354 - gcc -pg -mprofile-kernel creates unused stack frames on leaf functions on ppc64le
Hi Martin, I've applied your latest patch to top of trunk and looked at the code gen on powerpc-darwin9 (and a cross from x86-64-darwin12 => powerpc64-linux-gnu). On 15 Mar 2015, at 23:39, Martin Sebor wrote: > On 03/14/2015 08:34 AM, Segher Boessenkool wrote: >> On Fri, Mar 13, 2015 at 03:54:57PM -0600, Martin Sebor wrote: >>> Attached is a patch that eliminates the unused stack frame >>> allocated by gcc 5 with -pg -mprofile-kernel on powepc64le >>> and brings the code into parity with previous gcc versions. >>> >>> The patch doesn't do anything to change the emitted code >>> when -mprofile-kernel is used without -pg. Since the former >>> option isn't fully documented (as noted in pr 65372) it's >>> unclear what effect it should be expected to have without >>> -pg. >> >> -mprofile-kernel does nothing without profiling enabled. Maybe it >> should just have been called -pk or something horrid like that. >> >> The effect it should have is to do what the only user of the option >> (the 64-bit PowerPC Linux kernel) wants. The effect it does have >> is to make the 64-bit ABI more like the 32-bit ABI for mcount. > > Thanks for the review and the clarification. FWIW, I mentioned > -pg because the reporter had noted that in prior versions of > GCC specifying -pg in addition to -mprofile-kernel wasn't > necessary to get the expected effect. > >> >> >>> 2015-03-13 Anton Blanchard >>> >>> PR target/63354 >>> * gcc/config/rs6000/linux64.h (ARGET_KEEP_LEAF_WHEN_PROFILED): Define. >>^ typo This ^ will cause a bootstrap fail for every rs6000 target that doesn't include linux64.h. (because rs6000_keep_leaf_when_profiled will be "defined but unused"). Since ISTM you intend this to apply to all rs6000 sub-targets, you might as well move it to rs6000.h? >> >>> * cc/config/rs6000/rs6000.c (rs6000_keep_leaf_when_profiled). New >> ^ typo^ typo >> >> It shouldn't have "gcc/" in the path names at all, actually. > > Sorry, I must have mangled the ChangeLog sopmehow while copying > it from one terminal to another. I fixed it in the new patch > (attached) along with the other issues you pointed out. > > I tested the changes in powerpc64*-linux-* native builds and on > an x86_64 host in a build for the powerpc-unknown-linux-gnu and > powerpc64-apple-darwin targets. Of these, the -mprofile-kernel > option is only accepted for powerpc64*-linux-* (which was also > confirmed by inspecting the sources) so I adjusted the test > target accordingly and kept the body of > rs6000_keep_leaf_when_profiled you suggested. > > Martin > >> >>> +/* -mprofile-kernel code calls mcount before the function prolog, >> >> "prologue". >> >>> + so a profiled leaf function should stay a leaf function. */ >>> + >>> +static bool >>> +rs6000_keep_leaf_when_profiled (void) >>> +{ >>> + return TARGET_PROFILE_KERNEL; >>> +} >> >> Something like >> >> switch (DEFAULT_ABI) >> { >> case ABI_AIX: >> case ABI_ELFv2: >> return TARGET_PROFILE_KERNEL; >> >> default: >> return true; >> } >> >> although I'm not sure about Darwin here. More conservative is to >> return false for anything untested, of course. The change is 'no-op' on Darwin, since we pass a parameter to mcount a stack frame is always forced. >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/pr63354.c >>> @@ -0,0 +1,10 @@ >>> +/* { dg-do compile { target { powerpc*-*-* } } } */ >>> +/* { dg-options "-O2 -pg -mprofile-kernel" } */ >>> + >>> +int foo (void) >>> +{ >>> + return 1; >>> +} >>> + >>> +/* { dg-final { scan-assembler "bl _mcount" } } */ >>> +/* { dg-final { scan-assembler-not "\(addi|stdu\) 1," } } */ >> >> Either you should run this only on AIX/ELFv2 ABIs, or you want to >> test for "stwu" as well. Bare "1" does not work for all assemblers >> (only Darwin again?) a bare register # will, indeed, fail for Darwin's native assembler (which expects r#). cheers Iain >> >> >> Segher >> > >
[patch,doc] add code markup on Cilk Plus builtins
While I was working on something else, I noticed that the list of Cilk Plus built-in functions was missing @code markup. I've checked in this quick patch to fix that. -Sandra 2015-03-21 Sandra Loosemore gcc/ * doc/extend.texi (Cilk Plus Builtins): Add markup. Index: gcc/doc/extend.texi === --- gcc/doc/extend.texi (revision 221558) +++ gcc/doc/extend.texi (working copy) @@ -8916,19 +8916,19 @@ GCC provides support for the following b is enabled. Cilk Plus can be enabled using the @option{-fcilkplus} flag. @itemize @bullet -@item __sec_implicit_index -@item __sec_reduce -@item __sec_reduce_add -@item __sec_reduce_all_nonzero -@item __sec_reduce_all_zero -@item __sec_reduce_any_nonzero -@item __sec_reduce_any_zero -@item __sec_reduce_max -@item __sec_reduce_min -@item __sec_reduce_max_ind -@item __sec_reduce_min_ind -@item __sec_reduce_mul -@item __sec_reduce_mutating +@item @code{__sec_implicit_index} +@item @code{__sec_reduce} +@item @code{__sec_reduce_add} +@item @code{__sec_reduce_all_nonzero} +@item @code{__sec_reduce_all_zero} +@item @code{__sec_reduce_any_nonzero} +@item @code{__sec_reduce_any_zero} +@item @code{__sec_reduce_max} +@item @code{__sec_reduce_min} +@item @code{__sec_reduce_max_ind} +@item @code{__sec_reduce_min_ind} +@item @code{__sec_reduce_mul} +@item @code{__sec_reduce_mutating} @end itemize Further details and examples about these built-in functions are described
Re: [Patch, Fortran] Extend (lib)coarray API/ABI documentation
Dear Tobias, On 21 Mar 2015, at 14:28, H.J. Lu wrote: > On Sat, Mar 21, 2015 at 6:19 AM, Dominique Dhumieres > wrote: a couple of minor nits that Dominique and I spotted while discussing this : > -@item @emph{NOTE} A simple implementation could be a simple @code{__asm__ maybe "A simple implementation could be " ... would read more smoothly? > -__volatile__ ("":::"memory)} to prevent code movements. > +@item @emph{NOTE} A simple implementation could be a simple > +@code{__asm__ __volatile__ ("":::"memory)} to prevent code movements. > @end table also: @code{__asm__ __volatile__ ("":::"memory)} <= seems to be missing a quotation mark. i.e. __asm__ __volatile__ ("" ::: "memory") I wonder if the latter was somehow confusing the newer edition of texinfo? cheers, Iain
Re: [PATCH][3/3][PR65460] Mark offloaded functions as parallelized
On 20-03-15 12:38, Tom de Vries wrote: On 19-03-15 12:05, Tom de Vries wrote: On 18-03-15 18:22, Tom de Vries wrote: Hi, this patch fixes PR65460. The patch marks offloaded functions as parallelized, which means the parloops pass no longer attempts to modify that function. Updated patch to postpone mark_parallelized_function until the corresponding cgraph_node is available, to ensure it works with the updated mark_parallelized_function from patch 2/3. Updated to eliminate mark_parallelized_function. Bootstrapped and reg-tested on x86_64. OK for stage4? Thomas, as requested, applied to gomp-4_0-branch. Thanks, - Tom
[patch, doc] fix "the @option{...}" usage
When I was looking at something else I noticed a description like "The @option{-foo} does blah" instead of either "@option{-foo} does blah" (-foo is a name and doesn't need an article) or "The @option{-foo} option does blah" (-foo is an adjective modifying "the option") I made a quick check and found several other instances like this, too. I've checked in the attached patch to fix them. -Sandra 2015-03-21 Sandra Loosemore gcc/ * doc/invoke.texi (-fno-diagnostics-show-caret): Fix usage of "the @option{...}". (-Wopenmp-simd): Likewise. (-fsanitize-recover): Likewise. (-fsanitize-undefined-trap-on-error): Likewise. (-flto): Likewise. (tracer-dynamic-coverage-feedback): Likewise. (reorder-block-duplicate-feedback): Likewise. (loop-unroll-jam-size): Likewise. (-B): Likewise. (-I-): Likewise. (-mabs=legacy): Likewise. (-mupper-regs-df): Likewise. (-mupper-regs-sf): Likewise. (-mpointers-to-nested-functions): Likewise. Index: gcc/doc/invoke.texi === --- gcc/doc/invoke.texi (revision 221558) +++ gcc/doc/invoke.texi (working copy) @@ -3253,7 +3253,7 @@ option is known to the diagnostic machin By default, each diagnostic emitted includes the original source line and a caret '^' indicating the column. This option suppresses this information. The source line is truncated to @var{n} characters, if -the @option{-fmessage-length=n} is given. When the output is done +the @option{-fmessage-length=n} option is given. When the output is done to the terminal, the width is limited to the width given by the @env{COLUMNS} environment variable or, if not set, to the terminal width. @@ -5157,8 +5157,8 @@ Requires @option{-flto-odr-type-merging} @item -Wopenmp-simd @opindex Wopenm-simd Warn if the vectorizer cost model overrides the OpenMP or the Cilk Plus -simd directive set by user. The @option{-fsimd-cost-model=unlimited} can -be used to relax the cost model. +simd directive set by user. The @option{-fsimd-cost-model=unlimited} +option can be used to relax the cost model. @item -Woverride-init @r{(C and Objective-C only)} @opindex Woverride-init @@ -5810,7 +5810,8 @@ for a sanitizer component causes it to a running the program as if no error happened. This means multiple runtime errors can be reported in a single program run, and the exit code of the program may indicate success even when errors -have been reported. The @option{-fno-sanitize-recover=} can be used to alter +have been reported. The @option{-fno-sanitize-recover=} option +can be used to alter this behavior: only the first detected error is reported and program then exits with a non-zero exit code. @@ -5834,7 +5835,7 @@ Similarly @option{-fno-sanitize-recover} @item -fsanitize-undefined-trap-on-error @opindex fsanitize-undefined-trap-on-error -The @option{-fsanitize-undefined-trap-on-error} instructs the compiler to +The @option{-fsanitize-undefined-trap-on-error} option instructs the compiler to report undefined behavior using @code{__builtin_trap} rather than a @code{libubsan} library routine. The advantage of this is that the @code{libubsan} library is not needed and is not linked in, so this @@ -9259,7 +9260,8 @@ them as usual to produce @file{myprog}. The only important thing to keep in mind is that to enable link-time optimizations you need to use the GCC driver to perform the link-step. GCC then automatically performs link-time optimization if any of the -objects involved were compiled with the @option{-flto}. You generally +objects involved were compiled with the @option{-flto} command-line option. +You generally should specify the optimization options to be used for link-time optimization though GCC tries to be clever at guessing an optimization level to use from the options used at compile-time @@ -10446,7 +10448,8 @@ This value is used to limit superblock f executed instructions is covered. This limits unnecessary code size expansion. -The @option{tracer-dynamic-coverage-feedback} is used only when profile +The @option{tracer-dynamic-coverage-feedback} parameter +is used only when profile feedback is available. The real profiles (as opposed to statically estimated ones) are much less balanced allowing the threshold to be larger value. @@ -10534,7 +10537,8 @@ branch or duplicate the code on its dest estimated size is smaller than this value multiplied by the estimated size of unconditional jump in the hot spots of the program. -The @option{reorder-block-duplicate-feedback} is used only when profile +The @option{reorder-block-duplicate-feedback} parameter +is used only when profile feedback is available. It may be set to higher values than @option{reorder-block-duplicate} since information about the hot spots is more accurate. @@ -10811,7 +10815,7 @@ length can be changed using the @option{ parameter. The default value is 51 iterations. @item loop-unroll-jam-size -Specify the unroll
[patch, nios2] implement TARGET_ASM_OUTPUT_MI_THUNK
Per Richard Biener's encouragement to go after low-hanging fruit in cleaning up test results for non-primary/secondary ports, I've checked in this patch which fixes several FAILs in the g++ testsuite for nios2. Chung-Lin already wrote the patch for our local tree some time ago and I've just re-tested it on mainline head before committing it. The nios2 back end didn't previously implement TARGET_ASM_OUTPUT_MI_THUNK. The approach here is similar to what other backends do, but it got a little messy because we had to modify the PIC helpers to allow the use of a specific temporary register RTX passed in as a parameter instead of always allocating a new pseudo. We use R2 for this purpose in the thunk (call-clobbered, but not used for parameter passing). -Sandra 2015-03-21 Chung-Lin Tang Sandra Loosemore gcc/ * config/nios2/nios2-protos.h (nios2_adjust_call_address): Adjust function parameter declaration. * config/nios2/nios2.md (call,call_value,sibcall,sibcall_value): Update arguments to nios2_adjust_call_address(). (sibcall_internal): Rename from *sibcall. (sibcall_value_internal): Rename from *sibcall_value. * config/nios2/nios2.c (nios2_emit_add_constant): New function. (nios2_large_got_address): Add target temp reg parameter. (nios2_got_address): Adjust call to nios2_large_got_address, add force_reg around it. (nios2_load_pic_address): Add target temp reg parameter, replace call to nios2_got_address with corresponding code. (nios2_legitimize_constant_address): Update call to nios2_load_pic_address. (nios2_adjust_call_address): Add temp reg parameter, update PIC case to use temp reg for PIC loading purposes. (nios2_asm_output_mi_thunk): Implement TARGET_ASM_OUTPUT_MI_THUNK. (TARGET_ASM_CAN_OUTPUT_MI_THUNK): Define. (TARGET_ASM_OUTPUT_MI_THUNK): Likewise. Index: gcc/config/nios2/nios2-protos.h === --- gcc/config/nios2/nios2-protos.h (revision 221369) +++ gcc/config/nios2/nios2-protos.h (working copy) @@ -31,7 +31,7 @@ extern void nios2_function_profiler (FIL #ifdef RTX_CODE extern int nios2_emit_move_sequence (rtx *, machine_mode); extern void nios2_emit_expensive_div (rtx *, machine_mode); -extern void nios2_adjust_call_address (rtx *); +extern void nios2_adjust_call_address (rtx *, rtx); extern rtx nios2_get_return_address (int); extern void nios2_set_return_address (rtx, rtx); Index: gcc/config/nios2/nios2.md === --- gcc/config/nios2/nios2.md (revision 221369) +++ gcc/config/nios2/nios2.md (working copy) @@ -726,7 +726,7 @@ (match_operand 1 "" "")) (clobber (reg:SI RA_REGNO))])] "" - "nios2_adjust_call_address (&operands[0]);") + "nios2_adjust_call_address (&operands[0], NULL_RTX);") (define_expand "call_value" [(parallel [(set (match_operand 0 "" "") @@ -734,7 +734,7 @@ (match_operand 2 "" ""))) (clobber (reg:SI RA_REGNO))])] "" - "nios2_adjust_call_address (&operands[1]);") + "nios2_adjust_call_address (&operands[1], NULL_RTX);") (define_insn "*call" [(call (mem:QI (match_operand:SI 0 "call_operand" "i,r")) @@ -762,7 +762,7 @@ (match_operand 1 "" "")) (return)])] "" - "nios2_adjust_call_address (&operands[0]);") + "nios2_adjust_call_address (&operands[0], NULL_RTX);") (define_expand "sibcall_value" [(parallel [(set (match_operand 0 "" "") @@ -770,9 +770,9 @@ (match_operand 2 "" ""))) (return)])] "" - "nios2_adjust_call_address (&operands[1]);") + "nios2_adjust_call_address (&operands[1], NULL_RTX);") -(define_insn "*sibcall" +(define_insn "sibcall_internal" [(call (mem:QI (match_operand:SI 0 "call_operand" "i,j")) (match_operand 1 "" "")) (return)] @@ -782,7 +782,7 @@ jmp\\t%0" [(set_attr "type" "control")]) -(define_insn "*sibcall_value" +(define_insn "sibcall_value_internal" [(set (match_operand 0 "register_operand" "") (call (mem:QI (match_operand:SI 1 "call_operand" "i,j")) (match_operand 2 "" ""))) Index: gcc/config/nios2/nios2.c === --- gcc/config/nios2/nios2.c (revision 221369) +++ gcc/config/nios2/nios2.c (working copy) @@ -489,6 +489,21 @@ nios2_emit_stack_limit_check (void) /* Temp regno used inside prologue/epilogue. */ #define TEMP_REG_NUM 8 +static rtx +nios2_emit_add_constant (rtx reg, HOST_WIDE_INT immed) +{ + rtx insn; + if (SMALL_INT (immed)) +insn = emit_insn (gen_add2_insn (reg, gen_int_mode (immed, Pmode))); + else +{ + rtx tmp = gen_rtx_REG (Pmode, TEMP_REG_NUM); + emit_move_insn (tmp, gen_int_mode (immed, Pmode)); + insn = emit_insn (gen_add2_insn (reg, tmp)); +} + return insn; +} + void nios2_expand_prologue (void) { @@ -1229,12 +1244,12 @@ nios2_unspec_offset (rtx