[Ada] Add assertion for context of subprograms
This adds an early assertion that the context of subprograms is in keeping with their scope, to avoid obscure segfaults later if this isn't the case. Tested on x86_64-suse-linux, applied on the mainline. 2015-02-08 Eric Botcazou * gcc-interface/utils.c (begin_subprog_body): Assert that the body is present in the same context as the declaration. -- Eric BotcazouIndex: gcc-interface/utils.c === --- gcc-interface/utils.c (revision 220502) +++ gcc-interface/utils.c (working copy) @@ -3105,6 +3105,11 @@ begin_subprog_body (tree subprog_decl) /* This function is being defined. */ TREE_STATIC (subprog_decl) = 1; + /* The failure of this assertion will likely come from a wrong context for + the subprogram body, e.g. another procedure for a procedure declared at + library level. */ + gcc_assert (current_function_decl == decl_function_context (subprog_decl)); + current_function_decl = subprog_decl; /* Enter a new binding level and show that all the parameters belong to
[Ada] Skip derived types treatment for subprogram types
This makes it so that the special code dealing with alias sets for derived types is skipped for subprogram types Tested on x86_64-suse-linux, applied on the mainline. 2015-02-08 Eric Botcazou * gcc-interface/decl.c (gnat_to_gnu_entity): Do not bother about alias sets in presence of derivation for subprogram types. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 220509) +++ gcc-interface/decl.c (working copy) @@ -5138,7 +5138,9 @@ gnat_to_gnu_entity (Entity_Id gnat_entit to conflict with Comp2 and an alias set copy is required. The language rules ensure the parent type is already frozen here. */ - if (Is_Derived_Type (gnat_entity) && !type_annotate_only) + if (kind != E_Subprogram_Type + && Is_Derived_Type (gnat_entity) + && !type_annotate_only) { Entity_Id gnat_parent_type = Underlying_Type (Etype (gnat_entity)); /* For constrained packed array subtypes, the implementation type is
[Ada] Fix interfacing with C++ methods on 32-bit Windows
Gigi fails to set the "thiscall" calling convention on a method imported from C++ if the class has no virtual member functions on the C++ side. Tested on x86-windows, applied on the mainline. 2015-02-08 Eric Botcazou * gcc-interface/decl.c (is_cplusplus_method): Use Is_Primitive flag to detect primitive operations of tagged and untagged types. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 220511) +++ gcc-interface/decl.c (working copy) @@ -5448,16 +5448,17 @@ is_cplusplus_method (Entity_Id gnat_enti if (Convention (gnat_entity) != Convention_CPP) return false; - /* This is the main case: C++ method imported as a primitive operation. */ - if (Is_Dispatching_Operation (gnat_entity)) + /* This is the main case: C++ method imported as a primitive operation. + Note that a C++ class with no virtual functions can be imported as a + limited record type so the operation is not necessarily dispatching. */ + if (Is_Primitive (gnat_entity)) return true; /* A thunk needs to be handled like its associated primitive operation. */ if (Is_Subprogram (gnat_entity) && Is_Thunk (gnat_entity)) return true; - /* C++ classes with no virtual functions can be imported as limited - record types, but we need to return true for the constructors. */ + /* A constructor is a method on the C++ side. */ if (Is_Constructor (gnat_entity)) return true;
[Ada] Fix alignment warning on aliased formal parameter
This eliminates the alignment warning wrongly issued on an address clause applied to a formal parameter, even though the type of the parameter has the appropriate alignment clause. Tested on x86_64-suse-linux, applied on the mainline. 2015-02-08 Eric Botcazou * gcc-interface/decl.c (gnat_to_gnu_param): Do not strip the padding if the parameter either is passed by reference or if the alignment would be lowered. 2015-02-08 Eric Botcazou * gnat.dg/addr7.ad[sb]: New test. * gnat.dg/addr8.ad[sb]: Likewise. -- Eric BotcazouIndex: gcc-interface/decl.c === --- gcc-interface/decl.c (revision 220512) +++ gcc-interface/decl.c (working copy) @@ -5659,15 +5659,17 @@ gnat_to_gnu_param (Entity_Id gnat_param, } /* If this is either a foreign function or if the underlying type won't - be passed by reference, strip off possible padding type. */ + be passed by reference and is as aligned as the original type, strip + off possible padding type. */ if (TYPE_IS_PADDING_P (gnu_param_type)) { tree unpadded_type = TREE_TYPE (TYPE_FIELDS (gnu_param_type)); - if (mech == By_Reference - || foreign + if (foreign || (!must_pass_by_ref (unpadded_type) - && (mech == By_Copy || !default_pass_by_ref (unpadded_type + && mech != By_Reference + && (mech == By_Copy || !default_pass_by_ref (unpadded_type)) + && TYPE_ALIGN (unpadded_type) >= TYPE_ALIGN (gnu_param_type))) gnu_param_type = unpadded_type; } -- { dg-do compile } package body Addr7 is procedure Proc (B: aliased Bytes) is O: Integer; for O'Address use B'Address; begin null; end; end Addr7; package Addr7 is type Bytes is array (1 .. 4) of Character; for Bytes'Alignment use 4; procedure Proc (B: aliased Bytes); end Addr7; -- { dg-do compile } package body Addr8 is procedure Proc (B: Bytes) is O: Integer; for O'Address use B'Address; begin null; end; end Addr8; package Addr8 is type Bytes is array (1 .. 4) of Character; for Bytes'Alignment use 4; procedure Proc (B: Bytes); end Addr8;
[Patch, fortran] PR64952 - Missing temporary in assignment from elemental function
Dear All, This came up at https://groups.google.com/forum/#!topic/comp.lang.fortran/TvVY5j3GPmg gfortran produces wrong result from: PROGRAM Main INTEGER :: i, index(5) = (/ (i, i = 1,5) /) REAL :: array(5) = (/ (i+0.0, i = 1,5) /) array = Fred(index,array) PRINT *, array CONTAINS ELEMENTAL FUNCTION Fred (n, x) REAL :: Fred INTEGER, INTENT(IN) :: n REAL, INTENT(IN) :: x ! In general, this would be in an external procedure Fred = x+SUM(array(:n-1))+SUM(array(n+1:)) END FUNCTION Fred END PROGRAM Main outputs 15.000 29.000 56.000 109.00 214.00 when result should be 5*15.0 A temporary should be produced for array = Fred(index, array). See the clf thread for the reasoning. In a nutshell, the reason is: The execution of the assignment shall have the same effect as if the evaluation of expr and the evaluation of all expressions in variable occurred before any portion of the variable is defined by the assignment. The evaluation of expressions within variable shall neither affect nor be affected by the evaluation of expr. Clearly, the above code violates this requirement because of the references to 'array' in 'Fred'. I think that we will have to provide an attribute that marks up array valued elemental functions that have any external array references and provide a temporary for assignment from one of these. Clearly something less brutal could be done, such as attaching a list of external arrays (to the elemental function, that is) to the symbol of the elemental function and comparing them with the lhs of an assignment. However, this works and has no perceivable effect on Polyhedron timings. I will change the name of the flags to potentially_aliasing. Bootstrapped and regtested on FC21/x86_64 - OK for trunk? Paul 2015-02-08 Paul Thomas PR fortran/64952 * gfortran.h : Add 'potentially_aliased' field to symbol_attr. * trans.h : Add 'potentially_aliased' field to gfc_ss_info. * resolve.c (resolve_variable): Mark elemental function symbol as 'potentially_aliased' if it has an array reference from outside its own namespace. * trans-array.c (gfc_conv_resolve_dependencies): If any ss is marked as 'potentially_aliased' generate a temporary. (gfc_walk_function_expr): If the function is marked as 'potentially_aliased', likewise mark the head gfc_ss. 2015-02-08 Paul Thomas PR fortran/64952 * gfortran.dg/finalize_28.f90: New test Index: gcc/fortran/trans-stmt.c === *** gcc/fortran/trans-stmt.c(revision 220481) --- gcc/fortran/trans-stmt.c(working copy) *** gfc_trans_deallocate (gfc_code *code) *** 5575,5585 if (expr->rank || gfc_is_coarray (expr)) { if (expr->ts.type == BT_DERIVED && expr->ts.u.derived->attr.alloc_comp && !gfc_is_finalizable (expr->ts.u.derived, NULL)) { - gfc_ref *ref; gfc_ref *last = NULL; for (ref = expr->ref; ref; ref = ref->next) if (ref->type == REF_COMPONENT) last = ref; --- 5575,5587 if (expr->rank || gfc_is_coarray (expr)) { + gfc_ref *ref; + if (expr->ts.type == BT_DERIVED && expr->ts.u.derived->attr.alloc_comp && !gfc_is_finalizable (expr->ts.u.derived, NULL)) { gfc_ref *last = NULL; + for (ref = expr->ref; ref; ref = ref->next) if (ref->type == REF_COMPONENT) last = ref; *** gfc_trans_deallocate (gfc_code *code) *** 5590,5602 && !(!last && expr->symtree->n.sym->attr.pointer)) { tmp = gfc_deallocate_alloc_comp (expr->ts.u.derived, se.expr, ! expr->rank); gfc_add_expr_to_block (&se.pre, tmp); } } ! tmp = gfc_array_deallocate (se.expr, pstat, errmsg, errlen, ! label_finish, expr); ! gfc_add_expr_to_block (&se.pre, tmp); if (al->expr->ts.type == BT_CLASS) gfc_reset_vptr (&se.pre, al->expr); } --- 5592,5636 && !(!last && expr->symtree->n.sym->attr.pointer)) { tmp = gfc_deallocate_alloc_comp (expr->ts.u.derived, se.expr, ! expr->rank); gfc_add_expr_to_block (&se.pre, tmp); } } ! ! if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (se.expr))) ! { ! tmp = gfc_array_deallocate (se.expr, pstat, errmsg, errlen, ! label_finish, expr); ! gfc_add_expr_to_block (&se.pre, tmp); ! } ! else if (TREE_C
Re: Stepping down as global maintainer
On Friday 2015-02-06 16:42, Diego Novillo wrote: > As such, I propose to become a write-after-approval maintainer > and relinquish all the other maintainer roles I had. Thanks for your contributions over the years, Diego! I had a look at gcc/doc/contrib.texi and am not sure this properly reflects all of those. If you'd like to update this, let me know. Gerald
Re: [PATCH] check_GNU_style.sh "80 characters exceeded" error fix
On Monday 2014-12-08 15:15, Jeff Law wrote: >> * contrib/check_GNU_style.sh (col): Got rid of cut operation >> from the pipe chain and instead added cut inside awk command. > Yes. Please install on the trunk. I was going to apply this for Mantas (who does not have write access), alas the patch does not apply any more and the context has changed a bit. Mantas, would you like to propose an updated change? Gerald
Re: [PATCH][ARM][PING] __ARM_FP & __ARM_NEON_FP defined when -march=armv7-m
On 3 February 2015 at 17:29, Richard Earnshaw wrote: > On 06/01/15 09:40, Mantas Mikaitis wrote: >> >> Ping and changelog spaces removed. >> >> Thank you, >> Mantas M. >> >> On 18/11/14 11:58, Richard Earnshaw wrote: >>> On 18/11/14 11:30, Mantas Mikaitis wrote: Incorrect predefinitions for certain target architectures. E.g. arm7-m does not contain NEON but the defintion __ARM_NEON_FP was switched on. Similarly with armv6 and even armv2. This patch fixes the predefines for each of the different chips containing certain types of the FPU implementations. Tests: Tested on arm-none-linux-gnueabi and arm-none-linux-gnueabihf without any new regression. Manually compiled for various targets and all correct definitions were present. Is this patch ok for trunk? Mantas gcc/Changelog: * config/arm/arm.h (TARGET_NEON_FP): Removed conditional definition, define to zero if !TARGET_NEON. (TARGET_CPU_CPP_BUILTINS): Added second condition before defining __ARM_FP macro. ARM_DEFS.patch diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index ff4ddac..325fea9 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -118,7 +118,7 @@ extern char arm_arch_name[]; if (TARGET_VFP) \ builtin_define ("__VFP_FP__");\ \ - if (TARGET_ARM_FP) \ + if (TARGET_ARM_FP && !TARGET_SOFT_FLOAT)\ >>> Wouldn't it be better to factor this into TARGET_ARM_FP? It seems odd >>> that that macro returns a set of values based on something completely >>> unavailable for the current compilation. That would also then mirror >>> the behaviour of TARGET_NEON_FP (see below) and make the internal macros >>> more consistent. >>> >>> R. >> >> Thank you. Patch updated. >> >> Ok for trunk? >> >> Mantas M. >> >> gcc/Changelog >> >> 2014-12-03 Mantas Mikaits >> >>* config/arm/arm.h (TARGET_NEON_FP): Removed conditional >> definition, define to zero if !TARGET_NEON. >> (TARGET_ARM_FP): Added !TARGET_SOFT_FLOAT into the conditional >> definition. >> >> gcc/testsuite/ChangeLog: >> >>* gcc.target/arm/macro_defs0.c: New test. >>* gcc.target/arm/macro_defs1.c: New test. >>* gcc.target/arm/macro_defs2.c: New test. >> >> > > OK. However, watch your ChangeLog line length (80 char limit). Also, > entries (even continuation lines) should be indented with exactly one > (hard) tab. > > R. > >> >> >> >> >> >> >> >> >> >> >> >> >> mypatch.patch >> >> >> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h >> index ff4ddac..7d4cc39 100644 >> --- a/gcc/config/arm/arm.h >> +++ b/gcc/config/arm/arm.h >> @@ -2343,17 +2343,17 @@ extern int making_const_table; >> point types. Where bit 1 indicates 16-bit support, bit 2 indicates >> 32-bit support, bit 3 indicates 64-bit support. */ >> #define TARGET_ARM_FP\ >> - (TARGET_VFP_SINGLE ? 4 \ >> - : (TARGET_VFP_DOUBLE ? (TARGET_FP16 ? 14 : 12) : 0)) >> + (!TARGET_SOFT_FLOAT ? (TARGET_VFP_SINGLE ? 4 \ >> + : (TARGET_VFP_DOUBLE ? (TARGET_FP16 ? 14 : 12) : 0)) \ >> + : 0) >> >> >> /* Set as a bit mask indicating the available widths of floating point >> types for hardware NEON floating point. This is the same as >> TARGET_ARM_FP without the 64-bit bit set. */ >> -#ifdef TARGET_NEON >> -#define TARGET_NEON_FP \ >> - (TARGET_ARM_FP & (0xff ^ 0x08)) >> -#endif >> +#define TARGET_NEON_FP\ >> + (TARGET_NEON ? (TARGET_ARM_FP & (0xff ^ 0x08)) \ >> +: 0) >> >> /* The maximum number of parallel loads or stores we support in an ldm/stm >> instruction. */ >> diff --git a/gcc/testsuite/gcc.target/arm/macro_defs0.c >> b/gcc/testsuite/gcc.target/arm/macro_defs0.c >> new file mode 100644 >> index 000..198243e >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/arm/macro_defs0.c >> @@ -0,0 +1,14 @@ >> +/* { dg-do compile } */ >> +/* { dg-skip-if "avoid conflicting multilib options" >> + { *-*-* } { "-march=*" } {"-march=armv7-m"} } */ >> +/* { dg-skip-if "avoid conflicting multilib options" >> + { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=soft" } } */ >> +/* { dg-options "-march=armv7-m -mcpu=cortex-m3 -mfloat-abi=soft -mthumb" } >> */ >> + >> +#ifdef __ARM_FP >> +#error __ARM_FP should not be defined >> +#endif >> + >> +#ifdef __ARM_NEON_FP >> +#error __ARM_NEON_FP should not be defined >> +#endif >> diff --git a/gcc/testsuite/gcc.target/arm/macro_defs1.c >> b/gcc/testsuite/gcc.target/arm/macro_defs1.c >> new file mode 100644 >> index 000..075b71b >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/arm/mac
Re: [Patch, fortran] PR64952 - Missing temporary in assignment from elemental function
Dear All, Dominique has just flagged up a slight technical problem with the patch... it's not for this PR :-( Please find the correct patch attached. Paul On 8 February 2015 at 12:42, Paul Richard Thomas wrote: > Dear All, > > This came up at > https://groups.google.com/forum/#!topic/comp.lang.fortran/TvVY5j3GPmg > > gfortran produces wrong result from: > > PROGRAM Main > INTEGER :: i, index(5) = (/ (i, i = 1,5) /) > REAL :: array(5) = (/ (i+0.0, i = 1,5) /) > array = Fred(index,array) > PRINT *, array > CONTAINS > ELEMENTAL FUNCTION Fred (n, x) > REAL :: Fred > INTEGER, INTENT(IN) :: n > REAL, INTENT(IN) :: x > ! In general, this would be in an external procedure > Fred = x+SUM(array(:n-1))+SUM(array(n+1:)) > END FUNCTION Fred > END PROGRAM Main > > outputs > 15.000 29.000 56.000 109.00 > 214.00 > when result should be > 5*15.0 > > A temporary should be produced for array = Fred(index, array). See the > clf thread for the reasoning. > > In a nutshell, the reason is: > The execution of the assignment shall have the same effect as > if the evaluation of expr and the evaluation of all expressions > in variable occurred before any portion of the variable is > defined by the assignment. The evaluation of expressions within > variable shall neither affect nor be affected by the evaluation > of expr. > > Clearly, the above code violates this requirement because of the > references to 'array' in 'Fred'. I think that we will have to provide > an attribute that marks up array valued elemental functions that have > any external array references and provide a temporary for assignment > from one of these. Clearly something less brutal could be done, such > as attaching a list of external arrays (to the elemental function, > that is) to the symbol of the elemental function and comparing them > with the lhs of an assignment. However, this works and has no > perceivable effect on Polyhedron timings. > > I will change the name of the flags to potentially_aliasing. > > Bootstrapped and regtested on FC21/x86_64 - OK for trunk? > > Paul > > 2015-02-08 Paul Thomas > > PR fortran/64952 > * gfortran.h : Add 'potentially_aliased' field to symbol_attr. > * trans.h : Add 'potentially_aliased' field to gfc_ss_info. > * resolve.c (resolve_variable): Mark elemental function symbol > as 'potentially_aliased' if it has an array reference from > outside its own namespace. > * trans-array.c (gfc_conv_resolve_dependencies): If any ss is > marked as 'potentially_aliased' generate a temporary. > (gfc_walk_function_expr): If the function is marked as > 'potentially_aliased', likewise mark the head gfc_ss. > > 2015-02-08 Paul Thomas > > PR fortran/64952 > * gfortran.dg/finalize_28.f90: New test -- Outside of a dog, a book is a man's best friend. Inside of a dog it's too dark to read. Groucho Marx Index: gcc/fortran/gfortran.h === *** gcc/fortran/gfortran.h (revision 220482) --- gcc/fortran/gfortran.h (working copy) *** typedef struct *** 789,794 --- 789,798 cannot alias. Note that this is zero for PURE procedures. */ unsigned implicit_pure:1; + /* This set for an elemental function that contains expressions for + arrays coming from outside its namespace. */ + unsigned potentially_aliased:1; + /* This is set if the subroutine doesn't return. Currently, this is only possible for intrinsic subroutines. */ unsigned noreturn:1; Index: gcc/fortran/trans.h === *** gcc/fortran/trans.h (revision 220481) --- gcc/fortran/trans.h (working copy) *** typedef struct gfc_ss_info *** 226,231 --- 226,235 /* Suppresses precalculation of scalars in WHERE assignments. */ unsigned where:1; + /* Signals that an array argument of an elemental function might be aliased, + thereby generating a temporary in assignments. */ + unsigned potentially_aliased:1; + /* Tells whether the SS is for an actual argument which can be a NULL reference. In other words, the associated dummy argument is OPTIONAL. Used to handle elemental procedures. */ Index: gcc/fortran/resolve.c === *** gcc/fortran/resolve.c (revision 220481) --- gcc/fortran/resolve.c (working copy) *** resolve_variable (gfc_expr *e) *** 5054,5059 --- 5054,5067 && gfc_current_ns->parent->parent == sym->ns))) sym->attr.host_assoc = 1; + if (sym->attr.dimension + && (sym->ns != gfc_current_ns + || sym->attr.use_assoc + || sym->attr.in_common) + && gfc_elemental (NULL) + && gfc_current_ns->proc_name->attr.funct
Re: [Patch, fortran] PR63744 accept duplicate use-rename
Le 07/02/2015 14:40, Dominique Dhumieres a écrit : > Dear Mikael, > > Even if > >>> use m, only: A => X >>> use m, only: A => X > > is valid, it does not make sense to use it and it is probably a typo. > Should not gfortran emit a warning, at least with -Wall? > Yeah, why not? I think we have to defer it to stage1 though. I'll open a PR. Mikael
Re: Stepping down as global maintainer
On Sun, Feb 8, 2015 at 6:58 AM, Gerald Pfeifer wrote: > On Friday 2015-02-06 16:42, Diego Novillo wrote: >> As such, I propose to become a write-after-approval maintainer >> and relinquish all the other maintainer roles I had. > > Thanks for your contributions over the years, Diego! Thanks! > I had a look at gcc/doc/contrib.texi and am not sure this properly > reflects all of those. If you'd like to update this, let me know. Oh, right. I'll send a patch. Diego.
Re: [PATCH][AArch64] Remove crypto extension from default for cortex-a53, cortex-a57
On 17/11/14 17:03, pins...@gmail.com wrote: > > > > >> On Nov 17, 2014, at 8:59 AM, Ramana Radhakrishnan >> wrote: >> >>> On Mon, Nov 17, 2014 at 2:48 PM, Kyrill Tkachov >>> wrote: >>> Hi all, >>> >>> Some configurations of Cortex-A53 and Cortex-A57 don't ship with crypto, >>> so enabling it by default for -mcpu=cortex-a53 and cortex-a57 is >>> inappropriate. >>> >>> Tested aarch64-none-elf. Reminder that at the moment all the crypto >>> extension does is enable the use of the ACLE crypto intrinsics in arm_neon.h >>> >>> Ok for trunk? >> >> I can't ok this but ... >> >> Since we've changed behaviour from 4.9 I think it warrants an entry in >> changes.html for 5.0 > > ThunderX should also disable crypto too by default. I will submit a patch for > that soon too. > You appear to have not yet done so... BTW, you can consider such a change pre-approved. R. > Thanks, > Andrew > >> >> Ramana >> >>> >>> Thanks, >>> Kyrill >>> >>> 2014-11-17 Kyrylo Tkachov >>> >>>* config/aarch64/aarch64-cores.def (cortex-a53): Remove >>>AARCH64_FL_CRYPTO from feature flags. >>>(cortex-a57): Likewise. >>>(cortex-a57.cortex-a53): Likewise. >
Re: [Patch, fortran] PR64952 - Missing temporary in assignment from elemental function
Hello Paul, comments below Le 08/02/2015 16:24, Paul Richard Thomas a écrit : > > Index: gcc/fortran/gfortran.h > === > *** gcc/fortran/gfortran.h(revision 220482) > --- gcc/fortran/gfortran.h(working copy) > *** typedef struct > *** 789,794 > --- 789,798 >cannot alias. Note that this is zero for PURE procedures. */ > unsigned implicit_pure:1; > > + /* This set for an elemental function that contains expressions for > + arrays coming from outside its namespace. */ > + unsigned potentially_aliased:1; > + aliased is more something about pointers, so how about naming it something like array_outer_dependency? Anyway, that's minor. I wonder whether we should negate the meaning, that is set the flag if there is no external dependency. If we can get the conditions to set it exhaustively right, both are equivalent. Otherwise... maybe not. > /* This is set if the subroutine doesn't return. Currently, this >is only possible for intrinsic subroutines. */ > unsigned noreturn:1; > Index: gcc/fortran/trans.h > === > *** gcc/fortran/trans.h (revision 220481) > --- gcc/fortran/trans.h (working copy) > *** typedef struct gfc_ss_info > *** 226,231 > --- 226,235 > /* Suppresses precalculation of scalars in WHERE assignments. */ > unsigned where:1; > > + /* Signals that an array argument of an elemental function might be > aliased, > + thereby generating a temporary in assignments. */ > + unsigned potentially_aliased:1; > + > /* Tells whether the SS is for an actual argument which can be a NULL >reference. In other words, the associated dummy argument is OPTIONAL. >Used to handle elemental procedures. */ > Index: gcc/fortran/resolve.c > === > *** gcc/fortran/resolve.c (revision 220481) > --- gcc/fortran/resolve.c (working copy) > *** resolve_variable (gfc_expr *e) > *** 5054,5059 > --- 5054,5067 > && gfc_current_ns->parent->parent == sym->ns))) > sym->attr.host_assoc = 1; > > + if (sym->attr.dimension > + && (sym->ns != gfc_current_ns > + || sym->attr.use_assoc > + || sym->attr.in_common) > + && gfc_elemental (NULL) > + && gfc_current_ns->proc_name->attr.function) > + gfc_current_ns->proc_name->attr.potentially_aliased = 1; I would expect the flag to also be copied between procedures in some cases; namely if A calls B, and B has the flag, then A has the flag. There is also the case of external procedures (for which the flag is not known -> assume the worst) > + > resolve_procedure: > if (t && !resolve_procedure_expression (e)) > t = false; > Index: gcc/fortran/trans-array.c > === > *** gcc/fortran/trans-array.c (revision 220482) > --- gcc/fortran/trans-array.c (working copy) > *** gfc_conv_resolve_dependencies (gfc_loopi > *** 4391,4396 > --- 4391,4402 > { > ss_expr = ss->info->expr; > > + if (ss->info->potentially_aliased) > + { > + nDepend = 1; > + break; > + } > + > if (ss->info->type != GFC_SS_SECTION) > { > if (flag_realloc_lhs > *** gfc_walk_function_expr (gfc_ss * ss, gfc > *** 9096,9104 > /* Walk the parameters of an elemental function. For now we always pass >by reference. */ > if (sym->attr.elemental || (comp && comp->attr.elemental)) > ! return gfc_walk_elemental_function_args (ss, > expr->value.function.actual, >gfc_get_proc_ifc_for_expr (expr), >GFC_SS_REFERENCE); > > /* Scalar functions are OK as these are evaluated outside the > scalarization >loop. Pass back and let the caller deal with it. */ > --- 9102,9114 > /* Walk the parameters of an elemental function. For now we always pass >by reference. */ > if (sym->attr.elemental || (comp && comp->attr.elemental)) > ! { > ! ss = gfc_walk_elemental_function_args (ss, > expr->value.function.actual, >gfc_get_proc_ifc_for_expr (expr), >GFC_SS_REFERENCE); > + if (sym->attr.potentially_aliased) > + ss->info->potentially_aliased = 1; > + } This is somewhat hackish, potentially_aliased is a global thing, not specific to SS, and this may end up marking gfc_ss_terminator as potentiallly_aliased for example, but I don't see any other obvious way to do it, so it's OK I guess. Anyway, the comp && comp->attr.elemental part of the if should be handled too (always set the flag in that case?). I actu
Re: [Patch, fortran] PR64952 - Missing temporary in assignment from elemental function
Dear Mikael, Thank you very much for the review. You raise some points that I had thought about and others that I hadn't. I also realised that such things as blocks, within the elemental function would through the fix as well. I'll defer doing anything with it until tomorrow night. I reason that there is always going to be an 'ss', although I should check that it is not gfc_ss_terminator, and that it does not matter which one is flagged. I should add a comment to that effect; it's not quite as hackish as it looks, methinks. I will be back! Paul On 8 February 2015 at 18:27, Mikael Morin wrote: > Hello Paul, > > comments below > > Le 08/02/2015 16:24, Paul Richard Thomas a écrit : >> >> Index: gcc/fortran/gfortran.h >> === >> *** gcc/fortran/gfortran.h(revision 220482) >> --- gcc/fortran/gfortran.h(working copy) >> *** typedef struct >> *** 789,794 >> --- 789,798 >>cannot alias. Note that this is zero for PURE procedures. */ >> unsigned implicit_pure:1; >> >> + /* This set for an elemental function that contains expressions for >> + arrays coming from outside its namespace. */ >> + unsigned potentially_aliased:1; >> + > aliased is more something about pointers, so how about naming it > something like array_outer_dependency? Anyway, that's minor. > > I wonder whether we should negate the meaning, that is set the flag if > there is no external dependency. > If we can get the conditions to set it exhaustively right, both are > equivalent. Otherwise... maybe not. > >> /* This is set if the subroutine doesn't return. Currently, this >>is only possible for intrinsic subroutines. */ >> unsigned noreturn:1; >> Index: gcc/fortran/trans.h >> === >> *** gcc/fortran/trans.h (revision 220481) >> --- gcc/fortran/trans.h (working copy) >> *** typedef struct gfc_ss_info >> *** 226,231 >> --- 226,235 >> /* Suppresses precalculation of scalars in WHERE assignments. */ >> unsigned where:1; >> >> + /* Signals that an array argument of an elemental function might be >> aliased, >> + thereby generating a temporary in assignments. */ >> + unsigned potentially_aliased:1; >> + >> /* Tells whether the SS is for an actual argument which can be a NULL >>reference. In other words, the associated dummy argument is OPTIONAL. >>Used to handle elemental procedures. */ >> Index: gcc/fortran/resolve.c >> === >> *** gcc/fortran/resolve.c (revision 220481) >> --- gcc/fortran/resolve.c (working copy) >> *** resolve_variable (gfc_expr *e) >> *** 5054,5059 >> --- 5054,5067 >> && gfc_current_ns->parent->parent == sym->ns))) >> sym->attr.host_assoc = 1; >> >> + if (sym->attr.dimension >> + && (sym->ns != gfc_current_ns >> + || sym->attr.use_assoc >> + || sym->attr.in_common) >> + && gfc_elemental (NULL) >> + && gfc_current_ns->proc_name->attr.function) >> + gfc_current_ns->proc_name->attr.potentially_aliased = 1; > I would expect the flag to also be copied between procedures in some > cases; namely if A calls B, and B has the flag, then A has the flag. > There is also the case of external procedures (for which the flag is not > known -> assume the worst) > >> + >> resolve_procedure: >> if (t && !resolve_procedure_expression (e)) >> t = false; >> Index: gcc/fortran/trans-array.c >> === >> *** gcc/fortran/trans-array.c (revision 220482) >> --- gcc/fortran/trans-array.c (working copy) >> *** gfc_conv_resolve_dependencies (gfc_loopi >> *** 4391,4396 >> --- 4391,4402 >> { >> ss_expr = ss->info->expr; >> >> + if (ss->info->potentially_aliased) >> + { >> + nDepend = 1; >> + break; >> + } >> + >> if (ss->info->type != GFC_SS_SECTION) >> { >> if (flag_realloc_lhs >> *** gfc_walk_function_expr (gfc_ss * ss, gfc >> *** 9096,9104 >> /* Walk the parameters of an elemental function. For now we always pass >>by reference. */ >> if (sym->attr.elemental || (comp && comp->attr.elemental)) >> ! return gfc_walk_elemental_function_args (ss, >> expr->value.function.actual, >>gfc_get_proc_ifc_for_expr (expr), >>GFC_SS_REFERENCE); >> >> /* Scalar functions are OK as these are evaluated outside the >> scalarization >>loop. Pass back and let the caller deal with it. */ >> --- 9102,9114 >> /* Walk the parameters of an elemental function. For now we always pass >>by reference. */ >> if (sym->attr.elemental || (comp && comp->att
Re: [PATCH] PR rtl-optimization/32219: optimizer causes wrong code in pic/hidden/weak symbol checking
H.J., This last version of the patch bootstraps and passes the test suite without regressions on x86_64-apple-darwin14. https://gcc.gnu.org/ml/gcc-testresults/2015-02/msg00912.html Thanks for fixing this. Jack On Sat, Feb 7, 2015 at 11:45 AM, H.J. Lu wrote: > On Sat, Feb 07, 2015 at 07:56:06AM -0800, H.J. Lu wrote: >> On Sat, Feb 07, 2015 at 10:11:01AM -0500, Jack Howarth wrote: >> > H.J,, >> > Unfortunately, the answer is yes. This patch still introduces >> > regressions in the g++ test suite.l These are all some form of... >> > >> >> It looks like Darwin depends on the old behavior of >> default_binds_local_p_1. Please try this patch. >> > > Here is an updated patch. > > > H.J. > From d010cd1ddf866f8e10fe7ad2cf483b5a872bc6ea Mon Sep 17 00:00:00 2001 > From: "H.J. Lu" > Date: Thu, 5 Feb 2015 14:28:58 -0800 > Subject: [PATCH] Handle symbol visibility/locality for PIE/PIC > > If a hidden weak symbol isn't defined in the TU, we can't assume it will > be defined in another TU at link time. It makes a difference in code > generation when compiling for PIC. If we assume that a hidden weak > undefined symbol is local, the address checking may be optimized out and > leads to the wrong code. This means that a symbol with user specified > visibility is local only if it is locally resolved or defined, not weak > or not compiling for PIC. When symbol visibility is specified in the > source, we should always output symbol visibility even if symbol isn't > local to the TU. > > If a global data symbol is defined in the TU, it is always local to the > executable, regardless if it is a common symbol or not. If we aren't > compiling for shared library, locally defined global data symbol binds > locally. > > Since some targets call default_binds_local_p_1 directly and depend on > the old behavior of default_binds_local_p_1. This patch renames > default_binds_local_p_1 to default_binds_local_p_2 and implements the > new behavior in default_binds_local_p_2 controlled by a variable. > The old behavior remains with default_binds_local_p_1. > > gcc/ > > PR rtl-optimization/32219 > * cgraphunit.c (varpool_node::finalize_decl): Set definition > first before calling notice_global_symbol so that it is > available to notice_global_symbol. > * varasm.c (default_binds_local_p_1): Renamed to ... > (default_binds_local_p_2): This. Resolve defined data symbol > locally if not building shared library. Resolve symbol with > user specified visibility locally only if it is locally resolved > or defined, not weak or not compiling for PIC. > (default_binds_local_p): Replace default_binds_local_p_1 with > default_binds_local_p_2. > (default_binds_local_p_1): Call default_binds_local_p_2. > (default_elf_asm_output_external): Always output visibility > specified in the source. > > gcc/testsuite/ > > PR rtl-optimization/32219 > * gcc.dg/visibility-22.c: New test. > * gcc.dg/visibility-23.c: Likewise. > * gcc.target/i386/pr32219-1.c: Likewise. > * gcc.target/i386/pr32219-2.c: Likewise. > * gcc.target/i386/pr32219-3.c: Likewise. > * gcc.target/i386/pr32219-4.c: Likewise. > * gcc.target/i386/pr32219-5.c: Likewise. > * gcc.target/i386/pr32219-6.c: Likewise. > * gcc.target/i386/pr32219-7.c: Likewise. > * gcc.target/i386/pr32219-8.c: Likewise. > * gcc.target/i386/pr64317.c: Expect GOTOFF relocation instead > of GOT relocation. > --- > gcc/cgraphunit.c | 4 +- > gcc/testsuite/gcc.dg/visibility-22.c | 17 + > gcc/testsuite/gcc.dg/visibility-23.c | 15 > gcc/testsuite/gcc.target/i386/pr32219-1.c | 16 > gcc/testsuite/gcc.target/i386/pr32219-2.c | 16 > gcc/testsuite/gcc.target/i386/pr32219-3.c | 17 + > gcc/testsuite/gcc.target/i386/pr32219-4.c | 17 + > gcc/testsuite/gcc.target/i386/pr32219-5.c | 16 > gcc/testsuite/gcc.target/i386/pr32219-6.c | 16 > gcc/testsuite/gcc.target/i386/pr32219-7.c | 17 + > gcc/testsuite/gcc.target/i386/pr32219-8.c | 17 + > gcc/testsuite/gcc.target/i386/pr64317.c | 2 +- > gcc/varasm.c | 61 > +-- > 13 files changed, 209 insertions(+), 22 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/visibility-22.c > create mode 100644 gcc/testsuite/gcc.dg/visibility-23.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr32219-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr32219-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr32219-3.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr32219-4.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr32219-5.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr32219-6.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr32219-7.
Fix PR 63566 part 1
Hi, PR63566 after fixing the underlying wrong code issue is about functions with aliases not market local. This patch prepares cgraphunit for some changes needed to make i386.c aliases safe. Bootstrapped/regtested x86_64-linux, comitted. Honza PR ipa/63566 * cgraphunit.c (cgraph_node::analyze): Be sure target of thunk is aliases before trying to expand it. (cgraph_node::expand_thunk): Fix formating. Index: cgraphunit.c === --- cgraphunit.c(revision 220509) +++ cgraphunit.c(working copy) @@ -580,8 +580,19 @@ cgraph_node::analyze (void) if (thunk.thunk_p) { - create_edge (cgraph_node::get (thunk.alias), - NULL, 0, CGRAPH_FREQ_BASE); + cgraph_node *t = cgraph_node::get (thunk.alias); + + create_edge (t, NULL, 0, CGRAPH_FREQ_BASE); + /* Target code in expand_thunk may need the thunk's target +to be analyzed, so recurse here. */ + if (!t->analyzed) + t->analyze (); + if (t->alias) + { + t = t->get_alias_target (); + if (!t->analyzed) + t->analyze (); + } if (!expand_thunk (false, false)) { thunk.alias = NULL; @@ -1515,7 +1526,8 @@ cgraph_node::expand_thunk (bool output_a current_function_decl = thunk_fndecl; /* Ensure thunks are emitted in their correct sections. */ - resolve_unique_section (thunk_fndecl, 0, flag_function_sections); + resolve_unique_section (thunk_fndecl, 0, + flag_function_sections); DECL_RESULT (thunk_fndecl) = build_decl (DECL_SOURCE_LOCATION (thunk_fndecl), @@ -1568,7 +1580,8 @@ cgraph_node::expand_thunk (bool output_a current_function_decl = thunk_fndecl; /* Ensure thunks are emitted in their correct sections. */ - resolve_unique_section (thunk_fndecl, 0, flag_function_sections); + resolve_unique_section (thunk_fndecl, 0, + flag_function_sections); DECL_IGNORED_P (thunk_fndecl) = 1; bitmap_obstack_initialize (NULL);
Fix PR 63566 part 2
Hi, this patch fixes heurstics in ipa-split that disables splitting for functions called once (which may need revisiting anyway). When function has alias, the other uses may come through that alias. Bootstrapped/regtested x86_64. PR ipa/63566 * ipa-split.c (execute_split_functions): Split if function has aliases. Index: ipa-split.c === --- ipa-split.c (revision 220509) +++ ipa-split.c (working copy) @@ -1736,6 +1736,7 @@ execute_split_functions (void) /* Local functions called once will be completely inlined most of time. */ || (!node->callers->next_caller && node->local.local)) && !node->address_taken + && !node->has_aliases_p () && (!flag_lto || !node->externally_visible)) { if (dump_file)
Fix PR 63566 part 3
Hi, this patch makes i386.c ready for presence of aliases of local functions, but also fixes a wrong code issue where ix86_function_sseregparm does use caller TARET_SSE_MATH flags instead of callees. If callee disagree on this flag, it may get called with wrong calling convention. I added error message on calling local function with SSE ABI from function compiled with SSE disabled. We may work harder to simply disable SSE calling convention but that requires whole compilation unit analysis that is hard to do on ltrans. I hope it does not matter in practice. Otherwise we may want to introduce some hook and disqualify such functions from being local at WPA time. Bootstrapped/regtested x86_64-linux (includin 32bit testing), will commit it shortly. Honza PR ipa/63566 * i386.c (ix86_function_regparm): Look through aliases to see if callee is local and optimized. (ix86_function_sseregparm): Likewise; also use target's SSE math settings; error out instead of silently generating wrong code on mismatches. (init_cumulative_args): Look through aliases. Index: config/i386/i386.c === --- config/i386/i386.c (revision 220509) +++ config/i386/i386.c (working copy) @@ -5767,49 +5767,55 @@ ix86_function_regparm (const_tree type, /* Use register calling convention for local functions when possible. */ if (decl - && TREE_CODE (decl) == FUNCTION_DECL + && TREE_CODE (decl) == FUNCTION_DECL) +{ + cgraph_node *target = cgraph_node::get (decl); + if (target) + target = target->function_symbol (); + /* Caller and callee must agree on the calling convention, so checking here just optimize means that with __attribute__((optimize (...))) caller could use regparm convention and callee not, or vice versa. Instead look at whether the callee is optimized or not. */ - && opt_for_fn (decl, optimize) - && !(profile_flag && !flag_fentry)) -{ - /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified. */ - cgraph_local_info *i = cgraph_node::local_info (CONST_CAST_TREE (decl)); - if (i && i->local && i->can_change_signature) + if (target && opt_for_fn (target->decl, optimize) + && !(profile_flag && !flag_fentry)) { - int local_regparm, globals = 0, regno; + cgraph_local_info *i = &target->local; + if (i && i->local && i->can_change_signature) + { + int local_regparm, globals = 0, regno; - /* Make sure no regparm register is taken by a -fixed register variable. */ - for (local_regparm = 0; local_regparm < REGPARM_MAX; local_regparm++) - if (fixed_regs[local_regparm]) - break; - - /* We don't want to use regparm(3) for nested functions as -these use a static chain pointer in the third argument. */ - if (local_regparm == 3 && DECL_STATIC_CHAIN (decl)) - local_regparm = 2; - - /* In 32-bit mode save a register for the split stack. */ - if (!TARGET_64BIT && local_regparm == 3 && flag_split_stack) - local_regparm = 2; - - /* Each fixed register usage increases register pressure, -so less registers should be used for argument passing. -This functionality can be overriden by an explicit -regparm value. */ - for (regno = AX_REG; regno <= DI_REG; regno++) - if (fixed_regs[regno]) - globals++; + /* Make sure no regparm register is taken by a +fixed register variable. */ + for (local_regparm = 0; local_regparm < REGPARM_MAX; + local_regparm++) + if (fixed_regs[local_regparm]) + break; + + /* We don't want to use regparm(3) for nested functions as +these use a static chain pointer in the third argument. */ + if (local_regparm == 3 && DECL_STATIC_CHAIN (target->decl)) + local_regparm = 2; + + /* Save a register for the split stack. */ + if (local_regparm == 3 && flag_split_stack) + local_regparm = 2; + + /* Each fixed register usage increases register pressure, +so less registers should be used for argument passing. +This functionality can be overriden by an explicit +regparm value. */ + for (regno = AX_REG; regno <= DI_REG; regno++) + if (fixed_regs[regno]) + globals++; - local_regparm - = globals < local_regparm ? local_regparm - globals : 0; + local_regparm + = globals < local_regparm ? local_regparm - globals : 0; - if (local_regparm > regparm) - regparm = local_regparm; + if (local_regparm > re
[PATCH/AARCH64] Xfail ssa-dom-cse-2.c
Like https://gcc.gnu.org/ml/gcc-patches/2015-01/msg02646.html, we should xfail this testcase for aarch64 too. OK? Bootstrapped and tested on aarch64-linux-gnu. Thanks, Andrew Pinski ChangeLog: * gcc.dg/tree-ssa/ssa-dom-cse-2.c: xfail for AARCH64 also. commit 5ddfd811b7b7a0f6bcbaf78b597ae473bb36bff4 Author: Andrew Pinski Date: Sun Feb 8 12:39:01 2015 + Disable testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c for AARCH64 * gcc.dg/tree-ssa/ssa-dom-cse-2.c: Disable for AARCH64. diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c index f767a31..1b7369c 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c @@ -20,5 +20,5 @@ foo () /* See PR63679 and PR64159, if the target forces the initializer to memory then DOM is not able to perform this optimization. */ -/* { dg-final { scan-tree-dump "return 28;" "optimized" { xfail hppa*-*-* powerpc*-*-* sparc*-*-*} } } */ +/* { dg-final { scan-tree-dump "return 28;" "optimized" { xfail hppa*-*-* powerpc*-*-* sparc*-*-* aarch64*-*-* } } } */ /* { dg-final { cleanup-tree-dump "optimized" } } */
Fix PR 63566 part 4
Hi, this patch finally enables local functions for functions with aliases and thunks. Honza PR ipa/63566 * ipa-visibility.c (cgraph_node::non_local_p): Accept aliases. (cgraph_node::local_p): Remove thunk related FIXME. Index: ipa-visibility.c === --- ipa-visibility.c(revision 220509) +++ ipa-visibility.c(working copy) @@ -104,14 +104,15 @@ along with GCC; see the file COPYING3. bool cgraph_node::non_local_p (struct cgraph_node *node, void *data ATTRIBUTE_UNUSED) { - /* FIXME: Aliases can be local, but i386 gets thunks wrong then. */ - return !(node->only_called_directly_or_aliased_p () - && !node->has_aliases_p () - && node->definition - && !DECL_EXTERNAL (node->decl) - && !node->externally_visible - && !node->used_from_other_partition - && !node->in_other_partition); + return !(node->only_called_directly_or_aliased_p () + /* i386 would need update to output thunk with locak calling + ocnvetions. */ + && !node->thunk.thunk_p + && node->definition + && !DECL_EXTERNAL (node->decl) + && !node->externally_visible + && !node->used_from_other_partition + && !node->in_other_partition); } /* Return true when function can be marked local. */ @@ -121,12 +122,10 @@ cgraph_node::local_p (void) { cgraph_node *n = ultimate_alias_target (); - /* FIXME: thunks can be considered local, but we need prevent i386 - from attempting to change calling convention of them. */ if (n->thunk.thunk_p) - return false; + return n->callees->callee->local_p (); return !n->call_for_symbol_thunks_and_aliases (cgraph_node::non_local_p, - NULL, true); + NULL, true); }
Re: [PATCH/AARCH64] Xfail ssa-dom-cse-2.c
On February 8, 2015 9:39:20 PM CET, Andrew Pinski wrote: >Like https://gcc.gnu.org/ml/gcc-patches/2015-01/msg02646.html, we >should xfail this testcase for aarch64 too. > >OK? Bootstrapped and tested on aarch64-linux-gnu. OK. Thanks, Richard. >Thanks, >Andrew Pinski > >ChangeLog: >* gcc.dg/tree-ssa/ssa-dom-cse-2.c: xfail for AARCH64 also.
[Committed/AARCH64] Remove dummy GTY variable
The aarch64 back-end has a dummy gty variable which is no longer needed as there are other GTY marked variables now in aarch64.c This should speed up (very slightly) PCH generate and reading in. Committed as obvious after a build and test for aarch64-elf. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64.c (gty_dummy): Delete. Index: gcc/config/aarch64/aarch64.c === --- gcc/config/aarch64/aarch64.c(revision 220521) +++ gcc/config/aarch64/aarch64.c(working copy) @@ -488,9 +488,6 @@ static const struct aarch64_option_exten increment address. */ static machine_mode aarch64_memory_reference_mode; -/* Used to force GTY into this file. */ -static GTY(()) int gty_dummy; - /* A table of valid AArch64 "bitmask immediate" values for logical instructions. */
Re: [wwwdocs] IPA/LTO/FDO updates for gcc-5/changes.html
On Wednesday 2014-09-24 17:25, Jan Hubicka wrote: > this patch adds list of changes to IPA/LTO/FDO before I forget about > them ;) Good work, lots of! :-) In preparation of the GCC 5.0 release I did go through this (and other changes) and made a number of editorial changes which you can find below. I went ahead and committed those. If there are further ones, or you'd like to see things differently, let me know. Gerald Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v retrieving revision 1.76 diff -u -r1.76 changes.html --- changes.html31 Jan 2015 15:48:16 - 1.76 +++ changes.html8 Feb 2015 22:58:43 - @@ -44,29 +44,29 @@ Virtual tables are now optimized. Local aliases are used to reduce dynamic linking time of C++ virtual tables on ELF targets and data alignment has been reduced to limit data segment bloat. - New -fno-semantic-interposition flag can be used + A new -fno-semantic-interposition flag can be used to improve code quality of shared libraries where interposition of exported symbols is not allowed. Write-only variables are now detected and optimized out. With profile feedback the function inliner can now bypass --param inline-insns-auto and --param inline-insns-single limits for hot calls. - IPA reference pass was significantly sped up making it feasible + The IPA reference pass was significantly sped up making it feasible to enable -fipa-reference with -fprofile-generate. This also solves a bottleneck -seen when building Chromium with link time optimization. - Symbol table and call-graph API was reworked to C++ and +seen when building Chromium with link-time optimization. + The symbol table and call-graph API was reworked to C++ and simplified. Link-time optimization improvements: - New One Definition Rule based merging of C++ types implemented. + One Definition Rule based merging of C++ types has been implemented. Type merging enables better devirtualization and alias analysis. Streaming extra information needed to merge types adds about 2-6% of memory size and object size increase. This can be controlled by -flto-odr-type-merging. - GCC bootstrap now use slim LTO object files. - Memory usage and link times was improved. Tree merging was sped up, + GCC bootstrap now uses slim LTO object files. + Memory usage and link times were improved. Tree merging was sped up, memory usage of GIMPLE declarations and types was reduced, and, support for on-demand streaming of variable constructors was added. @@ -74,8 +74,9 @@ Profile precision was improved in presence of C++ inline and extern inline functions. - New gcov-tool to manipulate profiles. - Profile is now more tolerant to source file changes (this can be + The new gcov-tool utility allows manipulating + profiles. + Profiles are now more tolerant to source file changes (this can be controlled by --param profile-func-internal-id). UndefinedBehaviorSanitizer gained a few new sanitization options: @@ -155,10 +156,11 @@ Full support for https://www.cilkplus.org/";>Cilk Plus has been added to the GCC compiler. Cilk Plus is an extension to the C and C++ languages to support data and task parallelism. -New attribute no_reorder prevents reordering of selected symbols - against other such symbols or inline assembler. - This enables to link-time optimize Linux kernel without need to use - -fno-toplevel-reorder that disable several optimizations. +A new attribute no_reorder prevents reordering of +selected symbols against other such symbols or inline assembler. +This enables to link-time optimize the Linux kernel without having +to resort to -fno-toplevel-reorder that disables +several optimizations. New preprocessor constructs, __has_include and __has_include_next, to test the availability of headers have been added. @@ -278,13 +280,13 @@ void operator delete (void *, std::size_t) noexcept; void operator delete[] (void *, std::size_t) noexcept; - New One Definition Rule violation warning (controlled by -Wodr) + A new One Definition Rule violation warning (controlled by -Wodr) detects mismatches in type definitions and virtual table contents during link-time optimization. New warnings -Wsuggest-final-types and - -Wsuggest-final-methods helps developers - to annotate programs by final specifiers (or anonymous - namespaces) in the cases where code generation improves. + -Wsuggest-final-methods help developers + to annotate programs with final specifiers (or anony
[patch, libgfortran] Bug 57822 - I/O: "(g0)" wrongly prints "E+0000"
The attached patch fixes this by checking for the case when we are doing g0 editing and the exponent is 0. Regression tested on X86-64. For the larger kinds, we are on a different code path out of necessity, so we need to address this corner case. I will commit in a day or two as simple/obvious, with a Changelog for the testsuite as well. Regards, Jerry 2015-02-09 Jerry DeLisle PR libgfortran/57822 * io/write_float.def (output_float): If doing g0 editing and exponent is zero, do not emit exponent. Index: write_float.def === --- write_float.def (revision 220505) +++ write_float.def (working copy) @@ -724,7 +724,7 @@ } /* Output the exponent. */ - if (expchar) + if (expchar && !(dtp->u.p.g0_no_blanks && e == 0)) { if (expchar != ' ') { ! { dg-do run } ! PR58722 program testit character(50) :: astring write(astring, '(g0)') 0.1_4 if (test(astring)) call abort write(astring, '(g0)') 0.1_8 if (test(astring)) call abort write(astring, '(g0)') 0.1_10 if (test(astring)) call abort write(astring, '(g0)') 0.1_16 if (test(astring)) call abort contains function test (string1) result(res) character(len=*) :: string1 logical :: res res = .true. do i = 1, len(string1) if (string1(i:i) == 'E') return end do res = .false. end function end program
Ping : [PATCH] [gcc, combine] PR46164: Don't combine the insns if a volatile register is contained.
Ping? > -Original Message- > From: Hale Wang [mailto:hale.w...@arm.com] > Sent: Thursday, January 29, 2015 9:58 AM > To: Hale Wang; 'Segher Boessenkool' > Cc: GCC Patches > Subject: RE: [PATCH] [gcc, combine] PR46164: Don't combine the insns if a > volatile register is contained. > > Hi Segher, > > I have updated the patch as you suggested. Both the patch and the > changelog are attached. > > By the way, the test case provided by Tim Pambor in PR46164 was a different > bug with PR46164. So I resubmitted the bug in > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64818. > And this patch is just used to fix this bug. Is it OK for you? > > Thanks, > Hale > > gcc/ChangeLog: > 2015-01-27 Segher Boessenkool > Hale Wang > > PR rtl-optimization/64818 > * combine.c (can_combine_p): Don't combine the insn if > the dest of insn is a user specified register. > > gcc/testsuit/ChangeLog: > 2015-01-27 Segher Boessenkool > Hale Wang > > PR rtl-optimization/64818 > * gcc.target/arm/pr64818.c: New test. > > > diff --git a/gcc/combine.c b/gcc/combine.c index 5c763b4..6901ac2 100644 > --- a/gcc/combine.c > +++ b/gcc/combine.c > @@ -1904,6 +1904,12 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, > rtx_insn *pred ATTRIBUTE_UNUSED, >set = expand_field_assignment (set); >src = SET_SRC (set), dest = SET_DEST (set); > > + /* Don't combine if dest contains a user specified register, because the > + user specified register (same with dest) in i3 would be replaced by the > + src of insn which might be different with the user's expectation. > + */ if (REG_P (dest) && REG_USERVAR_P (dest) && HARD_REGISTER_P > (dest)) > +return 0; > + >/* Don't eliminate a store in the stack pointer. */ >if (dest == stack_pointer_rtx >/* Don't combine with an insn that sets a register to itself if it has diff --git > a/gcc/testsuite/gcc.target/arm/pr64818.c > b/gcc/testsuite/gcc.target/arm/pr64818.c > new file mode 100644 > index 000..bddd846 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/arm/pr64818.c > @@ -0,0 +1,30 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O1" } */ > + > +char temp[16]; > +extern int foo1 (void); > + > +void foo (void) > +{ > + int i; > + int len; > + > + while (1) > + { > +len = foo1 (); > +register int a asm ("r0") = 5; > +register char *b asm ("r1") = temp; > +register int c asm ("r2") = len; > +asm volatile ("mov %[r0], %[r0]\n mov %[r1], %[r1]\n > mov %[r2], %[r2]\n" > +: "+m"(*b) > +: [r0]"r"(a), [r1]"r"(b), [r2]"r"(c)); > + > +for (i = 0; i < len; i++) > +{ > + if (temp[i] == 10) > + return; > +} > + } > +} > + > +/* { dg-final { scan-assembler "\[\\t \]+mov\ r1,\ r1" } } */ > > > > > On Tue, Jan 27, 2015 at 11:49:55AM +0800, Hale Wang wrote: > > > > > > Hi Hale, > > > > You are correct. Just "-O1" reproduces this problem. > > > > However it's a combine bug which is related to the combing user > > > > specified register into inline-asm. > > > > > > Yes, it is. But the registers the testcase uses exist on any ARM > > > version > > there > > > is as far as I know, so not specifying specific model and ABI should > > > give > > wider > > > test coverage (if anyone actually builds and/or tests more than the > > default, > > > of course :-) ) > > > > > > > > Could you try this patch please? > > > > > > > > Your patch rejected the combine 98+43, that's correct. > > > > > > Excellent, thanks for testing. > > > > > > > However, Jakub > > > > pointed out that preventing that to be combined would be a serious > > > > regression on code quality. > > > > > > I know; I needed to think of some good way to detect register > > > variables > > (they > > > aren't marked specially in RTL). I think I found one, for combine > > > that > > is; if we > > > need to detect it in other passes too, we probably need to put > > > another > > flag > > > on it, or something. > > > > > > > Andrew Pinski suggested: can_combine_p would reject combing into > > > > an inline-asm to prevent this issue. And I have updated the patch. > > > > What do you think about this change? > > > > > > That will regress combining anything else into an asm. It will > > > disallow combining asms _at all_, if we really wanted that we should > > > simply not > > build > > > LOG_LINKS for them. But it hurts optimisation (for simple "r" > > > constraints > > it is > > > not a real problem, RA should take care of it, but for anything else > > > it > > is). > > > > > > Updated patch below. A user variable that is also a hard register > > > can > > only > > > happen in a few cases: 1) a register variable, the case we are > > > after; > > > 2) > > an > > > argument for the current function that was propagated into a user > > > variable (something combine should not do at all, it hinders good > > > register > > allocation, > > > but it does anyway on most targets). > > > > >
RE: [Ping^2] [PATCH, ARM, libgcc] New aeabi_idiv function for armv6-m
Ping https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01059.html. > -Original Message- > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > ow...@gcc.gnu.org] On Behalf Of Hale Wang > Sent: Friday, December 12, 2014 9:36 AM > To: gcc-patches > Subject: RE: [Ping] [PATCH, ARM, libgcc] New aeabi_idiv function for armv6- > m > > Ping? Already applied to arm/embedded-4_9-branch, is it OK for trunk? > > -Hale > > > -Original Message- > > From: Joey Ye [mailto:joey.ye...@gmail.com] > > Sent: Thursday, November 27, 2014 10:01 AM > > To: Hale Wang > > Cc: gcc-patches > > Subject: Re: [PATCH, ARM, libgcc] New aeabi_idiv function for armv6-m > > > > OK applying to arm/embedded-4_9-branch, though you still need > > maintainer approval into trunk. > > > > - Joey > > > > On Wed, Nov 26, 2014 at 11:43 AM, Hale Wang > wrote: > > > Hi, > > > > > > This patch ports the aeabi_idiv routine from Linaro Cortex-Strings > > > (https://git.linaro.org/toolchain/cortex-strings.git), which was > > > contributed by ARM under Free BSD license. > > > > > > The new aeabi_idiv routine is used to replace the one in > > > libgcc/config/arm/lib1funcs.S. This replacement happens within the > > > Thumb1 wrapper. The new routine is under LGPLv3 license. > > > > > > The main advantage of this version is that it can improve the > > > performance of the aeabi_idiv function for Thumb1. This solution > > > will also increase the code size. So it will only be used if > > > __OPTIMIZE_SIZE__ is > > not defined. > > > > > > Make check passed for armv6-m. > > > > > > OK for trunk? > > > > > > Thanks, > > > Hale Wang > > > > > > libgcc/ChangeLog: > > > > > > 2014-11-26 Hale Wang > > > > > > * config/arm/lib1funcs.S: Add new wrapper. > > > > > > === > > > diff --git a/libgcc/config/arm/lib1funcs.S > > > b/libgcc/config/arm/lib1funcs.S index b617137..de66c81 100644 > > > --- a/libgcc/config/arm/lib1funcs.S > > > +++ b/libgcc/config/arm/lib1funcs.S > > > @@ -306,34 +306,12 @@ LSYM(Lend_fde): > > > #ifdef __ARM_EABI__ > > > .macro THUMB_LDIV0 name signed > > > #if defined(__ARM_ARCH_6M__) > > > - .ifc \signed, unsigned > > > - cmp r0, #0 > > > - beq 1f > > > - mov r0, #0 > > > - mvn r0, r0 @ 0x > > > -1: > > > - .else > > > - cmp r0, #0 > > > - beq 2f > > > - blt 3f > > > + > > > + push{r0, lr} > > > mov r0, #0 > > > - mvn r0, r0 > > > - lsr r0, r0, #1 @ 0x7fff > > > - b 2f > > > -3: mov r0, #0x80 > > > - lsl r0, r0, #24 @ 0x8000 > > > -2: > > > - .endif > > > - push{r0, r1, r2} > > > - ldr r0, 4f > > > - adr r1, 4f > > > - add r0, r1 > > > - str r0, [sp, #8] > > > - @ We know we are not on armv4t, so pop pc is safe. > > > - pop {r0, r1, pc} > > > - .align 2 > > > -4: > > > - .word __aeabi_idiv0 - 4b > > > + bl SYM(__aeabi_idiv0) > > > + pop {r1, pc} > > > + > > > #elif defined(__thumb2__) > > > .syntax unified > > > .ifc \signed, unsigned > > > @@ -927,7 +905,158 @@ LSYM(Lover7): > > > add dividend, work > > >.endif > > > LSYM(Lgot_result): > > > -.endm > > > +.endm > > > + > > > +#if defined(__prefer_thumb__) > > && !defined(__OPTIMIZE_SIZE__) .macro > > > +BranchToDiv n, label > > > + lsr curbit, dividend, \n > > > + cmp curbit, divisor > > > + blo \label > > > +.endm > > > + > > > +.macro DoDiv n > > > + lsr curbit, dividend, \n > > > + cmp curbit, divisor > > > + bcc 1f > > > + lsl curbit, divisor, \n > > > + sub dividend, dividend, curbit > > > + > > > +1: adc result, result > > > +.endm > > > + > > > +.macro THUMB1_Div_Positive > > > + mov result, #0 > > > + BranchToDiv #1, LSYM(Lthumb1_div1) > > > + BranchToDiv #4, LSYM(Lthumb1_div4) > > > + BranchToDiv #8, LSYM(Lthumb1_div8) > > > + BranchToDiv #12, LSYM(Lthumb1_div12) > > > + BranchToDiv #16, LSYM(Lthumb1_div16) > > > +LSYM(Lthumb1_div_large_positive): > > > + mov result, #0xff > > > + lsl divisor, divisor, #8 > > > + rev result, result > > > + lsr curbit, dividend, #16 > > > + cmp curbit, divisor > > > + blo 1f > > > + asr result, #8 > > > + lsl divisor, divisor, #8 > > > + beq LSYM(Ldivbyzero_waypoint) > > > + > > > +1: lsr curbit, dividend, #12 > > > + cmp curbit, divisor > > > + blo LSYM(Lthumb1_div12) > > > + b LSYM(Lthumb1_div16) > > > +LSYM(Lthumb1_div_loop): > > > + lsr divisor, divisor, #8 > > > +LSYM(Lthumb1_div16): > > > + Dodiv #15 > > > + Dodiv #14 > > > + Dodiv #13 > > > + Dodiv #12
[PATCH/AARCH64] Fix gcc.c-torture/compile/pr37433.c for AARCH64:ILP32.
The problem here is that we get a symbol_ref which is SImode but for the sibcall patterns we only match symbol_refs which use DImode. I added a new testcase that tests the non-value sibcall pattern too. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Thanks, Andrew Pinski ChangeLog: * config/aarch64/aarch64.md (sibcall): Force call address to be DImode for ILP32. (sibcall_value): Likewise. testsuite/ChangeLog: * gcc.c-torture/compile/pr37433-1.c: New testcase. commit e72320f54d1c6ed6f2324a3faaad02175c83887b Author: Andrew Pinski Date: Sun Feb 8 23:05:01 2015 + Fix gcc.c-torture/compile/pr37433.c for AARCH64:ILP32. The problem here is that we get a symbol_ref which is SImode but for the sibcall patterns we only match symbol_refs which use DImode. OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions. Thanks, Andrew Pinski * config/aarch64/aarch64.md (sibcall): Force operands[0]'s address to be DImode for ILP32. (sibcall_value): Likewise. * gcc.c-torture/compile/pr37433-1.c: New testcase. diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 1f4169e..05240ba 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -687,6 +687,11 @@ && (GET_CODE (XEXP (operands[0], 0)) != SYMBOL_REF)) XEXP (operands[0], 0) = force_reg (Pmode, XEXP (operands[0], 0)); +if (TARGET_ILP32 + && GET_CODE (XEXP (operands[0], 0)) == SYMBOL_REF + && GET_MODE (XEXP (operands[0], 0)) == SImode) + XEXP (operands[0], 0) = convert_memory_address (DImode, + XEXP (operands[0], 0)); if (operands[2] == NULL_RTX) operands[2] = const0_rtx; @@ -717,6 +722,12 @@ && (GET_CODE (XEXP (operands[1], 0)) != SYMBOL_REF)) XEXP (operands[1], 0) = force_reg (Pmode, XEXP (operands[1], 0)); +if (TARGET_ILP32 + && GET_CODE (XEXP (operands[1], 0)) == SYMBOL_REF + && GET_MODE (XEXP (operands[1], 0)) == SImode) + XEXP (operands[1], 0) = convert_memory_address (DImode, + XEXP (operands[1], 0)); + if (operands[3] == NULL_RTX) operands[3] = const0_rtx; diff --git a/gcc/testsuite/gcc.c-torture/compile/pr37433-1.c b/gcc/testsuite/gcc.c-torture/compile/pr37433-1.c new file mode 100644 index 000..322c167 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/compile/pr37433-1.c @@ -0,0 +1,11 @@ +void regex_subst(void) +{ + const void *subst = ""; + (*(void (*)(int))subst) (0); +} + +void foobar (void) +{ + int x; + (*(void (*)(void))&x) (); +}
[PATCH PR43378]Add test case for the issue
Hi, I crossed to PR43378 and found it had already been fixed on trunk long before. I am adding a test case and going to close it after this patch. The case is tested, Is it OK? 2015-02-09 Bin Cheng PR tree-optimization/43378 * gcc.dg/tree-ssa/pr43378.c: New test.
RE: [PATCH PR43378]Add test case for the issue
And the missed patch. Thanks, bin > -Original Message- > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > ow...@gcc.gnu.org] On Behalf Of Bin Cheng > Sent: Monday, February 09, 2015 2:07 PM > To: gcc-patches@gcc.gnu.org > Subject: [PATCH PR43378]Add test case for the issue > > Hi, > I crossed to PR43378 and found it had already been fixed on trunk long > before. I am adding a test case and going to close it after this patch. > The case is tested, Is it OK? > > 2015-02-09 Bin Cheng > > PR tree-optimization/43378 > * gcc.dg/tree-ssa/pr43378.c: New test. > > > Index: gcc/testsuite/gcc.dg/tree-ssa/pr43378.c === --- gcc/testsuite/gcc.dg/tree-ssa/pr43378.c (revision 0) +++ gcc/testsuite/gcc.dg/tree-ssa/pr43378.c (revision 0) @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-ivopts" } */ + +void bar (int, int, int); +void foo (int left, int rite, int element) +{ + while (left <= rite) +{ + rite -= element; + bar (left, rite, element); + left += element; +} +} + +/* { dg-final { scan-tree-dump-times "rite_\[0-9\]* = rite_\[0-9\]* - element" 1 "ivopts"} } */ +/* { dg-final { scan-tree-dump-times "left_\[0-9\]* = left_\[0-9\]* \\+ element|left_\[0-9\]* = element_\[0-9\]*\\(D\\) \\+ left" 1 "ivopts"} } */ +/* { dg-final { cleanup-tree-dump "ivopts" } } */
Re: [RFC][PR target/39726 P4 regression] match.pd pattern to do type narrowing
On 02/03/15 04:39, Richard Biener wrote: I found that explicit types were ignored in some cases. It was frustrating to say the least. Huh, that would be a bug. Do you have a pattern where that happens? I'll have to recreate them. In the mean time consider something else I'm playing with that causes an odd error from genmatch... /* If we have a narrowing conversion of an arithmetic or logical operation where both are operands widening conversions from the same type as the outer narrowing conversion. Then convert the innermost operands to a suitable unsigned type (to avoid introducing undefined behaviour), perform the operation and convert the result to the desired type. */ (simplify (convert (plus (convert@2 @0) (convert @1))) (if (TREE_TYPE (@0) == TREE_TYPE (@1) && TREE_TYPE (@0) == type && INTEGRAL_TYPE_P (type) && TYPE_PRECISION (TREE_TYPE (@2)) >= TYPE_PRECISION (TREE_TYPE (@0))) (with { tree utype = unsigned_type_for (TREE_TYPE (@0));} (convert (plus (convert:utype @0) (convert:utype @1))) So given two narrow operands that get widened, added, and the final result narrowed back down to the original operand types. Replace with convert the operands to an unsigned type (of same size as the operand), operate on them and convert to the final desired type. This happens to fix 47477 (P2 regression). Works perfectly for the testcase. Of course we'd like to extend that to other operators... So, adding the obvious for iterator... (for op (plus minus) (simplify (convert (op (convert@2 @0) (convert @1))) (if (TREE_TYPE (@0) == TREE_TYPE (@1) && TREE_TYPE (@0) == type && INTEGRAL_TYPE_P (type) && TYPE_PRECISION (TREE_TYPE (@2)) >= TYPE_PRECISION (TREE_TYPE (@0))) (with { tree utype = unsigned_type_for (TREE_TYPE (@0));} (convert (op (convert:utype @0) (convert:utype @1))) Which causes genmatch to barf: build/genmatch --gimple /home/gcc/GIT-2/gcc/gcc/match.pd \ > tmp-gimple-match.c genmatch: two conversions in a row Not only does genmatch barf, it doesn't give any indication what part of the .pd file it found objectionable.
[PATCH] gcc/ira-color.c: save a init statement
In function setup_left_conflict_sizes_p, init assignment conflict_size = 0 could be eliminated for the tidiness. I have no write access to gcc repository and I can't provide a testcase because the improvement has effective compiler no output. Bootstraped and regtested in x86_64 Linux Signed-off-by: Zhouyi Zhou --- gcc/ChangeLog |3 +++ gcc/ira-color.c |3 +-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index bd74326..577b57e 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,6 @@ +2015-02-09 Zhouyi Zhou + * ira-color.c (setup_left_conflict_sizes_p): save a init statement. + 2015-02-08 Andrew Pinski * config/aarch64/aarch64.c (gty_dummy): Delete. diff --git a/gcc/ira-color.c b/gcc/ira-color.c index d04be29..5591637 100644 --- a/gcc/ira-color.c +++ b/gcc/ira-color.c @@ -858,7 +858,6 @@ setup_left_conflict_sizes_p (ira_allocno_t a) HARD_REG_SET node_set; nobj = ALLOCNO_NUM_OBJECTS (a); - conflict_size = 0; data = ALLOCNO_COLOR_DATA (a); subnodes = allocno_hard_regs_subnodes + data->hard_regs_subnodes_start; COPY_HARD_REG_SET (profitable_hard_regs, data->profitable_hard_regs); @@ -959,7 +958,7 @@ setup_left_conflict_sizes_p (ira_allocno_t a) } left_conflict_subnodes_size = subnodes[0].left_conflict_subnodes_size; conflict_size -+= (left_conflict_subnodes_size += (left_conflict_subnodes_size + MIN (subnodes[0].max_node_impact - left_conflict_subnodes_size, subnodes[0].left_conflict_size)); conflict_size += ira_reg_class_max_nregs[ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]; -- 1.7.10.4