Re: [SH] Add simple_return pattern
On 09/11/2012 03:05 AM, Kaz Kojima wrote: > Christian Bruel wrote: >> This patch implements the simple_return pattern to enable -fshrink-wrap >> on SH. It also clean up some redundancies for expand_epilogue (called >> twice from the "return" and "epilogue" patterns and the >> sh_expand_prologue parameter type. >> >> No regressions with sh-superh-elf and sh4-linux gcc testsuites. > > With the patch + revision 191106, I've got a new failure: > > FAIL: gcc.dg/tree-prof/bb-reorg.c compilation, -fprofile-use -D_PROFILE_USE > (internal compiler error) > > for sh4-unknown-linux-gnu. My testsuite/gcc/gcc.log says > > /exp/ldroot/dodes/xsh-gcc/gcc/xgcc -B/exp/ldroot/dodes/xsh-gcc/gcc/ > /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c > -fno-diagnostics-show-caret -O2 -freorder-blocks-and-partition -fprofile-use > -D_PROFILE_USE -lm -o /exp/ldroot/dodes/xsh-gcc/gcc/testsuite/gcc/bb-reorg.x02 > /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c: In > function 'main': > /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c:38:1: > error: EDGE_CROSSING missing across section boundary > /exp/ldroot/dodes/LOCAL/trunk/gcc/testsuite/gcc.dg/tree-prof/bb-reorg.c:38:1: > internal compiler error: verify_flow_info failed > Please submit a full bug report, > > Regards, Ugh, indeed, I forgot a SPEC file that set the release mode on my SH-Linux distri, so verify_flow_info was not called :-(. I need to test again. thanks ! Christian > kaz >
Bootstrap fails (was: Remove unnecessary VEC function overloads.)
On 09/11/2012 01:52 AM, Diego Novillo wrote: Remove unnecessary VEC function overloads. Several VEC member functions that accept an element 'T' used to have two overloads: one taking 'T', the second taking 'T *'. They might be unnecessary, but with your patch bootstrapping fails here with the following failure. Did you test with or without Graphite? Tobias /home/tob/projects/gcc-git/gcc/gcc/graphite-scop-detection.c: In function ‘void move_sd_regions(vec_t**, vec_t**)’: /home/tob/projects/gcc-git/gcc/gcc/vec.h:408:63: error: no matching function for call to ‘vec_t::safe_push(vec_t**, sd_region*&, const char [61], int, const char [16])’ (vec_t::safe_push (&(V), O VEC_CHECK_INFO MEM_STAT_INFO)) ^ /home/tob/projects/gcc-git/gcc/gcc/graphite-scop-detection.c:146:5: note: in expansion of macro 'VEC_safe_push' VEC_safe_push (sd_region, heap, *target, s); ^ /home/tob/projects/gcc-git/gcc/gcc/vec.h:408:63: note: candidate is: (vec_t::safe_push (&(V), O VEC_CHECK_INFO MEM_STAT_INFO))
Ping [SH] Define NO_IMPLICIT_EXTERN_C for newlib targets
Hi Kaz, Any news for my sh-superh-elf --with-newlib patch ? http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00137.html Thanks Christian
Re: [PATCH] Fix PR54492
On Mon, 10 Sep 2012, William J. Schmidt wrote: > Here's the revised patch with a param. Bootstrapped and tested in the > same manner. Ok for trunk? Ok. Thanks, Richard. > Thanks, > Bill > > > 2012-08-10 Bill Schmidt > > * doc/invoke.texi (max-slsr-cand-scan): New description. > * gimple-ssa-strength-reduction.c (find_basis_for_candidate): Limit > the time spent searching for a basis. > * params.def (PARAM_MAX_SLSR_CANDIDATE_SCAN): New param. > > > Index: gcc/doc/invoke.texi > === > --- gcc/doc/invoke.texi (revision 191135) > +++ gcc/doc/invoke.texi (working copy) > @@ -9407,6 +9407,11 @@ having a regular register file and accurate regist > See @file{haifa-sched.c} in the GCC sources for more details. > > The default choice depends on the target. > + > +@item max-slsr-cand-scan > +Set the maximum number of existing candidates that will be considered when > +seeking a basis for a new straight-line strength reduction candidate. > + > @end table > @end table > > Index: gcc/gimple-ssa-strength-reduction.c > === > --- gcc/gimple-ssa-strength-reduction.c (revision 191135) > +++ gcc/gimple-ssa-strength-reduction.c (working copy) > @@ -54,6 +54,7 @@ along with GCC; see the file COPYING3. If not see > #include "domwalk.h" > #include "pointer-set.h" > #include "expmed.h" > +#include "params.h" > > /* Information about a strength reduction candidate. Each statement > in the candidate table represents an expression of one of the > @@ -353,10 +354,14 @@ find_basis_for_candidate (slsr_cand_t c) >cand_chain_t chain; >slsr_cand_t basis = NULL; > > + // Limit potential of N^2 behavior for long candidate chains. > + int iters = 0; > + int max_iters = PARAM_VALUE (PARAM_MAX_SLSR_CANDIDATE_SCAN); > + >mapping_key.base_expr = c->base_expr; >chain = (cand_chain_t) htab_find (base_cand_map, &mapping_key); > > - for (; chain; chain = chain->next) > + for (; chain && iters < max_iters; chain = chain->next, ++iters) > { >slsr_cand_t one_basis = chain->cand; > > Index: gcc/params.def > === > --- gcc/params.def(revision 191135) > +++ gcc/params.def(working copy) > @@ -973,6 +973,13 @@ DEFPARAM (PARAM_SCHED_PRESSURE_ALGORITHM, > "Which -fsched-pressure algorithm to apply", > 1, 1, 2) > > +/* Maximum length of candidate scans in straight-line strength reduction. */ > +DEFPARAM (PARAM_MAX_SLSR_CANDIDATE_SCAN, > + "max-slsr-cand-scan", > + "Maximum length of candidate scans for straight-line " > + "strength reduction", > + 50, 1, 99) > + > /* > Local variables: > mode:c > > > -- Richard Biener SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imend
[PATCH] Fix PR54515
This is the trunk variant of the 54515 fix - we shouldn't really return NULL_TREE from get_base_address apart from for invalid inputs (and then it's just GIGO). This makes us go half-way to fix the PR, I'll followup with a patch to look through WITH_SIZE_EXPR (after thinking about effects on alias analysis). Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-09-11 Richard Guenther PR middle-end/54515 * gimple.c (get_base_address): Do not return NULL_TREE apart from for WITH_SIZE_EXPR. * gimple-fold.c (canonicalize_constructor_val): Do not call get_base_address when not necessary. * g++.dg/tree-ssa/pr54515.C: New testcase. Index: gcc/gimple.c === --- gcc/gimple.c(revision 191143) +++ gcc/gimple.c(working copy) @@ -2878,16 +2878,12 @@ get_base_address (tree t) && TREE_CODE (TREE_OPERAND (t, 0)) == ADDR_EXPR) t = TREE_OPERAND (TREE_OPERAND (t, 0), 0); - if (TREE_CODE (t) == SSA_NAME - || DECL_P (t) - || TREE_CODE (t) == STRING_CST - || TREE_CODE (t) == CONSTRUCTOR - || INDIRECT_REF_P (t) - || TREE_CODE (t) == MEM_REF - || TREE_CODE (t) == TARGET_MEM_REF) -return t; - else + /* ??? Either the alias oracle or all callers need to properly deal + with WITH_SIZE_EXPRs before we can look through those. */ + if (TREE_CODE (t) == WITH_SIZE_EXPR) return NULL_TREE; + + return t; } void Index: gcc/gimple-fold.c === --- gcc/gimple-fold.c (revision 191143) +++ gcc/gimple-fold.c (working copy) @@ -154,13 +154,15 @@ canonicalize_constructor_val (tree cval, } if (TREE_CODE (cval) == ADDR_EXPR) { - tree base = get_base_address (TREE_OPERAND (cval, 0)); - if (!base && TREE_CODE (TREE_OPERAND (cval, 0)) == COMPOUND_LITERAL_EXPR) + tree base = NULL_TREE; + if (TREE_CODE (TREE_OPERAND (cval, 0)) == COMPOUND_LITERAL_EXPR) { base = COMPOUND_LITERAL_EXPR_DECL (TREE_OPERAND (cval, 0)); if (base) TREE_OPERAND (cval, 0) = base; } + else + base = get_base_address (TREE_OPERAND (cval, 0)); if (!base) return NULL_TREE; Index: gcc/testsuite/g++.dg/tree-ssa/pr54515.C === --- gcc/testsuite/g++.dg/tree-ssa/pr54515.C (revision 0) +++ gcc/testsuite/g++.dg/tree-ssa/pr54515.C (working copy) @@ -0,0 +1,19 @@ +// { dg-do compile } +// { dg-options "-O2" } + +template < typename T > T h2le (T) +{ +T a; +unsigned short &b = a; +short c = 0; +unsigned char (&d)[2] = reinterpret_cast < unsigned char (&)[2] > (c); +unsigned char (&e)[2] = reinterpret_cast < unsigned char (&)[2] > (b); +e[0] = d[0]; +return a; +} + +void +bar () +{ +h2le ((unsigned short) 0); +}
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Mon, Sep 10, 2012 at 6:30 PM, Richard Henderson wrote: > On 09/10/2012 09:11 AM, Iyer, Balaji V wrote: >> Can you please help me get a start on how to get can be done? From >> what I understand (please correct me if I am wrong), this requires >> rearranging and duplicating a lot of passes and can potentially open >> up to a lot of bugs. > > Certainly not duplicating passes. And probably not even rearranging them. > > The Important parts are: > > (1) Having a bit in "struct loop" that indicates the special semantics > you have for #pragma simd. I don't know if maybe all loops inside an > elemental function are so automatically marked? > > (2) Have bits in "struct function" that summarize the contents of the > bit from "struct loop", for all loops in the function. Note that > this bit would need to be updated during inlining. > > (3) Change the "gate" predicates for the relevant function to also check > the bit from "struct function". In some cases the pass might need > to run globally (perhaps if-conversion?) and in some cases the pass > might be able to restrict work to specific loops (e.g. the vectorizer), > skipping loops for which the optimization is not enabled. Note that we do not preserve the loop tree before the gimple loop optimizer passes. Nor do we have a convenient way (currently) to transfer per-loop information from GENERIC to the point where we can first create the loop tree (after the CFG is built). The former is because I didn't want to think about the inlining case (I'm still chasing bugs for preserving the loop tree from the start of gimple loop optimizer passes ...), the latter could be done in a similar way we handle predications or OMP annotations - have special instructions in the IL. Richard. > > r~ >
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Mon, Sep 10, 2012 at 6:37 PM, Richard Henderson wrote: > On 09/10/2012 09:09 AM, Iyer, Balaji V wrote: >>> >If that's the case, what's the point in defining an external ABI and >>> >defining what >>> >__attribute__((vector)) placed on a function declaration means? > >> When you have __attribute__((vector)) you are asking the compiler to >> create a vector AND a scalar version of the function. The advantage >> is that if the function is used, for example, in 2 loops where 1 can >> be vectorized and another cannot, the vectorizable loop won't suffer >> (i.e. suffer from being not-vectorized). > > You've totally mis-understood my point. > > Whether or not the compiler creates a clone COULD BE totally up to the > compiler, based on whether or not vectorization is enabled, whether the > loop has been analyzed such that vectorization may proceed, or indeed > the phase of the moon. > > But in order for that to happen, the clone must be totally private to > the module for which we are generating code (in the LTO sense, this is > the entire program or dll; without LTO, this is just the object file). > It means that we never attempt to generate clones for functions for > which the body of the function is not visible. > > On the other hand, if you insist on assuming a clone exists merely > because a declaration bears an attribute, then you must address ALL > of the problems with respect to defining a stable ABI in the face of > different cpu revisions, different ISAs, and different vector lengths. > > I've not seen you address ANY of these problems, despite having the > problem pointed out multiple times. Indeed, if the definition of an elemental function is always visible to the vectorizer the vectorizer itself can instruct the creation of the clone if it does not already exist (just make those clones managed by the callgraph). Then the clones are visible to the current TU only and no ABI issues exist (though you could say that the vectorizer or the inliner could as well force inlining of elemental functions into places it wants to vectorize - one complication even with local clones is that the x86 ABI has no callee-saved XMM registers which makes function calls inside loops especially expensive). Richard. > > r~
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Tue, Sep 11, 2012 at 10:41 AM, Richard Guenther wrote: > On Mon, Sep 10, 2012 at 6:37 PM, Richard Henderson wrote: >> On 09/10/2012 09:09 AM, Iyer, Balaji V wrote: >If that's the case, what's the point in defining an external ABI and >defining what >__attribute__((vector)) placed on a function declaration means? >> >>> When you have __attribute__((vector)) you are asking the compiler to >>> create a vector AND a scalar version of the function. The advantage >>> is that if the function is used, for example, in 2 loops where 1 can >>> be vectorized and another cannot, the vectorizable loop won't suffer >>> (i.e. suffer from being not-vectorized). >> >> You've totally mis-understood my point. >> >> Whether or not the compiler creates a clone COULD BE totally up to the >> compiler, based on whether or not vectorization is enabled, whether the >> loop has been analyzed such that vectorization may proceed, or indeed >> the phase of the moon. >> >> But in order for that to happen, the clone must be totally private to >> the module for which we are generating code (in the LTO sense, this is >> the entire program or dll; without LTO, this is just the object file). >> It means that we never attempt to generate clones for functions for >> which the body of the function is not visible. >> >> On the other hand, if you insist on assuming a clone exists merely >> because a declaration bears an attribute, then you must address ALL >> of the problems with respect to defining a stable ABI in the face of >> different cpu revisions, different ISAs, and different vector lengths. >> >> I've not seen you address ANY of these problems, despite having the >> problem pointed out multiple times. > > Indeed, if the definition of an elemental function is always visible to the > vectorizer the vectorizer itself can instruct the creation of the clone > if it does not already exist (just make those clones managed by the > callgraph). Then the clones are visible to the current TU only and no > ABI issues exist (though you could say that the vectorizer or the inliner > could as well force inlining of elemental functions into places it wants to > vectorize - one complication even with local clones is that the x86 ABI > has no callee-saved XMM registers which makes function calls inside > loops especially expensive). Btw, this then happily fits into my suggestion that the "elementalness" can be autodetected by the compiler simply by means of a proper IPA pass and thus be fully LTO / whole-program aware. No need for an attribute (where you'd need to handle the case that the attribute was placed there by error). Richard. > Richard. > >> >> r~
Re: Ping [SH] Define NO_IMPLICIT_EXTERN_C for newlib targets
Christian Bruel wrote: > Any news for my sh-superh-elf --with-newlib patch ? > > http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00137.html The patch is OK for both 4.7 and 4.8. Sorry for the delay. Regards, kaz
Re: [PATCH] Combine location with block using block_locations
On Mon, Sep 10, 2012 at 5:27 PM, Dehao Chen wrote: > On Mon, Sep 10, 2012 at 3:01 AM, Richard Guenther > wrote: >> On Sun, Sep 9, 2012 at 12:26 AM, Dehao Chen wrote: >>> Hi, Diego, >>> >>> Thanks a lot for the review. I've updated the patch. >>> >>> This patch is large and may easily break builds because it reserves >>> more complete information for TREE_BLOCK as well as gimple_block (may >>> trigger bugs that was hided when these info are unavailable). I've >>> done more rigorous testing to ensure that most bugs are caught before >>> checking in. >>> >>> * Sync to the head and retest all gcc testsuite. >>> * Port the patch to google-4_7 branch to retest all gcc testsuite, as >>> well as build many large applications. >>> >>> Through these tests, I've found two additional bugs that was omitted >>> in the original implementation. A new patch is attached (patch.txt) to >>> fix these problems. After this fix, all gcc testsuites pass for both >>> trunk and google-4_7 branch. I've also copy pasted the new fixes >>> (lto.c and tree-cfg.c) below. Now I'd say this patch is in good shape. >>> But it may not be perfect. I'll look into build failures as soon as it >>> arises. >>> >>> Richard and Diego, could you help me take a look at the following two fixes? >>> >>> Thanks, >>> Dehao >>> >>> New fixes: >>> --- gcc/lto/lto.c (revision 191083) >>> +++ gcc/lto/lto.c (working copy) >>> @@ -1559,8 +1559,6 @@ lto_fixup_prevailing_decls (tree t) >>> { >>>enum tree_code code = TREE_CODE (t); >>>LTO_NO_PREVAIL (TREE_TYPE (t)); >>> - if (CODE_CONTAINS_STRUCT (code, TS_COMMON)) >>> -LTO_NO_PREVAIL (TREE_CHAIN (t)); >> >> That change is odd. Can you show us how it breaks? > > This will break LTO build of gcc.c-torture/execute/pr38051.c > > There is data structure like: > > union { long int l; char c[sizeof (long int)]; } u; > > Once the block info is reserved for this, it'll reserve this data > structure. And inside this data structure, there is VAR_DECL. Thus > LTO_NO_PREVAIL assertion does not satisfy here for TREE_CHAIN (t). I see - the issue here is that this data structure is not reached at the time we call free_lang_data (via find_decls_types_r). But maybe I do not understand "once the block info is reserved for this". So the patch papers over an issue elsewhere I believe. Maybe Micha can add some clarification here though, how BLOCK_VARS should be visible here Richard. >> >>>if (DECL_P (t)) >>> { >>>LTO_NO_PREVAIL (DECL_NAME (t)); >>> >>> Index: gcc/tree-cfg.c >>> === >>> --- gcc/tree-cfg.c (revision 191083) >>> +++ gcc/tree-cfg.c (working copy) >>> @@ -5980,9 +5974,21 @@ move_stmt_op (tree *tp, int *walk_subtrees, void * >>>tree t = *tp; >>> >>>if (EXPR_P (t)) >>> -/* We should never have TREE_BLOCK set on non-statements. */ >>> -gcc_assert (!TREE_BLOCK (t)); >>> - >>> +{ >>> + tree block = TREE_BLOCK (t); >>> + if (p->orig_block == NULL_TREE >>> + || block == p->orig_block >>> + || block == NULL_TREE) >>> + TREE_SET_BLOCK (t, p->new_block); >>> +#ifdef ENABLE_CHECKING >>> + else if (block != p->new_block) >>> + { >>> + while (block && block != p->orig_block) >>> + block = BLOCK_SUPERCONTEXT (block); >>> + gcc_assert (block); >>> + } >>> +#endif >> >> I think what this means is that TREE_BLOCK on non-stmts are meaningless >> (thus only gimple_block is interesting on GIMPLE, not BLOCKs on trees). >> >> So instead of setting a BLOCK in some cases you should clear BLOCK >> if it happens to be set, or alternatively, only re-set it if there was >> a block associated >> with it. > > Yeah, makes sense. New change: > > @@ -5980,9 +5974,10 @@ >tree t = *tp; > >if (EXPR_P (t)) > -/* We should never have TREE_BLOCK set on non-statements. */ > -gcc_assert (!TREE_BLOCK (t)); > - > +{ > + if (TREE_BLOCK (t)) > + TREE_SET_BLOCK (t, p->new_block); > +} >else if (DECL_P (t) || TREE_CODE (t) == SSA_NAME) > { >if (TREE_CODE (t) == SSA_NAME) > > Thanks, > Dehao > >> >> Richard. >> >>> +} >>>else if (DECL_P (t) || TREE_CODE (t) == SSA_NAME) >>> { >>>if (TREE_CODE (t) == SSA_NAME) >>> >>> Whole patch: >>> gcc/ChangeLog: >>> 2012-09-08 Dehao Chen >>> >>> * toplev.c (general_init): Init block_locations. >>> * tree.c (tree_set_block): New. >>> (tree_block): Change to use LOCATION_BLOCK. >>> * tree.h (TREE_SET_BLOCK): New. >>> * final.c (reemit_insn_block_notes): Change to use LOCATION_BLOCK. >>> (final_start_function): Likewise. >>> * input.c (expand_location_1): Likewise. >>> * input.h (LOCATION_LOCUS): New. >>> (LOCATION_BLOCK): New. >>> (IS_UNKNOWN_LOCATION): New. >>> * fold-const.c (expr_location_or): Change to use new location. >>> * reorg.c (emit_d
Re: [patch] PR54149: fix data race in LIM pass
On Tue, Sep 11, 2012 at 1:15 AM, Aldy Hernandez wrote: > In this failing testcase the LIM pass writes to g_13 regardless of the > initial value of g_13, which is the test protecting the write. This causes > an incorrect store data race wrt both the C++ memory model and transactional > memory (the latter if the store occurs inside of a transaction). > > The problem here is that the ``lsm_flag'' temporary should only be set to > true on the code paths where we actually set the original global. As it > stands, we are setting lsm_flag to true for reads or writes. > > Fixed by only setting lsm_flag=1 when the original code path has a write. > > Tested on x86-64 Linux. > > OK for trunk? + /* Only set the flag for writes. */ + if (is_gimple_assign (loc->stmt) + && gimple_assign_lhs (loc->stmt) == *loc->ref) ok with && gimple_assign_lhs_ptr (loc->stmt) == loc->ref instead. Let's hope we conservatively catch all writes to ref this way (which is what we need, right)? Thanks, Richard.
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Tue, Sep 11, 2012 at 3:42 AM, Richard Guenther wrote: > Btw, this then happily fits into my suggestion that the "elementalness" > can be autodetected by the compiler simply by means of a proper IPA > pass and thus be fully LTO / whole-program aware. No need for an > attribute (where you'd need to handle the case that the attribute was placed > there by error). We are in violent agreement. -- Gaby
[PATCH,i386] Enable prefetchw in processor alias table for AMD targets
Hi Maintainers, This patch enables "prefetchw" ISA in the processor alias table for targets amdfam10,barcelona and bdver1,2 and btver1,2. GCC regression test passes with the patch. Ok for trunk? Change log: 2012-09-11 Venkataramanan Kumar * config/i386/i386.c (processor_alias_table): Enable PTA_PRFCHW for targets amdfam10, barcelona, bdver1, bdver2, btver1 and btver2. Index: gcc/config/i386/i386.c === --- gcc/config/i386/i386.c (revision 190345) +++ gcc/config/i386/i386.c (working copy) @@ -3151,31 +3151,33 @@ | PTA_SSE2 | PTA_NO_SAHF}, {"amdfam10", PROCESSOR_AMDFAM10, CPU_AMDFAM10, PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE - | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM}, + | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM + | PTA_PRFCHW}, {"barcelona", PROCESSOR_AMDFAM10, CPU_AMDFAM10, PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE - | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM}, + | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM + | PTA_PRFCHW}, {"bdver1", PROCESSOR_BDVER1, CPU_BDVER1, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_FMA4 - | PTA_XOP | PTA_LWP}, + | PTA_XOP | PTA_LWP | PTA_PRFCHW}, {"bdver2", PROCESSOR_BDVER2, CPU_BDVER2, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX | PTA_XOP | PTA_LWP | PTA_BMI | PTA_TBM | PTA_F16C - | PTA_FMA}, + | PTA_FMA | PTA_PRFCHW}, {"btver1", PROCESSOR_BTVER1, CPU_GENERIC64, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 -| PTA_SSSE3 | PTA_SSE4A |PTA_ABM | PTA_CX16}, +| PTA_SSSE3 | PTA_SSE4A |PTA_ABM | PTA_CX16 | PTA_PRFCHW}, {"generic32", PROCESSOR_GENERIC32, CPU_PENTIUMPRO, PTA_HLE /* flags are only used for -march switch. */ }, {"btver2", PROCESSOR_BTVER2, CPU_GENERIC64, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3 | PTA_SSE4A |PTA_ABM | PTA_CX16 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AES | PTA_PCLMUL | PTA_AVX - | PTA_BMI | PTA_F16C | PTA_MOVBE}, + | PTA_BMI | PTA_F16C | PTA_MOVBE | PTA_PRFCHW}, {"generic64", PROCESSOR_GENERIC64, CPU_GENERIC64, PTA_64BIT | PTA_HLE /* flags are only used for -march switch. */ },
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Tue, Sep 11, 2012 at 03:57:44AM -0500, Gabriel Dos Reis wrote: > On Tue, Sep 11, 2012 at 3:42 AM, Richard Guenther > wrote: > > > Btw, this then happily fits into my suggestion that the "elementalness" > > can be autodetected by the compiler simply by means of a proper IPA > > pass and thus be fully LTO / whole-program aware. No need for an > > attribute (where you'd need to handle the case that the attribute was placed > > there by error). > > We are in violent agreement. For locally defined functions sure, the question is if we want the attribute to be something for external functions. Something that would have ABI implications (the external symbol would need to be provided in two forms (or more?), one scalar with normal mangling, one vector with some other kind of mangling/suffix/whatever), when compiling the definition of function with such an attribute the compiler could verify its properties (i.e. autodetect and if it is not autodetected elemental, complain?), and when using extern function just rely on it being provided twice. Even with LTO, the function can be defined in some other shared library etc. Nothing says the implementation of the vector version of the elemental function necessary has to be vectorized, just that the arguments would need to be passed in the expected vector registers, similarly for return value. Say if the elemental function is compiled with -O0, then there could just be a loop executing the scalar body several times and creating vectors. Jakub
RE: [PATCH] Enable bbro for -Os
Thank you for the detail comments. The updated patched is attached. Is it OK? Thanks! -Zhenqiang > -Original Message- > From: Eric Botcazou [mailto:ebotca...@adacore.com] > Sent: Tuesday, September 11, 2012 1:01 AM > To: Zhenqiang Chen > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] Enable bbro for -Os > > > All other comments are accepted. > > > > The updated patch is attached. Is it OK? > > As you probably gathered, I had missed that Steven and Richard had already > commented on your patch before posting my message. Sorry about that... > > I think that the patch is interesting because, even if it doesn't exactly > implement what the comment in gate_handle_reorder_blocks was talking > about, it fixes code layout regressions without increasing the code size (and > even decreasing it). So, assuming that Steven and Richard don't strongly > oppose, I think the patch is OK modulo the following nits: > > + The above description is for the full algorithm, which is used when the > + function is optimized for speed. When the function is optimized for size, > + in order to reduce long jump and connect more fall through edges, > + the > > long jumps... bb-reorder.c uses "fallthru edges" consistently. > > + algorithm is modified as follows: > + (1) Break long trace to short ones. The trace is broken at a block, which > + has multi-predecessors/successors during finding traces. > > long traces... A trace is broken at a block that has multiple predecessors/ > successors during trace discovery. > > + (2) Ignore the edge probability and frequency for fall through edges. > > fallthru > > + (3) Keep its original order when there is no chance to fall through. > + bbro > > Keep the original order of blocks... We rely on the results of cfg_cleanup > > + bases on the result of cfg_cleanup, which does lots of optimizations > + on > cfg. > + So the order is expected to be kept if no fall through. > + > + To implement the change for code size optimization, block's index is > + selected as the key and all traces are found in one round. > > > + /* If the best destination has multiple successors or predecessors, > + don't allow it to be added when optimizing for size. This makes > + sure predecessors with smaller index handled before the best > + destination. It breaks long trace and reduces long jumps. > > missing "are" before "handled" > > > + After removing the best edge, the final result will be ABCD/ACBD. > + It does not add jump compared with the previous order. But it > + reduce the possibility of long jump. */ > > Double space before "But". > > > + if (optimize_function_for_size_p (cfun)) > +{ > + e_index = src_index_p ? e->src->index : e->dest->index; > + b_index = src_index_p ? cur_best_edge->src->index > + : cur_best_edge->dest->index; > + /* The smaller one is better to keep the original order. */ > + return b_index > e_index; > +} > > Trailing space after the last parenthesis. > > > + /* If dest has multiple predecessors, skip it. We expect > + that one predecessor with smaller index connect with it > + later. */ > > connects > > > + /* Only connect Trace n with Trace n + 1. It is conservative > + to keep the order as close as possible to the original order. > + It also helps to reduce long jump. */ > > long jumps > > > Thanks for working on this. > > -- > Eric Botcazou Enable-bbro-for-size-updated3.patch Description: Binary data
Re: Bootstrap fails (was: Remove unnecessary VEC function overloads.)
On Tue, Sep 11, 2012 at 9:58 AM, Tobias Burnus wrote: > On 09/11/2012 01:52 AM, Diego Novillo wrote: >> >> Remove unnecessary VEC function overloads. >> >> Several VEC member functions that accept an element 'T' used to have >> two overloads: one taking 'T', the second taking 'T *'. > > > They might be unnecessary, but with your patch bootstrapping fails here > with the following failure. > > Did you test with or without Graphite? Fixed with the attached. Richard. > Tobias > > > /home/tob/projects/gcc-git/gcc/gcc/graphite-scop-detection.c: In function > ‘void move_sd_regions(vec_t**, vec_t**)’: > /home/tob/projects/gcc-git/gcc/gcc/vec.h:408:63: error: no matching function > for call to ‘vec_t::safe_push(vec_t**, > sd_region*&, const char [61], int, const char [16])’ > (vec_t::safe_push (&(V), O VEC_CHECK_INFO MEM_STAT_INFO)) >^ > /home/tob/projects/gcc-git/gcc/gcc/graphite-scop-detection.c:146:5: note: in > expansion of macro 'VEC_safe_push' > VEC_safe_push (sd_region, heap, *target, s); > ^ > /home/tob/projects/gcc-git/gcc/gcc/vec.h:408:63: note: candidate is: > (vec_t::safe_push (&(V), O VEC_CHECK_INFO MEM_STAT_INFO)) > p Description: Binary data
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Tue, Sep 11, 2012 at 11:06 AM, Jakub Jelinek wrote: > On Tue, Sep 11, 2012 at 03:57:44AM -0500, Gabriel Dos Reis wrote: >> On Tue, Sep 11, 2012 at 3:42 AM, Richard Guenther >> wrote: >> >> > Btw, this then happily fits into my suggestion that the "elementalness" >> > can be autodetected by the compiler simply by means of a proper IPA >> > pass and thus be fully LTO / whole-program aware. No need for an >> > attribute (where you'd need to handle the case that the attribute was >> > placed >> > there by error). >> >> We are in violent agreement. > > For locally defined functions sure, the question is if we want the attribute > to be something for external functions. Something that would have ABI > implications (the external symbol would need to be provided in two forms (or > more?), one scalar with normal mangling, one vector with some other kind > of mangling/suffix/whatever), when compiling the definition of function with > such an attribute the compiler could verify its properties (i.e. autodetect > and if it is not autodetected elemental, complain?), and when using extern > function just rely on it being provided twice. Even with LTO, the function > can be defined in some other shared library etc. > > Nothing says the implementation of the vector version of the elemental > function necessary has to be vectorized, just that the arguments would need > to be passed in the expected vector registers, similarly for return value. > Say if the elemental function is compiled with -O0, then there could just be > a loop executing the scalar body several times and creating vectors. Sure. And the "versioning" can happen from the C frontend then. Of course this one has the requirement of documenting the ABI. Richard. > Jakub
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Tue, Sep 11, 2012 at 4:06 AM, Jakub Jelinek wrote: > On Tue, Sep 11, 2012 at 03:57:44AM -0500, Gabriel Dos Reis wrote: >> On Tue, Sep 11, 2012 at 3:42 AM, Richard Guenther >> wrote: >> >> > Btw, this then happily fits into my suggestion that the "elementalness" >> > can be autodetected by the compiler simply by means of a proper IPA >> > pass and thus be fully LTO / whole-program aware. No need for an >> > attribute (where you'd need to handle the case that the attribute was >> > placed >> > there by error). >> >> We are in violent agreement. > > For locally defined functions sure, the question is if we want the attribute > to be something for external functions. Something that would have ABI > implications (the external symbol would need to be provided in two forms (or > more?), one scalar with normal mangling, one vector with some other kind > of mangling/suffix/whatever), when compiling the definition of function with > such an attribute the compiler could verify its properties (i.e. autodetect > and if it is not autodetected elemental, complain?), and when using extern > function just rely on it being provided twice. Even with LTO, the function > can be defined in some other shared library etc. > > Nothing says the implementation of the vector version of the elemental > function necessary has to be vectorized, just that the arguments would need > to be passed in the expected vector registers, similarly for return value. > Say if the elemental function is compiled with -O0, then there could just be > a loop executing the scalar body several times and creating vectors. > As it was pointed out earlier (by Marc?), there is also an issue of overload resolution if these automatically synthetized functions have to be something visible, which of course entails the whole ABI issues. This is really a language design issue, not just compiler implementation. If the synthetized functions do not need to have the same status as real functions (hence no need for attributes), then these issues evaporate. -- Gaby
Re: [PATCH] PowerPC VLE port
2012-09-10 Maciej W. Rozycki gcc/ * config/rs6000/rs6000.c (print_operand) <'c'>: Remove. * config/rs6000/spe.md: Remove a leftover comment. Okay. This patch wasn't sent to gcc-patches -- can we see it please? Segher
Re: Bootstrap fails (was: Remove unnecessary VEC function overloads.)
> Fixed with the attached. Followed by the same failure on darwin. Fixed with --- ../_clean/gcc/config/darwin.c 2012-07-09 22:06:21.0 +0200 +++ ../p_work/gcc/config/darwin.c 2012-09-11 11:53:02.0 +0200 @@ -1878,7 +1878,7 @@ darwin_asm_named_section (const char *na the assumption of how this is done. */ if (lto_section_names == NULL) lto_section_names = VEC_alloc (darwin_lto_section_e, gc, 16); - VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, &e); + VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, e); } else if (strncmp (name, "__DWARF,", 8) == 0) darwin_asm_dwarf_section (name, flags, decl); @@ -2698,7 +2698,7 @@ darwin_asm_dwarf_section (const char *na fprintf (asm_out_file, "Lsection%.*s:\n", namelen, sname); e.count = 1; e.name = xstrdup (sname); - VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, &e); + VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, e); } } (now at stage 2). TIA Dominique
Re: [Patch ARM] implement bswap16
On 10 September 2012 19:30, Richard Earnshaw wrote: > On 10/09/12 16:40, Christophe Lyon wrote: >> Why do we have to keep room for the predicate here? (%?) Doesn't this >> pattern match only in unconditional cases? >> > > Because the ARM back-end has a very late conditionalizer pass that can > also generate conditional execution. It very rarely kicks in these > days, but if the predication rules are in there you could end up with an > instruction that the compiler thought was conditionally executed being > always run. That would be bad^TM. > Thanks for the clarification. >> BTW, I didn't manage to have GCC generate conditional revsh. I merely >> added an "if (y)" guard before calling builtin_bswap16, but this >> didn't turn into a conditional revsh. >> On this topic, could you suggest a way to generate conditional revsh? I would like to augment the testsuite for this, and I tried: int y; short swaps16(short x) { if (y) return __builtin_bswap16(x); } but it generates: swaps16: @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. movwr3, #:lower16:y @ 50*arm_movsi_vfp/4[length = 4] movtr3, #:upper16:y @ 51*arm_movt [length = 4] ldr r3, [r3]@ 7 *arm_movsi_vfp/5[length = 4] cmp r3, #0 @ 8 *arm_cmpsi_insn/3 [length = 4] beq .L3 @ 9 arm_cond_branch [length = 4] revsh r0, r0 @ 13*arm_revsh/3[length = 4] bx lr @ 56*arm_return [length = 12] .L3: bx lr @ 58*arm_return [length = 12] ie unconditional revsh. Another question regarding the *arm_revsh pattern you wrote: why is the "arch" set to "t1,t2,32" ? Shouldn't it be "t1,t2,a" ? (IIUC, "32" matches both "a" and "t2" as per the definition of TARGET_32BIT) Thanks Christophe.
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Tue, 11 Sep 2012, Richard Guenther wrote: On Tue, Sep 11, 2012 at 10:41 AM, Richard Guenther wrote: On Mon, Sep 10, 2012 at 6:37 PM, Richard Henderson wrote: Whether or not the compiler creates a clone COULD BE totally up to the compiler, based on whether or not vectorization is enabled, whether the loop has been analyzed such that vectorization may proceed, or indeed the phase of the moon. But in order for that to happen, the clone must be totally private to the module for which we are generating code (in the LTO sense, this is the entire program or dll; without LTO, this is just the object file). It means that we never attempt to generate clones for functions for which the body of the function is not visible. On the other hand, if you insist on assuming a clone exists merely because a declaration bears an attribute, then you must address ALL of the problems with respect to defining a stable ABI in the face of different cpu revisions, different ISAs, and different vector lengths. I've not seen you address ANY of these problems, despite having the problem pointed out multiple times. Indeed, if the definition of an elemental function is always visible to the vectorizer the vectorizer itself can instruct the creation of the clone if it does not already exist (just make those clones managed by the callgraph). Then the clones are visible to the current TU only and no ABI issues exist (though you could say that the vectorizer or the inliner could as well force inlining of elemental functions into places it wants to vectorize - one complication even with local clones is that the x86 ABI has no callee-saved XMM registers which makes function calls inside loops especially expensive). I thought gcc wouldn't use the x86 ABI for those private calls. I guess what I remember were vague discussions and not a description of the current status... Btw, this then happily fits into my suggestion that the "elementalness" can be autodetected by the compiler simply by means of a proper IPA pass and thus be fully LTO / whole-program aware. No need for an attribute (where you'd need to handle the case that the attribute was placed there by error). Note that, apart from preventing external calls, it removes this use case: __attribute__((vector(4))) double mysqrt(double x){return sqrt(x);} __m256d var; mysqrt(var); I am not sure it is the best way to achieve this, but it is one way. I am also planning a patch to turn {sqrt(a),sqrt(b)} into sqrt({a,b}) when the target likes it. And there is a PR asking for a __builtin_math_sqrt. -- Marc Glisse
[PATCH] Fix PR54534
The backport of the patch for PR53572 caused us to remove unused decls at -O0, a regresion on the branch - fixed by the following. Bootstrapped and tested on x86_64-unknown-linux-gnu, applied. Richard. 2012-09-11 Richard Guenther PR debug/54534 * cgraph.h (varpool_can_remove_if_no_refs): Restore dependence on flag_toplevel_reorder. Index: gcc/cgraph.h === --- gcc/cgraph.h(revision 191174) +++ gcc/cgraph.h(working copy) @@ -951,7 +951,7 @@ varpool_can_remove_if_no_refs (struct va return (!node->force_output && !node->used_from_other_partition && ((DECL_COMDAT (node->decl) && !varpool_used_from_object_file_p (node)) - || !node->externally_visible + || (flag_toplevel_reorder && !node->externally_visible) || DECL_HAS_VALUE_EXPR_P (node->decl))); }
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Tue, Sep 11, 2012 at 12:29:10PM +0200, Marc Glisse wrote: > >Btw, this then happily fits into my suggestion that the "elementalness" > >can be autodetected by the compiler simply by means of a proper IPA > >pass and thus be fully LTO / whole-program aware. No need for an > >attribute (where you'd need to handle the case that the attribute was placed > >there by error). > > Note that, apart from preventing external calls, it removes this use case: > > __attribute__((vector(4))) double mysqrt(double x){return sqrt(x);} > > __m256d var; > mysqrt(var); I don't think those functions should be available for C++ overloading. For one, it would be only for C++, not for C, and how would you handle the case where the user already provides __m256d mysqrt(__m256d); overload in addition to the one with vector attribute? I'd say the compiler should when beneficial synthetize calls to those in SLP or normal vectorizer instead, so you'd write: (__m256d){mysqrt(var[0]),mysqrt(var[1]),mysqrt(var[2]),mysqrt(var[3])}; instead of mysqrt(var); and the compiler would turn that into mysqrt.elem.V4DF(var) (or whatever the mangling of the elemental functions would be). Jakub
Re: [Patch ARM] implement bswap16
On 11/09/12 11:25, Christophe Lyon wrote: > On 10 September 2012 19:30, Richard Earnshaw wrote: >> On 10/09/12 16:40, Christophe Lyon wrote: >>> Why do we have to keep room for the predicate here? (%?) Doesn't this >>> pattern match only in unconditional cases? >>> >> >> Because the ARM back-end has a very late conditionalizer pass that can >> also generate conditional execution. It very rarely kicks in these >> days, but if the predication rules are in there you could end up with an >> instruction that the compiler thought was conditionally executed being >> always run. That would be bad^TM. >> > > Thanks for the clarification. > >>> BTW, I didn't manage to have GCC generate conditional revsh. I merely >>> added an "if (y)" guard before calling builtin_bswap16, but this >>> didn't turn into a conditional revsh. >>> > On this topic, could you suggest a way to generate conditional revsh? > > I would like to augment the testsuite for this, and I tried: > > int y; > short swaps16(short x) { > if (y) > return __builtin_bswap16(x); > } > but it generates: > swaps16: > @ args = 0, pretend = 0, frame = 0 > @ frame_needed = 0, uses_anonymous_args = 0 > @ link register save eliminated. > movwr3, #:lower16:y @ 50*arm_movsi_vfp/4[length = 4] > movtr3, #:upper16:y @ 51*arm_movt [length = 4] > ldr r3, [r3]@ 7 *arm_movsi_vfp/5[length = 4] > cmp r3, #0 @ 8 *arm_cmpsi_insn/3 [length = 4] > beq .L3 @ 9 arm_cond_branch [length = 4] > revsh r0, r0 @ 13*arm_revsh/3[length = 4] > bx lr @ 56*arm_return [length = 12] > .L3: > bx lr @ 58*arm_return [length = 12] > > ie unconditional revsh. > > > Another question regarding the *arm_revsh pattern you wrote: why is > the "arch" set to "t1,t2,32" ? Shouldn't it be "t1,t2,a" ? > (IIUC, "32" matches both "a" and "t2" as per the definition of TARGET_32BIT) > > Thanks > > Christophe. > Try something like: short foo(int); short swaps (short x, int y) { int z = x; if (y) z = __builtin_bswap16(x); return foo (z); } If that's not enough, try adding 1 to z before calling foo. R.
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On Tue, 11 Sep 2012, Jakub Jelinek wrote: On Tue, Sep 11, 2012 at 12:29:10PM +0200, Marc Glisse wrote: Note that, apart from preventing external calls, it removes this use case: __attribute__((vector(4))) double mysqrt(double x){return sqrt(x);} __m256d var; mysqrt(var); I don't think those functions should be available for C++ overloading. The current patch does make them available, according to their author. For one, it would be only for C++, not for C, and how would you handle the case where the user already provides __m256d mysqrt(__m256d); overload in addition to the one with vector attribute? The same way you handle it when the user provides 2 identical overloads. I'd say the compiler should when beneficial synthetize calls to those in SLP or normal vectorizer instead, so you'd write: (__m256d){mysqrt(var[0]),mysqrt(var[1]),mysqrt(var[2]),mysqrt(var[3])}; instead of mysqrt(var); and the compiler would turn that into mysqrt.elem.V4DF(var) (or whatever the mangling of the elemental functions would be). Ok. -- Marc Glisse
Recognize vec_perm_expr in a constructor of bit_field_ref
Hello, here is a patch that turns {v[1],v[0]} into vec_perm_expr(v,v,{1,0}) if the target is ok with it. I am attaching 2 versions of the patch. p-good is the one that passes testing. p-bad, where I rely on fold_stmt to detect identity permutations, ICEs towards the end of the pass while checking a bogus gimple stmt (one that gimple_debug_stmt crashes on if I call it in gdb). From a performance point of view, p-good makes sense, but I liked the simplicity of p-bad and I am confused as to why it fails. 2012-09-11 Marc Glisse gcc/ * tree-ssa-forwprop.c (simplify_vector_constructor): New function. (ssa_forward_propagate_and_combine): Call it. gcc/testsuite/ * gcc.dg/tree-ssa/forwprop-22.c: New testcase. -- Marc GlisseIndex: Makefile.in === --- Makefile.in (revision 191173) +++ Makefile.in (working copy) @@ -2237,21 +2237,22 @@ tree-outof-ssa.o : tree-outof-ssa.c $(TR $(TREE_H) $(DIAGNOSTIC_H) $(TM_H) coretypes.h dumpfile.h \ $(TREE_SSA_LIVE_H) $(BASIC_BLOCK_H) $(BITMAP_H) $(GGC_H) \ $(EXPR_H) $(SSAEXPAND_H) $(GIMPLE_PRETTY_PRINT_H) tree-ssa-dse.o : tree-ssa-dse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(GGC_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \ $(TREE_FLOW_H) $(TREE_PASS_H) domwalk.h $(FLAGS_H) \ $(GIMPLE_PRETTY_PRINT_H) langhooks.h tree-ssa-forwprop.o : tree-ssa-forwprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) $(CFGLOOP_H) \ $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ - langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) $(EXPR_H) + langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) $(EXPR_H) \ + $(TREE_VECTORIZER_H) tree-ssa-phiprop.o : tree-ssa-phiprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \ $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ langhooks.h $(FLAGS_H) $(GIMPLE_PRETTY_PRINT_H) tree-ssa-ifcombine.o : tree-ssa-ifcombine.c $(CONFIG_H) $(SYSTEM_H) \ coretypes.h $(TM_H) $(TREE_H) $(BASIC_BLOCK_H) \ $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ $(TREE_PRETTY_PRINT_H) tree-ssa-phiopt.o : tree-ssa-phiopt.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ $(TM_H) $(GGC_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \ Index: testsuite/gcc.dg/tree-ssa/forwprop-22.c === --- testsuite/gcc.dg/tree-ssa/forwprop-22.c (revision 0) +++ testsuite/gcc.dg/tree-ssa/forwprop-22.c (revision 0) @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_double } */ +/* { dg-require-effective-target vect_perm } */ +/* { dg-options "-O -fdump-tree-optimized" } */ + +typedef double vec __attribute__((vector_size (2 * sizeof (double; +void f (vec *px, vec *y, vec *z) +{ + vec x = *px; + vec t1 = { x[1], x[0] }; + vec t2 = { x[0], x[1] }; + *y = t1; + *z = t2; +} + +/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-not "BIT_FIELD_REF" "optimized" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ Property changes on: testsuite/gcc.dg/tree-ssa/forwprop-22.c ___ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: tree-ssa-forwprop.c === --- tree-ssa-forwprop.c (revision 191173) +++ tree-ssa-forwprop.c (working copy) @@ -26,20 +26,21 @@ along with GCC; see the file COPYING3. #include "tm_p.h" #include "basic-block.h" #include "gimple-pretty-print.h" #include "tree-flow.h" #include "tree-pass.h" #include "langhooks.h" #include "flags.h" #include "gimple.h" #include "expr.h" #include "cfgloop.h" +#include "tree-vectorizer.h" /* This pass propagates the RHS of assignment statements into use sites of the LHS of the assignment. It's basically a specialized form of tree combination. It is hoped all of this can disappear when we have a generalized tree combiner. One class of common cases we handle is forward propagating a single use variable into a COND_EXPR. bb0: @@ -2787,20 +2788,105 @@ simplify_permutation (gimple_stmt_iterat if (TREE_CODE (op0) == SSA_NAME) ret = remove_prop_source_from_use (op0); if (op0 != op1 && TREE_CODE (op1) == SSA_NAME) ret |= remove_prop_source_from_use (op1); return ret ? 2 : 1; } return 0; } +/* Recognize a VEC_PERM_EXPR. Returns true if there were any changes. */ + +static bool +simplify_vector_constructor (gimple_stmt_iterator *gsi) +{ + gimple stmt = gsi_stmt (*gsi); + gimple def_stmt; + tree op, op2, orig, type, elem_type; + unsigned elem_size, nelts, i; + enum tree_code code; + constructor_elt *elt; + unsigned char *sel; + bool maybe_ident; + + gcc_checking_assert (gimple_assign_rhs_c
Re: [PATCH] PowerPC VLE port
On Tue, 11 Sep 2012, Segher Boessenkool wrote: > > > 2012-09-10 Maciej W. Rozycki > > > > > > gcc/ > > > * config/rs6000/rs6000.c (print_operand) <'c'>: Remove. > > > * config/rs6000/spe.md: Remove a leftover comment. > > > > Okay. > > This patch wasn't sent to gcc-patches -- can we see it please? Umm, I didn't notice a cc to gcc-patches was removed in the course of discussion, sorry about that. Here's the change concerned. Maciej gcc-powerpc-print-operand-c.patch Index: gcc/config/rs6000/spe.md === --- gcc/config/rs6000/spe.md(revision 191161) +++ gcc/config/rs6000/spe.md(working copy) @@ -2945,8 +2945,6 @@ "mfspefscr %0" [(set_attr "type" "vecsimple")]) -;; FP comparison stuff. - ;; Flip the GT bit. (define_insn "e500_flip_gt_bit" [(set (match_operand:CCFP 0 "cc_reg_operand" "=y") Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 191161) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -14659,14 +14659,6 @@ print_operand (FILE *file, rtx x, int co /* %c is output_addr_const if a CONSTANT_ADDRESS_P, otherwise output_operand. */ -case 'c': - /* X is a CR register. Print the number of the GT bit of the CR. */ - if (GET_CODE (x) != REG || ! CR_REGNO_P (REGNO (x))) - output_operand_lossage ("invalid %%c value"); - else - fprintf (file, "%d", 4 * (REGNO (x) - CR0_REGNO) + 1); - return; - case 'D': /* Like 'J' but get to the GT bit only. */ gcc_assert (REG_P (x));
Re: vector comparisons in C++
Any comment? http://gcc.gnu.org/ml/gcc-patches/2012-08/msg02098.html Maybe separately on the technical and political aspects? On Sat, 1 Sep 2012, Marc Glisse wrote: With the patch... On Sat, 1 Sep 2012, Marc Glisse wrote: Hello, this patch copies some more vector extensions from the C front-end to the C++ front-end. There seemed to be some reluctance to add those, but I guess a patch is the best way to ask. Note that I only added the vector x vector operations, not the vector x scalar ones. I have some issues with the vector-compare-2.c torture test. It passes a vector by value (argument and return type), which is likely to warn (although for some reason it doesn't for me, with today's compiler). And it takes -Wno-psabi through a .x file, but those are not read in c-c++-common, so I put it in dg-options. I would have changed the function to use pointers, but I don't know if it specifically wants to test passing by value... 2012-08-31 Marc Glisse PR c++/54427 cp/ChangeLog * typeck.c (cp_build_binary_op) [LSHIFT_EXPR, RSHIFT_EXPR, EQ_EXPR, NE_EXPR, LE_EXPR, GE_EXPR, LT_EXPR, GT_EXPR]: Handle VECTOR_TYPE. testsuite/ChangeLog * gcc.dg/vector-shift.c: Move ... * c-c++-common/vector-shift.c: ... here. * gcc.dg/vector-shift1.c: Move ... * c-c++-common/vector-shift1.c: ... here. * gcc.dg/vector-shift3.c: Move ... * c-c++-common/vector-shift3.c: ... here. * gcc.dg/vector-compare-1.c: Move ... * c-c++-common/vector-compare-1.c: ... here. * gcc.dg/vector-compare-2.c: Move ... * c-c++-common/vector-compare-2.c: ... here. * gcc.c-torture/execute/vector-compare-1.c: Move ... * c-c++-common/torture/vector-compare-1.c: ... here. * gcc.c-torture/execute/vector-compare-2.x: Delete. * gcc.c-torture/execute/vector-compare-2.c: Move ... * c-c++-common/torture/vector-compare-2.c: ... here. * gcc.c-torture/execute/vector-shift.c: Move ... * c-c++-common/torture/vector-shift.c: ... here. * gcc.c-torture/execute/vector-shift2.c: Move ... * c-c++-common/torture/vector-shift2.c: ... here. * gcc.c-torture/execute/vector-subscript-1.c: Move ... * c-c++-common/torture/vector-subscript-1.c: ... here. * gcc.c-torture/execute/vector-subscript-2.c: Move ... * c-c++-common/torture/vector-subscript-2.c: ... here. * gcc.c-torture/execute/vector-subscript-3.c: Move ... * c-c++-common/torture/vector-subscript-3.c: ... here. -- Marc Glisse
Re: [i386] recognize haddpd
Hello, any advice? http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00044.html On Sun, 2 Sep 2012, Marc Glisse wrote: Hello, this patch passes bootstrap+testsuite. It is probably wrong in many ways, but I don't know enough to do more without some advice. The goal is to recognize that v[0]+v[1] can be computed with haddpd. With the patch, v[0]-v[1] becomes hsubpd and v[1]+v[0] becomes haddpd. Also, thanks to it, {v[0]-v[1], w[0]-w[1]} is now recognized as a single hsubpd. 1) Is a define_insn the right tool? 2) {v[0]-v[1], v[0]-v[1]} is not recognized as a hsubpd because vec_duplicate doesn't match vec_concat. Do we really need to duplicate (no pun intended) the pattern? 3) v[0]+v[1] is not recognized. Some pass changed their order, and nothing tries the reverse order. I can see 3 ways: canonicalize the order at some point, let combine try both orders for commutative operators or make the patterns more flexible (I don't know how many would need changing). 4) I don't understand the set_attr part. I copied it from the haddpd define_insn, and removed (set_attr "type" "sseadd") because it crashed the compiler. isa and prefix make sense and they match the alternatives, but I am not sure about "mode" (removing it still works IIRC). 2012-09-02 Marc Glisse gcc/ * config/i386/sse.md (*sse3_hv2df3_low): New. gcc/testsuite/ * gcc.target/i386/pr54400.c: New testcase. -- Marc Glisse
Re: Bootstrap fails
On 2012-09-11 03:58 , Tobias Burnus wrote: Did you test with or without Graphite? I tested with and without release checking, all languages and all targets that use VEC. So many combinations... how is graphite enabled? Diego.
Re: Bootstrap fails
On 2012-09-11 05:35 , Richard Guenther wrote: On Tue, Sep 11, 2012 at 9:58 AM, Tobias Burnus wrote: On 09/11/2012 01:52 AM, Diego Novillo wrote: Remove unnecessary VEC function overloads. Several VEC member functions that accept an element 'T' used to have two overloads: one taking 'T', the second taking 'T *'. They might be unnecessary, but with your patch bootstrapping fails here with the following failure. Did you test with or without Graphite? Fixed with the attached. Thanks! Diego.
Re: Bootstrap fails
On Tue, Sep 11, 2012 at 1:41 PM, Diego Novillo wrote: > On 2012-09-11 03:58 , Tobias Burnus wrote: > >> Did you test with or without Graphite? > > > I tested with and without release checking, all languages and all targets > that use VEC. So many combinations... how is graphite enabled? By having its prerequesites available (cloog and isl). Richard. > > Diego.
Re: Bootstrap fails (was: Remove unnecessary VEC function overloads.)
On 2012-09-11 06:12 , Dominique Dhumieres wrote: Fixed with the attached. Followed by the same failure on darwin. Fixed with --- ../_clean/gcc/config/darwin.c 2012-07-09 22:06:21.0 +0200 +++ ../p_work/gcc/config/darwin.c 2012-09-11 11:53:02.0 +0200 @@ -1878,7 +1878,7 @@ darwin_asm_named_section (const char *na the assumption of how this is done. */ if (lto_section_names == NULL) lto_section_names = VEC_alloc (darwin_lto_section_e, gc, 16); - VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, &e); + VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, e); } else if (strncmp (name, "__DWARF,", 8) == 0) darwin_asm_dwarf_section (name, flags, decl); @@ -2698,7 +2698,7 @@ darwin_asm_dwarf_section (const char *na fprintf (asm_out_file, "Lsection%.*s:\n", namelen, sname); e.count = 1; e.name = xstrdup (sname); - VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, &e); + VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, e); } } Gah, my grep did not include config/*.c. This is ok, of course. Diego.
Re: Bootstrap fails
On 09/11/2012 01:41 PM, Diego Novillo wrote: On 2012-09-11 03:58 , Tobias Burnus wrote: Did you test with or without Graphite? I tested with and without release checking, all languages and all targets that use VEC. So many combinations... There is unfortunately always an N+1 configuration which one hasn't tested ... how is graphite enabled? I think it is automatically enabled when the libraries are found; at least I didn't specify anything special and just see the following configure output: checking for version 0.10 of ISL... yes checking for version 0.17.0 of CLooG... yes Consequently (cf. toplevel configure): # Treat either --without-cloog or --without-isl as a request to disable # GRAPHITE support and skip all following checks. If you don't have them in the default tree: See http://gcc.gnu.org/install/prerequisites.html and http://gcc.gnu.org/install/configure.html Both also build in tree. Tobias PS: Thanks for the clean up patch.
Re: Remove unnecessary VEC function overloads.
On 2012-09-11 01:01 , Ian Lance Taylor wrote: On Mon, Sep 10, 2012 at 4:52 PM, Diego Novillo wrote: Ian, could you commit the changes in go/gofrontend? Done. Actually, it looks like you already committed them, but I brought the master repo up to date. Yes, sorry. I'm not quite sure how to deal with Go patches, in general. Had I not committed the patch, then Go would've been broken. Is it OK if these patches get committed to GCC trunk? I have at least 2 or 3 more of this kind in the queue. Or do you prefer to have the master repo update first? (in which case, trunk will be broken for a little while). Diego.
Re: [patch] PR54149: fix data race in LIM pass
ok with && gimple_assign_lhs_ptr (loc->stmt) == loc->ref instead. Let's hope we conservatively catch all writes to ref this way (which is what we need, right)? Yes. Thanks. Committing the attached patch. PR middle-end/54149 * tree-ssa-loop-im.c (execute_sm_if_changed_flag_set): Only set flag for writes. diff --git a/gcc/testsuite/gcc.dg/simulate-thread/speculative-store-4.c b/gcc/testsuite/gcc.dg/simulate-thread/speculative-store-4.c new file mode 100644 index 000..59f81b7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/simulate-thread/speculative-store-4.c @@ -0,0 +1,54 @@ +/* { dg-do link } */ +/* { dg-options "--param allow-store-data-races=0" } */ +/* { dg-final { simulate-thread } } */ + +#include +#include + +#include "simulate-thread.h" + +/* PR 54139 */ +/* Test that speculative stores do not happen for --param + allow-store-data-races=0. */ + +int g_13=1, insns=1; + +__attribute__((noinline)) +void simulate_thread_main() +{ + int l_245; + + /* Since g_13 is unilaterally set positive above, there should be + no store to g_13 below. */ + for (l_245 = 0; l_245 <= 1; l_245 += 1) +for (; g_13 <= 0; g_13 = 1) + ; +} + +int main() +{ + simulate_thread_main (); + simulate_thread_done (); + return 0; +} + +void simulate_thread_other_threads () +{ + ++g_13; + ++insns; +} + +int simulate_thread_step_verify () +{ + return 0; +} + +int simulate_thread_final_verify () +{ + if (g_13 != insns) +{ + printf("FAIL: g_13 was incorrectly cached\n"); + return 1; +} + return 0; +} diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c index 0f61631..67cab3a 100644 --- a/gcc/tree-ssa-loop-im.c +++ b/gcc/tree-ssa-loop-im.c @@ -2113,9 +2113,14 @@ execute_sm_if_changed_flag_set (struct loop *loop, mem_ref_p ref) gimple_stmt_iterator gsi; gimple stmt; - gsi = gsi_for_stmt (loc->stmt); - stmt = gimple_build_assign (flag, boolean_true_node); - gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING); + /* Only set the flag for writes. */ + if (is_gimple_assign (loc->stmt) + && gimple_assign_lhs_ptr (loc->stmt) == loc->ref) + { + gsi = gsi_for_stmt (loc->stmt); + stmt = gimple_build_assign (flag, boolean_true_node); + gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING); + } } VEC_free (mem_ref_loc_p, heap, locs); return flag;
shrink-wrapping duplicates BBs across partitions.
Hello, While testing the patch to enable shrink-wrapping on SH [PR54546], we hit an the "error: EDGE_CROSSING missing across section boundary" Indeed, shrink-wrap duplicates a bb with successors (containing the return sequence) into an unlikely section. I first thought about setting the EDGE_CROSSING on flag on those edge, but I feel that this block duplication doesn't go in the direction of this optimization. Not duplicating BBs between partitions solves the problem. Does this restriction look right to you ? (regression tests are still running on x86 and sh) Thanks a lot for any comment. Christian Index: function.c === --- function.c (revision 191177) +++ function.c (working copy) @@ -6063,6 +6063,7 @@ FOR_EACH_EDGE (e, ei, tmp_bb->preds) if (single_succ_p (e->src) && !bitmap_bit_p (&bb_on_list, e->src->index) + && (BB_PARTITION (e->src) == BB_PARTITION (e->dest)) && can_duplicate_block_p (e->src)) { edge pe;
Re: Bootstrap fails (was: Remove unnecessary VEC function overloads.)
> This is ok, of course. Then could you please commit it (I don't have write access)? TIA Dominique
Re: Remove unnecessary VEC function overloads.
On Tue, Sep 11, 2012 at 5:03 AM, Diego Novillo wrote: > On 2012-09-11 01:01 , Ian Lance Taylor wrote: >> >> On Mon, Sep 10, 2012 at 4:52 PM, Diego Novillo >> wrote: >>> >>> >>> Ian, could you commit the changes in go/gofrontend? >> >> >> Done. Actually, it looks like you already committed them, but I >> brought the master repo up to date. > > > Yes, sorry. I'm not quite sure how to deal with Go patches, in general. > Had I not committed the patch, then Go would've been broken. > > Is it OK if these patches get committed to GCC trunk? I have at least 2 or > 3 more of this kind in the queue. Or do you prefer to have the master repo > update first? (in which case, trunk will be broken for a little while). I think the right thing to do is to let Go break for a little while, so that the code in the GCC repository is always a copy of the gofrontend repository. I hope to get back to moving the remaining GCC-specific code out of gofrontend soon. Ian
Re: shrink-wrapping duplicates BBs across partitions.
> Does this restriction look right to you ? (regression tests are still > running on x86 and sh) Please generate your patches with diff -up (or svn diff -x -up). > + && (BB_PARTITION (e->src) == BB_PARTITION (e->dest)) No need for parentheses around this check. The shrink wrapping code appears to be dealing with partitioning, or at least there are BB_COPY_PARTITIONs further down. So I can't tell whether this fix is correct. Can you show in more detail what happens? (A dotty graph is always helpful ;-) Ciao! Steven
[PATCH, TESTSUITE] Add -fno-short-enums to pr51712
Add -fno-short-enums flag to test c-c++-common/pr51712.c as discussed in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51712. This removes the excess warning that caused the test to fail. Tested in arm-none-eabi configuration. The test now passes. Comment? Ok for trunk? Thanks, Kyrill gcc/testsuite 2012-09-11 Kyrylo Tkachov * c-c++-common/pr51712.c: Add -fno-short-enums flag to test.--- a/gcc/testsuite/c-c++-common/pr51712.c +++ b/gcc/testsuite/c-c++-common/pr51712.c @@ -1,6 +1,6 @@ /* PR c/51712 */ /* { dg-do compile } */ -/* { dg-options "-Wtype-limits" } */ +/* { dg-options "-Wtype-limits -fno-short-enums" {target short_enums} } */ enum test_enum { FOO,
Re: [PATCH, TESTSUITE] Add -fno-short-enums to pr51712
On Tue, Sep 11, 2012 at 01:46:37PM +0100, Kyrylo Tkachov wrote: > Add -fno-short-enums flag to test c-c++-common/pr51712.c as discussed in > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51712. > This removes the excess warning that caused the test to fail. > Tested in arm-none-eabi configuration. The test now passes. > Comment? Ok for trunk? > > Thanks, > Kyrill > > gcc/testsuite > > 2012-09-11 Kyrylo Tkachov > > * c-c++-common/pr51712.c: Add -fno-short-enums flag to test. > --- a/gcc/testsuite/c-c++-common/pr51712.c > +++ b/gcc/testsuite/c-c++-common/pr51712.c > @@ -1,6 +1,6 @@ > /* PR c/51712 */ > /* { dg-do compile } */ > -/* { dg-options "-Wtype-limits" } */ > +/* { dg-options "-Wtype-limits -fno-short-enums" {target short_enums} } */ That is wrong, it means that on non-short_enums targets suddenly no dg-options would be passed. Instead you should keep the dg-options line as is and add /* { dg-additional-options "-fno-short-enums" { target short_enums } } */ or just add the new dg-options line but keep the old one as well (though, dg-additional-options is the new preferred way). Jakub
[Patch ARM] Allow auto-vectorizer to use vfma.
Hi, This allows the auto-vectorizer to use vfma under Ofast or ffast-math. I have a follow-up patch which will add support for these from arm_neon.h as well before someone asks. It's being regression tested as we speak and that'll follow shortly. Tested on A15 silicon native with no regressions. Committed. regards, Ramana 2012-09-11 Ramana Radhakrishnan Matthew Gretton-Dann * config/arm/neon.md (fma4): New pattern. (*fmsub4): Likewise. * doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw): Document it. 2012-09-11 Ramana Radhakrishnan Matthew Gretton-Dann * gcc.target/arm/neon-vfma-1.c: New testcase. * gcc.target/arm/neon-vfms-1.c: Likewise. * gcc.target/arm/neon-vmla-1.c: Update test to use int instead of float. * gcc.target/arm/neon-vmls-1.c: Likewise. * lib/target-supports.exp (add_options_for_arm_neonv2): New function. (check_effective_target_arm_neonv2_ok_nocache): Likewise. (check_effective_target_arm_neonv2_ok): Likewise. (check_effective_target_arm_neonv2_hw): Likewise. (check_effective_target_arm_neonv2): Likewise.diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index a929546..4821bb7 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -707,6 +707,33 @@ (const_string "neon_mla_qqq_32_qqd_32_scalar")] ) +;; Fused multiply-accumulate +(define_insn "fma4" + [(set (match_operand:VCVTF 0 "register_operand" "=w") +(fma:VCVTF (match_operand:VCVTF 1 "register_operand" "w") + (match_operand:VCVTF 2 "register_operand" "w") + (match_operand:VCVTF 3 "register_operand" "0")))] + "TARGET_NEON && TARGET_FMA && flag_unsafe_math_optimizations" + "vfma%?.\\t%0, %1, %2" + [(set (attr "neon_type") + (if_then_else (match_test "") + (const_string "neon_fp_vmla_ddd") + (const_string "neon_fp_vmla_qqq")))] +) + +(define_insn "*fmsub4" + [(set (match_operand:VCVTF 0 "register_operand" "=w") +(fma:VCVTF (neg:VCVTF (match_operand:VCVTF 1 "register_operand" "w")) + (match_operand:VCVTF 2 "register_operand" "w") + (match_operand:VCVTF 3 "register_operand" "0")))] + "TARGET_NEON && TARGET_FMA && flag_unsafe_math_optimizations" + "vfms%?.\\t%0, %1, %2" + [(set (attr "neon_type") + (if_then_else (match_test "") + (const_string "neon_fp_vmla_ddd") + (const_string "neon_fp_vmla_qqq")))] +) + (define_insn "ior3" [(set (match_operand:VDQ 0 "s_register_operand" "=w,w") (ior:VDQ (match_operand:VDQ 1 "s_register_operand" "w,0") diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 7e9dbe3..3fe52ad 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1525,11 +1525,19 @@ ARM target supports generating NEON instructions. @item arm_neon_hw Test system supports executing NEON instructions. +@item arm_neonv2_hw +Test system supports executing NEON v2 instructions. + @item arm_neon_ok @anchor{arm_neon_ok} ARM Target supports @code{-mfpu=neon -mfloat-abi=softfp} or compatible options. Some multilibs may be incompatible with these options. +@item arm_neonv2_ok +@anchor{arm_neon_ok} +ARM Target supports @code{-mfpu=neon -mfloat-abi=softfp} or compatible +options. Some multilibs may be incompatible with these options. + @item arm_neon_fp16_ok @anchor{arm_neon_fp16_ok} ARM Target supports @code{-mfpu=neon-fp16 -mfloat-abi=softfp} or compatible diff --git a/gcc/testsuite/gcc.target/arm/neon-vfma-1.c b/gcc/testsuite/gcc.target/arm/neon-vfma-1.c new file mode 100644 index 000..a003a82 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon-vfma-1.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neonv2_ok } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ +/* { dg-add-options arm_neonv2 } */ +/* { dg-final { scan-assembler "vfma\\.f32\[ \]+\[dDqQ]" } } */ + +/* Verify that VFMA is used. */ +void f1(int n, float a, float x[], float y[]) { + int i; + for (i = 0; i < n; ++i) +y[i] = a * x[i] + y[i]; +} diff --git a/gcc/testsuite/gcc.target/arm/neon-vfms-1.c b/gcc/testsuite/gcc.target/arm/neon-vfms-1.c new file mode 100644 index 000..8cefd8a --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/neon-vfms-1.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neonv2_ok } */ +/* { dg-options "-O2 -ftree-vectorize -ffast-math" } */ +/* { dg-add-options arm_neonv2 } */ +/* { dg-final { scan-assembler "vfms\\.f32\[ \]+\[dDqQ]" } } */ + +/* Verify that VFMS is used. */ +void f1(int n, float a, float x[], float y[]) { + int i; + for (i = 0; i < n; ++i) +y[i] = a * -x[i] + y[i]; +} diff --git a/gcc/testsuite/gcc.target/arm/neon-vmla-1.c b/gcc/testsuite/gcc.target/arm/neon-vmla-1.c index 9d239ed..c60c014 100644 --- a/gcc/testsuite/gcc.target/arm/neon-vmla-1.c +++ b/gcc/testsuite/gcc.target/arm/neon-vmla-1.c @@ -1,10 +1,10 @@ /
Re: [Patch ARM] Allow auto-vectorizer to use vfma.
Hi, your patch broke bootstrapping here: /home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node `arm_neon_ok' previously defined at line 1532. (Sorry for only complaining about those issues today.) Tobias On 09/11/2012 02:54 PM, Ramana Radhakrishnan wrote: Hi, This allows the auto-vectorizer to use vfma under Ofast or ffast-math. I have a follow-up patch which will add support for these from arm_neon.h as well before someone asks. It's being regression tested as we speak and that'll follow shortly. Tested on A15 silicon native with no regressions. Committed. regards, Ramana 2012-09-11 Ramana Radhakrishnan Matthew Gretton-Dann * config/arm/neon.md (fma4): New pattern. (*fmsub4): Likewise. * doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw): Document it. 2012-09-11 Ramana Radhakrishnan Matthew Gretton-Dann * gcc.target/arm/neon-vfma-1.c: New testcase. * gcc.target/arm/neon-vfms-1.c: Likewise. * gcc.target/arm/neon-vmla-1.c: Update test to use int instead of float. * gcc.target/arm/neon-vmls-1.c: Likewise. * lib/target-supports.exp (add_options_for_arm_neonv2): New function. (check_effective_target_arm_neonv2_ok_nocache): Likewise. (check_effective_target_arm_neonv2_ok): Likewise. (check_effective_target_arm_neonv2_hw): Likewise. (check_effective_target_arm_neonv2): Likewise.
Re: [Patch ARM] Allow auto-vectorizer to use vfma.
> your patch broke bootstrapping here: > > /home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node > `arm_neon_ok' previously defined at line 1532. > > (Sorry for only complaining about those issues today.) No need to feel sorry about that. It is Really Bad that people apparently don't test their patches properly. Ciao! Steven
RE: [PATCH, TESTSUITE] Add -fno-short-enums to pr51712
Fixed the format of the test options, as per Jakub's comment. Add -fno-short-enums flag to test c-c++-common/pr51712.c as discussed in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51712. This removes the excess warning that caused the test to fail. Tested in arm-none-eabi configuration. The test now passes. Comment? Ok for trunk? Thanks, Kyrill gcc/testsuite 2012-09-11 Kyrylo Tkachov * c-c++-common/pr51712.c: Add -fno-short-enums flag to test. -Original Message- From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: 11 September 2012 13:50 To: Kyrylo Tkachov Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH, TESTSUITE] Add -fno-short-enums to pr51712 On Tue, Sep 11, 2012 at 01:46:37PM +0100, Kyrylo Tkachov wrote: > Add -fno-short-enums flag to test c-c++-common/pr51712.c as discussed in > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51712. > This removes the excess warning that caused the test to fail. > Tested in arm-none-eabi configuration. The test now passes. > Comment? Ok for trunk? > > Thanks, > Kyrill > > gcc/testsuite > > 2012-09-11 Kyrylo Tkachov > > * c-c++-common/pr51712.c: Add -fno-short-enums flag to test. > --- a/gcc/testsuite/c-c++-common/pr51712.c > +++ b/gcc/testsuite/c-c++-common/pr51712.c > @@ -1,6 +1,6 @@ > /* PR c/51712 */ > /* { dg-do compile } */ > -/* { dg-options "-Wtype-limits" } */ > +/* { dg-options "-Wtype-limits -fno-short-enums" {target short_enums} } */ That is wrong, it means that on non-short_enums targets suddenly no dg-options would be passed. Instead you should keep the dg-options line as is and add /* { dg-additional-options "-fno-short-enums" { target short_enums } } */ or just add the new dg-options line but keep the old one as well (though, dg-additional-options is the new preferred way). Jakub --- a/gcc/testsuite/c-c++-common/pr51712.c +++ b/gcc/testsuite/c-c++-common/pr51712.c @@ -1,6 +1,7 @@ /* PR c/51712 */ /* { dg-do compile } */ /* { dg-options "-Wtype-limits" } */ +/* { dg-additional-options "-fno-short-enums" { target short_enums } } */ enum test_enum { FOO,
Re: [Patch ARM] Allow auto-vectorizer to use vfma.
On 09/11/2012 03:08 PM, Tobias Burnus wrote: your patch broke bootstrapping here: /home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node `arm_neon_ok' previously defined at line 1532. I fixed it (Rev. 191181) with the attached patch. arm_neon_ok should have been arm_neon2_ok. (I also changed spaces into tabs in the ChangeLog.) Tobias PS: Fortunately, documentation changes do not require an all-language bootstrap. On 09/11/2012 02:54 PM, Ramana Radhakrishnan wrote: Hi, This allows the auto-vectorizer to use vfma under Ofast or ffast-math. I have a follow-up patch which will add support for these from arm_neon.h as well before someone asks. It's being regression tested as we speak and that'll follow shortly. Tested on A15 silicon native with no regressions. Committed. regards, Ramana 2012-09-11 Ramana Radhakrishnan Matthew Gretton-Dann * config/arm/neon.md (fma4): New pattern. (*fmsub4): Likewise. * doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw): Document it. 2012-09-11 Ramana Radhakrishnan Matthew Gretton-Dann * gcc.target/arm/neon-vfma-1.c: New testcase. * gcc.target/arm/neon-vfms-1.c: Likewise. * gcc.target/arm/neon-vmla-1.c: Update test to use int instead of float. * gcc.target/arm/neon-vmls-1.c: Likewise. * lib/target-supports.exp (add_options_for_arm_neonv2): New function. (check_effective_target_arm_neonv2_ok_nocache): Likewise. (check_effective_target_arm_neonv2_ok): Likewise. (check_effective_target_arm_neonv2_hw): Likewise. (check_effective_target_arm_neonv2): Likewise. Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 191180) +++ gcc/ChangeLog (working copy) @@ -1,9 +1,13 @@ +2012-09-11 Tobias Burnus + + * doc/sourcebuild.texi (arm_neon_v2_ok): Fix @anchor. + 2012-09-11 Ramana Radhakrishnan -Matthew Gretton-Dann + Matthew Gretton-Dann - * config/arm/neon.md (fma4): New pattern. - (*fmsub4): Likewise. - * doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw): Document it. + * config/arm/neon.md (fma4): New pattern. + (*fmsub4): Likewise. + * doc/sourcebuild.texi (arm_neon_v2_ok, arm_neon_v2_hw): Document it. 2012-09-11 Aldy Hernandez Index: gcc/doc/sourcebuild.texi === --- gcc/doc/sourcebuild.texi (revision 191180) +++ gcc/doc/sourcebuild.texi (working copy) @@ -1534,7 +1534,7 @@ ARM Target supports @code{-mfpu=neon -mfloat-abi=s options. Some multilibs may be incompatible with these options. @item arm_neonv2_ok -@anchor{arm_neon_ok} +@anchor{arm_neon2_ok} ARM Target supports @code{-mfpu=neon -mfloat-abi=softfp} or compatible options. Some multilibs may be incompatible with these options.
Re: [PATCH] Combine location with block using block_locations
Hi, On Tue, 11 Sep 2012, Richard Guenther wrote: > >>> +++ gcc/lto/lto.c (working copy) > >>> @@ -1559,8 +1559,6 @@ lto_fixup_prevailing_decls (tree t) > >>> { > >>>enum tree_code code = TREE_CODE (t); > >>>LTO_NO_PREVAIL (TREE_TYPE (t)); > >>> - if (CODE_CONTAINS_STRUCT (code, TS_COMMON)) > >>> -LTO_NO_PREVAIL (TREE_CHAIN (t)); > >> > >> That change is odd. Can you show us how it breaks? > > > > This will break LTO build of gcc.c-torture/execute/pr38051.c > > > > There is data structure like: > > > > union { long int l; char c[sizeof (long int)]; } u; > > > > Once the block info is reserved for this, it'll reserve this data > > structure. And inside this data structure, there is VAR_DECL. Thus > > LTO_NO_PREVAIL assertion does not satisfy here for TREE_CHAIN (t). > > I see - the issue here is that this data structure is not reached at the > time we call free_lang_data (via find_decls_types_r). It should be reached just fine. The problem is that TREE_CHAIN of that union type contains random garbage (in this case the var_decl 'u'). This is not supposed to happen. It's set as part of reading back a BLOCK_VARS chain, so the type_decl itself is in such a chain (and 'u' is part of it via the TREE_CHAIN pointer). I have no idea why this is no problem without the patch. Possibly because of the hunk in remove_unused_scope_block_p that makes more blocks stay. > But maybe I do not understand "once the block info is reserved for > this". > > So the patch papers over an issue elsewhere I believe. Maybe Micha can > add some clarification here though, how BLOCK_VARS should be visible > here Hmm. Without the half-hearted tries to support debug info with LTO the block_vars list was no problem, it simply wouldn't be streamed. Now I think it is a problem, and we need to fix it up with the prevailing decls if there are multiple ones. I.e. instead of removing the two lines, replace LTO_NO_PREVAIL (TREE_CHAIN (t)) with LTO_SET_PREVAIL. This is quite unfortunate as we really rather want to make sure that TREE_CHAIN isn't randomly set to something. But as long as block_vars are implemented via TREE_CHAIN, and we want to preserve block_vars we don't have much choice :-( Ciao, Michael.
[PATCH][1/n] Improve LTO type merging
This removes the unused gtc_mode param and moves lifetime management of the various tables to a central place, avoiding repeated checks. LTO bootstrapped on x86_64-unknown-linux-gnu, applied to trunk. Richard. 2012-09-11 Richard Guenther * lto.c (enum gtc_mode): Remove. (struct type_pair_d): Adjust. (lookup_type_pair): Likewise. (gimple_type_leader): Do not mark as deletable. (gimple_lookup_type_leader): Adjust. (gtc_visit): Likewise. (gimple_types_compatible_p_1): Likewise. (gimple_types_compatible_p): Likewise. (gimple_type_hash): Likewise. (gimple_register_type): Likewise. (read_cgraph_and_symbols): Manage lifetime of tables here. Index: gcc/lto/lto.c === --- gcc/lto/lto.c (revision 191177) +++ gcc/lto/lto.c (working copy) @@ -276,6 +276,8 @@ lto_read_in_decl_state (struct data_in * return data; } + + /* Global type table. FIXME, it should be possible to re-use some of the type hashing routines in tree.c (type_hash_canon, type_hash_lookup, etc), but those assume that types were built with the various @@ -285,8 +287,6 @@ static GTY((if_marked ("ggc_marked_p"), static GTY((if_marked ("tree_int_map_marked_p"), param_is (struct tree_int_map))) htab_t type_hash_cache; -enum gtc_mode { GTC_MERGE = 0, GTC_DIAG = 1 }; - static hashval_t gimple_type_hash (const void *); /* Structure used to maintain a cache of some type pairs compared by @@ -295,16 +295,13 @@ static hashval_t gimple_type_hash (const -2: The pair (T1, T2) has just been inserted in the table. 0: T1 and T2 are different types. -1: T1 and T2 are the same type. - - The two elements in the SAME_P array are indexed by the comparison - mode gtc_mode. */ +1: T1 and T2 are the same type. */ struct type_pair_d { unsigned int uid1; unsigned int uid2; - signed char same_p[2]; + signed char same_p; }; typedef struct type_pair_d *type_pair_t; DEF_VEC_P(type_pair_t); @@ -323,9 +320,6 @@ lookup_type_pair (tree t1, tree t2) unsigned int index; unsigned int uid1, uid2; - if (type_pair_cache == NULL) -type_pair_cache = XCNEWVEC (struct type_pair_d, GIMPLE_TYPE_PAIR_SIZE); - if (TYPE_UID (t1) < TYPE_UID (t2)) { uid1 = TYPE_UID (t1); @@ -348,8 +342,7 @@ lookup_type_pair (tree t1, tree t2) type_pair_cache [index].uid1 = uid1; type_pair_cache [index].uid2 = uid2; - type_pair_cache [index].same_p[0] = -2; - type_pair_cache [index].same_p[1] = -2; + type_pair_cache [index].same_p = -2; return &type_pair_cache[index]; } @@ -381,7 +374,7 @@ typedef struct GTY(()) gimple_type_leade } gimple_type_leader_entry; #define GIMPLE_TYPE_LEADER_SIZE 16381 -static GTY((deletable, length("GIMPLE_TYPE_LEADER_SIZE"))) +static GTY((length("GIMPLE_TYPE_LEADER_SIZE"))) gimple_type_leader_entry *gimple_type_leader; /* Lookup an existing leader for T and return it or NULL_TREE, if @@ -392,9 +385,6 @@ gimple_lookup_type_leader (tree t) { gimple_type_leader_entry *leader; - if (!gimple_type_leader) -return NULL_TREE; - leader = &gimple_type_leader[TYPE_UID (t) % GIMPLE_TYPE_LEADER_SIZE]; if (leader->type != t) return NULL_TREE; @@ -403,7 +393,6 @@ gimple_lookup_type_leader (tree t) } - /* Return true if T1 and T2 have the same name. If FOR_COMPLETION_P is true then if any type has no name return false, otherwise return true if both types have no names. */ @@ -535,11 +524,11 @@ gtc_visit (tree t1, tree t2, /* Allocate a new cache entry for this comparison. */ p = lookup_type_pair (t1, t2); - if (p->same_p[GTC_MERGE] == 0 || p->same_p[GTC_MERGE] == 1) + if (p->same_p == 0 || p->same_p == 1) { /* We have already decided whether T1 and T2 are the same, return the cached result. */ - return p->same_p[GTC_MERGE] == 1; + return p->same_p == 1; } if ((slot = pointer_map_contains (sccstate, p)) != NULL) @@ -574,7 +563,7 @@ gimple_types_compatible_p_1 (tree t1, tr { struct sccs *state; - gcc_assert (p->same_p[GTC_MERGE] == -2); + gcc_assert (p->same_p == -2); state = XOBNEW (sccstate_obstack, struct sccs); *pointer_map_insert (sccstate, p) = state; @@ -861,7 +850,7 @@ pop: x = VEC_pop (type_pair_t, *sccstack); cstate = (struct sccs *)*pointer_map_contains (sccstate, x); cstate->on_sccstack = false; - x->same_p[GTC_MERGE] = state->u.same_p; + x->same_p = state->u.same_p; } while (x != p); } @@ -958,11 +947,11 @@ gimple_types_compatible_p (tree t1, tree /* If we've visited this type pair before (in the case of aggregates with self-referential types), and we made a decision, return it. */ p = lookup_type_pair (t1, t2); - if (p->same_p[GTC_MERGE] == 0 || p->same_p[GTC_MERGE] == 1) + if (p->same_p == 0 || p->sa
Re: [Patch ARM] Allow auto-vectorizer to use vfma.
On 09/11/12 14:17, Tobias Burnus wrote: On 09/11/2012 03:08 PM, Tobias Burnus wrote: your patch broke bootstrapping here: /home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node `arm_neon_ok' previously defined at line 1532. I fixed it (Rev. 191181) with the attached patch. arm_neon_ok should have been arm_neon2_ok. (I also changed spaces into tabs in the ChangeLog.) On 09/11/12 14:17, Tobias Burnus wrote: > On 09/11/2012 03:08 PM, Tobias Burnus wrote: >> your patch broke bootstrapping here: >> /home/tob/projects/gcc-git/gcc/gcc/doc//sourcebuild.texi:1537: Node >> `arm_neon_ok' previously defined at line 1532. > > I fixed it (Rev. 191181) with the attached patch. arm_neon_ok should > have been arm_neon2_ok. (I also changed spaces into tabs in the ChangeLog.) Nearly: should be arm_neonv2_ok rather than arm_neon_ok. I've realized another issue with the command line and committed this as obvious after checking that the documentation built fine. Thanks and apologies for the slip-up. I've changed machines recently and somethings not ok in this new setup. regards, Ramana 2012-09-11 Ramana Radhakrishnan * doc/sourcebuild.texi (arm_neon_v2_ok): Adjust command line. Index: gcc/doc/sourcebuild.texi === --- gcc/doc/sourcebuild.texi (revision 191181) +++ gcc/doc/sourcebuild.texi (revision 191182) @@ -1534,8 +1534,8 @@ ARM Target supports @code{-mfpu=neon -mf options. Some multilibs may be incompatible with these options. @item arm_neonv2_ok -@anchor{arm_neon2_ok} -ARM Target supports @code{-mfpu=neon -mfloat-abi=softfp} or compatible +@anchor{arm_neonv2_ok} +ARM Target supports @code{-mfpu=neon-vfpv4 -mfloat-abi=softfp} or compatible options. Some multilibs may be incompatible with these options. @item arm_neon_fp16_ok
[PATCH, AARCH64] Added predefines for AArch64 code models
This patch adds predefines for AArch64 code models. These code models are added as an effective target for the AArch64 platform. Tests for these predefines have been added to `gcc.target/aarch64/'. Thanks, Chris ChangeLog: [AArch64] Added predefines for AArch64 code models. gcc/ * config/aarch64/aarch64.h (TARGET_CPU_CPP_BUILTINS): Add predefine for AArch64 code models. gcc/testsuite/ * gcc.target/aarch64/predefine_large.c: New test for large code model predefine. * gcc.target/aarch64/predefine_small.c: Likewise for small code model. * gcc.target/aarch64/predefine_tiny.c: Likewise for small code model. * lib/target-supports.exp (check_effective_target_aarch64_tiny): Check effective target for tiny code model. (check_effective_target_aarch64_small): Likewise for small code model. (check_effective_target_aarch64_large): Likewise for large code model. >From c130393b5d8b888550e548b36dd34a71b8d94f88 Mon Sep 17 00:00:00 2001 From: Chris Schlumberger-Socha Date: Wed, 22 Aug 2012 18:22:26 +0100 Subject: [PATCH] Added predefines for AArch64 code models. Added DejaGnu tests for new predefines. --- gcc/config/aarch64/aarch64.h | 34 gcc/testsuite/gcc.target/aarch64/predefine_large.c |7 +++ gcc/testsuite/gcc.target/aarch64/predefine_small.c |7 +++ gcc/testsuite/gcc.target/aarch64/predefine_tiny.c |7 +++ gcc/testsuite/lib/target-supports.exp | 42 5 files changed, 89 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/predefine_large.c create mode 100644 gcc/testsuite/gcc.target/aarch64/predefine_small.c create mode 100644 gcc/testsuite/gcc.target/aarch64/predefine_tiny.c diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h index 5d121fa..593c01a 100644 --- a/gcc/config/aarch64/aarch64.h +++ b/gcc/config/aarch64/aarch64.h @@ -23,14 +23,32 @@ #define GCC_AARCH64_H /* Target CPU builtins. */ -#define TARGET_CPU_CPP_BUILTINS() \ - do \ -{ \ - builtin_define ("__aarch64__"); \ - if (TARGET_BIG_END) \ - builtin_define ("__AARCH64EB__"); \ - else \ - builtin_define ("__AARCH64EL__"); \ +#define TARGET_CPU_CPP_BUILTINS() \ + do \ +{ \ + builtin_define ("__aarch64__"); \ + if (TARGET_BIG_END)\ + builtin_define ("__AARCH64EB__"); \ + else \ + builtin_define ("__AARCH64EL__"); \ + \ + switch (aarch64_cmodel)\ + { \ + case AARCH64_CMODEL_TINY: \ + case AARCH64_CMODEL_TINY_PIC: \ + builtin_define ("__AARCH64_CMODEL_TINY__"); \ + break; \ + case AARCH64_CMODEL_SMALL: \ + case AARCH64_CMODEL_SMALL_PIC: \ + builtin_define ("__AARCH64_CMODEL_SMALL__");\ + break; \ + case AARCH64_CMODEL_LARGE: \ + builtin_define ("__AARCH64_CMODEL_LARGE__"); \ + break; \ + default: \ + break; \ + } \ + \ } while (0) diff --git a/gcc/testsuite/gcc.target/aarch64/predefine_large.c b/gcc/testsuite/gcc.target/aarch64/predefine_large.c new file mode 100644 index 000..0d7d4da --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/predefine_large.c @@ -0,0 +1,7 @@ +/* { dg-skip-if "Code model already defined" { aarch64_tiny || aarch64_small } } */ + +#ifdef __AARCH64_CMODEL_LARGE__ + int dummy; +#else + #error +#endif diff --git a/gcc/testsuite/gcc.target/aarch64/predefine_small.c b/gcc/testsuite/gcc.target/aarch64/predefine_small.c new file mode 100644 index 000..b136284 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/predefine_small.c @@ -0,0 +1,7 @@ +/* { dg-skip-if "Code model already defined" { aarch64_tiny || aarch64_large } } */ + +#ifdef __AARCH64_CMODEL_SMALL__ + int dummy; +#else + #error +#endif diff --git a/gcc/testsuite/gcc.target/aarch64/predefine_tiny.c b/gcc/testsuite/gcc.target/aarch64/predefine_tiny.c new file mode 100644 index 000..d2c844b --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/predefine_tiny.c @@ -0,0 +1,7 @@ +/* { dg-skip-if "Code model already defined" { aarch64_small || aarch64_large } } */ + +#ifdef __AARCH64_CMODEL_TINY__ + int dummy; +#else + #error +#endif diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 51805ed..2252c83 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -4654,3 +4654,45 @@ proc check_effective_target_ucontext_h { } { #include }] } + +proc check_effective_target_aarch64_tiny { } { +if { [istarget aarch64*-*-*] } { + return [check_no_compiler_messages aarch64_tiny object { + #ifdef __AARCH64_CMODEL_TINY__ + int dummy; + #else + #error target not AArch64 tiny code model + #endif + }] +} else { + return 0 +} +} + +proc check_effective_target_aarch64_small { } { +if { [istarget aarch64
[PATCH] Improve debug info for partial inlining (PR debug/54519)
Hi! As discussed in the PR, right now we do a very bad job for debug info of partially inlined functions (both when they are kept only partially inlined, or when partial inlining is performed, but doesn't seem to be useful and foo.part.N is inlined back, either into the original function, or into a function into which foo has been inlined first). This patch improves that by doing something similar to what ipa-prop.c does, in particular for arguments that aren't actually passed to foo.part.N we add debug args and corresponding debug bind and debug source bind stmts to provide better debug info (if foo.part.N isn't inlined, then DW_OP_GNU_parameter_ref is going to be used together with corresponding call site arguments). Bootstrapped/regtested on x86_64-linux and i686-linux, some of the tests still fail with some option combinations, am going to file a DF VTA PR for that momentarily. Ok for trunk? 2012-09-11 Jakub Jelinek PR debug/54519 * ipa-split.c (split_function): Add debug args and debug source and normal stmts for args_to_skip which are gimple regs. * tree-inline.c (copy_debug_stmt): When inlining, adjust source debug bind stmts to debug binds of corresponding DEBUG_EXPR_DECL. * gcc.dg/guality/pr54519-1.c: New test. * gcc.dg/guality/pr54519-2.c: New test. * gcc.dg/guality/pr54519-3.c: New test. * gcc.dg/guality/pr54519-4.c: New test. * gcc.dg/guality/pr54519-5.c: New test. --- gcc/ipa-split.c.jj 2012-08-20 11:09:45.0 +0200 +++ gcc/ipa-split.c 2012-09-10 16:04:39.499558177 +0200 @@ -1059,6 +1059,7 @@ split_function (struct split_point *spli gimple last_stmt = NULL; unsigned int i; tree arg, ddef; + VEC(tree, gc) **debug_args = NULL; if (dump_file) { @@ -1232,6 +1233,65 @@ split_function (struct split_point *spli gimple_set_block (call, DECL_INITIAL (current_function_decl)); VEC_free (tree, heap, args_to_pass); + if (args_to_skip) +for (parm = DECL_ARGUMENTS (current_function_decl), num = 0; +parm; parm = DECL_CHAIN (parm), num++) + if (bitmap_bit_p (args_to_skip, num) + && is_gimple_reg (parm)) + { + tree ddecl; + gimple def_temp; + + arg = get_or_create_ssa_default_def (cfun, parm); + if (!MAY_HAVE_DEBUG_STMTS) + continue; + if (debug_args == NULL) + debug_args = decl_debug_args_insert (node->symbol.decl); + ddecl = make_node (DEBUG_EXPR_DECL); + DECL_ARTIFICIAL (ddecl) = 1; + TREE_TYPE (ddecl) = TREE_TYPE (parm); + DECL_MODE (ddecl) = DECL_MODE (parm); + VEC_safe_push (tree, gc, *debug_args, parm); + VEC_safe_push (tree, gc, *debug_args, ddecl); + def_temp = gimple_build_debug_bind (ddecl, unshare_expr (arg), + call); + gsi_insert_after (&gsi, def_temp, GSI_NEW_STMT); + } + if (debug_args != NULL) +{ + unsigned int i; + tree var, vexpr; + gimple_stmt_iterator cgsi; + gimple def_temp; + + push_cfun (DECL_STRUCT_FUNCTION (node->symbol.decl)); + var = BLOCK_VARS (DECL_INITIAL (node->symbol.decl)); + i = VEC_length (tree, *debug_args); + cgsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR)); + do + { + i -= 2; + while (var != NULL_TREE +&& DECL_ABSTRACT_ORIGIN (var) + != VEC_index (tree, *debug_args, i)) + var = TREE_CHAIN (var); + if (var == NULL_TREE) + break; + vexpr = make_node (DEBUG_EXPR_DECL); + parm = VEC_index (tree, *debug_args, i); + DECL_ARTIFICIAL (vexpr) = 1; + TREE_TYPE (vexpr) = TREE_TYPE (parm); + DECL_MODE (vexpr) = DECL_MODE (parm); + def_temp = gimple_build_debug_source_bind (vexpr, parm, +NULL); + gsi_insert_before (&cgsi, def_temp, GSI_SAME_STMT); + def_temp = gimple_build_debug_bind (var, vexpr, NULL); + gsi_insert_before (&cgsi, def_temp, GSI_SAME_STMT); + } + while (i); + pop_cfun (); +} + /* We avoid address being taken on any variable used by split part, so return slot optimization is always possible. Moreover this is required to make DECL_BY_REFERENCE work. */ --- gcc/tree-inline.c.jj2012-08-22 11:18:56.0 +0200 +++ gcc/tree-inline.c 2012-09-11 09:13:49.509205799 +0200 @@ -2355,6 +2355,31 @@ copy_debug_stmt (gimple stmt, copy_body_ gimple_debug_source_bind_set_var (stmt, t); walk_tree (gimple_debug_source_bind_get_value_ptr (stmt), remap_gimple_op_r, &wi, NULL); + /* When inlining and source bind refers to one of the optimized +away parameters, change the source bind into normal debug bind +referring to the corresponding DEBUG_EXPR_DECL that should have +been boun
Re: [PATCH] Combine location with block using block_locations
On Tue, Sep 11, 2012 at 3:30 PM, Michael Matz wrote: > Hi, > > On Tue, 11 Sep 2012, Richard Guenther wrote: > >> >>> +++ gcc/lto/lto.c (working copy) >> >>> @@ -1559,8 +1559,6 @@ lto_fixup_prevailing_decls (tree t) >> >>> { >> >>>enum tree_code code = TREE_CODE (t); >> >>>LTO_NO_PREVAIL (TREE_TYPE (t)); >> >>> - if (CODE_CONTAINS_STRUCT (code, TS_COMMON)) >> >>> -LTO_NO_PREVAIL (TREE_CHAIN (t)); >> >> >> >> That change is odd. Can you show us how it breaks? >> > >> > This will break LTO build of gcc.c-torture/execute/pr38051.c >> > >> > There is data structure like: >> > >> > union { long int l; char c[sizeof (long int)]; } u; >> > >> > Once the block info is reserved for this, it'll reserve this data >> > structure. And inside this data structure, there is VAR_DECL. Thus >> > LTO_NO_PREVAIL assertion does not satisfy here for TREE_CHAIN (t). >> >> I see - the issue here is that this data structure is not reached at the >> time we call free_lang_data (via find_decls_types_r). > > It should be reached just fine. The problem is that TREE_CHAIN of that > union type contains random garbage (in this case the var_decl 'u'). This > is not supposed to happen. It's set as part of reading back a BLOCK_VARS > chain, so the type_decl itself is in such a chain (and 'u' is part of it > via the TREE_CHAIN pointer). > > I have no idea why this is no problem without the patch. Possibly because > of the hunk in remove_unused_scope_block_p that makes more blocks stay. > >> But maybe I do not understand "once the block info is reserved for >> this". >> >> So the patch papers over an issue elsewhere I believe. Maybe Micha can >> add some clarification here though, how BLOCK_VARS should be visible >> here > > Hmm. Without the half-hearted tries to support debug info with LTO the > block_vars list was no problem, it simply wouldn't be streamed. Now I > think it is a problem, and we need to fix it up with the prevailing decls > if there are multiple ones. I.e. instead of removing the two lines, > replace LTO_NO_PREVAIL (TREE_CHAIN (t)) with LTO_SET_PREVAIL. > > This is quite unfortunate as we really rather want to make sure that > TREE_CHAIN isn't randomly set to something. But as long as block_vars are > implemented via TREE_CHAIN, and we want to preserve block_vars we don't > have much choice :-( I don't think we can fixup TREE_CHAIN - the things cannot be in multiple lists after all. Unifying/fixing up would need to happen at a BLOCK level. But as you say - only TYPE_DECLs should be in BLOCK_VARS, but never global ones, so there would be nothing to replace. Which means we shouldn't even try to merge those. Hmm. Richard. > > Ciao, > Michael.
Re: Recognize vec_perm_expr in a constructor of bit_field_ref
On Tue, Sep 11, 2012 at 1:07 PM, Marc Glisse wrote: > Hello, > > here is a patch that turns {v[1],v[0]} into vec_perm_expr(v,v,{1,0}) if the > target is ok with it. > > I am attaching 2 versions of the patch. p-good is the one that passes > testing. p-bad, where I rely on fold_stmt to detect identity permutations, > ICEs towards the end of the pass while checking a bogus gimple stmt (one > that gimple_debug_stmt crashes on if I call it in gdb). From a performance > point of view, p-good makes sense, but I liked the simplicity of p-bad and I > am confused as to why it fails. Probably because you cannot simply increase num_ops ... > 2012-09-11 Marc Glisse > > gcc/ > * tree-ssa-forwprop.c (simplify_vector_constructor): New function. > (ssa_forward_propagate_and_combine): Call it. > > gcc/testsuite/ > * gcc.dg/tree-ssa/forwprop-22.c: New testcase. > > -- > Marc Glisse > Index: Makefile.in > === > --- Makefile.in (revision 191173) > +++ Makefile.in (working copy) > @@ -2237,21 +2237,22 @@ tree-outof-ssa.o : tree-outof-ssa.c $(TR > $(TREE_H) $(DIAGNOSTIC_H) $(TM_H) coretypes.h dumpfile.h \ > $(TREE_SSA_LIVE_H) $(BASIC_BLOCK_H) $(BITMAP_H) $(GGC_H) \ > $(EXPR_H) $(SSAEXPAND_H) $(GIMPLE_PRETTY_PRINT_H) > tree-ssa-dse.o : tree-ssa-dse.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ > $(TM_H) $(GGC_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \ > $(TREE_FLOW_H) $(TREE_PASS_H) domwalk.h $(FLAGS_H) \ > $(GIMPLE_PRETTY_PRINT_H) langhooks.h > tree-ssa-forwprop.o : tree-ssa-forwprop.c $(CONFIG_H) $(SYSTEM_H) > coretypes.h \ > $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) $(CFGLOOP_H) \ > $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ > - langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) $(EXPR_H) > + langhooks.h $(FLAGS_H) $(GIMPLE_H) $(GIMPLE_PRETTY_PRINT_H) $(EXPR_H) \ > + $(TREE_VECTORIZER_H) > tree-ssa-phiprop.o : tree-ssa-phiprop.c $(CONFIG_H) $(SYSTEM_H) coretypes.h > \ > $(TM_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \ > $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ > langhooks.h $(FLAGS_H) $(GIMPLE_PRETTY_PRINT_H) > tree-ssa-ifcombine.o : tree-ssa-ifcombine.c $(CONFIG_H) $(SYSTEM_H) \ > coretypes.h $(TM_H) $(TREE_H) $(BASIC_BLOCK_H) \ > $(TREE_FLOW_H) $(TREE_PASS_H) $(DIAGNOSTIC_H) \ > $(TREE_PRETTY_PRINT_H) > tree-ssa-phiopt.o : tree-ssa-phiopt.c $(CONFIG_H) $(SYSTEM_H) coretypes.h \ > $(TM_H) $(GGC_H) $(TREE_H) $(TM_P_H) $(BASIC_BLOCK_H) \ > Index: testsuite/gcc.dg/tree-ssa/forwprop-22.c > === > --- testsuite/gcc.dg/tree-ssa/forwprop-22.c (revision 0) > +++ testsuite/gcc.dg/tree-ssa/forwprop-22.c (revision 0) > @@ -0,0 +1,18 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target vect_double } */ > +/* { dg-require-effective-target vect_perm } */ > +/* { dg-options "-O -fdump-tree-optimized" } */ > + > +typedef double vec __attribute__((vector_size (2 * sizeof (double; > +void f (vec *px, vec *y, vec *z) > +{ > + vec x = *px; > + vec t1 = { x[1], x[0] }; > + vec t2 = { x[0], x[1] }; > + *y = t1; > + *z = t2; > +} > + > +/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 1 "optimized" } } */ > +/* { dg-final { scan-tree-dump-not "BIT_FIELD_REF" "optimized" } } */ > +/* { dg-final { cleanup-tree-dump "optimized" } } */ > > Property changes on: testsuite/gcc.dg/tree-ssa/forwprop-22.c > ___ > Added: svn:keywords >+ Author Date Id Revision URL > Added: svn:eol-style >+ native > > Index: tree-ssa-forwprop.c > === > --- tree-ssa-forwprop.c (revision 191173) > +++ tree-ssa-forwprop.c (working copy) > @@ -26,20 +26,21 @@ along with GCC; see the file COPYING3. > #include "tm_p.h" > #include "basic-block.h" > #include "gimple-pretty-print.h" > #include "tree-flow.h" > #include "tree-pass.h" > #include "langhooks.h" > #include "flags.h" > #include "gimple.h" > #include "expr.h" > #include "cfgloop.h" > +#include "tree-vectorizer.h" > > /* This pass propagates the RHS of assignment statements into use > sites of the LHS of the assignment. It's basically a specialized > form of tree combination. It is hoped all of this can disappear > when we have a generalized tree combiner. > > One class of common cases we handle is forward propagating a single use > variable into a COND_EXPR. > > bb0: > @@ -2787,20 +2788,105 @@ simplify_permutation (gimple_stmt_iterat >if (TREE_CODE (op0) == SSA_NAME) > ret = remove_prop_source_from_use (op0); >if (op0 != op1 && TREE_CODE (op1) == SSA_NAME) > ret |= remove_prop_source_from_use (op1); >return ret ? 2 : 1; > } > >return 0; > } > > +/* Recognize a VEC_PERM_EXPR. Returns true if there were any changes. */ > + > +static
Re: Recognize vec_perm_expr in a constructor of bit_field_ref
On Tue, 11 Sep 2012, Richard Guenther wrote: On Tue, Sep 11, 2012 at 1:07 PM, Marc Glisse wrote: Hello, here is a patch that turns {v[1],v[0]} into vec_perm_expr(v,v,{1,0}) if the target is ok with it. I am attaching 2 versions of the patch. p-good is the one that passes testing. p-bad, where I rely on fold_stmt to detect identity permutations, ICEs towards the end of the pass while checking a bogus gimple stmt (one that gimple_debug_stmt crashes on if I call it in gdb). From a performance point of view, p-good makes sense, but I liked the simplicity of p-bad and I am confused as to why it fails. Probably because you cannot simply increase num_ops ... Ah... thanks, it makes sense now... For some reason I thought it was a fixed size structure and num_ops just told it how many of the fields were in use. [...] Ok with that change. Just to be sure, that means you prefer the version where I manually detect identity and don't call fold, right? Thank you for all the quick reviews, -- Marc Glisse
Re: [PATCH] Improve debug info for partial inlining (PR debug/54519)
> + if (args_to_skip) > +for (parm = DECL_ARGUMENTS (current_function_decl), num = 0; > +parm; parm = DECL_CHAIN (parm), num++) > + if (bitmap_bit_p (args_to_skip, num) > + && is_gimple_reg (parm)) > + { > + tree ddecl; > + gimple def_temp; > + > + arg = get_or_create_ssa_default_def (cfun, parm); > + if (!MAY_HAVE_DEBUG_STMTS) > + continue; You can do this MAY_HAVE_DEBUG_STMTS check before the loop, e.g. > + if (args_to_skip && MAY_HAVE_DEBUG_STMTS) Ciao! Steven
Re: [PATCH] Improve debug info for partial inlining (PR debug/54519)
On Tue, Sep 11, 2012 at 04:41:24PM +0200, Steven Bosscher wrote: > > + if (args_to_skip) > > +for (parm = DECL_ARGUMENTS (current_function_decl), num = 0; > > +parm; parm = DECL_CHAIN (parm), num++) > > + if (bitmap_bit_p (args_to_skip, num) > > + && is_gimple_reg (parm)) > > + { > > + tree ddecl; > > + gimple def_temp; > > + > > + arg = get_or_create_ssa_default_def (cfun, parm); > > + if (!MAY_HAVE_DEBUG_STMTS) > > + continue; > > You can do this MAY_HAVE_DEBUG_STMTS check before the loop, e.g. > > > + if (args_to_skip && MAY_HAVE_DEBUG_STMTS) No, that would result in -fcompare-debug failures if parm doesn't have a default def yet. Jakub
Remove def operands cache
Hi, the operands cache is ugly. This patch removes it at least for the def operands, saving three pointers for roughly each normal statement (the pointer in gsbase, and two pointers from def_optype_d). This is relatively easy to do, because all statements except ASMs have at most one def (and one vdef), which themself aren't pointed to by something else, unlike the use operands which have more structure for the SSA web. Performance wise the patch is a slight improvement (1% for some C++ testcases, but relatively noisy, but at least not slower), bootstrap time is unaffected. As the iterator is a bit larger code size increases by 1 promille. The patch is regstrapped on x86_64-linux. If it's approved I'll adjust the WORD count markers in gimple.h, I left it out in this submission as it's just verbose noise in comments. Okay for trunk? Ciao, Michael. * tree-ssa-operands.h (struct def_optype_d, def_optype_p): Remove. (ssa_operands.free_defs): Remove. (DEF_OP_PTR, DEF_OP): Remove. (struct ssa_operand_iterator_d): Remove 'defs', add 'flags', 'def_i' members, rename 'phi_stmt' to 'stmt'. * gimple.h (gimple_statement_with_ops.def_ops): Remove. (gimple_def_ops, gimple_set_def_ops): Remove. (gimple_vdef_op): Don't take const gimple, adjust. * tree-ssa-operands.c (build_defs): Remove. (init_ssa_operands): Don't initialize it. (fini_ssa_operands): Don't free it. (cleanup_build_arrays): Don't truncate it. (finalize_ssa_stmt_operands): Don't assert on it. (alloc_def, add_def_op, append_def): Remove. (finalize_ssa_defs): Remove building of def_ops list. (finalize_ssa_uses): Don't mark for SSA renaming here, ... (add_stmt_operand): ... but here, don't call append_def. (get_indirect_ref_operands): Remove recurse_on_base argument. (get_expr_operands): Adjust call to get_indirect_ref_operands. (verify_ssa_operands): Don't check def operands. (free_stmt_operands): Don't free def operands. * gimple.c (gimple_copy): Don't clear def operands. * tree-flow-inline.h (op_iter_next_use): Adjust to explicitely handle def operand. (op_iter_next_tree): Ditto. (clear_and_done_ssa_iter): Clear new fields. (op_iter_init): Adjust to setup new iterator structure. (op_iter_init_phiuse): Adjust. Index: tree-ssa-operands.h === --- tree-ssa-operands.h.orig2012-09-06 16:14:30.0 +0200 +++ tree-ssa-operands.h 2012-09-06 16:18:33.0 +0200 @@ -34,14 +34,6 @@ typedef ssa_use_operand_t *use_operand_p #define NULL_USE_OPERAND_P ((use_operand_p)NULL) #define NULL_DEF_OPERAND_P ((def_operand_p)NULL) -/* This represents the DEF operands of a stmt. */ -struct def_optype_d -{ - struct def_optype_d *next; - tree *def_ptr; -}; -typedef struct def_optype_d *def_optype_p; - /* This represents the USE operands of a stmt. */ struct use_optype_d { @@ -68,7 +60,6 @@ struct GTY(()) ssa_operands { bool ops_active; - struct def_optype_d * GTY ((skip (""))) free_defs; struct use_optype_d * GTY ((skip (""))) free_uses; }; @@ -82,9 +73,6 @@ struct GTY(()) ssa_operands { #define USE_OP_PTR(OP) (&((OP)->use_ptr)) #define USE_OP(OP) (USE_FROM_PTR (USE_OP_PTR (OP))) -#define DEF_OP_PTR(OP) ((OP)->def_ptr) -#define DEF_OP(OP) (DEF_FROM_PTR (DEF_OP_PTR (OP))) - #define PHI_RESULT_PTR(PHI)gimple_phi_result_ptr (PHI) #define PHI_RESULT(PHI)DEF_FROM_PTR (PHI_RESULT_PTR (PHI)) #define SET_PHI_RESULT(PHI, V) SET_DEF (PHI_RESULT_PTR (PHI), (V)) @@ -135,11 +123,12 @@ typedef struct ssa_operand_iterator_d { bool done; enum ssa_op_iter_type iter_type; - def_optype_p defs; + int flags; + unsigned def_i; use_optype_p uses; int phi_i; int num_phi; - gimple phi_stmt; + gimple stmt; } ssa_op_iter; /* These flags are used to determine which operands are returned during Index: gimple.h === --- gimple.h.orig 2012-09-06 16:14:30.0 +0200 +++ gimple.h2012-09-07 16:01:27.0 +0200 @@ -224,12 +226,12 @@ struct GTY(()) gimple_statement_with_ops /* [ WORD 1-6 ] */ struct gimple_statement_base gsbase; + /* XXX adjust word count */ /* [ WORD 7-8 ] SSA operand vectors. NOTE: It should be possible to amalgamate these vectors with the operand vector OP. However, the SSA operand vectors are organized differently and contain more information (like immediate use chaining). */ - struct def_optype_d GTY((skip (""))) *def_ops; struct use_optype_d GTY((skip (""))) *use_ops; }; @@ -1374,27 +1376,6 @@ gimple_has_mem_ops (const_gimple g) } -/* Return the set of DEF operands for statement G. */ - -static inline struc
[PATCH] fix bootstrap on darwin to adapt to VEC changes
The attached patch fixes the bootstrap on darwin to cope with the VEC changes to remove unnecessary VEC function overloads. Tested on x86_64-apple-darwin12. Okay for gcc trunk. Jack 2012-09-11 Dominique d'Humieres Jack Howarth * config/darwin.c (darwin_asm_named_section): Adjust for VEC changes. (darwin_asm_dwarf_section): Likewise. Index: gcc/config/darwin.c === --- gcc/config/darwin.c (revision 191179) +++ gcc/config/darwin.c (working copy) @@ -1878,7 +1878,7 @@ darwin_asm_named_section (const char *na the assumption of how this is done. */ if (lto_section_names == NULL) lto_section_names = VEC_alloc (darwin_lto_section_e, gc, 16); - VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, &e); + VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, e); } else if (strncmp (name, "__DWARF,", 8) == 0) darwin_asm_dwarf_section (name, flags, decl); @@ -2698,7 +2698,7 @@ darwin_asm_dwarf_section (const char *na fprintf (asm_out_file, "Lsection%.*s:\n", namelen, sname); e.count = 1; e.name = xstrdup (sname); - VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, &e); + VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, e); } }
Re: [patch] Expand SJLJ exceptions as tablejump/casesi
On 09/10/2012 04:26 PM, Steven Bosscher wrote: > + rtx index = force_reg (index_mode, dispatch_index); You can't modify the result of force_reg. Use copy_to_{mode_,}reg instead. > + rtx tmp = expand_simple_binop (index_mode, MINUS, > + index, CONST1_RTX (index_mode), > + index, 0, OPTAB_DIRECT); > + gcc_assert (REG_P (tmp)); > + if (tmp != index) > + emit_move_insn (index, tmp); This pattern is force_expand_binop. Of course, you don't really need to force index be the same all the way down the chain. You could just as well use index = expand_simple_binop (index_mode, MINUS, index, one, index, 0, OPTAB_DIRECT); and use any new pseudo in the next iteration. Otherwise this looks good. r~
Re: [Patch ARM] implement bswap16
On 11 September 2012 12:52, Richard Earnshaw wrote: > Try something like: > > short foo(int); > > short swaps (short x, int y) > { > int z = x; > if (y) > z = __builtin_bswap16(x); > return foo (z); > } > > If that's not enough, try adding 1 to z before calling foo. > Thanks, it works. It's surprising however that 'return z' isn't enough. Here is a new version of the patch, which also transforms the 32 bits arm_rev/thumb1_rev into arm_rev/arm_rev_cond. I have enhanced the testcase too. Christophe. bswap16.patch Description: Binary data
Re: [PATCH, libstdc++] Improve slightly __cxa_guard_acquire
On Thu, Sep 06, 2012 at 11:10:37PM +0200, Jakub Jelinek wrote: > > + int expected(0); > > if (__atomic_compare_exchange_n(gi, &expected, pending_bit, false, > > __ATOMIC_ACQ_REL, > > __ATOMIC_RELAXED)) > > Shouldn't this __ATOMIC_RELAXED be also __ATOMIC_ACQUIRE? If expected ends > up being guard_bit, then the code will return 0; right away. Here is a patch for that. Ok for trunk/4.7? 2012-09-11 Jakub Jelinek PR libstdc++/54172 * libsupc++/guard.cc (__cxa_guard_acquire): Fix up the last argument of the first __atomic_compare_exchange_n. --- libstdc++-v3/libsupc++/guard.cc.jj 2012-09-11 16:55:16.0 +0200 +++ libstdc++-v3/libsupc++/guard.cc 2012-09-11 16:56:38.035848876 +0200 @@ -253,7 +253,7 @@ namespace __cxxabiv1 int expected(0); if (__atomic_compare_exchange_n(gi, &expected, pending_bit, false, __ATOMIC_ACQ_REL, - __ATOMIC_RELAXED)) + __ATOMIC_ACQUIRE)) { // This thread should do the initialization. return 1; Jakub
Re: [Patch ARM testsuite] fix 3 tests for big-endian
Ping? http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00068.html Thanks Christophe. On 3 September 2012 11:01, Christophe Lyon wrote: > On 31 August 2012 18:14, Janis Johnson wrote: >> >> do something like >> >> /* { dg-final { scan-assembler-times "fmrrd\[\\t \]+r0,\[\\t \]*r1,\[\\t >> \]*d0" 2 } { target arm_little_endian } } */ >> /* { dg-final { scan-assembler-times "fmrrd\[\\t \]+r1,\[\\t \]*r0,\[\\t >> \]*d0" 2 } {target { ! arm_little_endian } } } */ >> >> That's untested, but you get the idea. >> >> Janis >> >> > > Thanks for your review. Here is an updated patch. > > Christophe. > > 2012-09-03 Christophe Lyon > > gcc/testsuite/ > * gcc.target/arm/neon-vset_lanes8.c, gcc.target/arm/pr51835.c, > gcc.target/arm/pr48252.c: Fix for big-endian support.
Re: [PATCH, libstdc++] Improve slightly __cxa_guard_acquire
On 09/11/2012 08:02 AM, Jakub Jelinek wrote: > 2012-09-11 Jakub Jelinek > > PR libstdc++/54172 > * libsupc++/guard.cc (__cxa_guard_acquire): Fix up the last > argument of the first __atomic_compare_exchange_n. Looks good. r~
Re: Recognize vec_perm_expr in a constructor of bit_field_ref
On Tue, 11 Sep 2012, Richard Guenther wrote: On Tue, Sep 11, 2012 at 1:07 PM, Marc Glisse wrote: Hello, here is a patch that turns {v[1],v[0]} into vec_perm_expr(v,v,{1,0}) if the target is ok with it. I am attaching 2 versions of the patch. p-good is the one that passes testing. p-bad, where I rely on fold_stmt to detect identity permutations, ICEs towards the end of the pass while checking a bogus gimple stmt (one that gimple_debug_stmt crashes on if I call it in gdb). From a performance point of view, p-good makes sense, but I liked the simplicity of p-bad and I am confused as to why it fails. Probably because you cannot simply increase num_ops ... 2012-09-11 Marc Glisse gcc/ * tree-ssa-forwprop.c (simplify_vector_constructor): New function. (ssa_forward_propagate_and_combine): Call it. gcc/testsuite/ * gcc.dg/tree-ssa/forwprop-22.c: New testcase. [...] Ok with that change. Attached is what I am testing and will commit if it passes. -- Marc GlisseIndex: gcc/tree-ssa-forwprop.c === --- gcc/tree-ssa-forwprop.c (revision 191187) +++ gcc/tree-ssa-forwprop.c (working copy) @@ -26,20 +26,21 @@ along with GCC; see the file COPYING3. #include "tm_p.h" #include "basic-block.h" #include "gimple-pretty-print.h" #include "tree-flow.h" #include "tree-pass.h" #include "langhooks.h" #include "flags.h" #include "gimple.h" #include "expr.h" #include "cfgloop.h" +#include "tree-vectorizer.h" /* This pass propagates the RHS of assignment statements into use sites of the LHS of the assignment. It's basically a specialized form of tree combination. It is hoped all of this can disappear when we have a generalized tree combiner. One class of common cases we handle is forward propagating a single use variable into a COND_EXPR. bb0: @@ -2787,20 +2788,98 @@ simplify_permutation (gimple_stmt_iterat if (TREE_CODE (op0) == SSA_NAME) ret = remove_prop_source_from_use (op0); if (op0 != op1 && TREE_CODE (op1) == SSA_NAME) ret |= remove_prop_source_from_use (op1); return ret ? 2 : 1; } return 0; } +/* Recognize a VEC_PERM_EXPR. Returns true if there were any changes. */ + +static bool +simplify_vector_constructor (gimple_stmt_iterator *gsi) +{ + gimple stmt = gsi_stmt (*gsi); + gimple def_stmt; + tree op, op2, orig, type, elem_type; + unsigned elem_size, nelts, i; + enum tree_code code; + constructor_elt *elt; + unsigned char *sel; + bool maybe_ident; + + gcc_checking_assert (gimple_assign_rhs_code (stmt) == CONSTRUCTOR); + + op = gimple_assign_rhs1 (stmt); + type = TREE_TYPE (op); + gcc_checking_assert (TREE_CODE (type) == VECTOR_TYPE); + + nelts = TYPE_VECTOR_SUBPARTS (type); + elem_type = TREE_TYPE (type); + elem_size = TREE_INT_CST_LOW (TYPE_SIZE (elem_type)); + + sel = XALLOCAVEC (unsigned char, nelts); + orig = NULL; + maybe_ident = true; + FOR_EACH_VEC_ELT (constructor_elt, CONSTRUCTOR_ELTS (op), i, elt) +{ + tree ref, op1; + + if (i >= nelts) + return false; + + if (TREE_CODE (elt->value) != SSA_NAME) + return false; + def_stmt = SSA_NAME_DEF_STMT (elt->value); + if (!def_stmt || !is_gimple_assign (def_stmt)) + return false; + code = gimple_assign_rhs_code (def_stmt); + if (code != BIT_FIELD_REF) + return false; + op1 = gimple_assign_rhs1 (def_stmt); + ref = TREE_OPERAND (op1, 0); + if (orig) + { + if (ref != orig) + return false; + } + else + { + if (TREE_CODE (ref) != SSA_NAME) + return false; + orig = ref; + } + if (TREE_INT_CST_LOW (TREE_OPERAND (op1, 1)) != elem_size) + return false; + sel[i] = TREE_INT_CST_LOW (TREE_OPERAND (op1, 2)) / elem_size; + if (sel[i] != i) maybe_ident = false; +} + if (i < nelts) +return false; + + if (maybe_ident) +{ + gimple_assign_set_rhs_from_tree (gsi, orig); +} + else +{ + op2 = vect_gen_perm_mask (type, sel); + if (!op2) + return false; + gimple_assign_set_rhs_with_ops_1 (gsi, VEC_PERM_EXPR, orig, orig, op2); +} + update_stmt (gsi_stmt (*gsi)); + return true; +} + /* Main entry point for the forward propagation and statement combine optimizer. */ static unsigned int ssa_forward_propagate_and_combine (void) { basic_block bb; unsigned int todoflags = 0; cfg_changed = false; @@ -2958,20 +3037,23 @@ ssa_forward_propagate_and_combine (void) } else if (code == VEC_PERM_EXPR) { int did_something = simplify_permutation (&gsi); if (did_something == 2) cfg_changed = true; changed = did_something != 0; } else if (code == BIT_FIELD_REF) changed
Obsolete picochip-* in 4.7.2+
Hi! As discussed on IRC, the picochip-* port doesn't have an active maintainer anymore, this patch adds it to deprecated ports for 4.7.2+ so that it can be removed in GCC 4.8 unless somebody steps up to maintain it. Ok for trunk/4.7? 2012-09-11 Jakub Jelinek * config.gcc: Obsolete picochip-*. --- gcc/config.gcc 2012-09-05 14:52:14.428548941 +0200 +++ gcc/config.gcc 2012-09-11 17:05:15.147522191 +0200 @@ -245,7 +245,8 @@ md_file= # Obsolete configurations. case ${target} in - score-* \ + picochip-* \ + | score-* \ ) if test "x$enable_obsolete" != xyes; then echo "*** Configuration ${target} is obsolete." >&2 --- gcc-4.7/changes.html10 Aug 2012 16:25:46 - 1.124 +++ gcc-4.7/changes.html11 Sep 2012 15:15:38 - @@ -29,7 +29,14 @@ next release of GCC will have their sources permanently removed. -The following ports for individual systems on +All GCC ports for the following processor +architectures have been declared obsolete: + + + picoChip (picochip-*) + + +The following ports for individual systems on particular architectures have been obsoleted: Jakub 2012-09-11 Jakub Jelinek * config.gcc: Obsolete picochip-*. --- gcc/config.gcc 2012-09-05 14:52:14.428548941 +0200 +++ gcc/config.gcc 2012-09-11 17:05:15.147522191 +0200 @@ -245,7 +245,8 @@ md_file= # Obsolete configurations. case ${target} in - score-* \ + picochip-* \ + | score-* \ ) if test "x$enable_obsolete" != xyes; then echo "*** Configuration ${target} is obsolete." >&2 --- gcc-4.7/changes.html10 Aug 2012 16:25:46 - 1.124 +++ gcc-4.7/changes.html11 Sep 2012 15:15:38 - @@ -29,7 +29,14 @@ next release of GCC will have their sources permanently removed. -The following ports for individual systems on +All GCC ports for the following processor +architectures have been declared obsolete: + + + picoChip (picochip-*) + + +The following ports for individual systems on particular architectures have been obsoleted:
[C++ Patch] Remove uses of ATTRIBUTE_UNUSED in the function parameters
Hi, since we are now using C++, I think we can remove the attributes and just use unnamed parameters. For now I kept the names in comments for documentation purposes, but would be glad to remove those too, if you like. Booted and tested x86_64-linux. Thanks, Paolo. PS: slightly interesting, in a couple of cases - write_unnamed_type_name, wrap_cleanups_r - the parameters were actually used. // 2012-09-11 Paolo Carlini * typeck.c (build_indirect_ref, build_function_call, build_function_call_vec, build_binary_op, build_unary_op, build_compound_expr, build_c_cast, build_modify_expr): Remove uses of ATTRIBUTE_UNUSED on the parameters. * class.c (set_linkage_according_to_type, resort_type_method_vec, dfs_find_final_overrider_post, empty_base_at_nonzero_offset_p): Likewise. * decl.c (local_variable_p_walkfn): Likewise. * except.c (wrap_cleanups_r, check_noexcept_r): Likewise. * error.c (find_typenames_r): Likewise. * tree.c (verify_stmt_tree_r, bot_replace, handle_java_interface_attribute, handle_com_interface_attribute, handle_init_priority_attribute, c_register_addr_space): Likewise. * cp-gimplify.c (cxx_omp_clause_default_ctor): Likewise. * cp-lang.c (objcp_tsubst_copy_and_build): Likewise. * pt.c (unify_success, unify_invalid, instantiation_dependent_r): Likewise. * semantics.c (dfs_calculate_bases_pre): Likewise. * decl2.c (fix_temporary_vars_context_r, clear_decl_external): Likewise. * parser.c (cp_lexer_token_at, cp_parser_omp_clause_mergeable, cp_parser_omp_clause_nowait, cp_parser_omp_clause_ordered, cp_parser_omp_clause_untied): Likewise. * mangle.c (write_unnamed_type_name, discriminator_for_string_literal): Likewise. * search.c (dfs_accessible_post, dfs_debug_mark): Likewise. * lex.c (handle_pragma_vtable, handle_pragma_unit, handle_pragma_interface, handle_pragma_implementation, handle_pragma_java_exceptions): Likewise. Index: typeck.c === --- typeck.c(revision 191177) +++ typeck.c(working copy) @@ -2772,7 +2772,7 @@ build_x_indirect_ref (location_t loc, tree expr, r /* Helper function called from c-common. */ tree -build_indirect_ref (location_t loc ATTRIBUTE_UNUSED, +build_indirect_ref (location_t /*loc*/, tree ptr, ref_operator errorstring) { return cp_build_indirect_ref (ptr, errorstring, tf_warning_or_error); @@ -3207,7 +3207,7 @@ get_member_function_from_ptrfunc (tree *instance_p /* Used by the C-common bits. */ tree -build_function_call (location_t loc ATTRIBUTE_UNUSED, +build_function_call (location_t /*loc*/, tree function, tree params) { return cp_build_function_call (function, params, tf_warning_or_error); @@ -3215,9 +3215,9 @@ tree /* Used by the C-common bits. */ tree -build_function_call_vec (location_t loc ATTRIBUTE_UNUSED, +build_function_call_vec (location_t /*loc*/, tree function, VEC(tree,gc) *params, -VEC(tree,gc) *origtypes ATTRIBUTE_UNUSED) +VEC(tree,gc) * /*origtypes*/) { VEC(tree,gc) *orig_params = params; tree ret = cp_build_function_call_vec (function, ¶ms, @@ -3693,7 +3693,7 @@ enum_cast_to_int (tree op) /* For the c-common bits. */ tree build_binary_op (location_t location, enum tree_code code, tree op0, tree op1, -int convert_p ATTRIBUTE_UNUSED) +int /*convert_p*/) { return cp_build_binary_op (location, code, op0, op1, tf_warning_or_error); } @@ -5448,7 +5448,7 @@ cp_build_unary_op (enum tree_code code, tree xarg, /* Hook for the c-common bits that build a unary op. */ tree -build_unary_op (location_t location ATTRIBUTE_UNUSED, +build_unary_op (location_t /*location*/, enum tree_code code, tree xarg, int noconvert) { return cp_build_unary_op (code, xarg, noconvert, tf_warning_or_error); @@ -5784,7 +5784,7 @@ build_x_compound_expr (location_t loc, tree op1, t /* Like cp_build_compound_expr, but for the c-common bits. */ tree -build_compound_expr (location_t loc ATTRIBUTE_UNUSED, tree lhs, tree rhs) +build_compound_expr (location_t /*loc*/, tree lhs, tree rhs) { return cp_build_compound_expr (lhs, rhs, tf_warning_or_error); } @@ -6652,7 +6652,7 @@ build_const_cast (tree type, tree expr, tsubst_fla /* Like cp_build_c_cast, but for the c-common bits. */ tree -build_c_cast (location_t loc ATTRIBUTE_UNUSED, tree type, tree expr) +build_c_cast (location_t /*loc*/, tree type, tree expr) { return cp_build_c_cast (type, expr, tf_warning_or_error); } @@ -6782,11 +6782,11 @@ cp_build_c_cast (tree type, tree expr, tsubst_flag /* For use from the C common bits. */ tree -build_modify_expr (location_t location ATTRIBUTE
Re: shrink-wrapping duplicates BBs across partitions.
Actually, the edge is fairly simple. I have BB5 (BB_COLD_PARTITION) -> BB10 (BB_HOT_PARTITION) -> EXIT and BB10 has no other incoming edges. and we are duplicating it. My hypothesis, is that with a gcov based profile, we should never have such partitioning on the edges, BB10 should be COLD as well. My suggestion was to avoid shrink-wrapping failing on the block duplication for this case, but that would hide the real cause. I now prefer to understand why BB10 is HOT in the first place... if this is a correct assumption that it should not be. Thanks Christian On 09/11/2012 02:46 PM, Steven Bosscher wrote: >> Does this restriction look right to you ? (regression tests are still >> running on x86 and sh) > > Please generate your patches with diff -up (or svn diff -x -up). > >> +&& (BB_PARTITION (e->src) == BB_PARTITION (e->dest)) > > No need for parentheses around this check. > > The shrink wrapping code appears to be dealing with partitioning, or > at least there are BB_COPY_PARTITIONs further down. So I can't tell > whether this fix is correct. Can you show in more detail what happens? > (A dotty graph is always helpful ;-) > > Ciao! > Steven >
Re: [PATCH] Combine location with block using block_locations
Hi, On Tue, 11 Sep 2012, Dehao Chen wrote: > Looks like we have two choices: > > 1. Stream out block info, and use LTO_SET_PREVAIL for TREE_CHAIN(t) This will actually not work correctly in some cases. The problem is, if the prevailing decl is already part of another chain (say in another block_var list) you would break the current chain. Hence block vars need special handling in the lto streamer (another reason why tree_chain is not the most clever think to use for this chain). This problem area needs to be solved somehow if block info is to be preserved correctly. > 2. Don't stream out block info for LTO, and still call LTO_NO_PREVAIL > (TREE_CHAIN (t)). That's also a large hammer as it basically will mean no debug info after LTO :-/ Sigh, at this point I have no good solution that doesn't involve quite some work, perhaps your hack is good enough for the time being, though I hate it :) Ciao, Michael.
Re: Change double_int calls to new interface.
> Index: gcc/ChangeLog > > 2012-09-04 Lawrence Crowl > > * double-int.h (double_int::operator &=): New. > (double_int::operator ^=): New. > (double_int::operator |=): New. > (double_int::mul_with_sign): Modify overflow parameter to bool*. > (double_int::add_with_sign): New. > (double_int::ule): New. > (double_int::sle): New. > (binary double_int::operator *): Remove parameter name. > (binary double_int::operator +): Likewise. > (binary double_int::operator -): Likewise. > (binary double_int::operator &): Likewise. > (double_int::operator |): Likewise. > (double_int::operator ^): Likewise. > (double_int::and_not): Likewise. > (double_int::from_shwi): Tidy formatting. > (double_int::from_uhwi): Likewise. > (double_int::from_uhwi): Likewise. > * double-int.c (double_int::mul_with_sign): Modify overflow > parameter > to bool*. > (double_int::add_with_sign): New. > (double_int::ule): New. > (double_int::sle): New. > * builtins.c: Modify to use the new double_int interface. > * cgraph.c: Likewise. > * combine.c: Likewise. > * dwarf2out.c: Likewise. > * emit-rtl.c: Likewise. > * expmed.c: Likewise. > * expr.c: Likewise. > * fixed-value.c: Likewise. > * fold-const.c: Likewise. > * gimple-fold.c: Likewise. > * gimple-ssa-strength-reduction.c: Likewise. > * gimplify-rtx.c: Likewise. > * ipa-prop.c: Likewise. > * loop-iv.c: Likewise. > * optabs.c: Likewise. > * stor-layout.c: Likewise. > * tree-affine.c: Likewise. > * tree-cfg.c: Likewise. > * tree-dfa.c: Likewise. > * tree-flow-inline.h: Likewise. > * tree-object-size.c: Likewise. > * tree-predcom.c: Likewise. > * tree-pretty-print.c: Likewise. > * tree-sra.c: Likewise. > * tree-ssa-address.c: Likewise. > * tree-ssa-alias.c: Likewise. > * tree-ssa-ccp.c: Likewise. > * tree-ssa-forwprop.c: Likewise. > * tree-ssa-loop-ivopts.c: Likewise. > * tree-ssa-loop-niter.c: Likewise. > * tree-ssa-phiopt.c: Likewise. > * tree-ssa-pre.c: Likewise. > * tree-ssa-sccvn: Likewise. > * tree-ssa-structalias.c: Likewise. > * tree-ssa.c: Likewise. > * tree-switch-conversion.c: Likewise. > * tree-vect-loop-manip.c: Likewise. > * tree-vrp.c: Likewise. > * tree.h: Likewise. > * tree.c: Likewise. > * varasm.c: Likewise. I fear this has broken hppa. Bootstrap on OpenBSD/hppa now fails with: In file included from ../../../src/gcc/gcc/mcf.c:47:0: ../../../src/gcc/gcc/mcf.c: In function 'void dump_fixup_edge(FILE*, fixup_graph_type*, fixup_edge_p)': ../../../src/gcc/gcc/system.h:288:78: error: integer overflow in expression [-Werror=overflow] ? ~ (t) 0 << (sizeof(t) * CHAR_BIT - 1) : (t) 0)) ^ ../../../src/gcc/gcc/system.h:289:44: note: in expansion of macro 'INTTYPE_MINIMUM' #define INTTYPE_MAXIMUM(t) ((t) (~ (t) 0 - INTTYPE_MINIMUM (t))) ^ ../../../src/gcc/gcc/mcf.c:55:22: note: in expansion of macro 'INTTYPE_MAXIMUM' #define CAP_INFINITY INTTYPE_MAXIMUM (HOST_WIDEST_INT) ^ ../../../src/gcc/gcc/mcf.c:211:34: note: in expansion of macro 'CAP_INFINITY' if (fedge->max_capacity == CAP_INFINITY) ^ Something must be wrong with the overflow detection logic in the new double_int interfaces. I suspect this is because for hppa HOST_WIDE_INT is 32 bits wide, since on i386 and x86_64 I don't hit this.
Re: Bootstrap fails (was: Remove unnecessary VEC function overloads.)
On 2012-09-11 08:42 , Dominique Dhumieres wrote: This is ok, of course. Then could you please commit it (I don't have write access)? Done. Rev 191192. 2012-09-11 Dominique Dhumieres * config/darwin.c (darwin_asm_named_section): Adjust for VEC changes. (darwin_asm_dwarf_section): Likewise. diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c index 33a831f..54c92d1 100644 --- a/gcc/config/darwin.c +++ b/gcc/config/darwin.c @@ -1878,7 +1878,7 @@ darwin_asm_named_section (const char *name, the assumption of how this is done. */ if (lto_section_names == NULL) lto_section_names = VEC_alloc (darwin_lto_section_e, gc, 16); - VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, &e); + VEC_safe_push (darwin_lto_section_e, gc, lto_section_names, e); } else if (strncmp (name, "__DWARF,", 8) == 0) darwin_asm_dwarf_section (name, flags, decl); @@ -2698,7 +2698,7 @@ darwin_asm_dwarf_section (const char *name, unsigned int flags, fprintf (asm_out_file, "Lsection%.*s:\n", namelen, sname); e.count = 1; e.name = xstrdup (sname); - VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, &e); + VEC_safe_push (dwarf_sect_used_entry, gc, dwarf_sect_names_table, e); } }
Re: [C++ Patch] Remove uses of ATTRIBUTE_UNUSED in the function parameters
On Tue, Sep 11, 2012 at 05:29:12PM +0200, Paolo Carlini wrote: > PS: slightly interesting, in a couple of cases - > write_unnamed_type_name, wrap_cleanups_r - the parameters were > actually used. Just a general comment, often an argument is only conditionally used, e.g. depending on some preprocessor macro (e.g. target hook). In that case unnamed parameter is not an option, but dropping ATTRIBUTE_UNUSED is not desirable either. Jakub
Re: shrink-wrapping duplicates BBs across partitions.
On Tue, Sep 11, 2012 at 5:31 PM, Christian Bruel wrote: > Actually, the edge is fairly simple. I have > > BB5 (BB_COLD_PARTITION) -> BB10 (BB_HOT_PARTITION) -> EXIT > > and BB10 has no other incoming edges. and we are duplicating it. That is wrong, should never happen. Is there a test case to play with? It'd be good to have a PR for this. Ciao! Steven
Re: Change double_int calls to new interface.
Mark Kettenis writes: > In file included from ../../../src/gcc/gcc/mcf.c:47:0: > ../../../src/gcc/gcc/mcf.c: In function 'void dump_fixup_edge(FILE*, > fixup_graph_type*, fixup_edge_p)': > ../../../src/gcc/gcc/system.h:288:78: error: integer overflow in expression > [-Werror=overflow] This is PR54528. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."
Fix var-tracking for window register targets
Caught on a sparc build. Testing on sparc. Will commit once it finishes. Diego. * var-tracking.c (vt_add_function_parameter): Adjust for VEC changes. diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c index 8c9ec48..9f5bc12 100644 --- a/gcc/var-tracking.c +++ b/gcc/var-tracking.c @@ -9356,13 +9356,13 @@ vt_add_function_parameter (tree parm) && HARD_REGISTER_P (incoming) && OUTGOING_REGNO (REGNO (incoming)) != REGNO (incoming)) { - parm_reg_t *p - = VEC_safe_push (parm_reg_t, gc, windowed_parm_regs, NULL); - p->incoming = incoming; + parm_reg_t p; + p.incoming = incoming; incoming = gen_rtx_REG_offset (incoming, GET_MODE (incoming), OUTGOING_REGNO (REGNO (incoming)), 0); - p->outgoing = incoming; + p.outgoing = incoming; + VEC_safe_push (parm_reg_t, gc, windowed_parm_regs, p); } else if (MEM_P (incoming) && REG_P (XEXP (incoming, 0)) @@ -9371,11 +9371,11 @@ vt_add_function_parameter (tree parm) rtx reg = XEXP (incoming, 0); if (OUTGOING_REGNO (REGNO (reg)) != REGNO (reg)) { - parm_reg_t *p - = VEC_safe_push (parm_reg_t, gc, windowed_parm_regs, NULL); - p->incoming = reg; + parm_reg_t p; + p.incoming = reg; reg = gen_raw_REG (GET_MODE (reg), OUTGOING_REGNO (REGNO (reg))); - p->outgoing = reg; + p.outgoing = reg; + VEC_safe_push (parm_reg_t, gc, windowed_parm_regs, p); incoming = replace_equiv_address_nv (incoming, reg); } }
Re: shrink-wrapping duplicates BBs across partitions.
On 09/11/2012 05:40 PM, Steven Bosscher wrote: > On Tue, Sep 11, 2012 at 5:31 PM, Christian Bruel > wrote: >> Actually, the edge is fairly simple. I have >> >> BB5 (BB_COLD_PARTITION) -> BB10 (BB_HOT_PARTITION) -> EXIT >> >> and BB10 has no other incoming edges. and we are duplicating it. > > That is wrong, should never happen. Is there a test case to play with? Thanks for the confirmation. The case happens on SH only when applying the simple_return patch [PR target/54546] on the bb-reorder test from the testsuite. > It'd be good to have a PR for this. I'll update the PR above with what I find, lets see if this turns out to be target independent. thanks Christian > > Ciao! > Steven >
Re: Obsolete picochip-* in 4.7.2+
Hi! As discussed on IRC, the picochip-* port doesn't have an active maintainer anymore, this patch adds it to deprecated ports for 4.7.2+ so that it can be removed in GCC 4.8 unless somebody steps up to maintain it. Ok for trunk/4.7? 2012-09-11 Jakub Jelinek * config.gcc: Obsolete picochip-*. --- gcc/config.gcc 2012-09-05 14:52:14.428548941 +0200 +++ gcc/config.gcc 2012-09-11 17:05:15.147522191 +0200 @@ -245,7 +245,8 @@ md_file= # Obsolete configurations. case ${target} in - score-* \ + picochip-* \ + | score-* \ ) if test "x$enable_obsolete" != xyes; then echo "*** Configuration ${target} is obsolete.">&2 --- gcc-4.7/changes.html10 Aug 2012 16:25:46 - 1.124 +++ gcc-4.7/changes.html11 Sep 2012 15:15:38 - @@ -29,7 +29,14 @@ next release of GCC will have their sources permanently removed. -The following ports for individual systems on +All GCC ports for the following processor +architectures have been declared obsolete: + + + picoChip (picochip-*) + + +The following ports for individual systems on particular architectures have been obsoleted: As some of you will be aware, picoChip was acquired earlier this year by Mindspeed Technologies. Although the picoChip specific tool chain, which includes the port of GCC, is still being actively used by customers in existing products, further development of picoChip products is ceasing and customers are migrating to the equivalent Mindspeed products. No further development work will be undertaken for picoGcc, and no one within Mindspeed will be able to continue to support the port, so it is right that the picochip port should be obsoleted. Thank you to everyone who has helped myself and the other maintainers of the picochip port over the years. regards, dan. -- -- Daniel Towner, Mindspeed Technologies Inc. Upper Borough Court, Upper Borough Walls, Bath BA1 1RG, UK daniel.tow...@mindspeed.com +44 7786 702589 -- This message has been scanned for viruses and dangerous content by Mindspeed IT using MailScanner and is believed to be clean.
Re: [rtl] combine a vec_concat of 2 vec_selects from the same vector
On Sun, 9 Sep 2012, Marc Glisse wrote: Hello, this patch lets the compiler try to rewrite: (vec_concat (vec_select x [a]) (vec_select x [b])) as: vec_select x [a b] or even just "x" if appropriate. In a first iteration I was restricting it to b-a==1, but it seemed better not to: it helps for {v[1],v[0]} and doesn't change anything for unknown patterns. Note that I am planning to do a similar optimization at tree level, but it shouldn't make this one useless because such patterns can be created during rtl passes. The testcase may need an additional -fno-tree-xxx to still be useful at that point though. Since the tree-ssa patch was reviewed faster, assume there is a -fno-tree-forwprop in dg-options for the testcase. bootstrap+testsuite on x86_64-linux-gnu. 2012-09-09 Marc Glisse gcc/ * simplify-rtx.c (simplify_binary_operation_1): Handle vec_concat of vec_selects from the same vector. gcc/testsuite/ * gcc.target/i386/vect-rebuild.c: New testcase. http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00540.html -- Marc Glisse
Re: [PATCH] Combine location with block using block_locations
I saw comments in tree-streamer-out.c: /* Do not stream BLOCK_SOURCE_LOCATION. We cannot handle debug information for early inlining so drop it on the floor instead of ICEing in dwarf2out.c. */ streamer_write_chain (ob, BLOCK_VARS (expr), ref_p); However, what the code is doing seemed contradictory with the comment. Or am I missing something? On Tue, Sep 11, 2012 at 8:32 AM, Michael Matz wrote: > Hi, > > On Tue, 11 Sep 2012, Dehao Chen wrote: > >> Looks like we have two choices: >> >> 1. Stream out block info, and use LTO_SET_PREVAIL for TREE_CHAIN(t) > > This will actually not work correctly in some cases. The problem is, if > the prevailing decl is already part of another chain (say in another > block_var list) you would break the current chain. Hence block vars need > special handling in the lto streamer (another reason why tree_chain is not > the most clever think to use for this chain). This problem area needs to > be solved somehow if block info is to be preserved correctly. > >> 2. Don't stream out block info for LTO, and still call LTO_NO_PREVAIL >> (TREE_CHAIN (t)). > > That's also a large hammer as it basically will mean no debug info after > LTO :-/ Sigh, at this point I have no good solution that doesn't involve > quite some work, perhaps your hack is good enough for the time being, > though I hate it :) I got it. Then I'll keep the patch as it is (remove the LTO_NO_PREVAIL), and work with Honza to resolve the issue he had, and then we should be good to check in? Thanks, Dehao > > > Ciao, > Michael.
Re: [C++ Patch] Remove uses of ATTRIBUTE_UNUSED in the function parameters
On 09/11/2012 05:37 PM, Jakub Jelinek wrote: On Tue, Sep 11, 2012 at 05:29:12PM +0200, Paolo Carlini wrote: PS: slightly interesting, in a couple of cases - write_unnamed_type_name, wrap_cleanups_r - the parameters were actually used. Just a general comment, often an argument is only conditionally used, e.g. depending on some preprocessor macro (e.g. target hook). In that case unnamed parameter is not an option, but dropping ATTRIBUTE_UNUSED is not desirable either. Of course. As far as I can see, that isn't the case for the C++ front-end uses, but hey, if you spot something which *may* be less than straightforward in my patch, please let me know asap! Paolo.
Re: shrink-wrapping duplicates BBs across partitions.
when running a cfg dump, I get many messages like: Invalid sum of incoming frequencies 1667, should be 3334 So it looks like a profile information was not correctly propagated somewhere. which could lead to such partitioning incoherency. I have no idea for the moment if this is local problem or not, just want to share that in case someone as an input on this. Cheers Christian On 09/11/2012 05:40 PM, Steven Bosscher wrote: > On Tue, Sep 11, 2012 at 5:31 PM, Christian Bruel > wrote: >> Actually, the edge is fairly simple. I have >> >> BB5 (BB_COLD_PARTITION) -> BB10 (BB_HOT_PARTITION) -> EXIT >> >> and BB10 has no other incoming edges. and we are duplicating it. > > That is wrong, should never happen. Is there a test case to play with? > It'd be good to have a PR for this. > > Ciao! > Steven >
Re: shrink-wrapping duplicates BBs across partitions.
On Tue, Sep 11, 2012 at 05:40:30PM +0200, Steven Bosscher wrote: > On Tue, Sep 11, 2012 at 5:31 PM, Christian Bruel > wrote: > > Actually, the edge is fairly simple. I have > > > > BB5 (BB_COLD_PARTITION) -> BB10 (BB_HOT_PARTITION) -> EXIT > > > > and BB10 has no other incoming edges. and we are duplicating it. > > That is wrong, should never happen. Is there a test case to play with? > It'd be good to have a PR for this. Isn't that the standard case when !HAVE_return ? Then you can have only a single return through epilogue, and when the epilogue is in the hot partition, even if cold code is returning, it needs to jump to the epilogue. Jakub
Re: Scheduler: Allow breaking dependencies by modifying patterns
On 08/03/2012 08:05 AM, Bernd Schmidt wrote: This patch allows us to change rn++ rm=[rn] into rm=[rn + 4] rn++ That is an interesting optimization. I think analogous optimization could be done for INC/DEC addressing (probably it might be beneficial for ppc which has such addressing and displacement addressing). Although it will complicate the haifa scheduler quite a lot as a new insn is generated and the real benefits are may be not worth of it (as an additional insn should be generated which in many cases it could result even in worse code). Opportunities to do this are discovered by a mini-pass over the instructions after generating dependencies and before scheduling a block. At that point we have all the information required to ensure that a candidate dep between two instructions is only used to show the register dependence, and to ensure that every insn with a memory reference is only subject to at most one dep causing a pattern change. The dep_t structure is extended to hold an optional pointer to a "replacement description", which holds information about what to change when a dependency is broken. The time when this replacement is applied differs depending on whether the changed insn is the DEP_CON (in which case the pattern is changed whenever the broken dependency becomes the last one), or the DEP_PRO, in which case we make the change when the corresponding DEP_CON has been scheduled. This ensures that the ready list always contains insns with the correct pattern. A few additional bits are needed in the dep structure: one to hold information about whether a dependency occurs multiple times, and one to distinguish dependencies that are purely for register values from those with other meanings (e.g. memory references). Also, sched-rgn was changed to use a new bit, DEP_POSTPONED, rather than HARD_DEP to indicate that we don't want to schedule an insn in the current block. A possible future extension would be to also allow autoinc addressing modes as the increment insn. Bootstrapped and tested on x86_64-linux, and also tested on c6x-elf (quite a number of changes were necessary to make it work there). It was originally written for a mips target and tested there in the context of a 4.6 tree. I've also run spec2000 on x86_64, with no change that looked like anything other than noise. Ok? Ok, thanks. The changes are pretty straightforward. Only just a few comments. One is a missed change log entry for haifa_note_dep. Second one is for + /* Cached cost of the dependency. Make sure to update UNKNOWN_DEP_COST + when changing the size of this field. */ + int cost:20; }; +#define UNKNOWN_DEP_COST (-1<<19) + You could use a macro to define bit widths and UNKNOWN_DEP_COST. But probably it is a taste matter. The third one is success_in_block in find_modifiable_mems. It is calculated but nowhere used. Probably it was used for debugging. You should something to do with this. Thanks for the patch, Bernd. Sorry for the delay with the review. I thought that Maxim writes his comments first.
Re: [C++ Patch] Remove uses of ATTRIBUTE_UNUSED in the function parameters
On Tue, Sep 11, 2012 at 10:37 AM, Jakub Jelinek wrote: > On Tue, Sep 11, 2012 at 05:29:12PM +0200, Paolo Carlini wrote: >> PS: slightly interesting, in a couple of cases - >> write_unnamed_type_name, wrap_cleanups_r - the parameters were >> actually used. > > Just a general comment, often an argument is only conditionally used, > e.g. depending on some preprocessor macro (e.g. target hook). In that > case unnamed parameter is not an option, but dropping ATTRIBUTE_UNUSED is > not desirable either. That a parameter is unused in a function body should be clear from the context. And in those case, it is desirable that the parameter be unnamed, and the attribute be dropped. That is what Paolo's patch is doing. That should not be controversial. -- Gaby
RE: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
Please see my answers below >-Original Message- >From: Richard Henderson [mailto:r...@redhat.com] >Sent: Monday, September 10, 2012 12:38 PM >To: Iyer, Balaji V >Cc: Richard Guenther; gcc-patches@gcc.gnu.org; Gabriel Dos Reis; Aldy >Hernandez (al...@redhat.com); Jeff Law >Subject: Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22) > >On 09/10/2012 09:09 AM, Iyer, Balaji V wrote: >>> >If that's the case, what's the point in defining an external ABI and >>> >defining what >>> >__attribute__((vector)) placed on a function declaration means? > >> When you have __attribute__((vector)) you are asking the compiler to >> create a vector AND a scalar version of the function. The advantage is >> that if the function is used, for example, in 2 loops where 1 can be >> vectorized and another cannot, the vectorizable loop won't suffer >> (i.e. suffer from being not-vectorized). > > >On the other hand, if you insist on assuming a clone exists merely because a >declaration bears an attribute, then you must address ALL of the problems with >respect to defining a stable ABI in the face of different cpu revisions, >different >ISAs, and different vector lengths. The function mangling handles several of the version inconsistencies you have mentioned. If the CPU revisions, vector lengths are not the same between the function declaration and the function, then the name of the function will be different and the linker should complain. > >I've not seen you address ANY of these problems, despite having the problem >pointed out multiple times. > > >r~
Re: [PATCH, TESTSUITE] Add -fno-short-enums to pr51712
On Sep 11, 2012, at 6:12 AM, Kyrylo Tkachov wrote: > Fixed the format of the test options, as per Jakub's comment. > Ok for trunk? Ok.
Re: [Patch ARM testsuite] fix 3 tests for big-endian
On Sep 11, 2012, at 8:06 AM, Christophe Lyon wrote: > Ping? > http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00068.html Since the arm people haven't rejected it… Ok.
Re: [patch] Expand SJLJ exceptions as tablejump/casesi
Hello, Thanks for the quick review! On Tue, Sep 11, 2012 at 5:03 PM, Richard Henderson wrote: > On 09/10/2012 04:26 PM, Steven Bosscher wrote: >> + rtx index = force_reg (index_mode, dispatch_index); > > You can't modify the result of force_reg. Use copy_to_{mode_,}reg instead. Done. >> + rtx tmp = expand_simple_binop (index_mode, MINUS, >> + index, CONST1_RTX (index_mode), >> + index, 0, OPTAB_DIRECT); >> + gcc_assert (REG_P (tmp)); >> + if (tmp != index) >> + emit_move_insn (index, tmp); > > This pattern is force_expand_binop. Didn't know about this one :-) > Of course, you don't really need to force index be the same all > the way down the chain. You could just as well use > > index = expand_simple_binop (index_mode, MINUS, index, one, >index, 0, OPTAB_DIRECT); > > and use any new pseudo in the next iteration. Right, I've made the changes to do so. > Otherwise this looks good. I made the following changes: $ interdiff sjlj_tablejump.diff.20120910 sjlj_tablejump.diff diff -u stmt.c stmt.c --- stmt.c (working copy) +++ stmt.c (working copy) @@ -2129,19 +2129,16 @@ This is more efficient than a dispatch table on most machines. The last "index--" is redundant but the code is trivially dead and will be cleaned up by later passes. */ - rtx index = force_reg (index_mode, dispatch_index); + rtx index = copy_to_mode_reg (index_mode, dispatch_index); rtx zero = CONST0_RTX (index_mode); for (int i = 0; i < ncases; i++) { tree elt = VEC_index (tree, dispatch_table, i); rtx lab = label_rtx (CASE_LABEL (elt)); do_jump_if_equal (index_mode, index, zero, lab, 0); - rtx tmp = expand_simple_binop (index_mode, MINUS, -index, CONST1_RTX (index_mode), -index, 0, OPTAB_DIRECT); - gcc_assert (REG_P (tmp)); - if (tmp != index) - emit_move_insn (index, tmp); + force_expand_binop (index_mode, code_to_optab (MINUS), + index, CONST1_RTX (index_mode), + index, 0, OPTAB_DIRECT); } } else and I'm re-testing the updated patch. OK for trunk if it passes? Ciao! Steven
Re: [patch] Expand SJLJ exceptions as tablejump/casesi
On 09/11/2012 10:53 AM, Steven Bosscher wrote: > + force_expand_binop (index_mode, code_to_optab (MINUS), Use sub_optab directly, rather than code_to_optab. Otherwise ok. r~
Re: [PATCH] Merging Cilk Plus into Trunk (Patch 1 of approximately 22)
On 09/11/2012 10:14 AM, Iyer, Balaji V wrote: > The function mangling handles several of the version inconsistencies > you have mentioned. If the CPU revisions, vector lengths are not the > same between the function declaration and the function, then the name > of the function will be different and the linker should complain. Sure. I get that. And that works for code within a single project. But that means that if you build a shared library containing one of these elemental functions, its external ABI changes depending on what compiler flags you build it with. Can you not understand how totally unacceptable this is? r~
Re: [PATCH] Add option for dumping to stderr (issue6190057)
Can you resend your patch in text form (also need to resolve the latest conflicts) so that it can be commented inline? Please also provide as summary a more up-to-date description of 1) Command line option syntax and semantics 2) New dumping APIs and semantics 3) Conversion changes Looking at the patch briefly, I am confused with the opt-info syntax. I thought the following is desired: -fopt-info=pass-flags where pass is the pass name, and flags is one of [optimized, notes, missed]. Both pass and flags can be omitted. Is it implemented this way in your patch? David On Mon, Sep 10, 2012 at 11:20 AM, Sharad Singhai wrote: > Ping. > > Thanks, > Sharad > Sharad > > > On Wed, Sep 5, 2012 at 10:34 AM, Sharad Singhai wrote: >> Ping. >> >> Thanks, >> Sharad >> >> Sharad >> >> >> >> >> On Fri, Aug 24, 2012 at 1:06 AM, Sharad Singhai wrote: >>> >>> Sorry about the delay. Please see comments inline. >>> >>> On Wed, Jul 4, 2012 at 6:33 AM, Richard Guenther >>> wrote: >>> > On Tue, Jul 3, 2012 at 11:07 PM, Sharad Singhai >>> > wrote: >>> >> Apologies for the spam. Attempting to resend the patch after shrinking >>> >> it. >>> >> >>> >> I have updated the attached patch to use a new dump message >>> >> classification system for the vectorizer. It currently uses four >>> >> classes, viz, MSG_OPTIMIZED_LOCATIONS, MSG_UNOPTIMIZED_LOCATION, >>> >> MSG_MISSING_OPTIMIZATION, and MSG_NOTE. I have gone through the >>> >> vectorizer passes and have converted each call to fprintf (dump_file, >>> >> ) to a message classification matching in spirit. Most often, it >>> >> is MSG_OPTIMIZED_LOCATIONS, but occasionally others as well. >>> >> >>> >> For example, the following >>> >> >>> >> if (vect_print_dump_info (REPORT_DETAILS)) >>> >> { >>> >> fprintf (vect_dump, "niters for prolog loop: "); >>> >> print_generic_expr (vect_dump, iters, TDF_SLIM); >>> >> } >>> >> >>> >> gets converted to >>> >> >>> >> if (dump_kind (MSG_OPTIMIZED_LOCATIONS)) >>> >> { >>> >> dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location, >>> >> "niters for prolog loop: "); >>> >> dump_generic_expr (MSG_OPTIMIZED_LOCATIONS, TDF_SLIM, iters); >>> >> } >>> >> >>> >> The asymmetry between the first printf and the second is due to the >>> >> fact that 'vect_print_dump_info (xxx)' prints the location as a >>> >> "side-effect". To preserve the original intent somewhat, I have >>> >> converted the first call within a dump sequence to a dump_printf_loc >>> >> (xxx) which prints the location while the subsequence calls within the >>> >> same conditional get converted to the corresponding plain variants. >>> > >>> > Ok, that looks reasonable. >>> > >>> >> I considered removing the support for alternate dump file, but ended >>> >> up preserving it instead since it is needed for setting the alternate >>> >> dump file to stderr for the case when -fopt-info is given but no dump >>> >> file is available. >>> >> >>> >> The following invocation >>> >> g++ ... -ftree-vectorize -fopt-info=4 >>> >> >>> >> dumps *all* available information to stderr. Currently, the opt-info >>> >> level is common to all passes, i.e., a pass can't specify if wants a >>> >> different level of diagnostic info. This can be added as an >>> >> enhancement with a suitable syntax for selecting passes. >>> >> >>> >> I haven't fixed up the documentation/tests but wanted to get some >>> >> feedback about the current state of patch before doing that. >>> > >>> > Some comments / questions. >>> > >>> > + if (dump_file && (dump_kind & opt_info_flags)) >>> > +{ >>> > + dump_loc (dump_kind, dump_file, loc); >>> > + print_generic_expr (dump_file, t, dump_flags | extra_dump_flags); >>> > +} >>> > + >>> > + if (alt_dump_file && (dump_kind & opt_info_flags)) >>> > +{ >>> > >>> > you always test dump_kind against the same opt_info_flags variable. >>> > I would have thought that the alternate dump file has a different >>> > opt_info_flags >>> > setting so I can have -fdump-tree-vect-details -fopt-info=1. Am I >>> > missing >>> > something? >>> >>> It was an oversight on my part. I have since fixed this. There are two >>> separate flags corresponding to the two types of dump files, >>> >>> pflags ==> pass private dump file >>> alt_flags ==> opt-info dump file >>> >>> > If I do >>> > >>> >> gcc file1.c file2.c -O3 -fdump-tree-vectorize=foo >>> > >>> > what will foo contain afterwards? I think you need to document the >>> > behavior >>> > when such redirection is used with the compiler-driver feature of >>> > handling >>> > multiple translation units. Especially the difference (or not >>> > difference) to >>> > >>> >> gcc file1.c -O3 -fdump-tree-vectorize=foo >>> >> gcc file2.c -O3 -fdump-tree-vectorize=foo >>> >>> Yes, the dump file gets overwritten during each invocation. I have >>> noted this in the documentation. >>> >>> > I suppose we do not want to append to foo (but eventually support that >>> > with some alternate syntax?
Re: [PATCH] limited C++ parsing support for gengtype
On 2012-08-29 20:31 , Aaron Gray wrote: 2012-08-30 Aaron Gray * gengtype-lex.l: Support for FILE Support for C++ single line Comments Support for classes Support for enums ignore 'static' ignore 'inline' ignore 'public:' ignore 'protected:' ignore 'private:' ignore 'friend' support for 'operator' token support for 'new' support for 'delete' added support for '+' as a token for summations in enum bodies * gengtype.h: added 'TYPE_ENUM' to 'enum typekind' added enum TYPE_ENUM to 'struct type' union Write entries like these as: * gengtype.h (enum type_kind): Add TYPE_ENUM. (struct type): Add TYPE_ENUM. added OPERATOR_KEYWORD and OPERATOR keywords to Token Code enum Likewise. * gengtype-parser.c: updated 'token_names[]' (direct_declarator): support for parsing limited operators support for parsing constructors with no parameters support for parsing enums * gengtype.c: added 'type_p enums' to maintain list of enums (resolve_typedef): added support for stucture types and enums added 'new_enum()' diff --git a/gcc/gengtype-lex.l b/gcc/gengtype-lex.l index 5788a6a..af9696a 100644 --- a/gcc/gengtype-lex.l +++ b/gcc/gengtype-lex.l @@ -53,11 +53,11 @@ update_lineno (const char *l, size_t len) ID[[:alpha:]_][[:alnum:]_]* WS[[:space:]]+ HWS [ \t\r\v\f]* -IWORD short|long|(un)?signed|char|int|HOST_WIDE_INT|HOST_WIDEST_INT|bool|size_t|BOOL_BITFIELD|CPPCHAR_SIGNED_T|ino_t|dev_t|HARD_REG_SET +IWORD short|long|(un)?signed|char|int|HOST_WIDE_INT|HOST_WIDEST_INT|bool|size_t|BOOL_BITFIELD|CPPCHAR_SIGNED_T|ino_t|dev_t|HARD_REG_SET|FILE ITYPE {IWORD}({WS}{IWORD})* EOID [^[:alnum:]_] -%x in_struct in_struct_comment in_comment +%x in_struct in_struct_comment in_comment in_line_comment in_line_struct_comment %option warn noyywrap nounput nodefault perf-report %option 8bit never-interactive %% @@ -83,6 +83,14 @@ EOID [^[:alnum:]_] BEGIN(in_struct); return UNION; } +^{HWS}class/{EOID} { + BEGIN(in_struct); + return STRUCT; +} +^{HWS}enum/{EOID} { + BEGIN(in_struct); + return ENUM; +} ^{HWS}extern/{EOID} { BEGIN(in_struct); return EXTERN; @@ -101,10 +109,20 @@ EOID [^[:alnum:]_] \\\n { lexer_line.line++; } "const"/{EOID} /* don't care */ +"static"/{EOID} /* don't care */ +"inline"/{EOID} /* don't care */ +"public:"/* don't care */ +"private:" /* don't care */ +"protected:" /* don't care */ +"operator"/{EOID} { return OPERATOR_KEYWORD; } +"new"/{EOID}{ *yylval = XDUPVAR (const char, yytext+1, yyleng-2, yyleng-1); return OPERATOR; } +"delete"/{EOID} { *yylval = XDUPVAR (const char, yytext+1, yyleng-2, yyleng-1); return OPERATOR; } +"friend"/{EOID} "GTY"/{EOID}{ return GTY_TOKEN; } "VEC"/{EOID}{ return VEC_TOKEN; } "union"/{EOID} { return UNION; } "struct"/{EOID} { return STRUCT; } +"class"/{EOID} { return CLASS; } Why not just return STRUCT here? @@ -3,7 +3,7 @@ This file is part of GCC. - GCC is free software; you can redistribute it and/or modify it under + /GCC is free software; you can redistribute it and/or modify it under This seems out of place. @@ -778,6 +791,7 @@ type (options_p *optsp, bool nested) return resolve_typedef (s, &lexer_line); case STRUCT: +case CLASS: I think that as far as gengtype is concerned, 'struct' and 'class' should be exactly the same thing. So, all the handling for 'CLASS' you added should not be needed. +/* enum definition: type() does all the work. */ +static void +parse_enum (void) +{ + options_p dummy; + type (&dummy, false); + /* There may be junk after the type: notably, we cannot currently + distinguish 'struct foo *function(prototype);' from 'struct foo;' + ... we could call declarator(), but it's a waste of time at + present. Instead, just eat whatever token is currently lookahead + and go back to lexical skipping mode. */ + advance (); +} + I'm not quite sure what is this trying to do. @@ -601,16 +602,93 @@ type_p resolve_typedef (const char *s, struct fileloc *pos) { pair_p p; + type_p t; + type_p e; + for (p = typedefs; p != NULL; p = p->next) if (strcmp (p->name, s) == 0) return p->type; + for (t = structures; t != NULL; t = t->next) +{ + switch ( t->kind) +{ + case TYPE_NONE: + if (do_debug) + fprintf(stderr, "TYPE_NONE:\n"); +break; + case TYPE_SCALAR: +if (do_debug) + fprintf(s
Re: Change double_int calls to new interface.
On 9/11/12, Andreas Schwab wrote: > Mark Kettenis writes: >> In file included from ../../../src/gcc/gcc/mcf.c:47:0: >> ../../../src/gcc/gcc/mcf.c: In function 'void dump_fixup_edge(FILE*, >> fixup_graph_type*, fixup_edge_p)': >> ../../../src/gcc/gcc/system.h:288:78: error: integer overflow in >> expression [-Werror=overflow] > > This is PR54528. The expression itself looks correct. I have not been able to duplicate the problem on x86. I am now waiting on access to the compile farm for access to a hppa system. Does anyone have more specific information on the condition that generates the error? -- Lawrence Crowl
Re: [PATCH] limited C++ parsing support for gengtype
On Tue, Sep 11, 2012 at 3:41 PM, Diego Novillo wrote: >> @@ -778,6 +791,7 @@ type (options_p *optsp, bool nested) >> return resolve_typedef (s, &lexer_line); >> >> case STRUCT: >> +case CLASS: > > > I think that as far as gengtype is concerned, 'struct' and 'class' should be > exactly the same thing. So, all the handling for 'CLASS' you added should > not be needed. 100% agreed. -- Gaby
Backtrace library [1/3]
I have finished the initial implementation of the backtrace library I proposed at http://gcc.gnu.org/ml/gcc/2012-08/msg00317.html . I've separated the work into three patches. These patches only implement the backtrace library itself; actual use of the library will follow in separate patches. This initial implementation only supports ELF and DWARF. The library is designed to work correctly for other cases, in the sense that it will report that it can not find any backtrace information. The library is designed to make it straightforward to add support for other object file formats and debugging formats. My intent is to commit the library with ELF/DWARF support and then support other people in extending it. In particular, adding support for Mach-O and PE with DWARF should be simple. This patch is the interface to and configury of libbacktrace. I've separated these out as the parts of libbacktrace that require the most review. The interface to libbacktrace is in the file backtrace.h. This is what callers will use. The file backtrace-supported.h is also available so that programs can see whether calling the backtrace library will work at all. The configury is fairly standard. Note that libbacktrace is built as both a host library (to link into the compilers) and as a target library (to link into libgo and possibly other libraries). Bootstrapped on x86_64-unknown-linux-gnu in conjunction with the other two patches. OK for mainline? Ian 2012-09-11 Ian Lance Taylor * Initial implementation. Index: libbacktrace/README === --- libbacktrace/README (revision 0) +++ libbacktrace/README (revision 0) @@ -0,0 +1,23 @@ +The libbacktrace library +Initially written by Ian Lance Taylor + +The libbacktrace library may be linked into a program or library and +used to produce symbolic backtraces. Sample uses would be to print a +detailed backtrace when an error occurs or to gather detailed +profiling information. + +The libbacktrace library is provided under a BSD license. See the +source files for the exact license text. + +The public functions are declared and documented in the header file +backtrace.h, which should be #include'd by a user of the library. + +Building libbacktrace will generate a file backtrace-supported.h, +which a user of the library may use to determine whether backtraces +will work. See the source file backtrace-supported.h.in for the +macros that it defines. + +As of September 2012, libbacktrace only supports ELF executables with +DWARF debugging information. The library is written to make it +straightforward to add support for other object file and debugging +formats. Index: libbacktrace/backtrace.h === --- libbacktrace/backtrace.h (revision 0) +++ libbacktrace/backtrace.h (revision 0) @@ -0,0 +1,165 @@ +/* backtrace.h -- Public header file for stack backtrace library. + Copyright (C) 2012 Free Software Foundation, Inc. + Written by Ian Lance Taylor, Google. + +Redistribution and use in source and binary forms, with or without +modification, are permitted provided that the following conditions are +met: + +(1) Redistributions of source code must retain the above copyright +notice, this list of conditions and the following disclaimer. + +(2) Redistributions in binary form must reproduce the above copyright +notice, this list of conditions and the following disclaimer in +the documentation and/or other materials provided with the +distribution. + +(3) The name of the author may not be used to +endorse or promote products derived from this software without +specific prior written permission. + +THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR +IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE +DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, +INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES +(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR +SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) +HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, +STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING +IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE +POSSIBILITY OF SUCH DAMAGE. */ + +#ifndef BACKTRACE_H +#define BACKTRACE_H + +#include +#include +#include + +#ifdef __cplusplus +extern "C" { +#endif + +/* The backtrace code needs to open the executable file in order to + find the debug info. On systems that do not support + /proc/self/exe, the program using the backtrace library needs to + tell the backtrace library the name of the executable to open. It + does so by calling backtrace_set_executable_name. The FILENAME + argument must point to a permanent buffer. */
Backtrace library [2/3]
I have finished the initial implementation of the backtrace library I proposed at http://gcc.gnu.org/ml/gcc/2012-08/msg00317.html . I've separated the work into three patches. These patches only implement the backtrace library itself; actual use of the library will follow in separate patches. This patch is the changes to the top-level directories for the backtrace library. This is straightforward. Note that libbacktrace is built as both a host library (to link into the compilers) and as a target library (to link into libgo and possibly other libraries). Bootstrapped on x86_64-unknown-linux-gnu in conjunction with the other two patches. OK for mainline? Ian 2012-09-11 Ian Lance Taylor * MAINTAINERS (Various Maintainers): Add libbacktrace. * configure.ac (host_libs): Add libbacktrace. (target_libraries): Add libbacktrace. * Makefile.def (host_modules): Add libbacktrace. (target_modules): Likewise. * configure, Makefile.in: Rebuild. Index: configure.ac === --- configure.ac (revision 191171) +++ configure.ac (working copy) @@ -133,7 +133,7 @@ build_tools="build-texinfo build-flex bu # these libraries are used by various programs built for the host environment # -host_libs="intl libiberty opcodes bfd readline tcl tk itcl libgui zlib libcpp libdecnumber gmp mpfr mpc isl cloog libelf libiconv" +host_libs="intl libiberty opcodes bfd readline tcl tk itcl libgui zlib libbacktrace libcpp libdecnumber gmp mpfr mpc isl cloog libelf libiconv" # these tools are built for the host environment # Note, the powerpc-eabi build depends on sim occurring before gdb in order to @@ -152,6 +152,7 @@ libgcj="target-libffi \ # the host libraries and the host tools (which may be a cross compiler) # Note that libiberty is not a target library. target_libraries="target-libgcc \ + target-libbacktrace \ target-libgloss \ target-newlib \ target-libgomp \ Index: MAINTAINERS === --- MAINTAINERS (revision 191171) +++ MAINTAINERS (working copy) @@ -155,6 +155,7 @@ objective-c/c++ Stan Shebs stanshebs@e Various Maintainers +libbacktrace Ian Lance Taylor i...@airs.com libcpp Per Bothner p...@bothner.com libcpp All C and C++ front end maintainers fp-bit Ian Lance Taylor i...@airs.com Index: Makefile.def === --- Makefile.def (revision 191171) +++ Makefile.def (working copy) @@ -80,6 +80,7 @@ host_modules= { module= tcl; missing=mostlyclean; }; host_modules= { module= itcl; }; host_modules= { module= ld; bootstrap=true; }; +host_modules= { module= libbacktrace; bootstrap=true; }; host_modules= { module= libcpp; bootstrap=true; }; host_modules= { module= libdecnumber; bootstrap=true; }; host_modules= { module= libgui; }; @@ -121,6 +122,7 @@ target_modules = { module= libmudflap; l target_modules = { module= libssp; lib_path=.libs; }; target_modules = { module= newlib; }; target_modules = { module= libgcc; bootstrap=true; no_check=true; }; +target_modules = { module= libbacktrace; }; target_modules = { module= libquadmath; }; target_modules = { module= libgfortran; }; target_modules = { module= libobjc; };
Re: Backtrace library [1/3]
On Tue, Sep 11, 2012 at 5:53 PM, Ian Lance Taylor wrote: > This patch is the interface to and configury of libbacktrace. I've > separated these out as the parts of libbacktrace that require the most > review. The interface to libbacktrace is in the file backtrace.h. This > is what callers will use. The file backtrace-supported.h is also > available so that programs can see whether calling the backtrace library > will work at all. So, you've settled on a C interface? A C++ interface would have been native for other open source projects that are C++ oriented... -- Gaby
Re: Backtrace library [1/3]
On Sep 11, 2012, at 3:53 PM, Ian Lance Taylor wrote: > I have finished the initial implementation of the backtrace library I > proposed at http://gcc.gnu.org/ml/gcc/2012-08/msg00317.html . I've > separated the work into three patches. These patches only implement the > backtrace library itself; actual use of the library will follow in > separate patches. Hi Ian, I have no specific comment on the implementation of this library, but: > > +/* Get a full stack backtrace. SKIP is the number of frames to skip; > + passing 0 will start the trace with the function calling backtrace. > + DATA is passed to the callback routine. If any call to CALLBACK > + returns a non-zero value, the stack backtrace stops, and backtrace > + returns that value; this may be used to limit the number of stack > + frames desired. If all calls to CALLBACK return 0, backtrace > + returns 0. The backtrace function will make at least one call to > + either CALLBACK or ERROR_CALLBACK. This function requires debug > + info for the executable. */ > + > +extern int backtrace (int skip, backtrace_callback callback, > + backtrace_error_callback error_callback, void *data); FYI, "backtrace" is a well-known function provide by glibc (and other libc's). It might be best to pick another name. -Chris
Re: Backtrace library [1/3]
On Tue, Sep 11, 2012 at 4:01 PM, Gabriel Dos Reis wrote: > On Tue, Sep 11, 2012 at 5:53 PM, Ian Lance Taylor wrote: > >> This patch is the interface to and configury of libbacktrace. I've >> separated these out as the parts of libbacktrace that require the most >> review. The interface to libbacktrace is in the file backtrace.h. This >> is what callers will use. The file backtrace-supported.h is also >> available so that programs can see whether calling the backtrace library >> will work at all. > > So, you've settled on a C interface? A C++ interface would have been > native for other open source projects that are C++ oriented... Yes, a C interface is convenient for libgo, and of course is generally usable. We can certainly layer a C++ interface on top if it seems useful. The interface is somewhat constrained in that, on systems that support anonymous mmap, it does not call malloc. That makes it possible to do a symbolic backtrace from a signal handler. Ian