Re: [patch] 19/n: trans-mem: compiler tree/gimple stuff
On 11/04/11 08:26, Michael Matz wrote: Hi, On Thu, 3 Nov 2011, Aldy Hernandez wrote: +/* GIMPLE_EH_ELSE must be the sole contents of + a GIMPLE_TRY_FINALLY node. For all normal exits from the try block, + we N_BODY is run; for all exception exits from the try block, s/we // Fixed +++ gcc/calls.c (.../branches/transactional-memory) (revision 180773) @@ -496,7 +496,60 @@ emit_call_1 (rtx funexp, tree fntree ATT static int special_function_p (const_tree fndecl, int flags) { + case BUILT_IN_TM_IRREVOCABLE: + case BUILT_IN_TM_GETTMCLONE_IRR: + case BUILT_IN_TM_MEMCPY: + case BUILT_IN_TM_MEMMOVE: +case BUILT_IN_TM_MEMSET: Whitespace. Fixed @@ -1751,6 +1787,8 @@ walk_gimple_stmt (gimple_stmt_iterator * gcc_assert (tree_ret == NULL); /* Re-read stmt in case the callback changed it. */ + if (wi&& wi->removed_stmt) + return NULL; stmt = gsi_stmt (*gsi); Comment belongs to the stmt assignment, not to the new if/return. Fixed @@ -3085,6 +3153,8 @@ get_call_expr_in (tree t) t = TREE_OPERAND (t, 1); if (TREE_CODE (t) == WITH_SIZE_EXPR) t = TREE_OPERAND (t, 0); + if (TREE_CODE (t) == VIEW_CONVERT_EXPR) +t = TREE_OPERAND (t, 0); if (TREE_CODE (t) == CALL_EXPR) return t; The function get_call_expr_in is unused in our compiler (and you don't introduce a new use), so instead of amending it, just remove it. Fixed in previous patch. +GF_CALL_NOINLINE = 1<< 8, This flag is only used by the new accessors gimple_call_noinline_p and gimple_call_set_noinline_p. The latter is used in trans-mem.c:ipa_tm_insert_gettmclone_call, but marked as hack. The flag isn't tested anywhere (i.e. no calls to gimple_call_noinline_p). Hence this whole thing is unused, presumably the hack was transformed into a real solution :) So, don't add the flag or the accessors, and remove the call from trans-mem.c. Excellent catch! Thanks so much. Fixed. The attached patch has been bootstrapped and regtested on x86-64 Linux. Committing to branch. * trans-mem.c (ipa_tm_insert_gettmclone_call): Remove call to gimple_call_set_noinline_p. * gimple.h (enum gf_mask): Remove GF_CALL_NOINLINE. (gimple_call_noinline_p): Remove. (gimple_call_set_noinline_p): Remove. * gimple.c (walk_gimple_stmt): Move comment down. * calls.c (is_tm_builtin): Fix whitespace problem. * gimple.def (GIMPLE_EH_ELSE): Fix typo in comment. Index: gimple.def === --- gimple.def (revision 180974) +++ gimple.def (working copy) @@ -161,7 +161,7 @@ DEFGSCODE(GIMPLE_EH_MUST_NOT_THROW, "gim /* GIMPLE_EH_ELSE must be the sole contents of a GIMPLE_TRY_FINALLY node. For all normal exits from the try block, - we N_BODY is run; for all exception exits from the try block, + N_BODY is run; for all exception exits from the try block, E_BODY is run. */ DEFGSCODE(GIMPLE_EH_ELSE, "gimple_eh_else", GSS_EH_ELSE) Index: trans-mem.c === --- trans-mem.c (revision 180974) +++ trans-mem.c (working copy) @@ -4367,12 +4367,6 @@ ipa_tm_insert_gettmclone_call (struct cg if (gimple_call_nothrow_p (stmt)) gimple_call_set_nothrow (stmt, true); - /* ??? This is a hack to prevent tree-eh.c inlineable_call_p from - deciding that the indirect call we have after this transformation - might be inlinable, and thus changing the value of can_throw_internal, - and thus requiring extra EH edges. */ - gimple_call_set_noinline_p (stmt); - gimple_call_set_fn (stmt, callfn); /* Discarding OBJ_TYPE_REF above may produce incompatible LHS and RHS Index: calls.c === --- calls.c (revision 181004) +++ calls.c (working copy) @@ -634,7 +634,7 @@ is_tm_builtin (const_tree fndecl) case BUILT_IN_TM_GETTMCLONE_IRR: case BUILT_IN_TM_MEMCPY: case BUILT_IN_TM_MEMMOVE: -case BUILT_IN_TM_MEMSET: + case BUILT_IN_TM_MEMSET: CASE_BUILT_IN_TM_STORE (1): CASE_BUILT_IN_TM_STORE (2): CASE_BUILT_IN_TM_STORE (4): Index: gimple.c === --- gimple.c(revision 181004) +++ gimple.c(working copy) @@ -1788,9 +1788,10 @@ walk_gimple_stmt (gimple_stmt_iterator * a value to return. */ gcc_assert (tree_ret == NULL); - /* Re-read stmt in case the callback changed it. */ if (wi && wi->removed_stmt) return NULL; + + /* Re-read stmt in case the callback changed it. */ stmt = gsi_stmt (*gsi); } Index: gimple.h === --- gimple.h(revision 181004) +++ gimple.h(working copy) @@ -105,7 +105,6 @@ enum gf_mask { GF_CALL_NOTHROW
Re: [patch] 19/n: trans-mem: compiler tree/gimple stuff
On Sat, Nov 5, 2011 at 3:24 AM, Richard Henderson wrote: > On 11/04/2011 04:53 PM, Aldy Hernandez wrote: >>> Why is it necessary to know whether a clone is a tm clone? >> >> How do you mean? First, there are a few pretty printing places where we >> dump that a function is a clone. It is easy to debug dumps when you know >> which function is the clone and which is the original function, since we >> will dump both variants at code generation time. >> >> Second, there is code in the TM lowering bits where we assert that we are >> not trying to lower TM clones ahead of time. And there is a check in >> gate_tm_init() where we specify that the entire function is a TM region if >> it is a clone. >> >> etc, etc. >> >> Does this answer your question? > > Richi, if it's the use of the bit in the tree node that you're worried about, > we could probably put it in cgraph_node.local instead. But we do need the > knowledge. Yeah, I was worried about /* 1 bit left */ ;) Putting it in the cgraph node sounds more appealing indeed. Thanks, Richard. > > r~ >
Re: [patch] 19/n: trans-mem: compiler tree/gimple stuff
On Sat, Nov 5, 2011 at 3:54 AM, Richard Henderson wrote: > On 11/04/2011 07:36 PM, Richard Henderson wrote: >> On 11/04/2011 03:36 AM, Richard Guenther wrote: > + case GIMPLE_TRANSACTION: > + return (weights->tm_cost > + + estimate_num_insns_seq (gimple_transaction_body (stmt), > + weights)); > + >>> Huh, so we now have non-lowered gimple sub-sequence throughout all >>> optimizations (inlining especially)? :( >> >> No. I'm not sure why we're still looking at gimple_transaction_body >> here -- that should be NULL after lowering. > ... of course, I'm not sure why we're looking at all those other > nested statements there inside the inliner either. At least we're > doing the same thing as everyone else here. It might be because of nested function lowering which works on gimple like it falls out of the gimplifier. So it might all be correct after all ... Sorry for the noise. Richard. > > r~ >
[doc] Fix a cross reference
Hello, This small patch fix a cross reference in gcc document. 2011-11-05 Mingjie Xing * doc/invoke.texi (Wunused-result): Change @pxref{Variable Attributes} to @pxref{Function Attributes}. Is it OK? Thanks Mingjie Index: doc/invoke.texi === --- doc/invoke.texi (revision 181008) +++ doc/invoke.texi (working copy) @@ -3535,7 +3535,7 @@ To suppress this warning use the @samp{u @opindex Wunused-result @opindex Wno-unused-result Do not warn if a caller of a function marked with attribute -@code{warn_unused_result} (@pxref{Variable Attributes}) does not use +@code{warn_unused_result} (@pxref{Function Attributes}) does not use its return value. The default is @option{-Wunused-result}. @item -Wunused-variable
Re: [PING #2] Pass address space to REGNO_MODE_CODE_OK_FOR_BASE_P
Ulrich Weigand wrote: The following patch still needs maintainer review: http://gcc.gnu.org/ml/gcc-patches/2011-10/msg01874.html Thanks, Ulrich ...me too looking forward to this being reviewed. Thanks, Johann
Re: [PATCH] PR target/50038 fix: redundant zero extensions removal
> Here is a patch which fixes redundant zero extensions problem. Issue > is resolved by expanding implicit_zee pass functionality to cover zero > and sign extends of different modes. Could please someone review it? Could you explain the undelying idea? The current strategy of implicit-zee.c is exposed at length at the beginning of the file, but here's a summary: 1. On some architectures (typically x86-64), implicity zero-extensions are applied when instructions operate in selected sub-word modes (SImode here): addl edi,eax has an implicit zero-extension for %rax. 2. Because of 1, the second instruction in sequences like: (set (reg:SI x) (plus:SI (reg:SI z1) (reg:SI z2))) (set (reg:DI x) (zero_extend:DI (reg:SI x))) is redundant. 3. The pass recognizes this and transforms the above sequence into: (set (reg:DI x) (zero_extend:DI (plus:SI (reg:SI z1) (reg:SI z2 and the machine description knows how to translate this into an 'addl'. You're proposing extending this to other modes and other architectures, for example QImode on x86. But does addb %dl, %al modify the entire %eax register on x86? In other words, are you really after implicit (zero-)extensions or after something else, like global elimination of redundant extensions? What's the effect of the patch on the testcase in the PR in terms of insns at the RTL level? Why doesn't the combiner already optimize it? Enhancing implicit-zee.c to address missed optimizations like the one reported in target/50038 might well be the best approach, but the strategy shift must be clearly exposed and discussed. The reported numbers are certainly impressive. -- Eric Botcazou
Re: [PATCH] strlenopt improvements
> The man page is outdated, stpcpy is a standard POSIX2008 function. Sorry for being so 20th century-ish. :-) > Anyway, in the other gcc.dg/strlenopt-* testcases for USE_GNU I was using > the convention that the name ended with g (i.e. strlenopt-22g.c) and > the test would start with: > /* This test needs runtime that provides stpcpy function. */ > /* { dg-do run { target *-*-linux* } } */ > instead of just > /* { dg-do run } */ Thanks for the tip. Tested on Solaris 8 and Linux, applied on the mainline. 2011-11-05 Eric Botcazou * gcc.dg/strlenopt-22g.c: New wrapper around... * gcc.dg/strlenopt-22.c: ...this. Do not define USE_GNU and adjust. -- Eric Botcazou Index: gcc.dg/strlenopt-22g.c === --- gcc.dg/strlenopt-22g.c (revision 0) +++ gcc.dg/strlenopt-22g.c (revision 0) @@ -0,0 +1,14 @@ +/* This test needs runtime that provides stpcpy function. */ +/* { dg-do run { target *-*-linux* } } */ +/* { dg-options "-O2 -fdump-tree-strlen" } */ + +#define USE_GNU +#include "strlenopt-22.c" + +/* { dg-final { scan-tree-dump-times "strlen \\(" 0 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "memcpy \\(" 1 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strchr \\(" 1 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "stpcpy \\(" 1 "strlen" } } */ +/* { dg-final { cleanup-tree-dump "strlen" } } */ Index: gcc.dg/strlenopt-22.c === --- gcc.dg/strlenopt-22.c (revision 181007) +++ gcc.dg/strlenopt-22.c (working copy) @@ -1,7 +1,6 @@ /* { dg-do run } */ /* { dg-options "-O2 -fdump-tree-strlen" } */ -#define USE_GNU #include "strlenopt.h" __attribute__((noinline, noclone)) size_t @@ -32,10 +31,9 @@ main () return 0; } -/* { dg-final { scan-tree-dump-times "strlen \\(" 0 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strlen \\(" 3 "strlen" } } */ /* { dg-final { scan-tree-dump-times "memcpy \\(" 1 "strlen" } } */ -/* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */ +/* { dg-final { scan-tree-dump-times "strcpy \\(" 1 "strlen" } } */ /* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */ /* { dg-final { scan-tree-dump-times "strchr \\(" 1 "strlen" } } */ -/* { dg-final { scan-tree-dump-times "stpcpy \\(" 1 "strlen" } } */ /* { dg-final { cleanup-tree-dump "strlen" } } */
Re: [PATCH] Fix early inliner inlining uninlinable functions
On 28 Oct 2011, at 13:57, Richard Guenther wrote: We fail to keep the cannot-inline flag up-to-date when turning indirect to direct calls. The following patch arranges to do this during statement folding (which should always be called when that happens). It also makes sure to copy the updated flag to the edge when iterating early inlining. This: http://gcc.gnu.org/ml/gcc-cvs/2011-11/msg00046.html regresses: acats/c740203a (x86-64-darwin10) gnat/aliasing3.adb (m64 i486-darwin9 and x86-64-darwin10) ... don't know about other platforms at present. with messages like: FAIL: gnat.dg/aliasing3.adb (test for excess errors) Excess errors: +===GNAT BUG DETECTED==+ | 4.7.0 2002 (experimental) [trunk revision 180763] (i686-apple- darwin9) GCC error:| | in estimate_function_body_sizes, at ipa-inline-analysis.c: 1977 | | Error detected around /GCC/gcc-live-trunk/gcc/testsuite/gnat.dg/ aliasing3_pkg.adb:8:6| | Please submit a bug report; see http://gcc.gnu.org/ bugs.html.| | Use a subject line meaningful to you and us to track the bug.| | Include the entire contents of this bug box in the report. | | Also include sources listed below in gnatchop format | | (concatenated together with no headers between files). | + = = = ===+ Please include these source files with error report Note that list may not be accurate in some cases, so please double check that the problem can still be reproduced with the set of files listed. Consider also -gnatd.n switch (see debug.adb). /Volumes/ScratchCS/gcc-4-7-trunk-build/i686-apple-darwin9/x86_64/ libada/adainclude/system.ads /GCC/gcc-live-trunk/gcc/testsuite/gnat.dg/aliasing3.adb /GCC/gcc-live-trunk/gcc/testsuite/gnat.dg/aliasing3_pkg.ads /GCC/gcc-live-trunk/gcc/testsuite/gnat.dg/aliasing3_pkg.adb raised TYPES.UNRECOVERABLE_ERROR : comperr.adb:432 Bootstrap and regtest running on x86_64-unknown-linux-gnu, ok? Thanks, Richard. 2010-10-28 Richard Guenther PR tree-optimization/50890 * gimple.h (gimple_fold_call): Remove. * gimple-fold.c (fold_stmt_1): Move all call related code to ... (gimple_fold_call): ... here. Make static. Update the cannot-inline flag on direct calls. * ipa-inline.c (early_inliner): Copy the cannot-inline flag from the statements to the edges. * gcc.dg/torture/pr50890.c: New testcase. Index: gcc/gimple.h === *** gcc/gimple.h(revision 180608) --- gcc/gimple.h(working copy) *** unsigned get_gimple_rhs_num_ops (enum tr *** 909,915 #define gimple_alloc(c, n) gimple_alloc_stat (c, n MEM_STAT_INFO) gimple gimple_alloc_stat (enum gimple_code, unsigned MEM_STAT_DECL); const char *gimple_decl_printable_name (tree, int); - bool gimple_fold_call (gimple_stmt_iterator *gsi, bool inplace); tree gimple_get_virt_method_for_binfo (HOST_WIDE_INT, tree); void gimple_adjust_this_by_delta (gimple_stmt_iterator *, tree); tree gimple_extract_devirt_binfo_from_cst (tree); --- 909,914 Index: gcc/gimple-fold.c === *** gcc/gimple-fold.c (revision 180608) --- gcc/gimple-fold.c (working copy) *** gimple_extract_devirt_binfo_from_cst (tr *** 1057,1109 simplifies to a constant value. Return true if any changes were made. It is assumed that the operands have been previously folded. */ ! bool gimple_fold_call (gimple_stmt_iterator *gsi, bool inplace) { gimple stmt = gsi_stmt (*gsi); tree callee; ! /* Check for builtins that CCP can handle using information not ! available in the generic fold routines. */ ! callee = gimple_call_fndecl (stmt); ! if (!inplace && callee && DECL_BUILT_IN (callee)) ! { ! tree result = gimple_fold_builtin (stmt); ! ! if (result) ! { ! if (!update_call_from_tree (gsi, result)) ! gimplify_and_update_call_from_tree (gsi, result); ! return true; ! } ! } /* Check for virtual calls that became direct calls. */ callee = gimple_call_fn (stmt); if (callee && TREE_CODE (callee) == OBJ_TYPE_REF) { - tree binfo, fndecl, obj; - HOST_WIDE_INT token; - if (gimple_call_addr_fndecl (OBJ_TYPE_REF_EXPR (callee)) != NULL_TREE) { gimple_call_set_fn (stmt, OBJ_TYPE_REF_EXPR (callee)); ! return true; } ! obj = OBJ_TYPE_REF_OBJECT (callee); ! binfo = gimple_extract_devirt_binfo_from_cst (obj); ! if (!binfo) ! return false; ! token = TREE_INT_CST_LOW (OBJ_TYPE_REF_TOKEN (callee)); ! fndecl = gimple_get_virt_method_for_binfo (token, binfo); ! if (!fndecl) ! return false; ! gimple
Re: [PATCH] strlenopt improvements
On Sat, Nov 05, 2011 at 11:44:27AM +0100, Eric Botcazou wrote: > Thanks for the tip. Tested on Solaris 8 and Linux, applied on the mainline. Thanks. > @@ -32,10 +31,9 @@ main () >return 0; > } > > -/* { dg-final { scan-tree-dump-times "strlen \\(" 0 "strlen" } } */ > +/* { dg-final { scan-tree-dump-times "strlen \\(" 3 "strlen" } } */ > /* { dg-final { scan-tree-dump-times "memcpy \\(" 1 "strlen" } } */ > -/* { dg-final { scan-tree-dump-times "strcpy \\(" 0 "strlen" } } */ > +/* { dg-final { scan-tree-dump-times "strcpy \\(" 1 "strlen" } } */ > /* { dg-final { scan-tree-dump-times "strcat \\(" 0 "strlen" } } */ > /* { dg-final { scan-tree-dump-times "strchr \\(" 1 "strlen" } } */ > -/* { dg-final { scan-tree-dump-times "stpcpy \\(" 1 "strlen" } } */ This line perhaps could have been kept, just with s/1/0/ to also test that when stpcpy prototype isn't provided we don't emit any stpcpy calls in code that didn't originally have any of them. Jakub
Re: [Patch]: Fix PR rtl-optimization/50448
Paolo Bonzini wrote: > On 11/04/2011 09:50 AM, Eric Botcazou wrote: >> + /* If above failed and this is a single set, try to simplify the >> source of >> + the set given our substitution. We could perhaps try this for >> multiple >> + SETs, but it probably won't buy us anything. */ >> +rtx addr = simplify_replace_rtx (SET_DEST (set), from, to); >> >> What does "If above failed" refer to? Again "source" instead of >> "destination". > > What about > >/* Registers can also appear as uses in SET_DEST if it is a MEM. We > could perhaps try this for multiple SETs, but it probably won't > buy us anything. */ > > ? > > Georg, can you put it all together into a v2? > > Paolo Like so? PR rtl-optimization/50448 * cprop.c (try_replace_reg): Also try to replace uses of FROM that appear in SET_DEST. IMO the head comment is still misleading because it might give rise to the assumption that SET_DESTs are replaced, too, which is not the case. Johann Index: cprop.c === --- cprop.c (revision 180962) +++ cprop.c (working copy) @@ -712,8 +712,8 @@ find_used_regs (rtx *xptr, void *data AT } } -/* Try to replace all non-SET_DEST occurrences of FROM in INSN with TO. - Returns nonzero is successful. */ +/* Try to replace all uses of FROM in INSN with TO. + Return nonzero if successful. */ static int try_replace_reg (rtx from, rtx to, rtx insn) @@ -764,6 +764,18 @@ try_replace_reg (rtx from, rtx to, rtx i note = set_unique_reg_note (insn, REG_EQUAL, copy_rtx (src)); } + if (set && MEM_P (SET_DEST (set)) && reg_mentioned_p (from, SET_DEST (set))) +{ + /* Registers can also appear as uses in SET_DEST if it is a MEM. + We could perhaps try this for multiple SETs, but it probably + won't buy us anything. */ + rtx addr = simplify_replace_rtx (SET_DEST (set), from, to); + + if (!rtx_equal_p (addr, SET_DEST (set)) + && validate_change (insn, &SET_DEST (set), addr, 0)) +success = 1; +} + /* REG_EQUAL may get simplified into register. We don't allow that. Remove that note. This code ought not to happen, because previous code ought to synthesize
Re: [Patch]: Fix PR rtl-optimization/50448
> PR rtl-optimization/50448 > * cprop.c (try_replace_reg): Also try to replace uses of FROM that > appear in SET_DEST. OK if it passes testing, with s/addr/dest/ as addr isn't an address at all. > IMO the head comment is still misleading because it might give rise to the > assumption that SET_DESTs are replaced, too, which is not the case. "use" is to be understood as opposed to "set" so SET_DEST itself is excluded. -- Eric Botcazou
Re: PATCH: Move f16c intrinsics into f16cintrin.h
> Reposting, okay to commit after testing on x86_64 if no regressions? The patch wasn't correctly installed and the ChangeLog is incomplete. * config/i386/f16cintrin.h: Contents moved from immintrin.h. * config/i386/immintrin.h: Include f16cintrin.h. * config.gcc (i[34567]86-*-*, x86_64-*-*): Add f16cintrin.h. -- Eric Botcazou
Re: [v3] use NSDMI in C++11 mutex types
Also use NSDMI for std::once_flag PR libstdc++/49894 PR bootstrap/50982 * include/std/mutex (once_flag): Use NSDMI. tested x86_64-linux, committed to trunk. Index: include/std/mutex === --- include/std/mutex (revision 180749) +++ include/std/mutex (working copy) @@ -760,11 +760,11 @@ { private: typedef __gthread_once_t __native_type; -__native_type _M_once; +__native_type _M_once = __GTHREAD_ONCE_INIT; public: /// Constructor -constexpr once_flag() noexcept : _M_once(__GTHREAD_ONCE_INIT) { } +constexpr once_flag() noexcept = default; /// Deleted copy constructor once_flag(const once_flag&) = delete;
Re: [Patch]: Fix PR rtl-optimization/50448
Eric Botcazou wrote: >> PR rtl-optimization/50448 >> * cprop.c (try_replace_reg): Also try to replace uses of FROM that >> appear in SET_DEST. > > OK if it passes testing, with s/addr/dest/ as addr isn't an address at all. Ok, it's here: http://gcc.gnu.org/viewcvs?view=revision&revision=181011 Johann
VZEROUPPER for simple_return?
Hi! On Sat, Nov 05, 2011 at 10:50:44AM +0100, Jakub Jelinek wrote: > On the following testcase with -m64 -O3 -mavx2 (but it is just an example, > you can replace the loop there with any code that doesn't touch the > stack or frame pointer at all), only f3 is shrink wrapped and in that case > it on the other side doesn't add vzeroupper before leaving the AVX using > code that it IMNSHO should. But I wonder why we can't shrink-wrap also Here is a quick hack that deals with the missing vzeroupper issue. Probably it would be nicer to create a helper in i386.c for that though, because call_no_avx256 is an enum private to i386.c. --- gcc/config/i386/i386.md.jj 2011-11-04 07:49:41.0 +0100 +++ gcc/config/i386/i386.md 2011-11-05 14:00:32.0 +0100 @@ -11725,6 +11725,12 @@ (define_expand "return" [(simple_return)] "ix86_can_use_return_insn_p ()" { + /* Emit vzeroupper if needed. */ + if (TARGET_VZEROUPPER + && !TREE_THIS_VOLATILE (cfun->decl) + && !cfun->machine->caller_return_avx256_p) +emit_insn (gen_avx_vzeroupper (const2_rtx)); + if (crtl->args.pops_args) { rtx popc = GEN_INT (crtl->args.pops_args); @@ -11741,6 +11747,12 @@ (define_expand "simple_return" [(simple_return)] "!TARGET_SEH" { + /* Emit vzeroupper if needed. */ + if (TARGET_VZEROUPPER + && !TREE_THIS_VOLATILE (cfun->decl) + && !cfun->machine->caller_return_avx256_p) +emit_insn (gen_avx_vzeroupper (const2_rtx)); + if (crtl->args.pops_args) { rtx popc = GEN_INT (crtl->args.pops_args); Jakub
Re: C++ PATCH for c++/26714 (lifetime of temps in mem-initializers for reference members)
On 5 Nov 2011, at 03:24, Jason Merrill wrote: After my previous patch for 48370 which adds extend_ref_init_temps, it is straightforward to fix this issue as well by extending ref init mem-initializers to match the lifetime of 'this'. Tested x86_64-pc-linux-gnu, applying to trunk. commit 30ed5835a92df18afef71802b5fa95899ceca227 Author: Jason Merrill Date: Fri Nov 4 14:59:20 2011 -0400 PR c++/26714 * init.c (perform_member_init): Strip TARGET_EXPR around NSDMI. Do temporary lifetime extension. either this or the previous patch has broken (or exposed a problem which has broken) bootstrap on i686-darwin9 with: libtool: compile: /GCC/gcc-4-7-trunk-build/./gcc/xgcc -shared-libgcc - B/GCC/gcc-4-7-trunk-build/./gcc -nostdinc++ -L/GCC/gcc-4-7-trunk-build/ i686-apple-darwin9/x86_64/libstdc++-v3/src -L/GCC/gcc-4-7-trunk-build/ i686-apple-darwin9/x86_64/libstdc++-v3/src/.libs -B/GCC/gcc-4-7- install/i686-apple-darwin9/bin/ -B/GCC/gcc-4-7-install/i686-apple- darwin9/lib/ -isystem /GCC/gcc-4-7-install/i686-apple-darwin9/include - isystem /GCC/gcc-4-7-install/i686-apple-darwin9/sys-include -m64 -I/ GCC/gcc-4-7-trunk-build/i686-apple-darwin9/x86_64/libstdc++-v3/include/ i686-apple-darwin9 -I/GCC/gcc-4-7-trunk-build/i686-apple-darwin9/ x86_64/libstdc++-v3/include -I/GCC/gcc-live-trunk/libstdc++-v3/libsupc+ + -fno-implicit-templates -Wall -Wextra -Wwrite-strings -Wcast-qual - fdiagnostics-show-location=once -fvisibility-inlines-hidden -ffunction- sections -fdata-sections -frandom-seed=functexcept.lo -g -O2 -m64 - std=gnu++0x -c /GCC/gcc-live-trunk/libstdc++-v3/src/functexcept.cc - fno-common -DPIC -o .libs/functexcept.o In file included from /GCC/gcc-4-7-trunk-build/i686-apple-darwin9/ x86_64/libstdc++-v3/include/future:41:0, from /GCC/gcc-live-trunk/libstdc++-v3/src/ functexcept.cc:32: /GCC/gcc-4-7-trunk-build/i686-apple-darwin9/x86_64/libstdc++-v3/ include/thread:195:46: internal compiler error: Segmentation fault Please submit a full bug report, gdb: GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096 Compiler executable checksum: d5643d919cac033598d9277e144abbda Program received signal EXC_BAD_ACCESS, Could not access memory. Reason: KERN_INVALID_ADDRESS at address: 0xa5a5a5a5 0x00443fca in build_stmt (loc=6644384, code=CLEANUP_STMT) at /GCC/gcc- live-trunk/gcc/c-family/c-semantics.c:124 124 if (t && !TYPE_P (t)) (gdb) bt #0 0x00443fca in build_stmt (loc=6644384, code=CLEANUP_STMT) at /GCC/ gcc-live-trunk/gcc/c-family/c-semantics.c:124 #1 0x002da1f3 in push_cleanup (decl=0x2d36280, cleanup=0xa5a5a5a5, eh_only=0 '\0') at /GCC/gcc-live-trunk/gcc/cp/semantics.c:488 #2 0x00069fad in cp_finish_decl (decl=0x2d36280, init=0x428167b8, init_const_expr_p=0 '\0', asmspec_tree=0x0, flags=11) at /GCC/gcc-live- trunk/gcc/cp/decl.c:6320 #3 0x00223127 in cp_parser_init_declarator (parser=0x42827188, decl_specifiers=0xbfffe81c, checks=0x0, function_definition_allowed_p=1 '\001', member_p=0 '\0', declares_class_or_enum=0, function_definition_p=0xbfffe817 "", maybe_range_for_decl=0x0) at /GCC/gcc-live-trunk/gcc/cp/parser.c:15462 #4 0x00219e28 in cp_parser_simple_declaration (parser=0x42827188, function_definition_allowed_p=1 '\001', maybe_range_for_decl=0x0) at / GCC/gcc-live-trunk/gcc/cp/parser.c:10301 #5 0x00219c70 in cp_parser_block_declaration (parser=0x42827188, statement_p=0 '\0') at /GCC/gcc-live-trunk/gcc/cp/parser.c:10187 #6 0x00219a68 in cp_parser_declaration (parser=0x42827188) at /GCC/ gcc-live-trunk/gcc/cp/parser.c:10092 #7 0x002196dd in cp_parser_declaration_seq_opt (parser=0x42827188) at /GCC/gcc-live-trunk/gcc/cp/parser.c:9978 #8 0x00221c85 in cp_parser_namespace_body (parser=0x42827188) at /GCC/ gcc-live-trunk/gcc/cp/parser.c:14633 #9 0x00221c3b in cp_parser_namespace_definition (parser=0x42827188) at /GCC/gcc-live-trunk/gcc/cp/parser.c:14614 #10 0x002199b7 in cp_parser_declaration (parser=0x42827188) at /GCC/ gcc-live-trunk/gcc/cp/parser.c:10076 #11 0x002196dd in cp_parser_declaration_seq_opt (parser=0x42827188) at /GCC/gcc-live-trunk/gcc/cp/parser.c:9978 #12 0x0020c287 in cp_parser_translation_unit (parser=0x42827188) at / GCC/gcc-live-trunk/gcc/cp/parser.c:3739 #13 0x002416dc in c_parse_file () at /GCC/gcc-live-trunk/gcc/cp/ parser.c:26714 #14 0x0042edb3 in c_common_parse_file () at /GCC/gcc-live-trunk/gcc/c- family/c-opts.c:1122 #15 0x00e2a954 in compile_file () at /GCC/gcc-live-trunk/gcc/toplev.c: 565 #16 0x00e2d634 in do_compile () at /GCC/gcc-live-trunk/gcc/toplev.c:1930 #17 0x00e2d81d in toplev_main (argc=57, argv=0xbfffebe4) at /GCC/gcc- live-trunk/gcc/toplev.c:2006 #18 0x00458618 in main (argc=57, argv=0xbfffebe4) at /GCC/gcc-live- trunk/gcc/main.c:36 cheers Iain
Re: [Patch, libfortran] PR 45723 Revert previous fix
On 10/31/2011 11:51 AM, Janne Blomqvist wrote: Hi, I'd like to revert the fix for PR 45723 that was committed previously for the following reasons: - Using stat("/path/to/file", ...) to infer something about an open file descriptor is racy. - As I argued in http://gcc.gnu.org/ml/fortran/2011-10/msg00133.html , I think the best approach is to do what the calling program asks as to do, and then return failure if that doesn't work rather than trying to fix stuff up in the library. I agree, we just can not accommodate every OS in every situation. OK to revert. Jerry
Re: [patch] 19/n: trans-mem: compiler tree/gimple stuff
Richi, if it's the use of the bit in the tree node that you're worried about, we could probably put it in cgraph_node.local instead. But we do need the knowledge. Yeah, I was worried about /* 1 bit left */ ;) Putting it in the cgraph node sounds more appealing indeed. Richi, is this a blocker, or merely a suggestion? If this is a requirement for merging, I can do so. Just want to make sure where best to spend my time. If this is a suggestion, I can put it on my laundry list of future things todo (after merge, 4.8?, etc).
SPU build broken (Re: CFT: [build] Move libgcc2 to toplevel libgcc)
Rainer Orth wrote: > diff --git a/gcc/config/spu/t-spu-elf b/gcc/config/spu/t-spu-elf > -# We exclude those because the libgcc2.c default versions do not support > -# the SPU single-precision format (round towards zero). We provide our > -# own versions below and/or via direct expansion. > -LIB2FUNCS_EXCLUDE = _floatdisf _floatundisf _floattisf _floatunstisf > diff --git a/libgcc/config/spu/t-elf b/libgcc/config/spu/t-elf > +# We exclude those because the libgcc2.c default versions do not support > +# the SPU single-precision format (round towards zero). We provide our > +# own versions below and/or via direct expansion. > +LIB2ADD = _floatdisf _floatundisf _floattisf _floatunstisf This seems to have caused: make[2]: Entering directory `/home/kwerner/dailybuild/spu-tc-2011-11-05/gcc-build/spu/libgcc' Makefile:792: *** Unsupported files in LIB2ADD or LIB2ADD_ST.. Stop. Shouldn't the variable still be called LIB2FUNCS_EXCLUDE after the move to libgcc? LIB2ADD seems to expect full file names ... Bye, Ulrich -- Dr. Ulrich Weigand GNU Toolchain for Linux on System z and Cell BE ulrich.weig...@de.ibm.com
Re: [Patch, Fortran] Cleanup of gfc_extend_expr
Sounds like what everything needs is a differently named enum: say three_way_logic. On Nov 3, 2011, at 3:56 PM, Janus Weil wrote: >> At least add a comment about the re-use (abuse?) of the >> enum. > > Updated patch attached, which adds a short comment on the usage of 'match'. > > >> This should reduce confusion months from when >> someone wonders why gfc_extend_expr returns a "match" >> for a non-matching function. > > Well, I think my approach is not as far-fetched as you seem to imply: > There are already a good number of procedures which use the 'match' > enum, although they're not related to matching at all. Listing only > those that occur in gfortran.h (I'm sure there are more): > > * match gfc_mod_pointee_as (gfc_array_spec *); > * match gfc_intrinsic_func_interface (gfc_expr *, int); > * match gfc_intrinsic_sub_interface (gfc_code *, int); > * match gfc_iso_c_sub_interface(gfc_code *, gfc_symbol *); > > The reason for this is of course that the YES/NO/ERROR triple is not > only useful in matching, but also in many other situations. > > Cheers, > Janus >
Re: [commited, rtems] Sync gcc/config/rs6000/*rtems* with RTEMS-gcc
I'd like to remind you that specs such as %{!Dppc*: %{!Dmpc*: -Dppc8540}} (which you were adding to match existing specs) have been broken for some time, since the canonical form of -D options now has separate arguments (and they would never have worked with separate-argument -D options). * Defining new macros like that in the user namespace is in any case frowned upon, and certainly they should not be defined if -ansi, -std=i* or -std=c* (PR 545) (I haven't actually checked the circumstances in which this spec is used). * Similarly, in conformance modes the compiler shouldn't care about whether the user has defined such user-namespace macros on the command line. * If, nevertheless, you want such specs to work (possibly for macros in the implementation namespace), see PR 48524; there are some use cases for which specs matching option arguments like this would be useful. -- Joseph S. Myers jos...@codesourcery.com
Re: [doc] Fix a cross reference
On Sat, 5 Nov 2011, Mingjie Xing wrote: > Hello, > > This small patch fix a cross reference in gcc document. > > 2011-11-05 Mingjie Xing > > * doc/invoke.texi (Wunused-result): Change @pxref{Variable Attributes} > to @pxref{Function Attributes}. > > Is it OK? OK. -- Joseph S. Myers jos...@codesourcery.com
put __stl_prime_list in comdat section
Hi, the following patch is a follow up to the one posted here http://gcc.gnu.org/ml/gcc-patches/2009-05/msg01293.html. The new patch is a header only change and can greatly reduce rodata section size for some programs. Ok for trunk after testing? thanks, David cl Description: Binary data stl_prime.p Description: Binary data
Re: put __stl_prime_list in comdat section
On 11/05/2011 07:32 PM, Xinliang David Li wrote: Hi, the following patch is a follow up to the one posted here http://gcc.gnu.org/ml/gcc-patches/2009-05/msg01293.html. The new patch is a header only change and can greatly reduce rodata section size for some programs. Ok for trunk after testing? As usual for backward/ stuff, I would say largely Ian's call, but please don't use get_prime_list unuglified, prefer __get_prime_list, or something with _S_* as prefix, being a static member. Thanks, Paolo. PS: also, make sure to post library patches to libstdc++ too, and, possibly, add [v3] to the Subject, otherwise you can easily fail to get the attention of the library people.
Re: New port^2: Renesas RL78
On 11/04/2011 10:09 PM, DJ Delorie wrote: > The problem I'm trying to solve with that is that there's only one > segment register (ES) so you only need to force an operand non-far if > *both* operands are far. Not sure if the function is implemented that > way, but I coded the expanders that way. Ah, I missed that. No, the generic code only looks at a single predicate at a time. If you're looking at two, then you do have to take care of it yourself. >>> if (CONST_INT_P (operand1) && ! IN_RANGE (INTVAL (operand1), (-1 << 8) >>> + 1, (1 << 8) - 1)) >>> FAIL; >> >> Huh? This would be an assert, not a FAIL. But why have it? >> >> It sounds like something that should have been caught way up >> the chain... > > "should" he says ;-) Indeed. I'll go so far as to say that it's a generic bug that needs fixing. I don't suppose this triggers during a normal build? If so, please file a bug and cc me. >>> (define_expand "addsi3" >>> [(set (match_operand:SI 0 "register_operand" "=&v") >>> (plus:SI (match_operand:SI 1 "nonmemory_operand" "vi") >>> (match_operand2 "nonmemory_operand" "vi"))) >>>] >>> "" >>> "if (!nonmemory_operand (operands[1], SImode)) >>> operands[1] = force_reg (SImode, operands[1]); >>>if (!nonmemory_operand (operands[1], SImode)) >>> operands[2] = force_reg (SImode, operands[2]);" >>> ) >> >> Drop the register constrains in the expander. Use register_operand >> in the expander and drop the force_reg bits. > > The pattern *does* accept immediates as-is. Not sure why that extra > check is in there, though... might be that parts of gcc call > gen_addsi3() without checking the predicates first. I've seen it do > that for moves. Yes, I suppose that's possible. Especially given add's use inside reload. > Yes, it's just like alloca() - it detects a stack shrink in the next > call to the library function and frees up any stubs that are "off the > stack". The call in the epilogue is to increase the odds of properly > detecting such a case, since we know at that point it's out of scope. Excellent. r~
[v3] Re: put __stl_prime_list in comdat section
thanks. The attached is the revised patch. David On Sat, Nov 5, 2011 at 11:52 AM, Paolo Carlini wrote: > On 11/05/2011 07:32 PM, Xinliang David Li wrote: >> >> Hi, the following patch is a follow up to the one posted here >> http://gcc.gnu.org/ml/gcc-patches/2009-05/msg01293.html. >> >> The new patch is a header only change and can greatly reduce rodata >> section size for some programs. >> >> Ok for trunk after testing? > > As usual for backward/ stuff, I would say largely Ian's call, but please > don't use get_prime_list unuglified, prefer __get_prime_list, or something > with _S_* as prefix, being a static member. > > Thanks, > Paolo. > > PS: also, make sure to post library patches to libstdc++ too, and, possibly, > add [v3] to the Subject, otherwise you can easily fail to get the attention > of the library people. > stl_prime.p Description: Binary data
Re: PATCH: Check HARD_FRAME_POINTER_REGNUM instead of hard_frame_pointer_rtx in dwarf2out_frame_debug_expr
On 11/04/2011 01:29 PM, H.J. Lu wrote: > 2011-11-04 H.J. Lu > > * dwarf2cfi.c (dwarf2out_frame_debug_expr): Check > HARD_FRAME_POINTER_REGNUM instead of hard_frame_pointer_rtx > in Rule 18. Ok. r~
Re: RFA: Add Epiphany port
> #define __builtin_epiphany_fmsub(a, b, c) __builtin_fmaf (-b, c, a) Needs -(b), or conversion to __always_inline__ functions, as with the intrinsics used by the i386 target. Otherwise, you've taken care of all of my concerns. r~
Re: [patch] 3/n: trans-mem: runtime
On Thu, 3 Nov 2011, Aldy Hernandez wrote: > --- libitm/method-wbetl.cc(.../trunk) (revision 0) > +++ libitm/method-wbetl.cc(.../branches/transactional-memory) > (revision 180773) > @@ -0,0 +1,628 @@ > +/* Copyright (C) 2009 Free Software Foundation, Inc. > + Contributed by Richard Henderson . I believe this should become "2009, 2011" or "2009, 2010, 2011" when it's applied to trunk. Gerald
cxx-mem-model merge ... status
So the comments seem have quieted down yesterday afternoon and today. There was nothing too major, mostly it was documentation issues. Is it OK to apply the patches to mainline? If its OK, I'd like to try applying them sunday morning... Any further issues/comment can be dealt with after merging. Should I also post the exact patches I check-in?, or is what I've posted already sufficient? I don't need a slush. I'll get a mainline checkout and build it, apply the patch rebuild & test and checkin. The patches arent hugely intrusive so Im sure I can manage around normal checkin procedure. Andrew
[Patch, Fortran, committed] Add libquadmath testcase gfortran.dg/quad_2.f90
Motivated by the report at http://groups.google.com/group/comp.lang.fortran/browse_thread/thread/6373a2dfe64f0b83 There is currently no run-test check that libquadmath actually works. The attached and committed (Rev. 181015) adds one which tests for libquadmath that I/O read/write works. Additionally, it checks that the result for sqrt(2.0) is OK. The test uses the largest available floating-point number - be it 8, 10 or 16 - and tests for that. The checks should be thus OK for any system. Regarding the issue mentioned in the linked report: Kai could reproduce the issue - he also gets "0.0" under MinGW64 with the current trunk. (The report was for MinGW32 4.7.0 and allegedly it worked for 4.6.2.) Tobias Index: gcc/testsuite/gfortran.dg/quad_2.f90 === --- gcc/testsuite/gfortran.dg/quad_2.f90(Revision 0) +++ gcc/testsuite/gfortran.dg/quad_2.f90(Revision 0) @@ -0,0 +1,63 @@ +! { dg-do run } +! +! This test checks whether the largest possible +! floating-point number works. +! +! This is a run-time check. Depending on the architecture, +! this tests REAL(8), REAL(10) or REAL(16) and REAL(16) +! might be a hardware or libquadmath 128bit number. +! +program test_qp + use iso_fortran_env, only: real_kinds + implicit none + integer, parameter :: QP = real_kinds(ubound(real_kinds,dim=1)) + real(qp) :: fp1, fp2, fp3, fp4 + character(len=80) :: str1, str2, str3, str4 + fp1 = 1 + fp2 = sqrt (2.0_qp) + write (str1,*) fp1 + write (str2,'(g0)') fp1 + write (str3,*) fp2 + write (str4,'(g0)') fp2 + +! print '(3a)', '>',trim(str1),'<' +! print '(3a)', '>',trim(str2),'<' +! print '(3a)', '>',trim(str3),'<' +! print '(3a)', '>',trim(str4),'<' + + read (str1, *) fp3 + if (fp1 /= fp3) call abort() + read (str2, *) fp3 + if (fp1 /= fp3) call abort() + read (str3, *) fp4 + if (fp2 /= fp4) call abort() + read (str4, *) fp4 + if (fp2 /= fp4) call abort() + + select case (qp) + case (8) + if (str1 /= " 1.") call abort() + if (str2 /= "1.") call abort() + if (str3 /= " 1.4142135623730951") call abort() + if (str4 /= "1.4142135623730951") call abort() + case (10) + if (str1 /= " 1.") call abort() + if (str2 /= "1.") call abort() + if (str3 /= " 1.41421356237309504876") call abort() + if (str4 /= "1.41421356237309504876") call abort() + case (16) + if (str1 /= " 1.000") call abort() + if (str2 /= "1.000") call abort() + if (str3 /= " 1.41421356237309504880168872420969798") call abort() + if (str4 /= "1.41421356237309504880168872420969798") call abort() + block + real(qp), volatile :: fp2a + fp2a = 2.0_qp + fp2a = sqrt (fp2a) + if (abs (fp2a - fp2) > sqrt(2.0_qp)-nearest(sqrt(2.0_qp),-1.0_qp)) call abort() + end block + case default + call abort() + end select + +end program test_qp Index: gcc/testsuite/ChangeLog === --- gcc/testsuite/ChangeLog (Revision 181014) +++ gcc/testsuite/ChangeLog (Arbeitskopie) @@ -1,3 +1,7 @@ +2011-11-05 Tobias Burnus + + * gfortran.dg/quad_2.f90: New. + 2011-11-05 Eric Botcazou * gcc.dg/strlenopt-22g.c: New wrapper around... @@ -26,10 +30,10 @@ 2011-10-09 Magnus Fromreide -* g++.dg/cpp0x/enum21a.C: Test that enum x { y, } does -generate a pedwarn in c++98-mode. -* g++.dg/cpp0x/enum21b.C: Test that enum x { y, } -don't generate a pedwarn in c++0x-mode. + * g++.dg/cpp0x/enum21a.C: Test that enum x { y, } does + generate a pedwarn in c++98-mode. + * g++.dg/cpp0x/enum21b.C: Test that enum x { y, } + don't generate a pedwarn in c++0x-mode. 2011-11-04 Olivier Goffart
Re: [trans-mem] document -fgnu-tm
On Thu, 3 Nov 2011, Aldy Hernandez wrote: > OK for branch? * doc/invoke.texi (C Dialect Options): Document -fgnu-tm. Index: doc/invoke.texi === +When the option @option{-fgnu-tm} is specified, the compiler will +generate code for the Linux variant of Intel's current Transactional I assume this ought to be "GNU/Linux variant"? +Memory ABI specification document (Revision 1.1, May 6 2009). This is +an experimental feature whose interface may change in future versions +of GCC, as the official specification changes. Please note that not +all architectures are supported for this feature. Where can the user find which platforms are supported? +For more information on GCC's support for transactional memory, see +the accompanying documentation for @file{libitm}. Ah, here it is. How does one find that documentation? Gerald
Re: [patch] 19/n: trans-mem: compiler tree/gimple stuff
[rth, see below] local_define_builtin ("__builtin_eh_pointer", ftype, BUILT_IN_EH_POINTER, "__builtin_eh_pointer", ECF_PURE | ECF_NOTHROW | ECF_LEAF); + if (flag_tm) +apply_tm_attr (builtin_decl_explicit (BUILT_IN_EH_POINTER), + get_identifier ("transaction_pure")); I think this should use a new ECF_TM_PURE flag, unconditionally set with handling in the functions that handle/return ECF flags so that transitioning this to a tree node flag instead of an attribute is easier. I could add a ECF_TM_PURE flag and attach it to the BUILT_IN_EH_POINTER in the local_define_builtin above, but we still need the attribute for function decl's as in: __attribute__((transaction_pure)) void foo(); Attributes seem like a clean way to approach this. I don't see what the flag buys us. Or am I misunderstanding something? +/* Nonzero if this call performs a transactional memory operation. */ +#define ECF_TM_OPS (1<< 11) What's this flag useful for? Isn't it the case that you want to conservatively know whether a call might perform a tm operation? Thus, the flag should be inverted? Is this the same as "TM pure"? Richard? +case GIMPLE_TRANSACTION: + return (weights->tm_cost + + estimate_num_insns_seq (gimple_transaction_body (stmt), + weights)); + Huh, so we now have non-lowered gimple sub-sequence throughout all optimizations (inlining especially)? :( Richard addressed this elsewhere. I think I miss tree-cfg.c parts that do any verification of the new gimple kinds. Yes, they're there. I see you commented on them in the middle/end patch. I will fix the issues you brought up on that thread. ? Why not use GF_CALL_CANNOT_INLINE? As per Michael Matz's suggestion, I have removed all reference to this unused flag. +static inline void +gimple_call_set_noinline_p (gimple s) +{ + GIMPLE_CHECK (s, GIMPLE_CALL); + s->gsbase.subcode |= GF_CALL_NOINLINE; +} See above. We have *_cannot_inline already. Similarly here. Richi, I have fixed or addressed all the issues in this thread, with the exception of your EFC_TM_PURE and ECF_TM_OPS questions, which I am deferring to rth and then fixing if required. I will now go through the middle-end thread (which erroneously also prefixed with [patch] 19/n...). Aldy
Re: [trans-mem] document -fgnu-tm
Torvald, is this documentation somewhere public whose link we can add to the committed patch below? I know the latest draft is not available, but at least Revision 1.1 which we reference below? Aldy +Memory ABI specification document (Revision 1.1, May 6 2009). This is +an experimental feature whose interface may change in future versions +of GCC, as the official specification changes. Please note that not +all architectures are supported for this feature. Where can the user find which platforms are supported? +For more information on GCC's support for transactional memory, see +the accompanying documentation for @file{libitm}. Ah, here it is. How does one find that documentation?
Re: [trans-mem] document -fgnu-tm
On Sat, 2011-11-05 at 14:08 -0700, Aldy Hernandez wrote: > > Torvald, is this documentation somewhere public whose link we can add to > the committed patch below? I know the latest draft is not available, but > at least Revision 1.1 which we reference below? I don't know of any stable URL. The ABI PDF is linked to from http://software.intel.com/en-us/articles/intel-c-stm-compiler-prototype-edition/ but I don't think the final URL is supposed to be used for deep linking, or guaranteed to be stable. > > +Memory ABI specification document (Revision 1.1, May 6 2009). This should be Revision 1.0.1, Nov 12 2008. The rev you had there is for the API / language spec I think. > This is > > +an experimental feature whose interface may change in future versions > > +of GCC, as the official specification changes. Please note that not > > +all architectures are supported for this feature. > > > > Where can the user find which platforms are supported? This should be added to libitm too I guess. > > > > +For more information on GCC's support for transactional memory, see > > +the accompanying documentation for @file{libitm}. > > > > Ah, here it is. How does one find that documentation? Can you put an xref there (or whatever you used for the other link) instead of the @file{libitm}? Thanks!
Re: PATCH: Add capability to contrib/compare_tests to handle directories
On Nov 4, 2011, at 8:23 PM, Quentin Neill wrote: > This patch concatenates the common .sum files before comparing. > > Okay to commit? Ok, thanks for the contribution.
Re: C++ PATCH for c++/26714 (lifetime of temps in mem-initializers for reference members)
On 11/05/2011 10:32 AM, Iain Sandoe wrote: either this or the previous patch has broken (or exposed a problem which has broken) bootstrap on i686-darwin9 with: I've mostly reverted the previous patch, does that fix bootstrap for you? I don't understand yet what the problem is. Jason
Re: C++ PATCH for c++/26714 (lifetime of temps in mem-initializers for reference members)
On 5 Nov 2011, at 21:30, Jason Merrill wrote: On 11/05/2011 10:32 AM, Iain Sandoe wrote: either this or the previous patch has broken (or exposed a problem which has broken) bootstrap on i686-darwin9 with: I've mostly reverted the previous patch, does that fix bootstrap for you? I don't understand yet what the problem is. I'll set it running now... try to report back today... looks like a GTY kinda issue from the pattern showing in the gdb output/ thanks, Iain
Re: [trans-mem] document -fgnu-tm
On 11/05/2011 02:25 PM, Torvald Riegel wrote: > I don't know of any stable URL. The ABI PDF is linked to from > http://software.intel.com/en-us/articles/intel-c-stm-compiler-prototype-edition/ > but I don't think the final URL is supposed to be used for deep linking, > or guaranteed to be stable. We have the GCC wiki. We can put a copy there and know that it's going to be stable. r~
Re: [PATCH 1/1] sparc leon: Use -Aleon assembler switch for -mcpu=leon arch
From: Konrad Eisele Date: Tue, 1 Nov 2011 10:22:13 +0100 > Use -Aleon to enable binutils sparc-leon architecture. The leon-arch > binutils GAS has umul/smul and casa enabled. > > Signed-off-by: Konrad Eisele You can't just add new assembler options that are only right now being added to binutils. Instead, you have to add a test to make sure that the binutils being used alongside gcc actually supports the new option, and only pass it in if so. And subsequently, you cannot emit instructions that require the -Aleon option unless this support is present too. This is exactly what we had to do for VIS 3.0, FMAF, and HPC instructions on Niagara3. Look at the "FMAF, HPC, and VIS 3.0 instructions" gcc_GAS_CHECK_FEATURE code in gcc/configure.ac for how to do such a test. Then once you have that, you can conditionlize instruction emission as well based upon whether the feature CPP define is set or not.
Re: [trans-mem] document -fgnu-tm
On Sat, 2011-11-05 at 14:38 -0700, Richard Henderson wrote: > On 11/05/2011 02:25 PM, Torvald Riegel wrote: > > I don't know of any stable URL. The ABI PDF is linked to from > > http://software.intel.com/en-us/articles/intel-c-stm-compiler-prototype-edition/ > > but I don't think the final URL is supposed to be used for deep linking, > > or guaranteed to be stable. > > We have the GCC wiki. We can put a copy there and know that it's > going to be stable. IANAL, but I don't see anything in the document's disclaimer/legal section that would allow us to redistribute this document. IMHO, citing it by using title, publisher (Intel), and date/revision should be okay in absence of any better source. Torvald
Re: C++ PATCH for c++/26714 (lifetime of temps in mem-initializers for reference members)
On 5 Nov 2011, at 21:32, Iain Sandoe wrote: On 5 Nov 2011, at 21:30, Jason Merrill wrote: On 11/05/2011 10:32 AM, Iain Sandoe wrote: either this or the previous patch has broken (or exposed a problem which has broken) bootstrap on i686-darwin9 with: I've mostly reverted the previous patch, does that fix bootstrap for you? I don't understand yet what the problem is. I'll set it running now... try to report back today... we're on stage2 - so looking good. looks like a GTY kinda issue from the pattern showing in the gdb output/ two more data points - 1) doesn't fail on either powerpc-darwin9 or x86-64-darwin10. 2) doesn't fail if compiled "-save-temps" .. makes one think that it's sensitive to the headers used cheers Iain
Re: cxx-mem-model merge ... status
On Sat, Nov 5, 2011 at 9:47 PM, Andrew MacLeod wrote: > So the comments seem have quieted down yesterday afternoon and today. There > was nothing too major, mostly it was documentation issues. > > Is it OK to apply the patches to mainline? If its OK, I'd like to try > applying them sunday morning... Any further issues/comment can be dealt > with after merging. Should I also post the exact patches I check-in?, or > is what I've posted already sufficient? > > I don't need a slush. I'll get a mainline checkout and build it, apply the > patch rebuild & test and checkin. The patches arent hugely intrusive so Im > sure I can manage around normal checkin procedure. Sounds good. Posting the exact patches checked in would be nice. Thanks, Richard. > Andrew > > >
Re: [patch] 19/n: trans-mem: compiler tree/gimple stuff
On Sat, Nov 5, 2011 at 4:09 PM, Aldy Hernandez wrote: > >>> Richi, if it's the use of the bit in the tree node that you're worried >>> about, >>> we could probably put it in cgraph_node.local instead. But we do need >>> the >>> knowledge. >> >> Yeah, I was worried about /* 1 bit left */ ;) Putting it in the >> cgraph node sounds more appealing >> indeed. > > > Richi, is this a blocker, or merely a suggestion? If this is a requirement > for merging, I can do so. Just want to make sure where best to spend my > time. Well - we usually don't grab bits off the tree nodes lightly. Especially if the cgraph seems to be more fit. > If this is a suggestion, I can put it on my laundry list of future things > todo (after merge, 4.8?, etc). There are not many consumers of the flag, so fixing it shouldn't be hard. For 4.7 definitely. Thanks, Richard.
Re: [trans-mem] document -fgnu-tm
Richard Henderson wrote: We have the GCC wiki. We can put a copy there and know that it's going to be stable. Jumping on that topic: Is there actually a possibility to "undo" attachment deletes in the wiki? I remember someone using an recursive "wget" on the wiki, which deleted Graphite PDFs on the way. The problem is that the Attachment page has, e.g., [delete | move | load | show] (2010-01-31 06:10:04, 337.6 KB) [[attachment:graphite2y-slides.pdf]] And the script hit the delete link. I think contrary to "upload" there complicated check; I think it is even deleted right away without asking "are you sure?" - but it might be that it does and the crawler went over it. In any case, I failed to find an "undelete" as user - and Sebastian was able to upload the files again. Tobias
Re: [patch] 19/n: trans-mem: compiler tree/gimple stuff
On Sat, Nov 5, 2011 at 10:05 PM, Aldy Hernandez wrote: > [rth, see below] > >>> local_define_builtin ("__builtin_eh_pointer", ftype, >>> BUILT_IN_EH_POINTER, >>> "__builtin_eh_pointer", ECF_PURE | ECF_NOTHROW | >>> ECF_LEAF); >>> + if (flag_tm) >>> + apply_tm_attr (builtin_decl_explicit (BUILT_IN_EH_POINTER), >>> + get_identifier ("transaction_pure")); >> >> I think this should use a new ECF_TM_PURE flag, unconditionally set >> with handling in the functions that handle/return ECF flags so that >> transitioning this to a tree node flag instead of an attribute is easier. > > I could add a ECF_TM_PURE flag and attach it to the BUILT_IN_EH_POINTER in > the local_define_builtin above, but we still need the attribute for function > decl's as in: > > __attribute__((transaction_pure)) void foo(); > > Attributes seem like a clean way to approach this. The middle-end interfacing is supposed to be via ECF_ flags, the user interface via attributes. What's the semantic of transaction-pure vs. ... > I don't see what the flag buys us. Or am I misunderstanding something? > >>> +/* Nonzero if this call performs a transactional memory operation. */ >>> +#define ECF_TM_OPS (1<< 11) >> >> What's this flag useful for? Isn't it the case that you want to >> conservatively >> know whether a call might perform a tm operation? Thus, the flag >> should be inverted? Is this the same as "TM pure"? ... this? > Richard? > Richi, I have fixed or addressed all the issues in this thread, with the > exception of your EFC_TM_PURE and ECF_TM_OPS questions, which I am deferring > to rth and then fixing if required. Yeah, seems to be still an open question. Thanks, Richard.
Re: [PATCH] Straight-line strength reduction, stage 1
On Fri, 2011-11-04 at 14:55 +0100, Richard Guenther wrote: > On Sun, Oct 30, 2011 at 5:10 PM, William J. Schmidt > wrote: > > > > You do not seem to transform stmts on-the-fly, so for > > a1 = c + d; > a2 = c + 2*d; > a3 = c + 3*d; > > are you able to generate > > a1 = c + d; > a2 = a1 + d; > a3 = a2 + d; > > ? On-the-fly operation would also handle this if the candidate info > for a2 is kept as c + 2*d. Though it's probably worth lookign at > > a1 = c + d; > a2 = a1 + d; > a3 = c + 3*d; > > and see if that still figures out that a3 = a2 + d (thus, do you, > when you find the candidate a1 + 1 * d, fold in candidate information > for a1? thus, try to reduce it to an ultimate base?) > > Thanks, > Richard. Just a couple of quick thoughts here. As I mentioned, this patch is only for the cases where the stride is a constant. The only interesting patterns I could think of for that case is what I'm currently handling, where an add-immediate feeds a multiply, e.g., y = (x + c) * d where c and d are constants. Once the stride is a variable, we have not only those cases, but also cases like you show here where the multiply feeds into an add. Those can be handled with the existing infrastructure in a slightly different way. The main differences are: - The cand_stmt will be the add in this case. We always want the candidate to be the statement that we hope to replace. - The base_name will be the "ultimate base," so that all the original candidates in your example will have c for the base. This may involve looking back through casts. - The index will be the multiplier applied to the stride. The logic for finding the nearest dominating basis will be pretty much identical. The candidate table again contains enough information that we don't need to do on-the-fly replacement, but can examine all the related candidates at once. This will be important for the add-feeding-multiply case with an SSA name stride, since we sometimes need to introduce multiplies by a constant in order to remove general multiplies of two registers. But again, that's all for a follow-on patch. My thought was to get this one set of easy candidates handled in a first patch so you could get a look at the general infrastructure without having to review a huge chunk of code at once. Once that patch is in place, the next stages would be: 2. SSA-name strides, both multiply-add and add-multiply forms. 3. Cases involving conditional increments (looking through PHIs). 4. Cases where the multiplies are hidden in addressing expressions. I have a pretty good idea where I'm going with stages 2 and 3. Stage 4 is where things are likely to get a bit bloodier, and I will be glad for any advice about the best way to handle those as we get to that point. Thanks again, Bill
Re: [patch] 19/n: trans-mem: compiler tree/gimple stuff
[rth, see below] Index: gcc/attribs.c === --- gcc/attribs.c (.../trunk) (revision 180744) +++ gcc/attribs.c (.../branches/transactional-memory) (revision 180773) @@ -166,7 +166,8 @@ init_attributes (void) gcc_assert (strcmp (attribute_tables[i][j].name, attribute_tables[i][k].name)); } - /* Check that no name occurs in more than one table. */ + /* Check that no name occurs in more than one table. Names that + begin with '*' are exempt, and may be overridden. */ for (i = 0; i< ARRAY_SIZE (attribute_tables); i++) { size_t j, k, l; @@ -174,8 +175,9 @@ init_attributes (void) for (j = i + 1; j< ARRAY_SIZE (attribute_tables); j++) for (k = 0; attribute_tables[i][k].name != NULL; k++) for (l = 0; attribute_tables[j][l].name != NULL; l++) - gcc_assert (strcmp (attribute_tables[i][k].name, - attribute_tables[j][l].name)); + gcc_assert (attribute_tables[i][k].name[0] == '*' + || strcmp (attribute_tables[i][k].name, + attribute_tables[j][l].name)); } #endif @@ -207,7 +209,7 @@ register_attribute (const struct attribu slot = htab_find_slot_with_hash (attribute_hash,&str, substring_hash (str.str, str.length), INSERT); - gcc_assert (!*slot); + gcc_assert (!*slot || attr->name[0] == '*'); *slot = (void *) CONST_CAST (struct attribute_spec *, attr); } The above changes seem to belong to a different changeset and look strange. Why would attributes ever appear in two different tables? I couldn't find a corresponding gcc-patches message for this patch, but I was able to hunt down full the patch, which I am attaching for discussion. This seems to be a change required for allowing '*' to override builtins, so it is indeed part of the branch. Perhaps with the full context it is easier to review. I will defer to rth to answer any questions on the original motivation. Richi, do you have any particular issue with the attribs.c change? Does this context resolve any questions you may have had? Aldy Index: ChangeLog.tm === --- ChangeLog.tm(revision 149303) +++ ChangeLog.tm(revision 149304) @@ -1,3 +1,17 @@ +2009-07-06 Richard Henderson + + * attribs.c (init_attributes): Allow '*' prefix for overrides. + (register_attribute): Likewise. + * builtin-attrs.def (ATTR_TM_REGPARM): New. + (ATTR_TM_NOTHROW_LIST, ATTR_TM_NORETURN_NOTHROW_LIST, + ATTR_TM_NOTHROW_NONNULL, ATTR_TM_CONST_NOTHROW_LIST, + ATTR_TM_PURE_NOTHROW_LIST): New. + * c-common.c (ignore_attribute): New. + (c_common_attribute_table): Add "*tm regparm". + + * config/i386/i386.c (ix86_handle_tm_regparm_attribute): New. + (ix86_attribute_table): Add "*tm regparm". + 2009-07-02 Richard Henderson * c-typeck.c (c_finish_tm_atomic): Use build_stmt. Index: attribs.c === --- attribs.c (revision 149303) +++ attribs.c (revision 149304) @@ -166,7 +166,8 @@ init_attributes (void) gcc_assert (strcmp (attribute_tables[i][j].name, attribute_tables[i][k].name)); } - /* Check that no name occurs in more than one table. */ + /* Check that no name occurs in more than one table. Names that + begin with '*' are exempt, and may be overridden. */ for (i = 0; i < ARRAY_SIZE (attribute_tables); i++) { size_t j, k, l; @@ -174,8 +175,9 @@ init_attributes (void) for (j = i + 1; j < ARRAY_SIZE (attribute_tables); j++) for (k = 0; attribute_tables[i][k].name != NULL; k++) for (l = 0; attribute_tables[j][l].name != NULL; l++) - gcc_assert (strcmp (attribute_tables[i][k].name, - attribute_tables[j][l].name)); + gcc_assert (attribute_tables[i][k].name[0] == '*' + || strcmp (attribute_tables[i][k].name, + attribute_tables[j][l].name)); } #endif @@ -202,7 +204,7 @@ register_attribute (const struct attribu slot = htab_find_slot_with_hash (attribute_hash, &str, substring_hash (str.str, str.length), INSERT); - gcc_assert (!*slot); + gcc_assert (!*slot || attr->name[0] == '*'); *slot = (void *) CONST_CAST (struct attribute_spec *, attr); } Index: builtin-attrs.def === --- builtin-attrs.def (revision 149303) +++ builtin-attrs.def (revision 149304) @@ -94,6 +94,7 @@ DEF_ATTR_IDENT (ATTR_SENTINEL, "sentinel DEF_ATTR_IDENT (ATTR_STRFMON, "strfmon") DEF_ATTR_IDE
Re: [C++ PATCH] PR c++/45114 - Support alias templates
Jason Merrill writes: > On 10/27/2011 03:10 PM, Dodji Seketeli wrote: > > +/* Setter for the TYPE_DECL_ALIAS_P proprety above. */ > > +#define SET_TYPE_DECL_ALIAS_P(NODE, VAL) \ > > + (DECL_LANG_FLAG_6 (TYPE_DECL_CHECK (NODE)) = (VAL)) > > This seems unnecessary. Removed. > > > +#define TYPE_DECL_NAMES_ALIAS_TEMPLATE_P(NODE) \ > > + (TYPE_DECL_ALIAS_P (NODE)\ > > + && DECL_LANG_SPECIFIC (NODE) > > \ > > + && DECL_TI_TEMPLATE (NODE) \ > > + && same_type_p (TREE_TYPE (NODE), TREE_TYPE (DECL_TI_TEMPLATE (NODE > > I don't think same_type_p is the test you want here, as it ignores > typedefs. How about > > DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (NODE)) == (NODE) Right. Changed. > ? > > > +#define TYPE_ALIAS_P(NODE) \ > > + (TYPE_P (NODE) \ > > + && DECL_LANG_SPECIFIC (TYPE_NAME (NODE))\ > > + && TYPE_DECL_ALIAS_P (TYPE_NAME (NODE))) > > Why check DECL_LANG_SPECIFIC? I removed the check. > > > + /*If T is a specialization of an alias template, then we don't > > + want to take this 'if' branch; we want to print it as if it > > + was a specialization of class template. */ > > I think we want to handle them specially within this if. Done. > > > - else if (same_type_p (t, TREE_TYPE (decl))) > > + else if (same_type_p (t, TREE_TYPE (decl)) > > + && /* If T is the type of an alias template then we > > +want to let dump_decl print it like an alias > > +template. */ > > + TYPE_DECL_NAMES_ALIAS_TEMPLATE_P (decl)) > > This change restricts the existing test to only apply to alias > templates. Removed. > > Also, I would think we would want to handle the uninstantiated alias > the same as instantiations. In the updated patch below, uninstantiated aliase types follow the same path as typedefs and are handled specifically by dump_decl, whereas alias instantiations are handled by the new dump_alias_template_specialization that knows how to handle class and non-class alias template instantiations. Is that bad? > > You need some tests for printing of aliases in error messages. Only > one of the current tests prints an alias: > > > /home/jason/gt/gcc/testsuite/g++.dg/cpp0x/alias-decl-1.C:5:26: error: > > partial specialization of alias template 'using AA0 = struct A0' > > This should have the template header. So here: > > > + if (DECL_ALIAS_TEMPLATE_P (TI_TEMPLATE (get_template_info > (type > > + { > > + error ("partial specialization of alias template %qD", > > +TYPE_NAME (type)); > > + return error_mark_node; > > + } > > We should pass the template to error, rather than the > instantiation. But when I try that I see that it prints > > template struct AA0 > > instead, so more fixing is needed. Right. I tried to add more tests for that, and fixed many little things here and there to get better printing. > > > + else if (DECL_ALIAS_TEMPLATE_P (t)) > > +{ > > + tree tmpl; > > + result = get_aliased_type (DECL_TEMPLATE_RESULT (t)); > > + tmpl = TI_TEMPLATE (get_template_info (result)); > > + /* If RESULT is just the naming of TMPL, return TMPL. */ > > + if (same_type_p (result, > > + TREE_TYPE (DECL_TEMPLATE_RESULT (tmpl > > + result = tmpl; > > +} > > What is this trying to achieve? When we pass in a template, sometimes > it returns a type and sometimes a template? That seems odd. This is gone now, as it was for stripping aliases and I removed it now, see below. > > > + else > > + /* Strip template aliases from TEMPLATE_DECL nodes, > > + similarly to what is done by > > + canonicalize_type_argument for types above. */ > > + val = strip_alias (val); > > I don't think this is right. Alias templates are never deduced, but > that doesn't seem to mean that they can't be used as template template > arguments. Both clang and EDG accept this testcase: > > template struct same; > template struct same {}; > > template using Ptr = T*; > template class T> struct A { > template using X = T; > }; > same::X,int*> s; I got confused by the fact that in the initial n2258 paper, the first test case of gcc/testsuite/g++.dg/cpp0x/alias-decl-0.C was meant to pass. I didn't realize that it changed in the final draft. And you are right that the example above ought to pass. Also that example made me realize that I needed to do a bit more to support non-class alias template instantiations as well. I have added more test cases. > > > + if (ctx == DECL_CONTEXT (t) > > + && (TREE_CODE (t) != TYPE_DECL > > + /* ... unless T is an alias declaration; in > > + which case our caller
[v3] update c++0x status table w.r.t emplace member of associate containers
PR libstdc++/44436 * doc/xml/manual/status_cxx200x.xml: Document emplace members are missing. committed to trunk Index: doc/xml/manual/status_cxx200x.xml === --- doc/xml/manual/status_cxx200x.xml (revision 180447) +++ doc/xml/manual/status_cxx200x.xml (working copy) @@ -1387,16 +1388,18 @@ particular release. + 23.2.4 Associative containers - Y - + Partial + Missing emplace members + 23.2.5 Unordered associative containers - Y - + Partial + Missing emplace members 23.3
[v3] fix dg-warning examples in docs
* doc/xml/manual/test.xml: Fix dg-warning examples. Committed to trunk Index: doc/xml/manual/test.xml === --- doc/xml/manual/test.xml (revision 181010) +++ doc/xml/manual/test.xml (working copy) @@ -609,10 +609,10 @@ // { dg-do compile } Example 2: Testing for expected warnings on line 36, which all targets fail -// { dg-warning "string literals" "" { xfail *-*-* } 36 +// { dg-warning "string literals" "" { xfail *-*-* } 36 } Example 3: Testing for expected warnings on line 36 -// { dg-warning "string literals" "" { target *-*-* } 36 +// { dg-warning "string literals" "" { target *-*-* } 36 } Example 4: Testing for compilation errors on line 41 // { dg-do compile }
[PATCH] More improvements to sparc VIS vec_init code generation.
Eric, the testsuite target tests for vis2 and vi3 capable hardware work well in my own testing but if you find some problem with how it's done just let me know and I'll try to fix it up. I'm almost %100 satisfied with the code generation for vec_init now. The one remaining case where I think we can do better is initializing a V8QImode vector using bshuffle with more than 4 unique inputs. I've been trying to come up with a trick to use fpmerge to get the last few bytes into place for the bshuffle, but it's a bit of a challenge because the bytes don't propagate into the destination in a convenient way. In particular it can't be done without fighting the natural move coalescing from the RTL optimizers that cleans up these RTL expansions. The vector_init_bshuffle() code tries to rely upon moves as much as possible to put the bshuffle inputs into place, because such moves are typically completely optimized away by the compiler. So something like: __v4hi foo(short a, short b, short c, short d) { __v4hi x = { a, b, c, d }; return x; } generates: foo: movwtos %o0, %f2 movwtos %o1, %f3 movwtos %o2, %f4 movwtos %o3, %f5 sethi %hi(bmask_val), %g1 or %g1, %lo(bmask_val), %g1 bmask %g1, %g0, %g1 retl bshuffle %f2, %f4, %f0 for VIS3. For cases where we only load part of a register input to the bshuffle instruction, and the rest of the register is "don't care" we have a preceeding clobber emitted so that the compiler doesn't try to zero initialize the uninitialized bits. Support for the short floating point loads starts to show up here as well, and I intend to flesh these out, support the short float stores, and add VIS intrinsic access to them. Richard, is there a better way to represent this in RTL? These instructions basically load a single byte or half-word into the bottom of a 64-bit float register, and clear the rest of that register with zeros. So the v4hi one is essentially loading the vector: [(const_int 0) (const_int 0) (const_int 0) (mem:HI (register:P ...))] into the destination 64-bit float reg. For now I'm simply using an unspec. Committed to trunk. gcc/ * config/sparc/sparc.md (UNSPEC_SHORT_LOAD): New unspec. (zero-extend_v8qi_vis, zero_extend_v4hi_vis): New expanders. (*zero_extend_v8qi__insn, *zero_extend_v4hi__insn): New insns. * config/sparc/sparc.c (vector_init_move_words) (vector_init_prepare_elts, sparc_expand_vector_init_vis2, sparc_expand_vector_init_vis1): New functions. (vector_init_bshuffle): Rewrite to handle more cases and make use of locs[] array prepared by vector_init_prepare_elts. (vector_init_fpmerge, vector_init_faligndata): Delete. (sparc_expand_vector_init): Rewrite using new infrastructure. gcc/testsuite/ * lib/test-supports.exp (check_effective_target_ultrasparc_vis2_hw): New proc. (check_effective_target_ultrasparc_vis3_hw): New proc. * gcc.target/sparc/vec-init-1.inc: New vector init common code. * gcc.target/sparc/vec-init-2.inc: Likewise. * gcc.target/sparc/vec-init-3.inc: Likewise. * gcc.target/sparc/vec-init-1-vis1.c: New test. * gcc.target/sparc/vec-init-1-vis2.c: New test. * gcc.target/sparc/vec-init-1-vis3.c: New test. * gcc.target/sparc/vec-init-2-vis1.c: New test. * gcc.target/sparc/vec-init-2-vis2.c: New test. * gcc.target/sparc/vec-init-2-vis3.c: New test. * gcc.target/sparc/vec-init-3-vis1.c: New test. * gcc.target/sparc/vec-init-3-vis2.c: New test. * gcc.target/sparc/vec-init-3-vis3.c: New test. --- gcc/ChangeLog| 16 +- gcc/config/sparc/sparc.c | 419 +- gcc/config/sparc/sparc.md| 43 +++ gcc/testsuite/ChangeLog | 18 + gcc/testsuite/gcc.target/sparc/vec-init-1-vis1.c |5 + gcc/testsuite/gcc.target/sparc/vec-init-1-vis2.c |5 + gcc/testsuite/gcc.target/sparc/vec-init-1-vis3.c |5 + gcc/testsuite/gcc.target/sparc/vec-init-1.inc| 85 + gcc/testsuite/gcc.target/sparc/vec-init-2-vis1.c |5 + gcc/testsuite/gcc.target/sparc/vec-init-2-vis2.c |5 + gcc/testsuite/gcc.target/sparc/vec-init-2-vis3.c |5 + gcc/testsuite/gcc.target/sparc/vec-init-2.inc| 94 + gcc/testsuite/gcc.target/sparc/vec-init-3-vis1.c |5 + gcc/testsuite/gcc.target/sparc/vec-init-3-vis2.c |5 + gcc/testsuite/gcc.target/sparc/vec-init-3-vis3.c |5 + gcc/testsuite/gcc.target/sparc/vec-init-3.inc| 105 ++ gcc/testsuite/lib/target-supports.exp| 18 + 17 files changed, 743 insertions(+), 100 deletions(-) create mode 100644 gcc/testsuite/gcc.target/sparc/vec-init-1-vis1.c create mode 100644 gcc/testsuite/gcc.target/sparc/vec-init-1-vis2.c create mode 100644 gcc/testsuit
Re: [patch] 19/n: trans-mem: compiler tree/gimple stuff
Well - we usually don't grab bits off the tree nodes lightly. Especially if the cgraph seems to be more fit. If this is a suggestion, I can put it on my laundry list of future things todo (after merge, 4.8?, etc). There are not many consumers of the flag, so fixing it shouldn't be hard. For 4.7 definitely. Fair enough. The following patch puts the bit in the cgraph structure. There was a comment originally that we may be able to calculate this bit from the CFG, but I'm not sure whether this applies any more, or how much work it would be. I left the comment in. Tested on x86-64 Linux. OK for branch? * cgraph.c (dump_cgraph_node): Handle tm_clone. * cgraph.h (struct cgraph_node): Add tm_clone field. (decl_is_tm_clone): New. * tree.h (DECL_IS_TM_CLONE): Remove. * trans-mem.c (execute_lower_tm): Rename DECL_IS_TM_CLONE to decl_is_tm_clone. (gate_tm_init): Same. (ipa_tm_create_version_alias): Set tm_clone. (ipa_tm_create_version): Same. (ipa_tm_transform_calls_redirect): Rename DECL_IS_TM_CLONE to decl_is_tm_clone. * calls.c (is_tm_builtin): Same. * tree-cfg.c (dump_function_to_file): Same. * print-tree.c (print_node): Same. * gimple-pretty-print.c (dump_gimple_call): Same. Index: cgraph.c === --- cgraph.c(revision 181017) +++ cgraph.c(working copy) @@ -1840,6 +1840,8 @@ dump_cgraph_node (FILE *f, struct cgraph fprintf (f, " only_called_at_exit"); else if (node->alias) fprintf (f, " alias"); + if (node->tm_clone) +fprintf (f, " tm_clone"); fprintf (f, "\n"); Index: cgraph.h === --- cgraph.h(revision 181017) +++ cgraph.h(working copy) @@ -248,6 +248,11 @@ struct GTY((chain_next ("%h.next"), chai unsigned only_called_at_startup : 1; /* True when function can only be called at startup (from static dtor). */ unsigned only_called_at_exit : 1; + /* True when function is the transactional clone of a function which + is called only from inside transactions. */ + /* ?? We should be able to remove this. We have enough bits in + cgraph to calculate it. */ + unsigned tm_clone : 1; }; typedef struct cgraph_node *cgraph_node_ptr; @@ -1087,4 +1092,14 @@ cgraph_edge_recursive_p (struct cgraph_e else return e->caller->decl == callee->decl; } + +/* Return true if the TM_CLONE bit is set for a given FNDECL. */ +static inline bool +decl_is_tm_clone (const_tree fndecl) +{ + struct cgraph_node *n = cgraph_get_node (fndecl); + if (n) +return n->tm_clone; + return false; +} #endif /* GCC_CGRAPH_H */ Index: tree.h === --- tree.h (revision 181017) +++ tree.h (working copy) @@ -3466,11 +3466,6 @@ struct GTY(()) #define DECL_NO_INLINE_WARNING_P(NODE) \ (FUNCTION_DECL_CHECK (NODE)->function_decl.no_inline_warning_flag) -/* Nonzero in a FUNCTION_DECL means this function is the transactional - clone of a function - called only from inside transactions. */ -#define DECL_IS_TM_CLONE(NODE) \ - (FUNCTION_DECL_CHECK (NODE)->function_decl.tm_clone_flag) - /* Nonzero if a FUNCTION_CODE is a TM load/store. */ #define BUILTIN_TM_LOAD_STORE_P(FN) \ ((FN) >= BUILT_IN_TM_STORE_1 && (FN) <= BUILT_IN_TM_LOAD_RFW_LDOUBLE) Index: gimple-pretty-print.c === --- gimple-pretty-print.c (revision 181017) +++ gimple-pretty-print.c (working copy) @@ -701,7 +701,7 @@ dump_gimple_call (pretty_printer *buffer /* Dump the arguments of _ITM_beginTransaction sanely. */ if (TREE_CODE (fn) == ADDR_EXPR) fn = TREE_OPERAND (fn, 0); - if (TREE_CODE (fn) == FUNCTION_DECL && DECL_IS_TM_CLONE (fn)) + if (TREE_CODE (fn) == FUNCTION_DECL && decl_is_tm_clone (fn)) pp_string (buffer, " [tm-clone]"); if (TREE_CODE (fn) == FUNCTION_DECL && DECL_BUILT_IN_CLASS (fn) == BUILT_IN_NORMAL Index: trans-mem.c === --- trans-mem.c (revision 181017) +++ trans-mem.c (working copy) @@ -1683,7 +1683,7 @@ execute_lower_tm (void) struct walk_stmt_info wi; /* Transactional clones aren't created until a later pass. */ - gcc_assert (!DECL_IS_TM_CLONE (current_function_decl)); + gcc_assert (!decl_is_tm_clone (current_function_decl)); memset (&wi, 0, sizeof (wi)); walk_gimple_seq (gimple_body (current_function_decl), @@ -1901,7 +1901,7 @@ gate_tm_init (void) bitmap_obstack_initialize (&tm_obstack); /* If the function is a TM_CLONE, then the entire function is the region. */ - if (DECL_IS_TM_CLONE (current_function_decl)) + if (decl_is_tm_clone (current_function_decl)) { struct tm_region *region = (struct tm_region *) obstack_alloc (&tm_obstack.obstack,
Re: [patch] 3/n: trans-mem: runtime
On 11/05/11 13:40, Gerald Pfeifer wrote: On Thu, 3 Nov 2011, Aldy Hernandez wrote: --- libitm/method-wbetl.cc (.../trunk) (revision 0) +++ libitm/method-wbetl.cc (.../branches/transactional-memory) (revision 180773) @@ -0,0 +1,628 @@ +/* Copyright (C) 2009 Free Software Foundation, Inc. + Contributed by Richard Henderson. I believe this should become "2009, 2011" or "2009, 2010, 2011" when it's applied to trunk. Gerald I assume the same thing goes for the rest of similar files. Committed to branch. * method-wbetl.cc: Update copyright notice. * aatree.cc: Same. * util.cc: Same. * libitm.h: Same. * memset.cc: Same. * eh_cpp.cc: Same. * barrier.tpl: Same. * useraction.cc: Same. * stmlock.h: Same. * memcpy.cc: Same. * common.h: Same. * config/generic/tls.cc: Same. * config/generic/cacheline.h: Same. * config/generic/cachepage.h: Same. * config/generic/cacheline.cc: Same. * config/generic/unaligned.h: Same. * config/x86/cacheline.h: Same. * config/x86/cacheline.cc: Same. * config/x86/unaligned.h: Same. * config/alpha/cacheline.h: Same. * config/alpha/unaligned.h: Same. * config/alpha/sjlj.S: Same. * config/posix/cachepage.cc: Same. * config/linux/futex.h: Same. * config/linux/alpha/futex_bits.h: Same. Index: method-wbetl.cc === --- method-wbetl.cc (revision 181013) +++ method-wbetl.cc (working copy) @@ -1,4 +1,4 @@ -/* Copyright (C) 2009 Free Software Foundation, Inc. +/* Copyright (C) 2009, 2011 Free Software Foundation, Inc. Contributed by Richard Henderson . This file is part of the GNU Transactional Memory Library (libitm). Index: aatree.cc === --- aatree.cc (revision 181013) +++ aatree.cc (working copy) @@ -1,4 +1,4 @@ -/* Copyright (C) 2009 Free Software Foundation, Inc. +/* Copyright (C) 2009, 2011 Free Software Foundation, Inc. Contributed by Richard Henderson . This file is part of the GNU Transactional Memory Library (libitm). Index: util.cc === --- util.cc (revision 181013) +++ util.cc (working copy) @@ -1,4 +1,4 @@ -/* Copyright (C) 2009 Free Software Foundation, Inc. +/* Copyright (C) 2009, 2011 Free Software Foundation, Inc. Contributed by Richard Henderson . This file is part of the GNU Transactional Memory Library (libitm). Index: libitm.h === --- libitm.h(revision 181013) +++ libitm.h(working copy) @@ -1,4 +1,4 @@ -/* Copyright (C) 2008, 2009 Free Software Foundation, Inc. +/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc. Contributed by Richard Henderson . This file is part of the GNU Transactional Memory Library (libitm). Index: memset.cc === --- memset.cc (revision 181013) +++ memset.cc (working copy) @@ -1,4 +1,4 @@ -/* Copyright (C) 2008, 2009 Free Software Foundation, Inc. +/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc. Contributed by Richard Henderson . This file is part of the GNU Transactional Memory Library (libitm). Index: eh_cpp.cc === --- eh_cpp.cc (revision 181013) +++ eh_cpp.cc (working copy) @@ -1,4 +1,4 @@ -/* Copyright (C) 2009 Free Software Foundation, Inc. +/* Copyright (C) 2009, 2011 Free Software Foundation, Inc. Contributed by Richard Henderson . This file is part of the GNU Transactional Memory Library (libitm). Index: barrier.tpl === --- barrier.tpl (revision 181013) +++ barrier.tpl (working copy) @@ -1,5 +1,5 @@ /* -*- c++ -*- */ -/* Copyright (C) 2008, 2009 Free Software Foundation, Inc. +/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc. Contributed by Richard Henderson . This file is part of the GNU Transactional Memory Library (libitm). Index: useraction.cc === --- useraction.cc (revision 181013) +++ useraction.cc (working copy) @@ -1,4 +1,4 @@ -/* Copyright (C) 2008, 2009 Free Software Foundation, Inc. +/* Copyright (C) 2008, 2009, 2011 Free Software Foundation, Inc. Contributed by Richard Henderson . This file is part of the GNU Transactional Memory Library (libitm). Index: stmlock.h === --- stmlock.h (revision 181013) +++ stmlock.h (working copy) @@ -1,4 +1,4 @@ -/* Copyright (C) 2009 Free Software Foundation, Inc. +/* Copyright (C) 2009, 2011 Free Software Foundation, Inc. Contributed by Richar
Re: [C++ PATCH] PR c++/45114 - Support alias templates
On 11/05/2011 07:36 PM, Dodji Seketeli wrote: +#define TYPE_DECL_NAMES_ALIAS_TEMPLATE_P(NODE) \ This doesn't seem to be needed anymore. +dump_alias_template_specialization (tree t, int flags) +{ + gcc_assert (alias_template_specialization_p (t)); + + if (CLASS_TYPE_P (t)) +dump_aggr_type (t, flags); + else +{ + tree name; + name = TYPE_IDENTIFIER (t); + pp_cxx_tree_identifier (cxx_pp, name); + dump_template_parms (TYPE_TEMPLATE_INFO (t), + /*primary=*/false, + flags & ~TFF_TEMPLATE_HEADER); +} Why do you treat class and non-class aliases differently? In both cases I think we want alias specializations to be printed as scope::name. We don't want to print 'class' since such a specialization cannot be used in an elaborated-type-specifier. 7.1.6.3/2: "If the identifier resolves to a typedef-name or the simple-template-id resolves to an alias template specialization, the elaborated-type-specifier is ill-formed." + if (alias_template_specialization_p (t)) + { + dump_alias_template_specialization (t, flags); + return; + } + else if ((flags & TFF_CHASE_TYPEDEF) + || DECL_SELF_REFERENCE_P (decl) + || (!flag_pretty_templates + && DECL_LANG_SPECIFIC (decl) && DECL_TEMPLATE_INFO (decl))) t = strip_typedefs (t); The order of these two should be reversed. We want TFF_CHASE_TYPEDEF and -fno-pretty-templates to strip alias-templates as well as non-template typedefs. - /* If the next keyword is `namespace', we have a + /* If the next keyword is `namespace', we have either a namespace-alias-definition. */ This change seems unintended. + if (!(type_decl != NULL_TREE + && TREE_CODE (type_decl) == TYPE_DECL + && TYPE_DECL_ALIAS_P (type_decl) + && DECL_TEMPLATE_INSTANTIATION (type_decl))) + cp_parser_simulate_error (parser); I think the TYPE_DECL_ALIAS_P and DECL_TEMPLATE_INSTANTIATION checks should be an assert instead; at this point any TYPE_DECL we get should satisfy those. - || (TYPE_P (t) && TYPE_DECL_ALIAS_P (TYPE_NAME (t))) + || (TYPE_P (t) + && TYPE_NAME (t) + && TREE_CODE (TYPE_NAME (t)) == TYPE_DECL + && TYPE_DECL_ALIAS_P (TYPE_NAME (t))) In C++ I think a non-null TYPE_NAME is always a TYPE_DECL. Why not set r here, as for the other cases? Because I'd like to handle alias declarations even for cases handled by the other cases where, r is end up being NULL. Hmm. With this code we end up substituting into a non-template alias declaration at file scope. I think if you check alias_template_specialization_p before checking for class/function scope that should handle the cases you need. Jason