Re: Fix ipa-devirt ICE
On Thu, 3 Apr 2014, Jan Hubicka wrote: + /* Use OTR_TOKEN = INT_MAX as a marker of probably type inconsistent + /* Use OTR_TOKEN = INT_MAX as a marker of probably type inconsistent + OTR_TOKEN == INT_MAX is used to mark calls that are provably Did you mean "provably" instead of "probably" in the first two? -- Marc Glisse
[Ping][Patch]Simplify SUBREG with operand whose target bits are cleared by AND operation
Hello Eric, Would you please review my patch at http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01582.html? Thanks. BR, Terry
Re: RFA: PATCH to add -fno-gnu-unique for c++/60731
On Wed, Apr 2, 2014 at 9:24 PM, Jason Merrill wrote: > Use of STB_GNU_UNIQUE to avoid problems with variable symbols shared between > two RTLD_LOCAL plugins and a common library dependency causes problems with > libraries that depend on dlclose/dlopen to reinitialize state. This patch > adds a -fno-gnu-unique flag that such libraries can use. > > Tested x86_64-pc-linux-gnu. OK for trunk? Ok. Can you add a testcase as well please? Thanks, Richard.
Re: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store
"Thomas Preud'homme" writes: > +# Return 1 if the target supports byte swap instructions. > + > +proc check_effective_target_bswap { } { > +global et_bswap_saved > + > +if [info exists et_bswap_saved] { > +verbose "check_effective_target_bswap: using cached result" 2 > +} else { > + set et_bswap_saved 0 > + if { [istarget aarch64-*-*] > + || [istarget alpha*-*-*] > + || [istarget arm*-*-*] > + || [istarget i?86-*-*] > + || [istarget powerpc*-*-*] > + || [istarget rs6000-*-*] > + || [istarget s390*-*-*] > + || [istarget x86_64-*-*] } { Please add m68k-*-*. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
[PATCH][LTO] Reduce WPA memory usage
This reduces WPA memory usage at stream-out time by avoiding to allocate the streamer cache node array and by freeing the global out-decl-states hash tables (we do that already for the fn-decl-states). LTO bootstrapped and bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Ok? Not sure if it will make a notable difference. The pointer-map overhead is at least 4 times of that of the vector, if we make pointer-map behave like hash_table (3/4 full) or htab_t (half full) then that would improve (we could even make that configurable at pointer-set/map construction time). Like with Index: gcc/pointer-set.c === --- gcc/pointer-set.c (revision 209018) +++ gcc/pointer-set.c (working copy) @@ -125,7 +125,7 @@ pointer_set_insert (struct pointer_set_t /* For simplicity, expand the set even if P is already there. This can be superfluous but can happen at most once. */ - if (pset->n_elements > pset->n_slots / 4) + if (pset->n_elements * 4 > pset->n_slots * 3) { size_t old_n_slots = pset->n_slots; const void **old_slots = pset->slots; Index: gcc/pointer-set.h === --- gcc/pointer-set.h (revision 209018) +++ gcc/pointer-set.h (working copy) @@ -109,7 +109,7 @@ pointer_map::insert (const void *p, b /* For simplicity, expand the map even if P is already there. This can be superfluous but can happen at most once. */ /* ??? Fugly that we have to inline that here. */ - if (n_elements > n_slots / 4) + if (n_elements * 4 > n_slots * 3) { size_t old_n_slots = n_slots; const void **old_keys = slots; might be worth checking how much memory we save from the above. Thanks, Richard. 2014-04-03 Richard Biener * tree-streamer.h (struct streamer_tree_cache_d): Add next_idx member. (streamer_tree_cache_create): Adjust. * tree-streamer.c (streamer_tree_cache_add_to_node_array): Adjust to allow optional nodes array. (streamer_tree_cache_insert_1): Use next_idx to assign idx. (streamer_tree_cache_append): Likewise. (streamer_tree_cache_create): Create nodes array optionally as specified by parameter. * lto-streamer-out.c (create_output_block): Avoid maintaining the node array in the writer cache. (DFS_write_tree): Remove assertion. (produce_asm_for_decls): Free the out decl state hash table early. * lto-streamer-in.c (lto_data_in_create): Adjust for streamer_tree_cache_create prototype change. Index: gcc/tree-streamer.c === *** gcc/tree-streamer.c (revision 209018) --- gcc/tree-streamer.c (working copy) *** static void *** 101,120 streamer_tree_cache_add_to_node_array (struct streamer_tree_cache_d *cache, unsigned ix, tree t, hashval_t hash) { ! /* Make sure we're either replacing an old element or ! appending consecutively. */ ! gcc_assert (ix <= cache->nodes.length ()); ! ! if (ix == cache->nodes.length ()) { ! cache->nodes.safe_push (t); ! if (cache->hashes.exists ()) ! cache->hashes.safe_push (hash); } ! else { ! cache->nodes[ix] = t; ! if (cache->hashes.exists ()) cache->hashes[ix] = hash; } } --- 101,119 streamer_tree_cache_add_to_node_array (struct streamer_tree_cache_d *cache, unsigned ix, tree t, hashval_t hash) { ! /* We're either replacing an old element or appending consecutively. */ ! if (cache->nodes.exists ()) { ! if (cache->nodes.length () == ix) ! cache->nodes.safe_push (t); ! else ! cache->nodes[ix] = t; } ! if (cache->hashes.exists ()) { ! if (cache->hashes.length () == ix) ! cache->hashes.safe_push (hash); ! else cache->hashes[ix] = hash; } } *** streamer_tree_cache_insert_1 (struct str *** 146,152 { /* Determine the next slot to use in the cache. */ if (insert_at_next_slot_p) ! ix = cache->nodes.length (); else ix = *ix_p; *slot = ix; --- 145,151 { /* Determine the next slot to use in the cache. */ if (insert_at_next_slot_p) ! ix = cache->next_idx++; else ix = *ix_p; *slot = ix; *** void *** 211,217 streamer_tree_cache_append (struct streamer_tree_cache_d *cache, tree t, hashval_t hash) { ! unsigned ix = cache->nodes.length (); if (!cache->node_map) streamer_tree_cache_add_to_node_array (cache, ix, t, hash); else --- 210,216 streamer_tree_cache_append (struct streamer_tree_cache_d *cache, tree t, hashval_t hash)
[PATCH] Fix PR60740
The following fixes the graphite ICE that results from stmt_simple_for_scop_p not walking all GIMPLE_COND operands but only SSA name ones. Bootstrap and regtest in progress on x86_64-unknown-linux-gnu. Richard. 2014-04-03 Richard Biener PR tree-optimization/60740 * graphite-scop-detection.c (stmt_simple_for_scop_p): Iterate over all GIMPLE_COND operands. * gcc.dg/graphite/pr60740.c: New testcase. Index: gcc/graphite-scop-detection.c === *** gcc/graphite-scop-detection.c (revision 209018) --- gcc/graphite-scop-detection.c (working copy) *** stmt_simple_for_scop_p (basic_block scop *** 346,358 case GIMPLE_COND: { - tree op; - ssa_op_iter op_iter; - enum tree_code code = gimple_cond_code (stmt); - /* We can handle all binary comparisons. Inequalities are also supported as they can be represented with union of polyhedra. */ if (!(code == LT_EXPR || code == GT_EXPR || code == LE_EXPR --- 346,355 case GIMPLE_COND: { /* We can handle all binary comparisons. Inequalities are also supported as they can be represented with union of polyhedra. */ + enum tree_code code = gimple_cond_code (stmt); if (!(code == LT_EXPR || code == GT_EXPR || code == LE_EXPR *** stmt_simple_for_scop_p (basic_block scop *** 361,371 || code == NE_EXPR)) return false; ! FOR_EACH_SSA_TREE_OPERAND (op, stmt, op_iter, SSA_OP_ALL_USES) ! if (!graphite_can_represent_expr (scop_entry, loop, op) ! /* We can not handle REAL_TYPE. Failed for pr39260. */ ! || TREE_CODE (TREE_TYPE (op)) == REAL_TYPE) ! return false; return true; } --- 358,371 || code == NE_EXPR)) return false; ! for (unsigned i = 0; i < 2; ++i) ! { ! tree op = gimple_op (stmt, i); ! if (!graphite_can_represent_expr (scop_entry, loop, op) ! /* We can not handle REAL_TYPE. Failed for pr39260. */ ! || TREE_CODE (TREE_TYPE (op)) == REAL_TYPE) ! return false; ! } return true; } Index: gcc/testsuite/gcc.dg/graphite/pr60740.c === *** gcc/testsuite/gcc.dg/graphite/pr60740.c (revision 0) --- gcc/testsuite/gcc.dg/graphite/pr60740.c (working copy) *** *** 0 --- 1,16 + /* { dg-options "-O2 -floop-interchange" } */ + + int **db6 = 0; + + void + k26(void) + { + static int geb = 0; + int *a22 = &geb; + int **l30 = &a22; + int *c4b; + int ndf; + for (ndf = 0; ndf <= 1; ++ndf) + *c4b = (db6 == l30) && (*a22)--; + } +
[PATCH, ARM] Fix PR60609 (Error: value of 256 too large for field of 1 bytes)
Hi This bug causes the compiler to create a Thumb-2 TBB instruction with a jump table containing an out of range value in a .byte field: whatever.s:148: Error: value of 256 too large for field of 1 bytes at 100 This occurs because the jump table is followed with a ".align 1" due to ASM_OUTPUT_CASE_END, but the 'shorten' phase does not account for the space taken by this align directive. This patch addresses the issue by removing ASM_OUTPUT_CASE_END from arm.h, and ensuring that the alignment after an ADDR_DIFF_VEC is instead inserted by aligning the label following the barrier which follows it. This is achieved by defining LABEL_ALIGN_AFTER_BARRIER appropriately. Bootstrapped/checked on arm-unknown-linux-gnueabihf. OK for trunk, and backporting to 4.8? 2014-04-02 Charles Baylis PR target/60609 * config/arm/arm.h (ASM_OUTPUT_CASE_END) Remove. (LABEL_ALIGN_AFTER_BARRIER) Align barriers which occur after ADDR_DIFF_VEC. 2014-04-02 Charles Baylis PR target/60609 * g++.dg/torture/pr60609.C: New test. From 9b0c1ada23e2b210b02ebaee2f599bb5205a91d6 Mon Sep 17 00:00:00 2001 From: Charles Baylis Date: Thu, 3 Apr 2014 10:57:33 +0100 Subject: [PATCH] fix for PR target/60609 2014-04-02 Charles Baylis PR target/60609 * config/arm/arm.h (ASM_OUTPUT_CASE_END) Remove. (LABEL_ALIGN_AFTER_BARRIER) Align barriers which occur after ADDR_DIFF_VEC. 2014-04-02 Charles Baylis PR target/60609 * g++.dg/torture/pr60609.C: New test. --- gcc/config/arm/arm.h | 11 +- gcc/testsuite/g++.dg/torture/pr60609.C | 252 + 2 files changed, 255 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/g++.dg/torture/pr60609.C diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 7ca47a7..a4bbd12 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -2194,14 +2194,9 @@ extern int making_const_table; #undef ASM_OUTPUT_BEFORE_CASE_LABEL #define ASM_OUTPUT_BEFORE_CASE_LABEL(FILE, PREFIX, NUM, TABLE) /* Empty. */ -/* Make sure subsequent insns are aligned after a TBB. */ -#define ASM_OUTPUT_CASE_END(FILE, NUM, JUMPTABLE) \ - do \ -{ \ - if (GET_MODE (PATTERN (JUMPTABLE)) == QImode) \ - ASM_OUTPUT_ALIGN (FILE, 1); \ -} \ - while (0) +#define LABEL_ALIGN_AFTER_BARRIER(LABEL)\ + (GET_CODE (PATTERN (prev_active_insn (LABEL))) == ADDR_DIFF_VEC \ + ? 1 : 0) #define ARM_DECLARE_FUNCTION_NAME(STREAM, NAME, DECL) \ do \ diff --git a/gcc/testsuite/g++.dg/torture/pr60609.C b/gcc/testsuite/g++.dg/torture/pr60609.C new file mode 100644 index 000..9ddec0b --- /dev/null +++ b/gcc/testsuite/g++.dg/torture/pr60609.C @@ -0,0 +1,252 @@ +/* { dg-do assemble } */ + +class exception +{ +}; +class bad_alloc:exception +{ +}; +class logic_error:exception +{ +}; +class domain_error:logic_error +{ +}; +class invalid_argument:logic_error +{ +}; +class length_error:logic_error +{ +}; +class overflow_error:exception +{ +}; +typedef int mpz_t[]; +template < class > class __gmp_expr; +template <> class __gmp_expr < mpz_t > +{ +~__gmp_expr (); +}; + +class PIP_Solution_Node; +class internal_exception +{ +~internal_exception (); +}; +class not_an_integer:internal_exception +{ +}; +class not_a_variable:internal_exception +{ +}; +class not_an_optimization_mode:internal_exception +{ +}; +class not_a_bounded_integer_type_width:internal_exception +{ +}; +class not_a_bounded_integer_type_representation:internal_exception +{ +}; +class not_a_bounded_integer_type_overflow:internal_exception +{ +}; +class not_a_complexity_class:internal_exception +{ +}; +class not_a_control_parameter_name:internal_exception +{ +}; +class not_a_control_parameter_value:internal_exception +{ +}; +class not_a_pip_problem_control_parameter_name:internal_exception +{ +}; +class not_a_pip_problem_control_parameter_value:internal_exception +{ +}; +class not_a_relation:internal_exception +{ +}; +class ppl_handle_mismatch:internal_exception +{ +}; +class timeout_exception +{ +~timeout_exception (); +}; +class deterministic_timeout_exception:timeout_exception +{ +}; +void __assert_fail (const char *, const char *, int, int *) +__attribute__ ((__noreturn__)); +void PL_get_pointer (void *); +int Prolog_is_address (); +inline int +Prolog_get_address (void **p1) +{ +Prolog_is_address ()? static_cast < +void >(0) : __assert_fail ("Prolog_is_address", "./swi_cfli.hh", 0, 0); +PL_get_pointer (p1); +return 0; +} + +class non_linear:internal_exception +{ +}; +class not_unsigned_integer:internal_exception +{ +}; +class not_universe_or_empty:internal_exception +{ +}; +class not_a_nil_terminated_list:internal_exception +{ +}; +class PPL_integer_out_of_range +{ +__gmp_expr < mpz_t > n; +}; +void handle_exception (); +template < typename T > T * term_to_handle (int, const char *) +{ +if (Prolog_is_address ()) +{ +void *p; +Prolog_get_address (&
[PATCH][LTO] Fix(?) parallel WPA memory unsharing
The following fixes(?) parallel WPA memory unsharing caused by streamer_write_chain writing to TREE_CHAIN (for no good reason). The patch removes this historical code. LTO bootstrap and testing running on x86_64-unknown-linux-gnu. Richard. 2014-04-03 Richard Biener * tree-streamer-out.c (streamer_write_chain): Do not temporarily set TREE_CHAIN to NULL_TREE. Index: gcc/tree-streamer-out.c === --- gcc/tree-streamer-out.c (revision 209054) +++ gcc/tree-streamer-out.c (working copy) @@ -523,13 +523,6 @@ streamer_write_chain (struct output_bloc { while (t) { - tree saved_chain; - - /* Clear TREE_CHAIN to avoid blindly recursing into the rest -of the list. */ - saved_chain = TREE_CHAIN (t); - TREE_CHAIN (t) = NULL_TREE; - /* We avoid outputting external vars or functions by reference to the global decls section as we do not want to have them enter decl merging. This is, of course, only for the call @@ -541,7 +534,6 @@ streamer_write_chain (struct output_bloc else stream_write_tree (ob, t, ref_p); - TREE_CHAIN (t) = saved_chain; t = TREE_CHAIN (t); }
Re: [Patch]Simplify SUBREG with operand whose target bits are cleared by AND operation
> I find the GCC function simplify_subreg fails to simplify rtx (subreg:SI > (and:DI (reg/v:DI 115 [ a ]) (const_int 4294967295 [0x])) 4) to zero > during the fwprop1 pass, considering the fact that the high 32-bit part of > (a & 0x) is zero. This leads to some unnecessary multiplications > for high 32-bit part of the result of AND operation. The attached patch is > trying to improve simplify_rtx to handle such case. Other target like x86 > seems hasn't such issue because it generates different RTX to handle 64bit > multiplication on a 32bit machine. See http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00073.html for another try, which led to the simplification in combine.c:combine_simplify_rtx line 5448. Your variant is both more general, because it isn't restricted to the lowpart, and less general, because it is artificially restricted to AND. Some remarks: - this needs to be restricted to non-paradoxical subregs, - you need to test HWI_COMPUTABLE_MODE_P (innermode), - you need to test !side_effects_p (op). I think we need to find a common ground between Jakub's patch and yours and put a single transformation in simplify_subreg. -- Eric Botcazou
Re: [4.8, PATCH 0/26] Backport Power8 and LE support
On Wed, Mar 19, 2014 at 3:23 PM, Bill Schmidt wrote: > Hi, > > Support for Power8 features and the new powerpc64le-linux-gnu target, > including the ELFv2 ABI, has been developed up till now on the > ibm/gcc-4_8-branch. It was appropriate to use this separate branch > while the support was unstable, but this branch will not represent a > particularly good support mechanism for distributions going forward. > Most distros are set up to pull from the major release branches, and > having a separate branch for one target is quite inconvenient. Also, > the ibm/gcc-4_8-branch's original purpose is to serve as the code base > for IBM's Advance Toolchain 7.0. Over time the two purposes that the > branch currently serves will diverge and make things even more > complicated. > > The code is now tested and stable enough that we are ready to backport > this support to the FSF 4.8 branch. This patch series constitutes that > backport. > > Almost all of the changes are specific to PowerPC portions of the code, > and for those patches I am only CCing David. However, some of the > patches require changes to common code, and for these I will CC Richard > and Jakub. Three of these are slightly unrelated but necessary patches, > one to enable decimal float ABS builtins, and two others to fix PR54537 > and PR56843. In addition there are patches that update configuration > files throughout for the new target, and some small changes in common > call support (call.c, expr.h, function.c) to support how the new ABI > handles calls. > > I realize it is unusual to backport such a large amount of code, but we > have been asked by distribution partners to do this, and we feel it > makes good sense for long-term support. > > I have tested the patch series by applying it to a clean FSF 4.8 branch > and comparing the test results against those from the IBM 4.8 branch on > three systems: > * Power8, little endian (--mcpu=power8) > * Power8, big endian (--mcpu=power8) > * Power7, big endian (--mcpu=power7) > > I also checked a recursive diff against the two source directories to > ensure that no patches were missed. > > Thanks, > Bill > > [ 1/26] diff-p8 > [ 2/26] diff-p8-htm > [ 3/26] diff-le-config > [ 4/26] diff-le-libtool > [ 5/26] diff-le-tests > [ 6/26] diff-le-dfp > [ 7/26] diff-le-vector > [ 8/26] diff-abi-compat > [ 9/26] diff-abi-calls > [10/26] diff-abi-elfv2 > [11/26] diff-abi-gotest > [12/26] diff-le-align > [13/26] diff-abi-libffi > [14/26] diff-dfp-abs > [15/26] diff-pr54537 > [16/26] diff-pr56843 > [17/26] diff-direct-move > [18/26] diff-le-config-2 > [19/26] diff-quad-memory > [20/26] diff-lra > [21/26] diff-le-vector-api > [22/26] diff-mcall > [23/26] diff-pr60137-pr60203 > [24/26] diff-reload > [25/26] diff-v1ti > [26/26] diff-trunk-missing With the positive feedback from Darwin and RTEMS, the additional backports for AIX and the bug fix for SPE, I am going to approve this patch series. There is a remaining issue with e600, but IBM LTC cannot reproduce it. If IBM can get more information, it can be addressed in a later patch to trunk and 4.8 branch. Thanks, David
Re: [4.8, PATCH 2/26] Backport Power8 and LE support: HTM support
On Wed, Mar 19, 2014 at 3:25 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-p8-htm) backports hardware transactional memory > support. Copying Jakub and Richard for the libitm support. > > Thanks, > Bill > > > [gcc] > > 2014-03-29 Bill Schmidt > > Backport from mainline > 2013-12-03 Peter Bergner > > * config/rs6000/htmintrin.h (_TEXASR_INSTRUCTION_FETCH_CONFLICT): Fix > typo in macro name. > (_TEXASRU_INSTRUCTION_FETCH_CONFLICT): Likewise. > > Backport from mainline r205233. > 2013-11-21 Peter Bergner > > * doc/extend.texi: Document htm builtins. > > Backport from mainline > 2013-07-17 Iain Sandoe > > * config/rs6000/darwin.h (REGISTER_NAMES): Add HTM registers. > > Backport from mainline > 2013-07-16 Peter Bergner > > * config/rs6000/rs6000.c (rs6000_option_override_internal): Do not > enable extra ISA flags with TARGET_HTM. > > 2013-07-16 Jakub Jelinek > Peter Bergner > > * config/rs6000/rs6000.h (FIRST_PSEUDO_REGISTERS): Mention HTM > registers in the comment. > (DWARF_FRAME_REGISTERS): Subtract also the 3 HTM registers. > (DWARF_REG_TO_UNWIND_COLUMN): Use DWARF_FRAME_REGISTERS > rather than FIRST_PSEUDO_REGISTERS. > > * config.gcc (powerpc*-*-*): Install htmintrin.h and htmxlintrin.h. > * config/rs6000/t-rs6000 (MD_INCLUDES): Add htm.md. > * config/rs6000/rs6000.opt: Add -mhtm option. > * config/rs6000/rs6000-cpus.def (POWERPC_MASKS): Add OPTION_MASK_HTM. > (ISA_2_7_MASKS_SERVER): Add OPTION_MASK_HTM. > * config/rs6000/rs6000-c.c (rs6000_target_modify_macros): Define > __HTM__ if the HTM instructions are available. > * config/rs6000/predicates.md (u3bit_cint_operand, > u10bit_cint_operand, > htm_spr_reg_operand): New define_predicates. > * config/rs6000/rs6000.md (define_attr "type"): Add htm. > (TFHAR_REGNO, TFIAR_REGNO, TEXASR_REGNO): New define_constants. > Include htm.md. > * config/rs6000/rs6000-builtin.def (BU_HTM_0, BU_HTM_1, BU_HTM_2, > BU_HTM_3, BU_HTM_SPR0, BU_HTM_SPR1): Add support macros for defining > HTM builtin functions. > * config/rs6000/rs6000.c (RS6000_BUILTIN_H): New macro. > (rs6000_reg_names, alt_reg_names): Add HTM SPR register names. > (rs6000_init_hard_regno_mode_ok): Add support for HTM instructions. > (rs6000_builtin_mask_calculate): Likewise. > (rs6000_option_override_internal): Likewise. > (bdesc_htm): Add new HTM builtin support. > (htm_spr_num): New function. > (htm_spr_regno): Likewise. > (rs6000_htm_spr_icode): Likewise. > (htm_expand_builtin): Likewise. > (htm_init_builtins): Likewise. > (rs6000_expand_builtin): Add support for HTM builtin functions. > (rs6000_init_builtins): Likewise. > (rs6000_invalid_builtin, rs6000_opt_mask): Add support for -mhtm > option. > * config/rs6000/rs6000.h (ASM_CPU_SPEC): Add support for -mhtm. > (TARGET_HTM, MASK_HTM): Define macros. > (FIRST_PSEUDO_REGISTER): Adjust for new HTM SPR registers. > (FIXED_REGISTERS): Likewise. > (CALL_USED_REGISTERS): Likewise. > (CALL_REALLY_USED_REGISTERS): Likewise. > (REG_ALLOC_ORDER): Likewise. > (enum reg_class): Likewise. > (REG_CLASS_NAMES): Likewise. > (REG_CLASS_CONTENTS): Likewise. > (REGISTER_NAMES): Likewise. > (ADDITIONAL_REGISTER_NAMES): Likewise. > (RS6000_BTC_SPR, RS6000_BTC_VOID, RS6000_BTC_32BIT, RS6000_BTC_64BIT, > RS6000_BTC_MISC_MASK, RS6000_BTM_HTM): New macros. > (RS6000_BTM_COMMON): Add RS6000_BTM_HTM. > * config/rs6000/htm.md: New file. > * config/rs6000/htmintrin.h: New file. > * config/rs6000/htmxlintrin.h: New file. > > [libitm] > > 2014-03-29 Bill Schmidt > > Backport from mainline > * acinclude.m4 (LIBITM_CHECK_AS_HTM): New. > * configure: Rebuild. > * configure.tgt (target_cpu): Add -mhtm to XCFLAGS. > * config/powerpc/target.h: Include sys/auxv.h and htmintrin.h. > (USE_HTM_FASTPATH): Define. > (_TBEGIN_STARTED, _TBEGIN_INDETERMINATE, _TBEGIN_PERSISTENT, > _HTM_RETRIES) New macros. > (htm_abort, htm_abort_should_retry, htm_available, htm_begin, > htm_init, > htm_begin_success, htm_commit, htm_transaction_active): New functions. > > [gcc/testsuite] > > 2014-03-29 Bill Schmidt > > Backport from mainline > * lib/target-supports.exp (check_effective_target_powerpc_htm_ok): New > function to test if HTM is available. > * gcc.target/powerpc/htm-xl-intrin-1.c: New test. > * gcc.target/powerpc/htm-builtin-1.c: New test. The PowerPC bits are okay. Thanks, David
Re: [4.8, PATCH 5/26] Backport Power8 and LE support: Test adjustments
On Wed, Mar 19, 2014 at 11:25 AM, Bill Schmidt wrote: > Hi, > > This patch (diff-le-tests) backports adjustments to a few tests for > powerpc64le and the ELFv2 ABI. > > Thanks, > Bill > > > 2014-03-29 Bill Schmidt > > Backport from mainline > 2013-11-27 Bill Schmidt > > * gfortran.dg/nan_7.f90: Disable for little endian PowerPC. > > Backport from mainline r205106: > > 2013-11-20 Ulrich Weigand > > * gcc.target/powerpc/darwin-longlong.c (msw): Make endian-safe. > > Backport from mainline r205046: > > 2013-11-19 Ulrich Weigand > > * gcc.target/powerpc/ppc64-abi-2.c (MAKE_SLOT): New macro to > construct parameter slot value in endian-independent way. > (fcevv, fciievv, fcvevv): Use it. Okay. Thanks, David
Re: [4.8, PATCH 6/26] Backport Power8 and LE support: TDmode for LE
On Wed, Mar 19, 2014 at 3:29 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-le-dfp) backports fixes for TDmode on a little endian > target. > > Thanks, > Bill > > > 2014-03-19 Bill Schmidt > > Backport from mainline r205123: > > 2013-11-20 Ulrich Weigand > > * config/rs6000/rs6000.c (rs6000_cannot_change_mode_class): Do not > allow subregs of TDmode in FPRs of smaller size in little-endian. > (rs6000_split_multireg_move): When splitting an access to TDmode > in FPRs, do not use simplify_gen_subreg. > > Backport from mainline r204927: > > 2013-11-17 Ulrich Weigand > > * config/rs6000/rs6000.c (rs6000_emit_move): Use low word of > sdmode_stack_slot also in little-endian mode. Okay. Thanks, David
Re: [4.8, PATCH 7/26] Backport Power8 and LE support: Vector LE
On Wed, Mar 19, 2014 at 3:30 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-le-vector) backports the changes to support vector > infrastructure on powerpc64le. Copying Richard and Jakub for the libcpp > bits. > > Thanks, > Bill > > > [gcc] > > 2014-03-29 Bill Schmidt > > Backport from mainline r205333 > 2013-11-24 Bill Schmidt > > * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Correct > for little endian. > > Backport from mainline r205241 > 2013-11-21 Bill Schmidt > > * config/rs6000/vector.md (vec_pack_trunc_v2df): Revert previous > little endian change. > (vec_pack_sfix_trunc_v2df): Likewise. > (vec_pack_ufix_trunc_v2df): Likewise. > * config/rs6000/rs6000.c (rs6000_expand_interleave): Correct > double checking of endianness. > > Backport from mainline r205146 > 2013-11-20 Bill Schmidt > > * config/rs6000/vsx.md (vsx_set_): Adjust for little endian. > (vsx_extract_): Likewise. > (*vsx_extract__one_le): New LE variant on > *vsx_extract__zero. > (vsx_extract_v4sf): Adjust for little endian. > > Backport from mainline r205080 > 2013-11-19 Bill Schmidt > > * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Adjust > V16QI vector splat case for little endian. > > Backport from mainline r205045: > > 2013-11-19 Ulrich Weigand > > * config/rs6000/vector.md ("mov"): Do not call > rs6000_emit_le_vsx_move to move into or out of GPRs. > * config/rs6000/rs6000.c (rs6000_emit_le_vsx_move): Assert > source and destination are not GPR hard regs. > > Backport from mainline r204920 > 2011-11-17 Bill Schmidt > > * config/rs6000/rs6000.c (rs6000_frame_related): Add split_reg > parameter and use it in REG_FRAME_RELATED_EXPR note. > (emit_frame_save): Call rs6000_frame_related with extra NULL_RTX > parameter. > (rs6000_emit_prologue): Likewise, but for little endian VSX > stores, pass the source register of the store instead. > > Backport from mainline r204862 > 2013-11-15 Bill Schmidt > > * config/rs6000/altivec.md (UNSPEC_VPERM_X, UNSPEC_VPERM_UNS_X): > Remove. > (altivec_vperm_): Revert earlier little endian change. > (*altivec_vperm__internal): Remove. > (altivec_vperm__uns): Revert earlier little endian change. > (*altivec_vperm__uns_internal): Remove. > * config/rs6000/vector.md (vec_realign_load_): Revise > commentary. > > Backport from mainline r204441 > 2013-11-05 Bill Schmidt > > * config/rs6000/rs6000.c (rs6000_option_override_internal): > Remove restriction against use of VSX instructions when generating > code for little endian mode. > > Backport from mainline r204440 > 2013-11-05 Bill Schmidt > > * config/rs6000/altivec.md (mulv4si3): Ensure we generate vmulouh > for both big and little endian. > (mulv8hi3): Swap input operands for merge high and merge low > instructions for little endian. > > Backport from mainline r204439 > 2013-11-05 Bill Schmidt > > * config/rs6000/altivec.md (vec_widen_umult_even_v16qi): Change > define_insn to define_expand that uses even patterns for big > endian and odd patterns for little endian. > (vec_widen_smult_even_v16qi): Likewise. > (vec_widen_umult_even_v8hi): Likewise. > (vec_widen_smult_even_v8hi): Likewise. > (vec_widen_umult_odd_v16qi): Likewise. > (vec_widen_smult_odd_v16qi): Likewise. > (vec_widen_umult_odd_v8hi): Likewise. > (vec_widen_smult_odd_v8hi): Likewise. > (altivec_vmuleub): New define_insn. > (altivec_vmuloub): Likewise. > (altivec_vmulesb): Likewise. > (altivec_vmulosb): Likewise. > (altivec_vmuleuh): Likewise. > (altivec_vmulouh): Likewise. > (altivec_vmulesh): Likewise. > (altivec_vmulosh): Likewise. > > Backport from mainline r204395 > 2013-11-05 Bill Schmidt > > * config/rs6000/vector.md (vec_pack_sfix_trunc_v2df): Adjust for > little endian. > (vec_pack_ufix_trunc_v2df): Likewise. > > Backport from mainline r204363 > 2013-11-04 Bill Schmidt > > * config/rs6000/altivec.md (vec_widen_umult_hi_v16qi): Swap > arguments to merge instruction for little endian. > (vec_widen_umult_lo_v16qi): Likewise. > (vec_widen_smult_hi_v16qi): Likewise. > (vec_widen_smult_lo_v16qi): Likewise. > (vec_widen_umult_hi_v8hi): Likewise. > (vec_widen_umult_lo_v8hi): Likewise. > (vec_widen_smult_hi_v8hi): Likewise. > (vec_widen_smult_lo_v8hi): Likewise. > > Backport from mainli
Re: [4.8, PATCH 8/26] Backport Power8 and LE support: PR57949
On Wed, Mar 19, 2014 at 3:30 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-abi-compat) backports the ABI compatibility fix for > PR57949. > > Thanks, > Bill > > > [gcc] > > 2014-03-29 Bill Schmidt > > Backport from mainline r201750. > 2013-11-15 Ulrich Weigand > Note: Default setting of -mcompat-align-parm inverted! > > 2013-08-14 Bill Schmidt > > PR target/57949 > * doc/invoke.texi: Add documentation of mcompat-align-parm > option. > * config/rs6000/rs6000.opt: Add mcompat-align-parm option. > * config/rs6000/rs6000.c (rs6000_function_arg_boundary): For AIX > and Linux, correct BLKmode alignment when 128-bit alignment is > required and compatibility flag is not set. > (rs6000_gimplify_va_arg): For AIX and Linux, honor specified > alignment for zero-size arguments when compatibility flag is not > set. > > [gcc/testsuite] > > 2014-03-29 Bill Schmidt > > Backport from mainline r201750. > 2013-11-15 Ulrich Weigand > Note: Default setting of -mcompat-align-parm inverted! > > 2013-08-14 Bill Schmidt > > PR target/57949 > * gcc.target/powerpc/pr57949-1.c: New. > * gcc.target/powerpc/pr57949-2.c: New. Okay. Thanks, David
Re: [4.8, PATCH 10/26] Backport Power8 and LE support: ELFv2 ABI
On Wed, Mar 19, 2014 at 3:31 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-abi-elfv2) backports the fundamental changes for the > ELFv2 ABI for powerpc64le. Copying Richard and Jakub for the libgcc, > libitm, and libstdc++ bits. > > Thanks, > Bill > > > [gcc] > > 2014-03-29 Bill Schmidt > > Backport from mainline r204842: > > 2013-11-15 Ulrich Weigand > > * doc/invoke.texi (-mabi=elfv1, -mabi=elfv2): Document. > > Backport from mainline r204809: > > 2013-11-14 Ulrich Weigand > > * config/rs6000/sysv4le.h (LINUX64_DEFAULT_ABI_ELFv2): Define. > > Backport from mainline r204808: > > 2013-11-14 Ulrich Weigand > Alan Modra > > * config/rs6000/rs6000.h (RS6000_SAVE_AREA): Handle ABI_ELFv2. > (RS6000_SAVE_TOC): Remove. > (RS6000_TOC_SAVE_SLOT): New macro. > * config/rs6000/rs6000.c (rs6000_parm_offset): New function. > (rs6000_parm_start): Use it. > (rs6000_function_arg_advance_1): Likewise. > (rs6000_emit_prologue): Use RS6000_TOC_SAVE_SLOT. > (rs6000_emit_epilogue): Likewise. > (rs6000_call_aix): Likewise. > (rs6000_output_function_prologue): Do not save/restore r11 > around calling _mcount for ABI_ELFv2. > > 2013-11-14 Ulrich Weigand > Alan Modra > > * config/rs6000/rs6000-protos.h (rs6000_reg_parm_stack_space): > Add prototype. > * config/rs6000/rs6000.h (RS6000_REG_SAVE): Remove. > (REG_PARM_STACK_SPACE): Call rs6000_reg_parm_stack_space. > * config/rs6000/rs6000.c (rs6000_parm_needs_stack): New function. > (rs6000_function_parms_need_stack): Likewise. > (rs6000_reg_parm_stack_space): Likewise. > (rs6000_function_arg): Do not replace BLKmode by Pmode when > returning a register argument. > > 2013-11-14 Ulrich Weigand > Michael Gschwind > > * config/rs6000/rs6000.h (FP_ARG_MAX_RETURN): New macro. > (ALTIVEC_ARG_MAX_RETURN): Likewise. > (FUNCTION_VALUE_REGNO_P): Use them. > * config/rs6000/rs6000.c (TARGET_RETURN_IN_MSB): Define. > (rs6000_return_in_msb): New function. > (rs6000_return_in_memory): Handle ELFv2 homogeneous aggregates. > Handle aggregates of up to 16 bytes for ELFv2. > (rs6000_function_value): Handle ELFv2 homogeneous aggregates. > > 2013-11-14 Ulrich Weigand > Michael Gschwind > > * config/rs6000/rs6000.h (AGGR_ARG_NUM_REG): Define. > * config/rs6000/rs6000.c (rs6000_aggregate_candidate): New function. > (rs6000_discover_homogeneous_aggregate): Likewise. > (rs6000_function_arg_boundary): Handle homogeneous aggregates. > (rs6000_function_arg_advance_1): Likewise. > (rs6000_function_arg): Likewise. > (rs6000_arg_partial_bytes): Likewise. > (rs6000_psave_function_arg): Handle BLKmode arguments. > > 2013-11-14 Ulrich Weigand > Michael Gschwind > > * config/rs6000/rs6000.h (AGGR_ARG_NUM_REG): Define. > * config/rs6000/rs6000.c (rs6000_aggregate_candidate): New function. > (rs6000_discover_homogeneous_aggregate): Likewise. > (rs6000_function_arg_boundary): Handle homogeneous aggregates. > (rs6000_function_arg_advance_1): Likewise. > (rs6000_function_arg): Likewise. > (rs6000_arg_partial_bytes): Likewise. > (rs6000_psave_function_arg): Handle BLKmode arguments. > > 2013-11-14 Ulrich Weigand > > * config/rs6000/rs6000.c (machine_function): New member > r2_setup_needed. > (rs6000_emit_prologue): Set r2_setup_needed if necessary. > (rs6000_output_mi_thunk): Set r2_setup_needed. > (rs6000_output_function_prologue): Output global entry point > prologue and local entry point marker if needed for ABI_ELFv2. > Output -mprofile-kernel code here. > (output_function_profiler): Do not output -mprofile-kernel > code here; moved to rs6000_output_function_prologue. > (rs6000_file_start): Output ".abiversion 2" for ABI_ELFv2. > > (rs6000_emit_move): Do not handle dot symbols for ABI_ELFv2. > (rs6000_output_function_entry): Likewise. > (rs6000_assemble_integer): Likewise. > (rs6000_elf_encode_section_info): Likewise. > (rs6000_elf_declare_function_name): Do not create dot symbols > or .opd section for ABI_ELFv2. > > (rs6000_trampoline_size): Update for ABI_ELFv2 trampolines. > (rs6000_trampoline_init): Likewise. > (rs6000_elf_file_end): Call file_end_indicate_exec_stack > for ABI_ELFv2. > > (rs6000_call_aix): Handle ELFv2 indirect calls. Do not check > for function descriptors in ABI_ELFv2. > > * config/rs6000/rs6000.md ("*call_indirect_aix"): Supp
Re: [4.8, PATCH 11/26] Backport Power8 and LE support: gotest
On Wed, Mar 19, 2014 at 3:31 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-abi-gotest) backports enablement of the Go testsuite > for powerpc64le. > > Thanks, > Bill > > > 2014-03-29 Bill Schmidt > > Backport from mainline r205000. > 2013-11-19 Ulrich Weigand > > gotest: Recognize PPC ELF v2 function pointers in text section. Okay. Thanks, David
Re: [4.8, PATCH 12/26] Backport Power8 and LE support: Defaults
On Wed, Mar 19, 2014 at 3:32 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-le-align) sets some miscellaneous defaults for little > endian support. > > Thanks, > Bill > > > 2014-03-29 Bill Schmidt > > Apply mainline r205060. > 2013-11-20 Alan Modra > * config/rs6000/sysv4.h (CC1_ENDIAN_LITTLE_SPEC): Define as empty. > * config/rs6000/rs6000.c (rs6000_option_override_internal): Default > to strict alignment on older processors when little-endian. > * config/rs6000/linux64.h (PROCESSOR_DEFAULT64): Default to power8 > for ELFv2. Okay. Thanks, David
Re: [4.8, PATCH 14/26] Backport Power8 and LE support: DFP absolute value
On Wed, Mar 19, 2014 at 3:32 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-dfp-abs) backports some unrelated but necessary work to > enable the DFP absolute value builtins. Copying Jakub who was involved > with the original patch. > > Thanks, > Bill > > > 2014-03-29 Bill Schmidt > > Backport from mainline > 2013-08-19 Peter Bergner > Jakub Jelinek > > * builtins.def (BUILT_IN_FABSD32): New DFP ABS builtin. > (BUILT_IN_FABSD64): Likewise. > (BUILT_IN_FABSD128): Likewise. > * builtins.c (expand_builtin): Add support for > new DFP ABS builtins. > (fold_builtin_1): Likewise. > * config/rs6000/dfp.md > (*abstd2_fpr): Handle non-overlapping destination > and source operands. > (*nabstd2_fpr): Likewise. > > 2014-03-29 Bill Schmidt > > Backport from mainline > 2013-08-19 Peter Bergner > > * gcc.target/powerpc/dfp-dd-2.c: New test. > * gcc.target/powerpc/dfp-td-2.c: Likewise. > * gcc.target/powerpc/dfp-td-3.c: Likewise. Okay. Thanks, David
Re: [4.8, PATCH 17/26] Backport Power8 and LE support: Direct moves
On Wed, Mar 19, 2014 at 3:33 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-direct-move) backports support for the Power8 direct > move instructions for little endian. > > Thanks, > Bill > > > 2014-03-19 Bill Schmidt > > Backport from mainline > 2013-10-23 Pat Haugen > > * gcc.target/powerpc/direct-move.h: Fix header for executable tests. > > Back port from mainline > 2014-01-16 Michael Meissner > > PR target/59844 > * config/rs6000/rs6000.md (reload_vsx_from_gprsf): Add little > endian support, remove tests for WORDS_BIG_ENDIAN. > (p8_mfvsrd_3_): Likewise. > (reload_gpr_from_vsx): Likewise. > (reload_gpr_from_vsxsf): Likewise. > (p8_mfvsrd_4_disf): Likewise. Okay. Thanks, David
Re: [4.8, PATCH 16/26] Backport Power8 and LE support: PR56843
On Wed, Mar 19, 2014 at 3:33 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-pr56843) backports the fix for PR56843. > > Thanks, > Bill > > > [gcc] > > 2014-03-19 Bill Schmidt > > Backport from mainline > 2013-04-05 Bill Schmidt > > PR target/56843 > * config/rs6000/rs6000.c (rs6000_emit_swdiv_high_precision): Remove. > (rs6000_emit_swdiv_low_precision): Remove. > (rs6000_emit_swdiv): Rewrite to handle between one and four > iterations of Newton-Raphson generally; modify required number of > iterations for some cases. > * config/rs6000/rs6000.h (RS6000_RECIP_HIGH_PRECISION_P): Remove. > > [gcc/testsuite] > > 2014-03-19 Bill Schmidt > > Backport from mainline > 2013-04-05 Bill Schmidt > > PR target/56843 > * gcc.target/powerpc/recip-1.c: Modify expected output. > * gcc.target/powerpc/recip-3.c: Likewise. > * gcc.target/powerpc/recip-4.c: Likewise. > * gcc.target/powerpc/recip-5.c: Add expected output for iterations. Okay. Thanks, David
Re: [4.8, PATCH 18/26] Backport Power8 and LE support: Configure bits 2
On Wed, Mar 19, 2014 at 3:33 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-le-config-2) backports more configure changes, > particularly for multilib/multiarch targeting powerpc64le. > > Thanks, > Bill > > > 2014-03-19 Bill Schmidt > > Apply mainline r202190, powerpc64le multilibs and multiarch dir > 2013-09-03 Alan Modra > > * config.gcc (powerpc*-*-linux*): Add support for little-endian > multilibs to big-endian target and vice versa. > * config/rs6000/t-linux64: Use := assignment on all vars. > (MULTILIB_EXTRA_OPTS): Remove fPIC. > (MULTILIB_OSDIRNAMES): Specify using mapping from multilib_options. > * config/rs6000/t-linux64le: New file. > * config/rs6000/t-linux64bele: New file. > * config/rs6000/t-linux64lebe: New file. Okay. Thanks, David
Re: [4.8, PATCH 19/26] Backport Power8 and LE support: Quad memory atomic
On Wed, Mar 19, 2014 at 3:33 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-quad-memory) backports support for quad-memory atomic > operations. > > Thanks, > Bill > > > [gcc/testsuite] > > 2014-03-19 Bill Schmidt > > Back port from mainline > 2014-01-23 Michael Meissner > > PR target/59909 > * gcc.target/powerpc/quad-atomic.c: New file to test power8 quad > word atomic functions at runtime. > > [gcc] > > 2014-03-19 Bill Schmidt > > Back port from mainline > 2014-01-23 Michael Meissner > > PR target/59909 > * doc/invoke.texi (RS/6000 and PowerPC Options): Document > -mquad-memory-atomic. Update -mquad-memory documentation to say > it is only used for non-atomic loads/stores. > > * config/rs6000/predicates.md (quad_int_reg_operand): Allow either > -mquad-memory or -mquad-memory-atomic switches. > > * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Add > -mquad-memory-atomic to ISA 2.07 support. > > * config/rs6000/rs6000.opt (-mquad-memory-atomic): Add new switch > to separate support of normal quad word memory operations (ldq, > stq) from the atomic quad word memory operations. > > * config/rs6000/rs6000.c (rs6000_option_override_internal): Add > support to separate non-atomic quad word operations from atomic > quad word operations. Disable non-atomic quad word operations in > little endian mode so that we don't have to swap words after the > load and before the store. > (quad_load_store_p): Add comment about atomic quad word support. > (rs6000_opt_masks): Add -mquad-memory-atomic to the list of > options printed with -mdebug=reg. > > * config/rs6000/rs6000.h (TARGET_SYNC_TI): Use > -mquad-memory-atomic as the test for whether we have quad word > atomic instructions. > (TARGET_SYNC_HI_QI): If either -mquad-memory-atomic, > -mquad-memory, or -mp8-vector are used, allow byte/half-word > atomic operations. > > * config/rs6000/sync.md (load_lockedti): Insure that the address > is a proper indexed or indirect address for the lqarx instruction. > On little endian systems, swap the hi/lo registers after the lqarx > instruction. > (load_lockedpti): Use indexed_or_indirect_operand predicate to > insure the address is valid for the lqarx instruction. > (store_conditionalti): Insure that the address is a proper indexed > or indirect address for the stqcrx. instruction. On little endian > systems, swap the hi/lo registers before doing the stqcrx. > instruction. > (store_conditionalpti): Use indexed_or_indirect_operand predicate to > insure the address is valid for the stqcrx. instruction. > > * gcc/config/rs6000/rs6000-c.c (rs6000_target_modify_macros): > Define __QUAD_MEMORY__ and __QUAD_MEMORY_ATOMIC__ based on what > type of quad memory support is available. Okay. Thanks, David
Re: [4.8, PATCH 20/26] Backport Power8 and LE support: LRA
On Wed, Mar 19, 2014 at 3:33 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-lra) backports the changes to enable -mlra for the > PowerPC back end. > > Thanks, > Bill > > > 2014-03-19 Bill Schmidt > > Backport from mainline > 2014-02-04 Michael Meissner > > * config/rs6000/rs6000.opt (-mlra): Add switch to enable the LRA > register allocator. > > * config/rs6000/rs6000.c (TARGET_LRA_P): Add support for -mlra to > enable the LRA register allocator. Back port the changes from the > trunk to enable LRA. > (rs6000_legitimate_offset_address_p): Likewise. > (legitimate_lo_sum_address_p): Likewise. > (use_toc_relative_ref): Likewise. > (rs6000_legitimate_address_p): Likewise. > (rs6000_emit_move): Likewise. > (rs6000_secondary_memory_needed_mode): Likewise. > (rs6000_alloc_sdmode_stack_slot): Likewise. > (rs6000_lra_p): Likewise. > > * config/rs6000/sync.md (load_lockedti): Copy TI/PTI variables by > 64-bit parts to force the register allocator to allocate even/odd > register pairs for the quad word atomic instructions. > (store_conditionalti): Likewise. Okay. Thanks, David
Re: [4.8, PATCH 22/26] Backport Power8 and LE support: -mcall-* endianness
On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-mcall) fixes big-endian assumptions for -mcall-aixdesc > and various others. > > Thanks, > Bill > > > 2014-03-19 Bill Schmidt > > Backport from mainline r207658 > 2014-02-06 Ulrich Weigand > > * config/rs6000/sysv4.h (ENDIAN_SELECT): Do not attempt to enforce > big-endian mode for -mcall-aixdesc, -mcall-freebsd, -mcall-netbsd, > -mcall-openbsd, or -mcall-linux. > (CC1_ENDIAN_BIG_SPEC): Remove. > (CC1_ENDIAN_LITTLE_SPEC): Remove. > (CC1_ENDIAN_DEFAULT_SPEC): Remove. > (CC1_SPEC): Remove (always empty) %cc1_endian_... spec. > (SUBTARGET_EXTRA_SPECS): Remove %cc1_endian_big, %cc1_endian_little, > and %cc1_endian_default. > * config/rs6000/sysv4le.h (CC1_ENDIAN_DEFAULT_SPEC): Remove. Okay. Thanks, David
Re: [4.8, PATCH 21/26] Backport Power8 and LE support: Vector APIs
On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-le-vector-api) backports enablement of LE support for > the Altivec APIs, including support for -maltivec=be. > > Thanks, > Bill > > > [gcc] > > 2014-03-19 Bill Schmidt > > Backport from mainline r206443 > 2014-01-08 Bill Schmidt > > * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Remove > two duplicate entries. > > Backport from mainline r206494 > 2014-01-09 Bill Schmidt > > * doc/invoke.texi: Add -maltivec={be,le} options, and document > default element-order behavior for -maltivec. > * config/rs6000/rs6000.opt: Add -maltivec={be,le} options. > * config/rs6000/rs6000.c (rs6000_option_override_internal): Ensure > that -maltivec={le,be} implies -maltivec; disallow -maltivec=le > when targeting big endian, at least for now. > * config/rs6000/rs6000.h: Add #define of VECTOR_ELT_ORDER_BIG. > > Backport from mainline r206541 > 2014-01-10 Bill Schmidt > > * config/rs6000/rs6000-builtin.def: Fix pasto for VPKSDUS. > > Backport from mainline r206590 > 2014-01-13 Bill Schmidt > > * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): > Implement -maltivec=be for vec_insert and vec_extract. > > Backport from mainline r206641 > 2014-01-15 Bill Schmidt > > * config/rs6000/altivec.md (mulv8hi3): Explicitly generate vmulesh > and vmulosh rather than call gen_vec_widen_smult_*. > (vec_widen_umult_even_v16qi): Test VECTOR_ELT_ORDER_BIG rather > than BYTES_BIG_ENDIAN to determine use of even or odd instruction. > (vec_widen_smult_even_v16qi): Likewise. > (vec_widen_umult_even_v8hi): Likewise. > (vec_widen_smult_even_v8hi): Likewise. > (vec_widen_umult_odd_v16qi): Likewise. > (vec_widen_smult_odd_v16qi): Likewise. > (vec_widen_umult_odd_v8hi): Likewise. > (vec_widen_smult_odd_v8hi): Likewise. > (vec_widen_umult_hi_v16qi): Explicitly generate vmuleub and > vmuloub rather than call gen_vec_widen_umult_*. > (vec_widen_umult_lo_v16qi): Likewise. > (vec_widen_smult_hi_v16qi): Explicitly generate vmulesb and > vmulosb rather than call gen_vec_widen_smult_*. > (vec_widen_smult_lo_v16qi): Likewise. > (vec_widen_umult_hi_v8hi): Explicitly generate vmuleuh and vmulouh > rather than call gen_vec_widen_umult_*. > (vec_widen_umult_lo_v8hi): Likewise. > (vec_widen_smult_hi_v8hi): Explicitly gnerate vmulesh and vmulosh > rather than call gen_vec_widen_smult_*. > (vec_widen_smult_lo_v8hi): Likewise. > > Backport from mainline r207062 > 2014-01-24 Bill Schmidt > > * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Remove > correction for little endian... > * config/rs6000/vsx.md (vsx_xxpermdi2__1): ...and move it to > here. > > Backport from mainline r207262 > 2014-01-29 Bill Schmidt > > * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Use > CODE_FOR_altivec_vmrg*_direct rather than CODE_FOR_altivec_vmrg*. > * config/rs6000/vsx.md (vsx_mergel_): Adjust for > -maltivec=be with LE targets. > (vsx_mergeh_): Likewise. > * config/rs6000/altivec.md (UNSPEC_VMRG[HL]_DIRECT): New > unspecs. > (mulv8hi3): Use gen_altivec_vmrg[hl]w_direct. > (altivec_vmrghb): Replace with define_expand and new > *altivec_vmrghb_internal insn; adjust for -maltivec=be with LE > targets. > (altivec_vmrghb_direct): New define_insn. > (altivec_vmrghh): Replace with define_expand and new > *altivec_vmrghh_internal insn; adjust for -maltivec=be with LE > targets. > (altivec_vmrghh_direct): New define_insn. > (altivec_vmrghw): Replace with define_expand and new > *altivec_vmrghw_internal insn; adjust for -maltivec=be with LE > targets. > (altivec_vmrghw_direct): New define_insn. > (*altivec_vmrghsf): Adjust for endianness. > (altivec_vmrglb): Replace with define_expand and new > *altivec_vmrglb_internal insn; adjust for -maltivec=be with LE > targets. > (altivec_vmrglb_direct): New define_insn. > (altivec_vmrglh): Replace with define_expand and new > *altivec_vmrglh_internal insn; adjust for -maltivec=be with LE > targets. > (altivec_vmrglh_direct): New define_insn. > (altivec_vmrglw): Replace with define_expand and new > *altivec_vmrglw_internal insn; adjust for -maltivec=be with LE > targets. > (altivec_vmrglw_direct): New define_insn. > (*altivec_vmrglsf): Adjust for endianness. > (vec_widen_umult_hi_v16qi): Use gen_altivec_vmrghh_direct. >
Re: [4.8, PATCH 23/26] Backport Power8 and LE support: PR60137, PR60203
On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-pr60137-pr60203) backports fixes for two little-endian > vector mode problems. > > Thanks, > Bill > > > [gcc] > > 2014-03-19 Bill Schmidt > > Backport from mainline r207699. > 2014-02-11 Michael Meissner > > PR target/60137 > * config/rs6000/rs6000.md (128-bit GPR splitter): Add a splitter > for VSX/Altivec vectors that land in GPR registers. > > Backport from mainline r207808. > 2014-02-15 Michael Meissner > > PR target/60203 > * config/rs6000/rs6000.md (rreg): Add TFmode, TDmode constraints. > (mov_internal, TFmode/TDmode): Split TFmode/TDmode moves > into 64-bit and 32-bit moves. On 64-bit moves, add support for > using direct move instructions on ISA 2.07. Also adjust > instruction length for 64-bit. > (mov_64bit, TFmode/TDmode): Likewise. > (mov_32bit, TFmode/TDmode): Likewise. > > Backport from mainline r207868. > 2014-02-18 Michael Meissner > > PR target/60203 > * config/rs6000/rs6000.md (mov_64bit, TF/TDmode moves): > Split 64-bit moves into 2 patterns. Do not allow the use of > direct move for TDmode in little endian, since the decimal value > has little endian bytes within a word, but the 64-bit pieces are > ordered in a big endian fashion, and normal subreg's of TDmode are > not allowed. > (mov_64bit_dm): Likewise. > (movtd_64bit_nodm): Likewise. > > [gcc/testsuite] > > 2014-03-19 Bill Schmidt > > Backport from mainline r207699. > 2014-02-11 Michael Meissner > > PR target/60137 > * gcc.target/powerpc/pr60137.c: New file. > > Backport from mainline r207808. > 2014-02-15 Michael Meissner > > PR target/60203 > * gcc.target/powerpc/pr60203.c: New testsuite. Okay. Thanks, David
Re: [4.8, PATCH 24/26] Backport Power8 and LE support: Reload issues
On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-reload) backports fixes for a couple of problems in > PowerPC reload handling. > > Thanks, > Bill > > > 2014-03-19 Bill Schmidt > > Apply mainline r207798 > 2014-02-26 Alan Modra > PR target/58675 > PR target/57935 > * config/rs6000/rs6000.c (rs6000_secondary_reload_inner): Use > find_replacement on parts of insn rtl that might be reloaded. > > Backport from mainline r208287 > 2014-03-03 Bill Schmidt > > * config/rs6000/rs6000.c (rs6000_preferred_reload_class): Disallow > reload of PLUS rtx's outside of GENERAL_REGS or BASE_REGS; relax > constraint on constants to permit them being loaded into > GENERAL_REGS or BASE_REGS. Okay. Thanks, David
Re: [4.8, PATCH 25/26] Backport Power8 and LE support: V1TI support
On Wed, Mar 19, 2014 at 3:34 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-v1ti) backports the V1TI support. > > Thanks, > Bill > > > [gcc] > > 2014-03-19 Bill Schmidt > > Back port from trunk > 2014-03-12 Michael Meissner > > * config/rs6000/vector.md (VEC_L): Add V1TI mode to vector types. > (VEC_M): Likewise. > (VEC_N): Likewise. > (VEC_R): Likewise. > (VEC_base): Likewise. > (mov, VEC_M modes): If we are loading TImode into VSX > registers, we need to swap double words in little endian mode. > > * config/rs6000/rs6000-modes.def (V1TImode): Add new vector mode > to be a container mode for 128-bit integer operations added in ISA > 2.07. Unlike TImode and PTImode, the preferred register set is > the Altivec/VMX registers for the 128-bit operations. > > * config/rs6000/rs6000-protos.h (rs6000_move_128bit_ok_p): Add > declarations. > (rs6000_split_128bit_ok_p): Likewise. > > * config/rs6000/rs6000-builtin.def (BU_P8V_AV_3): Add new support > macros for creating ISA 2.07 normal and overloaded builtin > functions with 3 arguments. > (BU_P8V_OVERLOAD_3): Likewise. > (VPERM_1T): Add support for V1TImode in 128-bit vector operations > for use as overloaded functions. > (VPERM_1TI_UNS): Likewise. > (VSEL_1TI): Likewise. > (VSEL_1TI_UNS): Likewise. > (ST_INTERNAL_1ti): Likewise. > (LD_INTERNAL_1ti): Likewise. > (XXSEL_1TI): Likewise. > (XXSEL_1TI_UNS): Likewise. > (VPERM_1TI): Likewise. > (VPERM_1TI_UNS): Likewise. > (XXPERMDI_1TI): Likewise. > (SET_1TI): Likewise. > (LXVD2X_V1TI): Likewise. > (STXVD2X_V1TI): Likewise. > (VEC_INIT_V1TI): Likewise. > (VEC_SET_V1TI): Likewise. > (VEC_EXT_V1TI): Likewise. > (EQV_V1TI): Likewise. > (NAND_V1TI): Likewise. > (ORC_V1TI): Likewise. > (VADDCUQ): Add support for 128-bit integer arithmetic instructions > added in ISA 2.07. Add both normal 'altivec' builtins, and the > overloaded builtin. > (VADDUQM): Likewise. > (VSUBCUQ): Likewise. > (VADDEUQM): Likewise. > (VADDECUQ): Likewise. > (VSUBEUQM): Likewise. > (VSUBECUQ): Likewise. > > * config/rs6000/rs6000-c.c (__int128_type): New static to hold > __int128_t and __uint128_t types. > (__uint128_type): Likewise. > (altivec_categorize_keyword): Add support for vector __int128_t, > vector __uint128_t, vector __int128, and vector unsigned __int128 > as a container type for TImode operations that need to be done in > VSX/Altivec registers. > (rs6000_macro_to_expand): Likewise. > (altivec_overloaded_builtins): Add ISA 2.07 overloaded functions > to support 128-bit integer instructions vaddcuq, vadduqm, > vaddecuq, vaddeuqm, vsubcuq, vsubuqm, vsubecuq, vsubeuqm. > (altivec_resolve_overloaded_builtin): Add support for V1TImode. > > * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Add support > for V1TImode, and set up preferences to use VSX/Altivec > registers. Setup VSX reload handlers. > (rs6000_debug_reg_global): Likewise. > (rs6000_init_hard_regno_mode_ok): Likewise. > (rs6000_preferred_simd_mode): Likewise. > (vspltis_constant): Do not allow V1TImode as easy altivec > constants. > (easy_altivec_constant): Likewise. > (output_vec_const_move): Likewise. > (rs6000_expand_vector_set): Convert V1TImode set and extract to > simple move. > (rs6000_expand_vector_extract): Likewise. > (reg_offset_addressing_ok_p): Setup V1TImode to use VSX reg+reg > addressing. > (rs6000_const_vec): Add support for V1TImode. > (rs6000_emit_le_vsx_load): Swap double words when loading or > storing TImode/V1TImode. > (rs6000_emit_le_vsx_store): Likewise. > (rs6000_emit_le_vsx_move): Likewise. > (rs6000_emit_move): Add support for V1TImode. > (altivec_expand_ld_builtin): Likewise. > (altivec_expand_st_builtin): Likewise. > (altivec_expand_vec_init_builtin): Likewise. > (altivec_expand_builtin): Likewise. > (rs6000_init_builtins): Add support for V1TImode type. Add > support for ISA 2.07 128-bit integer builtins. Define type names > for the VSX/Altivec vector types. > (altivec_init_builtins): Add support for overloaded vector > functions with V1TImode type. > (rs6000_preferred_reload_class): Prefer Altivec registers for > V1TImode. > (rs6000_move_128bit_ok_p): Move 128-bit move/split validation to > external function. > (rs6000_split_128bit_ok_p): Likewise. > (rs6000_han
Re: [4.8, PATCH 26/26] Backport Power8 and LE support: Missing support
On Wed, Mar 19, 2014 at 3:35 PM, Bill Schmidt wrote: > Hi, > > This patch (diff-trunk-missing) backports some LE pieces that were found > not to have been backported from trunk to the IBM 4.8 branch until > relatively recently. > > Thanks, > Bill > > > 2014-03-19 Bill Schmidt > > Back port from trunk > 2013-04-25 Alan Modra > > PR target/57052 > * config/rs6000/rs6000.md (rotlsi3_internal7): Rename to > rotlsi3_internal7le and condition on !BYTES_BIG_ENDIAN. > (rotlsi3_internal8be): New BYTES_BIG_ENDIAN insn. > Repeat for many other rotate/shift and mask patterns using subregs. > Name lshiftrt insns. > (ashrdisi3_noppc64): Rename to ashrdisi3_noppc64be and condition > on WORDS_BIG_ENDIAN. > > 2013-06-07 Alan Modra > > * config/rs6000/rs6000.c (rs6000_option_override_internal): Don't > override user -mfp-in-toc. > (offsettable_ok_by_alignment): Consider just the current access > rather than the whole object, unless BLKmode. Handle > CONSTANT_POOL_ADDRESS_P constants that lack a decl too. > (use_toc_relative_ref): Allow CONSTANT_POOL_ADDRESS_P constants > for -mcmodel=medium. > * config/rs6000/linux64.h (SUBSUBTARGET_OVERRIDE_OPTIONS): Don't > override user -mfp-in-toc or -msum-in-toc. Default to > -mno-fp-in-toc for -mcmodel=medium. > > 2013-06-18 Alan Modra > > * config/rs6000/rs6000.h (enum data_align): New. > (LOCAL_ALIGNMENT, DATA_ALIGNMENT): Use rs6000_data_alignment. > (DATA_ABI_ALIGNMENT): Define. > (CONSTANT_ALIGNMENT): Correct comment. > * config/rs6000/rs6000-protos.h (rs6000_data_alignment): Declare. > * config/rs6000/rs6000.c (rs6000_data_alignment): New function. > > 2013-07-11 Ulrich Weigand > > * config/rs6000/rs6000.md (""*tls_gd_low"): > Require GOT register as additional operand in UNSPEC. > ("*tls_ld_low"): Likewise. > ("*tls_got_dtprel_low"): Likewise. > ("*tls_got_tprel_low"): Likewise. > ("*tls_gd"): Update splitter. > ("*tls_ld"): Likewise. > ("tls_got_dtprel_"): Likewise. > ("tls_got_tprel_"): Likewise. > > 2014-01-23 Pat Haugen > > * config/rs6000/rs6000.c (rs6000_option_override_internal): Don't > force flag_ira_loop_pressure if set via command line. > > 2014-02-06 Alan Modra > > PR target/60032 > * config/rs6000/rs6000.c (rs6000_secondary_memory_needed_mode): Only > change SDmode to DDmode when lra_in_progress. Okay. Thanks, David
Re: [4.8, PATCH 27/26] Backport Power8 and LE support: Fixes for AIX test failures
On Wed, Apr 2, 2014 at 11:18 AM, Bill Schmidt wrote: > Hi, > > This patch (diff-aix) adds to the 4.8 PowerPC backport patch series with > a few backported fixes from trunk that repair test failures on AIX. > > Thanks, > Bill > > > [gcc] > > 2014-04-02 Bill Schmidt > > Backport from mainline r205308 > 2013-11-23 David Edelsohn > > * config/rs6000/rs6000.c (IN_NAMED_SECTION): New macro. > (rs6000_xcoff_select_section): Place decls with stricter alignment > into named sections. > (rs6000_xcoff_unique_section): Allow unique sections for > uninitialized data with strict alignment. > > [gcc/testsuite] > > 2014-04-02 Bill Schmidt > > Backport from mainline > 2013-04-05 David Edelsohn > > * gcc.target/powerpc/sd-vsx.c: Skip on AIX. > * gcc.target/powerpc/sd-pwr6.c: Same. Okay. Thanks, David
[PATCH] Fix PR c++/21113
Hi, This patch fixes c++/21113 which reports that the C++ frontend does not forbid jumps into the scope of identifiers with variably-modified types. The patch simply augments decl_jump_unsafe() to disallow jumping into blocks that initialize variably-modified decls. I bootstrapped and regtested this change on x86_64-unknown-linux-gnu. 2014-04-03 Patrick Palka PR c++/21113 * decl.c (decl_jump_unsafe): Consider variably-modified decls. --- gcc/cp/decl.c| 5 ++--- gcc/testsuite/g++.dg/ext/vla14.C | 23 +++ 2 files changed, 25 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/g++.dg/ext/vla14.C diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 5bd33c5..6571af5 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -2785,9 +2785,8 @@ decl_jump_unsafe (tree decl) || type == error_mark_node) return 0; - type = strip_array_types (type); - - if (DECL_NONTRIVIALLY_INITIALIZED_P (decl)) + if (DECL_NONTRIVIALLY_INITIALIZED_P (decl) + || variably_modified_type_p (type, NULL_TREE)) return 2; if (TYPE_HAS_NONTRIVIAL_DESTRUCTOR (TREE_TYPE (decl))) diff --git a/gcc/testsuite/g++.dg/ext/vla14.C b/gcc/testsuite/g++.dg/ext/vla14.C new file mode 100644 index 000..278cb63 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/vla14.C @@ -0,0 +1,23 @@ +// PR c++/21113 +// { dg-options "" } + +void +f (int n) +{ + goto label; // { dg-error "from here" } + int a[n]; // { dg-error "crosses initialization" } +label: // { dg-error "jump to label" } + ; +} + +void +g (int n) +{ + switch (1) + { + case 1: +int (*a)[n]; // { dg-error "crosses initialization" } + default: // { dg-error "jump to case label" } +; + } +} -- 1.9.1
[PATCH] Fix PR c++/44613
Hi, This patch fixes a wrong code issue in the code generated for VLAs in the C++ frontend. This exact issue was fixed in the C frontend with r85849, and this patch is essentially a port of r85849 for the C++ frontend. The issue is that this C++ code: { foo: int x[n]; f (); } gets gimplified into this: { int x[n]; void *saved_stack; saved_stack = __builtin_stack_save (); try { foo: // <-- jump to foo will bypass initialization of saved_stack x = alloca (...); f (); } finally { __builtin_stack_restore (saved_stack); } } In order to ensure that labels such as "foo" that occur before the initialization of a VLA are emitted in the right place by the gimplifier, the C++ frontend is changed to handle the above C++ code as if it looked like this: { foo: { int x[n]; f (); } } thereby forcing the label "foo" to be placed before the initialization of saved_stack during gimplification. This is the same approach that the C frontend uses (see r85849). I bootstrapped and regtested this patch on x86_64-unknown-linux-gnu. 2014-04-03 Patrick Palka PR c++/44613 * semantics.c (add_stmt): Set STATEMENT_LIST_HAS_LABEL. * decl.c (cp_finish_decl): Create a new BIND_EXPR before instantiating a variable-sized type. --- gcc/cp/decl.c| 19 ++- gcc/cp/semantics.c | 3 +++ gcc/testsuite/g++.dg/ext/vla15.C | 20 3 files changed, 41 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/ext/vla15.C diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index f3a081b..5bd33c5 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -6441,7 +6441,24 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p, after the call to check_initializer so that the DECL_EXPR for a reference temp is added before the DECL_EXPR for the reference itself. */ if (DECL_FUNCTION_SCOPE_P (decl)) -add_decl_expr (decl); +{ + /* If we're building a variable sized type, and we might be +reachable other than via the top of the current binding +level, then create a new BIND_EXPR so that we deallocate +the object at the right time. */ + if (VAR_P (decl) + && DECL_SIZE (decl) + && !TREE_CONSTANT (DECL_SIZE (decl)) + && STATEMENT_LIST_HAS_LABEL (cur_stmt_list)) + { + tree bind; + bind = build3 (BIND_EXPR, void_type_node, NULL, NULL, NULL); + TREE_SIDE_EFFECTS (bind) = 1; + add_stmt (bind); + BIND_EXPR_BODY (bind) = push_stmt_list (); + } + add_decl_expr (decl); +} /* Let the middle end know about variables and functions -- but not static data members in uninstantiated class templates. */ diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index fb1e404..b00294e 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -386,6 +386,9 @@ add_stmt (tree t) STMT_IS_FULL_EXPR_P (t) = stmts_are_full_exprs_p (); } + if (code == LABEL_EXPR || code == CASE_LABEL_EXPR) +STATEMENT_LIST_HAS_LABEL (cur_stmt_list) = 1; + /* Add T to the statement-tree. Non-side-effect statements need to be recorded during statement expressions. */ gcc_checking_assert (!stmt_list_stack->is_empty ()); diff --git a/gcc/testsuite/g++.dg/ext/vla15.C b/gcc/testsuite/g++.dg/ext/vla15.C new file mode 100644 index 000..feeb49f --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/vla15.C @@ -0,0 +1,20 @@ +// PR c++/44613 +// { dg-do run } +// { dg-options "" } + +void *volatile p; + +int +main (void) +{ + int n = 0; + lab:; + int x[n % 1000 + 1]; + x[0] = 1; + x[n % 1000] = 2; + p = x; + n++; + if (n < 100) +goto lab; + return 0; +} -- 1.9.1
[PING][C++ Patch, 4.8] Backport fix for c++/54537 to FSF 4.8
I'd like to ping the following backport patch for the fix for PR54537. This did bootstrap and regtest with no regressions on powerpc64-linux. http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01148.html Peter
Re: [PING^8][PATCH] Add a couple of dialect and warning options regarding Objective-C instance variable scope
Still pinging. On 03/28/2014 11:58 AM, Dimitris Papavasiliou wrote: Ping! On 03/23/2014 03:20 AM, Dimitris Papavasiliou wrote: Ping! On 03/13/2014 11:54 AM, Dimitris Papavasiliou wrote: Ping! On 03/06/2014 07:44 PM, Dimitris Papavasiliou wrote: Ping! On 02/27/2014 11:44 AM, Dimitris Papavasiliou wrote: Ping! On 02/20/2014 12:11 PM, Dimitris Papavasiliou wrote: Hello all, Pinging this patch review request again. See previous messages quoted below for details. Regards, Dimitris On 02/13/2014 04:22 PM, Dimitris Papavasiliou wrote: Hello, Pinging this patch review request. Can someone involved in the Objective-C language frontend have a quick look at the description of the proposed features and tell me if it'd be ok to have them in the trunk so I can go ahead and create proper patches? Thanks, Dimitris On 02/06/2014 11:25 AM, Dimitris Papavasiliou wrote: Hello, This is a patch regarding a couple of Objective-C related dialect options and warning switches. I have already submitted it a while ago but gave up after pinging a couple of times. I am now informed that should have kept pinging until I got someone's attention so I'm resending it. The patch is now against an old revision and as I stated originally it's probably not in a state that can be adopted as is. I'm sending it as is so that the implemented features can be assesed in terms of their usefulness and if they're welcome I'd be happy to make any necessary changes to bring it up-to-date, split it into smaller patches, add test-cases and anything else that is deemed necessary. Here's the relevant text from my initial message: Two of these switches are related to a feature request I submitted a while ago, Bug 56044 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56044). I won't reproduce the entire argument here since it is available in the feature request. The relevant functionality in the patch comes in the form of two switches: -Wshadow-ivars which controls the "local declaration of ‘somevar’ hides instance variable" warning which curiously is enabled by default instead of being controlled at least by -Wshadow. The patch changes it so that this warning can be enabled and disabled specifically through -Wshadow-ivars as well as with all other shadowing-related warnings through -Wshadow. The reason for the extra switch is that, while searching through the Internet for a solution to this problem I have found out that other people are inconvenienced by this particular warning as well so it might be useful to be able to turn it off while keeping all the other shadowing-related warnings enabled. -flocal-ivars which when true, as it is by default, treats instance variables as having local scope. If false (-fno-local-ivars) instance variables must always be referred to as self->ivarname and references of ivarname resolve to the local or global scope as usual. I've also taken the opportunity of adding another switch unrelated to the above but related to instance variables: -fivar-visibility which can be set to either private, protected (the default), public and package. This sets the default instance variable visibility which normally is implicitly protected. My use-case for it is basically to be able to set it to public and thus effectively disable this visibility mechanism altogether which I find no use for and therefore have to circumvent. I'm not sure if anyone else feels the same way towards this but I figured it was worth a try. I'm attaching a preliminary patch against the current revision in case anyone wants to have a look. The changes are very small and any blatant mistakes should be immediately obvious. I have to admit to having virtually no knowledge of the internals of GCC but I have tried to keep in line with formatting guidelines and general style as well as looking up the particulars of the way options are handled in the available documentation to avoid blind copy-pasting. I have also tried to test the functionality both in my own (relatively large, or at least not too small) project and with small test programs and everything works as expected. Finallly, I tried running the tests too but these fail to complete both in the patched and unpatched version, possibly due to the way I've configured GCC. Dimitris
Re: [committed, libjava] XFAIL sourcelocation (PR libgcj/55637) backported to 4.8.3
Thanks for the tip. What should I do now? Should I fix the ChangeLog entry and add a new one or do nothing? Dominique Le 2 avr. 2014 à 12:47, Rainer Orth a écrit : > domi...@lps.ens.fr (Dominique Dhumieres) writes: > >> r...@cebitec.uni-bielefeld.de (Rainer Orth) wrote: >>> Sure, patch preapproved. >> >> Commited as r208983: >> >> 2014-04-01 Dominique d'Humieres >>Rainer Orth >> >>PR libgcj/55637 >>* testsuite/libjava.lang/sourcelocation.xfail: New file. > > Btw, the customary format for such a ChangeLog entry is > > 2014-04-01 Dominique d'Humieres > > Backport from mainline > 2014-02-20 Rainer Orth > >PR libgcj/55637 >* testsuite/libjava.lang/sourcelocation.xfail: New file. > > This way, you can easily see when the original went in. > > Rainer > > -- > - > Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [committed, libjava] XFAIL sourcelocation (PR libgcj/55637) backported to 4.8.3
Hi Dominique, > Thanks for the tip. What should I do now? Should I fix the ChangeLog entry > and add a new one or do nothing? if you want, you could fix the ChangeLog entry in place, but don't add a new one for that change. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PING][C++ Patch, 4.8] Backport fix for c++/54537 to FSF 4.8
On 03/04/14 10:25 -0500, Peter Bergner wrote: I'd like to ping the following backport patch for the fix for PR54537. This did bootstrap and regtest with no regressions on powerpc64-linux. http://gcc.gnu.org/ml/gcc-patches/2014-03/msg01148.html I don't know how risky the front-end change is, but if it gets approved then the library part is obviously fine. That said, my kneejerk reaction is if it's only really needed to allow inclusion of then my solution would be to not use that TR1 header!
Re: [gomp4] Add tables generation
On 04/02/2014 10:36 AM, Thomas Schwinge wrote: I see regressions in the libgomp testsuite for configurations where offloading is not enabled: spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ [...]/source/libgomp/testsuite/libgomp.c/for-3.c -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -I[...]/build/x86_64-unknown-linux-gnu/./libgomp -I[...]/source/libgomp/testsuite/.. -fmessage-length=0 -fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 -fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o ./for-3.exe /tmp/ccGnT0ei.o: In function `main': for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__' collect2: error: ld returned 1 exit status I suppose that's because [...] Workaround committed in r209015: libgcc/ * crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to NULL. The patch below should be a better fix, making the references to __OPENMP_TARGET__ weak. Does this work for you? Bernd Index: gcc/omp-low.c === --- gcc/omp-low.c (revision 429741) +++ gcc/omp-low.c (working copy) @@ -221,6 +221,28 @@ static tree scan_omp_1_op (tree *, int * *handled_ops_p = false; \ break; +static GTY(()) tree offload_symbol_decl; + +/* Get the __OPENMP_TARGET__ symbol. */ +static tree +get_offload_symbol_decl (void) +{ + if (!offload_symbol_decl) +{ + tree decl = build_decl (UNKNOWN_LOCATION, VAR_DECL, + get_identifier ("__OPENMP_TARGET__"), + ptr_type_node); + TREE_PUBLIC (decl) = 1; + DECL_EXTERNAL (decl) = 1; + DECL_WEAK (decl) = 1; + DECL_ATTRIBUTES (decl) + = tree_cons (get_identifier ("weak"), + NULL_TREE, DECL_ATTRIBUTES (decl)); + offload_symbol_decl = decl; +} + return offload_symbol_decl; +} + /* Convenience function for calling scan_omp_1_op on tree operands. */ static inline tree @@ -5148,11 +5170,7 @@ expand_oacc_offload (struct omp_region * } gimple g; - tree openmp_target -= build_decl (UNKNOWN_LOCATION, VAR_DECL, - get_identifier ("__OPENMP_TARGET__"), ptr_type_node); - TREE_PUBLIC (openmp_target) = 1; - DECL_EXTERNAL (openmp_target) = 1; + tree openmp_target = get_offload_symbol_decl (); tree fnaddr = build_fold_addr_expr (child_fn); g = gimple_build_call (builtin_decl_explicit (start_ix), 10, device, fnaddr, build_fold_addr_expr (openmp_target), @@ -8686,11 +8704,7 @@ expand_omp_target (struct omp_region *re } gimple g; - tree openmp_target -= build_decl (UNKNOWN_LOCATION, VAR_DECL, - get_identifier ("__OPENMP_TARGET__"), ptr_type_node); - TREE_PUBLIC (openmp_target) = 1; - DECL_EXTERNAL (openmp_target) = 1; + tree openmp_target = get_offload_symbol_decl (); if (kind == GF_OMP_TARGET_KIND_REGION) { tree fnaddr = build_fold_addr_expr (child_fn);
Re: [gomp4] Add tables generation
2014-04-03 20:13 GMT+04:00 Bernd Schmidt : > The patch below should be a better fix, making the references to > > __OPENMP_TARGET__ weak. Does this work for you? Shouldn't we just remove __OPENMP_TARGET__ argument from GOMP_target, since we decided to pass it to GOMP_offload_register? -- Ilya
[PATCH] Initialize sanitizer builtins (PR sanitizer/60745)
Under certain circumstances the sanitizer builtins are not initialized properly and ubsan_instrument_return must make sure they are initialized. Otherwise builtin_decl_explicit returns NULL and we'll ICE in build_call_expr_loc_array. I'm not sure which other ubsan routines need similar fix. No testcase attached since it's not trivial to reproduce this. Bootstrapped/ran ubsan testsuite on x86_64-linux, ok for trunk? 2014-04-03 Marek Polacek PR sanitizer/60745 * c-ubsan.c: Include asan.h. (ubsan_instrument_return): Call initialize_sanitizer_builtins. diff --git gcc/c-family/c-ubsan.c gcc/c-family/c-ubsan.c index dc4d981..9d2403c 100644 --- gcc/c-family/c-ubsan.c +++ gcc/c-family/c-ubsan.c @@ -29,6 +29,7 @@ along with GCC; see the file COPYING3. If not see #include "ubsan.h" #include "c-family/c-common.h" #include "c-family/c-ubsan.h" +#include "asan.h" /* Instrument division by zero and INT_MIN / -1. If not instrumenting, return NULL_TREE. */ @@ -185,6 +186,8 @@ ubsan_instrument_vla (location_t loc, tree size) tree ubsan_instrument_return (location_t loc) { + initialize_sanitizer_builtins (); + tree data = ubsan_create_data ("__ubsan_missing_return_data", &loc, NULL, NULL_TREE); tree t = builtin_decl_explicit (BUILT_IN_UBSAN_HANDLE_MISSING_RETURN); Marek
Re: [PATCH, ARM] Fix PR60609 (Error: value of 256 too large for field of 1 bytes)
On Thu, Apr 3, 2014 at 2:27 PM, Charles Baylis wrote: > Hi > > This bug causes the compiler to create a Thumb-2 TBB instruction with > a jump table containing an out of range value in a .byte field: > > whatever.s:148: Error: value of 256 too large for field of 1 bytes at 100 > > This occurs because the jump table is followed with a ".align 1" due > to ASM_OUTPUT_CASE_END, but the 'shorten' phase does not account for > the space taken by this align directive. My first reaction is to wonder why this is this not a bug in the "shorten" phase. > > This patch addresses the issue by removing ASM_OUTPUT_CASE_END from > arm.h, and ensuring that the alignment after an ADDR_DIFF_VEC is > instead inserted by aligning the label following the barrier which > follows it. This is achieved by defining LABEL_ALIGN_AFTER_BARRIER > appropriately. On first glance this feels like a blunt hammer, what's the code size bloat with putting out such an alignment after each barrier that the compiler emits rather than tracking this in ASM_OUTPUT_CASE_END. I'll try and have a look at this again tomorrow morning. regards Ramana > > Bootstrapped/checked on arm-unknown-linux-gnueabihf. > > OK for trunk, and backporting to 4.8? > > > > 2014-04-02 Charles Baylis > > PR target/60609 > * config/arm/arm.h (ASM_OUTPUT_CASE_END) Remove. > (LABEL_ALIGN_AFTER_BARRIER) Align barriers which occur after > ADDR_DIFF_VEC. > > > 2014-04-02 Charles Baylis > > PR target/60609 > * g++.dg/torture/pr60609.C: New test.
Re: [gomp4] Add tables generation
On 04/03/2014 06:53 PM, Ilya Verbin wrote: 2014-04-03 20:13 GMT+04:00 Bernd Schmidt : The patch below should be a better fix, making the references to > __OPENMP_TARGET__ weak. Does this work for you? Shouldn't we just remove __OPENMP_TARGET__ argument from GOMP_target, since we decided to pass it to GOMP_offload_register? I thought it was used to look up the right function? With shared libraries you'd get multiple __OPENMP_TARGET__ tables. Bernd
Re: [PATCH] PowerPC, PR60735: _Decimal64 moves broken on -mspe
On Tue, Apr 1, 2014 at 7:55 PM, Michael Meissner wrote: > In backporting the power8 changes to the 4.8 branch, one of the testers of > these patches noticed that libgcc cannot be built on a linux SPE target. The > reason was the _Decimal64 type did not have a proper move insn in the SPE > environment. This patch fixes that issue. In looking at the patch, I > discovered two other thinkos that are fixed in this patch. > > The first problem is the movdf/movdd insns for 32-bit without hardware > floating > point, checked whether we had hardware single precision support, when it > should > have been checking that we had hardware double precision support. > > The second problem was that some of the types believed they could use the > floating point registers in a SPE or software emulation enviornment. So I > added additional code to turn off the use of the FPRs in this case. > > I have done bootstraps and make check on 64-bit PowerPC linux systems with no > regression. In addition, I tested the code generated using cross compilers to > the Linux SPE system. Is this patch acceptible to be checked in the trunk > (and > to the 4.8 branch when the other patches are approved)? Mike, Can you work with Edmar and Rohit to create a testcase for the GCC testsuite as well? Thanks, David
Re: [gomp4] Add tables generation
2014-04-03 21:06 GMT+04:00 Bernd Schmidt : > On 04/03/2014 06:53 PM, Ilya Verbin wrote: >> >> 2014-04-03 20:13 GMT+04:00 Bernd Schmidt : >>> >>> The patch below should be a better fix, making the references to > >>> __OPENMP_TARGET__ weak. Does this work for you? >> >> >> Shouldn't we just remove __OPENMP_TARGET__ argument from GOMP_target, >> since we decided to pass it to GOMP_offload_register? > > > I thought it was used to look up the right function? With shared libraries > you'd get multiple __OPENMP_TARGET__ tables. > > > Bernd > Yes, initially the idea was to use it for look up the right function. But now each DSO will call GOMP_offload_register, and pass unique pointer to __OPENMP_TARGET__ (host_table) for this DSO. Then gomp_register_images_for_device registers all this host tables in the plugin. And when libgomp calls device_get_table_func, the plugin returns the joint table for all DSO's. -- Ilya
Re: [gomp4] Add tables generation
On 04/03/2014 07:25 PM, Ilya Verbin wrote: Yes, initially the idea was to use it for look up the right function. But now each DSO will call GOMP_offload_register, and pass unique pointer to __OPENMP_TARGET__ (host_table) for this DSO. Then gomp_register_images_for_device registers all this host tables in the plugin. And when libgomp calls device_get_table_func, the plugin returns the joint table for all DSO's. Why make a joint table? It seems better to use the __OPENMP_TARGET__ symbol to restrict lookups to the subset of symbols that could actually be found. BTW, I still expect that the lookup by ordering will turn out to be fundamentally unreliable and we'll need to use the unique id patch I posted a while ago. In that case using __OPENMP_TARGET__ as a first order key for the lookups eliminates any problem with duplicate names across multiple libraries. Bernd
Re: [gomp4] Add tables generation
2014-04-03 21:28 GMT+04:00 Bernd Schmidt : > On 04/03/2014 07:25 PM, Ilya Verbin wrote: >> >> Yes, initially the idea was to use it for look up the right function. >> But now each DSO will call GOMP_offload_register, and pass unique >> pointer to __OPENMP_TARGET__ (host_table) for this DSO. Then >> gomp_register_images_for_device registers all this host tables in the >> plugin. And when libgomp calls device_get_table_func, the plugin >> returns the joint table for all DSO's. > > > Why make a joint table? It seems better to use the __OPENMP_TARGET__ symbol > to restrict lookups to the subset of symbols that could actually be found. > BTW, I still expect that the lookup by ordering will turn out to be > fundamentally unreliable and we'll need to use the unique id patch I posted > a while ago. In that case using __OPENMP_TARGET__ as a first order key for > the lookups eliminates any problem with duplicate names across multiple > libraries. > > > Bernd > In current implementation each gomp_device_descr contains one dev_splay_tree. And all addresses are inserted into this splay tree. There is no need to restrict lookup, because the addresses from multiple DSO's can't overlap. -- Ilya
Re: [PATCH] PowerPC, PR60735: _Decimal64 moves broken on -mspe
On Thu, Apr 03, 2014 at 01:24:25PM -0400, David Edelsohn wrote: > On Tue, Apr 1, 2014 at 7:55 PM, Michael Meissner > wrote: > > In backporting the power8 changes to the 4.8 branch, one of the testers of > > these patches noticed that libgcc cannot be built on a linux SPE target. > > The > > reason was the _Decimal64 type did not have a proper move insn in the SPE > > environment. This patch fixes that issue. In looking at the patch, I > > discovered two other thinkos that are fixed in this patch. > > > > The first problem is the movdf/movdd insns for 32-bit without hardware > > floating > > point, checked whether we had hardware single precision support, when it > > should > > have been checking that we had hardware double precision support. > > > > The second problem was that some of the types believed they could use the > > floating point registers in a SPE or software emulation enviornment. So I > > added additional code to turn off the use of the FPRs in this case. > > > > I have done bootstraps and make check on 64-bit PowerPC linux systems with > > no > > regression. In addition, I tested the code generated using cross compilers > > to > > the Linux SPE system. Is this patch acceptible to be checked in the trunk > > (and > > to the 4.8 branch when the other patches are approved)? > > Mike, > > Can you work with Edmar and Rohit to create a testcase for the GCC > testsuite as well? Sure, but I won't be able to run it under the test suite. -- Michael Meissner, IBM IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
[4.8, PATCH 28/26] Backport Power8 and LE support: Fix for SPE (PR60735)
Hi, This patch (diff-pr60735) adds to the 4.8 PowerPC backport patch series with a backported fix for PR60735, an unrecognized insn problem for SPE. Thanks, Bill [gcc] 2014-04-03 Bill Schmidt Back port mainline subversion id 209025. 2014-04-02 Michael Meissner PR target/60735 * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): If we have software floating point or no floating point registers, do not allow any type in the FPRs. Eliminate a test for SPE SIMD types in GPRs that occurs after we tested for GPRs that would never be true. * config/rs6000/rs6000.md (mov_softfloat32, FMOVE64): Rewrite tests to use TARGET_DOUBLE_FLOAT and TARGET_E500_DOUBLE, since the FMOVE64 type is DFmode/DDmode. If TARGET_E500_DOUBLE, specifically allow DDmode, since that does not use the SPE SIMD instructions. Index: gcc-4_8-test2/gcc/config/rs6000/rs6000.c === --- gcc-4_8-test2.orig/gcc/config/rs6000/rs6000.c +++ gcc-4_8-test2/gcc/config/rs6000/rs6000.c @@ -1733,6 +1733,9 @@ rs6000_hard_regno_mode_ok (int regno, en modes and DImode. */ if (FP_REGNO_P (regno)) { + if (TARGET_SOFT_FLOAT || !TARGET_FPRS) + return 0; + if (SCALAR_FLOAT_MODE_P (mode) && (mode != TDmode || (regno % 2) == 0) && FP_REGNO_P (last_regno)) @@ -1761,10 +1764,6 @@ rs6000_hard_regno_mode_ok (int regno, en return (VECTOR_MEM_ALTIVEC_OR_VSX_P (mode) || mode == V1TImode); - /* ...but GPRs can hold SIMD data on the SPE in one register. */ - if (SPE_SIMD_REGNO_P (regno) && TARGET_SPE && SPE_VECTOR_MODE (mode)) -return 1; - /* We cannot put non-VSX TImode or PTImode anywhere except general register and it must be able to fit within the register set. */ Index: gcc-4_8-test2/gcc/config/rs6000/rs6000.md === --- gcc-4_8-test2.orig/gcc/config/rs6000/rs6000.md +++ gcc-4_8-test2/gcc/config/rs6000/rs6000.md @@ -9428,8 +9428,9 @@ [(set (match_operand:FMOVE64 0 "nonimmediate_operand" "=Y,r,r,r,r,r") (match_operand:FMOVE64 1 "input_operand" "r,Y,r,G,H,F"))] "! TARGET_POWERPC64 - && ((TARGET_FPRS && TARGET_SINGLE_FLOAT) - || TARGET_SOFT_FLOAT || TARGET_E500_SINGLE) + && ((TARGET_FPRS && TARGET_DOUBLE_FLOAT) + || TARGET_SOFT_FLOAT + || (mode == DDmode && TARGET_E500_DOUBLE)) && (gpc_reg_operand (operands[0], mode) || gpc_reg_operand (operands[1], mode))" "#"
[4.8, PATCH 29/26] Backport Power8 and LE support: Document vec_vgbbd
Hi, This patch (diff-vecdoc) is the last addition to the 4.8 PowerPC backport patch series. It simply adds some missing documentation that should have been part of one of the previous patches. I'm currently doing one more quick round of testing with the three late-addition patches, and will then be ready to commit the series. Thanks, Bill [gcc] 2014-04-03 Bill Schmidt Back port from main line: 2014-04-01 Michael Meissner * doc/extend.texi (PowerPC AltiVec/VSX Built-in Functions): Document vec_vgbbd. Index: gcc-4_8-test3/gcc/doc/extend.texi === --- gcc-4_8-test3.orig/gcc/doc/extend.texi +++ gcc-4_8-test3/gcc/doc/extend.texi @@ -14132,6 +14132,9 @@ vector unsigned short vec_vclzh (vector vector int vec_vclzw (vector int); vector unsigned int vec_vclzw (vector int); +vector signed char vec_vgbbd (vector signed char); +vector unsigned char vec_vgbbd (vector unsigned char); + vector long long vec_vmaxsd (vector long long, vector long long); vector unsigned long long vec_vmaxud (vector unsigned long long,
Re: [GOOGLE] Updates SSA after VPT transofrmations in AFDO pass
looks fine. David On Thu, Apr 3, 2014 at 10:56 AM, Dehao Chen wrote: > This patch updates SSA after VPT transformation. This is needed > because compute_inline_parameters will ICE without updated SSA. > > Testing on-going. > > OK for google-4_8? > > Thanks, > Dehao > > Index: gcc/auto-profile.c > === > --- gcc/auto-profile.c (revision 209059) > +++ gcc/auto-profile.c (working copy) > @@ -1448,6 +1448,7 @@ afdo_vpt_for_early_inline (stmt_set *promoted_stmt >free_dominance_info (CDI_POST_DOMINATORS); >calculate_dominance_info (CDI_POST_DOMINATORS); >calculate_dominance_info (CDI_DOMINATORS); > + update_ssa (TODO_update_ssa); >rebuild_cgraph_edges (); >return true; > }
[GOOGLE] Updates SSA after VPT transofrmations in AFDO pass
This patch updates SSA after VPT transformation. This is needed because compute_inline_parameters will ICE without updated SSA. Testing on-going. OK for google-4_8? Thanks, Dehao Index: gcc/auto-profile.c === --- gcc/auto-profile.c (revision 209059) +++ gcc/auto-profile.c (working copy) @@ -1448,6 +1448,7 @@ afdo_vpt_for_early_inline (stmt_set *promoted_stmt free_dominance_info (CDI_POST_DOMINATORS); calculate_dominance_info (CDI_POST_DOMINATORS); calculate_dominance_info (CDI_DOMINATORS); + update_ssa (TODO_update_ssa); rebuild_cgraph_edges (); return true; }
[Fortran-CAF, committed] Add array sending support for coarrays
This patch handles assigning to coarray array (sections) from local arrays for array RHS and for scalar RHS. I have lightly tested it with libcaf_single. On the library side, I added a minimal implementation for libcaf_single, which handles only rank==1 arrays, but which otherwise seems to work. With that patch, the most common cases for sending should be handled. Missing features for sending to remote issues: character strings are not handled, type conversion (i.e. assigning a real to an integer or similar), allocatable/pointer components of coarrays, and array vector sections are still not handled. - And, of course, reading from remote coarrays ("get", "pull") is not supported. Build on x86-64-gnu-linux - and committed to the branch as Rev. 209060 Tobias PS: Minimal test case to be run with "gfortran -fdump-tree-original -fcoarray=single -lcaf_single": integer :: foo(5)[*] integer :: bar(5) bar = [1,2,3,4,5] foo(:)[1] = bar print *, foo foo(:)[1] = 45 print *, foo end gcc/fortran/ChangeLog.fortran-caf |9 + gcc/fortran/trans-decl.c | 15 +++- gcc/fortran/trans-intrinsic.c | 34 +-- gcc/fortran/trans.h |2 + libgfortran/ChangeLog.fortran-caf | 13 +++ libgfortran/caf/libcaf.h | 34 +++ libgfortran/caf/single.c | 67 ++ 7 files changed, 163 insertions(+), 11 deletions(-) Index: libgfortran/ChangeLog.fortran-caf === --- libgfortran/ChangeLog.fortran-caf (Revision 208931) +++ libgfortran/ChangeLog.fortran-caf (Arbeitskopie) @@ -1,3 +1,16 @@ +2014-04-03 Tobias Burnus + + * caf/libcaf.h (descriptor_dimension, gfc_descriptor_t): New + structs. + (GFC_MAX_DIMENSIONS, GFC_DTYPE_RANK_MASK, GFC_DTYPE_TYPE_SHIFT, + GFC_DTYPE_TYPE_MASK, GFC_DTYPE_SIZE_SHIFT, GFC_DESCRIPTOR_RANK, + GFC_DESCRIPTOR_TYPE, GFC_DESCRIPTOR_SIZE): New defines. + (_gfortran_caf_send_desc, _gfortran_caf_send_desc_scalar): New + prototypes. + * caf/single.c (_gfortran_caf_send_desc, + _gfortran_caf_send_desc_scalar): New functions, supporting + rank == 1 only. + 2014-03-14 Tobias Burnus * caf/libcaf.h (caf_token_t): New typedef. Index: libgfortran/caf/libcaf.h === --- libgfortran/caf/libcaf.h (Revision 208931) +++ libgfortran/caf/libcaf.h (Arbeitskopie) @@ -58,6 +58,38 @@ caf_register_t; typedef void* caf_token_t; + +/* GNU Fortran's array descriptor. Keep in sync with libgfortran.h. */ + +typedef struct descriptor_dimension +{ + ptrdiff_t _stride; + ptrdiff_t lower_bound; + ptrdiff_t _ubound; +} +descriptor_dimension; + +typedef struct gfc_descriptor_t { + void *base_addr; + size_t offset; + ptrdiff_t dtype; + descriptor_dimension dim[]; +} gfc_descriptor_t; + + +#define GFC_MAX_DIMENSIONS 7 + +#define GFC_DTYPE_RANK_MASK 0x07 +#define GFC_DTYPE_TYPE_SHIFT 3 +#define GFC_DTYPE_TYPE_MASK 0x38 +#define GFC_DTYPE_SIZE_SHIFT 6 +#define GFC_DESCRIPTOR_RANK(desc) ((desc)->dtype & GFC_DTYPE_RANK_MASK) +#define GFC_DESCRIPTOR_TYPE(desc) (((desc)->dtype & GFC_DTYPE_TYPE_MASK) \ + >> GFC_DTYPE_TYPE_SHIFT) +#define GFC_DESCRIPTOR_SIZE(desc) ((desc)->dtype >> GFC_DTYPE_SIZE_SHIFT) + + + /* Linked list of static coarrays registered. */ typedef struct caf_static_t { caf_token_t token; @@ -77,6 +109,8 @@ void *_gfortran_caf_register (size_t, caf_register void _gfortran_caf_deregister (caf_token_t *, int *, char *, int); void _gfortran_send (caf_token_t, size_t, int, void *, size_t, bool); +void _gfortran_send_desc (caf_token_t, size_t, int, gfc_descriptor_t*, gfc_descriptor_t*, bool); +void _gfortran_send_desc_scalar (caf_token_t, size_t, int, gfc_descriptor_t*, void*, bool); void _gfortran_caf_sync_all (int *, char *, int); void _gfortran_caf_sync_images (int, int[], int *, char *, int); Index: libgfortran/caf/single.c === --- libgfortran/caf/single.c (Revision 208931) +++ libgfortran/caf/single.c (Arbeitskopie) @@ -149,6 +149,7 @@ _gfortran_caf_deregister (caf_token_t *token, int *stat = 0; } +/* Send scalar (or contiguous) data from buffer to a remote image. */ void _gfortran_caf_send (caf_token_t token, size_t offset, @@ -161,7 +162,73 @@ _gfortran_caf_send (caf_token_t token, size_t offs } +/* Send array data from src to dest on a remote image. */ + void +_gfortran_caf_send_desc (caf_token_t token, size_t offset, + int image_id __attribute__ ((unused)), + gfc_descriptor_t *dest, gfc_descriptor_t *src, + bool asyn __attribute__ ((unused))) +{ + fprintf (stderr, "COARRAY ERROR: Array communication " + "[_gfortran_caf_send_desc] not yet implemented for rank /= 0"); + exit (EXIT_FAILURE); + size_t i, j; + size_t size = GFC_DESCRIPTOR_SIZE (dest); + int rank = GFC_DESCRIPTOR_RANK (dest); + + if (ran
[Fortran-caf] Merge from the trunk to the branch
Committed to the fortran-caf branch as Rev. 209062 Tobias
[PR target/60657] [P1 regression] Fix operand predicates for a few ARM insns
As noted in the PR, there are a few insns in the ARM backend which use const_int_operand as a predicate, but which have constraints like "I" or "M". With the predicate accepting all constants, it's possible for a pass such as combine to create an insn where the constant operand matches the loose predicate, but will not match the tighter constraint. WIth no other alternatives to choose from, lra/reload won't be able to fixup the insn. The right way (IMHO) is to tighten the predicate in these cases. This patch introduces const_int_I_operand and const_int_M_operand. Bootstrapped on arm7l-unknown-linux-gnu (without java which fails for unrelated reasons) and regression tested. One system didn't have GDB installed, so the atomic and guality tests were noisy and due to time constraints, I haven't re-run them. OK for the trunk? diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 8d0c021..6c170d3 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,15 @@ +2014-04-03 Jeff Law + +PR target/60657 + * arm/predicates.md (const_int_I_operand): New predicate. + (const_int_M_operand): Similarly. + * arm/arm.md (insv_zero): Use const_int_M_operand instead of + const_int_operand. + (insv_t2, extv_reg, extzv_t2): Likewise. + (load_multiple_with_writeback): Similarly for const_int_I_operand. + (pop_multiple_with_writeback_and_return): Likewise. + (vfp_pop_multiple_with_writeback): Likewise + 2014-04-03 Richard Biener * tree-streamer.h (struct streamer_tree_cache_d): Add next_idx diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index 4df24a2..4b81ee2 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -2784,8 +2784,8 @@ (define_insn "insv_zero" [(set (zero_extract:SI (match_operand:SI 0 "s_register_operand" "+r") - (match_operand:SI 1 "const_int_operand" "M") - (match_operand:SI 2 "const_int_operand" "M")) + (match_operand:SI 1 "const_int_M_operand" "M") + (match_operand:SI 2 "const_int_M_operand" "M")) (const_int 0))] "arm_arch_thumb2" "bfc%?\t%0, %2, %1" @@ -2797,8 +2797,8 @@ (define_insn "insv_t2" [(set (zero_extract:SI (match_operand:SI 0 "s_register_operand" "+r") - (match_operand:SI 1 "const_int_operand" "M") - (match_operand:SI 2 "const_int_operand" "M")) + (match_operand:SI 1 "const_int_M_operand" "M") + (match_operand:SI 2 "const_int_M_operand" "M")) (match_operand:SI 3 "s_register_operand" "r"))] "arm_arch_thumb2" "bfi%?\t%0, %3, %2, %1" @@ -4480,8 +4480,8 @@ (define_insn "*extv_reg" [(set (match_operand:SI 0 "s_register_operand" "=r") (sign_extract:SI (match_operand:SI 1 "s_register_operand" "r") - (match_operand:SI 2 "const_int_operand" "M") - (match_operand:SI 3 "const_int_operand" "M")))] + (match_operand:SI 2 "const_int_M_operand" "M") + (match_operand:SI 3 "const_int_M_operand" "M")))] "arm_arch_thumb2" "sbfx%?\t%0, %1, %3, %2" [(set_attr "length" "4") @@ -4493,8 +4493,8 @@ (define_insn "extzv_t2" [(set (match_operand:SI 0 "s_register_operand" "=r") (zero_extract:SI (match_operand:SI 1 "s_register_operand" "r") - (match_operand:SI 2 "const_int_operand" "M") - (match_operand:SI 3 "const_int_operand" "M")))] + (match_operand:SI 2 "const_int_M_operand" "M") + (match_operand:SI 3 "const_int_M_operand" "M")))] "arm_arch_thumb2" "ubfx%?\t%0, %1, %3, %2" [(set_attr "length" "4") @@ -12073,7 +12073,7 @@ [(match_parallel 0 "load_multiple_operation" [(set (match_operand:SI 1 "s_register_operand" "+rk") (plus:SI (match_dup 1) - (match_operand:SI 2 "const_int_operand" "I"))) + (match_operand:SI 2 "const_int_I_operand" "I"))) (set (match_operand:SI 3 "s_register_operand" "=rk") (mem:SI (match_dup 1))) ])] @@ -12102,7 +12102,7 @@ [(return) (set (match_operand:SI 1 "s_register_operand" "+rk") (plus:SI (match_dup 1) - (match_operand:SI 2 "const_int_operand" "I"))) + (match_operand:SI 2 "const_int_I_operand" "I"))) (set (match_operand:SI 3 "s_register_operand" "=rk") (mem:SI (match_dup 1))) ])] @@ -12155,7 +12155,7 @@ [(match_parallel 0 "pop_multiple_fp" [(set (match_operand:SI 1 "s_register_operand" "+rk") (plus:SI (match_dup 1) - (match_operand:SI 2 "const_int_operand" "I"))) + (match_operand:SI 2 "const_int_I_operand" "I"))) (set (match_operand:DF 3 "vfp_hard_register_operand" "") (mem:DF (match_dup 1)))])] "
[RFA jit] clear timevar_enable in timevar_print
The timevar module doesn't properly re-initialize timevar_print between invocations of the compiler. In particular, if the compiler is put into verbose mode, and subsequently put back into quiet mode, then timevar_enable is never set to false -- leading to unwanted timevar display. This patch fixes the problem by clearing timevar_enable in timevar_print. --- gcc/ChangeLog.jit | 4 gcc/timevar.c | 2 ++ 2 files changed, 6 insertions(+) diff --git a/gcc/ChangeLog.jit b/gcc/ChangeLog.jit index 5145cf9..6ef9794 100644 --- a/gcc/ChangeLog.jit +++ b/gcc/ChangeLog.jit @@ -1,5 +1,9 @@ 2014-03-24 Tom Tromey + * timevar.c (timevar_print): Clear timevar_enable. + +2014-03-24 Tom Tromey + * toplev.c (general_init): Initialize input_location. * input.c (input_location): Initialize to UNKNOWN_LOCATION. diff --git a/gcc/timevar.c b/gcc/timevar.c index 2ceee51..5e4c4c49 100644 --- a/gcc/timevar.c +++ b/gcc/timevar.c @@ -491,6 +491,8 @@ timevar_print (FILE *fp) if (!timevar_enable) return; + // Clean up for a possible next run. + timevar_enable = false; /* Update timing information in case we're calling this from GDB. */ -- 1.9.0
Re: [PATCH, PR 60640] When creating virtual clones, clone thunks too
> > > +/* If E does not lead to a thunk, simply redirect it to N. Otherwise > > > create > > > + one or more equivalent thunks for N and redirect E to the first in the > > > + chain. */ > > > + > > > +void > > > +redirect_edge_duplicating_thunks (struct cgraph_edge *e, struct > > > cgraph_node *n, > > > + bitmap args_to_skip) > > > +{ > > > + cgraph_node *orig_to = cgraph_function_or_thunk_node (e->callee); > > > + if (orig_to->thunk.thunk_p) > > > +n = duplicate_thunk_for_node (orig_to, n, args_to_skip); > > > > Is there anything that would pevent us from creating a new thunk for > > each call? > > No, given how late we have discovered it, it probably only happens > very rarely. Moreover, since you have plans to always inline only > directly called thunks for the next release, which should be the > ultimate solution, I did not think it was necessary or even > appropriate at this stage. A lot of code iterate over thunks/aliases and expect this to be cheap operation. We thus need to be sure we won't create very many thunks or aliases of a given function internally. In order to trigger quadratic behaviour here, we only need a single function call used very often in a big project, like mozilla, to create uncontrolled numbers of thunks. I would suggest to just walk existing thunks before creating new looking if there is one mathcing our needs. Same code is in making local aliases. This change is pre-approved. > > > > > Also I think you need to avoid this logic when THIS parameter is being > > optimized out > > (i.e. it is part of skip_args) > > You are of course right. However, skipping the creation of a new > thunk when we are also removing parameter this leads to verification > errors again, so I had to also teach the verifier that this case is > actually OK. Moreover, although it seems that currently all That is fine with me. > non-this_adjusting thunks are expanded before IPA-CP runs, I made sure > the skipping logic checked that flag. Yes, we only keep the simple thunks in non-lowered form, but I do not see how it makes difference for you. > > Accidently, the two original testcases are removing parameter this so > I added a new one, which also shows how current trunk miscompiles > stuff. Unfortunately, at the moment it relies on speculative edges > and so when IPA-CP correctly redirects calls to a thunk, inlining > gives up and removes the edge, so the IPA-CP transformation is not > run-time checked. However, the cgraph verifier does see the edge > before that happens and is OK with it. You can probably play with anonymous namespaces and final flags to get it devirtualized unconditnally. > > I have also took the liberty of removing an extra call to > cgraph_function_or_thunk_node (clone_of_p calls it too) and a clearly > obsolete comment from verify_edge_corresponds_to_fndecl. > > Bootstrapped and tested on x86_64-linux. OK for trunk? > > Thanks, > > Martin > > > 2014-03-31 Martin Jambor > > * cgraph.h (cgraph_clone_node): New parameter added to declaration. > Adjust all callers. > * cgraph.c (clone_of_p): Also return true if thunks match. > (verify_edge_corresponds_to_fndecl): Removed extraneous call to > cgraph_function_or_thunk_node and an obsolete comment. > * cgraphclones.c (build_function_type_skip_args): Moved upwards in the > file. > (build_function_decl_skip_args): Likewise. > (set_new_clone_decl_and_node_flags): New function. > (duplicate_thunk_for_node): Likewise. > (redirect_edge_duplicating_thunks): Likewise. > (cgraph_clone_node): New parameter args_to_skip, pass it to > redirect_edge_duplicating_thunks which is called instead of > cgraph_redirect_edge_callee. > (cgraph_create_virtual_clone): Pass args_to_skip to cgraph_clone_node, > moved setting of a lot of flags to set_new_clone_decl_and_node_flags. > > testsuite/ > * g++.dg/ipa/pr60640-1.C: New test. > * g++.dg/ipa/pr60640-2.C: Likewise. > * g++.dg/ipa/pr60640-3.C: Likewise. OK, with the change above. Honza
Re: [4.8, PATCH 29/26] Backport Power8 and LE support: Document vec_vgbbd
On Thu, 2014-04-03 at 13:01 -0500, Bill Schmidt wrote: > I'm currently doing one more quick round of testing with the three > late-addition patches, and will then be ready to commit the series. > Final tests have all passed (BE Linux, LE Linux, BE AIX). Thanks, Bill
Re: [PATCH] PR debug/57519 - Emit DW_TAG_imported_declaration under the right class for 'using' statements in a class
> ChangeLog: > 2014-03-25 Siva Chandra Reddy > > Fix PR debug/57519 > > /cp > > PR debug/57519 > * class.c (handle_using_decl): Pass the correct scope to > cp_emit_debug_info_for_using. > > testsuite/ > > PR debug/57519 > * g++.dg/debug/dwarf2/imported-decl-2.C: New testcase. This looks right to me, but you'll need approval from a C++ front end maintainer. Thanks! -cary
RE: [PATCH][1/3] Fix PR54733 Optimize endian independent load/store
> From: Andreas Schwab [mailto:sch...@suse.de] > > Please add m68k-*-*. > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > ow...@gcc.gnu.org] On Behalf Of Rainer Orth > > Just omit the { target *-*-* } completely, also a few more times. Please find attached an updated patch. gcc32rm-84.3.2.part1.diff Description: Binary data
RE: [PATCH][2/3] Fix PR54733 Optimize endian independent load/store
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches- > ow...@gcc.gnu.org] On Behalf Of Rainer Orth > > Just omit the { target *-*-* } completely, also a few more times. Please find attached an updated patch. Best regards, Thomas gcc32rm-84.3.2.part2.diff Description: Binary data
Re: [gomp4] Add tables generation
Hi! On Thu, 3 Apr 2014 18:13:08 +0200, Bernd Schmidt wrote: > On 04/02/2014 10:36 AM, Thomas Schwinge wrote: > >> I see regressions in the libgomp testsuite for configurations where > >> offloading is not enabled: > >> > >> spawn [...]/build/gcc/xgcc -B[...]/build/gcc/ > >> [...]/source/libgomp/testsuite/libgomp.c/for-3.c > >> -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/ > >> -B[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs > >> -I[...]/build/x86_64-unknown-linux-gnu/./libgomp > >> -I[...]/source/libgomp/testsuite/.. -fmessage-length=0 > >> -fno-diagnostics-show-caret -fdiagnostics-color=never -fopenmp -std=gnu99 > >> -fopenmp -L[...]/build/x86_64-unknown-linux-gnu/./libgomp/.libs -lm -o > >> ./for-3.exe > >> /tmp/ccGnT0ei.o: In function `main': > >> for-3.c:(.text+0x21032): undefined reference to `__OPENMP_TARGET__' > >> collect2: error: ld returned 1 exit status > >> > >> I suppose that's because [...] > > > > Workaround committed in r209015: > > > libgcc/ > > * crtstuff.c [!ENABLE_OFFLOADING] (__OPENMP_TARGET__): Define to > > NULL. > > The patch below should be a better fix, making the references to > __OPENMP_TARGET__ weak. Does this work for you? Yes, it does, thanks! Please revert my patch when committing yours. Oh, and please use ChangeLog.gomp files on gomp-4_0-branch; also please move the entries for your recent commits from the ChangeLog file(s) to the respective ChangeLog.gomp one(s). Grüße, Thomas pgp9LEYYQa4tJ.pgp Description: PGP signature
Re: [PATCH] Fix PR59626, _FORTIFY_SOURCE wrappers and LTO
Hi, here is an updated version of my earlier ipa.c change. It turns out that the problem was that I did not drop always_inline. In this version I just drop always_inline attribute on all functions whose body is removed. The patch will affect non-LTO compilation, too, but IMO only by making us to not inline&diagnose the cases where indirect call to always inline is turned direct in between early opts and inline. Does that seem acceptable? (I personally would preffer this over inventing yet another way to special case always_inline for LTO only; we never made any strong promises on always_inline and indirect calls) Honza * lto-cgraph.c (input_overwrite_node): Check that partitioning flags are set only during streaming. * ipa.c (process_references, walk_polymorphic_call_targets, symtab_remove_unreachable_nodes): Drop bodies of always inline after early inlining. (symtab_remove_unreachable_nodes): Remove always_inline attribute. * gcc.dg/lto/pr59626_0.c: New testcase. * gcc.dg/lto/pr59626_1.c: New testcase. Index: lto-cgraph.c === --- lto-cgraph.c(revision 209062) +++ lto-cgraph.c(working copy) @@ -1001,6 +1001,9 @@ input_overwrite_node (struct lto_file_de node->thunk.thunk_p = bp_unpack_value (bp, 1); node->resolution = bp_unpack_enum (bp, ld_plugin_symbol_resolution, LDPR_NUM_KNOWN); + gcc_assert (flag_ltrans + || (!node->in_other_partition + && !node->used_from_other_partition)); } /* Return string alias is alias of. */ @@ -1169,6 +1172,9 @@ input_varpool_node (struct lto_file_decl node->same_comdat_group = (symtab_node *) (intptr_t) ref; node->resolution = streamer_read_enum (ib, ld_plugin_symbol_resolution, LDPR_NUM_KNOWN); + gcc_assert (flag_ltrans + || (!node->in_other_partition + && !node->used_from_other_partition)); return node; } Index: ipa.c === --- ipa.c (revision 209062) +++ ipa.c (working copy) @@ -139,7 +139,10 @@ process_references (struct ipa_ref_list if (node->definition && !node->in_other_partition && ((!DECL_EXTERNAL (node->decl) || node->alias) - || (before_inlining_p + || (((before_inlining_p + && (cgraph_state < CGRAPH_STATE_IPA_SSA + || !lookup_attribute ("always_inline", + DECL_ATTRIBUTES (node->decl) /* We use variable constructors during late complation for constant folding. Keep references alive so partitioning knows about potential references. */ @@ -191,7 +194,10 @@ walk_polymorphic_call_targets (pointer_s /* Prior inlining, keep alive bodies of possible targets for devirtualization. */ if (n->definition - && before_inlining_p) + && (before_inlining_p + && (cgraph_state < CGRAPH_STATE_IPA_SSA + || !lookup_attribute ("always_inline", +DECL_ATTRIBUTES (n->decl) pointer_set_insert (reachable, n); /* Even after inlining we want to keep the possible targets in the @@ -491,6 +497,12 @@ symtab_remove_unreachable_nodes (bool be node->alias = false; node->thunk.thunk_p = false; node->weakref = false; + /* After early inlining we drop always_inline attributes on +bodies of functions that are still referenced (have their +address taken). */ + DECL_ATTRIBUTES (node->decl) + = remove_attribute ("always_inline", + DECL_ATTRIBUTES (node->decl)); if (!node->in_other_partition) node->local.local = false; cgraph_node_remove_callees (node); Index: testsuite/gcc.dg/lto/pr59626_1.c === --- testsuite/gcc.dg/lto/pr59626_1.c(revision 0) +++ testsuite/gcc.dg/lto/pr59626_1.c(revision 0) @@ -0,0 +1,4 @@ +int bar (int (*fn)(const char *)) +{ + return fn ("0"); +} Index: testsuite/gcc.dg/lto/pr59626_0.c === --- testsuite/gcc.dg/lto/pr59626_0.c(revision 0) +++ testsuite/gcc.dg/lto/pr59626_0.c(revision 0) @@ -0,0 +1,15 @@ +/* { dg-lto-do run } */ + +int __atoi (const char *) __asm__("atoi"); +extern inline __attribute__((always_inline,gnu_inline)) +int atoi (const char *x) +{ + return __atoi (x); +} + +int bar (int (*)(const char *)); + +int main() +{ + return bar (atoi); +}
Re: [PATCH, FORTRAN] Fix PR fortran/60191
Bernd Edlinger wrote: Boot-strapped and Regression-tested on arm-linux-gnueabihf and x86_64-linux-gnu. OK for trunk? The patch looks good to me. Thanks for the patch. [Hopefully, we do not miss some odd corner case where it causes some problems.] Cheers, Tobias