[PATCH,AIX] go: disable mvsx and maltivec for aix/ppc
Description: * This patch removes -mvsx and -maltivec for go aix/ppc. These options don't seem compatible with Go stack layout. Tests: * AIX 7.2: Configure/Build: SUCCESS Changelog: * go/gospec.c (lang_specific_driver): disable -mvsx and -maltivec for aix/ppc. Clément Chigot ATOS Bull SAS 1 rue de Provence - 38432 Échirolles - France Index: gospec.c === --- gospec.c (revision 269987) +++ gospec.c (working copy) @@ -427,6 +427,14 @@ lang_specific_driver (struct cl_decoded_option **i j++; #endif +#if defined(TARGET_AIX) && defined(TARGET_32BIT) + /* aix/ppc doesn't support -mvsx and -maltivec with Go */ + generate_option (OPT_mvsx, NULL, 0, CL_DRIVER, &new_decoded_options[j]); + j++; + generate_option (OPT_maltivec, NULL, 0, CL_DRIVER, &new_decoded_options[j]); + j++; +#endif + *in_decoded_options_count = j; *in_decoded_options = new_decoded_options; *in_added_libraries = added_libraries;
[PATCH,AIX] disables -flto on Go tests for AIX
Description: * This patches disables go tests with -flto on AIX. Tests: * AIX 7.2: Configure/Build: SUCCESS Changelog: * lib/go-torture.exp: add check for lto option Clément Chigot ATOS Bull SAS 1 rue de Provence - 38432 Échirolles - France Index: go-torture.exp === --- go-torture.exp (revision 269987) +++ go-torture.exp (working copy) @@ -34,10 +34,13 @@ if ![info exists TORTURE_OPTIONS] { { -O2 -fomit-frame-pointer -finline-functions -funroll-loops } \ { -O2 -fbounds-check } \ { -O3 -g } \ - { -Os } \ - { -flto }] + { -Os }] } +if [check_effective_target_lto] { + set TORTURE_OPTIONS \ + [concat $TORTURE_OPTIONS [list {-flto}]] +} # # go-torture-compile -- compile a go.go-torture testcase.
Re: C++ PATCH for c++/89214 - ICE when initializing aggregates with bases
On Mär 28 2019, Marek Polacek wrote: > Andreas, could you please find out why we're not hitting this code in > digest_init_r: > > 1210 tree elt = CONSTRUCTOR_ELT (stripped_init, 0)->value; > 1211 if (reference_related_p (type, TREE_TYPE (elt))) This is never executed if flag_checking is false, of course. Andreas. -- Andreas Schwab, SUSE Labs, sch...@suse.de GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE 1748 E4D4 88E3 0EEA B9D7 "And now for something completely different."
Re: [v3 PATCH] Use single-visitation in variant assignment and swap.
Hi On 30/03/19 19:00, Ville Voutilainen wrote: - template + template +decltype(auto) +__visitor_result_type(_Visitor&& __visitor, _Variants&&... __variants) +{ + if constexpr(__use_index) +return __detail::__variant::__variant_idx_cookie{}; + else + return std::forward<_Visitor>(__visitor)( + std::get<0>(std::forward<_Variants>(__variants))...); +} If I'm not misreading something, the new function will be usually compiled/optimized to something very small and isn't constexpr thus normally should be explicitly marked inline. Or the problem is the missing constexpr? (sorry, I didn't really study the new code) Paolo.
Re: [v3 PATCH] Use single-visitation in variant assignment and swap.
On Mon, 1 Apr 2019 at 11:30, Paolo Carlini wrote: > > Hi > > On 30/03/19 19:00, Ville Voutilainen wrote: > > - template > > + template > > +decltype(auto) > > +__visitor_result_type(_Visitor&& __visitor, _Variants&&... __variants) > > +{ > > + if constexpr(__use_index) > > +return __detail::__variant::__variant_idx_cookie{}; > > + else > > + return std::forward<_Visitor>(__visitor)( > > + std::get<0>(std::forward<_Variants>(__variants))...); > > +} > > If I'm not misreading something, the new function will be usually > compiled/optimized to something very small and isn't constexpr thus > normally should be explicitly marked inline. Or the problem is the > missing constexpr? (sorry, I didn't really study the new code) The only use of this function is to compute its return type. It's never called. I should probably comment on that...
Re: [v3 PATCH] Use single-visitation in variant assignment and swap.
Hi On 01/04/19 10:43, Ville Voutilainen wrote: On Mon, 1 Apr 2019 at 11:30, Paolo Carlini wrote: Hi On 30/03/19 19:00, Ville Voutilainen wrote: - template + template +decltype(auto) +__visitor_result_type(_Visitor&& __visitor, _Variants&&... __variants) +{ + if constexpr(__use_index) +return __detail::__variant::__variant_idx_cookie{}; + else + return std::forward<_Visitor>(__visitor)( + std::get<0>(std::forward<_Variants>(__variants))...); +} If I'm not misreading something, the new function will be usually compiled/optimized to something very small and isn't constexpr thus normally should be explicitly marked inline. Or the problem is the missing constexpr? (sorry, I didn't really study the new code) The only use of this function is to compute its return type. It's never called. I should probably comment on that... Oh, yes, now I see, a few lines below. That's clear enough, I think. Paolo.
Re: [v3 PATCH] Use single-visitation in variant assignment and swap.
On Sat, 30 Mar 2019 at 20:00, Ville Voutilainen wrote: > > This patch makes assignments correct, because they need to > match indices instead of types. In addition, we cut down the > codegen size. The symbols are longer than before, the the amount > of them is roughly the same, so there's no longer an explosion > in their amount. > > Relops are the last bit in these fixes, I'll fix them during the weekend. Here. This does produce 17 symbols instead of 9 for a test with a single operator== comparison. But it's not 47 like it is with the code before this patch. 2019-04-01 Ville Voutilainen Use single-visitation in variant assignment and swap. Also use indices instead of types when checking whether variants hold the same thing. * include/std/variant (__do_visit): Add a template parameter for index visitation, invoke with indices if index visitation is used. (__variant_idx_cookie): New. (__visit_with_index): Likewise. (_Copy_assign_base::operator=): Do single-visitation with an index visitor. (_Move_assign_base::operator=): Likewise. (_Extra_visit_slot_needed): Adjust. (__visit_invoke): Call with indices if it's an index visitor. (relops): Do single-visitation with an index visitor. (swap): Likewise. (__visitor_result_type): New. diff --git a/libstdc++-v3/include/std/variant b/libstdc++-v3/include/std/variant index 7fd6947..3a11e1c 100644 --- a/libstdc++-v3/include/std/variant +++ b/libstdc++-v3/include/std/variant @@ -138,7 +138,7 @@ namespace __variant constexpr variant_alternative_t<_Np, variant<_Types...>> const&& get(const variant<_Types...>&&); - template + template constexpr decltype(auto) __do_visit(_Visitor&& __visitor, _Variants&&... __variants); @@ -175,6 +175,10 @@ namespace __variant // used for raw visitation struct __variant_cookie {}; + // used for raw visitation with indices passed in + struct __variant_idx_cookie {}; + // a more explanatory name than 'true' + inline constexpr auto __visit_with_index = bool_constant{}; // _Uninitialized is guaranteed to be a literal type, even if T is not. // We have to do this, because [basic.types]p10.5.3 (n4606) is not implemented @@ -570,45 +574,44 @@ namespace __variant operator=(const _Copy_assign_base& __rhs) noexcept(_Traits<_Types...>::_S_nothrow_copy_assign) { - __do_visit([this, &__rhs](auto&& __this_mem, auto&& __rhs_mem) mutable - -> __detail::__variant::__variant_cookie + __do_visit<__visit_with_index>([this](auto&& __rhs_mem, + auto __rhs_index) mutable + -> __detail::__variant::__variant_idx_cookie { - if constexpr (is_same_v< - remove_reference_t, - remove_reference_t>) + if constexpr (__rhs_index != variant_npos) { - if constexpr (!is_same_v< - remove_reference_t, - __variant_cookie>) - __this_mem = __rhs_mem; - } - else - { - if constexpr (!is_same_v< - remove_reference_t, - __variant_cookie>) + if (this->_M_index == __rhs_index) { - using __rhs_type = - remove_reference_t; - if constexpr (is_nothrow_copy_constructible_v<__rhs_type> - || !is_nothrow_move_constructible_v<__rhs_type>) + if constexpr (__rhs_index != variant_npos) { - this->_M_destructive_copy(__rhs._M_index, __rhs_mem); + auto& __this_mem = + __get<__rhs_index>(*this); + if constexpr (is_same_v< + remove_reference_t, + remove_reference_t>) + __this_mem = __rhs_mem; } + } + else + { + using __rhs_type = + remove_reference_t; + if constexpr + (is_nothrow_copy_constructible_v<__rhs_type> + || !is_nothrow_move_constructible_v<__rhs_type>) + this->_M_destructive_copy(__rhs_index, __rhs_mem); else { auto __tmp(__rhs_mem); - this->_M_destructive_move(__rhs._M_index, + this->_M_destructive_move(__rhs_index, std::move(__tmp)); } } - else - { - this->_M_reset(); - } } - return {}; - }, __variant_cast<_Types...>(*this), __variant_cast<_Types...>(__rhs)); + else + this->_M_reset(); + return {}; + }, __variant_cast<_Types...>(__rhs)); __glibcxx_assert(this->_M_index == __rhs._M_index); return *this; } @@ -641,25 +644,32 @@ namespace __variant operator=(_Move_assign_base&& __rhs) noexcept(_Traits<_Types...>::_S_nothrow_move_assign) { - __do_visit([this, &__rhs](auto&& __this_mem, auto&& __rhs_mem) mutable - -> __detail::__variant::__variant_cookie + __do_visit<__visit_with_index>([this](auto&& __rhs_mem, + auto __rhs_index) mutable + -> __detail::__variant::__variant_idx_cookie { - if constexpr (is_same_v< - remove_reference_t, - remove_reference_t>) + if constexpr (__rhs_index != variant_npos) { - if constexpr (!is_same_v< - remove_reference_t, -
Re: [v3 PATCH] Use single-visitation in variant assignment and swap.
On Mon, 1 Apr 2019 at 11:45, Paolo Carlini wrote: > >>> +__visitor_result_type(_Visitor&& __visitor, _Variants&&... > >>> __variants) > >>> +{ > >>> + if constexpr(__use_index) > >>> +return __detail::__variant::__variant_idx_cookie{}; > >>> + else > >>> + return std::forward<_Visitor>(__visitor)( > >> If I'm not misreading something, the new function will be usually > >> compiled/optimized to something very small and isn't constexpr thus > >> normally should be explicitly marked inline. Or the problem is the > >> missing constexpr? (sorry, I didn't really study the new code) > > The only use of this function is to compute its return type. It's > > never called. I should probably comment on that... > > Oh, yes, now I see, a few lines below. That's clear enough, I think. It's a bit novel, but happens to be much easier to write than something using std::conditional or some such. Traits like that may end up evaluating their template arguments much sooner than I'd like, but shoving it into a helper function with an if-constexpr in it makes it trivial and I don't need to worry about the validity of the branch taken, since it's not instantiated. This is a new use case for if-constexpr even for me, but it's a very nice bonus ability in that already shining-fabulous language facility. :)
Re: [v3 PATCH] Use single-visitation in variant assignment and swap.
On Mon, 1 Apr 2019 at 11:50, Ville Voutilainen wrote: > in it makes it trivial and I don't > need to worry about the validity of the branch taken, since it's not > instantiated. This is a new I mean the branch *not* taken. :)
[PATCH] Improve PR46590 compile-time, fix RPO VN regression
This fixes a quadraticness in region RPO VN which happens because dom_walker computes whole-function RPO order in its constructor even though in the end we do not use it. That's an artifact of C++ (we derive from dom_walker). The fix is to lazily do costly initialization in the ::walk method. I'm not sure we can re-use previous edge flag state for multiple ::walk invocations when the user said REACHABLE_BLOCKS instead of REACHABLE_BLOCKS_PRESERVING_FLAGS so I've gone for the safe side (but I'm quite sure we do not have a consumer walking multiple times using the same class instance anyways). This cuts down GIMPLE copy-header time from 11% to 0% on the second testcase from PR46590, leaving RTL invariant motion as the worst offender at 50% of the compile-time (through it's use of DF). Bootstrap / regtest running on x86_64-unknown-linux-gnu. Richard. 2019-04-01 Richard Biener PR tree-optimization/46590 * domwalk.h (dom_walker::dom_walker): Consolidate constructors. (dom_walker::m_reachability): Add in place of... (dom_walker::m_skip_unreachable_blocks): ...this. * domwalk.c (dom_walker::dom_walker): Consoliate constructors. Move complex initialization ... (dom_walker::walk): Here. Especially compute m_bb_to_rpo lazily and initialize edge flags on each invocation. (dom_walker::bb_reachable): Use m_reachability. Index: gcc/domwalk.c === --- gcc/domwalk.c (revision 270053) +++ gcc/domwalk.c (working copy) @@ -190,69 +190,11 @@ dom_walker::dom_walker (cdi_direction di enum reachability reachability, int *bb_index_to_rpo) : m_dom_direction (direction), -m_skip_unreachable_blocks (reachability != ALL_BLOCKS), -m_user_bb_to_rpo (true), +m_reachability (reachability), +m_user_bb_to_rpo (bb_index_to_rpo != NULL), m_unreachable_dom (NULL), m_bb_to_rpo (bb_index_to_rpo) { - /* Set up edge flags if need be. */ - switch (reachability) -{ -default: - gcc_unreachable (); -case ALL_BLOCKS: - /* No need to touch edge flags. */ - break; - -case REACHABLE_BLOCKS: - set_all_edges_as_executable (cfun); - break; - -case REACHABLE_BLOCKS_PRESERVING_FLAGS: - /* Preserve the edge flags. */ - break; -} -} - -/* Constructor for a dom walker. */ - -dom_walker::dom_walker (cdi_direction direction, - enum reachability reachability) - : m_dom_direction (direction), -m_skip_unreachable_blocks (reachability != ALL_BLOCKS), -m_user_bb_to_rpo (false), -m_unreachable_dom (NULL), -m_bb_to_rpo (NULL) -{ - /* Compute the basic-block index to RPO mapping. */ - if (direction == CDI_DOMINATORS) -{ - int *postorder = XNEWVEC (int, n_basic_blocks_for_fn (cfun)); - int postorder_num = pre_and_rev_post_order_compute (NULL, postorder, - true); - m_bb_to_rpo = XNEWVEC (int, last_basic_block_for_fn (cfun)); - for (int i = 0; i < postorder_num; ++i) - m_bb_to_rpo[postorder[i]] = i; - free (postorder); -} - - /* Set up edge flags if need be. */ - switch (reachability) -{ -default: - gcc_unreachable (); -case ALL_BLOCKS: - /* No need to touch edge flags. */ - break; - -case REACHABLE_BLOCKS: - set_all_edges_as_executable (cfun); - break; - -case REACHABLE_BLOCKS_PRESERVING_FLAGS: - /* Preserve the edge flags. */ - break; -} } /* Destructor. */ @@ -270,7 +212,7 @@ dom_walker::bb_reachable (struct functio { /* If we're not skipping unreachable blocks, then assume everything is reachable. */ - if (!m_skip_unreachable_blocks) + if (m_reachability == ALL_BLOCKS) return true; /* If any of the predecessor edges that do not come from blocks dominated @@ -331,6 +273,23 @@ const edge dom_walker::STOP = (edge)-1; void dom_walker::walk (basic_block bb) { + /* Compute the basic-block index to RPO mapping lazily. */ + if (!m_bb_to_rpo + && m_dom_direction == CDI_DOMINATORS) +{ + int *postorder = XNEWVEC (int, n_basic_blocks_for_fn (cfun)); + int postorder_num = pre_and_rev_post_order_compute (NULL, postorder, + true); + m_bb_to_rpo = XNEWVEC (int, last_basic_block_for_fn (cfun)); + for (int i = 0; i < postorder_num; ++i) + m_bb_to_rpo[postorder[i]] = i; + free (postorder); +} + + /* Set up edge flags if need be. */ + if (m_reachability == REACHABLE_BLOCKS) +set_all_edges_as_executable (cfun); + basic_block dest; basic_block *worklist = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun) * 2); Index: gcc/domwalk.h === --- gcc/domwalk.h
[C++ Patch] PR 62207 ("[7/8/9 Regression] ICE: tree check: expected tree that contains 'decl minimal' structure, have 'overload' in tsubst_copy, at cp/pt.c")
Hi, A rather long standing issue. The error-recovery problem is that, in various circumstances, we flag as erroneous a local variable declaration and then, when we need to come back to it later on, lookup_name doesn't find it, keeps looking and may find instead a completely unrelated non-variable declaration with the same name. In principle, we could probably set up things in such a way that during error-recovery the erroneous entity is found anyway, marked as erroneous of course, but I don't think we want to attempt that now... Thus, I'm proposing to just check for VAR_P in the relevant case of tsubst_copy and bail-out. Setting instead r = NULL_TREE and trying to continue works rather well but leads to duplicate diagnostics in some cases (eg, for template/crash131.C). By the way, something else I actually tried is returning immediately from tsubst_expr when the tsubst call in case DECL_EXPR returns error_mark_node and then bail out immediately from the for loop over the STATEMENT_LIST. That also passes the testsuite, but, if you want, you can certainly notice that the loop over the statements ends early, for example we would not provide any diagnostics for an additional Y y2; in template/crash131.C. Which interestingly is true for current clang too ;) Tested x86_64-linux. Thanks, Paolo. / /cp 2019-04-01 Paolo Carlini PR c++/62207 * pt.c (tsubst_copy): Deal with lookup_name not returing a variable. /testsuite 2019-04-01 Paolo Carlini PR c++/62207 * g++.dg/template/crash130.C: New. * g++.dg/template/crash131.C: Likewise. Index: cp/pt.c === --- cp/pt.c (revision 270012) +++ cp/pt.c (working copy) @@ -15579,12 +15579,23 @@ tsubst_copy (tree t, tree args, tsubst_flags_t com { /* First try name lookup to find the instantiation. */ r = lookup_name (DECL_NAME (t)); - if (r && !is_capture_proxy (r)) + if (r) { - /* Make sure that the one we found is the one we want. */ - tree ctx = enclosing_instantiation_of (DECL_CONTEXT (t)); - if (ctx != DECL_CONTEXT (r)) - r = NULL_TREE; + if (!VAR_P (r)) + { + /* During error-recovery we may find a non-variable, +even an OVERLOAD: just bail out and avoid ICEs and +duplicate diagnostics (c++/62207). */ + gcc_assert (seen_error ()); + return error_mark_node; + } + if (!is_capture_proxy (r)) + { + /* Make sure the one we found is the one we want. */ + tree ctx = enclosing_instantiation_of (DECL_CONTEXT (t)); + if (ctx != DECL_CONTEXT (r)) + r = NULL_TREE; + } } if (r) @@ -15620,7 +15631,7 @@ tsubst_copy (tree t, tree args, tsubst_flags_t com } gcc_assert (cp_unevaluated_operand || TREE_STATIC (r) || decl_constant_var_p (r) - || errorcount || sorrycount); + || seen_error ()); if (!processing_template_decl && !TREE_STATIC (r)) r = process_outer_var_ref (r, complain); Index: testsuite/g++.dg/template/crash130.C === --- testsuite/g++.dg/template/crash130.C(nonexistent) +++ testsuite/g++.dg/template/crash130.C(working copy) @@ -0,0 +1,15 @@ +// PR c++/62207 + +template void foo(T) +{ + X; // { dg-error "not declared" } + X; +} + +void X(); +void X(int); + +void bar() +{ + foo(0); +} Index: testsuite/g++.dg/template/crash131.C === --- testsuite/g++.dg/template/crash131.C(nonexistent) +++ testsuite/g++.dg/template/crash131.C(working copy) @@ -0,0 +1,16 @@ +// PR c++/62207 + +template +class X { +public: + template class Y {}; + template void y() {} + X(F f) + { +Y y; // { dg-error "not a constant" } + +y.value(); + } +}; + +int main() { X x(1); }
Re: [v3 PATCH] Use single-visitation in variant assignment and swap.
On Mon, 1 Apr 2019 at 11:45, Ville Voutilainen wrote: > > On Sat, 30 Mar 2019 at 20:00, Ville Voutilainen > wrote: > > > > This patch makes assignments correct, because they need to > > match indices instead of types. In addition, we cut down the > > codegen size. The symbols are longer than before, the the amount > > of them is roughly the same, so there's no longer an explosion > > in their amount. > > > > Relops are the last bit in these fixes, I'll fix them during the weekend. > > Here. This does produce 17 symbols instead of 9 for a test with a > single operator== comparison. > But it's not 47 like it is with the code before this patch. Jonathan: I consider this good enough to ship; it doesn't explode the symbols too much and nicely gets rid of half of the runtime ifs for valueless state and index. That's really the perf bonus that visitation buys, and for swap and relops there is less need to do other inquiries, although there is that check for same-index, which is now a run-time check between a completely-runtime value and a compile-time constant as opposed to a run-time check between two completely-runtime values. Later, as a possible optimization, we can see whether swap and relops can be made to use a different generated jump table with just maybe twelve entries, and if we can dispatch into it without other additional branches, which I'm not entirely sure we can.
[PATCH] Improve DF use of loop-invaraint (PR46590)
After the fix to RPO VN in loop header copying DF via RTL invariant motion takes 50% of the compile-time for the second testcase in PR46590. This is caused by the DF live problem iterating over all dirty blocks for the local problem which is all blocks of the function because while loop-invariant uses df_analyze_loop it always marks all blocks of the function as dirty via df_live_set_all_dirty. One of the possible fixes is to add a df_live_set_loop_dirty () function which causes the problem to only iterate over one loop blocks. It turns out the LIVE problem is always present so we cannot remove it which in turn means we have to mark blocks possibly not visited by invaraint motion as dirty (thus the new df_live_set_all_dirty at the end of move_loop_invariants). Anyhow - I'm not really into how DF should ideally work here but thought this issue was "easy" enough to fix... Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. This brings down DF live time from 50% to 0% for the testcase (IIRC I've seen loop-invaraint as time-hog several times, surprising me) Richard. 2019-04-01 Richard Biener PR rtl-optimization/46590 * df.h (df_live_set_loop_dirty): Declare. * df-problems.c: Include cfgloop.h. (df_live_set_loop_dirty): New function. * loop-invariant.c (find_defs): Use df_live_set_loop_dirty instead of df_live_set_all_dirty. (move_loop_invariants): Mark all of live dirty at the end. Index: gcc/df.h === --- gcc/df.h(revision 270054) +++ gcc/df.h(working copy) @@ -1023,6 +1023,7 @@ extern void df_lr_verify_transfer_functi extern void df_live_verify_transfer_functions (void); extern void df_live_add_problem (void); extern void df_live_set_all_dirty (void); +extern void df_live_set_loop_dirty (struct loop *); extern void df_chain_add_problem (unsigned int); extern void df_word_lr_add_problem (void); extern bool df_word_lr_mark_ref (df_ref, bool, bitmap); Index: gcc/df-problems.c === --- gcc/df-problems.c (revision 270054) +++ gcc/df-problems.c (working copy) @@ -36,6 +36,7 @@ along with GCC; see the file COPYING3. #include "valtrack.h" #include "dumpfile.h" #include "rtl-iter.h" +#include "cfgloop.h" /* Note that turning REG_DEAD_DEBUGGING on will cause gcc.c-torture/unsorted/dump-noaddr.c to fail because it prints @@ -1774,6 +1775,19 @@ df_live_set_all_dirty (void) } +/* Set all of the blocks of LOOP as dirty. This needs to be done if this + problem is added after all of the insns have been scanned. */ + +void +df_live_set_loop_dirty (struct loop *loop) +{ + basic_block *bbs = get_loop_body (loop); + for (unsigned i = 0; i < loop->num_nodes; ++i) +bitmap_set_bit (df_live->out_of_date_transfer_functions, bbs[i]->index); + free (bbs); +} + + /* Verify that all of the lr related info is consistent and correct. */ Index: gcc/loop-invariant.c === --- gcc/loop-invariant.c(revision 270054) +++ gcc/loop-invariant.c(working copy) @@ -685,7 +685,7 @@ find_defs (struct loop *loop) df_process_deferred_rescans (); df_chain_add_problem (DF_UD_CHAIN); df_live_add_problem (); - df_live_set_all_dirty (); + df_live_set_loop_dirty (loop); df_set_flags (DF_RD_PRUNE_DEAD_DEFS); df_analyze_loop (loop); check_invariant_table_size (); @@ -2286,5 +2286,9 @@ move_loop_invariants (void) invariant_table = NULL; invariant_table_size = 0; + /* Mark all of DF_LIVE dirty, we've likely not covered all blocks + sofar. */ + df_live_set_all_dirty (); + checking_verify_flow_info (); }
Re: [v3 PATCH] Use single-visitation in variant assignment and swap.
On 01/04/19 11:45 +0300, Ville Voutilainen wrote: @@ -570,45 +574,44 @@ namespace __variant operator=(const _Copy_assign_base& __rhs) noexcept(_Traits<_Types...>::_S_nothrow_copy_assign) { - __do_visit([this, &__rhs](auto&& __this_mem, auto&& __rhs_mem) mutable - -> __detail::__variant::__variant_cookie + __do_visit<__visit_with_index>([this](auto&& __rhs_mem, + auto __rhs_index) mutable + -> __detail::__variant::__variant_idx_cookie { - if constexpr (is_same_v< - remove_reference_t, - remove_reference_t>) + if constexpr (__rhs_index != variant_npos) { - if constexpr (!is_same_v< - remove_reference_t, - __variant_cookie>) - __this_mem = __rhs_mem; - } - else - { - if constexpr (!is_same_v< - remove_reference_t, - __variant_cookie>) + if (this->_M_index == __rhs_index) { - using __rhs_type = - remove_reference_t; - if constexpr (is_nothrow_copy_constructible_v<__rhs_type> - || !is_nothrow_move_constructible_v<__rhs_type>) + if constexpr (__rhs_index != variant_npos) { - this->_M_destructive_copy(__rhs._M_index, __rhs_mem); + auto& __this_mem = + __get<__rhs_index>(*this); Qualify this as __variant::__get please. + if constexpr (is_same_v< + remove_reference_t, + remove_reference_t>) + __this_mem = __rhs_mem; } + } + else + { + using __rhs_type = + remove_reference_t; + if constexpr + (is_nothrow_copy_constructible_v<__rhs_type> + || !is_nothrow_move_constructible_v<__rhs_type>) + this->_M_destructive_copy(__rhs_index, __rhs_mem); else { auto __tmp(__rhs_mem); - this->_M_destructive_move(__rhs._M_index, + this->_M_destructive_move(__rhs_index, std::move(__tmp)); } } - else - { - this->_M_reset(); - } } - return {}; - }, __variant_cast<_Types...>(*this), __variant_cast<_Types...>(__rhs)); + else + this->_M_reset(); + return {}; + }, __variant_cast<_Types...>(__rhs)); __glibcxx_assert(this->_M_index == __rhs._M_index); return *this; } @@ -641,25 +644,32 @@ namespace __variant operator=(_Move_assign_base&& __rhs) noexcept(_Traits<_Types...>::_S_nothrow_move_assign) { - __do_visit([this, &__rhs](auto&& __this_mem, auto&& __rhs_mem) mutable - -> __detail::__variant::__variant_cookie + __do_visit<__visit_with_index>([this](auto&& __rhs_mem, + auto __rhs_index) mutable + -> __detail::__variant::__variant_idx_cookie { - if constexpr (is_same_v< - remove_reference_t, - remove_reference_t>) + if constexpr (__rhs_index != variant_npos) { - if constexpr (!is_same_v< - remove_reference_t, - __variant_cookie>) - __this_mem = std::move(__rhs_mem); + if (this->_M_index == __rhs_index) + { + if constexpr (__rhs_index != variant_npos) + { + auto& __this_mem = + __get<__rhs_index>(*this); And here. + if constexpr (is_same_v< + remove_reference_t, + remove_reference_t>) + __this_mem = std::move(__rhs_mem); + } + } + else + this->_M_destructive_move(__rhs_index, + std::move(__rhs_mem)); } else - { - auto __tmp(std::move(__rhs_mem)); - this->_M_destructive_move(__rhs._M_index, std::move(__tmp)); - } - return {}; - }, __variant_cast<_Types...>(*this), __variant_c
Re: [v3 PATCH] Use single-visitation in variant assignment and swap.
On 01/04/19 12:52 +0300, Ville Voutilainen wrote: On Mon, 1 Apr 2019 at 11:45, Ville Voutilainen wrote: On Sat, 30 Mar 2019 at 20:00, Ville Voutilainen wrote: > > This patch makes assignments correct, because they need to > match indices instead of types. In addition, we cut down the > codegen size. The symbols are longer than before, the the amount > of them is roughly the same, so there's no longer an explosion > in their amount. > > Relops are the last bit in these fixes, I'll fix them during the weekend. Here. This does produce 17 symbols instead of 9 for a test with a single operator== comparison. But it's not 47 like it is with the code before this patch. Jonathan: I consider this good enough to ship; it doesn't explode the symbols too much and nicely gets rid of half of the runtime ifs for valueless state and index. That's really the perf bonus that visitation buys, and for swap and relops there is less need to do other inquiries, although there is that check for same-index, which is now a run-time check between a completely-runtime value and a compile-time constant as opposed to a run-time check between two completely-runtime values. Later, as a possible optimization, we can see whether swap and relops can be made to use a different generated jump table with just maybe twelve entries, and if we can dispatch into it without other additional branches, which I'm not entirely sure we can. Yes, this is a very satisfactory improvement in codegen.
[RFC][PATCH] Postpone print of --help=* option.
Hi. Last week I was curious which warnings are disabled by default on top of -Wall and -Wextra. Thus I used --help=warning and noticed some discrepancy in between documentation and output of the --help option. I created PR89885 where I explained that OPT__help_ option handling happens early. That's why LangEnabledBy are not reflected and similarly target overrides don't take place. I'm attaching diff for --help=warning for C++ and -Ofast. Thoughts? gcc/ChangeLog: 2019-04-01 Martin Liska * gcc.c (process_command): Add dummy file only if n_infiles == 0. * opts-global.c (decode_options): Pass lang_mask. * opts.c (print_help): New function. (finish_options): Print --help if help_option_argument is set. (common_handle_option): Factor out content of OPT__help_ into print_help. * opts.h (finish_options): Add new argument. --- gcc/gcc.c | 3 +- gcc/opts-global.c | 2 +- gcc/opts.c| 267 -- gcc/opts.h| 3 +- 4 files changed, 146 insertions(+), 129 deletions(-) diff --git a/gcc/gcc.c b/gcc/gcc.c index 4f57765b012..7ce1cae28a7 100644 --- a/gcc/gcc.c +++ b/gcc/gcc.c @@ -4751,7 +4751,8 @@ process_command (unsigned int decoded_options_count, } /* Ensure we only invoke each subprocess once. */ - if (print_subprocess_help || print_help_list || print_version) + if (n_infiles == 0 + && (print_subprocess_help || print_help_list || print_version)) { n_infiles = 0; diff --git a/gcc/opts-global.c b/gcc/opts-global.c index a5e9ef0237a..f110fe1026f 100644 --- a/gcc/opts-global.c +++ b/gcc/opts-global.c @@ -314,7 +314,7 @@ decode_options (struct gcc_options *opts, struct gcc_options *opts_set, loc, lang_mask, &handlers, dc); - finish_options (opts, opts_set, loc); + finish_options (opts, opts_set, loc, lang_mask); } /* Hold command-line options associated with stack limitation. */ diff --git a/gcc/opts.c b/gcc/opts.c index 02f6b4656e1..707e6754294 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -854,12 +854,18 @@ control_options_for_live_patching (struct gcc_options *opts, } } +/* --help option argument if set. */ +const char *help_option_argument = NULL; + +static void print_help (struct gcc_options *opts, unsigned int lang_mask); + + /* After all options at LOC have been read into OPTS and OPTS_SET, finalize settings of those options and diagnose incompatible combinations. */ void finish_options (struct gcc_options *opts, struct gcc_options *opts_set, - location_t loc) + location_t loc, unsigned int lang_mask) { enum unwind_info_type ui_except; @@ -1223,6 +1229,10 @@ finish_options (struct gcc_options *opts, struct gcc_options *opts_set, opts->x_flag_live_patching, loc); } + + /* Print --help=* if used. */ + if (help_option_argument != NULL) +print_help (opts, lang_mask); } #define LEFT_COLUMN 27 @@ -2052,6 +2062,135 @@ check_alignment_argument (location_t loc, const char *flag, const char *name) parse_and_check_align_values (flag, name, align_result, true, loc); } +/* Print help when OPT__help_ is set. */ + +static void +print_help (struct gcc_options *opts, unsigned int lang_mask) +{ + const char *a = help_option_argument; + unsigned int include_flags = 0; + /* Note - by default we include undocumented options when listing + specific classes. If you only want to see documented options + then add ",^undocumented" to the --help= option. E.g.: + + --help=target,^undocumented */ + unsigned int exclude_flags = 0; + + if (lang_mask == CL_DRIVER) +return; + + /* Walk along the argument string, parsing each word in turn. + The format is: + arg = [^]{word}[,{arg}] + word = {optimizers|target|warnings|undocumented| + params|common|} */ + while (*a != 0) +{ + static const struct + { + const char *string; + unsigned int flag; + } + specifics[] = + { + { "optimizers", CL_OPTIMIZATION }, + { "target", CL_TARGET }, + { "warnings", CL_WARNING }, + { "undocumented", CL_UNDOCUMENTED }, + { "params", CL_PARAMS }, + { "joined", CL_JOINED }, + { "separate", CL_SEPARATE }, + { "common", CL_COMMON }, + { NULL, 0 } + }; + unsigned int *pflags; + const char *comma; + unsigned int lang_flag, specific_flag; + unsigned int len; + unsigned int i; + + if (*a == '^') + { + ++a; + if (*a == '\0') + { + error ("missing argument to %qs", "--help=^"); + break; + } + pflags = &exclude_flags; + } + else + pflags = &include_flags; + + comma = strchr (a, ','); + if (comma == NULL) + len = strlen (a); + else + len = comma - a; + if (len == 0) + { + a = comma + 1; + continue; + } + + /* Check to see if the string matches an option class name. */ + for (i = 0, specific_flag = 0; specifics[i].string != NULL; i++) +
[PATCH] Fix PR89896, bogus Makefile generated by lto-wrapper
Currently confused by both all or all.c in the local directory. Bootstrap / regtest running on x86_64-unknown-linux-gnu. Richard. 2019-04-01 Richard Biener PR lto/89896 * lto-wrapper.c (run_gcc): Avoid implicit rules making the all target phony. Index: gcc/lto-wrapper.c === --- gcc/lto-wrapper.c (revision 270053) +++ gcc/lto-wrapper.c (working copy) @@ -1665,7 +1665,9 @@ cont: struct pex_obj *pex; char jobs[32]; - fprintf (mstream, "all:"); + fprintf (mstream, + ".PHONY: all\n" + "all:"); for (i = 0; i < nr; ++i) { int j = ltrans_priorities[i*2 + 1];
Re: [PATCH] Add PSTL internal namespace qualifications
On 29/03/19 12:12 -0700, Thomas Rodgers wrote: Prevent ADL related weirdness. * include/pstl/algorithm_impl.h: Add namespace qualification. * include/pstl/execution_defs.h: Add namespace qualification. * include/pstl/execution_impl.h: Add namespace qualification. * include/pstl/numeric_impl.h: Add namespace qualification. * include/pstl/parallel_backend_tbb.h: Add namespace qualification. * include/pstl/unseq_backend_simd.h: Add namespace qualification. * include/pstl/parallel_backend_utils.h: Include . We shouldn't include in the std::lib, the uses of assert should be changed to __glibcxx_assert instead (which is enabled by defining _GLIBCXX_ASSERTIONS). @@ -285,7 +286,7 @@ _ForwardIterator2 __pattern_walk2_n(_ExecutionPolicy&&, _ForwardIterator1 __first1, _Size n, _ForwardIterator2 __first2, _Function f, _IsVector is_vector, /*parallel=*/std::false_type) noexcept I missed these before, but we have non-uglified 'n' and 'f' and 'is_vector' here. Almost all uses of "is_vector" are inside a comment like /*is_vector=*/ but here it's a real parameter name. In practice if users do something stupid like #define n 123 #define f 456 then they won't be able to include these headers anyway, because TBB uses those as parameter names (so users that are using these headers with TBB can't defines such dumb macros). But when we get a non-TBB backend that won't be the case, and we should be uglifying all names declared in our own headers on principle. { -return __brick_walk2_n(__first1, n, __first2, f, is_vector); +return __internal::__brick_walk2_n(__first1, n, __first2, f, is_vector); } template And 'n' and 'f' here. { -return __pattern_walk2(std::forward<_ExecutionPolicy>(__exec), __first1, __first1 + n, __first2, f, is_vector, +return __internal::__pattern_walk2(std::forward<_ExecutionPolicy>(__exec), __first1, __first1 + n, __first2, f, is_vector, std::true_type()); } We can fix the 'n' and 'f' and 'is_vector' names in another patch, for this one please just remove again and change assert(...) to __glibcxx_assert(...).
Re: [PATCH,AIX] go: disable mvsx and maltivec for aix/ppc
On Mon, Apr 1, 2019 at 3:43 AM CHIGOT, CLEMENT wrote: > > Description: > * This patch removes -mvsx and -maltivec for go aix/ppc. > These options don't seem compatible with Go stack layout. > > Tests: > * AIX 7.2: Configure/Build: SUCCESS > > Changelog: > * go/gospec.c (lang_specific_driver): disable -mvsx and -maltivec for > aix/ppc. Is this really the correct place to disable VSX and Altivec for GCC Go? It seems that this should be disabled in the target-specific code based on language. Thanks, David
Re: [PATCH PR89725]Handle DR's access functions of loops not in DDR's loop_nest
On Mon, Apr 1, 2019 at 5:10 AM bin.cheng wrote: > > Hi, > > As described in comments of PR89725, this patch fixes out-of-bound access > during > computing classic dist/dir vector for DDR. Basically it does two things: A) > Handle > relevant chrec of outer loop in multivariate access function as invariant > symbol during > DDR analysis; B) Bypass relevant univariate access functions. > > Bootstrap and test on x86_64, any comments? LGTM, as Jakub said elsewhere index_in_loop_nest is better written as static inline int index_in_loop_nest (int var, vec loop_nest) { struct loop *loopi; int var_index; for (var_index = 0; loop_nest.iterate (var_index, &loopi); var_index++) if (loopi->num == var) return var_index; gcc_unreachable (); } OK with that change. Richard. > Thanks, > bin > > 2019-04-01 Bin Cheng > > PR tree-optimization/89725 > * tree-chrec.c (chrec_contains_symbols): New parameter. Handle outer > loop's chrec as invariant symbol. > * tree-chrec.h (chrec_contains_symbols): New parameter. > * tree-data-ref.c (analyze_miv_subscript): Pass new argument. > (build_classic_dist_vector_1, add_other_self_distances): Bypass access > function of loops not in DDR's loop_nest. > * tree-data-ref.h (index_in_loop_nest): New assertion. > > 2018-04-01 Bin Cheng > > PR tree-optimization/89725 > * gcc/testsuite/gcc.dg/tree-ssa/pr89725.c: New test.
Re: [C++ Patch] PR 62207 ("[7/8/9 Regression] ICE: tree check: expected tree that contains 'decl minimal' structure, have 'overload' in tsubst_copy, at cp/pt.c")
OK. Jason
Re: [PATCH, GCC, AARCH64] Add GNU note section with BTI and PAC.
Hi Richard Thanks for the comments and pointing out the much cleaner existing asm output functions! On 29/03/2019 17:51, Richard Henderson wrote: >> +#define ASM_LONG "\t.long\t" > > Do not replicate targetm.asm_out.aligned_op.si, or integer_asm_op, really. > >> +aarch64_file_end_indicate_exec_stack () >> +{ >> + file_end_indicate_exec_stack (); >> + >> + if (!aarch64_bti_enabled () >> + && aarch64_ra_sign_scope == AARCH64_FUNCTION_NONE) >> +{ >> + return; >> +} > > This is redundant with... > >> + >> + unsigned feature_1_and = 0; >> + if (aarch64_bti_enabled ()) >> +feature_1_and |= GNU_PROPERTY_AARCH64_FEATURE_1_BTI; >> + >> + if (aarch64_ra_sign_scope != AARCH64_FUNCTION_NONE) >> +feature_1_and |= GNU_PROPERTY_AARCH64_FEATURE_1_PAC; >> + >> + if (feature_1_and) > > ... this. I prefer the second, as it's obvious. > >> + ASM_OUTPUT_ALIGN (asm_out_file, p2align); >> + /* name length. */ >> + fprintf (asm_out_file, ASM_LONG " 1f - 0f\n"); >> + /* data length. */ >> + fprintf (asm_out_file, ASM_LONG " 4f - 1f\n"); >> + /* note type: NT_GNU_PROPERTY_TYPE_0. */ >> + fprintf (asm_out_file, ASM_LONG " 5\n"); >> + fprintf (asm_out_file, "0:\n"); >> + /* vendor name: "GNU". */ >> + fprintf (asm_out_file, STRING_ASM_OP " \"GNU\"\n"); >> + fprintf (asm_out_file, "1:\n"); >> + ASM_OUTPUT_ALIGN (asm_out_file, p2align); >> + /* pr_type: GNU_PROPERTY_AARCH64_FEATURE_1_AND. */ >> + fprintf (asm_out_file, ASM_LONG " 0x%x\n", >> + GNU_PROPERTY_AARCH64_FEATURE_1_AND); >> + /* pr_datasz. */\ >> + fprintf (asm_out_file, ASM_LONG " 3f - 2f\n"); >> + fprintf (asm_out_file, "2:\n"); >> + /* GNU_PROPERTY_AARCH64_FEATURE_1_XXX. */ >> + fprintf (asm_out_file, ASM_LONG " 0x%x\n", feature_1_and); >> + fprintf (asm_out_file, "3:\n"); >> + ASM_OUTPUT_ALIGN (asm_out_file, p2align); >> + fprintf (asm_out_file, "4:\n"); > > This could stand to use a comment, a moment's thinking about the sizes, and to > use the existing asm output functions. > > /* PT_NOTE header: namesz, descsz, type. > namesz = 4 ("GNU\0") > descsz = 12 (see below) I was trying out these changes but the descsz of 12 gets rejected by readelf. It hits the following unsigned intsize = is_32bit_elf ? 4 : 8; printf (_(" Properties: ")); if (pnote->descsz < 8 || (pnote->descsz % size) != 0) { printf (_("\n"), pnote->descsz); return; } Thanks Sudi > type = 5 (NT_GNU_PROPERTY_TYPE_0). */ > assemble_align (POINTER_SIZE); > assemble_integer (GEN_INT (4), 4, 32, 1); > assemble_integer (GEN_INT (12), 4, 32, 1); > assemble_integer (GEN_INT (5), 4, 32, 1); > > /* PT_NOTE name */ > assemble_string ("GNU", 4); > > /* PT_NOTE contents for NT_GNU_PROPERTY_TYPE_0: > type = 0xc000 (GNU_PROPERTY_AARCH64_FEATURE_1_AND), > datasz = 4 > data = feature_1_and > Note that the current section offset is 16, > and there has been no padding so far. */ > assemble_integer (GEN_INT (0xc000), 4, 32, 1); > assemble_integer (GEN_INT (4), 4, 32, 1); > assemble_integer (GEN_INT (feature_1_and), 4, 32, 1); > > /* Pad the size of the note to the required alignment. */ > assemble_align (POINTER_SIZE); > > > r~ >
Re: [PATCH] Improve DF use of loop-invaraint (PR46590)
Richard Biener writes: > After the fix to RPO VN in loop header copying DF via RTL invariant > motion takes 50% of the compile-time for the second testcase in > PR46590. This is caused by the DF live problem iterating over > all dirty blocks for the local problem which is all blocks of > the function because while loop-invariant uses df_analyze_loop > it always marks all blocks of the function as dirty via > df_live_set_all_dirty. One of the possible fixes is to add > a df_live_set_loop_dirty () function which causes the problem > to only iterate over one loop blocks. > > It turns out the LIVE problem is always present so we cannot > remove it which in turn means we have to mark blocks possibly > not visited by invaraint motion as dirty (thus the new > df_live_set_all_dirty at the end of move_loop_invariants). > > Anyhow - I'm not really into how DF should ideally work here Me neither, but as discussed on IRC, here's alternative that seems to give a similar speed-up. - df_live is already present at -O2, so we only need to add it and mark all blocks dirty for -O - df_process_deferred_rescans should be enough to force a rescan of blocks affected by moving invariants, but calling it in find_defs means that we don't do any rescans for the final loop Still testing, but posting for comments. Thanks, Richard 2019-04-01 Richard Sandiford gcc/ PR rtl-optimization/46950 * loop-invariant.c (find_defs): Move df_remove_problem and df_process_deferred_rescans to move_invariants. Move df_live_add_problem and df_live_set_all_dirty calls to move_invariants. (move_invariants): Likewise. (move_loop_invariants): Likewise, making the df_live calls conditional on -O. Remove the problem again if we added it locally. Index: gcc/loop-invariant.c === --- gcc/loop-invariant.c2019-03-08 18:15:33.704751739 + +++ gcc/loop-invariant.c2019-04-01 14:09:13.553206345 + @@ -681,11 +681,7 @@ find_defs (struct loop *loop) loop->num); } - df_remove_problem (df_chain); - df_process_deferred_rescans (); df_chain_add_problem (DF_UD_CHAIN); - df_live_add_problem (); - df_live_set_all_dirty (); df_set_flags (DF_RD_PRUNE_DEAD_DEFS); df_analyze_loop (loop); check_invariant_table_size (); @@ -1891,6 +1887,10 @@ move_invariants (struct loop *loop) GENERAL_REGS, NO_REGS, GENERAL_REGS); } } + /* Remove the DF_UD_CHAIN problem added in find_defs before rescanning, + to save a bit of compile time. */ + df_remove_problem (df_chain); + df_process_deferred_rescans (); } /* Initializes invariant motion data. */ @@ -2254,6 +2254,11 @@ move_loop_invariants (void) { struct loop *loop; + if (optimize == 1) +{ + df_live_add_problem (); + df_live_set_all_dirty (); +} if (flag_ira_loop_pressure) { df_analyze (); @@ -2286,5 +2291,8 @@ move_loop_invariants (void) invariant_table = NULL; invariant_table_size = 0; + if (optimize == 1) +df_remove_problem (df_live); + checking_verify_flow_info (); }
Re: [PATCH] Improve DF use of loop-invaraint (PR46590)
On April 1, 2019 4:13:28 PM GMT+02:00, Richard Sandiford wrote: >Richard Biener writes: >> After the fix to RPO VN in loop header copying DF via RTL invariant >> motion takes 50% of the compile-time for the second testcase in >> PR46590. This is caused by the DF live problem iterating over >> all dirty blocks for the local problem which is all blocks of >> the function because while loop-invariant uses df_analyze_loop >> it always marks all blocks of the function as dirty via >> df_live_set_all_dirty. One of the possible fixes is to add >> a df_live_set_loop_dirty () function which causes the problem >> to only iterate over one loop blocks. >> >> It turns out the LIVE problem is always present so we cannot >> remove it which in turn means we have to mark blocks possibly >> not visited by invaraint motion as dirty (thus the new >> df_live_set_all_dirty at the end of move_loop_invariants). >> >> Anyhow - I'm not really into how DF should ideally work here > >Me neither, but as discussed on IRC, here's alternative that seems >to give a similar speed-up. > >- df_live is already present at -O2, so we only need to add it and > mark all blocks dirty for -O > >- df_process_deferred_rescans should be enough to force a rescan of > blocks affected by moving invariants, but calling it in find_defs > means that we don't do any rescans for the final loop > >Still testing, but posting for comments. Looks like a good cleanup. It will still process all blocks at the first df_analyze_loop call but at least not all over again for each loop. So if it works out OK since it's conceptually simpler. Richard. >Thanks, >Richard > > >2019-04-01 Richard Sandiford > >gcc/ > PR rtl-optimization/46950 > * loop-invariant.c (find_defs): Move df_remove_problem and > df_process_deferred_rescans to move_invariants. > Move df_live_add_problem and df_live_set_all_dirty calls > to move_invariants. > (move_invariants): Likewise. > (move_loop_invariants): Likewise, making the df_live calls > conditional on -O. Remove the problem again if we added it > locally. > >Index: gcc/loop-invariant.c >=== >--- gcc/loop-invariant.c 2019-03-08 18:15:33.704751739 + >+++ gcc/loop-invariant.c 2019-04-01 14:09:13.553206345 + >@@ -681,11 +681,7 @@ find_defs (struct loop *loop) > loop->num); > } > >- df_remove_problem (df_chain); >- df_process_deferred_rescans (); > df_chain_add_problem (DF_UD_CHAIN); >- df_live_add_problem (); >- df_live_set_all_dirty (); > df_set_flags (DF_RD_PRUNE_DEAD_DEFS); > df_analyze_loop (loop); > check_invariant_table_size (); >@@ -1891,6 +1887,10 @@ move_invariants (struct loop *loop) >GENERAL_REGS, NO_REGS, GENERAL_REGS); > } > } >+ /* Remove the DF_UD_CHAIN problem added in find_defs before >rescanning, >+ to save a bit of compile time. */ >+ df_remove_problem (df_chain); >+ df_process_deferred_rescans (); > } > > /* Initializes invariant motion data. */ >@@ -2254,6 +2254,11 @@ move_loop_invariants (void) > { > struct loop *loop; > >+ if (optimize == 1) >+{ >+ df_live_add_problem (); >+ df_live_set_all_dirty (); >+} > if (flag_ira_loop_pressure) > { > df_analyze (); >@@ -2286,5 +2291,8 @@ move_loop_invariants (void) > invariant_table = NULL; > invariant_table_size = 0; > >+ if (optimize == 1) >+df_remove_problem (df_live); >+ > checking_verify_flow_info (); > }
[PATCH PR d/88462] Committed fix for abort in pthread_mutex_init on Solaris
Hi, This patch merges the libphobos druntime library with druntime upstream d57fa1ff. Fixes alignment of internal core.thread locks that were found to be the cause of abort() being called inside pthread_mutex_init(), fixing the second part of PR d/88462. Bootstrapped and regression tested on x86_64-linux-gnu. Committed to trunk as r270057. -- Iain --- diff --git a/libphobos/libdruntime/MERGE b/libphobos/libdruntime/MERGE index ed756fa6c18..15a55ab612a 100644 --- a/libphobos/libdruntime/MERGE +++ b/libphobos/libdruntime/MERGE @@ -1,4 +1,4 @@ -b9564bef1147c797842e6c1a804f2c3565c64ac1 +d57fa1ffaecc858229ed7a730e8486b59197dee5 The first line of this file holds the git revision number of the last merge done from the dlang/druntime repository. diff --git a/libphobos/libdruntime/core/internal/traits.d b/libphobos/libdruntime/core/internal/traits.d index d5786808054..e56f016c355 100644 --- a/libphobos/libdruntime/core/internal/traits.d +++ b/libphobos/libdruntime/core/internal/traits.d @@ -170,6 +170,29 @@ template anySatisfy(alias F, T...) } } +// simplified from std.traits.maxAlignment +template maxAlignment(U...) +{ +static if (U.length == 0) +static assert(0); +else static if (U.length == 1) +enum maxAlignment = U[0].alignof; +else static if (U.length == 2) +enum maxAlignment = U[0].alignof > U[1].alignof ? U[0].alignof : U[1].alignof; +else +{ +enum a = maxAlignment!(U[0 .. ($+1)/2]); +enum b = maxAlignment!(U[($+1)/2 .. $]); +enum maxAlignment = a > b ? a : b; +} +} + +template classInstanceAlignment(T) +if (is(T == class)) +{ +alias classInstanceAlignment = maxAlignment!(void*, typeof(T.tupleof)); +} + // Somehow fails for non-static nested structs without support for aliases template hasElaborateDestructor(T...) { diff --git a/libphobos/libdruntime/core/thread.d b/libphobos/libdruntime/core/thread.d index e502072be7a..1cf26641e05 100644 --- a/libphobos/libdruntime/core/thread.d +++ b/libphobos/libdruntime/core/thread.d @@ -114,6 +114,13 @@ private { import core.atomic, core.memory, core.sync.mutex; +// Handling unaligned mutexes are not supported on all platforms, so we must +// ensure that the address of all shared data are appropriately aligned. +import core.internal.traits : classInstanceAlignment; + +enum mutexAlign = classInstanceAlignment!Mutex; +enum mutexClassInstanceSize = __traits(classInstanceSize, Mutex); + // // exposed by compiler runtime // @@ -1708,29 +1715,30 @@ private: // lock order inversion. @property static Mutex slock() nothrow @nogc { -return cast(Mutex)_locks[0].ptr; +return cast(Mutex)_slock.ptr; } @property static Mutex criticalRegionLock() nothrow @nogc { -return cast(Mutex)_locks[1].ptr; +return cast(Mutex)_criticalRegionLock.ptr; } -__gshared align(Mutex.alignof) void[__traits(classInstanceSize, Mutex)][2] _locks; +__gshared align(mutexAlign) void[mutexClassInstanceSize] _slock; +__gshared align(mutexAlign) void[mutexClassInstanceSize] _criticalRegionLock; static void initLocks() { -foreach (ref lock; _locks) -{ -lock[] = typeid(Mutex).initializer[]; -(cast(Mutex)lock.ptr).__ctor(); -} +_slock[] = typeid(Mutex).initializer[]; +(cast(Mutex)_slock.ptr).__ctor(); + +_criticalRegionLock[] = typeid(Mutex).initializer[]; +(cast(Mutex)_criticalRegionLock.ptr).__ctor(); } static void termLocks() { -foreach (ref lock; _locks) -(cast(Mutex)lock.ptr).__dtor(); +(cast(Mutex)_slock.ptr).__dtor(); +(cast(Mutex)_criticalRegionLock.ptr).__dtor(); } __gshared Context* sm_cbeg;
Re: [PATCH, wwwdocs] Mention -march=armv8.5-a and other new command line options for AArch64 and Arm for GCC 9
Hi James On 29/03/2019 13:41, Sudakshina Das wrote: > Hi James > > On 22/03/2019 16:25, James Greenhalgh wrote: >> On Wed, Mar 20, 2019 at 10:17:41AM +, Sudakshina Das wrote: >>> Hi Kyrill >>> >>> On 12/03/2019 12:03, Kyrill Tkachov wrote: Hi Sudi, On 2/22/19 10:45 AM, Sudakshina Das wrote: > Hi > > This patch documents the addition of the new Armv8.5-A and > corresponding > extensions in the gcc-9/changes.html. > As per https://gcc.gnu.org/about.html, I have used W3 Validator. > Is this ok for cvs? > > Thanks > Sudi Index: htdocs/gcc-9/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v retrieving revision 1.43 diff -u -r1.43 changes.html --- htdocs/gcc-9/changes.html 21 Feb 2019 10:32:55 - 1.43 +++ htdocs/gcc-9/changes.html 21 Feb 2019 18:25:09 - @@ -283,6 +283,19 @@ The intrinsics are defined by the ACLE specification. + + The Armv8.5-A architecture is now supported. This can be used by specifying the + -march=armv8.5-a option. I tend to prefer the wording "... is now supported through the -march=armv8.5-a option". Otherwise it reads as the compiler "using" the architecture, whereas we usually talk about "targeting" an architecture. + + The Armv8.5-A architecture also adds some security features that are optional to all older + architecture versions. These are also supported now and only effect the assembler. + + Speculation Barrier instruction using -march=armv8-a+sb. + Execution and Data Prediction Restriction instructions using -march=armv8-a+predres. + Speculative Store Bypass Safe instruction using -march=armv8-a+ssbs. This does not + require a compiler option for Arm and thus -march=armv8-a+ssbs is a AArch64 specific option. "AArch64-specific" LGTM otherwise. Thanks, Kyrill >>> >>> Thanks for the review and sorry for the delay in response. I had edited >>> the language for adding new options in a few other places as well. >>> >>> + The Armv8.5-A architecture also adds some security features >>> that are >>> + optional to all older architecture versions. These are also >>> supported now >> >> s/also supported now/now supported/ >> >>> + and only effect the assembler. >> >> s/effect/affect/ >> >>> + >>> + Speculation Barrier instruction through the >>> + -march=armv8-a+sb option. >>> + Execution and Data Prediction Restriction instructions through >>> + the -march=armv8-a+predres option. >>> + Speculative Store Bypass Safe instruction through the >>> + -march=armv8-a+ssbs option. This does not >>> require a >>> + compiler option for Arm and thus >>> -march=armv8-a+ssbs >>> + is an AArch64-specific option. >>> + >>> + >>> >>> AArch64 specific >>> @@ -362,6 +380,23 @@ >>> The default value is 16 (64Kb) and can be changed at configure >>> time using the flag >>> --with-stack-clash-protection-guard-size=12|16. >>> >>> + >>> + The option -msign-return-address= has been >>> deprecated. This >>> + has been replaced by the new -mbranch-protection= >>> option. This >>> + new option can now be used to enable the return address signing >>> as well as >>> + the new Branch Target Identification feature of Armv8.5-A >>> architecture. For >>> + more information on the arguments accepted by this option, >>> please refer to >>> + >> href="https://gcc.gnu.org/onlinedocs/gcc/AArch64-Options.html#AArch64-Options";>AArch64-Options. >>> >>> >>> + >>> + The following optional extensions to Armv8.5-A architecture >>> are also >>> + supported now and only effect the assembler. >> >> s/effect/affect/ >> >>> + >>> + Random Number Generation instructions through the >>> + -march=armv8.5-a+rng option. >>> + Memory Tagging Extension through the >>> + -march=armv8.5-a+memtag option. >>> + >>> + >>> >>> Arm specific >> >> Otherwise, OK by me but feel free to wait for people with gooder >> grammar than me to have their say. >> > > Thanks for spotting those. So far no one else with gooder grammar has > pointed out anything else. I will commit the patch with the changes you > suggested on Monday if no one else has any other objections. > Committed as 1.56 Thanks Sudi > Thanks > Sudi > >> Thanks, >> James >> >
[committed] sel-sched: remove assert in merge_fences (PR 87273)
Hi, I'm applying this patch on Andrey's behalf to resolve PR 87273 where an overly strict assert fires; Andrey explanation in PR comment #5 provides more detail. Alexander 2019-04-01 Andrey Belevantsev PR rtl-optimization/87273 * sel-sched-ir.c (merge_fences): Remove assert. * gcc.dg/pr87273.c: New test. diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c index 2f76e56ca8e..6dec1beaa04 100644 --- a/gcc/sel-sched-ir.c +++ b/gcc/sel-sched-ir.c @@ -703,11 +703,6 @@ merge_fences (fence_t f, insn_t insn, else if (candidate->src == BLOCK_FOR_INSN (last_scheduled_insn)) { -/* Would be weird if same insn is successor of several fallthrough - edges. */ -gcc_assert (BLOCK_FOR_INSN (insn)->prev_bb -!= BLOCK_FOR_INSN (last_scheduled_insn_old)); - state_free (FENCE_STATE (f)); FENCE_STATE (f) = state; diff --git a/gcc/testsuite/gcc.dg/pr87273.c b/gcc/testsuite/gcc.dg/pr87273.c new file mode 100644 index 000..43662f0241c --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr87273.c @@ -0,0 +1,20 @@ +/* { dg-do compile { target powerpc*-*-* ia64-*-* i?86-*-* x86_64-*-* } } */ +/* { dg-options "-Os -fschedule-insns -fsel-sched-pipelining -fselective-scheduling -fno-ssa-phiopt -fno-tree-loop-im" } */ +/* { dg-additional-options "-march=core2" { target i?86-*-* x86_64-*-* } } */ + +int sd; + +void +w5 (int n4) +{ + long int *vq = (long int *) &n4; + + while (n4 < 1) +{ + int ks; + + ks = !!(n4 + 1) ? ((++sd) == *vq) : 0; + if (ks == 1 / *vq) +*vq *= sd; +} +}
Re: C++ PATCH for c++/89214 - ICE when initializing aggregates with bases
On Mon, Apr 01, 2019 at 10:15:11AM +0200, Andreas Schwab wrote: > On Mär 28 2019, Marek Polacek wrote: > > > Andreas, could you please find out why we're not hitting this code in > > digest_init_r: > > > > 1210 tree elt = CONSTRUCTOR_ELT (stripped_init, 0)->value; > > 1211 if (reference_related_p (type, TREE_TYPE (elt))) > > This is never executed if flag_checking is false, of course. It's certainly wrong for a warning to depend on flag_checking, so this patch corrects it, and I hope will fix the ia64 problem as well. Bootstrapped/regtested on x86_64-linux, ok for trunk? 2019-04-01 Marek Polacek * typeck2.c (digest_init_r): Don't condition the object slicing warning on flag_checking. diff --git gcc/cp/typeck2.c gcc/cp/typeck2.c index fa98b1cb8b5..55b84f043f4 100644 --- gcc/cp/typeck2.c +++ gcc/cp/typeck2.c @@ -1200,8 +1200,7 @@ digest_init_r (tree type, tree init, int nested, int flags, /* "If T is a class type and the initializer list has a single element of type cv U, where U is T or a class derived from T, the object is initialized from that element." */ - if (flag_checking - && cxx_dialect >= cxx11 + if (cxx_dialect >= cxx11 && BRACE_ENCLOSED_INITIALIZER_P (stripped_init) && CONSTRUCTOR_NELTS (stripped_init) == 1 && ((CLASS_TYPE_P (type) && !CLASSTYPE_NON_AGGREGATE (type)) @@ -1228,7 +1227,7 @@ digest_init_r (tree type, tree init, int nested, int flags, "results in object slicing", TREE_TYPE (field))) inform (loc, "remove %<{ }%> around initializer"); } - else + else if (flag_checking) /* We should have fixed this in reshape_init. */ gcc_unreachable (); }
[PATCH, OpenACC, og8, committed] Handle Compute Capability 7.0 in libgomp
I've committed this single-liner (and added a comment) to OG8. This allows detection of Compute Capability 7.0 (Volta) and lets libgomp allocate a more reasonable default gang number for Volta GPUs. Tested without regressions on a powerpc64le-linux system. Note that mainline has different code for doing this task, so this patch doesn't apply there (not needed there). Chung-Lin [og] Handle Compute Capability 7.0 (Volta) libgomp/ * plugin/plugin-nvptx.c (GOMP_OFFLOAD_load_image): Handle up to Compute Capability 7.0. diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c index c2d3b71..706a36f 100644 --- a/libgomp/plugin/plugin-nvptx.c +++ b/libgomp/plugin/plugin-nvptx.c @@ -1273,8 +1273,9 @@ GOMP_OFFLOAD_load_image (int ord, unsigned version, const void *target_data, dev->register_allocation_unit_size = 256; dev->register_allocation_granularity = 2; } - else if (dev->binary_version <= 62) + else if (dev->binary_version <= 70) { + /* Compute Capability 6.1, 6.2, 7.0 share same parameters. */ dev->register_allocation_unit_size = 256; dev->register_allocation_granularity = 4; }
Re: [PATCH] elf.c: initialize st_mode member
On 4/1/19 12:03 AM, Yu, Mingli wrote: > Ping. > > Thanks, > > On 2019年03月19日 16:37, mingli...@windriver.com wrote: >> From: Mingli Yu >> >> Initialize st_mode member to fix the below >> build failure when -Og included in compiler flag. >> | >> ./../../../../../../../../work-shared/gcc-8.3.0-r0/gcc-8.3.0/libsanitizer/libbacktrace/../../libbacktrace/elf.c: >> In function 'elf_is_symlink': >> | >> ../../../../../../../../../work-shared/gcc-8.3.0-r0/gcc-8.3.0/libsanitizer/libbacktrace/../../libbacktrace/elf.c:772:21: >> error: 'st.st_mode' may be used uninitialized in this function >> [-Werror=maybe-uninitialized] >> return S_ISLNK (st.st_mode); We already wet through this with someone else. Use -O2 rather than -Og or -O1 when compiling that file. THe warning is a false positive and the higher optimization levels jeff
[PATCH] Fix a missed case in predicate analysis of the late uninit pass
Hi! This is a fairly trivial change fixing a false negative in -Wmaybe-uninitialized. I am pretty sure this is simply an overlooked case (is_value_included_in() is not meant to deal with the case where both compare codes are NE_EXPRs, neither does it need to, since their handling is trivial). In a nutshell, what happens is values of v restricted by (v != 2) are incorrectly considered a subset of values of v restricted by (v != 1). As if "v != 2, therefore v != 1". This is by no means a gcc-9 regression; I'll ping the patch once stage4 is over, if needed. This came up when I was experimenting with moving the uninit passes around. On mainline, the late uninit pass runs very late, so reliably triggering the affected path is a bit tricky. So I created a GIMPLE test (it reproduces the behavior precisely, but might be fragile w.r.t. future versions of the textual representation) and then with a hint from Alexander managed to produce a simple C test. [By the way, the first take was to insert an asm with a lot of newlines to prevent the dom pass from rewriting the CFG due to high cost of duplicating instructions. This didn't work out; I think the dom pass does not respect sizes of inline asms. I plan to create a testcase and file a bug later.] I ran regression testing on x86_64-pc-linux-gnu and saw no new regressions modulo a handful of flaky UNRESOLVED tests under gcc.dg/tree-prof. `BOOT_CFLAGS="-O -Wno-error=maybe-uninitialized -Wmaybe-uninitialized" bootstrap` also succeeded producing no new warnings. OK for stage1? gcc/ChangeLog: * tree-ssa-uninit.c (is_pred_expr_subset_of): Correctly handle a case where the operand of both simple predicates is the same, both compare codes are NE_EXPR, and the integer values are different (e.g. neither of the predicate expressions 'v != 1' and 'v != 2' should be considered a subset of the other). gcc/testsuite/ChangeLog: * gcc.dg/uninit-25-gimple.c: New test. * gcc.dg/uninit-25.c: New test. --- gcc/testsuite/gcc.dg/uninit-25-gimple.c | 41 + gcc/testsuite/gcc.dg/uninit-25.c| 23 ++ gcc/tree-ssa-uninit.c | 3 ++ 3 files changed, 67 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/uninit-25-gimple.c create mode 100644 gcc/testsuite/gcc.dg/uninit-25.c diff --git a/gcc/testsuite/gcc.dg/uninit-25-gimple.c b/gcc/testsuite/gcc.dg/uninit-25-gimple.c new file mode 100644 index 000..0c0acd6b83e --- /dev/null +++ b/gcc/testsuite/gcc.dg/uninit-25-gimple.c @@ -0,0 +1,41 @@ +/* { dg-do compile } */ +/* { dg-options "-fgimple -O -Wmaybe-uninitialized" } */ + +unsigned int __GIMPLE (ssa,startwith("uninit1")) +foo (unsigned int v) +{ + unsigned int undef;/* { dg-warning "may be used uninitialized" } */ + unsigned int _2; + unsigned int _9; + unsigned int _10; + + __BB(2): + if (v_4(D) != 1u) +goto __BB3; + else +goto __BB4; + + __BB(3): + undef_8 = 8u; // 'undef' is defined conditionally (under 'v != 1' predicate) + goto __BB4; + + __BB(4): + // An undef value flows into a phi. + undef_1 = __PHI (__BB2: undef_5(D), __BB3: undef_8); + if (v_4(D) != 2u) // This condition is neither a superset nor a subset of 'v != 1'. +goto __BB5; + else +goto __BB6; + + __BB(5): + _9 = undef_1; // The phi value is used here (under 'v != 2' predicate). + goto __BB7; + + __BB(6): + _10 = v_4(D); + goto __BB7; + + __BB(7): + _2 = __PHI (__BB5: _9, __BB6: _10); + return _2; +} diff --git a/gcc/testsuite/gcc.dg/uninit-25.c b/gcc/testsuite/gcc.dg/uninit-25.c new file mode 100644 index 000..ce3f91c --- /dev/null +++ b/gcc/testsuite/gcc.dg/uninit-25.c @@ -0,0 +1,23 @@ +/* { dg-do compile } */ +/* { dg-options "-O -Wmaybe-uninitialized" } */ + +extern unsigned bar (void); +extern void quux (void); + +unsigned foo (unsigned v) +{ + unsigned u; + if (v != 1) +u = bar (); + + // Prevent the "dom" pass from changing the CFG layout based on the inference + // 'if (v != 1) is false then (v != 2) is true'. (Now it would have to + // duplicate the loop in order to do so, which is deemed expensive.) + for (int i = 0; i < 10; i++) +quux (); + + if (v != 2) +return u; /* { dg-warning "may be used uninitialized" } */ + + return 0; +} diff --git a/gcc/tree-ssa-uninit.c b/gcc/tree-ssa-uninit.c index 55a55a05c96..1de7d21d2eb 100644 --- a/gcc/tree-ssa-uninit.c +++ b/gcc/tree-ssa-uninit.c @@ -1488,6 +1488,9 @@ is_pred_expr_subset_of (pred_info expr1, pred_info expr2) if (expr2.invert) code2 = invert_tree_comparison (code2, false); + if (code1 == NE_EXPR && code2 == NE_EXPR) +return false; + if ((code1 == EQ_EXPR || code1 == BIT_AND_EXPR) && code2 == BIT_AND_EXPR) return (wi::to_wide (expr1.pred_rhs) == (wi::to_wide (expr1.pred_rhs) & wi::to_wide (expr2.pred_rhs))); -- 2.20.1 -- Vlad
Re: [PATCH] Add PSTL internal namespace qualifications
Jonathan Wakely writes: > On 29/03/19 12:12 -0700, Thomas Rodgers wrote: >>Prevent ADL related weirdness. >> >> * include/pstl/algorithm_impl.h: Add namespace qualification. >> * include/pstl/execution_defs.h: Add namespace qualification. >> * include/pstl/execution_impl.h: Add namespace qualification. >> * include/pstl/numeric_impl.h: Add namespace qualification. >> * include/pstl/parallel_backend_tbb.h: Add namespace qualification. >> * include/pstl/unseq_backend_simd.h: Add namespace qualification. >> * include/pstl/parallel_backend_utils.h: Include . > > We shouldn't include in the std::lib, the uses of assert > should be changed to __glibcxx_assert instead (which is enabled by > defining _GLIBCXX_ASSERTIONS). > This has to come through one of the PSTL library configuration macros because the "right assert" upstream won't be __glibcxx_assert. >>@@ -285,7 +286,7 @@ _ForwardIterator2 >> __pattern_walk2_n(_ExecutionPolicy&&, _ForwardIterator1 __first1, _Size n, >> _ForwardIterator2 __first2, _Function f, >> _IsVector is_vector, /*parallel=*/std::false_type) noexcept > > I missed these before, but we have non-uglified 'n' and 'f' and > 'is_vector' here. Almost all uses of "is_vector" are inside a comment > like /*is_vector=*/ but here it's a real parameter name. > > In practice if users do something stupid like > #define n 123 > #define f 456 > then they won't be able to include these headers anyway, because > TBB uses those as parameter names (so users that are using these > headers with TBB can't defines such dumb macros). But when we get a > non-TBB backend that won't be the case, and we should be uglifying all > names declared in our own headers on principle. > >> { >>-return __brick_walk2_n(__first1, n, __first2, f, is_vector); >>+return __internal::__brick_walk2_n(__first1, n, __first2, f, is_vector); >> } >> >> template > class _RandomAccessIterator2, >>@@ -294,7 +295,7 @@ _RandomAccessIterator2 >> __pattern_walk2_n(_ExecutionPolicy&& __exec, _RandomAccessIterator1 >> __first1, _Size n, _RandomAccessIterator2 __first2, >> _Function f, _IsVector is_vector, >> /*parallel=*/std::true_type) > > And 'n' and 'f' here. > >> { >>-return __pattern_walk2(std::forward<_ExecutionPolicy>(__exec), __first1, >>__first1 + n, __first2, f, is_vector, >>+return >>__internal::__pattern_walk2(std::forward<_ExecutionPolicy>(__exec), __first1, >>__first1 + n, __first2, f, is_vector, >>std::true_type()); >> } >> > > We can fix the 'n' and 'f' and 'is_vector' names in another patch, for > this one please just remove again and change assert(...) to > __glibcxx_assert(...).
patch for PR89865
The following patch is for https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89865 The patch was successfully tested on x86-64, aarch64, and ppc64. The patch decreases scan matches from 82 to 8 on x86-64. I don't know what test pr49095.c should expect. Therefore I don't change the test itself. Committed as rev. 270060 Index: ChangeLog === --- ChangeLog (revision 270059) +++ ChangeLog (working copy) @@ -1,3 +1,9 @@ +2019-04-01 Vladimir Makarov + + PR rtl-optimization/89865 + * ira-costs.c (process_bb_node_for_hard_reg_moves): Skip hard + register if it is a part of small class. + 2019-04-01 Andrey Belevantsev PR rtl-optimization/87273 Index: ira-costs.c === --- ira-costs.c (revision 270004) +++ ira-costs.c (working copy) @@ -2107,6 +2107,13 @@ process_bb_node_for_hard_reg_moves (ira_ } else continue; + if (reg_class_size[(int) REGNO_REG_CLASS (hard_regno)] + == (ira_reg_class_max_nregs + [REGNO_REG_CLASS (hard_regno)][(int) ALLOCNO_MODE(a)])) + /* If the class can provide only one hard reg to the allocno, + we processed the insn record_operand_costs already and we + actually updated the hard reg cost there. */ + continue; rclass = ALLOCNO_CLASS (a); if (! TEST_HARD_REG_BIT (reg_class_contents[rclass], hard_regno)) continue;
[committed] sel-sched: update liveness in redirect_edge_and_branch hooks (PR 86928)
Hi, This patch by Andrey fixes PR 86928 by avoiding leaving basic blocks without computed liveness info. 2019-04-01 Andrey Belevantsev PR rtl-optimization/86928 * sel-sched-ir.c (sel_redirect_edge_and_branch_force): Invoke compute_live if necessary. (sel_redirect_edge_and_branch): Likewise. * gcc.dg/pr86928.c: New test. diff --git a/gcc/sel-sched-ir.c b/gcc/sel-sched-ir.c index e8e508ef692..2f76e56ca8e 100644 --- a/gcc/sel-sched-ir.c +++ b/gcc/sel-sched-ir.c @@ -5642,6 +5642,8 @@ sel_redirect_edge_and_branch_force (edge e, basic_block to) recompute_dominator (CDI_DOMINATORS, to)); set_immediate_dominator (CDI_DOMINATORS, orig_dest, recompute_dominator (CDI_DOMINATORS, orig_dest)); + if (jump && sel_bb_head_p (jump)) +compute_live (jump); } /* A wrapper for redirect_edge_and_branch. Return TRUE if blocks connected by @@ -5702,6 +5704,8 @@ sel_redirect_edge_and_branch (edge e, basic_block to) set_immediate_dominator (CDI_DOMINATORS, orig_dest, recompute_dominator (CDI_DOMINATORS, orig_dest)); } + if (jump && sel_bb_head_p (jump)) +compute_live (jump); return recompute_toporder_p; } diff --git a/gcc/testsuite/gcc.dg/pr86928.c b/gcc/testsuite/gcc.dg/pr86928.c new file mode 100644 index 000..1586a36f945 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr86928.c @@ -0,0 +1,25 @@ +/* { dg-do compile { target powerpc*-*-* ia64-*-* i?86-*-* x86_64-*-* } } */ +/* { dg-options "-O2 -fnon-call-exceptions -fsel-sched-pipelining -fsel-sched-pipelining-outer-loops -fselective-scheduling -fno-if-conversion -fno-tree-dce -w" } */ +/* { dg-additional-options "-m32" { target powerpc*-*-* } } */ + +int kn; + +void +gd (short int sk) +{ + char *as; + + while (sk < 1) +{ + char *ci; + + if (*ci == 0) +*as += ci; + + for (kn = 0; kn < 18; ++kn) +{ +} + + ++sk; +} +}
Re: [PATCH] Updated patch for PR84101
Richard Biener writes: > This is an update from last years attempt to tame down vectorization > when it runs in to ABI inefficiencies at function return. I've > updated the patch to look for multi-reg returns instead of > !vector ones (due to word_mode vectorization) and handle a bit > more returns, esp. the common pattern of > > ret = vector; > D.1234 = ret; > ret = {CLOBBER}; > return D.1234; > > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK? > > I know this isn't the ultimate solution but we keep getting > duplicates of the PR. > > Richard. > > 2019-03-28 Richard Biener > > PR tree-optimization/84101 > * tree-vect-stmts.c: Include explow.h for hard_function_value, > regs.h for hard_regno_nregs. > (cfun_returns): New helper. > (vect_model_store_cost): When vectorizing a store to a decl > we return and the function ABI returns in a multi-reg location > account for the possible spilling that will happen. > > * gcc.target/i386/pr84101.c: New testcase. > > Index: gcc/tree-vect-stmts.c > === > --- gcc/tree-vect-stmts.c (revision 269987) > +++ gcc/tree-vect-stmts.c (working copy) > @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. > #include "tree-cfg.h" > #include "tree-ssa-loop-manip.h" > #include "cfgloop.h" > +#include "explow.h" > #include "tree-ssa-loop.h" > #include "tree-scalar-evolution.h" > #include "tree-vectorizer.h" > @@ -52,6 +53,7 @@ along with GCC; see the file COPYING3. > #include "vec-perm-indices.h" > #include "tree-ssa-loop-niter.h" > #include "gimple-fold.h" > +#include "regs.h" > > /* For lang_hooks.types.type_for_mode. */ > #include "langhooks.h" > @@ -948,6 +950,37 @@ vect_model_promotion_demotion_cost (stmt > "prologue_cost = %d .\n", inside_cost, prologue_cost); > } > > +/* Returns true if the current function returns DECL. */ > + > +static bool > +cfun_returns (tree decl) > +{ > + edge_iterator ei; > + edge e; > + FOR_EACH_EDGE (e, ei, EXIT_BLOCK_PTR_FOR_FN (cfun)->preds) > +{ > + greturn *ret = safe_dyn_cast (last_stmt (e->src)); > + if (!ret) > + continue; > + if (gimple_return_retval (ret) == decl) > + return true; > + /* We often end up with an aggregate copy to the result decl, > + handle that case as well. First skip intermediate clobbers > + though. */ > + gimple *def = ret; > + do > + { > + def = SSA_NAME_DEF_STMT (gimple_vuse (def)); > + } > + while (gimple_clobber_p (def)); > + if (is_a (def) > + && gimple_assign_lhs (def) == gimple_return_retval (ret) > + && gimple_assign_rhs1 (def) == decl) > + return true; > +} > + return false; > +} > + > /* Function vect_model_store_cost > > Models cost for stores. In the case of grouped accesses, one access > @@ -1032,6 +1065,37 @@ vect_model_store_cost (stmt_vec_info stm > vec_to_scalar, stmt_info, 0, vect_body); > } > > + /* When vectorizing a store into the function result assign > + a penalty if the function returns in a multi-register location. > + In this case we assume we'll end up with having to spill the > + vector result and do piecewise loads. */ > + tree base = get_base_address (STMT_VINFO_DATA_REF (stmt_info)->ref); > + if (base > + && (TREE_CODE (base) == RESULT_DECL > + || (DECL_P (base) && cfun_returns (base))) > + && !aggregate_value_p (base, cfun->decl)) > +{ > + rtx reg = hard_function_value (TREE_TYPE (base), cfun->decl, 0, 1); > + /* ??? Handle PARALLEL in some way. */ > + if (REG_P (reg)) > + { > + int nregs = hard_regno_nregs (REGNO (reg), GET_MODE (reg)); > + /* Assume that a reg-reg move is possible and cheap, > + do not account for vector to gp register move cost. */ > + if (nregs > 1) Looks OK to me FWIW, but maybe it would be worth having a check like: targetm.secondary_memory_needed (TYPE_MODE (vectype), ALL_REGS, REGNO_REG_CLASS (REGNO (reg))) as well as the above, so that we don't accidentally penalise values that are returned in multiple vector registers. Looks like the i386.c definition will return true in the cases that matter. Thanks, Richard > + { > + /* Spill. */ > + prologue_cost += record_stmt_cost (cost_vec, ncopies, > + vector_store, > + stmt_info, 0, vect_epilogue); > + /* Loads. */ > + prologue_cost += record_stmt_cost (cost_vec, ncopies * nregs, > + scalar_load, > + stmt_info, 0, vect_epilogue); > + } > + } > +} > + >if (dump_enabled_p ()) > dump_printf_loc (MSG_NOTE, v
Re: [Patch] PR rtl-optimization/87763 - generate more bfi instructions on aarch64
This is a ping**3 for a patch to fix one of the test failures in PR 877763. It fixes the gcc.target/aarch64/combine_bfi_1.c failure, but not the other ones. Could one of the Aarch64 maintainers take a look at it? This version of the patch was originally submitted on February 11 after incorporating some changes suggested by Wilco Dijkstra but I have not gotten any comments since then despite pinging it. I updated it to apply cleanly on ToT but did not make any other changes since the February 11th version. If we want to encourage people to fix bugs in the run up to a release it would help to get feedback on bugfix patches. Steve Ellcey sell...@marvell.com 2018-04-01 Steve Ellcey PR rtl-optimization/87763 * config/aarch64/aarch64-protos.h (aarch64_masks_and_shift_for_bfi_p): New prototype. * config/aarch64/aarch64.c (aarch64_masks_and_shift_for_bfi_p): New function. * config/aarch64/aarch64.md (*aarch64_bfi5_shift): New instruction. (*aarch64_bfi4_noand): Ditto. (*aarch64_bfi4_noshift): Ditto. (*aarch64_bfi4_noshift_alt): Ditto. 2018-04-01 Steve Ellcey PR rtl-optimization/87763 * gcc.target/aarch64/combine_bfxil.c: Change some bfxil checks to bfi. * gcc.target/aarch64/combine_bfi_2.c: New test. diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index b035e35..b6c0d0a 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -429,6 +429,9 @@ bool aarch64_label_mentioned_p (rtx); void aarch64_declare_function_name (FILE *, const char*, tree); bool aarch64_legitimate_pic_operand_p (rtx); bool aarch64_mask_and_shift_for_ubfiz_p (scalar_int_mode, rtx, rtx); +bool aarch64_masks_and_shift_for_bfi_p (scalar_int_mode, unsigned HOST_WIDE_INT, + unsigned HOST_WIDE_INT, + unsigned HOST_WIDE_INT); bool aarch64_zero_extend_const_eq (machine_mode, rtx, machine_mode, rtx); bool aarch64_move_imm (HOST_WIDE_INT, machine_mode); opt_machine_mode aarch64_sve_pred_mode (unsigned int); diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index b38505b..3017e99 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -9336,6 +9336,35 @@ aarch64_mask_and_shift_for_ubfiz_p (scalar_int_mode mode, rtx mask, & ((HOST_WIDE_INT_1U << INTVAL (shft_amnt)) - 1)) == 0; } +/* Return true if the masks and a shift amount from an RTX of the form + ((x & MASK1) | ((y << SHIFT_AMNT) & MASK2)) are valid to combine into + a BFI instruction of mode MODE. See *arch64_bfi patterns. */ + +bool +aarch64_masks_and_shift_for_bfi_p (scalar_int_mode mode, + unsigned HOST_WIDE_INT mask1, + unsigned HOST_WIDE_INT shft_amnt, + unsigned HOST_WIDE_INT mask2) +{ + unsigned HOST_WIDE_INT t; + + /* Verify that there is no overlap in what bits are set in the two masks. */ + if (mask1 != ~mask2) +return false; + + /* Verify that mask2 is not all zeros or ones. */ + if (mask2 == 0 || mask2 == HOST_WIDE_INT_M1U) +return false; + + /* The shift amount should always be less than the mode size. */ + gcc_assert (shft_amnt < GET_MODE_BITSIZE (mode)); + + /* Verify that the mask being shifted is contiguous and would be in the + least significant bits after shifting by shft_amnt. */ + t = mask2 + (HOST_WIDE_INT_1U << shft_amnt); + return (t == (t & -t)); +} + /* Calculate the cost of calculating X, storing it in *COST. Result is true if the total cost of the operation has now been calculated. */ static bool diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 70f0418..baa8983 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -5490,6 +5490,76 @@ [(set_attr "type" "bfm")] ) +;; Match a bfi instruction where the shift of OP3 means that we are +;; actually copying the least significant bits of OP3 into OP0 by way +;; of the AND masks and the IOR instruction. A similar instruction +;; with the two parts of the IOR swapped around was never triggered +;; in a bootstrap build and test of GCC so it was not included. + +(define_insn "*aarch64_bfi5_shift" + [(set (match_operand:GPI 0 "register_operand" "=r") +(ior:GPI (and:GPI (match_operand:GPI 1 "register_operand" "0") + (match_operand:GPI 2 "const_int_operand" "n")) + (and:GPI (ashift:GPI + (match_operand:GPI 3 "register_operand" "r") + (match_operand:GPI 4 "aarch64_simd_shift_imm_" "n")) + (match_operand:GPI 5 "const_int_operand" "n"] + "aarch64_masks_and_shift_for_bfi_p (mode, UINTVAL (operands[2]), + UINTVAL (operands[4]), + UINTVAL(operands[5]))" + "bfi\t%0, %3, %4, %P5" + [(set_attr "type" "bfm")] +) + +;; Like *aarch64_bfi5_shift but with no and of the ashift because +;; the shift is large
Re: [PATCH PR81740]Enforce dependence check for outer loop vectorization
Richard Biener writes: > On Tue, Mar 26, 2019 at 1:56 AM Richard Sandiford > wrote: >> Based on the "complete unrolling" view, if we number statements as >> (i, n), where i is the outer loop iteration and n is a statement number >> in the (completely unrolled) loop body, then the original scalar code >> executes in lexicographical order while for the vector loop: >> >> (1) (i,n) executes before (i+ix,n+nx) for all ix>=0, nx>=1, regardless of VF >> (2) (i,n) executes before (i+ix,n-nx) for all ix>=VF, nx>=0 >> (well, nx unrestricted, but only nx>=0 is useful given (1)) >> >> So for any kind of dependence between (i,n) and (i+ix,n-nx), ix>=1, nx>=0 >> we need to restrict VF to ix so that (2) ensures the right order. >> This means that the unnormalised distances of interest are: >> >> - (ix, -nx), ix>=1, nx>=0 >> - (-ix, nx), ix>=1, nx>=0 >> >> But the second gets normalised to the first, which is actually useful >> in this case :-). >> >> In terms of the existing code, I think that means we want to change >> the handling of nested statements (only) to: >> >> - ignore DDR_REVERSED_P (ddr) >> - restrict the main dist > 0 case to when the inner distance is <= 0. >> >> This should have the side effect of allowing outer-loop vectorisation for: >> >> void __attribute__ ((noipa)) >> f (int a[][N], int b[restrict]) >> { >> for (int i = N - 1; i-- > 0; ) >> for (int j = 0; j < N - 1; ++j) >> a[j + 1][i] = a[j][i + 1] + b[i]; >> } >> >> At the moment we reject this, but AFAICT it should be OK. >> (We do allow it for s/i + 1/i/, since then the outer distance is 0.) > > Can you file an enhancement request so we don't forget? OK, for the record it's https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89908
Re: [PATCH] handle strings as template arguments (PR 47488, 89833, 89876)
On 3/31/19 10:17 PM, Martin Sebor wrote: To fix PR 89833, a P1 regression, the attached patch tries to handle string literals as C++ 2a non-type template arguments by treating them the same as brace enclosed initializer lists (where the latter are handled correctly). The solution makes sure equivalent forms of array initializers such as for char[5]: "\0\1\2" "\0\1\2\0" { 0, 1, 2 } { 0, 1, 2, 0 } { 0, 1, 2, 0, 0 } are treated as the same, both for overloading and mangling. Ditto for the following equivalent forms: "" "\0" "\0\0\0\0" { } { 0 } { 0, 0, 0, 0, 0 } and for these of struct { char a[5], b[5], c[5]; }: { {0,0,0,0,0}, {0,0,0,0}, {0,0,0} } { { 0 }, { } } { "" } Since this is not handled correctly by the current code (see PR 89876 for a test case) the patch also fixes that. I'm not at all confident the handling of classes with user-defined constexpr ctors is 100% correct. (I use triviality to constrain the solution for strings but that was more of an off-the-cuff guess than a carefully considered decision). You could use TYPE_HAS_TRIVIAL_DFLT since we don't care about the triviality of other operations. I wouldn't worry about trying to omit user-defined constexpr ctors. The g++.dg/abi/mangle71.C test is all I've got in terms of verifying it works correctly. I'm quite sure the C++ 2a testing could stand to be beefed up. The patch passes x86_64-linux bootstrap and regression tests. There are a few failures in check-c++-all tests that don't look related to the changes (I see them with an unpatched GCC as well): g++.dg/spellcheck-macro-ordering-2.C g++.dg/cpp0x/auto52.C g++.dg/cpp1y/auto-neg1.C g++.dg/other/crash-9.C You probably need to check zero_init_p to properly handle pointers to data members, where a null value is integer -1; given struct A { int i; }; constexpr A::* pm = &A::i; int A::* pma[] = { pm, pm }; we don't want to discard the initializers because they look like zeros, as then digest_init will add back -1s. + unsigned n = TREE_STRING_LENGTH (value); + const char *str = TREE_STRING_POINTER (value); + /* Count the number of trailing nuls and subtract them from +STRSIZE because they don't need to be mangled. */ + tree strsizenode = TYPE_SIZE_UNIT (TREE_TYPE (value)); + unsigned strsize = tree_to_uhwi (strsizenode); + if (strsize > n) + strsize = n; Why not just use TREE_STRING_LENGTH? Jason
[PATCH] Check avx2_available in check_avx2_available
check_avx2_available should check avx2_available, instead of avx_available. Otherwise, check_avx2_available may use result from check_avx_available. PR testsuite/89907 * lib/target-supports.exp (check_avx2_available): Replace avx_available with avx2_available. --- gcc/testsuite/lib/target-supports.exp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index ac60c1e1567..90efaea804d 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -7403,7 +7403,7 @@ proc check_avx_available { } { # Return true if we are compiling for AVX2 target. proc check_avx2_available { } { - if { [check_no_compiler_messages avx_available assembly { + if { [check_no_compiler_messages avx2_available assembly { #ifndef __AVX2__ #error unsupported #endif -- 2.20.1
[committed] sel-sched: correct reset of reset_sched_cycles_p (PR 85412)
Hi, this patch by Andrey moves an assignment to reset_sched_cycles_p after rev. 259228 placed it incorrectly. 2019-04-01 Andrey Belevantsev PR rtl-optimization/85412 * sel-sched.c (sel_sched_region): Assign reset_sched_cycles_p before sel_sched_region_1, not after. * gcc.dg/pr85412.c: New test. diff --git a/gcc/sel-sched.c b/gcc/sel-sched.c index 338d7c097df..552dd0b9263 100644 --- a/gcc/sel-sched.c +++ b/gcc/sel-sched.c @@ -7650,11 +7650,11 @@ sel_sched_region (int rgn) /* Schedule always selecting the next insn to make the correct data for bundling or other later passes. */ pipelining_p = false; + reset_sched_cycles_p = false; force_next_insn = 1; sel_sched_region_1 (); force_next_insn = 0; } - reset_sched_cycles_p = pipelining_p; sel_region_finish (reset_sched_cycles_p); } diff --git a/gcc/testsuite/gcc.dg/pr85412.c b/gcc/testsuite/gcc.dg/pr85412.c new file mode 100644 index 000..11b8ceccd1e --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr85412.c @@ -0,0 +1,21 @@ +/* { dg-do compile { target powerpc*-*-* ia64-*-* x86_64-*-* } } */ +/* { dg-require-effective-target int128 } */ +/* { dg-options "-O1 -fpeephole2 -fschedule-insns2 -fsel-sched-pipelining -fselective-scheduling2 -ftree-loop-if-convert -fno-if-conversion -fno-move-loop-invariants -fno-split-wide-types -fno-tree-dominator-opts" } */ +/* { dg-additional-options "-march=bonnell" { target x86_64-*-* } } */ + +__int128 jv; + +void +zm (__int128 g9, unsigned short int sm, short int hk) +{ + while (hk < 1) +{ + if (jv == 0) +sm *= g9; + + if (sm < jv) +hk = sm; + + g9 |= sm == hk; +} +}
Re: [PATCH] Check avx2_available in check_avx2_available
On Mon, Apr 1, 2019 at 7:39 PM H.J. Lu wrote: > > check_avx2_available should check avx2_available, instead of avx_available. > Otherwise, check_avx2_available may use result from check_avx_available. > > PR testsuite/89907 > * lib/target-supports.exp (check_avx2_available): Replace > avx_available with avx2_available. OK. Thanks, Uros. > --- > gcc/testsuite/lib/target-supports.exp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/testsuite/lib/target-supports.exp > b/gcc/testsuite/lib/target-supports.exp > index ac60c1e1567..90efaea804d 100644 > --- a/gcc/testsuite/lib/target-supports.exp > +++ b/gcc/testsuite/lib/target-supports.exp > @@ -7403,7 +7403,7 @@ proc check_avx_available { } { > # Return true if we are compiling for AVX2 target. > > proc check_avx2_available { } { > - if { [check_no_compiler_messages avx_available assembly { > + if { [check_no_compiler_messages avx2_available assembly { > #ifndef __AVX2__ > #error unsupported > #endif > -- > 2.20.1 >
[C++ PATCH] PR c++/86946 - ICE with function call in template argument.
DR 1321 clarified that two dependent names are equivalent if the names are the same, even if the result of name lookup is different. So template argument hashing should treat a lookup set like a plain identifier. Mangling already does. Tested x86_64-pc-linux-gnu, applying to trunk. * pt.c (iterative_hash_template_arg): Hash all overload sets like an identifier. --- gcc/cp/pt.c | 10 +- gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C | 23 +++ gcc/cp/ChangeLog | 7 +++ 3 files changed, 35 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index f3faa89f671..ea1976bd0f5 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -1757,6 +1757,11 @@ iterative_hash_template_arg (tree arg, hashval_t val) code = TREE_CODE (arg); tclass = TREE_CODE_CLASS (code); + if (code == FUNCTION_DECL || code == OVERLOAD) +/* DR 1321: Hash a dependent name as the name, not the lookup result. */ +return iterative_hash_template_arg (DECL_NAME (get_first_fn (arg)), + val); + val = iterative_hash_object (code, val); switch (code) @@ -1789,11 +1794,6 @@ iterative_hash_template_arg (tree arg, hashval_t val) val = iterative_hash_template_arg (TREE_VALUE (arg), val); return val; -case OVERLOAD: - for (lkp_iterator iter (arg); iter; ++iter) - val = iterative_hash_template_arg (*iter, val); - return val; - case CONSTRUCTOR: { tree field, value; diff --git a/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C new file mode 100644 index 000..833ae6fc85c --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/fntmp-equiv1.C @@ -0,0 +1,23 @@ +// PR c++/86946, DR 1321 +// { dg-do compile { target c++11 } } + +int d(int, int); +template class e {}; +template e d(); +template e d(); + +template constexpr T d2(T, U) { return 42; } +template e d2(); +template e d2(); + +template a d3(a, c); +template e d3(); +template e d3(); + + +int main() +{ + d<1,2,int>(); + d2<1,2,int>(); + d3<1,2,int>(); +} diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog index 8c6e9931db1..805237cb17b 100644 --- a/gcc/cp/ChangeLog +++ b/gcc/cp/ChangeLog @@ -1,3 +1,10 @@ +2019-04-01 Jason Merrill + + PR c++/86946 - ICE with function call in template argument. + DR 1321 + * pt.c (iterative_hash_template_arg): Hash all overload sets like an + identifier. + 2019-03-31 Marek Polacek PR c++/89852 - ICE with C++11 functional cast with { }. base-commit: 9c2fddaf0c9fded1e3788fdf4bc15f2435b84df5 -- 2.20.1
Re: [C++ PATCH] PR c++/86946 - ICE with function call in template argument.
On Mon, Apr 1, 2019 at 3:04 PM Jason Merrill wrote: > > DR 1321 clarified that two dependent names are equivalent if the names are > the same, even if the result of name lookup is different. So template > argument hashing should treat a lookup set like a plain identifier. > Mangling already does. Actually, let's handle this more like cp_tree_equal does, i.e. only in the context of a CALL_EXPR. 86946-patch2 Description: Binary data
Re: [PATCH, PR d/89823] Update EXCLUDES for updated/removed dmd frontend sources
On Sat, 30 Mar 2019, Iain Buclaw wrote: > Patch updates the EXCLUDES list for message strings that are now > picked up from d/dmd after additions/removals that have happened since > adding sources. Regenerated gcc.pot as per documentation at > https://gcc.gnu.org/translation.html. > > OK for trunk? OK. -- Joseph S. Myers jos...@codesourcery.com
Re: [PATCH] Add PSTL internal namespace qualifications
>> We shouldn't include in the std::lib, the uses of assert >> should be changed to __glibcxx_assert instead (which is enabled by >> defining _GLIBCXX_ASSERTIONS). >> > > This has to come through one of the PSTL library configuration > macros because the "right assert" upstream won't be __glibcxx_assert. There are a number of locations in the upstream PSTL sources where is included and assert() is used. Additionally, the TBB backend uses __TBB_ASSERT(). I'm going to follow up with a separate patch that introduces __PSTL_ASSERT() and makes everything consistent, with __PSTL_ASSERT expanding to __glibcxx_assert in libstdc++. Tom. Thomas Rodgers writes: > Jonathan Wakely writes: > >> On 29/03/19 12:12 -0700, Thomas Rodgers wrote: >>>Prevent ADL related weirdness. >>> >>> * include/pstl/algorithm_impl.h: Add namespace qualification. >>> * include/pstl/execution_defs.h: Add namespace qualification. >>> * include/pstl/execution_impl.h: Add namespace qualification. >>> * include/pstl/numeric_impl.h: Add namespace qualification. >>> * include/pstl/parallel_backend_tbb.h: Add namespace qualification. >>> * include/pstl/unseq_backend_simd.h: Add namespace qualification. >>> * include/pstl/parallel_backend_utils.h: Include . >> >> We shouldn't include in the std::lib, the uses of assert >> should be changed to __glibcxx_assert instead (which is enabled by >> defining _GLIBCXX_ASSERTIONS). >> > > This has to come through one of the PSTL library configuration > macros because the "right assert" upstream won't be __glibcxx_assert. > >>>@@ -285,7 +286,7 @@ _ForwardIterator2 >>> __pattern_walk2_n(_ExecutionPolicy&&, _ForwardIterator1 __first1, _Size n, >>> _ForwardIterator2 __first2, _Function f, >>> _IsVector is_vector, /*parallel=*/std::false_type) >>> noexcept >> >> I missed these before, but we have non-uglified 'n' and 'f' and >> 'is_vector' here. Almost all uses of "is_vector" are inside a comment >> like /*is_vector=*/ but here it's a real parameter name. >> >> In practice if users do something stupid like >> #define n 123 >> #define f 456 >> then they won't be able to include these headers anyway, because >> TBB uses those as parameter names (so users that are using these >> headers with TBB can't defines such dumb macros). But when we get a >> non-TBB backend that won't be the case, and we should be uglifying all >> names declared in our own headers on principle. >> >>> { >>>-return __brick_walk2_n(__first1, n, __first2, f, is_vector); >>>+return __internal::__brick_walk2_n(__first1, n, __first2, f, is_vector); >>> } >>> >>> template >> _Size, class _RandomAccessIterator2, >>>@@ -294,7 +295,7 @@ _RandomAccessIterator2 >>> __pattern_walk2_n(_ExecutionPolicy&& __exec, _RandomAccessIterator1 >>> __first1, _Size n, _RandomAccessIterator2 __first2, >>> _Function f, _IsVector is_vector, >>> /*parallel=*/std::true_type) >> >> And 'n' and 'f' here. >> >>> { >>>-return __pattern_walk2(std::forward<_ExecutionPolicy>(__exec), >>>__first1, __first1 + n, __first2, f, is_vector, >>>+return >>>__internal::__pattern_walk2(std::forward<_ExecutionPolicy>(__exec), >>>__first1, __first1 + n, __first2, f, is_vector, >>>std::true_type()); >>> } >>> >> >> We can fix the 'n' and 'f' and 'is_vector' names in another patch, for >> this one please just remove again and change assert(...) to >> __glibcxx_assert(...).
[PATCH] Fix PR 81721: ICE with PCH and Pragma warning and C++ operator
From: Andrew Pinski Hi, The problem here is the token->val.node is not saved over a precompiled header for C++ operator. This can cause an internal compiler error as we tried to print out the spelling of the token as we assumed it was valid. The fix is to have cpp_token_val_index return CPP_TOKEN_FLD_NODE for operator tokens that have NAMED_OP set. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. Thanks, Andrew Pinski libcpp/ChangeLog: * lex.c (cpp_token_val_index ): If tok->flags has NAMED_OP set, then return CPP_TOKEN_FLD_NODE. gcc/testsuite/ChangeLog: * g++.dg/pch/operator-1.C: New testcase. * g++.dg/pch/operator-1.Hs: New file. --- gcc/testsuite/g++.dg/pch/operator-1.C | 2 ++ gcc/testsuite/g++.dg/pch/operator-1.Hs | 9 + libcpp/lex.c | 6 +- 3 files changed, 16 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/pch/operator-1.C create mode 100644 gcc/testsuite/g++.dg/pch/operator-1.Hs diff --git a/gcc/testsuite/g++.dg/pch/operator-1.C b/gcc/testsuite/g++.dg/pch/operator-1.C new file mode 100644 index 000..290b5f7 --- /dev/null +++ b/gcc/testsuite/g++.dg/pch/operator-1.C @@ -0,0 +1,2 @@ +#include "operator-1.H" +int main(void){ major(0);} /* { dg-warning "Did not Work" } */ diff --git a/gcc/testsuite/g++.dg/pch/operator-1.Hs b/gcc/testsuite/g++.dg/pch/operator-1.Hs new file mode 100644 index 000..657dae1 --- /dev/null +++ b/gcc/testsuite/g++.dg/pch/operator-1.Hs @@ -0,0 +1,9 @@ +# define __glibc_macro_warning1(message) _Pragma (#message) +# define __glibc_macro_warning(message) \ + __glibc_macro_warning1 (GCC warning message) +#define __SYSMACROS_DM1(...) __glibc_macro_warning (#__VA_ARGS__) + +#define __SYSMACROS_DM(symbol) __SYSMACROS_DM1 \ + (Did not Work) + +# define major(dev) __SYSMACROS_DM (major) (dev+0) diff --git a/libcpp/lex.c b/libcpp/lex.c index eedfcbb..16ded6e 100644 --- a/libcpp/lex.c +++ b/libcpp/lex.c @@ -3756,7 +3756,11 @@ cpp_token_val_index (const cpp_token *tok) case SPELL_LITERAL: return CPP_TOKEN_FLD_STR; case SPELL_OPERATOR: - if (tok->type == CPP_PASTE) + /* Operands which were originally spelled as ident keep around + the node for the exact spelling. */ + if (tok->flags & NAMED_OP) + return CPP_TOKEN_FLD_NODE; + else if (tok->type == CPP_PASTE) return CPP_TOKEN_FLD_TOKEN_NO; else return CPP_TOKEN_FLD_NONE; -- 1.8.3.1
Re: [PATCH,AIX] go: disable mvsx and maltivec for aix/ppc
On Mon, Apr 1, 2019 at 12:43 AM CHIGOT, CLEMENT wrote: > > Description: > * This patch removes -mvsx and -maltivec for go aix/ppc. > These options don't seem compatible with Go stack layout. Can you give some more details on the problem? gccgo should just be using the regular stack layout, I can't think of any reason why these options would not be compatible. Ian
Re: [PATCH, GCC, AARCH64] Add GNU note section with BTI and PAC.
On 4/1/19 8:53 PM, Sudakshina Das wrote: >> This could stand to use a comment, a moment's thinking about the sizes, and >> to >> use the existing asm output functions. >> >> /* PT_NOTE header: namesz, descsz, type. >> namesz = 4 ("GNU\0") >> descsz = 12 (see below) > I was trying out these changes but the descsz of 12 gets rejected by > readelf. It hits the following > >unsigned intsize = is_32bit_elf ? 4 : 8; > >printf (_(" Properties: ")); > >if (pnote->descsz < 8 || (pnote->descsz % size) != 0) > { >printf (_("\n"), > pnote->descsz); >return; > } Hmm, interesting. The docs say that padding is not to be included in descsz (gabi4.1, page 82). To my eye this is a bug in binutils, but perhaps we will have to live with it. Nick, thoughts? r~
Re: [PATCH,AIX] go: disable mvsx and maltivec for aix/ppc
On Mon, Apr 1, 2019 at 9:50 PM Ian Lance Taylor wrote: > > On Mon, Apr 1, 2019 at 12:43 AM CHIGOT, CLEMENT > wrote: > > > > Description: > > * This patch removes -mvsx and -maltivec for go aix/ppc. > > These options don't seem compatible with Go stack layout. > > Can you give some more details on the problem? gccgo should just be > using the regular stack layout, I can't think of any reason why these > options would not be compatible. Some Altivec instructions require stricter alignment and 32 bit AIX does not impose sufficient alignment in the stack, so GCC Go silently references the wrong address. GCC Go should be able to align the stack properly on AIX. In the interim, we need to disable generation of Altivec/VSX. Thanks, David
Re: [PATCH,AIX] go: disable mvsx and maltivec for aix/ppc
On Mon, Apr 1, 2019 at 7:06 PM David Edelsohn wrote: > > On Mon, Apr 1, 2019 at 9:50 PM Ian Lance Taylor wrote: > > > > On Mon, Apr 1, 2019 at 12:43 AM CHIGOT, CLEMENT > > wrote: > > > > > > Description: > > > * This patch removes -mvsx and -maltivec for go aix/ppc. > > > These options don't seem compatible with Go stack layout. > > > > Can you give some more details on the problem? gccgo should just be > > using the regular stack layout, I can't think of any reason why these > > options would not be compatible. > > Some Altivec instructions require stricter alignment and 32 bit AIX > does not impose sufficient alignment in the stack, so GCC Go silently > references the wrong address. > > GCC Go should be able to align the stack properly on AIX. In the > interim, we need to disable generation of Altivec/VSX. Thanks. That makes it seem like more of a general GCC problem than a gccgo problem, though. Or does GCC arrange to align the stack in the main function? Ian
Re: [PATCH, GCC, AARCH64] Add GNU note section with BTI and PAC.
On Tue, Apr 2, 2019 at 10:05 AM Richard Henderson wrote: > > On 4/1/19 8:53 PM, Sudakshina Das wrote: > >> This could stand to use a comment, a moment's thinking about the sizes, > >> and to > >> use the existing asm output functions. > >> > >> /* PT_NOTE header: namesz, descsz, type. > >> namesz = 4 ("GNU\0") > >> descsz = 12 (see below) > > I was trying out these changes but the descsz of 12 gets rejected by > > readelf. It hits the following > > > >unsigned intsize = is_32bit_elf ? 4 : 8; > > > >printf (_(" Properties: ")); > > > >if (pnote->descsz < 8 || (pnote->descsz % size) != 0) > > { > >printf (_("\n"), > > pnote->descsz); > >return; > > } > > Hmm, interesting. The docs say that padding is not to be included in descsz > (gabi4.1, page 82). To my eye this is a bug in binutils, but perhaps we will > have to live with it. > > Nick, thoughts? descsz is wrong. From: https://github.com/hjl-tools/linux-abi/wiki/Linux-Extensions-to-gABI n_desc The note descriptor. The first n_descsz bytes in n_desc is the pro- gram property array. The program property array Each array element represents one program property with type, data size and data. In 64-bit objects, each element is an array of 8-byte integers in the format of the target processor. In 32-bit objects, each element is an array of 4-byte integers in the format of the target processor. -- H.J.
Re: [PATCH,AIX] go: disable mvsx and maltivec for aix/ppc
On Mon, Apr 1, 2019 at 10:12 PM Ian Lance Taylor wrote: > > On Mon, Apr 1, 2019 at 7:06 PM David Edelsohn wrote: > > > > On Mon, Apr 1, 2019 at 9:50 PM Ian Lance Taylor wrote: > > > > > > On Mon, Apr 1, 2019 at 12:43 AM CHIGOT, CLEMENT > > > wrote: > > > > > > > > Description: > > > > * This patch removes -mvsx and -maltivec for go aix/ppc. > > > > These options don't seem compatible with Go stack layout. > > > > > > Can you give some more details on the problem? gccgo should just be > > > using the regular stack layout, I can't think of any reason why these > > > options would not be compatible. > > > > Some Altivec instructions require stricter alignment and 32 bit AIX > > does not impose sufficient alignment in the stack, so GCC Go silently > > references the wrong address. > > > > GCC Go should be able to align the stack properly on AIX. In the > > interim, we need to disable generation of Altivec/VSX. > > Thanks. That makes it seem like more of a general GCC problem than a > gccgo problem, though. Or does GCC arrange to align the stack in the > main function? 32 bit AIX doesn't require stack alignment as strict as Altivec assumes. GCC believes that the stack alignment is stricter than it is. - David
[wwwdocs] gcc-9/changes.html - Mention new OpenRISC backend
Hello, I was reading through some things and found this was missing. As before, I don't seem to have CVS access, if its OK and someone can commit it would be helpful. -Stafford Index: htdocs/gcc-9/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-9/changes.html,v retrieving revision 1.56 diff -u -r1.56 changes.html --- htdocs/gcc-9/changes.html 1 Apr 2019 14:55:53 - 1.56 +++ htdocs/gcc-9/changes.html 2 Apr 2019 03:59:42 - @@ -827,6 +827,13 @@ +OpenRISC + + +A new back end targeting OpenRISC processors has been contributed to GCC. + + +
Re: [PATCH,AIX] go: disable mvsx and maltivec for aix/ppc
On Mon, Apr 1, 2019 at 7:28 PM David Edelsohn wrote: > > On Mon, Apr 1, 2019 at 10:12 PM Ian Lance Taylor wrote: > > > > On Mon, Apr 1, 2019 at 7:06 PM David Edelsohn wrote: > > > > > > On Mon, Apr 1, 2019 at 9:50 PM Ian Lance Taylor wrote: > > > > > > > > On Mon, Apr 1, 2019 at 12:43 AM CHIGOT, CLEMENT > > > > wrote: > > > > > > > > > > Description: > > > > > * This patch removes -mvsx and -maltivec for go aix/ppc. > > > > > These options don't seem compatible with Go stack layout. > > > > > > > > Can you give some more details on the problem? gccgo should just be > > > > using the regular stack layout, I can't think of any reason why these > > > > options would not be compatible. > > > > > > Some Altivec instructions require stricter alignment and 32 bit AIX > > > does not impose sufficient alignment in the stack, so GCC Go silently > > > references the wrong address. > > > > > > GCC Go should be able to align the stack properly on AIX. In the > > > interim, we need to disable generation of Altivec/VSX. > > > > Thanks. That makes it seem like more of a general GCC problem than a > > gccgo problem, though. Or does GCC arrange to align the stack in the > > main function? > > 32 bit AIX doesn't require stack alignment as strict as Altivec > assumes. GCC believes that the stack alignment is stricter than it > is. OK, so that seems like something to address in gcc/config/rs6000/rs6000.c, not in gcc/go/gospec.c. Ian