Re: Patch ping (Re: [PATCH] Fortran include line fixes and -fdec-include support)
On Wed, Nov 21, 2018 at 08:31:17AM +0100, Thomas Koenig wrote: > > I'd like to ping this patch, ok for trunk? > > OK. Thanks for the patch! Thanks. > Before 9.0 is released, we should also document the flag > (and the extension it supports) in the manual, and note it > in changes.html and on the Wiki. Would you also do that? Like this? Ok for trunk/wwwdocs? 2018-11-21 Jakub Jelinek * invoke.texi (-fdec-include): Document. --- gcc/fortran/invoke.texi.jj 2018-08-26 22:42:19.907823618 +0200 +++ gcc/fortran/invoke.texi 2018-11-21 09:14:21.449174232 +0100 @@ -119,7 +119,7 @@ by type. Explanations are in the follow @gccoptlist{-fall-intrinsics -fbackslash -fcray-pointer -fd-lines-as-code @gol -fd-lines-as-comments @gol -fdec -fdec-structure -fdec-intrinsic-ints -fdec-static -fdec-math @gol --fdefault-double-8 -fdefault-integer-8 -fdefault-real-8 @gol +-fdec-include -fdefault-double-8 -fdefault-integer-8 -fdefault-real-8 @gol -fdefault-real-10 -fdefault-real-16 -fdollar-ok -ffixed-line-length-@var{n} @gol -ffixed-line-length-none -ffree-form -ffree-line-length-@var{n} @gol -ffree-line-length-none -fimplicit-none -finteger-4-integer-8 @gol @@ -277,6 +277,12 @@ functions (e.g. TAND, ATAND, etc...) for Enable DEC-style STATIC and AUTOMATIC attributes to explicitly specify the storage of variables and other objects. +@item -fdec-include +@opindex @code{fdec-include} +Enable parsing of INCLUDE as a statement in addition to parsing it as +INCLUDE line. When parsed as INCLUDE statement, INCLUDE does not have to +be on a single line and can use line continuations. + @item -fdollar-ok @opindex @code{fdollar-ok} @cindex @code{$} Jakub --- gcc-9/changes.html.jj 2018-11-14 17:46:10.747799079 +0100 +++ gcc-9/changes.html 2018-11-21 09:23:48.974896385 +0100 @@ -118,6 +118,14 @@ a work-in-progress. the IEEE_IS_NAN function from the intrinsic module IEEE_ARITHMETIC. + +A new command line option -fdec-include, set also +by -fdec option, has been added for an extension +for compatibility with legacy code. With this option, +INCLUDE directive is parsed also as a statement, +which allows the directive to be written on multiple source lines +with line continuations. +
[RFC, RFT PATCH, mingw]: Do not cancel vzeroupper when XMM registers live across call
Hello! Before vzeroupper gets emitted before function call, the compiler checks if if there are live call-saved SSE registers at the insertion point. This functionality is intended to handle Windows ABI, so we don't clear upper parts of the XMM registers that live across the call. However, the called function saves only lower 128bit part of the XMM register, so it seems that wider modes have to be saved and restored by the caller function anyway. If this is the case, we don't have to cancel vzeroupper insertion before the call. Attached patch removes this cancellation, since all other ABIs clobber all XMM registers. 2018-21-11 Uros Bizjak * config/i386/i386.c (ix86_avx_emit_vzeroupper): Remove. (ix86_emit_mode_set) : Emit vzeroupper here. The patch is untested, since I have no Windows target here. Daniel, can you please review the above assumptions and test the patch on Windows target? Uros. diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index c18c60a1d191..598165103716 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -19167,37 +19167,11 @@ emit_i387_cw_initialization (int mode) emit_move_insn (new_mode, reg); } -/* Emit vzeroupper. */ - -void -ix86_avx_emit_vzeroupper (HARD_REG_SET regs_live) -{ - int i; - - /* Cancel automatic vzeroupper insertion if there are - live call-saved SSE registers at the insertion point. */ - - for (i = FIRST_SSE_REG; i <= LAST_SSE_REG; i++) -if (TEST_HARD_REG_BIT (regs_live, i) && !call_used_regs[i]) - return; - - if (TARGET_64BIT) -for (i = FIRST_REX_SSE_REG; i <= LAST_REX_SSE_REG; i++) - if (TEST_HARD_REG_BIT (regs_live, i) && !call_used_regs[i]) - return; - - emit_insn (gen_avx_vzeroupper ()); -} - /* Generate one or more insns to set ENTITY to MODE. */ -/* Generate one or more insns to set ENTITY to MODE. HARD_REG_LIVE - is the set of hard registers live at the point where the insn(s) - are to be inserted. */ - static void ix86_emit_mode_set (int entity, int mode, int prev_mode ATTRIBUTE_UNUSED, - HARD_REG_SET regs_live) + HARD_REG_SET regs_live ATTRIBUTE_UNUSED) { switch (entity) { @@ -19207,7 +19181,7 @@ ix86_emit_mode_set (int entity, int mode, int prev_mode ATTRIBUTE_UNUSED, break; case AVX_U128: if (mode == AVX_U128_CLEAN) - ix86_avx_emit_vzeroupper (regs_live); + emit_insn (gen_avx_vzeroupper ()); break; case I387_TRUNC: case I387_FLOOR:
Re: Fix PR rtl-optimization/85925
> This is saying that *every* op except those very few works on the full > register. And that for every architecture that has W_R_O. That's still a progress over the previous situation. > It also only looks at the top code in the RTL, so it will say for example > a rotate-and-mask is just fine, while that isn't true. Not clear whether this needs to be recursive because nonzero_bits1 and num_sign_bit_copies1 already recurse on RTXes. -- Eric Botcazou
Re: Stream TREE_TYPE of TYPE_DECLs again
On Wed, 21 Nov 2018, Jan Hubicka wrote: > Hi, > this patch recovers location infomration in the ODR warnings. > Because location info is not attached to types but corresponding > TYPE_DECLs, we need to prevent TYPE_DECLs to be merged when > corresponding types are not merged. > > To achieve this I no longer clear TREE_TYPE of TYPE_DECLs which > puts them back to the same SCC as the type itself. While making > incomplete type variant we need to produce copy of TYPE_DECL. Becuase > it is possible that TYPE_DECL was not processed by free lang data > we can not do copy_node but build it from scratch (because > the toplevel loops possibly processed all decls). This is not hard > to do, but made me notice few extra flags that are streamed for > TYPE_DECLs and free_lang_data is not seeing them. > > I have also extended ipa-devirt to get rid of the duplicated decls > once ODR warnings are done to save ltrans streaming (it actually > added about 10% of ltrans data w/o this change) > > I have checked that the patch does not increase number of type > duplicates for cc1 (24), I will also re-do testing for Firefox > which may uncover some extra flags/attributes to care about. > > lto-bootstrapped/regtested x86_64-linux OK? OK if you put a comment ... > Honza > > PR lto/87957 > * tree.c (fld_decl_context): Break out from ... > (free_lang_data_in_decl): ... here; free TREE_PUBLIC, TREE_PRIVATE > DECL_ARTIFICIAL of TYPE_DECL; do not free TREE_TYPE of TYPE_DECL. > (fld_incomplete_type_of): Build copy of TYP_DECL. > * ipa-devirt.c (free_enum_values): Rename to ... > (free_odr_warning_data): ... this one; free also duplicated TYPE_DECLs > and TREE_TYPEs of TYPE_DECLs. > > Index: tree.c > === > --- tree.c(revision 266325) > +++ tree.c(working copy) > @@ -5206,6 +5206,24 @@ fld_process_array_type (tree t, tree t2, >return array; > } > > +/* Return CTX after removal of contexts that are not relevant */ > + > +static tree > +fld_decl_context (tree ctx) > +{ > + /* Variably modified types are needed for tree_is_indexable to decide > + whether the type needs to go to local or global section. > + This code is semi-broken but for now it is easiest to keep contexts > + as expected. */ > + if (ctx && TYPE_P (ctx) > + && !variably_modified_type_p (ctx, NULL_TREE)) > + { > + while (ctx && TYPE_P (ctx)) > + ctx = TYPE_CONTEXT (ctx); > + } > + return ctx; > +} > + > /* For T being aggregate type try to turn it into a incomplete variant. > Return T if no simplification is possible. */ > > @@ -5267,6 +5285,27 @@ fld_incomplete_type_of (tree t, struct f > } > else > TYPE_VALUES (copy) = NULL; > + > + /* Build copy of TYPE_DECL in TYPE_NAME if necessary. > + This is needed for ODR violation warnings to come out right (we > + want duplicate TYPE_DECLs whenever the type is duplicated because > + of ODR violation. Because lang data in the TYPE_DECL may not > + have been freed yet, rebuild it from scratch and copy relevant > + fields. */ > + TYPE_NAME (copy) = fld_simplified_type_name (copy); > + tree name = TYPE_NAME (copy); > + > + if (name && TREE_CODE (name) == TYPE_DECL) > + { > + gcc_checking_assert (TREE_TYPE (name) == t); > + tree name2 = build_decl (DECL_SOURCE_LOCATION (name), TYPE_DECL, > +DECL_NAME (name), copy); > + SET_DECL_ASSEMBLER_NAME (name2, DECL_ASSEMBLER_NAME (name)); > + SET_DECL_ALIGN (name2, 0); > + DECL_CONTEXT (name2) = fld_decl_context > + (DECL_CONTEXT (name)); > + TYPE_NAME (copy) = name2; > + } > } >return copy; > } > @@ -5649,12 +5688,13 @@ free_lang_data_in_decl (tree decl, struc > { >DECL_VISIBILITY (decl) = VISIBILITY_DEFAULT; >DECL_VISIBILITY_SPECIFIED (decl) = 0; > - /* TREE_PUBLIC is used to tell if type is anonymous. */ > + TREE_PUBLIC (decl) = 0; > + TREE_PRIVATE (decl) = 0; > + DECL_ARTIFICIAL (decl) = 0; >TYPE_DECL_SUPPRESS_DEBUG (decl) = 0; >DECL_INITIAL (decl) = NULL_TREE; >DECL_ORIGINAL_TYPE (decl) = NULL_TREE; >DECL_MODE (decl) = VOIDmode; > - TREE_TYPE (decl) = void_type_node; >SET_DECL_ALIGN (decl, 0); > } >else if (TREE_CODE (decl) == FIELD_DECL) > @@ -5688,20 +5728,7 @@ free_lang_data_in_decl (tree decl, struc >if (TREE_CODE (decl) != FIELD_DECL >&& ((TREE_CODE (decl) != VAR_DECL && TREE_CODE (decl) != FUNCTION_DECL) >|| !DECL_VIRTUAL_P (decl))) > -{ > - tree ctx = DECL_CONTEXT (decl); > - /* Variably modified types are needed for tree_is_indexable to decide > - whether the type needs to go to local or global se
Re: [PATCH] apply_subst_iterator: Handle define_split/define_insn_and_split
On Fri, Oct 26, 2018 at 9:44 AM H.J. Lu wrote: > > On 10/25/18, Uros Bizjak wrote: > > On Fri, Oct 26, 2018 at 8:48 AM H.J. Lu wrote: > >> > >> On 10/25/18, Uros Bizjak wrote: > >> > On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu wrote: > >> >> > >> >> * read-rtl.c (apply_subst_iterator): Handle > >> >> define_insn_and_split. > >> >> --- > >> >> gcc/read-rtl.c | 6 -- > >> >> 1 file changed, 4 insertions(+), 2 deletions(-) > >> >> > >> >> diff --git a/gcc/read-rtl.c b/gcc/read-rtl.c > >> >> index d698dd4af4d..5957c29671a 100644 > >> >> --- a/gcc/read-rtl.c > >> >> +++ b/gcc/read-rtl.c > >> >> @@ -275,9 +275,11 @@ apply_subst_iterator (rtx rt, unsigned int, int > >> >> value) > >> >>if (value == 1) > >> >> return; > >> >>gcc_assert (GET_CODE (rt) == DEFINE_INSN > >> >> + || GET_CODE (rt) == DEFINE_INSN_AND_SPLIT > >> >> || GET_CODE (rt) == DEFINE_EXPAND); > >> > > >> > Can we also handle DEFINE_SPLIT here? > >> > > >> > >> Yes, we could if there were a usage for it. I am reluctant to add > >> something > >> I have no use nor test for. > > > > Just split one define_insn_and_split to define_insn and corresponding > > define_split. > > > > define_insn_and_split is a contraction for for the define_insn and > > corresponding define_split, so it looks weird to only handle > > define_insn_and-split without handling define_split. > > > > Here is the updated patch to handle define_split. Tested with OK. > (define_insn "*sse4_1_v8qiv8hi2_2" > [(set (match_operand:V8HI 0 "register_operand") > (any_extend:V8HI > (vec_select:V8QI > (subreg:V16QI > (vec_concat:V2DI > (match_operand:DI 1 "memory_operand") > (const_int 0)) 0) > (parallel [(const_int 0) (const_int 1) >(const_int 2) (const_int 3) >(const_int 4) (const_int 5) >(const_int 6) (const_int 7)]] > "TARGET_SSE4_1 && && " > "#") > > (define_split > [(set (match_operand:V8HI 0 "register_operand") > (any_extend:V8HI > (vec_select:V8QI > (subreg:V16QI > (vec_concat:V2DI > (match_operand:DI 1 "memory_operand") > (const_int 0)) 0) > (parallel [(const_int 0) (const_int 1) >(const_int 2) (const_int 3) >(const_int 4) (const_int 5) >(const_int 6) (const_int 7)]] > "TARGET_SSE4_1 && && >&& can_create_pseudo_p ()" > [(set (match_dup 0) > (any_extend:V8HI (match_dup 1)))] > { > operands[1] = adjust_address_nv (operands[1], V8QImode, 0); > }) > > -- > H.J.
Re: [PATCH, middle-end]: Fix PR 88070, ICE in create_pre_exit, at mode-switching.c:438
On Wed, Nov 21, 2018 at 12:46 AM Jeff Law wrote: > > On 11/19/18 12:58 PM, Uros Bizjak wrote: > > Hello! > > > > The assert in create_pre_exit at mode-switching.c expects return copy > > pair with nothing in between. However, the compiler starts mode > > switching pass with the following sequence: > > > > (insn 19 18 16 2 (set (reg:V2SF 21 xmm0) > > (mem/c:V2SF (plus:DI (reg/f:DI 7 sp) > > (const_int -72 [0xffb8])) [0 S8 A64])) > > "pr88070.c":8 1157 {*movv2sf_internal} > > (nil)) > > (insn 16 19 20 2 (set (reg:V2SF 0 ax [orig:91 ] [91]) > > (reg:V2SF 0 ax [89])) "pr88070.c":8 1157 {*movv2sf_internal} > > (nil)) > > (insn 20 16 21 2 (unspec_volatile [ > > (const_int 0 [0]) > > ] UNSPECV_BLOCKAGE) "pr88070.c":8 710 {blockage} > > (nil)) > > (insn 21 20 23 2 (use (reg:V2SF 21 xmm0)) "pr88070.c":8 -1 > > (nil)) > So I know there's an updated patch. But I thought it might be worth > mentioning that insn 16 here appears to be a nop-move. Removing it > might address this instance of the problem, but I doubt it's general > enough to address any larger issues. > > You still might want to investigate why it's still in the IL. Oh yes, I remember this. These nop-moves were removed in Vlad's patch [1],[2]: 2013-10-25 Vladimir Makarov ... * lra-spills.c (lra_final_code_change): Remove useless move insns. Which regressed vzeroupper insertion pass [3] that was reported in [4]. The functionality was later reverted in [5]: 2013-10-26 Vladimir Makarov Revert: 2013-10-25 Vladimir Makarov * lra-spills.c (lra_final_code_change): Remove useless move insns. Which IMO can be reintroduced back, now that vzeroupper pass works in a different way. We actually have a couple of tests in place for PR58679 [6]. [1] https://gcc.gnu.org/ml/gcc-patches/2013-10/msg02208.html [2] https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=204079 [3] https://gcc.gnu.org/ml/gcc-patches/2013-10/msg02225.html [4] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58679 [5] https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=204094 [6] https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=204109 Uros.
Re: [PATCH, middle-end]: Fix PR 88070, ICE in create_pre_exit, at mode-switching.c:438
On Wed, Nov 21, 2018 at 10:48 AM Uros Bizjak wrote: > > On Wed, Nov 21, 2018 at 12:46 AM Jeff Law wrote: > > > > On 11/19/18 12:58 PM, Uros Bizjak wrote: > > > Hello! > > > > > > The assert in create_pre_exit at mode-switching.c expects return copy > > > pair with nothing in between. However, the compiler starts mode > > > switching pass with the following sequence: > > > > > > (insn 19 18 16 2 (set (reg:V2SF 21 xmm0) > > > (mem/c:V2SF (plus:DI (reg/f:DI 7 sp) > > > (const_int -72 [0xffb8])) [0 S8 A64])) > > > "pr88070.c":8 1157 {*movv2sf_internal} > > > (nil)) > > > (insn 16 19 20 2 (set (reg:V2SF 0 ax [orig:91 ] [91]) > > > (reg:V2SF 0 ax [89])) "pr88070.c":8 1157 {*movv2sf_internal} > > > (nil)) > > > (insn 20 16 21 2 (unspec_volatile [ > > > (const_int 0 [0]) > > > ] UNSPECV_BLOCKAGE) "pr88070.c":8 710 {blockage} > > > (nil)) > > > (insn 21 20 23 2 (use (reg:V2SF 21 xmm0)) "pr88070.c":8 -1 > > > (nil)) > > So I know there's an updated patch. But I thought it might be worth > > mentioning that insn 16 here appears to be a nop-move. Removing it > > might address this instance of the problem, but I doubt it's general > > enough to address any larger issues. > > > > You still might want to investigate why it's still in the IL. > > Oh yes, I remember this. > > These nop-moves were removed in Vlad's patch [1],[2]: > > 2013-10-25 Vladimir Makarov > > ... > * lra-spills.c (lra_final_code_change): Remove useless move insns. > > Which regressed vzeroupper insertion pass [3] that was reported in [4]. > > The functionality was later reverted in [5]: > > 2013-10-26 Vladimir Makarov > > Revert: > 2013-10-25 Vladimir Makarov > * lra-spills.c (lra_final_code_change): Remove useless move insns. > > Which IMO can be reintroduced back, now that vzeroupper pass works in > a different way. We actually have a couple of tests in place for > PR58679 [6]. The revert of the revert works OK for PR58679 tests with the latest compiler. Uros.
Re: [PATCH] avoid error_mark_node in -Wsizeof-pointer-memaccess (PR 88065)
On Tue, Nov 20, 2018 at 12:39:44AM +0100, Jakub Jelinek wrote: > On Mon, Nov 19, 2018 at 04:10:09PM -0700, Jeff Law wrote: > > > PR c/88065 - ICE in -Wsizeof-pointer-memaccess on an invalid strncpy > > > > > > gcc/c-family/ChangeLog: > > > > > > PR c/88065 Please also add PR c/87297 > > > * c-warn.c (sizeof_pointer_memaccess_warning): Bail if source > > > or destination is an error. > > > > > > gcc/testsuite/ChangeLog: > > > > > > PR c/88065 > > > * gcc.dg/Wsizeof-pointer-memaccess2.c: New test. > > This is probably OK. But before final ACK, is there a point earlier > > where we could/should have bailed out? > > IMHO it is a good point, but it should use error_operand_p predicate instead > of == error_mark_node checks to also catch the case where the argument is > not error_mark_node, but has error_mark_node type. And, the testcase > shouldn't be in gcc.dg, but in c-c++-common and cover also C++ testing. Testcase proving that error_operand_p is really necessary: /* PR c/87297 */ /* { dg-do compile } */ /* { dg-options "-Wsizeof-pointer-memaccess" } */ struct S { char a[4]; }; void foo (struct S *p, const char *s) { struct T x; /* { dg-error "storage size|incomplete type" } */ __builtin_strncpy (x, s, sizeof p->a); } Works in C, still ICEs in C++ even with the patch you've posted. 819 tree dstsz = TYPE_SIZE_UNIT (TREE_TYPE (d)); debug_tree (d) used decl_5 VOID huvaa.c:9:12 align:8 warn_if_not_align:0 context chain > And, I think it is important to have these tests in c-c++-common, as the above test shows, it behaves differently between C and C++ (C will present error_mark_node itself rather than VAR_DECL with error_mark_node type) and the code in question is just a helper for the FEs. Jakub
Patch ping (Re: [PATCH] Fix x86 bzhi/bextr iff zero_extract with zero size is undefined (PR rtl-optimization/87817))
Hi! On Wed, Nov 14, 2018 at 12:37:02AM +0100, Jakub Jelinek wrote: > 2018-11-13 Jakub Jelinek > > PR rtl-optimization/87817 > * config/i386/i386.md (nmi2_bzhi_3, *bmi2_bzhi_3, > *bmi2_bzhi_3_1, *bmi2_bzhi_3_1_ccz): Use IF_THEN_ELSE > in the pattern to avoid triggering UB when operands[2] is zero. > (tbm_bextri_): New expander. Renamed the old define_insn to ... > (*tbm_bextri_): ... this. I'd like to ping this patch, while the folding committed for the PR often triggers and so the RTL passes see literal zero propagated there less often, e.g. the testcase with: -O2 -mbmi2 -fno-tree-ccp -fno-tree-forwprop -fno-tree-fre -fno-tree-pre -fno-tree-vrp -fno-tree-dominator-opts -fno-code-hoisting is still miscompiled and there could be other reasons why a zero appears only after expansion. >From what I understood, the agreement was that zero_extract with 0 size (either literal or just at runtime is incorrect in the middle-end). Jakub
[PATCH] x86: Add -march=cascadelake
Hi, The attached patch added -march=cascadelake for x86. Tested with bootstrap and regression tests on x86_64. No regressions. Is it ok for trunk? Wei gcc/ * common/config/i386/i386-common.c (processor_names): Add cascadelake. (processor_alias_table): Add cascadelake. * config.gcc: Add -march=cascadelake. * config/i386/driver-i386.c (host_detect_local_cpu): Detect cascadelake. * config/i386/i386-c.c (ix86_target_macros_internal): Handle cascadelake. * config/i386/i386.c (ix86_cost): Add m_CASCADELAKE. (processor_cost_table): Add cascadelake. (get_builtin_code_for_version): Handle cascadelake. (fold_builtin_cpu): Ditto. * config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New. (PTA_CASCADELAKE): Ditto. * doc/invoke.texi: Add -march=cascadelake. gcc/testsuite/ * g++.target/i386/mv16.C: Handle new march. * gcc.target/i386/funcspec-56.inc" Ditto. libgcc/ * config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE. cascadelake.diff Description: Binary data
Re: [PATCH v3] [aarch64] Add CPU support for Ampere Computing's eMAG.
Hi Christoph, On 20/11/18 18:00, Christoph Muellner wrote: Tested with "make check" and no regressions found. This patch depends on the struct xgene1_prefetch_tune, which has been acknowledged already: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00985.html *** gcc/ChangeLog *** 2018-xx-xx Christoph Muellner * config/aarch64/aarch64-cores.def: Define emag. * config/aarch64/aarch64-tune.md: Regenerated with emag. * config/aarch64/aarch64.c (emag_tunings): New struct. * doc/invoke.texi: Document mtune value. This looks ok to me but you'll need a maintainer to approve. You mentioned this depends on your previously approved patches. Do you have write access or do you need someone to commit them for you? Thanks, Kyrill Signed-off-by: Christoph Muellner --- gcc/config/aarch64/aarch64-cores.def | 3 +++ gcc/config/aarch64/aarch64-tune.md | 2 +- gcc/config/aarch64/aarch64.c | 25 + gcc/doc/invoke.texi | 2 +- 4 files changed, 30 insertions(+), 2 deletions(-) diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def index 1f3ac56..68cca00 100644 --- a/gcc/config/aarch64/aarch64-cores.def +++ b/gcc/config/aarch64/aarch64-cores.def @@ -61,6 +61,9 @@ AARCH64_CORE("thunderxt88", thunderxt88, thunderx, 8A, AARCH64_FL_FOR_ARCH AARCH64_CORE("thunderxt81", thunderxt81, thunderx, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, 0x43, 0x0a2, -1) AARCH64_CORE("thunderxt83", thunderxt83, thunderx, 8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, 0x43, 0x0a3, -1) +/* Ampere Computing cores. */ +AARCH64_CORE("emag",emag, xgene1,8A, AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, 0x000, 3) + /* APM ('P') cores. */ AARCH64_CORE("xgene1", xgene1,xgene1,8A, AARCH64_FL_FOR_ARCH8, xgene1, 0x50, 0x000, -1) diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md index fade1d4..2fc7f03 100644 --- a/gcc/config/aarch64/aarch64-tune.md +++ b/gcc/config/aarch64/aarch64-tune.md @@ -1,5 +1,5 @@ ;; -*- buffer-read-only: t -*- ;; Generated automatically by gentune.sh from aarch64-cores.def (define_attr "tune" - "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" + "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index f7f88a9..995aafe 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -957,6 +957,31 @@ static const struct tune_params xgene1_tunings = &xgene1_prefetch_tune }; +static const struct tune_params emag_tunings = +{ + &xgene1_extra_costs, + &xgene1_addrcost_table, + &xgene1_regmove_cost, + &xgene1_vector_cost, + &generic_branch_cost, + &xgene1_approx_modes, + 6, /* memmov_cost */ + 4, /* issue_rate */ + AARCH64_FUSE_NOTHING, /* fusible_ops */ + "16", /* function_align. */ + "16", /* jump_align. */ + "16", /* loop_align. */ + 2, /* int_reassoc_width. */ + 4, /* fp_reassoc_width. */ + 1, /* vec_reassoc_width. */ + 2, /* min_div_recip_mul_sf. */ + 2, /* min_div_recip_mul_df. */ + 17, /* max_case_values. */ + tune_params::AUTOPREFETCHER_OFF, /* autoprefetcher_model. */ + (AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS), /* tune_flags. */ + &xgene1_prefetch_tune +}; + static const struct tune_params qdf24xx_tunings = { &qdf24xx_extra_costs, diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index e016dce..ac81fb2 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -15288,7 +15288,7 @@ Specify the name of the target processor for which GCC should tune the performance of the code. Permissible values for this option are: @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55}, @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75}, -@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{falkor}, +@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{emag}, @samp{falkor}, @samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{thunderx}, @samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81},
Re: Patch ping (Re: [PATCH] Fix x86 bzhi/bextr iff zero_extract with zero size is undefined (PR rtl-optimization/87817))
On Wed, Nov 21, 2018 at 11:20 AM Jakub Jelinek wrote: > > Hi! > > On Wed, Nov 14, 2018 at 12:37:02AM +0100, Jakub Jelinek wrote: > > 2018-11-13 Jakub Jelinek > > > > PR rtl-optimization/87817 > > * config/i386/i386.md (nmi2_bzhi_3, *bmi2_bzhi_3, > > *bmi2_bzhi_3_1, *bmi2_bzhi_3_1_ccz): Use IF_THEN_ELSE > > in the pattern to avoid triggering UB when operands[2] is zero. > > (tbm_bextri_): New expander. Renamed the old define_insn to ... > > (*tbm_bextri_): ... this. OK. I thought that I already approved the patch. Oh well... Thanks, Uros. > I'd like to ping this patch, while the folding committed for the PR > often triggers and so the RTL passes see literal zero propagated there less > often, e.g. the testcase with: > -O2 -mbmi2 -fno-tree-ccp -fno-tree-forwprop -fno-tree-fre -fno-tree-pre > -fno-tree-vrp -fno-tree-dominator-opts -fno-code-hoisting > is still miscompiled and there could be other reasons why a zero appears > only after expansion. > > From what I understood, the agreement was that zero_extract with 0 size > (either literal or just at runtime is incorrect in the middle-end). > > Jakub
Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM
Yes you did indeed which is why I didn't include you in to To list. I've reworked the Arm part significantly since it was last approved, the ping is meant for the Arm maintainers. Thanks for enquiring about it. Best regards, Thomas On Wed, 21 Nov 2018 at 00:32, Jeff Law wrote: > > On 11/16/18 7:56 AM, Thomas Preudhomme wrote: > > Ping? > I thought I acked the target independent stuff a while back. What's > still waiting on review here? > > jeff
Re: [PATCH] x86: Add -march=cascadelake
On Wed, Nov 21, 2018 at 06:23:41PM +0800, Wei Xiao wrote: > The attached patch added -march=cascadelake for x86. > Tested with bootstrap and regression tests on x86_64. No regressions. > Is it ok for trunk? Not a real review, just nits: index bff4dfb..f7c1c98 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,18 @@ +2018-11-21 Wei Xiao Two spaces after date, two spaces before <. --- a/gcc/config/i386/driver-i386.c +++ b/gcc/config/i386/driver-i386.c @@ -857,6 +857,9 @@ const char *host_detect_local_cpu (int argc, const char **argv) /* Assume Ice Lake. */ else if (has_gfni) cpu = "icelake-client"; + /* Assume Cascade Lake. */ + if (has_avx512vnni) + cpu = "cascadelake"; /* Assume Cannon Lake. */ else if (has_avx512vbmi) cpu = "cannonlake"; Doesn't this break handling of all the other CPUs? I mean, it is a large if (cond) ... else if (cond) ... else if (cond) ... else ... but you've added if without else before it into the middle. Jakub
Re: [PATCH v3] [aarch64] Add CPU support for Ampere Computing's eMAG.
> On 21.11.2018, at 11:26, Kyrill Tkachov wrote: > > Hi Christoph, > > On 20/11/18 18:00, Christoph Muellner wrote: >> Tested with "make check" and no regressions found. >> >> This patch depends on the struct xgene1_prefetch_tune, >> which has been acknowledged already: >> https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00985.html >> >> *** gcc/ChangeLog *** >> >> 2018-xx-xx Christoph Muellner >> >> * config/aarch64/aarch64-cores.def: Define emag. >> * config/aarch64/aarch64-tune.md: Regenerated with emag. >> * config/aarch64/aarch64.c (emag_tunings): New struct. >> * doc/invoke.texi: Document mtune value. > > This looks ok to me but you'll need a maintainer to approve. > You mentioned this depends on your previously approved patches. > Do you have write access or do you need someone to commit them for you? I'd don't have write access. But I have already contacted somebody with write access to get my ACK'ed changes in. Thanks, Christoph > > Thanks, > Kyrill > >> Signed-off-by: Christoph Muellner >> --- >> gcc/config/aarch64/aarch64-cores.def | 3 +++ >> gcc/config/aarch64/aarch64-tune.md | 2 +- >> gcc/config/aarch64/aarch64.c | 25 + >> gcc/doc/invoke.texi | 2 +- >> 4 files changed, 30 insertions(+), 2 deletions(-) >> >> diff --git a/gcc/config/aarch64/aarch64-cores.def >> b/gcc/config/aarch64/aarch64-cores.def >> index 1f3ac56..68cca00 100644 >> --- a/gcc/config/aarch64/aarch64-cores.def >> +++ b/gcc/config/aarch64/aarch64-cores.def >> @@ -61,6 +61,9 @@ AARCH64_CORE("thunderxt88", thunderxt88, thunderx, >> 8A, AARCH64_FL_FOR_ARCH >> AARCH64_CORE("thunderxt81", thunderxt81, thunderx, 8A, >> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, 0x43, >> 0x0a2, -1) >> AARCH64_CORE("thunderxt83", thunderxt83, thunderx, 8A, >> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, 0x43, >> 0x0a3, -1) >> +/* Ampere Computing cores. */ >> +AARCH64_CORE("emag",emag, xgene1,8A, AARCH64_FL_FOR_ARCH8 >> | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, 0x000, 3) >> + >> /* APM ('P') cores. */ >> AARCH64_CORE("xgene1", xgene1,xgene1,8A, >> AARCH64_FL_FOR_ARCH8, xgene1, 0x50, 0x000, -1) >> diff --git a/gcc/config/aarch64/aarch64-tune.md >> b/gcc/config/aarch64/aarch64-tune.md >> index fade1d4..2fc7f03 100644 >> --- a/gcc/config/aarch64/aarch64-tune.md >> +++ b/gcc/config/aarch64/aarch64-tune.md >> @@ -1,5 +1,5 @@ >> ;; -*- buffer-read-only: t -*- >> ;; Generated automatically by gentune.sh from aarch64-cores.def >> (define_attr "tune" >> - >> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" >> + >> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" >> (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >> index f7f88a9..995aafe 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -957,6 +957,31 @@ static const struct tune_params xgene1_tunings = >>&xgene1_prefetch_tune >> }; >> +static const struct tune_params emag_tunings = >> +{ >> + &xgene1_extra_costs, >> + &xgene1_addrcost_table, >> + &xgene1_regmove_cost, >> + &xgene1_vector_cost, >> + &generic_branch_cost, >> + &xgene1_approx_modes, >> + 6, /* memmov_cost */ >> + 4, /* issue_rate */ >> + AARCH64_FUSE_NOTHING, /* fusible_ops */ >> + "16", /* function_align. */ >> + "16", /* jump_align. */ >> + "16", /* loop_align. */ >> + 2,/* int_reassoc_width. */ >> + 4,/* fp_reassoc_width. */ >> + 1,/* vec_reassoc_width. */ >> + 2,/* min_div_recip_mul_sf. */ >> + 2,/* min_div_recip_mul_df. */ >> + 17, /* max_case_values. */ >> + tune_params::AUTOPREFETCHER_OFF, /* autoprefetcher_model. */ >> + (AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS),/* tune_flags. */ >> + &xgene1_prefetch_tune >> +}; >> + >> static const struct tune_params qdf24xx_tunings = >> { >>&qdf24xx_extra_costs, >> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi >> index e016dce..ac81fb2 100644 >> --- a/gcc/doc/invoke.texi >> +++ b/gcc/doc/invoke.texi >> @@ -15288,7 +15288,7 @@ Specify the name of the target processor for which >> GCC should tune the >> performance of the code. Permissible values
Re: [PATCH 01/10] Fix IRA ICE.
On 21/11/2018 00:47, Jeff Law wrote: This seems like a really gross hack and sets an expectation that generating registers in the target after IRA has started is OK. It is not OK. THe fact that this works is, IMHO, likely an accident. What's the proper test for this? Neither lra_in_progress nor reload_in_progress is set here, and can_create_pseudos returns true. The patterns have the ability to not generate registers, but they don't know not to. Richard Sandiford has stated that it should be OK, but perhaps the other architectures also work by accident? In fact, since we're using LRA (not reload), my understanding is that I ought to be able to create new pseudos right up until reload_completed. (Although, my experience is that it's easy to get into an infinite loop doing that.) I think this comes back to the fundamental representational issue with the EXEC handling that still needs to be addressed. Undoubtedly, this makes it worse, but even without that I'd still want to expand vector memory moves long before split1, so at least some cases have to generate additional registers. (Perhaps IRA doesn't create memory moves though? I'm not sure.) I'm going to investigate how easy it is to fix the EXEC representation issues. I've been resisting because I had a deadline to make, and it's bound to be an invasive and destabilizing alteration (albeit largely mechanical), but if it's going to be a barrier to commit then probably it's become time. :-( Andrew
Re: Fix PR rtl-optimization/85925
Hi Eric, On Wed, Nov 21, 2018 at 09:35:03AM +0100, Eric Botcazou wrote: > > This is saying that *every* op except those very few works on the full > > register. And that for every architecture that has W_R_O. > > That's still a progress over the previous situation. Yes. But it feels more than a bit wobbly. Segher
PR C++/88114 - patch for destructor not generated for "virtual ~destructor() = default"
Hello all, if a class contains any 'virtual ... = 0', it's an abstract class and for an abstract class, the destructor not added to the vtable. For a normal virtual ~class() { } that's not a problem as the class::~class() destructor will be generated during the parsing of the function. But for virtual ~class() = default; the destructor will be generated via mark_used via the vtable. If one now declares a derived class and uses it, the class::~class() is generated in that translation unit. Unless, #pragma interface/implementation is used. In that case, the 'default' destructor will never be generated. The following code seems to work both for the big code and for the example; without '#pragma implementation', the destructor is not generated for the example, only with. The patch survived boostrapping GCC with default languages on x86-64-gnu-linux and "make check-g++".* [One probably could get rid of some of the conditions for generating the code, e.g. TREE_USED and DECL_DEFAULTED_FN are probably not both needed; one might want to set some additional DECL to the fn decl.] Does the patch and the test case make sense? Or is something else/in addition needed? Tobias *I do get the following failures on this CentOS6 system: FAIL: g++.dg/pr83239.C -std=gnu++98 (test for excess errors) Excess errors: cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned int)' specified size 18446744073709551608 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=] cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned int)' specified size 18446744073709551600 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=] FAIL: g++.dg/tls/thread_local-order2.C -std=c++14 execution test FAIL: g++.dg/tls/thread_local-order2.C -std=c++17 execution test plus each 32 times: FAIL: guality/guality.h: 0 PASS, 1 FAIL, 0 UNRESOLVED FAIL: guality/guality.h: varl is -1, not 6 PR C++/88114 * decl2.c (c_parse_final_cleanups): If needed, generate code for the destructor of an abstract class. (mark_used): Update comment for older function-name change. PR C++/88114 * g++.dg/cpp0x/defaulted61.C: New. diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c index ffc0d0d6ec4..056e49ad88a 100644 --- a/gcc/cp/decl2.c +++ b/gcc/cp/decl2.c @@ -4782,6 +4782,18 @@ c_parse_final_cleanups (void) { reconsider = true; keyed_classes->unordered_remove (i); + + /* For abstract classes, the destructor has been removed from the + vtable (in class.c's build_vtbl_initializer). For a compiler- + generated destructor, it hence might not have been generated in + this translation unit - and with '#pragma interface' it might + never get generated. */ + if (CLASSTYPE_PURE_VIRTUALS (t) + && TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t)) + for (tree x = TYPE_FIELDS (t); x; x = DECL_CHAIN (x)) + if (DECL_DECLARES_FUNCTION_P (x) && DECL_DESTRUCTOR_P (x) + && !TREE_USED (x) && DECL_DEFAULTED_FN (x)) + note_vague_linkage_fn (x); } /* The input_location may have been changed during marking of vtable entries. */ @@ -5465,7 +5477,7 @@ mark_used (tree decl, tsubst_flags_t complain) within the body of a function so as to avoid collecting live data on the stack (such as overload resolution candidates). - We could just let cp_write_global_declarations handle synthesizing + We could just let c_parse_final_cleanups handle synthesizing this function by adding it to deferred_fns, but doing it at the use site produces better error messages. */ ++function_depth; diff --git a/gcc/testsuite/g++.dg/cpp0x/defaulted61.C b/gcc/testsuite/g++.dg/cpp0x/defaulted61.C new file mode 100644 index 000..e7e0a486292 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/defaulted61.C @@ -0,0 +1,22 @@ +// { dg-do compile { target c++11 } } +// { dg-final { scan-assembler "_ZN3OneD0Ev" } } + +// PR C++/88114 +// Destructor of an abstract class was never generated +// when compiling the class - nor later due to the +// '#pragma interface' + +#pragma implementation +#pragma interface + +class One +{ + public: + virtual ~One() = default; + void some_fn(); + virtual void later() = 0; + private: + int m_int; +}; + +void One::some_fn() { }
Re: [PATCH v3] [aarch64] Add CPU support for Ampere Computing's eMAG.
This is currently slowed down by the speed of subversion (as my subversion tree was outdated). So it should only be a matter of days ... ;-) > On 21.11.2018, at 12:15, Christoph Müllner > wrote: > >> >> On 21.11.2018, at 11:26, Kyrill Tkachov wrote: >> >> Hi Christoph, >> >> On 20/11/18 18:00, Christoph Muellner wrote: >>> Tested with "make check" and no regressions found. >>> >>> This patch depends on the struct xgene1_prefetch_tune, >>> which has been acknowledged already: >>> https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00985.html >>> >>> *** gcc/ChangeLog *** >>> >>> 2018-xx-xx Christoph Muellner >>> >>> * config/aarch64/aarch64-cores.def: Define emag. >>> * config/aarch64/aarch64-tune.md: Regenerated with emag. >>> * config/aarch64/aarch64.c (emag_tunings): New struct. >>> * doc/invoke.texi: Document mtune value. >> >> This looks ok to me but you'll need a maintainer to approve. >> You mentioned this depends on your previously approved patches. >> Do you have write access or do you need someone to commit them for you? > > I'd don't have write access. > But I have already contacted somebody with write access to get my ACK'ed > changes in. > > Thanks, > Christoph > >> >> Thanks, >> Kyrill >> >>> Signed-off-by: Christoph Muellner >>> --- >>> gcc/config/aarch64/aarch64-cores.def | 3 +++ >>> gcc/config/aarch64/aarch64-tune.md | 2 +- >>> gcc/config/aarch64/aarch64.c | 25 + >>> gcc/doc/invoke.texi | 2 +- >>> 4 files changed, 30 insertions(+), 2 deletions(-) >>> >>> diff --git a/gcc/config/aarch64/aarch64-cores.def >>> b/gcc/config/aarch64/aarch64-cores.def >>> index 1f3ac56..68cca00 100644 >>> --- a/gcc/config/aarch64/aarch64-cores.def >>> +++ b/gcc/config/aarch64/aarch64-cores.def >>> @@ -61,6 +61,9 @@ AARCH64_CORE("thunderxt88", thunderxt88, thunderx, >>> 8A, AARCH64_FL_FOR_ARCH >>> AARCH64_CORE("thunderxt81", thunderxt81, thunderx, 8A, >>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, 0x43, >>> 0x0a2, -1) >>> AARCH64_CORE("thunderxt83", thunderxt83, thunderx, 8A, >>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, 0x43, >>> 0x0a3, -1) >>> +/* Ampere Computing cores. */ >>> +AARCH64_CORE("emag",emag, xgene1,8A, >>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, >>> 0x000, 3) >>> + >>> /* APM ('P') cores. */ >>> AARCH64_CORE("xgene1", xgene1,xgene1,8A, >>> AARCH64_FL_FOR_ARCH8, xgene1, 0x50, 0x000, -1) >>> diff --git a/gcc/config/aarch64/aarch64-tune.md >>> b/gcc/config/aarch64/aarch64-tune.md >>> index fade1d4..2fc7f03 100644 >>> --- a/gcc/config/aarch64/aarch64-tune.md >>> +++ b/gcc/config/aarch64/aarch64-tune.md >>> @@ -1,5 +1,5 @@ >>> ;; -*- buffer-read-only: t -*- >>> ;; Generated automatically by gentune.sh from aarch64-cores.def >>> (define_attr "tune" >>> - >>> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" >>> + >>> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" >>> (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) >>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >>> index f7f88a9..995aafe 100644 >>> --- a/gcc/config/aarch64/aarch64.c >>> +++ b/gcc/config/aarch64/aarch64.c >>> @@ -957,6 +957,31 @@ static const struct tune_params xgene1_tunings = >>> &xgene1_prefetch_tune >>> }; >>> +static const struct tune_params emag_tunings = >>> +{ >>> + &xgene1_extra_costs, >>> + &xgene1_addrcost_table, >>> + &xgene1_regmove_cost, >>> + &xgene1_vector_cost, >>> + &generic_branch_cost, >>> + &xgene1_approx_modes, >>> + 6, /* memmov_cost */ >>> + 4, /* issue_rate */ >>> + AARCH64_FUSE_NOTHING, /* fusible_ops */ >>> + "16",/* function_align. */ >>> + "16",/* jump_align. */ >>> + "16",/* loop_align. */ >>> + 2, /* int_reassoc_width. */ >>> + 4, /* fp_reassoc_width. */ >>> + 1, /* vec_reassoc_width. */ >>> + 2, /* min_div_recip_mul_sf. */ >>> + 2, /* min_div_recip_mul_df. */ >>> + 17, /* max_case_values. */ >>> + tune_params::AUTOPREFETCHER_OFF, /* autoprefetcher_model. */ >>> + (AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS), /* tune_flags. */ >>> + &xgene1_prefetch_tune >>> +}; >>> + >>> static const struct tune_params qdf24xx_tunings = >>> { >>> &qdf24xx_extra_costs, >>> diff
Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87626
Thank you for the inputs and please find the attachment for the update patch. Do please let us know your comments on the same ~Umesh On Tue, Nov 20, 2018 at 3:03 PM Jakub Jelinek wrote: > > On Mon, Nov 19, 2018 at 04:08:29PM +0530, Lokesh Janghel wrote: > diff --git a/gcc/ChangeLog b/gcc/ChangeLog > index 8ca2e73..b55dfa9 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,8 @@ > +2018-11-19 Lokesh Janghel > > Two spaces between date and name and name and <, i.e. > 2018-11-20 Lokesh Janghel > in both ChangeLog files. > > --- /dev/null > +++ b/gcc/testsuite/gcc.target/i386/pr85667-2.c > @@ -0,0 +1,15 @@ > +/* { dg-do assemble } */ > +/* { dg-options "-O2 -masm=intel" } */ > +/* { dg-require-effective-target lp64 } */ > +/* { dg-require-effective-target masm_intel } */ > +/* { dg-final { scan-assembler-times "movl\[^\n\r]*, %eax" 1} } */ > +typedef struct > +{ > + float x; > +} Float; > +Float __attribute__((ms_abi)) fn1 () > +{ > + Float v; > + v.x = 3.145; > + return v; > +} > > This test wasn't properly tested: > > /usr/src/gcc/obj/gcc/xgcc -B/usr/src/gcc/obj/gcc/ -m64 -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never -O2 -masm=intel -ffat-lto-objects -fno-ident -c -o pr85667-2.o /usr/src/gcc/gcc/testsuite/gcc.target/i386/pr85667-2.c > PASS: gcc.target/i386/pr85667-2.c (test for excess errors) > gcc.target/i386/pr85667-2.c: output file does not exist > UNRESOLVED: gcc.target/i386/pr85667-2.c scan-assembler-times movl[^\n\r]*, %eax 1 > testcase /usr/src/gcc/gcc/testsuite/gcc.target/i386/i386.exp completed in 1 seconds > > 1) you do not want to use dg-do assemble, but dg-do compile, because only >in that case (or when using -save-temps) assembly is produced > 2) you do not want to use -masm=intel and then expect AT&T syntax in the >regexp > > Thus, I'd replace all the dg- directive lines with: > /* { dg-do compile { target lp64 } } */ > /* { dg-options "-O2" } */ > /* { dg-final { scan-assembler-times "movl\[^\n\r]*, %eax|mov\[ \t]*eax," 1 } } */ > > That way, it will work both with -masm=att (explicit or implicit) or > -masm=intel. > > One can use > > make check-gcc RUNTESTFLAGS='--target_board=unix\{-m32,-m64,-m64/-masm=intel\} i386.exp=pr85667*' > > to verify and then look at the log file. > > Furthermore, I'd copy pr85667-1.c test to pr85667-3.c and the modified > pr85667-2.c to pr85667-4.c, change Float to Double, float to double, remove > f suffixes and adjust all the eax in the regexp to rax, so that you also > test the struct with DFmode case. > > Jakub 85667.patch Description: Binary data
Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87626
On Wed, Nov 21, 2018 at 06:06:41PM +0530, Umesh Kalappa wrote: > Thank you for the inputs and please find the attachment for the update patch. LGTM. Jakub
Re: [PATCH] handle unusual targets in -Wbuiltin-declaration-mismatch (PR 88098)
Hi Martin, > By calling builtin_decl_explicit rather than builtin_decl_implicit > the updated patch in the attachment avoids test failures due to > missing warnings on targets with support for long double but whose > libc doesn't support C99 functions like fabsl (such as apparently > aarch64-linux). [...] > gcc/testsuite/ChangeLog: > > PR testsuite/88098 > * gcc.dg/Wbuiltin-declaration-mismatch-4.c: Adjust. > * gcc.dg/Wbuiltin-declaration-mismatch-5.c: New test. is the Wbuiltin-declaration-mismatch-5.c testcase still supposed to be part of the patch? It's in the ChangeLog, but missing from the revised patch. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[PATCH] Fix up 20181120-1.c testcase on big-endian (PR rtl-optimization/85925)
Hi! As mentioned in the PR, the testcase fails on big-endian targets. The following patch tweaks it so that it does not fail there and still checks for the original bug. Tested on x86_64-linux and i686-linux, ok for trunk and release branches? 2018-11-21 Jakub Jelinek PR rtl-optimization/85925 * gcc.c-torture/execute/20181120-1.c (u): New variable. (main): Compare d against u.f1 rather than 0x101. --- gcc/testsuite/gcc.c-torture/execute/20181120-1.c.jj 2018-11-20 21:39:05.230507352 +0100 +++ gcc/testsuite/gcc.c-torture/execute/20181120-1.c2018-11-21 11:49:29.919488909 +0100 @@ -9,6 +9,7 @@ union U1 { unsigned f0; unsigned f1 : 15; }; +volatile union U1 u = { 0x10101 }; int main (void) { @@ -19,7 +20,7 @@ int main (void) *e = f.f1; } - if (d != 0x101) + if (d != u.f1) __builtin_abort (); return 0; Jakub
Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87626
Hi Jakub and All, We don't have the commit access ,can someone please commit for us ? ~Umesh On Wed, Nov 21, 2018, 18:37 Jakub Jelinek On Wed, Nov 21, 2018 at 06:06:41PM +0530, Umesh Kalappa wrote: > > Thank you for the inputs and please find the attachment for the update > patch. > > LGTM. > > Jakub >
[PATCH] C++: show namespaces for enum values (PR c++/88121)
Consider this test case: namespace json { enum { JSON_OBJECT }; } void test () { JSON_OBJECT; } which erroneously accesses an enum value in another namespace without qualifying the access. GCC 6 through 8 issue a suggestion that doesn't mention the namespace: : In function 'void test()': :8:3: error: 'JSON_OBJECT' was not declared in this scope JSON_OBJECT; ^~~ :8:3: note: suggested alternative: :3:10: note: 'JSON_OBJECT' enum { JSON_OBJECT }; ^~~ which is suboptimal. I made the problem worse with r265610, as gcc 9 now consolidates the single suggestion into the error, and emits: : In function 'void test()': :8:3: error: 'JSON_OBJECT' was not declared in this scope; did you mean 'JSON_OBJECT'? 8 | JSON_OBJECT; | ^~~ | JSON_OBJECT :3:10: note: 'JSON_OBJECT' declared here 3 | enum { JSON_OBJECT }; | ^~~ where the message: 'JSON_OBJECT' was not declared in this scope; did you mean 'JSON_OBJECT'? is nonsensical. The root cause is that dump_scope doesn't print anything when called for CONST_DECL in a namespace: the scope is an ENUMERAL_TYPE, rather than a namespace. This patch tweaks dump_scope to detect ENUMERAL_TYPE, and to use the enclosing namespace, so that the CONST_DECL is dumped as "json::JSON_OBJECT". This changes the output for the above so that it refers to the namespace, fixing the issue: :8:3: error: 'JSON_OBJECT' was not declared in this scope; did you mean 'json::JSON_OBJECT'? 9 | JSON_OBJECT; | ^~~ | json::JSON_OBJECT 3:10: note: 'json::JSON_OBJECT' declared here 3 | enum { JSON_OBJECT }; | ^~~ Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. OK for trunk? gcc/cp/ChangeLog: PR c++/88121 * error.c (dump_scope): Handle ENUMERAL_TYPE by using the CP_TYPE_CONTEXT of the type. gcc/testsuite/ChangeLog: PR c++/88121 * g++.dg/lookup/suggestions-scoped-enums.C: New test. * g++.dg/lookup/suggestions-unscoped-enums.C: New test. --- gcc/cp/error.c | 6 ++ .../g++.dg/lookup/suggestions-scoped-enums.C | 13 .../g++.dg/lookup/suggestions-unscoped-enums.C | 91 ++ 3 files changed, 110 insertions(+) create mode 100644 gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C create mode 100644 gcc/testsuite/g++.dg/lookup/suggestions-unscoped-enums.C diff --git a/gcc/cp/error.c b/gcc/cp/error.c index 72b42bd..6fee62d 100644 --- a/gcc/cp/error.c +++ b/gcc/cp/error.c @@ -182,6 +182,12 @@ dump_scope (cxx_pretty_printer *pp, tree scope, int flags) if (scope == NULL_TREE) return; + /* Enum values will be CONST_DECL with an ENUMERAL_TYPE as their + "scope". Use CP_TYPE_CONTEXT of the ENUMERAL_TYPE, so as to + print the enclosing namespace. */ + if (TREE_CODE (scope) == ENUMERAL_TYPE) +scope = CP_TYPE_CONTEXT (scope); + if (TREE_CODE (scope) == NAMESPACE_DECL) { if (scope != global_namespace) diff --git a/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C b/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C new file mode 100644 index 000..2bf3ed6 --- /dev/null +++ b/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C @@ -0,0 +1,13 @@ +// { dg-do compile { target c++11 } } +// { dg-options "-fdiagnostics-show-caret" } + +enum class vegetable { CARROT, TURNIP }; + +void misspelled_value_in_scoped_enum () +{ + vegetable::TURNUP; // { dg-error "'TURNUP' is not a member of 'vegetable'" } + /* { dg-begin-multiline-output "" } + vegetable::TURNUP; + ^~ + { dg-end-multiline-output "" } */ +} diff --git a/gcc/testsuite/g++.dg/lookup/suggestions-unscoped-enums.C b/gcc/testsuite/g++.dg/lookup/suggestions-unscoped-enums.C new file mode 100644 index 000..bc610d0 --- /dev/null +++ b/gcc/testsuite/g++.dg/lookup/suggestions-unscoped-enums.C @@ -0,0 +1,91 @@ +// { dg-options "-fdiagnostics-show-caret" } + +enum { LASAGNA, SPAGHETTI }; +namespace outer_ns_a +{ + enum enum_in_outer_ns_a { STRAWBERRY, BANANA }; + namespace inner_ns + { +enum enum_in_inner_ns { ELEPHANT, LION }; + } +} +namespace outer_ns_2 +{ + enum enum_in_outer_ns_2 { NIGHT, DAY }; +} + +void misspelled_enum_in_global_ns () +{ + SPOOGHETTI; // { dg-error "'SPOOGHETTI' was not declared in this scope; did you mean 'SPAGHETTI'" } + /* { dg-begin-multiline-output "" } + SPOOGHETTI; + ^~ + SPAGHETTI + { dg-end-multiline-output "" } */ +} + +void unqualified_enum_in_outer_ns () +{ + BANANA; // { dg-error "'BANANA' was not declared in this scope; did you mean 'outer_ns_a::BANANA'" } + /* { dg-begin-multiline-output "" } + BANANA; + ^~ + outer_ns_a::BANANA + { dg-end-multiline-output "" } */ + /* { dg-begin-multiline-output "" } + enum enum_in_outer_ns_a { STRAWBERRY, BANANA }; + ^~ +
Re: [PATCH] Fix missing dump_impl_location_t values, using a new dump_metadata_t
On Tue, Nov 20, 2018 at 8:37 PM David Malcolm wrote: > > The dump_* API attempts to capture emission location metadata for the > various dump messages, but looking in -fsave-optimization-record shows > that many dump messages are lacking useful impl_location values, instead > having this location within dumpfile.c: > > "impl_location": { > "file": "../../src/gcc/dumpfile.c", > "function": "ensure_pending_optinfo", > "line": 1169 > }, > > The problem is that the auto-capturing of dump_impl_location_t is tied to > dump_location_t, and this is tied to the dump_*_loc calls. If a message > comes from a dump_* call without a "_loc" suffix (e.g. dump_printf), the > current code synthesizes the dump_location_t within > dump_context::ensure_pending_optinfo, and thus saves the useless > impl_location seen above. > > This patch fixes things by changing the dump_* API so that, rather than > taking a dump_flags_t, they take a new class dump_metadata_t, which is > constructed from a dump_flags_t, but captures the emission location. > > Hence e.g.: > > dump_printf (MSG_NOTE, "some message\n"); > > implicitly builds a dump_metadata_t wrapping the MSG_NOTE and the > emission location. If there are several dump_printf calls without > a dump_*_loc call, the emission location within the optinfo is that > of the first dump call within it. > > The patch updates selftest::test_capture_of_dump_calls to verify > that the impl location of various dump_* calls is captured. I also > manually verified that the references to dumpfile.c in the saved > optimization records were fixed. > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. > > OK for trunk? OK. Richard. > gcc/ChangeLog: > * dump-context.h (dump_context::dump_loc): Convert 1st param from > dump_flags_t to const dump_metadata_t &. Convert 2nd param from > const dump_location_t & to const dump_user_location_t &. > (dump_context::dump_loc_immediate): Convert 2nd param from > const dump_location_t & to const dump_user_location_t &. > (dump_context::dump_gimple_stmt): Convert 1st param from > dump_flags_t to const dump_metadata_t &. > (dump_context::void dump_gimple_stmt_loc): Likewise; convert > 2nd param from const dump_location_t & to > const dump_user_location_t &. > (dump_context::dump_gimple_expr): Convert 1st param from > dump_flags_t to const dump_metadata_t &. > (dump_context::dump_gimple_expr_loc): Likewise; convert > 2nd param from const dump_location_t & to > const dump_user_location_t &. > (dump_context::dump_generic_expr): Convert 1st param from > dump_flags_t to const dump_metadata_t &. > (dump_context::dump_generic_expr_loc): Likewise; convert > 2nd param from const dump_location_t & to > const dump_user_location_t &. > (dump_context::dump_printf_va): Convert 1st param from > dump_flags_t to const dump_metadata_t &. > (dump_context::dump_printf_loc_va): Likewise; convert > 2nd param from const dump_location_t & to > const dump_user_location_t &. > (dump_context::dump_dec): Convert 1st param from > dump_flags_t to const dump_metadata_t &. > (dump_context::dump_symtab_node): Likewise. > (dump_context::begin_scope): Split out 2nd param into > user and impl locations. > (dump_context::ensure_pending_optinfo): Add metadata param. > (dump_context::begin_next_optinfo): Replace dump_location_t param > with metadata and user location. > * dumpfile.c (dump_context::dump_loc): Convert 1st param from > dump_flags_t to const dump_metadata_t &. Convert 2nd param from > const dump_location_t & to const dump_user_location_t &. > (dump_context::dump_loc_immediate): Convert 2nd param from > const dump_location_t & to const dump_user_location_t &. > (dump_context::dump_gimple_stmt): Convert 1st param from > dump_flags_t to const dump_metadata_t &. > (dump_context::void dump_gimple_stmt_loc): Likewise; convert > 2nd param from const dump_location_t & to > const dump_user_location_t &. > (dump_context::dump_gimple_expr): Convert 1st param from > dump_flags_t to const dump_metadata_t &. > (dump_context::dump_gimple_expr_loc): Likewise; convert > 2nd param from const dump_location_t & to > const dump_user_location_t &. > (dump_context::dump_generic_expr): Convert 1st param from > dump_flags_t to const dump_metadata_t &. > (dump_context::dump_generic_expr_loc): Likewise; convert > 2nd param from const dump_location_t & to > const dump_user_location_t &. > (dump_context::dump_printf_va): Convert 1st param from > dump_flags_t to const dump_metadata_t &. >
Re: Fix PR37916 (unnecessary spilling)
On Wed, Nov 21, 2018 at 1:12 AM Jeff Law wrote: > > On 11/20/18 6:42 AM, Michael Matz wrote: > > Hi, > > > > this bug report is about cris generating worse code since tree-ssa. The > > effect is also visible on x86-64. The symptom is that the work horse of > > adler32.c (from zlib) needs spills in the inner loop, while gcc 3 did not, > > and those spills go away with -fno-tree-reassoc. > > > > The underlying reason for the spills is register pressure, which could > > either be rectified by the pressure aware scheduling (which cris doesn't > > have), or by simply not generating high pressure code to begin with. In > > this case it's TER which ultimately causes the register pressure to > > increase, and there are many plans in people minds how to fix this (make > > TER regpressure aware, do some regpressure scheduling on gimple, or even > > more ambitious things), but this patch doesn't tackle this. Instead it > > makes reassoc not generate the situation which causes TER to run wild. > > > > TER increasing register pressure is a long standing problem and so it has > > some heuristics to avoid that. One wobbly heuristic is that it doesn't > > TER expressions together that have the same base variable as their LHSs. > > But reassoc generates only anonymous SSA names for its temporary > > subexpressions, so that TER heuristic doesn't apply. In this testcase > > it's even the case that reassoc doesn't really change much code (one > > addition moves from the end to the beginning of the inner loop), so that > > whole rewriting is even pointless. > > > > In any case, let's use copy_ssa_name instead of make_ssa_name, when we > > have an obvious LHS; that leads to TER making the same decisions with and > > without -fno-tree-reassoc, leading to the same register pressure and no > > spills. I don't think this is OK. Take one example, in rewrite_expr_tree for the final recursion case we replace a_1 = _2 + _3; with sth else, like _4 = _5 + 1; so we compute a new value that may not have been computed before and certainly not into the user variable a. If you change this to instead create a_4 = _5 + 1; then this leads to wrong debug info showing a value for 'a' that never existed. You can observe this with unsigned int __attribute__((noinline,noipa)) foo (unsigned int a, unsigned int b, unsigned int c, unsigned int d) { a = a + 1; a = a + b; a = a + c; a = a + d; a = a + 3; return a; } int main() { volatile unsigned x = foo (1, 3, 5, 7); return 0; } when you build this with -O -g -fno-var-tracking-assignments (VTA seems to hide the issue, probably due to not enough instructions in the end) you observe (gdb) s foo (a=1, b=3, c=5, d=7) at t.c:6 6 a = a + c; (gdb) disassemble Dump of assembler code for function foo: => 0x004004b2 <+0>: lea0x4(%rdx,%rcx,1),%eax 0x004004b6 <+4>: add%esi,%eax 0x004004b8 <+6>: add%edi,%eax 0x004004ba <+8>: retq End of assembler dump. (gdb) p a $1 = 1 (gdb) si 7 a = a + d; (gdb) p a $2 = 16 (gdb) si 9 return a; (gdb) p a $3 = 19 (gdb) si 0x004004ba 9 return a; (gdb) p a $4 = (gdb) p $eax $5 = 20 (gdb) quit printing values for a that never should occur. The sequence of "allowed" values is 1, 2, 5, 10, 17, 20. Both 16 and 19 should never be printed. It's quite obvious if you look at the reassoc result which is a_11 = d_7(D) + 4; a_12 = a_11 + c_5(D); a_13 = a_12 + b_3(D); a_9 = a_13 + a_1(D); return a_9; and the fact that var-tracking looks at REG_DECL for debug info location generation. With VTA we get # DEBUG BEGIN_STMT # DEBUG D#3 => a_1(D) + 1 # DEBUG a => D#3 # DEBUG BEGIN_STMT a_11 = d_7(D) + 4; # DEBUG D#2 => D#3 + b_3(D) # DEBUG a => D#2 # DEBUG BEGIN_STMT a_12 = a_11 + c_5(D); # DEBUG D#1 => D#2 + c_5(D) # DEBUG a => D#1 # DEBUG BEGIN_STMT a_13 = a_12 + b_3(D); # DEBUG a => D#1 + d_7(D) # DEBUG BEGIN_STMT a_9 = a_13 + a_1(D); # DEBUG a => a_9 # DEBUG BEGIN_STMT return a_9; but DEBUG stmts in itself do not make the code valid from a debug perspective. Note that the reassociated stmts can become DEBUG stmts itself if they are later DCEd. So your patch has to be much more careful to never change the LHS of stmts that are adjusted (which I think reassoc already does). Richard. > > On x86-64 the effect is: > > before patch: 48 bytes stackframe, 24 stack > > accesses (most of them in the loops), 576 bytes codesize; > > after patch: no stack frame, no stack accesses, 438 bytes codesize > > > > On cris: > > before patch: 64 bytes stack frame, 27 stack access in loops, size of .s > > 145 lines > > after patch: 20 bytes stack frame (as it uses callee saved regs, which > > is also complained about in the bug report), but no stack accesses > > in loops, size of .s: 125 lines > > > > I'm wondering about testcase: should I add an x86-64 specific that tests > > for no
Re: compute discriminator info for overrides
On Wed, Nov 21, 2018 at 2:06 AM Alexandre Oliva wrote: > > In some cases of overriding or resetting locations, we might retain > discriminator info from earlier locations, when we should take > discriminator information from the overriding location or reset it. > > Regstrapped on x86_64-linux-gnu. Ok to install? OK. Richard. > for gcc/ChangeLog > > * final.c (compute_discriminator): Declare. Renamed from... > (maybe_set_discriminator): ... this. Set and return a local. > (override_discriminator): New. > (final_scan_insn_1): Set it. > (notice_source_line): Adjust. Always set discriminator. > --- > gcc/final.c | 19 +++ > 1 file changed, 15 insertions(+), 4 deletions(-) > > diff --git a/gcc/final.c b/gcc/final.c > index 0c1ac625f37a..f707d2fc0bcd 100644 > --- a/gcc/final.c > +++ b/gcc/final.c > @@ -128,6 +128,7 @@ static int last_discriminator; > /* Discriminator to be written to assembly for current instruction. > Note: actual usage depends on loc_discriminator_kind setting. */ > static int discriminator; > +static inline int compute_discriminator (location_t loc); > > /* Discriminator identifying current basic block among others sharing > the same locus. */ > @@ -149,6 +150,7 @@ static const char *last_filename; > static const char *override_filename; > static int override_linenum; > static int override_columnnum; > +static int override_discriminator; > > /* Whether to force emission of a line note before the next insn. */ > static bool force_source_line = false; > @@ -2342,6 +2344,7 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int > optimize_p ATTRIBUTE_UNUSED, > override_filename = LOCATION_FILE (*locus_ptr); > override_linenum = LOCATION_LINE (*locus_ptr); > override_columnnum = LOCATION_COLUMN (*locus_ptr); > + override_discriminator = compute_discriminator (*locus_ptr); > } > } > break; > @@ -2379,12 +2382,14 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int > optimize_p ATTRIBUTE_UNUSED, > override_filename = LOCATION_FILE (*locus_ptr); > override_linenum = LOCATION_LINE (*locus_ptr); > override_columnnum = LOCATION_COLUMN (*locus_ptr); > + override_discriminator = compute_discriminator (*locus_ptr); > } > else > { > override_filename = NULL; > override_linenum = 0; > override_columnnum = 0; > + override_discriminator = 0; > } > } > break; > @@ -3185,9 +3190,11 @@ map_decl_to_instance (const_tree decl) > > /* Set DISCRIMINATOR to the appropriate value, possibly derived from LOC. */ > > -static inline void > -maybe_set_discriminator (location_t loc) > +static inline int > +compute_discriminator (location_t loc) > { > + int discriminator; > + >if (!decl_to_instance_map) > discriminator = bb_discriminator; >else > @@ -3209,6 +3216,8 @@ maybe_set_discriminator (location_t loc) > >discriminator = map_decl_to_instance (decl); > } > + > + return discriminator; > } > > /* Return whether a source line note needs to be emitted before INSN. > @@ -3234,7 +3243,7 @@ notice_source_line (rtx_insn *insn, bool *is_stmt) >filename = xloc.file; >linenum = xloc.line; >columnnum = xloc.column; > - maybe_set_discriminator (loc); > + discriminator = compute_discriminator (loc); >force_source_line = true; > } >else if (override_filename) > @@ -3242,6 +3251,7 @@ notice_source_line (rtx_insn *insn, bool *is_stmt) >filename = override_filename; >linenum = override_linenum; >columnnum = override_columnnum; > + discriminator = override_discriminator; > } >else if (INSN_HAS_LOCATION (insn)) > { > @@ -3249,13 +3259,14 @@ notice_source_line (rtx_insn *insn, bool *is_stmt) >filename = xloc.file; >linenum = xloc.line; >columnnum = xloc.column; > - maybe_set_discriminator (INSN_LOCATION (insn)); > + discriminator = compute_discriminator (INSN_LOCATION (insn)); > } >else > { >filename = NULL; >linenum = 0; >columnnum = 0; > + discriminator = 0; > } > >if (filename == NULL) > > -- > Alexandre Oliva, freedom fighter https://FSFLA.org/blogs/lxo > Be the change, be Free! FSF Latin America board member > GNU Toolchain EngineerFree Software Evangelist > Hay que enGNUrecerse, pero sin perder la terGNUra jamás-GNUChe
Re: Fix PR37916 (unnecessary spilling)
On 11/21/18 7:13 AM, Richard Biener wrote: > On Wed, Nov 21, 2018 at 1:12 AM Jeff Law wrote: >> >> On 11/20/18 6:42 AM, Michael Matz wrote: >>> Hi, >>> >>> this bug report is about cris generating worse code since tree-ssa. The >>> effect is also visible on x86-64. The symptom is that the work horse of >>> adler32.c (from zlib) needs spills in the inner loop, while gcc 3 did not, >>> and those spills go away with -fno-tree-reassoc. >>> >>> The underlying reason for the spills is register pressure, which could >>> either be rectified by the pressure aware scheduling (which cris doesn't >>> have), or by simply not generating high pressure code to begin with. In >>> this case it's TER which ultimately causes the register pressure to >>> increase, and there are many plans in people minds how to fix this (make >>> TER regpressure aware, do some regpressure scheduling on gimple, or even >>> more ambitious things), but this patch doesn't tackle this. Instead it >>> makes reassoc not generate the situation which causes TER to run wild. >>> >>> TER increasing register pressure is a long standing problem and so it has >>> some heuristics to avoid that. One wobbly heuristic is that it doesn't >>> TER expressions together that have the same base variable as their LHSs. >>> But reassoc generates only anonymous SSA names for its temporary >>> subexpressions, so that TER heuristic doesn't apply. In this testcase >>> it's even the case that reassoc doesn't really change much code (one >>> addition moves from the end to the beginning of the inner loop), so that >>> whole rewriting is even pointless. >>> >>> In any case, let's use copy_ssa_name instead of make_ssa_name, when we >>> have an obvious LHS; that leads to TER making the same decisions with and >>> without -fno-tree-reassoc, leading to the same register pressure and no >>> spills. > > I don't think this is OK. Take one example, in rewrite_expr_tree for the > final > recursion case we replace > >a_1 = _2 + _3; > > with sth else, like > > _4 = _5 + 1; > > so we compute a new value that may not have been computed before and > certainly not into the user variable a. If you change this to instead create > > a_4 = _5 + 1; > > then this leads to wrong debug info showing a value for 'a' that never > existed. > You can observe this with But isn't the point to use an underlying SSA_NAME_VAR when we have one that should be appropriate? Are we just being too aggressive with using copy_ssa_name? jeff
[PATCH] Fix PR88133
This fixes a bogus warning in bitmap.c by avoiding the problematic transform of cunrolli, thereby eliding the elt->bits[0] test for --disable-checking. Bootstrapped (with and without --disable-checking) and tested on x86_64-unknown-linux-gnu, applied. Richard. 2018-11-21 Richard Biener PR bootstrap/88133 * bitmap.c (bitmap_last_set_bit): Refactor to avoid warning. * Makefile.in (bitmap.o-warn): Remove again. Index: gcc/bitmap.c === --- gcc/bitmap.c(revision 266340) +++ gcc/bitmap.c(working copy) @@ -1186,13 +1186,13 @@ bitmap_last_set_bit (const_bitmap a) elt = elt->next; bit_no = elt->indx * BITMAP_ELEMENT_ALL_BITS; - for (ix = BITMAP_ELEMENT_WORDS - 1; ix >= 0; ix--) + for (ix = BITMAP_ELEMENT_WORDS - 1; ix >= 1; ix--) { word = elt->bits[ix]; if (word) goto found_bit; } - gcc_unreachable (); + gcc_assert (elt->bits[ix] != 0); found_bit: bit_no += ix * BITMAP_WORD_BITS; #if GCC_VERSION >= 3004 Index: gcc/Makefile.in === --- gcc/Makefile.in (revision 266340) +++ gcc/Makefile.in (working copy) @@ -221,7 +221,6 @@ libgcov-merge-tool.o-warn = -Wno-error gimple-match.o-warn = -Wno-unused generic-match.o-warn = -Wno-unused dfp.o-warn = -Wno-strict-aliasing -bitmap.o-warn = -Wno-error=array-bounds # PR 87926 # All warnings have to be shut off in stage1 if the compiler used then # isn't gcc; configure determines that. WARN_CFLAGS will be either
Re: Fix regression introduced by 88069
On Wed, Nov 21, 2018 at 3:28 AM Jeff Law wrote: > > Richi's recent change to fix 88069 is causing various targets to fail > tree-ssa/20030711-2.c. That test is verifying a variety of > optimizations occur during the first DOM pass. > > Prior to Richi's change FRE1 would do some significant cleanups of the > IL and as a result DOM was fully able to optimize the resultant code. Hum... I obviously missed the FAIL during testing somehow. FRE1 behavior shouldn't change so I'll fixup. Richard. > After Richi's change we've got a redundant load in the IL. After > analyzing the CFG and IL it was clear that DOM *should* be able to > remove the redundant load, but simply wasn't. > > DOM would discover that it could statically determine the result of a > branch condition. This resulted in one arm of the branch becoming > unreachable. That in turn caused some PHI nodes to become degenerates. > > Normally when a PHI node becomes a degenerate we record it as a copy in > the const_and_copies table and *most* of the time we'll propagate the > src value into uses of the dest. But propagation is not guaranteed > (there's a BZ around that issue you can find if you dig into the history > of some of this code). > > Anyway, exposing the degenerate PHI *should* have exposed the redundant > load, but we didn't record anything into the const/copies table for the > virtual phi. That's a conscious decision to avoid issues with > overlapping lifetimes of virtual SSA_NAMEs. > > While investigating the history here I noticed Richi's little trick > which allows propagation of virtuals if we propagate to all the uses. > Twiddling DOM to use that same trick results in the virtual operand > propagating. That in turn allows DOM to see and remove the redundant load. > > Bootstrapped and regression tested on x86_64 where is fixes > 20030711-2.c. Also verified that it fixed various other targets where > that test had started failing. > > Installing on the trunk. > > jeff
Re: Fix regression introduced by 88069
On 11/21/18 7:21 AM, Richard Biener wrote: > On Wed, Nov 21, 2018 at 3:28 AM Jeff Law wrote: >> >> Richi's recent change to fix 88069 is causing various targets to fail >> tree-ssa/20030711-2.c. That test is verifying a variety of >> optimizations occur during the first DOM pass. >> >> Prior to Richi's change FRE1 would do some significant cleanups of the >> IL and as a result DOM was fully able to optimize the resultant code. > > Hum... I obviously missed the FAIL during testing somehow. FRE1 > behavior shouldn't change so I'll fixup. NP. The change to DOM is probably a good thing independently. I may go back and factor that little hunk into a reusable function as we've got a nearly identical copy elsewhere. jeff
Re: Fix PR37916 (unnecessary spilling)
On Wed, Nov 21, 2018 at 3:16 PM Jeff Law wrote: > > On 11/21/18 7:13 AM, Richard Biener wrote: > > On Wed, Nov 21, 2018 at 1:12 AM Jeff Law wrote: > >> > >> On 11/20/18 6:42 AM, Michael Matz wrote: > >>> Hi, > >>> > >>> this bug report is about cris generating worse code since tree-ssa. The > >>> effect is also visible on x86-64. The symptom is that the work horse of > >>> adler32.c (from zlib) needs spills in the inner loop, while gcc 3 did not, > >>> and those spills go away with -fno-tree-reassoc. > >>> > >>> The underlying reason for the spills is register pressure, which could > >>> either be rectified by the pressure aware scheduling (which cris doesn't > >>> have), or by simply not generating high pressure code to begin with. In > >>> this case it's TER which ultimately causes the register pressure to > >>> increase, and there are many plans in people minds how to fix this (make > >>> TER regpressure aware, do some regpressure scheduling on gimple, or even > >>> more ambitious things), but this patch doesn't tackle this. Instead it > >>> makes reassoc not generate the situation which causes TER to run wild. > >>> > >>> TER increasing register pressure is a long standing problem and so it has > >>> some heuristics to avoid that. One wobbly heuristic is that it doesn't > >>> TER expressions together that have the same base variable as their LHSs. > >>> But reassoc generates only anonymous SSA names for its temporary > >>> subexpressions, so that TER heuristic doesn't apply. In this testcase > >>> it's even the case that reassoc doesn't really change much code (one > >>> addition moves from the end to the beginning of the inner loop), so that > >>> whole rewriting is even pointless. > >>> > >>> In any case, let's use copy_ssa_name instead of make_ssa_name, when we > >>> have an obvious LHS; that leads to TER making the same decisions with and > >>> without -fno-tree-reassoc, leading to the same register pressure and no > >>> spills. > > > > I don't think this is OK. Take one example, in rewrite_expr_tree for the > > final > > recursion case we replace > > > >a_1 = _2 + _3; > > > > with sth else, like > > > > _4 = _5 + 1; > > > > so we compute a new value that may not have been computed before and > > certainly not into the user variable a. If you change this to instead > > create > > > > a_4 = _5 + 1; > > > > then this leads to wrong debug info showing a value for 'a' that never > > existed. > > You can observe this with > But isn't the point to use an underlying SSA_NAME_VAR when we have one > that should be appropriate? Are we just being too aggressive with using > copy_ssa_name? Well, sure - there _might_ be places that could use copy_ssa_name in reassoc but I doubt so. Usually code-generation should not use copy_ssa_name given the aforementioned issues. Richard. > > jeff >
[PATCH] Alternate fix for PR87229, fix PR88112
My previous fix for PR87229 was too aggressive. The following simply teaches the LTO streamer to deal with CALL_EXPRs, support for which was already in place. I've amended it with two missing pieces, streaming of CALL_EXPR_BY_DESCRIPTOR and CALL_EXPR_IFN. LTO bootstrapped and tested on x86_64-unknown-linux-gnu with Ada enabled. Any objections? As said elsewhere I'm looking for sth that is reasonably safe for GCC 8 as well given the PR is a regression there. Richard. 2018-11-21 Richard Biener PR lto/87229 PR lto/88112 * lto-streamer-out.c (lto_is_streamable): Allow CALL_EXPRs which can appear in size expressions. * tree-streamer-in.c (unpack_ts_base_value_fields): Stream CALL_EXPR_BY_DESCRIPTOR. (streamer_read_tree_bitfields): Stream CALL_EXPR_IFN. * tree-streamer-out.c (pack_ts_base_value_fields): Stream CALL_EXPR_BY_DESCRIPTOR. (streamer_write_tree_bitfields): Stream CALL_EXPR_IFN. Revert PR lto/87229 * tree.c (free_lang_data_in_one_sizepos): Free non-gimple-val sizepos values. Index: gcc/lto-streamer-out.c === --- gcc/lto-streamer-out.c (revision 266308) +++ gcc/lto-streamer-out.c (working copy) @@ -306,7 +306,6 @@ lto_is_streamable (tree expr) name version in lto_output_tree_ref (see output_ssa_names). */ return !is_lang_specific (expr) && code != SSA_NAME -&& code != CALL_EXPR && code != LANG_TYPE && code != MODIFY_EXPR && code != INIT_EXPR Index: gcc/tree-streamer-in.c === --- gcc/tree-streamer-in.c (revision 266308) +++ gcc/tree-streamer-in.c (working copy) @@ -158,6 +158,11 @@ unpack_ts_base_value_fields (struct bitp SSA_NAME_IS_DEFAULT_DEF (expr) = (unsigned) bp_unpack_value (bp, 1); bp_unpack_value (bp, 8); } + else if (TREE_CODE (expr) == CALL_EXPR) +{ + CALL_EXPR_BY_DESCRIPTOR (expr) = (unsigned) bp_unpack_value (bp, 1); + bp_unpack_value (bp, 8); +} else bp_unpack_value (bp, 9); } @@ -521,6 +526,8 @@ streamer_read_tree_bitfields (struct lto MR_DEPENDENCE_BASE (expr) = (unsigned)bp_unpack_value (&bp, sizeof (short) * 8); } + else if (code == CALL_EXPR) + CALL_EXPR_IFN (expr) = bp_unpack_enum (&bp, internal_fn, IFN_LAST); } if (CODE_CONTAINS_STRUCT (code, TS_BLOCK)) Index: gcc/tree-streamer-out.c === --- gcc/tree-streamer-out.c (revision 266308) +++ gcc/tree-streamer-out.c (working copy) @@ -129,6 +129,11 @@ pack_ts_base_value_fields (struct bitpac bp_pack_value (bp, SSA_NAME_IS_DEFAULT_DEF (expr), 1); bp_pack_value (bp, 0, 8); } + else if (TREE_CODE (expr) == CALL_EXPR) +{ + bp_pack_value (bp, CALL_EXPR_BY_DESCRIPTOR (expr), 1); + bp_pack_value (bp, 0, 8); +} else bp_pack_value (bp, 0, 9); } @@ -457,6 +462,8 @@ streamer_write_tree_bitfields (struct ou if (MR_DEPENDENCE_CLIQUE (expr) != 0) bp_pack_value (&bp, MR_DEPENDENCE_BASE (expr), sizeof (short) * 8); } + else if (code == CALL_EXPR) + bp_pack_enum (&bp, internal_fn, IFN_LAST, CALL_EXPR_IFN (expr)); } if (CODE_CONTAINS_STRUCT (code, TS_BLOCK)) Index: gcc/tree.c === --- gcc/tree.c (revision 266308) +++ gcc/tree.c (working copy) @@ -5254,13 +5254,6 @@ free_lang_data_in_one_sizepos (tree *exp tree expr = *expr_p; if (CONTAINS_PLACEHOLDER_P (expr)) *expr_p = build0 (PLACEHOLDER_EXPR, TREE_TYPE (expr)); - /* ??? We have to reset all non-GIMPLE sizepos because those eventually - refer to trees we cannot stream. See for example PR87229 which - shows an example with non-gimplified abstract origins in C++. - Note this should only happen for abstract copies so setting sizes - to NULL is OK (but we cannot easily assert this). */ - else if (expr && !is_gimple_val (expr)) -*expr_p = NULL_TREE; }
Re: Fix regression introduced by 88069
On Wed, Nov 21, 2018 at 3:23 PM Jeff Law wrote: > > On 11/21/18 7:21 AM, Richard Biener wrote: > > On Wed, Nov 21, 2018 at 3:28 AM Jeff Law wrote: > >> > >> Richi's recent change to fix 88069 is causing various targets to fail > >> tree-ssa/20030711-2.c. That test is verifying a variety of > >> optimizations occur during the first DOM pass. > >> > >> Prior to Richi's change FRE1 would do some significant cleanups of the > >> IL and as a result DOM was fully able to optimize the resultant code. > > > > Hum... I obviously missed the FAIL during testing somehow. FRE1 > > behavior shouldn't change so I'll fixup. > NP. The change to DOM is probably a good thing independently. I may go > back and factor that little hunk into a reusable function as we've got a > nearly identical copy elsewhere. I am testing the following. Richard. 2018-11-21 Richard Biener PR tree-optimization/88069 * tree-ssa-sccvn.c (visit_phi): Tweak previous fix to not apply to default defs. Index: gcc/tree-ssa-sccvn.c === --- gcc/tree-ssa-sccvn.c(revision 266345) +++ gcc/tree-ssa-sccvn.c(working copy) @@ -4205,6 +4205,7 @@ visit_phi (gimple *phi, bool *inserted, given that allows us to escape a region in alias walking. */ || (sameval && TREE_CODE (sameval) == SSA_NAME + && !SSA_NAME_IS_DEFAULT_DEF (sameval) && SSA_NAME_IS_VIRTUAL_OPERAND (sameval) && (SSA_VAL (sameval, &visited_p), !visited_p))) /* Note this just drops to VARYING without inserting the PHI into > jeff
Re: [PATCH v2 1/3] Allow memory operands for PTWRITE
On Tue, Nov 20, 2018 at 7:36 PM Andi Kleen wrote: > > On Tue, Nov 20, 2018 at 11:53:15AM +0100, Richard Biener wrote: > > On Fri, Nov 16, 2018 at 8:07 AM Uros Bizjak wrote: > > > > > > On Fri, Nov 16, 2018 at 4:57 AM Andi Kleen wrote: > > > > > > > > From: Andi Kleen > > > > > > > > The earlier PTWRITE builtin definition was unnecessarily restrictive, > > > > only allowing register input to PTWRITE. The instruction actually > > > > supports memory operands too, so allow that too. > > > > > > > > gcc/: > > > > > > > > 2018-11-15 Andi Kleen > > > > > > > > * config/i386/i386.md: Allow memory operands to ptwrite. > > > > > > OK. > > > > Btw, I wonder why the ptwrite builtin is in SPECIAL_ARGS2 > > commented as /* Add all special builtins with variable number of operands. > > */? > > i think i put it in the same place as a similar builtin. AFAIK > those others don't have variable arguments either, so the comment > may be wrong? No idea... > > > > On the GIMPLE level this builtin also has quite some (bad) effects on > > alias analysis and any related optimization (vectorization, etc.). I'll > > have > > to see where the instrumenting pass now resides. > > It's fairly late now. OK, saw that. > Any suggestions for improvements? At some point I removed the edges > like the old MPX builtins to minimize memory usage, but that was > removed during an earlier review cycle. I guess it's fine now - it will have an effect on TER, limiting its ability a bit, but otherwise the builtin only lives up to RTL expansion where it becomes the UNSPEC_VOLATILE. As said, instrumenting on RTL would be an improvement, I think HJ might be able to help with that. Richard. > -Andi
Re: [PATCH v2 1/3] Allow memory operands for PTWRITE
On Wed, Nov 21, 2018 at 6:48 AM Richard Biener wrote: > > On Tue, Nov 20, 2018 at 7:36 PM Andi Kleen wrote: > > > > On Tue, Nov 20, 2018 at 11:53:15AM +0100, Richard Biener wrote: > > > On Fri, Nov 16, 2018 at 8:07 AM Uros Bizjak wrote: > > > > > > > > On Fri, Nov 16, 2018 at 4:57 AM Andi Kleen wrote: > > > > > > > > > > From: Andi Kleen > > > > > > > > > > The earlier PTWRITE builtin definition was unnecessarily restrictive, > > > > > only allowing register input to PTWRITE. The instruction actually > > > > > supports memory operands too, so allow that too. > > > > > > > > > > gcc/: > > > > > > > > > > 2018-11-15 Andi Kleen > > > > > > > > > > * config/i386/i386.md: Allow memory operands to ptwrite. > > > > > > > > OK. > > > > > > Btw, I wonder why the ptwrite builtin is in SPECIAL_ARGS2 > > > commented as /* Add all special builtins with variable number of > > > operands. */? > > > > i think i put it in the same place as a similar builtin. AFAIK > > those others don't have variable arguments either, so the comment > > may be wrong? > > No idea... > > > > > > > On the GIMPLE level this builtin also has quite some (bad) effects on > > > alias analysis and any related optimization (vectorization, etc.). I'll > > > have > > > to see where the instrumenting pass now resides. > > > > It's fairly late now. > > OK, saw that. > > > Any suggestions for improvements? At some point I removed the edges > > like the old MPX builtins to minimize memory usage, but that was > > removed during an earlier review cycle. > > I guess it's fine now - it will have an effect on TER, limiting its ability > a bit, but otherwise the builtin only lives up to RTL expansion where > it becomes the UNSPEC_VOLATILE. As said, instrumenting on > RTL would be an improvement, I think HJ might be able to help with that. > What are the issues? -- H.J.
[C++ PATCH] Improve locations of id-expressions and operator "" (PR c++/87386, take 2)
Hi! On Tue, Nov 20, 2018 at 04:32:26PM -0500, David Malcolm wrote: > This makes the fix-it hint wrong: after the fix-it is applied, it will > become > return color; > (which won't compile), rather than > return O::color; > which will. Here is an updated version of the patch, which still uses the whole range of the id-expression when it is parsed as primary expression, but does so not in cp_parser_id_expression, but in cp_parser_primary_expression after all the diagnostics. Thus all the spell-checking etc. tests behave as previously, they underline only the part after the last ::, and just what uses the expression later on uses whole range. The remaining needed tweeks in the testcases are minor and look correct to me, e.g. for D::Bar the column is not at D but at B, similarly for operator"" _F the column is under _ rather than first o. The libstdc++ changes are because there are several large expressions like: something::value and we used to diagnose on the something line (column of s) but now we warn on value line (column of v). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-11-21 Jakub Jelinek PR c++/87386 * parser.c (cp_parser_primary_expression): Use id_expression.get_location () instead of id_expr_token->location. Adjust the range from id_expr_token->location to id_expressio.get_finish (). (cp_parser_operator): For operator "" make a range from "" to the end of the suffix with caret at the start of the id. gcc/testsuite/ * g++.dg/diagnostic/pr87386.C: New test. * g++.dg/parse/error17.C: Adjust expected diagnostics. * g++.dg/cpp0x/pr51420.C: Likewise. * g++.dg/cpp0x/udlit-declare-neg.C: Likewise. * g++.dg/cpp0x/udlit-member-neg.C: Likewise. libstdc++-v3/ * testsuite/20_util/scoped_allocator/69293_neg.cc: Adjust expected line. * testsuite/20_util/uses_allocator/cons_neg.cc: Likewise. * testsuite/20_util/uses_allocator/69293_neg.cc: Likewise. * testsuite/experimental/propagate_const/requirements2.cc: Likewise. * testsuite/experimental/propagate_const/requirements3.cc: Likewise. * testsuite/experimental/propagate_const/requirements4.cc: Likewise. * testsuite/experimental/propagate_const/requirements5.cc: Likewise. --- gcc/cp/parser.c.jj 2018-11-21 11:35:43.698053550 +0100 +++ gcc/cp/parser.c 2018-11-21 12:23:20.701047164 +0100 @@ -5604,7 +5604,7 @@ cp_parser_primary_expression (cp_parser /*is_namespace=*/false, /*check_dependency=*/true, &ambiguous_decls, - id_expr_token->location); + id_expression.get_location ()); /* If the lookup was ambiguous, an error will already have been issued. */ if (ambiguous_decls) @@ -5675,7 +5675,7 @@ cp_parser_primary_expression (cp_parser if (parser->local_variables_forbidden_p && local_variable_p (decl)) { - error_at (id_expr_token->location, + error_at (id_expression.get_location (), "local variable %qD may not appear in this context", decl.get_value ()); return error_mark_node; @@ -5694,7 +5694,8 @@ cp_parser_primary_expression (cp_parser id_expression.get_location ())); if (error_msg) cp_parser_error (parser, error_msg); - decl.set_location (id_expr_token->location); + decl.set_location (id_expression.get_location ()); + decl.set_range (id_expr_token->location, id_expression.get_finish ()); return decl; } @@ -15051,7 +15052,7 @@ cp_literal_operator_id (const char* name static cp_expr cp_parser_operator (cp_parser* parser) { - tree id = NULL_TREE; + cp_expr id = NULL_TREE; cp_token *token; bool utf8 = false; @@ -15339,8 +15340,9 @@ cp_parser_operator (cp_parser* parser) if (id != error_mark_node) { const char *name = IDENTIFIER_POINTER (id); - id = cp_literal_operator_id (name); + *id = cp_literal_operator_id (name); } + id.set_range (start_loc, id.get_finish ()); return id; } @@ -15364,7 +15366,8 @@ cp_parser_operator (cp_parser* parser) id = error_mark_node; } - return cp_expr (id, start_loc); + id.set_location (start_loc); + return id; } /* Parse a template-declaration. --- gcc/testsuite/g++.dg/diagnostic/pr87386.C.jj2018-11-21 14:40:58.377769686 +0100 +++ gcc/testsuite/g++.dg/diagnostic/pr87386.C 2018-11-21 14:40:19.064410070 +0100 @@ -0,0 +1,18 @@ +// PR c++/87386 +// { dg-do compile { target c++11 } } +// { dg-options "-fdiagnostics-show-caret" } + +namespace foo { + template struct test
[C++ PATCH] Remove useless tokens from cp_parser_linkage_specification (PR c++/87393)
Hi! David's r251026 change added a weird trailing ->location. It doesn't seem to be useful for anything, matching_braces has its own code to track locations, so no need to do anything in the caller (and no other spot does something like that). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-11-21 Jakub Jelinek PR c++/87393 * parser.c (cp_parser_linkage_specification): Remove useless dereference of the consume_open method result. --- gcc/cp/parser.c.jj 2018-11-21 08:58:56.190250827 +0100 +++ gcc/cp/parser.c 2018-11-21 10:02:40.690687576 +0100 @@ -14223,7 +14223,7 @@ cp_parser_linkage_specification (cp_pars /* Consume the `{' token. */ matching_braces braces; - braces.consume_open (parser)->location; + braces.consume_open (parser); /* Parse the declarations. */ cp_parser_declaration_seq_opt (parser); /* Look for the closing `}'. */ Jakub
Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM
Hi Thomas, Sorry for the delay. On 16/11/18 14:56, Thomas Preudhomme wrote: Ping? Best regards, Thomas On Sat, 10 Nov 2018 at 15:07, Thomas Preudhomme wrote: Thanks Kyrill. Updated patch in attachment. Best regards, Thomas On Thu, 8 Nov 2018 at 15:53, Kyrill Tkachov wrote: Hi Thomas, On 08/11/18 09:52, Thomas Preudhomme wrote: Ping? Best regards, Thomas On Thu, 1 Nov 2018 at 16:03, Thomas Preudhomme wrote: Ping? Best regards, Thomas On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme wrote: Hi, Please find updated patch to fix PR85434: spilling of stack protector guard's address on ARM. Quite a few changes have been made to the ARM part since last round of review so I think it makes more sense to review it anew. Ran bootstrap + regression testsuite + glibc build + glibc regression testsuite for Arm and Thumb-2 and bootstrap + regression testsuite for Thumb-1. GCC's regression testsuite was run in 3 configurations in all those cases: - default configuration (no RUNTESTFLAGS) - with -fstack-protector-all - with -fPIC -fstack-protector-all (to exercise both codepath in stack protector's split code) None of this show any regression beyond some new scan fail with -fstack-protector-all or -fPIC due to unexpected code sequence for the testcases concerned and some guality swing due to less optimization with new stack protector on. Patch description and ChangeLog below. In case of high register pressure in PIC mode, address of the stack protector's guard can be spilled on ARM targets as shown in PR85434, thus allowing an attacker to control what the canary would be compared against. ARM does lack stack_protect_set and stack_protect_test insn patterns, defining them does not help as the address is expanded regularly and the patterns only deal with the copy and test of the guard with the canary. This problem does not occur for x86 targets because the PIC access and the test can be done in the same instruction. Aarch64 is exempt too because PIC access insn pattern are mov of UNSPEC which prevents it from the second access in the epilogue being CSEd in cse_local pass with the first access in the prologue. The approach followed here is to create new "combined" set and test standard pattern names that take the unexpanded guard and do the set or test. This allows the target to use an opaque pattern (eg. using UNSPEC) to hide the individual instructions being generated to the compiler and split the pattern into generic load, compare and branch instruction after register allocator, therefore avoiding any spilling. This is here implemented for the ARM targets. For targets not implementing these new standard pattern names, the existing stack_protect_set and stack_protect_test pattern names are used. To be able to split PIC access after register allocation, the functions had to be augmented to force a new PIC register load and to control which register it loads into. This is because sharing the PIC register between prologue and epilogue could lead to spilling due to CSE again which an attacker could use to control what the canary gets compared against. ChangeLog entries are as follows: *** gcc/ChangeLog *** 2018-10-26 Thomas Preud'homme * target-insns.def (stack_protect_combined_set): Define new standard pattern name. (stack_protect_combined_test): Likewise. * cfgexpand.c (stack_protect_prologue): Try new stack_protect_combined_set pattern first. * function.c (stack_protect_epilogue): Try new stack_protect_combined_test pattern first. * config/arm/arm.c (require_pic_register): Add pic_reg and compute_now parameters to control which register to use as PIC register and force reloading PIC register respectively. Insert in the stream of insns if possible. (legitimize_pic_address): Expose above new parameters in prototype and adapt recursive calls accordingly. Use pic_reg if non null instead of cached one. (arm_load_pic_register): Add pic_reg parameter and use it if non null. (arm_legitimize_address): Adapt to new legitimize_pic_address prototype. (thumb_legitimize_address): Likewise. (arm_emit_call_insn): Adapt to require_pic_register prototype change. (arm_expand_prologue): Adapt to arm_load_pic_register prototype change. (thumb1_expand_prologue): Likewise. * config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype change. (arm_load_pic_register): Likewise. * config/arm/predicated.md (guard_addr_operand): New predicate. (guard_operand): New predicate. * config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address prototype change. (builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue prototype change. (stack_protect_combined_set): New expander.. (stack_protect_combined_set_insn): New insn_and_split pattern. (stack_protect_set_insn): New insn pattern. (stack_protect_combined_test): New expander. (stack_protect_combined_test_insn): New insn_and_split pattern. (arm_stack_protect_test_insn): New insn pattern. * config/arm/thumb1.md (thumb1_stack_protect_test_insn):
Re: Patch ping (was Re: [PATCH] Fix aarch64_compare_and_swap* constraints (PR target/87839))
On Tue, Nov 20, 2018 at 11:04:46AM -0600, Jakub Jelinek wrote: > Hi! > > On Tue, Nov 13, 2018 at 10:28:16AM +0100, Jakub Jelinek wrote: > > 2018-11-13 Jakub Jelinek > > > > PR target/87839 > > * config/aarch64/atomics.md (@aarch64_compare_and_swap): Use > > rIJ constraint for aarch64_plus_operand rather than rn. > > > > * gcc.target/aarch64/pr87839.c: New test. > > I'd like to ping this patch, Kyrill had kindly tested it, ok for trunk? OK. Thanks, James
Re: [C++ PATCH] Remove useless tokens from cp_parser_linkage_specification (PR c++/87393)
On Wed, 2018-11-21 at 16:59 +0100, Jakub Jelinek wrote: > Hi! > > David's r251026 change added a weird trailing ->location. > It doesn't seem to be useful for anything, matching_braces has its > own code > to track locations, so no need to do anything in the caller (and no > other > spot does something like that). > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2018-11-21 Jakub Jelinek > > PR c++/87393 > * parser.c (cp_parser_linkage_specification): Remove useless > dereference of the consume_open method result. > > --- gcc/cp/parser.c.jj2018-11-21 08:58:56.190250827 +0100 > +++ gcc/cp/parser.c 2018-11-21 10:02:40.690687576 +0100 > @@ -14223,7 +14223,7 @@ cp_parser_linkage_specification (cp_pars > >/* Consume the `{' token. */ >matching_braces braces; > - braces.consume_open (parser)->location; > + braces.consume_open (parser); >/* Parse the declarations. */ >cp_parser_declaration_seq_opt (parser); >/* Look for the closing `}'. */ Oops; looks like a stray edit by me. Thanks for catching this; OK to remove it. Dave
Re: [PATCH 1/3][GCC] Add new target hook asm_post_cfi_startproc
On 11/2/18 6:07 PM, Sam Tebbs wrote: > On 11/02/2018 05:28 PM, Sam Tebbs wrote: > >> Hi all, >> >> This patch adds a new target hook called "asm_post_cfi_startproc". This hook >> is >> intended to be used by the aarch64 backend to emit a directive that enables >> support for unwinding frames signed with the pointer authentication B-key. >> This >> hook is triggered after the ".cfi_startproc" directive is emitted in >> gcc/dwarf2out.c. >> >> Bootstrapped on aarch64-none-linux-gnu and tested on aarch64-none-elf with >> no regressions. >> >> Ok for trunk? >> >> gcc/ >> 2018-11-02 Sam Tebbs >> >> * doc/tm.texi (TARGET_ASM_POST_CFI_STARTPROC): Define. >> * doc/tm.texi.in (TARGET_ASM_POST_CFI_STARTPROC): Define. >> * dwarf2out.c (dwarf2out_do_cfi_startproc): Trigger the hook. >> * hooks.c (hook_void_FILEptr_tree): Define. >> * hooks.h (hook_void_FILEptr_tree): Define. >> * target.def (post_cfi_startproc): Define. > CCing global reviewers and dwarf maintainers. > ping
Re: [PATCH v3] [aarch64] Add CPU support for Ampere Computing's eMAG.
On 20/11/2018 18:00, Christoph Muellner wrote: > Tested with "make check" and no regressions found. > > This patch depends on the struct xgene1_prefetch_tune, > which has been acknowledged already: > https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00985.html > > *** gcc/ChangeLog *** > > 2018-xx-xx Christoph Muellner > > * config/aarch64/aarch64-cores.def: Define emag. > * config/aarch64/aarch64-tune.md: Regenerated with emag. > * config/aarch64/aarch64.c (emag_tunings): New struct. > * doc/invoke.texi: Document mtune value. OK. R. > > Signed-off-by: Christoph Muellner > --- > gcc/config/aarch64/aarch64-cores.def | 3 +++ > gcc/config/aarch64/aarch64-tune.md | 2 +- > gcc/config/aarch64/aarch64.c | 25 + > gcc/doc/invoke.texi | 2 +- > 4 files changed, 30 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/aarch64/aarch64-cores.def > b/gcc/config/aarch64/aarch64-cores.def > index 1f3ac56..68cca00 100644 > --- a/gcc/config/aarch64/aarch64-cores.def > +++ b/gcc/config/aarch64/aarch64-cores.def > @@ -61,6 +61,9 @@ AARCH64_CORE("thunderxt88", thunderxt88, thunderx, 8A, > AARCH64_FL_FOR_ARCH > AARCH64_CORE("thunderxt81", thunderxt81, thunderx, 8A, > AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, 0x43, > 0x0a2, -1) > AARCH64_CORE("thunderxt83", thunderxt83, thunderx, 8A, > AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, 0x43, > 0x0a3, -1) > > +/* Ampere Computing cores. */ > +AARCH64_CORE("emag",emag, xgene1,8A, AARCH64_FL_FOR_ARCH8 > | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, 0x000, 3) > + > /* APM ('P') cores. */ > AARCH64_CORE("xgene1", xgene1,xgene1,8A, AARCH64_FL_FOR_ARCH8, > xgene1, 0x50, 0x000, -1) > > diff --git a/gcc/config/aarch64/aarch64-tune.md > b/gcc/config/aarch64/aarch64-tune.md > index fade1d4..2fc7f03 100644 > --- a/gcc/config/aarch64/aarch64-tune.md > +++ b/gcc/config/aarch64/aarch64-tune.md > @@ -1,5 +1,5 @@ > ;; -*- buffer-read-only: t -*- > ;; Generated automatically by gentune.sh from aarch64-cores.def > (define_attr "tune" > - > "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" > + > "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" > (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index f7f88a9..995aafe 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -957,6 +957,31 @@ static const struct tune_params xgene1_tunings = >&xgene1_prefetch_tune > }; > > +static const struct tune_params emag_tunings = > +{ > + &xgene1_extra_costs, > + &xgene1_addrcost_table, > + &xgene1_regmove_cost, > + &xgene1_vector_cost, > + &generic_branch_cost, > + &xgene1_approx_modes, > + 6, /* memmov_cost */ > + 4, /* issue_rate */ > + AARCH64_FUSE_NOTHING, /* fusible_ops */ > + "16", /* function_align. */ > + "16", /* jump_align. */ > + "16", /* loop_align. */ > + 2, /* int_reassoc_width. */ > + 4, /* fp_reassoc_width. */ > + 1, /* vec_reassoc_width. */ > + 2, /* min_div_recip_mul_sf. */ > + 2, /* min_div_recip_mul_df. */ > + 17,/* max_case_values. */ > + tune_params::AUTOPREFETCHER_OFF, /* autoprefetcher_model. */ > + (AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS), /* tune_flags. */ > + &xgene1_prefetch_tune > +}; > + > static const struct tune_params qdf24xx_tunings = > { >&qdf24xx_extra_costs, > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index e016dce..ac81fb2 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -15288,7 +15288,7 @@ Specify the name of the target processor for which > GCC should tune the > performance of the code. Permissible values for this option are: > @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55}, > @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75}, > -@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{falkor}, > +@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{emag}, @samp{falkor}, > @samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan}, > @samp{thunderx}, @samp{thunderxt88}, @samp{thunderxt88p1}, > @samp{thunderxt81}, > @samp{tsv110}, @samp{thunderxt83}, @samp{thund
Re: [PATCH] handle unusual targets in -Wbuiltin-declaration-mismatch (PR 88098)
On 11/21/18 6:08 AM, Rainer Orth wrote: Hi Martin, By calling builtin_decl_explicit rather than builtin_decl_implicit the updated patch in the attachment avoids test failures due to missing warnings on targets with support for long double but whose libc doesn't support C99 functions like fabsl (such as apparently aarch64-linux). [...] gcc/testsuite/ChangeLog: PR testsuite/88098 * gcc.dg/Wbuiltin-declaration-mismatch-4.c: Adjust. * gcc.dg/Wbuiltin-declaration-mismatch-5.c: New test. is the Wbuiltin-declaration-mismatch-5.c testcase still supposed to be part of the patch? It's in the ChangeLog, but missing from the revised patch. It should still be there. I must have excluded it by accident. I will make sure to include it in the commit. Thanks for pointing it out! Martin
Re: [PATCH] Fix up 20181120-1.c testcase on big-endian (PR rtl-optimization/85925)
Hi, On Wed, Nov 21, 2018 at 02:13:55PM +0100, Jakub Jelinek wrote: > As mentioned in the PR, the testcase fails on big-endian targets. > The following patch tweaks it so that it does not fail there and still > checks for the original bug. It relies on a certain bitfield layout, not just on LE. I think the testcase should run only on those specific targets where it works. I don't see how this patch would fix the problem for BE, btw. Segher
Re: [PATCH] Fix up 20181120-1.c testcase on big-endian (PR rtl-optimization/85925)
On Wed, Nov 21, 2018 at 11:23:38AM -0600, Segher Boessenkool wrote: > Hi, > > On Wed, Nov 21, 2018 at 02:13:55PM +0100, Jakub Jelinek wrote: > > As mentioned in the PR, the testcase fails on big-endian targets. > > The following patch tweaks it so that it does not fail there and still > > checks for the original bug. > > It relies on a certain bitfield layout, not just on LE. I think the > testcase should run only on those specific targets where it works. I don't > see how this patch would fix the problem for BE, btw. With the patch, it doesn't rely on anything, it compares if what you get at runtime from the code combiner would optimize is equal to what is read from a volatile union. Admittedly, it might be better if the initializer was 0x1010101 or say 0x4030201 because on big endian in particular 0x10101 has the top 15 bits all zero and thus that is what is in u.f1, so if the bug can be reproduced with the combine.c + rtlanal.c fix reverted with 0x4030201, it would be better to use that value (in both spots). Jakub
Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM
On Fri, Nov 16, 2018 at 02:56:46PM +, Thomas Preudhomme wrote: > In case of high register pressure in PIC mode, address of the stack > protector's guard can be spilled on ARM targets as shown in PR85434, > thus allowing an attacker to control what the canary would be compared > against. ARM does lack stack_protect_set and stack_protect_test insn > patterns, defining them does not help as the address is expanded > regularly and the patterns only deal with the copy and test of the > guard with the canary. > > This problem does not occur for x86 targets because the PIC access and > the test can be done in the same instruction. Aarch64 is exempt too > because PIC access insn pattern are mov of UNSPEC which prevents it from > the second access in the epilogue being CSEd in cse_local pass with the > first access in the prologue. The unspecs are not CSEd because they are *different* unspecs (UNSPEC_SP_SET vs. UNSPEC_SP_TEST; they have different args too, different number of args even). Two the same unspecs can be CSEd just fine. Segher
Re: Stream TREE_TYPE of TYPE_DECLs again
> > OK if you put a comment ... I have adde comments to both free_lang_data referring that some fields are freed late and comment to the new freeing pass. While testing I noticed stupid bug in need_assembler_name_p which in case TYPE_DECL does not satisfy the elaborate conditional for type to be ODR it falls into "return true" rather than false. Fixing that uncovered bug in -fno-odr-type-merging path of ipa-devirt where vtable hash was no longer initialized. Fixed thus. lto-bootstrapped/regtested x86_64-linux, comitted. PR lto/87957 * tree.c (fld_decl_context): Break out from ... (free_lang_data_in_decl): ... here; free TREE_PUBLIC, TREE_PRIVATE DECL_ARTIFICIAL of TYPE_DECL; do not free TREE_TYPE of TYPE_DECL. (fld_incomplete_type_of): Build copy of TYP_DECL. * ipa-devirt.c (free_enum_values): Rename to ... (free_odr_warning_data): ... this one; free also duplicated TYPE_DECLs and TREE_TYPEs of TYPE_DECLs. (get_odr_type): Initialize odr_vtable_hash if needed. Index: ipa-devirt.c === --- ipa-devirt.c(revision 266334) +++ ipa-devirt.c(working copy) @@ -2025,6 +2025,8 @@ get_odr_type (tree type, bool insert) if ((!slot || !*slot) && in_lto_p && can_be_vtable_hashed_p (type)) { hash = hash_odr_vtable (type); + if (!odr_vtable_hash) +odr_vtable_hash = new odr_vtable_hash_type (23); vtable_slot = odr_vtable_hash->find_slot_with_hash (type, hash, insert ? INSERT : NO_INSERT); } @@ -2289,27 +2291,43 @@ dump_type_inheritance_graph (FILE *f) "%i duplicates overall\n", num_all_types, num_types, num_duplicates); } -/* Save some WPA->ltrans streaming by freeing enum values. */ +/* Save some WPA->ltrans streaming by freeing stuff needed only for good + ODR warnings. + We free TYPE_VALUES of enums and also make TYPE_DECLs to not point back + to the type (which is needed to keep them in the same SCC and preserve + location information to output warnings) and subsequently we make all + TYPE_DECLS of same assembler name equivalent. */ static void -free_enum_values () +free_odr_warning_data () { - static bool enum_values_freed = false; - if (enum_values_freed || !flag_wpa || !odr_types_ptr) + static bool odr_data_freed = false; + + if (odr_data_freed || !flag_wpa || !odr_types_ptr) return; - enum_values_freed = true; - unsigned int i; - for (i = 0; i < odr_types.length (); i++) + + odr_data_freed = true; + + for (unsigned int i = 0; i < odr_types.length (); i++) if (odr_types[i]) { - if (TREE_CODE (odr_types[i]->type) == ENUMERAL_TYPE) - TYPE_VALUES (odr_types[i]->type) = NULL; + tree t = odr_types[i]->type; + + if (TREE_CODE (t) == ENUMERAL_TYPE) + TYPE_VALUES (t) = NULL; + TREE_TYPE (TYPE_NAME (t)) = void_type_node; + if (odr_types[i]->types) for (unsigned int j = 0; j < odr_types[i]->types->length (); j++) - if (TREE_CODE ((*odr_types[i]->types)[j]) == ENUMERAL_TYPE) - TYPE_VALUES ((*odr_types[i]->types)[j]) = NULL; + { + tree td = (*odr_types[i]->types)[j]; + + if (TREE_CODE (td) == ENUMERAL_TYPE) + TYPE_VALUES (td) = NULL; + TYPE_NAME (td) = TYPE_NAME (t); + } } - enum_values_freed = true; + odr_data_freed = true; } /* Initialize IPA devirt and build inheritance tree graph. */ @@ -2323,7 +2341,7 @@ build_type_inheritance_graph (void) if (odr_hash) { - free_enum_values (); + free_odr_warning_data (); return; } timevar_push (TV_IPA_INHERITANCE); @@ -2370,7 +2388,7 @@ build_type_inheritance_graph (void) dump_type_inheritance_graph (inheritance_dump_file); dump_end (TDI_inheritance, inheritance_dump_file); } - free_enum_values (); + free_odr_warning_data (); timevar_pop (TV_IPA_INHERITANCE); } Index: tree.c === --- tree.c (revision 266325) +++ tree.c (working copy) @@ -5206,6 +5206,24 @@ fld_process_array_type (tree t, tree t2, return array; } +/* Return CTX after removal of contexts that are not relevant */ + +static tree +fld_decl_context (tree ctx) +{ + /* Variably modified types are needed for tree_is_indexable to decide + whether the type needs to go to local or global section. + This code is semi-broken but for now it is easiest to keep contexts + as expected. */ + if (ctx && TYPE_P (ctx) + && !variably_modified_type_p (ctx, NULL_TREE)) + { + while (ctx && TYPE_P (ctx)) +ctx = TYPE_CONTEXT (ctx); + } + return ctx; +} + /* For T being aggregate type try to turn it into a incomplete variant. Return T if no simplification is possible. */ @@ -5267,6 +5285,28 @@ f
Re: [PATCH] Fix up 20181120-1.c testcase on big-endian (PR rtl-optimization/85925)
On Wed, Nov 21, 2018 at 06:31:43PM +0100, Jakub Jelinek wrote: > > > As mentioned in the PR, the testcase fails on big-endian targets. > > > The following patch tweaks it so that it does not fail there and still > > > checks for the original bug. > > > > It relies on a certain bitfield layout, not just on LE. I think the > > testcase should run only on those specific targets where it works. I don't > > see how this patch would fix the problem for BE, btw. > > With the patch, it doesn't rely on anything, it compares if what you get at > runtime from the code combiner would optimize is equal to what is read from > a volatile union. Oh, I think I misread it, sorry :-) > Admittedly, it might be better if the initializer was 0x1010101 or say > 0x4030201 because on big endian in particular 0x10101 has the top 15 bits > all zero and thus that is what is in u.f1, so if the bug can be reproduced > with the combine.c + rtlanal.c fix reverted with 0x4030201, it would be > better to use that value (in both spots). Yeah good point. Segher
Re: [C++ PATCH] Improve locations of id-expressions and operator "" (PR c++/87386, take 2)
On 11/21/18 10:55 AM, Jakub Jelinek wrote: Hi! On Tue, Nov 20, 2018 at 04:32:26PM -0500, David Malcolm wrote: This makes the fix-it hint wrong: after the fix-it is applied, it will become return color; (which won't compile), rather than return O::color; which will. Here is an updated version of the patch, which still uses the whole range of the id-expression when it is parsed as primary expression, but does so not in cp_parser_id_expression, but in cp_parser_primary_expression after all the diagnostics. Thus all the spell-checking etc. tests behave as previously, they underline only the part after the last ::, and just what uses the expression later on uses whole range. The remaining needed tweeks in the testcases are minor and look correct to me, e.g. for D::Bar the column is not at D but at B, Sounds good. similarly for operator"" _F the column is under _ rather than first o. I disagree with this one: the name of the declaration is operator""_F, so I think the caret should go at the first o. The libstdc++ changes are because there are several large expressions like: something::value and we used to diagnose on the something line (column of s) but now we warn on value line (column of v). Makes sense. Jason
Re: [C++ PATCH] Remove useless tokens from cp_parser_linkage_specification (PR c++/87393)
On 11/21/18 10:59 AM, Jakub Jelinek wrote: Hi! David's r251026 change added a weird trailing ->location. It doesn't seem to be useful for anything, matching_braces has its own code to track locations, so no need to do anything in the caller (and no other spot does something like that). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Jason
[PATCH] libstdc++/88111 and libstdc++/88113 fix src/c++17/memory_resource.cc for 16-bit targets
Two patches to fix the build on msp430-elf which has 16-bit or 20-bit pointers. The patch for 88111 also affects other targets, by changing the default values that are used when pool_options members are zero. The new default values depend on the number of bits in size_t. Bootstrapped on msp430-elf, tested on powerpc64le-linux. commit b5ba0a7b875c3524d447452531416eabf218e6e9 Author: Jonathan Wakely Date: Wed Nov 21 18:16:45 2018 + PR libstdc++/88111 Make maximum block size depend on size_t width PR libstdc++/88111 * include/std/memory_resource (pool_options): Add Doxygen comments. * src/c++17/memory_resource.cc (pool_sizes): Only use suitable values on targets with 16-bit or 20-bit size_t type. (munge_options): Make default values depend on width of size_t type. diff --git a/libstdc++-v3/include/std/memory_resource b/libstdc++-v3/include/std/memory_resource index 87ad25d60f3..e9a46a3b455 100644 --- a/libstdc++-v3/include/std/memory_resource +++ b/libstdc++-v3/include/std/memory_resource @@ -299,13 +299,25 @@ namespace pmr { return !(__a == __b); } + /// Parameters for tuning a pool resource's behaviour. struct pool_options { +/** @brief Upper limit on number of blocks in a chunk. + * + * A lower value prevents allocating huge chunks that could remain mostly + * unused, but means pools will need to replenished more frequently. + */ size_t max_blocks_per_chunk = 0; + +/* @brief Largest block size (in bytes) that should be served from pools. + * + * Larger allocations will be served directly by the upstream resource, + * not from one of the pools managed by the pool resource. + */ size_t largest_required_pool_block = 0; }; - // Common implementation details for unsynchronized/synchronized pool resources. + // Common implementation details for un-/synchronized pool resources. class __pool_resource { friend class synchronized_pool_resource; diff --git a/libstdc++-v3/src/c++17/memory_resource.cc b/libstdc++-v3/src/c++17/memory_resource.cc index 6198e6b68ca..929df93233c 100644 --- a/libstdc++-v3/src/c++17/memory_resource.cc +++ b/libstdc++-v3/src/c++17/memory_resource.cc @@ -825,10 +825,15 @@ namespace pmr 128, 192, 256, 320, 384, 448, 512, 768, +#if __SIZE_WIDTH__ > 16 1024, 1536, 2048, 3072, - 1<<12, 1<<13, 1<<14, 1<<15, 1<<16, 1<<17, +#if __SIZE_WIDTH__ > 20 + 1<<12, 1<<13, 1<<14, + 1<<15, 1<<16, 1<<17, 1<<20, 1<<21, 1<<22 // 4MB should be enough for anybody +#endif +#endif }; pool_options @@ -839,10 +844,13 @@ namespace pmr // replaced with implementation-defined defaults, and sizes may be // rounded to unspecified granularity. -// Absolute maximum. Each pool might have a smaller maximum. +// max_blocks_per_chunk sets the absolute maximum for the pool resource. +// Each pool might have a smaller maximum, because pools for very large +// objects might impose smaller limit. if (opts.max_blocks_per_chunk == 0) { - opts.max_blocks_per_chunk = 1024 * 10; // TODO a good default? + // Pick a default that depends on the number of bits in size_t. + opts.max_blocks_per_chunk = __SIZE_WIDTH__ << 8; } else { @@ -854,10 +862,15 @@ namespace pmr opts.max_blocks_per_chunk = chunk::max_blocks_per_chunk(); } -// Absolute minimum. Likely to be much larger in practice. +// largest_required_pool_block specifies the largest block size that will +// be allocated from a pool. Larger allocations will come directly from +// the upstream resource and so will not be pooled. if (opts.largest_required_pool_block == 0) { - opts.largest_required_pool_block = 4096; // TODO a good default? + // Pick a sensible default that depends on the number of bits in size_t + // (pools with larger block sizes must be explicitly requested by + // using a non-zero value for largest_required_pool_block). + opts.largest_required_pool_block = __SIZE_WIDTH__ << 6; } else { commit 14974318adc5e9d56e827cdfa39207e7c7be9e6d Author: Jonathan Wakely Date: Wed Nov 21 17:39:51 2018 + PR libstdc++/88113 use size_type consistently instead of size_t On 16-bit msp430-elf size_t is either 16 bits or 20 bits, and so can't represent all values of the uint32_t type used for bitset::size_type. Using the smaller of size_t and uint32_t for size_type ensures it fits in size_t. PR libstdc++/88113 * src/c++17/memory_resource.cc (bitset::size_type): Use the smaller of uint32_t and size_t. (bitset::size(), bitset::free(), bitset::update_next_word()) (bitset::max_blocks_per_chunk(), bitset::max_word_index()): Use size_type consistently instead of size_t. (chunk): Adjust static_assert checking sizeof(chunk). diff --git a/
Re: [PATCH] C++: show namespaces for enum values (PR c++/88121)
On 11/21/18 8:35 AM, David Malcolm wrote: Consider this test case: namespace json { enum { JSON_OBJECT }; } void test () { JSON_OBJECT; } which erroneously accesses an enum value in another namespace without qualifying the access. GCC 6 through 8 issue a suggestion that doesn't mention the namespace: : In function 'void test()': :8:3: error: 'JSON_OBJECT' was not declared in this scope JSON_OBJECT; ^~~ :8:3: note: suggested alternative: :3:10: note: 'JSON_OBJECT' enum { JSON_OBJECT }; ^~~ which is suboptimal. I made the problem worse with r265610, as gcc 9 now consolidates the single suggestion into the error, and emits: : In function 'void test()': :8:3: error: 'JSON_OBJECT' was not declared in this scope; did you mean 'JSON_OBJECT'? 8 | JSON_OBJECT; | ^~~ | JSON_OBJECT :3:10: note: 'JSON_OBJECT' declared here 3 | enum { JSON_OBJECT }; | ^~~ where the message: 'JSON_OBJECT' was not declared in this scope; did you mean 'JSON_OBJECT'? is nonsensical. The root cause is that dump_scope doesn't print anything when called for CONST_DECL in a namespace: the scope is an ENUMERAL_TYPE, rather than a namespace. Although that's only true for unscoped enums. This patch tweaks dump_scope to detect ENUMERAL_TYPE, and to use the enclosing namespace, so that the CONST_DECL is dumped as "json::JSON_OBJECT". @@ -182,6 +182,12 @@ dump_scope (cxx_pretty_printer *pp, tree scope, int flags) if (scope == NULL_TREE) return; + /* Enum values will be CONST_DECL with an ENUMERAL_TYPE as their + "scope". Use CP_TYPE_CONTEXT of the ENUMERAL_TYPE, so as to + print the enclosing namespace. */ + if (TREE_CODE (scope) == ENUMERAL_TYPE) +scope = CP_TYPE_CONTEXT (scope); This needs to handle scoped enums differently. diff --git a/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C b/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C new file mode 100644 index 000..2bf3ed6 --- /dev/null +++ b/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C @@ -0,0 +1,13 @@ +// { dg-do compile { target c++11 } } +// { dg-options "-fdiagnostics-show-caret" } + +enum class vegetable { CARROT, TURNIP }; + +void misspelled_value_in_scoped_enum () +{ + vegetable::TURNUP; // { dg-error "'TURNUP' is not a member of 'vegetable'" } + /* { dg-begin-multiline-output "" } + vegetable::TURNUP; + ^~ + { dg-end-multiline-output "" } */ +} I don't see any suggestion in the expected output, and would hope for it to suggest vegetable::TURNIP. Jason
Re: [C++ PATCH] Improve locations of id-expressions and operator "" (PR c++/87386, take 2)
On Wed, Nov 21, 2018 at 01:29:15PM -0500, Jason Merrill wrote: > > similarly for operator"" _F the column is under _ rather than first o. > > I disagree with this one: the name of the declaration is operator""_F, so I > think the caret should go at the first o. Right now when cp_parser_operator_function_id is called, it returns locus like: operator new ^~~ operator delete [] ^ operator == ^ operator "" _foo UNKNOWN_LOCATION The last one is because for others we do return cp_expr (id, start_loc); but for operator "" just return id; So, do you suggest we should instead return operator new ^~~~ operator delete [] ^~ operator == ^~~ operator "" _foo ^~~~ ? That would mean cp_parser_operator_function_id would need to pass location_t start_loc (the start of the operator token) to cp_parser_operator and let that create a range in all cases rather than just for operator new/delete. Jakub
[PATCH 1/7][v2][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf
On Wed, 14 Nov 2018 15:41:00 + Jozef Lawrynowicz wrote: > Patch 1 tweaks dg directives in tests specifically for msp430. Many of > these are extensions to existing target selectors in dg directives. Made some modifications to patch 1 based on suggestions. Added int_eq_float and ptr_eq_long effective target procedures. Re-tested on avr, x86_64-pc-linux-gnu and msp430-elf. Ok for trunk? >From 1f31a27ab7cf5b7de0c1cfc7e33a39a66cd61146 Mon Sep 17 00:00:00 2001 From: Jozef Lawrynowicz Date: Thu, 8 Nov 2018 18:55:57 + Subject: [PATCH] [TESTSUITE][MSP430] Tweak dg-directives for msp430-elf 2018-11-21 Jozef Lawrynowicz gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_logical_op_short_circuit): Add msp430. (check_effective_target_int_eq_float): New. (check_effective_target_ptr_eq_long): New. * c-c++-common/pr41779.c: Require int_eq_float for dg-warning tests. * c-c++-common/pr57371-2.c: XFAIL optimized dump scan when sizeof (float) != sizeof (int). * gcc.dg/pr84670-4.c: Require ptr_eq_long. * gcc.dg/pr85859.c: Likewise. * gcc.dg/Wno-frame-address.c: Skip for msp430-elf. * gcc.dg/torture/stackalign/builtin-apply-2.c: Likewise. * gcc.dg/ifcvt-4.c: Likewise. * gcc.dg/pr34856.c: Likewise. * gcc.dg/builtin-apply2.c: Likewise. * gcc.dg/tree-ssa/ssa-dse-26.c: Likewise. * gcc.dg/attr-alloc_size-11.c: Remove dg-warning XFAIL for msp430. * gcc.dg/tree-ssa/20040204-1.c: Likewise. * gcc.dg/compat/struct-by-value-16a_x.c: Build at -O1 for msp430 so it fits. * gcc.dg/lto/20091013-1_0.c: Require ptr_eq_long. * gcc.dg/lto/20091013-1_1.c: Remove xfail-if for when sizeof(void *) != sizeof(long). * gcc.dg/lto/20091013-1_2.c: Likewise. * gcc.dg/tree-ssa/loop-1.c: Fix expected dg-final behaviour for msp430. * gcc.dg/tree-ssa/gen-vect-25.c: Likewise. * gcc.dg/tree-ssa/gen-vect-11.c: Likewise. * gcc.dg/tree-ssa/loop-35.c: Likewise. * gcc.dg/tree-ssa/pr23455.c: Likewise. * gcc.dg/weak/typeof-2.c: Likewise. * gcc.target/msp430/interrupt_fn_placement.c: Skip for 430 ISA. * gcc.target/msp430/pr78818-data-region.c: Fix scan-assembler text. * gcc.target/msp430/pr79242.c: Don't skip for -msmall. * gcc.target/msp430/special-regs.c: Use "__asm__" instead of "asm". --- gcc/testsuite/c-c++-common/pr41779.c | 6 +++--- gcc/testsuite/c-c++-common/pr57371-2.c | 2 +- gcc/testsuite/gcc.dg/Wno-frame-address.c | 2 +- gcc/testsuite/gcc.dg/attr-alloc_size-11.c | 4 ++-- gcc/testsuite/gcc.dg/builtin-apply2.c | 2 +- .../gcc.dg/compat/struct-by-value-16a_x.c | 2 ++ gcc/testsuite/gcc.dg/ifcvt-4.c | 2 +- gcc/testsuite/gcc.dg/lto/20091013-1_0.c| 1 + gcc/testsuite/gcc.dg/lto/20091013-1_1.c| 1 - gcc/testsuite/gcc.dg/lto/20091013-1_2.c| 1 - gcc/testsuite/gcc.dg/pr34856.c | 1 + gcc/testsuite/gcc.dg/pr84670-4.c | 1 + gcc/testsuite/gcc.dg/pr85859.c | 1 + .../gcc.dg/torture/stackalign/builtin-apply-2.c| 2 +- gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11.c| 2 +- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-25.c| 4 ++-- gcc/testsuite/gcc.dg/tree-ssa/loop-1.c | 4 ++-- gcc/testsuite/gcc.dg/tree-ssa/loop-35.c| 4 ++-- gcc/testsuite/gcc.dg/tree-ssa/pr23455.c| 4 ++-- gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c | 1 + gcc/testsuite/gcc.dg/weak/typeof-2.c | 2 ++ .../gcc.target/msp430/interrupt_fn_placement.c | 1 + .../gcc.target/msp430/pr78818-data-region.c| 3 ++- gcc/testsuite/gcc.target/msp430/pr79242.c | 2 +- gcc/testsuite/gcc.target/msp430/special-regs.c | 8 gcc/testsuite/lib/target-supports.exp | 24 ++ 27 files changed, 61 insertions(+), 28 deletions(-) diff --git a/gcc/testsuite/c-c++-common/pr41779.c b/gcc/testsuite/c-c++-common/pr41779.c index c42a0f5..a80bf78 100644 --- a/gcc/testsuite/c-c++-common/pr41779.c +++ b/gcc/testsuite/c-c++-common/pr41779.c @@ -1,6 +1,6 @@ /* PR41779: Wconversion cannot see through real*integer promotions. */ /* { dg-do compile } */ -/* { dg-skip-if "doubles are floats" { "avr-*-*" } } */ +/* { dg-skip-if "doubles are floats" { avr-*-* } } */ /* { dg-options "-std=c99 -Wconversion" { target c } } */ /* { dg-options "-Wconversion" { target c++ } } */ /* { dg-require-effective-target large_double } */ @@ -27,7 +27,7 @@ float f4(float x, unsigned char y) float f5(float x, int y) { - return x * y; /* { dg-warning "conversion" } */ + return x * y; /* { dg-warning "conversion" "" { target int_eq_float } } */ } double c1(float x, unsigned short y, int z) @@ -52,5 +52,5 @@ double c4(float x, unsigned char y, int z) double c5(float x, int y, int z) { - return z ? x + x : y; /* { dg-warning "conversion" } */ + return z ? x + x : y; /* {
[PATCH] Fix up 20181120-1.c testcase on big-endian (PR rtl-optimization/85925, take 2)
Hi! On Wed, Nov 21, 2018 at 12:07:51PM -0600, Segher Boessenkool wrote: > > Admittedly, it might be better if the initializer was 0x1010101 or say > > 0x4030201 because on big endian in particular 0x10101 has the top 15 bits > > all zero and thus that is what is in u.f1, so if the bug can be reproduced > > with the combine.c + rtlanal.c fix reverted with 0x4030201, it would be > > better to use that value (in both spots). > > Yeah good point. I've now managed to test this with a cross to armv7hl (scped to an arm box) with and without the rtlanal.c + combine.c change reverted and on powerpc64-linux as example of big-endian, on armv7hl it still fails with the changes reverted, otherwise it succeeds on both. The test also needs 32-bit int target (previously just 17-bit or more, so I've added effective target). Ok for trunk and release branches? 2018-11-21 Jakub Jelinek PR rtl-optimization/85925 * gcc.c-torture/execute/20181120-1.c: Require effective target int32plus. (u): New variable. (main): Compare d against u.f1 rather than 0x101. Use 0x4030201 instead of 0x10101. --- gcc/testsuite/gcc.c-torture/execute/20181120-1.c.jj 2018-11-21 17:39:47.963671708 +0100 +++ gcc/testsuite/gcc.c-torture/execute/20181120-1.c2018-11-21 20:07:45.804556443 +0100 @@ -1,4 +1,5 @@ /* PR rtl-optimization/85925 */ +/* { dg-require-effective-target int32plus } */ /* Testcase by */ int a, c, d; @@ -9,17 +10,18 @@ union U1 { unsigned f0; unsigned f1 : 15; }; +volatile union U1 u = { 0x4030201 }; int main (void) { for (c = 0; c <= 1; c++) { -union U1 f = {0x10101}; +union U1 f = {0x4030201}; if (c == 1) b; *e = f.f1; } - if (d != 0x101) + if (d != u.f1) __builtin_abort (); return 0; Jakub
Re: [PATCH 1/7][v2][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf
Hi Jozef, > On Wed, 14 Nov 2018 15:41:00 + > Jozef Lawrynowicz wrote: > >> Patch 1 tweaks dg directives in tests specifically for msp430. Many of >> these are extensions to existing target selectors in dg directives. > > Made some modifications to patch 1 based on suggestions. > Added int_eq_float and ptr_eq_long effective target procedures. > > Re-tested on avr, x86_64-pc-linux-gnu and msp430-elf. > > Ok for trunk? new effective-target keywords always need documenting in gcc/doc/sourcebuild.texi. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] Replace sync builtins with atomic builtins
Hi Janne, PING! OK. Thanks for the patch! Regards Thomas
Re: Patch ping (Re: [PATCH] Fortran include line fixes and -fdec-include support)
Hi Jakub, Before 9.0 is released, we should also document the flag (and the extension it supports) in the manual, and note it in changes.html and on the Wiki. Would you also do that? Like this? Ok for trunk/wwwdocs? OK for trunk (and I don't think you need my OK for wwwdocs, but you have it anyway :-) Regards Thomas
[PATCH, LRA]: Revert the revert of removal of usless move insns.
Hello! Before the recent patch to post-reload mode switching, vzeroupper insertion depended on the existence of the return copy instructions pair in functions that return a value. The first instruction in the pair represents a move to a function return hard register, and the second was a USE of the function return hard register. Sometimes a nop move was generated (e.g. %eax->%eax) for the first instruction of the return copy instructions pair and the patch [1] teached LRA to remove these useless instructions on the fly. The removal caused optimize mode switching to trigger the assert, since the first instruction of a return pair was not found. The relevant part of the patch was later reverted. With the recent optimize mode switching patch, this is no longer necessary for vzeroupper insertion pass, so attached patch reverts the revert. 2018-11-21 Uros Bizjak Revert the revert: 2013-10-26 Vladimir Makarov Revert: 2013-10-25 Vladimir Makarov * lra-spills.c (lra_final_code_change): Remove useless move insns. Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. OK for mainline? [1] https://gcc.gnu.org/ml/gcc-patches/2013-10/msg02208.html Uros. diff --git a/gcc/lra-spills.c b/gcc/lra-spills.c index 33caf9f45649..008d7399687d 100644 --- a/gcc/lra-spills.c +++ b/gcc/lra-spills.c @@ -740,6 +740,7 @@ lra_final_code_change (void) int i, hard_regno; basic_block bb; rtx_insn *insn, *curr; + rtx set; int max_regno = max_reg_num (); for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++) @@ -818,5 +819,19 @@ lra_final_code_change (void) } if (insn_change_p) lra_update_operator_dups (id); + + if ((set = single_set (insn)) != NULL + && REG_P (SET_SRC (set)) && REG_P (SET_DEST (set)) + && REGNO (SET_SRC (set)) == REGNO (SET_DEST (set))) + { + /* Remove an useless move insn. IRA can generate move +insns involving pseudos. It is better remove them +earlier to speed up compiler a bit. It is also +better to do it here as they might not pass final RTL +check in LRA, (e.g. insn moving a control register +into itself). */ + lra_invalidate_insn_data (insn); + delete_insn (insn); + } } }
Re: Improve relocation
ping? On Fri, 26 Oct 2018, Marc Glisse wrote: Hello, here are some tweaks so that I can usefully mark deque as trivially relocatable. It includes more noexcept(auto) madness. For __relocate_a_1, I should also test if copying, ++ and != are noexcept, but I wanted to ask first because there might be restrictions on what iterators are allowed to do, even if I didn't see them. Also, the current code already ignores those, so it may as well be fixed in another patch. Allocators are complicated. I specialized only for the default allocator, because that's by far the one that is used the most, and I have much less risk of getting it wrong. Some allocator expert is welcome to make a better test. I do not know in details how deque is implemented. A quick look seemed to show that trivial relocation should be fine, but I would appreciate a confirmation. The extra parameter for __is_trivially_relocatable is not used, but I expect it will be as soon as the specializations of __is_trivially_relocatable become more advanced. If I use or specialize __is_trivially_relocatable in many places, this forces to #include bits/stl_uninitialized.h in many places. I wonder if I should move some of that stuff. Since I may use it in std::swap, bits/move.h looks like a sensible place for the core pieces (__is_trivially_relocatable, and __relocate_object if I ever create that). That or type_traits. Regtested on gcc112. I manually checked that there was a speed-up for operations on vector>, although doing any kind of benchmarking on gcc112 is hard, I'll test locally next time. 2018-10-26 Marc Glisse PR libstdc++/87106 * include/bits/stl_algobase.h: Include . (__niter_base): Add noexcept specification. * include/bits/stl_deque.h: Include . (__is_trivially_relocatable): Specialize for deque. * include/bits/stl_iterator.h: Include . (__niter_base): Add noexcept specification. * include/bits/stl_uninitialized.h (__is_trivially_relocatable): Add parameter for meta-programming. (__relocate_a_1, __relocate_a): Add noexcept specification. * include/bits/stl_vector.h (__use_relocate): Test __relocate_a. -- Marc Glisse
[PATCH, middle-end]: Fix PR88129, Two blockage insns are emitted in the function epilogue
Hello! Attached patch removes extra blockage insn generation. For the software archaeology, please see the PR [1], where it was determined, that the removed part is probably a dataflow branch to trunk merge oversight. 2018-11-21 Uros Bizjak PR middle-end/88129 * function.c (expand_function_end): Do not emit extra blockage insn. Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32} for all default languages, obj-c++ and go. OK for mainline? [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88129 Uros. diff --git a/gcc/function.c b/gcc/function.c index 302438323c87..44ad57840440 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -5296,14 +5296,6 @@ expand_function_end (void) if (flag_exceptions) sjlj_emit_function_exit_after (get_last_insn ()); } - else -{ - /* We want to ensure that instructions that may trap are not -moved into the epilogue by scheduling, because we don't -always emit unwind information for the epilogue. */ - if (cfun->can_throw_non_call_exceptions) - emit_insn (gen_blockage ()); -} /* If this is an implementation of throw, do what's necessary to communicate between __builtin_eh_return and the epilogue. */
Re: [PATCH, middle-end]: Fix PR88129, Two blockage insns are emitted in the function epilogue
On November 21, 2018 8:44:46 PM GMT+01:00, Uros Bizjak wrote: >Hello! > >Attached patch removes extra blockage insn generation. For the >software archaeology, please see the PR [1], where it was determined, >that the removed part is probably a dataflow branch to trunk merge >oversight. > >2018-11-21 Uros Bizjak > >PR middle-end/88129 > * function.c (expand_function_end): Do not emit extra blockage insn. > >Patch was bootstrapped and regression tested on x86_64-linux-gnu >{,-m32} for all default languages, obj-c++ and go. > >OK for mainline? OK. Richard. >[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88129 > >Uros.
[PATCH, i386]: Fix PR85667, ms_abi rules aren't followed when returning short structs with float values
> We don't have the commit access ,can someone please commit for us ? > > ~Umesh > > On Wed, Nov 21, 2018, 18:37 Jakub Jelinek > > On Wed, Nov 21, 2018 at 06:06:41PM +0530, Umesh Kalappa wrote: > > > Thank you for the inputs and please find the attachment for the update > > patch. > > > > LGTM. Committed to mainline SVN. Thanks, Uros.
Re: [PATCH] Fix up 20181120-1.c testcase on big-endian (PR rtl-optimization/85925, take 2)
On Wed, Nov 21, 2018 at 08:12:44PM +0100, Jakub Jelinek wrote: > On Wed, Nov 21, 2018 at 12:07:51PM -0600, Segher Boessenkool wrote: > > > Admittedly, it might be better if the initializer was 0x1010101 or say > > > 0x4030201 because on big endian in particular 0x10101 has the top 15 bits > > > all zero and thus that is what is in u.f1, so if the bug can be reproduced > > > with the combine.c + rtlanal.c fix reverted with 0x4030201, it would be > > > better to use that value (in both spots). > > > > Yeah good point. > > I've now managed to test this with a cross to armv7hl (scped to an arm box) > with and without the rtlanal.c + combine.c change reverted and on > powerpc64-linux as example of big-endian, on armv7hl it still fails with > the changes reverted, otherwise it succeeds on both. The test also needs > 32-bit int target (previously just 17-bit or more, so I've added effective > target). It fixes the problem on powerpc64-linux {-m32,-m64}. Thanks :-) Segher
Re: [PATCH 1/7][v2][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf
On Wed, 21 Nov 2018 20:19:29 +0100 Rainer Orth wrote: > new effective-target keywords always need documenting in > gcc/doc/sourcebuild.texi. > > Rainer > Whoops, thanks for the heads up, fixed in attached. I'll add documentation for the keywords added in the other patches as well. Jozef >From be96391838c65b297589ac47ad6347f55ea713c0 Mon Sep 17 00:00:00 2001 From: Jozef Lawrynowicz Date: Thu, 8 Nov 2018 18:55:57 + Subject: [PATCH] [TESTSUITE][MSP430] Tweak dg-directives for msp430-elf 2018-11-21 Jozef Lawrynowicz gcc/ChangeLog: * doc/sourcebuild.texi: Document check_effective_target_int_eq_float and check_effective_target_ptr_eq_long. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_logical_op_short_circuit): Add msp430. (check_effective_target_int_eq_float): New. (check_effective_target_ptr_eq_long): New. * c-c++-common/pr41779.c: Require int_eq_float for dg-warning tests. * c-c++-common/pr57371-2.c: XFAIL optimized dump scan when sizeof (float) != sizeof (int). * gcc.dg/pr84670-4.c: Require ptr_eq_long. * gcc.dg/pr85859.c: Likewise. * gcc.dg/Wno-frame-address.c: Skip for msp430-elf. * gcc.dg/torture/stackalign/builtin-apply-2.c: Likewise. * gcc.dg/ifcvt-4.c: Likewise. * gcc.dg/pr34856.c: Likewise. * gcc.dg/builtin-apply2.c: Likewise. * gcc.dg/tree-ssa/ssa-dse-26.c: Likewise. * gcc.dg/attr-alloc_size-11.c: Remove dg-warning XFAIL for msp430. * gcc.dg/tree-ssa/20040204-1.c: Likewise. * gcc.dg/compat/struct-by-value-16a_x.c: Build at -O1 for msp430 so it fits. * gcc.dg/lto/20091013-1_0.c: Require ptr_eq_long. * gcc.dg/lto/20091013-1_1.c: Remove xfail-if for when sizeof(void *) != sizeof(long). * gcc.dg/lto/20091013-1_2.c: Likewise. * gcc.dg/tree-ssa/loop-1.c: Fix expected dg-final behaviour for msp430. * gcc.dg/tree-ssa/gen-vect-25.c: Likewise. * gcc.dg/tree-ssa/gen-vect-11.c: Likewise. * gcc.dg/tree-ssa/loop-35.c: Likewise. * gcc.dg/tree-ssa/pr23455.c: Likewise. * gcc.dg/weak/typeof-2.c: Likewise. * gcc.target/msp430/interrupt_fn_placement.c: Skip for 430 ISA. * gcc.target/msp430/pr78818-data-region.c: Fix scan-assembler text. * gcc.target/msp430/pr79242.c: Don't skip for -msmall. * gcc.target/msp430/special-regs.c: Use "__asm__" instead of "asm". --- gcc/doc/sourcebuild.texi | 6 ++ gcc/testsuite/c-c++-common/pr41779.c | 6 +++--- gcc/testsuite/c-c++-common/pr57371-2.c | 2 +- gcc/testsuite/gcc.dg/Wno-frame-address.c | 2 +- gcc/testsuite/gcc.dg/attr-alloc_size-11.c | 4 ++-- gcc/testsuite/gcc.dg/builtin-apply2.c | 2 +- .../gcc.dg/compat/struct-by-value-16a_x.c | 2 ++ gcc/testsuite/gcc.dg/ifcvt-4.c | 2 +- gcc/testsuite/gcc.dg/lto/20091013-1_0.c| 1 + gcc/testsuite/gcc.dg/lto/20091013-1_1.c| 1 - gcc/testsuite/gcc.dg/lto/20091013-1_2.c| 1 - gcc/testsuite/gcc.dg/pr34856.c | 1 + gcc/testsuite/gcc.dg/pr84670-4.c | 1 + gcc/testsuite/gcc.dg/pr85859.c | 1 + .../gcc.dg/torture/stackalign/builtin-apply-2.c| 2 +- gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11.c| 2 +- gcc/testsuite/gcc.dg/tree-ssa/gen-vect-25.c| 4 ++-- gcc/testsuite/gcc.dg/tree-ssa/loop-1.c | 4 ++-- gcc/testsuite/gcc.dg/tree-ssa/loop-35.c| 4 ++-- gcc/testsuite/gcc.dg/tree-ssa/pr23455.c| 4 ++-- gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c | 1 + gcc/testsuite/gcc.dg/weak/typeof-2.c | 2 ++ .../gcc.target/msp430/interrupt_fn_placement.c | 1 + .../gcc.target/msp430/pr78818-data-region.c| 3 ++- gcc/testsuite/gcc.target/msp430/pr79242.c | 2 +- gcc/testsuite/gcc.target/msp430/special-regs.c | 8 gcc/testsuite/lib/target-supports.exp | 24 ++ 28 files changed, 67 insertions(+), 28 deletions(-) diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 7487977..bca5db3 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1360,6 +1360,12 @@ Target has @code{int} that is 16 bits or shorter. @item long_neq_int Target has @code{int} and @code{long} with different sizes. +@item int_eq_float +Target has @code{int} and @code{float} with the same size. + +@item ptr_eq_long +Target has pointers (@code{void *}) and @code{long} with the same size. + @item large_double Target supports @code{double} that is longer than @code{float}. diff --git a/gcc/testsuite/c-c++-common/pr41779.c b/gcc/testsuite/c-c++-common/pr41779.c index c42a0f5..a80bf78 100644 --- a/gcc/testsuite/c-c++-common/pr41779.c +++ b/gcc/testsuite/c-c++-common/pr41779.c @@ -1,6 +1,6 @@ /* PR41779: Wconversion cannot see through real*integer promotions. */ /* { dg-do compile } */ -/* { dg-skip-if "doubles are floats" { "avr-*-*" } }
[PING^3] Re: [PATCH 1/3] Support instrumenting returns of instrumented functions
Andi Kleen writes: Ping^3! > Andi Kleen writes: > > Ping!^2 > >> Andi Kleen writes: >> >> Ping! >> >>> From: Andi Kleen >>> >>> When instrumenting programs using __fentry__ it is often useful >>> to instrument the function return too. Traditionally this >>> has been done by patching the return address on the stack >>> frame on entry. However this is fairly complicated (trace >>> function has to emulate a stack) and also slow because >>> it causes a branch misprediction on every return. >>> >>> Add an option to generate call or nop instrumentation for >>> every return instead, including patch sections. >>> >>> This will increase the program size slightly, but can be a >>> lot faster and simpler. >>> >>> This version only instruments true returns, not sibling >>> calls or tail recursion. This matches the semantics of the >>> original stack. >>> >>> gcc/: >>> >>> 2018-11-04 Andi Kleen >>> >>> * config/i386/i386-opts.h (enum instrument_return): Add. >>> * config/i386/i386.c (output_return_instrumentation): Add. >>> (ix86_output_function_return): Call output_return_instrumentation. >>> (ix86_output_call_insn): Call output_return_instrumentation. >>> * config/i386/i386.opt: Add -minstrument-return=. >>> * doc/invoke.texi (-minstrument-return): Document. >>> >>> gcc/testsuite/: >>> >>> 2018-11-04 Andi Kleen >>> >>> * gcc.target/i386/returninst1.c: New test. >>> * gcc.target/i386/returninst2.c: New test. >>> * gcc.target/i386/returninst3.c: New test. >>> --- >>> gcc/config/i386/i386-opts.h | 6 >>> gcc/config/i386/i386.c | 36 + >>> gcc/config/i386/i386.opt| 21 >>> gcc/doc/invoke.texi | 14 >>> gcc/testsuite/gcc.target/i386/returninst1.c | 14 >>> gcc/testsuite/gcc.target/i386/returninst2.c | 21 >>> gcc/testsuite/gcc.target/i386/returninst3.c | 9 ++ >>> 7 files changed, 121 insertions(+) >>> create mode 100644 gcc/testsuite/gcc.target/i386/returninst1.c >>> create mode 100644 gcc/testsuite/gcc.target/i386/returninst2.c >>> create mode 100644 gcc/testsuite/gcc.target/i386/returninst3.c >>> >>> diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h >>> index 46366cbfa72..35e9413100e 100644 >>> --- a/gcc/config/i386/i386-opts.h >>> +++ b/gcc/config/i386/i386-opts.h >>> @@ -119,4 +119,10 @@ enum indirect_branch { >>>indirect_branch_thunk_extern >>> }; >>> >>> +enum instrument_return { >>> + instrument_return_none = 0, >>> + instrument_return_call, >>> + instrument_return_nop5 >>> +}; >>> + >>> #endif >>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c >>> index f9ef0b4445b..f7cd94a8139 100644 >>> --- a/gcc/config/i386/i386.c >>> +++ b/gcc/config/i386/i386.c >>> @@ -28336,12 +28336,47 @@ ix86_output_indirect_jmp (rtx call_op) >>> return "%!jmp\t%A0"; >>> } >>> >>> +/* Output return instrumentation for current function if needed. */ >>> + >>> +static void >>> +output_return_instrumentation (void) >>> +{ >>> + if (ix86_instrument_return != instrument_return_none >>> + && flag_fentry >>> + && !DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT (cfun->decl)) >>> +{ >>> + if (ix86_flag_record_return) >>> + fprintf (asm_out_file, "1:\n"); >>> + switch (ix86_instrument_return) >>> + { >>> + case instrument_return_call: >>> + fprintf (asm_out_file, "\tcall\t__return__\n"); >>> + break; >>> + case instrument_return_nop5: >>> + /* 5 byte nop: nopl 0(%[re]ax,%[re]ax,1) */ >>> + fprintf (asm_out_file, ASM_BYTE "0x0f, 0x1f, 0x44, 0x00, 0x00\n"); >>> + break; >>> + case instrument_return_none: >>> + break; >>> + } >>> + >>> + if (ix86_flag_record_return) >>> + { >>> + fprintf (asm_out_file, "\t.section __return_loc, \"a\",@progbits\n"); >>> + fprintf (asm_out_file, "\t.%s 1b\n", TARGET_64BIT ? "quad" : "long"); >>> + fprintf (asm_out_file, "\t.previous\n"); >>> + } >>> +} >>> +} >>> + >>> /* Output function return. CALL_OP is the jump target. Add a REP >>> prefix to RET if LONG_P is true and function return is kept. */ >>> >>> const char * >>> ix86_output_function_return (bool long_p) >>> { >>> + output_return_instrumentation (); >>> + >>>if (cfun->machine->function_return_type != indirect_branch_keep) >>> { >>>char thunk_name[32]; >>> @@ -28454,6 +28489,7 @@ ix86_output_call_insn (rtx_insn *insn, rtx call_op) >>> >>>if (SIBLING_CALL_P (insn)) >>> { >>> + output_return_instrumentation (); >>>if (direct_p) >>> { >>> if (ix86_nopic_noplt_attribute_p (call_op)) >>> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt >>> index e7fbf9b6f99..5925b75244f 100644 >>> --- a/gcc/config/i386/i386.opt >>> +++ b/gcc/config/i386/i386.opt >>> @@ -1063,3 +1063,24 @@ Support WAITPKG built-in functions and code >>> generation. >>> mcldemote >>>
Re: [PATCH, LRA]: Revert the revert of removal of usless move insns.
On 11/21/2018 02:33 PM, Uros Bizjak wrote: Hello! Before the recent patch to post-reload mode switching, vzeroupper insertion depended on the existence of the return copy instructions pair in functions that return a value. The first instruction in the pair represents a move to a function return hard register, and the second was a USE of the function return hard register. Sometimes a nop move was generated (e.g. %eax->%eax) for the first instruction of the return copy instructions pair and the patch [1] teached LRA to remove these useless instructions on the fly. The removal caused optimize mode switching to trigger the assert, since the first instruction of a return pair was not found. The relevant part of the patch was later reverted. With the recent optimize mode switching patch, this is no longer necessary for vzeroupper insertion pass, so attached patch reverts the revert. 2018-11-21 Uros Bizjak Revert the revert: 2013-10-26 Vladimir Makarov Revert: 2013-10-25 Vladimir Makarov * lra-spills.c (lra_final_code_change): Remove useless move insns. Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. OK for mainline? Sure. Thank you, Uros. [1] https://gcc.gnu.org/ml/gcc-patches/2013-10/msg02208.html Uros.
[C++ PATCH] Improve locations of id-expressions and operator "" (PR c++/87386, take 3)
On Wed, Nov 21, 2018 at 07:49:48PM +0100, Jakub Jelinek wrote: > So, do you suggest we should instead return > operator new > ^~~~ > operator delete [] > ^~ > operator == > ^~~ > operator "" _foo > ^~~~ > ? > That would mean cp_parser_operator_function_id would need to pass > location_t start_loc (the start of the operator token) to cp_parser_operator > and > let that create a range in all cases rather than just for operator > new/delete. This version of the patch implements that. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-11-21 Jakub Jelinek PR c++/87386 * parser.c (cp_parser_primary_expression): Use id_expression.get_location () instead of id_expr_token->location. Adjust the range from id_expr_token->location to id_expressio.get_finish (). (cp_parser_operator_function_id): Pass location of the operator token down to cp_parser_operator. (cp_parser_operator): Add start_loc argument, always construct a location with caret at start_loc and range from start_loc to the finish of the last token. gcc/testsuite/ * g++.dg/diagnostic/pr87386.C: New test. * g++.dg/parse/error17.C: Adjust expected diagnostics. libstdc++-v3/ * testsuite/20_util/scoped_allocator/69293_neg.cc: Adjust expected line. * testsuite/20_util/uses_allocator/cons_neg.cc: Likewise. * testsuite/20_util/uses_allocator/69293_neg.cc: Likewise. * testsuite/experimental/propagate_const/requirements2.cc: Likewise. * testsuite/experimental/propagate_const/requirements3.cc: Likewise. * testsuite/experimental/propagate_const/requirements4.cc: Likewise. * testsuite/experimental/propagate_const/requirements5.cc: Likewise. --- gcc/cp/parser.c.jj 2018-11-21 17:42:18.003216049 +0100 +++ gcc/cp/parser.c 2018-11-21 20:56:43.694344258 +0100 @@ -2312,7 +2312,7 @@ static tree cp_parser_mem_initializer_id static cp_expr cp_parser_operator_function_id (cp_parser *); static cp_expr cp_parser_operator - (cp_parser *); + (cp_parser *, location_t); /* Templates [gram.temp] */ @@ -5604,7 +5604,7 @@ cp_parser_primary_expression (cp_parser /*is_namespace=*/false, /*check_dependency=*/true, &ambiguous_decls, - id_expr_token->location); + id_expression.get_location ()); /* If the lookup was ambiguous, an error will already have been issued. */ if (ambiguous_decls) @@ -5675,7 +5675,7 @@ cp_parser_primary_expression (cp_parser if (parser->local_variables_forbidden_p && local_variable_p (decl)) { - error_at (id_expr_token->location, + error_at (id_expression.get_location (), "local variable %qD may not appear in this context", decl.get_value ()); return error_mark_node; @@ -5694,7 +5694,8 @@ cp_parser_primary_expression (cp_parser id_expression.get_location ())); if (error_msg) cp_parser_error (parser, error_msg); - decl.set_location (id_expr_token->location); + decl.set_location (id_expression.get_location ()); + decl.set_range (id_expr_token->location, id_expression.get_finish ()); return decl; } @@ -15011,11 +15012,12 @@ cp_parser_mem_initializer_id (cp_parser* static cp_expr cp_parser_operator_function_id (cp_parser* parser) { + location_t start_loc = cp_lexer_peek_token (parser->lexer)->location; /* Look for the `operator' keyword. */ if (!cp_parser_require_keyword (parser, RID_OPERATOR, RT_OPERATOR)) return error_mark_node; /* And then the name of the operator itself. */ - return cp_parser_operator (parser); + return cp_parser_operator (parser, start_loc); } /* Return an identifier node for a user-defined literal operator. @@ -15049,7 +15051,7 @@ cp_literal_operator_id (const char* name human-readable spelling of the identifier, e.g., `operator +'. */ static cp_expr -cp_parser_operator (cp_parser* parser) +cp_parser_operator (cp_parser* parser, location_t start_loc) { tree id = NULL_TREE; cp_token *token; @@ -15058,7 +15060,7 @@ cp_parser_operator (cp_parser* parser) /* Peek at the next token. */ token = cp_lexer_peek_token (parser->lexer); - location_t start_loc = token->location; + location_t end_loc = token->location; /* Figure out which operator we have. */ enum tree_code op = ERROR_MARK; @@ -15077,7 +15079,7 @@ cp_parser_operator (cp_parser* parser) break; /* Consume the `new' or `delete' token. */ - location_t end_loc = cp_lexer_consume_token (parser->lexer)->location; + end
[C++ PATCH] Fix ICE in maybe_explain_implicit_delete (PR c++/88122)
Hi! On the following testcase we ICE in maybe_explain_implicit_delete, because FUNCTION_FIRST_USER_PARMTYPE (decl) is NULL - there are no user parameters and ... >From what I understood, const_p is used only in certain cases like const vs. non-const copy constructor or assignment operator, if the sfk has no user parameters, usually parm_type is just the void_type terminating the argument list and also not really interesting for const_p computation. So, this patch just arranges to pass false as const_p in this case. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-11-21 Jakub Jelinek PR c++/88122 * method.c (maybe_explain_implicit_delete): If FUNCTION_FIRST_USER_PARMTYPE (decl) is NULL, set const_p to false instead of ICEing. * g++.dg/cpp0x/implicit15.C: New test. --- gcc/cp/method.c.jj 2018-11-16 10:22:18.668258171 +0100 +++ gcc/cp/method.c 2018-11-21 15:42:08.441785625 +0100 @@ -1821,8 +1821,12 @@ maybe_explain_implicit_delete (tree decl if (!informed) { tree parms = FUNCTION_FIRST_USER_PARMTYPE (decl); - tree parm_type = TREE_VALUE (parms); - bool const_p = CP_TYPE_CONST_P (non_reference (parm_type)); + bool const_p = false; + if (parms) + { + tree parm_type = TREE_VALUE (parms); + const_p = CP_TYPE_CONST_P (non_reference (parm_type)); + } tree raises = NULL_TREE; bool deleted_p = false; tree scope = push_scope (ctype); --- gcc/testsuite/g++.dg/cpp0x/implicit15.C.jj 2018-11-21 15:59:29.849741499 +0100 +++ gcc/testsuite/g++.dg/cpp0x/implicit15.C 2018-11-21 15:58:00.912197089 +0100 @@ -0,0 +1,11 @@ +// PR c++/88122 +// { dg-do compile { target c++11 } } + +struct A { + A (...); // { dg-message "candidate" } + A ();// { dg-message "candidate" } +}; +struct B : A { + using A::A; // { dg-error "is ambiguous" } + // { dg-message "is implicitly deleted because the default definition would be ill-formed" "" { target *-*-* } .-1 } +} b{3};// { dg-error "use of deleted function" } Jakub
[PATCH] Fix -fstack-protector* on darwin/mingw etc. (PR target/85644)
Hi! As I wrote in the PR, before PR81708 commits, while i386 defaulted to SSP_TLS rather than SSP_GLOBAL on everything but Android, the -mstack-protector-guard= switch controlled pretty much whether the i386.md special stack protector patterns are used (if tls) or whether generic code is used (global). These special stack protector patterns did one thing if TARGET_THREAD_SSP_OFFSET macro was defined (only defined on glibc targets) - code like: movq%fs:40, %rax movq%rax, -8(%rbp) xorl%eax, %eax in the prologue and movq-8(%rbp), %rdx xorq%fs:40, %rdx je .L4 in the epilogue. If TARGET_THREAD_SSP_OFFSET macro wasn't defined, it would do instead: movq.refptr.__stack_chk_guard(%rip), %rax movq(%rax), %rcx movq%rcx, -8(%rbp) xorl%ecx, %ecx and movq.refptr.__stack_chk_guard(%rip), %rdx movq-8(%rbp), %rcx xorq(%rdx), %rcx je .L4 (this is taken from 7.x cross to mingw). Finally, for Android or when -mstack-protector-guard=global was used, it emitted: movq__stack_chk_guard(%rip), %rax movq%rax, -8(%rbp) and movq__stack_chk_guard(%rip), %rdx cmpq%rdx, %rcx je .L4 Note, apart from OS specific details, those =global sequences are similar to the =tls ones when TARGET_THREAD_SSP_OFFSET is not defined, the main difference is that the =tls ones are more secure as they clear registers containing the guard as quickly as possible. The PR81708 changes dropped the non-tls special stack_protector_* patterns from i386.md and now =tls implies really tls, but the default remained, so mingw32 or darwin still default to tls and just use 0 offset by default. So, this patch changes the default for mingw32, darwin and everything else except gnu-user*.h to be =global, and just forces those special i386.md more secure patterns unconditionally (slightly changing the generated code on Android, but it is one extra insn in prologue and one fewer in the epilogue). With this patch -mstack-protector-guard=tls is really for tls and =global for pure var access and user can override the defaults on non-glibc targets, but they should get a default that works there. Bootstrapped/regtested on x86_64-linux and i686-linux, plus tested with a cross to mingw, ok for trunk? 2018-11-21 Jakub Jelinek PR target/85644 PR target/86832 * config/i386/i386.c (ix86_option_override_internal): Default ix86_stack_protector_guard to SSP_TLS only if TARGET_THREAD_SSP_OFFSET is defined. * config/i386/i386.md (stack_protect_set, stack_protect_set_, stack_protect_test, stack_protect_test_): Use empty condition instead of TARGET_SSP_TLS_GUARD. --- gcc/config/i386/i386.c.jj 2018-11-20 21:39:00.905577452 +0100 +++ gcc/config/i386/i386.c 2018-11-21 18:02:49.448049161 +0100 @@ -4557,8 +4557,13 @@ ix86_option_override_internal (bool main /* Handle stack protector */ if (!opts_set->x_ix86_stack_protector_guard) -opts->x_ix86_stack_protector_guard - = TARGET_HAS_BIONIC ? SSP_GLOBAL : SSP_TLS; +{ + opts->x_ix86_stack_protector_guard = SSP_GLOBAL; +#ifdef TARGET_THREAD_SSP_OFFSET + if (!TARGET_HAS_BIONIC) + opts->x_ix86_stack_protector_guard = SSP_TLS; +#endif +} #ifdef TARGET_THREAD_SSP_OFFSET ix86_stack_protector_guard_offset = TARGET_THREAD_SSP_OFFSET; --- gcc/config/i386/i386.md.jj 2018-11-21 11:45:12.090721862 +0100 +++ gcc/config/i386/i386.md 2018-11-21 18:03:46.166119350 +0100 @@ -19010,7 +19010,7 @@ (define_insn "*prefetch_prefetchwt1" (define_expand "stack_protect_set" [(match_operand 0 "memory_operand") (match_operand 1 "memory_operand")] - "TARGET_SSP_TLS_GUARD" + "" { rtx (*insn)(rtx, rtx); @@ -19028,7 +19028,7 @@ (define_insn "stack_protect_set_" UNSPEC_SP_SET)) (set (match_scratch:PTR 2 "=&r") (const_int 0)) (clobber (reg:CC FLAGS_REG))] - "TARGET_SSP_TLS_GUARD" + "" "mov{}\t{%1, %2|%2, %1}\;mov{}\t{%2, %0|%0, %2}\;xor{l}\t%k2, %k2" [(set_attr "type" "multi")]) @@ -19036,7 +19036,7 @@ (define_expand "stack_protect_test" [(match_operand 0 "memory_operand") (match_operand 1 "memory_operand") (match_operand 2)] - "TARGET_SSP_TLS_GUARD" + "" { rtx flags = gen_rtx_REG (CCZmode, FLAGS_REG); @@ -19059,7 +19059,7 @@ (define_insn "stack_protect_test_" (match_operand:PTR 2 "memory_operand" "m")] UNSPEC_SP_TEST)) (clobber (match_scratch:PTR 3 "=&r"))] - "TARGET_SSP_TLS_GUARD" + "" "mov{}\t{%1, %3|%3, %1}\;xor{}\t{%2, %3|%3, %2}" [(set_attr "type" "multi")]) Jakub
Re: [PATCH] Fix -fstack-protector* on darwin/mingw etc. (PR target/85644)
On Wed, Nov 21, 2018 at 11:21:18PM +0100, Jakub Jelinek wrote: > As I wrote in the PR, before PR81708 commits, Note, e.g. in 4.8, the stack_protector_* patterns weren't guarded with something like TARGET_SSP_TLS_GUARD but with !TARGET_HAS_BIONIC, which just means it was incorrectly implemented for Android initially (should have been done by forcing there the non-*tls* insns for !TARGET_HAS_BIONIC rather than failing the optab). Jakub
Re: [C++ PATCH] Improve locations of id-expressions and operator "" (PR c++/87386, take 2)
On 11/21/18 1:49 PM, Jakub Jelinek wrote: On Wed, Nov 21, 2018 at 01:29:15PM -0500, Jason Merrill wrote: similarly for operator"" _F the column is under _ rather than first o. I disagree with this one: the name of the declaration is operator""_F, so I think the caret should go at the first o. Right now when cp_parser_operator_function_id is called, it returns locus like: operator new ^~~ operator delete [] ^ operator == ^ operator "" _foo UNKNOWN_LOCATION The last one is because for others we do return cp_expr (id, start_loc); but for operator "" just return id; So, do you suggest we should instead return operator new ^~~~ operator delete [] ^~ operator == ^~~ operator "" _foo ^~~~ ? Yes. That would mean cp_parser_operator_function_id would need to pass location_t start_loc (the start of the operator token) to cp_parser_operator and let that create a range in all cases rather than just for operator new/delete. Sure. Jason
Re: [C++ PATCH] Fix ICE in maybe_explain_implicit_delete (PR c++/88122)
On 11/21/18 5:16 PM, Jakub Jelinek wrote: Hi! On the following testcase we ICE in maybe_explain_implicit_delete, because FUNCTION_FIRST_USER_PARMTYPE (decl) is NULL - there are no user parameters and ... From what I understood, const_p is used only in certain cases like const vs. non-const copy constructor or assignment operator, if the sfk has no user parameters, usually parm_type is just the void_type terminating the argument list and also not really interesting for const_p computation. So, this patch just arranges to pass false as const_p in this case. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2018-11-21 Jakub Jelinek PR c++/88122 * method.c (maybe_explain_implicit_delete): If FUNCTION_FIRST_USER_PARMTYPE (decl) is NULL, set const_p to false instead of ICEing. * g++.dg/cpp0x/implicit15.C: New test. OK. Jason
Re: [C++ PATCH] Improve locations of id-expressions and operator "" (PR c++/87386, take 3)
On 11/21/18 5:10 PM, Jakub Jelinek wrote: On Wed, Nov 21, 2018 at 07:49:48PM +0100, Jakub Jelinek wrote: So, do you suggest we should instead return operator new ^~~~ operator delete [] ^~ operator == ^~~ operator "" _foo ^~~~ ? That would mean cp_parser_operator_function_id would need to pass location_t start_loc (the start of the operator token) to cp_parser_operator and let that create a range in all cases rather than just for operator new/delete. This version of the patch implements that. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? OK. Jason
[PATCH 3/7][v2][MSP430][TESTSUITE] Dynamically check if size_t is large enough for tests containing large structs/arrays
On Wed, 14 Nov 2018 15:41:00 + Jozef Lawrynowicz wrote: > Patch 3 sets up require-effective-target directives for tests which > require the compilation of large arrays. > Targets which have 16-bit or 20-bit size_t fail to compile tests with large > arrays designed to test 32-bit or 64-bit behaviour. Rather than enumerating > another target to skip, I've replaced the target selector in some tests with > a size checking procedure: > - size20plus (new) > - size32plus > size20plus checks to see if a 16-bit structure/array size is supported, > similarly to how the existing size32plus checks to see if a 24-bit > structure/array size is supported, Added missing documentation for new check_effective target procs in attached patch. >From 1573a8392605a17e58c74be19ee5eb28950dc32d Mon Sep 17 00:00:00 2001 From: Jozef Lawrynowicz Date: Thu, 8 Nov 2018 22:39:12 + Subject: [PATCH] [TESTSUITE] Dynamically check if size_t is large enough for tests containing large structs/arrays 2018-11-21 Jozef Lawrynowicz gcc/ChangeLog: * doc/sourcebuild.texi: Document check_effective_target_size20plus. Clarify documentation for check_effective_target_size32plus. gcc/testsuite/ChangeLog: * gcc.c-torture/compile/20151204.c: Add dg-require-effective-target size20plus. * gcc.dg/pr34225.c: Likewise. * gcc.dg/pr40971.c: Likewise. * gcc.dg/pr69071.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-10.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-2.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-3.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-5.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-6.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-7.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-8.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-9.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-11.c: Add dg-require-effective-target size32plus. * gcc.dg/Walloc-size-larger-than-4.c: Likewise. * gcc.dg/Walloc-size-larger-than-5.c: Likewise. * gcc.dg/Walloc-size-larger-than-6.c: Likewise. * gcc.dg/Walloc-size-larger-than-7.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-1.c: Likewise. * gcc.dg/tree-ssa/loop-interchange-1b.c: Likewise. * lib/target-supports.exp (check_effective_target_size20plus): New. (check_effective_target_size32plus): Update comment. --- gcc/doc/sourcebuild.texi| 7 ++- gcc/testsuite/gcc.c-torture/compile/20151204.c | 2 +- gcc/testsuite/gcc.dg/Walloc-size-larger-than-4.c| 2 +- gcc/testsuite/gcc.dg/Walloc-size-larger-than-5.c| 2 +- gcc/testsuite/gcc.dg/Walloc-size-larger-than-6.c| 2 +- gcc/testsuite/gcc.dg/Walloc-size-larger-than-7.c| 2 +- gcc/testsuite/gcc.dg/pr34225.c | 1 + gcc/testsuite/gcc.dg/pr40971.c | 1 + gcc/testsuite/gcc.dg/pr69071.c | 2 +- gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-1.c | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-10.c | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-11.c | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-1b.c | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-2.c | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-3.c | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-5.c | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-6.c | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-7.c | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-8.c | 3 ++- gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-9.c | 3 ++- gcc/testsuite/lib/target-supports.exp | 18 +++--- 21 files changed, 51 insertions(+), 21 deletions(-) diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index bca5db3..9c57226 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1375,8 +1375,13 @@ Target supports @code{long double} that is longer than @code{double}. @item ptr32plus Target has pointers that are 32 bits or longer. +@item size20plus +Target has a 20-bit or larger address space, so at least supports +16-bit array and structure sizes. + @item size32plus -Target supports array and structure sizes that are 32 bits or longer. +Target has a 32-bit or larger address space, so at least supports +24-bit array and structure sizes. @item 4byte_wchar_t Target has @code{wchar_t} that is at least 4 bytes. diff --git a/gcc/testsuite/gcc.c-torture/compile/20151204.c b/gcc/testsuite/gcc.c-torture/compile/20151204.c index 6a46abf..e41f6c1 100644 --- a/gcc/testsuite/gcc.c-torture/compile/20151204.c +++ b/gcc/testsuite/gcc.c-torture/compile/20151204.c @@ -1,4 +1,4 @@ -/* { dg-skip-if "Array too big" { "avr-*-*" "pdp11-*-*" } } */ +/* { dg-require-effective-target size20plus } */ typedef __SIZE_TYPE__ size_t; diff --git a/gcc/testsuite/gcc.dg/Walloc-size-larger-than-4.c b/gcc/testsuite/gcc.dg/Walloc-size-larger-than-4.c index 4b3a64b..54e43cd 100644 --- a/gcc/testsuite/gcc.dg/Walloc-size-larger-than-4.c +++ b/gcc/testsuite
[PATCH 6/7][v2][MSP430][TESTSUITE] Fix tests requiring float printf support when GCC was configured with --enable-newlib-nano-formatted-io
On Wed, 14 Nov 2018 15:41:00 + Jozef Lawrynowicz wrote: > Patch 6 fixes tests expecting printf float support for targets which have been > configured with "newlib-nano-formatted-io". When newlib is configured in this > way, float printf is enabled at build time by registering _printf_float as an > undefined symbol. Added missing documentation for new check_effective target procs in attached patch. >From ad5c2e3684904f961938cfc0b50445013300c6e0 Mon Sep 17 00:00:00 2001 From: Jozef Lawrynowicz Date: Sat, 10 Nov 2018 16:02:25 + Subject: [PATCH] [TESTSUITE] Fix tests requiring float printf support when GCC was configured with --enable-newlib-nano-formatted-io 2018-11-21 Jozef Lawrynowicz gcc/ChangeLog: * doc/sourcebuild.texi: Document check_effective_target_newlib_nano_io. gcc/testsuite/ChangeLog: * lib/target-supports.exp (check_effective_target_newlib_nano_io): New. * gcc.c-torture/execute/920501-8.c: Register undefined linker symbol _printf_float for newlib_nano_io target. * gcc.c-torture/execute/930513-1.c: Likewise. * gcc.dg/torture/builtin-sprintf.c: Likewise. * gcc.c-torture/execute/ieee/920810-1.x: New. --- gcc/doc/sourcebuild.texi| 4 gcc/testsuite/gcc.c-torture/execute/920501-8.c | 2 ++ gcc/testsuite/gcc.c-torture/execute/930513-1.c | 2 ++ gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x | 4 gcc/testsuite/gcc.dg/torture/builtin-sprintf.c | 3 ++- gcc/testsuite/lib/target-supports.exp | 4 6 files changed, 18 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index 9c57226..bfaa0fd 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -2152,6 +2152,10 @@ Target supports @code{mmap}. @item newlib Target supports Newlib. +@item newlib_nano_io +GCC was configured with @code{--enable-newlib-nano-formatted-io}, which reduces +the code size of Newlib formatted I/O functions. + @item pow10 Target provides @code{pow10} function. diff --git a/gcc/testsuite/gcc.c-torture/execute/920501-8.c b/gcc/testsuite/gcc.c-torture/execute/920501-8.c index 62780a0..7e4fa17 100644 --- a/gcc/testsuite/gcc.c-torture/execute/920501-8.c +++ b/gcc/testsuite/gcc.c-torture/execute/920501-8.c @@ -1,3 +1,5 @@ +/* { dg-additional-options "-Wl,-u,_printf_float" { target newlib_nano_io } } */ + #include #include diff --git a/gcc/testsuite/gcc.c-torture/execute/930513-1.c b/gcc/testsuite/gcc.c-torture/execute/930513-1.c index 4544471..f163007 100644 --- a/gcc/testsuite/gcc.c-torture/execute/930513-1.c +++ b/gcc/testsuite/gcc.c-torture/execute/930513-1.c @@ -1,3 +1,5 @@ +/* { dg-additional-options "-Wl,-u,_printf_float" { target newlib_nano_io } } */ + #include char buf[2]; diff --git a/gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x b/gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x new file mode 100644 index 000..8edec730 --- /dev/null +++ b/gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x @@ -0,0 +1,4 @@ +if { [check_effective_target_newlib_nano_io] } { +lappend additional_flags "-Wl,-u,_printf_float" +} +return 0 diff --git a/gcc/testsuite/gcc.dg/torture/builtin-sprintf.c b/gcc/testsuite/gcc.dg/torture/builtin-sprintf.c index 6f8b7a9..5684fd7 100644 --- a/gcc/testsuite/gcc.dg/torture/builtin-sprintf.c +++ b/gcc/testsuite/gcc.dg/torture/builtin-sprintf.c @@ -1,6 +1,7 @@ /* PR tree-optimization/86274 - SEGFAULT when logging std::to_string(NAN) { dg-do run } - { dg-options "-O2 -Wall" } */ + { dg-options "-O2 -Wall" } + { dg-additional-options "-Wl,-u,_printf_float" { target newlib_nano_io } } */ #define X"0xdeadbeef" #define nan(x) __builtin_nan (x) diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index 7488653..d696fc6 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -6691,6 +6691,10 @@ proc check_effective_target_newlib {} { #include }] } +# Return true if GCC was configured with --enable-newlib-nano-formatted-io +proc check_effective_target_newlib_nano_io { } { +return [check_configured_with "--enable-newlib-nano-formatted-io"] +} # Some newlib versions don't provide a frexpl and instead depend # on frexp to implement long double conversions in their printf-like -- 2.7.4
[PATCH 7/7][v2][MSP430][TESTSUITE] Fix tests for msp430-elf large memory model
On Wed, 14 Nov 2018 15:41:00 + Jozef Lawrynowicz wrote: > Patch 7 fixes tests for msp430-elf in the large memory model. Added missing documentation for new check_effective target procs in attached patch. >From 4cfb2ecd0e0580f69790fadd68b77e8a82992ef4 Mon Sep 17 00:00:00 2001 From: Jozef Lawrynowicz Date: Sat, 10 Nov 2018 16:08:44 + Subject: [PATCH] [TESTSUITE] Fix tests for msp430-elf large memory model 2018-11-21 Jozef Lawrynowicz gcc/ChangeLog: * doc/sourcebuild.texi: Document check_effective_target_msp430_large_mem. gcc/testsuite/ChangeLog: * gcc.c-torture/execute/991014-1.c: Fix bufsize definition for msp430 large memory model. * gcc.dg/Walloca-1.c: Don't expect warning for msp430 large memory model. * gcc.dg/Walloca-2.c: Likewise. * gcc.dg/c99-const-expr-2.c: Define ZERO macro for msp430 large memory model. * gcc.dg/format/format.h: Prefix typedefs using __SIZE_TYPE__ and __PTRDIFF_TYPE__ with __extension__. * gcc.dg/lto/20081210-1_0.c: Always typedef uintptr_t as __UINTPTR_TYPE__. * gcc.dg/pr36227.c: Likewise. * gcc.dg/pr42611.c: Use __INTPTR_MAX__ as the maximum object size if size_t and ptr_t are the same size. * gcc.dg/pr78973.c: dg-warning XFAIL for int16 but not msp430 large memory model. * gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Update dg-warning directives for msp430 large memory model. * gcc.dg/tree-ssa/pr66449.c: Always use __INTPTR_TYPE__ when integer type equal in size to ptr_t is required. * gcc.dg/tree-ssa/ssa-dom-thread-8.c: Extend pointer size checking macro for msp430. * lib/target-supports.exp (check_effective_target_msp430_large_mem): New. --- gcc/doc/sourcebuild.texi | 8 ++ gcc/testsuite/gcc.c-torture/execute/991014-1.c | 7 - gcc/testsuite/gcc.dg/Walloca-1.c | 4 +-- gcc/testsuite/gcc.dg/Walloca-2.c | 8 +++--- gcc/testsuite/gcc.dg/c99-const-expr-2.c| 2 ++ gcc/testsuite/gcc.dg/format/format.h | 6 ++-- gcc/testsuite/gcc.dg/lto/20081210-1_0.c| 8 +- gcc/testsuite/gcc.dg/pr36227.c | 10 +-- gcc/testsuite/gcc.dg/pr42611.c | 3 +- gcc/testsuite/gcc.dg/pr78973.c | 2 +- .../gcc.dg/tree-ssa/builtin-sprintf-warn-3.c | 32 +++--- gcc/testsuite/gcc.dg/tree-ssa/pr66449.c| 8 ++ gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-8.c | 8 +++--- gcc/testsuite/lib/target-supports.exp | 13 + 14 files changed, 66 insertions(+), 53 deletions(-) diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi index bfaa0fd..b5fac4e 100644 --- a/gcc/doc/sourcebuild.texi +++ b/gcc/doc/sourcebuild.texi @@ -1941,6 +1941,14 @@ when using the new ABI. MIPS target supports @code{-mpaired-single}. @end table +@subsubsection MSP430-specific attributes + +@table @code +@item msp430_large_mem +The MSP430 large memory model (enabled with @code{-mlarge} compiler flag) +is in use. +@end table + @subsubsection PowerPC-specific attributes @table @code diff --git a/gcc/testsuite/gcc.c-torture/execute/991014-1.c b/gcc/testsuite/gcc.c-torture/execute/991014-1.c index e0bcd6d..95e38ce 100644 --- a/gcc/testsuite/gcc.c-torture/execute/991014-1.c +++ b/gcc/testsuite/gcc.c-torture/execute/991014-1.c @@ -1,11 +1,16 @@ - typedef __SIZE_TYPE__ Size_t; +#ifdef __MSP430X_LARGE__ +/* size_t is __int20, so 20 bits, for __MSP430X_LARGE__, but __SIZEOF_POINTER__ + returns the bytesize which is 4. */ +#define bufsize ((1L << (20 - 2))-256) +#else /* !__MSP430X_LARGE__ */ #if __SIZEOF_LONG__ < __SIZEOF_POINTER__ #define bufsize ((1LL << (8 * sizeof(Size_t) - 2))-256) #else #define bufsize ((1L << (8 * sizeof(Size_t) - 2))-256) #endif +#endif struct huge_struct { diff --git a/gcc/testsuite/gcc.dg/Walloca-1.c b/gcc/testsuite/gcc.dg/Walloca-1.c index 85e9160..c9a6c57 100644 --- a/gcc/testsuite/gcc.dg/Walloca-1.c +++ b/gcc/testsuite/gcc.dg/Walloca-1.c @@ -24,8 +24,8 @@ void foo1 (size_t len, size_t len2, size_t len3) char *s = alloca (123); useit (s); // OK, constant argument to alloca - s = alloca (num); // { dg-warning "large due to conversion" "" { target lp64 } } - // { dg-warning "unbounded use of 'alloca'" "" { target { ! lp64 } } .-1 } + s = alloca (num); // { dg-warning "large due to conversion" "" { target { { lp64 } || { msp430_large_mem } } } } + // { dg-warning "unbounded use of 'alloca'" "" { target { { ! lp64 } && { ! msp430_large_mem } } } .-1 } useit (s); s = alloca (3); /* { dg-warning "is too large" } */ diff --git a/gcc/testsuite/gcc.dg/Walloca-2.c b/gcc/testsuite/gcc.dg/Walloca-2.c index 766ff8d..446c811 100644 --- a/gcc/testsuite/gcc.dg/Walloca-2.c +++ b/gcc/testsuite/gcc.dg/Walloca-2.c @@ -13,7 +13,7 @@ g1 (int n) // 32-bit targets because VRP is not giving us any range info for // the argument to __builtin_alloca. This should be fixed by the // up
[PATCH][libbacktrace] Factor out read_initial_length
Hi, this patch factors out new function read_initial_length in dwarf.c. Bootstrapped and reg-tested on x86_64. OK for trunk? Thanks, - Tom [libbacktrace] Factor out read_initial_length 2018-11-22 Tom de Vries * dwarf.c (read_initial_length): Factor out of ... (build_address_map, read_line_info): ... here. --- libbacktrace/dwarf.c | 36 +--- 1 file changed, 21 insertions(+), 15 deletions(-) diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c index c4f8732c7eb..4e93f120820 100644 --- a/libbacktrace/dwarf.c +++ b/libbacktrace/dwarf.c @@ -651,6 +651,25 @@ leb128_len (const unsigned char *p) return ret; } +/* Read initial_length from BUF and advance the appropriate number of bytes. */ + +static uint64_t +read_initial_length (struct dwarf_buf *buf, int *is_dwarf64) +{ + uint64_t len; + + len = read_uint32 (buf); + if (len == 0x) +{ + len = read_uint64 (buf); + *is_dwarf64 = 1; +} + else +*is_dwarf64 = 0; + + return len; +} + /* Free an abbreviations structure. */ static void @@ -1463,14 +1482,7 @@ build_address_map (struct backtrace_state *state, uintptr_t base_address, unit_data_start = info.buf; - is_dwarf64 = 0; - len = read_uint32 (&info); - if (len == 0x) - { - len = read_uint64 (&info); - is_dwarf64 = 1; - } - + len = read_initial_length (&info, &is_dwarf64); unit_buf = info; unit_buf.left = len; @@ -2002,13 +2014,7 @@ read_line_info (struct backtrace_state *state, struct dwarf_data *ddata, line_buf.data = data; line_buf.reported_underflow = 0; - is_dwarf64 = 0; - len = read_uint32 (&line_buf); - if (len == 0x) -{ - len = read_uint64 (&line_buf); - is_dwarf64 = 1; -} + len = read_initial_length (&line_buf, &is_dwarf64); line_buf.left = len; if (!read_line_header (state, u, is_dwarf64, &line_buf, hdr))
Re: [PATCH][libbacktrace] Factor out read_initial_length
Tom de Vries writes: > [libbacktrace] Factor out read_initial_length > > 2018-11-22 Tom de Vries > > * dwarf.c (read_initial_length): Factor out of ... > (build_address_map, read_line_info): ... here. This is OK. Thanks. Ian
Re: [C++ Patch] PR 84636 ("internal compiler error: Segmentation fault (identifier_p()/grokdeclarator())")
... in fact I'm thinking that the below - which directly checks for unqualified_id to be non-null in both places - may be a better variant: among other things it means that for related testcases like: typedef void a(); struct A { a a1: 1; }; we get the location of a1 right (we could also change the diagnostics in grokbitfield to use DECL_SOURCE_LOCATION and exploit it), and the testsuite doesn't need adjustments. Tested x86_64-linux. Thanks, Paolo. Index: cp/decl.c === --- cp/decl.c (revision 266339) +++ cp/decl.c (working copy) @@ -12165,7 +12165,8 @@ grokdeclarator (const cp_declarator *declarator, } if (ctype && TREE_CODE (type) == FUNCTION_TYPE && staticp < 2 - && !(identifier_p (unqualified_id) + && !(unqualified_id + && identifier_p (unqualified_id) && IDENTIFIER_NEWDEL_OP_P (unqualified_id))) { cp_cv_quals real_quals = memfn_quals; @@ -12245,8 +12246,7 @@ grokdeclarator (const cp_declarator *declarator, error ("invalid use of %<::%>"); return error_mark_node; } - else if (TREE_CODE (type) == FUNCTION_TYPE -|| TREE_CODE (type) == METHOD_TYPE) + else if (FUNC_OR_METHOD_TYPE_P (type) && unqualified_id) { int publicp = 0; tree function_context; Index: testsuite/g++.dg/parse/bitfield6.C === --- testsuite/g++.dg/parse/bitfield6.C (nonexistent) +++ testsuite/g++.dg/parse/bitfield6.C (working copy) @@ -0,0 +1,6 @@ +// PR c++/84636 + +typedef void a(); +struct A { +a: 1; // { dg-error "bit-field .\\. with non-integral type" } +};
Re: [PATCH][libbacktrace] Handle DW_FORM_GNU_strp_alt
On 21-11-18 02:03, Ian Lance Taylor wrote: > On Wed, Nov 14, 2018 at 6:45 AM, Tom de Vries wrote: >> On 14-11-18 14:25, Jakub Jelinek wrote: >>> On Wed, Nov 14, 2018 at 02:08:05PM +0100, Tom de Vries wrote: > +btest_dwz_CFLAGS = $(AM_CFLAGS) -g -O0 Hmm, I already discovered that specifying the -O0 doesn't work, since it's overridden by $(CFLAGS). With a hack like this: ... diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am index 2fec9bbb4b6..8bdf13b3546 100644 --- a/libbacktrace/Makefile.am +++ b/libbacktrace/Makefile.am @@ -99,11 +99,14 @@ check_PROGRAMS += btest if HAVE_DWZ btest_dwz_SOURCES = btest_dwz.c testlib.c -btest_dwz_CFLAGS = $(AM_CFLAGS) -g -O0 +btest_dwz_CFLAGS = $(AM_CFLAGS) -g btest_dwz_LDADD = libbacktrace.la check_PROGRAMS += btest_dwz +btest_dwz-btest_dwz.o: btest_dwz.c + $(AM_V_CC)$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) $(btest_dwz_CFLAGS) $(CFLAGS) -O0 -c -o btest_dwz-btest_dwz.o `test -f 'btest_dwz.c' || echo '$(srcdir)/'`btest_dwz.c >>> >>> Can't you instead do something like: >>> btest_dwz.o: CFLAGS += -g -O0 >>> or something similar >> >> Hi, >> >> yes, that works, thanks. >> >>> (whatever the corresponding goal is)? >> >> The goal is to run the testcase with a setting lower than -O2, such that >> we can successfully run a substantial portion of the test without >> needing support for DW_FORM_GNU_ref_alt. >> >> [ At O2 we get constprop versions of some functions, which have an >> abstract origin, which tends to be moved to the common debug file by dwz >> -m, after which we need support for DW_FORM_GNU_ref_alt to get to the >> name of the function. ] >> >>> Otherwise, the patch looks generally ok to me, >> >> Great. >> >>> but yes, I've been wondering >>> how you can get away with DW_FORM_GNU_ref_alt not implemented properly. >>> >> >> Indeed, DW_FORM_GNU_ref_alt support is required to make this work in >> general. >> >> But I observed that implementing just DW_FORM_GNU_strp_alt improves on >> the current situation, so I thought it was worthwhile submitting this as >> a separate patch. >> >> Updated patch attached (which also rewrites btest_dwz.c to an include of >> btest.c, while disabling the inline tests that require DW_FORM_GNU_ref_alt). > > Unfortunately the tests don't pass for me. > > rm -f btest_dwz.debug > cp btest_dwz btest_dwz_2 > cp btest_dwz btest_dwz_3 > dwz -m btest_dwz.debug btest_dwz_2 btest_dwz_3 > FAIL: btest_dwz_2 > FAIL: btest_dwz_3 > >> libbacktrace/btest_dwz_2 > test1: [0]: missing file name or function name > FAIL: backtrace_full noinline > test3: [0]: missing file name or function name > FAIL: backtrace_simple noinline > PASS: backtrace_syminfo variable > >> libbacktrace/btest_dwz_3 > test1: [0]: missing file name or function name > FAIL: backtrace_full noinline > test3: [0]: missing file name or function name > FAIL: backtrace_simple noinline > PASS: backtrace_syminfo variable > Hmm, I can't reproduce that. I'm reworking this patch into a patch series that includes also support for DW_FORM_GNU_ref_alt, so I'm hoping that that will fix the failures you're seeing. >> +#define INLINE_TESTS 0 >> +#include "btest.c" > > Please avoid this kind of #include game. If you need to skip some > tests (why?) use a command line option. If you need to compile with > different options, use automake features. > The patch series with DW_FORM_GNU_ref_alt support added no longer requires this. >> +elf_open_debugfile_by_debugaltlink (struct backtrace_state *state, > > Do we need this function? It seems to be the same as > elf_find_debugfile_by_debuglink. Hmm, that's right. I've now updated the patch in my patch series. I'll resubmit once the fix for PR88063 is in trunk (I need the keeping track of units that that patch adds, for DW_FORM_GNU_ref_alt support). Thanks for the review, - Tom
Re: C++ PATCH to implement P1094R2, Nested inline namespaces
On Tue, Nov 20, 2018 at 04:59:46PM -0500, Jason Merrill wrote: > On 11/19/18 5:12 PM, Marek Polacek wrote: > > On Mon, Nov 19, 2018 at 10:33:17PM +0100, Jakub Jelinek wrote: > > > On Mon, Nov 19, 2018 at 04:21:19PM -0500, Marek Polacek wrote: > > > > 2018-11-19 Marek Polacek > > > > > > > > Implement P1094R2, Nested inline namespaces. > > > > * g++.dg/cpp2a/nested-inline-ns1.C: New test. > > > > * g++.dg/cpp2a/nested-inline-ns2.C: New test. > > > > * g++.dg/cpp2a/nested-inline-ns3.C: New test. > > > > > > Just a small testsuite comment. > > > > > > > --- /dev/null > > > > +++ gcc/testsuite/g++.dg/cpp2a/nested-inline-ns1.C > > > > @@ -0,0 +1,26 @@ > > > > +// P1094R2 > > > > +// { dg-do compile { target c++2a } } > > > > > > Especially because 2a testing isn't included by default, but also > > > to make sure it works right even with -std=c++17, wouldn't it be better to > > > drop the nested-inline-ns3.C test, make this test c++17 or > > > even better always enabled, add dg-options "-Wpedantic" and > > > just add dg-warning with c++17_down and c++14_down what should be > > > warned on the 3 lines (with .-1 for c++14_down)? > > > > > > Or if you want add some further testcases that will test how > > > c++17 etc. will dg-error on those with -pedantic-errors etc. > > > > Sure, I've made it { target c++11 } and dropped the third test: > > > > Bootstrapped/regtested on x86_64-linux, ok for trunk? > > > > 2018-11-19 Marek Polacek > > > > Implement P1094R2, Nested inline namespaces. > > * parser.c (cp_parser_namespace_definition): Parse the optional inline > > keyword in a nested-namespace-definition. Adjust push_namespace call. > > Formatting fix. > > > > * g++.dg/cpp2a/nested-inline-ns1.C: New test. > > * g++.dg/cpp2a/nested-inline-ns2.C: New test. > > > > diff --git gcc/cp/parser.c gcc/cp/parser.c > > index 292cce15676..f39e9d753d2 100644 > > --- gcc/cp/parser.c > > +++ gcc/cp/parser.c > > @@ -18872,6 +18872,7 @@ cp_parser_namespace_definition (cp_parser* parser) > > cp_ensure_no_oacc_routine (parser); > > bool is_inline = cp_lexer_next_token_is_keyword (parser->lexer, > > RID_INLINE); > > + const bool topmost_inline_p = is_inline; > > if (is_inline) > > { > > @@ -18890,6 +18891,17 @@ cp_parser_namespace_definition (cp_parser* parser) > > { > > identifier = NULL_TREE; > > + bool nested_inline_p = cp_lexer_next_token_is_keyword (parser->lexer, > > +RID_INLINE); > > + if (nested_inline_p && nested_definition_count != 0) > > + { > > + if (cxx_dialect < cxx2a) > > + pedwarn (cp_lexer_peek_token (parser->lexer)->location, > > +OPT_Wpedantic, "nested inline namespace definitions only " > > +"available with -std=c++2a or -std=gnu++2a"); > > + cp_lexer_consume_token (parser->lexer); > > + } > > This looks like we won't get any diagnostic in lower conformance modes if > there are multiple namespace scopes before the inline keyword. If you mean something like namespace A::B:C::inline D { } then we do get a diagnostic. nested-inline-ns1.C tests that. Or do you mean something else? > > if (cp_lexer_next_token_is (parser->lexer, CPP_NAME)) > > { > > identifier = cp_parser_identifier (parser); > > @@ -18904,7 +18916,12 @@ cp_parser_namespace_definition (cp_parser* parser) > > } > > if (cp_lexer_next_token_is_not (parser->lexer, CPP_SCOPE)) > > - break; > > + { > > + /* Don't forget that the innermost namespace might have been > > +marked as inline. */ > > + is_inline |= nested_inline_p; > > This looks wrong: an inline namespace does not make its nested namespaces > inline as well. A nested namespace definition cannot be inline. This is supposed to handle cases such as namespace A::B::inline C { ... } because after 'C' we don't see :: so it breaks and we call push_namespace outside the for loop. So I still don't see a bug; do you have a test that I got wrong? Marek
Re: [PATCH, libphobos] Fix libphobos.shared testsuite for multilib tests
On Sat, 17 Nov 2018 at 16:07, Johannes Pfau wrote: > > Hi, > > the loadDR test in the libphobos.shared testsuite tries to dynamically load > the phobos library. The path for the library currently points to the main > multilib variant phobos library, causing other multilib variants to fail the > test. The attached patch uses $blddir instead of $objdir to fix this issue. > > --- > libphobos/ChangeLog: > > 2018-11-17 Johannes Pfau > > PR d/87824 > * testsuite/libphobos.shared/shared.exp: Set proper path to phobos > library for multilib builds. > > diff --git a/libphobos/testsuite/libphobos.shared/shared.exp > b/libphobos/testsuite/libphobos.shared/shared.exp > index b3bdd..623e06259 100644 > --- a/libphobos/testsuite/libphobos.shared/shared.exp > +++ b/libphobos/testsuite/libphobos.shared/shared.exp > @@ -94,7 +94,7 @@ if { [is-effective-target dlopen] && [is-effective-target > pthread] } { > dg-test "$srcdir/$subdir/host.c" "-ldl -pthread" "$DEFAULT_CFLAGS" > > # Test requires a command line argument to be passed to the program. > -set libphobos_run_args "$objdir/../src/.libs/libgphobos.so" > +set libphobos_run_args "${blddir}/src/.libs/libgphobos.${shlib_ext}" > dg-test "$srcdir/$subdir/loadDR.c" "-ldl -pthread -g" "$DEFAULT_CFLAGS" > set libphobos_run_args "" > } OK. I've checked and committed this, however perhaps we should get you write after approval set-up. -- Iain
Re: [PATCH] x86: Add -march=cascadelake
Jakub, Thanks for the comments! I have addressed them as attached. Wei gcc/ * common/config/i386/i386-common.c (processor_names): Add cascadelake. (processor_alias_table): Add cascadelake. * config.gcc: Add -march=cascadelake. * config/i386/driver-i386.c (host_detect_local_cpu): Detect cascadelake. * config/i386/i386-c.c (ix86_target_macros_internal): Handle cascadelake. * config/i386/i386.c (ix86_cost): Add m_CASCADELAKE. (processor_cost_table): Add cascadelake. (get_builtin_code_for_version): Handle cascadelake. (fold_builtin_cpu): Ditto. * config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New. (PTA_CASCADELAKE): Ditto. * doc/invoke.texi: Add -march=cascadelake. gcc/testsuite/ * g++.target/i386/mv16.C: Handle new march. * gcc.target/i386/funcspec-56.inc" Ditto. libgcc/ * config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE. Jakub Jelinek 于2018年11月21日周三 下午7:09写道: > > On Wed, Nov 21, 2018 at 06:23:41PM +0800, Wei Xiao wrote: > > The attached patch added -march=cascadelake for x86. > > Tested with bootstrap and regression tests on x86_64. No regressions. > > Is it ok for trunk? > > Not a real review, just nits: > > index bff4dfb..f7c1c98 100644 > --- a/gcc/ChangeLog > +++ b/gcc/ChangeLog > @@ -1,3 +1,18 @@ > +2018-11-21 Wei Xiao > > Two spaces after date, two spaces before <. > > --- a/gcc/config/i386/driver-i386.c > +++ b/gcc/config/i386/driver-i386.c > @@ -857,6 +857,9 @@ const char *host_detect_local_cpu (int argc, const char > **argv) > /* Assume Ice Lake. */ > else if (has_gfni) > cpu = "icelake-client"; > + /* Assume Cascade Lake. */ > + if (has_avx512vnni) > + cpu = "cascadelake"; > /* Assume Cannon Lake. */ > else if (has_avx512vbmi) > cpu = "cannonlake"; > > Doesn't this break handling of all the other CPUs? I mean, it is a large > if (cond) ... else if (cond) ... else if (cond) ... else ... > but you've added if without else before it into the middle. > > Jakub cascadelake-v2.diff Description: Binary data
Re: [PATCH v3] [aarch64] Add CPU support for Ampere Computing's eMAG.
One small comment. On Tue, Nov 20, 2018 at 10:01 AM Christoph Muellner wrote: > > Tested with "make check" and no regressions found. > > This patch depends on the struct xgene1_prefetch_tune, > which has been acknowledged already: > https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00985.html > > *** gcc/ChangeLog *** > > 2018-xx-xx Christoph Muellner > > * config/aarch64/aarch64-cores.def: Define emag. > * config/aarch64/aarch64-tune.md: Regenerated with emag. > * config/aarch64/aarch64.c (emag_tunings): New struct. > * doc/invoke.texi: Document mtune value. > > Signed-off-by: Christoph Muellner > --- > gcc/config/aarch64/aarch64-cores.def | 3 +++ > gcc/config/aarch64/aarch64-tune.md | 2 +- > gcc/config/aarch64/aarch64.c | 25 + > gcc/doc/invoke.texi | 2 +- > 4 files changed, 30 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/aarch64/aarch64-cores.def > b/gcc/config/aarch64/aarch64-cores.def > index 1f3ac56..68cca00 100644 > --- a/gcc/config/aarch64/aarch64-cores.def > +++ b/gcc/config/aarch64/aarch64-cores.def > @@ -61,6 +61,9 @@ AARCH64_CORE("thunderxt88", thunderxt88, thunderx, 8A, > AARCH64_FL_FOR_ARCH > AARCH64_CORE("thunderxt81", thunderxt81, thunderx, 8A, > AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, 0x43, > 0x0a2, -1) > AARCH64_CORE("thunderxt83", thunderxt83, thunderx, 8A, > AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx, 0x43, > 0x0a3, -1) > > +/* Ampere Computing cores. */ > +AARCH64_CORE("emag",emag, xgene1,8A, AARCH64_FL_FOR_ARCH8 > | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, 0x000, 3) I think you should add a comment to say why this order is required like above for thunderxt88p1. Thanks, Andrew Pinski > + > /* APM ('P') cores. */ > AARCH64_CORE("xgene1", xgene1,xgene1,8A, AARCH64_FL_FOR_ARCH8, > xgene1, 0x50, 0x000, -1) > > diff --git a/gcc/config/aarch64/aarch64-tune.md > b/gcc/config/aarch64/aarch64-tune.md > index fade1d4..2fc7f03 100644 > --- a/gcc/config/aarch64/aarch64-tune.md > +++ b/gcc/config/aarch64/aarch64-tune.md > @@ -1,5 +1,5 @@ > ;; -*- buffer-read-only: t -*- > ;; Generated automatically by gentune.sh from aarch64-cores.def > (define_attr "tune" > - > "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" > + > "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55" > (const (symbol_ref "((enum attr_tune) aarch64_tune)"))) > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index f7f88a9..995aafe 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -957,6 +957,31 @@ static const struct tune_params xgene1_tunings = >&xgene1_prefetch_tune > }; > > +static const struct tune_params emag_tunings = > +{ > + &xgene1_extra_costs, > + &xgene1_addrcost_table, > + &xgene1_regmove_cost, > + &xgene1_vector_cost, > + &generic_branch_cost, > + &xgene1_approx_modes, > + 6, /* memmov_cost */ > + 4, /* issue_rate */ > + AARCH64_FUSE_NOTHING, /* fusible_ops */ > + "16",/* function_align. */ > + "16",/* jump_align. */ > + "16",/* loop_align. */ > + 2, /* int_reassoc_width. */ > + 4, /* fp_reassoc_width. */ > + 1, /* vec_reassoc_width. */ > + 2, /* min_div_recip_mul_sf. */ > + 2, /* min_div_recip_mul_df. */ > + 17, /* max_case_values. */ > + tune_params::AUTOPREFETCHER_OFF, /* autoprefetcher_model. */ > + (AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS), /* tune_flags. */ > + &xgene1_prefetch_tune > +}; > + > static const struct tune_params qdf24xx_tunings = > { >&qdf24xx_extra_costs, > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index e016dce..ac81fb2 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -15288,7 +15288,7 @@ Specify the name of the target processor for which > GCC should tune the > performance of the code. Permissible values for this option are: > @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55}, > @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75}, > -@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{falkor}, > +@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{emag}, @samp{falkor}, > @samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @