Re: Patch ping (Re: [PATCH] Fortran include line fixes and -fdec-include support)

2018-11-21 Thread Jakub Jelinek
On Wed, Nov 21, 2018 at 08:31:17AM +0100, Thomas Koenig wrote:
> > I'd like to ping this patch, ok for trunk?
> 
> OK. Thanks for the patch!

Thanks.

> Before 9.0 is released, we should also document the flag
> (and the extension it supports) in the manual, and note it
> in changes.html and on the Wiki.  Would you also do that?

Like this?  Ok for trunk/wwwdocs?

2018-11-21  Jakub Jelinek  

* invoke.texi (-fdec-include): Document.

--- gcc/fortran/invoke.texi.jj  2018-08-26 22:42:19.907823618 +0200
+++ gcc/fortran/invoke.texi 2018-11-21 09:14:21.449174232 +0100
@@ -119,7 +119,7 @@ by type.  Explanations are in the follow
 @gccoptlist{-fall-intrinsics -fbackslash -fcray-pointer -fd-lines-as-code @gol
 -fd-lines-as-comments @gol
 -fdec -fdec-structure -fdec-intrinsic-ints -fdec-static -fdec-math @gol
--fdefault-double-8 -fdefault-integer-8 -fdefault-real-8 @gol
+-fdec-include -fdefault-double-8 -fdefault-integer-8 -fdefault-real-8 @gol
 -fdefault-real-10 -fdefault-real-16 -fdollar-ok -ffixed-line-length-@var{n} 
@gol
 -ffixed-line-length-none -ffree-form -ffree-line-length-@var{n} @gol
 -ffree-line-length-none -fimplicit-none -finteger-4-integer-8 @gol
@@ -277,6 +277,12 @@ functions (e.g. TAND, ATAND, etc...) for
 Enable DEC-style STATIC and AUTOMATIC attributes to explicitly specify
 the storage of variables and other objects.
 
+@item -fdec-include
+@opindex @code{fdec-include}
+Enable parsing of INCLUDE as a statement in addition to parsing it as
+INCLUDE line.  When parsed as INCLUDE statement, INCLUDE does not have to
+be on a single line and can use line continuations.
+
 @item -fdollar-ok
 @opindex @code{fdollar-ok}
 @cindex @code{$}


Jakub
--- gcc-9/changes.html.jj   2018-11-14 17:46:10.747799079 +0100
+++ gcc-9/changes.html  2018-11-21 09:23:48.974896385 +0100
@@ -118,6 +118,14 @@ a work-in-progress.
   the IEEE_IS_NAN function from the intrinsic
   module IEEE_ARITHMETIC.
   
+  
+A new command line option -fdec-include, set also
+by -fdec option, has been added for an extension
+for compatibility with legacy code.  With this option,
+INCLUDE directive is parsed also as a statement,
+which allows the directive to be written on multiple source lines
+with line continuations.
+  
 
 
 


[RFC, RFT PATCH, mingw]: Do not cancel vzeroupper when XMM registers live across call

2018-11-21 Thread Uros Bizjak
Hello!

Before vzeroupper gets emitted before function call, the compiler
checks if if there are live call-saved SSE registers at the insertion
point. This functionality is intended to handle Windows ABI, so we
don't clear upper parts of the XMM registers that live across the
call.

However, the called function saves only lower 128bit part of the XMM
register, so it seems that wider modes have to be saved and restored
by the caller function anyway. If this is the case, we don't have to
cancel vzeroupper insertion before the call.

Attached patch removes this cancellation, since all other ABIs clobber
all XMM registers.

2018-21-11  Uros Bizjak  

* config/i386/i386.c (ix86_avx_emit_vzeroupper): Remove.
(ix86_emit_mode_set) : Emit vzeroupper here.

The patch is untested, since I have no Windows target here. Daniel,
can you please review the above assumptions and test the patch on
Windows target?

Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c18c60a1d191..598165103716 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -19167,37 +19167,11 @@ emit_i387_cw_initialization (int mode)
   emit_move_insn (new_mode, reg);
 }
 
-/* Emit vzeroupper.  */
-
-void
-ix86_avx_emit_vzeroupper (HARD_REG_SET regs_live)
-{
-  int i;
-
-  /* Cancel automatic vzeroupper insertion if there are
- live call-saved SSE registers at the insertion point.  */
-
-  for (i = FIRST_SSE_REG; i <= LAST_SSE_REG; i++)
-if (TEST_HARD_REG_BIT (regs_live, i) && !call_used_regs[i])
-  return;
-
-  if (TARGET_64BIT)
-for (i = FIRST_REX_SSE_REG; i <= LAST_REX_SSE_REG; i++)
-  if (TEST_HARD_REG_BIT (regs_live, i) && !call_used_regs[i])
-   return;
-
-  emit_insn (gen_avx_vzeroupper ());
-}
-
 /* Generate one or more insns to set ENTITY to MODE.  */
 
-/* Generate one or more insns to set ENTITY to MODE.  HARD_REG_LIVE
-   is the set of hard registers live at the point where the insn(s)
-   are to be inserted.  */
-
 static void
 ix86_emit_mode_set (int entity, int mode, int prev_mode ATTRIBUTE_UNUSED,
-   HARD_REG_SET regs_live)
+   HARD_REG_SET regs_live ATTRIBUTE_UNUSED)
 {
   switch (entity)
 {
@@ -19207,7 +19181,7 @@ ix86_emit_mode_set (int entity, int mode, int prev_mode 
ATTRIBUTE_UNUSED,
   break;
 case AVX_U128:
   if (mode == AVX_U128_CLEAN)
-   ix86_avx_emit_vzeroupper (regs_live);
+   emit_insn (gen_avx_vzeroupper ());
   break;
 case I387_TRUNC:
 case I387_FLOOR:


Re: Fix PR rtl-optimization/85925

2018-11-21 Thread Eric Botcazou
> This is saying that *every* op except those very few works on the full
> register.  And that for every architecture that has W_R_O.

That's still a progress over the previous situation.

> It also only looks at the top code in the RTL, so it will say for example
> a rotate-and-mask is just fine, while that isn't true.

Not clear whether this needs to be recursive because nonzero_bits1 and 
num_sign_bit_copies1 already recurse on RTXes.

-- 
Eric Botcazou


Re: Stream TREE_TYPE of TYPE_DECLs again

2018-11-21 Thread Richard Biener
On Wed, 21 Nov 2018, Jan Hubicka wrote:

> Hi,
> this patch recovers location infomration in the ODR warnings.
> Because location info is not attached to types but corresponding
> TYPE_DECLs, we need to prevent TYPE_DECLs to be merged when
> corresponding types are not merged.
> 
> To achieve this I no longer clear TREE_TYPE of TYPE_DECLs which
> puts them back to the same SCC as the type itself.  While making
> incomplete type variant we need to produce copy of TYPE_DECL. Becuase
> it is possible that TYPE_DECL was not processed by free lang data
> we can not do copy_node but build it from scratch (because 
> the toplevel loops possibly processed all decls). This is not hard
> to do, but made me notice few extra flags that are streamed for
> TYPE_DECLs and free_lang_data is not seeing them.
> 
> I have also extended ipa-devirt to get rid of the duplicated decls
> once ODR warnings are done to save ltrans streaming (it actually
> added about 10% of ltrans data w/o this change)
> 
> I have checked that the patch does not increase number of type
> duplicates for cc1 (24), I will also re-do testing for Firefox
> which may uncover some extra flags/attributes to care about.
> 
> lto-bootstrapped/regtested x86_64-linux OK?

OK if you put a comment ...

> Honza
>   
>   PR lto/87957
>   * tree.c (fld_decl_context): Break out from ...
>   (free_lang_data_in_decl): ... here; free TREE_PUBLIC, TREE_PRIVATE
>   DECL_ARTIFICIAL of TYPE_DECL; do not free TREE_TYPE of TYPE_DECL.
>   (fld_incomplete_type_of): Build copy of TYP_DECL.
>   * ipa-devirt.c (free_enum_values): Rename to ...
>   (free_odr_warning_data): ... this one; free also duplicated TYPE_DECLs
>   and TREE_TYPEs of TYPE_DECLs.
> 
> Index: tree.c
> ===
> --- tree.c(revision 266325)
> +++ tree.c(working copy)
> @@ -5206,6 +5206,24 @@ fld_process_array_type (tree t, tree t2,
>return array;
>  }
>  
> +/* Return CTX after removal of contexts that are not relevant  */
> +
> +static tree
> +fld_decl_context (tree ctx)
> +{
> +  /* Variably modified types are needed for tree_is_indexable to decide
> + whether the type needs to go to local or global section.
> + This code is semi-broken but for now it is easiest to keep contexts
> + as expected.  */
> +  if (ctx && TYPE_P (ctx)
> +  && !variably_modified_type_p (ctx, NULL_TREE))
> + {
> +   while (ctx && TYPE_P (ctx))
> +  ctx = TYPE_CONTEXT (ctx);
> + }
> +  return ctx;
> +}
> +
>  /* For T being aggregate type try to turn it into a incomplete variant.
> Return T if no simplification is possible.  */
>  
> @@ -5267,6 +5285,27 @@ fld_incomplete_type_of (tree t, struct f
>   }
> else
>   TYPE_VALUES (copy) = NULL;
> +
> +   /* Build copy of TYPE_DECL in TYPE_NAME if necessary.
> +  This is needed for ODR violation warnings to come out right (we
> +  want duplicate TYPE_DECLs whenever the type is duplicated because
> +  of ODR violation.  Because lang data in the TYPE_DECL may not
> +  have been freed yet, rebuild it from scratch and copy relevant
> +  fields.  */
> +   TYPE_NAME (copy) = fld_simplified_type_name (copy);
> +   tree name = TYPE_NAME (copy);
> +
> +   if (name && TREE_CODE (name) == TYPE_DECL)
> + {
> +   gcc_checking_assert (TREE_TYPE (name) == t);
> +   tree name2 = build_decl (DECL_SOURCE_LOCATION (name), TYPE_DECL,
> +DECL_NAME (name), copy);
> +   SET_DECL_ASSEMBLER_NAME (name2, DECL_ASSEMBLER_NAME (name));
> +   SET_DECL_ALIGN (name2, 0);
> +   DECL_CONTEXT (name2) = fld_decl_context
> +  (DECL_CONTEXT (name));
> +   TYPE_NAME (copy) = name2;
> + }
>   }
>return copy;
> }
> @@ -5649,12 +5688,13 @@ free_lang_data_in_decl (tree decl, struc
>  {
>DECL_VISIBILITY (decl) = VISIBILITY_DEFAULT;
>DECL_VISIBILITY_SPECIFIED (decl) = 0;
> -  /* TREE_PUBLIC is used to tell if type is anonymous.  */
> +  TREE_PUBLIC (decl) = 0;
> +  TREE_PRIVATE (decl) = 0;
> +  DECL_ARTIFICIAL (decl) = 0;
>TYPE_DECL_SUPPRESS_DEBUG (decl) = 0;
>DECL_INITIAL (decl) = NULL_TREE;
>DECL_ORIGINAL_TYPE (decl) = NULL_TREE;
>DECL_MODE (decl) = VOIDmode;
> -  TREE_TYPE (decl) = void_type_node;
>SET_DECL_ALIGN (decl, 0);
>  }
>else if (TREE_CODE (decl) == FIELD_DECL)
> @@ -5688,20 +5728,7 @@ free_lang_data_in_decl (tree decl, struc
>if (TREE_CODE (decl) != FIELD_DECL
>&& ((TREE_CODE (decl) != VAR_DECL && TREE_CODE (decl) != FUNCTION_DECL)
>|| !DECL_VIRTUAL_P (decl)))
> -{
> -  tree ctx = DECL_CONTEXT (decl);
> -  /* Variably modified types are needed for tree_is_indexable to decide
> -  whether the type needs to go to local or global se

Re: [PATCH] apply_subst_iterator: Handle define_split/define_insn_and_split

2018-11-21 Thread Richard Biener
On Fri, Oct 26, 2018 at 9:44 AM H.J. Lu  wrote:
>
> On 10/25/18, Uros Bizjak  wrote:
> > On Fri, Oct 26, 2018 at 8:48 AM H.J. Lu  wrote:
> >>
> >> On 10/25/18, Uros Bizjak  wrote:
> >> > On Fri, Oct 26, 2018 at 8:07 AM H.J. Lu  wrote:
> >> >>
> >> >> * read-rtl.c (apply_subst_iterator): Handle
> >> >> define_insn_and_split.
> >> >> ---
> >> >>  gcc/read-rtl.c | 6 --
> >> >>  1 file changed, 4 insertions(+), 2 deletions(-)
> >> >>
> >> >> diff --git a/gcc/read-rtl.c b/gcc/read-rtl.c
> >> >> index d698dd4af4d..5957c29671a 100644
> >> >> --- a/gcc/read-rtl.c
> >> >> +++ b/gcc/read-rtl.c
> >> >> @@ -275,9 +275,11 @@ apply_subst_iterator (rtx rt, unsigned int, int
> >> >> value)
> >> >>if (value == 1)
> >> >>  return;
> >> >>gcc_assert (GET_CODE (rt) == DEFINE_INSN
> >> >> + || GET_CODE (rt) == DEFINE_INSN_AND_SPLIT
> >> >>   || GET_CODE (rt) == DEFINE_EXPAND);
> >> >
> >> > Can we also handle DEFINE_SPLIT here?
> >> >
> >>
> >> Yes, we could if there were a usage for it.  I am reluctant to add
> >> something
> >> I have no use nor test for.
> >
> > Just split one define_insn_and_split to define_insn and corresponding
> > define_split.
> >
> > define_insn_and_split is a contraction for for the define_insn and
> > corresponding define_split, so it looks weird to only handle
> > define_insn_and-split without handling define_split.
> >
>
> Here is the updated patch to handle define_split.  Tested with

OK.

> (define_insn "*sse4_1_v8qiv8hi2_2"
>   [(set (match_operand:V8HI 0 "register_operand")
> (any_extend:V8HI
>   (vec_select:V8QI
> (subreg:V16QI
>   (vec_concat:V2DI
> (match_operand:DI 1 "memory_operand")
> (const_int 0)) 0)
> (parallel [(const_int 0) (const_int 1)
>(const_int 2) (const_int 3)
>(const_int 4) (const_int 5)
>(const_int 6) (const_int 7)]]
>   "TARGET_SSE4_1 &&  && "
>   "#")
>
> (define_split
>   [(set (match_operand:V8HI 0 "register_operand")
> (any_extend:V8HI
>   (vec_select:V8QI
> (subreg:V16QI
>   (vec_concat:V2DI
> (match_operand:DI 1 "memory_operand")
> (const_int 0)) 0)
> (parallel [(const_int 0) (const_int 1)
>(const_int 2) (const_int 3)
>(const_int 4) (const_int 5)
>(const_int 6) (const_int 7)]]
>   "TARGET_SSE4_1 &&  && 
>&& can_create_pseudo_p ()"
>   [(set (match_dup 0)
> (any_extend:V8HI (match_dup 1)))]
> {
>   operands[1] = adjust_address_nv (operands[1], V8QImode, 0);
> })
>
> --
> H.J.


Re: [PATCH, middle-end]: Fix PR 88070, ICE in create_pre_exit, at mode-switching.c:438

2018-11-21 Thread Uros Bizjak
On Wed, Nov 21, 2018 at 12:46 AM Jeff Law  wrote:
>
> On 11/19/18 12:58 PM, Uros Bizjak wrote:
> > Hello!
> >
> > The assert in create_pre_exit at mode-switching.c expects return copy
> > pair with nothing in between. However, the compiler starts mode
> > switching pass with the following sequence:
> >
> > (insn 19 18 16 2 (set (reg:V2SF 21 xmm0)
> > (mem/c:V2SF (plus:DI (reg/f:DI 7 sp)
> > (const_int -72 [0xffb8])) [0  S8 A64]))
> > "pr88070.c":8 1157 {*movv2sf_internal}
> >  (nil))
> > (insn 16 19 20 2 (set (reg:V2SF 0 ax [orig:91  ] [91])
> > (reg:V2SF 0 ax [89])) "pr88070.c":8 1157 {*movv2sf_internal}
> >  (nil))
> > (insn 20 16 21 2 (unspec_volatile [
> > (const_int 0 [0])
> > ] UNSPECV_BLOCKAGE) "pr88070.c":8 710 {blockage}
> >  (nil))
> > (insn 21 20 23 2 (use (reg:V2SF 21 xmm0)) "pr88070.c":8 -1
> >  (nil))
> So I know there's an updated patch.  But I thought it might be worth
> mentioning that insn 16 here appears to be a nop-move.   Removing it
> might address this instance of the problem, but I doubt it's general
> enough to address any larger issues.
>
> You still might want to investigate why it's still in the IL.

Oh yes, I remember this.

These nop-moves were removed in Vlad's patch [1],[2]:

2013-10-25  Vladimir Makarov 

...
* lra-spills.c (lra_final_code_change): Remove useless move insns.

Which regressed vzeroupper insertion pass [3] that was reported in [4].

The functionality was later reverted in [5]:

2013-10-26  Vladimir Makarov  

Revert:
2013-10-25  Vladimir Makarov  
* lra-spills.c (lra_final_code_change): Remove useless move insns.

Which IMO can be reintroduced back, now that vzeroupper pass works in
a different way. We actually have a couple of tests in place for
PR58679 [6].

[1] https://gcc.gnu.org/ml/gcc-patches/2013-10/msg02208.html
[2] https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=204079
[3] https://gcc.gnu.org/ml/gcc-patches/2013-10/msg02225.html
[4] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58679
[5] https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=204094
[6] https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=204109

Uros.


Re: [PATCH, middle-end]: Fix PR 88070, ICE in create_pre_exit, at mode-switching.c:438

2018-11-21 Thread Uros Bizjak
On Wed, Nov 21, 2018 at 10:48 AM Uros Bizjak  wrote:
>
> On Wed, Nov 21, 2018 at 12:46 AM Jeff Law  wrote:
> >
> > On 11/19/18 12:58 PM, Uros Bizjak wrote:
> > > Hello!
> > >
> > > The assert in create_pre_exit at mode-switching.c expects return copy
> > > pair with nothing in between. However, the compiler starts mode
> > > switching pass with the following sequence:
> > >
> > > (insn 19 18 16 2 (set (reg:V2SF 21 xmm0)
> > > (mem/c:V2SF (plus:DI (reg/f:DI 7 sp)
> > > (const_int -72 [0xffb8])) [0  S8 A64]))
> > > "pr88070.c":8 1157 {*movv2sf_internal}
> > >  (nil))
> > > (insn 16 19 20 2 (set (reg:V2SF 0 ax [orig:91  ] [91])
> > > (reg:V2SF 0 ax [89])) "pr88070.c":8 1157 {*movv2sf_internal}
> > >  (nil))
> > > (insn 20 16 21 2 (unspec_volatile [
> > > (const_int 0 [0])
> > > ] UNSPECV_BLOCKAGE) "pr88070.c":8 710 {blockage}
> > >  (nil))
> > > (insn 21 20 23 2 (use (reg:V2SF 21 xmm0)) "pr88070.c":8 -1
> > >  (nil))
> > So I know there's an updated patch.  But I thought it might be worth
> > mentioning that insn 16 here appears to be a nop-move.   Removing it
> > might address this instance of the problem, but I doubt it's general
> > enough to address any larger issues.
> >
> > You still might want to investigate why it's still in the IL.
>
> Oh yes, I remember this.
>
> These nop-moves were removed in Vlad's patch [1],[2]:
>
> 2013-10-25  Vladimir Makarov 
>
> ...
> * lra-spills.c (lra_final_code_change): Remove useless move insns.
>
> Which regressed vzeroupper insertion pass [3] that was reported in [4].
>
> The functionality was later reverted in [5]:
>
> 2013-10-26  Vladimir Makarov  
>
> Revert:
> 2013-10-25  Vladimir Makarov  
> * lra-spills.c (lra_final_code_change): Remove useless move insns.
>
> Which IMO can be reintroduced back, now that vzeroupper pass works in
> a different way. We actually have a couple of tests in place for
> PR58679 [6].

The revert of the revert works OK for PR58679 tests with the latest compiler.

Uros.


Re: [PATCH] avoid error_mark_node in -Wsizeof-pointer-memaccess (PR 88065)

2018-11-21 Thread Jakub Jelinek
On Tue, Nov 20, 2018 at 12:39:44AM +0100, Jakub Jelinek wrote:
> On Mon, Nov 19, 2018 at 04:10:09PM -0700, Jeff Law wrote:
> > > PR c/88065 - ICE in -Wsizeof-pointer-memaccess on an invalid strncpy
> > > 
> > > gcc/c-family/ChangeLog:
> > > 
> > >   PR c/88065

Please also add
PR c/87297

> > >   * c-warn.c (sizeof_pointer_memaccess_warning): Bail if source
> > >   or destination is an error.
> > > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > >   PR c/88065
> > >   * gcc.dg/Wsizeof-pointer-memaccess2.c: New test.
> > This is probably OK.  But before final ACK, is there a point earlier
> > where we could/should have bailed out?
> 
> IMHO it is a good point, but it should use error_operand_p predicate instead
> of == error_mark_node checks to also catch the case where the argument is
> not error_mark_node, but has error_mark_node type.  And, the testcase
> shouldn't be in gcc.dg, but in c-c++-common and cover also C++ testing.

Testcase proving that error_operand_p is really necessary:

/* PR c/87297 */
/* { dg-do compile } */
/* { dg-options "-Wsizeof-pointer-memaccess" } */
struct S { char a[4]; };

void
foo (struct S *p, const char *s)
{
  struct T x;   /* { dg-error "storage size|incomplete type" } */
  __builtin_strncpy (x, s, sizeof p->a);
}

Works in C, still ICEs in C++ even with the patch you've posted.
819   tree dstsz = TYPE_SIZE_UNIT (TREE_TYPE (d));
debug_tree (d)
 
used decl_5 VOID huvaa.c:9:12
align:8 warn_if_not_align:0 context 
chain >

And, I think it is important to have these tests in c-c++-common, as the
above test shows, it behaves differently between C and C++ (C will present
error_mark_node itself rather than VAR_DECL with error_mark_node type) and
the code in question is just a helper for the FEs.

Jakub


Patch ping (Re: [PATCH] Fix x86 bzhi/bextr iff zero_extract with zero size is undefined (PR rtl-optimization/87817))

2018-11-21 Thread Jakub Jelinek
Hi!

On Wed, Nov 14, 2018 at 12:37:02AM +0100, Jakub Jelinek wrote:
> 2018-11-13  Jakub Jelinek  
> 
>   PR rtl-optimization/87817
>   * config/i386/i386.md (nmi2_bzhi_3, *bmi2_bzhi_3,
>   *bmi2_bzhi_3_1, *bmi2_bzhi_3_1_ccz): Use IF_THEN_ELSE
>   in the pattern to avoid triggering UB when operands[2] is zero.
>   (tbm_bextri_): New expander.  Renamed the old define_insn to ...
>   (*tbm_bextri_): ... this.

I'd like to ping this patch, while the folding committed for the PR
often triggers and so the RTL passes see literal zero propagated there less
often, e.g. the testcase with:
-O2 -mbmi2 -fno-tree-ccp -fno-tree-forwprop -fno-tree-fre -fno-tree-pre 
-fno-tree-vrp -fno-tree-dominator-opts -fno-code-hoisting
is still miscompiled and there could be other reasons why a zero appears
only after expansion.

>From what I understood, the agreement was that zero_extract with 0 size
(either literal or just at runtime is incorrect in the middle-end).

Jakub


[PATCH] x86: Add -march=cascadelake

2018-11-21 Thread Wei Xiao
Hi,

The attached patch added -march=cascadelake for x86.
Tested with bootstrap and regression tests on x86_64. No regressions.
Is it ok for trunk?

Wei

gcc/
* common/config/i386/i386-common.c (processor_names): Add cascadelake.
(processor_alias_table): Add cascadelake.
* config.gcc: Add -march=cascadelake.
* config/i386/driver-i386.c
(host_detect_local_cpu): Detect cascadelake.
* config/i386/i386-c.c (ix86_target_macros_internal): Handle
cascadelake.
* config/i386/i386.c (ix86_cost): Add m_CASCADELAKE.
(processor_cost_table): Add cascadelake.
(get_builtin_code_for_version): Handle cascadelake.
(fold_builtin_cpu): Ditto.
* config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New.
(PTA_CASCADELAKE): Ditto.
* doc/invoke.texi: Add -march=cascadelake.
gcc/testsuite/
* g++.target/i386/mv16.C: Handle new march.
* gcc.target/i386/funcspec-56.inc" Ditto.
libgcc/
* config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE.


cascadelake.diff
Description: Binary data


Re: [PATCH v3] [aarch64] Add CPU support for Ampere Computing's eMAG.

2018-11-21 Thread Kyrill Tkachov

Hi Christoph,

On 20/11/18 18:00, Christoph Muellner wrote:

Tested with "make check" and no regressions found.

This patch depends on the struct xgene1_prefetch_tune,
which has been acknowledged already:
https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00985.html

*** gcc/ChangeLog ***

2018-xx-xx  Christoph Muellner 

* config/aarch64/aarch64-cores.def: Define emag.
* config/aarch64/aarch64-tune.md: Regenerated with emag.
* config/aarch64/aarch64.c (emag_tunings): New struct.
* doc/invoke.texi: Document mtune value.


This looks ok to me but you'll need a maintainer to approve.
You mentioned this depends on your previously approved patches.
Do you have write access or do you need someone to commit them for you?

Thanks,
Kyrill


Signed-off-by: Christoph Muellner 
---
  gcc/config/aarch64/aarch64-cores.def |  3 +++
  gcc/config/aarch64/aarch64-tune.md   |  2 +-
  gcc/config/aarch64/aarch64.c | 25 +
  gcc/doc/invoke.texi  |  2 +-
  4 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 1f3ac56..68cca00 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -61,6 +61,9 @@ AARCH64_CORE("thunderxt88",   thunderxt88,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH
  AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 0x0a2, -1)
  AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 0x0a3, -1)
  
+/* Ampere Computing cores. */

+AARCH64_CORE("emag",emag,  xgene1,8A,  AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, 0x000, 3)
+
  /* APM ('P') cores. */
  AARCH64_CORE("xgene1",  xgene1,xgene1,8A,  AARCH64_FL_FOR_ARCH8, 
xgene1, 0x50, 0x000, -1)
  
diff --git a/gcc/config/aarch64/aarch64-tune.md b/gcc/config/aarch64/aarch64-tune.md

index fade1d4..2fc7f03 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
  ;; -*- buffer-read-only: t -*-
  ;; Generated automatically by gentune.sh from aarch64-cores.def
  (define_attr "tune"
-   
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
+   
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
(const (symbol_ref "((enum attr_tune) aarch64_tune)")))
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index f7f88a9..995aafe 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -957,6 +957,31 @@ static const struct tune_params xgene1_tunings =
&xgene1_prefetch_tune
  };
  
+static const struct tune_params emag_tunings =

+{
+  &xgene1_extra_costs,
+  &xgene1_addrcost_table,
+  &xgene1_regmove_cost,
+  &xgene1_vector_cost,
+  &generic_branch_cost,
+  &xgene1_approx_modes,
+  6, /* memmov_cost  */
+  4, /* issue_rate  */
+  AARCH64_FUSE_NOTHING, /* fusible_ops  */
+  "16",  /* function_align.  */
+  "16",  /* jump_align.  */
+  "16",  /* loop_align.  */
+  2,   /* int_reassoc_width.  */
+  4,   /* fp_reassoc_width.  */
+  1,   /* vec_reassoc_width.  */
+  2,   /* min_div_recip_mul_sf.  */
+  2,   /* min_div_recip_mul_df.  */
+  17,  /* max_case_values.  */
+  tune_params::AUTOPREFETCHER_OFF, /* autoprefetcher_model.  */
+  (AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS),   /* tune_flags.  */
+  &xgene1_prefetch_tune
+};
+
  static const struct tune_params qdf24xx_tunings =
  {
&qdf24xx_extra_costs,
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e016dce..ac81fb2 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -15288,7 +15288,7 @@ Specify the name of the target processor for which GCC 
should tune the
  performance of the code.  Permissible values for this option are:
  @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
  @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
-@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{falkor},
+@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{emag}, @samp{falkor},
  @samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan},
  @samp{thunderx}, @samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81},

Re: Patch ping (Re: [PATCH] Fix x86 bzhi/bextr iff zero_extract with zero size is undefined (PR rtl-optimization/87817))

2018-11-21 Thread Uros Bizjak
On Wed, Nov 21, 2018 at 11:20 AM Jakub Jelinek  wrote:
>
> Hi!
>
> On Wed, Nov 14, 2018 at 12:37:02AM +0100, Jakub Jelinek wrote:
> > 2018-11-13  Jakub Jelinek  
> >
> >   PR rtl-optimization/87817
> >   * config/i386/i386.md (nmi2_bzhi_3, *bmi2_bzhi_3,
> >   *bmi2_bzhi_3_1, *bmi2_bzhi_3_1_ccz): Use IF_THEN_ELSE
> >   in the pattern to avoid triggering UB when operands[2] is zero.
> >   (tbm_bextri_): New expander.  Renamed the old define_insn to ...
> >   (*tbm_bextri_): ... this.

OK.

I thought that I already approved the patch. Oh well...

Thanks,
Uros.

> I'd like to ping this patch, while the folding committed for the PR
> often triggers and so the RTL passes see literal zero propagated there less
> often, e.g. the testcase with:
> -O2 -mbmi2 -fno-tree-ccp -fno-tree-forwprop -fno-tree-fre -fno-tree-pre 
> -fno-tree-vrp -fno-tree-dominator-opts -fno-code-hoisting
> is still miscompiled and there could be other reasons why a zero appears
> only after expansion.
>
> From what I understood, the agreement was that zero_extract with 0 size
> (either literal or just at runtime is incorrect in the middle-end).
>
> Jakub


Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-11-21 Thread Thomas Preudhomme
Yes you did indeed which is why I didn't include you in to To list.
I've reworked the Arm part significantly since it was last approved,
the ping is meant for the Arm maintainers.

Thanks for enquiring about it. Best regards,

Thomas
On Wed, 21 Nov 2018 at 00:32, Jeff Law  wrote:
>
> On 11/16/18 7:56 AM, Thomas Preudhomme wrote:
> > Ping?
> I thought I acked the target independent stuff a while back.  What's
> still waiting on review here?
>
> jeff


Re: [PATCH] x86: Add -march=cascadelake

2018-11-21 Thread Jakub Jelinek
On Wed, Nov 21, 2018 at 06:23:41PM +0800, Wei Xiao wrote:
> The attached patch added -march=cascadelake for x86.
> Tested with bootstrap and regression tests on x86_64. No regressions.
> Is it ok for trunk?

Not a real review, just nits:

index bff4dfb..f7c1c98 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,18 @@
+2018-11-21 Wei Xiao 

Two spaces after date, two spaces before <.

--- a/gcc/config/i386/driver-i386.c
+++ b/gcc/config/i386/driver-i386.c
@@ -857,6 +857,9 @@ const char *host_detect_local_cpu (int argc, const char 
**argv)
  /* Assume Ice Lake.  */
  else if (has_gfni)
cpu = "icelake-client";
+ /* Assume Cascade Lake.  */
+ if (has_avx512vnni)
+   cpu = "cascadelake";
  /* Assume Cannon Lake.  */
  else if (has_avx512vbmi)
cpu = "cannonlake";

Doesn't this break handling of all the other CPUs?  I mean, it is a large
  if (cond) ... else if (cond) ... else if (cond) ... else ...
but you've added if without else before it into the middle.

Jakub


Re: [PATCH v3] [aarch64] Add CPU support for Ampere Computing's eMAG.

2018-11-21 Thread Christoph Müllner


> On 21.11.2018, at 11:26, Kyrill Tkachov  wrote:
> 
> Hi Christoph,
> 
> On 20/11/18 18:00, Christoph Muellner wrote:
>> Tested with "make check" and no regressions found.
>> 
>> This patch depends on the struct xgene1_prefetch_tune,
>> which has been acknowledged already:
>> https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00985.html
>> 
>> *** gcc/ChangeLog ***
>> 
>> 2018-xx-xx  Christoph Muellner 
>> 
>>  * config/aarch64/aarch64-cores.def: Define emag.
>>  * config/aarch64/aarch64-tune.md: Regenerated with emag.
>>  * config/aarch64/aarch64.c (emag_tunings): New struct.
>>  * doc/invoke.texi: Document mtune value.
> 
> This looks ok to me but you'll need a maintainer to approve.
> You mentioned this depends on your previously approved patches.
> Do you have write access or do you need someone to commit them for you?

I'd don't have write access.
But I have already contacted somebody with write access to get my ACK'ed 
changes in.

Thanks,
Christoph

> 
> Thanks,
> Kyrill
> 
>> Signed-off-by: Christoph Muellner 
>> ---
>>  gcc/config/aarch64/aarch64-cores.def |  3 +++
>>  gcc/config/aarch64/aarch64-tune.md   |  2 +-
>>  gcc/config/aarch64/aarch64.c | 25 +
>>  gcc/doc/invoke.texi  |  2 +-
>>  4 files changed, 30 insertions(+), 2 deletions(-)
>> 
>> diff --git a/gcc/config/aarch64/aarch64-cores.def 
>> b/gcc/config/aarch64/aarch64-cores.def
>> index 1f3ac56..68cca00 100644
>> --- a/gcc/config/aarch64/aarch64-cores.def
>> +++ b/gcc/config/aarch64/aarch64-cores.def
>> @@ -61,6 +61,9 @@ AARCH64_CORE("thunderxt88",   thunderxt88,   thunderx,  
>> 8A,  AARCH64_FL_FOR_ARCH
>>  AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx,  8A,  
>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
>> 0x0a2, -1)
>>  AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  8A,  
>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
>> 0x0a3, -1)
>>  +/* Ampere Computing cores. */
>> +AARCH64_CORE("emag",emag,  xgene1,8A,  AARCH64_FL_FOR_ARCH8 
>> | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, 0x000, 3)
>> +
>>  /* APM ('P') cores. */
>>  AARCH64_CORE("xgene1",  xgene1,xgene1,8A,  
>> AARCH64_FL_FOR_ARCH8, xgene1, 0x50, 0x000, -1)
>>  diff --git a/gcc/config/aarch64/aarch64-tune.md 
>> b/gcc/config/aarch64/aarch64-tune.md
>> index fade1d4..2fc7f03 100644
>> --- a/gcc/config/aarch64/aarch64-tune.md
>> +++ b/gcc/config/aarch64/aarch64-tune.md
>> @@ -1,5 +1,5 @@
>>  ;; -*- buffer-read-only: t -*-
>>  ;; Generated automatically by gentune.sh from aarch64-cores.def
>>  (define_attr "tune"
>> -
>> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
>> +
>> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
>>  (const (symbol_ref "((enum attr_tune) aarch64_tune)")))
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index f7f88a9..995aafe 100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -957,6 +957,31 @@ static const struct tune_params xgene1_tunings =
>>&xgene1_prefetch_tune
>>  };
>>  +static const struct tune_params emag_tunings =
>> +{
>> +  &xgene1_extra_costs,
>> +  &xgene1_addrcost_table,
>> +  &xgene1_regmove_cost,
>> +  &xgene1_vector_cost,
>> +  &generic_branch_cost,
>> +  &xgene1_approx_modes,
>> +  6, /* memmov_cost  */
>> +  4, /* issue_rate  */
>> +  AARCH64_FUSE_NOTHING, /* fusible_ops  */
>> +  "16", /* function_align.  */
>> +  "16", /* jump_align.  */
>> +  "16", /* loop_align.  */
>> +  2,/* int_reassoc_width.  */
>> +  4,/* fp_reassoc_width.  */
>> +  1,/* vec_reassoc_width.  */
>> +  2,/* min_div_recip_mul_sf.  */
>> +  2,/* min_div_recip_mul_df.  */
>> +  17,   /* max_case_values.  */
>> +  tune_params::AUTOPREFETCHER_OFF,  /* autoprefetcher_model.  */
>> +  (AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS),/* tune_flags.  */
>> +  &xgene1_prefetch_tune
>> +};
>> +
>>  static const struct tune_params qdf24xx_tunings =
>>  {
>>&qdf24xx_extra_costs,
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index e016dce..ac81fb2 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -15288,7 +15288,7 @@ Specify the name of the target processor for which 
>> GCC should tune the
>>  performance of the code.  Permissible values 

Re: [PATCH 01/10] Fix IRA ICE.

2018-11-21 Thread Andrew Stubbs

On 21/11/2018 00:47, Jeff Law wrote:

This seems like a really gross hack and sets an expectation that
generating registers in the target after IRA has started is OK.  It is
not OK.  THe fact that this works is, IMHO, likely an accident.


What's the proper test for this? Neither lra_in_progress nor 
reload_in_progress is set here, and can_create_pseudos returns true.


The patterns have the ability to not generate registers, but they don't 
know not to.


Richard Sandiford has stated that it should be OK, but perhaps the other 
architectures also work by accident?


In fact, since we're using LRA (not reload), my understanding is that I 
ought to be able to create new pseudos right up until reload_completed. 
(Although, my experience is that it's easy to get into an infinite loop 
doing that.)



I think this comes back to the fundamental representational issue with
the EXEC handling that still needs to be addressed.


Undoubtedly, this makes it worse, but even without that I'd still want 
to expand vector memory moves long before split1, so at least some cases 
have to generate additional registers. (Perhaps IRA doesn't create 
memory moves though? I'm not sure.)


I'm going to investigate how easy it is to fix the EXEC representation 
issues. I've been resisting because I had a deadline to make, and it's 
bound to be an invasive and destabilizing alteration (albeit largely 
mechanical), but if it's going to be a barrier to commit then probably 
it's become time. :-(


Andrew


Re: Fix PR rtl-optimization/85925

2018-11-21 Thread Segher Boessenkool
Hi Eric,

On Wed, Nov 21, 2018 at 09:35:03AM +0100, Eric Botcazou wrote:
> > This is saying that *every* op except those very few works on the full
> > register.  And that for every architecture that has W_R_O.
> 
> That's still a progress over the previous situation.

Yes.  But it feels more than a bit wobbly.


Segher


PR C++/88114 - patch for destructor not generated for "virtual ~destructor() = default"

2018-11-21 Thread Tobias Burnus
Hello all,

if a class contains any 'virtual ... = 0', it's an abstract class and for an
abstract class, the destructor not added to the vtable.

For a normal
  virtual ~class() { }
that's not a problem as the class::~class() destructor will be generated during
the parsing of the function.

But for
  virtual ~class() = default;
the destructor will be generated via mark_used via the vtable.


If one now declares a derived class and uses it, the class::~class() is 
generated
in that translation unit.  Unless, #pragma interface/implementation is used.

In that case, the 'default' destructor will never be generated.


The following code seems to work both for the big code and for the example;
without '#pragma implementation', the destructor is not generated for the 
example,
only with.

The patch survived boostrapping GCC with default languages on x86-64-gnu-linux
and "make check-g++".*

[One probably could get rid of some of the conditions for generating the code,
e.g. TREE_USED and DECL_DEFAULTED_FN are probably not both needed; one might
want to set some additional DECL to the fn decl.]

Does the patch and the test case make sense? Or is something else/in addition
needed?

Tobias


*I do get the following failures on this CentOS6 system:

FAIL: g++.dg/pr83239.C  -std=gnu++98 (test for excess errors)
Excess errors:
cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned int)' 
specified size 18446744073709551608 exceeds maximum object size 
9223372036854775807 [-Wstringop-overflow=]
cc1plus: warning: 'void* __builtin_memset(void*, int, long unsigned int)' 
specified size 18446744073709551600 exceeds maximum object size 
9223372036854775807 [-Wstringop-overflow=]

FAIL: g++.dg/tls/thread_local-order2.C  -std=c++14 execution test
FAIL: g++.dg/tls/thread_local-order2.C  -std=c++17 execution test

plus each 32 times:
FAIL: guality/guality.h: 0 PASS, 1 FAIL, 0 UNRESOLVED
FAIL: guality/guality.h: varl is -1, not 6
	PR C++/88114
	* decl2.c (c_parse_final_cleanups): If needed, generate code for 
	the destructor of an abstract class.
	(mark_used): Update comment for older function-name change.

	PR C++/88114
	* g++.dg/cpp0x/defaulted61.C: New.

diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index ffc0d0d6ec4..056e49ad88a 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -4782,6 +4782,18 @@ c_parse_final_cleanups (void)
 	  {
 	reconsider = true;
 	keyed_classes->unordered_remove (i);
+
+	/* For abstract classes, the destructor has been removed from the
+	   vtable (in class.c's build_vtbl_initializer).  For a compiler-
+	   generated destructor, it hence might not have been generated in
+	   this translation unit - and with '#pragma interface' it might
+	   never get generated.  */
+	if (CLASSTYPE_PURE_VIRTUALS (t)
+		&& TYPE_HAS_NONTRIVIAL_DESTRUCTOR (t))
+	  for (tree x = TYPE_FIELDS (t); x; x = DECL_CHAIN (x))
+		if (DECL_DECLARES_FUNCTION_P (x) && DECL_DESTRUCTOR_P (x)
+		&& !TREE_USED (x) && DECL_DEFAULTED_FN (x))
+		  note_vague_linkage_fn (x);
 	  }
   /* The input_location may have been changed during marking of
 	 vtable entries.  */
@@ -5465,7 +5477,7 @@ mark_used (tree decl, tsubst_flags_t complain)
 	 within the body of a function so as to avoid collecting live data
 	 on the stack (such as overload resolution candidates).
 
- We could just let cp_write_global_declarations handle synthesizing
+ We could just let c_parse_final_cleanups handle synthesizing
  this function by adding it to deferred_fns, but doing
  it at the use site produces better error messages.  */
   ++function_depth;
diff --git a/gcc/testsuite/g++.dg/cpp0x/defaulted61.C b/gcc/testsuite/g++.dg/cpp0x/defaulted61.C
new file mode 100644
index 000..e7e0a486292
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/defaulted61.C
@@ -0,0 +1,22 @@
+// { dg-do compile { target c++11 } }
+// { dg-final { scan-assembler "_ZN3OneD0Ev" } }
+
+// PR C++/88114
+// Destructor of an abstract class was never generated
+// when compiling the class - nor later due to the
+// '#pragma interface'
+
+#pragma implementation
+#pragma interface
+
+class One
+{
+ public:
+  virtual ~One() = default;
+  void some_fn();
+  virtual void later() = 0;
+ private:
+  int m_int;
+};
+
+void One::some_fn() { }


Re: [PATCH v3] [aarch64] Add CPU support for Ampere Computing's eMAG.

2018-11-21 Thread Philipp Tomsich
This is currently slowed down by the speed of subversion (as my subversion tree
was outdated).  So it should only be a matter of days ... ;-)

> On 21.11.2018, at 12:15, Christoph Müllner 
>  wrote:
> 
>> 
>> On 21.11.2018, at 11:26, Kyrill Tkachov  wrote:
>> 
>> Hi Christoph,
>> 
>> On 20/11/18 18:00, Christoph Muellner wrote:
>>> Tested with "make check" and no regressions found.
>>> 
>>> This patch depends on the struct xgene1_prefetch_tune,
>>> which has been acknowledged already:
>>> https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00985.html
>>> 
>>> *** gcc/ChangeLog ***
>>> 
>>> 2018-xx-xx  Christoph Muellner 
>>> 
>>> * config/aarch64/aarch64-cores.def: Define emag.
>>> * config/aarch64/aarch64-tune.md: Regenerated with emag.
>>> * config/aarch64/aarch64.c (emag_tunings): New struct.
>>> * doc/invoke.texi: Document mtune value.
>> 
>> This looks ok to me but you'll need a maintainer to approve.
>> You mentioned this depends on your previously approved patches.
>> Do you have write access or do you need someone to commit them for you?
> 
> I'd don't have write access.
> But I have already contacted somebody with write access to get my ACK'ed 
> changes in.
> 
> Thanks,
> Christoph
> 
>> 
>> Thanks,
>> Kyrill
>> 
>>> Signed-off-by: Christoph Muellner 
>>> ---
>>> gcc/config/aarch64/aarch64-cores.def |  3 +++
>>> gcc/config/aarch64/aarch64-tune.md   |  2 +-
>>> gcc/config/aarch64/aarch64.c | 25 +
>>> gcc/doc/invoke.texi  |  2 +-
>>> 4 files changed, 30 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/gcc/config/aarch64/aarch64-cores.def 
>>> b/gcc/config/aarch64/aarch64-cores.def
>>> index 1f3ac56..68cca00 100644
>>> --- a/gcc/config/aarch64/aarch64-cores.def
>>> +++ b/gcc/config/aarch64/aarch64-cores.def
>>> @@ -61,6 +61,9 @@ AARCH64_CORE("thunderxt88",   thunderxt88,   thunderx,  
>>> 8A,  AARCH64_FL_FOR_ARCH
>>> AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx,  8A,  
>>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
>>> 0x0a2, -1)
>>> AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  8A,  
>>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
>>> 0x0a3, -1)
>>> +/* Ampere Computing cores. */
>>> +AARCH64_CORE("emag",emag,  xgene1,8A,  
>>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, 
>>> 0x000, 3)
>>> +
>>> /* APM ('P') cores. */
>>> AARCH64_CORE("xgene1",  xgene1,xgene1,8A,  
>>> AARCH64_FL_FOR_ARCH8, xgene1, 0x50, 0x000, -1)
>>> diff --git a/gcc/config/aarch64/aarch64-tune.md 
>>> b/gcc/config/aarch64/aarch64-tune.md
>>> index fade1d4..2fc7f03 100644
>>> --- a/gcc/config/aarch64/aarch64-tune.md
>>> +++ b/gcc/config/aarch64/aarch64-tune.md
>>> @@ -1,5 +1,5 @@
>>> ;; -*- buffer-read-only: t -*-
>>> ;; Generated automatically by gentune.sh from aarch64-cores.def
>>> (define_attr "tune"
>>> -   
>>> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
>>> +   
>>> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
>>> (const (symbol_ref "((enum attr_tune) aarch64_tune)")))
>>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>>> index f7f88a9..995aafe 100644
>>> --- a/gcc/config/aarch64/aarch64.c
>>> +++ b/gcc/config/aarch64/aarch64.c
>>> @@ -957,6 +957,31 @@ static const struct tune_params xgene1_tunings =
>>>   &xgene1_prefetch_tune
>>> };
>>> +static const struct tune_params emag_tunings =
>>> +{
>>> +  &xgene1_extra_costs,
>>> +  &xgene1_addrcost_table,
>>> +  &xgene1_regmove_cost,
>>> +  &xgene1_vector_cost,
>>> +  &generic_branch_cost,
>>> +  &xgene1_approx_modes,
>>> +  6, /* memmov_cost  */
>>> +  4, /* issue_rate  */
>>> +  AARCH64_FUSE_NOTHING, /* fusible_ops  */
>>> +  "16",/* function_align.  */
>>> +  "16",/* jump_align.  */
>>> +  "16",/* loop_align.  */
>>> +  2,   /* int_reassoc_width.  */
>>> +  4,   /* fp_reassoc_width.  */
>>> +  1,   /* vec_reassoc_width.  */
>>> +  2,   /* min_div_recip_mul_sf.  */
>>> +  2,   /* min_div_recip_mul_df.  */
>>> +  17,  /* max_case_values.  */
>>> +  tune_params::AUTOPREFETCHER_OFF, /* autoprefetcher_model.  */
>>> +  (AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS),   /* tune_flags.  */
>>> +  &xgene1_prefetch_tune
>>> +};
>>> +
>>> static const struct tune_params qdf24xx_tunings =
>>> {
>>>   &qdf24xx_extra_costs,
>>> diff 

Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87626

2018-11-21 Thread Umesh Kalappa
Thank you for the inputs and please find the attachment for the update patch.

Do please let us know your comments on the same

~Umesh
On Tue, Nov 20, 2018 at 3:03 PM Jakub Jelinek  wrote:
>
> On Mon, Nov 19, 2018 at 04:08:29PM +0530, Lokesh Janghel wrote:
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 8ca2e73..b55dfa9 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,8 @@
> +2018-11-19 Lokesh Janghel 
>
> Two spaces between date and name and name and <, i.e.
> 2018-11-20  Lokesh Janghel  
> in both ChangeLog files.
>
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr85667-2.c
> @@ -0,0 +1,15 @@
> +/* { dg-do assemble } */
> +/* { dg-options "-O2 -masm=intel" } */
> +/* { dg-require-effective-target lp64 } */
> +/* { dg-require-effective-target masm_intel } */
> +/* { dg-final { scan-assembler-times "movl\[^\n\r]*, %eax" 1} } */
> +typedef struct
> +{
> +  float x;
> +} Float;
> +Float __attribute__((ms_abi)) fn1 ()
> +{
> +  Float v;
> +  v.x = 3.145;
> +  return v;
> +}
>
> This test wasn't properly tested:
>
> /usr/src/gcc/obj/gcc/xgcc -B/usr/src/gcc/obj/gcc/ -m64
-fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
-fdiagnostics-color=never -O2 -masm=intel -ffat-lto-objects -fno-ident
-c -o pr85667-2.o
/usr/src/gcc/gcc/testsuite/gcc.target/i386/pr85667-2.c
> PASS: gcc.target/i386/pr85667-2.c (test for excess errors)
> gcc.target/i386/pr85667-2.c: output file does not exist
> UNRESOLVED: gcc.target/i386/pr85667-2.c scan-assembler-times
movl[^\n\r]*, %eax 1
> testcase /usr/src/gcc/gcc/testsuite/gcc.target/i386/i386.exp
completed in 1 seconds
>
> 1) you do not want to use dg-do assemble, but dg-do compile, because only
>in that case (or when using -save-temps) assembly is produced
> 2) you do not want to use -masm=intel and then expect AT&T
syntax in the
>regexp
>
> Thus, I'd replace all the dg- directive lines with:
> /* { dg-do compile { target lp64 } } */
> /* { dg-options "-O2" } */
> /* { dg-final { scan-assembler-times "movl\[^\n\r]*, %eax|mov\[
\t]*eax," 1 } } */
>
> That way, it will work both with -masm=att (explicit or implicit) or
> -masm=intel.
>
> One can use
>
> make check-gcc
RUNTESTFLAGS='--target_board=unix\{-m32,-m64,-m64/-masm=intel\}
i386.exp=pr85667*'
>
> to verify and then look at the log file.
>
> Furthermore, I'd copy pr85667-1.c test to pr85667-3.c and the modified
> pr85667-2.c to pr85667-4.c, change Float to Double, float to double, remove
> f suffixes and adjust all the eax in the regexp to rax, so that you also
> test the struct with DFmode case.
>
> Jakub


85667.patch
Description: Binary data


Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87626

2018-11-21 Thread Jakub Jelinek
On Wed, Nov 21, 2018 at 06:06:41PM +0530, Umesh Kalappa wrote:
> Thank you for the inputs and please find the attachment for the update patch.

LGTM.

Jakub


Re: [PATCH] handle unusual targets in -Wbuiltin-declaration-mismatch (PR 88098)

2018-11-21 Thread Rainer Orth
Hi Martin,

> By calling builtin_decl_explicit rather than builtin_decl_implicit
> the updated patch in the attachment avoids test failures due to
> missing warnings on targets with support for long double but whose
> libc doesn't support C99 functions like fabsl (such as apparently
> aarch64-linux).
[...]
> gcc/testsuite/ChangeLog:
>
>   PR testsuite/88098
>   * gcc.dg/Wbuiltin-declaration-mismatch-4.c: Adjust.
>   * gcc.dg/Wbuiltin-declaration-mismatch-5.c: New test.

is the Wbuiltin-declaration-mismatch-5.c testcase still supposed to be
part of the patch?  It's in the ChangeLog, but missing from the revised
patch.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH] Fix up 20181120-1.c testcase on big-endian (PR rtl-optimization/85925)

2018-11-21 Thread Jakub Jelinek
Hi!

As mentioned in the PR, the testcase fails on big-endian targets.
The following patch tweaks it so that it does not fail there and still
checks for the original bug.

Tested on x86_64-linux and i686-linux, ok for trunk and release branches?

2018-11-21  Jakub Jelinek  

PR rtl-optimization/85925
* gcc.c-torture/execute/20181120-1.c (u): New variable.
(main): Compare d against u.f1 rather than 0x101.

--- gcc/testsuite/gcc.c-torture/execute/20181120-1.c.jj 2018-11-20 
21:39:05.230507352 +0100
+++ gcc/testsuite/gcc.c-torture/execute/20181120-1.c2018-11-21 
11:49:29.919488909 +0100
@@ -9,6 +9,7 @@ union U1 {
   unsigned f0;
   unsigned f1 : 15;
 };
+volatile union U1 u = { 0x10101 };
 
 int main (void)
 {
@@ -19,7 +20,7 @@ int main (void)
 *e = f.f1;
   }
 
-  if (d != 0x101)
+  if (d != u.f1)
 __builtin_abort ();
 
   return 0;

Jakub


Re: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87626

2018-11-21 Thread Umesh Kalappa
Hi Jakub and All,

We don't have the commit access ,can  someone please commit for us ?

~Umesh

On Wed, Nov 21, 2018, 18:37 Jakub Jelinek  On Wed, Nov 21, 2018 at 06:06:41PM +0530, Umesh Kalappa wrote:
> > Thank you for the inputs and please find the attachment for the update
> patch.
>
> LGTM.
>
> Jakub
>


[PATCH] C++: show namespaces for enum values (PR c++/88121)

2018-11-21 Thread David Malcolm
Consider this test case:

namespace json
{
  enum { JSON_OBJECT };
}

void test ()
{
  JSON_OBJECT;
}

which erroneously accesses an enum value in another namespace without
qualifying the access.

GCC 6 through 8 issue a suggestion that doesn't mention the namespace:

: In function 'void test()':
:8:3: error: 'JSON_OBJECT' was not declared in this scope
   JSON_OBJECT;
   ^~~
:8:3: note: suggested alternative:
:3:10: note:   'JSON_OBJECT'
   enum { JSON_OBJECT };
  ^~~

which is suboptimal.

I made the problem worse with r265610, as gcc 9 now consolidates
the single suggestion into the error, and emits:

: In function 'void test()':
:8:3: error: 'JSON_OBJECT' was not declared in this scope; did
   you mean 'JSON_OBJECT'?
8 |   JSON_OBJECT;
  |   ^~~
  |   JSON_OBJECT
:3:10: note: 'JSON_OBJECT' declared here
3 |   enum { JSON_OBJECT };
  |  ^~~

where the message:
  'JSON_OBJECT' was not declared in this scope; did you mean 'JSON_OBJECT'?
is nonsensical.

The root cause is that dump_scope doesn't print anything when called for
CONST_DECL in a namespace: the scope is an ENUMERAL_TYPE, rather than
a namespace.

This patch tweaks dump_scope to detect ENUMERAL_TYPE, and to use the
enclosing namespace, so that the CONST_DECL is dumped as
"json::JSON_OBJECT".

This changes the output for the above so that it refers to the
namespace, fixing the issue:

:8:3: error: 'JSON_OBJECT' was not declared in this scope; did
   you mean 'json::JSON_OBJECT'?
9 |   JSON_OBJECT;
  |   ^~~
  |   json::JSON_OBJECT
3:10: note: 'json::JSON_OBJECT' declared here
3 |   enum { JSON_OBJECT };
  |  ^~~

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

OK for trunk?

gcc/cp/ChangeLog:
PR c++/88121
* error.c (dump_scope): Handle ENUMERAL_TYPE by using the
CP_TYPE_CONTEXT of the type.

gcc/testsuite/ChangeLog:
PR c++/88121
* g++.dg/lookup/suggestions-scoped-enums.C: New test.
* g++.dg/lookup/suggestions-unscoped-enums.C: New test.
---
 gcc/cp/error.c |  6 ++
 .../g++.dg/lookup/suggestions-scoped-enums.C   | 13 
 .../g++.dg/lookup/suggestions-unscoped-enums.C | 91 ++
 3 files changed, 110 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C
 create mode 100644 gcc/testsuite/g++.dg/lookup/suggestions-unscoped-enums.C

diff --git a/gcc/cp/error.c b/gcc/cp/error.c
index 72b42bd..6fee62d 100644
--- a/gcc/cp/error.c
+++ b/gcc/cp/error.c
@@ -182,6 +182,12 @@ dump_scope (cxx_pretty_printer *pp, tree scope, int flags)
   if (scope == NULL_TREE)
 return;
 
+  /* Enum values will be CONST_DECL with an ENUMERAL_TYPE as their
+ "scope".  Use CP_TYPE_CONTEXT of the ENUMERAL_TYPE, so as to
+ print the enclosing namespace.  */
+  if (TREE_CODE (scope) == ENUMERAL_TYPE)
+scope = CP_TYPE_CONTEXT (scope);
+
   if (TREE_CODE (scope) == NAMESPACE_DECL)
 {
   if (scope != global_namespace)
diff --git a/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C 
b/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C
new file mode 100644
index 000..2bf3ed6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C
@@ -0,0 +1,13 @@
+// { dg-do compile { target c++11 } }
+// { dg-options "-fdiagnostics-show-caret" }
+
+enum class vegetable { CARROT, TURNIP };
+
+void misspelled_value_in_scoped_enum ()
+{
+  vegetable::TURNUP; // { dg-error "'TURNUP' is not a member of 'vegetable'" }
+  /* { dg-begin-multiline-output "" }
+   vegetable::TURNUP;
+  ^~
+ { dg-end-multiline-output "" } */
+}
diff --git a/gcc/testsuite/g++.dg/lookup/suggestions-unscoped-enums.C 
b/gcc/testsuite/g++.dg/lookup/suggestions-unscoped-enums.C
new file mode 100644
index 000..bc610d0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/suggestions-unscoped-enums.C
@@ -0,0 +1,91 @@
+// { dg-options "-fdiagnostics-show-caret" }
+
+enum { LASAGNA, SPAGHETTI };
+namespace outer_ns_a
+{
+  enum enum_in_outer_ns_a { STRAWBERRY, BANANA };
+  namespace inner_ns
+  {
+enum enum_in_inner_ns { ELEPHANT, LION };
+  }
+}
+namespace outer_ns_2
+{
+  enum enum_in_outer_ns_2 { NIGHT, DAY };
+}
+
+void misspelled_enum_in_global_ns ()
+{
+  SPOOGHETTI; // { dg-error "'SPOOGHETTI' was not declared in this scope; did 
you mean 'SPAGHETTI'" }
+  /* { dg-begin-multiline-output "" }
+   SPOOGHETTI;
+   ^~
+   SPAGHETTI
+ { dg-end-multiline-output "" } */
+}
+
+void unqualified_enum_in_outer_ns ()
+{
+  BANANA; // { dg-error "'BANANA' was not declared in this scope; did you mean 
'outer_ns_a::BANANA'" }
+  /* { dg-begin-multiline-output "" }
+   BANANA;
+   ^~
+   outer_ns_a::BANANA
+ { dg-end-multiline-output "" } */
+  /* { dg-begin-multiline-output "" }
+   enum enum_in_outer_ns_a { STRAWBERRY, BANANA };
+ ^~
+

Re: [PATCH] Fix missing dump_impl_location_t values, using a new dump_metadata_t

2018-11-21 Thread Richard Biener
On Tue, Nov 20, 2018 at 8:37 PM David Malcolm  wrote:
>
> The dump_* API attempts to capture emission location metadata for the
> various dump messages, but looking in -fsave-optimization-record shows
> that many dump messages are lacking useful impl_location values, instead
> having this location within dumpfile.c:
>
> "impl_location": {
> "file": "../../src/gcc/dumpfile.c",
> "function": "ensure_pending_optinfo",
> "line": 1169
> },
>
> The problem is that the auto-capturing of dump_impl_location_t is tied to
> dump_location_t, and this is tied to the dump_*_loc calls.  If a message
> comes from a dump_* call without a "_loc" suffix (e.g. dump_printf), the
> current code synthesizes the dump_location_t within
> dump_context::ensure_pending_optinfo, and thus saves the useless
> impl_location seen above.
>
> This patch fixes things by changing the dump_* API so that, rather than
> taking a dump_flags_t, they take a new class dump_metadata_t, which is
> constructed from a dump_flags_t, but captures the emission location.
>
> Hence e.g.:
>
>   dump_printf (MSG_NOTE, "some message\n");
>
> implicitly builds a dump_metadata_t wrapping the MSG_NOTE and the
> emission location.  If there are several dump_printf calls without
> a dump_*_loc call, the emission location within the optinfo is that
> of the first dump call within it.
>
> The patch updates selftest::test_capture_of_dump_calls to verify
> that the impl location of various dump_* calls is captured.  I also
> manually verified that the references to dumpfile.c in the saved
> optimization records were fixed.
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
>
> OK for trunk?

OK.

Richard.

> gcc/ChangeLog:
> * dump-context.h (dump_context::dump_loc): Convert 1st param from
> dump_flags_t to const dump_metadata_t &.  Convert 2nd param from
> const dump_location_t & to const dump_user_location_t &.
> (dump_context::dump_loc_immediate): Convert 2nd param from
> const dump_location_t & to const dump_user_location_t &.
> (dump_context::dump_gimple_stmt): Convert 1st param from
> dump_flags_t to const dump_metadata_t &.
> (dump_context::void dump_gimple_stmt_loc): Likewise; convert
> 2nd param from const dump_location_t & to
> const dump_user_location_t &.
> (dump_context::dump_gimple_expr): Convert 1st param from
> dump_flags_t to const dump_metadata_t &.
> (dump_context::dump_gimple_expr_loc): Likewise; convert
> 2nd param from const dump_location_t & to
> const dump_user_location_t &.
> (dump_context::dump_generic_expr): Convert 1st param from
> dump_flags_t to const dump_metadata_t &.
> (dump_context::dump_generic_expr_loc): Likewise; convert
> 2nd param from const dump_location_t & to
> const dump_user_location_t &.
> (dump_context::dump_printf_va): Convert 1st param from
> dump_flags_t to const dump_metadata_t &.
> (dump_context::dump_printf_loc_va): Likewise; convert
> 2nd param from const dump_location_t & to
> const dump_user_location_t &.
> (dump_context::dump_dec): Convert 1st param from
> dump_flags_t to const dump_metadata_t &.
> (dump_context::dump_symtab_node): Likewise.
> (dump_context::begin_scope): Split out 2nd param into
> user and impl locations.
> (dump_context::ensure_pending_optinfo): Add metadata param.
> (dump_context::begin_next_optinfo): Replace dump_location_t param
> with metadata and user location.
> * dumpfile.c (dump_context::dump_loc): Convert 1st param from
> dump_flags_t to const dump_metadata_t &.  Convert 2nd param from
> const dump_location_t & to const dump_user_location_t &.
> (dump_context::dump_loc_immediate): Convert 2nd param from
> const dump_location_t & to const dump_user_location_t &.
> (dump_context::dump_gimple_stmt): Convert 1st param from
> dump_flags_t to const dump_metadata_t &.
> (dump_context::void dump_gimple_stmt_loc): Likewise; convert
> 2nd param from const dump_location_t & to
> const dump_user_location_t &.
> (dump_context::dump_gimple_expr): Convert 1st param from
> dump_flags_t to const dump_metadata_t &.
> (dump_context::dump_gimple_expr_loc): Likewise; convert
> 2nd param from const dump_location_t & to
> const dump_user_location_t &.
> (dump_context::dump_generic_expr): Convert 1st param from
> dump_flags_t to const dump_metadata_t &.
> (dump_context::dump_generic_expr_loc): Likewise; convert
> 2nd param from const dump_location_t & to
> const dump_user_location_t &.
> (dump_context::dump_printf_va): Convert 1st param from
> dump_flags_t to const dump_metadata_t &.
>

Re: Fix PR37916 (unnecessary spilling)

2018-11-21 Thread Richard Biener
On Wed, Nov 21, 2018 at 1:12 AM Jeff Law  wrote:
>
> On 11/20/18 6:42 AM, Michael Matz wrote:
> > Hi,
> >
> > this bug report is about cris generating worse code since tree-ssa.  The
> > effect is also visible on x86-64.  The symptom is that the work horse of
> > adler32.c (from zlib) needs spills in the inner loop, while gcc 3 did not,
> > and those spills go away with -fno-tree-reassoc.
> >
> > The underlying reason for the spills is register pressure, which could
> > either be rectified by the pressure aware scheduling (which cris doesn't
> > have), or by simply not generating high pressure code to begin with.  In
> > this case it's TER which ultimately causes the register pressure to
> > increase, and there are many plans in people minds how to fix this (make
> > TER regpressure aware, do some regpressure scheduling on gimple, or even
> > more ambitious things), but this patch doesn't tackle this.  Instead it
> > makes reassoc not generate the situation which causes TER to run wild.
> >
> > TER increasing register pressure is a long standing problem and so it has
> > some heuristics to avoid that.  One wobbly heuristic is that it doesn't
> > TER expressions together that have the same base variable as their LHSs.
> > But reassoc generates only anonymous SSA names for its temporary
> > subexpressions, so that TER heuristic doesn't apply.  In this testcase
> > it's even the case that reassoc doesn't really change much code (one
> > addition moves from the end to the beginning of the inner loop), so that
> > whole rewriting is even pointless.
> >
> > In any case, let's use copy_ssa_name instead of make_ssa_name, when we
> > have an obvious LHS; that leads to TER making the same decisions with and
> > without -fno-tree-reassoc, leading to the same register pressure and no
> > spills.

I don't think this is OK.  Take one example, in rewrite_expr_tree for the final
recursion case we replace

   a_1 = _2 + _3;

with sth else, like

_4 = _5 + 1;

so we compute a new value that may not have been computed before and
certainly not into the user variable a.  If you change this to instead create

  a_4 = _5 + 1;

then this leads to wrong debug info showing a value for 'a' that never existed.
You can observe this with

unsigned int __attribute__((noinline,noipa))
foo (unsigned int a, unsigned int b, unsigned int c, unsigned int d)
{
  a = a + 1;
  a = a + b;
  a = a + c;
  a = a + d;
  a = a + 3;
  return a;
}
int main()
{
  volatile unsigned x = foo (1, 3, 5, 7);
  return 0;
}

when you build this with -O -g -fno-var-tracking-assignments (VTA seems
to hide the issue, probably due to not enough instructions in the end)
you observe

(gdb) s
foo (a=1, b=3, c=5, d=7) at t.c:6
6 a = a + c;
(gdb) disassemble
Dump of assembler code for function foo:
=> 0x004004b2 <+0>: lea0x4(%rdx,%rcx,1),%eax
   0x004004b6 <+4>: add%esi,%eax
   0x004004b8 <+6>: add%edi,%eax
   0x004004ba <+8>: retq
End of assembler dump.
(gdb) p a
$1 = 1
(gdb) si
7 a = a + d;
(gdb) p a
$2 = 16
(gdb) si
9 return a;
(gdb) p a
$3 = 19
(gdb) si
0x004004ba  9 return a;
(gdb) p a
$4 = 
(gdb) p $eax
$5 = 20
(gdb) quit

printing values for a that never should occur.  The sequence
of "allowed" values is 1, 2, 5, 10, 17, 20.  Both 16 and 19
should never be printed.  It's quite obvious if you look at the
reassoc result which is

  a_11 = d_7(D) + 4;
  a_12 = a_11 + c_5(D);
  a_13 = a_12 + b_3(D);
  a_9 = a_13 + a_1(D);
  return a_9;

and the fact that var-tracking looks at REG_DECL for
debug info location generation.  With VTA we get

  # DEBUG BEGIN_STMT
  # DEBUG D#3 => a_1(D) + 1
  # DEBUG a => D#3
  # DEBUG BEGIN_STMT
  a_11 = d_7(D) + 4;
  # DEBUG D#2 => D#3 + b_3(D)
  # DEBUG a => D#2
  # DEBUG BEGIN_STMT
  a_12 = a_11 + c_5(D);
  # DEBUG D#1 => D#2 + c_5(D)
  # DEBUG a => D#1
  # DEBUG BEGIN_STMT
  a_13 = a_12 + b_3(D);
  # DEBUG a => D#1 + d_7(D)
  # DEBUG BEGIN_STMT
  a_9 = a_13 + a_1(D);
  # DEBUG a => a_9
  # DEBUG BEGIN_STMT
  return a_9;

but DEBUG stmts in itself do not make the code valid from
a debug perspective.  Note that the reassociated stmts can
become DEBUG stmts itself if they are later DCEd.

So your patch has to be much more careful to never change
the LHS of stmts that are adjusted (which I think reassoc already does).

Richard.

> > On x86-64 the effect is:
> >   before patch: 48 bytes stackframe, 24 stack
> > accesses (most of them in the loops), 576 bytes codesize;
> >   after patch: no stack frame, no stack accesses, 438 bytes codesize
> >
> > On cris:
> >   before patch: 64 bytes stack frame, 27 stack access in loops, size of .s
> > 145 lines
> >   after patch: 20 bytes stack frame (as it uses callee saved regs, which
> > is also complained about in the bug report), but no stack accesses
> > in loops, size of .s: 125 lines
> >
> > I'm wondering about testcase: should I add an x86-64 specific that tests
> > for no

Re: compute discriminator info for overrides

2018-11-21 Thread Richard Biener
On Wed, Nov 21, 2018 at 2:06 AM Alexandre Oliva  wrote:
>
> In some cases of overriding or resetting locations, we might retain
> discriminator info from earlier locations, when we should take
> discriminator information from the overriding location or reset it.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?

OK.

Richard.

> for  gcc/ChangeLog
>
> * final.c (compute_discriminator): Declare.  Renamed from...
> (maybe_set_discriminator): ... this.  Set and return a local.
> (override_discriminator): New.
> (final_scan_insn_1): Set it.
> (notice_source_line): Adjust.  Always set discriminator.
> ---
>  gcc/final.c |   19 +++
>  1 file changed, 15 insertions(+), 4 deletions(-)
>
> diff --git a/gcc/final.c b/gcc/final.c
> index 0c1ac625f37a..f707d2fc0bcd 100644
> --- a/gcc/final.c
> +++ b/gcc/final.c
> @@ -128,6 +128,7 @@ static int last_discriminator;
>  /* Discriminator to be written to assembly for current instruction.
> Note: actual usage depends on loc_discriminator_kind setting.  */
>  static int discriminator;
> +static inline int compute_discriminator (location_t loc);
>
>  /* Discriminator identifying current basic block among others sharing
> the same locus.  */
> @@ -149,6 +150,7 @@ static const char *last_filename;
>  static const char *override_filename;
>  static int override_linenum;
>  static int override_columnnum;
> +static int override_discriminator;
>
>  /* Whether to force emission of a line note before the next insn.  */
>  static bool force_source_line = false;
> @@ -2342,6 +2344,7 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int 
> optimize_p ATTRIBUTE_UNUSED,
>   override_filename = LOCATION_FILE (*locus_ptr);
>   override_linenum = LOCATION_LINE (*locus_ptr);
>   override_columnnum = LOCATION_COLUMN (*locus_ptr);
> + override_discriminator = compute_discriminator (*locus_ptr);
> }
> }
>   break;
> @@ -2379,12 +2382,14 @@ final_scan_insn_1 (rtx_insn *insn, FILE *file, int 
> optimize_p ATTRIBUTE_UNUSED,
>   override_filename = LOCATION_FILE (*locus_ptr);
>   override_linenum = LOCATION_LINE (*locus_ptr);
>   override_columnnum = LOCATION_COLUMN (*locus_ptr);
> + override_discriminator = compute_discriminator (*locus_ptr);
> }
>   else
> {
>   override_filename = NULL;
>   override_linenum = 0;
>   override_columnnum = 0;
> + override_discriminator = 0;
> }
> }
>   break;
> @@ -3185,9 +3190,11 @@ map_decl_to_instance (const_tree decl)
>
>  /* Set DISCRIMINATOR to the appropriate value, possibly derived from LOC.  */
>
> -static inline void
> -maybe_set_discriminator (location_t loc)
> +static inline int
> +compute_discriminator (location_t loc)
>  {
> +  int discriminator;
> +
>if (!decl_to_instance_map)
>  discriminator = bb_discriminator;
>else
> @@ -3209,6 +3216,8 @@ maybe_set_discriminator (location_t loc)
>
>discriminator = map_decl_to_instance (decl);
>  }
> +
> +  return discriminator;
>  }
>
>  /* Return whether a source line note needs to be emitted before INSN.
> @@ -3234,7 +3243,7 @@ notice_source_line (rtx_insn *insn, bool *is_stmt)
>filename = xloc.file;
>linenum = xloc.line;
>columnnum = xloc.column;
> -  maybe_set_discriminator (loc);
> +  discriminator = compute_discriminator (loc);
>force_source_line = true;
>  }
>else if (override_filename)
> @@ -3242,6 +3251,7 @@ notice_source_line (rtx_insn *insn, bool *is_stmt)
>filename = override_filename;
>linenum = override_linenum;
>columnnum = override_columnnum;
> +  discriminator = override_discriminator;
>  }
>else if (INSN_HAS_LOCATION (insn))
>  {
> @@ -3249,13 +3259,14 @@ notice_source_line (rtx_insn *insn, bool *is_stmt)
>filename = xloc.file;
>linenum = xloc.line;
>columnnum = xloc.column;
> -  maybe_set_discriminator (INSN_LOCATION (insn));
> +  discriminator = compute_discriminator (INSN_LOCATION (insn));
>  }
>else
>  {
>filename = NULL;
>linenum = 0;
>columnnum = 0;
> +  discriminator = 0;
>  }
>
>if (filename == NULL)
>
> --
> Alexandre Oliva, freedom fighter   https://FSFLA.org/blogs/lxo
> Be the change, be Free! FSF Latin America board member
> GNU Toolchain EngineerFree Software Evangelist
> Hay que enGNUrecerse, pero sin perder la terGNUra jamás-GNUChe


Re: Fix PR37916 (unnecessary spilling)

2018-11-21 Thread Jeff Law
On 11/21/18 7:13 AM, Richard Biener wrote:
> On Wed, Nov 21, 2018 at 1:12 AM Jeff Law  wrote:
>>
>> On 11/20/18 6:42 AM, Michael Matz wrote:
>>> Hi,
>>>
>>> this bug report is about cris generating worse code since tree-ssa.  The
>>> effect is also visible on x86-64.  The symptom is that the work horse of
>>> adler32.c (from zlib) needs spills in the inner loop, while gcc 3 did not,
>>> and those spills go away with -fno-tree-reassoc.
>>>
>>> The underlying reason for the spills is register pressure, which could
>>> either be rectified by the pressure aware scheduling (which cris doesn't
>>> have), or by simply not generating high pressure code to begin with.  In
>>> this case it's TER which ultimately causes the register pressure to
>>> increase, and there are many plans in people minds how to fix this (make
>>> TER regpressure aware, do some regpressure scheduling on gimple, or even
>>> more ambitious things), but this patch doesn't tackle this.  Instead it
>>> makes reassoc not generate the situation which causes TER to run wild.
>>>
>>> TER increasing register pressure is a long standing problem and so it has
>>> some heuristics to avoid that.  One wobbly heuristic is that it doesn't
>>> TER expressions together that have the same base variable as their LHSs.
>>> But reassoc generates only anonymous SSA names for its temporary
>>> subexpressions, so that TER heuristic doesn't apply.  In this testcase
>>> it's even the case that reassoc doesn't really change much code (one
>>> addition moves from the end to the beginning of the inner loop), so that
>>> whole rewriting is even pointless.
>>>
>>> In any case, let's use copy_ssa_name instead of make_ssa_name, when we
>>> have an obvious LHS; that leads to TER making the same decisions with and
>>> without -fno-tree-reassoc, leading to the same register pressure and no
>>> spills.
> 
> I don't think this is OK.  Take one example, in rewrite_expr_tree for the 
> final
> recursion case we replace
> 
>a_1 = _2 + _3;
> 
> with sth else, like
> 
> _4 = _5 + 1;
> 
> so we compute a new value that may not have been computed before and
> certainly not into the user variable a.  If you change this to instead create
> 
>   a_4 = _5 + 1;
> 
> then this leads to wrong debug info showing a value for 'a' that never 
> existed.
> You can observe this with
But isn't the point to use an underlying SSA_NAME_VAR when we have one
that should be appropriate?  Are we just being too aggressive with using
copy_ssa_name?

jeff



[PATCH] Fix PR88133

2018-11-21 Thread Richard Biener


This fixes a bogus warning in bitmap.c by avoiding the problematic
transform of cunrolli, thereby eliding the elt->bits[0] test for
--disable-checking.

Bootstrapped (with and without --disable-checking) and tested on
x86_64-unknown-linux-gnu, applied.

Richard.

2018-11-21  Richard Biener  

PR bootstrap/88133
* bitmap.c (bitmap_last_set_bit): Refactor to avoid warning.

* Makefile.in (bitmap.o-warn): Remove again.

Index: gcc/bitmap.c
===
--- gcc/bitmap.c(revision 266340)
+++ gcc/bitmap.c(working copy)
@@ -1186,13 +1186,13 @@ bitmap_last_set_bit (const_bitmap a)
 elt = elt->next;
 
   bit_no = elt->indx * BITMAP_ELEMENT_ALL_BITS;
-  for (ix = BITMAP_ELEMENT_WORDS - 1; ix >= 0; ix--)
+  for (ix = BITMAP_ELEMENT_WORDS - 1; ix >= 1; ix--)
 {
   word = elt->bits[ix];
   if (word)
goto found_bit;
 }
-  gcc_unreachable ();
+  gcc_assert (elt->bits[ix] != 0);
  found_bit:
   bit_no += ix * BITMAP_WORD_BITS;
 #if GCC_VERSION >= 3004
Index: gcc/Makefile.in
===
--- gcc/Makefile.in (revision 266340)
+++ gcc/Makefile.in (working copy)
@@ -221,7 +221,6 @@ libgcov-merge-tool.o-warn = -Wno-error
 gimple-match.o-warn = -Wno-unused
 generic-match.o-warn = -Wno-unused
 dfp.o-warn = -Wno-strict-aliasing
-bitmap.o-warn = -Wno-error=array-bounds # PR 87926
 
 # All warnings have to be shut off in stage1 if the compiler used then
 # isn't gcc; configure determines that.  WARN_CFLAGS will be either


Re: Fix regression introduced by 88069

2018-11-21 Thread Richard Biener
On Wed, Nov 21, 2018 at 3:28 AM Jeff Law  wrote:
>
> Richi's recent change to fix 88069 is causing various targets to fail
> tree-ssa/20030711-2.c.  That test is verifying a variety of
> optimizations occur during the first DOM pass.
>
> Prior to Richi's change FRE1 would do some significant cleanups of the
> IL and as a result DOM was fully able to optimize the resultant code.

Hum...  I obviously missed the FAIL during testing somehow.  FRE1
behavior shouldn't change so I'll fixup.

Richard.

> After Richi's change we've got a redundant load in the IL.  After
> analyzing the CFG and IL it was clear that DOM *should* be able to
> remove the redundant load, but simply wasn't.
>
> DOM would discover that it could statically determine the result of a
> branch condition.  This resulted in one arm of the branch becoming
> unreachable.  That in turn caused some PHI nodes to become degenerates.
>
> Normally when a PHI node becomes a degenerate we record it as a copy in
> the const_and_copies table and *most* of the time we'll propagate the
> src value into uses of the dest.  But propagation is not guaranteed
> (there's a BZ around that issue you can find if you dig into the history
> of some of this code).
>
> Anyway, exposing the degenerate PHI *should* have exposed the redundant
> load, but we didn't record anything into the const/copies table for the
> virtual phi.  That's a conscious decision to avoid issues with
> overlapping lifetimes of virtual SSA_NAMEs.
>
> While investigating the history here I noticed Richi's little trick
> which allows propagation of virtuals if we propagate to all the uses.
> Twiddling DOM to use that same trick results in the virtual operand
> propagating.  That in turn allows DOM to see and remove the redundant load.
>
> Bootstrapped and regression tested on x86_64 where is fixes
> 20030711-2.c.  Also verified that it fixed various other targets where
> that test had started failing.
>
> Installing on the trunk.
>
> jeff


Re: Fix regression introduced by 88069

2018-11-21 Thread Jeff Law
On 11/21/18 7:21 AM, Richard Biener wrote:
> On Wed, Nov 21, 2018 at 3:28 AM Jeff Law  wrote:
>>
>> Richi's recent change to fix 88069 is causing various targets to fail
>> tree-ssa/20030711-2.c.  That test is verifying a variety of
>> optimizations occur during the first DOM pass.
>>
>> Prior to Richi's change FRE1 would do some significant cleanups of the
>> IL and as a result DOM was fully able to optimize the resultant code.
> 
> Hum...  I obviously missed the FAIL during testing somehow.  FRE1
> behavior shouldn't change so I'll fixup.
NP.  The change to DOM is probably a good thing independently.  I may go
back and factor that little hunk into a reusable function as we've got a
nearly identical copy elsewhere.

jeff


Re: Fix PR37916 (unnecessary spilling)

2018-11-21 Thread Richard Biener
On Wed, Nov 21, 2018 at 3:16 PM Jeff Law  wrote:
>
> On 11/21/18 7:13 AM, Richard Biener wrote:
> > On Wed, Nov 21, 2018 at 1:12 AM Jeff Law  wrote:
> >>
> >> On 11/20/18 6:42 AM, Michael Matz wrote:
> >>> Hi,
> >>>
> >>> this bug report is about cris generating worse code since tree-ssa.  The
> >>> effect is also visible on x86-64.  The symptom is that the work horse of
> >>> adler32.c (from zlib) needs spills in the inner loop, while gcc 3 did not,
> >>> and those spills go away with -fno-tree-reassoc.
> >>>
> >>> The underlying reason for the spills is register pressure, which could
> >>> either be rectified by the pressure aware scheduling (which cris doesn't
> >>> have), or by simply not generating high pressure code to begin with.  In
> >>> this case it's TER which ultimately causes the register pressure to
> >>> increase, and there are many plans in people minds how to fix this (make
> >>> TER regpressure aware, do some regpressure scheduling on gimple, or even
> >>> more ambitious things), but this patch doesn't tackle this.  Instead it
> >>> makes reassoc not generate the situation which causes TER to run wild.
> >>>
> >>> TER increasing register pressure is a long standing problem and so it has
> >>> some heuristics to avoid that.  One wobbly heuristic is that it doesn't
> >>> TER expressions together that have the same base variable as their LHSs.
> >>> But reassoc generates only anonymous SSA names for its temporary
> >>> subexpressions, so that TER heuristic doesn't apply.  In this testcase
> >>> it's even the case that reassoc doesn't really change much code (one
> >>> addition moves from the end to the beginning of the inner loop), so that
> >>> whole rewriting is even pointless.
> >>>
> >>> In any case, let's use copy_ssa_name instead of make_ssa_name, when we
> >>> have an obvious LHS; that leads to TER making the same decisions with and
> >>> without -fno-tree-reassoc, leading to the same register pressure and no
> >>> spills.
> >
> > I don't think this is OK.  Take one example, in rewrite_expr_tree for the 
> > final
> > recursion case we replace
> >
> >a_1 = _2 + _3;
> >
> > with sth else, like
> >
> > _4 = _5 + 1;
> >
> > so we compute a new value that may not have been computed before and
> > certainly not into the user variable a.  If you change this to instead 
> > create
> >
> >   a_4 = _5 + 1;
> >
> > then this leads to wrong debug info showing a value for 'a' that never 
> > existed.
> > You can observe this with
> But isn't the point to use an underlying SSA_NAME_VAR when we have one
> that should be appropriate?  Are we just being too aggressive with using
> copy_ssa_name?

Well, sure - there _might_ be places that could use copy_ssa_name in reassoc
but I doubt so.  Usually code-generation should not use copy_ssa_name given
the aforementioned issues.

Richard.

>
> jeff
>


[PATCH] Alternate fix for PR87229, fix PR88112

2018-11-21 Thread Richard Biener


My previous fix for PR87229 was too aggressive.  The following
simply teaches the LTO streamer to deal with CALL_EXPRs, support
for which was already in place.  I've amended it with two
missing pieces, streaming of CALL_EXPR_BY_DESCRIPTOR and CALL_EXPR_IFN.

LTO bootstrapped and tested on x86_64-unknown-linux-gnu with Ada enabled.

Any objections?  As said elsewhere I'm looking for sth that is
reasonably safe for GCC 8 as well given the PR is a regression there.

Richard.

2018-11-21  Richard Biener  

PR lto/87229
PR lto/88112
* lto-streamer-out.c (lto_is_streamable): Allow CALL_EXPRs
which can appear in size expressions.
* tree-streamer-in.c (unpack_ts_base_value_fields): Stream
CALL_EXPR_BY_DESCRIPTOR.
(streamer_read_tree_bitfields): Stream CALL_EXPR_IFN.
* tree-streamer-out.c (pack_ts_base_value_fields): Stream
CALL_EXPR_BY_DESCRIPTOR.
(streamer_write_tree_bitfields): Stream CALL_EXPR_IFN.

Revert
PR lto/87229
* tree.c (free_lang_data_in_one_sizepos): Free non-gimple-val
sizepos values.


Index: gcc/lto-streamer-out.c
===
--- gcc/lto-streamer-out.c  (revision 266308)
+++ gcc/lto-streamer-out.c  (working copy)
@@ -306,7 +306,6 @@ lto_is_streamable (tree expr)
  name version in lto_output_tree_ref (see output_ssa_names).  */
   return !is_lang_specific (expr)
 && code != SSA_NAME
-&& code != CALL_EXPR
 && code != LANG_TYPE
 && code != MODIFY_EXPR
 && code != INIT_EXPR
Index: gcc/tree-streamer-in.c
===
--- gcc/tree-streamer-in.c  (revision 266308)
+++ gcc/tree-streamer-in.c  (working copy)
@@ -158,6 +158,11 @@ unpack_ts_base_value_fields (struct bitp
   SSA_NAME_IS_DEFAULT_DEF (expr) = (unsigned) bp_unpack_value (bp, 1);
   bp_unpack_value (bp, 8);
 }
+  else if (TREE_CODE (expr) == CALL_EXPR)
+{
+  CALL_EXPR_BY_DESCRIPTOR (expr) = (unsigned) bp_unpack_value (bp, 1);
+  bp_unpack_value (bp, 8);
+}
   else
 bp_unpack_value (bp, 9);
 }
@@ -521,6 +526,8 @@ streamer_read_tree_bitfields (struct lto
MR_DEPENDENCE_BASE (expr)
  = (unsigned)bp_unpack_value (&bp, sizeof (short) * 8);
}
+  else if (code == CALL_EXPR)
+   CALL_EXPR_IFN (expr) = bp_unpack_enum (&bp, internal_fn, IFN_LAST);
 }
 
   if (CODE_CONTAINS_STRUCT (code, TS_BLOCK))
Index: gcc/tree-streamer-out.c
===
--- gcc/tree-streamer-out.c (revision 266308)
+++ gcc/tree-streamer-out.c (working copy)
@@ -129,6 +129,11 @@ pack_ts_base_value_fields (struct bitpac
   bp_pack_value (bp, SSA_NAME_IS_DEFAULT_DEF (expr), 1);
   bp_pack_value (bp, 0, 8);
 }
+  else if (TREE_CODE (expr) == CALL_EXPR)
+{
+  bp_pack_value (bp, CALL_EXPR_BY_DESCRIPTOR (expr), 1);
+  bp_pack_value (bp, 0, 8);
+}
   else
 bp_pack_value (bp, 0, 9);
 }
@@ -457,6 +462,8 @@ streamer_write_tree_bitfields (struct ou
  if (MR_DEPENDENCE_CLIQUE (expr) != 0)
bp_pack_value (&bp, MR_DEPENDENCE_BASE (expr), sizeof (short) * 8);
}
+  else if (code == CALL_EXPR)
+   bp_pack_enum (&bp, internal_fn, IFN_LAST, CALL_EXPR_IFN (expr));
 }
 
   if (CODE_CONTAINS_STRUCT (code, TS_BLOCK))
Index: gcc/tree.c
===
--- gcc/tree.c  (revision 266308)
+++ gcc/tree.c  (working copy)
@@ -5254,13 +5254,6 @@ free_lang_data_in_one_sizepos (tree *exp
   tree expr = *expr_p;
   if (CONTAINS_PLACEHOLDER_P (expr))
 *expr_p = build0 (PLACEHOLDER_EXPR, TREE_TYPE (expr));
-  /* ???  We have to reset all non-GIMPLE sizepos because those eventually
- refer to trees we cannot stream.  See for example PR87229 which
- shows an example with non-gimplified abstract origins in C++.
- Note this should only happen for abstract copies so setting sizes
- to NULL is OK (but we cannot easily assert this).  */
-  else if (expr && !is_gimple_val (expr))
-*expr_p = NULL_TREE;
 }
 
 


Re: Fix regression introduced by 88069

2018-11-21 Thread Richard Biener
On Wed, Nov 21, 2018 at 3:23 PM Jeff Law  wrote:
>
> On 11/21/18 7:21 AM, Richard Biener wrote:
> > On Wed, Nov 21, 2018 at 3:28 AM Jeff Law  wrote:
> >>
> >> Richi's recent change to fix 88069 is causing various targets to fail
> >> tree-ssa/20030711-2.c.  That test is verifying a variety of
> >> optimizations occur during the first DOM pass.
> >>
> >> Prior to Richi's change FRE1 would do some significant cleanups of the
> >> IL and as a result DOM was fully able to optimize the resultant code.
> >
> > Hum...  I obviously missed the FAIL during testing somehow.  FRE1
> > behavior shouldn't change so I'll fixup.
> NP.  The change to DOM is probably a good thing independently.  I may go
> back and factor that little hunk into a reusable function as we've got a
> nearly identical copy elsewhere.

I am testing the following.

Richard.

2018-11-21  Richard Biener  

PR tree-optimization/88069
* tree-ssa-sccvn.c (visit_phi): Tweak previous fix to not
apply to default defs.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 266345)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -4205,6 +4205,7 @@ visit_phi (gimple *phi, bool *inserted,
 given that allows us to escape a region in alias walking.  */
   || (sameval
  && TREE_CODE (sameval) == SSA_NAME
+ && !SSA_NAME_IS_DEFAULT_DEF (sameval)
  && SSA_NAME_IS_VIRTUAL_OPERAND (sameval)
  && (SSA_VAL (sameval, &visited_p), !visited_p)))
 /* Note this just drops to VARYING without inserting the PHI into


> jeff


Re: [PATCH v2 1/3] Allow memory operands for PTWRITE

2018-11-21 Thread Richard Biener
On Tue, Nov 20, 2018 at 7:36 PM Andi Kleen  wrote:
>
> On Tue, Nov 20, 2018 at 11:53:15AM +0100, Richard Biener wrote:
> > On Fri, Nov 16, 2018 at 8:07 AM Uros Bizjak  wrote:
> > >
> > > On Fri, Nov 16, 2018 at 4:57 AM Andi Kleen  wrote:
> > > >
> > > > From: Andi Kleen 
> > > >
> > > > The earlier PTWRITE builtin definition was unnecessarily restrictive,
> > > > only allowing register input to PTWRITE. The instruction actually
> > > > supports memory operands too, so allow that too.
> > > >
> > > > gcc/:
> > > >
> > > > 2018-11-15  Andi Kleen  
> > > >
> > > > * config/i386/i386.md: Allow memory operands to ptwrite.
> > >
> > > OK.
> >
> > Btw, I wonder why the ptwrite builtin is in SPECIAL_ARGS2
> > commented as /* Add all special builtins with variable number of operands. 
> > */?
>
> i think i put it in the same place as a similar builtin. AFAIK
> those others don't have variable arguments either, so the comment
> may be wrong?

No idea...

> >
> > On the GIMPLE level this builtin also has quite some (bad) effects on
> > alias analysis and any related optimization (vectorization, etc.).  I'll 
> > have
> > to see where the instrumenting pass now resides.
>
> It's fairly late now.

OK, saw that.

> Any suggestions for improvements? At some point I removed the edges
> like the old MPX builtins to minimize memory usage, but that was
> removed during an earlier review cycle.

I guess it's fine now - it will have an effect on TER, limiting its ability
a bit, but otherwise the builtin only lives up to RTL expansion where
it becomes the UNSPEC_VOLATILE.  As said, instrumenting on
RTL would be an improvement, I think HJ might be able to help with that.

Richard.

> -Andi


Re: [PATCH v2 1/3] Allow memory operands for PTWRITE

2018-11-21 Thread H.J. Lu
On Wed, Nov 21, 2018 at 6:48 AM Richard Biener
 wrote:
>
> On Tue, Nov 20, 2018 at 7:36 PM Andi Kleen  wrote:
> >
> > On Tue, Nov 20, 2018 at 11:53:15AM +0100, Richard Biener wrote:
> > > On Fri, Nov 16, 2018 at 8:07 AM Uros Bizjak  wrote:
> > > >
> > > > On Fri, Nov 16, 2018 at 4:57 AM Andi Kleen  wrote:
> > > > >
> > > > > From: Andi Kleen 
> > > > >
> > > > > The earlier PTWRITE builtin definition was unnecessarily restrictive,
> > > > > only allowing register input to PTWRITE. The instruction actually
> > > > > supports memory operands too, so allow that too.
> > > > >
> > > > > gcc/:
> > > > >
> > > > > 2018-11-15  Andi Kleen  
> > > > >
> > > > > * config/i386/i386.md: Allow memory operands to ptwrite.
> > > >
> > > > OK.
> > >
> > > Btw, I wonder why the ptwrite builtin is in SPECIAL_ARGS2
> > > commented as /* Add all special builtins with variable number of 
> > > operands. */?
> >
> > i think i put it in the same place as a similar builtin. AFAIK
> > those others don't have variable arguments either, so the comment
> > may be wrong?
>
> No idea...
>
> > >
> > > On the GIMPLE level this builtin also has quite some (bad) effects on
> > > alias analysis and any related optimization (vectorization, etc.).  I'll 
> > > have
> > > to see where the instrumenting pass now resides.
> >
> > It's fairly late now.
>
> OK, saw that.
>
> > Any suggestions for improvements? At some point I removed the edges
> > like the old MPX builtins to minimize memory usage, but that was
> > removed during an earlier review cycle.
>
> I guess it's fine now - it will have an effect on TER, limiting its ability
> a bit, but otherwise the builtin only lives up to RTL expansion where
> it becomes the UNSPEC_VOLATILE.  As said, instrumenting on
> RTL would be an improvement, I think HJ might be able to help with that.
>

What are the issues?

-- 
H.J.


[C++ PATCH] Improve locations of id-expressions and operator "" (PR c++/87386, take 2)

2018-11-21 Thread Jakub Jelinek
Hi!

On Tue, Nov 20, 2018 at 04:32:26PM -0500, David Malcolm wrote:
> This makes the fix-it hint wrong: after the fix-it is applied, it will
> become
>   return color;
> (which won't compile), rather than
>   return O::color;
> which will.

Here is an updated version of the patch, which still uses the whole
range of the id-expression when it is parsed as primary expression, but does
so not in cp_parser_id_expression, but in cp_parser_primary_expression after
all the diagnostics.  Thus all the spell-checking etc. tests behave as
previously, they underline only the part after the last ::, and just
what uses the expression later on uses whole range.

The remaining needed tweeks in the testcases are minor and look correct to
me, e.g. for D::Bar the column is not at D but at B,
similarly for operator"" _F the column is under _ rather than first o.
The libstdc++ changes are because there are several large expressions like:
  something::value
and we used to diagnose on the something line (column of s)
but now we warn on value line (column of v).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-11-21  Jakub Jelinek  

PR c++/87386
* parser.c (cp_parser_primary_expression): Use
id_expression.get_location () instead of id_expr_token->location.
Adjust the range from id_expr_token->location to
id_expressio.get_finish ().
(cp_parser_operator): For operator "" make a range from "" to
the end of the suffix with caret at the start of the id.
gcc/testsuite/
* g++.dg/diagnostic/pr87386.C: New test.
* g++.dg/parse/error17.C: Adjust expected diagnostics.
* g++.dg/cpp0x/pr51420.C: Likewise.
* g++.dg/cpp0x/udlit-declare-neg.C: Likewise.
* g++.dg/cpp0x/udlit-member-neg.C: Likewise.
libstdc++-v3/
* testsuite/20_util/scoped_allocator/69293_neg.cc: Adjust expected
line.
* testsuite/20_util/uses_allocator/cons_neg.cc: Likewise.
* testsuite/20_util/uses_allocator/69293_neg.cc: Likewise.
* testsuite/experimental/propagate_const/requirements2.cc: Likewise.
* testsuite/experimental/propagate_const/requirements3.cc: Likewise.
* testsuite/experimental/propagate_const/requirements4.cc: Likewise.
* testsuite/experimental/propagate_const/requirements5.cc: Likewise.

--- gcc/cp/parser.c.jj  2018-11-21 11:35:43.698053550 +0100
+++ gcc/cp/parser.c 2018-11-21 12:23:20.701047164 +0100
@@ -5604,7 +5604,7 @@ cp_parser_primary_expression (cp_parser
  /*is_namespace=*/false,
  /*check_dependency=*/true,
  &ambiguous_decls,
- id_expr_token->location);
+ id_expression.get_location ());
/* If the lookup was ambiguous, an error will already have
   been issued.  */
if (ambiguous_decls)
@@ -5675,7 +5675,7 @@ cp_parser_primary_expression (cp_parser
if (parser->local_variables_forbidden_p
&& local_variable_p (decl))
  {
-   error_at (id_expr_token->location,
+   error_at (id_expression.get_location (),
  "local variable %qD may not appear in this context",
  decl.get_value ());
return error_mark_node;
@@ -5694,7 +5694,8 @@ cp_parser_primary_expression (cp_parser
 id_expression.get_location ()));
if (error_msg)
  cp_parser_error (parser, error_msg);
-   decl.set_location (id_expr_token->location);
+   decl.set_location (id_expression.get_location ());
+   decl.set_range (id_expr_token->location, id_expression.get_finish ());
return decl;
   }
 
@@ -15051,7 +15052,7 @@ cp_literal_operator_id (const char* name
 static cp_expr
 cp_parser_operator (cp_parser* parser)
 {
-  tree id = NULL_TREE;
+  cp_expr id = NULL_TREE;
   cp_token *token;
   bool utf8 = false;
 
@@ -15339,8 +15340,9 @@ cp_parser_operator (cp_parser* parser)
if (id != error_mark_node)
  {
const char *name = IDENTIFIER_POINTER (id);
-   id = cp_literal_operator_id (name);
+   *id = cp_literal_operator_id (name);
  }
+   id.set_range (start_loc, id.get_finish ());
return id;
   }
 
@@ -15364,7 +15366,8 @@ cp_parser_operator (cp_parser* parser)
   id = error_mark_node;
 }
 
-  return cp_expr (id, start_loc);
+  id.set_location (start_loc);
+  return id;
 }
 
 /* Parse a template-declaration.
--- gcc/testsuite/g++.dg/diagnostic/pr87386.C.jj2018-11-21 
14:40:58.377769686 +0100
+++ gcc/testsuite/g++.dg/diagnostic/pr87386.C   2018-11-21 14:40:19.064410070 
+0100
@@ -0,0 +1,18 @@
+// PR c++/87386
+// { dg-do compile { target c++11 } }
+// { dg-options "-fdiagnostics-show-caret" }
+
+namespace foo {
+  template struct test

[C++ PATCH] Remove useless tokens from cp_parser_linkage_specification (PR c++/87393)

2018-11-21 Thread Jakub Jelinek
Hi!

David's r251026 change added a weird trailing ->location.
It doesn't seem to be useful for anything, matching_braces has its own code
to track locations, so no need to do anything in the caller (and no other
spot does something like that).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-11-21  Jakub Jelinek  

PR c++/87393
* parser.c (cp_parser_linkage_specification): Remove useless
dereference of the consume_open method result.

--- gcc/cp/parser.c.jj  2018-11-21 08:58:56.190250827 +0100
+++ gcc/cp/parser.c 2018-11-21 10:02:40.690687576 +0100
@@ -14223,7 +14223,7 @@ cp_parser_linkage_specification (cp_pars
 
   /* Consume the `{' token.  */
   matching_braces braces;
-  braces.consume_open (parser)->location;
+  braces.consume_open (parser);
   /* Parse the declarations.  */
   cp_parser_declaration_seq_opt (parser);
   /* Look for the closing `}'.  */

Jakub


Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-11-21 Thread Kyrill Tkachov

Hi Thomas,

Sorry for the delay.

On 16/11/18 14:56, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On Sat, 10 Nov 2018 at 15:07, Thomas Preudhomme
 wrote:

Thanks Kyrill.

Updated patch in attachment. Best regards,

Thomas
On Thu, 8 Nov 2018 at 15:53, Kyrill Tkachov  wrote:

Hi Thomas,

On 08/11/18 09:52, Thomas Preudhomme wrote:

Ping?

Best regards,

Thomas

On Thu, 1 Nov 2018 at 16:03, Thomas Preudhomme
 wrote:

Ping?

Best regards,

Thomas
On Fri, 26 Oct 2018 at 22:41, Thomas Preudhomme
 wrote:

Hi,

Please find updated patch to fix PR85434: spilling of stack protector
guard's address on ARM. Quite a few changes have been made to the ARM
part since last round of review so I think it makes more sense to
review it anew. Ran bootstrap + regression testsuite + glibc build +
glibc regression testsuite for Arm and Thumb-2 and bootstrap +
regression testsuite for Thumb-1. GCC's regression testsuite was run
in 3 configurations in all those cases:

- default configuration (no RUNTESTFLAGS)
- with -fstack-protector-all
- with -fPIC -fstack-protector-all (to exercise both codepath in stack
protector's split code)

None of this show any regression beyond some new scan fail with
-fstack-protector-all or -fPIC due to unexpected code sequence for the
testcases concerned and some guality swing due to less optimization
with new stack protector on.

Patch description and ChangeLog below.

In case of high register pressure in PIC mode, address of the stack
protector's guard can be spilled on ARM targets as shown in PR85434,
thus allowing an attacker to control what the canary would be compared
against. ARM does lack stack_protect_set and stack_protect_test insn
patterns, defining them does not help as the address is expanded
regularly and the patterns only deal with the copy and test of the
guard with the canary.

This problem does not occur for x86 targets because the PIC access and
the test can be done in the same instruction. Aarch64 is exempt too
because PIC access insn pattern are mov of UNSPEC which prevents it from
the second access in the epilogue being CSEd in cse_local pass with the
first access in the prologue.

The approach followed here is to create new "combined" set and test
standard pattern names that take the unexpanded guard and do the set or
test. This allows the target to use an opaque pattern (eg. using UNSPEC)
to hide the individual instructions being generated to the compiler and
split the pattern into generic load, compare and branch instruction
after register allocator, therefore avoiding any spilling. This is here
implemented for the ARM targets. For targets not implementing these new
standard pattern names, the existing stack_protect_set and
stack_protect_test pattern names are used.

To be able to split PIC access after register allocation, the functions
had to be augmented to force a new PIC register load and to control
which register it loads into. This is because sharing the PIC register
between prologue and epilogue could lead to spilling due to CSE again
which an attacker could use to control what the canary gets compared
against.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-10-26  Thomas Preud'homme  

* target-insns.def (stack_protect_combined_set): Define new standard
pattern name.
(stack_protect_combined_test): Likewise.
* cfgexpand.c (stack_protect_prologue): Try new
stack_protect_combined_set pattern first.
* function.c (stack_protect_epilogue): Try new
stack_protect_combined_test pattern first.
* config/arm/arm.c (require_pic_register): Add pic_reg and compute_now
parameters to control which register to use as PIC register and force
reloading PIC register respectively.  Insert in the stream of insns if
possible.
(legitimize_pic_address): Expose above new parameters in prototype and
adapt recursive calls accordingly.  Use pic_reg if non null instead of
cached one.
(arm_load_pic_register): Add pic_reg parameter and use it if non null.
(arm_legitimize_address): Adapt to new legitimize_pic_address
prototype.
(thumb_legitimize_address): Likewise.
(arm_emit_call_insn): Adapt to require_pic_register prototype change.
(arm_expand_prologue): Adapt to arm_load_pic_register prototype change.
(thumb1_expand_prologue): Likewise.
* config/arm/arm-protos.h (legitimize_pic_address): Adapt to prototype
change.
(arm_load_pic_register): Likewise.
* config/arm/predicated.md (guard_addr_operand): New predicate.
(guard_operand): New predicate.
* config/arm/arm.md (movsi expander): Adapt to legitimize_pic_address
prototype change.
(builtin_setjmp_receiver expander): Adapt to thumb1_expand_prologue
prototype change.
(stack_protect_combined_set): New expander..
(stack_protect_combined_set_insn): New insn_and_split pattern.
(stack_protect_set_insn): New insn pattern.
(stack_protect_combined_test): New expander.
(stack_protect_combined_test_insn): New insn_and_split pattern.
(arm_stack_protect_test_insn): New insn pattern.
* config/arm/thumb1.md (thumb1_stack_protect_test_insn):

Re: Patch ping (was Re: [PATCH] Fix aarch64_compare_and_swap* constraints (PR target/87839))

2018-11-21 Thread James Greenhalgh
On Tue, Nov 20, 2018 at 11:04:46AM -0600, Jakub Jelinek wrote:
> Hi!
> 
> On Tue, Nov 13, 2018 at 10:28:16AM +0100, Jakub Jelinek wrote:
> > 2018-11-13  Jakub Jelinek  
> > 
> > PR target/87839
> > * config/aarch64/atomics.md (@aarch64_compare_and_swap): Use
> > rIJ constraint for aarch64_plus_operand rather than rn.
> > 
> > * gcc.target/aarch64/pr87839.c: New test.
> 
> I'd like to ping this patch, Kyrill had kindly tested it, ok for trunk?

OK.

Thanks,
James


Re: [C++ PATCH] Remove useless tokens from cp_parser_linkage_specification (PR c++/87393)

2018-11-21 Thread David Malcolm
On Wed, 2018-11-21 at 16:59 +0100, Jakub Jelinek wrote:
> Hi!
> 
> David's r251026 change added a weird trailing ->location.
> It doesn't seem to be useful for anything, matching_braces has its
> own code
> to track locations, so no need to do anything in the caller (and no
> other
> spot does something like that).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2018-11-21  Jakub Jelinek  
> 
>   PR c++/87393
>   * parser.c (cp_parser_linkage_specification): Remove useless
>   dereference of the consume_open method result.
> 
> --- gcc/cp/parser.c.jj2018-11-21 08:58:56.190250827 +0100
> +++ gcc/cp/parser.c   2018-11-21 10:02:40.690687576 +0100
> @@ -14223,7 +14223,7 @@ cp_parser_linkage_specification (cp_pars
>  
>/* Consume the `{' token.  */
>matching_braces braces;
> -  braces.consume_open (parser)->location;
> +  braces.consume_open (parser);
>/* Parse the declarations.  */
>cp_parser_declaration_seq_opt (parser);
>/* Look for the closing `}'.  */

Oops; looks like a stray edit by me.

Thanks for catching this; OK to remove it.

Dave


Re: [PATCH 1/3][GCC] Add new target hook asm_post_cfi_startproc

2018-11-21 Thread Sam Tebbs

On 11/2/18 6:07 PM, Sam Tebbs wrote:
> On 11/02/2018 05:28 PM, Sam Tebbs wrote:
>
>> Hi all,
>>
>> This patch adds a new target hook called "asm_post_cfi_startproc". This hook 
>> is
>> intended to be used by the aarch64 backend to emit a directive that enables
>> support for unwinding frames signed with the pointer authentication B-key. 
>> This
>> hook is triggered after the ".cfi_startproc" directive is emitted in
>> gcc/dwarf2out.c.
>>
>> Bootstrapped on aarch64-none-linux-gnu and tested on aarch64-none-elf with 
>> no regressions.
>>
>> Ok for trunk?
>>
>> gcc/
>> 2018-11-02  Sam Tebbs
>>
>>  * doc/tm.texi (TARGET_ASM_POST_CFI_STARTPROC): Define.
>>  * doc/tm.texi.in (TARGET_ASM_POST_CFI_STARTPROC): Define.
>>  * dwarf2out.c (dwarf2out_do_cfi_startproc): Trigger the hook.
>>  * hooks.c (hook_void_FILEptr_tree): Define.
>>  * hooks.h (hook_void_FILEptr_tree): Define.
>>  * target.def (post_cfi_startproc): Define.
> CCing global reviewers and dwarf maintainers.
>
ping


Re: [PATCH v3] [aarch64] Add CPU support for Ampere Computing's eMAG.

2018-11-21 Thread Richard Earnshaw (lists)
On 20/11/2018 18:00, Christoph Muellner wrote:
> Tested with "make check" and no regressions found.
> 
> This patch depends on the struct xgene1_prefetch_tune,
> which has been acknowledged already:
> https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00985.html
> 
> *** gcc/ChangeLog ***
> 
> 2018-xx-xx  Christoph Muellner 
> 
>   * config/aarch64/aarch64-cores.def: Define emag.
>   * config/aarch64/aarch64-tune.md: Regenerated with emag.
>   * config/aarch64/aarch64.c (emag_tunings): New struct.
>   * doc/invoke.texi: Document mtune value.

OK.

R.

> 
> Signed-off-by: Christoph Muellner 
> ---
>  gcc/config/aarch64/aarch64-cores.def |  3 +++
>  gcc/config/aarch64/aarch64-tune.md   |  2 +-
>  gcc/config/aarch64/aarch64.c | 25 +
>  gcc/doc/invoke.texi  |  2 +-
>  4 files changed, 30 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index 1f3ac56..68cca00 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -61,6 +61,9 @@ AARCH64_CORE("thunderxt88",   thunderxt88,   thunderx,  8A, 
>  AARCH64_FL_FOR_ARCH
>  AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
> 0x0a2, -1)
>  AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
> 0x0a3, -1)
>  
> +/* Ampere Computing cores. */
> +AARCH64_CORE("emag",emag,  xgene1,8A,  AARCH64_FL_FOR_ARCH8 
> | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, 0x000, 3)
> +
>  /* APM ('P') cores. */
>  AARCH64_CORE("xgene1",  xgene1,xgene1,8A,  AARCH64_FL_FOR_ARCH8, 
> xgene1, 0x50, 0x000, -1)
>  
> diff --git a/gcc/config/aarch64/aarch64-tune.md 
> b/gcc/config/aarch64/aarch64-tune.md
> index fade1d4..2fc7f03 100644
> --- a/gcc/config/aarch64/aarch64-tune.md
> +++ b/gcc/config/aarch64/aarch64-tune.md
> @@ -1,5 +1,5 @@
>  ;; -*- buffer-read-only: t -*-
>  ;; Generated automatically by gentune.sh from aarch64-cores.def
>  (define_attr "tune"
> - 
> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
> + 
> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
>   (const (symbol_ref "((enum attr_tune) aarch64_tune)")))
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index f7f88a9..995aafe 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -957,6 +957,31 @@ static const struct tune_params xgene1_tunings =
>&xgene1_prefetch_tune
>  };
>  
> +static const struct tune_params emag_tunings =
> +{
> +  &xgene1_extra_costs,
> +  &xgene1_addrcost_table,
> +  &xgene1_regmove_cost,
> +  &xgene1_vector_cost,
> +  &generic_branch_cost,
> +  &xgene1_approx_modes,
> +  6, /* memmov_cost  */
> +  4, /* issue_rate  */
> +  AARCH64_FUSE_NOTHING, /* fusible_ops  */
> +  "16",  /* function_align.  */
> +  "16",  /* jump_align.  */
> +  "16",  /* loop_align.  */
> +  2, /* int_reassoc_width.  */
> +  4, /* fp_reassoc_width.  */
> +  1, /* vec_reassoc_width.  */
> +  2, /* min_div_recip_mul_sf.  */
> +  2, /* min_div_recip_mul_df.  */
> +  17,/* max_case_values.  */
> +  tune_params::AUTOPREFETCHER_OFF,   /* autoprefetcher_model.  */
> +  (AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS), /* tune_flags.  */
> +  &xgene1_prefetch_tune
> +};
> +
>  static const struct tune_params qdf24xx_tunings =
>  {
>&qdf24xx_extra_costs,
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index e016dce..ac81fb2 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -15288,7 +15288,7 @@ Specify the name of the target processor for which 
> GCC should tune the
>  performance of the code.  Permissible values for this option are:
>  @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
>  @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
> -@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{falkor},
> +@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{emag}, @samp{falkor},
>  @samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @samp{xgene1}, @samp{vulcan},
>  @samp{thunderx}, @samp{thunderxt88}, @samp{thunderxt88p1}, 
> @samp{thunderxt81},
>  @samp{tsv110}, @samp{thunderxt83}, @samp{thund

Re: [PATCH] handle unusual targets in -Wbuiltin-declaration-mismatch (PR 88098)

2018-11-21 Thread Martin Sebor

On 11/21/18 6:08 AM, Rainer Orth wrote:

Hi Martin,


By calling builtin_decl_explicit rather than builtin_decl_implicit
the updated patch in the attachment avoids test failures due to
missing warnings on targets with support for long double but whose
libc doesn't support C99 functions like fabsl (such as apparently
aarch64-linux).

[...]

gcc/testsuite/ChangeLog:

PR testsuite/88098
* gcc.dg/Wbuiltin-declaration-mismatch-4.c: Adjust.
* gcc.dg/Wbuiltin-declaration-mismatch-5.c: New test.


is the Wbuiltin-declaration-mismatch-5.c testcase still supposed to be
part of the patch?  It's in the ChangeLog, but missing from the revised
patch.


It should still be there.  I must have excluded it by accident.
I will make sure to include it in the commit.

Thanks for pointing it out!
Martin


Re: [PATCH] Fix up 20181120-1.c testcase on big-endian (PR rtl-optimization/85925)

2018-11-21 Thread Segher Boessenkool
Hi,

On Wed, Nov 21, 2018 at 02:13:55PM +0100, Jakub Jelinek wrote:
> As mentioned in the PR, the testcase fails on big-endian targets.
> The following patch tweaks it so that it does not fail there and still
> checks for the original bug.

It relies on a certain bitfield layout, not just on LE.  I think the
testcase should run only on those specific targets where it works.  I don't
see how this patch would fix the problem for BE, btw.


Segher


Re: [PATCH] Fix up 20181120-1.c testcase on big-endian (PR rtl-optimization/85925)

2018-11-21 Thread Jakub Jelinek
On Wed, Nov 21, 2018 at 11:23:38AM -0600, Segher Boessenkool wrote:
> Hi,
> 
> On Wed, Nov 21, 2018 at 02:13:55PM +0100, Jakub Jelinek wrote:
> > As mentioned in the PR, the testcase fails on big-endian targets.
> > The following patch tweaks it so that it does not fail there and still
> > checks for the original bug.
> 
> It relies on a certain bitfield layout, not just on LE.  I think the
> testcase should run only on those specific targets where it works.  I don't
> see how this patch would fix the problem for BE, btw.

With the patch, it doesn't rely on anything, it compares if what you get at
runtime from the code combiner would optimize is equal to what is read from
a volatile union.

Admittedly, it might be better if the initializer was 0x1010101 or say
0x4030201 because on big endian in particular 0x10101 has the top 15 bits
all zero and thus that is what is in u.f1, so if the bug can be reproduced
with the combine.c + rtlanal.c fix reverted with 0x4030201, it would be
better to use that value (in both spots).

Jakub


Re: [PATCH, ARM, ping3] PR85434: Prevent spilling of stack protector guard's address on ARM

2018-11-21 Thread Segher Boessenkool
On Fri, Nov 16, 2018 at 02:56:46PM +, Thomas Preudhomme wrote:
> In case of high register pressure in PIC mode, address of the stack
> protector's guard can be spilled on ARM targets as shown in PR85434,
> thus allowing an attacker to control what the canary would be compared
> against. ARM does lack stack_protect_set and stack_protect_test insn
> patterns, defining them does not help as the address is expanded
> regularly and the patterns only deal with the copy and test of the
> guard with the canary.
> 
> This problem does not occur for x86 targets because the PIC access and
> the test can be done in the same instruction. Aarch64 is exempt too
> because PIC access insn pattern are mov of UNSPEC which prevents it from
> the second access in the epilogue being CSEd in cse_local pass with the
> first access in the prologue.

The unspecs are not CSEd because they are *different* unspecs (UNSPEC_SP_SET
vs. UNSPEC_SP_TEST; they have different args too, different number of args
even).  Two the same unspecs can be CSEd just fine.


Segher


Re: Stream TREE_TYPE of TYPE_DECLs again

2018-11-21 Thread Jan Hubicka
> 
> OK if you put a comment ...

I have adde comments to both free_lang_data referring that some fields
are freed late and comment to the new freeing pass.
While testing I noticed stupid bug in need_assembler_name_p which in
case TYPE_DECL does not satisfy the elaborate conditional for type to be
ODR it falls into "return true" rather than false.  Fixing that
uncovered bug in -fno-odr-type-merging path of ipa-devirt where vtable
hash was no longer initialized. Fixed thus.

lto-bootstrapped/regtested x86_64-linux, comitted.

PR lto/87957
* tree.c (fld_decl_context): Break out from ...
(free_lang_data_in_decl): ... here; free TREE_PUBLIC, TREE_PRIVATE
DECL_ARTIFICIAL of TYPE_DECL; do not free TREE_TYPE of TYPE_DECL.
(fld_incomplete_type_of): Build copy of TYP_DECL.
* ipa-devirt.c (free_enum_values): Rename to ...
(free_odr_warning_data): ... this one; free also duplicated TYPE_DECLs
and TREE_TYPEs of TYPE_DECLs.
(get_odr_type): Initialize odr_vtable_hash if needed.

Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 266334)
+++ ipa-devirt.c(working copy)
@@ -2025,6 +2025,8 @@ get_odr_type (tree type, bool insert)
   if ((!slot || !*slot) && in_lto_p && can_be_vtable_hashed_p (type))
 {
   hash = hash_odr_vtable (type);
+  if (!odr_vtable_hash)
+odr_vtable_hash = new odr_vtable_hash_type (23);
   vtable_slot = odr_vtable_hash->find_slot_with_hash (type, hash,
   insert ? INSERT : NO_INSERT);
 }
@@ -2289,27 +2291,43 @@ dump_type_inheritance_graph (FILE *f)
   "%i duplicates overall\n", num_all_types, num_types, num_duplicates);
 }
 
-/* Save some WPA->ltrans streaming by freeing enum values.  */
+/* Save some WPA->ltrans streaming by freeing stuff needed only for good
+   ODR warnings.
+   We free TYPE_VALUES of enums and also make TYPE_DECLs to not point back
+   to the type (which is needed to keep them in the same SCC and preserve
+   location information to output warnings) and subsequently we make all
+   TYPE_DECLS of same assembler name equivalent.  */
 
 static void
-free_enum_values ()
+free_odr_warning_data ()
 {
-  static bool enum_values_freed = false;
-  if (enum_values_freed || !flag_wpa || !odr_types_ptr)
+  static bool odr_data_freed = false;
+
+  if (odr_data_freed || !flag_wpa || !odr_types_ptr)
 return;
-  enum_values_freed = true;
-  unsigned int i;
-  for (i = 0; i < odr_types.length (); i++)
+
+  odr_data_freed = true;
+
+  for (unsigned int i = 0; i < odr_types.length (); i++)
 if (odr_types[i])
   {
-   if (TREE_CODE (odr_types[i]->type) == ENUMERAL_TYPE)
- TYPE_VALUES (odr_types[i]->type) = NULL;
+   tree t = odr_types[i]->type;
+
+   if (TREE_CODE (t) == ENUMERAL_TYPE)
+ TYPE_VALUES (t) = NULL;
+   TREE_TYPE (TYPE_NAME (t)) = void_type_node;
+
if (odr_types[i]->types)
   for (unsigned int j = 0; j < odr_types[i]->types->length (); j++)
-   if (TREE_CODE ((*odr_types[i]->types)[j]) == ENUMERAL_TYPE)
- TYPE_VALUES ((*odr_types[i]->types)[j]) = NULL;
+   {
+ tree td = (*odr_types[i]->types)[j];
+
+ if (TREE_CODE (td) == ENUMERAL_TYPE)
+   TYPE_VALUES (td) = NULL;
+ TYPE_NAME (td) = TYPE_NAME (t);
+   }
   }
-  enum_values_freed = true;
+  odr_data_freed = true;
 }
 
 /* Initialize IPA devirt and build inheritance tree graph.  */
@@ -2323,7 +2341,7 @@ build_type_inheritance_graph (void)
 
   if (odr_hash)
 {
-  free_enum_values ();
+  free_odr_warning_data ();
   return;
 }
   timevar_push (TV_IPA_INHERITANCE);
@@ -2370,7 +2388,7 @@ build_type_inheritance_graph (void)
   dump_type_inheritance_graph (inheritance_dump_file);
   dump_end (TDI_inheritance, inheritance_dump_file);
 }
-  free_enum_values ();
+  free_odr_warning_data ();
   timevar_pop (TV_IPA_INHERITANCE);
 }
 
Index: tree.c
===
--- tree.c  (revision 266325)
+++ tree.c  (working copy)
@@ -5206,6 +5206,24 @@ fld_process_array_type (tree t, tree t2,
   return array;
 }
 
+/* Return CTX after removal of contexts that are not relevant  */
+
+static tree
+fld_decl_context (tree ctx)
+{
+  /* Variably modified types are needed for tree_is_indexable to decide
+ whether the type needs to go to local or global section.
+ This code is semi-broken but for now it is easiest to keep contexts
+ as expected.  */
+  if (ctx && TYPE_P (ctx)
+  && !variably_modified_type_p (ctx, NULL_TREE))
+ {
+   while (ctx && TYPE_P (ctx))
+ctx = TYPE_CONTEXT (ctx);
+ }
+  return ctx;
+}
+
 /* For T being aggregate type try to turn it into a incomplete variant.
Return T if no simplification is possible.  */
 
@@ -5267,6 +5285,28 @@ f

Re: [PATCH] Fix up 20181120-1.c testcase on big-endian (PR rtl-optimization/85925)

2018-11-21 Thread Segher Boessenkool
On Wed, Nov 21, 2018 at 06:31:43PM +0100, Jakub Jelinek wrote:
> > > As mentioned in the PR, the testcase fails on big-endian targets.
> > > The following patch tweaks it so that it does not fail there and still
> > > checks for the original bug.
> > 
> > It relies on a certain bitfield layout, not just on LE.  I think the
> > testcase should run only on those specific targets where it works.  I don't
> > see how this patch would fix the problem for BE, btw.
> 
> With the patch, it doesn't rely on anything, it compares if what you get at
> runtime from the code combiner would optimize is equal to what is read from
> a volatile union.

Oh, I think I misread it, sorry :-)

> Admittedly, it might be better if the initializer was 0x1010101 or say
> 0x4030201 because on big endian in particular 0x10101 has the top 15 bits
> all zero and thus that is what is in u.f1, so if the bug can be reproduced
> with the combine.c + rtlanal.c fix reverted with 0x4030201, it would be
> better to use that value (in both spots).

Yeah good point.


Segher


Re: [C++ PATCH] Improve locations of id-expressions and operator "" (PR c++/87386, take 2)

2018-11-21 Thread Jason Merrill

On 11/21/18 10:55 AM, Jakub Jelinek wrote:

Hi!

On Tue, Nov 20, 2018 at 04:32:26PM -0500, David Malcolm wrote:

This makes the fix-it hint wrong: after the fix-it is applied, it will
become
   return color;
(which won't compile), rather than
   return O::color;
which will.


Here is an updated version of the patch, which still uses the whole
range of the id-expression when it is parsed as primary expression, but does
so not in cp_parser_id_expression, but in cp_parser_primary_expression after
all the diagnostics.  Thus all the spell-checking etc. tests behave as
previously, they underline only the part after the last ::, and just
what uses the expression later on uses whole range.

The remaining needed tweeks in the testcases are minor and look correct to
me, e.g. for D::Bar the column is not at D but at B,


Sounds good.


similarly for operator"" _F the column is under _ rather than first o.


I disagree with this one: the name of the declaration is operator""_F, 
so I think the caret should go at the first o.



The libstdc++ changes are because there are several large expressions like:
   something::value
and we used to diagnose on the something line (column of s)
but now we warn on value line (column of v).


Makes sense.

Jason



Re: [C++ PATCH] Remove useless tokens from cp_parser_linkage_specification (PR c++/87393)

2018-11-21 Thread Jason Merrill

On 11/21/18 10:59 AM, Jakub Jelinek wrote:

Hi!

David's r251026 change added a weird trailing ->location.
It doesn't seem to be useful for anything, matching_braces has its own code
to track locations, so no need to do anything in the caller (and no other
spot does something like that).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.

Jason



[PATCH] libstdc++/88111 and libstdc++/88113 fix src/c++17/memory_resource.cc for 16-bit targets

2018-11-21 Thread Jonathan Wakely

Two patches to fix the build on msp430-elf which has 16-bit or 20-bit
pointers.

The patch for 88111 also affects other targets, by changing the
default values that are used when pool_options members are zero. The
new default values depend on the number of bits in size_t.

Bootstrapped on msp430-elf, tested on powerpc64le-linux.

commit b5ba0a7b875c3524d447452531416eabf218e6e9
Author: Jonathan Wakely 
Date:   Wed Nov 21 18:16:45 2018 +

PR libstdc++/88111 Make maximum block size depend on size_t width

PR libstdc++/88111
* include/std/memory_resource (pool_options): Add Doxygen comments.
* src/c++17/memory_resource.cc (pool_sizes): Only use suitable values
on targets with 16-bit or 20-bit size_t type.
(munge_options): Make default values depend on width of size_t type.

diff --git a/libstdc++-v3/include/std/memory_resource b/libstdc++-v3/include/std/memory_resource
index 87ad25d60f3..e9a46a3b455 100644
--- a/libstdc++-v3/include/std/memory_resource
+++ b/libstdc++-v3/include/std/memory_resource
@@ -299,13 +299,25 @@ namespace pmr
 { return !(__a == __b); }
 
 
+  /// Parameters for tuning a pool resource's behaviour.
   struct pool_options
   {
+/** @brief Upper limit on number of blocks in a chunk.
+ *
+ * A lower value prevents allocating huge chunks that could remain mostly
+ * unused, but means pools will need to replenished more frequently.
+ */
 size_t max_blocks_per_chunk = 0;
+
+/* @brief Largest block size (in bytes) that should be served from pools.
+ *
+ * Larger allocations will be served directly by the upstream resource,
+ * not from one of the pools managed by the pool resource.
+ */
 size_t largest_required_pool_block = 0;
   };
 
-  // Common implementation details for unsynchronized/synchronized pool resources.
+  // Common implementation details for un-/synchronized pool resources.
   class __pool_resource
   {
 friend class synchronized_pool_resource;
diff --git a/libstdc++-v3/src/c++17/memory_resource.cc b/libstdc++-v3/src/c++17/memory_resource.cc
index 6198e6b68ca..929df93233c 100644
--- a/libstdc++-v3/src/c++17/memory_resource.cc
+++ b/libstdc++-v3/src/c++17/memory_resource.cc
@@ -825,10 +825,15 @@ namespace pmr
   128, 192,
   256, 320, 384, 448,
   512, 768,
+#if __SIZE_WIDTH__ > 16
   1024, 1536,
   2048, 3072,
-  1<<12, 1<<13, 1<<14, 1<<15, 1<<16, 1<<17,
+#if __SIZE_WIDTH__ > 20
+  1<<12, 1<<13, 1<<14,
+  1<<15, 1<<16, 1<<17,
   1<<20, 1<<21, 1<<22 // 4MB should be enough for anybody
+#endif
+#endif
   };
 
   pool_options
@@ -839,10 +844,13 @@ namespace pmr
 // replaced with implementation-defined defaults, and sizes may be
 // rounded to unspecified granularity.
 
-// Absolute maximum. Each pool might have a smaller maximum.
+// max_blocks_per_chunk sets the absolute maximum for the pool resource.
+// Each pool might have a smaller maximum, because pools for very large
+// objects might impose  smaller limit.
 if (opts.max_blocks_per_chunk == 0)
   {
-	opts.max_blocks_per_chunk = 1024 * 10; // TODO a good default?
+	// Pick a default that depends on the number of bits in size_t.
+	opts.max_blocks_per_chunk = __SIZE_WIDTH__ << 8;
   }
 else
   {
@@ -854,10 +862,15 @@ namespace pmr
 	opts.max_blocks_per_chunk = chunk::max_blocks_per_chunk();
   }
 
-// Absolute minimum. Likely to be much larger in practice.
+// largest_required_pool_block specifies the largest block size that will
+// be allocated from a pool. Larger allocations will come directly from
+// the upstream resource and so will not be pooled.
 if (opts.largest_required_pool_block == 0)
   {
-	opts.largest_required_pool_block = 4096; // TODO a good default?
+	// Pick a sensible default that depends on the number of bits in size_t
+	// (pools with larger block sizes must be explicitly requested by
+	// using a non-zero value for largest_required_pool_block).
+	opts.largest_required_pool_block = __SIZE_WIDTH__ << 6;
   }
 else
   {

commit 14974318adc5e9d56e827cdfa39207e7c7be9e6d
Author: Jonathan Wakely 
Date:   Wed Nov 21 17:39:51 2018 +

PR libstdc++/88113 use size_type consistently instead of size_t

On 16-bit msp430-elf size_t is either 16 bits or 20 bits, and so can't
represent all values of the uint32_t type used for bitset::size_type.
Using the smaller of size_t and uint32_t for size_type ensures it fits
in size_t.

PR libstdc++/88113
* src/c++17/memory_resource.cc (bitset::size_type): Use the smaller
of uint32_t and size_t.
(bitset::size(), bitset::free(), bitset::update_next_word())
(bitset::max_blocks_per_chunk(), bitset::max_word_index()): Use
size_type consistently instead of size_t.
(chunk): Adjust static_assert checking sizeof(chunk).

diff --git a/

Re: [PATCH] C++: show namespaces for enum values (PR c++/88121)

2018-11-21 Thread Jason Merrill

On 11/21/18 8:35 AM, David Malcolm wrote:

Consider this test case:

namespace json
{
   enum { JSON_OBJECT };
}

void test ()
{
   JSON_OBJECT;
}

which erroneously accesses an enum value in another namespace without
qualifying the access.

GCC 6 through 8 issue a suggestion that doesn't mention the namespace:

: In function 'void test()':
:8:3: error: 'JSON_OBJECT' was not declared in this scope
JSON_OBJECT;
^~~
:8:3: note: suggested alternative:
:3:10: note:   'JSON_OBJECT'
enum { JSON_OBJECT };
   ^~~

which is suboptimal.

I made the problem worse with r265610, as gcc 9 now consolidates
the single suggestion into the error, and emits:

: In function 'void test()':
:8:3: error: 'JSON_OBJECT' was not declared in this scope; did
you mean 'JSON_OBJECT'?
 8 |   JSON_OBJECT;
   |   ^~~
   |   JSON_OBJECT
:3:10: note: 'JSON_OBJECT' declared here
 3 |   enum { JSON_OBJECT };
   |  ^~~

where the message:
   'JSON_OBJECT' was not declared in this scope; did you mean 'JSON_OBJECT'?
is nonsensical.

The root cause is that dump_scope doesn't print anything when called for
CONST_DECL in a namespace: the scope is an ENUMERAL_TYPE, rather than
a namespace.


Although that's only true for unscoped enums.


This patch tweaks dump_scope to detect ENUMERAL_TYPE, and to use the
enclosing namespace, so that the CONST_DECL is dumped as
"json::JSON_OBJECT".



@@ -182,6 +182,12 @@ dump_scope (cxx_pretty_printer *pp, tree scope, int flags)
if (scope == NULL_TREE)
  return;
  
+  /* Enum values will be CONST_DECL with an ENUMERAL_TYPE as their

+ "scope".  Use CP_TYPE_CONTEXT of the ENUMERAL_TYPE, so as to
+ print the enclosing namespace.  */
+  if (TREE_CODE (scope) == ENUMERAL_TYPE)
+scope = CP_TYPE_CONTEXT (scope);


This needs to handle scoped enums differently.


diff --git a/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C 
b/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C
new file mode 100644
index 000..2bf3ed6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/suggestions-scoped-enums.C
@@ -0,0 +1,13 @@
+// { dg-do compile { target c++11 } }
+// { dg-options "-fdiagnostics-show-caret" }
+
+enum class vegetable { CARROT, TURNIP };
+
+void misspelled_value_in_scoped_enum ()
+{
+  vegetable::TURNUP; // { dg-error "'TURNUP' is not a member of 'vegetable'" }
+  /* { dg-begin-multiline-output "" }
+   vegetable::TURNUP;
+  ^~
+ { dg-end-multiline-output "" } */
+}


I don't see any suggestion in the expected output, and would hope for it 
to suggest vegetable::TURNIP.


Jason


Re: [C++ PATCH] Improve locations of id-expressions and operator "" (PR c++/87386, take 2)

2018-11-21 Thread Jakub Jelinek
On Wed, Nov 21, 2018 at 01:29:15PM -0500, Jason Merrill wrote:
> > similarly for operator"" _F the column is under _ rather than first o.
> 
> I disagree with this one: the name of the declaration is operator""_F, so I
> think the caret should go at the first o.

Right now when cp_parser_operator_function_id is called, it returns locus like:
operator new
 ^~~
operator delete []
 ^
operator ==
 ^
operator "" _foo
UNKNOWN_LOCATION
The last one is because for others we do return cp_expr (id, start_loc);
but for operator "" just return id;

So, do you suggest we should instead return
operator new
^~~~
operator delete []
^~
operator ==
^~~
operator "" _foo
^~~~
?
That would mean cp_parser_operator_function_id would need to pass
location_t start_loc (the start of the operator token) to cp_parser_operator and
let that create a range in all cases rather than just for operator
new/delete.

Jakub


[PATCH 1/7][v2][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-21 Thread Jozef Lawrynowicz
On Wed, 14 Nov 2018 15:41:00 +
Jozef Lawrynowicz  wrote:

> Patch 1 tweaks dg directives in tests specifically for msp430. Many of
> these are extensions to existing target selectors in dg directives.

Made some modifications to patch 1 based on suggestions.
Added int_eq_float and ptr_eq_long effective target procedures.

Re-tested on avr, x86_64-pc-linux-gnu and msp430-elf.

Ok for trunk?
>From 1f31a27ab7cf5b7de0c1cfc7e33a39a66cd61146 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 8 Nov 2018 18:55:57 +
Subject: [PATCH] [TESTSUITE][MSP430] Tweak dg-directives for msp430-elf

2018-11-21  Jozef Lawrynowicz  

	gcc/testsuite/ChangeLog:

	* lib/target-supports.exp
	(check_effective_target_logical_op_short_circuit): Add msp430.
	(check_effective_target_int_eq_float): New. 
	(check_effective_target_ptr_eq_long): New. 
	* c-c++-common/pr41779.c: Require int_eq_float for dg-warning tests.
	* c-c++-common/pr57371-2.c: XFAIL optimized dump scan when
	sizeof (float) != sizeof (int).
	* gcc.dg/pr84670-4.c: Require ptr_eq_long.
	* gcc.dg/pr85859.c: Likewise.
	* gcc.dg/Wno-frame-address.c: Skip for msp430-elf.
	* gcc.dg/torture/stackalign/builtin-apply-2.c: Likewise.
	* gcc.dg/ifcvt-4.c: Likewise.
	* gcc.dg/pr34856.c: Likewise.
	* gcc.dg/builtin-apply2.c: Likewise.
	* gcc.dg/tree-ssa/ssa-dse-26.c: Likewise.
	* gcc.dg/attr-alloc_size-11.c: Remove dg-warning XFAIL for msp430.
	* gcc.dg/tree-ssa/20040204-1.c: Likewise.
	* gcc.dg/compat/struct-by-value-16a_x.c: Build at -O1 for msp430
	so it fits.
	* gcc.dg/lto/20091013-1_0.c: Require ptr_eq_long.
	* gcc.dg/lto/20091013-1_1.c: Remove xfail-if for when
	sizeof(void *) != sizeof(long).
	* gcc.dg/lto/20091013-1_2.c: Likewise.
	* gcc.dg/tree-ssa/loop-1.c: Fix expected dg-final behaviour for msp430.
	* gcc.dg/tree-ssa/gen-vect-25.c: Likewise.
	* gcc.dg/tree-ssa/gen-vect-11.c: Likewise.
	* gcc.dg/tree-ssa/loop-35.c: Likewise.
	* gcc.dg/tree-ssa/pr23455.c: Likewise.
	* gcc.dg/weak/typeof-2.c: Likewise.
	* gcc.target/msp430/interrupt_fn_placement.c: Skip for 430 ISA.
	* gcc.target/msp430/pr78818-data-region.c: Fix scan-assembler text.
	* gcc.target/msp430/pr79242.c: Don't skip for -msmall.
	* gcc.target/msp430/special-regs.c: Use "__asm__" instead of "asm".
---
 gcc/testsuite/c-c++-common/pr41779.c   |  6 +++---
 gcc/testsuite/c-c++-common/pr57371-2.c |  2 +-
 gcc/testsuite/gcc.dg/Wno-frame-address.c   |  2 +-
 gcc/testsuite/gcc.dg/attr-alloc_size-11.c  |  4 ++--
 gcc/testsuite/gcc.dg/builtin-apply2.c  |  2 +-
 .../gcc.dg/compat/struct-by-value-16a_x.c  |  2 ++
 gcc/testsuite/gcc.dg/ifcvt-4.c |  2 +-
 gcc/testsuite/gcc.dg/lto/20091013-1_0.c|  1 +
 gcc/testsuite/gcc.dg/lto/20091013-1_1.c|  1 -
 gcc/testsuite/gcc.dg/lto/20091013-1_2.c|  1 -
 gcc/testsuite/gcc.dg/pr34856.c |  1 +
 gcc/testsuite/gcc.dg/pr84670-4.c   |  1 +
 gcc/testsuite/gcc.dg/pr85859.c |  1 +
 .../gcc.dg/torture/stackalign/builtin-apply-2.c|  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11.c|  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-25.c|  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/loop-1.c |  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/loop-35.c|  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/pr23455.c|  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c |  1 +
 gcc/testsuite/gcc.dg/weak/typeof-2.c   |  2 ++
 .../gcc.target/msp430/interrupt_fn_placement.c |  1 +
 .../gcc.target/msp430/pr78818-data-region.c|  3 ++-
 gcc/testsuite/gcc.target/msp430/pr79242.c  |  2 +-
 gcc/testsuite/gcc.target/msp430/special-regs.c |  8 
 gcc/testsuite/lib/target-supports.exp  | 24 ++
 27 files changed, 61 insertions(+), 28 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/pr41779.c b/gcc/testsuite/c-c++-common/pr41779.c
index c42a0f5..a80bf78 100644
--- a/gcc/testsuite/c-c++-common/pr41779.c
+++ b/gcc/testsuite/c-c++-common/pr41779.c
@@ -1,6 +1,6 @@
 /* PR41779: Wconversion cannot see through real*integer promotions. */
 /* { dg-do compile } */
-/* { dg-skip-if "doubles are floats" { "avr-*-*" } } */
+/* { dg-skip-if "doubles are floats" { avr-*-* } } */
 /* { dg-options "-std=c99 -Wconversion" { target c } } */
 /* { dg-options "-Wconversion" { target c++ } } */
 /* { dg-require-effective-target large_double } */
@@ -27,7 +27,7 @@ float f4(float x, unsigned char y)
 
 float f5(float x, int y)
 {
-  return x * y; /* { dg-warning "conversion" } */
+  return x * y; /* { dg-warning "conversion" "" { target int_eq_float } } */
 }
 
 double c1(float x, unsigned short y, int z)
@@ -52,5 +52,5 @@ double c4(float x, unsigned char y, int z)
 
 double c5(float x, int y, int z)
 {
-  return z ? x + x : y; /* { dg-warning "conversion" } */
+  return z ? x + x : y; /* { 

[PATCH] Fix up 20181120-1.c testcase on big-endian (PR rtl-optimization/85925, take 2)

2018-11-21 Thread Jakub Jelinek
Hi!

On Wed, Nov 21, 2018 at 12:07:51PM -0600, Segher Boessenkool wrote:
> > Admittedly, it might be better if the initializer was 0x1010101 or say
> > 0x4030201 because on big endian in particular 0x10101 has the top 15 bits
> > all zero and thus that is what is in u.f1, so if the bug can be reproduced
> > with the combine.c + rtlanal.c fix reverted with 0x4030201, it would be
> > better to use that value (in both spots).
> 
> Yeah good point.

I've now managed to test this with a cross to armv7hl (scped to an arm box)
with and without the rtlanal.c + combine.c change reverted and on
powerpc64-linux as example of big-endian, on armv7hl it still fails with
the changes reverted, otherwise it succeeds on both.  The test also needs
32-bit int target (previously just 17-bit or more, so I've added effective
target).

Ok for trunk and release branches?

2018-11-21  Jakub Jelinek  

PR rtl-optimization/85925
* gcc.c-torture/execute/20181120-1.c: Require effective target
int32plus.
(u): New variable.
(main): Compare d against u.f1 rather than 0x101.  Use 0x4030201
instead of 0x10101.

--- gcc/testsuite/gcc.c-torture/execute/20181120-1.c.jj 2018-11-21 
17:39:47.963671708 +0100
+++ gcc/testsuite/gcc.c-torture/execute/20181120-1.c2018-11-21 
20:07:45.804556443 +0100
@@ -1,4 +1,5 @@
 /* PR rtl-optimization/85925 */
+/* { dg-require-effective-target int32plus } */
 /* Testcase by  */
 
 int a, c, d;
@@ -9,17 +10,18 @@ union U1 {
   unsigned f0;
   unsigned f1 : 15;
 };
+volatile union U1 u = { 0x4030201 };
 
 int main (void)
 {
   for (c = 0; c <= 1; c++) {
-union U1 f = {0x10101};
+union U1 f = {0x4030201};
 if (c == 1)
   b;
 *e = f.f1;
   }
 
-  if (d != 0x101)
+  if (d != u.f1)
 __builtin_abort ();
 
   return 0;


Jakub


Re: [PATCH 1/7][v2][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-21 Thread Rainer Orth
Hi Jozef,

> On Wed, 14 Nov 2018 15:41:00 +
> Jozef Lawrynowicz  wrote:
>
>> Patch 1 tweaks dg directives in tests specifically for msp430. Many of
>> these are extensions to existing target selectors in dg directives.
>
> Made some modifications to patch 1 based on suggestions.
> Added int_eq_float and ptr_eq_long effective target procedures.
>
> Re-tested on avr, x86_64-pc-linux-gnu and msp430-elf.
>
> Ok for trunk?

new effective-target keywords always need documenting in
gcc/doc/sourcebuild.texi.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Replace sync builtins with atomic builtins

2018-11-21 Thread Thomas Koenig

Hi Janne,


PING!


OK.

Thanks for the patch!

Regards

Thomas


Re: Patch ping (Re: [PATCH] Fortran include line fixes and -fdec-include support)

2018-11-21 Thread Thomas Koenig

Hi Jakub,


Before 9.0 is released, we should also document the flag
(and the extension it supports) in the manual, and note it
in changes.html and on the Wiki.  Would you also do that?

Like this?  Ok for trunk/wwwdocs?


OK for trunk (and I don't think you need my OK for wwwdocs, but
you have it anyway :-)

Regards

Thomas


[PATCH, LRA]: Revert the revert of removal of usless move insns.

2018-11-21 Thread Uros Bizjak
Hello!

Before the recent patch to post-reload mode switching, vzeroupper
insertion depended on the existence of the return copy instructions
pair in functions that return a value. The first instruction in the
pair represents a move to a function return hard register, and the
second was a USE of the function return hard register. Sometimes a nop
move was generated (e.g. %eax->%eax) for the first instruction of the
return copy instructions pair and the patch [1] teached LRA  to remove
these useless instructions on the fly.

The removal caused optimize mode switching to trigger the assert,
since the first instruction of a return pair was not found. The
relevant part of the patch was later reverted. With the recent
optimize mode switching patch, this is no longer necessary for
vzeroupper insertion pass, so attached patch reverts the revert.

2018-11-21  Uros Bizjak  

Revert the revert:
2013-10-26  Vladimir Makarov  

Revert:
2013-10-25  Vladimir Makarov  

* lra-spills.c (lra_final_code_change): Remove useless move insns.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

OK for mainline?

[1] https://gcc.gnu.org/ml/gcc-patches/2013-10/msg02208.html

Uros.
diff --git a/gcc/lra-spills.c b/gcc/lra-spills.c
index 33caf9f45649..008d7399687d 100644
--- a/gcc/lra-spills.c
+++ b/gcc/lra-spills.c
@@ -740,6 +740,7 @@ lra_final_code_change (void)
   int i, hard_regno;
   basic_block bb;
   rtx_insn *insn, *curr;
+  rtx set;
   int max_regno = max_reg_num ();
 
   for (i = FIRST_PSEUDO_REGISTER; i < max_regno; i++)
@@ -818,5 +819,19 @@ lra_final_code_change (void)
  }
  if (insn_change_p)
lra_update_operator_dups (id);
+
+ if ((set = single_set (insn)) != NULL
+ && REG_P (SET_SRC (set)) && REG_P (SET_DEST (set))
+ && REGNO (SET_SRC (set)) == REGNO (SET_DEST (set)))
+   {
+ /* Remove an useless move insn.  IRA can generate move
+insns involving pseudos.  It is better remove them
+earlier to speed up compiler a bit.  It is also
+better to do it here as they might not pass final RTL
+check in LRA, (e.g. insn moving a control register
+into itself).  */
+ lra_invalidate_insn_data (insn);
+ delete_insn (insn);
+   }
}
 }


Re: Improve relocation

2018-11-21 Thread Marc Glisse

ping?

On Fri, 26 Oct 2018, Marc Glisse wrote:


Hello,

here are some tweaks so that I can usefully mark deque as trivially 
relocatable. It includes more noexcept(auto) madness. For __relocate_a_1, I 
should also test if copying, ++ and != are noexcept, but I wanted to ask 
first because there might be restrictions on what iterators are allowed to 
do, even if I didn't see them. Also, the current code already ignores those, 
so it may as well be fixed in another patch.


Allocators are complicated. I specialized only for the default allocator, 
because that's by far the one that is used the most, and I have much less 
risk of getting it wrong. Some allocator expert is welcome to make a better 
test. I do not know in details how deque is implemented. A quick look seemed 
to show that trivial relocation should be fine, but I would appreciate a 
confirmation.


The extra parameter for __is_trivially_relocatable is not used, but I expect 
it will be as soon as the specializations of __is_trivially_relocatable 
become more advanced.


If I use or specialize __is_trivially_relocatable in many places, this forces 
to #include bits/stl_uninitialized.h in many places. I wonder if I should 
move some of that stuff. Since I may use it in std::swap, bits/move.h looks 
like a sensible place for the core pieces (__is_trivially_relocatable, and 
__relocate_object if I ever create that). That or type_traits.


Regtested on gcc112. I manually checked that there was a speed-up for 
operations on vector>, although doing any kind of benchmarking on 
gcc112 is hard, I'll test locally next time.


2018-10-26  Marc Glisse  

PR libstdc++/87106
* include/bits/stl_algobase.h: Include .
(__niter_base): Add noexcept specification.
* include/bits/stl_deque.h: Include .
(__is_trivially_relocatable): Specialize for deque.
* include/bits/stl_iterator.h: Include .
(__niter_base): Add noexcept specification.
* include/bits/stl_uninitialized.h (__is_trivially_relocatable):
Add parameter for meta-programming.
(__relocate_a_1, __relocate_a): Add noexcept specification.
* include/bits/stl_vector.h (__use_relocate): Test __relocate_a.


--
Marc Glisse


[PATCH, middle-end]: Fix PR88129, Two blockage insns are emitted in the function epilogue

2018-11-21 Thread Uros Bizjak
Hello!

Attached patch removes extra blockage insn generation. For the
software archaeology, please see the PR [1], where it was determined,
that the removed part is probably a dataflow branch to trunk merge
oversight.

2018-11-21  Uros Bizjak  

PR middle-end/88129
* function.c (expand_function_end): Do not emit extra blockage insn.

Patch was bootstrapped and regression tested on x86_64-linux-gnu
{,-m32} for all default languages, obj-c++ and go.

OK for mainline?

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88129

Uros.
diff --git a/gcc/function.c b/gcc/function.c
index 302438323c87..44ad57840440 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -5296,14 +5296,6 @@ expand_function_end (void)
   if (flag_exceptions)
sjlj_emit_function_exit_after (get_last_insn ());
 }
-  else
-{
-  /* We want to ensure that instructions that may trap are not
-moved into the epilogue by scheduling, because we don't
-always emit unwind information for the epilogue.  */
-  if (cfun->can_throw_non_call_exceptions)
-   emit_insn (gen_blockage ());
-}
 
   /* If this is an implementation of throw, do what's necessary to
  communicate between __builtin_eh_return and the epilogue.  */


Re: [PATCH, middle-end]: Fix PR88129, Two blockage insns are emitted in the function epilogue

2018-11-21 Thread Richard Biener
On November 21, 2018 8:44:46 PM GMT+01:00, Uros Bizjak  
wrote:
>Hello!
>
>Attached patch removes extra blockage insn generation. For the
>software archaeology, please see the PR [1], where it was determined,
>that the removed part is probably a dataflow branch to trunk merge
>oversight.
>
>2018-11-21  Uros Bizjak  
>
>PR middle-end/88129
>   * function.c (expand_function_end): Do not emit extra blockage insn.
>
>Patch was bootstrapped and regression tested on x86_64-linux-gnu
>{,-m32} for all default languages, obj-c++ and go.
>
>OK for mainline?

OK. 

Richard. 

>[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88129
>
>Uros.



[PATCH, i386]: Fix PR85667, ms_abi rules aren't followed when returning short structs with float values

2018-11-21 Thread Uros Bizjak
> We don't have the commit access ,can  someone please commit for us ?
>
> ~Umesh
>
> On Wed, Nov 21, 2018, 18:37 Jakub Jelinek 
> > On Wed, Nov 21, 2018 at 06:06:41PM +0530, Umesh Kalappa wrote:
> > > Thank you for the inputs and please find the attachment for the update
> > patch.
> >
> > LGTM.

Committed to mainline SVN.

Thanks,
Uros.


Re: [PATCH] Fix up 20181120-1.c testcase on big-endian (PR rtl-optimization/85925, take 2)

2018-11-21 Thread Segher Boessenkool
On Wed, Nov 21, 2018 at 08:12:44PM +0100, Jakub Jelinek wrote:
> On Wed, Nov 21, 2018 at 12:07:51PM -0600, Segher Boessenkool wrote:
> > > Admittedly, it might be better if the initializer was 0x1010101 or say
> > > 0x4030201 because on big endian in particular 0x10101 has the top 15 bits
> > > all zero and thus that is what is in u.f1, so if the bug can be reproduced
> > > with the combine.c + rtlanal.c fix reverted with 0x4030201, it would be
> > > better to use that value (in both spots).
> > 
> > Yeah good point.
> 
> I've now managed to test this with a cross to armv7hl (scped to an arm box)
> with and without the rtlanal.c + combine.c change reverted and on
> powerpc64-linux as example of big-endian, on armv7hl it still fails with
> the changes reverted, otherwise it succeeds on both.  The test also needs
> 32-bit int target (previously just 17-bit or more, so I've added effective
> target).

It fixes the problem on powerpc64-linux {-m32,-m64}.  Thanks :-)


Segher


Re: [PATCH 1/7][v2][MSP430][TESTSUITE] Tweak dg-directives for msp430-elf

2018-11-21 Thread Jozef Lawrynowicz
On Wed, 21 Nov 2018 20:19:29 +0100
Rainer Orth  wrote:

> new effective-target keywords always need documenting in
> gcc/doc/sourcebuild.texi.
> 
>   Rainer
> 

Whoops, thanks for the heads up, fixed in attached.

I'll add documentation for the keywords added in the other patches as well.

Jozef
>From be96391838c65b297589ac47ad6347f55ea713c0 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 8 Nov 2018 18:55:57 +
Subject: [PATCH] [TESTSUITE][MSP430] Tweak dg-directives for msp430-elf

2018-11-21  Jozef Lawrynowicz  

	gcc/ChangeLog:

	* doc/sourcebuild.texi: Document check_effective_target_int_eq_float
	and check_effective_target_ptr_eq_long.

	gcc/testsuite/ChangeLog:

	* lib/target-supports.exp
	(check_effective_target_logical_op_short_circuit): Add msp430.
	(check_effective_target_int_eq_float): New. 
	(check_effective_target_ptr_eq_long): New. 
	* c-c++-common/pr41779.c: Require int_eq_float for dg-warning tests.
	* c-c++-common/pr57371-2.c: XFAIL optimized dump scan when
	sizeof (float) != sizeof (int).
	* gcc.dg/pr84670-4.c: Require ptr_eq_long.
	* gcc.dg/pr85859.c: Likewise.
	* gcc.dg/Wno-frame-address.c: Skip for msp430-elf.
	* gcc.dg/torture/stackalign/builtin-apply-2.c: Likewise.
	* gcc.dg/ifcvt-4.c: Likewise.
	* gcc.dg/pr34856.c: Likewise.
	* gcc.dg/builtin-apply2.c: Likewise.
	* gcc.dg/tree-ssa/ssa-dse-26.c: Likewise.
	* gcc.dg/attr-alloc_size-11.c: Remove dg-warning XFAIL for msp430.
	* gcc.dg/tree-ssa/20040204-1.c: Likewise.
	* gcc.dg/compat/struct-by-value-16a_x.c: Build at -O1 for msp430
	so it fits.
	* gcc.dg/lto/20091013-1_0.c: Require ptr_eq_long.
	* gcc.dg/lto/20091013-1_1.c: Remove xfail-if for when
	sizeof(void *) != sizeof(long).
	* gcc.dg/lto/20091013-1_2.c: Likewise.
	* gcc.dg/tree-ssa/loop-1.c: Fix expected dg-final behaviour for msp430.
	* gcc.dg/tree-ssa/gen-vect-25.c: Likewise.
	* gcc.dg/tree-ssa/gen-vect-11.c: Likewise.
	* gcc.dg/tree-ssa/loop-35.c: Likewise.
	* gcc.dg/tree-ssa/pr23455.c: Likewise.
	* gcc.dg/weak/typeof-2.c: Likewise.
	* gcc.target/msp430/interrupt_fn_placement.c: Skip for 430 ISA.
	* gcc.target/msp430/pr78818-data-region.c: Fix scan-assembler text.
	* gcc.target/msp430/pr79242.c: Don't skip for -msmall.
	* gcc.target/msp430/special-regs.c: Use "__asm__" instead of "asm".
---
 gcc/doc/sourcebuild.texi   |  6 ++
 gcc/testsuite/c-c++-common/pr41779.c   |  6 +++---
 gcc/testsuite/c-c++-common/pr57371-2.c |  2 +-
 gcc/testsuite/gcc.dg/Wno-frame-address.c   |  2 +-
 gcc/testsuite/gcc.dg/attr-alloc_size-11.c  |  4 ++--
 gcc/testsuite/gcc.dg/builtin-apply2.c  |  2 +-
 .../gcc.dg/compat/struct-by-value-16a_x.c  |  2 ++
 gcc/testsuite/gcc.dg/ifcvt-4.c |  2 +-
 gcc/testsuite/gcc.dg/lto/20091013-1_0.c|  1 +
 gcc/testsuite/gcc.dg/lto/20091013-1_1.c|  1 -
 gcc/testsuite/gcc.dg/lto/20091013-1_2.c|  1 -
 gcc/testsuite/gcc.dg/pr34856.c |  1 +
 gcc/testsuite/gcc.dg/pr84670-4.c   |  1 +
 gcc/testsuite/gcc.dg/pr85859.c |  1 +
 .../gcc.dg/torture/stackalign/builtin-apply-2.c|  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/20040204-1.c |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-11.c|  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/gen-vect-25.c|  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/loop-1.c |  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/loop-35.c|  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/pr23455.c|  4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c |  1 +
 gcc/testsuite/gcc.dg/weak/typeof-2.c   |  2 ++
 .../gcc.target/msp430/interrupt_fn_placement.c |  1 +
 .../gcc.target/msp430/pr78818-data-region.c|  3 ++-
 gcc/testsuite/gcc.target/msp430/pr79242.c  |  2 +-
 gcc/testsuite/gcc.target/msp430/special-regs.c |  8 
 gcc/testsuite/lib/target-supports.exp  | 24 ++
 28 files changed, 67 insertions(+), 28 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 7487977..bca5db3 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1360,6 +1360,12 @@ Target has @code{int} that is 16 bits or shorter.
 @item long_neq_int
 Target has @code{int} and @code{long} with different sizes.
 
+@item int_eq_float
+Target has @code{int} and @code{float} with the same size.
+
+@item ptr_eq_long
+Target has pointers (@code{void *}) and @code{long} with the same size.
+
 @item large_double
 Target supports @code{double} that is longer than @code{float}.
 
diff --git a/gcc/testsuite/c-c++-common/pr41779.c b/gcc/testsuite/c-c++-common/pr41779.c
index c42a0f5..a80bf78 100644
--- a/gcc/testsuite/c-c++-common/pr41779.c
+++ b/gcc/testsuite/c-c++-common/pr41779.c
@@ -1,6 +1,6 @@
 /* PR41779: Wconversion cannot see through real*integer promotions. */
 /* { dg-do compile } */
-/* { dg-skip-if "doubles are floats" { "avr-*-*" } } 

[PING^3] Re: [PATCH 1/3] Support instrumenting returns of instrumented functions

2018-11-21 Thread Andi Kleen
Andi Kleen  writes:

Ping^3!

> Andi Kleen  writes:
>
> Ping!^2
>
>> Andi Kleen  writes:
>>
>> Ping!
>>
>>> From: Andi Kleen 
>>>
>>> When instrumenting programs using __fentry__ it is often useful
>>> to instrument the function return too. Traditionally this
>>> has been done by patching the return address on the stack
>>> frame on entry. However this is fairly complicated (trace
>>> function has to emulate a stack) and also slow because
>>> it causes a branch misprediction on every return.
>>>
>>> Add an option to generate call or nop instrumentation for
>>> every return instead, including patch sections.
>>>
>>> This will increase the program size slightly, but can be a
>>> lot faster and simpler.
>>>
>>> This version only instruments true returns, not sibling
>>> calls or tail recursion. This matches the semantics of the
>>> original stack.
>>>
>>> gcc/:
>>>
>>> 2018-11-04  Andi Kleen  
>>>
>>> * config/i386/i386-opts.h (enum instrument_return): Add.
>>> * config/i386/i386.c (output_return_instrumentation): Add.
>>> (ix86_output_function_return): Call output_return_instrumentation.
>>> (ix86_output_call_insn): Call output_return_instrumentation.
>>> * config/i386/i386.opt: Add -minstrument-return=.
>>> * doc/invoke.texi (-minstrument-return): Document.
>>>
>>> gcc/testsuite/:
>>>
>>> 2018-11-04  Andi Kleen  
>>>
>>> * gcc.target/i386/returninst1.c: New test.
>>> * gcc.target/i386/returninst2.c: New test.
>>> * gcc.target/i386/returninst3.c: New test.
>>> ---
>>>  gcc/config/i386/i386-opts.h |  6 
>>>  gcc/config/i386/i386.c  | 36 +
>>>  gcc/config/i386/i386.opt| 21 
>>>  gcc/doc/invoke.texi | 14 
>>>  gcc/testsuite/gcc.target/i386/returninst1.c | 14 
>>>  gcc/testsuite/gcc.target/i386/returninst2.c | 21 
>>>  gcc/testsuite/gcc.target/i386/returninst3.c |  9 ++
>>>  7 files changed, 121 insertions(+)
>>>  create mode 100644 gcc/testsuite/gcc.target/i386/returninst1.c
>>>  create mode 100644 gcc/testsuite/gcc.target/i386/returninst2.c
>>>  create mode 100644 gcc/testsuite/gcc.target/i386/returninst3.c
>>>
>>> diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h
>>> index 46366cbfa72..35e9413100e 100644
>>> --- a/gcc/config/i386/i386-opts.h
>>> +++ b/gcc/config/i386/i386-opts.h
>>> @@ -119,4 +119,10 @@ enum indirect_branch {
>>>indirect_branch_thunk_extern
>>>  };
>>>  
>>> +enum instrument_return {
>>> +  instrument_return_none = 0,
>>> +  instrument_return_call,
>>> +  instrument_return_nop5
>>> +};
>>> +
>>>  #endif
>>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>>> index f9ef0b4445b..f7cd94a8139 100644
>>> --- a/gcc/config/i386/i386.c
>>> +++ b/gcc/config/i386/i386.c
>>> @@ -28336,12 +28336,47 @@ ix86_output_indirect_jmp (rtx call_op)
>>>  return "%!jmp\t%A0";
>>>  }
>>>  
>>> +/* Output return instrumentation for current function if needed.  */
>>> +
>>> +static void
>>> +output_return_instrumentation (void)
>>> +{
>>> +  if (ix86_instrument_return != instrument_return_none
>>> +  && flag_fentry
>>> +  && !DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT (cfun->decl))
>>> +{
>>> +  if (ix86_flag_record_return)
>>> +   fprintf (asm_out_file, "1:\n");
>>> +  switch (ix86_instrument_return)
>>> +   {
>>> +   case instrument_return_call:
>>> + fprintf (asm_out_file, "\tcall\t__return__\n");
>>> + break;
>>> +   case instrument_return_nop5:
>>> + /* 5 byte nop: nopl 0(%[re]ax,%[re]ax,1)  */
>>> + fprintf (asm_out_file, ASM_BYTE "0x0f, 0x1f, 0x44, 0x00, 0x00\n");
>>> + break;
>>> +   case instrument_return_none:
>>> + break;
>>> +   }
>>> +
>>> +  if (ix86_flag_record_return)
>>> +   {
>>> + fprintf (asm_out_file, "\t.section __return_loc, \"a\",@progbits\n");
>>> + fprintf (asm_out_file, "\t.%s 1b\n", TARGET_64BIT ? "quad" : "long");
>>> + fprintf (asm_out_file, "\t.previous\n");
>>> +   }
>>> +}
>>> +}
>>> +
>>>  /* Output function return.  CALL_OP is the jump target.  Add a REP
>>> prefix to RET if LONG_P is true and function return is kept.  */
>>>  
>>>  const char *
>>>  ix86_output_function_return (bool long_p)
>>>  {
>>> +  output_return_instrumentation ();
>>> +
>>>if (cfun->machine->function_return_type != indirect_branch_keep)
>>>  {
>>>char thunk_name[32];
>>> @@ -28454,6 +28489,7 @@ ix86_output_call_insn (rtx_insn *insn, rtx call_op)
>>>  
>>>if (SIBLING_CALL_P (insn))
>>>  {
>>> +  output_return_instrumentation ();
>>>if (direct_p)
>>> {
>>>   if (ix86_nopic_noplt_attribute_p (call_op))
>>> diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt
>>> index e7fbf9b6f99..5925b75244f 100644
>>> --- a/gcc/config/i386/i386.opt
>>> +++ b/gcc/config/i386/i386.opt
>>> @@ -1063,3 +1063,24 @@ Support WAITPKG built-in functions and code 
>>> generation.
>>>  mcldemote
>>>

Re: [PATCH, LRA]: Revert the revert of removal of usless move insns.

2018-11-21 Thread Vladimir Makarov




On 11/21/2018 02:33 PM, Uros Bizjak wrote:

Hello!

Before the recent patch to post-reload mode switching, vzeroupper
insertion depended on the existence of the return copy instructions
pair in functions that return a value. The first instruction in the
pair represents a move to a function return hard register, and the
second was a USE of the function return hard register. Sometimes a nop
move was generated (e.g. %eax->%eax) for the first instruction of the
return copy instructions pair and the patch [1] teached LRA  to remove
these useless instructions on the fly.

The removal caused optimize mode switching to trigger the assert,
since the first instruction of a return pair was not found. The
relevant part of the patch was later reverted. With the recent
optimize mode switching patch, this is no longer necessary for
vzeroupper insertion pass, so attached patch reverts the revert.

2018-11-21  Uros Bizjak  

 Revert the revert:
 2013-10-26  Vladimir Makarov  

 Revert:
 2013-10-25  Vladimir Makarov  

 * lra-spills.c (lra_final_code_change): Remove useless move insns.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

OK for mainline?

Sure. Thank you, Uros.

[1] https://gcc.gnu.org/ml/gcc-patches/2013-10/msg02208.html

Uros.




[C++ PATCH] Improve locations of id-expressions and operator "" (PR c++/87386, take 3)

2018-11-21 Thread Jakub Jelinek
On Wed, Nov 21, 2018 at 07:49:48PM +0100, Jakub Jelinek wrote:
> So, do you suggest we should instead return
> operator new
> ^~~~
> operator delete []
> ^~
> operator ==
> ^~~
> operator "" _foo
> ^~~~
> ?
> That would mean cp_parser_operator_function_id would need to pass
> location_t start_loc (the start of the operator token) to cp_parser_operator 
> and
> let that create a range in all cases rather than just for operator
> new/delete.

This version of the patch implements that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-11-21  Jakub Jelinek  

PR c++/87386
* parser.c (cp_parser_primary_expression): Use
id_expression.get_location () instead of id_expr_token->location.
Adjust the range from id_expr_token->location to
id_expressio.get_finish ().
(cp_parser_operator_function_id): Pass location of the operator
token down to cp_parser_operator.
(cp_parser_operator): Add start_loc argument, always construct a
location with caret at start_loc and range from start_loc to the
finish of the last token.
gcc/testsuite/
* g++.dg/diagnostic/pr87386.C: New test.
* g++.dg/parse/error17.C: Adjust expected diagnostics.
libstdc++-v3/
* testsuite/20_util/scoped_allocator/69293_neg.cc: Adjust expected
line.
* testsuite/20_util/uses_allocator/cons_neg.cc: Likewise.
* testsuite/20_util/uses_allocator/69293_neg.cc: Likewise.
* testsuite/experimental/propagate_const/requirements2.cc: Likewise.
* testsuite/experimental/propagate_const/requirements3.cc: Likewise.
* testsuite/experimental/propagate_const/requirements4.cc: Likewise.
* testsuite/experimental/propagate_const/requirements5.cc: Likewise.

--- gcc/cp/parser.c.jj  2018-11-21 17:42:18.003216049 +0100
+++ gcc/cp/parser.c 2018-11-21 20:56:43.694344258 +0100
@@ -2312,7 +2312,7 @@ static tree cp_parser_mem_initializer_id
 static cp_expr cp_parser_operator_function_id
   (cp_parser *);
 static cp_expr cp_parser_operator
-  (cp_parser *);
+  (cp_parser *, location_t);
 
 /* Templates [gram.temp] */
 
@@ -5604,7 +5604,7 @@ cp_parser_primary_expression (cp_parser
  /*is_namespace=*/false,
  /*check_dependency=*/true,
  &ambiguous_decls,
- id_expr_token->location);
+ id_expression.get_location ());
/* If the lookup was ambiguous, an error will already have
   been issued.  */
if (ambiguous_decls)
@@ -5675,7 +5675,7 @@ cp_parser_primary_expression (cp_parser
if (parser->local_variables_forbidden_p
&& local_variable_p (decl))
  {
-   error_at (id_expr_token->location,
+   error_at (id_expression.get_location (),
  "local variable %qD may not appear in this context",
  decl.get_value ());
return error_mark_node;
@@ -5694,7 +5694,8 @@ cp_parser_primary_expression (cp_parser
 id_expression.get_location ()));
if (error_msg)
  cp_parser_error (parser, error_msg);
-   decl.set_location (id_expr_token->location);
+   decl.set_location (id_expression.get_location ());
+   decl.set_range (id_expr_token->location, id_expression.get_finish ());
return decl;
   }
 
@@ -15011,11 +15012,12 @@ cp_parser_mem_initializer_id (cp_parser*
 static cp_expr
 cp_parser_operator_function_id (cp_parser* parser)
 {
+  location_t start_loc = cp_lexer_peek_token (parser->lexer)->location;
   /* Look for the `operator' keyword.  */
   if (!cp_parser_require_keyword (parser, RID_OPERATOR, RT_OPERATOR))
 return error_mark_node;
   /* And then the name of the operator itself.  */
-  return cp_parser_operator (parser);
+  return cp_parser_operator (parser, start_loc);
 }
 
 /* Return an identifier node for a user-defined literal operator.
@@ -15049,7 +15051,7 @@ cp_literal_operator_id (const char* name
human-readable spelling of the identifier, e.g., `operator +'.  */
 
 static cp_expr
-cp_parser_operator (cp_parser* parser)
+cp_parser_operator (cp_parser* parser, location_t start_loc)
 {
   tree id = NULL_TREE;
   cp_token *token;
@@ -15058,7 +15060,7 @@ cp_parser_operator (cp_parser* parser)
   /* Peek at the next token.  */
   token = cp_lexer_peek_token (parser->lexer);
 
-  location_t start_loc = token->location;
+  location_t end_loc = token->location;
 
   /* Figure out which operator we have.  */
   enum tree_code op = ERROR_MARK;
@@ -15077,7 +15079,7 @@ cp_parser_operator (cp_parser* parser)
  break;
 
/* Consume the `new' or `delete' token.  */
-   location_t end_loc = cp_lexer_consume_token (parser->lexer)->location;
+   end

[C++ PATCH] Fix ICE in maybe_explain_implicit_delete (PR c++/88122)

2018-11-21 Thread Jakub Jelinek
Hi!

On the following testcase we ICE in maybe_explain_implicit_delete, because
FUNCTION_FIRST_USER_PARMTYPE (decl) is NULL - there are no user parameters
and ...
>From what I understood, const_p is used only in certain cases like const vs.
non-const copy constructor or assignment operator, if the sfk has no user
parameters, usually parm_type is just the void_type terminating the argument
list and also not really interesting for const_p computation.
So, this patch just arranges to pass false as const_p in this case.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-11-21  Jakub Jelinek  

PR c++/88122
* method.c (maybe_explain_implicit_delete): If
FUNCTION_FIRST_USER_PARMTYPE (decl) is NULL, set const_p to false
instead of ICEing.

* g++.dg/cpp0x/implicit15.C: New test.

--- gcc/cp/method.c.jj  2018-11-16 10:22:18.668258171 +0100
+++ gcc/cp/method.c 2018-11-21 15:42:08.441785625 +0100
@@ -1821,8 +1821,12 @@ maybe_explain_implicit_delete (tree decl
   if (!informed)
{
  tree parms = FUNCTION_FIRST_USER_PARMTYPE (decl);
- tree parm_type = TREE_VALUE (parms);
- bool const_p = CP_TYPE_CONST_P (non_reference (parm_type));
+ bool const_p = false;
+ if (parms)
+   {
+ tree parm_type = TREE_VALUE (parms);
+ const_p = CP_TYPE_CONST_P (non_reference (parm_type));
+   }
  tree raises = NULL_TREE;
  bool deleted_p = false;
  tree scope = push_scope (ctype);
--- gcc/testsuite/g++.dg/cpp0x/implicit15.C.jj  2018-11-21 15:59:29.849741499 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/implicit15.C 2018-11-21 15:58:00.912197089 
+0100
@@ -0,0 +1,11 @@
+// PR c++/88122
+// { dg-do compile { target c++11 } }
+
+struct A {
+  A (...); // { dg-message "candidate" }
+  A ();// { dg-message "candidate" }
+};
+struct B : A {
+  using A::A;  // { dg-error "is ambiguous" }
+   // { dg-message "is implicitly deleted because the default 
definition would be ill-formed" "" { target *-*-* } .-1 }
+} b{3};// { dg-error "use of deleted function" }

Jakub


[PATCH] Fix -fstack-protector* on darwin/mingw etc. (PR target/85644)

2018-11-21 Thread Jakub Jelinek
Hi!

As I wrote in the PR, before PR81708 commits,
while i386 defaulted to SSP_TLS rather than SSP_GLOBAL on everything but
Android, the -mstack-protector-guard= switch controlled pretty much
whether the i386.md special stack protector patterns are used (if tls)
or whether generic code is used (global).  These special stack protector
patterns did one thing if TARGET_THREAD_SSP_OFFSET macro was defined
(only defined on glibc targets) - code like:
movq%fs:40, %rax
movq%rax, -8(%rbp)
xorl%eax, %eax
in the prologue and
movq-8(%rbp), %rdx
xorq%fs:40, %rdx
je  .L4
in the epilogue.  If TARGET_THREAD_SSP_OFFSET macro wasn't defined, it would
do instead:
movq.refptr.__stack_chk_guard(%rip), %rax
movq(%rax), %rcx
movq%rcx, -8(%rbp)
xorl%ecx, %ecx
and
movq.refptr.__stack_chk_guard(%rip), %rdx
movq-8(%rbp), %rcx
xorq(%rdx), %rcx
je  .L4
(this is taken from 7.x cross to mingw).
Finally, for Android or when -mstack-protector-guard=global was used, it
emitted:
movq__stack_chk_guard(%rip), %rax
movq%rax, -8(%rbp)
and
movq__stack_chk_guard(%rip), %rdx
cmpq%rdx, %rcx
je  .L4
Note, apart from OS specific details, those =global sequences are similar
to the =tls ones when TARGET_THREAD_SSP_OFFSET is not defined, the main
difference is that the =tls ones are more secure as they clear registers
containing the guard as quickly as possible.  The PR81708 changes dropped
the non-tls special stack_protector_* patterns from i386.md and now =tls
implies really tls, but the default remained, so mingw32 or darwin still
default to tls and just use 0 offset by default.

So, this patch changes the default for mingw32, darwin and everything else
except gnu-user*.h to be =global, and just forces those special i386.md
more secure patterns unconditionally (slightly changing the generated code
on Android, but it is one extra insn in prologue and one fewer in the
epilogue).

With this patch -mstack-protector-guard=tls is really for tls and =global
for pure var access and user can override the defaults on non-glibc targets,
but they should get a default that works there.

Bootstrapped/regtested on x86_64-linux and i686-linux, plus tested with a
cross to mingw, ok for trunk?

2018-11-21  Jakub Jelinek  

PR target/85644
PR target/86832
* config/i386/i386.c (ix86_option_override_internal): Default
ix86_stack_protector_guard to SSP_TLS only if TARGET_THREAD_SSP_OFFSET
is defined.
* config/i386/i386.md (stack_protect_set, stack_protect_set_,
stack_protect_test, stack_protect_test_): Use empty condition
instead of TARGET_SSP_TLS_GUARD.

--- gcc/config/i386/i386.c.jj   2018-11-20 21:39:00.905577452 +0100
+++ gcc/config/i386/i386.c  2018-11-21 18:02:49.448049161 +0100
@@ -4557,8 +4557,13 @@ ix86_option_override_internal (bool main
 
   /* Handle stack protector */
   if (!opts_set->x_ix86_stack_protector_guard)
-opts->x_ix86_stack_protector_guard
-  = TARGET_HAS_BIONIC ? SSP_GLOBAL : SSP_TLS;
+{
+  opts->x_ix86_stack_protector_guard = SSP_GLOBAL;
+#ifdef TARGET_THREAD_SSP_OFFSET
+  if (!TARGET_HAS_BIONIC)
+   opts->x_ix86_stack_protector_guard = SSP_TLS;
+#endif
+}
 
 #ifdef TARGET_THREAD_SSP_OFFSET
   ix86_stack_protector_guard_offset = TARGET_THREAD_SSP_OFFSET;
--- gcc/config/i386/i386.md.jj  2018-11-21 11:45:12.090721862 +0100
+++ gcc/config/i386/i386.md 2018-11-21 18:03:46.166119350 +0100
@@ -19010,7 +19010,7 @@ (define_insn "*prefetch_prefetchwt1"
 (define_expand "stack_protect_set"
   [(match_operand 0 "memory_operand")
(match_operand 1 "memory_operand")]
-  "TARGET_SSP_TLS_GUARD"
+  ""
 {
   rtx (*insn)(rtx, rtx);
 
@@ -19028,7 +19028,7 @@ (define_insn "stack_protect_set_"
UNSPEC_SP_SET))
(set (match_scratch:PTR 2 "=&r") (const_int 0))
(clobber (reg:CC FLAGS_REG))]
-  "TARGET_SSP_TLS_GUARD"
+  ""
   "mov{}\t{%1, %2|%2, %1}\;mov{}\t{%2, %0|%0, 
%2}\;xor{l}\t%k2, %k2"
   [(set_attr "type" "multi")])
 
@@ -19036,7 +19036,7 @@ (define_expand "stack_protect_test"
   [(match_operand 0 "memory_operand")
(match_operand 1 "memory_operand")
(match_operand 2)]
-  "TARGET_SSP_TLS_GUARD"
+  ""
 {
   rtx flags = gen_rtx_REG (CCZmode, FLAGS_REG);
 
@@ -19059,7 +19059,7 @@ (define_insn "stack_protect_test_"
 (match_operand:PTR 2 "memory_operand" "m")]
UNSPEC_SP_TEST))
(clobber (match_scratch:PTR 3 "=&r"))]
-  "TARGET_SSP_TLS_GUARD"
+  ""
   "mov{}\t{%1, %3|%3, %1}\;xor{}\t{%2, %3|%3, %2}"
   [(set_attr "type" "multi")])
 

Jakub


Re: [PATCH] Fix -fstack-protector* on darwin/mingw etc. (PR target/85644)

2018-11-21 Thread Jakub Jelinek
On Wed, Nov 21, 2018 at 11:21:18PM +0100, Jakub Jelinek wrote:
> As I wrote in the PR, before PR81708 commits,

Note, e.g. in 4.8, the stack_protector_* patterns weren't guarded with
something like TARGET_SSP_TLS_GUARD but with !TARGET_HAS_BIONIC,
which just means it was incorrectly implemented for Android initially
(should have been done by forcing there the non-*tls* insns for
!TARGET_HAS_BIONIC rather than failing the optab).

Jakub


Re: [C++ PATCH] Improve locations of id-expressions and operator "" (PR c++/87386, take 2)

2018-11-21 Thread Jason Merrill

On 11/21/18 1:49 PM, Jakub Jelinek wrote:

On Wed, Nov 21, 2018 at 01:29:15PM -0500, Jason Merrill wrote:

similarly for operator"" _F the column is under _ rather than first o.


I disagree with this one: the name of the declaration is operator""_F, so I
think the caret should go at the first o.


Right now when cp_parser_operator_function_id is called, it returns locus like:
operator new
  ^~~
operator delete []
  ^
operator ==
  ^
operator "" _foo
UNKNOWN_LOCATION
The last one is because for others we do return cp_expr (id, start_loc);
but for operator "" just return id;

So, do you suggest we should instead return
operator new
^~~~
operator delete []
^~
operator ==
^~~
operator "" _foo
^~~~
?


Yes.


That would mean cp_parser_operator_function_id would need to pass
location_t start_loc (the start of the operator token) to cp_parser_operator and
let that create a range in all cases rather than just for operator
new/delete.


Sure.

Jason



Re: [C++ PATCH] Fix ICE in maybe_explain_implicit_delete (PR c++/88122)

2018-11-21 Thread Jason Merrill

On 11/21/18 5:16 PM, Jakub Jelinek wrote:

Hi!

On the following testcase we ICE in maybe_explain_implicit_delete, because
FUNCTION_FIRST_USER_PARMTYPE (decl) is NULL - there are no user parameters
and ...
 From what I understood, const_p is used only in certain cases like const vs.
non-const copy constructor or assignment operator, if the sfk has no user
parameters, usually parm_type is just the void_type terminating the argument
list and also not really interesting for const_p computation.
So, this patch just arranges to pass false as const_p in this case.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-11-21  Jakub Jelinek  

PR c++/88122
* method.c (maybe_explain_implicit_delete): If
FUNCTION_FIRST_USER_PARMTYPE (decl) is NULL, set const_p to false
instead of ICEing.

* g++.dg/cpp0x/implicit15.C: New test.


OK.

Jason



Re: [C++ PATCH] Improve locations of id-expressions and operator "" (PR c++/87386, take 3)

2018-11-21 Thread Jason Merrill

On 11/21/18 5:10 PM, Jakub Jelinek wrote:

On Wed, Nov 21, 2018 at 07:49:48PM +0100, Jakub Jelinek wrote:

So, do you suggest we should instead return
operator new
^~~~
operator delete []
^~
operator ==
^~~
operator "" _foo
^~~~
?
That would mean cp_parser_operator_function_id would need to pass
location_t start_loc (the start of the operator token) to cp_parser_operator and
let that create a range in all cases rather than just for operator
new/delete.


This version of the patch implements that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.

Jason



[PATCH 3/7][v2][MSP430][TESTSUITE] Dynamically check if size_t is large enough for tests containing large structs/arrays

2018-11-21 Thread Jozef Lawrynowicz
On Wed, 14 Nov 2018 15:41:00 +
Jozef Lawrynowicz  wrote:

> Patch 3 sets up require-effective-target directives for tests which
> require the compilation of large arrays.
> Targets which have 16-bit or 20-bit size_t fail to compile tests with large
> arrays designed to test 32-bit or 64-bit behaviour. Rather than enumerating
> another target to skip, I've replaced the target selector in some tests with
> a size checking procedure:
> - size20plus (new)
> - size32plus
> size20plus checks to see if a 16-bit structure/array size is supported,
> similarly to how the existing size32plus checks to see if a 24-bit
> structure/array size is supported,

Added missing documentation for new check_effective target procs in attached
patch.

>From 1573a8392605a17e58c74be19ee5eb28950dc32d Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Thu, 8 Nov 2018 22:39:12 +
Subject: [PATCH] [TESTSUITE] Dynamically check if size_t is large enough for
 tests containing large structs/arrays

2018-11-21  Jozef Lawrynowicz  

	gcc/ChangeLog:

	* doc/sourcebuild.texi: Document check_effective_target_size20plus.
	Clarify documentation for check_effective_target_size32plus.

	gcc/testsuite/ChangeLog:

	* gcc.c-torture/compile/20151204.c: Add dg-require-effective-target
	size20plus.
	* gcc.dg/pr34225.c: Likewise.
	* gcc.dg/pr40971.c: Likewise.
	* gcc.dg/pr69071.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-10.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-2.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-3.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-5.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-6.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-7.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-8.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-9.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-11.c: Add dg-require-effective-target
	size32plus.
	* gcc.dg/Walloc-size-larger-than-4.c: Likewise.
	* gcc.dg/Walloc-size-larger-than-5.c: Likewise.
	* gcc.dg/Walloc-size-larger-than-6.c: Likewise.
	* gcc.dg/Walloc-size-larger-than-7.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-1.c: Likewise.
	* gcc.dg/tree-ssa/loop-interchange-1b.c: Likewise.
	* lib/target-supports.exp (check_effective_target_size20plus): New.
	(check_effective_target_size32plus): Update comment. 

---
 gcc/doc/sourcebuild.texi|  7 ++-
 gcc/testsuite/gcc.c-torture/compile/20151204.c  |  2 +-
 gcc/testsuite/gcc.dg/Walloc-size-larger-than-4.c|  2 +-
 gcc/testsuite/gcc.dg/Walloc-size-larger-than-5.c|  2 +-
 gcc/testsuite/gcc.dg/Walloc-size-larger-than-6.c|  2 +-
 gcc/testsuite/gcc.dg/Walloc-size-larger-than-7.c|  2 +-
 gcc/testsuite/gcc.dg/pr34225.c  |  1 +
 gcc/testsuite/gcc.dg/pr40971.c  |  1 +
 gcc/testsuite/gcc.dg/pr69071.c  |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-1.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-10.c |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-11.c |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-1b.c |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-2.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-3.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-5.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-6.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-7.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-8.c  |  3 ++-
 gcc/testsuite/gcc.dg/tree-ssa/loop-interchange-9.c  |  3 ++-
 gcc/testsuite/lib/target-supports.exp   | 18 +++---
 21 files changed, 51 insertions(+), 21 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index bca5db3..9c57226 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1375,8 +1375,13 @@ Target supports @code{long double} that is longer than @code{double}.
 @item ptr32plus
 Target has pointers that are 32 bits or longer.
 
+@item size20plus
+Target has a 20-bit or larger address space, so at least supports
+16-bit array and structure sizes.
+
 @item size32plus
-Target supports array and structure sizes that are 32 bits or longer.
+Target has a 32-bit or larger address space, so at least supports
+24-bit array and structure sizes.
 
 @item 4byte_wchar_t
 Target has @code{wchar_t} that is at least 4 bytes.
diff --git a/gcc/testsuite/gcc.c-torture/compile/20151204.c b/gcc/testsuite/gcc.c-torture/compile/20151204.c
index 6a46abf..e41f6c1 100644
--- a/gcc/testsuite/gcc.c-torture/compile/20151204.c
+++ b/gcc/testsuite/gcc.c-torture/compile/20151204.c
@@ -1,4 +1,4 @@
-/* { dg-skip-if "Array too big" { "avr-*-*" "pdp11-*-*" } } */
+/* { dg-require-effective-target size20plus } */
 
 typedef __SIZE_TYPE__ size_t;
 
diff --git a/gcc/testsuite/gcc.dg/Walloc-size-larger-than-4.c b/gcc/testsuite/gcc.dg/Walloc-size-larger-than-4.c
index 4b3a64b..54e43cd 100644
--- a/gcc/testsuite/gcc.dg/Walloc-size-larger-than-4.c
+++ b/gcc/testsuite

[PATCH 6/7][v2][MSP430][TESTSUITE] Fix tests requiring float printf support when GCC was configured with --enable-newlib-nano-formatted-io

2018-11-21 Thread Jozef Lawrynowicz
On Wed, 14 Nov 2018 15:41:00 +
Jozef Lawrynowicz  wrote:

> Patch 6 fixes tests expecting printf float support for targets which have been
> configured with "newlib-nano-formatted-io". When newlib is configured in this
> way, float printf is enabled at build time by registering _printf_float as an
> undefined symbol.

Added missing documentation for new check_effective target procs in attached
patch.

>From ad5c2e3684904f961938cfc0b50445013300c6e0 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Sat, 10 Nov 2018 16:02:25 +
Subject: [PATCH] [TESTSUITE] Fix tests requiring float printf support when GCC
 was configured with --enable-newlib-nano-formatted-io

2018-11-21  Jozef Lawrynowicz  

	gcc/ChangeLog:

	* doc/sourcebuild.texi: Document check_effective_target_newlib_nano_io.

	gcc/testsuite/ChangeLog:

	* lib/target-supports.exp (check_effective_target_newlib_nano_io): New. 
	* gcc.c-torture/execute/920501-8.c: Register undefined linker symbol
	_printf_float for newlib_nano_io target.
	* gcc.c-torture/execute/930513-1.c: Likewise.
	* gcc.dg/torture/builtin-sprintf.c: Likewise.
	* gcc.c-torture/execute/ieee/920810-1.x: New.
---
 gcc/doc/sourcebuild.texi| 4 
 gcc/testsuite/gcc.c-torture/execute/920501-8.c  | 2 ++
 gcc/testsuite/gcc.c-torture/execute/930513-1.c  | 2 ++
 gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x | 4 
 gcc/testsuite/gcc.dg/torture/builtin-sprintf.c  | 3 ++-
 gcc/testsuite/lib/target-supports.exp   | 4 
 6 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 9c57226..bfaa0fd 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2152,6 +2152,10 @@ Target supports @code{mmap}.
 @item newlib
 Target supports Newlib.
 
+@item newlib_nano_io
+GCC was configured with @code{--enable-newlib-nano-formatted-io}, which reduces
+the code size of Newlib formatted I/O functions.
+
 @item pow10
 Target provides @code{pow10} function.
 
diff --git a/gcc/testsuite/gcc.c-torture/execute/920501-8.c b/gcc/testsuite/gcc.c-torture/execute/920501-8.c
index 62780a0..7e4fa17 100644
--- a/gcc/testsuite/gcc.c-torture/execute/920501-8.c
+++ b/gcc/testsuite/gcc.c-torture/execute/920501-8.c
@@ -1,3 +1,5 @@
+/* { dg-additional-options "-Wl,-u,_printf_float" { target newlib_nano_io } } */
+
 #include 
 #include 
 
diff --git a/gcc/testsuite/gcc.c-torture/execute/930513-1.c b/gcc/testsuite/gcc.c-torture/execute/930513-1.c
index 4544471..f163007 100644
--- a/gcc/testsuite/gcc.c-torture/execute/930513-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/930513-1.c
@@ -1,3 +1,5 @@
+/* { dg-additional-options "-Wl,-u,_printf_float" { target newlib_nano_io } } */
+
 #include 
 char buf[2];
 
diff --git a/gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x b/gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x
new file mode 100644
index 000..8edec730
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/ieee/920810-1.x
@@ -0,0 +1,4 @@
+if { [check_effective_target_newlib_nano_io] } {
+lappend additional_flags "-Wl,-u,_printf_float"
+}
+return 0
diff --git a/gcc/testsuite/gcc.dg/torture/builtin-sprintf.c b/gcc/testsuite/gcc.dg/torture/builtin-sprintf.c
index 6f8b7a9..5684fd7 100644
--- a/gcc/testsuite/gcc.dg/torture/builtin-sprintf.c
+++ b/gcc/testsuite/gcc.dg/torture/builtin-sprintf.c
@@ -1,6 +1,7 @@
 /* PR tree-optimization/86274 - SEGFAULT when logging std::to_string(NAN)
{ dg-do run }
-   { dg-options "-O2 -Wall" } */
+   { dg-options "-O2 -Wall" }
+   { dg-additional-options "-Wl,-u,_printf_float" { target newlib_nano_io } } */
 
 #define X"0xdeadbeef"
 #define nan(x)   __builtin_nan (x)
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 7488653..d696fc6 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -6691,6 +6691,10 @@ proc check_effective_target_newlib {} {
 	#include 
 }]
 }
+# Return true if GCC was configured with --enable-newlib-nano-formatted-io
+proc check_effective_target_newlib_nano_io { } {
+return [check_configured_with "--enable-newlib-nano-formatted-io"]
+}
 
 # Some newlib versions don't provide a frexpl and instead depend
 # on frexp to implement long double conversions in their printf-like
-- 
2.7.4



[PATCH 7/7][v2][MSP430][TESTSUITE] Fix tests for msp430-elf large memory model

2018-11-21 Thread Jozef Lawrynowicz
On Wed, 14 Nov 2018 15:41:00 +
Jozef Lawrynowicz  wrote:

> Patch 7 fixes tests for msp430-elf in the large memory model.

Added missing documentation for new check_effective target procs in attached
patch.


>From 4cfb2ecd0e0580f69790fadd68b77e8a82992ef4 Mon Sep 17 00:00:00 2001
From: Jozef Lawrynowicz 
Date: Sat, 10 Nov 2018 16:08:44 +
Subject: [PATCH] [TESTSUITE] Fix tests for msp430-elf large memory model

2018-11-21  Jozef Lawrynowicz  

	gcc/ChangeLog:

	* doc/sourcebuild.texi: Document
	check_effective_target_msp430_large_mem.

	gcc/testsuite/ChangeLog:

	* gcc.c-torture/execute/991014-1.c: Fix bufsize definition for
	msp430 large memory model.
	* gcc.dg/Walloca-1.c: Don't expect warning for msp430 large memory
	model.
	* gcc.dg/Walloca-2.c: Likewise.
	* gcc.dg/c99-const-expr-2.c: Define ZERO macro for msp430 large memory
	model.
	* gcc.dg/format/format.h: Prefix typedefs using __SIZE_TYPE__ and
	__PTRDIFF_TYPE__ with __extension__.
	* gcc.dg/lto/20081210-1_0.c: Always typedef uintptr_t as
	__UINTPTR_TYPE__.
	* gcc.dg/pr36227.c: Likewise.
	* gcc.dg/pr42611.c: Use __INTPTR_MAX__ as the maximum object size if
	size_t and ptr_t are the same size.
	* gcc.dg/pr78973.c: dg-warning XFAIL for int16 but not msp430 large
	memory model.
	* gcc.dg/tree-ssa/builtin-sprintf-warn-3.c: Update dg-warning
	directives for msp430 large memory model.
	* gcc.dg/tree-ssa/pr66449.c: Always use __INTPTR_TYPE__ when integer
	type equal in size to ptr_t is required.
	* gcc.dg/tree-ssa/ssa-dom-thread-8.c: Extend pointer size checking
	macro for msp430.
	* lib/target-supports.exp (check_effective_target_msp430_large_mem):
	New. 
---
 gcc/doc/sourcebuild.texi   |  8 ++
 gcc/testsuite/gcc.c-torture/execute/991014-1.c |  7 -
 gcc/testsuite/gcc.dg/Walloca-1.c   |  4 +--
 gcc/testsuite/gcc.dg/Walloca-2.c   |  8 +++---
 gcc/testsuite/gcc.dg/c99-const-expr-2.c|  2 ++
 gcc/testsuite/gcc.dg/format/format.h   |  6 ++--
 gcc/testsuite/gcc.dg/lto/20081210-1_0.c|  8 +-
 gcc/testsuite/gcc.dg/pr36227.c | 10 +--
 gcc/testsuite/gcc.dg/pr42611.c |  3 +-
 gcc/testsuite/gcc.dg/pr78973.c |  2 +-
 .../gcc.dg/tree-ssa/builtin-sprintf-warn-3.c   | 32 +++---
 gcc/testsuite/gcc.dg/tree-ssa/pr66449.c|  8 ++
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-8.c   |  8 +++---
 gcc/testsuite/lib/target-supports.exp  | 13 +
 14 files changed, 66 insertions(+), 53 deletions(-)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index bfaa0fd..b5fac4e 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1941,6 +1941,14 @@ when using the new ABI.
 MIPS target supports @code{-mpaired-single}.
 @end table
 
+@subsubsection MSP430-specific attributes
+
+@table @code
+@item msp430_large_mem
+The MSP430 large memory model (enabled with @code{-mlarge} compiler flag)
+is in use.
+@end table
+
 @subsubsection PowerPC-specific attributes
 
 @table @code
diff --git a/gcc/testsuite/gcc.c-torture/execute/991014-1.c b/gcc/testsuite/gcc.c-torture/execute/991014-1.c
index e0bcd6d..95e38ce 100644
--- a/gcc/testsuite/gcc.c-torture/execute/991014-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/991014-1.c
@@ -1,11 +1,16 @@
-
 typedef __SIZE_TYPE__ Size_t;
 
+#ifdef __MSP430X_LARGE__
+/* size_t is __int20, so 20 bits, for __MSP430X_LARGE__, but __SIZEOF_POINTER__
+   returns the bytesize which is 4.  */
+#define bufsize ((1L << (20 - 2))-256)
+#else  /* !__MSP430X_LARGE__ */
 #if __SIZEOF_LONG__ < __SIZEOF_POINTER__
 #define bufsize ((1LL << (8 * sizeof(Size_t) - 2))-256)
 #else
 #define bufsize ((1L << (8 * sizeof(Size_t) - 2))-256)
 #endif
+#endif
 
 struct huge_struct
 {
diff --git a/gcc/testsuite/gcc.dg/Walloca-1.c b/gcc/testsuite/gcc.dg/Walloca-1.c
index 85e9160..c9a6c57 100644
--- a/gcc/testsuite/gcc.dg/Walloca-1.c
+++ b/gcc/testsuite/gcc.dg/Walloca-1.c
@@ -24,8 +24,8 @@ void foo1 (size_t len, size_t len2, size_t len3)
   char *s = alloca (123);
   useit (s);			// OK, constant argument to alloca
 
-  s = alloca (num);		// { dg-warning "large due to conversion" "" { target lp64 } }
-  // { dg-warning "unbounded use of 'alloca'" "" { target { ! lp64 } } .-1 }
+  s = alloca (num);		// { dg-warning "large due to conversion" "" { target { { lp64 } || { msp430_large_mem } } } }
+  // { dg-warning "unbounded use of 'alloca'" "" { target { { ! lp64 } && { ! msp430_large_mem } } } .-1 }
   useit (s);
 
   s = alloca (3);		/* { dg-warning "is too large" } */
diff --git a/gcc/testsuite/gcc.dg/Walloca-2.c b/gcc/testsuite/gcc.dg/Walloca-2.c
index 766ff8d..446c811 100644
--- a/gcc/testsuite/gcc.dg/Walloca-2.c
+++ b/gcc/testsuite/gcc.dg/Walloca-2.c
@@ -13,7 +13,7 @@ g1 (int n)
 // 32-bit targets because VRP is not giving us any range info for
 // the argument to __builtin_alloca.  This should be fixed by the
 // up

[PATCH][libbacktrace] Factor out read_initial_length

2018-11-21 Thread Tom de Vries
Hi,

this patch factors out new function read_initial_length in dwarf.c.

Bootstrapped and reg-tested on x86_64.

OK for trunk?

Thanks,
- Tom

[libbacktrace] Factor out read_initial_length

2018-11-22  Tom de Vries  

* dwarf.c (read_initial_length): Factor out of ...
(build_address_map, read_line_info): ... here.

---
 libbacktrace/dwarf.c | 36 +---
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index c4f8732c7eb..4e93f120820 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -651,6 +651,25 @@ leb128_len (const unsigned char *p)
   return ret;
 }
 
+/* Read initial_length from BUF and advance the appropriate number of bytes.  
*/
+
+static uint64_t
+read_initial_length (struct dwarf_buf *buf, int *is_dwarf64)
+{
+  uint64_t len;
+
+  len = read_uint32 (buf);
+  if (len == 0x)
+{
+  len = read_uint64 (buf);
+  *is_dwarf64 = 1;
+}
+  else
+*is_dwarf64 = 0;
+
+  return len;
+}
+
 /* Free an abbreviations structure.  */
 
 static void
@@ -1463,14 +1482,7 @@ build_address_map (struct backtrace_state *state, 
uintptr_t base_address,
 
   unit_data_start = info.buf;
 
-  is_dwarf64 = 0;
-  len = read_uint32 (&info);
-  if (len == 0x)
-   {
- len = read_uint64 (&info);
- is_dwarf64 = 1;
-   }
-
+  len = read_initial_length (&info, &is_dwarf64);
   unit_buf = info;
   unit_buf.left = len;
 
@@ -2002,13 +2014,7 @@ read_line_info (struct backtrace_state *state, struct 
dwarf_data *ddata,
   line_buf.data = data;
   line_buf.reported_underflow = 0;
 
-  is_dwarf64 = 0;
-  len = read_uint32 (&line_buf);
-  if (len == 0x)
-{
-  len = read_uint64 (&line_buf);
-  is_dwarf64 = 1;
-}
+  len = read_initial_length (&line_buf, &is_dwarf64);
   line_buf.left = len;
 
   if (!read_line_header (state, u, is_dwarf64, &line_buf, hdr))


Re: [PATCH][libbacktrace] Factor out read_initial_length

2018-11-21 Thread Ian Lance Taylor
Tom de Vries  writes:

> [libbacktrace] Factor out read_initial_length
>
> 2018-11-22  Tom de Vries  
>
>   * dwarf.c (read_initial_length): Factor out of ...
>   (build_address_map, read_line_info): ... here.

This is OK.

Thanks.

Ian


Re: [C++ Patch] PR 84636 ("internal compiler error: Segmentation fault (identifier_p()/grokdeclarator())")

2018-11-21 Thread Paolo Carlini
... in fact I'm thinking that the below - which directly checks for 
unqualified_id to be non-null in both places - may be a better variant: 
among other things it means that for related testcases like:

typedef void a();
struct A
{ a a1: 1; };
we get the location of a1 right (we could also change the diagnostics in 
grokbitfield to use DECL_SOURCE_LOCATION and exploit it), and the 
testsuite doesn't need adjustments. Tested x86_64-linux.


Thanks, Paolo.


Index: cp/decl.c
===
--- cp/decl.c   (revision 266339)
+++ cp/decl.c   (working copy)
@@ -12165,7 +12165,8 @@ grokdeclarator (const cp_declarator *declarator,
 }
 
   if (ctype && TREE_CODE (type) == FUNCTION_TYPE && staticp < 2
-  && !(identifier_p (unqualified_id)
+  && !(unqualified_id
+  && identifier_p (unqualified_id)
   && IDENTIFIER_NEWDEL_OP_P (unqualified_id)))
 {
   cp_cv_quals real_quals = memfn_quals;
@@ -12245,8 +12246,7 @@ grokdeclarator (const cp_declarator *declarator,
error ("invalid use of %<::%>");
return error_mark_node;
  }
-   else if (TREE_CODE (type) == FUNCTION_TYPE
-|| TREE_CODE (type) == METHOD_TYPE)
+   else if (FUNC_OR_METHOD_TYPE_P (type) && unqualified_id)
  {
int publicp = 0;
tree function_context;
Index: testsuite/g++.dg/parse/bitfield6.C
===
--- testsuite/g++.dg/parse/bitfield6.C  (nonexistent)
+++ testsuite/g++.dg/parse/bitfield6.C  (working copy)
@@ -0,0 +1,6 @@
+// PR c++/84636
+
+typedef void a();
+struct A {
+a: 1;  // { dg-error "bit-field .\\. with non-integral type" }
+};


Re: [PATCH][libbacktrace] Handle DW_FORM_GNU_strp_alt

2018-11-21 Thread Tom de Vries
On 21-11-18 02:03, Ian Lance Taylor wrote:
> On Wed, Nov 14, 2018 at 6:45 AM, Tom de Vries  wrote:
>> On 14-11-18 14:25, Jakub Jelinek wrote:
>>> On Wed, Nov 14, 2018 at 02:08:05PM +0100, Tom de Vries wrote:
> +btest_dwz_CFLAGS = $(AM_CFLAGS) -g -O0

 Hmm, I already discovered that specifying the -O0 doesn't work, since
 it's overridden by $(CFLAGS).

 With a hack like this:
 ...
 diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am
 index 2fec9bbb4b6..8bdf13b3546 100644
 --- a/libbacktrace/Makefile.am
 +++ b/libbacktrace/Makefile.am
 @@ -99,11 +99,14 @@ check_PROGRAMS += btest
  if HAVE_DWZ

  btest_dwz_SOURCES = btest_dwz.c testlib.c
 -btest_dwz_CFLAGS = $(AM_CFLAGS) -g -O0
 +btest_dwz_CFLAGS = $(AM_CFLAGS) -g
  btest_dwz_LDADD = libbacktrace.la

  check_PROGRAMS += btest_dwz

 +btest_dwz-btest_dwz.o: btest_dwz.c
 +   $(AM_V_CC)$(CC) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES)
 $(AM_CPPFLAGS) $(CPPFLAGS) $(btest_dwz_CFLAGS) $(CFLAGS) -O0 -c -o
 btest_dwz-btest_dwz.o `test -f 'btest_dwz.c' || echo
 '$(srcdir)/'`btest_dwz.c
>>>
>>> Can't you instead do something like:
>>> btest_dwz.o: CFLAGS += -g -O0
>>> or something similar
>>
>> Hi,
>>
>> yes, that works, thanks.
>>
>>> (whatever the corresponding goal is)?
>>
>> The goal is to run the testcase with a setting lower than -O2, such that
>> we can successfully run a substantial portion of the test without
>> needing support for DW_FORM_GNU_ref_alt.
>>
>> [ At O2 we get constprop versions of some functions, which have an
>> abstract origin, which tends to be moved to the common debug file by dwz
>> -m, after which we need support for DW_FORM_GNU_ref_alt to get to the
>> name of the function. ]
>>
>>> Otherwise, the patch looks generally ok to me,
>>
>> Great.
>>
>>> but yes, I've been wondering
>>> how you can get away with DW_FORM_GNU_ref_alt not implemented properly.
>>>
>>
>> Indeed, DW_FORM_GNU_ref_alt support is required to make this work in
>> general.
>>
>> But I observed that implementing just DW_FORM_GNU_strp_alt improves on
>> the current situation, so I thought it was worthwhile submitting this as
>> a separate patch.
>>
>> Updated patch attached (which also rewrites btest_dwz.c to an include of
>> btest.c, while disabling the inline tests that require DW_FORM_GNU_ref_alt).
> 
> Unfortunately the tests don't pass for me.
> 
> rm -f btest_dwz.debug
> cp btest_dwz btest_dwz_2
> cp btest_dwz btest_dwz_3
> dwz -m btest_dwz.debug btest_dwz_2 btest_dwz_3
> FAIL: btest_dwz_2
> FAIL: btest_dwz_3
> 
>> libbacktrace/btest_dwz_2
> test1: [0]: missing file name or function name
> FAIL: backtrace_full noinline
> test3: [0]: missing file name or function name
> FAIL: backtrace_simple noinline
> PASS: backtrace_syminfo variable
> 
>> libbacktrace/btest_dwz_3
> test1: [0]: missing file name or function name
> FAIL: backtrace_full noinline
> test3: [0]: missing file name or function name
> FAIL: backtrace_simple noinline
> PASS: backtrace_syminfo variable
> 

Hmm, I can't reproduce that. I'm reworking this patch into a patch
series that includes also support for DW_FORM_GNU_ref_alt, so I'm hoping
that that will fix the failures you're seeing.

>> +#define INLINE_TESTS 0
>> +#include "btest.c"
> 
> Please avoid this kind of #include game.  If you need to skip some
> tests (why?) use a command line option.  If you need to compile with
> different options, use automake features.
> 

The patch series with DW_FORM_GNU_ref_alt support added no longer
requires this.

>> +elf_open_debugfile_by_debugaltlink (struct backtrace_state *state,
> 
> Do we need this function?  It seems to be the same as
> elf_find_debugfile_by_debuglink.

Hmm, that's right. I've now updated the patch in my patch series.

I'll resubmit once the fix for PR88063 is in trunk (I need the keeping
track of units that that patch adds, for DW_FORM_GNU_ref_alt support).

Thanks for the review,
- Tom


Re: C++ PATCH to implement P1094R2, Nested inline namespaces

2018-11-21 Thread Marek Polacek
On Tue, Nov 20, 2018 at 04:59:46PM -0500, Jason Merrill wrote:
> On 11/19/18 5:12 PM, Marek Polacek wrote:
> > On Mon, Nov 19, 2018 at 10:33:17PM +0100, Jakub Jelinek wrote:
> > > On Mon, Nov 19, 2018 at 04:21:19PM -0500, Marek Polacek wrote:
> > > > 2018-11-19  Marek Polacek  
> > > > 
> > > > Implement P1094R2, Nested inline namespaces.
> > > > * g++.dg/cpp2a/nested-inline-ns1.C: New test.
> > > > * g++.dg/cpp2a/nested-inline-ns2.C: New test.
> > > > * g++.dg/cpp2a/nested-inline-ns3.C: New test.
> > > 
> > > Just a small testsuite comment.
> > > 
> > > > --- /dev/null
> > > > +++ gcc/testsuite/g++.dg/cpp2a/nested-inline-ns1.C
> > > > @@ -0,0 +1,26 @@
> > > > +// P1094R2
> > > > +// { dg-do compile { target c++2a } }
> > > 
> > > Especially because 2a testing isn't included by default, but also
> > > to make sure it works right even with -std=c++17, wouldn't it be better to
> > > drop the nested-inline-ns3.C test, make this test c++17 or
> > > even better always enabled, add dg-options "-Wpedantic" and
> > > just add dg-warning with c++17_down and c++14_down what should be
> > > warned on the 3 lines (with .-1 for c++14_down)?
> > > 
> > > Or if you want add some further testcases that will test how
> > > c++17 etc. will dg-error on those with -pedantic-errors etc.
> > 
> > Sure, I've made it { target c++11 } and dropped the third test:
> > 
> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> > 
> > 2018-11-19  Marek Polacek  
> > 
> > Implement P1094R2, Nested inline namespaces.
> > * parser.c (cp_parser_namespace_definition): Parse the optional inline
> > keyword in a nested-namespace-definition.  Adjust push_namespace call.
> > Formatting fix.
> > 
> > * g++.dg/cpp2a/nested-inline-ns1.C: New test.
> > * g++.dg/cpp2a/nested-inline-ns2.C: New test.
> > 
> > diff --git gcc/cp/parser.c gcc/cp/parser.c
> > index 292cce15676..f39e9d753d2 100644
> > --- gcc/cp/parser.c
> > +++ gcc/cp/parser.c
> > @@ -18872,6 +18872,7 @@ cp_parser_namespace_definition (cp_parser* parser)
> > cp_ensure_no_oacc_routine (parser);
> > bool is_inline = cp_lexer_next_token_is_keyword (parser->lexer, 
> > RID_INLINE);
> > +  const bool topmost_inline_p = is_inline;
> > if (is_inline)
> >   {
> > @@ -18890,6 +18891,17 @@ cp_parser_namespace_definition (cp_parser* parser)
> >   {
> > identifier = NULL_TREE;
> > +  bool nested_inline_p = cp_lexer_next_token_is_keyword (parser->lexer,
> > +RID_INLINE);
> > +  if (nested_inline_p && nested_definition_count != 0)
> > +   {
> > + if (cxx_dialect < cxx2a)
> > +   pedwarn (cp_lexer_peek_token (parser->lexer)->location,
> > +OPT_Wpedantic, "nested inline namespace definitions only "
> > +"available with -std=c++2a or -std=gnu++2a");
> > + cp_lexer_consume_token (parser->lexer);
> > +   }
> 
> This looks like we won't get any diagnostic in lower conformance modes if
> there are multiple namespace scopes before the inline keyword.

If you mean something like
namespace A::B:C::inline D { }
then we do get a diagnostic.  nested-inline-ns1.C tests that.  Or do you
mean something else?
 
> > if (cp_lexer_next_token_is (parser->lexer, CPP_NAME))
> > {
> >   identifier = cp_parser_identifier (parser);
> > @@ -18904,7 +18916,12 @@ cp_parser_namespace_definition (cp_parser* parser)
> > }
> > if (cp_lexer_next_token_is_not (parser->lexer, CPP_SCOPE))
> > -   break;
> > +   {
> > + /* Don't forget that the innermost namespace might have been
> > +marked as inline.  */
> > + is_inline |= nested_inline_p;
> 
> This looks wrong: an inline namespace does not make its nested namespaces
> inline as well.

A nested namespace definition cannot be inline.  This is supposed to handle
cases such as
namespace A::B::inline C { ... }
because after 'C' we don't see :: so it breaks and we call push_namespace
outside the for loop.  So I still don't see a bug; do you have a test that
I got wrong?

Marek


Re: [PATCH, libphobos] Fix libphobos.shared testsuite for multilib tests

2018-11-21 Thread Iain Buclaw
On Sat, 17 Nov 2018 at 16:07, Johannes Pfau  wrote:
>
> Hi,
>
> the loadDR test in the libphobos.shared testsuite tries to dynamically load 
> the phobos library. The path for the library currently points to the main 
> multilib variant phobos library, causing other multilib variants to fail the 
> test. The attached patch uses $blddir instead of $objdir to fix this issue.
>
> ---
> libphobos/ChangeLog:
>
> 2018-11-17  Johannes Pfau  
>
> PR d/87824
> * testsuite/libphobos.shared/shared.exp: Set proper path to phobos 
> library for multilib builds.
>
> diff --git a/libphobos/testsuite/libphobos.shared/shared.exp 
> b/libphobos/testsuite/libphobos.shared/shared.exp
> index b3bdd..623e06259 100644
> --- a/libphobos/testsuite/libphobos.shared/shared.exp
> +++ b/libphobos/testsuite/libphobos.shared/shared.exp
> @@ -94,7 +94,7 @@ if { [is-effective-target dlopen] && [is-effective-target 
> pthread] } {
>  dg-test "$srcdir/$subdir/host.c" "-ldl -pthread" "$DEFAULT_CFLAGS"
>
>  # Test requires a command line argument to be passed to the program.
> -set libphobos_run_args "$objdir/../src/.libs/libgphobos.so"
> +set libphobos_run_args "${blddir}/src/.libs/libgphobos.${shlib_ext}"
>  dg-test "$srcdir/$subdir/loadDR.c" "-ldl -pthread -g" "$DEFAULT_CFLAGS"
>  set libphobos_run_args ""
>  }

OK.

I've checked and committed this, however perhaps we should get you
write after approval set-up.

-- 
Iain


Re: [PATCH] x86: Add -march=cascadelake

2018-11-21 Thread Wei Xiao
Jakub,

Thanks for the comments!
I have addressed them as attached.

Wei

gcc/
* common/config/i386/i386-common.c (processor_names): Add cascadelake.
(processor_alias_table): Add cascadelake.
* config.gcc: Add -march=cascadelake.
* config/i386/driver-i386.c
(host_detect_local_cpu): Detect cascadelake.
* config/i386/i386-c.c (ix86_target_macros_internal): Handle
cascadelake.
* config/i386/i386.c (ix86_cost): Add m_CASCADELAKE.
(processor_cost_table): Add cascadelake.
(get_builtin_code_for_version): Handle cascadelake.
(fold_builtin_cpu): Ditto.
* config/i386/i386.h (TARGET_CASCADELAKE, PROCESSOR_CASCADELAKE): New.
(PTA_CASCADELAKE): Ditto.
* doc/invoke.texi: Add -march=cascadelake.
gcc/testsuite/
* g++.target/i386/mv16.C: Handle new march.
* gcc.target/i386/funcspec-56.inc" Ditto.
libgcc/
* config/i386/cpuinfo.h: Add INTEL_COREI7_CASCADELAKE.
Jakub Jelinek  于2018年11月21日周三 下午7:09写道:
>
> On Wed, Nov 21, 2018 at 06:23:41PM +0800, Wei Xiao wrote:
> > The attached patch added -march=cascadelake for x86.
> > Tested with bootstrap and regression tests on x86_64. No regressions.
> > Is it ok for trunk?
>
> Not a real review, just nits:
>
> index bff4dfb..f7c1c98 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,18 @@
> +2018-11-21 Wei Xiao 
>
> Two spaces after date, two spaces before <.
>
> --- a/gcc/config/i386/driver-i386.c
> +++ b/gcc/config/i386/driver-i386.c
> @@ -857,6 +857,9 @@ const char *host_detect_local_cpu (int argc, const char 
> **argv)
>   /* Assume Ice Lake.  */
>   else if (has_gfni)
> cpu = "icelake-client";
> + /* Assume Cascade Lake.  */
> + if (has_avx512vnni)
> +   cpu = "cascadelake";
>   /* Assume Cannon Lake.  */
>   else if (has_avx512vbmi)
> cpu = "cannonlake";
>
> Doesn't this break handling of all the other CPUs?  I mean, it is a large
>   if (cond) ... else if (cond) ... else if (cond) ... else ...
> but you've added if without else before it into the middle.
>
> Jakub


cascadelake-v2.diff
Description: Binary data


Re: [PATCH v3] [aarch64] Add CPU support for Ampere Computing's eMAG.

2018-11-21 Thread Andrew Pinski
One small comment.

On Tue, Nov 20, 2018 at 10:01 AM Christoph Muellner
 wrote:
>
> Tested with "make check" and no regressions found.
>
> This patch depends on the struct xgene1_prefetch_tune,
> which has been acknowledged already:
> https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00985.html
>
> *** gcc/ChangeLog ***
>
> 2018-xx-xx  Christoph Muellner 
>
> * config/aarch64/aarch64-cores.def: Define emag.
> * config/aarch64/aarch64-tune.md: Regenerated with emag.
> * config/aarch64/aarch64.c (emag_tunings): New struct.
> * doc/invoke.texi: Document mtune value.
>
> Signed-off-by: Christoph Muellner 
> ---
>  gcc/config/aarch64/aarch64-cores.def |  3 +++
>  gcc/config/aarch64/aarch64-tune.md   |  2 +-
>  gcc/config/aarch64/aarch64.c | 25 +
>  gcc/doc/invoke.texi  |  2 +-
>  4 files changed, 30 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/config/aarch64/aarch64-cores.def 
> b/gcc/config/aarch64/aarch64-cores.def
> index 1f3ac56..68cca00 100644
> --- a/gcc/config/aarch64/aarch64-cores.def
> +++ b/gcc/config/aarch64/aarch64-cores.def
> @@ -61,6 +61,9 @@ AARCH64_CORE("thunderxt88",   thunderxt88,   thunderx,  8A, 
>  AARCH64_FL_FOR_ARCH
>  AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
> 0x0a2, -1)
>  AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  8A,  
> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 
> 0x0a3, -1)
>
> +/* Ampere Computing cores. */
> +AARCH64_CORE("emag",emag,  xgene1,8A,  AARCH64_FL_FOR_ARCH8 
> | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, emag, 0x50, 0x000, 3)

I think you should add a comment to say why this order is required
like above for thunderxt88p1.

Thanks,
Andrew Pinski

> +
>  /* APM ('P') cores. */
>  AARCH64_CORE("xgene1",  xgene1,xgene1,8A,  AARCH64_FL_FOR_ARCH8, 
> xgene1, 0x50, 0x000, -1)
>
> diff --git a/gcc/config/aarch64/aarch64-tune.md 
> b/gcc/config/aarch64/aarch64-tune.md
> index fade1d4..2fc7f03 100644
> --- a/gcc/config/aarch64/aarch64-tune.md
> +++ b/gcc/config/aarch64/aarch64-tune.md
> @@ -1,5 +1,5 @@
>  ;; -*- buffer-read-only: t -*-
>  ;; Generated automatically by gentune.sh from aarch64-cores.def
>  (define_attr "tune"
> -   
> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
> +   
> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,emag,xgene1,falkor,qdf24xx,exynosm1,phecda,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,cortexa76,ares,tsv110,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55,cortexa76cortexa55"
> (const (symbol_ref "((enum attr_tune) aarch64_tune)")))
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index f7f88a9..995aafe 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -957,6 +957,31 @@ static const struct tune_params xgene1_tunings =
>&xgene1_prefetch_tune
>  };
>
> +static const struct tune_params emag_tunings =
> +{
> +  &xgene1_extra_costs,
> +  &xgene1_addrcost_table,
> +  &xgene1_regmove_cost,
> +  &xgene1_vector_cost,
> +  &generic_branch_cost,
> +  &xgene1_approx_modes,
> +  6, /* memmov_cost  */
> +  4, /* issue_rate  */
> +  AARCH64_FUSE_NOTHING, /* fusible_ops  */
> +  "16",/* function_align.  */
> +  "16",/* jump_align.  */
> +  "16",/* loop_align.  */
> +  2,   /* int_reassoc_width.  */
> +  4,   /* fp_reassoc_width.  */
> +  1,   /* vec_reassoc_width.  */
> +  2,   /* min_div_recip_mul_sf.  */
> +  2,   /* min_div_recip_mul_df.  */
> +  17,  /* max_case_values.  */
> +  tune_params::AUTOPREFETCHER_OFF, /* autoprefetcher_model.  */
> +  (AARCH64_EXTRA_TUNE_NO_LDP_STP_QREGS),   /* tune_flags.  */
> +  &xgene1_prefetch_tune
> +};
> +
>  static const struct tune_params qdf24xx_tunings =
>  {
>&qdf24xx_extra_costs,
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index e016dce..ac81fb2 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -15288,7 +15288,7 @@ Specify the name of the target processor for which 
> GCC should tune the
>  performance of the code.  Permissible values for this option are:
>  @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
>  @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
> -@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{falkor},
> +@samp{cortex-a76}, @samp{ares}, @samp{exynos-m1}, @samp{emag}, @samp{falkor},
>  @samp{qdf24xx}, @samp{saphira}, @samp{phecda}, @