date:20180621

[PATCH] Avoid changing DR for scatter/gather refs

2018-06-21 Thread Richard Biener



This avoids changing DRs for scatters/gathers in vect_analyze_data_refs
so we can share them across different VF analyses.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.  I've
also built SPEC 2k6 on a haswell machine.

simd-lane to go!

Richard.

>From 37c0f2f61d6aa0d3b9b7ae984e223cf2c7aa386b Mon Sep 17 00:00:00 2001
From: Richard Guenther 
Date: Wed, 20 Jun 2018 16:01:22 +0200
Subject: [PATCH] do-not-change-dr-for-scattergather

2018-06-21  Richard Biener  

* tree-data-ref.c (dr_step_indicator): Handle NULL DR_STEP.
* tree-vect-data-refs.c (vect_analyze_possibly_independent_ddr):
Avoid calling vect_mark_for_runtime_alias_test with gathers or scatters.
(vect_analyze_data_ref_dependence): Re-order checks to deal with
NULL DR_STEP.
(vect_record_base_alignments): Do not record base alignment
for gathers or scatters.
(vect_compute_data_ref_alignment): Drop return value that is always
true.  Bail out early for gathers or scatters.
(vect_enhance_data_refs_alignment): Bail out early for gathers
or scatters.
(vect_find_same_alignment_drs): Likewise.
(vect_analyze_data_refs_alignment): Remove dead code.
(vect_slp_analyze_and_verify_node_alignment): Likewise.
(vect_analyze_data_refs): For possible gathers or scatters do
not create an alternate DR, just check their possible validity
and mark them.  Adjust DECL_NONALIASED handling to not rely
on DR_BASE_ADDRESS.
* tree-vect-loop-manip.c (vect_update_inits_of_drs): Do not
update inits of gathers or scatters.
* tree-vect-patterns.c (vect_recog_mask_conversion_pattern):
Also copy gather/scatter flag to pattern vinfo.

diff --git a/gcc/tree-data-ref.c b/gcc/tree-data-ref.c
index 0917d6dff01..b163eaf841d 100644
--- a/gcc/tree-data-ref.c
+++ b/gcc/tree-data-ref.c
@@ -5454,6 +5454,8 @@ static tree
 dr_step_indicator (struct data_reference *dr, int useful_min)
 {
   tree step = DR_STEP (dr);
+  if (!step)
+return NULL_TREE;
   STRIP_NOPS (step);
   /* Look for cases where the step is scaled by a positive constant
  integer, which will often be the access size.  If the multiplication
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 9c53fe86b87..38a058bd6b6 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -267,7 +267,11 @@ vect_analyze_possibly_independent_ddr 
(data_dependence_relation *ddr,
 
 Note that the alias checks will be removed if the VF ends up
 being small enough.  */
- return vect_mark_for_runtime_alias_test (ddr, loop_vinfo);
+ return (!STMT_VINFO_GATHER_SCATTER_P
+(vinfo_for_stmt (DR_STMT (DDR_A (ddr
+ && !STMT_VINFO_GATHER_SCATTER_P
+   (vinfo_for_stmt (DR_STMT (DDR_B (ddr
+ && vect_mark_for_runtime_alias_test (ddr, loop_vinfo));
}
 }
   return true;
@@ -479,15 +483,15 @@ vect_analyze_data_ref_dependence (struct 
data_dependence_relation *ddr,
  if (loop->safelen < 2)
{
  tree indicator = dr_zero_step_indicator (dra);
- if (TREE_CODE (indicator) != INTEGER_CST)
-   vect_check_nonzero_value (loop_vinfo, indicator);
- else if (integer_zerop (indicator))
+ if (!indicator || integer_zerop (indicator))
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 "access also has a zero step\n");
  return true;
}
+ else if (TREE_CODE (indicator) != INTEGER_CST)
+   vect_check_nonzero_value (loop_vinfo, indicator);
}
  continue;
}
@@ -837,19 +841,18 @@ vect_record_base_alignments (vec_info *vinfo)
   FOR_EACH_VEC_ELT (vinfo->shared->datarefs, i, dr)
 {
   gimple *stmt = vect_dr_stmt (dr);
+  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
   if (!DR_IS_CONDITIONAL_IN_STMT (dr)
- && STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (stmt)))
+ && STMT_VINFO_VECTORIZABLE (stmt_info)
+ && !STMT_VINFO_GATHER_SCATTER_P (stmt_info))
{
  vect_record_base_alignment (vinfo, stmt, &DR_INNERMOST (dr));
 
  /* If DR is nested in the loop that is being vectorized, we can also
 record the alignment of the base wrt the outer loop.  */
  if (loop && nested_in_vect_loop_p (loop, stmt))
-   {
- stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
- vect_record_base_alignment
-(vinfo, stmt, &STMT_VINFO_DR_WRT_VEC_LOOP (stmt_info));
-   }
+   vect_record_base_alignment
+   (vinfo, stmt, &STMT_VINFO_DR_WRT_VEC_LOOP (stmt_info));
}
 }
 }
@@ -870,14 +873,12 @@ vect_calculate_target_alignment (

Re: [PATCH, PR85859][tail-merge] Factor out gimple_may_have_side_effects_p and use in stmt_local_def

2018-06-21 Thread Richard Biener

On Wed, Jun 20, 2018 at 11:32 PM Martin Sebor  wrote:
>
> On 06/20/2018 03:14 PM, Tom de Vries wrote:
> > Hi,
> >
> > Consider the test-case from the patch.  When compiled with "-O2 -fno-dce
> > -fno-isolate-erroneous-paths-dereference -fno-tree-dce -fno-tree-vrp" and
> > run, we get:
> > ...
> > $ ./a.out
> > Floating point exception
> > ...
> >
> > The problem is introduced by -ftree-tail-merge (enabled by -O2), so it
> > executes fine when compiled with -fno-tree-tail-merge.
> >
> > Tail-merge merges two blocks it considers equal:
> > ...
> > find_duplicates:  duplicate of 
> > Removing basic block 4
> >
> >   
> >   _6 = foo (0);
> >   iftmp.2_10 = (long int) _6;
> >   goto ; [100.00%]
> >
> >   
> >   iftmp.2_11 = (long int) &c;
> > ...
> > while the blocks in fact aren't equal from the point of view of side 
> > effects.
> > Executing bb3 causes the 'Floating point exception', while executing bb4
> > doesn't.
> >
> > This patch fixes the problem by factoring out a new function
> > gimple_may_have_side_effects_p from bb_no_side_effects_p, and reusing that
> > function in the side-effect test in stmt_local_def in tail-merge.

I think tail-merging and ifcombine need two different kinds of
no-side-effectness
so please do not factor it out.  In fact the name is too general and confusing
given we already have gimple_has_side_effects (which also means only
it _could_ have side-effects, not it must have...).

So simply add the call handling (and comment) to stmt_local_def (the gimple_vdef
check is redundant with the gimple_vuse one btw).

OK with that change.

Thanks,
Richard.

> Would gimple.h be a better place to declare the function, to
> make it easier to use (i.e., without searching for the header
> to include when using it for the first time)?
>
> Martin
>
> PS With one exception, AFAICS, the functions called by
> gimple_may_have_side_effects_p are all defined in gimple.c, but
> gimple_uses_undefined_value_p is declared in gimple-ssa.h and
> defined in tree-ssa.c.  I don't know enough about how functions
> are divvied up among these files but it seems to me that the
> easiest scheme to understand and follow would be to declare all
> extern gimple_xxx functions in gimple.h, no matter where they
> are defined.
>
> >
> > The patch inhibits tail-merge in pr81192.c, because
> > gimple_may_have_side_effects_p tests for gimple_uses_undefined_value_p, 
> > which
> > triggers for this particular test-case.
> >
> > Bootstrapped and reg-tested on x86_64.
> >
> > OK for trunk?
> >
> > Thanks,
> > - Tom
> >
> > [tail-merge] Factor out gimple_may_have_side_effects_p and use in 
> > stmt_local_def
> >
> > 2018-06-20  Tom de Vries  
> >
> >   PR tree-optimization/85859
> >   * tree-ssa-ifcombine.c (gimple_may_have_side_effects_p): Factor out of
> >   ...
> >   (bb_no_side_effects_p): ... here.
> >   * tree-ssa-ifcombine.h: New file.
> >   (gimple_may_have_side_effects_p): Declare.
> >   * tree-ssa-tail-merge.c (stmt_local_def): Use
> >   gimple_may_have_side_effects_p.
> >
> >   * gcc.dg/pr85859.c: New test.
> >   * gcc.dg/pr81192.c: Update.
> >
> > ---
> >  gcc/testsuite/gcc.dg/pr81192.c |  2 +-
> >  gcc/testsuite/gcc.dg/pr85859.c | 19 +++
> >  gcc/tree-ssa-ifcombine.c   | 34 +++---
> >  gcc/tree-ssa-ifcombine.h   | 24 
> >  gcc/tree-ssa-tail-merge.c  |  5 ++---
> >  5 files changed, 69 insertions(+), 15 deletions(-)
> >
> > diff --git a/gcc/testsuite/gcc.dg/pr81192.c b/gcc/testsuite/gcc.dg/pr81192.c
> > index 0049f371b3d..9db641ba91d 100644
> > --- a/gcc/testsuite/gcc.dg/pr81192.c
> > +++ b/gcc/testsuite/gcc.dg/pr81192.c
> > @@ -24,4 +24,4 @@ fn2 (void)
> >;
> >  }
> >
> > -/* { dg-final { scan-tree-dump-times "(?n)find_duplicates:  
> > duplicate of " 1 "pre" } } */
> > +/* { dg-final { scan-tree-dump-not "(?n)find_duplicates:  duplicate 
> > of " "pre" } } */
> > diff --git a/gcc/testsuite/gcc.dg/pr85859.c b/gcc/testsuite/gcc.dg/pr85859.c
> > new file mode 100644
> > index 000..96eb9671137
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/pr85859.c
> > @@ -0,0 +1,19 @@
> > +/* { dg-do run } */
> > +/* { dg-options "-ftree-tail-merge -Wno-div-by-zero -O2 -fno-dce 
> > -fno-isolate-erroneous-paths-dereference -fno-tree-dce -fno-tree-vrp" } */
> > +
> > +int b, c, d, e;
> > +
> > +__attribute__ ((noinline, noclone))
> > +int foo (short f)
> > +{
> > +  f %= 0;
> > +  return f;
> > +}
> > +
> > +int
> > +main (void)
> > +{
> > +  b = (unsigned char) __builtin_parity (d);
> > +  e ? foo (0) : (long) &c;
> > +  return 0;
> > +}
> > diff --git a/gcc/tree-ssa-ifcombine.c b/gcc/tree-ssa-ifcombine.c
> > index b63c600c47b..8ea51a793f9 100644
> > --- a/gcc/tree-ssa-ifcombine.c
> > +++ b/gcc/tree-ssa-ifcombine.c
> > @@ -40,6 +40,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "gimplify-me.h"
> >  #include "tree-cfg.h"
> >  #include "tree-ssa.h"
> > +#include "tree-ssa-ifco

Re: [PATCH] Fix IPA crash in libgccjit

2018-06-21 Thread Richard Biener

On Thu, Jun 21, 2018 at 1:07 AM David Malcolm  wrote:
>
> All/most of the jit.dg testcases are segfaulting on cleanup of
> the 2nd in-process iteration:
>
> PATH=.:$PATH LD_LIBRARY_PATH=. LIBRARY_PATH=. \
>  gdb --args \
>testsuite/jit/test-factorial.c.exe
>
> Starting program: 
> /home/david/coding-3/gcc-git-static-analysis/build/gcc/testsuite/jit/test-factorial.c.exe
> PASSED: test-factorial.c.exe iteration 1 of 5: set_up_logging: 
> logfile is non-null
> NOTE: test-factorial.c.exe iteration 1 of 5: writing reproducer to 
> /home/david/coding-3/gcc-git-static-analysis/build/gcc/testsuite/jit/test-factorial.c.exe.reproducer.c
> Detaching after fork from child process 35787.
> Detaching after fork from child process 35789.
> PASSED: test-factorial.c.exe iteration 1 of 5: verify_code: result is 
> non-null
> PASSED: test-factorial.c.exe iteration 1 of 5: verify_code: 
> my_factorial is non-null
> NOTE: my_factorial returned: 3628800
> PASSED: test-factorial.c.exe iteration 1 of 5: verify_code: actual: 
> val == expected: 3628800
> PASSED: test-factorial.c.exe iteration 2 of 5: set_up_logging: 
> logfile is non-null
> NOTE: test-factorial.c.exe iteration 2 of 5: writing reproducer to 
> /home/david/coding-3/gcc-git-static-analysis/build/gcc/testsuite/jit/test-factorial.c.exe.reproducer.c
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x771abc75 in ipcp_driver () at ../../src/gcc/ipa-cp.c:5091
> 5091  delete edge_clone_summaries;
>
> This appears to be due to recent(?) IPA changes that appear to assume
> that the IPA code is only initialized and cleaned up once.
>
> This patch fixes the crashes:
>
> Changes to jit.sum
> --
>
>   FAIL: 65->0 (-65)
>   PASS: 3186->10290 (+7104)
>   UNRESOLVED: 1->0 (-1)
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
>
> OK for trunk?

OK.

RIchard.

> gcc/ChangeLog:
> * ipa-cp.c (ipcp_driver): Set edge_clone_summaries to NULL after
> deleting it.
> * ipa-reference.c (ipa_reference_c_finalize): Delete
> ipa_ref_opt_sum_summaries and set it to NULL.
> ---
>  gcc/ipa-cp.c| 1 +
>  gcc/ipa-reference.c | 6 ++
>  2 files changed, 7 insertions(+)
>
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index c192e84..42dd4cc 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -5089,6 +5089,7 @@ ipcp_driver (void)
>/* Free all IPCP structures.  */
>free_toporder_info (&topo);
>delete edge_clone_summaries;
> +  edge_clone_summaries = NULL;
>ipa_free_all_structures_after_ipa_cp ();
>if (dump_file)
>  fprintf (dump_file, "\nIPA constant propagation end\n");
> diff --git a/gcc/ipa-reference.c b/gcc/ipa-reference.c
> index 9a9e94c..43bbdae 100644
> --- a/gcc/ipa-reference.c
> +++ b/gcc/ipa-reference.c
> @@ -1230,6 +1230,12 @@ make_pass_ipa_reference (gcc::context *ctxt)
>  void
>  ipa_reference_c_finalize (void)
>  {
> +  if (ipa_ref_opt_sum_summaries != NULL)
> +{
> +  delete ipa_ref_opt_sum_summaries;
> +  ipa_ref_opt_sum_summaries = NULL;
> +}
> +
>if (ipa_init_p)
>  {
>bitmap_obstack_release (&optimization_summary_obstack);
> --
> 1.8.5.3
>

Re: [PATCH][ARM] Use __ARM_ARCH instead of __ARM_ARCH__

2018-06-21 Thread Kyrill Tkachov




On 21/06/18 07:59, Christophe Lyon wrote:

On Tue, 19 Jun 2018 at 10:50, Kyrill Tkachov
 wrote:

Hi Christophe,

On 17/06/18 21:23, Christophe Lyon wrote:

On Fri, 15 Jun 2018 at 17:22, Richard Earnshaw (lists)
 wrote:

On 15/06/18 15:30, Christophe Lyon wrote:

Hello,

As suggested in [1], the attached patch removes all definitions and
uses of __ARM_ARCH__ and uses __ARM_ARCH instead. The later is indeed
defined by the preprocessor to the appropriate value.

I ran make check on arm-none-eabi (with A-profile multilib),
arm-none-linux-gnueabi, arm-none-linux-gnueabihf (with cortex-a9, a15,
a5, a57 and armtdmi as --with-cpu), armeb-none-linux-gnueabihf and
armv8l-linux-gnueabihf, and noticed no regression.

OK for trunk?

Thanks,

Christophe

[1] https://gcc.gnu.org/ml/gcc-patches/2018-06/msg00445.html


ARM_ARCH.chlog.txt


libatomic/ChangeLog:

2018-06-15  Christophe Lyon 

   * config/arm/arm-config.h (__ARM_ARCH__): Remove definitions, use
   __ARM_ARCH instead.

libgcc/ChangeLog:

2018-06-15  Christophe Lyon 

   * config/arm/lib1funcs.S (__ARM_ARCH__): Remove definitions, use
   __ARM_ARCH instead.
   * config/arm/ieee754-df.S: Use __ARM_ARCH instead of __ARM_ARCH__.
   * config/arm/ieee754-sf.S: Likewise.
   * config/arm/libunwind.S: Likewise.


ARM_ARCH.patch.txt


Thanks, this is a useful start.  We can, however, go further.  ACLE
defines a number of 'feature' pre-defines and we can use those to void
direct tests of the architecture version directly.  For example,
__ARM_FEATURE_LDREX could directly replace having to calculate
HAVE_STREX and HAVE_STREXBHD.


Hi,

Here is an updated patch using __ARM_FEATURE_LDREX.
I didn't find other opportunities to use ACLE pre-defines, did I miss any?


Thanks for doing this. I think we can catch a few more...


OK, I didn't grep accurately enough it seems.

Here is a new version hopefully addressing your comments.


yes, that looks good now.


However, I'm not sure whether replacing uses of __ARM_ARCH__ and
removing support for arches < 4 should be in the same patch: this goes
beyond my original intent, and I've noticed probable dead code in
include/longlong.h (support for umul_ppmm on arm v2 and v3)


I see your point. It could indeed be cleaner if the code removal hunk was
put in a separate patch. A bugzilla entry about the dead code to be removed 
would
be appreciated, I can take care of that then.


Similarly there is code to define __ARM_ARCH in libffi/src/arm/sysv.S.


I believe libffi is its own separate project that we import in GCC, so it may 
want
to support compiling with older GCC versions. I'd need to double-check that.


So it seems further cleanup would be needed.


Indeed. This patch is ok with the __ARM_ARCH < 4 path removals separated into
their own patch.

Thanks,
Kyrill



Christophe



diff --git a/libgcc/config/arm/ieee754-df.S b/libgcc/config/arm/ieee754-df.S
index 570e5f6..7c5260e 100644
--- a/libgcc/config/arm/ieee754-df.S
+++ b/libgcc/config/arm/ieee754-df.S
@@ -245,7 +245,7 @@ LSYM(Lad_a):
 @ No rounding necessary since ip will always be 0 at this point.
   LSYM(Lad_l):

-#if __ARM_ARCH__ < 5
+#if __ARM_ARCH < 5

This path exists to handle the case when the CLZ instruction is not available 
(the #else path uses CLZ).
So we can change this to #ifndef __ARM_FEATURE_CLZ


 teq xh, #0
 movne   r3, #20
@@ -656,7 +656,7 @@ ARM_FUNC_ALIAS aeabi_dmul muldf3
 orr yh, yh, #0x0010
 beq LSYM(Lml_1)

-#if __ARM_ARCH__ < 4
+#if __ARM_ARCH < 4

We can delete this whole path as we no longer support anything older than 4

diff --git a/libgcc/config/arm/ieee754-sf.S b/libgcc/config/arm/ieee754-sf.S
index dac3e2e..00a8d9c 100644
--- a/libgcc/config/arm/ieee754-sf.S
+++ b/libgcc/config/arm/ieee754-sf.S
@@ -175,7 +175,7 @@ LSYM(Lad_a):
 @ No rounding necessary since r1 will always be 0 at this point.
   LSYM(Lad_l):

-#if __ARM_ARCH__ < 5
+#if __ARM_ARCH < 5

 movsip, r0, lsr #12
 moveq   r0, r0, lsl #12
@@ -370,7 +370,7 @@ ARM_FUNC_ALIAS aeabi_l2f floatdisf
 subeq   r3, r3, #(32 << 23)
   2:sub r3, r3, #(1 << 23)

-#if __ARM_ARCH__ < 5
+#if __ARM_ARCH < 5

 mov r2, #23
 cmp ip, #(1 << 16)

Similar comment on checking __ARM_FEATURE_CLZ in the above two checks.


   @@ -460,7 +460,7 @@ LSYM(Lml_x):
 orr r0, r3, r0, lsr #5
 orr r1, r3, r1, lsr #5

-#if __ARM_ARCH__ < 4
+#if __ARM_ARCH < 4

 @ Put sign bit in r3, which will be restored into r0 later.
 and r3, ip, #0x8000


Likewise on deleting the < 4 path.

diff --git a/libgcc/config/arm/lib1funcs.S b/libgcc/config/arm/lib1funcs.S
index 04c1b77..264d54a 100644
--- a/libgcc/config/arm/lib1funcs.S
+++ b/libgcc/config/arm/lib1funcs.S
@@ -74,49 +74,6 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see



-#if __ARM_ARCH__ >= 5 && ! defined (__OPTIMIZE_SIZE__)
+#if __ARM_ARCH >= 5 && ! defined

[Patch, fortran] PR83118 - [7/8/9 Regression] Bad intrinsic assignment of class(*) array component of derived type

2018-06-21 Thread Paul Richard Thomas

The original problem was fixed by the patch for PR84546. This patch
fixes a variant that appears in comment #6.

The fix is completely straightforward and described by the comments
and ChangeLogs.

Bootstrapped and regtested on FC28/x86_64 - OK for trunk?

I am not sure that this problem is a regression on 7-branch and have
not yet checked if the patch is even compatible with it. However, I
can certainly fix 8-branch and will have a go at 7-branch.

Cheers

Paul

2018-06-21  Paul Thomas  

PR fortran/83118
* resolve.c (resolve_ordinary_assign): Force the creation of a
vtable for assignment of non-polymorphic expressions to an
unlimited polymorphic object.
* trans-array.c (gfc_alloc_allocatable_for_assignment): Use the
size of the rhs type for such assignments. Set the dtype, _len
and vptrs appropriately.
* trans-expr.c (gfc_trans_assignment): Force the use of the
_copy function for these assignments.

2018-06-21  Paul Thomas  

PR fortran/83118
* gfortran.dg/unlimited_polymorphic_30.f03: New test.
Index: gcc/fortran/resolve.c
===
*** gcc/fortran/resolve.c	(revision 261126)
--- gcc/fortran/resolve.c	(working copy)
*** resolve_ordinary_assign (gfc_code *code,
*** 10374,10379 
--- 10387,10397 
&& rhs->expr_type != EXPR_ARRAY)
  gfc_add_data_component (rhs);

+   /* Make sure there is a vtable and, in particular, a _copy for the
+  rhs type.  */
+   if (UNLIMITED_POLY (lhs) && lhs->rank && rhs->ts.type != BT_CLASS)
+ gfc_find_vtab (&rhs->ts);
+
bool caf_convert_to_send = flag_coarray == GFC_FCOARRAY_LIB
&& (lhs_coindexed
  	  || (code->expr2->expr_type == EXPR_FUNCTION
Index: gcc/fortran/trans-array.c
===
*** gcc/fortran/trans-array.c	(revision 261126)
--- gcc/fortran/trans-array.c	(working copy)
*** gfc_alloc_allocatable_for_assignment (gf
*** 9948,9953 
--- 9948,9955 
  			 gfc_array_index_type, tmp,
  			 expr1->ts.u.cl->backend_decl);
  }
+   else if (UNLIMITED_POLY (expr1) && expr2->ts.type != BT_CLASS)
+ tmp = TYPE_SIZE_UNIT (gfc_typenode_for_spec (&expr2->ts));
else
  tmp = TYPE_SIZE_UNIT (gfc_typenode_for_spec (&expr1->ts));
tmp = fold_convert (gfc_array_index_type, tmp);
*** gfc_alloc_allocatable_for_assignment (gf
*** 9974,9979 
--- 9976,10003 
gfc_add_modify (&fblock, tmp,
  		  gfc_get_dtype_rank_type (expr1->rank,type));
  }
+   else if (UNLIMITED_POLY (expr1) && expr2->ts.type != BT_CLASS)
+ {
+   tree type;
+   tmp = gfc_conv_descriptor_dtype (desc);
+   type = gfc_typenode_for_spec (&expr2->ts);
+   gfc_add_modify (&fblock, tmp,
+ 		  gfc_get_dtype_rank_type (expr2->rank,type));
+   /* Set the _len field as well...  */
+   tmp = gfc_class_len_get (TREE_OPERAND (desc, 0));
+   if (expr2->ts.type == BT_CHARACTER)
+ 	gfc_add_modify (&fblock, tmp,
+ 			fold_convert (TREE_TYPE (tmp),
+   TYPE_SIZE_UNIT (type)));
+   else
+ 	gfc_add_modify (&fblock, tmp,
+ 			build_int_cst (TREE_TYPE (tmp), 0));
+   /* ...and the vptr.  */
+   tmp = gfc_class_vptr_get (TREE_OPERAND (desc, 0));
+   tmp2 = gfc_get_symbol_decl (gfc_find_vtab (&expr2->ts));
+   tmp2 = gfc_build_addr_expr (TREE_TYPE (tmp), tmp2);
+   gfc_add_modify (&fblock, tmp, tmp2);
+ }
else if (coarray && GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (desc)))
  {
gfc_add_modify (&fblock, gfc_conv_descriptor_dtype (desc),
*** gfc_alloc_allocatable_for_assignment (gf
*** 10079,10088 


/* We already set the dtype in the case of deferred character
!  length arrays.  */
if (!(GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (desc))
  	&& ((expr1->ts.type == BT_CHARACTER && expr1->ts.deferred)
! 	|| coarray)))
  {
tmp = gfc_conv_descriptor_dtype (desc);
gfc_add_modify (&alloc_block, tmp, gfc_get_dtype (TREE_TYPE (desc)));
--- 10103,10113 


/* We already set the dtype in the case of deferred character
!  length arrays and unlimited polymorphic arrays.  */
if (!(GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (desc))
  	&& ((expr1->ts.type == BT_CHARACTER && expr1->ts.deferred)
! 	|| coarray))
!   && !UNLIMITED_POLY (expr1))
  {
tmp = gfc_conv_descriptor_dtype (desc);
gfc_add_modify (&alloc_block, tmp, gfc_get_dtype (TREE_TYPE (desc)));
Index: gcc/fortran/trans-expr.c
===
*** gcc/fortran/trans-expr.c	(revision 261126)
--- gcc/fortran/trans-expr.c	(working copy)
*** gfc_trans_assignment (gfc_expr * expr1,
*** 10431,10436 
--- 10431,10440 
  	return tmp;
  }

+   if (UNLIMITED_POLY (expr1) && expr1->rank
+   && expr2->ts.type != BT_CLASS)
+ use_vptr_copy = true;
+
/* Fallback to the scalarizer to generate explicit loops.

Re: [patch] Do not leak location information during inlining

2018-06-21 Thread Richard Biener

On Wed, Jun 20, 2018 at 4:37 PM Eric Botcazou  wrote:
>
> > There are fixes in this patch together with the new functionality -
> > can you split
> > those out?  I'm thinking of the copy_edges_for_bb hunks as well as
> > the expand_call_inline ones.
>
> Like this?

OK.

Similar factoring of remap_location and copying/remapping edges goto_locus
is a fix worth splitting out (and backporting eventually if a need arises).  It
interferes somewhat with the DECL_INGORED parts but IMHO is separate from
those.

Thanks,
Richard.

>
> * tree-inline.c (copy_edges_for_bb): Minor tweak.
> (maybe_move_debug_stmts_to_successors): Also reset the locus of the
> debug statement when resetting its value.
> (expand_call_inline): Copy the locus of the call onto the assignment 
> of
> the return value, if any.  Use local variable in more cases.
>
> --
> Eric Botcazou

Re: [Patch, fortran] PR49630 - [OOP] ICE on obsolescent deferred-length type bound character function

2018-06-21 Thread Paul Richard Thomas

Ping!

On 19 June 2018 at 10:16, Paul Richard Thomas
 wrote:
> I got caught with a wild goose chase with this one. I tried to get it
> to work before seeing the standard reference in trans-expr.c. In fact,
> it would be impossible to fix because there is no way to resolve
> different instances of the abstract interface with different character
> lengths.
>
> Bootstrapped and regtested on FC28/x86_64 - OK for trunk.
>
> I do not intend to backport it unless there is any enthusiasm for me to do so.
>
> Regards
>
> Paul
>
> 2018-06-19  Paul Thomas  
>
> PR fortran/49630
> * resolve.c (resolve_contained_fntype): Change standard ref.
> from F95 to F2003: C418. Correct a spelling error in a comment.
> It is an error for an abstract interface to have an assumed
> character length result.
> * trans-expr.c (gfc_conv_procedure_call): Likewise change the
> standard reference.
>
> 2018-06-19  Paul Thomas  
>
> PR fortran/49630
> * gfortran.dg/assumed_charlen_function_7.f90: New test.



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein

Re: [AArch64][PATCH 1/2] Make AES unspecs commutative

2018-06-21 Thread Andre Simoes Dias Vieira

Hey Kyrill,

I think it should be possible, I'll have a quick look.

Cheers,
Andre


From: Kyrill Tkachov 
Sent: Wednesday, June 20, 2018 9:32:50 AM
To: Andre Simoes Dias Vieira; gcc-patches@gcc.gnu.org
Cc: nd; James Greenhalgh; Richard Earnshaw; Marcus Shawcroft
Subject: Re: [AArch64][PATCH 1/2] Make AES unspecs commutative

Hi Andre,

On 18/06/18 10:38, Andre Simoes Dias Vieira wrote:
> Hi,
>
> This patch teaches the AArch64 backend that the AESE and AESD unspecs are 
> commutative (which correspond to the vaeseq_u8 and vaesdq_u8 intrinsics). 
> This improves register allocation around their corresponding instructions 
> avoiding unnecessary moves.
>
> For instance, with the old patterns code such as:
>
> uint8x16_t
> test0 (uint8x16_t a, uint8x16_t b)
> {
>   uint8x16_t result;
>   result = vaeseq_u8 (a, b);
>   result = vaeseq_u8 (result, a);
>   return result;
> }
>
> would lead to suboptimal register allocation such as:
> test0:
> mov v2.16b, v0.16b
> aesev2.16b, v1.16b
> mov v1.16b, v2.16b
> aesev1.16b, v0.16b
> mov v0.16b, v1.16b
> ret
>
> whereas with the new patterns we see:
> aesev1.16b, v0.16b
> aesev0.16b, v1.16b
> ret
>
>
> Bootstrapped and tested on aarch64-none-linux-gnu.
>
> Is this OK for trunk?
>

Nice one!
Do you think we can get an equivalent patch for arm?

Thanks,
Kyrill

> Cheers,
> Andre
>
>
> gcc
> 2018-06-18  Andre Vieira 
>
>
> * config/aarch64/aarch64-simd.md (aarch64_crypto_aesv16qi):
> Make operands of the unspec commutative.
>
> gcc/testsuite
> 2018-06-18 Andre Vieira 
>
> * gcc/target/aarch64/aes_2.c: New test.

Re: Fix safe iterator inconsistent assertion

2018-06-21 Thread Jonathan Wakely


On 21/06/18 07:36 +0200, François Dumont wrote:
Working on iterator == operator I noticed that a comparison in 
_Safe_iterator was inconsistent.


    * include/debug/debug.h


Wrong filename in the ChangeLog here.


    (_Safe_iterator<>(const _Safe_iterator<_MutableIterator,>& __x)):
    Compare __x base iterator with a default initialized iterator of the
    same type.


Please say value-initialized not default initialized (that's what
you're actually doing, and [forward.iterators] p2 only makes it
well-defined for value-initialized iterators).

OK with those changes to the ChangeLog, thanks.

Re: [PATCH] Add HXT Phecda core support

2018-06-21 Thread Kyrill Tkachov


Hi Hongbo,

On 20/06/18 03:54, Hongbo Zhang wrote:

HXT semiconductor's CPU core Phecda, as a variant of Qualcomm qdf24xx,
reuses the same tuning structure and pipeline with it.



This looks ok to me but you'll need approval from the maintainers.
Some comments on the ChangeLog below.


2018-06-19  Hongbo Zhang  

* config/aarch64/aarch64-cores.def (AARCH64_CORE): add phecda core
* config/aarch64/aarch64-tune.md: re-generated by gentune.sh
* doc/invoke.texi: add phecda core entry


Please use capital first letter and end the sentence with a full stop.
That is, "Add phecda core." Same for the other entries
For aarch64-tune.md you can just say "Regenerate."

Thanks,
Kyrill


---
 gcc/config/aarch64/aarch64-cores.def | 3 +++
 gcc/config/aarch64/aarch64-tune.md   | 2 +-
 gcc/doc/invoke.texi  | 2 +-
 3 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index e64d831..0e3c0a0 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -61,6 +61,9 @@ AARCH64_CORE("thunderxt88", thunderxt88,   thunderx,  8A,  
AARCH64_FL_FOR_ARCH
 AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx, 8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 0x0a2, -1)
 AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx, 8A,  
AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43, 0x0a3, -1)

+/* HXT ('H') cores. */
+AARCH64_CORE("phecda",  phecda,falkor,8A, AARCH64_FL_FOR_ARCH8 | 
AARCH64_FL_CRC | AARCH64_FL_CRYPTO, qdf24xx,   0x68, 0x000, -1)
+
 /* APM ('P') cores. */
 AARCH64_CORE("xgene1",  xgene1,xgene1,8A, AARCH64_FL_FOR_ARCH8, 
xgene1, 0x50, 0x000, -1)

diff --git a/gcc/config/aarch64/aarch64-tune.md 
b/gcc/config/aarch64/aarch64-tune.md
index 7b3a746..19b44d7 100644
--- a/gcc/config/aarch64/aarch64-tune.md
+++ b/gcc/config/aarch64/aarch64-tune.md
@@ -1,5 +1,5 @@
 ;; -*- buffer-read-only: t -*-
 ;; Generated automatically by gentune.sh from aarch64-cores.def
 (define_attr "tune"
- 
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55"
+ 
"cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,phecda,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55"
 (const (symbol_ref "((enum attr_tune) aarch64_tune)")))
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 940b846..43ef9ac 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14667,7 +14667,7 @@ performance of the code. Permissible values for this 
option are:
 @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
 @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73}, @samp{cortex-a75},
 @samp{exynos-m1}, @samp{falkor}, @samp{qdf24xx}, @samp{saphira},
-@samp{xgene1}, @samp{vulcan}, @samp{thunderx},
+@samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{thunderx},
 @samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81},
 @samp{thunderxt83}, @samp{thunderx2t99}, @samp{cortex-a57.cortex-a53},
 @samp{cortex-a72.cortex-a53}, @samp{cortex-a73.cortex-a35},
--
2.7.4

[PATCH] Consistently gimplify all-zero CTORs to = {};

2018-06-21 Thread Richard Biener



PR86223 points out that we currently gimplify the testcase inconsistently.
For the incomplete CTORs we use block-clearing while for the complete
one we emit initializations of the individual elements (up to the
limits imposed in following checks).

So the following makes us always use = {}; form for all-zero CTORs
which is most compact for GIMPLE IL and should only result in better
code (fingers crossing...) since but not only beacause SRA got the
ability to handle a = .LC0; style inits as well.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2018-06-21  Richard Biener  

PR middle-end/86223
* gimplify.c (gimplify_init_constructor): For an all-zero
constructor emit a block-clear.

* gcc.dg/pr86223-1.c: New testcase.

Index: gcc/gimplify.c
===
--- gcc/gimplify.c  (revision 261839)
+++ gcc/gimplify.c  (working copy)
@@ -4805,6 +4805,10 @@ gimplify_init_constructor (tree *expr_p,
 requires trickery to avoid quadratic compile-time behavior in
 large cases or excessive memory use in small cases.  */
  cleared = !CONSTRUCTOR_NO_CLEARING (ctor);
+   else if (num_nonzero_elements == 0)
+ /* If all elements are zero it is most efficient to block-clear
+things.  */
+ cleared = true;
else if (num_ctor_elements - num_nonzero_elements
 > CLEAR_RATIO (optimize_function_for_speed_p (cfun))
 && num_nonzero_elements < num_ctor_elements / 4)
Index: gcc/testsuite/gcc.dg/pr86223-1.c
===
--- gcc/testsuite/gcc.dg/pr86223-1.c(revision 0)
+++ gcc/testsuite/gcc.dg/pr86223-1.c(working copy)
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-fdump-tree-gimple" } */
+
+void f (int *);
+void g ()
+{
+  int a[3] = { 0, 0, 0 };
+  f (a);
+}
+void h ()
+{
+  int a[3] = { 0 };
+  f (a);
+}
+
+/* We should use block-clearing for the initializer of a in both cases.  */
+/* { dg-final { scan-tree-dump-times "a = {};" 2 "gimple" } } */

[PATCH] Fix PR86232

2018-06-21 Thread Richard Biener



Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2018-06-21  Richard Biener  

PR tree-optimization/86232
* tree-ssa-loop-niter.c (number_of_iterations_popcount): Adjust
max for constant niter.

* gcc.dg/torture/pr86232.c: New testcase.

diff --git a/gcc/testsuite/gcc.dg/torture/pr86232.c 
b/gcc/testsuite/gcc.dg/torture/pr86232.c
new file mode 100644
index 000..f5b61d3dfb7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr86232.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+
+enum { a = 1 } b;
+int c()
+{
+  int d = a;
+  for (; d;)
+d &= d - 1;
+  return b;
+}
diff --git a/gcc/tree-ssa-loop-niter.c b/gcc/tree-ssa-loop-niter.c
index 936591502d0..f5ffc0f19ad 100644
--- a/gcc/tree-ssa-loop-niter.c
+++ b/gcc/tree-ssa-loop-niter.c
@@ -2575,9 +2575,6 @@ number_of_iterations_popcount (loop_p loop, edge exit,
 return false;
 
   /* Update NITER params accordingly  */
-  max = TYPE_PRECISION (TREE_TYPE (src));
-  if (adjust)
-max = max - 1;
   tree utype = unsigned_type_for (TREE_TYPE (src));
   src = fold_convert (utype, src);
   tree call = fold_convert (utype, build_call_expr (fn, 1, src));
@@ -2588,6 +2585,15 @@ number_of_iterations_popcount (loop_p loop, edge exit,
   else
 iter = call;
 
+  if (TREE_CODE (call) == INTEGER_CST)
+max = tree_to_uhwi (call);
+  else
+{
+  max = TYPE_PRECISION (TREE_TYPE (src));
+  if (adjust)
+   max = max - 1;
+}
+
   niter->niter = iter;
   niter->assumptions = boolean_true_node;
   if (adjust)

Re: [PATCH v2] [aarch64] Add HiSilicon tsv110 CPU support

2018-06-21 Thread Zhangshaokun

Hi Kyrill,

It was the Dragon Boat Festival for a short holiday in China, sorry to
reply later.

On 2018/6/14 15:58, Kyrill Tkachov wrote:
> Hi Shaokun,
> 
> On 14/06/18 02:09, Shaokun Zhang wrote:
>> This patch adds HiSilicon's an mcpu: tsv110, which supports v8_4A.
>>
>> ---
>>   gcc/ChangeLog|   8 +++
>>   gcc/config/aarch64/aarch64-cores.def |   3 +
>>   gcc/config/aarch64/aarch64-cost-tables.h | 103 
>> +++
>>   gcc/config/aarch64/aarch64-tune.md   |   2 +-
>>   gcc/config/aarch64/aarch64.c |  80 +++-
>>   gcc/doc/invoke.texi  |   2 +-
>>   6 files changed, 195 insertions(+), 3 deletions(-)
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 9c90875..e376714 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,11 @@
>> +2018-06-12  Shaokun Zhang  
>> +Bo Zhou  
>> +* config/aarch64/aarch64-cores.def (tsv110): New CPU.
>> +* config/aarch64/aarch64-tune.md: Regenerated.
>> +* doc/invoke.texi (AArch64 Options/-mtune): Add "tsv110".
>> +* config/aarch64/aarch64.c (tsv110_tunings): New tuning table.
>> +* config/aarch64/aarch64-cost-tables.h: Add "tsv110" extra costs.
>> +
> 
> Can you confirm that you've run a bootstrap and test run with this patch
> to check there are no regressions?
> 

I have tested this patch (fix some typo) on aarch64 and didn't get any 
regressions.

While, there is issue that is on the master branch:
../.././gcc/bitmap.c: In function ‘unsigned int 
bitmap_last_set_bit(const_bitmap)’:
../.././gcc/bitmap.c:841:26: error: array subscript -1 is below array bounds of 
‘const BITMAP_WORD [2]’ {aka ‘const long unsigned int [2]’} 
[-Werror=array-bounds]
   word = elt->bits[ix];
  ^
cc1plus: all warnings being treated as errors
Makefile:1110: recipe for target 'bitmap.o' failed
make[3]: *** [bitmap.o] Error 1

My gcc version is: gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609.
Are you happy to fix it? I fixed it in my local, but I am not sure it is ok.

> This version looks good to me but you'll need final approval from the 
> maintainers.
> 

I will update patch based on latest branch code today.
Hopefully you and maintainers are happy on v3.

Thanks,
Shaokun.

> Thanks,
> Kyrill
> 
>>   2018-06-12  Eric Botcazou  
>> * gcc.c: Document new %@{...} sequence.
>> diff --git a/gcc/config/aarch64/aarch64-cores.def 
>> b/gcc/config/aarch64/aarch64-cores.def
>> index e64d831..e6ebf02 100644
>> --- a/gcc/config/aarch64/aarch64-cores.def
>> +++ b/gcc/config/aarch64/aarch64-cores.def
>> @@ -88,6 +88,9 @@ AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  
>> AARCH64_FL_FOR_ARCH8_2
>> /* ARMv8.4-A Architecture Processors.  */
>>   +/* HiSilicon ('H') cores. */
>> +AARCH64_CORE("tsv110", tsv110,cortexa57,8_4A, 
>> AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES 
>> | AARCH64_FL_SHA2, tsv110,   0x48, 0xd01, -1)
>> +
>>   /* Qualcomm ('Q') cores. */
>>   AARCH64_CORE("saphira", saphira,falkor,8_4A,  
>> AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   
>> 0x51, 0xC01, -1)
>>   diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
>> b/gcc/config/aarch64/aarch64-cost-tables.h
>> index a455c62..b6890d6 100644
>> --- a/gcc/config/aarch64/aarch64-cost-tables.h
>> +++ b/gcc/config/aarch64/aarch64-cost-tables.h
>> @@ -334,4 +334,107 @@ const struct cpu_cost_table thunderx2t99_extra_costs =
>> }
>>   };
>>   +const struct cpu_cost_table tsv110_extra_costs =
>> +{
>> +  /* ALU */
>> +  {
>> +0, /* arith.  */
>> +0, /* logical.  */
>> +0, /* shift.  */
>> +0, /* shift_reg.  */
>> +COSTS_N_INSNS (1), /* arith_shift.  */
>> +COSTS_N_INSNS (1), /* arith_shift_reg.  */
>> +COSTS_N_INSNS (1), /* log_shift.  */
>> +COSTS_N_INSNS (1), /* log_shift_reg.  */
>> +0, /* extend.  */
>> +COSTS_N_INSNS (1), /* extend_arith.  */
>> +0, /* bfi.  */
>> +0, /* bfx.  */
>> +0, /* clz.  */
>> +0,   /* rev.  */
>> +0, /* non_exec.  */
>> +true   /* non_exec_costs_exec.  */
>> +  },
>> +  {
>> +/* MULT SImode */
>> +{
>> +  COSTS_N_INSNS (2),   /* simple.  */
>> +  COSTS_N_INSNS (2),   /* flag_setting.  */
>> +  COSTS_N_INSNS (2),   /* extend.  */
>> +  COSTS_N_INSNS (2),   /* add.  */
>> +  COSTS_N_INSNS (2),   /* extend_add.  */
>> +  COSTS_N_INSNS (11)   /* idiv.  */
>> +},
>> +/* MULT DImode */
>> +{
>> +  COSTS_N_INSNS (3),   /* simple.  */
>> +  0,   /* flag_setting (N/A).  */
>> +  COSTS_N_INSNS (3),   /* extend.  */
>> +  COSTS_N_INSNS (3),   /* add.  */
>> +  COSTS_N_INSNS (3

[PATCH v3] [aarch64] Add HiSilicon tsv110 CPU support

2018-06-21 Thread Shaokun Zhang

This patch adds HiSilicon's an mcpu: tsv110, which supports v8_4A.
It has been tested on aarch64 and no regressions from this patch.

---
 gcc/ChangeLog|   8 +++
 gcc/config/aarch64/aarch64-cores.def |   3 +
 gcc/config/aarch64/aarch64-cost-tables.h | 103 +++
 gcc/config/aarch64/aarch64-tune.md   |   2 +-
 gcc/config/aarch64/aarch64.c |  82 
 gcc/doc/invoke.texi  |   2 +-
 6 files changed, 198 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index d9fbc0c..f5538f7 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2018-06-21  Shaokun Zhang  
+Bo Zhou  
+   * config/aarch64/aarch64-cores.def (tsv110): New CPU.
+   * config/aarch64/aarch64-tune.md: Regenerated.
+   * doc/invoke.texi (AArch64 Options/-mtune): Add "tsv110".
+   * config/aarch64/aarch64.c (tsv110_tunings): New tuning table.
+   * config/aarch64/aarch64-cost-tables.h: Add "tsv110" extra costs.
+
 2018-06-21  Richard Biener  
 
* tree-data-ref.c (dr_step_indicator): Handle NULL DR_STEP.
diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index e64d831..e6ebf02 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -88,6 +88,9 @@ AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2
 
 /* ARMv8.4-A Architecture Processors.  */
 
+/* HiSilicon ('H') cores. */
+AARCH64_CORE("tsv110", tsv110,cortexa57,8_4A, 
AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | 
AARCH64_FL_SHA2, tsv110,   0x48, 0xd01, -1)
+
 /* Qualcomm ('Q') cores. */
 AARCH64_CORE("saphira", saphira,falkor,8_4A,  
AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   0x51, 
0xC01, -1)
 
diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
b/gcc/config/aarch64/aarch64-cost-tables.h
index a455c62..44095ce 100644
--- a/gcc/config/aarch64/aarch64-cost-tables.h
+++ b/gcc/config/aarch64/aarch64-cost-tables.h
@@ -334,4 +334,107 @@ const struct cpu_cost_table thunderx2t99_extra_costs =
   }
 };
 
+const struct cpu_cost_table tsv110_extra_costs =
+{
+  /* ALU */
+  {
+0, /* arith.  */
+0, /* logical.  */
+0, /* shift.  */
+0, /* shift_reg.  */
+COSTS_N_INSNS (1), /* arith_shift.  */
+COSTS_N_INSNS (1), /* arith_shift_reg.  */
+COSTS_N_INSNS (1), /* log_shift.  */
+COSTS_N_INSNS (1), /* log_shift_reg.  */
+0, /* extend.  */
+COSTS_N_INSNS (1), /* extend_arith.  */
+0, /* bfi.  */
+0, /* bfx.  */
+0, /* clz.  */
+0,/* rev.  */
+0, /* non_exec.  */
+true   /* non_exec_costs_exec.  */
+  },
+  {
+/* MULT SImode */
+{
+  COSTS_N_INSNS (2),   /* simple.  */
+  COSTS_N_INSNS (2),   /* flag_setting.  */
+  COSTS_N_INSNS (2),   /* extend.  */
+  COSTS_N_INSNS (2),   /* add.  */
+  COSTS_N_INSNS (2),   /* extend_add.  */
+  COSTS_N_INSNS (11)   /* idiv.  */
+},
+/* MULT DImode */
+{
+  COSTS_N_INSNS (3),   /* simple.  */
+  0,   /* flag_setting (N/A).  */
+  COSTS_N_INSNS (3),   /* extend.  */
+  COSTS_N_INSNS (3),   /* add.  */
+  COSTS_N_INSNS (3),   /* extend_add.  */
+  COSTS_N_INSNS (19)   /* idiv.  */
+}
+  },
+  /* LD/ST */
+  {
+COSTS_N_INSNS (3), /* load.  */
+COSTS_N_INSNS (4), /* load_sign_extend.  */
+COSTS_N_INSNS (3), /* ldrd.  */
+COSTS_N_INSNS (3), /* ldm_1st.  */
+1, /* ldm_regs_per_insn_1st.  */
+2, /* ldm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (4), /* loadf.  */
+COSTS_N_INSNS (4), /* loadd.  */
+COSTS_N_INSNS (4), /* load_unaligned.  */
+0, /* store.  */
+0, /* strd.  */
+0, /* stm_1st.  */
+1, /* stm_regs_per_insn_1st.  */
+2, /* stm_regs_per_insn_subsequent.  */
+0, /* storef.  */
+0, /* stored.  */
+COSTS_N_INSNS (1), /* store_unaligned.  */
+COSTS_N_INSNS (4), /* loadv.  */
+COSTS_N_INSNS (4)  /* storev.  */
+  },
+  {
+/* FP SFmode */
+{
+  COSTS_N_INSNS (10),  /* div.  */
+  COSTS_N_INSNS (4),   /* mult.  */
+  COSTS_N_INSNS (4),   /* mult_addsub.  */
+  COSTS_N_INSNS (4),   /* fma.  */
+  COSTS_N_INSNS (4),   /* addsub.  */
+  COSTS_N_INSNS (1),   /* fpconst.  */
+  COSTS_N_INSNS (1),   /* neg.  */
+  COSTS_N_INSNS (1),

Free TYPE_VFIELD in free_lang_data

2018-06-21 Thread Jan Hubicka

Hi,
this patch frees TYPE_VFIELD which is used only for debug info generation.
Bootstrapped/regtested x86_64-linux, OK?

Honza

* tree.c (free_lang_data_in_type): Free all TYPE_VFIELDs.
Index: tree.c
===
--- tree.c  (revision 261841)
+++ tree.c  (working copy)
@@ -5134,10 +5134,7 @@ free_lang_data_in_type (tree type)
else
  *prev = DECL_CHAIN (member);
 
-  /* FIXME: C FE uses TYPE_VFIELD to record C_TYPE_INCOMPLETE_VARS
-and danagle the pointer from time to time.  */
-  if (TYPE_VFIELD (type) && TREE_CODE (TYPE_VFIELD (type)) != FIELD_DECL)
-TYPE_VFIELD (type) = NULL_TREE;
+  TYPE_VFIELD (type) = NULL_TREE;
 
   if (TYPE_BINFO (type))
{

Re: [PATCH][ARM] Use __ARM_ARCH instead of __ARM_ARCH__

2018-06-21 Thread Christophe Lyon

On Thu, 21 Jun 2018 at 10:00, Kyrill Tkachov
 wrote:
>
>
> On 21/06/18 07:59, Christophe Lyon wrote:
> > On Tue, 19 Jun 2018 at 10:50, Kyrill Tkachov
> >  wrote:
> >> Hi Christophe,
> >>
> >> On 17/06/18 21:23, Christophe Lyon wrote:
> >>> On Fri, 15 Jun 2018 at 17:22, Richard Earnshaw (lists)
> >>>  wrote:
>  On 15/06/18 15:30, Christophe Lyon wrote:
> > Hello,
> >
> > As suggested in [1], the attached patch removes all definitions and
> > uses of __ARM_ARCH__ and uses __ARM_ARCH instead. The later is indeed
> > defined by the preprocessor to the appropriate value.
> >
> > I ran make check on arm-none-eabi (with A-profile multilib),
> > arm-none-linux-gnueabi, arm-none-linux-gnueabihf (with cortex-a9, a15,
> > a5, a57 and armtdmi as --with-cpu), armeb-none-linux-gnueabihf and
> > armv8l-linux-gnueabihf, and noticed no regression.
> >
> > OK for trunk?
> >
> > Thanks,
> >
> > Christophe
> >
> > [1] https://gcc.gnu.org/ml/gcc-patches/2018-06/msg00445.html
> >
> >
> > ARM_ARCH.chlog.txt
> >
> >
> > libatomic/ChangeLog:
> >
> > 2018-06-15  Christophe Lyon 
> >
> >* config/arm/arm-config.h (__ARM_ARCH__): Remove definitions, use
> >__ARM_ARCH instead.
> >
> > libgcc/ChangeLog:
> >
> > 2018-06-15  Christophe Lyon 
> >
> >* config/arm/lib1funcs.S (__ARM_ARCH__): Remove definitions, use
> >__ARM_ARCH instead.
> >* config/arm/ieee754-df.S: Use __ARM_ARCH instead of 
> > __ARM_ARCH__.
> >* config/arm/ieee754-sf.S: Likewise.
> >* config/arm/libunwind.S: Likewise.
> >
> >
> > ARM_ARCH.patch.txt
> >
>  Thanks, this is a useful start.  We can, however, go further.  ACLE
>  defines a number of 'feature' pre-defines and we can use those to void
>  direct tests of the architecture version directly.  For example,
>  __ARM_FEATURE_LDREX could directly replace having to calculate
>  HAVE_STREX and HAVE_STREXBHD.
> 
> >>> Hi,
> >>>
> >>> Here is an updated patch using __ARM_FEATURE_LDREX.
> >>> I didn't find other opportunities to use ACLE pre-defines, did I miss any?
> >>>
> >> Thanks for doing this. I think we can catch a few more...
> >>
> > OK, I didn't grep accurately enough it seems.
> >
> > Here is a new version hopefully addressing your comments.
>
> yes, that looks good now.
>
> > However, I'm not sure whether replacing uses of __ARM_ARCH__ and
> > removing support for arches < 4 should be in the same patch: this goes
> > beyond my original intent, and I've noticed probable dead code in
> > include/longlong.h (support for umul_ppmm on arm v2 and v3)
>
> I see your point. It could indeed be cleaner if the code removal hunk was
> put in a separate patch. A bugzilla entry about the dead code to be removed 
> would
> be appreciated, I can take care of that then.
>
OK, I've filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86264

> > Similarly there is code to define __ARM_ARCH in libffi/src/arm/sysv.S.
>
> I believe libffi is its own separate project that we import in GCC, so it may 
> want
> to support compiling with older GCC versions. I'd need to double-check that.
>
Indeed, that worried me too.

> > So it seems further cleanup would be needed.
>
> Indeed. This patch is ok with the __ARM_ARCH < 4 path removals separated into
> their own patch.
>

Thanks, committed as r261840 and r261841.

> Thanks,
> Kyrill
>
> >
> > Christophe
> >
> >
> >> diff --git a/libgcc/config/arm/ieee754-df.S 
> >> b/libgcc/config/arm/ieee754-df.S
> >> index 570e5f6..7c5260e 100644
> >> --- a/libgcc/config/arm/ieee754-df.S
> >> +++ b/libgcc/config/arm/ieee754-df.S
> >> @@ -245,7 +245,7 @@ LSYM(Lad_a):
> >>  @ No rounding necessary since ip will always be 0 at this point.
> >>LSYM(Lad_l):
> >>
> >> -#if __ARM_ARCH__ < 5
> >> +#if __ARM_ARCH < 5
> >>
> >> This path exists to handle the case when the CLZ instruction is not 
> >> available (the #else path uses CLZ).
> >> So we can change this to #ifndef __ARM_FEATURE_CLZ
> >>
> >>
> >>  teq xh, #0
> >>  movne   r3, #20
> >> @@ -656,7 +656,7 @@ ARM_FUNC_ALIAS aeabi_dmul muldf3
> >>  orr yh, yh, #0x0010
> >>  beq LSYM(Lml_1)
> >>
> >> -#if __ARM_ARCH__ < 4
> >> +#if __ARM_ARCH < 4
> >>
> >> We can delete this whole path as we no longer support anything older than 4
> >>
> >> diff --git a/libgcc/config/arm/ieee754-sf.S 
> >> b/libgcc/config/arm/ieee754-sf.S
> >> index dac3e2e..00a8d9c 100644
> >> --- a/libgcc/config/arm/ieee754-sf.S
> >> +++ b/libgcc/config/arm/ieee754-sf.S
> >> @@ -175,7 +175,7 @@ LSYM(Lad_a):
> >>  @ No rounding necessary since r1 will always be 0 at this point.
> >>LSYM(Lad_l):
> >>
> >> -#if __ARM_ARCH__ < 5
> >> +#if __ARM_ARCH < 5
> >>
> >>  movsip, r0, lsr #12
> >>  moveq   r0, r0, lsl #12
> >> @@ -370,7 +370,7 @@ ARM_FUNC_ALIAS a

Re: Free TYPE_VFIELD in free_lang_data

2018-06-21 Thread Richard Biener

On Thu, 21 Jun 2018, Jan Hubicka wrote:

> Hi,
> this patch frees TYPE_VFIELD which is used only for debug info generation.
> Bootstrapped/regtested x86_64-linux, OK?

OK.

Richard.

> Honza
> 
>   * tree.c (free_lang_data_in_type): Free all TYPE_VFIELDs.
> Index: tree.c
> ===
> --- tree.c(revision 261841)
> +++ tree.c(working copy)
> @@ -5134,10 +5134,7 @@ free_lang_data_in_type (tree type)
>   else
> *prev = DECL_CHAIN (member);
>  
> -  /* FIXME: C FE uses TYPE_VFIELD to record C_TYPE_INCOMPLETE_VARS
> -  and danagle the pointer from time to time.  */
> -  if (TYPE_VFIELD (type) && TREE_CODE (TYPE_VFIELD (type)) != FIELD_DECL)
> -TYPE_VFIELD (type) = NULL_TREE;
> +  TYPE_VFIELD (type) = NULL_TREE;
>  
>if (TYPE_BINFO (type))
>   {

[testsuite] Fix guality/pr45882.c for flto

2018-06-21 Thread Tom de Vries

Hi,

Atm this test in pr45882.c:
...
  int d = a[i];  /* { dg-final { gdb-test 16 "d" "112" } } */
...
fails as follows with -flto:
...
FAIL: gcc.dg/guality/pr45882.c   -O2 -flto -fuse-linker-plugin \
  -fno-fat-lto-objects  line 16 d == 112
...

In more detail, gdb fails to print the value of d:
...
Breakpoint 1, foo (i=i@entry=7, j=j@entry=7) at pr45882.c:16
16++v;
$1 = 
$2 = 112
 != 112
...

Variable d is a local variable in function foo, initialized from global array a.
When compiling, first cddce1 removes the initialization of d in foo, given
that d is not used afterwards.  Then ipa marks array a as write-only, and
removes the stores to array a in main.  This invalidates the location
expression for d, which points to a[i], so it is removed, which is why gdb
ends up printing  for d.

This patches fixes the fail by adding attribute used to array a, preventing
array a from being marked as write-only.

Tested on x86_64.

OK for trunk?

Thanks,
- Tom

[testsuite] Fix guality/pr45882.c for flto

2018-06-21  Tom de Vries  

* gcc.dg/guality/pr45882.c (a): Add used attribute.

---
 gcc/testsuite/gcc.dg/guality/pr45882.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/guality/pr45882.c 
b/gcc/testsuite/gcc.dg/guality/pr45882.c
index 5ca22d4f4ad..ece35238a30 100644
--- a/gcc/testsuite/gcc.dg/guality/pr45882.c
+++ b/gcc/testsuite/gcc.dg/guality/pr45882.c
@@ -3,7 +3,7 @@
 /* { dg-options "-g" } */
 
 extern void abort (void);
-int a[1024];
+int a[1024] __attribute__((used));
 volatile short int v;
 
 __attribute__((noinline,noclone,used)) int

Re: [testsuite] Fix guality/pr45882.c for flto

2018-06-21 Thread Richard Biener

On Thu, 21 Jun 2018, Tom de Vries wrote:

> Hi,
> 
> Atm this test in pr45882.c:
> ...
>   int d = a[i];  /* { dg-final { gdb-test 16 "d" "112" } } */
> ...
> fails as follows with -flto:
> ...
> FAIL: gcc.dg/guality/pr45882.c   -O2 -flto -fuse-linker-plugin \
>   -fno-fat-lto-objects  line 16 d == 112
> ...
> 
> In more detail, gdb fails to print the value of d:
> ...
> Breakpoint 1, foo (i=i@entry=7, j=j@entry=7) at pr45882.c:16
> 16++v;
> $1 = 
> $2 = 112
>  != 112
> ...
> 
> Variable d is a local variable in function foo, initialized from global array 
> a.
> When compiling, first cddce1 removes the initialization of d in foo, given
> that d is not used afterwards.  Then ipa marks array a as write-only, and
> removes the stores to array a in main.  This invalidates the location
> expression for d, which points to a[i], so it is removed, which is why gdb
> ends up printing  for d.
> 
> This patches fixes the fail by adding attribute used to array a, preventing
> array a from being marked as write-only.
> 
> Tested on x86_64.
> 
> OK for trunk?

OK.

Richard.

> Thanks,
> - Tom
> 
> [testsuite] Fix guality/pr45882.c for flto
> 
> 2018-06-21  Tom de Vries  
> 
>   * gcc.dg/guality/pr45882.c (a): Add used attribute.
> 
> ---
>  gcc/testsuite/gcc.dg/guality/pr45882.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/guality/pr45882.c 
> b/gcc/testsuite/gcc.dg/guality/pr45882.c
> index 5ca22d4f4ad..ece35238a30 100644
> --- a/gcc/testsuite/gcc.dg/guality/pr45882.c
> +++ b/gcc/testsuite/gcc.dg/guality/pr45882.c
> @@ -3,7 +3,7 @@
>  /* { dg-options "-g" } */
>  
>  extern void abort (void);
> -int a[1024];
> +int a[1024] __attribute__((used));
>  volatile short int v;
>  
>  __attribute__((noinline,noclone,used)) int
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

Re: [PATCH v2] [aarch64] Add HiSilicon tsv110 CPU support

2018-06-21 Thread Kyrill Tkachov


Hi Shaokun,

On 21/06/18 12:07, Zhangshaokun wrote:

Hi Kyrill,

It was the Dragon Boat Festival for a short holiday in China, sorry to
reply later.

On 2018/6/14 15:58, Kyrill Tkachov wrote:

Hi Shaokun,

On 14/06/18 02:09, Shaokun Zhang wrote:

This patch adds HiSilicon's an mcpu: tsv110, which supports v8_4A.

---
   gcc/ChangeLog|   8 +++
   gcc/config/aarch64/aarch64-cores.def |   3 +
   gcc/config/aarch64/aarch64-cost-tables.h | 103 
+++
   gcc/config/aarch64/aarch64-tune.md   |   2 +-
   gcc/config/aarch64/aarch64.c |  80 +++-
   gcc/doc/invoke.texi  |   2 +-
   6 files changed, 195 insertions(+), 3 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9c90875..e376714 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2018-06-12  Shaokun Zhang  
+Bo Zhou  
+* config/aarch64/aarch64-cores.def (tsv110): New CPU.
+* config/aarch64/aarch64-tune.md: Regenerated.
+* doc/invoke.texi (AArch64 Options/-mtune): Add "tsv110".
+* config/aarch64/aarch64.c (tsv110_tunings): New tuning table.
+* config/aarch64/aarch64-cost-tables.h: Add "tsv110" extra costs.
+

Can you confirm that you've run a bootstrap and test run with this patch
to check there are no regressions?


I have tested this patch (fix some typo) on aarch64 and didn't get any 
regressions.

While, there is issue that is on the master branch:
../.././gcc/bitmap.c: In function ‘unsigned int 
bitmap_last_set_bit(const_bitmap)’:
../.././gcc/bitmap.c:841:26: error: array subscript -1 is below array bounds of 
‘const BITMAP_WORD [2]’ {aka ‘const long unsigned int [2]’} 
[-Werror=array-bounds]
word = elt->bits[ix];
   ^
cc1plus: all warnings being treated as errors
Makefile:1110: recipe for target 'bitmap.o' failed
make[3]: *** [bitmap.o] Error 1


I don't see that error with the current trunk based off r261832 (today).
Can you make sure the bootstrap passes with your patch on top of the recent 
trunk?

Thanks,
Kyrill


My gcc version is: gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609.
Are you happy to fix it? I fixed it in my local, but I am not sure it is ok.


This version looks good to me but you'll need final approval from the 
maintainers.


I will update patch based on latest branch code today.
Hopefully you and maintainers are happy on v3.

Thanks,
Shaokun.


Thanks,
Kyrill


   2018-06-12  Eric Botcazou  
 * gcc.c: Document new %@{...} sequence.
diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index e64d831..e6ebf02 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -88,6 +88,9 @@ AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2
 /* ARMv8.4-A Architecture Processors.  */
   +/* HiSilicon ('H') cores. */
+AARCH64_CORE("tsv110", tsv110,cortexa57,8_4A, 
AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | 
AARCH64_FL_SHA2, tsv110,   0x48, 0xd01, -1)
+
   /* Qualcomm ('Q') cores. */
   AARCH64_CORE("saphira", saphira,falkor,8_4A,  
AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   0x51, 0xC01, -1)
   diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
b/gcc/config/aarch64/aarch64-cost-tables.h
index a455c62..b6890d6 100644
--- a/gcc/config/aarch64/aarch64-cost-tables.h
+++ b/gcc/config/aarch64/aarch64-cost-tables.h
@@ -334,4 +334,107 @@ const struct cpu_cost_table thunderx2t99_extra_costs =
 }
   };
   +const struct cpu_cost_table tsv110_extra_costs =
+{
+  /* ALU */
+  {
+0, /* arith.  */
+0, /* logical.  */
+0, /* shift.  */
+0, /* shift_reg.  */
+COSTS_N_INSNS (1), /* arith_shift.  */
+COSTS_N_INSNS (1), /* arith_shift_reg.  */
+COSTS_N_INSNS (1), /* log_shift.  */
+COSTS_N_INSNS (1), /* log_shift_reg.  */
+0, /* extend.  */
+COSTS_N_INSNS (1), /* extend_arith.  */
+0, /* bfi.  */
+0, /* bfx.  */
+0, /* clz.  */
+0,   /* rev.  */
+0, /* non_exec.  */
+true   /* non_exec_costs_exec.  */
+  },
+  {
+/* MULT SImode */
+{
+  COSTS_N_INSNS (2),   /* simple.  */
+  COSTS_N_INSNS (2),   /* flag_setting.  */
+  COSTS_N_INSNS (2),   /* extend.  */
+  COSTS_N_INSNS (2),   /* add.  */
+  COSTS_N_INSNS (2),   /* extend_add.  */
+  COSTS_N_INSNS (11)   /* idiv.  */
+},
+/* MULT DImode */
+{
+  COSTS_N_INSNS (3),   /* simple.  */
+  0,   /* flag_setting (N/A).  */
+  COSTS_N_INSNS (3),   /* extend.  */
+  COSTS_N_INSNS (3),   /* add.  */
+  COSTS_N_INSNS (3),   /* extend_add.  */
+  COSTS_

Re: [PATCH v2] [aarch64] Add HiSilicon tsv110 CPU support

2018-06-21 Thread Zhangshaokun

Hi Kyrill,

On 2018/6/21 20:56, Kyrill Tkachov wrote:
> Hi Shaokun,
> 
> On 21/06/18 12:07, Zhangshaokun wrote:
>> Hi Kyrill,
>>
>> It was the Dragon Boat Festival for a short holiday in China, sorry to
>> reply later.
>>
>> On 2018/6/14 15:58, Kyrill Tkachov wrote:
>>> Hi Shaokun,
>>>
>>> On 14/06/18 02:09, Shaokun Zhang wrote:
 This patch adds HiSilicon's an mcpu: tsv110, which supports v8_4A.

 ---
gcc/ChangeLog|   8 +++
gcc/config/aarch64/aarch64-cores.def |   3 +
gcc/config/aarch64/aarch64-cost-tables.h | 103 
 +++
gcc/config/aarch64/aarch64-tune.md   |   2 +-
gcc/config/aarch64/aarch64.c |  80 +++-
gcc/doc/invoke.texi  |   2 +-
6 files changed, 195 insertions(+), 3 deletions(-)

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index 9c90875..e376714 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,11 @@
 +2018-06-12  Shaokun Zhang  
 +Bo Zhou  
 +* config/aarch64/aarch64-cores.def (tsv110): New CPU.
 +* config/aarch64/aarch64-tune.md: Regenerated.
 +* doc/invoke.texi (AArch64 Options/-mtune): Add "tsv110".
 +* config/aarch64/aarch64.c (tsv110_tunings): New tuning table.
 +* config/aarch64/aarch64-cost-tables.h: Add "tsv110" extra costs.
 +
>>> Can you confirm that you've run a bootstrap and test run with this patch
>>> to check there are no regressions?
>>>
>> I have tested this patch (fix some typo) on aarch64 and didn't get any 
>> regressions.
>>
>> While, there is issue that is on the master branch:
>> ../.././gcc/bitmap.c: In function ‘unsigned int 
>> bitmap_last_set_bit(const_bitmap)’:
>> ../.././gcc/bitmap.c:841:26: error: array subscript -1 is below array bounds 
>> of ‘const BITMAP_WORD [2]’ {aka ‘const long unsigned int [2]’} 
>> [-Werror=array-bounds]
>> word = elt->bits[ix];
>>^
>> cc1plus: all warnings being treated as errors
>> Makefile:1110: recipe for target 'bitmap.o' failed
>> make[3]: *** [bitmap.o] Error 1
> 
> I don't see that error with the current trunk based off r261832 (today).

I got it based on fa681b4(also today).

> Can you make sure the bootstrap passes with your patch on top of the recent 
> trunk?

On this patch, My mistake that there were some typos, I have fixed them and 
sent patch
v3, please review.

Thanks,
Shaokun

> 
> Thanks,
> Kyrill
> 
>> My gcc version is: gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609.
>> Are you happy to fix it? I fixed it in my local, but I am not sure it is ok.
>>
>>> This version looks good to me but you'll need final approval from the 
>>> maintainers.
>>>
>> I will update patch based on latest branch code today.
>> Hopefully you and maintainers are happy on v3.
>>
>> Thanks,
>> Shaokun.
>>
>>> Thanks,
>>> Kyrill
>>>
2018-06-12  Eric Botcazou  
  * gcc.c: Document new %@{...} sequence.
 diff --git a/gcc/config/aarch64/aarch64-cores.def 
 b/gcc/config/aarch64/aarch64-cores.def
 index e64d831..e6ebf02 100644
 --- a/gcc/config/aarch64/aarch64-cores.def
 +++ b/gcc/config/aarch64/aarch64-cores.def
 @@ -88,6 +88,9 @@ AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  
 AARCH64_FL_FOR_ARCH8_2
  /* ARMv8.4-A Architecture Processors.  */
+/* HiSilicon ('H') cores. */
 +AARCH64_CORE("tsv110", tsv110,cortexa57,8_4A, 
 AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_F16 | 
 AARCH64_FL_AES | AARCH64_FL_SHA2, tsv110,   0x48, 0xd01, -1)
 +
/* Qualcomm ('Q') cores. */
AARCH64_CORE("saphira", saphira,falkor,8_4A,  
 AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   
 0x51, 0xC01, -1)
diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
 b/gcc/config/aarch64/aarch64-cost-tables.h
 index a455c62..b6890d6 100644
 --- a/gcc/config/aarch64/aarch64-cost-tables.h
 +++ b/gcc/config/aarch64/aarch64-cost-tables.h
 @@ -334,4 +334,107 @@ const struct cpu_cost_table thunderx2t99_extra_costs 
 =
  }
};
+const struct cpu_cost_table tsv110_extra_costs =
 +{
 +  /* ALU */
 +  {
 +0, /* arith.  */
 +0, /* logical.  */
 +0, /* shift.  */
 +0, /* shift_reg.  */
 +COSTS_N_INSNS (1), /* arith_shift.  */
 +COSTS_N_INSNS (1), /* arith_shift_reg.  */
 +COSTS_N_INSNS (1), /* log_shift.  */
 +COSTS_N_INSNS (1), /* log_shift_reg.  */
 +0, /* extend.  */
 +COSTS_N_INSNS (1), /* extend_arith.  */
 +0, /* bfi.  */
 +0, /* bfx.  */
 +0, /* clz.  */
 +0,   /* rev.  */
 +

Do not stream unnecessary BINFO fields

2018-06-21 Thread Jan Hubicka

Hi,
this patch drops streaming of binfo bits we do not need.  We only care
about BINFO_TYPE, BINFO_VTABLE, BASES and BINFO_OFFSET.

Bootstrapped/regtested x86_64-linux, OK?

Honza

* lto-streamer-out.c (DFS::DFS_write_tree_body): Do not stream
BINFO_BASE_ACCESSES and BINFO_VPTR_FIELD.
* tree-streamer-in.c (streamer_read_tree_bitfields): Likewise.
(lto_input_ts_binfo_tree_pointers): Likewise.
* tree-streamer-out.c (streamer_write_tree_bitfields,
write_ts_binfo_tree_pointers): Likewise.
* tree.c (free_lang_data_in_binfo): Clear BINFO_VPTR_FIELD.
Index: lto-streamer-out.c
===
--- lto-streamer-out.c  (revision 261841)
+++ lto-streamer-out.c  (working copy)
@@ -954,15 +947,10 @@ DFS::DFS_write_tree_body (struct output_
DFS_follow_tree_edge (t);
   DFS_follow_tree_edge (BINFO_OFFSET (expr));
   DFS_follow_tree_edge (BINFO_VTABLE (expr));
-  DFS_follow_tree_edge (BINFO_VPTR_FIELD (expr));
-
-  /* The number of BINFO_BASE_ACCESSES has already been emitted in
-EXPR's bitfield section.  */
-  FOR_EACH_VEC_SAFE_ELT (BINFO_BASE_ACCESSES (expr), i, t)
-   DFS_follow_tree_edge (t);
 
-  /* Do not walk BINFO_INHERITANCE_CHAIN, BINFO_SUBVTT_INDEX
-and BINFO_VPTR_INDEX; these are used by C++ FE only.  */
+  /* Do not walk BINFO_INHERITANCE_CHAIN, BINFO_SUBVTT_INDEX,
+BINFO_BASE_ACCESSES and BINFO_VPTR_INDEX; these are used
+by C++ FE only.  */
 }
 
   if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
@@ -1347,11 +1329,9 @@ hash_tree (struct streamer_tree_cache_d
visit (b);
   visit (BINFO_OFFSET (t));
   visit (BINFO_VTABLE (t));
-  visit (BINFO_VPTR_FIELD (t));
-  FOR_EACH_VEC_SAFE_ELT (BINFO_BASE_ACCESSES (t), i, b)
-   visit (b);
   /* Do not walk BINFO_INHERITANCE_CHAIN, BINFO_SUBVTT_INDEX
-and BINFO_VPTR_INDEX; these are used by C++ FE only.  */
+BINFO_BASE_ACCESSES and BINFO_VPTR_INDEX; these are used
+by C++ FE only.  */
 }
 
   if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
Index: tree-streamer-in.c
===
--- tree-streamer-in.c  (revision 261841)
+++ tree-streamer-in.c  (working copy)
@@ -532,13 +532,6 @@ streamer_read_tree_bitfields (struct lto
   if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION))
 cl_optimization_stream_in (&bp, TREE_OPTIMIZATION (expr));
 
-  if (CODE_CONTAINS_STRUCT (code, TS_BINFO))
-{
-  unsigned HOST_WIDE_INT length = bp_unpack_var_len_unsigned (&bp);
-  if (length > 0)
-   vec_safe_grow (BINFO_BASE_ACCESSES (expr), length);
-}
-
   if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
 {
   unsigned HOST_WIDE_INT length = bp_unpack_var_len_unsigned (&bp);
@@ -969,7 +960,6 @@ static void
 lto_input_ts_binfo_tree_pointers (struct lto_input_block *ib,
  struct data_in *data_in, tree expr)
 {
-  unsigned i;
   tree t;
 
   /* Note that the number of slots in EXPR was read in
@@ -987,17 +977,10 @@ lto_input_ts_binfo_tree_pointers (struct
 
   BINFO_OFFSET (expr) = stream_read_tree (ib, data_in);
   BINFO_VTABLE (expr) = stream_read_tree (ib, data_in);
-  BINFO_VPTR_FIELD (expr) = stream_read_tree (ib, data_in);
 
-  /* The vector of BINFO_BASE_ACCESSES is pre-allocated during
- unpacking the bitfield section.  */
-  for (i = 0; i < vec_safe_length (BINFO_BASE_ACCESSES (expr)); i++)
-{
-  tree a = stream_read_tree (ib, data_in);
-  (*BINFO_BASE_ACCESSES (expr))[i] = a;
-}
-  /* Do not walk BINFO_INHERITANCE_CHAIN, BINFO_SUBVTT_INDEX
- and BINFO_VPTR_INDEX; these are used by C++ FE only.  */
+  /* Do not walk BINFO_INHERITANCE_CHAIN, BINFO_SUBVTT_INDEX,
+ BINFO_BASE_ACCESSES and BINFO_VPTR_INDEX; these are used by C++ FE
+ only.  */
 }
 
 
Index: tree-streamer-out.c
===
--- tree-streamer-out.c (revision 261841)
+++ tree-streamer-out.c (working copy)
@@ -468,9 +468,6 @@ streamer_write_tree_bitfields (struct ou
   if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION))
 cl_optimization_stream_out (&bp, TREE_OPTIMIZATION (expr));
 
-  if (CODE_CONTAINS_STRUCT (code, TS_BINFO))
-bp_pack_var_len_unsigned (&bp, vec_safe_length (BINFO_BASE_ACCESSES 
(expr)));
-
   if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
 bp_pack_var_len_unsigned (&bp, CONSTRUCTOR_NELTS (expr));
 
@@ -824,15 +814,9 @@ write_ts_binfo_tree_pointers (struct out
 
   stream_write_tree (ob, BINFO_OFFSET (expr), ref_p);
   stream_write_tree (ob, BINFO_VTABLE (expr), ref_p);
-  stream_write_tree (ob, BINFO_VPTR_FIELD (expr), ref_p);
-
-  /* The number of BINFO_BASE_ACCESSES has already been emitted in
- EXPR's bitfield section.  */
-  FOR_EACH_VEC_SAFE_ELT (BINFO_BASE_ACCESSES (expr), i, t)
-stream_write_tree (ob, t, ref_p);
 
-  /* Do not walk BINFO_INHERITA

Cleanup DECL streaming

2018-06-21 Thread Jan Hubicka

Hi,
this patch drops DECL_ORIGINAL_TYPE streaming and also logic handling
external decls in blocks since we no longer stream them at all.

Bootstrapped/regtested x86_64-linux, OK?

Honza

* lto-streamer-out.c (DFS::DFS_write_tree_body): Do not
stream DECL_ORIGINAL_TYPE.
(DFS::DFS_write_tree_body): Drop hack handling local external decls.
(hash_tree): Do not walk DECL_ORIGINAL_TYPE.
* tree-streamer-in.c (lto_input_ts_decl_non_common_tree_pointers):
Do not walk original type.
* tree-streamer-out.c (streamer_write_chain): Drop hack handling
external decls.
(write_ts_decl_non_common_tree_pointers): Do not stream
DECL_ORIGINAL_TYPE
* tree.c (free_lang_data_in_decl): Clear DECL_ORIGINAL_TYPE.
(find_decls_types_r): Do not walk DEC_ORIGINAL_TYPE.

Index: lto-streamer-out.c
===
--- lto-streamer-out.c  (revision 261841)
+++ lto-streamer-out.c  (working copy)
@@ -819,12 +819,6 @@ DFS::DFS_write_tree_body (struct output_
DFS_follow_tree_edge (DECL_DEBUG_EXPR (expr));
 }
 
-  if (CODE_CONTAINS_STRUCT (code, TS_DECL_NON_COMMON))
-{
-  if (TREE_CODE (expr) == TYPE_DECL)
-   DFS_follow_tree_edge (DECL_ORIGINAL_TYPE (expr));
-}
-
   if (CODE_CONTAINS_STRUCT (code, TS_DECL_WITH_VIS))
 {
   /* Make sure we don't inadvertently set the assembler name.  */
@@ -907,14 +901,13 @@ DFS::DFS_write_tree_body (struct output_
   if (CODE_CONTAINS_STRUCT (code, TS_BLOCK))
 {
   for (tree t = BLOCK_VARS (expr); t; t = TREE_CHAIN (t))
-   if (VAR_OR_FUNCTION_DECL_P (t)
-   && DECL_EXTERNAL (t))
- /* We have to stream externals in the block chain as
-non-references.  See also
-tree-streamer-out.c:streamer_write_chain.  */
- DFS_write_tree (ob, expr_state, t, ref_p, false);
-   else
+   {
+ /* We would have to stream externals in the block chain as
+non-references but we should have dropped them in
+free-lang-data.  */
+ gcc_assert (!VAR_OR_FUNCTION_DECL_P (t) || !DECL_EXTERNAL (t));
  DFS_follow_tree_edge (t);
+   }
 
   DFS_follow_tree_edge (BLOCK_SUPERCONTEXT (expr));
 
@@ -1261,12 +1249,6 @@ hash_tree (struct streamer_tree_cache_d
  be able to call get_symbol_initial_value.  */
 }
 
-  if (CODE_CONTAINS_STRUCT (code, TS_DECL_NON_COMMON))
-{
-  if (code == TYPE_DECL)
-   visit (DECL_ORIGINAL_TYPE (t));
-}
-
   if (CODE_CONTAINS_STRUCT (code, TS_DECL_WITH_VIS))
 {
   if (DECL_ASSEMBLER_NAME_SET_P (t))
Index: tree-streamer-in.c
===
--- tree-streamer-in.c  (revision 261841)
+++ tree-streamer-in.c  (working copy)
@@ -728,11 +721,9 @@ lto_input_ts_decl_common_tree_pointers (
file being read.  */
 
 static void
-lto_input_ts_decl_non_common_tree_pointers (struct lto_input_block *ib,
-   struct data_in *data_in, tree expr)
+lto_input_ts_decl_non_common_tree_pointers (struct lto_input_block *,
+   struct data_in *, tree)
 {
-  if (TREE_CODE (expr) == TYPE_DECL)
-DECL_ORIGINAL_TYPE (expr) = stream_read_tree (ib, data_in);
 }
 
 
Index: tree-streamer-out.c
===
--- tree-streamer-out.c (revision 261841)
+++ tree-streamer-out.c (working copy)
@@ -497,14 +494,10 @@ streamer_write_chain (struct output_bloc
 {
   /* We avoid outputting external vars or functions by reference
 to the global decls section as we do not want to have them
-enter decl merging.  This is, of course, only for the call
-for streaming BLOCK_VARS, but other callers are safe.
-See also lto-streamer-out.c:DFS_write_tree_body.  */
-  if (VAR_OR_FUNCTION_DECL_P (t)
- && DECL_EXTERNAL (t))
-   stream_write_tree_shallow_non_ref (ob, t, ref_p);
-  else
-   stream_write_tree (ob, t, ref_p);
+enter decl merging.  We should not need to do this anymore because
+free_lang_data removes them from block scopes.  */
+  gcc_assert (!VAR_OR_FUNCTION_DECL_P (t) || !DECL_EXTERNAL (t));
+  stream_write_tree (ob, t, ref_p);
 
   t = TREE_CHAIN (t);
 }
@@ -620,11 +613,8 @@ write_ts_decl_common_tree_pointers (stru
pointer fields.  */
 
 static void
-write_ts_decl_non_common_tree_pointers (struct output_block *ob, tree expr,
-   bool ref_p)
+write_ts_decl_non_common_tree_pointers (struct output_block *, tree, bool)
 {
-  if (TREE_CODE (expr) == TYPE_DECL)
-stream_write_tree (ob, DECL_ORIGINAL_TYPE (expr), ref_p);
 }
 
 
Index: tree.c
===
--- tree.c  (revision 261841)
+++ tree.c  (working copy)
@@ -5359,6 +5357,7 @@ free_lang_data_in_decl

Re: Cleanup DECL streaming

2018-06-21 Thread Jan Hubicka

> Hi,
> this patch drops DECL_ORIGINAL_TYPE streaming and also logic handling
> external decls in blocks since we no longer stream them at all.
> 
> Bootstrapped/regtested x86_64-linux, OK?
Actually the patch dies with lto-bootstrap :(
0x9c44fc gen_member_die
../../gcc/dwarf2out.c:24947
0x9c4bb1 gen_struct_or_union_type_die
../../gcc/dwarf2out.c:25143
0x9c52dd gen_tagged_type_die
../../gcc/dwarf2out.c:25353
0x9c4ffe gen_typedef_die
../../gcc/dwarf2out.c:25259
0x9c6e5f gen_decl_die
../../gcc/dwarf2out.c:26248
0x9c5557 gen_type_die_with_usage
../../gcc/dwarf2out.c:25424
0x9c5bc2 gen_type_die
../../gcc/dwarf2out.c:25608
0x9a7756 modified_type_die
../../gcc/dwarf2out.c:13396
0x9bd011 add_type_attribute
../../gcc/dwarf2out.c:21523
0x9bf854 gen_subprogram_die
../../gcc/dwarf2out.c:22835
0x9c6d8c gen_decl_die
../../gcc/dwarf2out.c:26222
0x9c7db9 dwarf2out_decl
../../gcc/dwarf2out.c:26787
0x9c7e15 dwarf2out_function_decl
../../gcc/dwarf2out.c:26802
0xa65bd9 rest_of_handle_final
../../gcc/final.c:4704
0xa65d46 execute
../../gcc/final.c:4746

I am going to respawn it becuase it seems odd that stage2 built but stage3
failed. Clearly typeefs are determined by existence of original type:

/* Returns true if X is a typedef decl.  */

bool
is_typedef_decl (const_tree x)
{
  return (x && TREE_CODE (x) == TYPE_DECL
  && DECL_ORIGINAL_TYPE (x) != NULL_TREE);
}

/* Returns true iff TYPE is a type variant created for a typedef. */

bool
typedef_variant_p (const_tree type)
{
  return is_typedef_decl (TYPE_NAME (type));
}

however aren't we supposed to not touch these at late builds? We drop most of 
TYPE_DECLs in favour
of IDENTIFIER_TYPE and thus also throwing away DECL_ORIGINAL_TYPEs.

Honza
> 
> Honza
> 
>   * lto-streamer-out.c (DFS::DFS_write_tree_body): Do not
>   stream DECL_ORIGINAL_TYPE.
>   (DFS::DFS_write_tree_body): Drop hack handling local external decls.
>   (hash_tree): Do not walk DECL_ORIGINAL_TYPE.
>   * tree-streamer-in.c (lto_input_ts_decl_non_common_tree_pointers):
>   Do not walk original type.
>   * tree-streamer-out.c (streamer_write_chain): Drop hack handling
>   external decls.
>   (write_ts_decl_non_common_tree_pointers): Do not stream
>   DECL_ORIGINAL_TYPE
>   * tree.c (free_lang_data_in_decl): Clear DECL_ORIGINAL_TYPE.
>   (find_decls_types_r): Do not walk DEC_ORIGINAL_TYPE.
> 
> Index: lto-streamer-out.c
> ===
> --- lto-streamer-out.c(revision 261841)
> +++ lto-streamer-out.c(working copy)
> @@ -819,12 +819,6 @@ DFS::DFS_write_tree_body (struct output_
>   DFS_follow_tree_edge (DECL_DEBUG_EXPR (expr));
>  }
>  
> -  if (CODE_CONTAINS_STRUCT (code, TS_DECL_NON_COMMON))
> -{
> -  if (TREE_CODE (expr) == TYPE_DECL)
> - DFS_follow_tree_edge (DECL_ORIGINAL_TYPE (expr));
> -}
> -
>if (CODE_CONTAINS_STRUCT (code, TS_DECL_WITH_VIS))
>  {
>/* Make sure we don't inadvertently set the assembler name.  */
> @@ -907,14 +901,13 @@ DFS::DFS_write_tree_body (struct output_
>if (CODE_CONTAINS_STRUCT (code, TS_BLOCK))
>  {
>for (tree t = BLOCK_VARS (expr); t; t = TREE_CHAIN (t))
> - if (VAR_OR_FUNCTION_DECL_P (t)
> - && DECL_EXTERNAL (t))
> -   /* We have to stream externals in the block chain as
> -  non-references.  See also
> -  tree-streamer-out.c:streamer_write_chain.  */
> -   DFS_write_tree (ob, expr_state, t, ref_p, false);
> - else
> + {
> +   /* We would have to stream externals in the block chain as
> +  non-references but we should have dropped them in
> +  free-lang-data.  */
> +   gcc_assert (!VAR_OR_FUNCTION_DECL_P (t) || !DECL_EXTERNAL (t));
> DFS_follow_tree_edge (t);
> + }
>  
>DFS_follow_tree_edge (BLOCK_SUPERCONTEXT (expr));
>  
> @@ -1261,12 +1249,6 @@ hash_tree (struct streamer_tree_cache_d
>   be able to call get_symbol_initial_value.  */
>  }
>  
> -  if (CODE_CONTAINS_STRUCT (code, TS_DECL_NON_COMMON))
> -{
> -  if (code == TYPE_DECL)
> - visit (DECL_ORIGINAL_TYPE (t));
> -}
> -
>if (CODE_CONTAINS_STRUCT (code, TS_DECL_WITH_VIS))
>  {
>if (DECL_ASSEMBLER_NAME_SET_P (t))
> Index: tree-streamer-in.c
> ===
> --- tree-streamer-in.c(revision 261841)
> +++ tree-streamer-in.c(working copy)
> @@ -728,11 +721,9 @@ lto_input_ts_decl_common_tree_pointers (
> file being read.  */
>  
>  static void
> -lto_input_ts_decl_non_common_tree_pointers (struct lto_input_block *ib,
> - struct data_in *data_in, tree expr)
> +lto_input_ts_decl_non_common_tree_pointers (struct lto_input_block *,
> + struct data_in *,

Re: Do not stream unnecessary BINFO fields

2018-06-21 Thread Richard Biener

On Thu, 21 Jun 2018, Jan Hubicka wrote:

> Hi,
> this patch drops streaming of binfo bits we do not need.  We only care
> about BINFO_TYPE, BINFO_VTABLE, BASES and BINFO_OFFSET.
> 
> Bootstrapped/regtested x86_64-linux, OK?

OK.

Richard.

> Honza
> 
>   * lto-streamer-out.c (DFS::DFS_write_tree_body): Do not stream
>   BINFO_BASE_ACCESSES and BINFO_VPTR_FIELD.
>   * tree-streamer-in.c (streamer_read_tree_bitfields): Likewise.
>   (lto_input_ts_binfo_tree_pointers): Likewise.
>   * tree-streamer-out.c (streamer_write_tree_bitfields,
>   write_ts_binfo_tree_pointers): Likewise.
>   * tree.c (free_lang_data_in_binfo): Clear BINFO_VPTR_FIELD.
> Index: lto-streamer-out.c
> ===
> --- lto-streamer-out.c(revision 261841)
> +++ lto-streamer-out.c(working copy)
> @@ -954,15 +947,10 @@ DFS::DFS_write_tree_body (struct output_
>   DFS_follow_tree_edge (t);
>DFS_follow_tree_edge (BINFO_OFFSET (expr));
>DFS_follow_tree_edge (BINFO_VTABLE (expr));
> -  DFS_follow_tree_edge (BINFO_VPTR_FIELD (expr));
> -
> -  /* The number of BINFO_BASE_ACCESSES has already been emitted in
> -  EXPR's bitfield section.  */
> -  FOR_EACH_VEC_SAFE_ELT (BINFO_BASE_ACCESSES (expr), i, t)
> - DFS_follow_tree_edge (t);
>  
> -  /* Do not walk BINFO_INHERITANCE_CHAIN, BINFO_SUBVTT_INDEX
> -  and BINFO_VPTR_INDEX; these are used by C++ FE only.  */
> +  /* Do not walk BINFO_INHERITANCE_CHAIN, BINFO_SUBVTT_INDEX,
> +  BINFO_BASE_ACCESSES and BINFO_VPTR_INDEX; these are used
> +  by C++ FE only.  */
>  }
>  
>if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
> @@ -1347,11 +1329,9 @@ hash_tree (struct streamer_tree_cache_d
>   visit (b);
>visit (BINFO_OFFSET (t));
>visit (BINFO_VTABLE (t));
> -  visit (BINFO_VPTR_FIELD (t));
> -  FOR_EACH_VEC_SAFE_ELT (BINFO_BASE_ACCESSES (t), i, b)
> - visit (b);
>/* Do not walk BINFO_INHERITANCE_CHAIN, BINFO_SUBVTT_INDEX
> -  and BINFO_VPTR_INDEX; these are used by C++ FE only.  */
> +  BINFO_BASE_ACCESSES and BINFO_VPTR_INDEX; these are used
> +  by C++ FE only.  */
>  }
>  
>if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
> Index: tree-streamer-in.c
> ===
> --- tree-streamer-in.c(revision 261841)
> +++ tree-streamer-in.c(working copy)
> @@ -532,13 +532,6 @@ streamer_read_tree_bitfields (struct lto
>if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION))
>  cl_optimization_stream_in (&bp, TREE_OPTIMIZATION (expr));
>  
> -  if (CODE_CONTAINS_STRUCT (code, TS_BINFO))
> -{
> -  unsigned HOST_WIDE_INT length = bp_unpack_var_len_unsigned (&bp);
> -  if (length > 0)
> - vec_safe_grow (BINFO_BASE_ACCESSES (expr), length);
> -}
> -
>if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
>  {
>unsigned HOST_WIDE_INT length = bp_unpack_var_len_unsigned (&bp);
> @@ -969,7 +960,6 @@ static void
>  lto_input_ts_binfo_tree_pointers (struct lto_input_block *ib,
> struct data_in *data_in, tree expr)
>  {
> -  unsigned i;
>tree t;
>  
>/* Note that the number of slots in EXPR was read in
> @@ -987,17 +977,10 @@ lto_input_ts_binfo_tree_pointers (struct
>  
>BINFO_OFFSET (expr) = stream_read_tree (ib, data_in);
>BINFO_VTABLE (expr) = stream_read_tree (ib, data_in);
> -  BINFO_VPTR_FIELD (expr) = stream_read_tree (ib, data_in);
>  
> -  /* The vector of BINFO_BASE_ACCESSES is pre-allocated during
> - unpacking the bitfield section.  */
> -  for (i = 0; i < vec_safe_length (BINFO_BASE_ACCESSES (expr)); i++)
> -{
> -  tree a = stream_read_tree (ib, data_in);
> -  (*BINFO_BASE_ACCESSES (expr))[i] = a;
> -}
> -  /* Do not walk BINFO_INHERITANCE_CHAIN, BINFO_SUBVTT_INDEX
> - and BINFO_VPTR_INDEX; these are used by C++ FE only.  */
> +  /* Do not walk BINFO_INHERITANCE_CHAIN, BINFO_SUBVTT_INDEX,
> + BINFO_BASE_ACCESSES and BINFO_VPTR_INDEX; these are used by C++ FE
> + only.  */
>  }
>  
>  
> Index: tree-streamer-out.c
> ===
> --- tree-streamer-out.c   (revision 261841)
> +++ tree-streamer-out.c   (working copy)
> @@ -468,9 +468,6 @@ streamer_write_tree_bitfields (struct ou
>if (CODE_CONTAINS_STRUCT (code, TS_OPTIMIZATION))
>  cl_optimization_stream_out (&bp, TREE_OPTIMIZATION (expr));
>  
> -  if (CODE_CONTAINS_STRUCT (code, TS_BINFO))
> -bp_pack_var_len_unsigned (&bp, vec_safe_length (BINFO_BASE_ACCESSES 
> (expr)));
> -
>if (CODE_CONTAINS_STRUCT (code, TS_CONSTRUCTOR))
>  bp_pack_var_len_unsigned (&bp, CONSTRUCTOR_NELTS (expr));
>  
> @@ -824,15 +814,9 @@ write_ts_binfo_tree_pointers (struct out
>  
>stream_write_tree (ob, BINFO_OFFSET (expr), ref_p);
>stream_write_tree (ob, BINFO_VTABLE (expr), ref_p);
> -  stream

Re: Cleanup DECL streaming

2018-06-21 Thread Richard Biener

On Thu, 21 Jun 2018, Jan Hubicka wrote:

> Hi,
> this patch drops DECL_ORIGINAL_TYPE streaming and also logic handling
> external decls in blocks since we no longer stream them at all.
> 
> Bootstrapped/regtested x86_64-linux, OK?

OK.

Thanks,
Richard.

> Honza
> 
>   * lto-streamer-out.c (DFS::DFS_write_tree_body): Do not
>   stream DECL_ORIGINAL_TYPE.
>   (DFS::DFS_write_tree_body): Drop hack handling local external decls.
>   (hash_tree): Do not walk DECL_ORIGINAL_TYPE.
>   * tree-streamer-in.c (lto_input_ts_decl_non_common_tree_pointers):
>   Do not walk original type.
>   * tree-streamer-out.c (streamer_write_chain): Drop hack handling
>   external decls.
>   (write_ts_decl_non_common_tree_pointers): Do not stream
>   DECL_ORIGINAL_TYPE
>   * tree.c (free_lang_data_in_decl): Clear DECL_ORIGINAL_TYPE.
>   (find_decls_types_r): Do not walk DEC_ORIGINAL_TYPE.
> 
> Index: lto-streamer-out.c
> ===
> --- lto-streamer-out.c(revision 261841)
> +++ lto-streamer-out.c(working copy)
> @@ -819,12 +819,6 @@ DFS::DFS_write_tree_body (struct output_
>   DFS_follow_tree_edge (DECL_DEBUG_EXPR (expr));
>  }
>  
> -  if (CODE_CONTAINS_STRUCT (code, TS_DECL_NON_COMMON))
> -{
> -  if (TREE_CODE (expr) == TYPE_DECL)
> - DFS_follow_tree_edge (DECL_ORIGINAL_TYPE (expr));
> -}
> -
>if (CODE_CONTAINS_STRUCT (code, TS_DECL_WITH_VIS))
>  {
>/* Make sure we don't inadvertently set the assembler name.  */
> @@ -907,14 +901,13 @@ DFS::DFS_write_tree_body (struct output_
>if (CODE_CONTAINS_STRUCT (code, TS_BLOCK))
>  {
>for (tree t = BLOCK_VARS (expr); t; t = TREE_CHAIN (t))
> - if (VAR_OR_FUNCTION_DECL_P (t)
> - && DECL_EXTERNAL (t))
> -   /* We have to stream externals in the block chain as
> -  non-references.  See also
> -  tree-streamer-out.c:streamer_write_chain.  */
> -   DFS_write_tree (ob, expr_state, t, ref_p, false);
> - else
> + {
> +   /* We would have to stream externals in the block chain as
> +  non-references but we should have dropped them in
> +  free-lang-data.  */
> +   gcc_assert (!VAR_OR_FUNCTION_DECL_P (t) || !DECL_EXTERNAL (t));
> DFS_follow_tree_edge (t);
> + }
>  
>DFS_follow_tree_edge (BLOCK_SUPERCONTEXT (expr));
>  
> @@ -1261,12 +1249,6 @@ hash_tree (struct streamer_tree_cache_d
>   be able to call get_symbol_initial_value.  */
>  }
>  
> -  if (CODE_CONTAINS_STRUCT (code, TS_DECL_NON_COMMON))
> -{
> -  if (code == TYPE_DECL)
> - visit (DECL_ORIGINAL_TYPE (t));
> -}
> -
>if (CODE_CONTAINS_STRUCT (code, TS_DECL_WITH_VIS))
>  {
>if (DECL_ASSEMBLER_NAME_SET_P (t))
> Index: tree-streamer-in.c
> ===
> --- tree-streamer-in.c(revision 261841)
> +++ tree-streamer-in.c(working copy)
> @@ -728,11 +721,9 @@ lto_input_ts_decl_common_tree_pointers (
> file being read.  */
>  
>  static void
> -lto_input_ts_decl_non_common_tree_pointers (struct lto_input_block *ib,
> - struct data_in *data_in, tree expr)
> +lto_input_ts_decl_non_common_tree_pointers (struct lto_input_block *,
> + struct data_in *, tree)
>  {
> -  if (TREE_CODE (expr) == TYPE_DECL)
> -DECL_ORIGINAL_TYPE (expr) = stream_read_tree (ib, data_in);
>  }
>  
>  
> Index: tree-streamer-out.c
> ===
> --- tree-streamer-out.c   (revision 261841)
> +++ tree-streamer-out.c   (working copy)
> @@ -497,14 +494,10 @@ streamer_write_chain (struct output_bloc
>  {
>/* We avoid outputting external vars or functions by reference
>to the global decls section as we do not want to have them
> -  enter decl merging.  This is, of course, only for the call
> -  for streaming BLOCK_VARS, but other callers are safe.
> -  See also lto-streamer-out.c:DFS_write_tree_body.  */
> -  if (VAR_OR_FUNCTION_DECL_P (t)
> -   && DECL_EXTERNAL (t))
> - stream_write_tree_shallow_non_ref (ob, t, ref_p);
> -  else
> - stream_write_tree (ob, t, ref_p);
> +  enter decl merging.  We should not need to do this anymore because
> +  free_lang_data removes them from block scopes.  */
> +  gcc_assert (!VAR_OR_FUNCTION_DECL_P (t) || !DECL_EXTERNAL (t));
> +  stream_write_tree (ob, t, ref_p);
>  
>t = TREE_CHAIN (t);
>  }
> @@ -620,11 +613,8 @@ write_ts_decl_common_tree_pointers (stru
> pointer fields.  */
>  
>  static void
> -write_ts_decl_non_common_tree_pointers (struct output_block *ob, tree expr,
> - bool ref_p)
> +write_ts_decl_non_common_tree_pointers (struct output_block *, tree, bool)
>  {
> -  if (TREE_CODE (expr) == TYPE_DECL)
>

Re: Cleanup DECL streaming

2018-06-21 Thread Richard Biener

On Thu, 21 Jun 2018, Jan Hubicka wrote:

> > Hi,
> > this patch drops DECL_ORIGINAL_TYPE streaming and also logic handling
> > external decls in blocks since we no longer stream them at all.
> > 
> > Bootstrapped/regtested x86_64-linux, OK?
> Actually the patch dies with lto-bootstrap :(
> 0x9c44fc gen_member_die
> ../../gcc/dwarf2out.c:24947
> 0x9c4bb1 gen_struct_or_union_type_die
> ../../gcc/dwarf2out.c:25143
> 0x9c52dd gen_tagged_type_die
> ../../gcc/dwarf2out.c:25353
> 0x9c4ffe gen_typedef_die
> ../../gcc/dwarf2out.c:25259
> 0x9c6e5f gen_decl_die
> ../../gcc/dwarf2out.c:26248
> 0x9c5557 gen_type_die_with_usage
> ../../gcc/dwarf2out.c:25424
> 0x9c5bc2 gen_type_die
> ../../gcc/dwarf2out.c:25608
> 0x9a7756 modified_type_die
> ../../gcc/dwarf2out.c:13396
> 0x9bd011 add_type_attribute
> ../../gcc/dwarf2out.c:21523
> 0x9bf854 gen_subprogram_die
> ../../gcc/dwarf2out.c:22835
> 0x9c6d8c gen_decl_die
> ../../gcc/dwarf2out.c:26222
> 0x9c7db9 dwarf2out_decl
> ../../gcc/dwarf2out.c:26787
> 0x9c7e15 dwarf2out_function_decl
> ../../gcc/dwarf2out.c:26802
> 0xa65bd9 rest_of_handle_final
> ../../gcc/final.c:4704
> 0xa65d46 execute
> ../../gcc/final.c:4746
> 
> I am going to respawn it becuase it seems odd that stage2 built but stage3
> failed. Clearly typeefs are determined by existence of original type:
> 
> /* Returns true if X is a typedef decl.  */
> 
> bool
> is_typedef_decl (const_tree x)
> {
>   return (x && TREE_CODE (x) == TYPE_DECL
>   && DECL_ORIGINAL_TYPE (x) != NULL_TREE);
> }
> 
> /* Returns true iff TYPE is a type variant created for a typedef. */
> 
> bool
> typedef_variant_p (const_tree type)
> {
>   return is_typedef_decl (TYPE_NAME (type));
> }
> 
> however aren't we supposed to not touch these at late builds? We drop most of 
> TYPE_DECLs in favour
> of IDENTIFIER_TYPE and thus also throwing away DECL_ORIGINAL_TYPEs.

We keep quite a bit of TYPE_DECLs around for devirt.

We shouldn't (very often...) end up trying to emit type DIEs late.
Here we're running into

  /* If the prototype had an 'auto' or 'decltype(auto)' return 
type,
 emit the real type on the definition die.  */
  if (is_cxx () && debug_info_level > DINFO_LEVEL_TERSE)
{
  dw_die_ref die = get_AT_ref (old_die, DW_AT_type);

but that's odd since get_AT_ref shoudn't be able to lookup DW_AT_type
in lto1.

So a testcase would be nice to have...

Richard.


> Honza
> > 
> > Honza
> > 
> > * lto-streamer-out.c (DFS::DFS_write_tree_body): Do not
> > stream DECL_ORIGINAL_TYPE.
> > (DFS::DFS_write_tree_body): Drop hack handling local external decls.
> > (hash_tree): Do not walk DECL_ORIGINAL_TYPE.
> > * tree-streamer-in.c (lto_input_ts_decl_non_common_tree_pointers):
> > Do not walk original type.
> > * tree-streamer-out.c (streamer_write_chain): Drop hack handling
> > external decls.
> > (write_ts_decl_non_common_tree_pointers): Do not stream
> > DECL_ORIGINAL_TYPE
> > * tree.c (free_lang_data_in_decl): Clear DECL_ORIGINAL_TYPE.
> > (find_decls_types_r): Do not walk DEC_ORIGINAL_TYPE.
> > 
> > Index: lto-streamer-out.c
> > ===
> > --- lto-streamer-out.c  (revision 261841)
> > +++ lto-streamer-out.c  (working copy)
> > @@ -819,12 +819,6 @@ DFS::DFS_write_tree_body (struct output_
> > DFS_follow_tree_edge (DECL_DEBUG_EXPR (expr));
> >  }
> >  
> > -  if (CODE_CONTAINS_STRUCT (code, TS_DECL_NON_COMMON))
> > -{
> > -  if (TREE_CODE (expr) == TYPE_DECL)
> > -   DFS_follow_tree_edge (DECL_ORIGINAL_TYPE (expr));
> > -}
> > -
> >if (CODE_CONTAINS_STRUCT (code, TS_DECL_WITH_VIS))
> >  {
> >/* Make sure we don't inadvertently set the assembler name.  */
> > @@ -907,14 +901,13 @@ DFS::DFS_write_tree_body (struct output_
> >if (CODE_CONTAINS_STRUCT (code, TS_BLOCK))
> >  {
> >for (tree t = BLOCK_VARS (expr); t; t = TREE_CHAIN (t))
> > -   if (VAR_OR_FUNCTION_DECL_P (t)
> > -   && DECL_EXTERNAL (t))
> > - /* We have to stream externals in the block chain as
> > -non-references.  See also
> > -tree-streamer-out.c:streamer_write_chain.  */
> > - DFS_write_tree (ob, expr_state, t, ref_p, false);
> > -   else
> > +   {
> > + /* We would have to stream externals in the block chain as
> > +non-references but we should have dropped them in
> > +free-lang-data.  */
> > + gcc_assert (!VAR_OR_FUNCTION_DECL_P (t) || !DECL_EXTERNAL (t));
> >   DFS_follow_tree_edge (t);
> > +   }
> >  
> >DFS_follow_tree_edge (BLOCK_SUPERCONTEXT (expr));
> >  
> > @@ -1261,12 +1249,6 @@ hash_tree (struct streamer_tree_cache_d
> >   be able to call get_symbol_initial_value.  */
> >  }
> >  
> > -  if (CODE_CONTAINS_STRUCT (code, TS_DECL_NON_COMMON))
> > -{
> > -

[PATCH] Cleanup *_ABSTRACT_ORIGIN streaming

2018-06-21 Thread Richard Biener



This simply streams it all.

LTO bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-06-21  Richard Biener  

* lto-streamer-out.c (DFS::DFS_write_tree_body): Update outdated
comment.  Follow BLOCK_ABSTRACT_ORIGIN unconditionally.
* tree-streamer-in.c (lto_input_ts_block_tree_pointers): Update
comment.
* tree-streamer-out.c (write_ts_block_tree_pointers): Stream
BLOCK_ABSTRACT_ORIGIN unconditionally.

Index: gcc/lto-streamer-out.c
===
--- gcc/lto-streamer-out.c  (revision 261845)
+++ gcc/lto-streamer-out.c  (working copy)
@@ -801,10 +801,7 @@ DFS::DFS_write_tree_body (struct output_
 
   DFS_follow_tree_edge (DECL_ATTRIBUTES (expr));
 
-  /* Do not follow DECL_ABSTRACT_ORIGIN.  We cannot handle debug 
information
-for early inlining so drop it on the floor instead of ICEing in
-dwarf2out.c.
-We however use DECL_ABSTRACT_ORIGIN == error_mark_node to mark
+  /* We use DECL_ABSTRACT_ORIGIN == error_mark_node to mark
 declarations which should be eliminated by decl merging. Be sure none
 leaks to this point.  */
   gcc_assert (DECL_ABSTRACT_ORIGIN (expr) != error_mark_node);
@@ -917,20 +914,8 @@ DFS::DFS_write_tree_body (struct output_
  DFS_follow_tree_edge (t);
 
   DFS_follow_tree_edge (BLOCK_SUPERCONTEXT (expr));
+  DFS_follow_tree_edge (BLOCK_ABSTRACT_ORIGIN (expr));
 
-  /* Follow BLOCK_ABSTRACT_ORIGIN for the limited cases we can
-handle - those that represent inlined function scopes.
-For the drop rest them on the floor instead of ICEing
-in dwarf2out.c, but keep the notion of whether the block
-is an inlined block by refering to itself for the sake of
-tree_nonartificial_location.  */
-  if (inlined_function_outer_scope_p (expr))
-   {
- tree ultimate_origin = block_ultimate_origin (expr);
- DFS_follow_tree_edge (ultimate_origin);
-   }
-  else if (BLOCK_ABSTRACT_ORIGIN (expr))
-   DFS_follow_tree_edge (expr);
   /* Do not follow BLOCK_NONLOCALIZED_VARS.  We cannot handle debug
 information for early inlined BLOCKs so drop it on the floor instead
 of ICEing in dwarf2out.c.  */
Index: gcc/tree-streamer-in.c
===
--- gcc/tree-streamer-in.c  (revision 261845)
+++ gcc/tree-streamer-in.c  (working copy)
@@ -927,11 +927,6 @@ lto_input_ts_block_tree_pointers (struct
   BLOCK_VARS (expr) = streamer_read_chain (ib, data_in);
 
   BLOCK_SUPERCONTEXT (expr) = stream_read_tree (ib, data_in);
-
-  /* Stream BLOCK_ABSTRACT_ORIGIN and BLOCK_SOURCE_LOCATION for
- the limited cases we can handle - those that represent inlined
- function scopes.  For the rest them on the floor instead of ICEing in
- dwarf2out.c.  */
   BLOCK_ABSTRACT_ORIGIN (expr) = stream_read_tree (ib, data_in);
   /* Do not stream BLOCK_NONLOCALIZED_VARS.  We cannot handle debug information
  for early inlined BLOCKs so drop it on the floor instead of ICEing in
Index: gcc/tree-streamer-out.c
===
--- gcc/tree-streamer-out.c (revision 261845)
+++ gcc/tree-streamer-out.c (working copy)
@@ -779,20 +779,8 @@ write_ts_block_tree_pointers (struct out
   streamer_write_chain (ob, BLOCK_VARS (expr), ref_p);
 
   stream_write_tree (ob, BLOCK_SUPERCONTEXT (expr), ref_p);
+  stream_write_tree (ob, BLOCK_ABSTRACT_ORIGIN (expr), ref_p);
 
-  /* Stream BLOCK_ABSTRACT_ORIGIN for the limited cases we can handle - those
- that represent inlined function scopes.
- For the rest them on the floor instead of ICEing in dwarf2out.c, but
- keep the notion of whether the block is an inlined block by refering
- to itself for the sake of tree_nonartificial_location.  */
-  if (inlined_function_outer_scope_p (expr))
-{
-  tree ultimate_origin = block_ultimate_origin (expr);
-  stream_write_tree (ob, ultimate_origin, ref_p);
-}
-  else
-stream_write_tree (ob, (BLOCK_ABSTRACT_ORIGIN (expr)
-   ? expr : NULL_TREE), ref_p);
   /* Do not stream BLOCK_NONLOCALIZED_VARS.  We cannot handle debug information
  for early inlined BLOCKs so drop it on the floor instead of ICEing in
  dwarf2out.c.  */

Re: Cleanup DECL streaming

2018-06-21 Thread Jan Hubicka

> > however aren't we supposed to not touch these at late builds? We drop most 
> > of TYPE_DECLs in favour
> > of IDENTIFIER_TYPE and thus also throwing away DECL_ORIGINAL_TYPEs.
> 
> We keep quite a bit of TYPE_DECLs around for devirt.

I know, but we do not keep them systematicaly enough to make them useful for 
dwarf2out.
So I would say dwarf2out touching them is a bug.

> 
> We shouldn't (very often...) end up trying to emit type DIEs late.
> Here we're running into
> 
>   /* If the prototype had an 'auto' or 'decltype(auto)' return 
> type,
>  emit the real type on the definition die.  */
>   if (is_cxx () && debug_info_level > DINFO_LEVEL_TERSE)
> {
>   dw_die_ref die = get_AT_ref (old_die, DW_AT_type);
> 
> but that's odd since get_AT_ref shoudn't be able to lookup DW_AT_type
> in lto1.
> 
> So a testcase would be nice to have...

OK, I will try to debug into it.  Testcase is of course easy - apply patch and 
run
lto bootstrap :)

Honza

> 
> Richard.
> 
> 
> > Honza
> > > 
> > > Honza
> > > 
> > >   * lto-streamer-out.c (DFS::DFS_write_tree_body): Do not
> > >   stream DECL_ORIGINAL_TYPE.
> > >   (DFS::DFS_write_tree_body): Drop hack handling local external decls.
> > >   (hash_tree): Do not walk DECL_ORIGINAL_TYPE.
> > >   * tree-streamer-in.c (lto_input_ts_decl_non_common_tree_pointers):
> > >   Do not walk original type.
> > >   * tree-streamer-out.c (streamer_write_chain): Drop hack handling
> > >   external decls.
> > >   (write_ts_decl_non_common_tree_pointers): Do not stream
> > >   DECL_ORIGINAL_TYPE
> > >   * tree.c (free_lang_data_in_decl): Clear DECL_ORIGINAL_TYPE.
> > >   (find_decls_types_r): Do not walk DEC_ORIGINAL_TYPE.
> > > 
> > > Index: lto-streamer-out.c
> > > ===
> > > --- lto-streamer-out.c(revision 261841)
> > > +++ lto-streamer-out.c(working copy)
> > > @@ -819,12 +819,6 @@ DFS::DFS_write_tree_body (struct output_
> > >   DFS_follow_tree_edge (DECL_DEBUG_EXPR (expr));
> > >  }
> > >  
> > > -  if (CODE_CONTAINS_STRUCT (code, TS_DECL_NON_COMMON))
> > > -{
> > > -  if (TREE_CODE (expr) == TYPE_DECL)
> > > - DFS_follow_tree_edge (DECL_ORIGINAL_TYPE (expr));
> > > -}
> > > -
> > >if (CODE_CONTAINS_STRUCT (code, TS_DECL_WITH_VIS))
> > >  {
> > >/* Make sure we don't inadvertently set the assembler name.  */
> > > @@ -907,14 +901,13 @@ DFS::DFS_write_tree_body (struct output_
> > >if (CODE_CONTAINS_STRUCT (code, TS_BLOCK))
> > >  {
> > >for (tree t = BLOCK_VARS (expr); t; t = TREE_CHAIN (t))
> > > - if (VAR_OR_FUNCTION_DECL_P (t)
> > > - && DECL_EXTERNAL (t))
> > > -   /* We have to stream externals in the block chain as
> > > -  non-references.  See also
> > > -  tree-streamer-out.c:streamer_write_chain.  */
> > > -   DFS_write_tree (ob, expr_state, t, ref_p, false);
> > > - else
> > > + {
> > > +   /* We would have to stream externals in the block chain as
> > > +  non-references but we should have dropped them in
> > > +  free-lang-data.  */
> > > +   gcc_assert (!VAR_OR_FUNCTION_DECL_P (t) || !DECL_EXTERNAL (t));
> > > DFS_follow_tree_edge (t);
> > > + }
> > >  
> > >DFS_follow_tree_edge (BLOCK_SUPERCONTEXT (expr));
> > >  
> > > @@ -1261,12 +1249,6 @@ hash_tree (struct streamer_tree_cache_d
> > >   be able to call get_symbol_initial_value.  */
> > >  }
> > >  
> > > -  if (CODE_CONTAINS_STRUCT (code, TS_DECL_NON_COMMON))
> > > -{
> > > -  if (code == TYPE_DECL)
> > > - visit (DECL_ORIGINAL_TYPE (t));
> > > -}
> > > -
> > >if (CODE_CONTAINS_STRUCT (code, TS_DECL_WITH_VIS))
> > >  {
> > >if (DECL_ASSEMBLER_NAME_SET_P (t))
> > > Index: tree-streamer-in.c
> > > ===
> > > --- tree-streamer-in.c(revision 261841)
> > > +++ tree-streamer-in.c(working copy)
> > > @@ -728,11 +721,9 @@ lto_input_ts_decl_common_tree_pointers (
> > > file being read.  */
> > >  
> > >  static void
> > > -lto_input_ts_decl_non_common_tree_pointers (struct lto_input_block *ib,
> > > - struct data_in *data_in, tree expr)
> > > +lto_input_ts_decl_non_common_tree_pointers (struct lto_input_block *,
> > > + struct data_in *, tree)
> > >  {
> > > -  if (TREE_CODE (expr) == TYPE_DECL)
> > > -DECL_ORIGINAL_TYPE (expr) = stream_read_tree (ib, data_in);
> > >  }
> > >  
> > >  
> > > Index: tree-streamer-out.c
> > > ===
> > > --- tree-streamer-out.c   (revision 261841)
> > > +++ tree-streamer-out.c   (working copy)
> > > @@ -497,14 +494,10 @@ streamer_write_chain (struct output_bloc
> > >  {
> > >/* We avoid outputting external vars or functions by reference
> > >to the global decls section as we do not want to have them
> > > -  enter decl merging

Re: [PATCH] Consistently gimplify all-zero CTORs to = {};

2018-06-21 Thread Richard Biener

On Thu, 21 Jun 2018, Richard Biener wrote:

> 
> PR86223 points out that we currently gimplify the testcase inconsistently.
> For the incomplete CTORs we use block-clearing while for the complete
> one we emit initializations of the individual elements (up to the
> limits imposed in following checks).
> 
> So the following makes us always use = {}; form for all-zero CTORs
> which is most compact for GIMPLE IL and should only result in better
> code (fingers crossing...) since but not only beacause SRA got the
> ability to handle a = .LC0; style inits as well.
> 
> Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Fine apart from

FAIL: g++.dg/tm/pr45940-3.C  -std=gnu++11 (internal compiler error)
FAIL: g++.dg/tm/pr45940-3.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/tm/pr45940-3.C  -std=gnu++14 (internal compiler error)
FAIL: g++.dg/tm/pr45940-3.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/tm/pr45940-3.C  -std=gnu++98 (internal compiler error)
FAIL: g++.dg/tm/pr45940-3.C  -std=gnu++98 (test for excess errors)
FAIL: g++.dg/tm/pr45940-4.C  -std=gnu++11 (internal compiler error)
FAIL: g++.dg/tm/pr45940-4.C  -std=gnu++11 (test for excess errors)
FAIL: g++.dg/tm/pr45940-4.C  -std=gnu++14 (internal compiler error)
FAIL: g++.dg/tm/pr45940-4.C  -std=gnu++14 (test for excess errors)
FAIL: g++.dg/tm/pr45940-4.C  -std=gnu++98 (internal compiler error)
FAIL: g++.dg/tm/pr45940-4.C  -std=gnu++98 (test for excess errors)
during GIMPLE pass: tmmark^M
In constructor 'shared_count::shared_count() transaction_safe':^M
cc1plus: internal compiler error: in create_tmp_var, at 
gimple-expr.c:479^M
0x6c8dc7 create_tmp_var(tree_node*, char const*)^M
/space/rguenther/src/svn/trunk/gcc/gimple-expr.c:479^M
0xc5375b create_tmp_from_val^M
/space/rguenther/src/svn/trunk/gcc/gimplify.c:516^M
0xc5375b lookup_tmp_var^M
/space/rguenther/src/svn/trunk/gcc/gimplify.c:537^M
0xc5375b internal_get_tmp_var^M
/space/rguenther/src/svn/trunk/gcc/gimplify.c:590^M
0xc4d481 gimplify_expr(tree_node**, gimple**, gimple**, bool 
(*)(tree_node*), int)^M
/space/rguenther/src/svn/trunk/gcc/gimplify.c:12386^M
0xc572d4 gimplify_addr_expr^M
/space/rguenther/src/svn/trunk/gcc/gimplify.c:5994^M
0xc4f82d gimplify_expr(tree_node**, gimple**, gimple**, bool 
(*)(tree_node*), int)^M
/space/rguenther/src/svn/trunk/gcc/gimplify.c:11485^M
0xc5f4cc force_gimple_operand_1(tree_node*, gimple**, bool 
(*)(tree_node*), tree_node*)^M
/space/rguenther/src/svn/trunk/gcc/gimplify-me.c:78^M
0xc5f57f force_gimple_operand_gsi_1(gimple_stmt_iterator*, tree_node*, 
bool (*)(tree_node*), tree_node*, bool, gsi_iterator_update)^M
/space/rguenther/src/svn/trunk/gcc/gimplify-me.c:115^M
0xe9ac6a expand_assign_tm^M
/space/rguenther/src/svn/trunk/gcc/trans-mem.c:2446^M
0xe9df02 expand_block_tm^M
/space/rguenther/src/svn/trunk/gcc/trans-mem.c:2646^M
0xe9df02 execute_tm_mark^M
/space/rguenther/src/svn/trunk/gcc/trans-mem.c:3130^M
0xe9df02 execute^M
/space/rguenther/src/svn/trunk/gcc/trans-mem.c:3175^M

we're doing gimplify_addr (gsi, rhs); on a {} RHS.  Looks like TM
doesn't handle stores from {} at all...  is there a TM-safe
memset()?

and

FAIL: gnat.dg/opt34.adb scan-tree-dump esra "Created a replacement for 
result"

no time to investigate right now, so I'm putting this on hold.
Eric, can you see if the opt34.adb FAIL is "harmless"?

Richard.

> Richard.
> 
> 2018-06-21  Richard Biener  
> 
>   PR middle-end/86223
>   * gimplify.c (gimplify_init_constructor): For an all-zero
>   constructor emit a block-clear.
> 
>   * gcc.dg/pr86223-1.c: New testcase.
> 
> Index: gcc/gimplify.c
> ===
> --- gcc/gimplify.c(revision 261839)
> +++ gcc/gimplify.c(working copy)
> @@ -4805,6 +4805,10 @@ gimplify_init_constructor (tree *expr_p,
>requires trickery to avoid quadratic compile-time behavior in
>large cases or excessive memory use in small cases.  */
> cleared = !CONSTRUCTOR_NO_CLEARING (ctor);
> + else if (num_nonzero_elements == 0)
> +   /* If all elements are zero it is most efficient to block-clear
> +  things.  */
> +   cleared = true;
>   else if (num_ctor_elements - num_nonzero_elements
>> CLEAR_RATIO (optimize_function_for_speed_p (cfun))
>&& num_nonzero_elements < num_ctor_elements / 4)
> Index: gcc/testsuite/gcc.dg/pr86223-1.c
> ===
> --- gcc/testsuite/gcc.dg/pr86223-1.c  (revision 0)
> +++ gcc/testsuite/gcc.dg/pr86223-1.c  (working copy)
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fdump-tree-gimple" } */
> +
> +void f (int *);
> +void g ()
> +{
> +  int a[3] = { 0, 0, 0 };
> +  f (a);
> +}
> +void h ()
> +{
> +  int a[3] = { 0 };
> +  f (a);
> +}
> +
> +/* We should use block-clearing for the initializer

Re: [patch] adjust default nvptx launch geometry for OpenACC offloaded regions

2018-06-21 Thread Cesar Philippidis

On 06/20/2018 03:15 PM, Tom de Vries wrote:
> On 06/20/2018 11:59 PM, Cesar Philippidis wrote:
>> Now it follows the formula contained in
>> the "CUDA Occupancy Calculator" spreadsheet that's distributed with CUDA.
> 
> Any reason we're not using the cuda runtime functions to get the
> occupancy (see PR85590 - [nvptx, libgomp, openacc] Use cuda runtime fns
> to determine launch configuration in nvptx ) ?

There are two reasons:

  1) cuda_occupancy.h depends on the CUDA runtime to extract the device
 properties instead of the CUDA driver API. However, we can always
 teach libgomp how to populate the cudaDeviceProp struct using the
 driver API.

  2) CUDA is not always present on the build host, and that's why
 libgomp maintains its own cuda.h. So at the very least, this
 functionality would be good to have in libgomp as a fallback
 implementation; its not good to have program fail due to
 insufficient hardware resources errors when it is avoidable.

Cesar

[PATCH] PR libstdc++/70940 make pmr::resource_adaptor return aligned memory

2018-06-21 Thread Jonathan Wakely


PR libstdc++/70940
* include/experimental/memory_resource (__resource_adaptor_common):
New base class.
(__resource_adaptor_common::_AlignMgr): Helper for obtaining aligned
pointer from unaligned, and vice versa.
(__resource_adaptor_imp::do_allocate): Use _AlignMgr to adjust
allocated pointer to meet alignment request.
(__resource_adaptor_imp::do_deallocate): Use _AlignMgr to retrieve
original pointer for deallocation.
(__resource_adaptor_imp::do_is_equal): Reformat.
(__resource_adaptor_imp::_S_aligned_size): Remove.
(__resource_adaptor_imp::_S_supported): Remove.
(new_delete_resource): Use __gnu_cxx::new_allocator.
* testsuite/experimental/memory_resource/resource_adaptor.cc: Test
extended alignments and use debug_allocator to check for matching
allocate/deallocate pairs.

Tested x86_64-linux, committed to trunk. This is more experimental TS
material, so I'll probably backport it.


commit 12844fcc6551b7667feb8a3c81280fa5ae90304f
Author: Jonathan Wakely 
Date:   Thu Jun 21 14:24:00 2018 +0100

PR libstdc++/70940 make pmr::resource_adaptor return aligned memory

PR libstdc++/70940
* include/experimental/memory_resource (__resource_adaptor_common):
New base class.
(__resource_adaptor_common::_AlignMgr): Helper for obtaining aligned
pointer from unaligned, and vice versa.
(__resource_adaptor_imp::do_allocate): Use _AlignMgr to adjust
allocated pointer to meet alignment request.
(__resource_adaptor_imp::do_deallocate): Use _AlignMgr to retrieve
original pointer for deallocation.
(__resource_adaptor_imp::do_is_equal): Reformat.
(__resource_adaptor_imp::_S_aligned_size): Remove.
(__resource_adaptor_imp::_S_supported): Remove.
(new_delete_resource): Use __gnu_cxx::new_allocator.
* testsuite/experimental/memory_resource/resource_adaptor.cc: Test
extended alignments and use debug_allocator to check for matching
allocate/deallocate pairs.

diff --git a/libstdc++-v3/include/experimental/memory_resource 
b/libstdc++-v3/include/experimental/memory_resource
index 670a2210804..3d2dce19868 100644
--- a/libstdc++-v3/include/experimental/memory_resource
+++ b/libstdc++-v3/include/experimental/memory_resource
@@ -33,6 +33,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 namespace std {
@@ -253,9 +254,103 @@ namespace pmr {
const polymorphic_allocator<_Tp2>& __b) noexcept
 { return !(__a == __b); }
 
+  class __resource_adaptor_common
+  {
+template friend class __resource_adaptor_imp;
+
+struct _AlignMgr
+{
+  _AlignMgr(size_t __nbytes, size_t __align)
+  : _M_nbytes(__nbytes), _M_align(__align)
+  { }
+
+  // Total size that needs to be allocated.
+  size_t
+  _M_alloc_size() const { return _M_buf_size() + _M_token_size(); }
+
+  void*
+  _M_adjust(void* __ptr) const
+  {
+   const auto __orig_ptr = static_cast(__ptr);
+   size_t __space = _M_buf_size();
+   // Align the pointer within the buffer:
+   std::align(_M_align, _M_nbytes, __ptr, __space);
+   const auto __aligned_ptr = static_cast(__ptr);
+   const auto __token_size = _M_token_size();
+   // Store token immediately after the aligned block:
+   char* const __end = __aligned_ptr + _M_nbytes;
+   if (__token_size == 1)
+ _S_write(__end, __aligned_ptr - __orig_ptr);
+   else if (__token_size == sizeof(short))
+ _S_write(__end, __aligned_ptr - __orig_ptr);
+   else if (__token_size == sizeof(int) && sizeof(int) < sizeof(char*))
+ _S_write(__end, __aligned_ptr - __orig_ptr);
+   else // (__token_size == sizeof(char*))
+ // Just store the original pointer:
+ _S_write(__end, __orig_ptr);
+   return __aligned_ptr;
+  }
+
+  char*
+  _M_unadjust(char* __ptr) const
+  {
+   const char* const __end = __ptr + _M_nbytes;
+   char* __orig_ptr;
+   const auto __token_size = _M_token_size();
+   // Read the token and restore the original pointer:
+   if (__token_size == 1)
+ __orig_ptr = __ptr - _S_read(__end);
+   else if (__token_size == sizeof(short))
+ __orig_ptr = __ptr - _S_read(__end);
+   else if (__token_size == sizeof(int)
+   && sizeof(int) < sizeof(char*))
+ __orig_ptr = __ptr - _S_read(__end);
+   else // (__token_size == sizeof(char*))
+ __orig_ptr = _S_read(__end);
+   return __orig_ptr;
+  }
+
+private:
+  size_t _M_nbytes;
+  size_t _M_align;
+
+  // Number of bytes needed to fit block of given size and alignment.
+  size_t
+  _M_buf_size() const { return _M_nbytes + _M_align - 1; }
+
+  // Number of additional bytes needed to write the token.

Re: Cleanup DECL streaming

2018-06-21 Thread Richard Biener

On Thu, 21 Jun 2018, Jan Hubicka wrote:

> > > however aren't we supposed to not touch these at late builds? We drop 
> > > most of TYPE_DECLs in favour
> > > of IDENTIFIER_TYPE and thus also throwing away DECL_ORIGINAL_TYPEs.
> > 
> > We keep quite a bit of TYPE_DECLs around for devirt.
> 
> I know, but we do not keep them systematicaly enough to make them useful for 
> dwarf2out.
> So I would say dwarf2out touching them is a bug.
> 
> > 
> > We shouldn't (very often...) end up trying to emit type DIEs late.
> > Here we're running into
> > 
> >   /* If the prototype had an 'auto' or 'decltype(auto)' return 
> > type,
> >  emit the real type on the definition die.  */
> >   if (is_cxx () && debug_info_level > DINFO_LEVEL_TERSE)
> > {
> >   dw_die_ref die = get_AT_ref (old_die, DW_AT_type);
> > 
> > but that's odd since get_AT_ref shoudn't be able to lookup DW_AT_type
> > in lto1.
> > 
> > So a testcase would be nice to have...
> 
> OK, I will try to debug into it.  Testcase is of course easy - apply patch 
> and run
> lto bootstrap :)

:)

Sometimes running the testsuite with -flto -g also pops up such issues.

Richard.

Have g++ define _FILE_OFFSET_BITS=64 on Solaris

2018-06-21 Thread Rainer Orth

I recently found two libstdc++ testcases failing on some Solaris hosts
for 32-bit only:

FAIL: 27_io/filesystem/operations/space.cc execution test
FAIL: experimental/filesystem/operations/space.cc execution test

Both file in the same way:

terminate called after throwing an instance of 
'std::filesystem::__cxx11::filesystem_error'
  what():  filesystem error: cannot get free space: Value too large for defined 
data type [.]

However, the test PASSes just fine on other systems.

It turns out that the tests FAIL with

statvfs(".", 0xFEFFDB64)Err#79 EOVERFLOW

On the failing system, the build filesystem is 3.4 TB, thus the
EOVERFLOW.

It seems g++ on Solaris doesn't fully enable largefile support: it has
-D_LARGEFILE_SOURCE=1 in gcc/config/sol2.h (TARGET_OS_CPP_BUILTINS), but
lacks -D_FILE_OFFSET_BITS=64 which is required to get the
largefile-aware functions (statvfs64 in this case).

The following patch adds that, fixing the two failures.

Bootstrapped without regressions on i386-pc-solaris2.1[01] and
sparc-sun-solaris2.1[01].

Unless someone has an idea why this might cause problems, I'll install
the patch on mainline and backport to the gcc-7 and gcc-8 branches.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2018-06-18  Rainer Orth  

* config/sol2.h (TARGET_OS_CPP_BUILTINS): Define
_FILE_OFFSET_BITS=64 for C++.

# HG changeset patch
# Parent  48a63094f075d53e7bbbe0f2de0513c267ef9e96
Have g++ define _FILE_OFFSET_BITS=64 on 32-bit Solaris

diff --git a/gcc/config/sol2.h b/gcc/config/sol2.h
--- a/gcc/config/sol2.h
+++ b/gcc/config/sol2.h
@@ -113,6 +113,7 @@ along with GCC; see the file COPYING3.  
 	builtin_define ("_XOPEN_SOURCE=600");		\
 	builtin_define ("_LARGEFILE_SOURCE=1");		\
 	builtin_define ("_LARGEFILE64_SOURCE=1");	\
+	builtin_define ("_FILE_OFFSET_BITS=64");	\
 	builtin_define ("__EXTENSIONS__");		\
   }			\
 TARGET_SUB_OS_CPP_BUILTINS();			\

[PATCH 3/N] Make symbol_summary::get and call_summary::get pure.

2018-06-21 Thread Martin Liška

Hi.

Last part of planned clean-up where I declare ::get as PURE.
Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From 540efe3374d649cc8745445a3e6dc1c720fb79ad Mon Sep 17 00:00:00 2001
From: marxin 
Date: Wed, 20 Jun 2018 14:26:48 +0200
Subject: [PATCH] Make symbol_summary::get and call_summary::get pure.

gcc/ChangeLog:

2018-06-20  Martin Liska  

	* symbol-summary.h (get): Make it pure and inline move
functionality from ::get function.
(get): Remove and inline into ::get and ::get_create.
(get_create): Move code from ::get function.
---
 gcc/symbol-summary.h | 74 +++-
 1 file changed, 18 insertions(+), 56 deletions(-)

diff --git a/gcc/symbol-summary.h b/gcc/symbol-summary.h
index bf32810abd7..26e9773d3c0 100644
--- a/gcc/symbol-summary.h
+++ b/gcc/symbol-summary.h
@@ -90,13 +90,19 @@ public:
  does not exist it will be created.  */
   T* get_create (cgraph_node *node)
   {
-return get (node->get_uid (), true);
+bool existed;
+T **v = &m_map.get_or_insert (node->get_uid (), &existed);
+if (!existed)
+  *v = allocate_new ();
+
+return *v;
   }
 
   /* Getter for summary callgraph node pointer.  */
-  T* get (cgraph_node *node)
+  T* get (cgraph_node *node) ATTRIBUTE_PURE
   {
-return get (node->get_uid (), false);
+T **v = m_map.get (node->get_uid ());
+return v == NULL ? NULL : *v;
   }
 
   /* Remove node from summary.  */
@@ -152,9 +158,6 @@ protected:
 private:
   typedef int_hash  map_hash;
 
-  /* Getter for summary callgraph ID.  */
-  T *get (int uid, bool lazy_insert);
-
   /* Indicates if insertion hook is enabled.  */
   bool m_insertion_enabled;
   /* Indicates if the summary is released.  */
@@ -273,28 +276,6 @@ function_summary::symtab_duplication (cgraph_node *node,
 }
 }
 
-template 
-T*
-function_summary::get (int uid, bool lazy_insert)
-{
-  gcc_checking_assert (uid > 0);
-
-  if (lazy_insert)
-{
-  bool existed;
-  T **v = &m_map.get_or_insert (uid, &existed);
-  if (!existed)
-	*v = allocate_new ();
-
-  return *v;
-}
-  else
-{
-  T **v = m_map.get (uid);
-  return v == NULL ? NULL : *v;
-}
-}
-
 template 
 void
 gt_ggc_mx(function_summary* const &summary)
@@ -387,13 +368,19 @@ public:
  If a summary for an edge does not exist, it will be created.  */
   T* get_create (cgraph_edge *edge)
   {
-return get (edge->get_uid (), true);
+bool existed;
+T **v = &m_map.get_or_insert (edge->get_uid (), &existed);
+if (!existed)
+  *v = allocate_new ();
+
+return *v;
   }
 
   /* Getter for summary callgraph edge pointer.  */
-  T* get (cgraph_edge *edge)
+  T* get (cgraph_edge *edge) ATTRIBUTE_PURE
   {
-return get (edge->get_uid (), false);
+T **v = m_map.get (edge->get_uid ());
+return v == NULL ? NULL : *v;
   }
 
   /* Remove edge from summary.  */
@@ -437,9 +424,6 @@ protected:
 private:
   typedef int_hash  map_hash;
 
-  /* Getter for summary callgraph ID.  */
-  T *get (int uid, bool lazy_insert);
-
   /* Main summary store, where summary ID is used as key.  */
   hash_map  m_map;
   /* Internal summary removal hook pointer.  */
@@ -457,28 +441,6 @@ private:
   gt_pointer_operator, void *);
 };
 
-template 
-T*
-call_summary::get (int uid, bool lazy_insert)
-{
-  gcc_checking_assert (uid > 0);
-
-  if (lazy_insert)
-{
-  bool existed;
-  T **v = &m_map.get_or_insert (uid, &existed);
-  if (!existed)
-	*v = allocate_new ();
-
-  return *v;
-}
-  else
-{
-  T **v = m_map.get (uid);
-  return v == NULL ? NULL : *v;
-}
-}
-
 template 
 void
 call_summary::release ()
-- 
2.17.1

Re: Have g++ define _FILE_OFFSET_BITS=64 on Solaris

2018-06-21 Thread Jonathan Wakely


On 21/06/18 16:17 +0200, Rainer Orth wrote:

I recently found two libstdc++ testcases failing on some Solaris hosts
for 32-bit only:

FAIL: 27_io/filesystem/operations/space.cc execution test
FAIL: experimental/filesystem/operations/space.cc execution test

Both file in the same way:

terminate called after throwing an instance of 
'std::filesystem::__cxx11::filesystem_error'
 what():  filesystem error: cannot get free space: Value too large for defined 
data type [.]

However, the test PASSes just fine on other systems.

It turns out that the tests FAIL with

statvfs(".", 0xFEFFDB64)Err#79 EOVERFLOW

On the failing system, the build filesystem is 3.4 TB, thus the
EOVERFLOW.

It seems g++ on Solaris doesn't fully enable largefile support: it has
-D_LARGEFILE_SOURCE=1 in gcc/config/sol2.h (TARGET_OS_CPP_BUILTINS), but
lacks -D_FILE_OFFSET_BITS=64 which is required to get the
largefile-aware functions (statvfs64 in this case).

The following patch adds that, fixing the two failures.

Bootstrapped without regressions on i386-pc-solaris2.1[01] and
sparc-sun-solaris2.1[01].

Unless someone has an idea why this might cause problems, I'll install
the patch on mainline and backport to the gcc-7 and gcc-8 branches.


No objection to this patch, but I'll just note that we have
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81091 suggesting we
should use LFS for libstdc++ unconditionally.

Don't preprocess .S files with -P on Solaris/x86 (PR target/85994)

2018-06-21 Thread Rainer Orth

When bootstrapping gcc on Solaris/x86 with gas, there are a couple of
comparison failures:

i386-pc-solaris2.11/amd64/libgcc/avx_savms64f.o differs
i386-pc-solaris2.11/amd64/libgcc/sse_resms64.o differs
i386-pc-solaris2.11/amd64/libgcc/sse_resms64f_s.o differs

and several more.

The differences occur in the .debug_line sections, where tmp file names
are embedded.

The resulting object files have (in readelf --debug-dump output):

 The Directory Table (offset 0x1b):
  1 /var/tmp/

 The File Name Table (offset 0x26):
  Entry Dir TimeSizeName
  1 1   0   0   cc3DRYva.s

while on Linux I see

 The Directory Table (offset 0x1b):
  1 /vol/gcc/src/hg/trunk/local/libgcc/config/i386

 The File Name Table (offset 0x4b):
  Entry Dir TimeSizeName
  1 1   0   0   resms64.h

On Linux, linemarkers (#  ...) are passed to gas, while on Solaris
they are suppressed by passing -P to cpp for -x assembler-with-cpp.
This was necessary once because old versions of the native assembler
didn't grok them.

However, on closer investigation it turned out that neither the original
Solaris 10/x86 FCS assembler nor the first one to support line markers
(from patch 119961-03) are able to bootstrap gcc any longer for
unrelated reasons.  Obviously, nobody has cared to report that, so I
just drop support for them.  This makes the need for the -P go away,
thus fixing the comparison failures.

Bootstrapped without regressions on i386-pc-solaris2.1[01], will apply
to mainline and gcc-8 branch shortly.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2018-05-31  Rainer Orth  

PR target/85994
* config/i386/sol2.h (CPP_SPEC): Don't pass -P for
-x assembler-with-cpp.

# HG changeset patch
# Parent  74e1ffd73c4cc935c6435b276f4c0d682b4a06d6
Avoid passing -P to cpp on Solaris/x86 (PR target/85994)

diff --git a/gcc/config/i386/sol2.h b/gcc/config/i386/sol2.h
--- a/gcc/config/i386/sol2.h
+++ b/gcc/config/i386/sol2.h
@@ -51,9 +51,8 @@ along with GCC; see the file COPYING3.  
 #undef TARGET_SUN_TLS
 #define TARGET_SUN_TLS 1
 
-/* Solaris 2/Intel as chokes on #line directives before Solaris 10.  */
 #undef CPP_SPEC
-#define CPP_SPEC "%{,assembler-with-cpp:-P} %(cpp_subtarget)"
+#define CPP_SPEC "%(cpp_subtarget)"
 
 /* GNU as understands --32 and --64, but the native Solaris
assembler requires -xarch=generic or -xarch=generic64 instead.  */

Re: [PATCH] Consistently gimplify all-zero CTORs to = {};

2018-06-21 Thread Eric Botcazou

> and
> 
> FAIL: gnat.dg/opt34.adb scan-tree-dump esra "Created a replacement for
> result"
> 
> no time to investigate right now, so I'm putting this on hold.
> Eric, can you see if the opt34.adb FAIL is "harmless"?

A bit busy too, and the failure is at most a pessimization in any case, so no 
objection to the change by me.

-- 
Eric Botcazou

Re: Have g++ define _FILE_OFFSET_BITS=64 on Solaris

2018-06-21 Thread Rainer Orth

Hi Jonathan,

> No objection to this patch, but I'll just note that we have
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81091 suggesting we
> should use LFS for libstdc++ unconditionally.

seems like a wise move to me.  The libstdc++.so ABI didn't change on
Solaris either (that possibility had caused concern for me initially);
didn't check libstdc++fs.a though.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: Have g++ define _FILE_OFFSET_BITS=64 on Solaris

2018-06-21 Thread Jonathan Wakely


On 21/06/18 16:49 +0200, Rainer Orth wrote:

Hi Jonathan,


No objection to this patch, but I'll just note that we have
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81091 suggesting we
should use LFS for libstdc++ unconditionally.


seems like a wise move to me.  The libstdc++.so ABI didn't change on
Solaris either (that possibility had caused concern for me initially);
didn't check libstdc++fs.a though.


Well the main reason that's only a static library for now is to allow
us to make ABI incompatible changes before we declare it stable and
add those symbols to libstdc++.so forever.

Re: Have g++ define _FILE_OFFSET_BITS=64 on Solaris

2018-06-21 Thread Franz Sirl


Am 2018-06-21 um 16:17 schrieb Rainer Orth:

I recently found two libstdc++ testcases failing on some Solaris hosts
for 32-bit only:

FAIL: 27_io/filesystem/operations/space.cc execution test
FAIL: experimental/filesystem/operations/space.cc execution test

Both file in the same way:

terminate called after throwing an instance of 
'std::filesystem::__cxx11::filesystem_error'
   what():  filesystem error: cannot get free space: Value too large for 
defined data type [.]

However, the test PASSes just fine on other systems.

It turns out that the tests FAIL with

statvfs(".", 0xFEFFDB64)Err#79 EOVERFLOW

On the failing system, the build filesystem is 3.4 TB, thus the
EOVERFLOW.

It seems g++ on Solaris doesn't fully enable largefile support: it has
-D_LARGEFILE_SOURCE=1 in gcc/config/sol2.h (TARGET_OS_CPP_BUILTINS), but
lacks -D_FILE_OFFSET_BITS=64 which is required to get the
largefile-aware functions (statvfs64 in this case).

The following patch adds that, fixing the two failures.

Bootstrapped without regressions on i386-pc-solaris2.1[01] and
sparc-sun-solaris2.1[01].

Unless someone has an idea why this might cause problems, I'll install
the patch on mainline and backport to the gcc-7 and gcc-8 branches.


No idea about possible problems, but isn't it usually recommended to use 
either _FILE_OFFSET_BITS=64 or _LARGEFILE{64}_SOURCE=1, not both at the 
same time?


Franz

Re: [PATCH] PR libstdc++/70940 make pmr::resource_adaptor return aligned memory

2018-06-21 Thread Jonathan Wakely


On 21/06/18 15:00 +0100, Jonathan Wakely wrote:

  virtual void
-  do_deallocate(void* __p, size_t __bytes, size_t __alignment)
+  do_deallocate(void* __p, size_t __bytes, size_t __alignment) noexcept
+  override
  {
-   using _Aligned_alloc = std::__alloc_rebind<_Alloc, char>;
-   size_t __new_size = _S_aligned_size(__bytes,
-   _S_supported(__alignment) ?
-   __alignment : _S_max_align);
-   using _Ptr = typename allocator_traits<_Aligned_alloc>::pointer;
-   _Aligned_alloc(_M_alloc).deallocate(static_cast<_Ptr>(__p),
-   __new_size);
+   auto __ptr = static_cast(__p);
+   if (__alignment == 1)
+ _M_alloc.deallocate(__ptr, __bytes);


Oops, this is missing a return!

Patch incoming ...

Re: [Patch, fortran] PR83118 - [7/8/9 Regression] Bad intrinsic assignment of class(*) array component of derived type

2018-06-21 Thread Steve Kargl

On Thu, Jun 21, 2018 at 09:02:53AM +0100, Paul Richard Thomas wrote:
> The original problem was fixed by the patch for PR84546. This patch
> fixes a variant that appears in comment #6.
> 
> The fix is completely straightforward and described by the comments
> and ChangeLogs.
> 
> Bootstrapped and regtested on FC28/x86_64 - OK for trunk?
> 

OK.

-- 
Steve

Re: [Patch, fortran] PR49630 - [OOP] ICE on obsolescent deferred-length type bound character function

2018-06-21 Thread Steve Kargl

On Thu, Jun 21, 2018 at 09:03:47AM +0100, Paul Richard Thomas wrote:
> Ping!
> 
> > 2018-06-19  Paul Thomas  
> >
> > PR fortran/49630
> > * resolve.c (resolve_contained_fntype): Change standard ref.
> > from F95 to F2003: C418. Correct a spelling error in a comment.
> > It is an error for an abstract interface to have an assumed
> > character length result.
> > * trans-expr.c (gfc_conv_procedure_call): Likewise change the
> > standard reference.
> >
> > 2018-06-19  Paul Thomas  
> >
> > PR fortran/49630
> > * gfortran.dg/assumed_charlen_function_7.f90: New test.

OK.

-- 
Steve

Re: Cleanup DECL streaming

2018-06-21 Thread Jan Hubicka

Hi
with -flto -g -O2 -r -nostdlib -flinker-output=nolto-rel I get ICE on:
class a;
namespace b {
template  struct c;
struct C {
  typedef a d;
};
void e();
}
template  class g : f {
public:
  template  g(i);
};
class a {
  long k;
};
namespace b {
template <> struct c { template  static a l(j, h); };
}
template  b::C::d b::e(j m, h n) { c::l(m, n); }
void o() {
  g p = 0;
  a r(b::e(r, p));
}

Re: [PATCH 3/N] Make symbol_summary::get and call_summary::get pure.

2018-06-21 Thread Jan Hubicka

> Hi.
> 
> Last part of planned clean-up where I declare ::get as PURE.
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?
OK,
thanks!
Honza
> Martin

> From 540efe3374d649cc8745445a3e6dc1c720fb79ad Mon Sep 17 00:00:00 2001
> From: marxin 
> Date: Wed, 20 Jun 2018 14:26:48 +0200
> Subject: [PATCH] Make symbol_summary::get and call_summary::get pure.
> 
> gcc/ChangeLog:
> 
> 2018-06-20  Martin Liska  
> 
>   * symbol-summary.h (get): Make it pure and inline move
> functionality from ::get function.
> (get): Remove and inline into ::get and ::get_create.
> (get_create): Move code from ::get function.
> ---
>  gcc/symbol-summary.h | 74 +++-
>  1 file changed, 18 insertions(+), 56 deletions(-)
> 
> diff --git a/gcc/symbol-summary.h b/gcc/symbol-summary.h
> index bf32810abd7..26e9773d3c0 100644
> --- a/gcc/symbol-summary.h
> +++ b/gcc/symbol-summary.h
> @@ -90,13 +90,19 @@ public:
>   does not exist it will be created.  */
>T* get_create (cgraph_node *node)
>{
> -return get (node->get_uid (), true);
> +bool existed;
> +T **v = &m_map.get_or_insert (node->get_uid (), &existed);
> +if (!existed)
> +  *v = allocate_new ();
> +
> +return *v;
>}
>  
>/* Getter for summary callgraph node pointer.  */
> -  T* get (cgraph_node *node)
> +  T* get (cgraph_node *node) ATTRIBUTE_PURE
>{
> -return get (node->get_uid (), false);
> +T **v = m_map.get (node->get_uid ());
> +return v == NULL ? NULL : *v;
>}
>  
>/* Remove node from summary.  */
> @@ -152,9 +158,6 @@ protected:
>  private:
>typedef int_hash  map_hash;
>  
> -  /* Getter for summary callgraph ID.  */
> -  T *get (int uid, bool lazy_insert);
> -
>/* Indicates if insertion hook is enabled.  */
>bool m_insertion_enabled;
>/* Indicates if the summary is released.  */
> @@ -273,28 +276,6 @@ function_summary::symtab_duplication (cgraph_node 
> *node,
>  }
>  }
>  
> -template 
> -T*
> -function_summary::get (int uid, bool lazy_insert)
> -{
> -  gcc_checking_assert (uid > 0);
> -
> -  if (lazy_insert)
> -{
> -  bool existed;
> -  T **v = &m_map.get_or_insert (uid, &existed);
> -  if (!existed)
> - *v = allocate_new ();
> -
> -  return *v;
> -}
> -  else
> -{
> -  T **v = m_map.get (uid);
> -  return v == NULL ? NULL : *v;
> -}
> -}
> -
>  template 
>  void
>  gt_ggc_mx(function_summary* const &summary)
> @@ -387,13 +368,19 @@ public:
>   If a summary for an edge does not exist, it will be created.  */
>T* get_create (cgraph_edge *edge)
>{
> -return get (edge->get_uid (), true);
> +bool existed;
> +T **v = &m_map.get_or_insert (edge->get_uid (), &existed);
> +if (!existed)
> +  *v = allocate_new ();
> +
> +return *v;
>}
>  
>/* Getter for summary callgraph edge pointer.  */
> -  T* get (cgraph_edge *edge)
> +  T* get (cgraph_edge *edge) ATTRIBUTE_PURE
>{
> -return get (edge->get_uid (), false);
> +T **v = m_map.get (edge->get_uid ());
> +return v == NULL ? NULL : *v;
>}
>  
>/* Remove edge from summary.  */
> @@ -437,9 +424,6 @@ protected:
>  private:
>typedef int_hash  map_hash;
>  
> -  /* Getter for summary callgraph ID.  */
> -  T *get (int uid, bool lazy_insert);
> -
>/* Main summary store, where summary ID is used as key.  */
>hash_map  m_map;
>/* Internal summary removal hook pointer.  */
> @@ -457,28 +441,6 @@ private:
>gt_pointer_operator, void *);
>  };
>  
> -template 
> -T*
> -call_summary::get (int uid, bool lazy_insert)
> -{
> -  gcc_checking_assert (uid > 0);
> -
> -  if (lazy_insert)
> -{
> -  bool existed;
> -  T **v = &m_map.get_or_insert (uid, &existed);
> -  if (!existed)
> - *v = allocate_new ();
> -
> -  return *v;
> -}
> -  else
> -{
> -  T **v = m_map.get (uid);
> -  return v == NULL ? NULL : *v;
> -}
> -}
> -
>  template 
>  void
>  call_summary::release ()
> -- 
> 2.17.1
>

Re: Cleanup DECL streaming

2018-06-21 Thread Jan Hubicka

Hi,
this problem here seems to be that is_cxx returns true and at the same
time auto_die and decltype_auto_die is not initialized. Here die==NULL
and we end up calling add_type_attribute for random reasons.

I am testing the following but I am not sure it is proper fix.  Are
we supposed to handle auto dies here somehow?

Index: dwarf2out.c
===
--- dwarf2out.c (revision 261841)
+++ dwarf2out.c (working copy)
@@ -22830,7 +22830,8 @@ gen_subprogram_die (tree decl, dw_die_re
  if (is_cxx () && debug_info_level > DINFO_LEVEL_TERSE)
{
  dw_die_ref die = get_AT_ref (old_die, DW_AT_type);
- if (die == auto_die || die == decltype_auto_die)
+ /* In LTO auto_die and decltype_auto_die may be NULL.  */
+ if (die && (die == auto_die || die == decltype_auto_die))
add_type_attribute (subr_die, TREE_TYPE (TREE_TYPE (decl)),
TYPE_UNQUALIFIED, false, context_die);
}

Re: [PATCH] PR libstdc++/70940 make pmr::resource_adaptor return aligned memory

2018-06-21 Thread Jonathan Wakely


On 21/06/18 16:22 +0100, Jonathan Wakely wrote:

On 21/06/18 15:00 +0100, Jonathan Wakely wrote:

 virtual void
-  do_deallocate(void* __p, size_t __bytes, size_t __alignment)
+  do_deallocate(void* __p, size_t __bytes, size_t __alignment) noexcept
+  override
 {
-   using _Aligned_alloc = std::__alloc_rebind<_Alloc, char>;
-   size_t __new_size = _S_aligned_size(__bytes,
-   _S_supported(__alignment) ?
-   __alignment : _S_max_align);
-   using _Ptr = typename allocator_traits<_Aligned_alloc>::pointer;
-   _Aligned_alloc(_M_alloc).deallocate(static_cast<_Ptr>(__p),
-   __new_size);
+   auto __ptr = static_cast(__p);
+   if (__alignment == 1)
+ _M_alloc.deallocate(__ptr, __bytes);


Oops, this is missing a return!

Patch incoming ...


Fixed like so, with improved testing too.

Tested x86_64-linux, committed to trunk.


commit 3786ccc11448d536e95cbdb4ff02bb446a1d1fca
Author: redi 
Date:   Thu Jun 21 14:01:11 2018 +

PR libstdc++/70940 make pmr::resource_adaptor return aligned memory

PR libstdc++/70940
* include/experimental/memory_resource
(__resource_adaptor_imp::do_deallocate): Add missing return.
* testsuite/experimental/memory_resource/new_delete_resource.cc: New.
* testsuite/experimental/memory_resource/resource_adaptor.cc: Test
resource_adaptor with std::allocator, __gnu_cxx::new_allocator and
__gnu_cxx::malloc_allocator.

diff --git a/libstdc++-v3/include/experimental/memory_resource b/libstdc++-v3/include/experimental/memory_resource
index 3d2dce19868..8f5a8df14c9 100644
--- a/libstdc++-v3/include/experimental/memory_resource
+++ b/libstdc++-v3/include/experimental/memory_resource
@@ -408,7 +408,10 @@ namespace pmr {
   {
 	auto __ptr = static_cast(__p);
 	if (__alignment == 1)
-	  _M_alloc.deallocate(__ptr, __bytes);
+	  {
+	_M_alloc.deallocate(__ptr, __bytes);
+	return;
+	  }
 
 	const _AlignMgr __mgr(__bytes, __alignment);
 	// Use the stored token to retrieve the original pointer to deallocate.
diff --git a/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc b/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
new file mode 100644
index 000..692e520bf9a
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/memory_resource/new_delete_resource.cc
@@ -0,0 +1,132 @@
+// Copyright (C) 2018 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do run { target c++14 } }
+
+#include 
+#include 
+
+bool new_called = false;
+bool delete_called = false;
+
+void* operator new(std::size_t n)
+{
+  new_called = true;
+  if (void* p = malloc(n))
+return p;
+  throw std::bad_alloc();
+}
+
+void operator delete(void* p)
+{
+  delete_called = true;
+  std::free(p);
+}
+
+void operator delete(void* p, std::size_t)
+{
+  ::operator delete(p);
+}
+
+template
+  bool aligned(void* p)
+  {
+return (reinterpret_cast(p) % A) == 0;
+  }
+
+template
+  bool aligned(void* p)
+  { return aligned(p); }
+
+// Extended alignments:
+constexpr std::size_t al6 = (1ul << 6), al12 = (1ul << 12), al18 = (1ul << 18);
+
+using std::experimental::pmr::memory_resource;
+using std::experimental::pmr::new_delete_resource;
+
+memory_resource* global = nullptr;
+
+void
+test01()
+{
+  memory_resource* r1 = new_delete_resource();
+  VERIFY( *r1 == *r1 );
+  memory_resource* r2 = new_delete_resource();
+  VERIFY( r1 == r2 );
+  VERIFY( *r1 == *r2 );
+  global = r1;
+}
+
+void
+test02()
+{
+  memory_resource* r1 = new_delete_resource();
+  VERIFY( r1 == global );
+  VERIFY( *r1 == *global );
+
+  new_called = false;
+  delete_called = false;
+  void* p = r1->allocate(1);
+  VERIFY( new_called );
+  VERIFY( ! delete_called );
+
+  new_called = false;
+  r1->deallocate(p, 1);
+  VERIFY( ! new_called );
+  VERIFY( delete_called );
+}
+
+void
+test03()
+{
+  using std::max_align_t;
+  using std::size_t;
+  void* p = nullptr;
+
+  memory_resource* r1 = new_delete_resource();
+  p = r1->allocate(1);
+  VERIFY( aligned(p) );
+  r1->deallocate(p, 1);
+  p = r1->allocate(1, alignof(short));
+  VERIFY( aligned(p) );
+

Re: [PATCH], PowerPC long double transition patches, v2, Patch #1 (disable long double multilib)

2018-06-21 Thread Michael Meissner

On Wed, Jun 20, 2018 at 07:31:31PM -0500, Segher Boessenkool wrote:
> On Wed, Jun 20, 2018 at 10:25:36AM -0400, Michael Meissner wrote:
> > This code disables the automatic multilib creation unless you use the
> > --with-advance-toolchain= option and the Advance Toolchain directoy has
> > been modified to have the lib64/ieee128 and/or lib64/ibm128 directories for
> > multilib support.  This allows the multilib to still be created, but it is 
> > not
> > enabled by default.
> > 
> > Alternatively, I have a patch that disables the IEEE/IBM long double 
> > multilib
> > support completely.
> 
> So what are the advantages and the disadvantages of both approaches?
> And, why do you think this one is preferable?

The main advantge is that it makes it a little easier for me to test things,
since I only have to build one compiler instead of two.  But that is fairly
minor.

Here is the alternate patch to eliminate the multilib support for IEEE/IBM long
double.

[gcc]
2018-06-21  Michael Meissner  

* config.gcc (powerpc64le*): Remove multilib support for IEEE and
IBM long double.
* config/rs6000/rs6000.c (rs6000_option_override_internal):
Likewise.
* config/rs6000/rs6000.h (TARGET_IEEEQUAD_MULTILIB): Likewise.
* config/rs6000/t-ldouble-linux64le-ibm: Delete, IEEE/IBM long
double multilib no longer supported.
* config/rs6000/t-ldouble-linux64le-ieee: Likewise.
* doc/install.texi (PowerPC options): Delete information about
IEEE/IBM long double multilibs.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config.gcc
===
--- gcc/config.gcc  (revision 261755)
+++ gcc/config.gcc  (working copy)
@@ -4566,16 +4566,6 @@ case "${target}" in
elif test x$with_long_double_format = xibm; then
tm_defines="${tm_defines} TARGET_IEEEQUAD_DEFAULT=0"
fi
-
-   case "${target}:${enable_multilib}:${with_long_double_format}" 
in
-   powerpc64le*:yes:ieee | powerpc64le*:yes:ibm)
-   tm_defines="${tm_defines} TARGET_IEEEQUAD_MULTILIB=1"
-   tmake_file="${tmake_file} 
rs6000/t-ldouble-linux64le-${with_long_double_format}"
-   ;;
-   *)
-   :
-   ;;
-   esac
;;
 
s390*-*-*)
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 261755)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -4578,19 +4578,16 @@ rs6000_option_override_internal (bool gl
   else if (rs6000_long_double_type_size == 128)
 rs6000_long_double_type_size = FLOAT_PRECISION_TFmode;
 
-  /* Set -mabi=ieeelongdouble on some old targets.  In the future, power server
- systems will also set long double to be IEEE 128-bit.  AIX and Darwin
- explicitly redefine TARGET_IEEEQUAD and TARGET_IEEEQUAD_DEFAULT to 0, so
- those systems will not pick up this default.  Warn if the user changes the
- default unless either the user used the -Wno-psabi option, or the compiler
- was built to enable multilibs to switch between the two long double
- types.  */
+  /* Set -mabi=ieeelongdouble on some old targets.  Power server systems may
+ also set long double to be IEEE 128-bit using the configuration option
+ --with-long-double-forrmat=ieee.  AIX and Darwin explicitly redefine
+ TARGET_IEEEQUAD and TARGET_IEEEQUAD_DEFAULT to 0, so those systems will
+ not pick up this default.  Warn if the user changes the default unless
+ either the user used the -Wno-psabi option.  */
   if (!global_options_set.x_rs6000_ieeequad)
 rs6000_ieeequad = TARGET_IEEEQUAD_DEFAULT;
 
-  else if (!TARGET_IEEEQUAD_MULTILIB
-  && rs6000_ieeequad != TARGET_IEEEQUAD_DEFAULT
-  && TARGET_LONG_DOUBLE_128)
+  else if (rs6000_ieeequad != TARGET_IEEEQUAD_DEFAULT && 
TARGET_LONG_DOUBLE_128)
 {
   static bool warned_change_long_double;
   if (!warned_change_long_double)
Index: gcc/config/rs6000/rs6000.h
===
--- gcc/config/rs6000/rs6000.h  (revision 261755)
+++ gcc/config/rs6000/rs6000.h  (working copy)
@@ -551,12 +551,6 @@ extern int rs6000_vector_align[];
 #define TARGET_ALTIVEC_ABI rs6000_altivec_abi
 #define TARGET_LDBRX (TARGET_POPCNTD || rs6000_cpu == PROCESSOR_CELL)
 
-/* Define as 1 if we support multilibs for switching long double between IEEE
-   128-bit floating point and IBM extended double.  */
-#ifndef TARGET_IEEEQUAD_MULTILIB
-#define TARGET_IEEEQUAD_MULTILIB 0
-#endif
-
 /* ISA 2.01 allowed FCFID to be done in 32-bit, previously it was 64-bit only.
Enable 32-bit fcfid's on any of the switches for newer

Re: [PATCH] Add checking that during RTL bbs don't mix EH and non-complex predecessor edges

2018-06-21 Thread Eric Botcazou

> OK, thanks.  I'll give it a try on x86/Windows once this is in.

The build of the Ada runtime miserably fails because finish_eh_generation 
calls commit_edge_insertions before redirecting the EH edges from the post-
landing pads to the landing pads...

Fixed thusly, applied on the mainline as obvious. I think that's good enough 
because the only edge onto which instructions can be inserted in the entry 
edge (including  by the Alpha kludge).


* except.c (finish_eh_generation): Commit edge insertions only after the
EH edges have been redirected from post-landing to landing pads.

-- 
Eric BotcazouIndex: except.c
===
--- except.c	(revision 261832)
+++ except.c	(working copy)
@@ -1510,12 +1510,8 @@ finish_eh_generation (void)
 sjlj_build_landing_pads ();
   else
 dw2_build_landing_pads ();
-  break_superblocks ();
 
-  if (targetm_common.except_unwind_info (&global_options) == UI_SJLJ
-  /* Kludge for Alpha (see alpha_gp_save_rtx).  */
-  || single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun))->insns.r)
-commit_edge_insertions ();
+  break_superblocks ();
 
   /* Redirect all EH edges from the post_landing_pad to the landing pad.  */
   FOR_EACH_BB_FN (bb, cfun)
@@ -1546,6 +1542,11 @@ finish_eh_generation (void)
 		   : EDGE_ABNORMAL);
 	}
 }
+
+  if (targetm_common.except_unwind_info (&global_options) == UI_SJLJ
+  /* Kludge for Alpha (see alpha_gp_save_rtx).  */
+  || single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun))->insns.r)
+commit_edge_insertions ();
 }
 
 /* This section handles removing dead code for flow.  */

Do not emit unnecessary NOPs at -O0

2018-06-21 Thread Eric Botcazou

When code is compiled at -O0, the RTL middle-end makes sure that location 
information is preserved as much as possible by generating NOPs with the 
location information (goto_locus) present on edges in the CFG, if it thinks 
that these edges are the only place where a particular location is mentioned.

The attached patch prevents this from happening in a couple of cases:
 1. if the function has the DECL_IGNORED_P flag set,
 2. if the NOP is emitted by merge_blocks and the 2nd block is a forwarder 
block whose outgoing edge has no location, because in this case the location 
of the to-be-elided edge is copied onto the aforementioned outgoing edge.

Tested (GCC and GDB) on x86-64/Linux, applied on the mainline.


2018-06-21  Eric Botcazou  

* cfgrtl.c (fixup_reorder_chain): Do not emit NOPs in DECL_IGNORED_P
functions.
(rtl_merge_blocks): Likewise.  Do not emit a NOP if the location of the
edge can be forwarded.
(cfg_layout_merge_blocks): Likewise.

-- 
Eric BotcazouIndex: cfgrtl.c
===
--- cfgrtl.c	(revision 261832)
+++ cfgrtl.c	(working copy)
@@ -813,10 +813,14 @@ emit_nop_for_unique_locus_between (basic
 static void
 rtl_merge_blocks (basic_block a, basic_block b)
 {
+  /* If B is a forwarder block whose outgoing edge has no location, we'll
+ propagate the locus of the edge between A and B onto it.  */
+  const bool forward_edge_locus
+= (b->flags & BB_FORWARDER_BLOCK) != 0
+  && LOCATION_LOCUS (EDGE_SUCC (b, 0)->goto_locus) == UNKNOWN_LOCATION;
   rtx_insn *b_head = BB_HEAD (b), *b_end = BB_END (b), *a_end = BB_END (a);
   rtx_insn *del_first = NULL, *del_last = NULL;
   rtx_insn *b_debug_start = b_end, *b_debug_end = b_end;
-  bool forwarder_p = (b->flags & BB_FORWARDER_BLOCK) != 0;
   int b_empty = 0;
 
   if (dump_file)
@@ -887,9 +891,11 @@ rtl_merge_blocks (basic_block a, basic_b
   BB_HEAD (b) = b_empty ? NULL : b_head;
   delete_insn_chain (del_first, del_last, true);
 
-  /* When not optimizing and the edge is the only place in RTL which holds
- some unique locus, emit a nop with that locus in between.  */
-  if (!optimize)
+  /* If not optimizing, preserve the locus of the single edge between
+ blocks A and B if necessary by emitting a nop.  */
+  if (!optimize
+  && !forward_edge_locus
+  && !DECL_IGNORED_P (current_function_decl))
 {
   emit_nop_for_unique_locus_between (a, b);
   a_end = BB_END (a);
@@ -918,9 +924,7 @@ rtl_merge_blocks (basic_block a, basic_b
 
   df_bb_delete (b->index);
 
-  /* If B was a forwarder block, propagate the locus on the edge.  */
-  if (forwarder_p
-  && LOCATION_LOCUS (EDGE_SUCC (b, 0)->goto_locus) == UNKNOWN_LOCATION)
+  if (forward_edge_locus)
 EDGE_SUCC (b, 0)->goto_locus = EDGE_SUCC (a, 0)->goto_locus;
 
   if (dump_file)
@@ -3916,9 +3920,9 @@ fixup_reorder_chain (void)
 	force_nonfallthru (e);
 }
 
-  /* Ensure goto_locus from edges has some instructions with that locus
- in RTL.  */
-  if (!optimize)
+  /* Ensure goto_locus from edges has some instructions with that locus in RTL
+ when not optimizing.  */
+  if (!optimize && !DECL_IGNORED_P (current_function_decl))
 FOR_EACH_BB_FN (bb, cfun)
   {
 edge e;
@@ -4605,7 +4609,11 @@ cfg_layout_can_merge_blocks_p (basic_blo
 static void
 cfg_layout_merge_blocks (basic_block a, basic_block b)
 {
-  bool forwarder_p = (b->flags & BB_FORWARDER_BLOCK) != 0;
+  /* If B is a forwarder block whose outgoing edge has no location, we'll
+ propagate the locus of the edge between A and B onto it.  */
+  const bool forward_edge_locus
+= (b->flags & BB_FORWARDER_BLOCK) != 0
+  && LOCATION_LOCUS (EDGE_SUCC (b, 0)->goto_locus) == UNKNOWN_LOCATION;
   rtx_insn *insn;
 
   gcc_checking_assert (cfg_layout_can_merge_blocks_p (a, b));
@@ -4626,9 +4634,11 @@ cfg_layout_merge_blocks (basic_block a,
 try_redirect_by_replacing_jump (EDGE_SUCC (a, 0), b, true);
   gcc_assert (!JUMP_P (BB_END (a)));
 
-  /* When not optimizing and the edge is the only place in RTL which holds
- some unique locus, emit a nop with that locus in between.  */
-  if (!optimize)
+  /* If not optimizing, preserve the locus of the single edge between
+ blocks A and B if necessary by emitting a nop.  */
+  if (!optimize
+  && !forward_edge_locus
+  && !DECL_IGNORED_P (current_function_decl))
 emit_nop_for_unique_locus_between (a, b);
 
   /* Move things from b->footer after a->footer.  */
@@ -4695,9 +4705,7 @@ cfg_layout_merge_blocks (basic_block a,
 
   df_bb_delete (b->index);
 
-  /* If B was a forwarder block, propagate the locus on the edge.  */
-  if (forwarder_p
-  && LOCATION_LOCUS (EDGE_SUCC (b, 0)->goto_locus) == UNKNOWN_LOCATION)
+  if (forward_edge_locus)
 EDGE_SUCC (b, 0)->goto_locus = EDGE_SUCC (a, 0)->goto_locus;
 
   if (dump_file)

Re: [patch] add -nolibc option

2018-06-21 Thread Olivier Hainque

Hello Joseph,

Thanks for getting back to me on this!

> On 19 Jun 2018, at 17:50, Joseph Myers  wrote:
> 
> On Thu, 7 Jun 2018, Olivier Hainque wrote:
> 
>> An updated version of the patch is attached, accounting for
>> your two comments and expanding on the .texi documentation a
>> bit. 
> 
> I see you're not changing LINK_GCC_C_SEQUENCE_SPEC in arc/elf.h.  That's a 
> slightly odd case in that it isn't actually using %L, but is using -lc 
> directly, whereas there's an empty definition of LIB_SPEC.

Indeed. I hadn't changed it because it was only added very recently
(like last week or so) and so wasn't there when I shaped the patch.

> I'd expect the documentation to say something about libraries added only 
> for particular languages by the front-end drivers (-lstdc++ -lm, 
> -lgfortran -lm, etc.).  It may be that the option isn't particularly 
> meaningful for code using such front-end drivers that add those libraries, 
> because those libraries depend on libc (and code in those languages will 
> generally depend on their libraries), but it should still say what the 
> effects are.

Agreed.

Attached is an updated version with the doc reworded to this
effect, referring to "the system C library or system libraries tightly
coupled with it", as opposed to "toolchain provided language support
libraries".

-lm is a bit annoying. There are very few explicit occurrences as
this is usually to be added by users. Still, among the few that are there,
some are in LIB_SPEC and some are not. I have qualified this with a
"in some configurations" which has the advantage on conveying honestly
that the effect isn't very precisely defined.

It has been working well for the uses we had, indeed on bareboard
configurations which is the stated intent.

Reboostrapping on x86_64-linux now. If there's meaningful extra
testing you think I could make, I'll be happy to comply.

Thanks again for your input.

With Kind Regards,

Olivier

nolibc2.diff
Description: Binary data

Re: [PATCH], PowerPC long double transition patches, v2, Patch #1 (disable long double multilib)

2018-06-21 Thread Segher Boessenkool

On Thu, Jun 21, 2018 at 12:58:06PM -0400, Michael Meissner wrote:
> On Wed, Jun 20, 2018 at 07:31:31PM -0500, Segher Boessenkool wrote:
> > On Wed, Jun 20, 2018 at 10:25:36AM -0400, Michael Meissner wrote:
> > > This code disables the automatic multilib creation unless you use the
> > > --with-advance-toolchain= option and the Advance Toolchain directoy 
> > > has
> > > been modified to have the lib64/ieee128 and/or lib64/ibm128 directories 
> > > for
> > > multilib support.  This allows the multilib to still be created, but it 
> > > is not
> > > enabled by default.
> > > 
> > > Alternatively, I have a patch that disables the IEEE/IBM long double 
> > > multilib
> > > support completely.
> > 
> > So what are the advantages and the disadvantages of both approaches?
> > And, why do you think this one is preferable?
> 
> The main advantge is that it makes it a little easier for me to test things,
> since I only have to build one compiler instead of two.  But that is fairly
> minor.
> 
> Here is the alternate patch to eliminate the multilib support for IEEE/IBM 
> long
> double.
> 
> [gcc]
> 2018-06-21  Michael Meissner  
> 
>   * config.gcc (powerpc64le*): Remove multilib support for IEEE and
>   IBM long double.
>   * config/rs6000/rs6000.c (rs6000_option_override_internal):
>   Likewise.
>   * config/rs6000/rs6000.h (TARGET_IEEEQUAD_MULTILIB): Likewise.
>   * config/rs6000/t-ldouble-linux64le-ibm: Delete, IEEE/IBM long
>   double multilib no longer supported.
>   * config/rs6000/t-ldouble-linux64le-ieee: Likewise.
>   * doc/install.texi (PowerPC options): Delete information about
>   IEEE/IBM long double multilibs.

This reverts r256775 (the commit message should say), except the linux64.h
parts, and it seems you just missed those?


Segher

Re: [PATCH], PowerPC long double transition patches, v2, Patch #1 (disable long double multilib)

2018-06-21 Thread Michael Meissner

On Thu, Jun 21, 2018 at 12:09:12PM -0500, Segher Boessenkool wrote:
> On Thu, Jun 21, 2018 at 12:58:06PM -0400, Michael Meissner wrote:
> > On Wed, Jun 20, 2018 at 07:31:31PM -0500, Segher Boessenkool wrote:
> > > On Wed, Jun 20, 2018 at 10:25:36AM -0400, Michael Meissner wrote:
> > > > This code disables the automatic multilib creation unless you use the
> > > > --with-advance-toolchain= option and the Advance Toolchain 
> > > > directoy has
> > > > been modified to have the lib64/ieee128 and/or lib64/ibm128 directories 
> > > > for
> > > > multilib support.  This allows the multilib to still be created, but it 
> > > > is not
> > > > enabled by default.
> > > > 
> > > > Alternatively, I have a patch that disables the IEEE/IBM long double 
> > > > multilib
> > > > support completely.
> > > 
> > > So what are the advantages and the disadvantages of both approaches?
> > > And, why do you think this one is preferable?
> > 
> > The main advantge is that it makes it a little easier for me to test things,
> > since I only have to build one compiler instead of two.  But that is fairly
> > minor.
> > 
> > Here is the alternate patch to eliminate the multilib support for IEEE/IBM 
> > long
> > double.
> > 
> > [gcc]
> > 2018-06-21  Michael Meissner  
> > 
> > * config.gcc (powerpc64le*): Remove multilib support for IEEE and
> > IBM long double.
> > * config/rs6000/rs6000.c (rs6000_option_override_internal):
> > Likewise.
> > * config/rs6000/rs6000.h (TARGET_IEEEQUAD_MULTILIB): Likewise.
> > * config/rs6000/t-ldouble-linux64le-ibm: Delete, IEEE/IBM long
> > double multilib no longer supported.
> > * config/rs6000/t-ldouble-linux64le-ieee: Likewise.
> > * doc/install.texi (PowerPC options): Delete information about
> > IEEE/IBM long double multilibs.
> 
> This reverts r256775 (the commit message should say), except the linux64.h
> parts, and it seems you just missed those?

No it only reverts part of r256775.  The main part of the r256775 change is to
add support for --with-long-double-format={ieee,ibm} which we need to keep.

Yeah, I probably missed them.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Re: [Patch, fortran] PR83118 - [7/8/9 Regression] Bad intrinsic assignment of class(*) array component of derived type

2018-06-21 Thread Paul Richard Thomas

Thanks, Steve. Committed to trunk as revision 261857.

8-branch will be patched in a few days. Any opinions about 7-branch?

Cheers

Paul


On 21 June 2018 at 16:23, Steve Kargl  wrote:
> On Thu, Jun 21, 2018 at 09:02:53AM +0100, Paul Richard Thomas wrote:
>> The original problem was fixed by the patch for PR84546. This patch
>> fixes a variant that appears in comment #6.
>>
>> The fix is completely straightforward and described by the comments
>> and ChangeLogs.
>>
>> Bootstrapped and regtested on FC28/x86_64 - OK for trunk?
>>
>
> OK.
>
> --
> Steve



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein

Re: [PATCH 10/11] Fix LRA to handle multi-word eliminable registers

2018-06-21 Thread Vladimir Makarov





On 06/13/2018 02:58 PM, Dimitar Dimitrov wrote:

From: Dimitar Dimitrov 

For some targets, Pmode != UNITS_PER_WORD. Take this into account
when marking hard registers as being used.

I tested C and C++ testsuits for x86_64 with and without this
patch. There was no regression, i.e. gcc.sum and g++.sum matched
exactly.

gcc/ChangeLog:

2018-06-13  Dimitar Dimitrov  

* lra-eliminations.c (set_ptr_hard_reg_bits): New function.
(update_reg_eliminate): Mark all spanning hw registers.

gcc/testsuite/ChangeLog:

2018-06-13  Dimitar Dimitrov  

* gcc.target/pru/lra-framepointer-fragmentation-1.c: New test.
* gcc.target/pru/lra-framepointer-fragmentation-2.c: New test.

Cc: Vladimir Makarov 
Cc: Peter Bergner 
Cc: Kenneth Zadeck 
Cc: Seongbae Park 
Signed-off-by: Dimitar Dimitrov 
---
  gcc/lra-eliminations.c | 14 -
  .../pru/lra-framepointer-fragmentation-1.c | 33 
  .../pru/lra-framepointer-fragmentation-2.c | 61 ++
  3 files changed, 106 insertions(+), 2 deletions(-)
  create mode 100644 
gcc/testsuite/gcc.target/pru/lra-framepointer-fragmentation-1.c
  create mode 100644 
gcc/testsuite/gcc.target/pru/lra-framepointer-fragmentation-2.c

diff --git a/gcc/lra-eliminations.c b/gcc/lra-eliminations.c
index 21d8d5f8018..566cc2c8248 100644
--- a/gcc/lra-eliminations.c
+++ b/gcc/lra-eliminations.c
@@ -1180,6 +1180,16 @@ spill_pseudos (HARD_REG_SET set)
bitmap_clear (&to_process);
  }
  
+static void set_ptr_hard_reg_bits (HARD_REG_SET *hard_reg_set, int r)

+{
+  int w;
+
+  for (w = 0; w < GET_MODE_SIZE (Pmode); w += UNITS_PER_WORD, r++)
+{
+ SET_HARD_REG_BIT (*hard_reg_set, r);
+}
+}
+

The patch itself is ok but for uniformity I'd use

    for (int i = hard_regno_nregs (r, Pmode) - 1; i >= 0; i--)
  SET_HARD_REG_BIT (*hard_reg_set, r + i);


Approved with the above change.

C++ PATCHes for improved template memory use

2018-06-21 Thread Jason Merrill

I've been looking at -fmem-report for a testcase from someone on the
committee, and found a couple of low-hanging fruits.

The first patch doesn't change any allocation, just makes it so
-fmem-report can see who's calling cxx_make_type.

The second patch avoids creating a vector (which we then never freed)
every time we push_to_top_level for doing template instantiation.
This alone improved peak memory use by 30%.

The third patch avoids redundantly creating level-lowered versions of
TEMPLATE_TYPE_PARM, as we had already been doing for non-type
parameters.  This improved peak memory use by another 20%.

Unfortunately, these don't affect the 80290 testcase significantly.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 472c58fd2b3620b8461c59af58af056e53793d85
Author: Jason Merrill 
Date:   Wed Jun 20 15:01:26 2018 -0400

Let -fmem-report see callers of cxx_make_type.

* lex.c (cxx_make_type): Add MEM_STAT_DECL.
(make_class_type): Likewise.
(cxx_make_type_hook): New.
* cp-objcp-common.h (LANG_HOOKS_MAKE_TYPE): Use cxx_make_type_hook.

diff --git a/gcc/cp/cp-objcp-common.h b/gcc/cp/cp-objcp-common.h
index 18ccc5bb6cf..6f08253f70f 100644
--- a/gcc/cp/cp-objcp-common.h
+++ b/gcc/cp/cp-objcp-common.h
@@ -34,6 +34,7 @@ extern tree cp_unit_size_without_reusable_padding (tree);
 extern tree cp_get_global_decls ();
 extern tree cp_pushdecl (tree);
 extern void cp_register_dumps (gcc::dump_manager *);
+extern tree cxx_make_type_hook			(tree_code);
 
 /* Lang hooks that are shared between C++ and ObjC++ are defined here.  Hooks
specific to C++ or ObjC++ go in cp/cp-lang.c and objcp/objcp-lang.c,
@@ -126,7 +127,7 @@ extern void cp_register_dumps (gcc::dump_manager *);
 #define LANG_HOOKS_TREE_DUMP_TYPE_QUALS_FN cp_type_quals
 
 #undef LANG_HOOKS_MAKE_TYPE
-#define LANG_HOOKS_MAKE_TYPE cxx_make_type
+#define LANG_HOOKS_MAKE_TYPE cxx_make_type_hook
 #undef LANG_HOOKS_TYPE_FOR_MODE
 #define LANG_HOOKS_TYPE_FOR_MODE c_common_type_for_mode
 #undef LANG_HOOKS_TYPE_FOR_SIZE
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index ee9242f9313..0994377e5d7 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6495,8 +6495,8 @@ extern void retrofit_lang_decl			(tree);
 extern void fit_decomposition_lang_decl		(tree, tree);
 extern tree copy_decl(tree CXX_MEM_STAT_INFO);
 extern tree copy_type(tree CXX_MEM_STAT_INFO);
-extern tree cxx_make_type			(enum tree_code);
-extern tree make_class_type			(enum tree_code);
+extern tree cxx_make_type			(enum tree_code CXX_MEM_STAT_INFO);
+extern tree make_class_type			(enum tree_code CXX_MEM_STAT_INFO);
 extern const char *get_identifier_kind_name	(tree);
 extern void set_identifier_kind			(tree, cp_identifier_kind);
 extern bool cxx_init(void);
diff --git a/gcc/cp/lex.c b/gcc/cp/lex.c
index c47ae1dd5a1..bd5d507e97b 100644
--- a/gcc/cp/lex.c
+++ b/gcc/cp/lex.c
@@ -852,9 +852,9 @@ maybe_add_lang_type_raw (tree t)
 }
 
 tree
-cxx_make_type (enum tree_code code)
+cxx_make_type (enum tree_code code MEM_STAT_DECL)
 {
-  tree t = make_node (code);
+  tree t = make_node (code PASS_MEM_STAT);
 
   if (maybe_add_lang_type_raw (t))
 {
@@ -868,10 +868,18 @@ cxx_make_type (enum tree_code code)
   return t;
 }
 
+/* A wrapper without the memory stats for LANG_HOOKS_MAKE_TYPE.  */
+
 tree
-make_class_type (enum tree_code code)
+cxx_make_type_hook (enum tree_code code)
 {
-  tree t = cxx_make_type (code);
+  return cxx_make_type (code);
+}
+
+tree
+make_class_type (enum tree_code code MEM_STAT_DECL)
+{
+  tree t = cxx_make_type (code PASS_MEM_STAT);
   SET_CLASS_TYPE_P (t, 1);
   return t;
 }
commit 0fb3fe86c4e32f8f72976d78f8af5b5229868129
Author: Jason Merrill 
Date:   Wed Jun 20 12:05:34 2018 -0400

Reduce garbage from push_to_top_level.

* name-lookup.c (do_push_to_top_level): Don't allocate
current_lang_base.
(do_pop_from_top_level): Release current_lang_base.

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index ec001016d3e..a30c37428ad 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -6852,7 +6852,7 @@ do_push_to_top_level (void)
 
   scope_chain = s;
   current_function_decl = NULL_TREE;
-  vec_alloc (current_lang_base, 10);
+  current_lang_base = NULL;
   current_lang_name = lang_name_cplusplus;
   current_namespace = global_namespace;
   push_class_stack ();
@@ -6872,7 +6872,7 @@ do_pop_from_top_level (void)
 invalidate_class_lookup_cache ();
   pop_class_stack ();
 
-  current_lang_base = 0;
+  release_tree_vector (current_lang_base);
 
   scope_chain = s->prev;
   FOR_EACH_VEC_SAFE_ELT (s->old_bindings, i, saved)
commit 93ca12c981968b5d545f1428d7b6d47557f207f7
Author: Jason Merrill 
Date:   Wed Jun 20 12:36:29 2018 -0400

 * pt.c (tsubst) [TEMPLATE_TYPE_PARM]: Use TEMPLATE_PARM_DESCENDANTS.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index c5433dc46ae..69e9479302e 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -14472,6 +14472,15 @@

C++ PATCH for c++/86184, rejects-valid with ?: and omitted operand

2018-06-21 Thread Marek Polacek

The following testcase is rejected because, for this line:

  bool b = X() ?: false;

arg2 is missing and arg1 is a TARGET_EXPR.  A TARGET_EXPR is a class
prvalue so we wrap it in a SAVE_EXPR.  Later when building 'this' we
call build_this (SAVE_EXPR >) which triggers lvalue_error:
 5856   cp_lvalue_kind kind = lvalue_kind (arg);
 5857   if (kind == clk_none)
 5858 {
 5859   if (complain & tf_error)
 5860 lvalue_error (input_location, lv_addressof);
because all SAVE_EXPRs are non-lvalue.

Since
a) cp_build_addr_expr_1 can process xvalues and class prvalues,
b) TARGET_EXPRs are only evaluated once (gimplify_target_expr),
I thought we could do the following.  The testcase ensures that
with the omitted operand we only construct X once.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-06-21  Marek Polacek  

PR c++/86184
* call.c (build_conditional_expr_1): Don't wrap TARGET_EXPRs
in a SAVE_EXPR.

* g++.dg/ext/cond3.C: New test.

--- gcc/cp/call.c
+++ gcc/cp/call.c
@@ -4806,6 +4806,10 @@ build_conditional_expr_1 (location_t loc, tree arg1, 
tree arg2, tree arg3,
   /* Make sure that lvalues remain lvalues.  See g++.oliva/ext1.C.  */
   if (lvalue_p (arg1))
arg2 = arg1 = cp_stabilize_reference (arg1);
+  else if (TREE_CODE (arg1) == TARGET_EXPR)
+   /* TARGET_EXPRs are only expanded once, don't wrap it in a SAVE_EXPR,
+  rendering it clk_none of clk_class.  */
+   arg2 = arg1;
   else
arg2 = arg1 = cp_save_expr (arg1);
 }
--- gcc/testsuite/g++.dg/ext/cond3.C
+++ gcc/testsuite/g++.dg/ext/cond3.C
@@ -0,0 +1,20 @@
+// PR c++/86184
+// { dg-do run }
+// { dg-options "" }
+
+int j;
+struct X {
+  X() { j++; }
+  operator bool() { return true; }
+};
+
+/* Only create X once.  */
+bool b = X() ?: false;
+bool b2 = X() ? X() : false;
+
+int
+main ()
+{
+  if (j != 3)
+__builtin_abort ();
+}

Re: [Patch, fortran] PR83118 - [7/8/9 Regression] Bad intrinsic assignment of class(*) array component of derived type

2018-06-21 Thread Steve Kargl

On Thu, Jun 21, 2018 at 06:36:28PM +0100, Paul Richard Thomas wrote:
> Thanks, Steve. Committed to trunk as revision 261857.
> 
> 8-branch will be patched in a few days. Any opinions about 7-branch?
> 

Well, from a selfish standpoint, I use 7 as my day-to-day
Fortran compiler at work, so a backport would be nice.
OTOH, I don't use the language feature affected by this bug,
so it won't effect me.  Thus, I would rather have you
spend your free time on other bug reports. 

-- 
Steve

Re: [PATCH], PowerPC long double transition patches, v2, Patch #1 (disable long double multilib)

2018-06-21 Thread Segher Boessenkool

On Thu, Jun 21, 2018 at 01:26:19PM -0400, Michael Meissner wrote:
> On Thu, Jun 21, 2018 at 12:09:12PM -0500, Segher Boessenkool wrote:
> > >   * config.gcc (powerpc64le*): Remove multilib support for IEEE and
> > >   IBM long double.
> > >   * config/rs6000/rs6000.c (rs6000_option_override_internal):
> > >   Likewise.
> > >   * config/rs6000/rs6000.h (TARGET_IEEEQUAD_MULTILIB): Likewise.
> > >   * config/rs6000/t-ldouble-linux64le-ibm: Delete, IEEE/IBM long
> > >   double multilib no longer supported.
> > >   * config/rs6000/t-ldouble-linux64le-ieee: Likewise.
> > >   * doc/install.texi (PowerPC options): Delete information about
> > >   IEEE/IBM long double multilibs.
> > 
> > This reverts r256775 (the commit message should say), except the linux64.h
> > parts, and it seems you just missed those?
> 
> No it only reverts part of r256775.  The main part of the r256775 change is to
> add support for --with-long-double-format={ieee,ibm} which we need to keep.

That is r256558 as far as I see?


Segher

Re: Do not emit unnecessary NOPs at -O0

2018-06-21 Thread Jeff Law

On 06/21/2018 11:04 AM, Eric Botcazou wrote:
> When code is compiled at -O0, the RTL middle-end makes sure that location 
> information is preserved as much as possible by generating NOPs with the 
> location information (goto_locus) present on edges in the CFG, if it thinks 
> that these edges are the only place where a particular location is mentioned.
> 
> The attached patch prevents this from happening in a couple of cases:
>  1. if the function has the DECL_IGNORED_P flag set,
>  2. if the NOP is emitted by merge_blocks and the 2nd block is a forwarder 
> block whose outgoing edge has no location, because in this case the location 
> of the to-be-elided edge is copied onto the aforementioned outgoing edge.
> 
> Tested (GCC and GDB) on x86-64/Linux, applied on the mainline.
> 
> 
> 2018-06-21  Eric Botcazou  
> 
>   * cfgrtl.c (fixup_reorder_chain): Do not emit NOPs in DECL_IGNORED_P
>   functions.
>   (rtl_merge_blocks): Likewise.  Do not emit a NOP if the location of the
>   edge can be forwarded.
>   (cfg_layout_merge_blocks): Likewise.
Funny Alex and I were just talking about the need to stuff away debug
info on edges for gimple.   Attaching it to gimple nops sounds like a
fairly straightforward approach :-)

jeff

[PATCH, rs6000] PR target/86222 fix truncation issue with constants when compiling -m32

2018-06-21 Thread Aaron Sawdey

expand_strn_compare was not using gen_int_mode or trunc_int_for_mode to
properly truncate to Pmode when creating contants in the generate rtx.
This lead to an improper constant and the ICE in PR/86222.

Testing on ppc64 with -m32, -m32 -mpowerpc64 and -m64. If regstrap
passes, ok for trunk and backport to 8?

Thanks, 
   Aaron


2018-06-19  Aaron Sawdey  

* config/rs6000/rs6000-string.c (expand_strn_compare): Handle -m32
correctly.

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC ToolchainIndex: gcc/config/rs6000/rs6000-string.c
===
--- gcc/config/rs6000/rs6000-string.c	(revision 261850)
+++ gcc/config/rs6000/rs6000-string.c	(working copy)
@@ -1925,20 +1925,15 @@
 	  /* -m32 -mpowerpc64 results in word_mode being DImode even
 	 though otherwise it is 32-bit. The length arg to strncmp
 	 is a size_t which will be the same size as pointers.  */
-	  rtx len_rtx;
-	  if (TARGET_64BIT)
-	len_rtx = gen_reg_rtx (DImode);
-	  else
-	len_rtx = gen_reg_rtx (SImode);
+	  rtx len_rtx = gen_reg_rtx (Pmode);
+	  emit_move_insn (len_rtx, gen_int_mode (bytes, Pmode));
 
-	  emit_move_insn (len_rtx, bytes_rtx);
-
 	  tree fun = builtin_decl_explicit (BUILT_IN_STRNCMP);
 	  emit_library_call_value (XEXP (DECL_RTL (fun), 0),
    target, LCT_NORMAL, GET_MODE (target),
    force_reg (Pmode, src1_addr), Pmode,
    force_reg (Pmode, src2_addr), Pmode,
-   len_rtx, GET_MODE (len_rtx));
+   len_rtx, Pmode);
 	}
 
   rtx fin_ref = gen_rtx_LABEL_REF (VOIDmode, final_label);
@@ -2126,18 +2121,12 @@
 	}
   else
 	{
-	  rtx len_rtx;
-	  if (TARGET_64BIT)
-	len_rtx = gen_reg_rtx (DImode);
-	  else
-	len_rtx = gen_reg_rtx (SImode);
-
-	  emit_move_insn (len_rtx, GEN_INT (bytes - compare_length));
+	  rtx len_rtx = gen_reg_rtx (Pmode);
+	  emit_move_insn (len_rtx, gen_int_mode (bytes-compare_length, Pmode));
 	  tree fun = builtin_decl_explicit (BUILT_IN_STRNCMP);
 	  emit_library_call_value (XEXP (DECL_RTL (fun), 0),
    target, LCT_NORMAL, GET_MODE (target),
-   src1, Pmode, src2, Pmode,
-   len_rtx, GET_MODE (len_rtx));
+   src1, Pmode, src2, Pmode, len_rtx, Pmode);
 	}
 
   rtx fin_ref = gen_rtx_LABEL_REF (VOIDmode, final_label);

[PATCH] config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.

2018-06-21 Thread Jonathan Wakely


Apparently we never updated the x86_64-linux-gnu baseline for gcc-8
(so now that I'm trying to add a new symbol version on trunk, I'm
seeing errors for the get_entropy symbol added in the previous symbol
version).

Tested x86_64-linux, committed to gcc-8-branch.


commit d62e12c7a1586c397594ef9671b92ee265787b8a
Author: Jonathan Wakely 
Date:   Thu Jun 21 21:37:47 2018 +0100

* config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.

diff --git a/libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt 
b/libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt
index a31597e906f..7027df688e0 100644
--- a/libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt
+++ b/libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt
@@ -444,6 +444,7 @@ 
FUNC:_ZNKSt13basic_fstreamIwSt11char_traitsIwEE7is_openEv@GLIBCXX_3.4
 FUNC:_ZNKSt13basic_istreamIwSt11char_traitsIwEE6gcountEv@@GLIBCXX_3.4
 FUNC:_ZNKSt13basic_istreamIwSt11char_traitsIwEE6sentrycvbEv@@GLIBCXX_3.4
 FUNC:_ZNKSt13basic_ostreamIwSt11char_traitsIwEE6sentrycvbEv@@GLIBCXX_3.4
+FUNC:_ZNKSt13random_device13_M_getentropyEv@@GLIBCXX_3.4.25
 FUNC:_ZNKSt13runtime_error4whatEv@@GLIBCXX_3.4
 FUNC:_ZNKSt14basic_ifstreamIcSt11char_traitsIcEE5rdbufEv@@GLIBCXX_3.4
 FUNC:_ZNKSt14basic_ifstreamIcSt11char_traitsIcEE7is_openEv@@GLIBCXX_3.4.5
@@ -4004,6 +4005,8 @@ OBJECT:0:GLIBCXX_3.4.20
 OBJECT:0:GLIBCXX_3.4.21
 OBJECT:0:GLIBCXX_3.4.22
 OBJECT:0:GLIBCXX_3.4.23
+OBJECT:0:GLIBCXX_3.4.24
+OBJECT:0:GLIBCXX_3.4.25
 OBJECT:0:GLIBCXX_3.4.3
 OBJECT:0:GLIBCXX_3.4.4
 OBJECT:0:GLIBCXX_3.4.5

Re: [PATCH, rs6000] PR target/86222 fix truncation issue with constants when compiling -m32

2018-06-21 Thread Segher Boessenkool

Hi!

On Thu, Jun 21, 2018 at 03:32:25PM -0500, Aaron Sawdey wrote:
> expand_strn_compare was not using gen_int_mode or trunc_int_for_mode to
> properly truncate to Pmode when creating contants in the generate rtx.
> This lead to an improper constant and the ICE in PR/86222.
> 
> Testing on ppc64 with -m32, -m32 -mpowerpc64 and -m64. If regstrap
> passes, ok for trunk and backport to 8?

Okay for trunk and also for 8.  Nit:

> +   rtx len_rtx = gen_reg_rtx (Pmode);
> +   emit_move_insn (len_rtx, gen_int_mode (bytes-compare_length, Pmode));

Spaces around the - please.

Thanks,


Segher

Re: [PATCH] config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.

2018-06-21 Thread Jonathan Wakely


On 21/06/18 21:39 +0100, Jonathan Wakely wrote:

Apparently we never updated the x86_64-linux-gnu baseline for gcc-8
(so now that I'm trying to add a new symbol version on trunk, I'm
seeing errors for the get_entropy symbol added in the previous symbol
version).


The baseline should have been updated after I fixed PR 81092, which
added a new symbol version and exported some new symbols for 32-bit
targets with that version. Because there were no new exports for
x86_64-linux-gnu the baseline file didn't need to be updated. But it
should have been anyway just to add the new GLIBCXX_3.4.24 version
(even though for x86_64-linux-gnu there are no symbols with that
version).


Tested x86_64-linux, committed to gcc-8-branch.


I'll commit the same patch to trunk (which currently has identical
exports to gcc-8-branch).

Here's the equivalent patch for gcc-7-branch.


commit eb9563208102a79d5e8f3f26c5356187a941ea53
Author: Jonathan Wakely 
Date:   Thu Jun 21 21:49:13 2018 +0100

* config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.

diff --git a/libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt b/libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt
index a31597e906f..06c61236f34 100644
--- a/libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt
+++ b/libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt
@@ -4004,6 +4004,7 @@ OBJECT:0:GLIBCXX_3.4.20
 OBJECT:0:GLIBCXX_3.4.21
 OBJECT:0:GLIBCXX_3.4.22
 OBJECT:0:GLIBCXX_3.4.23
+OBJECT:0:GLIBCXX_3.4.24
 OBJECT:0:GLIBCXX_3.4.3
 OBJECT:0:GLIBCXX_3.4.4
 OBJECT:0:GLIBCXX_3.4.5

Re: [PATCH], PowerPC long double transition patches, v2, Patch #2 (add missing conversion insn)

2018-06-21 Thread Segher Boessenkool

Hi!

On Wed, Jun 20, 2018 at 10:32:06AM -0400, Michael Meissner wrote:
> In reworking the ordering of the 128-bit floating point modes (June 18th, 2018
> patch), I missed one conversion insn.  This meant the compiler would generate 
> a
> conversion to using the IF name.

Is there some existing test this fixes?  Which?

Okay for trunk.


Segher

Re: [PATCH], PowerPC long double transition patches, v2, Patch #3 (use correct way to get the IEEE 128-bit complex type)

2018-06-21 Thread Segher Boessenkool

On Wed, Jun 20, 2018 at 10:38:10AM -0400, Michael Meissner wrote:
> This patch fixes the tests that use IEEE 128-bit float complex to use long
> double _Complex on systems where the default is IEEE 128-bit.  Due to needing
> to use the same internal type for long double and __float128 (to get the
> mangling correct and make templates work), you can't really use KF or KC
> attributes to get the float128 type when the long double type is IEEE 128-bit.

Which is a problem that needs fixing.  Make a separate testcase for that
then, please.

> +#ifndef __LONG_DOUBLE_IEEE128__
> +/* If long double is IBM, we have to use __attribute__ to get to the long
> +   double complex type.  If long double is IEEE, we can use the standard
> +   _Complex type.  */
> +typedef _Complex float __attribute__((mode(__KC__))) __cfloat128;
> +#else
> +typedef _Complex long double __cfloat128;
> +#endif

The comment doesn't make much sense.  If long double is IBM, the long
double complex mode is ICmode, not KCmode.  "To get the IEEE128 complex
type", perhaps?  And can't you do _Complex __ieee128 or such?


Segher

[PATCH] PR libstdc++/83328 add correct basic_string::insert for initializer_list

2018-06-21 Thread Jonathan Wakely


The SSO basic_string has a non-standard insert(iterator, initializer_list)
overload, from a C++0x draft. This adds the correct overload, while also
preserving the old one so that the old symbol is still exported from the
library.

The COW basic_string doesn't have any of the C++11 changes to the insert
overloads (they all still have non-const iterator parameters and the
ones that should return an iterator still return void). This doesn't
make any change to the COW string.

PR libstdc++/83328
* acinclude.m4 (libtool_VERSION): Bump to 6:26:0.
* config/abi/pre/gnu.ver: Add GLIBCXX_3.4.26 and export new symbol.
* configure: Regenerate.
* include/bits/basic_string.h [_GLIBCXX_USE_CXX11_ABI]
(basic_string::insert(const_iterator, initializer_list)): Add.
[_GLIBCXX_USE_CXX11_ABI && !_GLIBCXX_DEFINING_STRING_INSTANTIATIONS]
(basic_string::insert(iterator, initializer_list)): Suppress
definition.
* include/debug/string (basic_string::insert(iterator, C)): Change
first parameter to const_iterator.
(basic_string::insert(iterator, size_type, C)): Likewise. Change
return type to iterator.
(basic_string::insert(iterator, InputIterator, InputIterator)):
Likewise.
(basic_string::insert(iterator, initializer_list)): Change first
parameter to const_iterator and return type to iterator.
* src/c++11/string-inst.cc: Extend comment.
* testsuite/21_strings/basic_string/modifiers/insert/char/83328.cc:
New.
* testsuite/21_strings/basic_string/modifiers/insert/wchar_t/83328.cc:
New.
* testsuite/util/testsuite_abi.cc: Add new symbol version.

Tested x86_64-linux, committed to trunk.


commit 2e78faf91176299db541340d24ca3fae71ecf1cf
Author: Jonathan Wakely 
Date:   Thu Dec 14 11:29:23 2017 +

PR libstdc++/83328 add correct basic_string::insert for initializer_list

The SSO basic_string has a non-standard insert(iterator, initializer_list)
overload, from a C++0x draft. This adds the correct overload, while also
preserving the old one so that the old symbol is still exported from the
library.

The COW basic_string doesn't have any of the C++11 changes to the insert
overloads (they all still have non-const iterator parameters and the
ones that should return an iterator still return void). This doesn't
make any change to the COW string.

PR libstdc++/83328
* acinclude.m4 (libtool_VERSION): Bump to 6:26:0.
* config/abi/pre/gnu.ver: Add GLIBCXX_3.4.26 and export new symbol.
* configure: Regenerate.
* include/bits/basic_string.h [_GLIBCXX_USE_CXX11_ABI]
(basic_string::insert(const_iterator, initializer_list)): Add.
[_GLIBCXX_USE_CXX11_ABI && !_GLIBCXX_DEFINING_STRING_INSTANTIATIONS]
(basic_string::insert(iterator, initializer_list)): Suppress
definition.
* include/debug/string (basic_string::insert(iterator, C)): Change
first parameter to const_iterator.
(basic_string::insert(iterator, size_type, C)): Likewise. Change
return type to iterator.
(basic_string::insert(iterator, InputIterator, InputIterator)):
Likewise.
(basic_string::insert(iterator, initializer_list)): Change first
parameter to const_iterator and return type to iterator.
* src/c++11/string-inst.cc: Extend comment.
* testsuite/21_strings/basic_string/modifiers/insert/char/83328.cc:
New.
* 
testsuite/21_strings/basic_string/modifiers/insert/wchar_t/83328.cc:
New.
* testsuite/util/testsuite_abi.cc: Add new symbol version.

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index 6c855b6c7e5..cf5add167e6 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -3749,7 +3749,7 @@ changequote([,])dnl
 fi
 
 # For libtool versioning info, format is CURRENT:REVISION:AGE
-libtool_VERSION=6:25:0
+libtool_VERSION=6:26:0
 
 # Everything parsed; figure out what files and settings to use.
 case $enable_symvers in
diff --git a/libstdc++-v3/config/abi/pre/gnu.ver 
b/libstdc++-v3/config/abi/pre/gnu.ver
index 5e66dc5cc3f..b59b9a0ff1f 100644
--- a/libstdc++-v3/config/abi/pre/gnu.ver
+++ b/libstdc++-v3/config/abi/pre/gnu.ver
@@ -1703,7 +1703,34 @@ GLIBCXX_3.4.21 {
 _ZNSt7__cxx1112basic_stringI[cw]St11char_traitsI[cw]ESaI[cw]EE13*;
 
_ZNSt7__cxx1112basic_stringI[cw]St11char_traitsI[cw]ESaI[cw]EE14_M_replace_aux*;
 _ZNSt7__cxx1112basic_stringI[cw]St11char_traitsI[cw]ESaI[cw]EE1[5-9]*;
-_ZNSt7__cxx1112basic_stringI[cw]St11char_traitsI[cw]ESaI[cw]EE[2-9]*;
+_ZNSt7__cxx1112basic_stringI[cw]St11char_traitsI[cw]ESaI[cw]EE2at*;
+_ZNSt7__cxx1112basic_stringI[cw]St11char_traitsI[cw]ESaI[cw]EE3end*;
+_ZNSt7__cxx1112basic_stringI[cw]St11char_trai

Re: [PATCH], PowerPC long double transition patches, v2, Patch #2 (add missing conversion insn)

2018-06-21 Thread Michael Meissner

On Thu, Jun 21, 2018 at 04:25:29PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Jun 20, 2018 at 10:32:06AM -0400, Michael Meissner wrote:
> > In reworking the ordering of the 128-bit floating point modes (June 18th, 
> > 2018
> > patch), I missed one conversion insn.  This meant the compiler would 
> > generate a
> > conversion to using the IF name.
> 
> Is there some existing test this fixes?  Which?

pr85657-3.c.
 
> Okay for trunk.
> 
> 
> Segher
> 

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH], PowerPC long double transition patches, v2, Patch #4 (fix bug in clone/target attributes on long double == IEEE 128-bit systems)

2018-06-21 Thread Segher Boessenkool

On Wed, Jun 20, 2018 at 10:42:38AM -0400, Michael Meissner wrote:
> This patch prevents the special overriding of the complex float128
> multiply/divide functions from being run twice if there are clone or target
> attributes.  I wasn't aware that the hook used to initialize the built-in
> functions is run each time you change the target options.  The built-in
> function handling aborts if the name had already been set.
> 
> I have tested this on a little endian power8 sysytem using two builds with 
> long
> double set to IEEE and IBM 128-bit.  This patch fixes testsuite errors from
> using the clone or target attributes.  Can I install it in the trunk and GCC
> 8.x branches?

Next time, please mention what tests are fixed.

> --- gcc/config/rs6000/rs6000.c(revision 261574)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -17892,9 +17892,14 @@ init_float128_ieee (machine_mode mode)
>  {
>if (FLOAT128_VECTOR_P (mode))
>  {
> -  /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble.  */
> - if (mode == TFmode && TARGET_IEEEQUAD)
> +  static bool complex_muldiv_init_p = false;
> +
> +  /* Set up to call __mulkc3 and __divkc3 under -mabi=ieeelongdouble.  If
> +  we have clone or target attributes, this will be called a second
> +  time.  We want to create the built-in function only once.  */
> + if (mode == TFmode && TARGET_IEEEQUAD && !complex_muldiv_init_p)
> {
> +  complex_muldiv_init_p = true;
>built_in_function fncode_mul =
>  (built_in_function) (BUILT_IN_COMPLEX_MUL_MIN + TCmode
>   - MIN_MODE_COMPLEX_FLOAT);

The indentation here is wrong (was before, too).

Is there no more elegant, less error-prone way to see things are already
initialised?  Maybe use maybe_get_identifier?

But, okay for trunk.


Segher

Re: [PATCH], PowerPC long double transition patches, v2, Patch #5 (fix negif3)

2018-06-21 Thread Segher Boessenkool

On Wed, Jun 20, 2018 at 10:46:04AM -0400, Michael Meissner wrote:
> This patch fixes a thinko that I had that prevented negation of __ibm128 
> values
> if long double is IEEE 128-bit binary floating point.
> 
> I have checked this on a little endian power8 system with builds where the 
> long
> double is set to IEEE and IBM 128-bit binary floating point, and it fixes some
> tests in the testsuite when long double is IEEE.  Can I install this on the
> trunk and back port it to GCC 8.x?

Okay, thanks.


Segher


>   * config/rs6000/rs6000.md (neg2_internal): Use the correct
>   mode to check whether the mode is IBM extended.

Re: Do not emit unnecessary NOPs at -O0

2018-06-21 Thread Jeff Law

On 06/21/2018 11:04 AM, Eric Botcazou wrote:
> When code is compiled at -O0, the RTL middle-end makes sure that location 
> information is preserved as much as possible by generating NOPs with the 
> location information (goto_locus) present on edges in the CFG, if it thinks 
> that these edges are the only place where a particular location is mentioned.
> 
> The attached patch prevents this from happening in a couple of cases:
>  1. if the function has the DECL_IGNORED_P flag set,
>  2. if the NOP is emitted by merge_blocks and the 2nd block is a forwarder 
> block whose outgoing edge has no location, because in this case the location 
> of the to-be-elided edge is copied onto the aforementioned outgoing edge.
> 
> Tested (GCC and GDB) on x86-64/Linux, applied on the mainline.
> 
> 
> 2018-06-21  Eric Botcazou  
> 
>   * cfgrtl.c (fixup_reorder_chain): Do not emit NOPs in DECL_IGNORED_P
>   functions.
>   (rtl_merge_blocks): Likewise.  Do not emit a NOP if the location of the
>   edge can be forwarded.
>   (cfg_layout_merge_blocks): Likewise.
> 
OK
Jeff

Re: [PATCH], PowerPC long double transition patches, v2, Patch #3 (use correct way to get the IEEE 128-bit complex type)

2018-06-21 Thread Joseph Myers

On Thu, 21 Jun 2018, Segher Boessenkool wrote:

> The comment doesn't make much sense.  If long double is IBM, the long
> double complex mode is ICmode, not KCmode.  "To get the IEEE128 complex
> type", perhaps?  And can't you do _Complex __ieee128 or such?

You can do _Complex _Float128, in C only, not C++ (which doesn't have the 
_Float128 keyword).  _Complex can be used with keywords for type names, 
but not with a typedef name (built-in or otherwise, see discussion in bug 
32187).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH v3] Change default to -fno-math-errno

2018-06-21 Thread Jeff Law

On 06/19/2018 02:29 AM, Richard Biener wrote:
> On Mon, Jun 18, 2018 at 4:01 PM Wilco Dijkstra  wrote:
>>
>> GCC currently defaults to -fmath-errno.  This generates code assuming math
>> functions set errno and the application checks errno.  Few applications
>> test errno and various systems and math libraries no longer set errno since 
>> it
>> is optional.  GCC generates much faster code for simple math functions with
>> -fno-math-errno such as sqrt and lround (avoiding a call and PLT 
>> redirection).
>> Therefore it is reasonable to change the default to -fno-math-errno.  This is
>> already the case for non-C languages.  Only change the default for C99 and
>> later.
>>
>> long f(float x) { return lroundf(x) + 1; }
>>
>> by default:
>>
>> f:
>> str x30, [sp, -16]!
>> bl  lroundf
>> add x0, x0, 1
>> ldr x30, [sp], 16
>> ret
>>
>> With -fno-math-errno:
>>
>> f:
>> fcvtas  x0, s0
>> add x0, x0, 1
>> ret
>>
>> Passes regress on AArch64. OK for commit?
> 
> There are a number of regression tests that check for errno handling
> (I added some to avoid aliasing for example).  Please make sure to
> add explicit -fmath-errno to those that do not already have it set
> (I guess such patch would be obvious and independent of this one).
> 
> A grep -r errno testsuite/ is only 159 lines but it might not
> catch all cases - the one I'm refering to above matches because
> of a comment only:
> 
> testsuite/gcc.dg/tree-ssa/ssa-dse-15.c:  /* We should be able to DSE this 
> store
> (p may point to errno).  */
We're concerned about errno as potentially set by malloc here.  It's
unrelated to the math-errno work.


Jeff

Re: [PATCH v3] Change default to -fno-math-errno

2018-06-21 Thread Jeff Law

On 06/18/2018 10:46 AM, Joseph Myers wrote:
> On Mon, 18 Jun 2018, Jeff Law wrote:
> 
>> So do we need to set or check math_errhandling & MATH_ERRNO at all?  Or
> 
> That's a matter for libc (glibc currently sets it based on __FAST_MATH__ / 
> __NO_MATH_ERRNO__, but fails to avoid including MATH_ERREXCEPT in the 
> definition for configurations not supporting exceptions).
> 
>> WRT library behavior, I thought it was only complex arithmetic that had
>> the option of setting errno.  I thought float/double math functions had
>> a requirement to set errno?
> 
> For complex.h functions it's always optional.  For math.h functions at 
> least one of errno and exceptions must be set, as indicated in the 
> definition of math_errhandling (and if math_errhandling is defined to just 
> one of MATH_ERRNO and MATH_ERREXCEPT, functions then may or may not also 
> indicate errors in the other way).  Hence the question of whether a 
> -fno-math-errno default should be different for (mainly soft-float) 
> configurations not supporting floating-point exceptions, for which errno 
> is the only way they have available to indicate errors.
> 
Ah.  THanks for explaining things.

I think all this implies that the setting of -fno-math-errno by default
really depends on the math library in use since it's the library that
has to arrange for either errno to get set or for an exception to be raised.

I don't know enough about the various math runtimes to even hazard a
guess about the behavior of each.  glibc, musl, ulibc, newlib... THen
there's the alterate libraries from Intel & AMD and the proprietary Unix
math runtimes.  Ugh.

With that in mind ISTM that we may be best off keeping the default
as-is, but overriding it for libraries which we know are going to throw
exceptions rather than set errno.

Thoughts?

jeff

Re: [PATCH], PowerPC long double transition patches, v2, Patch #4 (fix bug in clone/target attributes on long double == IEEE 128-bit systems)

2018-06-21 Thread Michael Meissner

On Thu, Jun 21, 2018 at 05:18:19PM -0500, Segher Boessenkool wrote:
> On Wed, Jun 20, 2018 at 10:42:38AM -0400, Michael Meissner wrote:
> > This patch prevents the special overriding of the complex float128
> > multiply/divide functions from being run twice if there are clone or target
> > attributes.  I wasn't aware that the hook used to initialize the built-in
> > functions is run each time you change the target options.  The built-in
> > function handling aborts if the name had already been set.
> > 
> > I have tested this on a little endian power8 sysytem using two builds with 
> > long
> > double set to IEEE and IBM 128-bit.  This patch fixes testsuite errors from
> > using the clone or target attributes.  Can I install it in the trunk and GCC
> > 8.x branches?
> 
> Next time, please mention what tests are fixed.

It was at least clone1.c and clone2.c.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

libgo patch committed: Re-enable a couple of tests

2018-06-21 Thread Ian Lance Taylor

This libgo patch re-enables a couple of tests that are specific gccgo.
This is a port of https://golang.org/cl/120375 so that it gets more
reliable testing.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 261819)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-c3ef5bbf4e4271216b6f22621269d458599e8087
+d3eb93c1b8990dbfd4bb660c5c8454916b62655c
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/cmd/go/go_test.go
===
--- libgo/go/cmd/go/go_test.go  (revision 261819)
+++ libgo/go/cmd/go/go_test.go  (working copy)
@@ -791,7 +791,6 @@ func TestBuildComplex(t *testing.T) {
tg.run("build", "-x", "-o", os.DevNull, "complex")
 
if _, err := exec.LookPath("gccgo"); err == nil {
-   t.Skip("golang.org/issue/22472")
tg.run("build", "-x", "-o", os.DevNull, "-compiler=gccgo", 
"complex")
}
 }
@@ -2944,7 +2943,6 @@ func TestIssue7573(t *testing.T) {
if _, err := exec.LookPath("gccgo"); err != nil {
t.Skip("skipping because no gccgo compiler found")
}
-   t.Skip("golang.org/issue/22472")
 
tg := testgo(t)
defer tg.cleanup()

Re: [PATCH 10/11] Fix LRA to handle multi-word eliminable registers

2018-06-21 Thread Jeff Law

On 06/21/2018 11:44 AM, Vladimir Makarov wrote:
> 
> 
> On 06/13/2018 02:58 PM, Dimitar Dimitrov wrote:
>> From: Dimitar Dimitrov 
>>
>> For some targets, Pmode != UNITS_PER_WORD. Take this into account
>> when marking hard registers as being used.
>>
>> I tested C and C++ testsuits for x86_64 with and without this
>> patch. There was no regression, i.e. gcc.sum and g++.sum matched
>> exactly.
>>
>> gcc/ChangeLog:
>>
>> 2018-06-13  Dimitar Dimitrov  
>>
>> * lra-eliminations.c (set_ptr_hard_reg_bits): New function.
>> (update_reg_eliminate): Mark all spanning hw registers.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-06-13  Dimitar Dimitrov  
>>
>> * gcc.target/pru/lra-framepointer-fragmentation-1.c: New test.
>> * gcc.target/pru/lra-framepointer-fragmentation-2.c: New test.
>>
>> Cc: Vladimir Makarov 
>> Cc: Peter Bergner 
>> Cc: Kenneth Zadeck 
>> Cc: Seongbae Park 
>> Signed-off-by: Dimitar Dimitrov 
>> ---
>>   gcc/lra-eliminations.c | 14 -
>>   .../pru/lra-framepointer-fragmentation-1.c | 33 
>>   .../pru/lra-framepointer-fragmentation-2.c | 61
>> ++
>>   3 files changed, 106 insertions(+), 2 deletions(-)
>>   create mode 100644
>> gcc/testsuite/gcc.target/pru/lra-framepointer-fragmentation-1.c
>>   create mode 100644
>> gcc/testsuite/gcc.target/pru/lra-framepointer-fragmentation-2.c
>>
>> diff --git a/gcc/lra-eliminations.c b/gcc/lra-eliminations.c
>> index 21d8d5f8018..566cc2c8248 100644
>> --- a/gcc/lra-eliminations.c
>> +++ b/gcc/lra-eliminations.c
>> @@ -1180,6 +1180,16 @@ spill_pseudos (HARD_REG_SET set)
>>     bitmap_clear (&to_process);
>>   }
>>   +static void set_ptr_hard_reg_bits (HARD_REG_SET *hard_reg_set, int r)
>> +{
>> +  int w;
>> +
>> +  for (w = 0; w < GET_MODE_SIZE (Pmode); w += UNITS_PER_WORD, r++)
>> +    {
>> +  SET_HARD_REG_BIT (*hard_reg_set, r);
>> +    }
>> +}
>> +
> The patch itself is ok but for uniformity I'd use
> 
>     for (int i = hard_regno_nregs (r, Pmode) - 1; i >= 0; i--)
>   SET_HARD_REG_BIT (*hard_reg_set, r + i);
I'm a bit surprised we don't already have a utility function to do this.
Hmmm

add_to_hard_reg_set (hard_reg_set, Pmode, r)

So instead LRA ought to be using that function in the places where calls
to set_ptr_hard_reg_bits were introduced.

Dimitar, can you verify that change works?


Jeff

Re: [PATCH], PowerPC long double transition patches, v2, Patch #7 (fix IBM extended double tests to use __ibm128 as needed)

2018-06-21 Thread Segher Boessenkool

On Wed, Jun 20, 2018 at 10:54:18AM -0400, Michael Meissner wrote:
> This patch fixes the tests in the testsuite that implicitly were expecting 
> long
> double to be IBM extended double to use __ibm128 if long double is configured
> to be IEEE 128-bit floating point.

And just always using __ibm128 does not work?  That needs fixing then.

(Perhaps test long double in a separate testcase, too, then, and only run
that if the appropriate long double type is active).

Patch is okay for now.

Segher

Re: [PATCH], PowerPC long double transition patches, v2, Patch #7 (fix IBM extended double tests to use __ibm128 as needed)

2018-06-21 Thread Michael Meissner

On Thu, Jun 21, 2018 at 06:07:36PM -0500, Segher Boessenkool wrote:
> On Wed, Jun 20, 2018 at 10:54:18AM -0400, Michael Meissner wrote:
> > This patch fixes the tests in the testsuite that implicitly were expecting 
> > long
> > double to be IBM extended double to use __ibm128 if long double is 
> > configured
> > to be IEEE 128-bit floating point.
> 
> And just always using __ibm128 does not work?  That needs fixing then.

As we've discussed before, __ibm128 is only created when the float128 support
is enabled (VSX and Linux).  If I changed the tests without adding a condition
to only run it on the appropriate systems, it would give errors on AIX, and
32-bit embedded systems.

> (Perhaps test long double in a separate testcase, too, then, and only run
> that if the appropriate long double type is active).
> 
> Patch is okay for now.
> 
> 
> Segher
> 

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH], PowerPC long double transition patches, v2, Patch #6 (fix long double tests for -mno-float128)

2018-06-21 Thread Segher Boessenkool

On Wed, Jun 20, 2018 at 10:49:41AM -0400, Michael Meissner wrote:
> These patches fix the tests in the testsuite that check whether -mno-float128
> works properly.  In these cases, I explicitly run them with long double being
> set to IBM extended double.

So what happened without this patch?


Segher

Re: [PATCH/RFC] enable -Wstrict-prototypes (PR 82922)

2018-06-21 Thread Jeff Law

On 06/12/2018 11:21 AM, Joseph Myers wrote:
> On Tue, 12 Jun 2018, Martin Sebor wrote:
> 
>> The proposal to enable -Wstrict-prototypes discussed below
>> was considered too late for GCC 8.  I'd like to revive it
>> now for GCC 9.
>>
>>   https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00935.html
> 
> My point from that discussion stands that () for no arguments should be 
> considered separately from warning for all the other cases.
There's a lot of legacy code out there...  What's the proposal for
handling the no argument () case?  Are we thinking multiple levels?  And
if so what's the default?

This has the potential to cause a lot of headaches for the distros...

jeff

[PATCH] PR libstdc++/86138 prevent implicit instantiation of COW empty rep

2018-06-21 Thread Jonathan Wakely


The explicit instantiation declarations for std::basic_string are
disabled for C++17 (and later) so that basic_string symbols get
implicitly instantiated in every translation unit that needs them.  On
targets that don't support STB_GNU_UNIQUE this leads to multiple copies
of the empty rep symbol for COW strings. In order to detect whether a
COW string needs to deallocate its storage it compares the address with
the empty rep.  When there are multiple copies of the empty rep object
the address is not unique, and so string destructors try to delete the
empty rep, which crashes.

In order to guarantee uniqueness of the _S_empty_rep_storage symbol this
patch adds an explicit instantiation declaration for just that symbol.
This means the other symbols are still implicitly instantiated in C++17
code, but for the empty rep the definition in the library gets used.

Separately, there is no need for C++17 code to implicitly instantiate
the I/O functions for strings, so this also restores the explicit
instantiation declarations for those functions.

PR libstdc++/86138
* include/bits/basic_string.tcc:
[__cplusplus > 201402 && !_GLIBCXX_USE_CXX11_ABI]
(basic_string::_Rep::_S_empty_rep_storage)
(basic_string::_Rep::_S_empty_rep_storage): Add explicit
instantiation declarations.
[__cplusplus > 201402] (operator>>, operator<<, getline): Re-enable
explicit instantiation declarations.
* testsuite/21_strings/basic_string/cons/char/86138.cc: New.
* testsuite/21_strings/basic_string/cons/wchar_t/86138.cc: New.

Tested x86_64-linux, committed to trunk.

If this passes testing on Cygwin I'll also backport it to gcc-7 and
gcc-8, as the explicit instantiation declarations are disabled for
C++17 on those branches.


commit 798a049013ecffc0f428c0b936d3f0770471e8c6
Author: Jonathan Wakely 
Date:   Fri Jun 22 00:10:40 2018 +0100

PR libstdc++/86138 prevent implicit instantiation of COW empty rep

The explicit instantiation declarations for std::basic_string are
disabled for C++17 (and later) so that basic_string symbols get
implicitly instantiated in every translation unit that needs them.  On
targets that don't support STB_GNU_UNIQUE this leads to multiple copies
of the empty rep symbol for COW strings. In order to detect whether a
COW string needs to deallocate its storage it compares the address with
the empty rep.  When there are multiple copies of the empty rep object
the address is not unique, and so string destructors try to delete the
empty rep, which crashes.

In order to guarantee uniqueness of the _S_empty_rep_storage symbol this
patch adds an explicit instantiation declaration for just that symbol.
This means the other symbols are still implicitly instantiated in C++17
code, but for the empty rep the definition in the library gets used.

Separately, there is no need for C++17 code to implicitly instantiate
the I/O functions for strings, so this also restores the explicit
instantiation declarations for those functions.

PR libstdc++/86138
* include/bits/basic_string.tcc:
[__cplusplus > 201402 && !_GLIBCXX_USE_CXX11_ABI]
(basic_string::_Rep::_S_empty_rep_storage)
(basic_string::_Rep::_S_empty_rep_storage): Add explicit
instantiation declarations.
[__cplusplus > 201402] (operator>>, operator<<, getline): Re-enable
explicit instantiation declarations.
* testsuite/21_strings/basic_string/cons/char/86138.cc: New.
* testsuite/21_strings/basic_string/cons/wchar_t/86138.cc: New.

diff --git a/libstdc++-v3/include/bits/basic_string.tcc 
b/libstdc++-v3/include/bits/basic_string.tcc
index be8815c711b..9fbea84c4af 100644
--- a/libstdc++-v3/include/bits/basic_string.tcc
+++ b/libstdc++-v3/include/bits/basic_string.tcc
@@ -1597,8 +1597,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // Inhibit implicit instantiations for required instantiations,
   // which are defined via explicit instantiations elsewhere.
-#if _GLIBCXX_EXTERN_TEMPLATE > 0 && __cplusplus <= 201402L
+#if _GLIBCXX_EXTERN_TEMPLATE > 0
+  // The explicit instantiations definitions in src/c++11/string-inst.cc
+  // are compiled as C++14, so the new C++17 members aren't instantiated.
+  // Until those definitions are compiled as C++17 suppress the declaration,
+  // so C++17 code will implicitly instantiate std::string and std::wstring
+  // as needed.
+# if __cplusplus <= 201402L
   extern template class basic_string;
+# elif ! _GLIBCXX_USE_CXX11_ABI
+  // Still need to prevent implicit instantiation of the COW empty rep,
+  // to ensure the definition in libstdc++.so is unique (PR 86138).
+  extern template basic_string::size_type
+basic_string::_Rep::_S_empty_rep_storage[];
+# endif
+
   extern template
 basic_istream&
 operator>>(basic_istream&, string&);
@@ -1613,7 +1626,13 @@ _GLIBC

Re: [PATCH, rs6000] Fix AIX expected builtin instruction counts

2018-06-21 Thread Segher Boessenkool

Hi Carl,

On Wed, Jun 20, 2018 at 05:09:00PM -0700, Carl Love wrote:
> I believe I have addressed all of your concerns with the patch.
> 
> I have retested it and it looks good.

It looks good indeed.  Please commit, thanks!

I noticed one more thing (follow-up patch?)

>  /* { dg-final { scan-assembler-times "divd" 8  { target lp64 } } } */
>  /* { dg-final { scan-assembler-times "divdu" 2  { target lp64 } } } */
>  /* { dg-final { scan-assembler-times "mulld" 4  { target lp64 } } } */
> -/* { dg-final { scan-assembler-times "bl __divdi3" 3  { target ilp32 } } } */
> -/* { dg-final { scan-assembler-times "bl __udivdi3" 3  { target ilp32 } } } 
> */
> +/* { dg-final { scan-assembler-times {\mbl __divdi3\M} 2  { target { ilp32 } 
> } } } */
> +/* { dg-final { scan-assembler-times {\mbl __udivdi3\M} 2  { target {ilp32 } 
> } } } */

The test for "divd" will count those __divdi3, __udivdi3 as well.  It also
counts divdu.

Putting \m\M around most mnemonics helps.

Cheers,


Segher

Re: [PATCH/RFC] enable -Wstrict-prototypes (PR 82922)

2018-06-21 Thread Eric Gallager

On 6/21/18, Jeff Law  wrote:
> On 06/12/2018 11:21 AM, Joseph Myers wrote:
>> On Tue, 12 Jun 2018, Martin Sebor wrote:
>>
>>> The proposal to enable -Wstrict-prototypes discussed below
>>> was considered too late for GCC 8.  I'd like to revive it
>>> now for GCC 9.
>>>
>>>   https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00935.html
>>
>> My point from that discussion stands that () for no arguments should be
>> considered separately from warning for all the other cases.
> There's a lot of legacy code out there...  What's the proposal for
> handling the no argument () case?  Are we thinking multiple levels?  And
> if so what's the default?

-Wstrict-prototypes and -Wstricter-prototypes for the prototypes case?
And then split
-Wold-style-definition into -Wold-style-definition and
-Wc++-style-definition for the equivalent use of () in function
definitions?

>
> This has the potential to cause a lot of headaches for the distros...
>
> jeff
>
>

Re: [PATCH/RFC] enable -Wstrict-prototypes (PR 82922)

2018-06-21 Thread Martin Sebor


On 06/21/2018 05:14 PM, Jeff Law wrote:

On 06/12/2018 11:21 AM, Joseph Myers wrote:

On Tue, 12 Jun 2018, Martin Sebor wrote:


The proposal to enable -Wstrict-prototypes discussed below
was considered too late for GCC 8.  I'd like to revive it
now for GCC 9.

  https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00935.html


My point from that discussion stands that () for no arguments should be
considered separately from warning for all the other cases.

There's a lot of legacy code out there...  What's the proposal for
handling the no argument () case?  Are we thinking multiple levels?  And
if so what's the default?

This has the potential to cause a lot of headaches for the distros...


I've been pondering diagnosing calls with one or more arguments
to such functions instead as an alternative.  It would miss
the problem when the function is defined to take arguments.
I haven't prototyped it yet.

Martin

Re: [PATCH 10/11] Fix LRA to handle multi-word eliminable registers

2018-06-21 Thread Dimitar Dimitrov

On четвъртък, 21 юни 2018 г. 17:03:55 EEST Jeff Law wrote:
> On 06/21/2018 11:44 AM, Vladimir Makarov wrote:
> > On 06/13/2018 02:58 PM, Dimitar Dimitrov wrote:
> >> From: Dimitar Dimitrov 
> >> 
> >> For some targets, Pmode != UNITS_PER_WORD. Take this into account
> >> when marking hard registers as being used.
> >> 
> >> I tested C and C++ testsuits for x86_64 with and without this
> >> patch. There was no regression, i.e. gcc.sum and g++.sum matched
> >> exactly.
> >> 
> >> gcc/ChangeLog:
> >> 
> >> 2018-06-13  Dimitar Dimitrov  
> >> 
> >> * lra-eliminations.c (set_ptr_hard_reg_bits): New function.
> >> (update_reg_eliminate): Mark all spanning hw registers.
> >> 
> >> gcc/testsuite/ChangeLog:
> >> 
> >> 2018-06-13  Dimitar Dimitrov  
> >> 
> >> * gcc.target/pru/lra-framepointer-fragmentation-1.c: New test.
> >> * gcc.target/pru/lra-framepointer-fragmentation-2.c: New test.
> >> 
> >> Cc: Vladimir Makarov 
> >> Cc: Peter Bergner 
> >> Cc: Kenneth Zadeck 
> >> Cc: Seongbae Park 
> >> Signed-off-by: Dimitar Dimitrov 
> >> ---
> >>   gcc/lra-eliminations.c | 14 -
> >>   .../pru/lra-framepointer-fragmentation-1.c | 33 
> >>   .../pru/lra-framepointer-fragmentation-2.c | 61
> >> ++
> >>   3 files changed, 106 insertions(+), 2 deletions(-)
> >>   create mode 100644
> >> gcc/testsuite/gcc.target/pru/lra-framepointer-fragmentation-1.c
> >>   create mode 100644
> >> gcc/testsuite/gcc.target/pru/lra-framepointer-fragmentation-2.c
> >> 
> >> diff --git a/gcc/lra-eliminations.c b/gcc/lra-eliminations.c
> >> index 21d8d5f8018..566cc2c8248 100644
> >> --- a/gcc/lra-eliminations.c
> >> +++ b/gcc/lra-eliminations.c
> >> @@ -1180,6 +1180,16 @@ spill_pseudos (HARD_REG_SET set)
> >> bitmap_clear (&to_process);
> >>   }
> >>   +static void set_ptr_hard_reg_bits (HARD_REG_SET *hard_reg_set, int r)
> >> +{
> >> +  int w;
> >> +
> >> +  for (w = 0; w < GET_MODE_SIZE (Pmode); w += UNITS_PER_WORD, r++)
> >> +{
> >> +  SET_HARD_REG_BIT (*hard_reg_set, r);
> >> +}
> >> +}
> >> +
> > 
> > The patch itself is ok but for uniformity I'd use
> > 
> > for (int i = hard_regno_nregs (r, Pmode) - 1; i >= 0; i--)
> >   SET_HARD_REG_BIT (*hard_reg_set, r + i);
> 
> I'm a bit surprised we don't already have a utility function to do this.
> Hmmm
> 
> add_to_hard_reg_set (hard_reg_set, Pmode, r)
> 
> So instead LRA ought to be using that function in the places where calls
> to set_ptr_hard_reg_bits were introduced.
> 
> Dimitar, can you verify that change works?

Thank you. I'll test it and will update the patch.


The SET_HARD_REG_BIT call in check_pseudos_live_through_calls also seems 
suspicous to me. I'll try to come up with a regression test case to justify 
its upgrade to add_to_hard_reg_set().

Regards,
Dimitar

Re: [PATCH] Add HXT Phecda core support

2018-06-21 Thread Hongbo Zhang

On 21 June 2018 at 17:30, Kyrill  Tkachov  wrote:
> Hi Hongbo,
>
> On 20/06/18 03:54, Hongbo Zhang wrote:
>>
>> HXT semiconductor's CPU core Phecda, as a variant of Qualcomm qdf24xx,
>> reuses the same tuning structure and pipeline with it.
>>
>
> This looks ok to me but you'll need approval from the maintainers.
> Some comments on the ChangeLog below.
>
>> 2018-06-19  Hongbo Zhang  
>>
>> * config/aarch64/aarch64-cores.def (AARCH64_CORE): add phecda core
>> * config/aarch64/aarch64-tune.md: re-generated by gentune.sh
>> * doc/invoke.texi: add phecda core entry
>
>
> Please use capital first letter and end the sentence with a full stop.
> That is, "Add phecda core." Same for the other entries
> For aarch64-tune.md you can just say "Regenerate."
>
Get it, thanks Kyrill.

> Thanks,
> Kyrill
>
>
>> ---
>>  gcc/config/aarch64/aarch64-cores.def | 3 +++
>>  gcc/config/aarch64/aarch64-tune.md   | 2 +-
>>  gcc/doc/invoke.texi  | 2 +-
>>  3 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/aarch64/aarch64-cores.def
>> b/gcc/config/aarch64/aarch64-cores.def
>> index e64d831..0e3c0a0 100644
>> --- a/gcc/config/aarch64/aarch64-cores.def
>> +++ b/gcc/config/aarch64/aarch64-cores.def
>> @@ -61,6 +61,9 @@ AARCH64_CORE("thunderxt88", thunderxt88,   thunderx,
>> 8A,  AARCH64_FL_FOR_ARCH
>>  AARCH64_CORE("thunderxt81",   thunderxt81,   thunderx, 8A,
>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43,
>> 0x0a2, -1)
>>  AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx, 8A,
>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx,  0x43,
>> 0x0a3, -1)
>>
>> +/* HXT ('H') cores. */
>> +AARCH64_CORE("phecda",  phecda,falkor,8A,
>> AARCH64_FL_FOR_ARCH8 | AARCH64_FL_CRC | AARCH64_FL_CRYPTO, qdf24xx,   0x68,
>> 0x000, -1)
>> +
>>  /* APM ('P') cores. */
>>  AARCH64_CORE("xgene1",  xgene1,xgene1,8A,
>> AARCH64_FL_FOR_ARCH8, xgene1, 0x50, 0x000, -1)
>>
>> diff --git a/gcc/config/aarch64/aarch64-tune.md
>> b/gcc/config/aarch64/aarch64-tune.md
>> index 7b3a746..19b44d7 100644
>> --- a/gcc/config/aarch64/aarch64-tune.md
>> +++ b/gcc/config/aarch64/aarch64-tune.md
>> @@ -1,5 +1,5 @@
>>  ;; -*- buffer-read-only: t -*-
>>  ;; Generated automatically by gentune.sh from aarch64-cores.def
>>  (define_attr "tune"
>> -
>> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55"
>> +
>> "cortexa35,cortexa53,cortexa57,cortexa72,cortexa73,thunderx,thunderxt88p1,thunderxt88,thunderxt81,thunderxt83,phecda,xgene1,falkor,qdf24xx,exynosm1,thunderx2t99p1,vulcan,thunderx2t99,cortexa55,cortexa75,saphira,cortexa57cortexa53,cortexa72cortexa53,cortexa73cortexa35,cortexa73cortexa53,cortexa75cortexa55"
>>  (const (symbol_ref "((enum attr_tune) aarch64_tune)")))
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index 940b846..43ef9ac 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -14667,7 +14667,7 @@ performance of the code. Permissible values for
>> this option are:
>>  @samp{generic}, @samp{cortex-a35}, @samp{cortex-a53}, @samp{cortex-a55},
>>  @samp{cortex-a57}, @samp{cortex-a72}, @samp{cortex-a73},
>> @samp{cortex-a75},
>>  @samp{exynos-m1}, @samp{falkor}, @samp{qdf24xx}, @samp{saphira},
>> -@samp{xgene1}, @samp{vulcan}, @samp{thunderx},
>> +@samp{phecda}, @samp{xgene1}, @samp{vulcan}, @samp{thunderx},
>>  @samp{thunderxt88}, @samp{thunderxt88p1}, @samp{thunderxt81},
>>  @samp{thunderxt83}, @samp{thunderx2t99}, @samp{cortex-a57.cortex-a53},
>>  @samp{cortex-a72.cortex-a53}, @samp{cortex-a73.cortex-a35},
>> --
>> 2.7.4
>>
>

Re: [PATCH 10/11] Fix LRA to handle multi-word eliminable registers

2018-06-21 Thread Jeff Law

On 06/21/2018 10:01 PM, Dimitar Dimitrov wrote:
> On четвъртък, 21 юни 2018 г. 17:03:55 EEST Jeff Law wrote:
>> On 06/21/2018 11:44 AM, Vladimir Makarov wrote:
>>> On 06/13/2018 02:58 PM, Dimitar Dimitrov wrote:
 From: Dimitar Dimitrov 

 For some targets, Pmode != UNITS_PER_WORD. Take this into account
 when marking hard registers as being used.

 I tested C and C++ testsuits for x86_64 with and without this
 patch. There was no regression, i.e. gcc.sum and g++.sum matched
 exactly.

 gcc/ChangeLog:

 2018-06-13  Dimitar Dimitrov  

 * lra-eliminations.c (set_ptr_hard_reg_bits): New function.
 (update_reg_eliminate): Mark all spanning hw registers.

 gcc/testsuite/ChangeLog:

 2018-06-13  Dimitar Dimitrov  

 * gcc.target/pru/lra-framepointer-fragmentation-1.c: New test.
 * gcc.target/pru/lra-framepointer-fragmentation-2.c: New test.

 Cc: Vladimir Makarov 
 Cc: Peter Bergner 
 Cc: Kenneth Zadeck 
 Cc: Seongbae Park 
 Signed-off-by: Dimitar Dimitrov 
 ---
   gcc/lra-eliminations.c | 14 -
   .../pru/lra-framepointer-fragmentation-1.c | 33 
   .../pru/lra-framepointer-fragmentation-2.c | 61
 ++
   3 files changed, 106 insertions(+), 2 deletions(-)
   create mode 100644
 gcc/testsuite/gcc.target/pru/lra-framepointer-fragmentation-1.c
   create mode 100644
 gcc/testsuite/gcc.target/pru/lra-framepointer-fragmentation-2.c

 diff --git a/gcc/lra-eliminations.c b/gcc/lra-eliminations.c
 index 21d8d5f8018..566cc2c8248 100644
 --- a/gcc/lra-eliminations.c
 +++ b/gcc/lra-eliminations.c
 @@ -1180,6 +1180,16 @@ spill_pseudos (HARD_REG_SET set)
 bitmap_clear (&to_process);
   }
   +static void set_ptr_hard_reg_bits (HARD_REG_SET *hard_reg_set, int r)
 +{
 +  int w;
 +
 +  for (w = 0; w < GET_MODE_SIZE (Pmode); w += UNITS_PER_WORD, r++)
 +{
 +  SET_HARD_REG_BIT (*hard_reg_set, r);
 +}
 +}
 +
>>>
>>> The patch itself is ok but for uniformity I'd use
>>>
>>> for (int i = hard_regno_nregs (r, Pmode) - 1; i >= 0; i--)
>>>   SET_HARD_REG_BIT (*hard_reg_set, r + i);
>>
>> I'm a bit surprised we don't already have a utility function to do this.
>> Hmmm
>>
>> add_to_hard_reg_set (hard_reg_set, Pmode, r)
>>
>> So instead LRA ought to be using that function in the places where calls
>> to set_ptr_hard_reg_bits were introduced.
>>
>> Dimitar, can you verify that change works?
> 
> Thank you. I'll test it and will update the patch.
> 
> 
> The SET_HARD_REG_BIT call in check_pseudos_live_through_calls also seems 
> suspicous to me. I'll try to come up with a regression test case to justify 
> its upgrade to add_to_hard_reg_set().
I wouldn't be surprised if there's others, particularly WRT Pmode.
Targets where Pmode != word_mode are relatively rare and those that
exist aren't extensively tested.

Jeff
> 
> Regards,
> Dimitar
>

89 matches

Mail list logo