Re: [PATCH,RFC 0/3] Support for CTF in GCC

2019-05-29 Thread Richard Biener
On Mon, May 27, 2019 at 8:12 PM Indu Bhagat  wrote:
>
> Hi Michael,
>
> On 05/24/2019 06:04 AM, Michael Matz wrote:
> > Hello,
> >
> > On Thu, 23 May 2019, Indu Bhagat wrote:
> >
> >>> OK.  So I wonder how difficult it is to emit CTF by walking dwarf2outs own
> >>> data structures?  That is, in my view CTF should be emitted by
> >>> dwarf2out_early_finish ()  (which is also the point LTO type/decl debug
> >>> is generated from).  It would be nice to avoid extra bookkeeping data
> >>> structures
> >>> for CTF since those of DWARF should include all necessary information
> >>> already.
> >> CTF format has some characteristics which make it necessary to 
> >> "pre-process"
> >> the generated CTF data before asm'ing out into a section. E.g. few cases of
> >> why "pre-processing" CTF is required before asm'ing out :
> >>   1. CTF types do need to be emitted in "some" order :
> >>  CTF types can have references to other CTF types. This consequently
> >>  implies
> >>  that the referenced type must appear BEFORE the referring type.
> >>   2. CTF preamble holds offsets to the various subsections - function info,
> >>  variables, data types and CTF string table. To calculate the offsets, 
> >> the
> >>  compiler needs to know the size in bytes of these sub-sections.  CTF
> >>  representation for some types like structures, functions, enums have
> >>  variable length of bytes trailing them (depending on the defintion of 
> >> the
> >>  type).
> >>   3. CTF variable entries need to be placed in the order of the names.
> >>
> >> Because of some of these "features" of the CTF format, the compiler does
> >> need to do a transition from runtime CTF generated data --> CTF binary
> >> data format for a clean and readable code.
> > Sure, but this whole process could still be triggered from within
> > dwarf2out_early_finish, by walking the DWARF tree (instead of getting fed
> > types and decls via hooks) and generating the appropriate CTF data
> > structures.  (It's just a possibility, it might end up uglier that using
> > GCC trees)
>
> I think not only is the code messier, but it's also wasted effort if user only
> wants to generate CTF.
>
> > Imagine a world where debug hooks wouldn't exist (which is where we would
> > like to end up in a far away future), how would you then add CTF debug
> > info to the compiler (assuming it already emits DWARF)?  You would hook
> > yourself either into the DWARF routines that currently are fed the
> > entities or you would hook yourself into somewhere late in the pipeline
> > where the DWARF debug info is complete and you would generate CTF from
> > that.
> >
> >> So, I think the needs are different enough to vouch for an implementation
> >> segregated from dwarf* codebase.
> > Of course.  We are merely discussing of where the triggering of processing
> > starts: debug hooks, or something like:
> >
> > dwarf2out_early_finish() {
> >...
> >if (ctf)
> >  ctf_emit();
> > }
> >
> > (and then in addition if the current DWARF info would be the source of CTF
> > info, or if it'd be whatever the compiler gives you as trees)
> >
> > The thing is, with debug hooks you'd have to invent a scheme of stacking
> > hooks on top of each other (because we want to generate DWARF and CTF from
> > the same compilation).  That seems like a wasted effort when our wish is
> > for the hooks to go away alltogether.
> >
> When the debug hooks go away, the functionality can be folded in. Much like
> above, the ctf proposed implementation will do :
>
> ctf_early_global_decl (tree decl)
> {
>ctf_decl (decl);
>
>real_debug_hooks->early_global_decl (decl);
> }
>
> These ctf_* debug hooks wrappers are as lean as shown above.
>
> I do understand now that if debug hooks are destined to go away, all the
> implementation which wraps debug hooks (go dump hooks, vms debug hooks,
> and now the proposed ctf debug hooks) will need some merging. But to generate
> CTF, I think working on type or decl instead of DWARF dies to is a better
> implementation because if user wants only CTF, no DWARF trees need to be
> created.
>
> This way we keep DWARF and CTF generation independent of each other (as the
> user may want either one of these or both).

The user currently can't have both DWARF and STABS either.  That things like
godump uses debug hooks is just (convenient?) abuse.

In the end frontends will not call sth like dwarf2out_decl but maybe
gen_subroutine_die () or gen_template_die ().  So how do you expect
the "wrapping" to work there?

I understand you want CTF for "actually emitted" decls so I propose you
instead hook into the symtab code which would end up calling the
early_global_decl debug hook.  But please don't add new debug hook
users.

Richard.

> > Ciao,
> > Michael.
>


Re: Simplify more EXACT_DIV_EXPR comparisons

2019-05-29 Thread Richard Biener
On Tue, May 28, 2019 at 5:34 PM Martin Sebor  wrote:
>
> On 5/21/19 3:53 AM, Richard Biener wrote:
> > On Tue, May 21, 2019 at 4:13 AM Martin Sebor  wrote:
> >>
> >> On 5/20/19 3:16 AM, Richard Biener wrote:
> >>> On Mon, May 20, 2019 at 10:16 AM Marc Glisse  wrote:
> 
>  On Mon, 20 May 2019, Richard Biener wrote:
> 
> > On Sun, May 19, 2019 at 6:16 PM Marc Glisse  
> > wrote:
> >>
> >> Hello,
> >>
> >> 2 pieces:
> >>
> >> - the first one handles the case where the denominator is negative. It
> >> doesn't happen often with exact_div, so I don't handle it everywhere, 
> >> but
> >> this one looked trivial
> >>
> >> - handle the case where a pointer difference is cast to an unsigned 
> >> type
> >> before being compared to a constant (I hit this in std::vector). With 
> >> some
> >> range info we could probably handle some non-constant cases as well...
> >>
> >> The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2)
> >>
> >> void f (void*);
> >> void g (int *p, int *q)
> >> {
> >>  __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q);
> >>  if (n < 100)
> >>f (__builtin_alloca (n));
> >> }
> >>
> >> At the time of walloca2, we have
> >>
> >>  _1 = p_5(D) - q_6(D);
> >>  # RANGE [-2305843009213693952, 2305843009213693951]
> >>  _2 = _1 /[ex] 4;
> >>  # RANGE ~[2305843009213693952, 16140901064495857663]
> >>  n_7 = (long unsigned intD.10) _2;
> >>  _11 = (long unsigned intD.10) _1;
> >>  if (_11 <= 396)
> >> [...]
> >>  _3 = allocaD.1059 (n_7);
> >>
> >> and warn.
> >
> > That's indeed to complicated relation of _11 to n_7 for
> > VRP predicate discovery.
> >
> >> However, DOM3 later produces
> >>
> >>  _1 = p_5(D) - q_6(D);
> >>  _11 = (long unsigned intD.10) _1;
> >>  if (_11 <= 396)
> >
> > while _11 vs. _1 works fine.
> >
> >> [...]
> >>  # RANGE [0, 99] NONZERO 127
> >>  _2 = _1 /[ex] 4;
> >>  # RANGE [0, 99] NONZERO 127
> >>  n_7 = (long unsigned intD.10) _2;
> >>  _3 = allocaD.1059 (n_7);
> >>
> >> so I am tempted to say that the walloca2 pass is too early, xfail the
> >> testcase and file an issue...
> >
> > Hmm, there's a DOM pass before walloca2 already and moving
> > walloca2 after loop opts doesn't look like the best thing to do?
> > I suppose it's not DOM but sinking that does the important transform
> > here?  That is,
> >
> > Index: gcc/passes.def
> > ===
> > --- gcc/passes.def  (revision 271395)
> > +++ gcc/passes.def  (working copy)
> > @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3.
> > NEXT_PASS (pass_optimize_bswap);
> > NEXT_PASS (pass_laddress);
> > NEXT_PASS (pass_lim);
> > -  NEXT_PASS (pass_walloca, false);
> > NEXT_PASS (pass_pre);
> > NEXT_PASS (pass_sink_code);
> > +  NEXT_PASS (pass_walloca, false);
> > NEXT_PASS (pass_sancov);
> > NEXT_PASS (pass_asan);
> > NEXT_PASS (pass_tsan);
> >
> > fixes it?
> 
>  I will check, but I don't think walloca uses any kind of on-demand VRP, 
>  so
>  we still need some pass to update the ranges after sinking, which doesn't
>  seem to happen until the next DOM pass.
> >>>
> >>> Oh, ok...  Aldy, why's this a separate pass anyways?  I think similar
> >>> other warnigns are emitted from RTL expansion?  So maybe we can
> >>> indeed move the pass towards warn_restrict or late_warn_uninit.
> >>
> >> I thought there was a preference to add new middle-end warnings
> >> into passes of their own rather than into existing passes.  Is
> >> that not so (either in general or in this specific case)?
> >
> > The preference was to add them not into optimization passes.  But
> > of course having 10+ warning passes, each going over the whole IL
> > is excessive.  Also each of the locally computing ranges or so.
> >
> > Given the simplicity of Walloca I wonder why it's not part of another
> > warning pass - since it's about tracking "sizes" again there are plenty
> > that fit ;)
>
> -Walloca doesn't need to track object sizes in the same sense
> as objsize and strlen do.  It just examines calls to allocation
> functions, same as -Walloc-larger-than.  It would make sense to
> merge the implementation of two warnings.  They don't need to
> run as a pass of their own.
>
> >>   From my POV, the main (only?) benefit of putting warnings in their
> >> own passes is modularity.  Are there any others?
> >>
> >> The biggest drawback I see is that it makes it hard to then share
> >> data across multiple passes.  The sharing can help not just
> >> warnings (reduce both false positiv

Patch ping (was Re: [PATCH] Assorted optc-save-gen.awk fixes (PR bootstrap/90543))

2019-05-29 Thread Jakub Jelinek
On Wed, May 22, 2019 at 01:41:56PM +0200, Jakub Jelinek wrote:
> 2019-05-22  Jakub Jelinek  
> 
>   PR bootstrap/90543
>   * optc-save-gen.awk: In cl_optimization_print, use correct condition
>   for var_opt_string printing.  In cl_optimization_print_diff, print
>   (null) instead of invoking undefined behavior if one of the
>   var_opt_string pointers is NULL and use && instead of first || in the
>   guarding condition.  For var_target_other options, handle const char *
>   target variables similarly to const char * optimize node variables.

I'd like to ping this patch.

Thanks.

Jakub


[committed] Add support for lastprivate conditional on combined parallel {for,sections} constructs

2019-05-29 Thread Jakub Jelinek
Hi!

All that is needed is move the lastprivate(conditional:) clauses to the
worksharing construct, firstprivate too if the same decl has them, and
add a shared clause on the parallel if needed.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2019-05-29  Jakub Jelinek  

* gimplify.c (struct gimplify_omp_ctx): Add clauses member.
(gimplify_scan_omp_clauses): Initialize ctx->clauses.
(gimplify_adjust_omp_clauses_1): Transform lastprivate conditional
explicit clause on combined parallel into implicit shared clause.
(gimplify_adjust_omp_clauses): Move lastprivate conditional clause
and firstprivate if the decl has one too from combined parallel to
the worksharing construct.
gcc/testsuite/
* c-c++-common/gomp/lastprivate-conditional-2.c (foo): Don't expect
sorry on lastprivate conditional on parallel for.
* c-c++-common/gomp/lastprivate-conditional-3.c (foo): Add tests for
lastprivate conditional warnings on parallel for constructs.
* c-c++-common/gomp/lastprivate-conditional-4.c: New test.
libgomp/
* testsuite/libgomp.c-c++-common/lastprivate_conditional_4.c: Rename
to ...
* testsuite/libgomp.c-c++-common/lastprivate-conditional-4.c: ... this.
* testsuite/libgomp.c-c++-common/lastprivate-conditional-5.c: New test.
* testsuite/libgomp.c-c++-common/lastprivate-conditional-6.c: New test.

--- gcc/gimplify.c.jj   2019-05-27 23:32:56.0 +0200
+++ gcc/gimplify.c  2019-05-28 15:19:30.862437710 +0200
@@ -205,6 +205,7 @@ struct gimplify_omp_ctx
   struct gimplify_omp_ctx *outer_context;
   splay_tree variables;
   hash_set *privatized_types;
+  tree clauses;
   /* Iteration variables in an OMP_FOR.  */
   vec loop_iter_var;
   location_t location;
@@ -8054,7 +8055,7 @@ gimplify_scan_omp_clauses (tree *list_p,
   struct gimplify_omp_ctx *ctx, *outer_ctx;
   tree c;
   hash_map *struct_map_to_clause = NULL;
-  tree *prev_list_p = NULL;
+  tree *prev_list_p = NULL, *orig_list_p = list_p;
   int handled_depend_iterators = -1;
   int nowait = -1;
 
@@ -8143,7 +8144,9 @@ gimplify_scan_omp_clauses (tree *list_p,
}
  if (OMP_CLAUSE_LASTPRIVATE_CONDITIONAL (c))
{
- if (code == OMP_FOR || code == OMP_SECTIONS)
+ if (code == OMP_FOR
+ || code == OMP_SECTIONS
+ || region_type == ORT_COMBINED_PARALLEL)
flags |= GOVD_LASTPRIVATE_CONDITIONAL;
  else
{
@@ -9312,6 +9315,7 @@ gimplify_scan_omp_clauses (tree *list_p,
list_p = &OMP_CLAUSE_CHAIN (c);
 }
 
+  ctx->clauses = *orig_list_p;
   gimplify_omp_ctxp = ctx;
   if (struct_map_to_clause)
 delete struct_map_to_clause;
@@ -9455,6 +9459,9 @@ gimplify_adjust_omp_clauses_1 (splay_tre
   tree clause;
   bool private_debug;
 
+  if (gimplify_omp_ctxp->region_type == ORT_COMBINED_PARALLEL
+  && (flags & GOVD_LASTPRIVATE_CONDITIONAL) != 0)
+flags = GOVD_SHARED | GOVD_SEEN | GOVD_WRITTEN;
   if (flags & (GOVD_EXPLICIT | GOVD_LOCAL))
 return 0;
   if ((flags & GOVD_SEEN) == 0)
@@ -9697,6 +9704,34 @@ gimplify_adjust_omp_clauses (gimple_seq
   omp_find_stores_op, &wi);
}
 }
+
+  if (ctx->region_type == ORT_WORKSHARE
+  && ctx->outer_context
+  && ctx->outer_context->region_type == ORT_COMBINED_PARALLEL)
+{
+  for (c = ctx->outer_context->clauses; c; c = OMP_CLAUSE_CHAIN (c))
+   if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_LASTPRIVATE
+   && OMP_CLAUSE_LASTPRIVATE_CONDITIONAL (c))
+ {
+   decl = OMP_CLAUSE_DECL (c);
+   splay_tree_node n
+ = splay_tree_lookup (ctx->outer_context->variables,
+  (splay_tree_key) decl);
+   gcc_checking_assert (!splay_tree_lookup (ctx->variables,
+(splay_tree_key) decl));
+   omp_add_variable (ctx, decl, n->value);
+   tree c2 = copy_node (c);
+   OMP_CLAUSE_CHAIN (c2) = *list_p;
+   *list_p = c2;
+   if ((n->value & GOVD_FIRSTPRIVATE) == 0)
+ continue;
+   c2 = build_omp_clause (OMP_CLAUSE_LOCATION (c),
+  OMP_CLAUSE_FIRSTPRIVATE);
+   OMP_CLAUSE_DECL (c2) = decl;
+   OMP_CLAUSE_CHAIN (c2) = *list_p;
+   *list_p = c2;
+ }
+}
   while ((c = *list_p) != NULL)
 {
   splay_tree_node n;
@@ -9723,6 +9758,10 @@ gimplify_adjust_omp_clauses (gimple_seq
  decl = OMP_CLAUSE_DECL (c);
  n = splay_tree_lookup (ctx->variables, (splay_tree_key) decl);
  remove = !(n->value & GOVD_SEEN);
+ if ((n->value & GOVD_LASTPRIVATE_CONDITIONAL) != 0
+ && code == OMP_PARALLEL
+ && OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE)
+   remove = true;
  if (! remove)
{

[PATCH] Further C lapack workaround tweaks

2019-05-29 Thread Jakub Jelinek
Hi!

As I said earlier in the PR, I don't like -fbroken-callers option much,
as the option name doesn't hint what it is doing at all.

The following patch renames the option and makes it into a 3 state option,
with the default being a middle-ground, where it avoids the tail calls in
functions that have the hidden character length arguments only if it makes
any implicit interface calls.  The rationale for that is that there were no
previously reported issues with older GCC versions and the change that
affected the broken C/C++ wrappers was just giving prototypes to the
implicit interface procedure calls, so presumably in functions that don't
have any such calls nothing should have changed.

Bootstrapped/regtested on x86_64-linux and i686-linux, additionally tested
on the dtrtri.f testcase and on dtrtri.f testcase patched to include
explicit interfaces for all called procedures (and for those two verified
all the 6 ways of using these options, default, positive/negative option
without = and 0/1/2 values of the = option, checking in which case there is
a tail call), ok for trunk?

2019-05-29  Jakub Jelinek  

PR fortran/90329
* lang.opt (fbroken-callers): Remove.
(ftail-call-workaround, ftail-call-workaround=): New options.
* gfortran.h (struct gfc_namespace): Add implicit_interface_calls.
* interface.c (gfc_procedure_use): Set implicit_interface_calls
for calls to implicit interface procedures.
* trans-decl.c (create_function_arglist): Use flag_tail_call_workaround
instead of flag_broken_callers.  If it is not 2, also require
sym->ns->implicit_interface_calls.
* invoke.texi (fbroken-callers): Remove documentation.
(ftail-call-workaround, ftail-call-workaround=): Document.

--- gcc/fortran/lang.opt.jj 2019-05-23 12:57:17.759475698 +0200
+++ gcc/fortran/lang.opt2019-05-28 19:28:39.111359365 +0200
@@ -397,10 +397,6 @@ fblas-matmul-limit=
 Fortran RejectNegative Joined UInteger Var(flag_blas_matmul_limit) Init(30)
 -fblas-matmul-limit=Size of the smallest matrix for which matmul 
will use BLAS.
 
-fbroken-callers
-Fortran Var(flag_broken_callers) Init(1)
-Disallow tail call optimization when a calling routine may have omitted 
character lenghts.
-
 fcheck-array-temporaries
 Fortran
 Produce a warning at runtime if a array temporary has been created for a 
procedure argument.
@@ -766,6 +762,13 @@ fsign-zero
 Fortran Var(flag_sign_zero) Init(1)
 Apply negative sign to zero values.
 
+ftail-call-workaround
+Frotran Alias(ftail-call-workaround=,1,0)
+
+ftail-call-workaround=
+Fortran RejectNegative Joined UInteger IntegerRange(0, 2) 
Var(flag_tail_call_workaround) Init(1)
+Disallow tail call optimization when a calling routine may have omitted 
character lenghts.
+
 funderscoring
 Fortran Var(flag_underscoring) Init(1)
 Append underscores to externally visible names.
--- gcc/fortran/gfortran.h.jj   2019-05-10 09:31:29.548146009 +0200
+++ gcc/fortran/gfortran.h  2019-05-28 19:18:17.776456590 +0200
@@ -1866,6 +1866,9 @@ typedef struct gfc_namespace
 
   /* Set to 1 for !$ACC ROUTINE namespaces.  */
   unsigned oacc_routine:1;
+
+  /* Set to 1 if there are any calls to procedures with implicit interface.  */
+  unsigned implicit_interface_calls:1;
 }
 gfc_namespace;
 
--- gcc/fortran/interface.c.jj  2019-05-10 23:20:36.072731478 +0200
+++ gcc/fortran/interface.c 2019-05-28 19:23:05.551779996 +0200
@@ -3686,6 +3686,7 @@ gfc_procedure_use (gfc_symbol *sym, gfc_
gfc_warning (OPT_Wimplicit_procedure,
 "Procedure %qs called at %L is not explicitly declared",
 sym->name, where);
+  gfc_find_proc_namespace (sym->ns)->implicit_interface_calls = 1;
 }
 
   if (sym->attr.if_source == IFSRC_UNKNOWN)
--- gcc/fortran/trans-decl.c.jj 2019-05-20 11:39:33.392827300 +0200
+++ gcc/fortran/trans-decl.c2019-05-28 19:33:25.847699650 +0200
@@ -2520,9 +2520,12 @@ create_function_arglist (gfc_symbol * sy
  /* Marking the length DECL_HIDDEN_STRING_LENGTH will lead
 to tail calls being disabled.  Only do that if we
 potentially have broken callers.  */
- if (flag_broken_callers && f->sym->ts.u.cl
+ if (flag_tail_call_workaround
+ && f->sym->ts.u.cl
  && f->sym->ts.u.cl->length
- && f->sym->ts.u.cl->length->expr_type == EXPR_CONSTANT)
+ && f->sym->ts.u.cl->length->expr_type == EXPR_CONSTANT
+ && (flag_tail_call_workaround == 2
+ || f->sym->ns->implicit_interface_calls))
DECL_HIDDEN_STRING_LENGTH (length) = 1;
 
  /* Remember the passed value.  */
--- gcc/fortran/invoke.texi.jj  2019-05-23 12:57:17.761475665 +0200
+++ gcc/fortran/invoke.texi 2019-05-28 19:45:59.752653839 +0200
@@ -181,7 +181,8 @@ and warnings}.
 @item Code Generation Options
 @xref{Code Gen Options,,Options for code generation conventions}.
 @gccoptlist{-faggres

Re: [PATCH 3/3][GCC][AARCH64] Add support for pointer authentication B key

2019-05-29 Thread Sam Tebbs
The libgcc changes have been acknowledged off-list. Committed as r271735.

On 01/03/2019 14:12, Sam Tebbs wrote:
> On 31/01/2019 14:54, Sam Tebbs wrote:
>> 
>>> ping 3. The preceding two patches were committed a while ago but require
>>> the minor libgcc changes in this patch, which are the only parts left to
>>> be reviewed.
>> ping 4
> Attached is a rebased patch made to work on top of Sudi Das' BTI patch
> (by renaming UNSPEC_PACISP to UNSPEC_PACIASP and UNSPEC_PACIBSP in
> aarch64-bti-insert.c). The updated changelog is below.
>
> Are the libgcc changes OK for trunk?
>
> gcc/
> 2019-03-01  Sam Tebbs
>
>   * config/aarch64/aarch64-builtins.c (aarch64_builtins): Add
>   AARCH64_PAUTH_BUILTIN_AUTIB1716 and AARCH64_PAUTH_BUILTIN_PACIB1716.
>   * config/aarch64/aarch64-builtins.c (aarch64_init_pauth_hint_builtins):
>   Add autib1716 and pacib1716 initialisation.
>   * config/aarch64/aarch64-builtins.c (aarch64_expand_builtin): Add checks
>   for autib1716 and pacib1716.
>   * config/aarch64/aarch64-protos.h (aarch64_key_type,
>   aarch64_post_cfi_startproc): Define.
>   * config/aarch64/aarch64-protos.h (aarch64_ra_sign_key): Define extern.
>   * config/aarch64/aarch64.c (aarch64_handle_standard_branch_protection,
>   aarch64_handle_pac_ret_protection): Set default sign key to A.
>   * config/aarch64/aarch64.c (aarch64_expand_epilogue,
>   aarch64_expand_prologue): Add check for b-key.
>   * config/aarch64/aarch64.c (aarch64_ra_sign_key,
>   aarch64_post_cfi_startproc, aarch64_handle_pac_ret_b_key): Define.
>   * config/aarch64/aarch64.h (TARGET_ASM_POST_CFI_STARTPROC): Define.
>   * config/aarch64/aarch64.c (aarch64_pac_ret_subtypes): Add "b-key".
>   * config/aarch64/aarch64.md (unspec): Add UNSPEC_AUTIA1716,
>   UNSPEC_AUTIB1716, UNSPEC_AUTIASP, UNSPEC_AUTIBSP, UNSPEC_PACIA1716,
>   UNSPEC_PACIB1716, UNSPEC_PACIASP, UNSPEC_PACIBSP.
>   * config/aarch64/aarch64.md (do_return): Add check for b-key.
>   * config/aarch64/aarch64.md (sp): Replace
>   pauth_hint_num_a with pauth_hint_num.
>   * config/aarch64/aarch64.md (1716): Replace
>   pauth_hint_num_a with pauth_hint_num.
>   * config/aarch64/aarch64.opt (msign-return-address=): Deprecate.
>   * config/aarch64/iterators.md (PAUTH_LR_SP): Add UNSPEC_AUTIASP,
>   UNSPEC_AUTIBSP, UNSPEC_PACIASP, UNSPEC_PACIBSP.
>   * config/aarch64/iterators.md (PAUTH_17_16): Add UNSPEC_AUTIA1716,
>   UNSPEC_AUTIB1716, UNSPEC_PACIA1716, UNSPEC_PACIB1716.
>   * config/aarch64/iterators.md (pauth_mnem_prefix): Add UNSPEC_AUTIA1716,
>   UNSPEC_AUTIB1716, UNSPEC_PACIA1716, UNSPEC_PACIB1716, UNSPEC_AUTIASP,
>   UNSPEC_AUTIBSP, UNSPEC_PACIASP, UNSPEC_PACIBSP.
>   * config/aarch64/iterators.md (pauth_hint_num_a): Replace
>   UNSPEC_PACI1716 and UNSPEC_AUTI1716 with UNSPEC_PACIA1716 and
>   UNSPEC_AUTIA1716 respectively.
>   * config/aarch64/iterators.md (pauth_hint_num_a): Rename to 
> pauth_hint_num
>   and add UNSPEC_PACIBSP, UNSPEC_AUTIBSP, UNSPEC_PACIB1716, 
> UNSPEC_AUTIB1716.
>   * doc/invoke.texi (-mbranch-protection): Add b-key type.
>   * config/aarch64/aarch64-bti-insert.c (aarch64_pac_insn_p): Rename
>   UNSPEC_PACISP to UNSPEC_PACIASP and UNSPEC_PACIBSP.
>
> gcc/testsuite
> 2019-03-01  Sam Tebbs
>
>   * gcc.target/aarch64/return_address_sign_b_1.c: New file.
>   * gcc.target/aarch64/return_address_sign_b_2.c: New file.
>   * gcc.target/aarch64/return_address_sign_b_3.c: New file.
>   * gcc.target/aarch64/return_address_sign_b_exception.c: New file.
>   * gcc.target/aarch64/return_address_sign_ab_exception.c: New file.
>   * gcc.target/aarch64/return_address_sign_builtin.c: New file
>
> libgcc/
> 2019-03-01  Sam Tebbs
>
>   * config/aarch64/aarch64-unwind.h (aarch64_cie_signed_with_b_key): New
>   function.
>   * config/aarch64/aarch64-unwind.h (aarch64_post_extract_frame_addr,
>   aarch64_post_frob_eh_handler_addr): Add check for b-key.
>   * config/aarch64/aarch64-unwind-h (aarch64_post_extract_frame_addr,
>   aarch64_post_frob_eh_handler_addr, aarch64_post_frob_update_context):
>   Rename RA_A_SIGNED_BIT to RA_SIGNED_BIT.
>   * unwind-dw2-fde.c (get_cie_encoding): Add check for 'B' in augmentation
>   string.
>   * unwind-dw2.c (extract_cie_info): Add check for 'B' in augmentation
>   string.
>   (RA_A_SIGNED_BIT): Rename to RA_SIGNED_BIT.
>


Re: Patch ping (was Re: [PATCH] Assorted optc-save-gen.awk fixes (PR bootstrap/90543))

2019-05-29 Thread Richard Biener
On Wed, 29 May 2019, Jakub Jelinek wrote:

> On Wed, May 22, 2019 at 01:41:56PM +0200, Jakub Jelinek wrote:
> > 2019-05-22  Jakub Jelinek  
> > 
> > PR bootstrap/90543
> > * optc-save-gen.awk: In cl_optimization_print, use correct condition
> > for var_opt_string printing.  In cl_optimization_print_diff, print
> > (null) instead of invoking undefined behavior if one of the
> > var_opt_string pointers is NULL and use && instead of first || in the
> > guarding condition.  For var_target_other options, handle const char *
> > target variables similarly to const char * optimize node variables.
> 
> I'd like to ping this patch.

OK.

Richard.


[PATCH] gdbinit: add a new command and fix one

2019-05-29 Thread Martin Liška
Hi.

The patch is about a small change in .gdbinit file.

Ready for trunk?
Martin

gcc/ChangeLog:

2019-05-29  Martin Liska  

* gdbinit.in: Fix 'ptc' command.  Add tt
that prints TREE_TYPE($).
---
 gcc/gdbinit.in | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)


diff --git a/gcc/gdbinit.in b/gcc/gdbinit.in
index e16c3c8ef87..81c65d2e7b4 100644
--- a/gcc/gdbinit.in
+++ b/gcc/gdbinit.in
@@ -113,7 +113,7 @@ Works only when an inferior is executing.
 end
 
 define ptc
-output (enum tree_code) $.common.code
+output (enum tree_code) $.base.code
 echo \n
 end
 
@@ -201,6 +201,14 @@ document pcfun
 Print current function.
 end
 
+define tt
+print ($.typed.type)
+end
+
+document tt
+Print TREE_TYPE of the tree node that is $
+end
+
 define break-on-diagnostic
 break diagnostic_show_locus
 end



Re: Implement vector average patterns for SVE2

2019-05-29 Thread Richard Sandiford
Alejandro Martinez Vicente  writes:
> Hi,
>
> This patch implements the [u]avgM3_floor and [u]avgM3_ceil optabs for SVE2.
>
> Alejandro
>
> gcc/ChangeLog:
>
> 2019-05-28  Alejandro Martinez  
>
>   * config/aarch64/aarch64-sve2.md: New file.
>   (avg3_floor): New pattern.
>   (avg3_ceil): Likewise.
>   (*h): Likewise.
>   * config/aarch64/aarch64.md: Include aarch64-sve2.md.
>
>
> 2019-05-28  Alejandro Martinez  
>
> gcc/testsuite/
>   * gcc.target/aarch64/sve2/average_1.c: New test.
>   * lib/target-supports.exp (check_effective_target_aarch64_sve1_only):
>   New helper.
>   (check_effective_target_vect_avg_qi): Check for SVE1 only.

OK, thanks, but...

> diff --git gcc/testsuite/lib/target-supports.exp 
> gcc/testsuite/lib/target-supports.exp
> index f69106d..41431e6 100644
> --- gcc/testsuite/lib/target-supports.exp
> +++ gcc/testsuite/lib/target-supports.exp
> @@ -3308,6 +3308,12 @@ proc check_effective_target_aarch64_sve2 { } {
>  }]
>  }
>  
> +# Return 1 if this is an AArch64 target only supporting SVE (not SVE2).
> +proc check_effective_target_aarch64_sve1_only { } {
> +return [expr { [check_effective_target_aarch64_sve]
> +&& ![check_effective_target_aarch64_sve2] }]
> +}

...it needs check_effective_target_aarch64_sve2 to go in first.

Richard


Re: [PATCH v2] aarch64: emit .variant_pcs for aarch64_vector_pcs symbol references

2019-05-29 Thread Richard Sandiford
Szabolcs Nagy  writes:
> v2:
> - use aarch64_simd_decl_p to check for aarch64_vector_pcs.
> - emit the .variant_pcs directive even for local functions.
> - don't require .variant_pcs asm support in compile only tests.
> - add weakref tests.
>
> A dynamic linker with lazy binding support may need to handle vector PCS
> function symbols specially, so an ELF symbol table marking was
> introduced for such symbols.
>
> Function symbol references and definitions that follow the vector PCS
> are marked in the generated assembly with .variant_pcs and then the
> STO_AARCH64_VARIANT_PCS st_other flag is set on the symbol in the object
> file.  The marking is propagated to the dynamic symbol table by the
> static linker so a dynamic linker can handle such symbols specially.
>
> For this to work, the assembler, the static linker and the dynamic
> linker has to be updated on a system.  Old assembler does not support
> the new .variant_pcs directive, so a toolchain with old binutils won't
> be able to compile code that references vector PCS symbols.
>
> gcc/ChangeLog:
>
> 2019-05-28  Szabolcs Nagy  
>
>   * config/aarch64/aarch64-protos.h (aarch64_asm_output_alias): Declare.
>   (aarch64_asm_output_external): Declare.
>   * config/aarch64/aarch64.c (aarch64_asm_output_variant_pcs): New.
>   (aarch64_declare_function_name): Call aarch64_asm_output_variant_pcs.
>   (aarch64_asm_output_alias): New.
>   (aarch64_asm_output_external): New.
>   * config/aarch64/aarch64.h (ASM_OUTPUT_DEF_FROM_DECLS): Define.
>   (ASM_OUTPUT_EXTERNAL): Define.
>
> gcc/testsuite/ChangeLog:
>
> 2019-05-28  Szabolcs Nagy  
>
>   * gcc.target/aarch64/pcs_attribute-2.c: New test.
>   * gcc.target/aarch64/torture/simd-abi-4.c: Check .variant_pcs support.
>   * lib/target-supports.exp (check_effective_target_aarch64_variant_pcs):
>   New.

LGTM, but an AArch64 maintainer will need to approve.

> diff --git a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-4.c 
> b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-4.c
> index e399690f364..80ebd955e10 100644
> --- a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-4.c
> +++ b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-4.c
> @@ -1,4 +1,5 @@
>  /* dg-do run */
> +/* { dg-require-effective-target aarch64_variant_pcs } */
>  /* { dg-additional-options "-std=c99" }  */

Not your problem of course, but mind fixing the dg-do markup while
you're there?  It should be

/* { dg-do run } */

instead.  As things stand, the test only gets compiled, not run.

Thanks,
Richard


Re: [PATCHv2] debug: make -feliminate-unused-debug-symbols the default [PR debug/86964]

2019-05-29 Thread Thomas De Schampheleire
Hi Richard,

El mar., 21 may. 2019 a las 16:57, Thomas De Schampheleire
() escribió:
>
> From: Thomas De Schampheleire 
>
> In addition to making -feliminate-unused-debug-symbols work for the DWARF
> format (see [1]), make this option the default. This behavior was the case
> before, e.g. under gcc 4.9.x.
> [1] https://gcc.gnu.org/viewcvs/gcc?view=revision&revision=269925
>
> This change requires some updates to test cases, which expected the previous
> default of not eliminating unused debug symbols.
>
> gcc/ChangeLog:
>
> 2019-05-21  Thomas De Schampheleire  
>
> PR debug/86964
> * common.opt (feliminate-unused-debug-symbols): Enable by default.
> * doc/invoke.texi (Debugging Options): Document new default of
> -feliminate-unused-debug-symbols and remove restriction to 'stabs'.
>
> gcc/testsuite/ChangeLog:
>
> 2019-05-21  Thomas De Schampheleire  
>
> PR debug/86964
> * g++.dg/debug/dwarf2/fesd-any.C: Use
> -fno-eliminate-unused-debug-symbols.
> * g++.dg/debug/dwarf2/fesd-baseonly.C: Likewise.
> * g++.dg/debug/dwarf2/fesd-none.C: Likewise.
> * g++.dg/debug/dwarf2/fesd-reduced.C: Likewise.
> * g++.dg/debug/dwarf2/fesd-sys.C: Likewise.
> * g++.dg/debug/dwarf2/inline-var-1.C: Likewise.
> * g++.dg/debug/enum-2.C: Likewise.
> * gcc.dg/debug/dwarf2/fesd-any.c: Likewise.
> * gcc.dg/debug/dwarf2/fesd-baseonly.c: Likewise.
> * gcc.dg/debug/dwarf2/fesd-none.c: Likewise.
> * gcc.dg/debug/dwarf2/fesd-reduced.c: Likewise.
> * gcc.dg/debug/dwarf2/fesd-sys.c: Likewise.
> ---
>  gcc/common.opt| 2 +-
>  gcc/doc/invoke.texi   | 9 +
>  gcc/testsuite/g++.dg/debug/dwarf2/fesd-any.C  | 2 +-
>  gcc/testsuite/g++.dg/debug/dwarf2/fesd-baseonly.C | 2 +-
>  gcc/testsuite/g++.dg/debug/dwarf2/fesd-none.C | 2 +-
>  gcc/testsuite/g++.dg/debug/dwarf2/fesd-reduced.C  | 2 +-
>  gcc/testsuite/g++.dg/debug/dwarf2/fesd-sys.C  | 2 +-
>  gcc/testsuite/g++.dg/debug/dwarf2/inline-var-1.C  | 2 +-
>  gcc/testsuite/g++.dg/debug/enum-2.C   | 1 +
>  gcc/testsuite/gcc.dg/debug/dwarf2/fesd-any.c  | 2 +-
>  gcc/testsuite/gcc.dg/debug/dwarf2/fesd-baseonly.c | 2 +-
>  gcc/testsuite/gcc.dg/debug/dwarf2/fesd-none.c | 2 +-
>  gcc/testsuite/gcc.dg/debug/dwarf2/fesd-reduced.c  | 2 +-
>  gcc/testsuite/gcc.dg/debug/dwarf2/fesd-sys.c  | 2 +-
>  14 files changed, 18 insertions(+), 16 deletions(-)
>
> diff --git a/gcc/common.opt b/gcc/common.opt
> index d342c4f3749..0e72fd08ec4 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1379,7 +1379,7 @@ Common Report Var(flag_ipa_sra) Init(0) Optimization
>  Perform interprocedural reduction of aggregates.
>
>  feliminate-unused-debug-symbols
> -Common Report Var(flag_debug_only_used_symbols)
> +Common Report Var(flag_debug_only_used_symbols) Init(1)
>  Perform unused symbol elimination in debug info.
>
>  feliminate-unused-debug-types
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 5e3e8873d35..06c8c60f19e 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -388,7 +388,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fno-eliminate-unused-debug-types @gol
>  -femit-struct-debug-baseonly  -femit-struct-debug-reduced @gol
>  -femit-struct-debug-detailed@r{[}=@var{spec-list}@r{]} @gol
> --feliminate-unused-debug-symbols  -femit-class-debug-always @gol
> +-fno-eliminate-unused-debug-symbols  -femit-class-debug-always @gol
>  -fno-merge-debug-strings  -fno-dwarf2-cfi-asm @gol
>  -fvar-tracking  -fvar-tracking-assignments}
>
> @@ -7827,10 +7827,11 @@ confusion with @option{-gdwarf-@var{level}}.
>  Instead use an additional @option{-g@var{level}} option to change the
>  debug level for DWARF.
>
> -@item -feliminate-unused-debug-symbols
> +@item -fno-eliminate-unused-debug-symbols
>  @opindex feliminate-unused-debug-symbols
> -Produce debugging information in stabs format (if that is supported),
> -for only symbols that are actually used.
> +@opindex fno-eliminate-unused-debug-symbols
> +By default, no debug information is produced for symbols that are not 
> actually
> +used. Use this option if you want debug information for all symbols.
>
>  @item -femit-class-debug-always
>  @opindex femit-class-debug-always
> diff --git a/gcc/testsuite/g++.dg/debug/dwarf2/fesd-any.C 
> b/gcc/testsuite/g++.dg/debug/dwarf2/fesd-any.C
> index a4a0b50ee50..5868ebc9c85 100644
> --- a/gcc/testsuite/g++.dg/debug/dwarf2/fesd-any.C
> +++ b/gcc/testsuite/g++.dg/debug/dwarf2/fesd-any.C
> @@ -1,5 +1,5 @@
>  // { dg-do compile }
> -// { dg-options "-gdwarf-2 -dA -femit-struct-debug-detailed=any" }
> +// { dg-options "-gdwarf-2 -dA -femit-struct-debug-detailed=any 
> -fno-eliminate-unused-debug-symbols" }
>  // { dg-final { scan-assembler "timespec.*DW_AT_name" } }
>  // { dg-final { scan-assembler "tv_sec.*DW_AT_name" } }
>  // { dg-final { scan-assembler "tv_nsec.*D

Re: [PATCH] Implement LWG 2686, hash

2019-05-29 Thread Szabolcs Nagy
On 09/05/2019 16:16, Jonathan Wakely wrote:
> On Thu, 9 May 2019 at 15:43, Szabolcs Nagy  wrote:
>> On 07/05/2019 13:21, Christophe Lyon wrote:
>>> On Tue, 7 May 2019 at 12:07, Jonathan Wakely  wrote:
 On 07/05/19 10:37 +0100, Jonathan Wakely wrote:
> On 07/05/19 11:05 +0200, Christophe Lyon wrote:
>> I'm seeing link failures on arm-eabi (using newlib):
>> Excess errors:
>> /libstdc++-v3/src/c++17/fs_ops.cc:806: undefined reference to `chdir'
>> /libstdc++-v3/src/c++17/fs_ops.cc:583: undefined reference to `mkdir'
>> /libstdc++-v3/src/c++17/fs_ops.cc:1134: undefined reference to `chmod'
>> /libstdc++-v3/src/c++17/../filesystem/ops-common.h:439: undefined
>> reference to `chmod'
>> /libstdc++-v3/src/c++17/fs_ops.cc:750: undefined reference to `pathconf'
>> /libstdc++-v3/src/c++17/fs_ops.cc:769: undefined reference to `getcwd'
>>
>> Christophe

 Is it definitely the new 19_diagnostics/error_condition/hash.cc test
 that's giving this error?
>>
>> i looked at this and ld -M reports
>>
>> /B/aarch64-none-elf/libstdc++-v3/src/.libs/libstdc++.a(system_error.o)
>> hash.o (std::_V2::error_category::default_error_condition(int) const)
>> /B/aarch64-none-elf/libstdc++-v3/src/.libs/libstdc++.a(fs_ops.o)
>> 
>> /B/aarch64-none-elf/libstdc++-v3/src/.libs/libstdc++.a(system_error.o) (void 
>> std::__cxx11::basic_string,
>> std::allocator >::_M_construct(char const*, char const*, 
>> std::forward_iterator_tag))
>> ...
>>
>> i.e. hash.o pulls system_error.o in because of
>>
>>   std::_V2::error_category::default_error_condition(int) const
>>
>> and system_error.o pulls fs_ops.o in because of
>>
>>   std::__cxx11::basic_string...
>>
>> symbol reference.
>>
>> it seems fs_ops.o is the first object in libstdc++.a
>> that provides a (weak) definition for
>>
>> _ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_M_constructIPKcEEvT_S8_St20forward_iterator_tag
> 
> Ah, so maybe we need an explicit instantiation elsewhere.
> Or completely disable all the stuff using chdir, mkdir etc for these
> newlib targets, which is probably a good idea anyway.

disabling fs things for baremetal makes sense.

but i would not expect system_error.o to depend
on fs_ops.o even on non-baremetal targets.

so whatever causes the dependency should be fixed
as well i think.

in this case the base_string_whatever should have
a definition either in system_error.o or in a .o
with minimal deps that is guaranteed to come before
fs_ops.o in libstdc++.a


Re: [PATCH V2] Remove empty loop with assumed finiteness (PR tree-optimization/89713)

2019-05-29 Thread Richard Biener
On Fri, May 24, 2019 at 11:15 AM Feng Xue OS
 wrote:
>
> This version is based on the proposal of Richard. And fix a bug on OpenACC 
> loop when this opt is turned on.
> Also add some test cases

Comments below.

> Feng
> -
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 9f0f889..d1c1e3a 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,16 @@
> +2019-05-23  Feng Xue  
> +
> + PR tree-optimization/89713
> + * doc/invoke.texi (-ffinite-loop): Document new option.
> + * common.opt (-ffinite-loop): New option.
> + * tree-ssa-dce.c (loop_has_true_exits): New function.
> + (mark_stmt_if_obviously_necessary): Mark IFN_GOACC_LOOP
> + calls as neccessary.
> + (find_obviously_necessary_stmts): Use flag to control
> + removal of a loop with assumed finiteness.
> + (eliminate_unnecessary_stmts): Do not delete dead result
> + of IFN_GOACC_LOOP calls.
> +
>  2019-05-22  David Malcolm  
>
>   PR c++/90462
> diff --git a/gcc/common.opt b/gcc/common.opt
> index d342c4f..e98a34d 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1437,6 +1437,10 @@ ffinite-math-only
>  Common Report Var(flag_finite_math_only) Optimization SetByCombined
>  Assume no NaNs or infinities are generated.
>
> +ffinite-loop

I think it should be -ffinite-loops (plural)

> +Common Report Var(flag_finite_loop) Optimization
> +Assume loops are finite if can not be analytically determined.

This is a too vague description.  I'd rather write
'Assume that loops with an exit will terminate and not loop indefinitely.'

> +
>  ffixed-
>  Common Joined RejectNegative Var(common_deferred_options) Defer
>  -ffixed- Mark  as being unavailable to the compiler.
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 6c89843..caa0852 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
>  -fdevirtualize-at-ltrans  -fdse @gol
>  -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects 
> @gol
>  -ffast-math  -ffinite-math-only  -ffloat-store  
> -fexcess-precision=@var{style} @gol
> +-ffinite-loop @gol
>  -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
>  -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
>  -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
> @@ -9501,6 +9502,20 @@ that may set @code{errno} but are otherwise free of 
> side effects.  This flag is
>  enabled by default at @option{-O2} and higher if @option{-Os} is not also
>  specified.
>
> +@item -ffinite-loop
> +@opindex ffinite-loop
> +@opindex fno-finite-loop
> +Allow the compiler to assume that if finiteness of a loop can not be
> +analytically determined, the loop must be finite. With the assumption, some
> +aggressive transformation could be possible, such as removal of this kind
> +of empty loop by dead code elimination (DCE).

"Assume that a loop with an exit will eventually take the exit and not loop
indefinitely.  This allows the compiler to remove loops that otherwise have
no side-effects, not considering eventual endless looping as such."

> +This option is not turned on by any @option{-O} option since it might result
> +in incorrect behaviour for programs that contain seemly finite, but actually
> +infinite loop.

I think we should turn this option on by default, document that and note
that some languages (C++) say loops terminate.

> +
> +The default is @option{-fno-finite-loop}.
> +
>  @item -ftree-dominator-opts
>  @opindex ftree-dominator-opts
>  Perform a variety of simple scalar cleanups (constant/copy
> diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C 
> b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
> new file mode 100644
> index 000..e374155
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
> @@ -0,0 +1,33 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loop" } */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +using namespace std;
> +
> +int foo (vector &v, list &l, set &s, map string> &m)
> +{
> +  for (vector::iterator it = v.begin (); it != v.end (); ++it)
> +it->length();
> +
> +  for (list::iterator it = l.begin (); it != l.end (); ++it)
> +it->length();
> +
> +  for (map::iterator it = m.begin (); it != m.end (); ++it)
> +it->first + it->second.length();
> +
> +  for (set::iterator it0 = s.begin (); it0 != s.end(); ++it0)
> +for (vector::reverse_iterator it1 = v.rbegin(); it1 != v.rend(); 
> ++it1)
> +  {
> +it0->length();
> +it1->length();
> +  }
> +
> +  return 0;
> +}
> +/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */
> +
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
> new file mode 100644
> index 000..ffca49c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/dce-2.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-cddce1 -ffinite-loop" } */
> +
> +typedef stru

[patch, fortran] Fix wrong-code regression with netcdf and SPEC due to argument repacking

2019-05-29 Thread Thomas Koenig

Hello world,

the attached patch fixes the wrong-code regression due to the
inline argument repacking patch, r271377.

What had gone wrong?  gfortran used to pack and  unpack arrays
unconditionally passed to old-style assumed size or .  For code like

module t2
  implicit none
contains
  subroutine foo(a)
real, dimension(*) :: a
  end subroutine foo
end module t2

module t1
  use t2
  implicit none
contains
  subroutine bar(a)
real, dimension(:) :: a
call foo(a)
  end subroutine bar
end module t1

program main
  use t1
  call bar([1.0, 2.0])
end program main

this meant that an (always contiguous) array constructor was
passed down to an assumed shape array, which then passed it
on to an assumed size, explicit shape or adjustable array.
Packing was not problematic (apart from performance), but
unpacking tried to write into the array constructor.

So, this patch inserts a run-time check for contiguous arrays
and does not do packing/unpacking in that case.

Thanks to Toon and Martin for finding an open test case which
actually failed, and for help with debugging.

(Always repacking also likely impacted performance when it didn't
lead to wrong code, we will have to see how performance is with
this version).

OK for trunk?

Regards

Thomas

2019-05-29  Thomas Koenig  

PR fortran/90539
* gfortran.h (gfc_has_dimen_vector_ref): Add prototype.
* trans.h (gfc_conv_subref_array_arg): Add argument check_contiguous.
(gfc_conv_is_contiguous_expr): Add prototype.
* frontend-passes.c (has_dimen_vector_ref): Remove prototype,
rename to
(gfc_has_dimen_vector_ref): New function name.
(matmul_temp_args): Use gfc_has_dimen_vector_ref.
(inline_matmul_assign): Likewise.
* trans-array.c (gfc_conv_array_parameter): Also check for absence
of a vector subscript before calling gfc_conv_subref_array_arg.
Pass additional argument to gfc_conv_subref_array_arg.
* trans-expr.c (gfc_conv_subref_array_arg): Add argument
check_contiguous. If that is true, check if the argument
is contiguous and do not repack in that case.
* trans-intrinsic.c (gfc_conv_intrinsic_is_contiguous): Split
away most of the work into, and call
(gfc_conv_intrinsic_is_coniguous_expr): New function.

2019-05-29  Thomas Koenig  

PR fortran/90539
* gfortran.dg/internal_pack_21.f90: Adjust scan patterns.
* gfortran.dg/internal_pack_22.f90: New test.
* gfortran.dg/internal_pack_23.f90: New test.
Index: fortran/gfortran.h
===
--- fortran/gfortran.h	(Revision 271629)
+++ fortran/gfortran.h	(Arbeitskopie)
@@ -3532,6 +3532,7 @@ typedef int (*walk_expr_fn_t) (gfc_expr **, int *,
 int gfc_dummy_code_callback (gfc_code **, int *, void *);
 int gfc_expr_walker (gfc_expr **, walk_expr_fn_t, void *);
 int gfc_code_walker (gfc_code **, walk_code_fn_t, walk_expr_fn_t, void *);
+bool gfc_has_dimen_vector_ref (gfc_expr *e);
 
 /* simplify.c */
 
Index: fortran/trans.h
===
--- fortran/trans.h	(Revision 271629)
+++ fortran/trans.h	(Arbeitskopie)
@@ -535,8 +535,11 @@ int gfc_conv_procedure_call (gfc_se *, gfc_symbol
 void gfc_conv_subref_array_arg (gfc_se *, gfc_expr *, int, sym_intent, bool,
 const gfc_symbol *fsym = NULL,
 const char *proc_name = NULL,
-gfc_symbol *sym = NULL);
+gfc_symbol *sym = NULL,
+bool check_contiguous = false);
 
+void gfc_conv_is_contiguous_expr (gfc_se *, gfc_expr *);
+
 /* Generate code for a scalar assignment.  */
 tree gfc_trans_scalar_assign (gfc_se *, gfc_se *, gfc_typespec, bool, bool,
 			  bool c = false);
Index: fortran/frontend-passes.c
===
--- fortran/frontend-passes.c	(Revision 271629)
+++ fortran/frontend-passes.c	(Arbeitskopie)
@@ -54,7 +54,6 @@ static gfc_code * create_do_loop (gfc_expr *, gfc_
 static gfc_expr* check_conjg_transpose_variable (gfc_expr *, bool *,
 		 bool *);
 static int call_external_blas (gfc_code **, int *, void *);
-static bool has_dimen_vector_ref (gfc_expr *);
 static int matmul_temp_args (gfc_code **, int *,void *data);
 static int index_interchange (gfc_code **, int*, void *);
 
@@ -2868,7 +2867,7 @@ matmul_temp_args (gfc_code **c, int *walk_subtrees
 {
   if (matrix_a->expr_type == EXPR_VARIABLE
 	  && (gfc_check_dependency (matrix_a, expr1, true)
-	  || has_dimen_vector_ref (matrix_a)))
+	  || gfc_has_dimen_vector_ref (matrix_a)))
 	a_tmp = true;
 }
   else
@@ -2881,7 +2880,7 @@ matmul_temp_args (gfc_code **c, int *walk_subtrees
 {
   if (matrix_b->expr_type == EXPR_VARIABLE
 	  && (gfc_check_dependency (matrix_b, expr1, true)
-	  || has_dimen_vector_ref (matrix_b)))
+	  || gfc_has_dimen_vector_ref (matrix_b)))
 	b_tmp = true;
 }
   else
@@ -3681,8 +3680,8 @@ scalarized_

Re: [PATCH 3/3][GCC][AARCH64] Add support for pointer authentication B key

2019-05-29 Thread Christophe Lyon
On Wed, 29 May 2019 at 11:23, Sam Tebbs  wrote:
>
> The libgcc changes have been acknowledged off-list. Committed as r271735.
>

After this commit, I'm seeing errors while building libstdc++:
0x11c29f3 aarch64_return_address_signing_enabled()

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/aarch64/aarch64.c:4865
0x11c2a08 aarch64_post_cfi_startproc(_IO_FILE*, tree_node*)

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/aarch64/aarch64.c:15373
0xa27098 dwarf2out_do_cfi_startproc

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/dwarf2out.c:972
0xa43d6e dwarf2out_begin_prologue(unsigned int, unsigned int, char const*)

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/dwarf2out.c:1106
0xae05d5 final_start_function_1

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:1735
0xae0c2f final_start_function(rtx_insn*, _IO_FILE*, int)

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:1818
0x11c4442 aarch64_output_mi_thunk

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/aarch64/aarch64.c:6085
0x9cfa4f cgraph_node::expand_thunk(bool, bool)

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:1831
0x9d0dba cgraph_node::assemble_thunks_and_aliases()

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2122
0x9d0d89 cgraph_node::assemble_thunks_and_aliases()

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2140
0x9d1068 cgraph_node::expand()

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2259
0x9d23ec expand_all_functions

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2332
0x9d23ec symbol_table::compile()

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2683
0x9d5020 symbol_table::compile()

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2595
0x9d5020 symbol_table::finalize_compilation_unit()

/tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2861
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
make[5]: *** [Makefile:900: strstream.lo] Error 1

in aarch64-none-linux-gnu/libstdc++-v3/src/c++98

(same for aarch64[_be]-elf)

Christophe

> On 01/03/2019 14:12, Sam Tebbs wrote:
> > On 31/01/2019 14:54, Sam Tebbs wrote:
> >> 
> >>> ping 3. The preceding two patches were committed a while ago but require
> >>> the minor libgcc changes in this patch, which are the only parts left to
> >>> be reviewed.
> >> ping 4
> > Attached is a rebased patch made to work on top of Sudi Das' BTI patch
> > (by renaming UNSPEC_PACISP to UNSPEC_PACIASP and UNSPEC_PACIBSP in
> > aarch64-bti-insert.c). The updated changelog is below.
> >
> > Are the libgcc changes OK for trunk?
> >
> > gcc/
> > 2019-03-01  Sam Tebbs
> >
> >   * config/aarch64/aarch64-builtins.c (aarch64_builtins): Add
> >   AARCH64_PAUTH_BUILTIN_AUTIB1716 and AARCH64_PAUTH_BUILTIN_PACIB1716.
> >   * config/aarch64/aarch64-builtins.c 
> > (aarch64_init_pauth_hint_builtins):
> >   Add autib1716 and pacib1716 initialisation.
> >   * config/aarch64/aarch64-builtins.c (aarch64_expand_builtin): Add 
> > checks
> >   for autib1716 and pacib1716.
> >   * config/aarch64/aarch64-protos.h (aarch64_key_type,
> >   aarch64_post_cfi_startproc): Define.
> >   * config/aarch64/aarch64-protos.h (aarch64_ra_sign_key): Define 
> > extern.
> >   * config/aarch64/aarch64.c (aarch64_handle_standard_branch_protection,
> >   aarch64_handle_pac_ret_protection): Set default sign key to A.
> >   * config/aarch64/aarch64.c (aarch64_expand_epilogue,
> >   aarch64_expand_prologue): Add check for b-key.
> >   * config/aarch64/aarch64.c (aarch64_ra_sign_key,
> >   aarch64_post_cfi_startproc, aarch64_handle_pac_ret_b_key): Define.
> >   * config/aarch64/aarch64.h (TARGET_ASM_POST_CFI_STARTPROC): Define.
> >   * config/aarch64/aarch64.c (aarch64_pac_ret_subtypes): Add "b-key".
> >   * config/aarch64/aarch64.md (unspec): Add UNSPEC_AUTIA1716,
> >   UNSPEC_AUTIB1716, UNSPEC_AUTIASP, UNSPEC_AUTIBSP, UNSPEC_PACIA1716,
> >   UNSPEC_PACIB1716, UNSPEC_PACIASP, UNSPEC_PACIBSP.
> >   * config/aarch64/aarch64.md (do_return): Add check for b-key.
> >   * config/aarch64/aarch64.md (sp): Replace
> >   pauth_hint_num_a with pauth_hint_num.
> >   * config/aarch64/aarch64.md (1716): Replace
> >   pauth_hint_num_a with pauth_hint_num.
> >   * config/aarch64/aarch64.opt (msign-return-address=): Deprecate.
> >   * config/aarch64/iterators.md (PAUTH_LR_SP): Add UNSPEC_AUTIASP,
> >   UNSPEC_AUTIBSP, UNSPEC_PACIASP, UNSPEC_PACIBSP.
> >   * config/aarch

Re: [PATCH] Fix few build warnings with LLVM toolchain

2019-05-29 Thread David CARLIER
Here a little progress but maybe it s better doing this in small
"batches" rather than fixing everything in one shot ?

Kind regards.
Fixing few build warnings with clang/clang++ of this type:
../.././gcc/coretypes.h:76:1: warning: class 'rtx_def' was previously declared as a struct; this is valid, but may result in linker errors under the Microsoft C++ ABI [-Wmismatched-tags]
or
../.././gcc/machmode.h:320:1: warning: 'pod_mode' defined as a struct template here but previously declared as a class template; this is valid, but may result in linker errors under the
  Microsoft C++ ABI [-Wmismatched-tags]

The struct/class mismatch is mostly harmless, might be for Microsoft toolchain as mentioned above, but in general for correctness.

Index: gcc/ChangeLog
Index: gcc/cgraph.h
===
--- gcc/cgraph.h	(revision 271734)
+++ gcc/cgraph.h	(working copy)
@@ -100,11 +100,10 @@ enum symbol_partitioning_class
 
 /* Base of all entries in the symbol table.
The symtab_node is inherited by cgraph and varpol nodes.  */
-class GTY((desc ("%h.type"), tag ("SYMTAB_SYMBOL"),
+struct GTY((desc ("%h.type"), tag ("SYMTAB_SYMBOL"),
 	   chain_next ("%h.next"), chain_prev ("%h.previous")))
   symtab_node
 {
-public:
   friend class symbol_table;
 
   /* Return name.  */
@@ -598,7 +597,6 @@ public:
   /* Section name. Again can be private, if allowed.  */
   section_hash_entry *x_section;
 
-protected:
   /* Dump base fields of symtab nodes to F.  Not to be used directly.  */
   void dump_base (FILE *);
 
@@ -618,7 +616,6 @@ protected:
   bool call_for_symbol_and_aliases_1 (bool (*callback) (symtab_node *, void *),
   void *data,
   bool include_overwrite);
-private:
   /* Worker for set_section.  */
   static bool set_section (symtab_node *n, void *s);
 
@@ -1505,7 +1502,7 @@ struct cgraph_node_set_def
 typedef cgraph_node_set_def *cgraph_node_set;
 typedef struct varpool_node_set_def *varpool_node_set;
 
-class varpool_node;
+struct varpool_node;
 
 /* A varpool node set is a collection of varpool nodes.  A varpool node
can appear in multiple sets.  */
@@ -1675,7 +1672,7 @@ struct GTY(()) cgraph_indirect_call_info
 
 struct GTY((chain_next ("%h.next_caller"), chain_prev ("%h.prev_caller"),
 	for_user)) cgraph_edge {
-  friend class cgraph_node;
+  friend struct cgraph_node;
   friend class symbol_table;
 
   /* Remove the edge in the cgraph.  */
@@ -1856,8 +1853,7 @@ private:
 /* The varpool data structure.
Each static variable decl has assigned varpool_node.  */
 
-class GTY((tag ("SYMTAB_VARIABLE"))) varpool_node : public symtab_node {
-public:
+struct GTY((tag ("SYMTAB_VARIABLE"))) varpool_node : public symtab_node {
   /* Dump given varpool node to F.  */
   void dump (FILE *f);
 
@@ -1975,7 +1971,6 @@ public:
  if we did not do any inter-procedural code movement.  */
   unsigned used_by_single_function : 1;
 
-private:
   /* Assemble thunks and aliases associated to varpool node.  */
   void assemble_aliases (void);
 
@@ -2074,9 +2069,9 @@ struct asmname_hasher : ggc_ptr_hash  class opt_mode;
 typedef opt_mode opt_scalar_mode;
 typedef opt_mode opt_scalar_int_mode;
 typedef opt_mode opt_scalar_float_mode;
-template class pod_mode;
+template struct pod_mode;
 typedef pod_mode scalar_mode_pod;
 typedef pod_mode scalar_int_mode_pod;
 typedef pod_mode fixed_size_mode_pod;
@@ -73,7 +73,7 @@ typedef pod_mode fixed_
 /* Subclasses of rtx_def, using indentation to show the class
hierarchy, along with the relevant invariant.
Where possible, keep this list in the same order as in rtl.def.  */
-class rtx_def;
+struct rtx_def;
   class rtx_expr_list;   /* GET_CODE (X) == EXPR_LIST */
   class rtx_insn_list;   /* GET_CODE (X) == INSN_LIST */
   class rtx_sequence;/* GET_CODE (X) == SEQUENCE */
@@ -138,9 +138,9 @@ struct gomp_teams;
 /* Subclasses of symtab_node, using indentation to show the class
hierarchy.  */
 
-class symtab_node;
+struct symtab_node;
   struct cgraph_node;
-  class varpool_node;
+  struct varpool_node;
 
 union section;
 typedef union section section;
Index: gcc/dumpfile.h
===
--- gcc/dumpfile.h	(revision 271734)
+++ gcc/dumpfile.h	(working copy)
@@ -647,7 +647,7 @@ extern void dump_combine_total_stats (FI
 /* In cfghooks.c  */
 extern void dump_bb (FILE *, basic_block, int, dump_flags_t);
 
-struct opt_pass;
+class opt_pass;
 
 namespace gcc {
 
Index: gcc/hash-table.h
===
--- gcc/hash-table.h	(revision 271734)
+++ gcc/hash-table.h	(working copy)
@@ -347,7 +347,7 @@ hash_table_mod2 (hashval_t hash, unsigne
   return 1 + mul_mod (hash, p->prime - 2, p->inv_m2, p->shift);
 }
 
-class mem_usage;
+struct mem_usage;
 
 /* User-facing hash table type.
 
Index: gcc/ipa-prop.h
===
--- gcc/ipa-prop.

Re: [PATCH] Fix few build warnings with LLVM toolchain

2019-05-29 Thread Jakub Jelinek
On Wed, May 29, 2019 at 12:53:44PM +0100, David CARLIER wrote:
> Here a little progress but maybe it s better doing this in small
> "batches" rather than fixing everything in one shot ?

IMHO if we want to do anything about this, we should just in
system.h add
#ifdef __clang__
#pragma clang diagnostic ignored "-Wmismatched-tags"
#endif
perhaps with another guard for some minimum clang version number if the
warning isn't in all clang versions or not all versions support
that pragma.

The warning is really flawed and we shouldn't adjust our source because
of that.

Jakub


undefined behavior in value_range::equiv_add()?

2019-05-29 Thread Aldy Hernandez
As per the API, and the original documentation to value_range, 
VR_UNDEFINED and VR_VARYING should never have equivalences.  However, 
equiv_add is tacking on equivalences blindly, and there are various 
regressions that happen if I fix this oversight.


void
value_range::equiv_add (const_tree var,
const value_range *var_vr,
bitmap_obstack *obstack)
{
  if (!m_equiv)
m_equiv = BITMAP_ALLOC (obstack);
  unsigned ver = SSA_NAME_VERSION (var);
  bitmap_set_bit (m_equiv, ver);
  if (var_vr && var_vr->m_equiv)
bitmap_ior_into (m_equiv, var_vr->m_equiv);
}

Is this a bug in the documentation / API, or is equiv_add incorrect and 
we should fix the fall-out elsewhere?


Thanks.
Aldy


Re: Teach same_types_for_tbaa to structurally compare arrays, pointers and vectors

2019-05-29 Thread Richard Biener
On Mon, 27 May 2019, Jan Hubicka wrote:

> > The way you do it above seeing struct X p will end up comparing
> > 'struct X' but that doesn't really have any say on whether we
> > can apply TBAA to the outermost pointer type which, if used as a base,
> > cannot be subsetted by components anyway.
> 
> We remove pointers in pairs so seeing
> struct X p
> struct Y q
> we will end up saying that these pointers are same if struct X and Y are
> same (which we will do by pointer type) or if we can not decide (one of
> them is void).
> 
> Non-LTO code returns 0 in the second case, but I think that could be
> considered unsafe when mixing K&R and ansi-C code.
> > 
> > So - why's anything besides treating all structurally equivalent
> > non-composite types as the same sensible here (and not waste of time)?
> 
> I think you are right that with current implementation it should not
> make difference.  I want eventually to disambiguate
> 
> struct foo {struct bar *a;} ptr1;
> struct foobar **ptr2;
> 
> ptr1->a and *ptr2;
> 
> Here we will currently punt on comparing different pointer types.
> With structural compare we will end up to base&offset because we would
> consider struct foobar * and struct bar * as same types (they are both
> incomplete in LTO now).

But *ptr2 has no access 'path' so we shouldn't even consider it here.

That is, when the innermost reference (for *ptr2 that is a reference
to type foobar *) is of non-aggregate type there's no paths to
disambiguate.  That is same_types_for_tbaa shouldn't even be asked
for non-aggregate types...

> Same_types_for_tbaa does not need to form equivalences and if we unwind
> pointers&arrays to types mathing odr_comparable_p (i.e. we have two
> accesses originating from C++ code), we can disambiguate incompete
> pointers based on mangled name of types they point to.  I have
> incremental patch for that (which futher improves disambiguations to
> about 8000).
> > 
> > I realize this is mostly my mess but if we try to "enhance" things
> > we need to make it clearer what we want...
> > 
> > Maybe even commonize more code paths (the path TBAA stuff is really
> > replicated in at least two places IIRC).
> 
> Yep, I am trying to understand what we need to do here and clean things
> up.
> 
> I guess we could handle few independent issues
> 1) if it is safe to return 1 for types that compare as equivalent as
>pointers (currently we return -1 if canonical types are not defined).
>This seems to handle a most common case
> 2) decide what to do about pointers
> 3) decide what to do about arrays
> 4) decide what to do about ODR 
> 5) see how much we can merge with alias set & canonical type
> computation.
> 
> I would propose first just add
>  if (type1 == type2)
> reutrn 1;

That works for me.

> and I will do bit deeper look at the pointers next and produce also some
> testcases.

Please also see if there are testcases that do anything meaningful
and FAIL after instead of

  /* Do access-path based disambiguation.  */
  if (ref1 && ref2
  && (handled_component_p (ref1) || handled_component_p (ref2)))

doing

  /* Do access-path based disambiguation.  */
  if (ref1 && ref2
  && (handled_component_p (ref1) && handled_component_p (ref2)))

Thanks.
Richard.

> 
> Honza
> > 
> > Richard.
> > 
> > > Honza
> > > > 
> > > > > +  else
> > > > > + {
> > > > > +   if (POINTER_TYPE_P (type1) != POINTER_TYPE_P (type2))
> > > > > + return 0;
> > > > > +   return in_ptr ? 1 : -1;
> > > > > + }
> > > > > +
> > > > > +  if (type1 == type2)
> > > > > +return in_array ? -1 : 1;
> > > > > +}
> > > > >  
> > > > >/* Compare the canonical types.  */
> > > > >if (TYPE_CANONICAL (type1) == TYPE_CANONICAL (type2))
> > > > > -return 1;
> > > > > +return in_array ? -1 : 1;
> > > > >  
> > > > >/* ??? Array types are not properly unified in all cases as we have
> > > > >   spurious changes in the index types for example.  Removing this
> > > > >   causes all sorts of problems with the Fortran frontend.  */
> > > > >if (TREE_CODE (type1) == ARRAY_TYPE
> > > > >&& TREE_CODE (type2) == ARRAY_TYPE)
> > > > > -return -1;
> > > > > +return in_ptr ? 1 : -1;
> > > > >  
> > > > >/* ??? In Ada, an lvalue of an unconstrained type can be used to 
> > > > > access an
> > > > >   object of one of its constrained subtypes, e.g. when a function 
> > > > > with an
> > > > > @@ -770,7 +858,7 @@ same_type_for_tbaa (tree type1, tree typ
> > > > >   not (e.g. type and subtypes can have different modes).  So, in 
> > > > > the end,
> > > > >   they are only guaranteed to have the same alias set.  */
> > > > >if (get_alias_set (type1) == get_alias_set (type2))
> > > > > -return -1;
> > > > > +return in_ptr ? 1 : -1;
> > > > >  
> > > > >/* The types are known to be not equal.  */
> > > > >return 0;
> > > > > 
> > > > 
> > > > -- 
> > > > Richard Biener 
> > > > SUSE Linux 

Re: Teach same_types_for_tbaa to structurally compare arrays, pointers and vectors

2019-05-29 Thread Richard Biener
On Mon, 27 May 2019, Jan Hubicka wrote:

> Hi,
> this is minimal version of patch adding just the pointer compare.
> Bootstrapped/regtested x86_64-linux, makes sense? :)

Yes, obviously.

Richard.


Re: Teach same_types_for_tbaa to structurally compare arrays, pointers and vectors

2019-05-29 Thread Jan Hubicka
> On Mon, 27 May 2019, Jan Hubicka wrote:
> 
> > Hi,
> > this is minimal version of patch adding just the pointer compare.
> > Bootstrapped/regtested x86_64-linux, makes sense? :)
> 
> Yes, obviously.

Thanks, i will go ahead with installing it.
Note that I have also tested removal of:

  /* ??? Array types are not properly unified in all cases as we have
 spurious changes in the index types for example.  Removing this
 causes all sorts of problems with the Fortran frontend.  */
  if (TREE_CODE (type1) == ARRAY_TYPE
  && TREE_CODE (type2) == ARRAY_TYPE)
return -1;

And it causes no regressions.  I looked for the history and see you
added it in 2009 because Fortran mixes up array of chars with char
itself.  I am not sure if that was fixed since then or it is just
about missing testcase?

It does not seem to be that important, but looks odd 
and makes me woried about other changes :)

Honza
> 
> Richard.


Re: [PATCH V2] A jump threading opportunity for condition branch

2019-05-29 Thread Richard Biener
On Tue, 28 May 2019, Jiufu Guo wrote:

> Hi,
> 
> This patch implements a new opportunity of jump threading for PR77820.
> In this optimization, conditional jumps are merged with unconditional
> jump. And then moving CMP result to GPR is eliminated.
> 
> This version is based on the proposal of Richard, Jeff and Andrew, and
> refined to incorporate comments.  Thanks for the reviews!
> 
> Bootstrapped and tested on powerpc64le and powerpc64be with no
> regressions (one case is improved) and new testcases are added. Is this
> ok for trunk?
> 
> Example of this opportunity looks like below:
> 
>   
>   p0 = a CMP b
>   goto ;
> 
>   
>   p1 = c CMP d
>   goto ;
> 
>   
>   # phi = PHI 
>   if (phi != 0) goto ; else goto ;
> 
> Could be transformed to:
> 
>   
>   p0 = a CMP b
>   if (p0 != 0) goto ; else goto ;
> 
>   
>   p1 = c CMP d
>   if (p1 != 0) goto ; else goto ;
> 
> 
> This optimization eliminates:
> 1. saving CMP result: p0 = a CMP b.
> 2. additional CMP on branch: if (phi != 0).
> 3. converting CMP result if there is phi = (INT_CONV) p0 if there is.
> 
> Thanks!
> Jiufu Guo
> 
> 
> [gcc]
> 2019-05-28  Jiufu Guo  
>   Lijia He  
> 
>   PR tree-optimization/77820
>   * tree-ssa-threadedge.c
>   (edge_forwards_cmp_to_conditional_jump_through_empty_bb_p): New
>   function.
>   (thread_across_edge): Add call to
>   edge_forwards_cmp_to_conditional_jump_through_empty_bb_p.
> 
> [gcc/testsuite]
> 2019-05-28  Jiufu Guo  
>   Lijia He  
> 
>   PR tree-optimization/77820
>   * gcc.dg/tree-ssa/phi_on_compare-1.c: New testcase.
>   * gcc.dg/tree-ssa/phi_on_compare-2.c: New testcase.
>   * gcc.dg/tree-ssa/phi_on_compare-3.c: New testcase.
>   * gcc.dg/tree-ssa/phi_on_compare-4.c: New testcase.
>   * gcc.dg/tree-ssa/split-path-6.c: Update testcase.
> 
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c | 30 ++
>  gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c | 23 +++
>  gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c | 25 
>  gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c | 40 +
>  gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c |  2 +-
>  gcc/tree-ssa-threadedge.c| 76 
> +++-
>  6 files changed, 192 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c
> new file mode 100644
> index 000..5227c87
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c
> @@ -0,0 +1,30 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast -fdump-tree-vrp1" } */
> +
> +void g (int);
> +void g1 (int);
> +
> +void
> +f (long a, long b, long c, long d, long x)
> +{
> +  _Bool t;
> +  if (x)
> +{
> +  g (a + 1);
> +  t = a < b;
> +  c = d + x;
> +}
> +  else
> +{
> +  g (b + 1);
> +  a = c + d;
> +  t = c > d;
> +}
> +
> +  if (t)
> +g1 (c);
> +
> +  g (a);
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c
> new file mode 100644
> index 000..eaf89bb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast -fdump-tree-vrp1" } */
> +
> +void g (void);
> +void g1 (void);
> +
> +void
> +f (long a, long b, long c, long d, int x)
> +{
> +  _Bool t;
> +  if (x)
> +t = c < d;
> +  else
> +t = a < b;
> +
> +  if (t)
> +{
> +  g1 ();
> +  g ();
> +}
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c
> new file mode 100644
> index 000..d5a1e0b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c
> @@ -0,0 +1,25 @@
> +/* { dg-do compile } */
> +/* { dg-options "-Ofast -fdump-tree-vrp1" } */
> +
> +void g (void);
> +void g1 (void);
> +
> +void
> +f (long a, long b, long c, long d, int x)
> +{
> +  int t;
> +  if (x)
> +t = a < b;
> +  else if (d == x)
> +t = c < b;
> +  else
> +t = d > c;
> +
> +  if (t)
> +{
> +  g1 ();
> +  g ();
> +}
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c
> new file mode 100644
> index 000..53acabc
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c
> @@ -0,0 +1,40 @@
> +/* { dg

Re: [PATCH] rs6000: Call flow implementation for PC-relative addressing

2019-05-29 Thread Segher Boessenkool
Hi Bill,

On Thu, May 23, 2019 at 09:11:44PM -0500, Bill Schmidt wrote:
> (1) When a function uses PC-relative code generation, all direct calls (other 
> than 
> sibcalls) that the function makes to local or external callees should appear 
> as
> "bl sym@notoc" and should not be followed by a nop instruction.  @notoc 
> indicates
> that the assembler should annotate the call with R_PPC64_REL24_NOTOC, meaning
> that the caller does not guarantee that r2 contains a valid TOC pointer.  Thus
> the linker should not try to replace a subsequent "nop" with a TOC restore
> instruction.

All necessary linker (and binutils and GAS) support is upstream already, right?

> In creating the new sibcall patterns, I did not duplicate the "c" alternatives
> that allow for bctr or blr sibcalls.  I don't think there's a way to generate
> those currently.  The bctr would be legitimate for PC-relative sibcalls if you
> can prove that the target function is in the same binary, but we don't appear
> to detect that possibility today.

But you could see that the target is in the same translation unit, for example?
That should be a simple test to make, too.

>pld 12,0(0),1
>.reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo

Are we guaranteed the assembler always writes a pld like this as 8 bytes?

>   * gcc.target/powerpc/notoc-direct-1.c: New.
>   * gcc.target/powerpc/pcrel-sibcall-1.c: New.

A few more testcases would be useful.  Well we'll gain a lot of-em soon
enough, I suppose.

>static char str[32];  /* 2 spare */
> -  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
> +  if (rs6000_pcrel_p (cfun))
> +sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg);
> +  else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>  sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg,
>sibcall ? "" : "\n\tnop");

Two spare, and you add one char (@notoc vs. ..nop), so at a minimum you
need to correct the comment?

> +  if (DEFAULT_ABI == ABI_V4
> +  && (!TARGET_SECURE_PLT
> +   || !flag_pic
> +   || (decl
> +   && (*targetm.binds_local_p) (decl
> +return true;
> +
> +  return false;

Please invert this (put the "return false" ondition in the if, like the
preceding comment says).

>if (TARGET_PLTSEQ)
>  {
>rtx base = const0_rtx;
> -  int regno;
> -  if (DEFAULT_ABI == ABI_ELFv2)
> +  int regno = 12;
> +  if (rs6000_pcrel_p (cfun))
>   {
> -   base = gen_rtx_REG (Pmode, TOC_REGISTER);
> -   regno = 12;
> +   rtx reg = gen_rtx_REG (Pmode, regno);
> +   rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg),
> +   UNSPEC_PLT_PCREL);
> +   emit_insn (gen_rtx_SET (reg, u));
> +   return reg;
>   }

You don't need a regno variable here, so don't use it, only set it later
where it _is_ used?

> +(define_insn "*pltseq_plt_pcrel"
> +  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
> + (unspec:P [(match_operand:P 1 "" "")
> +(match_operand:P 2 "symbol_ref_operand" "s")
> +(match_operand:P 3 "" "")]
> +   UNSPEC_PLT_PCREL))]
> +  "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS
> +   && rs6000_pcrel_p (cfun)"
> +{
> +  return rs6000_pltseq_template (operands, 4);

Maybe those "4" magic constants should be an enum?

> +int zz0 ()
> +{
> +  asm ("");
> +  return 16;
> +};

You might want to put in a comment what this asm is for.


Please consider those things.  Okay for trunk with that.  Thanks!


Segher


Re: [AArch64] [SVE] PR88837 - Poor vector construction code in VL-specific mode

2019-05-29 Thread Richard Sandiford
Prathamesh Kulkarni  writes:
> Hi,
> The attached patch tries to improve initialization for fixed-length
> SVE vector and it's algorithm is described in comments for
> aarch64_sve_expand_vector_init() in the patch, with help from Richard
> Sandiford. I verified tests added in the patch pass with qemu and am
> trying to run bootstrap+test on patch in qemu.
> Does the patch look OK ?
>
> Thanks,
> Prathamesh
>
> 2019-05-27  Prathamesh Kulkarni  
>   Richard Sandiford  

Although we iterated on ideas for the patch a bit, I didn't write
any of it, so the changelog should just have your name.

> [...]
> @@ -3207,3 +3207,15 @@
>  DONE;
>}
>  )
> +
> +;; Standard pattern name vec_init.
> +
> +(define_expand "vec_init"

The rest of the file doesn't have blank lines after the comment.

> +/* Subroutine of aarch64_sve_expand_vector_init for handling
> +   trailing constants.
> +   This function works as follows:
> +   (a) Create a new vector consisting of trailing constants.
> +   (b) Initialize TARGET with the constant vector using emit_move_insn.
> +   (c) Insert remaining elements in TARGET using insr.
> +   NELTS is the total number of elements in original vector while
> +

truncated sentence, guess the rest would have been:

   NELTS_REQD is the number of elements that are actually significant.

or something.

> +   ??? The heuristic used is to do above only if number of constants
> +   is at least half the total number of elements.  May need fine tuning.  */
> +
> +static bool
> +aarch64_sve_expand_vector_init_handle_trailing_constants
> + (rtx target, const rtx_vector_builder &builder, int nelts, int nelts_reqd)
> +{
> +  machine_mode mode = GET_MODE (target);
> +  scalar_mode elem_mode = GET_MODE_INNER (mode);
> +  int n_trailing_constants = 0;
> +
> +  for (int i = nelts_reqd - 1;
> +   i >= 0 && aarch64_legitimate_constant_p (elem_mode, builder.elt (i));
> +   i--)
> +n_trailing_constants++;
> +
> +  if (n_trailing_constants >= nelts_reqd / 2)
> +{
> +  rtx_vector_builder v (mode, 1, nelts);
> +  for (int i = 0; i < nelts; i++)
> + v.quick_push (builder.elt (i + nelts_reqd - n_trailing_constants));
> +  rtx const_vec = v.build ();
> +  emit_move_insn (target, const_vec);
> +
> +  for (int i = nelts_reqd - n_trailing_constants - 1; i >= 0; i--)
> + emit_insr (target, builder.elt (i));
> +
> +  return true;
> +}
> +
> +  return false;
> +}
> +
> +/* Subroutine of aarch64_sve_expand_vector_init.
> +   Works as follows:
> +   (a) Initialize TARGET by broadcasting element NELTS_REQD - 1 of BUILDER.
> +   (b) Skip trailing elements from BUILDER, which are same as

s/are same/are the same/

> +   element NELTS_REQD - 1.
> +   (c) Insert earlier elements in reverse order in TARGET using insr.  */
> +
> +static void
> +aarch64_sve_expand_vector_init_insert_elems (rtx target,
> +  const rtx_vector_builder &builder,
> +  int nelts_reqd)
> +{
> +  machine_mode mode = GET_MODE (target);
> +  scalar_mode elem_mode = GET_MODE_INNER (mode);
> +
> +  struct expand_operand ops[2];
> +  enum insn_code icode = optab_handler (vec_duplicate_optab, mode);
> +  gcc_assert (icode != CODE_FOR_nothing);
> +
> +  create_output_operand (&ops[0], target, mode);
> +  create_input_operand (&ops[1], builder.elt (nelts_reqd - 1), elem_mode);
> +  expand_insn (icode, 2, ops);
> +
> +  int ndups = builder.count_dups (nelts_reqd - 1, -1, -1);
> +  for (int i = nelts_reqd - ndups - 1; i >= 0; i--)
> +emit_insr (target, builder.elt (i));
> +}
> +
> +/* Subroutine of aarch64_sve_expand_vector_init to handle case
> +   when all trailing elements of builder are same.
> +   This works as follows:
> +   (a) Using expand_insn interface to broadcast last vector element in 
> TARGET.

s/Using/Use/

> +   (b) Insert remaining elements in TARGET using insr.
> +
> +   ??? The heuristic used is to do above if number of same trailing elements
> +   is at least 3/4 of total number of elements, loosely based on
> +   heuristic from mostly_zeros_p. May need fine-tuning.  */

Should be two spaces before "May".

> [...]
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_1.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/init_1.c
> new file mode 100644
> index 000..c51876947fb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_1.c
> @@ -0,0 +1,27 @@
> +/* { dg-do compile { target aarch64_asm_sve_ok } } */
> +/* { dg-options "-O2 -ftree-vectorize -fno-schedule-insns 
> -msve-vector-bits=256 --save-temps" } */

These tests shouldn't require -ftree-vectorize.

> [...]
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/init_10_run.c 
> b/gcc/testsuite/gcc.target/aarch64/sve/init_10_run.c
> new file mode 100644
> index 000..d9640e42ddd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_10_run.c
> @@ -0,0 +1,21 @@
> +/* { dg-do run { target aarch64_sve256_hw } } */
> +

[PATCH] rs6000: Add undocumented switch -mprefixed-addr

2019-05-29 Thread Bill Schmidt
Hi,

This patch adds the undocumented switch -mprefixed-addr and a little of the 
basic
infrastructure around it.  It also includes a couple of lines in the same code
regions to disable P8 fusion for -mcpu=future.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
Is this okay for trunk?

Thanks,
Bill


2019-05-28  Bill Schmidt  
Michael Meissner  

* rs6000-cpus.def (OTHER_FUSION_MASKS): New #define.
(ISA_FUTURE_MASKS_SERVER): Add OPTION_MASK_PREFIXED_ADDR. Mask off
OTHER_FUSION_MASKS.
(OTHER_FUTURE_MASKS): Add OPTION_MASK_PREFIXED_ADDR.
(POWERPC_MASKS): Likewise.
* rs6000.c (rs6000_option_override_internal): Error if
-mprefixed-addr is specified without -mcpu=future. Error if
-mpcrel is specified without -mprefixed-addr.
(rs6000_opt_masks): Add entry for prefixed-addr.
* rs6000.opt (mprefixed-addr): New option.


diff --git a/gcc/config/rs6000/rs6000-cpus.def 
b/gcc/config/rs6000/rs6000-cpus.def
index 5337382bdcf..2fb612a8401 100644
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@@ -72,13 +72,20 @@
 | OPTION_MASK_P9_VECTOR\
 | OPTION_MASK_DIRECT_MOVE)
 
+/* ISA masks setting fusion options.  */
+#define OTHER_FUSION_MASKS (OPTION_MASK_P8_FUSION  \
+| OPTION_MASK_P8_FUSION_SIGN)
+
 /* Support for a future processor's features.  */
-#define ISA_FUTURE_MASKS_SERVER(ISA_3_0_MASKS_SERVER   
\
-| OPTION_MASK_FUTURE   \
-| OPTION_MASK_PCREL)
+#define ISA_FUTURE_MASKS_SERVER((ISA_3_0_MASKS_SERVER  
\
+ | OPTION_MASK_FUTURE  \
+ | OPTION_MASK_PCREL   \
+ | OPTION_MASK_PREFIXED_ADDR)  \
+& ~OTHER_FUSION_MASKS)
 
 /* Flags that need to be turned off if -mno-future.  */
-#define OTHER_FUTURE_MASKS (OPTION_MASK_PCREL)
+#define OTHER_FUTURE_MASKS (OPTION_MASK_PCREL  \
+| OPTION_MASK_PREFIXED_ADDR)
 
 /* Flags that need to be turned off if -mno-power9-vector.  */
 #define OTHER_P9_VECTOR_MASKS  (OPTION_MASK_FLOAT128_HW\
@@ -139,6 +146,7 @@
 | OPTION_MASK_POWERPC64\
 | OPTION_MASK_PPC_GFXOPT   \
 | OPTION_MASK_PPC_GPOPT\
+| OPTION_MASK_PREFIXED_ADDR\
 | OPTION_MASK_QUAD_MEMORY  \
 | OPTION_MASK_QUAD_MEMORY_ATOMIC   \
 | OPTION_MASK_RECIP_PRECISION  \
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 9229bad6acc..1860b344c3a 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4296,12 +4296,24 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW;
 }
 
-  /* -mpcrel requires the prefixed load/store support on FUTURE systems.  */
-  if (!TARGET_FUTURE && TARGET_PCREL)
+  /* -mprefixed-addr and -mpcrel require the prefixed load/store support on
+ FUTURE systems.  */
+  if (!TARGET_FUTURE && (TARGET_PCREL || TARGET_PREFIXED_ADDR))
 {
   if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
 
+  else if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
+   error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
+
+  rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
+}
+
+  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
+{
+  if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+   error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
+
   rs6000_isa_flags &= ~OPTION_MASK_PCREL;
 }
 
@@ -36379,6 +36391,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
   { "power9-vector",   OPTION_MASK_P9_VECTOR,  false, true  },
   { "powerpc-gfxopt",  OPTION_MASK_PPC_GFXOPT, false, true  },
   { "powerpc-gpopt",   OPTION_MASK_PPC_GPOPT,  false, true  },
+  { "prefixed-addr",   OPTION_MASK_PREFIXED_ADDR,  false, true  },
   { "quad-memory", OPTION_MASK_QUAD_MEMORY,false, true  },
   { "quad-memory-atomic",  OPTION_MASK_QUAD_MEMORY_ATOMIC, false, true  },
   { "recip-precision", OPTION_MASK_RECIP_PRECISION,false, true  },
diff --git a/gcc/config/rs6000/rs6000.opt b/gcc/config/rs6000/rs6000.opt
index 43b04834746..3a4353674b8 100644

Re: Teach same_types_for_tbaa to structurally compare arrays, pointers and vectors

2019-05-29 Thread Richard Biener
On Wed, 29 May 2019, Jan Hubicka wrote:

> > On Mon, 27 May 2019, Jan Hubicka wrote:
> > 
> > > Hi,
> > > this is minimal version of patch adding just the pointer compare.
> > > Bootstrapped/regtested x86_64-linux, makes sense? :)
> > 
> > Yes, obviously.
> 
> Thanks, i will go ahead with installing it.
> Note that I have also tested removal of:
> 
>   /* ??? Array types are not properly unified in all cases as we have
>  spurious changes in the index types for example.  Removing this
>  causes all sorts of problems with the Fortran frontend.  */
>   if (TREE_CODE (type1) == ARRAY_TYPE
>   && TREE_CODE (type2) == ARRAY_TYPE)
> return -1;
> 
> And it causes no regressions.  I looked for the history and see you
> added it in 2009 because Fortran mixes up array of chars with char
> itself.  I am not sure if that was fixed since then or it is just
> about missing testcase?

I think we had a testcase back then and I'm not aware of any fixes
here.  The introducing mail says we miscompile protein (part of
polyhedron).  Another thing about arrays is that unification
doesn't work for VLAs even in C (consider nested fns being
inlined and sharing an array type with the caller), so we cannot
really say ARRAY_TYPEs with non-constant bounds are ever
"not equal".  So simply dropping this check looks wrong.
I'm not sure about char[] vs char but the FE definitely can
end up with char vs. char[1] and we need not consider those
different.

The fortran FE is similarly sloppy in other areas, see

  /* ??? We cannot simply use the type of operand #0 of the refs here
 as the Fortran compiler smuggles type punning into COMPONENT_REFs
 for common blocks instead of using unions like everyone else.  */
  tree type1 = DECL_CONTEXT (field1);
  tree type2 = DECL_CONTEXT (field2);



> It does not seem to be that important, but looks odd 
> and makes me woried about other changes :)
> 
> Honza
> > 
> > Richard.
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git

2019-05-29 Thread Segher Boessenkool
On Wed, May 29, 2019 at 12:53:30AM +, Joseph Myers wrote:
> On Fri, 24 May 2019, Segher Boessenkool wrote:
> 
> > IMO the best we can do is use what we already have: what CVS or SVN used
> > as the committer identity.  *That* info is *correct* at least.
> 
> CVS and SVN have a local identity.  git has a global identity.  I consider 

Git has an identity (well, two) _per commit_, and there is no way you can
reconstruct people's prefered name and email address (at any point in time,
for every commit separately) correctly.  IMO it is much better to not even
try.  We already *have* enough info for anyone to trivially look up who wrote
what, and what might be that person's email address at the time.  But
pretending that is more than a guess is just wrong.

> it simply *incorrect* to put a local identity in a git committer or author 

On the contrary, the identity on the CVS or SVN server is 100% correct, and
it is the best we can do as far as I can see.


Segher


Re: Patch: don't cap TYPE_PRECISION of bitsizetype at MAX_FIXED_MODE_SIZE

2019-05-29 Thread Richard Biener
On Tue, May 28, 2019 at 5:43 PM Hans-Peter Nilsson
 wrote:
>
> TL;DR: instead of capping TYPE_PRECISION of bitsizetype at
> MAX_FIXED_MODE_SIZE, search for the largest fitting size from
> scalar_int_mode modes supported by the target using
> targetm.scalar_mode_supported_p.
>
> -
> In initialize_sizetypes, MAX_FIXED_MODE_SIZE is used as an upper
> limit to the *precision* of the bit size of the size-type
> (typically address length) of the target, which is wrong.
>
> The effect is that if a 32-bit target says "please don't cook up
> pieces larger than a register size", then we don't get more
> precision in address-related calculations than that, while the
> bit-precision needs to be at least (precision +
> LOG2_BITS_PER_UNIT + 1) with precision being the size of the
> address, to diagnose overflows.  There are gcc_asserts that
> guard this, causing ICE when broken.
>
> This MAX_FIXED_MODE_SIZE usage comes from r118977 (referencing
> PR27885 and PR28176) and was introduced as if
> MAX_FIXED_MODE_SIZE is the size of the largest supported type
> for the target (where "supported" is in the most trivial sense
> as in can move and add).  But it's not.
>
> MAX_FIXED_MODE_SIZE is arguably a bit vague, but documented as
> "the size in bits of the largest integer machine mode that
> should actually be used.  All integer machine modes of this size
> or smaller can be used for structures and unions with the
> appropriate sizes."

I read it as the machine may not have ways to do basic
things like add two numbers in modes bigger than this
but you can use larger modes as simple bit "containers".

>  While in general the documentation
> sometimes differs from reality, that's mostly right, with
> "should actually be" meaning "is preferably": it's the largest
> size that the target indicates as beneficial of use besides that
> directly mapped from types used in the source code; sort-of a
> performance knob.  (I did a static reality code check looking
> for direct and indirect uses before imposing this my own
> interpretation and recollection.)  Typical use is when gcc finds
> that some operations can be combined and synthesized to
> optionally use a wider mode than seen in the source (but mostly
> copying).  Then this macro sets an upper limit to the those
> operations, whether to be done at all or the chunk-size.
> Unfortunately some of the effects are unintuitive and I wouldn't
> be surprised if this de-facto affects ABI corners.  It's not
> something you tweak more than once for a target.
>
> Two tests pass with this fixed for cris-elf (MAX_FIXED_MODE_SIZE
> 32): gcc.dg/attr-vector_size.c and gcc.dg/pr69973.c, where the
> lack of precision (32 bits instead of 64 bits for bitsizetype)
> caused an consistency check to ICE, from where I tracked this.

So why does cris-elf have 32 as MAX_FIXED_MODE_SIZE when it
can appearantly do DImode arithmetic just fine?  On x86_64
we end up with TImode which is MAX_FIXED_MODE_SIZE, on
32bit x86 it is DImode.

So - fix cris instead?

> Regarding the change, MAX_FIXED_MODE_SIZE is still mentioned but
> just to initialize the fall-back largest-supported precision.
> Sometimes the target supports no larger mode than that of the
> address, like for a 64-bit target lacking support for larger
> sizes (e.g. TImode), as in the motivating PR27885.  I guess they
> can still get ICE for overflowing address-calculation checks,
> but that's separate problem, fixable by the target.
>
> I considered making a separate convenience function as well as
> amending smallest_int_mode_for_size (e.g., an optional argument
> to have smallest_mode_for_size neither abort or cap at
> MAX_FIXED_MODE_SIZE) but I think the utility is rather specific
> to this use.  We rarely want to both settle for a smaller type
> than the one requested, and that possibly being larger than
> MAX_FIXED_MODE_SIZE.
> -
>
> Regtested cris-elf and x86_64-linux-gnu, and a separate
> cross-build for hppa64-hp-hpux11.11 to spot-check that I didn't
> re-introduce PR27885.  (Not a full cross-build, just building
> f951 and following initialize_sizetypes in gdb to see TRT
> happening.)
>
> Ok to commit?
>
> gcc:
> * stor-layout.c (initialize_sizetypes): Set the precision
> of bitsizetype to the size of largest integer mode
> supported by target, not necessarily MAX_FIXED_MODE_SIZE.
>
> --- gcc/stor-layout.c.orig  Sat May 25 07:12:49 2019
> +++ gcc/stor-layout.c   Tue May 28 04:29:10 2019
> @@ -2728,9 +2728,36 @@ initialize_sizetypes (void)
> gcc_unreachable ();
>  }
>
> -  bprecision
> -= MIN (precision + LOG2_BITS_PER_UNIT + 1, MAX_FIXED_MODE_SIZE);
> -  bprecision = GET_MODE_PRECISION (smallest_int_mode_for_size (bprecision));
> +  bprecision = precision + LOG2_BITS_PER_UNIT + 1;
> +
> +  /* Find the precision of the largest supported mode equal to or larger
> + than needed for the bitsize precision.  This may be larger than
> + MAX_FIXED_MODE_SIZE, which is just the largest 

Re: [PATCH] rs6000: Add undocumented switch -mprefixed-addr

2019-05-29 Thread Segher Boessenkool
Hi!

On Wed, May 29, 2019 at 07:42:38AM -0500, Bill Schmidt wrote:
>   * rs6000-cpus.def (OTHER_FUSION_MASKS): New #define.
>   (ISA_FUTURE_MASKS_SERVER): Add OPTION_MASK_PREFIXED_ADDR. Mask off
>   OTHER_FUSION_MASKS.

Two spaces after a full stop (here and later again).

> +/* ISA masks setting fusion options.  */
> +#define OTHER_FUSION_MASKS   (OPTION_MASK_P8_FUSION  \
> +  | OPTION_MASK_P8_FUSION_SIGN)

Or merge the two masks into one?

>  /* Support for a future processor's features.  */
> -#define ISA_FUTURE_MASKS_SERVER  (ISA_3_0_MASKS_SERVER   
> \
> -  | OPTION_MASK_FUTURE   \
> -  | OPTION_MASK_PCREL)
> +#define ISA_FUTURE_MASKS_SERVER  ((ISA_3_0_MASKS_SERVER  
> \
> +   | OPTION_MASK_FUTURE  \
> +   | OPTION_MASK_PCREL   \
> +   | OPTION_MASK_PREFIXED_ADDR)  \
> +  & ~OTHER_FUSION_MASKS)

OTHER_FUSION_MASKS shouldn't be part of ISA_3_0_MASKS_SERVER.  Fix that
instead?  Fusion is a property of specific CPUs, not of ISA versions.

> -  /* -mpcrel requires the prefixed load/store support on FUTURE systems.  */
> -  if (!TARGET_FUTURE && TARGET_PCREL)
> +  /* -mprefixed-addr and -mpcrel require the prefixed load/store support on
> + FUTURE systems.  */
> +  if (!TARGET_FUTURE && (TARGET_PCREL || TARGET_PREFIXED_ADDR))
>  {
>if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
>   error ("%qs requires %qs", "-mpcrel", "-mcpu=future");

PCREL requires PREFIXED_ADDR, please simplify.

> +  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
> +{
> +  if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
> + error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
> +
>rs6000_isa_flags &= ~OPTION_MASK_PCREL;
>  }

Maybe put this test first, if that makes things easier or more logical?

> @@ -36379,6 +36391,7 @@ static struct rs6000_opt_mask const 
> rs6000_opt_masks[] =
>{ "power9-vector", OPTION_MASK_P9_VECTOR,  false, true  },
>{ "powerpc-gfxopt",OPTION_MASK_PPC_GFXOPT, false, 
> true  },
>{ "powerpc-gpopt", OPTION_MASK_PPC_GPOPT,  false, true  },
> +  { "prefixed-addr", OPTION_MASK_PREFIXED_ADDR,  false, true  },

Do we want this?  Why?


Segher


Re: Teach same_types_for_tbaa to structurally compare arrays, pointers and vectors

2019-05-29 Thread Jan Hubicka
> 
> Please also see if there are testcases that do anything meaningful
> and FAIL after instead of
> 
>   /* Do access-path based disambiguation.  */
>   if (ref1 && ref2
>   && (handled_component_p (ref1) || handled_component_p (ref2)))
> 
> doing
> 
>   /* Do access-path based disambiguation.  */
>   if (ref1 && ref2
>   && (handled_component_p (ref1) && handled_component_p (ref2)))
> 
On tramp3d we get quite few matches which are attached. If ref1 is
MEM_REF and ref2 has non-trivial access path then it seems we need:
 1) ref1 and ref2 to conflict (ref1 is a record or alias set 0)
 2) basetype2 to contain ref1 (so it conflicts too)
 3) if ref1 is a record than the access path may go into a type
contained as field of ref1 but via path not containing ref1 itself.

I tried to construct testcase:

truct foo {int val;} *fooptr;
struct bar {struct foo foo; int val2;} *barptr;
int test()
{ 
  struct foo foo={0};
  barptr->val2 = 1;
  *fooptr=foo;
  return barptr->val2;
}

but we do not optimize it. I.e. optimized dump has:

test ()
{
  struct bar * barptr.0_1;
  struct foo * fooptr.1_2;
  int _6;

   [local count: 1073741824]:
  barptr.0_1 = barptr;
  barptr.0_1->val2 = 1;
  fooptr.1_2 = fooptr;
  MEM[(struct foo *)fooptr.1_2] = 0;
  _6 = barptr.0_1->val2;
  return _6;
}

I see no reason why we should not constant propagate the return value.

Honza


rep5-sametest2-fits6.gz
Description: application/gzip


Re: undefined behavior in value_range::equiv_add()?

2019-05-29 Thread Richard Biener
On Wed, May 29, 2019 at 2:18 PM Aldy Hernandez  wrote:
>
> As per the API, and the original documentation to value_range,
> VR_UNDEFINED and VR_VARYING should never have equivalences.  However,
> equiv_add is tacking on equivalences blindly, and there are various
> regressions that happen if I fix this oversight.
>
> void
> value_range::equiv_add (const_tree var,
> const value_range *var_vr,
> bitmap_obstack *obstack)
> {
>if (!m_equiv)
>  m_equiv = BITMAP_ALLOC (obstack);
>unsigned ver = SSA_NAME_VERSION (var);
>bitmap_set_bit (m_equiv, ver);
>if (var_vr && var_vr->m_equiv)
>  bitmap_ior_into (m_equiv, var_vr->m_equiv);
> }
>
> Is this a bug in the documentation / API, or is equiv_add incorrect and
> we should fix the fall-out elsewhere?

I think this must have been crept in during the classification.  If you
go back to say GCC 7 you shouldn't see value-ranges with
UNDEFINED/VARYING state in the lattice that have equivalences.

It may not be easy to avoid with the new classy interface but we're
certainly not tacking on them "blindly".  At least we're not supposed
to.  As usual the intermediate state might be "broken" but
intermediateness is not sth the new class "likes".

Richard.

>
> Thanks.
> Aldy


Re: Teach same_types_for_tbaa to structurally compare arrays, pointers and vectors

2019-05-29 Thread Richard Biener
On Wed, May 29, 2019 at 3:21 PM Jan Hubicka  wrote:
>
> >
> > Please also see if there are testcases that do anything meaningful
> > and FAIL after instead of
> >
> >   /* Do access-path based disambiguation.  */
> >   if (ref1 && ref2
> >   && (handled_component_p (ref1) || handled_component_p (ref2)))
> >
> > doing
> >
> >   /* Do access-path based disambiguation.  */
> >   if (ref1 && ref2
> >   && (handled_component_p (ref1) && handled_component_p (ref2)))
> >
> On tramp3d we get quite few matches which are attached. If ref1 is
> MEM_REF and ref2 has non-trivial access path then it seems we need:
>  1) ref1 and ref2 to conflict (ref1 is a record or alias set 0)
>  2) basetype2 to contain ref1 (so it conflicts too)
>  3) if ref1 is a record than the access path may go into a type
> contained as field of ref1 but via path not containing ref1 itself.
>
> I tried to construct testcase:
>
> truct foo {int val;} *fooptr;
> struct bar {struct foo foo; int val2;} *barptr;
> int test()
> {
>   struct foo foo={0};
>   barptr->val2 = 1;
>   *fooptr=foo;
>   return barptr->val2;
> }
>
> but we do not optimize it. I.e. optimized dump has:
>
> test ()
> {
>   struct bar * barptr.0_1;
>   struct foo * fooptr.1_2;
>   int _6;
>
>[local count: 1073741824]:
>   barptr.0_1 = barptr;
>   barptr.0_1->val2 = 1;
>   fooptr.1_2 = fooptr;
>   MEM[(struct foo *)fooptr.1_2] = 0;
>   _6 = barptr.0_1->val2;
>   return _6;
> }
>
> I see no reason why we should not constant propagate the return value.

Indeed a good example.  Make it work and add it to the testsuite ;)
I would have said get_alias_set () on the ref type should already have
disambiguated 'int' (barptr->val2) from *fooptr (struct foo) but of course
they conflict because foo contains 'int'.

I guess it doesn't work because 'struct foo' isn't part of the other
path.  Here nonoverlapping_component_refs_of_decl_p would be
the vehicle to use (but IIRC that would also require a common
type in one of both paths).

Richard.

>
> Honza


Re: Teach same_types_for_tbaa to structurally compare arrays, pointers and vectors

2019-05-29 Thread Jan Hubicka
> >   /* ??? Array types are not properly unified in all cases as we have
> >  spurious changes in the index types for example.  Removing this
> >  causes all sorts of problems with the Fortran frontend.  */
> >   if (TREE_CODE (type1) == ARRAY_TYPE
> >   && TREE_CODE (type2) == ARRAY_TYPE)
> > return -1;
> > 
> > And it causes no regressions.  I looked for the history and see you
> > added it in 2009 because Fortran mixes up array of chars with char
> > itself.  I am not sure if that was fixed since then or it is just
> > about missing testcase?
> 
> I think we had a testcase back then and I'm not aware of any fixes
> here.  The introducing mail says we miscompile protein (part of
> polyhedron).  Another thing about arrays is that unification
> doesn't work for VLAs even in C (consider nested fns being
> inlined and sharing an array type with the caller), so we cannot
> really say ARRAY_TYPEs with non-constant bounds are ever
> "not equal".  So simply dropping this check looks wrong.
> I'm not sure about char[] vs char but the FE definitely can
> end up with char vs. char[1] and we need not consider those
> different.
> 
> The fortran FE is similarly sloppy in other areas, see
> 
>   /* ??? We cannot simply use the type of operand #0 of the refs here
>  as the Fortran compiler smuggles type punning into COMPONENT_REFs
>  for common blocks instead of using unions like everyone else.  */
>   tree type1 = DECL_CONTEXT (field1);
>   tree type2 = DECL_CONTEXT (field2);

I think the reason why tings work now is the following test:

  /* ??? In Ada, an lvalue of an unconstrained type can be used to access an
 object of one of its constrained subtypes, e.g. when a function with an
 unconstrained parameter passed by reference is called on an object and
 inlined.  But, even in the case of a fixed size, type and subtypes are
 not equivalent enough as to share the same TYPE_CANONICAL, since this
 would mean that conversions between them are useless, whereas they are
 not (e.g. type and subtypes can have different modes).  So, in the end,
 they are only guaranteed to have the same alias set.  */
  if (get_alias_set (type1) == get_alias_set (type2))
return -1;

If you have arrays of compatible basetype, say
 int a[10] 
 int b[var] 
or array and its basetype, say
 int a[10] 
 int
we will end up returning -1 because array alias set is basetype alias
set unless it is TYPE_NONALIASED_COMPONENT (and I think those we should
be able to skip inside the access patch oracles).

With the array check removed we however will disambiguate
 int a[10];
 short a[10];

This has similar effect as logic I implemented in the original patch
(i.e. we can prove arrays to be incompatible, but without extra care
about VLA bounds we can't prove them to be same)

Honza
> 
> 
> 
> > It does not seem to be that important, but looks odd 
> > and makes me woried about other changes :)
> > 
> > Honza
> > > 
> > > Richard.
> > 
> 
> -- 
> Richard Biener 
> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)



Re: Fwd: [PATCH] Fix few build warnings with LLVM toolchain

2019-05-29 Thread Segher Boessenkool
On Tue, May 28, 2019 at 10:28:07AM +, David CARLIER wrote:
> -- Forwarded message -
> From: David CARLIER 
> Date: Tue, 28 May 2019 at 10:16
> Subject: Re: [PATCH] Fix few build warnings with LLVM toolchain
> To: Segher Boessenkool 
> 
> 
> All right, here an updated version, hope it looks better.

Please don't top-post, and don't use base64 either?  Making it hard for
people to reply to your patch submission does not help you.


Segher


[PATCH][RFC] final-value replacement from DCE

2019-05-29 Thread Richard Biener


The following tries to address PR90648 by performing final
value replacement from DCE when DCE knows the final value
computation is not used during loop iteration.  This fits
neatly enough into existing tricks performed by DCE like
removing unused malloc/free pairs.

There's a few complications, one is it fails to bootstrap
because it exposes a few uninit warning false positives,
another is that -fno-tree-sccp is no longer effective.
As written this turns gcc.dg/pr34027-1.c into a division
again (I did not copy the expression_expensive checking).
It seems to also need -ftrapv adjustements (gcc.dg/pr81661.c).

The goal of this patch is to remove the SCCP pass, or rather
us unconditionally replacing loop-closed PHIs with final
value computations which we've got complaints in the past
already that it duplicates computation that is readily
available.  I've not yet figured testsuite fallout from that
change.

For the -fno-tree-sccp I consider to simply honor that
flag in the DCE path, for the gcc.dg/pr34027-1.c I'll
re-install the expression_expensive checking.  I'll
also fix the -ftrapv issue.

Does this otherwise look a sensible way forward?

Thanks,
Richard.

FAIL: gcc.dg/builtin-object-size-1.c execution test
FAIL: gcc.dg/builtin-object-size-5.c scan-assembler-not abort
FAIL: gcc.dg/pr34027-1.c scan-tree-dump-times optimized " / " 0
FAIL: gcc.dg/pr81661.c (internal compiler error)
FAIL: gcc.dg/pr81661.c (test for excess errors)
XPASS: gcc.dg/tree-ssa/loop-15.c scan-tree-dump-times optimized " + " 0
FAIL: gcc.dg/tree-ssa/loop-15.c scan-tree-dump-times optimized "if " 1
FAIL: gcc.dg/tree-ssa/loop-26.c scan-tree-dump-times optimized "if" 2
FAIL: gcc.dg/tree-ssa/pr32044.c scan-tree-dump-times optimized " / " 0
FAIL: gcc.dg/tree-ssa/pr32044.c scan-tree-dump-times optimized "if" 6
FAIL: gcc.dg/tree-ssa/pr64183.c scan-tree-dump cunroll "Loop 2 iterates at most 
3 times"
FAIL: gcc.dg/tree-ssa/ssa-pre-3.c scan-tree-dump-times pre "Eliminated: 2" 1
FAIL: gcc.dg/vect/no-scevccp-noreassoc-outer-3.c scan-tree-dump-times vect 
"OUTER LOOP VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-noreassoc-outer-4.c scan-tree-dump-times vect 
"OUTER LOOP VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-noreassoc-outer-5.c scan-tree-dump-times vect 
"OUTER LOOP VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-11.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-13.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-14.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-15.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-16.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-17.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-18.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-19.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-2.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED" 1
FAIL: gcc.dg/vect/no-scevccp-outer-20.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-21.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-3.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-5.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-6-global.c scan-tree-dump-times vect "OUTER 
LOOP VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-6.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-7.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-outer-8.c scan-tree-dump-times vect "OUTER LOOP 
VECTORIZED." 1
FAIL: gcc.dg/vect/no-scevccp-vect-iv-1.c scan-tree-dump-times vect "vectorized 
1 loops" 1
FAIL: gcc.dg/vect/no-scevccp-vect-iv-3.c scan-tree-dump-times vect 
"vect_recog_widen_sum_pattern: detected" 1
FAIL: gcc.dg/vect/no-scevccp-vect-iv-3.c scan-tree-dump-times vect "vectorized 
1 loops" 1

Running target unix//-m32
FAIL: gcc.dg/builtin-object-size-1.c execution test
FAIL: gcc.dg/builtin-object-size-5.c scan-assembler-not abort
FAIL: gcc.dg/pr34027-1.c scan-tree-dump-times optimized " / " 0
FAIL: gcc.dg/pr81661.c (internal compiler error)
FAIL: gcc.dg/pr81661.c (test for excess errors)
XPASS: gcc.dg/tree-ssa/loop-15.c scan-tree-dump-times optimized " + " 0
FAIL: gcc.dg/tree-ssa/loop-15.c scan-tree-dump-times optimized "if " 1
FAIL: gcc.dg/tree-ssa/loop-26.c scan-tree-dump-times optimized "if" 2
FAIL: gcc.dg/tree-ssa/pr32044.c scan-tree-dump-times optimized " / " 0
FAIL: gcc.dg/tree-ssa/pr32044.c scan-tree-dump-times optimized "if" 6
FAIL: gcc.dg/tree-ssa/pr64183.c scan-tree-dump cunroll "Loop 2 iterates at most 
3 times"
FAIL: gcc.dg/tree-ssa/ssa-pre-3.c scan-tree-dump-times pre "Eliminated: 

RFA: Synchronize top level files with binutils

2019-05-29 Thread Nick Clifton
Hi Guys,

  I would like to bring over a few additions that have recently been
  made to the binutils versions of the Makefile.def and configure.ac
  files.  Any objections ?

  Note - I did run a toolchain bootstrap after applying this patch
  locally and that went OK...

Cheers
  Nick

./ChangeLog
2019-05-29  Nick Clifton  

Import from binutils:
2019-05-29  Nick Clifton  

* configure.ac (noconfigdirs): Add libctf if the target does not use
the ELF file format.
* configure: Regenerate.

2019-05-28  Nick Alcock  

* Makefile.def (dependencies): configure-libctf depends on all-bfd
and all its deps.
* Makefile.in: Regenerated.

2019-05-28  Nick Alcock  

* Makefile.def (host_modules): Add libctf.
* Makefile.def (dependencies): Likewise.
libctf depends on zlib, libiberty, and bfd.
* Makefile.in: Regenerated.
* configure.ac (host_libs): Add libctf.
* configure: Regenerated.

Index: Makefile.def
===
--- Makefile.def(revision 271737)
+++ Makefile.def(working copy)
@@ -4,7 +4,7 @@
 // Makefile.in is generated from Makefile.tpl by 'autogen Makefile.def'.
 // This file was originally written by Nathanael Nerode.
 //
-//   Copyright 2002-2013 Free Software Foundation
+//   Copyright 2002-2019 Free Software Foundation
 //
 // This file is free software; you can redistribute it and/or modify
 // it under the terms of the GNU General Public License as published by
@@ -128,6 +128,8 @@
extra_make_flags='@extra_linker_plugin_flags@'; };
 host_modules= { module= libcc1; extra_configure_flags=--enable-shared; };
 host_modules= { module= gotools; };
+host_modules= { module= libctf; no_install=true; no_check=true;
+   bootstrap=true; };
 
 target_modules = { module= libstdc++-v3;
   bootstrap=true;
@@ -137,6 +139,9 @@
   bootstrap=true;
   lib_path=.libs;
   raw_cxx=true; };
+target_modules = { module= libmpx;
+  bootstrap=true;
+  lib_path=.libs; };
 target_modules = { module= libvtv;
   bootstrap=true;
   lib_path=.libs;
@@ -428,6 +433,7 @@
 dependencies = { module=all-binutils; on=all-build-bison; };
 dependencies = { module=all-binutils; on=all-intl; };
 dependencies = { module=all-binutils; on=all-gas; };
+dependencies = { module=all-binutils; on=all-libctf; };
 
 // We put install-opcodes before install-binutils because the installed
 // binutils might be on PATH, and they might need the shared opcodes
@@ -518,6 +524,14 @@
 dependencies = { module=all-fastjar; on=all-zlib; };
 dependencies = { module=all-fastjar; on=all-build-texinfo; };
 dependencies = { module=all-fastjar; on=all-libiberty; };
+dependencies = { module=all-libctf; on=all-libiberty; hard=true; };
+dependencies = { module=all-libctf; on=all-bfd; };
+dependencies = { module=all-libctf; on=all-zlib; };
+// So that checking for ELF support in BFD from libctf configure is possible.
+dependencies = { module=configure-libctf; on=all-bfd; };
+dependencies = { module=configure-libctf; on=all-intl; };
+dependencies = { module=configure-libctf; on=all-zlib; };
+dependencies = { module=configure-libctf; on=all-libiconv; };
 
 // Warning, these are not well tested.
 dependencies = { module=all-bison; on=all-intl; };
Index: configure.ac
===
--- configure.ac(revision 271737)
+++ configure.ac(working copy)
@@ -131,7 +131,7 @@
 
 # these libraries are used by various programs built for the host environment
 #f
-host_libs="intl libiberty opcodes bfd readline tcl tk itcl libgui zlib 
libbacktrace libcpp libdecnumber gmp mpfr mpc isl libelf libiconv"
+host_libs="intl libiberty opcodes bfd readline tcl tk itcl libgui zlib 
libbacktrace libcpp libdecnumber gmp mpfr mpc isl libelf libiconv libctf"
 
 # these tools are built for the host environment
 # Note, the powerpc-eabi build depends on sim occurring before gdb in order to
@@ -928,7 +934,23 @@
 ;;
 esac
 
+# Targets that do not use the ELF file format cannot support libctf.
 case "${target}" in
+  *-*-pe | *-*-*vms* | *-*-darwin | *-*-*coff* | *-*-wince | *-*-mingw*)
+noconfigdirs="$noconfigdirs libctf"
+;;
+  *-*-aout | *-*-osf* | *-*-go32 | *-*-macos* | *-*-rhapsody*)
+noconfigdirs="$noconfigdirs libctf"
+;;
+  *-*-netbsdpe | *-*-cygwin* | *-*-pep | *-*-msdos | *-*-winnt)
+noconfigdirs="$noconfigdirs libctf"
+;;
+  ns32k-*-* | pdp11-*-* | *-*-aix* | *-*-netbsdaout)
+noconfigdirs="$noconfigdirs libctf"
+;;
+esac
+
+case "${target}" in
   *-*-chorusos)
 ;;
   aarch64-*-darwin*)


RE: Implement vector average patterns for SVE2

2019-05-29 Thread Alejandro Martinez Vicente
Turns out I was missing a few bits and pieces. Here is the updated patch and 
changelog.

Alejandro


2019-05-29  Alejandro Martinez  

* config/aarch64/aarch64-c.c: Added TARGET_SVE2.
* config/aarch64/aarch64-sve2.md: New file.
(avg3_floor): New pattern.
(avg3_ceil): Likewise.
(*h): Likewise.
* config/aarch64/aarch64.h: Added AARCH64_ISA_SVE2 and TARGET_SVE2.
* config/aarch64/aarch64.md: Include aarch64-sve2.md.


2019-05-29  Alejandro Martinez  

gcc/testsuite/
* gcc.target/aarch64/sve2/aarch64-sve2.exp: New file, regression driver
for AArch64 SVE2.
* gcc.target/aarch64/sve2/average_1.c: New test.
* lib/target-supports.exp (check_effective_target_aarch64_sve2): New
helper.
(check_effective_target_aarch64_sve1_only): Likewise.
(check_effective_target_aarch64_sve2_hw): Likewise.
(check_effective_target_vect_avg_qi): Check for SVE1 only.

> -Original Message-
> From: Richard Sandiford 
> Sent: 29 May 2019 10:54
> To: Alejandro Martinez Vicente 
> Cc: GCC Patches ; nd 
> Subject: Re: Implement vector average patterns for SVE2
> 
> Alejandro Martinez Vicente  writes:
> > Hi,
> >
> > This patch implements the [u]avgM3_floor and [u]avgM3_ceil optabs for
> SVE2.
> >
> > Alejandro
> >
> > gcc/ChangeLog:
> >
> > 2019-05-28  Alejandro Martinez  
> >
> > * config/aarch64/aarch64-sve2.md: New file.
> > (avg3_floor): New pattern.
> > (avg3_ceil): Likewise.
> > (*h): Likewise.
> > * config/aarch64/aarch64.md: Include aarch64-sve2.md.
> >
> >
> > 2019-05-28  Alejandro Martinez  
> >
> > gcc/testsuite/
> > * gcc.target/aarch64/sve2/average_1.c: New test.
> > * lib/target-supports.exp
> (check_effective_target_aarch64_sve1_only):
> > New helper.
> > (check_effective_target_vect_avg_qi): Check for SVE1 only.
> 
> OK, thanks, but...
> 
> > diff --git gcc/testsuite/lib/target-supports.exp
> > gcc/testsuite/lib/target-supports.exp
> > index f69106d..41431e6 100644
> > --- gcc/testsuite/lib/target-supports.exp
> > +++ gcc/testsuite/lib/target-supports.exp
> > @@ -3308,6 +3308,12 @@ proc check_effective_target_aarch64_sve2 { } {
> >  }]
> >  }
> >
> > +# Return 1 if this is an AArch64 target only supporting SVE (not SVE2).
> > +proc check_effective_target_aarch64_sve1_only { } {
> > +return [expr { [check_effective_target_aarch64_sve]
> > +  && ![check_effective_target_aarch64_sve2] }] }
> 
> ...it needs check_effective_target_aarch64_sve2 to go in first.
> 
> Richard


vavg_sve2_v2.patch
Description: vavg_sve2_v2.patch


Re: RFA: Synchronize top level files with binutils

2019-05-29 Thread Richard Biener
On Wed, May 29, 2019 at 3:40 PM Nick Clifton  wrote:
>
> Hi Guys,
>
>   I would like to bring over a few additions that have recently been
>   made to the binutils versions of the Makefile.def and configure.ac
>   files.  Any objections ?
>
>   Note - I did run a toolchain bootstrap after applying this patch
>   locally and that went OK...
>
> Cheers
>   Nick
>
> ./ChangeLog
> 2019-05-29  Nick Clifton  
>
> Import from binutils:
> 2019-05-29  Nick Clifton  
>
> * configure.ac (noconfigdirs): Add libctf if the target does not use
> the ELF file format.
> * configure: Regenerate.
>
> 2019-05-28  Nick Alcock  
>
> * Makefile.def (dependencies): configure-libctf depends on all-bfd
> and all its deps.
> * Makefile.in: Regenerated.
>
> 2019-05-28  Nick Alcock  
>
> * Makefile.def (host_modules): Add libctf.
> * Makefile.def (dependencies): Likewise.
> libctf depends on zlib, libiberty, and bfd.
> * Makefile.in: Regenerated.
> * configure.ac (host_libs): Add libctf.
> * configure: Regenerated.
>
> Index: Makefile.def
> ===
> --- Makefile.def(revision 271737)
> +++ Makefile.def(working copy)
> @@ -4,7 +4,7 @@
>  // Makefile.in is generated from Makefile.tpl by 'autogen Makefile.def'.
>  // This file was originally written by Nathanael Nerode.
>  //
> -//   Copyright 2002-2013 Free Software Foundation
> +//   Copyright 2002-2019 Free Software Foundation
>  //
>  // This file is free software; you can redistribute it and/or modify
>  // it under the terms of the GNU General Public License as published by
> @@ -128,6 +128,8 @@
> extra_make_flags='@extra_linker_plugin_flags@'; };
>  host_modules= { module= libcc1; extra_configure_flags=--enable-shared; };
>  host_modules= { module= gotools; };
> +host_modules= { module= libctf; no_install=true; no_check=true;
> +   bootstrap=true; };
>
>  target_modules = { module= libstdc++-v3;
>bootstrap=true;
> @@ -137,6 +139,9 @@
>bootstrap=true;
>lib_path=.libs;
>raw_cxx=true; };
> +target_modules = { module= libmpx;
> +  bootstrap=true;
> +  lib_path=.libs; };

It seems to re-introduce things that have been removed on the
GCC side.

Please double-check and re-post. (just cherry-pick actual
changes from the binutils side?)

Richard.

>  target_modules = { module= libvtv;
>bootstrap=true;
>lib_path=.libs;
> @@ -428,6 +433,7 @@
>  dependencies = { module=all-binutils; on=all-build-bison; };
>  dependencies = { module=all-binutils; on=all-intl; };
>  dependencies = { module=all-binutils; on=all-gas; };
> +dependencies = { module=all-binutils; on=all-libctf; };
>
>  // We put install-opcodes before install-binutils because the installed
>  // binutils might be on PATH, and they might need the shared opcodes
> @@ -518,6 +524,14 @@
>  dependencies = { module=all-fastjar; on=all-zlib; };
>  dependencies = { module=all-fastjar; on=all-build-texinfo; };
>  dependencies = { module=all-fastjar; on=all-libiberty; };
> +dependencies = { module=all-libctf; on=all-libiberty; hard=true; };
> +dependencies = { module=all-libctf; on=all-bfd; };
> +dependencies = { module=all-libctf; on=all-zlib; };
> +// So that checking for ELF support in BFD from libctf configure is possible.
> +dependencies = { module=configure-libctf; on=all-bfd; };
> +dependencies = { module=configure-libctf; on=all-intl; };
> +dependencies = { module=configure-libctf; on=all-zlib; };
> +dependencies = { module=configure-libctf; on=all-libiconv; };
>
>  // Warning, these are not well tested.
>  dependencies = { module=all-bison; on=all-intl; };
> Index: configure.ac
> ===
> --- configure.ac(revision 271737)
> +++ configure.ac(working copy)
> @@ -131,7 +131,7 @@
>
>  # these libraries are used by various programs built for the host environment
>  #f
> -host_libs="intl libiberty opcodes bfd readline tcl tk itcl libgui zlib 
> libbacktrace libcpp libdecnumber gmp mpfr mpc isl libelf libiconv"
> +host_libs="intl libiberty opcodes bfd readline tcl tk itcl libgui zlib 
> libbacktrace libcpp libdecnumber gmp mpfr mpc isl libelf libiconv libctf"
>
>  # these tools are built for the host environment
>  # Note, the powerpc-eabi build depends on sim occurring before gdb in order 
> to
> @@ -928,7 +934,23 @@
>  ;;
>  esac
>
> +# Targets that do not use the ELF file format cannot support libctf.
>  case "${target}" in
> +  *-*-pe | *-*-*vms* | *-*-darwin | *-*-*coff* | *-*-wince | *-*-mingw*)
> +noconfigdirs="$noconfigdirs libctf"
> +;;
> +  *-*-aout | *-*-osf* | *-*-go32 | *-*-macos* | *-*-rhapsody*)
> +noconfigdirs="$noconfigdirs libctf"
> +;;
> +  *-*-netbs

Re: [PATCH][AArch64] PR tree-optimization/90332: Implement vec_init where N is a vector mode

2019-05-29 Thread Kyrill Tkachov

Ping.

https://gcc.gnu.org/ml/gcc-patches/2019-05/msg00477.html

Thanks,

Kyrill

On 5/10/19 10:32 AM, Kyrill Tkachov wrote:

Hi all,

This patch fixes the failing gcc.dg/vect/slp-reduc-sad-2.c testcase on 
aarch64
by implementing a vec_init optab that can handle two half-width 
vectors producing a full-width one

by concatenating them.

In the gcc.dg/vect/slp-reduc-sad-2.c case it's a V8QI reg concatenated 
with a V8QI const_vector of zeroes.
This can be implemented efficiently using the aarch64_combinez pattern 
that just loads a D-register to make

use of the implicit zero-extending semantics of that load.
Otherwise it concatenates the two vector using aarch64_simd_combine.

With this patch I'm seeing the effect from richi's original patch that 
added gcc.dg/vect/slp-reduc-sad-2.c on aarch64

and 525.x264_r improves by about 1.5%.

Bootstrapped and tested on aarch64-none-linux-gnu. Also tested on 
aarch64_be-none-elf.


Ok for trunk?
Thanks,
Kyrill

2019-10-05  Kyrylo Tkachov  

    PR tree-optimization/90332
    * config/aarch64/aarch64.c (aarch64_expand_vector_init):
    Handle VALS containing two vectors.
    * config/aarch64/aarch64-simd.md (*aarch64_combinez): Rename
    to...
    (@aarch64_combinez): ... This.
    (*aarch64_combinez_be): Rename to...
    (@aarch64_combinez_be): ... This.
    (vec_init): New define_expand.
    * config/aarch64/iterators.md (Vhalf): Handle V8HF.



Re: Implement vector average patterns for SVE2

2019-05-29 Thread Richard Sandiford
Alejandro Martinez Vicente  writes:
> Turns out I was missing a few bits and pieces. Here is the updated patch and 
> changelog.
>
> Alejandro
>
>
> 2019-05-29  Alejandro Martinez  
>
>   * config/aarch64/aarch64-c.c: Added TARGET_SVE2.
>   * config/aarch64/aarch64-sve2.md: New file.
>   (avg3_floor): New pattern.
>   (avg3_ceil): Likewise.
>   (*h): Likewise.
>   * config/aarch64/aarch64.h: Added AARCH64_ISA_SVE2 and TARGET_SVE2.
>   * config/aarch64/aarch64.md: Include aarch64-sve2.md.
>
>
> 2019-05-29  Alejandro Martinez  
>
> gcc/testsuite/
>   * gcc.target/aarch64/sve2/aarch64-sve2.exp: New file, regression driver
>   for AArch64 SVE2.
>   * gcc.target/aarch64/sve2/average_1.c: New test.
>   * lib/target-supports.exp (check_effective_target_aarch64_sve2): New
>   helper.
>   (check_effective_target_aarch64_sve1_only): Likewise.
>   (check_effective_target_aarch64_sve2_hw): Likewise.
>   (check_effective_target_vect_avg_qi): Check for SVE1 only.

OK, thanks.

We don't really need sve2_hw for this patch, but we will soon,
so might as well add it now.

Richard


Re: [PATCH] Further C lapack workaround tweaks

2019-05-29 Thread Thomas Koenig

Hi Jakub,


As I said earlier in the PR, I don't like -fbroken-callers option much,
as the option name doesn't hint what it is doing at all.

The following patch renames the option and makes it into a 3 state option,
with the default being a middle-ground, where it avoids the tail calls in
functions that have the hidden character length arguments only if it makes
any implicit interface calls.  The rationale for that is that there were no
previously reported issues with older GCC versions and the change that
affected the broken C/C++ wrappers was just giving prototypes to the
implicit interface procedure calls, so presumably in functions that don't
have any such calls nothing should have changed.

Bootstrapped/regtested on x86_64-linux and i686-linux, additionally tested
on the dtrtri.f testcase and on dtrtri.f testcase patched to include
explicit interfaces for all called procedures (and for those two verified
all the 6 ways of using these options, default, positive/negative option
without = and 0/1/2 values of the = option, checking in which case there is
a tail call), ok for trunk?


Yep, this is a much better scheme.  OK.

This problem is also present on all release branches, so I think that
this (which I think is the should also be backported to them, so that
7.5, 8.4 and 9.2 also can compile these LAPACK bindings again...).

Regards

Thomas


Re: Teach same_types_for_tbaa to structurally compare arrays, pointers and vectors

2019-05-29 Thread Jan Hubicka
> > but we do not optimize it. I.e. optimized dump has:
> >
> > test ()
> > {
> >   struct bar * barptr.0_1;
> >   struct foo * fooptr.1_2;
> >   int _6;
> >
> >[local count: 1073741824]:
> >   barptr.0_1 = barptr;
> >   barptr.0_1->val2 = 1;
> >   fooptr.1_2 = fooptr;
> >   MEM[(struct foo *)fooptr.1_2] = 0;
> >   _6 = barptr.0_1->val2;
> >   return _6;
> > }
> >
> > I see no reason why we should not constant propagate the return value.
> 
> Indeed a good example.  Make it work and add it to the testsuite ;)

I think Martin Jambor is working on it.  One needs -fno-tree-sra to
get this optimized :)
Othewise we punt on the check that both types in MEM_REF are the same.

Honza


[libgomp, testsuite] Generalize getconf _NPROCESSORS_ONLN

2019-05-29 Thread Rainer Orth
Prompted by extremely long libgomp make check times due to missing
parallelism, I noticed that the current support to restrict testcase
parallelism only works on targets which have getconf _NPROCESSORS_ONLN.
Instead of adding special cases for all sorts of tools to determine the
number of cores, I found a macro in the autoconf-archive that
encapsulates just that.

The following patch makes use of that, bootstrapped without regressions
on i386-pc-solaris2.11 and sparc-sun-solaris2.11 and checked that
CPU_COUNT is determined correctly and the OMP_NUMP_THREADS warning
emitted.  Also rebuilt libgomp on x86_64-pc-linux-gnu and performed the
same check.

Ok for mainline?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2019-05-28  Rainer Orth  

libgomp:
* configure.ac: Call AX_COUNT_CPUS.
Substitute CPU_COUNT.
* testsuite/Makefile.am (check-am): Use CPU_COUNT as processor count.
* aclocal.m4: Regenerate.
* configure: Regenerate.
* Makefile.in, testsuite/Makefile.in: Regenerate.

config:
* ax_count_cpus.m4: New file.

# HG changeset patch
# Parent  26461208f9155e138ea93c5cb2979247de995b40
Generalize getconf _NPROCESSORS_ONLN

	libgomp:
	* configure.ac: Call AX_COUNT_CPUS.
	Substitute CPU_COUNT.
	* testsuite/Makefile.am (check-am): Use CPU_COUNT as processor count.
	* aclocal.m4: Regenerate.
	* configure: Regenerate.
	* Makefile.in, testsuite/Makefile.in: Regenerate.

	config:
	* ax_count_cpus.m4: New file.

diff --git a/config/ax_count_cpus.m4 b/config/ax_count_cpus.m4
new file mode 100644
--- /dev/null
+++ b/config/ax_count_cpus.m4
@@ -0,0 +1,101 @@
+# ===
+#  https://www.gnu.org/software/autoconf-archive/ax_count_cpus.html
+# ===
+#
+# SYNOPSIS
+#
+#   AX_COUNT_CPUS([ACTION-IF-DETECTED],[ACTION-IF-NOT-DETECTED])
+#
+# DESCRIPTION
+#
+#   Attempt to count the number of logical processor cores (including
+#   virtual and HT cores) currently available to use on the machine and
+#   place detected value in CPU_COUNT variable.
+#
+#   On successful detection, ACTION-IF-DETECTED is executed if present. If
+#   the detection fails, then ACTION-IF-NOT-DETECTED is triggered. The
+#   default ACTION-IF-NOT-DETECTED is to set CPU_COUNT to 1.
+#
+# LICENSE
+#
+#   Copyright (c) 2014,2016 Karlson2k (Evgeny Grin) 
+#   Copyright (c) 2012 Brian Aker 
+#   Copyright (c) 2008 Michael Paul Bailey 
+#   Copyright (c) 2008 Christophe Tournayre 
+#
+#   Copying and distribution of this file, with or without modification, are
+#   permitted in any medium without royalty provided the copyright notice
+#   and this notice are preserved. This file is offered as-is, without any
+#   warranty.
+
+#serial 22
+
+  AC_DEFUN([AX_COUNT_CPUS],[dnl
+  AC_REQUIRE([AC_CANONICAL_HOST])dnl
+  AC_REQUIRE([AC_PROG_EGREP])dnl
+  AC_MSG_CHECKING([the number of available CPUs])
+  CPU_COUNT="0"
+
+  # Try generic methods
+
+  # 'getconf' is POSIX utility, but '_NPROCESSORS_ONLN' and
+  # 'NPROCESSORS_ONLN' are platform-specific
+  command -v getconf >/dev/null 2>&1 && \
+CPU_COUNT=`getconf _NPROCESSORS_ONLN 2>/dev/null || getconf NPROCESSORS_ONLN 2>/dev/null` || CPU_COUNT="0"
+  AS_IF([[test "$CPU_COUNT" -gt "0" 2>/dev/null || ! command -v nproc >/dev/null 2>&1]],[[: # empty]],[dnl
+# 'nproc' is part of GNU Coreutils and is widely available
+CPU_COUNT=`OMP_NUM_THREADS='' nproc 2>/dev/null` || CPU_COUNT=`nproc 2>/dev/null` || CPU_COUNT="0"
+  ])dnl
+
+  AS_IF([[test "$CPU_COUNT" -gt "0" 2>/dev/null]],[[: # empty]],[dnl
+# Try platform-specific preferred methods
+AS_CASE([[$host_os]],dnl
+  [[*linux*]],[[CPU_COUNT=`lscpu -p 2>/dev/null | $EGREP -e '^@<:@0-9@:>@+,' -c` || CPU_COUNT="0"]],dnl
+  [[*darwin*]],[[CPU_COUNT=`sysctl -n hw.logicalcpu 2>/dev/null` || CPU_COUNT="0"]],dnl
+  [[freebsd*]],[[command -v sysctl >/dev/null 2>&1 && CPU_COUNT=`sysctl -n kern.smp.cpus 2>/dev/null` || CPU_COUNT="0"]],dnl
+  [[netbsd*]], [[command -v sysctl >/dev/null 2>&1 && CPU_COUNT=`sysctl -n hw.ncpuonline 2>/dev/null` || CPU_COUNT="0"]],dnl
+  [[solaris*]],[[command -v psrinfo >/dev/null 2>&1 && CPU_COUNT=`psrinfo 2>/dev/null | $EGREP -e '^@<:@0-9@:>@.*on-line' -c 2>/dev/null` || CPU_COUNT="0"]],dnl
+  [[mingw*]],[[CPU_COUNT=`ls -qpU1 /proc/registry/HKEY_LOCAL_MACHINE/HARDWARE/DESCRIPTION/System/CentralProcessor/ 2>/dev/null | $EGREP -e '^@<:@0-9@:>@+/' -c` || CPU_COUNT="0"]],dnl
+  [[msys*]],[[CPU_COUNT=`ls -qpU1 /proc/registry/HKEY_LOCAL_MACHINE/HARDWARE/DESCRIPTION/System/CentralProcessor/ 2>/dev/null | $EGREP -e '^@<:@0-9@:>@+/' -c` || CPU_COUNT="0"]],dnl
+  [[cygwin*]],[[CPU_COUNT=`ls -qpU1 /pr

Re: [PATCH 3/3][GCC][AARCH64] Add support for pointer authentication B key

2019-05-29 Thread Sam Tebbs
Thanks for finding this Christoph, I had this failure a while ago but it 
stopped happening so I thought all was good. I have a fix ready.

Sam

On 29/05/2019 12:22, Christophe Lyon wrote:
> On Wed, 29 May 2019 at 11:23, Sam Tebbs  wrote:
>> The libgcc changes have been acknowledged off-list. Committed as r271735.
>>
> After this commit, I'm seeing errors while building libstdc++:
> 0x11c29f3 aarch64_return_address_signing_enabled()
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/aarch64/aarch64.c:4865
> 0x11c2a08 aarch64_post_cfi_startproc(_IO_FILE*, tree_node*)
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/aarch64/aarch64.c:15373
> 0xa27098 dwarf2out_do_cfi_startproc
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/dwarf2out.c:972
> 0xa43d6e dwarf2out_begin_prologue(unsigned int, unsigned int, char const*)
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/dwarf2out.c:1106
> 0xae05d5 final_start_function_1
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:1735
> 0xae0c2f final_start_function(rtx_insn*, _IO_FILE*, int)
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/final.c:1818
> 0x11c4442 aarch64_output_mi_thunk
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/aarch64/aarch64.c:6085
> 0x9cfa4f cgraph_node::expand_thunk(bool, bool)
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:1831
> 0x9d0dba cgraph_node::assemble_thunks_and_aliases()
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2122
> 0x9d0d89 cgraph_node::assemble_thunks_and_aliases()
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2140
> 0x9d1068 cgraph_node::expand()
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2259
> 0x9d23ec expand_all_functions
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2332
> 0x9d23ec symbol_table::compile()
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2683
> 0x9d5020 symbol_table::compile()
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2595
> 0x9d5020 symbol_table::finalize_compilation_unit()
>  
> /tmp/8467855_6.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cgraphunit.c:2861
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See  for instructions.
> make[5]: *** [Makefile:900: strstream.lo] Error 1
>
> in aarch64-none-linux-gnu/libstdc++-v3/src/c++98
>
> (same for aarch64[_be]-elf)
>
> Christophe
>
>> On 01/03/2019 14:12, Sam Tebbs wrote:
>>> On 31/01/2019 14:54, Sam Tebbs wrote:
 
> ping 3. The preceding two patches were committed a while ago but require
> the minor libgcc changes in this patch, which are the only parts left to
> be reviewed.
 ping 4
>>> Attached is a rebased patch made to work on top of Sudi Das' BTI patch
>>> (by renaming UNSPEC_PACISP to UNSPEC_PACIASP and UNSPEC_PACIBSP in
>>> aarch64-bti-insert.c). The updated changelog is below.
>>>
>>> Are the libgcc changes OK for trunk?
>>>
>>> gcc/
>>> 2019-03-01  Sam Tebbs
>>>
>>>* config/aarch64/aarch64-builtins.c (aarch64_builtins): Add
>>>AARCH64_PAUTH_BUILTIN_AUTIB1716 and AARCH64_PAUTH_BUILTIN_PACIB1716.
>>>* config/aarch64/aarch64-builtins.c 
>>> (aarch64_init_pauth_hint_builtins):
>>>Add autib1716 and pacib1716 initialisation.
>>>* config/aarch64/aarch64-builtins.c (aarch64_expand_builtin): Add 
>>> checks
>>>for autib1716 and pacib1716.
>>>* config/aarch64/aarch64-protos.h (aarch64_key_type,
>>>aarch64_post_cfi_startproc): Define.
>>>* config/aarch64/aarch64-protos.h (aarch64_ra_sign_key): Define 
>>> extern.
>>>* config/aarch64/aarch64.c 
>>> (aarch64_handle_standard_branch_protection,
>>>aarch64_handle_pac_ret_protection): Set default sign key to A.
>>>* config/aarch64/aarch64.c (aarch64_expand_epilogue,
>>>aarch64_expand_prologue): Add check for b-key.
>>>* config/aarch64/aarch64.c (aarch64_ra_sign_key,
>>>aarch64_post_cfi_startproc, aarch64_handle_pac_ret_b_key): Define.
>>>* config/aarch64/aarch64.h (TARGET_ASM_POST_CFI_STARTPROC): Define.
>>>* config/aarch64/aarch64.c (aarch64_pac_ret_subtypes): Add "b-key".
>>>* config/aarch64/aarch64.md (unspec): Add UNSPEC_AUTIA1716,
>>>UNSPEC_AUTIB1716, UNSPEC_AUTIASP, UNSPEC_AUTIBSP, UNSPEC_PACIA1716,
>>>UNSPEC_PACIB1716, UNSPEC_PACIASP, UNSPEC_PACIBSP.
>>>* config/aarch64/aarch64.md (do_return): Add check for b-key.
>>>* config/aarch64/aarch64.md (sp): Replace
>>>pauth_hint_num_a with paut

Re: [libgomp, testsuite] Generalize getconf _NPROCESSORS_ONLN

2019-05-29 Thread Jakub Jelinek
On Wed, May 29, 2019 at 04:13:41PM +0200, Rainer Orth wrote:
> Prompted by extremely long libgomp make check times due to missing
> parallelism, I noticed that the current support to restrict testcase
> parallelism only works on targets which have getconf _NPROCESSORS_ONLN.
> Instead of adding special cases for all sorts of tools to determine the
> number of cores, I found a macro in the autoconf-archive that
> encapsulates just that.
> 
> The following patch makes use of that, bootstrapped without regressions
> on i386-pc-solaris2.11 and sparc-sun-solaris2.11 and checked that
> CPU_COUNT is determined correctly and the OMP_NUMP_THREADS warning
> emitted.  Also rebuilt libgomp on x86_64-pc-linux-gnu and performed the
> same check.
> 
> Ok for mainline?

I'd prefer not to determine the CPU count at configure time, the build
directory can be on some network filesystem and tested sometimes on one
machine, sometimes on another one, hw can be upgraded etc.
So, something similar to what your patch does, but don't substitute the actual
CPU count, but a command to determine the number of CPUs?

Jakub


Re: [OpenACC] Update OpenACC data clause semantics to the 2.5 behavior - runtime

2019-05-29 Thread Thomas Schwinge
Hi Jakub!

Any comments on my questions, please?

On Thu, 02 May 2019 16:03:09 +0200, I wrote:
> I'm currently working on other pending OpenACC 'deviceptr' clause patches
> from our backlog, and I noticed the following, which I don't understand.
> You reviewed and approved this patch, could you please help?
> 
> On Tue, 19 Jun 2018 10:01:20 -0700, Cesar Philippidis 
>  wrote:
> > --- a/libgomp/oacc-parallel.c
> > +++ b/libgomp/oacc-parallel.c
> 
> > +/* Handle the mapping pair that are presented when a
> > +   deviceptr clause is used with Fortran.  */
> > +
> > +static void
> > +handle_ftn_pointers (size_t mapnum, void **hostaddrs, size_t *sizes,
> > +unsigned short *kinds)
> > +{
> > +  int i;
> > +
> > +  for (i = 0; i < mapnum; i++)
> > +{
> > +  unsigned short kind1 = kinds[i] & 0xff;
> > +
> > +  /* Handle Fortran deviceptr clause.  */
> > +  if (kind1 == GOMP_MAP_FORCE_DEVICEPTR)
> > +   {
> > + unsigned short kind2;
> > +
> > + if (i < (signed)mapnum - 1)
> > +   kind2 = kinds[i + 1] & 0xff;
> > + else
> > +   kind2 = 0x;
> > +
> > + if (sizes[i] == sizeof (void *))
> > +   continue;
> > +
> > + /* At this point, we're dealing with a Fortran deviceptr.
> > +If the next element is not what we're expecting, then
> > +this is an instance of where the deviceptr variable was
> > +not used within the region and the pointer was removed
> > +by the gimplifier.  */
> > + if (kind2 == GOMP_MAP_POINTER
> > + && sizes[i + 1] == 0
> > + && hostaddrs[i] == *(void **)hostaddrs[i + 1])
> > +   {
> > + kinds[i+1] = kinds[i];
> > + sizes[i+1] = sizeof (void *);
> > +   }
> > +
> > + /* Invalidate the entry.  */
> > + hostaddrs[i] = NULL;
> > +   }
> > +}
> >  }
> 
> This is used for rewriting the mappings for OpenACC 'parallel'
> etc. constructs:
> 
> > @@ -88,6 +141,8 @@ GOACC_parallel_keyed (int device, void (*fn) (void *),
> >thr = goacc_thread ();
> >acc_dev = thr->dev;
> >  
> > +  handle_ftn_pointers (mapnum, hostaddrs, sizes, kinds);
> > +
> >/* Host fallback if "if" clause is false or if the current device is set 
> > to
> >   the host.  */
> >if (host_fallback)
> 
> ..., and on our OpenACC development branch likewise for OpenACC 'data'
> constructs ('GOACC_data_start').
> 
> What this function seems to be doing, as I understand this, is that when
> there is an 'GOMP_MAP_FORCE_DEVICEPTR' with a size not eqal to pointer
> size (which should never happen, as per the information given
> 'include/gomp-constants.h'?), which is followed by a 'GOMP_MAP_POINTER',
> then preserve the 'GOMP_MAP_FORCE_DEVICEPTR' (by storing it into the slot
> of the 'GOMP_MAP_POINTER'), and unconditionally remove the
> 'GOMP_MAP_POINTER'.  This seems like a strange choice of a GCC/libgomp
> ABI to me -- or am I not understanding this correctly?
> 
> Instead of rewriting the mappings at run time, why isn't (presumably) the
> gimplifier changed to just emit the correct mappings?


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [libgomp, testsuite] Generalize getconf _NPROCESSORS_ONLN

2019-05-29 Thread Rainer Orth
Hi Jakub,

> On Wed, May 29, 2019 at 04:13:41PM +0200, Rainer Orth wrote:
>> Prompted by extremely long libgomp make check times due to missing
>> parallelism, I noticed that the current support to restrict testcase
>> parallelism only works on targets which have getconf _NPROCESSORS_ONLN.
>> Instead of adding special cases for all sorts of tools to determine the
>> number of cores, I found a macro in the autoconf-archive that
>> encapsulates just that.
>> 
>> The following patch makes use of that, bootstrapped without regressions
>> on i386-pc-solaris2.11 and sparc-sun-solaris2.11 and checked that
>> CPU_COUNT is determined correctly and the OMP_NUMP_THREADS warning
>> emitted.  Also rebuilt libgomp on x86_64-pc-linux-gnu and performed the
>> same check.
>> 
>> Ok for mainline?
>
> I'd prefer not to determine the CPU count at configure time, the build
> directory can be on some network filesystem and tested sometimes on one
> machine, sometimes on another one, hw can be upgraded etc.

somewhat agreed for testing on one or the other system.  However, build
dirs tend to be short-lived in my experience (and usually local because
at least I have found builds on NFS so incredibly slow as to avoid them
like the plague) and hw upgrades are relatively rare.

> So, something similar to what your patch does, but don't substitute the actual
> CPU count, but a command to determine the number of CPUs?

I'm not convinced that would work reliably: you can easily have one set
of commands on one machine, but a slightly different one on another.  If
we really want to go this route, that means extracting the autoconf
macro's logic into a separate script and use that at make check time.
Not sure if that's worth the effort, especially since we're not really
interested in the exact number of cores, just small (<= 8) vs. large (> 8).

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: PR90030 "Fortran OpenACC subarray data alignment" (was: [PATCH] Fortran OpenMP 4.0 target support)

2019-05-29 Thread Thomas Schwinge
Hi Jakub!

Any comments on this, please?

On Wed, 10 Apr 2019 15:00:06 +0200, I wrote:
> In context of PR90030 "Fortran OpenACC subarray data alignment" (which
> actually is reproducible for OpenMP with nvptx offloading in the very
> same way, see below), can you please explain the reason for the seven
> "[var] = fold_convert (build_pointer_type (char_type_node), [var])"
> instances that you've added as part of your 2014 trunk r211768 "Fortran
> OpenMP 4.0 target support" commit?
> 
> Replacing all these with "gcc_assert (POINTER_TYPE_P (TREE_TYPE (ptr)))"
> (see the attached WIP patch, which also includes an OpenMP test case), I
> don't see any ill effects for 'check-gcc-fortran', and
> 'check-target-libgomp' with nvptx offloading, and the errors 'libgomp:
> cuStreamSynchronize error: misaligned address' are gone.  I added these
> 'gcc_assert's just for checking; Cesar in
> , and Julian in
>  propose to
> simply drop (a subset of) these casts.  Do we need (a) all, (b) some, (c)
> none of these casts?  And do we want to replace them with 'gcc_assert's,
> or not do that?
> 
> If approving such a patch (for all release branches), please respond with
> "Reviewed-by: NAME " so that your effort will be recorded in the
> commit log, see .
> 
> For reference, see the seven 'char_type_node' instances:
> 
> On Tue, 17 Jun 2014 23:03:47 +0200, Jakub Jelinek  wrote:
> > --- gcc/fortran/trans-openmp.c.jj   2014-06-16 10:06:39.164099047 +0200
> > +++ gcc/fortran/trans-openmp.c  2014-06-17 19:32:58.939176877 +0200
> > @@ -873,6 +873,110 @@ gfc_omp_clause_dtor (tree clause, tree d
> >  }
> >  
> >  
> > +void
> > +gfc_omp_finish_clause (tree c, gimple_seq *pre_p)
> > +{
> > +  if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_MAP)
> > +return;
> > +
> > +  tree decl = OMP_CLAUSE_DECL (c);
> > +  tree c2 = NULL_TREE, c3 = NULL_TREE, c4 = NULL_TREE;
> > +  if (POINTER_TYPE_P (TREE_TYPE (decl)))
> > +{
> > +  if (!gfc_omp_privatize_by_reference (decl)
> > + && !GFC_DECL_GET_SCALAR_POINTER (decl)
> > + && !GFC_DECL_GET_SCALAR_ALLOCATABLE (decl)
> > + && !GFC_DECL_CRAY_POINTEE (decl)
> > + && !GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (TREE_TYPE (decl
> > +   return;
> > +  c4 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP);
> > +  OMP_CLAUSE_MAP_KIND (c4) = OMP_CLAUSE_MAP_POINTER;
> > +  OMP_CLAUSE_DECL (c4) = decl;
> > +  OMP_CLAUSE_SIZE (c4) = size_int (0);
> > +  decl = build_fold_indirect_ref (decl);
> > +  OMP_CLAUSE_DECL (c) = decl;
> > +}
> > +  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)))
> > +{
> > +  stmtblock_t block;
> > +  gfc_start_block (&block);
> > +  tree type = TREE_TYPE (decl);
> > +  tree ptr = gfc_conv_descriptor_data_get (decl);
> > +  ptr = fold_convert (build_pointer_type (char_type_node), ptr);
> > +  ptr = build_fold_indirect_ref (ptr);
> > +  OMP_CLAUSE_DECL (c) = ptr;
> > +  c2 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
> > +  OMP_CLAUSE_MAP_KIND (c2) = OMP_CLAUSE_MAP_TO_PSET;
> > +  OMP_CLAUSE_DECL (c2) = decl;
> > +  OMP_CLAUSE_SIZE (c2) = TYPE_SIZE_UNIT (type);
> > +  c3 = build_omp_clause (OMP_CLAUSE_LOCATION (c), OMP_CLAUSE_MAP);
> > +  OMP_CLAUSE_MAP_KIND (c3) = OMP_CLAUSE_MAP_POINTER;
> > +  OMP_CLAUSE_DECL (c3) = gfc_conv_descriptor_data_get (decl);
> > +  OMP_CLAUSE_SIZE (c3) = size_int (0);
> > +  tree size = create_tmp_var (gfc_array_index_type, NULL);
> > +  tree elemsz = TYPE_SIZE_UNIT (gfc_get_element_type (type));
> > +  elemsz = fold_convert (gfc_array_index_type, elemsz);
> > +  if (GFC_TYPE_ARRAY_AKIND (type) == GFC_ARRAY_POINTER
> > + || GFC_TYPE_ARRAY_AKIND (type) == GFC_ARRAY_POINTER_CONT)
> > +   {
> > + stmtblock_t cond_block;
> > + tree tem, then_b, else_b, zero, cond;
> > +
> > + gfc_init_block (&cond_block);
> > + tem = gfc_full_array_size (&cond_block, decl,
> > +GFC_TYPE_ARRAY_RANK (type));
> > + gfc_add_modify (&cond_block, size, tem);
> > + gfc_add_modify (&cond_block, size,
> > + fold_build2 (MULT_EXPR, gfc_array_index_type,
> > +  size, elemsz));
> > + then_b = gfc_finish_block (&cond_block);
> > + gfc_init_block (&cond_block);
> > + zero = build_int_cst (gfc_array_index_type, 0);
> > + gfc_add_modify (&cond_block, size, zero);
> > + else_b = gfc_finish_block (&cond_block);
> > + tem = gfc_conv_descriptor_data_get (decl);
> > + tem = fold_convert (pvoid_type_node, tem);
> > + cond = fold_build2_loc (input_location, NE_EXPR,
> > + boolean_type_node, tem, null_pointer_node);
> > + gfc_add_expr_to_block (&block, build3_loc (input_location, COND_EXPR,
> > + 

Re: Negative arguments in OpenMP 'num_threads' clause etc.

2019-05-29 Thread Thomas Schwinge
Hi Jakub!

Any comments on my question, please?

On Tue, 09 Apr 2019 17:51:46 +0200, I wrote:
> On Tue, 29 Nov 2016 17:47:08 -0800, Cesar Philippidis 
>  wrote:
> > One notable difference between the trunk and gomp4 implementation of the
> > tile clause is that gomp4 errors on negative value tile arguments,
> > whereas trunk issues warnings.
> 
> I'm picking up these changes, which have been posted a few times, and
> have been rejected (at least in their current incarnation) a few times,
> too.  ;-\
> 
> > Is there a reason why the fortran FE
> > generally emits a warning, on say num_threads(-5), instead of an error?
> 
> Same for the C/C++ front ends, which I'm looking into first.
> 
> Jakub, is the reason that even if the user is clearly doing something
> "strage" there, the compiler doesn't have a problem to continue
> compilation for 'num_threads(-5)', so it just emits a warning, but for
> example for 'collapse(-5)' is has to stop with an error, because it can't
> continue compilation in that case?  Or, is there a different reason for
> the many 'warning_at ([...], "[...] must be positive"' (C front end, for
> example), instead of using 'error_at' for these?


Grüße
 Thomas


signature.asc
Description: PGP signature


[PATCH] PR libstdc++/85494 use rdseed and rand_s in std::random_device

2019-05-29 Thread Jonathan Wakely

Add support for additional sources of randomness to std::random_device,
to allow using RDSEED for Intel CPUs and rand_s for Windows. When
supported these can be selected using the tokens "rdseed" and "rand_s".
For *-w64-mingw32 targets the "default" token will now use rand_s, and
for other i?86-*-* and x86_64-*-* targets it will try to use "rdseed"
first, then "rdrand", and finally "/dev/urandom".

To simplify the declaration of std::random_device in  the
constructors now unconditionally call _M_init instead of _M_init_pretr1,
and the function call operator now unconditionally calls _M_getval. The
library code now decides whether _M_init and _M_getval should use a real
source of randomness or the mt19937 engine.

Existing code compiled against old libstdc++ headers will still call
_M_init_pretr1 and _M_getval_pretr1, but those functions now forward to
_M_init and _M_getval if a real source of randomness is available. This
means existing code compiled for mingw-w64 will start to use rand_s just
by linking to a new libstdc++.dll.

* acinclude.m4 (GLIBCXX_CHECK_X86_RDSEED): Define macro to check if
the assembler supports rdseed.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Use GLIBCXX_CHECK_X86_RDSEED.
* config/os/mingw32-w64/os_defines.h (_GLIBCXX_USE_CRT_RAND_S): Define.
* doc/html/*: Regenerate.
* doc/xml/manual/status_cxx2011.xml: Document new tokens.
* include/bits/random.h (random_device::random_device()): Always call
_M_init rather than _M_init_pretr1.
(random_device::random_device(const string&)): Likewise.
(random_device::operator()()): Always call _M_getval().
(random_device::_M_file): Replace first member of union with an
anonymous struct, with _M_file as its first member.
* src/c++11/random.cc [_GLIBCXX_X86_RDRAND] (USE_RDRAND): Define.
[_GLIBCXX_X86_RDSEED] (USE_RDSEED): Define.
(USE_MT19937): Define if none of the above are defined.
(USE_POSIX_FILE_IO): Define.
(_M_strtoul): Remove.
[USE_RDSEED] (__x86_rdseed): Define new function.
[_GLIBCXX_USE_CRT_RAND_S] (__winxp_rand_s): Define new function.
(random_device::_M_init(const string&)): Initialize new union members.
Add support for "rdseed" and "rand_s" tokens. Decide what the
"default" token does according to which USE_* macros are defined.
[USE_POSIX_FILE_IO]: Store a file descriptor.
[USE_MT19937]: Forward to _M_init_pretr1 instead.
(random_device::_M_init_pretr1(const string&)) [USE_MT19937]: Inline
code from _M_strtoul.
[!USE_MT19937]: Call _M_init, transforming the old default token or
numeric tokens to "default".
(random_device::_M_fini()) [USE_POSIX_FILE_IO]: Use close not fclose.
(random_device::_M_getval()): Use new union members to obtain a
random number from the stored function pointer or file descriptor.
[USE_MT19937]: Obtain a value from the mt19937 engine.
(random_device::_M_getval_pretr1()): Call _M_getval().
(random_device::_M_getentropy()) [USE_POSIX_FILE_IO]: Use _M_fd
instead of fileno.
[!USE_MT19937] (mersenne_twister): Do not instantiate when not needed.
* testsuite/26_numerics/random/random_device/85494.cc: New test.

Tested x86_64-linux, powerpc64le-linux and x86_64-w64-ming32.

Committed to trunk.

commit dc337c84a7db52035f6b6efb87338e950bb7490e
Author: Jonathan Wakely 
Date:   Tue May 28 15:01:08 2019 +0100

PR libstdc++/85494 use rdseed and rand_s in std::random_device

Add support for additional sources of randomness to std::random_device,
to allow using RDSEED for Intel CPUs and rand_s for Windows. When
supported these can be selected using the tokens "rdseed" and "rand_s".
For *-w64-mingw32 targets the "default" token will now use rand_s, and
for other i?86-*-* and x86_64-*-* targets it will try to use "rdseed"
first, then "rdrand", and finally "/dev/urandom".

To simplify the declaration of std::random_device in  the
constructors now unconditionally call _M_init instead of _M_init_pretr1,
and the function call operator now unconditionally calls _M_getval. The
library code now decides whether _M_init and _M_getval should use a real
source of randomness or the mt19937 engine.

Existing code compiled against old libstdc++ headers will still call
_M_init_pretr1 and _M_getval_pretr1, but those functions now forward to
_M_init and _M_getval if a real source of randomness is available. This
means existing code compiled for mingw-w64 will start to use rand_s just
by linking to a new libstdc++.dll.

* acinclude.m4 (GLIBCXX_CHECK_X86_RDSEED): Define macro to check if
the assembler supports rdseed.
* config.h.in: Regenerate.
* configure: Regenerate.
* conf

[PATCH] Avoid -Wunused-parameter warnings from testsuite utility

2019-05-29 Thread Jonathan Wakely

* testsuite/util/testsuite_api.h: Remove names of unused parameters.

Tested x86_64-linux, committed to trunk.


commit 0ea1c74e5df4214d4ab03ef264f53e4ef4aa1d87
Author: Jonathan Wakely 
Date:   Wed May 29 15:19:14 2019 +0100

Avoid -Wunused-parameter warnings from testsuite utility

* testsuite/util/testsuite_api.h: Remove names of unused parameters.

diff --git a/libstdc++-v3/testsuite/util/testsuite_api.h 
b/libstdc++-v3/testsuite/util/testsuite_api.h
index 4c5388ac91f..793aa40f449 100644
--- a/libstdc++-v3/testsuite/util/testsuite_api.h
+++ b/libstdc++-v3/testsuite/util/testsuite_api.h
@@ -113,18 +113,15 @@ namespace __gnu_test
   // For 26 numeric algorithms requirements, need addable,
   // subtractable, multiplicable.
   inline NonDefaultConstructible
-  operator+(const NonDefaultConstructible& lhs,
-   const NonDefaultConstructible& rhs)
+  operator+(const NonDefaultConstructible&, const NonDefaultConstructible&)
   { return NonDefaultConstructible(1); }
 
   inline NonDefaultConstructible
-  operator-(const NonDefaultConstructible& lhs,
-   const NonDefaultConstructible& rhs)
+  operator-(const NonDefaultConstructible&, const NonDefaultConstructible&)
   { return NonDefaultConstructible(1); }
 
   inline NonDefaultConstructible
-  operator*(const NonDefaultConstructible& lhs,
-   const NonDefaultConstructible& rhs)
+  operator*(const NonDefaultConstructible&, const NonDefaultConstructible&)
   { return NonDefaultConstructible(1); }
 
   // Like unary_function, but takes no argument. (ie, void).


Re: [libgomp, testsuite] Generalize getconf _NPROCESSORS_ONLN

2019-05-29 Thread Jakub Jelinek
On Wed, May 29, 2019 at 04:32:41PM +0200, Rainer Orth wrote:
> > I'd prefer not to determine the CPU count at configure time, the build
> > directory can be on some network filesystem and tested sometimes on one
> > machine, sometimes on another one, hw can be upgraded etc.
> 
> somewhat agreed for testing on one or the other system.  However, build
> dirs tend to be short-lived in my experience (and usually local because
> at least I have found builds on NFS so incredibly slow as to avoid them
> like the plague) and hw upgrades are relatively rare.
> 
> > So, something similar to what your patch does, but don't substitute the 
> > actual
> > CPU count, but a command to determine the number of CPUs?
> 
> I'm not convinced that would work reliably: you can easily have one set
> of commands on one machine, but a slightly different one on another.  If
> we really want to go this route, that means extracting the autoconf
> macro's logic into a separate script and use that at make check time.
> Not sure if that's worth the effort, especially since we're not really
> interested in the exact number of cores, just small (<= 8) vs. large (> 8).

Ok, so can we do a middle-ground, instead of the current change to
Makefile.am just replace the two spots that do num_cpus=1 with 
num_cpus=@CPU_COUNT@
and keep the getconf invocation in there?  On Linux it will do what it used
to do, and on targets that don't support getconf _NPROCESSORS_ONLN it will
use a configure time determined value?

Jakub


[ARM][PATCH 1/2] Support HFmode for standard names implemented with VRINT instructions.

2019-05-29 Thread Srinath Parvathaneni
Hello,

The initial implementation of the FP16 extension added HFmode support to
a limited number of the standard names.  Following
https://gcc.gnu.org/ml/gcc-patches/2016-08/msg00168.html , this patch
extends the HFmode support to the names implemented by the ARM
 and l expanders: btrunc, ceil, round,
floor, nearbyint and rint. This patch also changes the patterns
supporting the neon_vrnd* and neon_vcvt* Adv.SIMD intrinsics to use the
standard names, where apropriate.

No new tests are added. The ARM tests for the SF and DF mode variants of
these names are based on GCC builtin functions and there doesn't seem to
be an obvious alternative to trigger the new patterns through the
standard names. The pattern definitions though are tested through the
Adv.SIMD intrinsics.

The following patch reworks the implementation for HFmode VRINT to
remove a number of redundant constructs that were added in the initial
implementation.

The two patches have been tested for arm-none-linux-gnueabihf with
native bootstrap and make check and for arm-none-eabi with
cross-compiled check-gcc on an ARMv8.4-A emulator.

Ok for trunk? If ok, could someone please commit the patch on my behalf, 
I don't have commit rights.

2019-05-29 Srinath Parvathaneni 
   Matthew Wahab  

* config/arm/iterators.md (fp16_rnd_str): Replace UNSPEC_VRND
values with equivalent UNSPEC_VRINT values.  Add UNSPEC_NVRINTZ,
UNSPEC_NVRINTA, UNSPEC_NVRINTM, UNSPEC_NVRINTN, UNSPEC_NVRINTP,
UNSPEC_NVRINTX.
(vrint_variant): Fix some white-space.
(vrint_predicable): Fix some white-space.
* config/arm/neon.md (neon_v): Replace
FP16_RND iterator with NEON_VRINT and update the rest of the
pattern accordingly.
(neon_vcvt): Replace with
neon_vcvt.
(neon_vcvt): New.
(neon_vcvtn): New.
* config/arm/unspecs.md: Add UNSPEC_VRINTN.
* config/arm/vfp.md (neon_vhf): Convert to an
expander invoking hf2.
(neon_vrndihf): Remove.
(neon_vrndnhf): New.
(neon_vcvthsi): Remove.
(hf2): New.
(lhfsi2): New.
(neon_vcvthsi): New.
(neon_vcvtnhsi): New.



rb10543.patch
Description: rb10543.patch


[ARM][PATCH 2/2] Remove redundant constructs added for FP16 support.

2019-05-29 Thread Srinath Parvathaneni
Hello,

The patch reworks some of the VRND and VCVT code added for the FP16
extension support to remove the redundant UNSPECS and related
constructs.

Tested for arm-none-linux-gnueabihf with native bootstrap and make check
and for arm-none-eabi with cross-compiled check-gcc on an
ARMv8.4-A emulator.

Ok for trunk? If ok, could someone please commit the patch on my behalf, 
I don't have commit rights.

2019-05-29 Srinath Parvathaneni 
   Matthew Wahab  

* config/arm/iterators.md (VCVT_HF_US_N): Remove.
(VCVT_SI_US_N): Remove.
(VCVT_HF_US): Remove.
(VCVTH_US): Remove.
(FP16_RND): Remove.
(sup): Remove UNSPEC_VCVTA_S, UNSPEC_VCVTA_U, UNSPEC_VCVTM_S,
UNSPEC_VCVTM_U, UNSPEC_VCVTN_S, UNSPEC_VCVTN_U, UNSPEC_VCVTP_S,
UNSPEC_VCVTP_U, UNSPEC_VCVT_HF_S_N, UNSPEC_VCVT_HF_U_N,
UNSPEC_VCVT_SI_S_N, UNSPEC_VCVT_SI_U_N, UNSPEC_VCVTH_S,
UNSPEC_VCVTH_U.
(vcvth_op): Remove.
(fp16_rnd_insn): Remove.
* config/arm/unspecs.md: Remove UNSPEC_VCVT_HF_S_N,
UNSPEC_VCVT_HF_U_N, UNSPEC_VCVT_SI_S_N, UNSPEC_VCVT_SI_U_N,
UNSPEC_VCVTH_S, UNSPEC_VCVTH_U, UNSPEC_VCVTA_S, UNSPEC_VCVTA_U,
UNSPEC_VCVTM_S, UNSPEC_VCVTM_U, UNSPEC_VCVTN_S, UNSPEC_VCVTN_U,
UNSPEC_VCVTP_S, UNSPEC_VCVTP_U, UNSPEC_VRND, UNSPEC_VRNDA,
UNSPEC_VRNDI, UNSPEC_VRNDM, UNSPEC_VRNDN, UNSPEC_VRNDP,
UNSPEC_VRNDX.
* config/arm/vfp.md (neon_vcvthhf): Replace VCVTH_US with
VCVT_US.
(neon_vcvthsi): Likewise.
(neon_vcvth_nhf_unspec): Replace VCVTH_US_N with VCVT_US_N.
(neon_vcvth_nhf): Likewise.
(neon_vcvth_nsi_unspec): Replace VCVTH_SI_US_N with
VCVT_US_N.
(neon_vcvth_nsi): Likewise.



rb10544.patch
Description: rb10544.patch


[PATCH][GCC][ARM] Add support for hint intrinsics: __yield, __wfe, __wfi, __sev and __sevl.

2019-05-29 Thread Srinath Parvathaneni
Hi All,

This patch implements the __yield(), __wfe(), __wfi(), __sev() and 
__sevl() ACLE (hint) intrinsics for all ARM targets.

The intrinsics specification are published on the Arm website [1].

[1] 
https://developer.arm.com/docs/ihi0053/latest/arm-c-language-extensions-21-architecture-specification

Bootstrapped on arm-none-linux-gnueabihf, regression tested on 
arm-none-eabi and found no regressions.

Added tests are tested using RUNTESTFLAGS as below:
RUNTESTFLAGS="--target_board=arm-eabi-aem/-march=armv8-a acle.exp=hint-1.c"
RUNTESTFLAGS="--target_board=arm-eabi-aem/-march=armv4t acle.exp=hint-2.c"
RUNTESTFLAGS="--target_board=arm-eabi-aem/-march=armv6t2 acle.exp=hint-3.c"

Ok for trunk? If ok, could please someone commit the patch on my behalf, 
I don't have commit rights.

Thanks,
Srinath

gcc/ChangeLog:

2019-05-29  Srinath Parvathaneni  

* config/arm/arm-builtins.c (NOP_QUALIFIERS): New qualifier.
(arm_expand_builtin_args): New case.
* config/arm/arm.md (yield): New pattern name.
(wfe): Likewise.
(wfi): Likewise.
(sev): Likewise.
(sevl): Likewise.
* config/arm/arm_acle.h (__yield ): New inline function.
(__sev): Likewise.
(__sevl): Likewise.
(__wfi): Likewise.
(__wfe): Likewise.
* config/arm/arm_acle_builtins.def (VAR1):
(yield): New acle builtin.
(sev): Likewise.
(sevl): Likewise.
(wfi): Likewise.
(wfe): Likewise.
* config/arm/unspecs.md (unspecv):
(VUNSPEC_YIELD): New volatile unspec.
(VUNSPEC_SEV): Likewise.
(VUNSPEC_SEVL): Likewise.
(VUNSPEC_WFI): Likewise.

gcc/testsuite/ChangeLog:

2019-05-29  Srinath Parvathaneni  

* gcc.target/arm/acle/hint-1.c: New test.
* gcc.target/arm/acle/hint-2.c: Likewise.
* gcc.target/arm/acle/hint-3.c: Likewise.



rb10373.patch
Description: rb10373.patch


Re: libgomp/target.c magic constants self-documentation

2019-05-29 Thread Thomas Schwinge
Hi Jakub!

Ping.

On Fri, 21 Dec 2018 11:41:07 +0100, I wrote:
> On Sat, 10 Nov 2018 09:11:18 -0800, Julian Brown  
> wrote:
> > This patch [...] replaces usage
> > of several magic constants in target.c with named macros
> 
> > --- a/libgomp/libgomp.h
> > +++ b/libgomp/libgomp.h
> > @@ -902,6 +902,11 @@ struct target_mem_desc {
> > artificial pointer to "omp declare target link" object.  */
> >  #define REFCOUNT_LINK (~(uintptr_t) 1)
> >  
> > +/* Special offset values.  */
> > +#define OFFSET_INLINED (~(uintptr_t) 0)
> > +#define OFFSET_POINTER (~(uintptr_t) 1)
> > +#define OFFSET_STRUCT (~(uintptr_t) 2)
> > +
> >  struct splay_tree_key_s {
> >/* Address of the host object.  */
> >uintptr_t host_start;
> 
> I'd move these close to the struct they apply to.
> 
> 
> > --- a/libgomp/target.c
> > +++ b/libgomp/target.c
> > @@ -45,6 +45,8 @@
> >  #include "plugin-suffix.h"
> >  #endif
> >  
> > +#define FIELD_TGT_EMPTY (~(size_t) 0)
> > +
> >  static void gomp_target_init (void);
> >  
> >  /* The whole initialization code for offloading plugins is only run one.  
> > */
> 
> As it's only used there, I'd actually move that one into "gomp_map_vars",
> as a "const size_t field_tgt_empty".  And, you'd missed to use it in the
> initialization of "field_tgt_clear".  ;-)
> 
> 
> > --- a/libgomp/target.c
> > +++ b/libgomp/target.c
> > @@ -876,6 +892,8 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
> > else
> >   k->host_end = k->host_start + sizeof (void *);
> > splay_tree_key n = splay_tree_lookup (mem_map, k);
> > +   /* Need to account for the case where a struct field hasn't been
> > +  mapped onto the accelerator yet.  */
> > if (n && n->refcount != REFCOUNT_LINK)
> >   gomp_map_vars_existing (devicep, aq, n, k, &tgt->list[i],
> >   kind & typemask, cbufp);
> 
> We usually talk about "device", not "accelerator".
> 
> 
> All that I'm changing with the incremental patch attached.
> 
> 
> I'm also again attaching the complete patch that we'd like to commit to
> trunk; Jakub, OK?  If approving this patch, please respond with
> "Reviewed-by: NAME " so that your effort will be recorded in the
> commit log, see .


Grüße
 Thomas


From 8f36a7d620b3e1d0130b352dc02d58c066c7ba92 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 21 Dec 2018 11:28:49 +0100
Subject: [PATCH] [WIP] libgomp/target.c magic constants self-documentation

---
 libgomp/libgomp.h | 10 +-
 libgomp/target.c  | 11 +--
 2 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 19e5fbb24e26..eef380d7b0fc 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -873,6 +873,11 @@ struct target_var_desc {
   uintptr_t length;
 };
 
+/* Special values for struct target_var_desc's offset.  */
+#define OFFSET_INLINED (~(uintptr_t) 0)
+#define OFFSET_POINTER (~(uintptr_t) 1)
+#define OFFSET_STRUCT (~(uintptr_t) 2)
+
 struct target_mem_desc {
   /* Reference count.  */
   uintptr_t refcount;
@@ -903,11 +908,6 @@ struct target_mem_desc {
artificial pointer to "omp declare target link" object.  */
 #define REFCOUNT_LINK (~(uintptr_t) 1)
 
-/* Special offset values.  */
-#define OFFSET_INLINED (~(uintptr_t) 0)
-#define OFFSET_POINTER (~(uintptr_t) 1)
-#define OFFSET_STRUCT (~(uintptr_t) 2)
-
 struct splay_tree_key_s {
   /* Address of the host object.  */
   uintptr_t host_start;
diff --git a/libgomp/target.c b/libgomp/target.c
index d7acdd9b784b..201da567d73a 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -45,8 +45,6 @@
 #include "plugin-suffix.h"
 #endif
 
-#define FIELD_TGT_EMPTY (~(size_t) 0)
-
 static void gomp_target_init (void);
 
 /* The whole initialization code for offloading plugins is only run one.  */
@@ -748,7 +746,8 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
   if (not_found_cnt)
 	tgt->array = gomp_malloc (not_found_cnt * sizeof (*tgt->array));
   splay_tree_node array = tgt->array;
-  size_t j, field_tgt_offset = 0, field_tgt_clear = ~(size_t) 0;
+  const size_t field_tgt_empty = ~(size_t) 0;
+  size_t j, field_tgt_offset = 0, field_tgt_clear = field_tgt_empty;
   uintptr_t field_tgt_base = 0;
 
   for (i = 0; i < mapnum; i++)
@@ -841,7 +840,7 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 	  k->host_end = k->host_start + sizeof (void *);
 	splay_tree_key n = splay_tree_lookup (mem_map, k);
 	/* Need to account for the case where a struct field hasn't been
-	   mapped onto the accelerator yet.  */
+	   mapped onto the device yet.  */
 	if (n && n->refcount != REFCOUNT_LINK)
 	  gomp_map_vars_existing (devicep, n, k, &tgt->list[i],
   kind & typemask, cbufp);
@@ -858,12 +857,12 @@ gomp_map_vars (struct gomp_device_descr *devicep, size_t mapnum,
 		size_t align = (size_t) 1 << (kind >> rshift);
 		tgt->list[i]

[PATCH][GCC][AArch64] Add support for hint intrinsics: __yield, __wfe, __wfi, __sev and __sevl.

2019-05-29 Thread Srinath Parvathaneni
Hi All,

This patch implements the __yield(), __wfe(), __wfi(), __sev() and 
__sevl() ACLE (hint) intrinsics for AArch64 as yield, wfe, wfi, sev and 
sevl (hint) instructions respectively.

The instructions are documented in the ArmARM[1] and the intrinsics 
specification are published on the Arm website [2].

[1] 
https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile
[2] 
https://developer.arm.com/docs/ihi0053/latest/arm-c-language-extensions-21-architecture-specification

Bootstrapped on aarch64-none-linux-gnu and regression tested on 
aarch64-none-elf with no regressions.

Ok for trunk? If ok, could someone please commit the patch on my behalf, 
I don't have commit rights.

Thanks,
Srinath

gcc/ChangeLog:

2019-05-29  Srinath Parvathaneni  

* config/aarch64/aarch64.md (UNSPECV_YIELD): New volatile unspec.
(UNSPECV_WFE): Likewise.
(UNSPECV_WFI): Likewise.
(UNSPECV_SEV): Likewise.
(UNSPECV_SEVL): Likewise.
(yield): New pattern name.
(wfe): Likewise.
(wfi): Likewise.
(sev): Likewise.
(sevl): Likewise.
* config/aarch64/aarch64-builtins.c (aarch64_builtins):
AARCH64_BUILTIN_YIELD: New builtin.
AARCH64_BUILTIN_WFE: Likewise.
AARCH64_BUILTIN_WFI: Likewise.
AARCH64_BUILTIN_SEV: Likewise.
AARCH64_BUILTIN_SEVL: Likewise.
(aarch64_init_syshintop_builtins): New function.
(aarch64_init_builtins): New call statement.
(aarch64_expand_builtin): New case.
* config/aarch64/arm_acle.h (__yield): New inline function.
(__sev): Likewise.
(__sevl): Likewise.
(__wfi): Likewise.
(__wfe): Likewise.

gcc/testsuite/ChangeLog:

2019-05-29  Srinath Parvathaneni  

* gcc.target/aarch64/acle/hint-1.c: New test.



rb10372.patch
Description: rb10372.patch


Re: [libgomp, testsuite] Generalize getconf _NPROCESSORS_ONLN

2019-05-29 Thread Rainer Orth
Hi Jakub,

>> > So, something similar to what your patch does, but don't substitute the
>> > actual
>> > CPU count, but a command to determine the number of CPUs?
>> 
>> I'm not convinced that would work reliably: you can easily have one set
>> of commands on one machine, but a slightly different one on another.  If
>> we really want to go this route, that means extracting the autoconf
>> macro's logic into a separate script and use that at make check time.
>> Not sure if that's worth the effort, especially since we're not really
>> interested in the exact number of cores, just small (<= 8) vs. large (> 8).
>
> Ok, so can we do a middle-ground, instead of the current change to
> Makefile.am just replace the two spots that do num_cpus=1 with
> num_cpus=@CPU_COUNT@
> and keep the getconf invocation in there?  On Linux it will do what it used
> to do, and on targets that don't support getconf _NPROCESSORS_ONLN it will
> use a configure time determined value?

fine with me.  I'll send an updated patch after testing.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: Negative arguments in OpenMP 'num_threads' clause etc.

2019-05-29 Thread Jakub Jelinek
On Wed, May 29, 2019 at 04:42:14PM +0200, Thomas Schwinge wrote:
> Any comments on my question, please?
> 
> On Tue, 09 Apr 2019 17:51:46 +0200, I wrote:
> > On Tue, 29 Nov 2016 17:47:08 -0800, Cesar Philippidis 
> >  wrote:
> > > One notable difference between the trunk and gomp4 implementation of the
> > > tile clause is that gomp4 errors on negative value tile arguments,
> > > whereas trunk issues warnings.
> > 
> > I'm picking up these changes, which have been posted a few times, and
> > have been rejected (at least in their current incarnation) a few times,
> > too.  ;-\
> > 
> > > Is there a reason why the fortran FE
> > > generally emits a warning, on say num_threads(-5), instead of an error?
> > 
> > Same for the C/C++ front ends, which I'm looking into first.
> > 
> > Jakub, is the reason that even if the user is clearly doing something
> > "strage" there, the compiler doesn't have a problem to continue
> > compilation for 'num_threads(-5)', so it just emits a warning, but for
> > example for 'collapse(-5)' is has to stop with an error, because it can't
> > continue compilation in that case?  Or, is there a different reason for
> > the many 'warning_at ([...], "[...] must be positive"' (C front end, for
> > example), instead of using 'error_at' for these?

collapse has a constant expression argument and if the value is negative (or
0), then parsing doesn't make sense, so that case is clearly something where
an error is in order.  num_threads is an example of where the standard is
not completely clear if it is or is not ok to reject compilation as opposed
to just UB at runtime if that happens and no problem if that construct is
never encoutered at runtime.

Kind like a C++ difference between:
void
foo ()
{
  constexpr int a = __INT_MAX__ + 1;
  int b = __INT_MAX__ + 1;
}

where for a we need to error, but for b we just warn (by default, of course
one can use -Werror).

Jakub


Re: [PATCH] rs6000: Call flow implementation for PC-relative addressing

2019-05-29 Thread Alan Modra
On Wed, May 29, 2019 at 07:40:46AM -0500, Segher Boessenkool wrote:
> All necessary linker (and binutils and GAS) support is upstream already, 
> right?

I believe so, except gold support is lacking right now.

> >pld 12,0(0),1
> >.reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo
> 
> Are we guaranteed the assembler always writes a pld like this as 8 bytes?

Strictly speaking the assembler might nop pad *before* the pld making
a total of 12 bytes, and that's the reason to put the .reloc *after*
the prefix instruction.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] rs6000: Call flow implementation for PC-relative addressing

2019-05-29 Thread Bill Schmidt
On 5/29/19 7:40 AM, Segher Boessenkool wrote:
> Hi Bill,
>
> On Thu, May 23, 2019 at 09:11:44PM -0500, Bill Schmidt wrote:
>> (1) When a function uses PC-relative code generation, all direct calls 
>> (other than 
>> sibcalls) that the function makes to local or external callees should appear 
>> as
>> "bl sym@notoc" and should not be followed by a nop instruction.  @notoc 
>> indicates
>> that the assembler should annotate the call with R_PPC64_REL24_NOTOC, meaning
>> that the caller does not guarantee that r2 contains a valid TOC pointer.  
>> Thus
>> the linker should not try to replace a subsequent "nop" with a TOC restore
>> instruction.
> All necessary linker (and binutils and GAS) support is upstream already, 
> right?
>
>> In creating the new sibcall patterns, I did not duplicate the "c" 
>> alternatives
>> that allow for bctr or blr sibcalls.  I don't think there's a way to generate
>> those currently.  The bctr would be legitimate for PC-relative sibcalls if 
>> you
>> can prove that the target function is in the same binary, but we don't appear
>> to detect that possibility today.
> But you could see that the target is in the same translation unit, for 
> example?
> That should be a simple test to make, too.
>
>>pld 12,0(0),1
>>.reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo
> Are we guaranteed the assembler always writes a pld like this as 8 bytes?
>
>>  * gcc.target/powerpc/notoc-direct-1.c: New.
>>  * gcc.target/powerpc/pcrel-sibcall-1.c: New.
> A few more testcases would be useful.  Well we'll gain a lot of-em soon
> enough, I suppose.
>
>>static char str[32];  /* 2 spare */
>> -  if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>> +  if (rs6000_pcrel_p (cfun))
>> +sprintf (str, "b%s %s@notoc%s", sibcall ? "" : "l", z, arg);
>> +  else if (DEFAULT_ABI == ABI_AIX || DEFAULT_ABI == ABI_ELFv2)
>>  sprintf (str, "b%s %s%s%s", sibcall ? "" : "l", z, arg,
>>   sibcall ? "" : "\n\tnop");
> Two spare, and you add one char (@notoc vs. ..nop), so at a minimum you
> need to correct the comment?
>
>> +  if (DEFAULT_ABI == ABI_V4
>> +  && (!TARGET_SECURE_PLT
>> +  || !flag_pic
>> +  || (decl
>> +  && (*targetm.binds_local_p) (decl
>> +return true;
>> +
>> +  return false;
> Please invert this (put the "return false" ondition in the if, like the
> preceding comment says).
>
>>if (TARGET_PLTSEQ)
>>  {
>>rtx base = const0_rtx;
>> -  int regno;
>> -  if (DEFAULT_ABI == ABI_ELFv2)
>> +  int regno = 12;
>> +  if (rs6000_pcrel_p (cfun))
>>  {
>> -  base = gen_rtx_REG (Pmode, TOC_REGISTER);
>> -  regno = 12;
>> +  rtx reg = gen_rtx_REG (Pmode, regno);
>> +  rtx u = gen_rtx_UNSPEC (Pmode, gen_rtvec (3, base, call_ref, arg),
>> +  UNSPEC_PLT_PCREL);
>> +  emit_insn (gen_rtx_SET (reg, u));
>> +  return reg;
>>  }
> You don't need a regno variable here, so don't use it, only set it later
> where it _is_ used?
>
>> +(define_insn "*pltseq_plt_pcrel"
>> +  [(set (match_operand:P 0 "gpc_reg_operand" "=r")
>> +(unspec:P [(match_operand:P 1 "" "")
>> +   (match_operand:P 2 "symbol_ref_operand" "s")
>> +   (match_operand:P 3 "" "")]
>> +  UNSPEC_PLT_PCREL))]
>> +  "HAVE_AS_PLTSEQ && TARGET_TLS_MARKERS
>> +   && rs6000_pcrel_p (cfun)"
>> +{
>> +  return rs6000_pltseq_template (operands, 4);
> Maybe those "4" magic constants should be an enum?
>
>> +int zz0 ()
>> +{
>> +  asm ("");
>> +  return 16;
>> +};
> You might want to put in a comment what this asm is for.
>
>
> Please consider those things.  Okay for trunk with that.  Thanks!

Thanks!  Will make appropriate changes and commit.  Much obliged for the
review!

Bill
>
>
> Segher
>



Re: [PATCH][RFC] final-value replacement from DCE

2019-05-29 Thread Jeff Law
On 5/29/19 7:36 AM, Richard Biener wrote:
> 
> The following tries to address PR90648 by performing final
> value replacement from DCE when DCE knows the final value
> computation is not used during loop iteration.  This fits
> neatly enough into existing tricks performed by DCE like
> removing unused malloc/free pairs.
DO you have the right BZ #?  90648 is a ICE in tree checking and doesn't
have a loop :-)



> 
> There's a few complications, one is it fails to bootstrap
> because it exposes a few uninit warning false positives,
> another is that -fno-tree-sccp is no longer effective.
> As written this turns gcc.dg/pr34027-1.c into a division
> again (I did not copy the expression_expensive checking).
> It seems to also need -ftrapv adjustements (gcc.dg/pr81661.c).
> 
> The goal of this patch is to remove the SCCP pass, or rather
> us unconditionally replacing loop-closed PHIs with final
> value computations which we've got complaints in the past
> already that it duplicates computation that is readily
> available.  I've not yet figured testsuite fallout from that
> change.
> 
> For the -fno-tree-sccp I consider to simply honor that
> flag in the DCE path, for the gcc.dg/pr34027-1.c I'll
> re-install the expression_expensive checking.  I'll
> also fix the -ftrapv issue.
> 
> Does this otherwise look a sensible way forward?

> 
> Thanks,
> Richard.
> 
> FAIL: gcc.dg/builtin-object-size-1.c execution test
> FAIL: gcc.dg/builtin-object-size-5.c scan-assembler-not abort
> FAIL: gcc.dg/pr34027-1.c scan-tree-dump-times optimized " / " 0
> FAIL: gcc.dg/pr81661.c (internal compiler error)
> FAIL: gcc.dg/pr81661.c (test for excess errors)
> XPASS: gcc.dg/tree-ssa/loop-15.c scan-tree-dump-times optimized " + " 0
> FAIL: gcc.dg/tree-ssa/loop-15.c scan-tree-dump-times optimized "if " 1
> FAIL: gcc.dg/tree-ssa/loop-26.c scan-tree-dump-times optimized "if" 2
> FAIL: gcc.dg/tree-ssa/pr32044.c scan-tree-dump-times optimized " / " 0
> FAIL: gcc.dg/tree-ssa/pr32044.c scan-tree-dump-times optimized "if" 6
> FAIL: gcc.dg/tree-ssa/pr64183.c scan-tree-dump cunroll "Loop 2 iterates at 
> most 3 times"
> FAIL: gcc.dg/tree-ssa/ssa-pre-3.c scan-tree-dump-times pre "Eliminated: 2" 1
> FAIL: gcc.dg/vect/no-scevccp-noreassoc-outer-3.c scan-tree-dump-times vect 
> "OUTER LOOP VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-noreassoc-outer-4.c scan-tree-dump-times vect 
> "OUTER LOOP VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-noreassoc-outer-5.c scan-tree-dump-times vect 
> "OUTER LOOP VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-11.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-13.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-14.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-15.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-16.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-17.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-18.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-19.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-2.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED" 1
> FAIL: gcc.dg/vect/no-scevccp-outer-20.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-21.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-3.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-5.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-6-global.c scan-tree-dump-times vect 
> "OUTER LOOP VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-6.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-7.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-outer-8.c scan-tree-dump-times vect "OUTER LOOP 
> VECTORIZED." 1
> FAIL: gcc.dg/vect/no-scevccp-vect-iv-1.c scan-tree-dump-times vect 
> "vectorized 1 loops" 1
> FAIL: gcc.dg/vect/no-scevccp-vect-iv-3.c scan-tree-dump-times vect 
> "vect_recog_widen_sum_pattern: detected" 1
> FAIL: gcc.dg/vect/no-scevccp-vect-iv-3.c scan-tree-dump-times vect 
> "vectorized 1 loops" 1
> 
> Running target unix//-m32
> FAIL: gcc.dg/builtin-object-size-1.c execution test
> FAIL: gcc.dg/builtin-object-size-5.c scan-assembler-not abort
> FAIL: gcc.dg/pr34027-1.c scan-tree-dump-times optimized " / " 0
> FAIL: gcc.dg/pr81661.c (internal compiler error)
> FAIL: gcc.dg/pr81661.c (test for excess errors)
> XPASS: gcc.dg/tree-ssa/loop-15.c scan-tree-dump-times optimized " + " 0
> FAIL: gcc.dg/tree-ssa/loop-15.c scan-tree-dump-times optimized "if " 1
> FAIL: gcc.dg/tree-ssa/loop-26.c scan-tr

Re: undefined behavior in value_range::equiv_add()?

2019-05-29 Thread Aldy Hernandez

On 5/29/19 9:24 AM, Richard Biener wrote:

On Wed, May 29, 2019 at 2:18 PM Aldy Hernandez  wrote:


As per the API, and the original documentation to value_range,
VR_UNDEFINED and VR_VARYING should never have equivalences.  However,
equiv_add is tacking on equivalences blindly, and there are various
regressions that happen if I fix this oversight.

void
value_range::equiv_add (const_tree var,
 const value_range *var_vr,
 bitmap_obstack *obstack)
{
if (!m_equiv)
  m_equiv = BITMAP_ALLOC (obstack);
unsigned ver = SSA_NAME_VERSION (var);
bitmap_set_bit (m_equiv, ver);
if (var_vr && var_vr->m_equiv)
  bitmap_ior_into (m_equiv, var_vr->m_equiv);
}

Is this a bug in the documentation / API, or is equiv_add incorrect and
we should fix the fall-out elsewhere?


I think this must have been crept in during the classification.  If you
go back to say GCC 7 you shouldn't see value-ranges with
UNDEFINED/VARYING state in the lattice that have equivalences.

It may not be easy to avoid with the new classy interface but we're
certainly not tacking on them "blindly".  At least we're not supposed
to.  As usual the intermediate state might be "broken" but
intermediateness is not sth the new class "likes".


It looks like extract_range_from_stmt (by virtue of 
vrp_visit_assignment_or_call and then extract_range_from_ssa_name) 
returns one of these intermediate ranges.  It would seem to me that an 
outward looking API method like vr_values::extract_range_from_stmt 
shouldn't be returning inconsistent ranges.  Or are there no guarantees 
for value_ranges from within all of vr_values?


Perhaps I should give a little background.  As part of your 
value_range_base re-factoring last year, you mentioned that you didn't 
split out intersect like you did union because of time or oversight.  I 
have code to implement intersect (attached), for which I've noticed that 
I must leave equivalences intact, even when transitioning to VR_UNDEFINED:


[from the attached patch]
+  /* If THIS is varying we want to pick up equivalences from OTHER.
+ Just special-case this here rather than trying to fixup after the
+ fact.  */
+  if (this->varying_p ())
+this->deep_copy (other);
+  else if (this->undefined_p ())
+/* ?? Leave any equivalences already present in an undefined.
+   This is technically not allowed, but we may get an in-flight
+   value_range in an intermediate state.  */
+;

What is happening is that in evrp's record_ranges_from_stmt, we call 
extract_range_from_stmt which returns an inconsistent VR_UNDEFINED with 
an equivalence, which is then fed to update_value_range() and finally 
value_range::intersect.  The VR_UNDEFINED equivalence must not be 
removed in the intersect, because update_value_range() will get confused 
as to whether this is a new VR or not (because VR_UNDEFINED with no 
equivalences is not the same as VR_UNDEFINED with equivalences-- see 
"is_new" in update_value_range).


I'd rather not special case VR_UNDEFINED in the intersect code as above, 
but if you think extract_range_from_stmt() can return an "intermediate" 
range, and thus intersect must handle them too, then I suppose we could 
leave this in.


What do you suggest?

Oh yeah, is the attached patch OK for trunk?  I can post in a separate 
thread if you prefer, but thought it relevant here :).


Aldy

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 9f194540327..9494520ba33 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,19 @@
+2019-05-29  Aldy Hernandez  
+
+	* tree-vrp.h (value_range_base::intersect): New.
+	(value_range::intersect_helper): Move from here...
+	(value_range_base::intersect_helper): ...to here.
+	* tree-vrp.c (value_range::intersect_helper): Rename to...
+	(value_range_base::intersect_helper): ...this, and rewrite to
+	return a value instead of modifying THIS in place.
+	Also, move equivalence handling...
+	(value_range::intersect): ...here, while calling intersect_helper.
+	* gimple-fold.c (size_must_be_zero_p): Use value_range_base when
+	calling intersect.
+	* gimple-ssa-evrp-analyze.c (ecord_ranges_from_incoming_edge):
+	Same.
+	* vr-values.c (vrp_evaluate_conditional_warnv_with_ops): Same.
+
 2019-05-29  Aldy Hernandez  
 	* tree-vrp.h (value_range_base::non_zero_p): New.
 	* tree-vrp.c (range_is_null): Remove.
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index b3e931744f8..8b8331eb555 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -684,10 +684,10 @@ size_must_be_zero_p (tree size)
   /* Compute the value of SSIZE_MAX, the largest positive value that
  can be stored in ssize_t, the signed counterpart of size_t.  */
   wide_int ssize_max = wi::lshift (wi::one (prec), prec - 1) - 1;
-  value_range valid_range (VR_RANGE,
-			   build_int_cst (type, 0),
-			   wide_int_to_tree (type, ssize_max));
-  value_range vr;
+  value_range_base valid_range (VR_RANGE,
+build_int_cst (type, 0),
+wide_int_to_tree (t

Re: [PATCH][RFC] final-value replacement from DCE

2019-05-29 Thread Jakub Jelinek
On Wed, May 29, 2019 at 09:57:50AM -0600, Jeff Law wrote:
> > FAIL: gcc.dg/builtin-object-size-1.c execution test
> > FAIL: gcc.dg/builtin-object-size-5.c scan-assembler-not abort

I admit I haven't looked at the details here, but wonder if the optimization
couldn't be done only in the DCE passes post IPA, otherwise we risk
behavior changes for __builtin_object_size.

Jakub


New Finnish PO file for 'gcc' (version 9.1.0)

2019-05-29 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Finnish team of translators.  The file is available at:

https://translationproject.org/latest/gcc/fi.po

(This file, 'gcc-9.1.0.fi.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: undefined behavior in value_range::equiv_add()?

2019-05-29 Thread Jeff Law
On 5/29/19 7:24 AM, Richard Biener wrote:
> On Wed, May 29, 2019 at 2:18 PM Aldy Hernandez  wrote:
>>
>> As per the API, and the original documentation to value_range,
>> VR_UNDEFINED and VR_VARYING should never have equivalences.  However,
>> equiv_add is tacking on equivalences blindly, and there are various
>> regressions that happen if I fix this oversight.
>>
>> void
>> value_range::equiv_add (const_tree var,
>> const value_range *var_vr,
>> bitmap_obstack *obstack)
>> {
>>if (!m_equiv)
>>  m_equiv = BITMAP_ALLOC (obstack);
>>unsigned ver = SSA_NAME_VERSION (var);
>>bitmap_set_bit (m_equiv, ver);
>>if (var_vr && var_vr->m_equiv)
>>  bitmap_ior_into (m_equiv, var_vr->m_equiv);
>> }
>>
>> Is this a bug in the documentation / API, or is equiv_add incorrect and
>> we should fix the fall-out elsewhere?
> 
> I think this must have been crept in during the classification.  If you
> go back to say GCC 7 you shouldn't see value-ranges with
> UNDEFINED/VARYING state in the lattice that have equivalences.
> 
> It may not be easy to avoid with the new classy interface but we're
> certainly not tacking on them "blindly".  At least we're not supposed
> to.  As usual the intermediate state might be "broken" but
> intermediateness is not sth the new class "likes".
I don't remember changing anything (behavior-wise) in this space.  If I
did it certainly wasn't intentional.

Given the code in gcc-7 looks like this:


> static void
> add_equivalence (bitmap *equiv, const_tree var)
> {
>   unsigned ver = SSA_NAME_VERSION (var);
>   value_range *vr = get_value_range (var);
> 
>   if (*equiv == NULL)
> *equiv = BITMAP_ALLOC (&vrp_equiv_obstack);
>   bitmap_set_bit (*equiv, ver);
>   if (vr && vr->equiv)
> bitmap_ior_into (*equiv, vr->equiv);
> }

I suspect we need to look up the call chain.


Jeff


Re: [PATCH] Further C lapack workaround tweaks

2019-05-29 Thread Jakub Jelinek
On Wed, May 29, 2019 at 04:02:55PM +0200, Thomas Koenig wrote:
> > As I said earlier in the PR, I don't like -fbroken-callers option much,
> > as the option name doesn't hint what it is doing at all.
> > 
> > The following patch renames the option and makes it into a 3 state option,
> > with the default being a middle-ground, where it avoids the tail calls in
> > functions that have the hidden character length arguments only if it makes
> > any implicit interface calls.  The rationale for that is that there were no
> > previously reported issues with older GCC versions and the change that
> > affected the broken C/C++ wrappers was just giving prototypes to the
> > implicit interface procedure calls, so presumably in functions that don't
> > have any such calls nothing should have changed.
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, additionally tested
> > on the dtrtri.f testcase and on dtrtri.f testcase patched to include
> > explicit interfaces for all called procedures (and for those two verified
> > all the 6 ways of using these options, default, positive/negative option
> > without = and 0/1/2 values of the = option, checking in which case there is
> > a tail call), ok for trunk?
> 
> Yep, this is a much better scheme.  OK.
> 
> This problem is also present on all release branches, so I think that
> this (which I think is the should also be backported to them, so that
> 7.5, 8.4 and 9.2 also can compile these LAPACK bindings again...).

I've committed following backport to 9 and 8 branches for now.
The 8 one is like the 9 one, except LTO_minor_version has been bumped there
from 1 to 2 instead of from 0 to 1.

2019-05-29  Jakub Jelinek  

PR fortran/90329
* lto-streamer.h (LTO_minor_version): Bump to 1.

Backported from mainline
2019-05-29  Jakub Jelinek  

PR fortran/90329
* lang.opt (fbroken-callers): Remove.
(ftail-call-workaround, ftail-call-workaround=): New options.
* gfortran.h (struct gfc_namespace): Add implicit_interface_calls.
* interface.c (gfc_procedure_use): Set implicit_interface_calls
for calls to implicit interface procedures.
* trans-decl.c (create_function_arglist): Use flag_tail_call_workaround
instead of flag_broken_callers.  If it is not 2, also require
sym->ns->implicit_interface_calls.
* invoke.texi (fbroken-callers): Remove documentation.
(ftail-call-workaround, ftail-call-workaround=): Document.

2019-05-19  Thomas Koenig  

PR fortran/90329
* invoke.texi: Document -fbroken-callers.
* lang.opt: Add -fbroken-callers.
* trans-decl.c (create_function_arglist): Only set
DECL_HIDDEN_STRING_LENGTH if flag_broken_callers is set.

2019-05-16  Jakub Jelinek  

PR fortran/90329
* tree-core.h (struct tree_decl_common): Document
decl_nonshareable_flag for PARM_DECLs.
* tree.h (DECL_HIDDEN_STRING_LENGTH): Define.
* calls.c (expand_call): Don't try tail call if caller
has any DECL_HIDDEN_STRING_LENGTH PARM_DECLs that are or might be
passed on the stack and callee needs to pass any arguments on the
stack.
* tree-streamer-in.c (unpack_ts_decl_common_value_fields): Use
else if instead of series of mutually exclusive ifs.  Handle
DECL_HIDDEN_STRING_LENGTH for PARM_DECLs.
* tree-streamer-out.c (pack_ts_decl_common_value_fields): Likewise.

* trans-decl.c (create_function_arglist): Set
DECL_HIDDEN_STRING_LENGTH on hidden string length PARM_DECLs if
len is constant.

--- gcc/tree-core.h (revision 271284)
+++ gcc/tree-core.h (revision 271285)
@@ -1683,6 +1683,7 @@ struct GTY(()) tree_decl_common {
   /* In a VAR_DECL and PARM_DECL, this is DECL_READ_P.  */
   unsigned decl_read_flag : 1;
   /* In a VAR_DECL or RESULT_DECL, this is DECL_NONSHAREABLE.  */
+  /* In a PARM_DECL, this is DECL_HIDDEN_STRING_LENGTH.  */
   unsigned decl_nonshareable_flag : 1;
 
   /* DECL_OFFSET_ALIGN, used only for FIELD_DECLs.  */
--- gcc/tree.h  (revision 271284)
+++ gcc/tree.h  (revision 271285)
@@ -900,6 +900,11 @@ extern void omp_clause_range_check_faile
   (TREE_CHECK2 (NODE, VAR_DECL, \
RESULT_DECL)->decl_common.decl_nonshareable_flag)
 
+/* In a PARM_DECL, set for Fortran hidden string length arguments that some
+   buggy callers don't pass to the callee.  */
+#define DECL_HIDDEN_STRING_LENGTH(NODE) \
+  (TREE_CHECK (NODE, PARM_DECL)->decl_common.decl_nonshareable_flag)
+
 /* In a CALL_EXPR, means that the call is the jump from a thunk to the
thunked-to function.  */
 #define CALL_FROM_THUNK_P(NODE) (CALL_EXPR_CHECK (NODE)->base.protected_flag)
--- gcc/calls.c (revision 271284)
+++ gcc/calls.c (revision 271285)
@@ -3628,6 +3628,28 @@ expand_call (tree exp, rtx target, int i
   || dbg_cnt (tail_call) == false)
 try_tail_call = 0;
 
+  /* Workaround buggy C/C++ wrappers

Re: undefined behavior in value_range::equiv_add()?

2019-05-29 Thread Jeff Law
On 5/29/19 9:58 AM, Aldy Hernandez wrote:
> On 5/29/19 9:24 AM, Richard Biener wrote:
>> On Wed, May 29, 2019 at 2:18 PM Aldy Hernandez  wrote:
>>>
>>> As per the API, and the original documentation to value_range,
>>> VR_UNDEFINED and VR_VARYING should never have equivalences.  However,
>>> equiv_add is tacking on equivalences blindly, and there are various
>>> regressions that happen if I fix this oversight.
>>>
>>> void
>>> value_range::equiv_add (const_tree var,
>>>  const value_range *var_vr,
>>>  bitmap_obstack *obstack)
>>> {
>>>     if (!m_equiv)
>>>   m_equiv = BITMAP_ALLOC (obstack);
>>>     unsigned ver = SSA_NAME_VERSION (var);
>>>     bitmap_set_bit (m_equiv, ver);
>>>     if (var_vr && var_vr->m_equiv)
>>>   bitmap_ior_into (m_equiv, var_vr->m_equiv);
>>> }
>>>
>>> Is this a bug in the documentation / API, or is equiv_add incorrect and
>>> we should fix the fall-out elsewhere?
>>
>> I think this must have been crept in during the classification.  If you
>> go back to say GCC 7 you shouldn't see value-ranges with
>> UNDEFINED/VARYING state in the lattice that have equivalences.
>>
>> It may not be easy to avoid with the new classy interface but we're
>> certainly not tacking on them "blindly".  At least we're not supposed
>> to.  As usual the intermediate state might be "broken" but
>> intermediateness is not sth the new class "likes".
> 
> It looks like extract_range_from_stmt (by virtue of
> vrp_visit_assignment_or_call and then extract_range_from_ssa_name)
> returns one of these intermediate ranges.  It would seem to me that an
> outward looking API method like vr_values::extract_range_from_stmt
> shouldn't be returning inconsistent ranges.  Or are there no guarantees
> for value_ranges from within all of vr_values?
ISTM that if we have an implementation constraint that says a VR_VARYING
or VR_UNDEFINED range can't have equivalences, then we need to honor
that at the minimum for anything returned by an external API.  Returning
an inconsistent state is bad.  I'd even state that we should try damn
hard to avoid it in internal APIs as well.

> 
> Perhaps I should give a little background.  As part of your
> value_range_base re-factoring last year, you mentioned that you didn't
> split out intersect like you did union because of time or oversight.  I
> have code to implement intersect (attached), for which I've noticed that
> I must leave equivalences intact, even when transitioning to VR_UNDEFINED:
> 
> [from the attached patch]
> +  /* If THIS is varying we want to pick up equivalences from OTHER.
> + Just special-case this here rather than trying to fixup after the
> + fact.  */
> +  if (this->varying_p ())
> +    this->deep_copy (other);
> +  else if (this->undefined_p ())
> +    /* ?? Leave any equivalences already present in an undefined.
> +   This is technically not allowed, but we may get an in-flight
> +   value_range in an intermediate state.  */
Where/when does this happen?

> +    ;
> 
> What is happening is that in evrp's record_ranges_from_stmt, we call
> extract_range_from_stmt which returns an inconsistent VR_UNDEFINED with
> an equivalence, which is then fed to update_value_range() and finally
> value_range::intersect.  The VR_UNDEFINED equivalence must not be
> removed in the intersect, because update_value_range() will get confused
> as to whether this is a new VR or not (because VR_UNDEFINED with no
> equivalences is not the same as VR_UNDEFINED with equivalences-- see
> "is_new" in update_value_range).
Ugh.  I hate some of the gyrations we have to do for update_value_range.
 Regardless I tend to think the problem is in the inconsistent state we
get back from extract_range_from_stmt.

Jeff


Re: [PATCH] gdbinit: add a new command and fix one

2019-05-29 Thread Jeff Law
On 5/29/19 3:46 AM, Martin Liška wrote:
> Hi.
> 
> The patch is about a small change in .gdbinit file.
> 
> Ready for trunk?
> Martin
> 
> gcc/ChangeLog:
> 
> 2019-05-29  Martin Liska  
> 
>   * gdbinit.in: Fix 'ptc' command.  Add tt
>   that prints TREE_TYPE($).
> ---
>  gcc/gdbinit.in | 10 +-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> 
OK
jeff


Re: undefined behavior in value_range::equiv_add()?

2019-05-29 Thread Aldy Hernandez

On 5/29/19 12:12 PM, Jeff Law wrote:

On 5/29/19 9:58 AM, Aldy Hernandez wrote:

On 5/29/19 9:24 AM, Richard Biener wrote:

On Wed, May 29, 2019 at 2:18 PM Aldy Hernandez  wrote:


As per the API, and the original documentation to value_range,
VR_UNDEFINED and VR_VARYING should never have equivalences.  However,
equiv_add is tacking on equivalences blindly, and there are various
regressions that happen if I fix this oversight.

void
value_range::equiv_add (const_tree var,
  const value_range *var_vr,
  bitmap_obstack *obstack)
{
     if (!m_equiv)
   m_equiv = BITMAP_ALLOC (obstack);
     unsigned ver = SSA_NAME_VERSION (var);
     bitmap_set_bit (m_equiv, ver);
     if (var_vr && var_vr->m_equiv)
   bitmap_ior_into (m_equiv, var_vr->m_equiv);
}

Is this a bug in the documentation / API, or is equiv_add incorrect and
we should fix the fall-out elsewhere?


I think this must have been crept in during the classification.  If you
go back to say GCC 7 you shouldn't see value-ranges with
UNDEFINED/VARYING state in the lattice that have equivalences.

It may not be easy to avoid with the new classy interface but we're
certainly not tacking on them "blindly".  At least we're not supposed
to.  As usual the intermediate state might be "broken" but
intermediateness is not sth the new class "likes".


It looks like extract_range_from_stmt (by virtue of
vrp_visit_assignment_or_call and then extract_range_from_ssa_name)
returns one of these intermediate ranges.  It would seem to me that an
outward looking API method like vr_values::extract_range_from_stmt
shouldn't be returning inconsistent ranges.  Or are there no guarantees
for value_ranges from within all of vr_values?

ISTM that if we have an implementation constraint that says a VR_VARYING
or VR_UNDEFINED range can't have equivalences, then we need to honor
that at the minimum for anything returned by an external API.  Returning
an inconsistent state is bad.  I'd even state that we should try damn
hard to avoid it in internal APIs as well.


Agreed * 2.





Perhaps I should give a little background.  As part of your
value_range_base re-factoring last year, you mentioned that you didn't
split out intersect like you did union because of time or oversight.  I
have code to implement intersect (attached), for which I've noticed that
I must leave equivalences intact, even when transitioning to VR_UNDEFINED:

[from the attached patch]
+  /* If THIS is varying we want to pick up equivalences from OTHER.
+ Just special-case this here rather than trying to fixup after the
+ fact.  */
+  if (this->varying_p ())
+    this->deep_copy (other);
+  else if (this->undefined_p ())
+    /* ?? Leave any equivalences already present in an undefined.
+   This is technically not allowed, but we may get an in-flight
+   value_range in an intermediate state.  */

Where/when does this happen?


The above snippet is not currently in mainline.  It's in the patch I'm 
proposing to clean up intersect.  It's just that while cleaning up 
intersect I noticed that if we keep to the value_range API, we end up 
clobbering an equivalence to a VR_UNDEFINED that we depend up further up 
the call chain.


The reason it doesn't happen in mainline is because intersect_helper 
bails early on an undefined, thus leaving the problematic equivalence 
intact.


You can see it in mainline though, with the following testcase:

int f(int x)
{
  if (x != 0 && x != 1)
return -2;

  return !x;
}

Break in evrp_range_analyzer::record_ranges_from_stmt() and see that the 
call to extract_range_from_stmt() returns a VR of undefined *WITH* 
equivalences:


  vr_values->extract_range_from_stmt (stmt, &taken_edge, &output, &vr);

This VR is later fed to update_value_range() and ultimately intersect(), 
which in mainline, leaves the equivalences alone, but IMO should respect 
that API and nuke them.


For my proposed overhaul of intersect, I have to special-case the 
undefined to make sure we don't clobber it's inconsistent use of 
equivalences.





+    ;

What is happening is that in evrp's record_ranges_from_stmt, we call
extract_range_from_stmt which returns an inconsistent VR_UNDEFINED with
an equivalence, which is then fed to update_value_range() and finally
value_range::intersect.  The VR_UNDEFINED equivalence must not be
removed in the intersect, because update_value_range() will get confused
as to whether this is a new VR or not (because VR_UNDEFINED with no
equivalences is not the same as VR_UNDEFINED with equivalences-- see
"is_new" in update_value_range).

Ugh.  I hate some of the gyrations we have to do for update_value_range.
  Regardless I tend to think the problem is in the inconsistent state we
get back from extract_range_from_stmt.


Agreed.

Aldy


Re: [C++ Patch] PR 89875 ("[7/8/9/10 Regression] invalid typeof reference to a member of an incomplete struct accepted at function scope")

2019-05-29 Thread Jason Merrill

On 5/28/19 4:44 PM, Paolo Carlini wrote:

Hi,

On 28/05/19 16:47, Jason Merrill wrote:

On 5/10/19 10:29 AM, Paolo Carlini wrote:

Hi,

a while ago Martin noticed that an unintended consequence of an old 
tweak of mine - which avoided redundant error messages emitted from 
cp_parser_init_declarator - is that, in some cases, we started 
accepting ill-formed typeofs. Luckily, decltype isn't affected and 
that points to the real issue: by the time that place in 
cp_parser_init_declarator is reached, for a decltype version we 
already emitted a correct error message. Thus I think the right way 
to fix the problem is simply committing to tentative parse when, 
toward the end of cp_parser_sizeof_operand we know that we must be 
looking at a (possibly ill-formed) expression. Tested x86_64-linux.


The problem with calling cp_parser_commit_to_tentative_parse here is 
that the tentative parse you're committing to is for the enclosing 
scope, which is trying to decide whether e.g. we're parsing a 
declaration or expression.  If the operand of typeof is a well-formed 
expression, and the larger context is an expression, this will break.


Better, I think, to commit and re-parse only if you actually encounter 
an error.


Alternately, cp_parser_decltype_expr deals with this by using a 
tentative firewall and CPP_DECLTYPE; cp_parser_sizeof_operand could do 
the same, but that seems like a bigger hammer.


Today I spent quite a bit of time on this and eventually decided to 
follow the example of decltype as closely as possible. Then I started 
tweaking those drafts which laready passed the testsuite and after a 
while ended up with the below, rather close to the current code, in 
fact. Testing !cp_parser_error_occurred and in case calling 
cp_parser_abort_tentative_parse by hand (closer to the decltype example) 
also works. What do you think? Thanks, Paolo.


OK.

Jason



Re: [PATCH] rs6000: Call flow implementation for PC-relative addressing

2019-05-29 Thread Segher Boessenkool
On Thu, May 30, 2019 at 12:44:35AM +0930, Alan Modra wrote:
> On Wed, May 29, 2019 at 07:40:46AM -0500, Segher Boessenkool wrote:
> > All necessary linker (and binutils and GAS) support is upstream already, 
> > right?
> 
> I believe so, except gold support is lacking right now.

Excellent :-)

> > >pld 12,0(0),1
> > >.reloc .-8,R_PPC64_PLT_PCREL34_NOTOC,foo
> > 
> > Are we guaranteed the assembler always writes a pld like this as 8 bytes?
> 
> Strictly speaking the assembler might nop pad *before* the pld making
> a total of 12 bytes, and that's the reason to put the .reloc *after*
> the prefix instruction.

Ah, okay.  That probably warrants a comment...

Thanks,


Segher


Re: Simplify more EXACT_DIV_EXPR comparisons

2019-05-29 Thread Jeff Law
On 5/29/19 1:21 AM, Richard Biener wrote:
> On Tue, May 28, 2019 at 5:34 PM Martin Sebor  wrote:
>>
>> On 5/21/19 3:53 AM, Richard Biener wrote:
>>> On Tue, May 21, 2019 at 4:13 AM Martin Sebor  wrote:

 On 5/20/19 3:16 AM, Richard Biener wrote:
> On Mon, May 20, 2019 at 10:16 AM Marc Glisse  wrote:
>>
>> On Mon, 20 May 2019, Richard Biener wrote:
>>
>>> On Sun, May 19, 2019 at 6:16 PM Marc Glisse  
>>> wrote:

 Hello,

 2 pieces:

 - the first one handles the case where the denominator is negative. It
 doesn't happen often with exact_div, so I don't handle it everywhere, 
 but
 this one looked trivial

 - handle the case where a pointer difference is cast to an unsigned 
 type
 before being compared to a constant (I hit this in std::vector). With 
 some
 range info we could probably handle some non-constant cases as well...

 The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2)

 void f (void*);
 void g (int *p, int *q)
 {
  __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q);
  if (n < 100)
f (__builtin_alloca (n));
 }

 At the time of walloca2, we have

  _1 = p_5(D) - q_6(D);
  # RANGE [-2305843009213693952, 2305843009213693951]
  _2 = _1 /[ex] 4;
  # RANGE ~[2305843009213693952, 16140901064495857663]
  n_7 = (long unsigned intD.10) _2;
  _11 = (long unsigned intD.10) _1;
  if (_11 <= 396)
 [...]
  _3 = allocaD.1059 (n_7);

 and warn.
>>>
>>> That's indeed to complicated relation of _11 to n_7 for
>>> VRP predicate discovery.
>>>
 However, DOM3 later produces

  _1 = p_5(D) - q_6(D);
  _11 = (long unsigned intD.10) _1;
  if (_11 <= 396)
>>>
>>> while _11 vs. _1 works fine.
>>>
 [...]
  # RANGE [0, 99] NONZERO 127
  _2 = _1 /[ex] 4;
  # RANGE [0, 99] NONZERO 127
  n_7 = (long unsigned intD.10) _2;
  _3 = allocaD.1059 (n_7);

 so I am tempted to say that the walloca2 pass is too early, xfail the
 testcase and file an issue...
>>>
>>> Hmm, there's a DOM pass before walloca2 already and moving
>>> walloca2 after loop opts doesn't look like the best thing to do?
>>> I suppose it's not DOM but sinking that does the important transform
>>> here?  That is,
>>>
>>> Index: gcc/passes.def
>>> ===
>>> --- gcc/passes.def  (revision 271395)
>>> +++ gcc/passes.def  (working copy)
>>> @@ -241,9 +241,9 @@ along with GCC; see the file COPYING3.
>>> NEXT_PASS (pass_optimize_bswap);
>>> NEXT_PASS (pass_laddress);
>>> NEXT_PASS (pass_lim);
>>> -  NEXT_PASS (pass_walloca, false);
>>> NEXT_PASS (pass_pre);
>>> NEXT_PASS (pass_sink_code);
>>> +  NEXT_PASS (pass_walloca, false);
>>> NEXT_PASS (pass_sancov);
>>> NEXT_PASS (pass_asan);
>>> NEXT_PASS (pass_tsan);
>>>
>>> fixes it?
>>
>> I will check, but I don't think walloca uses any kind of on-demand VRP, 
>> so
>> we still need some pass to update the ranges after sinking, which doesn't
>> seem to happen until the next DOM pass.
>
> Oh, ok...  Aldy, why's this a separate pass anyways?  I think similar
> other warnigns are emitted from RTL expansion?  So maybe we can
> indeed move the pass towards warn_restrict or late_warn_uninit.

 I thought there was a preference to add new middle-end warnings
 into passes of their own rather than into existing passes.  Is
 that not so (either in general or in this specific case)?
>>>
>>> The preference was to add them not into optimization passes.  But
>>> of course having 10+ warning passes, each going over the whole IL
>>> is excessive.  Also each of the locally computing ranges or so.
>>>
>>> Given the simplicity of Walloca I wonder why it's not part of another
>>> warning pass - since it's about tracking "sizes" again there are plenty
>>> that fit ;)
>>
>> -Walloca doesn't need to track object sizes in the same sense
>> as objsize and strlen do.  It just examines calls to allocation
>> functions, same as -Walloc-larger-than.  It would make sense to
>> merge the implementation of two warnings.  They don't need to
>> run as a pass of their own.
>>
   From my POV, the main (only?) benefit of putting warnings in their
 own passes is modularity.  Are there any others?

 The biggest drawback I see is that it makes it hard to then share
 data across multiple passes.  The sharing can h

Re: [C++ PATCH] Fix decltype on a trivial dtor with -flifetime-dse (PR c++/90598)

2019-05-29 Thread Jason Merrill

On 5/24/19 4:21 AM, Jakub Jelinek wrote:

The second patch fixes that by special casing void type MODIFY_EXPR, I
believe if we have void type MODIFY_EXPR, then it can't be an lvalue.


Any expression with void type is a prvalue, so let's not limit this to 
MODIFY_EXPR.


Jason


Re: [C++ Patch] A few more grokdeclarator locations fixes

2019-05-29 Thread Jason Merrill

On 5/23/19 3:23 PM, Paolo Carlini wrote:

Hi,

one more, rather straightforward, simply use the location stored in the 
declarator. Tested x86_64-linux.


OK.

Jason



Re: [gomp] Add langhook, so that Fortran can privatize variables by reference

2019-05-29 Thread Thomas Schwinge
Hi Jakub!

On Mon, 27 May 2019 18:49:20 +0200, Jakub Jelinek  wrote:
> On Sun, May 26, 2019 at 07:43:04PM +0200, Thomas Schwinge wrote:
> > On Tue, 18 Oct 2005 03:01:40 -0400, Jakub Jelinek  wrote:
> > > --- gcc/omp-low.c.jj  2005-10-15 12:00:06.0 +0200
> > > +++ gcc/omp-low.c 2005-10-18 08:46:23.0 +0200
> > > @@ -126,7 +126,7 @@ is_variable_sized (tree expr)
> > >  static inline bool
> > >  is_reference (tree decl)
> > >  {
> > > -  return TREE_CODE (TREE_TYPE (decl)) == REFERENCE_TYPE;
> > > +  return lang_hooks.decls.omp_privatize_by_reference (decl);
> > >  }
> > 
> > With the same implementation, this function nowadays is known as
> > 'omp_is_reference' ('gcc/omp-general.c'), and is used in 'omp-*' files
> > only.  The gimplifier directly calls
> > 'lang_hooks.decls.omp_privatize_by_reference'.
> > 
> > Will it be OK to commit the obvious patch to get rid of the
> > 'omp_is_reference' function?  Whenever I see it used in 'omp-*' files, I
> 
> No, omp_is_reference (something) is certainly more readable from
> lang_hooks.decls.omp_privatize_by_reference (something)

Yes, better readable because it's shorter, but you have to look up its
meaning, whereas with 'lang_hooks.decls.omp_privatize_by_reference' you
directly see what it's about.

> which is quite
> long and would cause major issues in formatting etc.

Well, we have rules about how to deal with the formatting issues.

> What advantage do you see in removing that?

For me, it's confusing, when looking at, say, 'OMP_CLAUSE_FIRSTPRIVATE'
code, that in 'gcc/gimplify.c' we call
'lang_hooks.decls.omp_privatize_by_reference', whereas in 'gcc/omp-*.c'
files we call 'omp_is_reference' -- but both actually mean the same
thing.

> > wonder and have to look up what special things it might be doing -- but
> > it actually isn't.
> > 
> > gcc/
> > * omp-general.c (omp_is_reference): Don't define.  Adjust all users.

Or, of course, the other way round:

gcc/
* gimplify.c: Use omp_is_reference.

Or, even more preferably:

gcc/
* omp-general.c (omp_is_reference): Rename to...
(omp_privatize_by_reference): ... this.  Adjust all users.
* gimplify.c: Use it.


Grüße
 Thomas


signature.asc
Description: PGP signature


[C++ PATCH] Fix decltype on a trivial dtor with -flifetime-dse (PR c++/90598, take 2)

2019-05-29 Thread Jakub Jelinek
On Wed, May 29, 2019 at 12:31:52PM -0400, Jason Merrill wrote:
> On 5/24/19 4:21 AM, Jakub Jelinek wrote:
> > The second patch fixes that by special casing void type MODIFY_EXPR, I
> > believe if we have void type MODIFY_EXPR, then it can't be an lvalue.
> 
> Any expression with void type is a prvalue, so let's not limit this to
> MODIFY_EXPR.

So like this?  So far no regressions in make check-c++-all, ok if it
passes full bootstrap/regtest?

The fact that cv void type expressions are prvalues has been clarified
only recently in
https://github.com/cplusplus/draft/commit/27d19661fbb0a5424f72330724d9809618efbb8b
it seems, is it true that they have been prvalues already in C++11 and
non-lvalues in C++98?

2019-05-29  Jakub Jelinek  

PR c++/90598
* tree.c (lvalue_kind): Return clk_none for expressions with
with VOID_TYPE_P.

* g++.dg/cpp0x/pr90598.C: New test.

--- gcc/cp/tree.c.jj2019-05-20 23:33:13.819084157 +0200
+++ gcc/cp/tree.c   2019-05-29 18:44:35.619408978 +0200
@@ -83,6 +83,10 @@ lvalue_kind (const_tree ref)
   if (ref == current_class_ptr)
 return clk_none;
 
+  /* Expressions with cv void type are prvalues.  */
+  if (TREE_TYPE (ref) && VOID_TYPE_P (TREE_TYPE (ref)))
+return clk_none;
+
   switch (TREE_CODE (ref))
 {
 case SAVE_EXPR:
--- gcc/testsuite/g++.dg/cpp0x/pr90598.C.jj 2019-05-29 18:41:43.882194503 
+0200
+++ gcc/testsuite/g++.dg/cpp0x/pr90598.C2019-05-29 18:41:43.882194503 
+0200
@@ -0,0 +1,8 @@
+// PR c++/90598
+// { dg-do compile { target c++11 } }
+
+struct A {};
+using B = decltype(A ().~A ());
+template  struct C;
+template <> struct C {};
+C t;


Jakub


Re: [PATCH] warn on returning alloca and VLA (PR 71924, 90549)

2019-05-29 Thread Jeff Law
On 5/22/19 3:34 PM, Martin Sebor wrote:
> -Wreturn-local-addr detects a subset of instances of returning
> the address of a local object from a function but the warning
> doesn't try to handle alloca or VLAs, or some non-trivial cases
> of ordinary automatic variables[1].
> 
> The attached patch extends the implementation of the warning to
> detect those.  It still doesn't detect instances where the address
> is the result of a built-in such strcpy[2].
> 
> Tested on x86_64-linux.
> 
> Martin
> 
> [1] For example, this is only diagnosed with the patch:
> 
>   void* f (int i)
>   {
> struct S { int a[2]; } s[2];
> return &s->a[i];
>   }
> 
> [2] The following is not diagnosed even with the patch:
> 
>   void sink (void*);
> 
>   void* f (int i)
>   {
> char a[6];
> char *p = __builtin_strcpy (a, "123");
> sink (p);
> return p;
>   }
> 
> I would expect detecting to be possible and useful.  Maybe as
> a follow-up.
> 
> gcc-71924.diff
> 
> PR middle-end/71924 - missing -Wreturn-local-addr returning alloca result
> PR middle-end/90549 - missing -Wreturn-local-addr maybe returning an address 
> of a local array plus offset
> 
> gcc/ChangeLog:
> 
>   PR c/71924
>   * gimple-ssa-isolate-paths.c (is_addr_local): New function.
>   (warn_return_addr_local_phi_arg, warn_return_addr_local): Same.
>   (find_implicit_erroneous_behavior): Call warn_return_addr_local_phi_arg.
>   (find_explicit_erroneous_behavior): Call warn_return_addr_local.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR c/71924
>   * gcc.dg/Wreturn-local-addr-2.c: New test.
>   * gcc.dg/Walloca-4.c: Prune expected warnings.
>   * gcc.dg/pr41551.c: Same.
>   * gcc.dg/pr59523.c: Same.
>   * gcc.dg/tree-ssa/pr88775-2.c: Same.
>   * gcc.dg/winline-7.c: Same.
> 
> diff --git a/gcc/gimple-ssa-isolate-paths.c b/gcc/gimple-ssa-isolate-paths.c
> index 33fe352bb23..2933ecf502e 100644
> --- a/gcc/gimple-ssa-isolate-paths.c
> +++ b/gcc/gimple-ssa-isolate-paths.c
> @@ -341,6 +341,135 @@ stmt_uses_0_or_null_in_undefined_way (gimple *stmt)
>return false;
>  }
>  
> +/* Return true if EXPR is a expression of pointer type that refers
> +   to the address of a variable with automatic storage duration.
> +   If so, set *PLOC to the location of the object or the call that
> +   allocated it (for alloca and VLAs).  When PMAYBE is non-null,
> +   also consider PHI statements and set *PMAYBE when some but not
> +   all arguments of such statements refer to local variables, and
> +   to clear it otherwise.  */
> +
> +static bool
> +is_addr_local (tree exp, location_t *ploc, bool *pmaybe = NULL,
> +hash_set *visited = NULL)
> +{
> +  if (TREE_CODE (exp) == ADDR_EXPR)
> +{
> +  tree baseaddr = get_base_address (TREE_OPERAND (exp, 0));
> +  if (TREE_CODE (baseaddr) == MEM_REF)
> + return is_addr_local (TREE_OPERAND (baseaddr, 0), ploc, pmaybe, 
> visited);
> +
> +  if ((!VAR_P (baseaddr)
> +|| is_global_var (baseaddr))
> +   && TREE_CODE (baseaddr) != PARM_DECL)
> + return false;
> +
> +  *ploc = DECL_SOURCE_LOCATION (baseaddr);
> +  return true;
> +}
> +
> +  if (TREE_CODE (exp) == SSA_NAME)
> +{
> +  gimple *def_stmt = SSA_NAME_DEF_STMT (exp);
> +  enum gimple_code code = gimple_code (def_stmt);
> +
> +  if (is_gimple_assign (def_stmt))
> + {
> +   tree type = TREE_TYPE (gimple_assign_lhs (def_stmt));
> +   if (POINTER_TYPE_P (type))
> + {
> +   tree ptr = gimple_assign_rhs1 (def_stmt);
> +   return is_addr_local (ptr, ploc, pmaybe, visited);
> + }
> +   return false;
> + }
> +
> +  if (code == GIMPLE_CALL
> +   && gimple_call_builtin_p (def_stmt))
> + {
> +   tree fn = gimple_call_fndecl (def_stmt);
> +   int code = DECL_FUNCTION_CODE (fn);
> +   if (code != BUILT_IN_ALLOCA
> +   && code != BUILT_IN_ALLOCA_WITH_ALIGN)
> + return false;
> +
> +   *ploc = gimple_location (def_stmt);
> +   return true;
> + }
> +
> +  if (code == GIMPLE_PHI && pmaybe)
> + {
> +   unsigned count = 0;
> +   gphi *phi_stmt = as_a  (def_stmt);
> +
> +   unsigned nargs = gimple_phi_num_args (phi_stmt);
> +   for (unsigned i = 0; i < nargs; ++i)
> + {
> +   if (!visited->add (phi_stmt))
> + {
> +   tree arg = gimple_phi_arg_def (phi_stmt, i);
> +   if (is_addr_local (arg, ploc, pmaybe, visited))
> + ++count;
> + }
> + }
> +
> +   *pmaybe = count && count < nargs;
> +   return count != 0;
> + }
> +}
> +
> +  return false;
> +}
Is there some reason you didn't query the alias oracle here?  It would
seem a fairly natural fit.  Ultimately given a pointer (which will be an
SSA_NAME) you want to ask whether or not it conclusively points into the
stack.

That would seem to dramatically simplify is_addr_local.

The rest looks pretty reasonab

Re: libbacktrace integration for _GLIBCXX_DEBUG mode

2019-05-29 Thread François Dumont

On 5/29/19 12:06 AM, Jonathan Wakely wrote:

On 23/05/19 07:39 +0200, François Dumont wrote:

Hi

    So here what I come up with.

    _GLIBCXX_DEBUG_BACKTRACE controls the feature. If the user define 


Thanks for making this opt-in.

it and there is a detectable issue with libbacktrace then I generate 
a compilation error. I want to avoid users defining it but having no 
backtrace in the end in the debug assertion.


Why do you want to avoid that?

This means users can't just define the macro in their makefiles
unconditionally, they have to check if libbacktrace is installed and
supported. That might mean having platform-specific conditionals in
the makefile to only enable it sometimes.

What harm does it do to just ignore the _GLIBCXX_DEBUG_BACKTRACE macro
if backtraces can't be enabled? Or just use #warning instead of
#error?

What I want to avoid is PR  saying that despite _GLIBCXX_DEBUG_BACKTRACE 
being defined there isn't any backtrace displayed in the assertion message.


Maybe I can still fail to compile if I can't include 
backtrace-supported.h to make clear that the -I... option is missing but 
ignore if !BACKTRACE_SUPPORTED. Would it be fine ?





    With this new setup I manage to run testsuite with it like that:

export LD_LIBRARY_PATH=/home/fdt/dev/gcc/install/lib/
make CXXFLAGS='-D_GLIBCXX_DEBUG_BACKTRACE 
-I/home/fdt/dev/gcc/install/include -lbacktrace' check-debug


    An example of result:

/home/fdt/dev/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/debug/vector:606: 


In function:
    std::__debug::vector<_Tp, _Allocator>::iterator
    std::__debug::vector<_Tp, 
_Allocator>::insert(std::__debug::vector<_Tp,

    _Allocator>::const_iterator, _InputIterator, _InputIterator) [with
    _InputIterator = int*;  = void; _Tp = int;
    _Allocator = std::allocator; std::__debug::vector<_Tp,
    _Allocator>::iterator =
__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator >, std::__debug::vector,
    std::random_access_iterator_tag>; typename 
std::iterator_traits
    std::vector<_Tp, _Alloc>::iterator>::iterator_category =
    std::random_access_iterator_tag; typename std::vector<_Tp,
    _Alloc>::iterator = __gnu_cxx::__normal_iteratorstd::vector

    >; std::__debug::vector<_Tp, _Allocator>::const_iterator =
__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator >, std::__debug::vector,
    std::random_access_iterator_tag>; typename 
std::iterator_traits
    std::vector<_Tp, _Alloc>::const_iterator>::iterator_category =
    std::random_access_iterator_tag; typename std::vector<_Tp,
    _Alloc>::const_iterator = __gnu_cxx::__normal_iteratorint*, std::

    vector >]

Backtrace:
    0x402718 
__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iteratorstd::vector >> std::__debug::vector::insertvoid>(__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iteratorconst*, std::vector >>, int*, int*)
/home/fdt/dev/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/debug/vector:606 


    0x402718 test01()
/home/fdt/dev/gcc/git/libstdc++-v3/testsuite/23_containers/vector/debug/57779_neg.cc:29 


    0x401428 main
/home/fdt/dev/gcc/git/libstdc++-v3/testsuite/23_containers/vector/debug/57779_neg.cc:34 



Error: attempt to insert with an iterator range [__first, __last) 
from this

container.

Objects involved in the operation:
    iterator "__first" @ 0x0x7fff730b96b0 {
  type = int* (mutable iterator);
    }
    iterator "__last" @ 0x0x7fff730b96b8 {
  type = int* (mutable iterator);
    }
    sequence "this" @ 0x0x7fff730b9720 {
  type = std::__debug::vector;
    }


This is nice.


Yes, I forgot to say that I made an effort to clean a little the 
demangle names but I think you saw it already.


I was surprised to see that the content of the __PRETTY_FUNCTION__ macro 
is quite different from the demangling of the same symbol, too bad.




diff --git a/libstdc++-v3/doc/xml/manual/debug_mode.xml 
b/libstdc++-v3/doc/xml/manual/debug_mode.xml

index 570c17ba28a..27873151dae 100644
--- a/libstdc++-v3/doc/xml/manual/debug_mode.xml
+++ b/libstdc++-v3/doc/xml/manual/debug_mode.xml
@@ -104,9 +104,11 @@
The following library components provide extra debugging
  capabilities in debug mode:

+ std::array (no safe 
iterators)
std::basic_string (no safe iterators and 
see note below)

std::bitset
std::deque
+ std::forward_list
std::list
std::map
std::multimap
@@ -160,6 +162,13 @@ which always works correctly.
  GLIBCXX_DEBUG_MESSAGE_LENGTH can be used to request a
  different length.

+Starting with GCC 10 libstdc++ has integrated


I think "integrated" is the wrong word, because you haven't made
libbacktrace and libstdc++ into a single thing. You've just added the
ability for libstdc++ to use libbacktrace.

I suggest "libstdc++ is able to use"


+  http://www.w3.org/1999/xlink";
+ 
xlink:href="https://github.com/ianlancetaylor/libbacktrace";>libbacktrace
+  to produce backtrace on error. Use 
-D_GLIBCXX_DEBUG_BACKTRACE to


Should be "produce backtraces".

+  activate it. Note that if not proper

[PATCH] rs6000: Add eI constraint for 34-bit constants

2019-05-29 Thread Bill Schmidt
Hi,

This short patch introduces the eI constraint.  It also adds the 
SIGNED_16BIT_OFFSET_P
convenience macro that will be used in subsequent patches.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.  
Is
this okay for trunk?

Thanks,
Bill


2019-05-29  Bill Schmidt  
Michael Meissner  

* config/rs6000/constraints.md (eI): New constraint.
* config/rs6000/predicates.md (cint34_operand): New predicate.
* config/rs6000/rs6000.h (SIGNED_16BIT_OFFSET_P): New #define.
(SIGNED_34BIT_OFFSET_P): Likewise.
* doc/md.texi (eI): Document constraint.


diff --git a/gcc/config/rs6000/constraints.md b/gcc/config/rs6000/constraints.md
index fd8be343f09..8004a92fd40 100644
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@@ -210,6 +210,11 @@
   (and (match_code "const_int")
(match_test "((- (unsigned HOST_WIDE_INT) ival) + 0x8000) < 0x1")))
 
+;; 34-bit signed integer constant
+(define_constraint "eI"
+  "34-bit constant integer that can be loaded with PADDI"
+  (match_operand 0 "cint34_operand"))
+
 ;; Floating-point constraints.  These two are defined so that insn
 ;; length attributes can be calculated exactly.
 
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index 2643f1abd2e..a578e0f27f7 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -302,6 +302,16 @@
   (and (match_code "const_int")
(match_test "IN_RANGE (INTVAL (op), 0, 15)")))
 
+;; Return 1 if op is a 34-bit constant integer.
+(define_predicate "cint34_operand"
+  (match_code "const_int")
+{
+  if (!TARGET_PREFIXED_ADDR)
+return 0;
+
+  return SIGNED_34BIT_OFFSET_P (INTVAL (op), 0);
+})
+
 ;; Return 1 if op is a register that is not special.
 ;; Disallow (SUBREG:SF (REG:SI)) and (SUBREG:SI (REG:SF)) on VSX systems where
 ;; you need to be careful in moving a SFmode to SImode and vice versa due to
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h
index 335d75ae85f..fc92ff20d11 100644
--- a/gcc/config/rs6000/rs6000.h
+++ b/gcc/config/rs6000/rs6000.h
@@ -2493,3 +2493,17 @@ extern GTY(()) tree 
rs6000_builtin_decls[RS6000_BUILTIN_COUNT];
 #if (GCC_VERSION >= 3000)
 #pragma GCC poison TARGET_FLOAT128 OPTION_MASK_FLOAT128 MASK_FLOAT128
 #endif
+
+/* Whether a given VALUE is a valid 16- or 34-bit signed offset.  EXTRA is the
+   amount that we can't touch at the high end of the range (typically if the
+   address is split into smaller addresses, the extra covers the addresses
+   which might be generated when the insn is split).  */
+#define SIGNED_16BIT_OFFSET_P(VALUE, EXTRA)\
+  IN_RANGE (VALUE, \
+   ~(HOST_WIDE_INT_1 << 15),   \
+   (HOST_WIDE_INT_1 << 15) - 1 - (EXTRA))
+
+#define SIGNED_34BIT_OFFSET_P(VALUE, EXTRA)\
+  IN_RANGE (VALUE, \
+   ~(HOST_WIDE_INT_1 << 33),   \
+   (HOST_WIDE_INT_1 << 33) - 1 - (EXTRA))
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index db9c210edb8..775b8f5b715 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -3367,6 +3367,9 @@ Zero
 @item P
 Constant whose negation is a signed 16-bit constant
 
+@item eI
+Signed 34-bit integer constant if prefixed instructions are supported.
+
 @item G
 Floating point constant that can be loaded into a register with one
 instruction per word



Re: [C++ PATCH] Fix decltype on a trivial dtor with -flifetime-dse (PR c++/90598, take 2)

2019-05-29 Thread Jason Merrill

On 5/29/19 1:12 PM, Jakub Jelinek wrote:

On Wed, May 29, 2019 at 12:31:52PM -0400, Jason Merrill wrote:

On 5/24/19 4:21 AM, Jakub Jelinek wrote:

The second patch fixes that by special casing void type MODIFY_EXPR, I
believe if we have void type MODIFY_EXPR, then it can't be an lvalue.


Any expression with void type is a prvalue, so let's not limit this to
MODIFY_EXPR.


So like this?  So far no regressions in make check-c++-all, ok if it
passes full bootstrap/regtest?


OK.


The fact that cv void type expressions are prvalues has been clarified
only recently in
https://github.com/cplusplus/draft/commit/27d19661fbb0a5424f72330724d9809618efbb8b
it seems, is it true that they have been prvalues already in C++11 and
non-lvalues in C++98?


Even if it wasn't stated clearly, it's never been possible to form an 
lvalue of void type.


Jason


Re: [PATCH] gdbinit: add a new command and fix one

2019-05-29 Thread Segher Boessenkool
On Wed, May 29, 2019 at 10:14:58AM -0600, Jeff Law wrote:
> On 5/29/19 3:46 AM, Martin Liška wrote:
> > Hi.
> > 
> > The patch is about a small change in .gdbinit file.
> > 
> > Ready for trunk?
> > Martin
> > 
> > gcc/ChangeLog:
> > 
> > 2019-05-29  Martin Liska  
> > 
> > * gdbinit.in: Fix 'ptc' command.  Add tt
> > that prints TREE_TYPE($).
> > ---
> >  gcc/gdbinit.in | 10 +-
> >  1 file changed, 9 insertions(+), 1 deletion(-)

There already *is* a "tt" command.  Not that that one is likely very
useful for debugging GCC, but still...  Please check before overriding
commands.


Segher


Re: [PATCH] RX: Add rx-*-linux target

2019-05-29 Thread Jeff Law
On 5/23/19 6:05 AM, Yoshinori Sato wrote:
> I ported linux kernel to Renesas RX.
> 
> rx-*-elf target output a binary different from the standard ELF.
> It has the same format as the Renesas compiler.
> 
> But the linux kernel requires the standard ELF format.
> I want to define a rx-*-linux target so that I can generate
> a standard ELF binary.
Presumably you're resubmitting after your assignment got recorded (I
think I saw that fly by recently).

I'll construct a ChangeLog and install this on the trunk.

Thanks,
Jeff


Re: [patch, fortran] Fix wrong-code regression with netcdf and SPEC due to argument repacking

2019-05-29 Thread Steve Kargl
On Wed, May 29, 2019 at 01:15:52PM +0200, Thomas Koenig wrote:
> 
> the attached patch fixes the wrong-code regression due to the
> inline argument repacking patch, r271377.
> 
> What had gone wrong?  gfortran used to pack and  unpack arrays
> unconditionally passed to old-style assumed size or .  For code like
> 
> module t2
>implicit none
> contains
>subroutine foo(a)
>  real, dimension(*) :: a
>end subroutine foo
> end module t2
> 
> module t1
>use t2
>implicit none
> contains
>subroutine bar(a)
>  real, dimension(:) :: a
>  call foo(a)
>end subroutine bar
> end module t1
> 
> program main
>use t1
>call bar([1.0, 2.0])
> end program main
> 
> this meant that an (always contiguous) array constructor was
> passed down to an assumed shape array, which then passed it
> on to an assumed size, explicit shape or adjustable array.
> Packing was not problematic (apart from performance), but
> unpacking tried to write into the array constructor.
> 
> So, this patch inserts a run-time check for contiguous arrays
> and does not do packing/unpacking in that case.
> 
> Thanks to Toon and Martin for finding an open test case which
> actually failed, and for help with debugging.
> 
> (Always repacking also likely impacted performance when it didn't
> lead to wrong code, we will have to see how performance is with
> this version).
> 
> OK for trunk?
> 

Yes.

Thomas and Martin thanks for the effort required with debugging
the SPEC benchmark codes.

-- 
Steve


[PATCH, i386]: Un-publish save/restore_multiple patterns

2019-05-29 Thread Uros Bizjak
These do not need to be public.

2019-05-29  Uroš Bizjak  

* config/i386/sse.md (*save_multiple): Rename from
save_multiple.
(*restore_multiple): Rename from restore_multiple.
(*restore_multiple_and_return): Rename from
restore_multiple_and_return.
(*restore_multiple_leave_return): Rename from
restore_multiple_leave_return.

Bootstrapped and regression tested on x86_64-linux-gnu{,-m32}.

Committed to mainline SVN.

Uros.
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index f053414a0c3b..f8e6f4c5be02 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -21889,21 +21889,21 @@
   "vpopcnt\t{%1, %0|%0, %1}")
 
 ;; Save multiple registers out-of-line.
-(define_insn "save_multiple"
+(define_insn "*save_multiple"
   [(match_parallel 0 "save_multiple"
 [(use (match_operand:P 1 "symbol_operand"))])]
   "TARGET_SSE && TARGET_64BIT"
   "call\t%P1")
 
 ;; Restore multiple registers out-of-line.
-(define_insn "restore_multiple"
+(define_insn "*restore_multiple"
   [(match_parallel 0 "restore_multiple"
 [(use (match_operand:P 1 "symbol_operand"))])]
   "TARGET_SSE && TARGET_64BIT"
   "call\t%P1")
 
 ;; Restore multiple registers out-of-line and return.
-(define_insn "restore_multiple_and_return"
+(define_insn "*restore_multiple_and_return"
   [(match_parallel 0 "restore_multiple"
 [(return)
  (use (match_operand:P 1 "symbol_operand"))
@@ -21914,7 +21914,7 @@
 
 ;; Restore multiple registers out-of-line when hard frame pointer is used,
 ;; perform the leave operation prior to returning (from the function).
-(define_insn "restore_multiple_leave_return"
+(define_insn "*restore_multiple_leave_return"
   [(match_parallel 0 "restore_multiple"
 [(return)
  (use (match_operand:P 1 "symbol_operand"))


Re: [PATCH] rs6000: Add eI constraint for 34-bit constants

2019-05-29 Thread Segher Boessenkool
Hi!

On Wed, May 29, 2019 at 01:08:28PM -0500, Bill Schmidt wrote:
> +/* Whether a given VALUE is a valid 16- or 34-bit signed offset.  EXTRA is 
> the
> +   amount that we can't touch at the high end of the range (typically if the
> +   address is split into smaller addresses, the extra covers the addresses
> +   which might be generated when the insn is split).  */
> +#define SIGNED_16BIT_OFFSET_P(VALUE, EXTRA)  \
> +  IN_RANGE (VALUE,   \
> + ~(HOST_WIDE_INT_1 << 15),   \
> + (HOST_WIDE_INT_1 << 15) - 1 - (EXTRA))
> +
> +#define SIGNED_34BIT_OFFSET_P(VALUE, EXTRA)  \
> +  IN_RANGE (VALUE,   \
> + ~(HOST_WIDE_INT_1 << 33),   \
> + (HOST_WIDE_INT_1 << 33) - 1 - (EXTRA))

The ~ should be - I think?

Okay for trunk with that change.  Thanks,


Segher


Re: [Patch] Fix ix86_expand_sse_comi_round (PR Target/89750, PR Target/86444)

2019-05-29 Thread Jeff Law
On 5/9/19 10:54 PM, Hongtao Liu wrote:
> On Fri, May 10, 2019 at 3:55 AM Jeff Law  wrote:
>>
>> On 5/6/19 11:38 PM, Hongtao Liu wrote:
>>> Hi Uros and GCC:
>>>   This patch is to fix ix86_expand_sse_comi_round whose implementation
>>> was not correct.
>>>   New implentation aligns with _mm_cmp_round_s[sd]_mask.
>>>
>>> Bootstrap and regression tests for x86 is fine.
>>> Ok for trunk?
>>>
>>>
>>> ChangeLog:
>>> gcc/
>>>* config/i386/i386-expand.c (ix86_expand_sse_comi_round):
>>>Modified, original implementation isn't correct.
>>>
>>> gcc/testsuite
>>>* gcc.target/i386/avx512f-vcomisd-2.c: New runtime tests.
>>>* gcc.target/i386/avx512f-vcomisd-2.c: Likewise.
>> So you'll have to bear with me, I'm not really familiar with this code,
>> but in the absence of a maintainer I'll try to work through it.
>>
>>
>>>
>>> -- BR, Hongtao
>>>
>>>
>>> 0001-Fix-ix86_expand_sse_comi_round.patch
>>>
>>> Index: gcc/ChangeLog
>>> ===
>>> --- gcc/ChangeLog (revision 270933)
>>> +++ gcc/ChangeLog (working copy)
>>> @@ -1,3 +1,11 @@
>>> +2019-05-06  H.J. Lu  
>>> + Hongtao Liu  
>>> +
>>> + PR Target/89750
>>> + PR Target/86444
>>> + * config/i386/i386-expand.c (ix86_expand_sse_comi_round):
>>> + Modified, original implementation isn't correct.
>>> +
>>>  2019-05-06  Segher Boessenkool  
>>>
>>>   * config/rs6000/rs6000.md (FIRST_ALTIVEC_REGNO, LAST_ALTIVEC_REGNO)
>>> Index: gcc/config/i386/i386-expand.c
>>> ===
>>> --- gcc/config/i386/i386-expand.c (revision 270933)
>>> +++ gcc/config/i386/i386-expand.c (working copy)
>>> @@ -9853,18 +9853,24 @@
>>>const struct insn_data_d *insn_p = &insn_data[icode];
>>>machine_mode mode0 = insn_p->operand[0].mode;
>>>machine_mode mode1 = insn_p->operand[1].mode;
>>> -  enum rtx_code comparison = UNEQ;
>>> -  bool need_ucomi = false;
>>>
>>>/* See avxintrin.h for values.  */
>>> -  enum rtx_code comi_comparisons[32] =
>>> +  static const enum rtx_code comparisons[32] =
>> So I assume the comment refers to the _CMP_* #defines in avxintrin.h?
>>
>   Yes.
>>
>>>  {
>>> -  UNEQ, GT, GE, UNORDERED, LTGT, UNLE, UNLT, ORDERED, UNEQ, UNLT,
>>> -  UNLE, LT, LTGT, GE, GT, LT, UNEQ, GT, GE, UNORDERED, LTGT, UNLE,
>>> -  UNLT, ORDERED, UNEQ, UNLT, UNLE, LT, LTGT, GE, GT, LT
>>> +  EQ, LT, LE, UNORDERED, NE, UNGE, UNGT, ORDERED,
>>> +  EQ, UNLT, UNLE, UNORDERED, LTGT, GE, GT, ORDERED,
>>> +  EQ, LT, LE, UNORDERED, NE, UNGE, UNGT, ORDERED,
>>> +  EQ, UNLT, UNLE, UNORDERED, LTGT, GE, GT, ORDERED
>>>  };
>>
>> For CMP_EQ_UQ aren't we looking for an unordered comparison, so UNEQ
>> seems right, but you're using EQ.  Can you double-check this?  If it's
>> wrong, then please make sure we cover this case with a test.
>>
> Avx512f-vcomis[sd]-2.c covers all 32 compare predicates.
> UNEQ and EQ behave differently when either operand is NAN, besides
> they're the same.
> Since NAN operands are handled separtely, so EQ/UNEQ makes no
> difference, That why this passes cover tests.
> I'll correct it.
Ah.  Thanks.  FWIW my approach was to walk through the _CMP_* defines in
avxintrin.h and map that to what I thought the comparison ought to be.
Then I reviewed my result against your patch.  I got a couple wrong, but
could easily see my mistake.  The only one I couldn't reconcile was the
CMP_EQ_UQ.  Knowing the NaNs are handled separately makes it clear.
Thanks gain.



>>
>>
>>
>>> @@ -9932,11 +10021,37 @@
>>>  }
>>>
>>>emit_insn (pat);
>>> +
>>> +  /* XXX: Set CCFPmode and check a different CCmode.  Does it work
>>> + correctly?  */
>>> +  if (GET_MODE (set_dst) != mode)
>>> +set_dst = gen_rtx_REG (mode, REGNO (set_dst));
>> This looks worrisome, even without the cryptic comment.  I don't think
>> you can just blindly change the mode like that.  Unless you happen to
>> know that the only things you test in the new mode were set in precisely
>> the same way as the old mode.
>>
> Modified as:
> +  /* NB: Set CCFPmode and check a different CCmode.  */
> +  if (GET_MODE (set_dst) != mode)
> +set_dst = gen_rtx_REG (mode, FLAGS_REG);
That might actually be worse.  The mode carries semantic information
about where to find the various condition codes within the flags
register and which condition codes are valid.  The register number
determines which (of possibly many) flags registers we are querying.

Thus if the mode of SET_DEST is not the same as MODE, then there is a
mismatch between the point where the condition codes were set and where
we want to use them.

That can only be safe is the condition codes set in the new mode are a
strict subset of the condition codes set in the old mode and they're
found in the same place within the same flags register.

Maybe a simple example would help.  Consider a fairly standard target
with arithmetic insns that set ZNV and logicals th

Re: libbacktrace integration for _GLIBCXX_DEBUG mode

2019-05-29 Thread Jonathan Wakely

On 29/05/19 19:45 +0200, François Dumont wrote:

On 5/29/19 12:06 AM, Jonathan Wakely wrote:

On 23/05/19 07:39 +0200, François Dumont wrote:

Hi

    So here what I come up with.

    _GLIBCXX_DEBUG_BACKTRACE controls the feature. If the user 
define


Thanks for making this opt-in.

it and there is a detectable issue with libbacktrace then I 
generate a compilation error. I want to avoid users defining it 
but having no backtrace in the end in the debug assertion.


Why do you want to avoid that?

This means users can't just define the macro in their makefiles
unconditionally, they have to check if libbacktrace is installed and
supported. That might mean having platform-specific conditionals in
the makefile to only enable it sometimes.

What harm does it do to just ignore the _GLIBCXX_DEBUG_BACKTRACE macro
if backtraces can't be enabled? Or just use #warning instead of
#error?

What I want to avoid is PR  saying that despite 
_GLIBCXX_DEBUG_BACKTRACE being defined there isn't any backtrace 
displayed in the assertion message.


I think optimizing for "we don't want to get PRs" instead of what's
useful to users is the wrong goal.

Maybe I can still fail to compile if I can't include 
backtrace-supported.h to make clear that the -I... option is missing 
but ignore if !BACKTRACE_SUPPORTED. Would it be fine ?


How about just warnings?

#if defined(_GLIBCXX_DEBUG_BACKTRACE)
# if !defined(BACKTRACE_SUPPORTED)
#  if defined(__has_include) && !__has_include()
#   warning "_GLIBCXX_DEBUG_BACKTRACE is defined but  header 
from libbacktrace was not found"
#  endif
#  include 
# endif
# if !BACKTRACE_SUPPORTED
#   warning "_GLIBCXX_DEBUG_BACKTRACE is defined but libbacktrace is not 
supported"
# endif
# include 
#else

It might be even better to give a way to suppress those warnings:

#if defined(_GLIBCXX_DEBUG_BACKTRACE)
# if !defined(BACKTRACE_SUPPORTED)
#  if defined(__has_include) && !__has_include() \
 && !defined _GLIBCXX_DEBUG_BACKTRACE_FAIL_QUIETLY
#   warning "_GLIBCXX_DEBUG_BACKTRACE is defined but  header 
from libbacktrace was not found"
#  endif
#  include 
# endif
# if !BACKTRACE_SUPPORTED && !defined _GLIBCXX_DEBUG_BACKTRACE_FAIL_QUIETLY
#   warning "_GLIBCXX_DEBUG_BACKTRACE is defined but libbacktrace is not 
supported"
# endif
# include 
#else






    With this new setup I manage to run testsuite with it like that:

export LD_LIBRARY_PATH=/home/fdt/dev/gcc/install/lib/
make CXXFLAGS='-D_GLIBCXX_DEBUG_BACKTRACE 
-I/home/fdt/dev/gcc/install/include -lbacktrace' check-debug


    An example of result:

/home/fdt/dev/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/debug/vector:606:

In function:
    std::__debug::vector<_Tp, _Allocator>::iterator
    std::__debug::vector<_Tp, 
_Allocator>::insert(std::__debug::vector<_Tp,

    _Allocator>::const_iterator, _InputIterator, _InputIterator) [with
    _InputIterator = int*;  = void; _Tp = int;
    _Allocator = std::allocator; std::__debug::vector<_Tp,
    _Allocator>::iterator =
__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator >, std::__debug::vector,
    std::random_access_iterator_tag>; typename 
std::iterator_traits
    std::vector<_Tp, _Alloc>::iterator>::iterator_category =
    std::random_access_iterator_tag; typename std::vector<_Tp,
    _Alloc>::iterator = __gnu_cxx::__normal_iteratorstd::vector

    >; std::__debug::vector<_Tp, _Allocator>::const_iterator =
__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator >, std::__debug::vector,
    std::random_access_iterator_tag>; typename 
std::iterator_traits
    std::vector<_Tp, _Alloc>::const_iterator>::iterator_category =
    std::random_access_iterator_tag; typename std::vector<_Tp,
    _Alloc>::const_iterator = __gnu_cxx::__normal_iteratorint*, std::

    vector >]

Backtrace:
    0x402718 
__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iteratorstd::vector >> std::__debug::vector::insertvoid>(__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iteratorconst*, std::vector >>, int*, int*)

/home/fdt/dev/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/debug/vector:606

    0x402718 test01()
/home/fdt/dev/gcc/git/libstdc++-v3/testsuite/23_containers/vector/debug/57779_neg.cc:29

    0x401428 main
/home/fdt/dev/gcc/git/libstdc++-v3/testsuite/23_containers/vector/debug/57779_neg.cc:34


Error: attempt to insert with an iterator range [__first, __last) 
from this

container.

Objects involved in the operation:
    iterator "__first" @ 0x0x7fff730b96b0 {
  type = int* (mutable iterator);
    }
    iterator "__last" @ 0x0x7fff730b96b8 {
  type = int* (mutable iterator);
    }
    sequence "this" @ 0x0x7fff730b9720 {
  type = std::__debug::vector;
    }


This is nice.


Yes, I forgot to say that I made an effort to clean a little the 
demangle names but I think you saw it already.


Yes, all that code concerns me. I'll reply to the patch email again
...





Re: [PATCH V2] A jump threading opportunity for condition branch

2019-05-29 Thread Jeff Law
On 5/29/19 6:36 AM, Richard Biener wrote:
> On Tue, 28 May 2019, Jiufu Guo wrote:
> 
>> Hi,
>>
>> This patch implements a new opportunity of jump threading for PR77820.
>> In this optimization, conditional jumps are merged with unconditional
>> jump. And then moving CMP result to GPR is eliminated.
>>
>> This version is based on the proposal of Richard, Jeff and Andrew, and
>> refined to incorporate comments.  Thanks for the reviews!
>>
>> Bootstrapped and tested on powerpc64le and powerpc64be with no
>> regressions (one case is improved) and new testcases are added. Is this
>> ok for trunk?
>>
>> Example of this opportunity looks like below:
>>
>>   
>>   p0 = a CMP b
>>   goto ;
>>
>>   
>>   p1 = c CMP d
>>   goto ;
>>
>>   
>>   # phi = PHI 
>>   if (phi != 0) goto ; else goto ;
>>
>> Could be transformed to:
>>
>>   
>>   p0 = a CMP b
>>   if (p0 != 0) goto ; else goto ;
>>
>>   
>>   p1 = c CMP d
>>   if (p1 != 0) goto ; else goto ;
>>
>>
>> This optimization eliminates:
>> 1. saving CMP result: p0 = a CMP b.
>> 2. additional CMP on branch: if (phi != 0).
>> 3. converting CMP result if there is phi = (INT_CONV) p0 if there is.
>>
>> Thanks!
>> Jiufu Guo
>>
>>
>> [gcc]
>> 2019-05-28  Jiufu Guo  
>>  Lijia He  
>>
>>  PR tree-optimization/77820
>>  * tree-ssa-threadedge.c
>>  (edge_forwards_cmp_to_conditional_jump_through_empty_bb_p): New
>>  function.
>>  (thread_across_edge): Add call to
>>  edge_forwards_cmp_to_conditional_jump_through_empty_bb_p.
>>
>> [gcc/testsuite]
>> 2019-05-28  Jiufu Guo  
>>  Lijia He  
>>
>>  PR tree-optimization/77820
>>  * gcc.dg/tree-ssa/phi_on_compare-1.c: New testcase.
>>  * gcc.dg/tree-ssa/phi_on_compare-2.c: New testcase.
>>  * gcc.dg/tree-ssa/phi_on_compare-3.c: New testcase.
>>  * gcc.dg/tree-ssa/phi_on_compare-4.c: New testcase.
>>  * gcc.dg/tree-ssa/split-path-6.c: Update testcase.
>>
>> ---
>>  gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c | 30 ++
>>  gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c | 23 +++
>>  gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c | 25 
>>  gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c | 40 +
>>  gcc/testsuite/gcc.dg/tree-ssa/split-path-6.c |  2 +-
>>  gcc/tree-ssa-threadedge.c| 76 
>> +++-
>>  6 files changed, 192 insertions(+), 4 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c
>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c
>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c
>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c
>>
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c 
>> b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c
>> new file mode 100644
>> index 000..5227c87
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-1.c
>> @@ -0,0 +1,30 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-Ofast -fdump-tree-vrp1" } */
>> +
>> +void g (int);
>> +void g1 (int);
>> +
>> +void
>> +f (long a, long b, long c, long d, long x)
>> +{
>> +  _Bool t;
>> +  if (x)
>> +{
>> +  g (a + 1);
>> +  t = a < b;
>> +  c = d + x;
>> +}
>> +  else
>> +{
>> +  g (b + 1);
>> +  a = c + d;
>> +  t = c > d;
>> +}
>> +
>> +  if (t)
>> +g1 (c);
>> +
>> +  g (a);
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c 
>> b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c
>> new file mode 100644
>> index 000..eaf89bb
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-2.c
>> @@ -0,0 +1,23 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-Ofast -fdump-tree-vrp1" } */
>> +
>> +void g (void);
>> +void g1 (void);
>> +
>> +void
>> +f (long a, long b, long c, long d, int x)
>> +{
>> +  _Bool t;
>> +  if (x)
>> +t = c < d;
>> +  else
>> +t = a < b;
>> +
>> +  if (t)
>> +{
>> +  g1 ();
>> +  g ();
>> +}
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c 
>> b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c
>> new file mode 100644
>> index 000..d5a1e0b
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-3.c
>> @@ -0,0 +1,25 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-Ofast -fdump-tree-vrp1" } */
>> +
>> +void g (void);
>> +void g1 (void);
>> +
>> +void
>> +f (long a, long b, long c, long d, int x)
>> +{
>> +  int t;
>> +  if (x)
>> +t = a < b;
>> +  else if (d == x)
>> +t = c < b;
>> +  else
>> +t = d > c;
>> +
>> +  if (t)
>> +{
>> +  g1 ();
>> +  g ();
>> +}
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times "Removing basic block" 1 "vrp1" } } */
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi_on_compare-4.c 
>> b/gcc

Re: libbacktrace integration for _GLIBCXX_DEBUG mode

2019-05-29 Thread Jonathan Wakely

On 23/05/19 07:39 +0200, François Dumont wrote:

Hi

    So here what I come up with.

    _GLIBCXX_DEBUG_BACKTRACE controls the feature. If the user define 
it and there is a detectable issue with libbacktrace then I generate a 
compilation error. I want to avoid users defining it but having no 
backtrace in the end in the debug assertion.


    With this new setup I manage to run testsuite with it like that:

export LD_LIBRARY_PATH=/home/fdt/dev/gcc/install/lib/
make CXXFLAGS='-D_GLIBCXX_DEBUG_BACKTRACE 
-I/home/fdt/dev/gcc/install/include -lbacktrace' check-debug


    An example of result:

/home/fdt/dev/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/debug/vector:606:
In function:
    std::__debug::vector<_Tp, _Allocator>::iterator
    std::__debug::vector<_Tp, 
_Allocator>::insert(std::__debug::vector<_Tp,

    _Allocator>::const_iterator, _InputIterator, _InputIterator) [with
    _InputIterator = int*;  = void; _Tp = int;
    _Allocator = std::allocator; std::__debug::vector<_Tp,
    _Allocator>::iterator =
__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator >, std::__debug::vector,
    std::random_access_iterator_tag>; typename 
std::iterator_traits
    std::vector<_Tp, _Alloc>::iterator>::iterator_category =
    std::random_access_iterator_tag; typename std::vector<_Tp,
    _Alloc>::iterator = __gnu_cxx::__normal_iteratorstd::vector

    >; std::__debug::vector<_Tp, _Allocator>::const_iterator =
__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iterator >, std::__debug::vector,
    std::random_access_iterator_tag>; typename 
std::iterator_traits
    std::vector<_Tp, _Alloc>::const_iterator>::iterator_category =
    std::random_access_iterator_tag; typename std::vector<_Tp,
    _Alloc>::const_iterator = __gnu_cxx::__normal_iteratorstd::

    vector >]

Backtrace:
    0x402718 
__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iteratorstd::vector >> std::__debug::vector::insertvoid>(__gnu_debug::_Safe_iterator<__gnu_cxx::__normal_iteratorconst*, std::vector >>, int*, int*)

/home/fdt/dev/gcc/build/x86_64-pc-linux-gnu/libstdc++-v3/include/debug/vector:606
    0x402718 test01()
/home/fdt/dev/gcc/git/libstdc++-v3/testsuite/23_containers/vector/debug/57779_neg.cc:29
    0x401428 main
/home/fdt/dev/gcc/git/libstdc++-v3/testsuite/23_containers/vector/debug/57779_neg.cc:34

Error: attempt to insert with an iterator range [__first, __last) from this
container.

Objects involved in the operation:
    iterator "__first" @ 0x0x7fff730b96b0 {
  type = int* (mutable iterator);
    }
    iterator "__last" @ 0x0x7fff730b96b8 {
  type = int* (mutable iterator);
    }
    sequence "this" @ 0x0x7fff730b9720 {
  type = std::__debug::vector;
    }
XFAIL: 23_containers/vector/debug/57779_neg.cc execution test


    * include/debug/formatter.h [_GLIBCXX_DEBUG_BACKTRACE]: Include
     and .
    [!_GLIBCXX_DEBUG_BACKTRACE]: Include .
    [!_GLIBCXX_DEBUG_BACKTRACE](backtrace_error_callback): New.
    [!_GLIBCXX_DEBUG_BACKTRACE](backtrace_full_callback): New.
    [!_GLIBCXX_DEBUG_BACKTRACE](struct backtrace_state): New declaration.
    (_Error_formatter::_Bt_full_t): New function pointer type.
    (_Error_formatter::_M_print_backtrace): New.
    (_Error_formatter::_M_backtrace_state): New.
    (_Error_formatter::_M_backtrace_full_func): New.
    * src/c++11/debug.cc: Include  and .
    (PrintContext::_M_demangle_name): New.
    (_Print_func_t): New.
    (print_word(PrintContext&, const char*)): New.
    (print_raw(PrintContext&, const char*)): New.
    (print_function(PrintContext&, const char*, _Print_func_t)): New.
    (print_type): Use latter.
    (print_string(PrintContext&, const char*)): New.
    (print_backtrace(void*, uintptr_t, const char*, int, const char*)):
    New.
    (_Error_formatter::_M_error()): Adapt.
    * doc/xml/manual/debug_mode.xml: Document _GLIBCXX_DEBUG_BACKTRACE.

Tested under Linux x86_64.

Ok to commit ?

François


On 12/21/18 10:03 PM, Jonathan Wakely wrote:

On 21/12/18 22:47 +0200, Ville Voutilainen wrote:
On Fri, 21 Dec 2018 at 22:35, Jonathan Wakely  
wrote:

    I also explcitely define BACKTRACE_SUPPORTED to 0 to make sure
libstdc++ has no libbacktrace dependency after usual build.



I'm concerned about the requirement to link to libbacktrace
explicitly (which will break existing makefiles and build systems that
currently use debug mode in testing).


But see what Francois wrote, "I also explcitely define
BACKTRACE_SUPPORTED to 0 to make sure
libstdc++ has no libbacktrace dependency after usual build."


Yes, but if you happen to install libbacktrace headers, the behaviour
for users building their own code changes. I agree that if you install
those headers, it's probably for a reason, but it might be a different
reason to "so that libstdc++ prints better backtraces".


Also, some of the glibc team pointed out to me that running *any*
extra code after undefined behaviour has been detected is a potential
risk. The less that you do between detecting UB and calling abort()

Re: Teach same_types_for_tbaa to structurally compare arrays, pointers and vectors

2019-05-29 Thread Jan Hubicka
Hi,
this is a variant of testcase I have comitted. Once Martin implements SRA
part, we could add next variant that drops -fno-tree-sra.

It seems odd that constant propagation only happens in fre3.
I woud expect fre1 to discover this already.
The IL before fre1 and 3 differs only by:

 test ()
 {
   struct foo foo;
   struct bar * barptr.0_1;
   struct foo * fooptr.1_2;
-  struct bar * barptr.2_3;
-  int _8;
+  int _7;
 
-   :
+   [local count: 1073741824]:
   foo.val = 0;
   barptr.0_1 = barptr;
   barptr.0_1->val2 = 123;
   fooptr.1_2 = fooptr;
   *fooptr.1_2 = foo;
-  barptr.2_3 = barptr;
-  _8 = barptr.2_3->val2;
+  _7 = barptr.0_1->val2;
   foo ={v} {CLOBBER};
-  return _8;
+  return _7;
 
 }

Why VN is not able to optimize the barptr access and lookup through
it at once?  It looks that could potentially save some need to re-run
GVN since it is common to store pointers to memory and use them multiple
times to access other pointers.

* tree-ssa/alias-access-spath-1.c: new testcase.

Index: gcc.dg/tree-ssa/alias-access-path-1.c
===
--- gcc.dg/tree-ssa/alias-access-path-1.c   (nonexistent)
+++ gcc.dg/tree-ssa/alias-access-path-1.c   (working copy)
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-fre3 -fno-tree-sra" } */
+struct foo
+{
+  int val;
+} *fooptr;
+struct bar
+{
+  struct foo foo;
+  int val2;
+} *barptr;
+int
+test ()
+{
+  struct foo foo = { 0 };
+  barptr->val2 = 123;
+  *fooptr = foo;
+  return barptr->val2;
+}
+
+/* { dg-final { scan-tree-dump-times "return 123" 1 "fre3"} } */


RFC: [PATCH] Remove using-declarations that add std names to __gnu_cxx

2019-05-29 Thread Jonathan Wakely

These using-declarations appear to have been added for simplicity when
moving the non-standard extensions from namespace std to namespace
__gnu_cxx. Dumping all these names into namespace __gnu_cxx allows
uses like __gnu_cxx::size_t and __gnu_cxx::pair, which serve no useful
purpose, but allows creating unnecessarily unportable code.

This patch removes most of the using-declarations from namespace scope,
then either qualifies names as needed or adds using-declarations at
block scope or typedefs at class scope.

* include/backward/hashtable.h (size_t, ptrdiff_t)
(forward_iterator_tag, input_iterator_tag, _Construct, _Destroy)
(distance, vector, pair, __iterator_category): Remove
using-declarations that add these names to namespace __gnu_cxx.
* include/ext/bitmap_allocator.h (size_t, ptrdiff_t): Likewise.
* include/ext/debug_allocator.h (size_t): Likewise.
* include/ext/functional (size_t, unary_function, binary_function)
(mem_fun1_t, const_mem_fun1_t, mem_fun1_ref_t, const_mem_fun1_ref_t):
Likewise.
* include/ext/malloc_allocator.h (size_t, ptrdiff_t): Likewise.
* include/ext/memory (ptrdiff_t, pair, __iterator_category): Likewise.
* include/ext/mt_allocator.h (size_t, ptrdiff_t): Likewise.
* include/ext/new_allocator.h (size_t, ptrdiff_t): Likewise.
* include/ext/numeric (iota): Fix outdated comment.
* include/ext/pool_allocator.h (size_t, ptrdiff_t): Likewise.
* include/ext/rb_tree (_Rb_tree, allocator): Likewise.
* include/ext/rope (size_t, ptrdiff_t, allocator, _Destroy): Likewise.
* include/ext/ropeimpl.h (size_t, printf, basic_ostream)
(__throw_length_error, _Destroy, std::__uninitialized_fill_n_a):
Likewise.
* include/ext/slist (size_t, ptrdiff_t, _Construct, _Destroy)
(allocator, __true_type, __false_type): Likewise.

Does anybody think we should keep __gnu_cxx::size_t,
__gnu_cxx::input_iterator_tag, __gnu_cxx::vector, __gnu_cxx::pair etc.
or should I go ahead and commit this?


commit ab8f45c41ec4ee31e55080af79b728dcb556ef2a
Author: Jonathan Wakely 
Date:   Wed May 29 15:56:59 2019 +0100

Remove using-declarations that add std names to __gnu_cxx

These using-declarations appear to have been added for simplicity when
moving the non-standard extensions from namespace std to namespace
__gnu_cxx. Dumping all these names into namespace __gnu_cxx allows
unportable uses like __gnu_cxx::size_t and __gnu_cxx::pair, which serve
no useful purpose.

This patch removes most of the using-declarations from namespace scope,
then either qualifies names as needed or adds using-declarations at
block scope or typedefs at class scope.

* include/backward/hashtable.h (size_t, ptrdiff_t)
(forward_iterator_tag, input_iterator_tag, _Construct, _Destroy)
(distance, vector, pair, __iterator_category): Remove
using-declarations that add these names to namespace __gnu_cxx.
* include/ext/bitmap_allocator.h (size_t, ptrdiff_t): Likewise.
* include/ext/debug_allocator.h (size_t): Likewise.
* include/ext/functional (size_t, unary_function, binary_function)
(mem_fun1_t, const_mem_fun1_t, mem_fun1_ref_t, 
const_mem_fun1_ref_t):
Likewise.
* include/ext/malloc_allocator.h (size_t, ptrdiff_t): Likewise.
* include/ext/memory (ptrdiff_t, pair, __iterator_category): 
Likewise.
* include/ext/mt_allocator.h (size_t, ptrdiff_t): Likewise.
* include/ext/new_allocator.h (size_t, ptrdiff_t): Likewise.
* include/ext/numeric (iota): Fix outdated comment.
* include/ext/pool_allocator.h (size_t, ptrdiff_t): Likewise.
* include/ext/rb_tree (_Rb_tree, allocator): Likewise.
* include/ext/rope (size_t, ptrdiff_t, allocator, _Destroy): 
Likewise.
* include/ext/ropeimpl.h (size_t, printf, basic_ostream)
(__throw_length_error, _Destroy, std::__uninitialized_fill_n_a):
Likewise.
* include/ext/slist (size_t, ptrdiff_t, _Construct, _Destroy)
(allocator, __true_type, __false_type): Likewise.

diff --git a/libstdc++-v3/include/backward/hashtable.h 
b/libstdc++-v3/include/backward/hashtable.h
index 28c487a71c2..df6ad85191c 100644
--- a/libstdc++-v3/include/backward/hashtable.h
+++ b/libstdc++-v3/include/backward/hashtable.h
@@ -69,17 +69,6 @@ namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
-  using std::size_t;
-  using std::ptrdiff_t;
-  using std::forward_iterator_tag;
-  using std::input_iterator_tag;
-  using std::_Construct;
-  using std::_Destroy;
-  using std::distance;
-  using std::vector;
-  using std::pair;
-  using std::__iterator_category;
-
   template
 struct _Hashtable_node
 {
@@ -112,10 +101,10 @@ _GLIBCXX_BEGIN_NAME

Re: RFC: [PATCH] Remove using-declarations that add std names to __gnu_cxx

2019-05-29 Thread Ville Voutilainen
On Wed, 29 May 2019 at 23:00, Jonathan Wakely  wrote:
> Does anybody think we should keep __gnu_cxx::size_t,
> __gnu_cxx::input_iterator_tag, __gnu_cxx::vector, __gnu_cxx::pair etc.
> or should I go ahead and commit this?

+1 go ahead.


Re: [PATCH] A jump threading opportunity for condition branch

2019-05-29 Thread Jeff Law
On 5/21/19 7:44 AM, Jiufu Guo wrote:
> Hi,
> 
> This patch implements a new opportunity of jump threading for PR77820.
> In this optimization, conditional jumps are merged with unconditional jump.
> And then moving CMP result to GPR is eliminated.
> 
> It looks like below:
> 
>   
>   p0 = a CMP b
>   goto ;
> 
>   
>   p1 = c CMP d
>   goto ;
> 
>   
>   # phi = PHI 
>   if (phi != 0) goto ; else goto ;
> 
> Could be transformed to:
> 
>   
>   p0 = a CMP b
>   if (p0 != 0) goto ; else goto ;
> 
>   
>   p1 = c CMP d
>   if (p1 != 0) goto ; else goto ;
A few high level notes.

I think LLVM does this in their jump threading pass as well, mostly
because it enables discovering additional jump threading opportunities
IIRC.   But it appears to me to be inherently good on its own as well as
it eliminates a dynamic unconditional jump.

It's also the case that after this transformation we may be able to
combine the assignment and test resulting in something like this:

>   
>   if (a CMP b) goto ; else goto ;
>
>   
>   if (c CMP d) goto ; else goto ;
Which is inherently good *and* the blocks no longer have side effects
which can have secondary positive effects in the jump threader.

I wouldn't be surprised if this was particularly useful for chained
boolean logical tests where some of the arms collapse down to single tests.

Jeff


Re: [PATCH] A jump threading opportunity for condition branch

2019-05-29 Thread Jeff Law
On 5/23/19 6:05 AM, Jiufu Guo wrote:
> Hi,
> 
> Richard Biener  writes:
> 
>> On Tue, 21 May 2019, Jiufu Guo wrote:
>>

>>>  
>>> +/* Return true if PHI's INDEX-th incoming value is a CMP, and the CMP is
>>> +   defined in the incoming basic block. Otherwise return false.  */
>>> +static bool
>>> +cmp_from_unconditional_block (gphi *phi, int index)
>>> +{
>>> +  tree value = gimple_phi_arg_def (phi, index);
>>> +  if (!(TREE_CODE (value) == SSA_NAME && has_single_use (value)))
>>> +return false;
>> Not sure why we should reject a constant here but I guess we
>> expect it to find a simplified condition anyways ;)
>>
> Const could be accepted here, like "# t_9 = PHI <5(3), t_17(4)>". I
> found this case is already handled by other jump-threading code, like
> 'ethread' pass.
Right.  There's no need to handle constants here.  They'll result in
trivially discoverable jump threading opportunities.

>>> +  /* Check if phi's incoming value is defined in the incoming basic_block. 
>>>  */
>>> +  edge e = gimple_phi_arg_edge (phi, index);
>>> +  if (def->bb != e->src)
>>> +return false;
>> why does this matter?
>>
> Through preparing pathes and duplicating block, this transform can also
> help to combine a cmp in previous block and a gcond in current block.
> "if (def->bb != e->src)" make sure the cmp is define in the incoming
> block of the current; and then combining "cmp with gcond" is safe.  If
> the cmp is defined far from the incoming block, it would be hard to
> achieve the combining, and the transform may not needed.
I don't think it's strictly needed in the long term and could be
addressed in a follow-up if we can find cases where it helps.  I think
we'd just need to double check insertion of the new conditional branch
to relax this if we cared.

However, I would expect sinking to have done is job here and would be
surprised if trying to handle this actually improved any real world code.
> 
>>> +
>>> +  if (!single_succ_p (def->bb))
>>> +return false;
>> Or this?  The actual threading will ensure this will hold true.
>>
> Yes, other thread code check this and ensure it to be true, like
> function thread_through_normal_block. Since this new function is invoked
> outside thread_through_normal_block, so, checking single_succ_p is also
> needed for this case.
Agreed that it's needed.  Consider if the source block has multiple
successors.  Where do we insert the copy of the conditional branch?


>>> +{
>>> +  gimple *gs = last_and_only_stmt (bb);
>>> +  if (gs == NULL)
>>> +return false;
>>> +
>>> +  if (gimple_code (gs) != GIMPLE_COND)
>>> +return false;
>>> +
>>> +  tree cond = gimple_cond_lhs (gs);
>>> +
>>> +  if (TREE_CODE (cond) != SSA_NAME)
>>> +return false;
>> space after if( too much vertical space in this function
>> for my taste btw.
> Will update this.
>> For the forwarding to work we want a NE_EXPR or EQ_EXPR
>> as gimple_cond_code and integer_one_p or integer_zero_p
>> gimple_cond_rhs.
> Right, checking those would be more safe.  Since no issue found, during
> bootstrap and regression tests, so I did not add these checking.  I will
> add this checking.
Definitely want to verify that we're dealing with an equality test
against 0/1.

Jeff


Re: [PATCH] A jump threading opportunity for condition branch

2019-05-29 Thread Jeff Law
On 5/23/19 6:11 AM, Richard Biener wrote:
> On Thu, 23 May 2019, Jiufu Guo wrote:
> 
>> Hi,
>>
>> Richard Biener  writes:
>>
>>> On Tue, 21 May 2019, Jiufu Guo wrote:

 +}
 +
 +  if (TREE_CODE_CLASS (gimple_assign_rhs_code (def)) != tcc_comparison)
 +return false;
 +
 +  /* Check if phi's incoming value is defined in the incoming 
 basic_block.  */
 +  edge e = gimple_phi_arg_edge (phi, index);
 +  if (def->bb != e->src)
 +return false;
>>> why does this matter?
>>>
>> Through preparing pathes and duplicating block, this transform can also
>> help to combine a cmp in previous block and a gcond in current block.
>> "if (def->bb != e->src)" make sure the cmp is define in the incoming
>> block of the current; and then combining "cmp with gcond" is safe.  If
>> the cmp is defined far from the incoming block, it would be hard to
>> achieve the combining, and the transform may not needed.
> We're in SSA form so the "combining" doesn't really care where the
> definition comes from.
Combining doesn't care, but we need to make sure the copy of the
conditional ends up in the right block since it wouldn't necessarily be
associated with def->bb anymore.  But I'd expect the sinking pass to
make this a non-issue in practice anyway.

> 
 +
 +  if (!single_succ_p (def->bb))
 +return false;
>>> Or this?  The actual threading will ensure this will hold true.
>>>
>> Yes, other thread code check this and ensure it to be true, like
>> function thread_through_normal_block. Since this new function is invoked
>> outside thread_through_normal_block, so, checking single_succ_p is also
>> needed for this case.
> I mean threading will isolate the path making this trivially true.
> It's also no requirement for combining, in fact due to the single-use
> check the definition can be sinked across the edge already (if
> the edges dest didn't have multiple predecessors which this threading
> will fix as well).
I don't think so.  The CMP source block could end with a call and have
an abnormal edge (for example).  We can't put the copied conditional
before the call and putting it after the call essentially means creating
a new block.

The CMP source block could also end with a conditional.  Where do we put
the one we want to copy into the CMP source block in that case? :-)

This is something else we'd want to check if we ever allowed the the CMP
defining block to not be the immediate predecessor of the conditional
jump block.  If we did that we'd need to validate that the block where
we're going to insert the copy of the jump has a single successor.


Jeff


Re: [PATCH] A jump threading opportunity for condition branch

2019-05-29 Thread Jeff Law
On 5/24/19 6:45 AM, Richard Biener wrote:
[ Aggressive snipping ]

> As said in my first review I'd just check whether for the
> edge we want to thread through the definition comes from a CMP.
> Suppose you have
> 
>  # val_1 = PHI 
>  if (val_1 != 0)
> 
> and only one edge has a b_3 = d_5 != 0 condition it's still
> worth tail-duplicating the if block.
Agreed.  The cost of tail duplicating here is so small we should be
doing it highly aggressively.  About the only case where we might not
want to would be if we're optimizing for size rather than speed.  That
case isn't clearly a win either way.

jeff


Re: [PATCH] rs6000: Add undocumented switch -mprefixed-addr

2019-05-29 Thread Bill Schmidt
On 5/29/19 8:16 AM, Segher Boessenkool wrote:
> Hi!
>
> On Wed, May 29, 2019 at 07:42:38AM -0500, Bill Schmidt wrote:
>>  * rs6000-cpus.def (OTHER_FUSION_MASKS): New #define.
>>  (ISA_FUTURE_MASKS_SERVER): Add OPTION_MASK_PREFIXED_ADDR. Mask off
>>  OTHER_FUSION_MASKS.
> Two spaces after a full stop (here and later again).

Oops, yep.
>
>> +/* ISA masks setting fusion options.  */
>> +#define OTHER_FUSION_MASKS  (OPTION_MASK_P8_FUSION  \
>> + | OPTION_MASK_P8_FUSION_SIGN)
> Or merge the two masks into one?

I'll ask Mike to explain this, as I don't know why there are two masks.
>
>>  /* Support for a future processor's features.  */
>> -#define ISA_FUTURE_MASKS_SERVER (ISA_3_0_MASKS_SERVER   
>> \
>> - | OPTION_MASK_FUTURE   \
>> - | OPTION_MASK_PCREL)
>> +#define ISA_FUTURE_MASKS_SERVER ((ISA_3_0_MASKS_SERVER  
>> \
>> +  | OPTION_MASK_FUTURE  \
>> +  | OPTION_MASK_PCREL   \
>> +  | OPTION_MASK_PREFIXED_ADDR)  \
>> + & ~OTHER_FUSION_MASKS)
> OTHER_FUSION_MASKS shouldn't be part of ISA_3_0_MASKS_SERVER.  Fix that
> instead?  Fusion is a property of specific CPUs, not of ISA versions.

Agreed, I think this should be masked out of ISA_3_0_MASKS_SERVER, but
would like Mike to agree before I change it in case I'm missing
something obvious.
>
>> -  /* -mpcrel requires the prefixed load/store support on FUTURE systems.  */
>> -  if (!TARGET_FUTURE && TARGET_PCREL)
>> +  /* -mprefixed-addr and -mpcrel require the prefixed load/store support on
>> + FUTURE systems.  */
>> +  if (!TARGET_FUTURE && (TARGET_PCREL || TARGET_PREFIXED_ADDR))
>>  {
>>if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
>>  error ("%qs requires %qs", "-mpcrel", "-mcpu=future");
> PCREL requires PREFIXED_ADDR, please simplify.
>
>> +  if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
>> +{
>> +  if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
>> +error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
>> +
>>rs6000_isa_flags &= ~OPTION_MASK_PCREL;
>>  }
> Maybe put this test first, if that makes things easier or more logical?
ok
>
>> @@ -36379,6 +36391,7 @@ static struct rs6000_opt_mask const 
>> rs6000_opt_masks[] =
>>{ "power9-vector",OPTION_MASK_P9_VECTOR,  false, 
>> true  },
>>{ "powerpc-gfxopt",   OPTION_MASK_PPC_GFXOPT, false, 
>> true  },
>>{ "powerpc-gpopt",OPTION_MASK_PPC_GPOPT,  false, 
>> true  },
>> +  { "prefixed-addr",OPTION_MASK_PREFIXED_ADDR,  false, 
>> true  },
> Do we want this?  Why?

Performance folks are using it for testing purposes.  Eventually this
will probably drop out, but for now I think it's best to have the
undocumented switch.

Thanks,
Bill
>
>
> Segher
>



Re: RFC: [PATCH] Remove using-declarations that add std names to __gnu_cxx

2019-05-29 Thread Thomas Rodgers
Concur

Ville Voutilainen writes:

> On Wed, 29 May 2019 at 23:00, Jonathan Wakely  wrote:
>> Does anybody think we should keep __gnu_cxx::size_t,
>> __gnu_cxx::input_iterator_tag, __gnu_cxx::vector, __gnu_cxx::pair etc.
>> or should I go ahead and commit this?
>
> +1 go ahead.



Re: Simplify more EXACT_DIV_EXPR comparisons

2019-05-29 Thread Marc Glisse

On Mon, 20 May 2019, Richard Biener wrote:


On Mon, May 20, 2019 at 10:16 AM Marc Glisse  wrote:


On Mon, 20 May 2019, Richard Biener wrote:


On Sun, May 19, 2019 at 6:16 PM Marc Glisse  wrote:


Hello,

2 pieces:

- the first one handles the case where the denominator is negative. It
doesn't happen often with exact_div, so I don't handle it everywhere, but
this one looked trivial

- handle the case where a pointer difference is cast to an unsigned type
before being compared to a constant (I hit this in std::vector). With some
range info we could probably handle some non-constant cases as well...

The second piece breaks Walloca-13.c (-Walloca-larger-than=100 -O2)

void f (void*);
void g (int *p, int *q)
{
   __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q);
   if (n < 100)
 f (__builtin_alloca (n));
}

At the time of walloca2, we have

   _1 = p_5(D) - q_6(D);
   # RANGE [-2305843009213693952, 2305843009213693951]
   _2 = _1 /[ex] 4;
   # RANGE ~[2305843009213693952, 16140901064495857663]
   n_7 = (long unsigned intD.10) _2;
   _11 = (long unsigned intD.10) _1;
   if (_11 <= 396)
[...]
   _3 = allocaD.1059 (n_7);

and warn.


That's indeed to complicated relation of _11 to n_7 for
VRP predicate discovery.


However, DOM3 later produces

   _1 = p_5(D) - q_6(D);
   _11 = (long unsigned intD.10) _1;
   if (_11 <= 396)


while _11 vs. _1 works fine.


[...]
   # RANGE [0, 99] NONZERO 127
   _2 = _1 /[ex] 4;
   # RANGE [0, 99] NONZERO 127
   n_7 = (long unsigned intD.10) _2;
   _3 = allocaD.1059 (n_7);

so I am tempted to say that the walloca2 pass is too early, xfail the
testcase and file an issue...


Hmm, there's a DOM pass before walloca2 already and moving
walloca2 after loop opts doesn't look like the best thing to do?
I suppose it's not DOM but sinking that does the important transform
here?  That is,

Index: gcc/passes.def
===
--- gcc/passes.def  (revision 271395)
+++ gcc/passes.def  (working copy)
@@ -241,9 +241,9 @@ along with GCC; see the file COPYING3.
  NEXT_PASS (pass_optimize_bswap);
  NEXT_PASS (pass_laddress);
  NEXT_PASS (pass_lim);
-  NEXT_PASS (pass_walloca, false);
  NEXT_PASS (pass_pre);
  NEXT_PASS (pass_sink_code);
+  NEXT_PASS (pass_walloca, false);
  NEXT_PASS (pass_sancov);
  NEXT_PASS (pass_asan);
  NEXT_PASS (pass_tsan);

fixes it?


I will check, but I don't think walloca uses any kind of on-demand VRP, so
we still need some pass to update the ranges after sinking, which doesn't
seem to happen until the next DOM pass.


Oh, ok...  Aldy, why's this a separate pass anyways?  I think similar
other warnigns are emitted from RTL expansion?  So maybe we can
indeed move the pass towards warn_restrict or late_warn_uninit.


I tried moving it after 'sink' and that didn't help. Moving it next to 
warn_restrict works for this test but breaks 2 others that currently 
"work" by accident (+ one where the message changes between "unbounded" 
and "too large", it isn't clear what the difference is between those 
messages).


My suggestion, in addition to the original patch, is

* gcc.dg/Walloca-13.c: Xfail.

--- Walloca-13.c(revision 271742)
+++ Walloca-13.c(working copy)
@@ -1,12 +1,12 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target alloca } */
 /* { dg-options "-Walloca-larger-than=100 -O2" } */

 void f (void*);

 void g (int *p, int *q)
 {
   __SIZE_TYPE__ n = (__SIZE_TYPE__)(p - q);
   if (n < 100)
-f (__builtin_alloca (n));
+f (__builtin_alloca (n)); // { dg-bogus "may be too large due to conversion" 
"" { xfail { *-*-* } } }
 }

Is that ok?


I also see that the Og pass pipeline misses the second walloca pass
completely (and also the warn_restrict pass).

Given code sinkings obvious effects on SSA value-range representation
it may make sense to add another instance of that pass earlier.


--
Marc Glisse


  1   2   >