Re: Implement intraprocedural dataflow for ipa-modref EAF analyser

2021-11-07 Thread Jan Hubicka via Gcc-patches
Hi,
I have commited the patch now.  On the current tree the patch causes new
failure
./gcc/testsuite/gfortran/gfortran.sum:FAIL: gfortran.dg/vector_subscript_1.f90  
 -O1  execution test
./gcc/testsuite/gfortran/gfortran.sum:FAIL: gfortran.dg/vector_subscript_1.f90  
 -O2  execution test
./gcc/testsuite/gfortran/gfortran.sum:FAIL: gfortran.dg/vector_subscript_1.f90  
 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer 
-finline-functions  execution test
./gcc/testsuite/gfortran/gfortran.sum:FAIL: gfortran.dg/vector_subscript_1.f90  
 -O3 -g  execution test

This is however just semi-latent bug that got triggered and same problem
reproduces with -fipa-pta and is tracked in PR86389
The problem is that Fortran builds a trailling array that is not
discovered as trailling by the middle-end and dom pass does invalid
propagation.

Same problem is triggered by modref because in order to make this happen
we need to discover that nested function test is not writting into
arrays a and b.

Honza


Fix inter-procedural EAF flags propagation with respect to !binds_to_current_def_p

2021-11-07 Thread Jan Hubicka via Gcc-patches
Hi,
while proofreading the code for handling EAF flags of
!binds_to_current_def_p I noticed that the interprocedural dataflow
actually ignores the flag possibly introducing wrong code on nterposable
functions in non-trivial recursion cycles or at ltrans partition
boundary.

This patch unifies the flags changes to single place (remove_useless_eaf_flags)
and does extend modref_merge_call_site_flags to do the right thing.

lto-bootstrapped/regtested x86_64-linux.  Plan to commit it today after bit
more testing (firefox/clang build).

gcc/ChangeLog:

* gimple.c (gimple_call_arg_flags): Use interposable_eaf_flags.
(gimple_call_retslot_flags): Likewise.
(gimple_call_static_chain_flags): Likewise.
* ipa-modref.c (remove_useless_eaf_flags): Do not remove everything for
NOVOPS.
(modref_summary::useful_p): Likewise.
(modref_summary_lto::useful_p): Likewise.
(analyze_parms): Do not kive up on NOVOPS.
(modref_merge_call_site_flags): Compute implicit eaf flags
based on callee ecf_flags and fnspec; if the function does not
bind to current defs use interposable_eaf_flags.
(modref_propagate_flags_in_scc): Update.
* ipa-modref.h (interposable_eaf_flags): New function.

diff --git a/gcc/gimple.c b/gcc/gimple.c
index 7a578f5113e..3d1d3a15b2c 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -1595,17 +1595,7 @@ gimple_call_arg_flags (const gcall *stmt, unsigned arg)
 
  /* We have possibly optimized out load.  Be conservative here.  */
  if (!node->binds_to_current_def_p ())
-   {
- if ((modref_flags & EAF_UNUSED) && !(flags & EAF_UNUSED))
-   {
- modref_flags &= ~EAF_UNUSED;
- modref_flags |= EAF_NOESCAPE;
-   }
- if ((modref_flags & EAF_NOREAD) && !(flags & EAF_NOREAD))
-   modref_flags &= ~EAF_NOREAD;
- if ((modref_flags & EAF_DIRECT) && !(flags & EAF_DIRECT))
-   modref_flags &= ~EAF_DIRECT;
-   }
+   modref_flags = interposable_eaf_flags (modref_flags, flags);
  if (dbg_cnt (ipa_mod_ref_pta))
flags |= modref_flags;
}
@@ -1633,13 +1623,7 @@ gimple_call_retslot_flags (const gcall *stmt)
 
  /* We have possibly optimized out load.  Be conservative here.  */
  if (!node->binds_to_current_def_p ())
-   {
- if ((modref_flags & EAF_UNUSED) && !(flags & EAF_UNUSED))
-   {
- modref_flags &= ~EAF_UNUSED;
- modref_flags |= EAF_NOESCAPE;
-   }
-   }
+   modref_flags = interposable_eaf_flags (modref_flags, flags);
  if (dbg_cnt (ipa_mod_ref_pta))
flags |= modref_flags;
}
@@ -1665,19 +1649,9 @@ gimple_call_static_chain_flags (const gcall *stmt)
{
  int modref_flags = summary->static_chain_flags;
 
- /* We have possibly optimized out load.  Be conservative here.  */
+ /* ??? Nested functions should always bind to current def.  */
  if (!node->binds_to_current_def_p ())
-   {
- if ((modref_flags & EAF_UNUSED) && !(flags & EAF_UNUSED))
-   {
- modref_flags &= ~EAF_UNUSED;
- modref_flags |= EAF_NOESCAPE;
-   }
- if ((modref_flags & EAF_NOREAD) && !(flags & EAF_NOREAD))
-   modref_flags &= ~EAF_NOREAD;
- if ((modref_flags & EAF_DIRECT) && !(flags & EAF_DIRECT))
-   modref_flags &= ~EAF_DIRECT;
-   }
+   modref_flags = interposable_eaf_flags (modref_flags, flags);
  if (dbg_cnt (ipa_mod_ref_pta))
flags |= modref_flags;
}
diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index 5209fbdfbf4..26a719ff40c 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -291,9 +291,7 @@ modref_summary::~modref_summary ()
 static int
 remove_useless_eaf_flags (int eaf_flags, int ecf_flags, bool returns_void)
 {
-  if (ecf_flags & ECF_NOVOPS)
-return 0;
-  if (ecf_flags & ECF_CONST)
+  if (ecf_flags & (ECF_CONST | ECF_NOVOPS))
 eaf_flags &= ~implicit_const_eaf_flags;
   else if (ecf_flags & ECF_PURE)
 eaf_flags &= ~implicit_pure_eaf_flags;
@@ -319,8 +317,6 @@ eaf_flags_useful_p (vec  &flags, int ecf_flags)
 bool
 modref_summary::useful_p (int ecf_flags, bool check_flags)
 {
-  if (ecf_flags & ECF_NOVOPS)
-return false;
   if (arg_flags.length () && !check_flags)
 return true;
   if (check_flags && eaf_flags_useful_p (arg_flags, ecf_flags))
@@ -331,7 +327,7 @@ modref_summary::useful_p (int ecf_flags, bool check_flags)
   if (check_flags
   && remove_useless_eaf_flags (static_chain_flags, ecf_flags, false))
 return true;
-  if (ecf_flags & ECF_CONST)
+  if (ecf_flags & (ECF_CONST | ECF_NOVOPS))
 return false;
   if (loads && !loads->every_base)
 return true;
@@ -405,8 +401,6 @@ modref_summary_lto::~modref_s

Re: [PATCH,FORTRAN] Fix memory leak in finalization wrappers

2021-11-07 Thread Mikael Morin

Le 07/11/2021 à 00:56, Bernhard Reutner-Fischer a écrit :

On Sat, 6 Nov 2021 13:04:07 +0100
Mikael Morin  wrote:


Le 05/11/2021 à 23:08, Bernhard Reutner-Fischer a écrit :

On Fri, 5 Nov 2021 19:46:16 +0100
Mikael Morin  wrote:
   

I’m a bit concerned by the loss of the null_expr’s type interface.
I can’t convince myself that it’s either absolutely necessary or
completely useless.


It's a delicate spot, yes, but i do think they are completely useless.
If we do NOT need a finalization, the initializer can (and has to be
AFAIU) be a null_expr and AFAICS then does not need an interface.
   

Well, the null pointer itself doesn’t need a type, but I think it’s
better if the pointer it’s assigned to has a type different from void*.
It will (hopefully) help the middle-end optimizers downstream.


I would not expect this to help all that much or at all TBH.

So i compiled
for i in $(grep -li final $(grep -L dg-error 
/scratch/src/gcc-12.mine/gcc/testsuite/gfortran.dg/*.f*)); do gfortran -O2 
-fcoarray=single -c $i -g -g3 -ggdb3 -fdump-tree-original 
-fdump-tree-optimized;done
and diffed all .original and .optimized dumps against pristine trunk
and they are identical.

I inspected and ran the binary from finalize_14 and there is no change
in the leaks compared to pristine trunk. The 3 shape_w in p leak as
they used to. I do remember that finalize_14 was a good testcase, in
sum i glared at it for quite some time ;)


In fact, the interface is not used.
the type is built in gfc_get_ppc_type which has the following.

  /* Explicit interface.  */
  if (c->attr.if_source != IFSRC_UNKNOWN && c->ts.interface)
return build_pointer_type (gfc_get_function_type (c->ts.interface));

As components have no if_source attribute set, the type is not built 
here and a default function type is built further down without interface 
information.
This is probably unintended as the components’ initializers carefully 
set an if_source attribute.


The problem has been identified before ; see vaguely related posts from 
FX in september 2020.


Anyway, I don’t think your changes will have negative impact then, and 
it makes things more readable, so I’m fine with it after all; OK.




I will see if I can manage to create a testcase where it makes a
difference (don’t hold your breath, I don’t even have a bootstrapped
compiler ready yet).


That'd be great, TIA!
[]


I’ve given up eventually.


btw.. Just because it's vagely related.
I think f8add009ce300f24b75e9c2e2cc5dd944a020c28 for
PR fortran/88009 (ICE in find_intrinsic_vtab, at fortran/class.c:2761)
is incomplete in that i think all the internal class helpers should be
flagged artificial. All these symbols built in gfc_build_class_symbol,
generate_finalization_wrapper, gfc_find_derived_vtab etc.
Looking at the history it seems the artificial bit often was forgotten.


I guess so, yes...


And most importantly i think it is not correct to ignore artificial in
gfc_check_conflict!

Well, it’s not correct to throw errors at users for things they haven’t 
written and that they don’t control.




RE: Some PINGs

2021-11-07 Thread Roger Sayle


>On 11/6/2021 4:20 PM, Roger Sayle wrote:
>> Simplify paradoxical subreg extensions of TRUNCATE 
>> https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578848.html
> So the discussion seemed to end with a recommendation to try and address this 
> earlier in the call chain rather than in simplify_rtx.

I take full responsibility that my poor choice of wording in describing this 
patch
may be confusing to those not familiar with this style of describing middle-end
transformations in GCC.  Take the opening sentence "This patch simplifies the
RTX (subreg:HI (truncate:QI (reg:SI))) as (truncate:HI (reg:SI))", taken 
literally
someone may look-up the RTL documentation and observe that SUBREG
may never be applied to truncate hence the above is invalid, and therefore
this transformation could never fire.  The wording is perhaps unfortunate; the
abstract RTL above is never actually created/allocated, instead we're 
describing the
transformation/context, i.e. the behaviour of the call to gen_lowpart when 
passed
a TRUCATE rtx as an operand.  Hence, this patch is already "earlier in the call 
chain"
and where it should be.  This is no different to how GCC's match describes tree
expressions to be folded, but that never physically exist in GIMPLE, where the
single-static assignment form guarantees no operator may have another as an
operand.

If someone could review the code change itself, and not misinterpret the patch 
description that would be much appreciated.

Thanks in advance,
Roger
--




Re: [PATCH 11/18] rs6000: Builtin expansion, part 6

2021-11-07 Thread Bill Schmidt via Gcc-patches
Hi Segher,

Thank you for all of the reviews!  I appreciate your hard work and thorough 
study of the patches.

I've updated these 6 patches and combined them into 1, pushed today.  There 
are still a couple of cleanups I haven't done, but I made note in the code
where these are needed.

Thanks again!
Bill

On 11/3/21 8:24 PM, Segher Boessenkool wrote:
> Hi!
>
> On Wed, Sep 01, 2021 at 11:13:47AM -0500, Bill Schmidt wrote:
>> Provide replacements for htm_spr_num and htm_expand_builtin.  No logic
>> changes are intended here, as usual.  Much code was factored out into
>> rs6000_expand_new_builtin, so the new version of htm_expand_builtin is
>> a little tidier.
> Nice.
>
>> Also implement the support for the "endian" and "32bit" attributes,
>> which is straightforward.  These just do icode substitution.
> Don't call this "attributes" please?  I don't know what would be a
> better name, of course.  "bif attribute" maybe?
>
>> +  rtx op[MAX_HTM_OPERANDS], pat;
> Don't declare arrays and scalars in the same statement, in general.  It
> is important that the arrays stand out.
>
> Also, don't declare things before they are used please.
>
>> +  FOR_EACH_CALL_EXPR_ARG (arg, iter, exp)
>> +{
>> +  if (arg == error_mark_node || nopnds >= MAX_HTM_OPERANDS)
>> +return const0_rtx;
>> +
>> +  insn_op = &insn_data[icode].operand[nopnds];
>> +  op[nopnds] = expand_normal (arg);
>> +
>> +  if (!insn_op->predicate (op[nopnds], insn_op->mode))
>> +{
>> +  if (!strcmp (insn_op->constraint, "n"))
>> +{
>> +  int arg_num = (nonvoid) ? nopnds : nopnds + 1;
> Please don't parenthesise random expressions like "nonvoid".  I wonder
> if that can be simpler handled by just unshifting a void_node into the
> operands, btw :-)
>
> And the same "n" thing as before of course.  Since it is the same: some
> factoring would be helpful probably.
>
>> +  machine_mode mode = (TARGET_POWERPC64) ? DImode : SImode;
> Superfluous parens.  This is just "word_mode", anyway?
>
>> +  /* If this builtin accesses a CR, then pass in a scratch
>> + CR as the last operand.  */
>> +  else if (bif_is_htmcr (*bifaddr))
>> +{
>> +  cr = gen_reg_rtx (CCmode);
>> +  op[nopnds++] = cr;
>> +}
> There is only one CR ("condition register").  You can say CRF here
> ("condition register field", a 4-bit thing), or just cc or CC maybe
> ("condition code").  A pet peeve, I know.
>
>> +  if (bif_is_endian (*bifaddr) && BYTES_BIG_ENDIAN)
> "is_endian" should maybe be "is_bigendian" or something like that?
>
> Okay for trunk with the changes you see fit at this time.  Thanks!
>
>
> Segher


[PATCH v3 1/5] fortran: Tiny sort_actual internal refactoring

2021-11-07 Thread Mikael Morin via Gcc-patches

Preliminary refactoring to make further changes more obvious.
No functional change.

gcc/fortran/ChangeLog:
* intrinsic.c (sort_actual): initialise variable and use it earlier.
---
 gcc/fortran/intrinsic.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/gcc/fortran/intrinsic.c b/gcc/fortran/intrinsic.c
index a6a18a471e3..49ef3b2a3d2 100644
--- a/gcc/fortran/intrinsic.c
+++ b/gcc/fortran/intrinsic.c
@@ -4378,19 +4378,18 @@ do_sort:
 
   for (f = formal; f; f = f->next)
 {
-  if (f->actual && f->actual->label != NULL && f->ts.type)
+  a = f->actual;
+  if (a && a->label != NULL && f->ts.type)
 	{
 	  gfc_error ("ALTERNATE RETURN not permitted at %L", where);
 	  return false;
 	}
 
-  if (f->actual == NULL)
+  if (a == NULL)
 	{
 	  a = gfc_get_actual_arglist ();
 	  a->missing_arg_type = f->ts.type;
 	}
-  else
-	a = f->actual;
 
   if (actual == NULL)
 	*ap = a;


[PATCH v3 0/5] fortran: Ignore unused arguments for scalarisation [PR97896]

2021-11-07 Thread Mikael Morin via Gcc-patches


Hello,

This is the third submit of this patch series.
After submitting the v2 [2] for master, and a somewhat different variant for
backport [3], I thought it was defeating the purpose of the backporting
process.  So I have decided to rebase the master patches on the backport
patches, so that the backport patchs can get some testing on master first.

The problematic case is intrinsic procedures where an argument is actually
not used in the code generated (KIND argument of INDEX in the testcase),
which confuses the scalariser.

Thomas König comitted a change to workaround the problem, but it regressed
in PR97896.  These patches put the workaround where I think it is more
appropriate, namely at the beginning of the scalarisation procedure.
That’s what is done by the series [3] initially intended for backport only,
and now for master too.  This series is a followup to them.

What are left in this series are a couple of refactoring for the master
branch only.  They aim at being able to identify the KIND argument of the INDEX
intrinsic by its name, rather than counting the right number of next->next->next
indirections starting with the first argument.  It may seem overkill for just
this use case, but I think it’s worth having that facility in the long term.

Regression-tested on x86_64-linux-gnu.  Ok for master? 


Changes from v2 [2]:

  Rebase on the backport variant of the series.

Changes from v1 [1]:

  Use C structs and enums instead of C++ classes.


[1] https://gcc.gnu.org/pipermail/fortran/2021-August/056303.html
[2] https://gcc.gnu.org/pipermail/fortran/2021-August/056317.html
[3] https://gcc.gnu.org/pipermail/fortran/2021-August/056329.html


Mikael Morin (5):
  fortran: Tiny sort_actual internal refactoring
  fortran: Reverse actual vs dummy argument mapping
  fortran: simplify elemental arguments walking
  fortran: Delete redundant missing_arg_type field
  fortran: Identify arguments by their names

 gcc/fortran/gfortran.h| 41 +++
 gcc/fortran/interface.c   | 77 +++
 gcc/fortran/intrinsic.c   | 53 
 gcc/fortran/trans-array.c | 35 +---
 gcc/fortran/trans-array.h |  2 +-
 gcc/fortran/trans-expr.c  |  9 +++-
 gcc/fortran/trans-intrinsic.c |  2 +-
 gcc/fortran/trans-stmt.c  | 22 --
 gcc/fortran/trans.h   |  4 +-
 9 files changed, 161 insertions(+), 84 deletions(-)

-- 
2.33.0



[PATCH v3 2/5] fortran: Reverse actual vs dummy argument mapping

2021-11-07 Thread Mikael Morin via Gcc-patches

There was originally no way from an actual argument to get
to the corresponding dummy argument, even if the job of sorting
and matching actual with dummy arguments was done.
The closest was a field named actual in gfc_intrinsic_arg that was
used as scratch data when sorting arguments of one specific call.
However that value was overwritten later on as arguments of another
call to the same procedure were sorted and matched.

This change removes that field from gfc_intrinsic_arg and adds instead
a new field associated_dummy in gfc_actual_arglist.

The new field has as type a new wrapper struct gfc_dummy_arg that provides
a common interface to both dummy arguments of user-defined procedures
(which have type gfc_formal_arglist) and dummy arguments of intrinsic procedures
(which have type gfc_intrinsic_arg).

As the removed field was used in the code sorting and matching arguments,
that code has to be updated.  Two local vectors with matching indices
are introduced for respectively dummy and actual arguments, and the
loops are modified to use indices and update those argument vectors.

gcc/fortran/ChangeLog:
* gfortran.h (gfc_dummy_arg_kind, gfc_dummy_arg): New.
(gfc_actual_arglist): New field associated_dummy.
(gfc_intrinsic_arg): Remove field actual.
* interface.c (get_nonintrinsic_dummy_arg): New.
(gfc_compare_actual): Initialize associated_dummy.
* intrinsic.c (get_intrinsic_dummy_arg): New.
(sort_actual):  Add argument vectors.
Use loops with indices on argument vectors.
Initialize associated_dummy.
---
 gcc/fortran/gfortran.h  | 31 +++--
 gcc/fortran/interface.c | 21 ++--
 gcc/fortran/intrinsic.c | 43 ++---
 3 files changed, 80 insertions(+), 15 deletions(-)

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 8c11cf6d18d..d678c6b56dc 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1199,6 +1199,9 @@ gfc_formal_arglist;
 #define gfc_get_formal_arglist() XCNEW (gfc_formal_arglist)
 
 
+struct gfc_dummy_arg;
+
+
 /* The gfc_actual_arglist structure is for actual arguments and
for type parameter specification lists.  */
 typedef struct gfc_actual_arglist
@@ -1215,6 +1218,11 @@ typedef struct gfc_actual_arglist
   gfc_param_spec_type spec_type;
 
   struct gfc_expr *expr;
+
+  /*  The dummy arg this actual arg is associated with, if the interface
+  is explicit.  NULL otherwise.  */
+  gfc_dummy_arg *associated_dummy;
+
   struct gfc_actual_arglist *next;
 }
 gfc_actual_arglist;
@@ -2298,14 +2306,33 @@ typedef struct gfc_intrinsic_arg
   gfc_typespec ts;
   unsigned optional:1, value:1;
   ENUM_BITFIELD (sym_intent) intent:2;
-  gfc_actual_arglist *actual;
 
   struct gfc_intrinsic_arg *next;
-
 }
 gfc_intrinsic_arg;
 
 
+typedef enum {
+  GFC_UNDEFINED_DUMMY_ARG = 0,
+  GFC_INTRINSIC_DUMMY_ARG,
+  GFC_NON_INTRINSIC_DUMMY_ARG
+}
+gfc_dummy_arg_intrinsicness;
+
+/* dummy arg of either an intrinsic or a user-defined procedure.  */
+struct gfc_dummy_arg
+{
+  gfc_dummy_arg_intrinsicness intrinsicness;
+
+  union {
+gfc_intrinsic_arg *intrinsic;
+gfc_formal_arglist *non_intrinsic;
+  } u;
+};
+
+#define gfc_get_dummy_arg() XCNEW (gfc_dummy_arg)
+
+
 /* Specifies the various kinds of check functions used to verify the
argument lists of intrinsic functions. fX with X an integer refer
to check functions of intrinsics with X arguments. f1m is used for
diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
index 24698be8364..c4ec0d89a58 100644
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -3043,6 +3043,18 @@ lookup_arg_fuzzy (const char *arg, gfc_formal_arglist *arguments)
 }
 
 
+static gfc_dummy_arg *
+get_nonintrinsic_dummy_arg (gfc_formal_arglist *formal)
+{
+  gfc_dummy_arg * const dummy_arg = gfc_get_dummy_arg ();
+
+  dummy_arg->intrinsicness = GFC_NON_INTRINSIC_DUMMY_ARG;
+  dummy_arg->u.non_intrinsic = formal;
+
+  return dummy_arg;
+}
+
+
 /* Given formal and actual argument lists, see if they are compatible.
If they are compatible, the actual argument list is sorted to
correspond with the formal list, and elements for missing optional
@@ -3150,6 +3162,8 @@ gfc_compare_actual_formal (gfc_actual_arglist **ap, gfc_formal_arglist *formal,
 			   "call at %L", where);
 	  return false;
 	}
+  else
+	a->associated_dummy = get_nonintrinsic_dummy_arg (f);
 
   if (a->expr == NULL)
 	{
@@ -3646,9 +3660,12 @@ gfc_compare_actual_formal (gfc_actual_arglist **ap, gfc_formal_arglist *formal,
   /* The argument lists are compatible.  We now relink a new actual
  argument list with null arguments in the right places.  The head
  of the list remains the head.  */
-  for (i = 0; i < n; i++)
+  for (f = formal, i = 0; f; f = f->next, i++)
 if (new_arg[i] == NULL)
-  new_arg[i] = gfc_get_actual_arglist ();
+  {
+	new_arg[i] = gfc_get_actual_arglist ();
+	new_arg[

[PATCH v3 4/5] fortran: Delete redundant missing_arg_type field

2021-11-07 Thread Mikael Morin via Gcc-patches

Now that we can get information about an actual arg's associated
dummy using the associated_dummy attribute, the field missing_arg_type
contains redundant information.
This removes it.

gcc/fortran/ChangeLog:
* gfortran.h (gfc_actual_arglist::missing_arg_type): Remove.
* interface.c (gfc_compare_actual_formal): Remove
missing_arg_type initialization.
* intrinsic.c (sort_actual): Ditto.
* trans-expr.c (gfc_conv_procedure_call): Use associated_dummy
and gfc_dummy_arg_get_typespec to get the dummy argument type.
---
 gcc/fortran/gfortran.h   | 5 -
 gcc/fortran/interface.c  | 5 -
 gcc/fortran/intrinsic.c  | 5 +
 gcc/fortran/trans-expr.c | 9 +++--
 4 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 7e76e482b98..4879805ff0b 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1210,11 +1210,6 @@ typedef struct gfc_actual_arglist
   /* Alternate return label when the expr member is null.  */
   struct gfc_st_label *label;
 
-  /* This is set to the type of an eventual omitted optional
- argument. This is used to determine if a hidden string length
- argument has to be added to a function call.  */
-  bt missing_arg_type;
-
   gfc_param_spec_type spec_type;
 
   struct gfc_expr *expr;
diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
index db0b3b01b8c..36b7a852066 100644
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -3681,11 +3681,6 @@ gfc_compare_actual_formal (gfc_actual_arglist **ap, gfc_formal_arglist *formal,
   if (*ap == NULL && n > 0)
 *ap = new_arg[0];
 
-  /* Note the types of omitted optional arguments.  */
-  for (a = *ap, f = formal; a; a = a->next, f = f->next)
-if (a->expr == NULL && a->label == NULL)
-  a->missing_arg_type = f->sym->ts.type;
-
   return true;
 }
 
diff --git a/gcc/fortran/intrinsic.c b/gcc/fortran/intrinsic.c
index f6d061a847c..3018315ed78 100644
--- a/gcc/fortran/intrinsic.c
+++ b/gcc/fortran/intrinsic.c
@@ -4405,10 +4405,7 @@ do_sort:
 	}
 
   if (a == NULL)
-	{
-	  a = gfc_get_actual_arglist ();
-	  a->missing_arg_type = f->ts.type;
-	}
+	a = gfc_get_actual_arglist ();
 
   a->associated_dummy = get_intrinsic_dummy_arg (f);
 
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index e7aec3845d3..bc502c0f43c 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -6157,7 +6157,10 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 		{
 		  /* Pass a NULL pointer for an absent arg.  */
 		  parmse.expr = null_pointer_node;
-		  if (arg->missing_arg_type == BT_CHARACTER)
+		  gfc_dummy_arg * const dummy_arg = arg->associated_dummy;
+		  if (dummy_arg
+		  && gfc_dummy_arg_get_typespec (*dummy_arg).type
+			 == BT_CHARACTER)
 		parmse.string_length = build_int_cst (gfc_charlen_type_node,
 			  0);
 		}
@@ -6174,7 +6177,9 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 			  || !CLASS_DATA (fsym)->attr.allocatable));
 	  gfc_init_se (&parmse, NULL);
 	  parmse.expr = null_pointer_node;
-	  if (arg->missing_arg_type == BT_CHARACTER)
+	  if (arg->associated_dummy
+	  && gfc_dummy_arg_get_typespec (*arg->associated_dummy).type
+		 == BT_CHARACTER)
 	parmse.string_length = build_int_cst (gfc_charlen_type_node, 0);
 	}
   else if (fsym && fsym->ts.type == BT_CLASS


[PATCH v3 3/5] fortran: simplify elemental arguments walking

2021-11-07 Thread Mikael Morin via Gcc-patches

This adds two functions working with the wrapper struct gfc_dummy_arg
and makes usage of them to simplify a bit the walking of elemental
procedure arguments for scalarization.  As information about dummy arguments
can be obtained from the actual argument through the just-introduced
associated_dummy field, there is no need to carry around the procedure
interface and walk dummy arguments manually together with actual arguments.

gcc/fortran/ChangeLog:
* interface.c (gfc_dummy_arg_get_typespec,
gfc_dummy_arg_is_optional): New functions.
* gfortran.h (gfc_dummy_arg_get_typespec,
gfc_dummy_arg_is_optional): Declare them.
* trans.h (gfc_ss_info::dummy_arg): Use the wrapper type
as declaration type.
* trans-array.c (gfc_scalar_elemental_arg_saved_as_reference):
use gfc_dummy_arg_get_typespec function to get the type.
(gfc_walk_elemental_function_args): Remove proc_ifc argument.
Get info about the dummy arg using the associated_dummy field.
* trans-array.h (gfc_walk_elemental_function_args): Update declaration.
* trans-intrinsic.c (gfc_walk_intrinsic_function):
Update call to gfc_walk_elemental_function_args.
* trans-stmt.c (gfc_trans_call): Ditto.
(get_proc_ifc_for_call): Remove.
---
 gcc/fortran/gfortran.h|  4 
 gcc/fortran/interface.c   | 34 ++
 gcc/fortran/trans-array.c | 19 ++-
 gcc/fortran/trans-array.h |  2 +-
 gcc/fortran/trans-intrinsic.c |  2 +-
 gcc/fortran/trans-stmt.c  | 22 --
 gcc/fortran/trans.h   |  4 ++--
 7 files changed, 48 insertions(+), 39 deletions(-)

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index d678c6b56dc..7e76e482b98 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -2333,6 +2333,10 @@ struct gfc_dummy_arg
 #define gfc_get_dummy_arg() XCNEW (gfc_dummy_arg)
 
 
+const gfc_typespec & gfc_dummy_arg_get_typespec (gfc_dummy_arg &);
+bool gfc_dummy_arg_is_optional (gfc_dummy_arg &);
+
+
 /* Specifies the various kinds of check functions used to verify the
argument lists of intrinsic functions. fX with X an integer refer
to check functions of intrinsics with X arguments. f1m is used for
diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
index c4ec0d89a58..db0b3b01b8c 100644
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -5503,3 +5503,37 @@ gfc_get_formal_from_actual_arglist (gfc_symbol *sym,
   f = &((*f)->next);
 }
 }
+
+
+const gfc_typespec &
+gfc_dummy_arg_get_typespec (gfc_dummy_arg & dummy_arg)
+{
+  switch (dummy_arg.intrinsicness)
+{
+case GFC_INTRINSIC_DUMMY_ARG:
+  return dummy_arg.u.intrinsic->ts;
+
+case GFC_NON_INTRINSIC_DUMMY_ARG:
+  return dummy_arg.u.non_intrinsic->sym->ts;
+
+default:
+  gcc_unreachable ();
+}
+}
+
+
+bool
+gfc_dummy_arg_is_optional (gfc_dummy_arg & dummy_arg)
+{
+  switch (dummy_arg.intrinsicness)
+{
+case GFC_INTRINSIC_DUMMY_ARG:
+  return dummy_arg.u.intrinsic->optional;
+
+case GFC_NON_INTRINSIC_DUMMY_ARG:
+  return dummy_arg.u.non_intrinsic->sym->attr.optional;
+
+default:
+  gcc_unreachable ();
+}
+}
diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 79321854498..d37c1e7ad7f 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -3010,7 +3010,8 @@ gfc_scalar_elemental_arg_saved_as_reference (gfc_ss_info * ss_info)
   /* If the expression is of polymorphic type, it's actual size is not known,
  so we avoid copying it anywhere.  */
   if (ss_info->data.scalar.dummy_arg
-  && ss_info->data.scalar.dummy_arg->ts.type == BT_CLASS
+  && gfc_dummy_arg_get_typespec (*ss_info->data.scalar.dummy_arg).type
+	 == BT_CLASS
   && ss_info->expr->ts.type == BT_CLASS)
 return true;
 
@@ -11521,9 +11522,8 @@ arg_evaluated_for_scalarization (gfc_intrinsic_sym *function,
 gfc_ss *
 gfc_walk_elemental_function_args (gfc_ss * ss, gfc_actual_arglist *arg,
   gfc_intrinsic_sym *intrinsic_sym,
-  gfc_symbol *proc_ifc, gfc_ss_type type)
+  gfc_ss_type type)
 {
-  gfc_formal_arglist *dummy_arg;
   int scalar;
   gfc_ss *head;
   gfc_ss *tail;
@@ -11532,15 +11532,11 @@ gfc_walk_elemental_function_args (gfc_ss * ss, gfc_actual_arglist *arg,
   head = gfc_ss_terminator;
   tail = NULL;
 
-  if (proc_ifc)
-dummy_arg = gfc_sym_get_dummy_args (proc_ifc);
-  else
-dummy_arg = NULL;
-
   int arg_num = 0;
   scalar = 1;
   for (; arg; arg = arg->next)
 {
+  gfc_dummy_arg * const dummy_arg = arg->associated_dummy;
   if (!arg->expr
 	  || arg->expr->expr_type == EXPR_NULL
 	  || !arg_evaluated_for_scalarization (intrinsic_sym, *arg, arg_num))
@@ -11554,13 +11550,13 @@ gfc_walk_elemental_function_args (gfc_ss * ss, gfc_actual_arglist *arg,
 	  newss = gfc_get_scalar_ss (head, arg->expr);
 	  newss->info->type = type;
 	  if (dummy_arg)
-	

[PATCH 0/2] fortran: Ignore unused arguments for scalarisation [PR97896]

2021-11-07 Thread Mikael Morin via Gcc-patches
Hello,

I repost this patch series initially targetted at the 11 branch only [1],
and that I now would like to commit to master as well before.

The problematic case is intrinsic procedures where an argument is actually
not used in the code generated (KIND argument of INDEX in the testcase),
which confuses the scalariser.

Thomas König comitted a change to workaround the problem, but it regressed
in PR97896.  These patch put the workaround where I think it is more
appropriate, namely at the beginning of the scalarisation procedure.
This is the patch 2 of the series, preceded with the revert in patch 1.
I intend to commit both of them squashed together.

Regression-tested on x86_64-linux-gnu.  Ok for master and 11 branch? 


Changes from v1:

  Rebase on master.


[1] https://gcc.gnu.org/pipermail/fortran/2021-August/056329.html


Mikael Morin (2):
  Revert "Remove KIND argument from INDEX so it does not mess up
scalarization."
  fortran: Ignore unused args in scalarization [PR97896]

 gcc/fortran/intrinsic.c   | 48 +++--
 gcc/fortran/intrinsic.h   |  3 +-
 gcc/fortran/iresolve.c| 21 ++---
 gcc/fortran/trans-array.c | 61 ++-
 gcc/fortran/trans-array.h |  3 ++
 gcc/fortran/trans-decl.c  | 24 +--
 gcc/fortran/trans-intrinsic.c |  1 +
 gcc/fortran/trans-stmt.c  | 20 +
 gcc/testsuite/gfortran.dg/index_5.f90 | 23 ++
 9 files changed, 121 insertions(+), 83 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/index_5.f90

-- 
2.33.0



[PATCH 1/2] Revert "Remove KIND argument from INDEX so it does not mess up scalarization."

2021-11-07 Thread Mikael Morin via Gcc-patches

This reverts commit d09847357b965a2c2cda063827ce362d4c9c86f2 except for
its testcase.

gcc/fortran/ChangeLog:
* intrinsic.c (add_sym_4ind): Remove.
(add_functions): Use add_sym4 instead of add_sym4ind.
Don’t special case the index intrinsic.
* iresolve.c (gfc_resolve_index_func): Use the individual arguments
directly instead of the full argument list.
* intrinsic.h (gfc_resolve_index_func): Update the declaration
accordingly.
* trans-decl.c (gfc_get_extern_function_decl): Don’t modify the
list of arguments in the case of the index intrinsic.
---
 gcc/fortran/intrinsic.c  | 48 ++--
 gcc/fortran/intrinsic.h  |  3 ++-
 gcc/fortran/iresolve.c   | 21 --
 gcc/fortran/trans-decl.c | 24 +---
 4 files changed, 14 insertions(+), 82 deletions(-)

diff --git a/gcc/fortran/intrinsic.c b/gcc/fortran/intrinsic.c
index f5c88d98cc9..a6a18a471e3 100644
--- a/gcc/fortran/intrinsic.c
+++ b/gcc/fortran/intrinsic.c
@@ -888,39 +888,6 @@ add_sym_4 (const char *name, gfc_isym_id id, enum klass cl, int actual_ok, bt ty
 	   (void *) 0);
 }
 
-/* Add a symbol to the function list where the function takes 4
-   arguments and resolution may need to change the number or
-   arrangement of arguments. This is the case for INDEX, which needs
-   its KIND argument removed.  */
-
-static void
-add_sym_4ind (const char *name, gfc_isym_id id, enum klass cl, int actual_ok,
-	  bt type, int kind, int standard,
-	  bool (*check) (gfc_expr *, gfc_expr *, gfc_expr *, gfc_expr *),
-	  gfc_expr *(*simplify) (gfc_expr *, gfc_expr *, gfc_expr *,
- gfc_expr *),
-	  void (*resolve) (gfc_expr *, gfc_actual_arglist *),
-	  const char *a1, bt type1, int kind1, int optional1,
-	  const char *a2, bt type2, int kind2, int optional2,
-	  const char *a3, bt type3, int kind3, int optional3,
-	  const char *a4, bt type4, int kind4, int optional4 )
-{
-  gfc_check_f cf;
-  gfc_simplify_f sf;
-  gfc_resolve_f rf;
-
-  cf.f4 = check;
-  sf.f4 = simplify;
-  rf.f1m = resolve;
-
-  add_sym (name, id, cl, actual_ok, type, kind, standard, cf, sf, rf,
-	   a1, type1, kind1, optional1, INTENT_IN,
-	   a2, type2, kind2, optional2, INTENT_IN,
-	   a3, type3, kind3, optional3, INTENT_IN,
-	   a4, type4, kind4, optional4, INTENT_IN,
-	   (void *) 0);
-}
-
 
 /* Add a symbol to the subroutine list where the subroutine takes
4 arguments.  */
@@ -2223,11 +2190,11 @@ add_functions (void)
 
   /* The resolution function for INDEX is called gfc_resolve_index_func
  because the name gfc_resolve_index is already used in resolve.c.  */
-  add_sym_4ind ("index", GFC_ISYM_INDEX, CLASS_ELEMENTAL, ACTUAL_YES,
-		BT_INTEGER, di, GFC_STD_F77,
-		gfc_check_index, gfc_simplify_index, gfc_resolve_index_func,
-		stg, BT_CHARACTER, dc, REQUIRED, ssg, BT_CHARACTER, dc, REQUIRED,
-		bck, BT_LOGICAL, dl, OPTIONAL, kind, BT_INTEGER, di, OPTIONAL);
+  add_sym_4 ("index", GFC_ISYM_INDEX, CLASS_ELEMENTAL, ACTUAL_YES,
+	 BT_INTEGER, di, GFC_STD_F77,
+	 gfc_check_index, gfc_simplify_index, gfc_resolve_index_func,
+	 stg, BT_CHARACTER, dc, REQUIRED, ssg, BT_CHARACTER, dc, REQUIRED,
+	 bck, BT_LOGICAL, dl, OPTIONAL, kind, BT_INTEGER, di, OPTIONAL);
 
   make_generic ("index", GFC_ISYM_INDEX, GFC_STD_F77);
 
@@ -4530,10 +4497,9 @@ resolve_intrinsic (gfc_intrinsic_sym *specific, gfc_expr *e)
 
   arg = e->value.function.actual;
 
-  /* Special case hacks for MIN, MAX and INDEX.  */
+  /* Special case hacks for MIN and MAX.  */
   if (specific->resolve.f1m == gfc_resolve_max
-  || specific->resolve.f1m == gfc_resolve_min
-  || specific->resolve.f1m == gfc_resolve_index_func)
+  || specific->resolve.f1m == gfc_resolve_min)
 {
   (*specific->resolve.f1m) (e, arg);
   return;
diff --git a/gcc/fortran/intrinsic.h b/gcc/fortran/intrinsic.h
index 7511df1..fb655fb078a 100644
--- a/gcc/fortran/intrinsic.h
+++ b/gcc/fortran/intrinsic.h
@@ -519,7 +519,8 @@ void gfc_resolve_ibits (gfc_expr *, gfc_expr *, gfc_expr *, gfc_expr *);
 void gfc_resolve_ibset (gfc_expr *, gfc_expr *, gfc_expr *);
 void gfc_resolve_image_index (gfc_expr *, gfc_expr *, gfc_expr *);
 void gfc_resolve_image_status (gfc_expr *, gfc_expr *, gfc_expr *);
-void gfc_resolve_index_func (gfc_expr *, gfc_actual_arglist *);
+void gfc_resolve_index_func (gfc_expr *, gfc_expr *, gfc_expr *, gfc_expr *,
+			 gfc_expr *);
 void gfc_resolve_ierrno (gfc_expr *);
 void gfc_resolve_ieor (gfc_expr *, gfc_expr *, gfc_expr *);
 void gfc_resolve_ichar (gfc_expr *, gfc_expr *, gfc_expr *);
diff --git a/gcc/fortran/iresolve.c b/gcc/fortran/iresolve.c
index e17fe45f080..598c0409b66 100644
--- a/gcc/fortran/iresolve.c
+++ b/gcc/fortran/iresolve.c
@@ -1276,27 +1276,16 @@ gfc_resolve_ior (gfc_expr *f, gfc_expr *i, gfc_expr *j)
 
 
 void
-gfc_resolve_index_func (gfc_expr *f, gfc_actual_arglist *a)
+gfc_resolve_index_func (gfc_expr *f, gfc_exp

[PATCH 2/2] fortran: Ignore unused args in scalarization [PR97896]

2021-11-07 Thread Mikael Morin via Gcc-patches

The KIND argument of the INDEX intrinsic is a compile time constant
that is used at compile time only to resolve to a kind-specific library
function.  That argument is otherwise completely ignored at runtime, and there 
is
no code generated for it as the library procedure has no kind argument.
This confuses the scalarizer which expects to see every argument
of elemental functions used when calling a procedure.
This change removes the argument from the scalarization lists
at the beginning of the scalarization process, so that the argument
is completely ignored.

PR fortran/97896

gcc/fortran/ChangeLog:
* trans-array.h (gfc_get_intrinsic_for_expr,
gfc_get_proc_ifc_for_expr): New.
* trans-array.c (gfc_get_intrinsic_for_expr,
arg_evaluated_for_scalarization): New.
(gfc_walk_elemental_function_args): Add intrinsic procedure
as argument.  Count arguments.  Check arg_evaluated_for_scalarization.
* trans-intrinsic.c (gfc_walk_intrinsic_function): Update call.
* trans-stmt.c (get_intrinsic_for_code): New.
(gfc_trans_call): Update call.

gcc/testsuite/ChangeLog:
* gfortran.dg/index_5.f90: New.
---
 gcc/fortran/trans-array.c | 61 ++-
 gcc/fortran/trans-array.h |  3 ++
 gcc/fortran/trans-intrinsic.c |  1 +
 gcc/fortran/trans-stmt.c  | 20 +
 gcc/testsuite/gfortran.dg/index_5.f90 | 23 ++
 5 files changed, 107 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/index_5.f90

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 5ceb261b698..79321854498 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -11460,6 +11460,59 @@ gfc_get_proc_ifc_for_expr (gfc_expr *procedure_ref)
 }
 
 
+/* Given an expression referring to an intrinsic function call,
+   return the intrinsic symbol.  */
+
+gfc_intrinsic_sym *
+gfc_get_intrinsic_for_expr (gfc_expr *call)
+{
+  if (call == NULL)
+return NULL;
+
+  /* Normal procedure case.  */
+  if (call->expr_type == EXPR_FUNCTION)
+return call->value.function.isym;
+  else
+return NULL;
+}
+
+
+/* Indicates whether an argument to an intrinsic function should be used in
+   scalarization.  It is usually the case, except for some intrinsics
+   requiring the value to be constant, and using the value at compile time only.
+   As the value is not used at runtime in those cases, we don’t produce code
+   for it, and it should not be visible to the scalarizer.
+   FUNCTION is the intrinsic function being called, ACTUAL_ARG is the actual
+   argument being examined in that call, and ARG_NUM the index number
+   of ACTUAL_ARG in the list of arguments.
+   The intrinsic procedure’s dummy argument associated with ACTUAL_ARG is
+   identified using the name in ACTUAL_ARG if it is present (that is: if it’s
+   a keyword argument), otherwise using ARG_NUM.  */
+
+static bool
+arg_evaluated_for_scalarization (gfc_intrinsic_sym *function,
+ gfc_actual_arglist &actual_arg, int arg_num)
+{
+  if (function != NULL)
+{
+  switch (function->id)
+	{
+	  case GFC_ISYM_INDEX:
+	if ((actual_arg.name == NULL && arg_num == 3)
+		|| (actual_arg.name != NULL
+		&& strcmp ("kind", actual_arg.name) == 0))
+	  return false;
+	  /* Fallthrough.  */
+
+	  default:
+	break;
+	}
+}
+
+  return true;
+}
+
+
 /* Walk the arguments of an elemental function.
PROC_EXPR is used to check whether an argument is permitted to be absent.  If
it is NULL, we don't do the check and the argument is assumed to be present.
@@ -11467,6 +11520,7 @@ gfc_get_proc_ifc_for_expr (gfc_expr *procedure_ref)
 
 gfc_ss *
 gfc_walk_elemental_function_args (gfc_ss * ss, gfc_actual_arglist *arg,
+  gfc_intrinsic_sym *intrinsic_sym,
   gfc_symbol *proc_ifc, gfc_ss_type type)
 {
   gfc_formal_arglist *dummy_arg;
@@ -11483,10 +11537,13 @@ gfc_walk_elemental_function_args (gfc_ss * ss, gfc_actual_arglist *arg,
   else
 dummy_arg = NULL;
 
+  int arg_num = 0;
   scalar = 1;
   for (; arg; arg = arg->next)
 {
-  if (!arg->expr || arg->expr->expr_type == EXPR_NULL)
+  if (!arg->expr
+	  || arg->expr->expr_type == EXPR_NULL
+	  || !arg_evaluated_for_scalarization (intrinsic_sym, *arg, arg_num))
 	goto loop_continue;
 
   newss = gfc_walk_subexpr (head, arg->expr);
@@ -11519,6 +11576,7 @@ gfc_walk_elemental_function_args (gfc_ss * ss, gfc_actual_arglist *arg,
 }
 
 loop_continue:
+  arg_num++;
   if (dummy_arg != NULL)
 	dummy_arg = dummy_arg->next;
 }
@@ -11579,6 +11637,7 @@ gfc_walk_function_expr (gfc_ss * ss, gfc_expr * expr)
 
   ss = gfc_walk_elemental_function_args (old_ss,
 	 expr->value.function.actual,
+	 gfc_get_intrinsic_for_expr (expr),
 	 gfc_get_proc_ifc_for_expr (expr),
 	 GFC_SS_REFERENCE);
   if (ss != old_ss
diff --git a/gcc/fortran/trans-array.h b/gcc/fortran/trans-array.

[PATCH v3 5/5] fortran: Identify arguments by their names

2021-11-07 Thread Mikael Morin via Gcc-patches

This provides a new function to get the name of a dummy argument,
so that identifying an argument can be made using just its name
instead of a mix of name matching (for keyword actual arguments)
and argument counting (for other actual arguments).

gcc/fortran/ChangeLog:
* interface.c (gfc_dummy_arg_get_name): New function.
* gfortran.h (gfc_dummy_arg_get_name): Declare it.
* trans-array.c (arg_evaluated_for_scalarization): Pass a dummy
argument wrapper as argument instead of an actual argument
and an index number.  Check it’s non-NULL.  Use its name
to identify it.
(gfc_walk_elemental_function_args): Update call to
arg_evaluated for scalarization.  Remove argument counting.
---
 gcc/fortran/gfortran.h|  1 +
 gcc/fortran/interface.c   | 17 +
 gcc/fortran/trans-array.c | 16 +---
 3 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 4879805ff0b..ac4b3a8b6d4 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -2328,6 +2328,7 @@ struct gfc_dummy_arg
 #define gfc_get_dummy_arg() XCNEW (gfc_dummy_arg)
 
 
+const char * gfc_dummy_arg_get_name (gfc_dummy_arg &);
 const gfc_typespec & gfc_dummy_arg_get_typespec (gfc_dummy_arg &);
 bool gfc_dummy_arg_is_optional (gfc_dummy_arg &);
 
diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
index 36b7a852066..d87088f988d 100644
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -5500,6 +5500,23 @@ gfc_get_formal_from_actual_arglist (gfc_symbol *sym,
 }
 
 
+const char *
+gfc_dummy_arg_get_name (gfc_dummy_arg & dummy_arg)
+{
+  switch (dummy_arg.intrinsicness)
+{
+case GFC_INTRINSIC_DUMMY_ARG:
+  return dummy_arg.u.intrinsic->name;
+
+case GFC_NON_INTRINSIC_DUMMY_ARG:
+  return dummy_arg.u.non_intrinsic->sym->name;
+
+default:
+  gcc_unreachable ();
+}
+}
+
+
 const gfc_typespec &
 gfc_dummy_arg_get_typespec (gfc_dummy_arg & dummy_arg)
 {
diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index d37c1e7ad7f..2090adf01e7 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -11492,16 +11492,14 @@ gfc_get_intrinsic_for_expr (gfc_expr *call)
 
 static bool
 arg_evaluated_for_scalarization (gfc_intrinsic_sym *function,
- gfc_actual_arglist &actual_arg, int arg_num)
+ gfc_dummy_arg *dummy_arg)
 {
-  if (function != NULL)
+  if (function != NULL && dummy_arg != NULL)
 {
   switch (function->id)
 	{
 	  case GFC_ISYM_INDEX:
-	if ((actual_arg.name == NULL && arg_num == 3)
-		|| (actual_arg.name != NULL
-		&& strcmp ("kind", actual_arg.name) == 0))
+	if (strcmp ("kind", gfc_dummy_arg_get_name (*dummy_arg)) == 0)
 	  return false;
 	  /* Fallthrough.  */
 
@@ -11532,15 +11530,14 @@ gfc_walk_elemental_function_args (gfc_ss * ss, gfc_actual_arglist *arg,
   head = gfc_ss_terminator;
   tail = NULL;
 
-  int arg_num = 0;
   scalar = 1;
   for (; arg; arg = arg->next)
 {
   gfc_dummy_arg * const dummy_arg = arg->associated_dummy;
   if (!arg->expr
 	  || arg->expr->expr_type == EXPR_NULL
-	  || !arg_evaluated_for_scalarization (intrinsic_sym, *arg, arg_num))
-	goto loop_continue;
+	  || !arg_evaluated_for_scalarization (intrinsic_sym, dummy_arg))
+	continue;
 
   newss = gfc_walk_subexpr (head, arg->expr);
   if (newss == head)
@@ -11570,9 +11567,6 @@ gfc_walk_elemental_function_args (gfc_ss * ss, gfc_actual_arglist *arg,
   while (tail->next != gfc_ss_terminator)
 tail = tail->next;
 }
-
-loop_continue:
-  arg_num++;
 }
 
   if (scalar)


[r12-4976 Regression] FAIL: gfortran.dg/vector_subscript_1.f90 -O3 -g execution test on Linux/x86_64

2021-11-07 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

4898e958a92d45dbf23c0f28bc7552689ba16ecc is the first bad commit
commit 4898e958a92d45dbf23c0f28bc7552689ba16ecc
Author: Jan Hubicka 
Date:   Sun Nov 7 09:35:16 2021 +0100

Implement intra-procedural dataflow in ipa-modref flags propagation.

caused

FAIL: gfortran.dg/vector_subscript_1.f90   -O1  execution test
FAIL: gfortran.dg/vector_subscript_1.f90   -O2  execution test
FAIL: gfortran.dg/vector_subscript_1.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/vector_subscript_1.f90   -O3 -g  execution test

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-4976/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gfortran.dg/vector_subscript_1.f90 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gfortran.dg/vector_subscript_1.f90 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gfortran.dg/vector_subscript_1.f90 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=gfortran.dg/vector_subscript_1.f90 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[COMMITTED] Remove VRP threader.

2021-11-07 Thread Aldy Hernandez via Gcc-patches
Now that things have stabilized, we can remove the old code.

I have left the hybrid threader in tree-ssa-threadedge, even though the
VRP threader was the only user, because we may need it as an interim
step for DOM threading removal.

Tested on x86-64 Linux.

p.s. I tried testing on ppc64le Linux, but it was broken at the time of
testing with:

/tmp/ccSw6Ajh.s: Assembler messages:
/tmp/ccSw6Ajh.s:399: Error: unrecognized opcode: `mtvsrws'
make[3]: *** [printf/quadmath-printf.lo] Error 1

gcc/ChangeLog:

* tree-pass.h (make_pass_vrp_threader): Remove.
* tree-ssa-threadbackward.c
(back_threader_profitability::profitable_path_p): Remove
ASSERT_EXPR references.
* tree-ssa-threadedge.c (jt_state::register_equivs_stmt): Same.
* tree-vrp.c (vrp_folder::simplify_casted_conds): Same.
(execute_vrp): Same.
(class hybrid_threader): Remove.
(hybrid_threader::hybrid_threader): Remove.
(hybrid_threader::~hybrid_threader): Remove.
(hybrid_threader::before_dom_children): Remove.
(hybrid_threader::after_dom_children): Remove.
(execute_vrp_threader): Remove.
(class pass_vrp_threader): Remove.
(make_pass_vrp_threader): Remove.
---
 gcc/tree-pass.h   |   1 -
 gcc/tree-ssa-threadbackward.c |   2 -
 gcc/tree-ssa-threadedge.c |  12 +--
 gcc/tree-vrp.c| 137 +-
 4 files changed, 5 insertions(+), 147 deletions(-)

diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index e807ad855ef..d494aff1c4c 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -463,7 +463,6 @@ extern gimple_opt_pass *make_pass_copy_prop (gcc::context 
*ctxt);
 extern gimple_opt_pass *make_pass_isolate_erroneous_paths (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_early_vrp (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_vrp (gcc::context *ctxt);
-extern gimple_opt_pass *make_pass_vrp_threader (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_uncprop (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_return_slot (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_reassoc (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 0085ad01cdc..f9485bf9046 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -710,8 +710,6 @@ back_threader_profitability::profitable_path_p (const 
vec &m_path,
return false;
  /* Do not count empty statements and labels.  */
  if (gimple_code (stmt) != GIMPLE_NOP
- && !(gimple_code (stmt) == GIMPLE_ASSIGN
-  && gimple_assign_rhs_code (stmt) == ASSERT_EXPR)
  && !is_gimple_debug (stmt))
n_insns += estimate_num_insns (stmt, &eni_size_weights);
}
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index a63a9764ff8..f693db17064 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -1334,25 +1334,19 @@ jt_state::register_equivs_stmt (gimple *stmt, 
basic_block bb,
  to expose more context sensitive equivalences which in turn may
  allow us to simplify the condition at the end of the loop.
 
- Handle simple copy operations as well as implied copies from
- ASSERT_EXPRs.  */
+ Handle simple copy operations.  */
   tree cached_lhs = NULL;
   if (gimple_assign_single_p (stmt)
   && TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME)
 cached_lhs = gimple_assign_rhs1 (stmt);
-  else if (gimple_assign_single_p (stmt)
-  && TREE_CODE (gimple_assign_rhs1 (stmt)) == ASSERT_EXPR)
-cached_lhs = TREE_OPERAND (gimple_assign_rhs1 (stmt), 0);
   else
 {
-  /* A statement that is not a trivial copy or ASSERT_EXPR.
+  /* A statement that is not a trivial copy.
 Try to fold the new expression.  Inserting the
 expression into the hash table is unlikely to help.  */
   /* ???  The DOM callback below can be changed to setting
 the mprts_hook around the call to thread_across_edge,
-avoiding the use substitution.  The VRP hook should be
-changed to properly valueize operands itself using
-SSA_NAME_VALUE in addition to its own lattice.  */
+avoiding the use substitution.  */
   cached_lhs = gimple_fold_stmt_to_constant_1 (stmt,
   threadedge_valueize);
   if (NUM_SSA_OPERANDS (stmt, SSA_OP_ALL_USES) != 0
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 5380508a9ec..dd7723629ba 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -43,7 +43,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "cfgloop.h"
 #include "tree-scalar-evolution.h"
 #include "tree-ssa-propagate.h"
-#include "tree-ssa-threadedge.h"
 #include "domwalk.h"
 #include "vr-values.h"
 #include "gimple-array-bounds.h"
@@ -3684,9 +3683,7 @@ vrp_asserts::all_imm_us

[patch, fortran, wwwdocs] Fix name of argument to CO_REDUCE

2021-11-07 Thread Thomas Koenig via Gcc-patches

Hello world,

the attached patches fix the name of the function argument to CO_REDUCE
to conform to Fortran 2018 instead of the TR.

This is a user-visible change, so I have put this both into changes.html
and porting_to.html.

Regression-tested.  OK for trunk?

Best regards

Thomas

Author: Thomas Koenig 
Date:   2021-11-07 15:38:35 +0100

Fix keyword name for co_reduce.

gcc/fortran/ChangeLog:

* intrinsic.c (add_subroutines): Change keyword "operator"
to the correct one, "operation".
* check.c (gfc_check_co_reduce): Change OPERATOR to
OPERATION in error messages.

gcc/testsuite/ChangeLog:

* gfortran.dg/co_reduce_2.f90: New test.
* gfortran.dg/coarray_collective_16.f90: Change OPERATOR
to OPERATION.
* gfortran.dg/coarray_collective_9.f90: Likewise.

Co-authored by: Steve Kargl 

diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c
index 6ea6e136d4f..15772009af4 100644
--- a/gcc/fortran/check.c
+++ b/gcc/fortran/check.c
@@ -2265,7 +2265,7 @@ gfc_check_co_reduce (gfc_expr *a, gfc_expr *op, gfc_expr *result_image,
   attr = gfc_expr_attr (op);
   if (!attr.pure || !attr.function)
 {
-  gfc_error ("OPERATOR argument at %L must be a PURE function",
+  gfc_error ("OPERATION argument at %L must be a PURE function",
 		 &op->where);
   return false;
 }
@@ -2292,7 +2292,7 @@ gfc_check_co_reduce (gfc_expr *a, gfc_expr *op, gfc_expr *result_image,
 
   if (!formal || !formal->next || formal->next->next)
 {
-  gfc_error ("The function passed as OPERATOR at %L shall have two "
+  gfc_error ("The function passed as OPERATION at %L shall have two "
 		 "arguments", &op->where);
   return false;
 }
@@ -2303,7 +2303,7 @@ gfc_check_co_reduce (gfc_expr *a, gfc_expr *op, gfc_expr *result_image,
   if (!gfc_compare_types (&a->ts, &sym->result->ts))
 {
   gfc_error ("The A argument at %L has type %s but the function passed as "
-		 "OPERATOR at %L returns %s",
+		 "OPERATION at %L returns %s",
 		 &a->where, gfc_typename (a), &op->where,
 		 gfc_typename (&sym->result->ts));
   return false;
@@ -2311,7 +2311,7 @@ gfc_check_co_reduce (gfc_expr *a, gfc_expr *op, gfc_expr *result_image,
   if (!gfc_compare_types (&a->ts, &formal->sym->ts)
   || !gfc_compare_types (&a->ts, &formal->next->sym->ts))
 {
-  gfc_error ("The function passed as OPERATOR at %L has arguments of type "
+  gfc_error ("The function passed as OPERATION at %L has arguments of type "
 		 "%s and %s but shall have type %s", &op->where,
 		 gfc_typename (&formal->sym->ts),
 		 gfc_typename (&formal->next->sym->ts), gfc_typename (a));
@@ -2322,7 +2322,7 @@ gfc_check_co_reduce (gfc_expr *a, gfc_expr *op, gfc_expr *result_image,
   || formal->next->sym->attr.allocatable || formal->sym->attr.pointer
   || formal->next->sym->attr.pointer)
 {
-  gfc_error ("The function passed as OPERATOR at %L shall have scalar "
+  gfc_error ("The function passed as OPERATION at %L shall have scalar "
 		 "nonallocatable nonpointer arguments and return a "
 		 "nonallocatable nonpointer scalar", &op->where);
   return false;
@@ -2330,21 +2330,21 @@ gfc_check_co_reduce (gfc_expr *a, gfc_expr *op, gfc_expr *result_image,
 
   if (formal->sym->attr.value != formal->next->sym->attr.value)
 {
-  gfc_error ("The function passed as OPERATOR at %L shall have the VALUE "
+  gfc_error ("The function passed as OPERATION at %L shall have the VALUE "
 		 "attribute either for none or both arguments", &op->where);
   return false;
 }
 
   if (formal->sym->attr.target != formal->next->sym->attr.target)
 {
-  gfc_error ("The function passed as OPERATOR at %L shall have the TARGET "
+  gfc_error ("The function passed as OPERATION at %L shall have the TARGET "
 		 "attribute either for none or both arguments", &op->where);
   return false;
 }
 
   if (formal->sym->attr.asynchronous != formal->next->sym->attr.asynchronous)
 {
-  gfc_error ("The function passed as OPERATOR at %L shall have the "
+  gfc_error ("The function passed as OPERATION at %L shall have the "
 		 "ASYNCHRONOUS attribute either for none or both arguments",
 		 &op->where);
   return false;
@@ -2352,7 +2352,7 @@ gfc_check_co_reduce (gfc_expr *a, gfc_expr *op, gfc_expr *result_image,
 
   if (formal->sym->attr.optional || formal->next->sym->attr.optional)
 {
-  gfc_error ("The function passed as OPERATOR at %L shall not have the "
+  gfc_error ("The function passed as OPERATION at %L shall not have the "
 		 "OPTIONAL attribute for either of the arguments", &op->where);
   return false;
 }
@@ -2383,14 +2383,14 @@ gfc_check_co_reduce (gfc_expr *a, gfc_expr *op, gfc_expr *result_image,
 	   || (formal_size2 && actual_size != formal_size2)))
 	{
 	  gfc_error ("The character length of the A argument at %L and of the "
-		 "arguments of the OP

Re: [patch, fortran, wwwdocs] Fix name of argument to CO_REDUCE

2021-11-07 Thread Harald Anlauf via Gcc-patches

Hi Thomas,

Am 07.11.21 um 19:18 schrieb Thomas Koenig via Fortran:

Hello world,

the attached patches fix the name of the function argument to CO_REDUCE
to conform to Fortran 2018 instead of the TR.

This is a user-visible change, so I have put this both into changes.html
and porting_to.html.

Regression-tested.  OK for trunk?

Best regards

 Thomas

Author: Thomas Koenig 
Date:   2021-11-07 15:38:35 +0100

     Fix keyword name for co_reduce.

     gcc/fortran/ChangeLog:

     * intrinsic.c (add_subroutines): Change keyword "operator"
     to the correct one, "operation".
     * check.c (gfc_check_co_reduce): Change OPERATOR to
     OPERATION in error messages.

     gcc/testsuite/ChangeLog:

     * gfortran.dg/co_reduce_2.f90: New test.
     * gfortran.dg/coarray_collective_16.f90: Change OPERATOR
     to OPERATION.
     * gfortran.dg/coarray_collective_9.f90: Likewise.

     Co-authored by: Steve Kargl 



Steve added an update to gcc/fortran/intrinsic.texi in the PR which is
missing here.

OK if it is committed with the above.

Thanks for the patch!

Harald


Re: [PATCH 11/18] rs6000: Builtin expansion, part 6

2021-11-07 Thread Segher Boessenkool
Hi!

On Sun, Nov 07, 2021 at 09:28:09AM -0600, Bill Schmidt wrote:
> Thank you for all of the reviews!  I appreciate your hard work and thorough 
> study of the patches.
> 
> I've updated these 6 patches and combined them into 1, pushed today.  There 
> are still a couple of cleanups I haven't done, but I made note in the code
> where these are needed.

I did not approve the testsuite one, it needs more work?


Segher


Re: [PATCH] PR middle-end/103059: reload: Also accept ASHIFT with indexed addressing

2021-11-07 Thread Maciej W. Rozycki
On Fri, 5 Nov 2021, Hans-Peter Nilsson wrote:

> >  I was trying to chase another target I could use to regression-test this
> > with that does do scaled indexed addressing while still using old reload.
> > The i386 port would be a good candidate, but it has switched to LRA long
> > ago with no option to use old reload, and I think there would be little
> > point in adding one just for the sake of such verification.  Do we have
> > any other port actually that could be affected by this change?
> 
> That'd be cris-elf.

 Good to know, thanks!

 How do I run regression-testing with this target however?  I can see QEMU
support upstream, even for user-mode Linux, which would be the easiest to 
run (sadly toolchain support for CRIS/Linux was removed a while ago as was 
the Linux kernel port; at one point I even considered getting myself a 
CRIS development board as an alternative RISC platform that would Linux, 
but concluded that it was too expensive for the features it offered), but 
for a bare metal environment both a C library (newlib?) and then a 
specific board support package is required.

 Or may I ask you to put this patch through testing with your environment?

> Your proposed patch reminded me of 6cb68940dcf9; giving reload a
> reload-specific insn_and_split pattern to play with, matching
> "mult" outside of a mem.  I *guess* that's the CRIS-specific
> replacement to c605a8bf9270.

 Possibly, except for the missing reload bits making it incomplete.

  Maciej


Re: [PATCH] PR middle-end/103059: reload: Also accept ASHIFT with indexed addressing

2021-11-07 Thread Maciej W. Rozycki
On Wed, 3 Nov 2021, Maciej W. Rozycki wrote:

>  I have not yet tracked down which change after commit c605a8bf9270 made 
> regressions appear in the test suites, however clearly the commit wasn't 
> as complete a change as it should have been.  I'll see if I can find it 
> and will mention it in the final commit description if there is anything 
> useful in that.

 So regrettably it's my original commit c605a8bf9270 that caused these 
regressions.  I don't know how it happened that it slipped through in 
testing; I may have had the change left in the tree that enabled LRA and 
did not notice it.  Sorry about that, umm.

>  As noted in the commit description this has been regression-tested with 
> commit 4a960d548b7d^.  I'm running regression-testing with GCC 11 right 
> now as well and expect results by the end of week.

 For the record the change does fix numerous regressions in GCC 11 too, 
e.g.:

=== gcc Summary ===

# of expected passes122244
# of unexpected failures1852
# of expected passes122428
# of unexpected failures1484
# of unexpected successes   5
# of expected failures  721
# of unresolved testcases   10

So either this change, if deemed safe enough, will have to go into GCC 11, 
or, if we're going to keep old reload yet for GCC 12, then the best course 
of action for GCC 11 might be reverting commit c605a8bf9270, although at 
the cost of regressing code quality a bit where it does build.

 Suggestions?

  Maciej


Re: [PATCH 0/2] fortran: Ignore unused arguments for scalarisation [PR97896]

2021-11-07 Thread Harald Anlauf via Gcc-patches

Hi Mikael,

thanks for working on this!

Am 07.11.21 um 17:17 schrieb Mikael Morin via Gcc-patches:

Hello,

I repost this patch series initially targetted at the 11 branch only [1],
and that I now would like to commit to master as well before.

The problematic case is intrinsic procedures where an argument is actually
not used in the code generated (KIND argument of INDEX in the testcase),
which confuses the scalariser.

Thomas König comitted a change to workaround the problem, but it regressed
in PR97896.  These patch put the workaround where I think it is more
appropriate, namely at the beginning of the scalarisation procedure.
This is the patch 2 of the series, preceded with the revert in patch 1.
I intend to commit both of them squashed together.

Regression-tested on x86_64-linux-gnu.  Ok for master and 11 branch?


Changes from v1:

   Rebase on master.


[1] https://gcc.gnu.org/pipermail/fortran/2021-August/056329.html


Mikael Morin (2):
   Revert "Remove KIND argument from INDEX so it does not mess up
 scalarization."
   fortran: Ignore unused args in scalarization [PR97896]

  gcc/fortran/intrinsic.c   | 48 +++--
  gcc/fortran/intrinsic.h   |  3 +-
  gcc/fortran/iresolve.c| 21 ++---
  gcc/fortran/trans-array.c | 61 ++-
  gcc/fortran/trans-array.h |  3 ++
  gcc/fortran/trans-decl.c  | 24 +--
  gcc/fortran/trans-intrinsic.c |  1 +
  gcc/fortran/trans-stmt.c  | 20 +
  gcc/testsuite/gfortran.dg/index_5.f90 | 23 ++
  9 files changed, 121 insertions(+), 83 deletions(-)
  create mode 100644 gcc/testsuite/gfortran.dg/index_5.f90



LGTM at first sight.  But you may want to wait for Tobias or Thomas to
take a second look.

Thanks for the patch!

Harald


Re: [PATCH] Objective-C: fix protocol list count type (pertinent to non-LP64)

2021-11-07 Thread Matt Jacobson via Gcc-patches


> On Oct 25, 2021, at 5:43 AM, Iain Sandoe  wrote:
> 
> Did you test objective-c++ on Darwin?
> 
> I see a lot of fails of the form:
> Excess errors:
> : error: initialization of a flexible array member [-Wpedantic]

Looked into this.  It’s happening because obj-c++.dg/dg.exp has:

set DEFAULT_OBJCXXFLAGS " -ansi -pedantic-errors -Wno-long-long"

Specifically, the `-pedantic-errors` argument prohibits initialization of a 
flexible array member.  Notably, this flag does *not* appear in objc/dg.exp.

Admittedly I didn’t know that initialization of a FAM was prohibited by the 
standard.  It’s allowed by GCC, though, as documented here:



Is it OK to use a GCC extension this way in the Objective-C frontend?

> For a patch that changes code-gen we should have a test that it produces 
> what’s
> expected (in general, a ‘torture' test would be preferrable so that we can be 
> sure the
> output is as expected for different optimisation levels). 

The output is different only for targets where 
sizeof (long) != sizeof (void *).  Do we have the ability to run “cross” 
torture tests?  Could such a test verify the emitted assembly (like LLVM’s 
FileCheck tests do)?  Or would it need to execute something?

Thanks for your help!

Matt

PING [PATCH] restore ancient -Waddress for weak symbols [PR33925]

2021-11-07 Thread Martin Sebor via Gcc-patches

Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2021-October/582415.html

On 10/23/21 5:06 PM, Martin Sebor wrote:

On 10/4/21 3:37 PM, Jason Merrill wrote:

On 10/4/21 14:42, Martin Sebor wrote:

While resolving the recent -Waddress enhancement request (PR
PR102103) I came across a 2007 problem report about GCC 4 having
stopped warning for using the address of inline functions in
equality comparisons with null.  With inline functions being
commonplace in C++ this seems like an important use case for
the warning.

The change that resulted in suppressing the warning in these
cases was introduced inadvertently in a fix for PR 22252.

To restore the warning, the attached patch enhances
the decl_with_nonnull_addr_p() function to return true also for
weak symbols for which a definition has been provided.


I think you probably want to merge this function with 
fold-const.c:maybe_nonzero_address, which already handles more cases.


maybe_nonzero_address() doesn't behave quite like
decl_with_nonnull_addr_p() expects and I'm reluctant to muck
around with the former too much since it's used for codegen,
while the latter just for warnings.  (There is even a case
where the functions don't behave the same, and would result
in different warnings between C and C++ without some extra
help.)

So in the attached revision I just have maybe_nonzero_address()
call decl_with_nonnull_addr_p() and then refine the failing
(or uncertain) cases separately, with some overlap between
them.

Since I worked on this someone complained that some instances
of the warning newly enhanced under PR102103 aren't suppresed
in code resulting from macro expansion.  Since it's trivial,
I include the fix for that report in this patch as well.

Tested on x86_64-linux.

Martin




Re: [PATCH] c++: Implement -Wuninitialized for mem-initializers (redux) [PR19808]

2021-11-07 Thread Martin Sebor via Gcc-patches

On 11/5/21 5:22 PM, Marek Polacek via Gcc-patches wrote:

2021 update: Last year I posted a version of this patch:

but it didn't make it in.  The main objection seemed to be that the
patch tried to do too much, and overlapped with the ME uninitialized
warnings.  Since the patch used walk_tree without any data flow info,
it issued false positives for things like a(0 ? b : 42) and similar.

I'll admit I've been dreading resurrecting this because of the lack
of clarity about where we should warn about what.  On the other hand,
I think we really should do something about this.  So I've simplified
the original patch as much as it seemed reasonable.  For instance, it
doesn't even attempt to handle cases like "a((b = 42)), c(b)" -- for
these I simply give up for the whole mem-initializer (but who writes
code like that, anyway?).  I also give up when a member is initialized
with a function call, because we don't know what the call could do.


I (still) believe this is a great improvement :)

Thinking about your last sentence above and experimenting with
the patch just a little, we do know that function calls cannot
"initialize" (by assigning a value) const or reference members,
so those can be assumed to be uninitiazed when used before
their initialized has been evaluated.  As in:

int f (int);

struct A
{
  int x;
  const int y;
  A ():
x (f (y)),   << y used before initialized
y (1) { }
};

It should also be reasonable for the prurposes of warnings to
assume that a call to a const member function (or any call where
an object is passed by a const reference) doesn't change any
non-mutable members.  The middle end issues -Wmaybe-uninitialized
when it sees an uninitualized object passed to a const-qualified
pointer or reference.

At the same time, a call to a non-const member function can assign
a value to a not-yet-initialized member, so I wonder if its uses
subsequent to it should be accepted without warning:

struct A
{
  int x, y, z;
  int f () { y = 1; return 2; }
  A ():
x (f ()),
z (y)   << avoid warning here?
  { }
};


See Wuninitialized-17.C, for which clang emits a false positive but
we don't.  I remember having a hard time dealing with initializer lists
in my previous patch, so now I only handle simple a{b} cases, but no
more.  It turned out that this abridged version still warns about 90%
cases where users would expect a warning.

More complicated cases are left for the ME, which, for unused inline
functions, will only warn with -fkeep-inline-functions, but so be it.

This patch implements the long-desired -Wuninitialized warning for
member initializer lists, so that the front end can detect bugs like

   struct A {
 int a;
 int b;
 A() : b(1), a(b) { }
   };

where the field 'b' is used uninitialized because the order of member
initializers in the member initializer list is irrelevant; what matters
is the order of declarations in the class definition.

I've implemented this by keeping a hash set holding fields that are not
initialized yet, so at first it will be {a, b}, and after initializing
'a' it will be {b} and so on.  Then I use walk_tree to walk the
initializer and if we see that an uninitialized object is used, we warn.
Of course, when we use the address of the object, we may not warn:

   struct B {
 int &r;
 int *p;
 int a;
 B() : r(a), p(&a), a(1) { } // ok
   };

Likewise, don't warn in unevaluated contexts such as sizeof.  Classes
without an explicit initializer may still be initialized by their
default constructors; whether or not something is considered initialized
is handled in perform_member_init, see member_initialized_p.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


I don't really know my way around the C++ front end very well
so I didn't look at the code carefully, but I have a couple of
suggestions below on the choice of wording and the tests.



PR c++/19808
PR c++/96121

gcc/cp/ChangeLog:

* init.c (perform_member_init): Remove a forward declaration.
Walk the initializer using find_uninit_fields_r.  New parameter
to track uninitialized fields.  If a member is initialized,
remove it from the hash set.
(perform_target_ctor): Return the initializer.
(struct find_uninit_data): New class.
(find_uninit_fields_r): New function.
(emit_mem_initializers): Keep and initialize a set holding fields
that are not initialized.  When handling delegating constructors,
walk the constructor tree using find_uninit_fields_r.  Also when
initializing base clases.  Pass uninitialized down to
perform_member_init.

gcc/ChangeLog:

* doc/invoke.texi: Update documentation for -Wuninitialized.
* tree.c (stabilize_reference): Set location.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wuninitialized-14.C: New test.
* g++.dg/warn/Wuninitialized-

Re: [PATCH 1/2] [Gimple] Simplify (trunc)fmax/fmin((extend)a, (extend)b) to MAX/MIN(a,b)

2021-11-07 Thread Hongtao Liu via Gcc-patches
On Fri, Nov 5, 2021 at 5:52 PM Richard Biener
 wrote:
>
> On Fri, Nov 5, 2021 at 6:38 AM liuhongt  wrote:
> >
> > a and b are same type as trunc type and has less precision than
> > extend type, the transformation is guarded by flag_finite_math_only.
> >
> > Bootstrapped and regtested under x86_64-pc-linux-gnu{-m32,}
> > Ok for trunk?
> >
> > gcc/ChangeLog:
> >
> > PR target/102464
> > * match.pd: Simplify (trunc)fmax/fmin((extend)a, (extend)b) to
> > MAX/MIN(a,b)
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/pr102464-maxmin.c: New test.
> > ---
> >  gcc/match.pd  | 14 ++
> >  .../gcc.target/i386/pr102464-maxmin.c | 44 +++
> >  2 files changed, 58 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr102464-maxmin.c
> >
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index f63079023d0..857ce7f712a 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -6182,6 +6182,20 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > && direct_internal_fn_supported_p (IFN_COPYSIGN,
> >   type, OPTIMIZE_FOR_BOTH))
> >  (IFN_COPYSIGN @0 @1
> > +
> > +(for maxmin (max min)
> > + (simplify
> > +  (convert (maxmin (convert@2 @0) (convert @1)))
> > +   (if (flag_finite_math_only
>
> I suppose you are concerned about infinities, not about NaNs.
> Please use !HONOR_INFINITIES (@2) then (in general testing
> flag_* is frowned upon).  You may want to do the FLOAT_TYPE_P
> tests first.
I'm concerned about NANs since MAX/MIN_EXPR are different from IEEE
minimum and maximum operations at NAN operations.
So i think i'd use MODE_HAS_NANS(@2)?
>
> > +   && optimize
> > +   && FLOAT_TYPE_P (type)
> > +   && FLOAT_TYPE_P (TREE_TYPE (@2))
> > +   && types_match (type, TREE_TYPE (@0))
> > +   && types_match (type, TREE_TYPE (@1))
> > +   && TYPE_PRECISION (type) < TYPE_PRECISION (TREE_TYPE (@2))
> > +   && optab_handler (maxmin == MAX_EXPR ? smax_optab : smin_optab,
> > +   TYPE_MODE (type)) != CODE_FOR_nothing)
> > +(maxmin @0 @1
> >  #endif
> >
> >  (for froms (XFLOORL XCEILL XROUNDL XRINTL)
> > diff --git a/gcc/testsuite/gcc.target/i386/pr102464-maxmin.c 
> > b/gcc/testsuite/gcc.target/i386/pr102464-maxmin.c
> > new file mode 100644
> > index 000..37867235a6c
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/i386/pr102464-maxmin.c
> > @@ -0,0 +1,44 @@
> > +/* PR target/102464.  */
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -mavx512fp16 -mavx512vl -ffast-math -ftree-vectorize 
> > -mtune=generic -mfpmath=sse" } */
> > +/* { dg-final { scan-assembler-times "vmaxph" 3 } }  */
> > +/* { dg-final { scan-assembler-times "vminph" 3 } }  */
> > +/* { dg-final { scan-assembler-times "vmaxsh" 3 } }  */
> > +/* { dg-final { scan-assembler-times "vminsh" 3 } }  */
> > +/* { dg-final { scan-assembler-times "vmaxps" 2 } }  */
> > +/* { dg-final { scan-assembler-times "vminps" 2 } }  */
> > +/* { dg-final { scan-assembler-times "vmaxss" 2 } }  */
> > +/* { dg-final { scan-assembler-times "vminss" 2 } }  */
> > +/* { dg-final { scan-assembler-times "vmaxpd" 1 } }  */
> > +/* { dg-final { scan-assembler-times "vminpd" 1 } }  */
> > +/* { dg-final { scan-assembler-times "vmaxsd" 1 } }  */
> > +/* { dg-final { scan-assembler-times "vminsd" 1 } }  */
> > +
> > +#include
> > +#define FOO(CODE,TYPE,SUFFIX)  \
> > +  void \
> > +  foo_vect_##CODE##TYPE##SUFFIX (TYPE* __restrict a, TYPE* b, TYPE* c) \
> > +  {\
> > +for (int i = 0; i != 8; i++)   \
> > +  a[i] = CODE##SUFFIX (b[i], c[i]);
> > \
> > +  }\
> > +  TYPE \
> > +  foo_##CODE##TYPE##SUFFIX (TYPE b, TYPE c)\
> > +  {\
> > +return CODE##l (b, c); \
> > +  }
> > +
> > +FOO (fmax, _Float16, f);
> > +FOO (fmax, _Float16,);
> > +FOO (fmax, _Float16, l);
> > +FOO (fmin, _Float16, f);
> > +FOO (fmin, _Float16,);
> > +FOO (fmin, _Float16, l);
> > +
> > +FOO (fmax, float,);
> > +FOO (fmax, float, l);
> > +FOO (fmin, float,);
> > +FOO (fmin, float, l);
> > +
> > +FOO (fmax, double, l);
> > +FOO (fmin, double, l);
> > --
> > 2.18.1
> >



--
BR,
Hongtao


Re: [PATCH] i386: Support complex fma/conj_fma for _Float16.

2021-11-07 Thread Hongtao Liu via Gcc-patches
On Fri, Nov 5, 2021 at 3:09 PM Kong, Lingling via Gcc-patches
 wrote:
>
> Hi,
>
> This patch is to support cmla_optab, cmul_optab, cmla_conj_optab, 
> cmul_conj_optab for vector _Float16.
> Ok for master?
LGTM.
> gcc/ChangeLog:
>
> * config/i386/sse.md (cmul3): add new define_expand.
> (cmla4): Likewise
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx512fp16-vector-complex-float.c: New test.
> ---
>  gcc/config/i386/sse.md| 23 +++
>  .../i386/avx512fp16-vector-complex-float.c| 40 +++
>  2 files changed, 63 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/i386/avx512fp16-vector-complex-float.c
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 
> 0a7f5b178f9..8d3fef0a31a 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -5922,6 +5922,12 @@
>  (UNSPEC_COMPLEX_FMUL "fmulc")
>  (UNSPEC_COMPLEX_FCMUL "fcmulc")])
>
> +(define_int_attr conj_op
> +   [(UNSPEC_COMPLEX_FMA "")
> +(UNSPEC_COMPLEX_FCMA "_conj")
> +(UNSPEC_COMPLEX_FMUL "")
> +(UNSPEC_COMPLEX_FCMUL "_conj")])
> +
>  (define_mode_attr complexmove
>[(V32HF "avx512f_loadv16sf")
> (V16HF "avx512vl_loadv8sf")
> @@ -6003,6 +6009,15 @@
>DONE;
>  })
>
> +(define_expand "cmla4"
> +  [(set (match_operand:VF_AVX512FP16VL 0 "register_operand")
> +   (unspec:VF_AVX512FP16VL
> +   [(match_operand:VF_AVX512FP16VL 1 "vector_operand")
> +(match_operand:VF_AVX512FP16VL 2 "vector_operand")
> +(match_operand:VF_AVX512FP16VL 3 "vector_operand")]
> +UNSPEC_COMPLEX_F_C_MA))]
> +  "TARGET_AVX512FP16")
> +
>  (define_insn "fma__"
>[(set (match_operand:VF_AVX512FP16VL 0 "register_operand" "=&v")
> (unspec:VF_AVX512FP16VL
> @@ -6084,6 +6099,14 @@
>[(set_attr "type" "ssemuladd")
> (set_attr "mode" "")])
>
> +(define_expand "cmul3"
> +  [(set (match_operand:VF_AVX512FP16VL 0 "register_operand")
> +   (unspec:VF_AVX512FP16VL
> + [(match_operand:VF_AVX512FP16VL 1 "vector_operand")
> +  (match_operand:VF_AVX512FP16VL 2 "vector_operand")]
> +  UNSPEC_COMPLEX_F_C_MUL))]
> +  "TARGET_AVX512FP16")
> +
>  (define_insn "__"
>[(set (match_operand:VF_AVX512FP16VL 0 "register_operand" "=&v")
>   (unspec:VF_AVX512FP16VL
> diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16-vector-complex-float.c 
> b/gcc/testsuite/gcc.target/i386/avx512fp16-vector-complex-float.c
> new file mode 100644
> index 000..bcb957f0de0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512fp16-vector-complex-float.c
> @@ -0,0 +1,40 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512fp16 -mavx512vl" } */
> +/* { dg-final { scan-assembler-times "vfmaddcph\[ \\t\]" 1 } } */
> +/* { dg-final { scan-assembler-not "vfmadd\[123]*ph\[ \\t\]"} } */
> +/* { dg-final { scan-assembler-not "vfmadd\[123]*sh\[ \\t\]"} } */
> +/* { dg-final { scan-assembler-times "vfcmaddcph\[ \\t\]" 1 } } */
> +/* { dg-final { scan-assembler-times "vfmulcph\[ \\t\]" 1 } } */
> +/* { dg-final { scan-assembler-times "vfcmulcph\[ \\t\]" 1 } } */
> +
> +#include
> +#define TYPE _Float16
> +#define N 16
> +
> +void fma0 (_Complex TYPE *a, _Complex TYPE *b,
> +   _Complex TYPE *c)
> +{
> +  for (int i = 0; i < N; i++)
> +c[i] += a[i] * b[i];
> +}
> +
> +void fmaconj (_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
> + _Complex TYPE c[restrict N])
> +{
> +  for (int i = 0; i < N; i++)
> +c[i] += a[i] * ~b[i];
> +}
> +
> +void fmul (_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
> +  _Complex TYPE c[restrict N])
> +{
> +  for (int i = 0; i < N; i++)
> +c[i] = a[i] * b[i];
> +}
> +
> +void fmulconj (_Complex TYPE a[restrict N], _Complex TYPE b[restrict N],
> +  _Complex TYPE c[restrict N])
> +{
> +  for (int i = 0; i < N; i++)
> +c[i] = a[i] * ~b[i];
> +}
> --
> 2.18.1
>


-- 
BR,
Hongtao


Re: [PATCH] i386: Optimization for mm512_set1_pch.

2021-11-07 Thread Hongtao Liu via Gcc-patches
On Fri, Nov 5, 2021 at 3:20 PM Kong, Lingling via Gcc-patches
 wrote:
>
> Hi,
>
> This patch is to support fold _mm512_fmadd_pch (a, _mm512_set1_pch(*(b)), c) 
> to 1 instruction vfmaddcph (%rsp){1to16}, %zmm1, %zmm2.
> OK for master?
>
LGTM.
> gcc/ChangeLog:
>
> * config/i386/sse.md (fma___pair):
> Add new define_insn.
> (fma__fmaddc_bcst): Add new define_insn_and_split.
> (fma__fcmaddc_bcst): Likewise
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/avx512fp16vl-complex-broadcast-1.c: New test.
> ---
>  gcc/config/i386/sse.md| 62 +++
>  .../i386/avx512fp16vl-complex-broadcast-1.c   | 25 
>  2 files changed, 87 insertions(+)
>  create mode 100644 
> gcc/testsuite/gcc.target/i386/avx512fp16vl-complex-broadcast-1.c
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 
> 0a7f5b178f9..eba8e77515f 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -193,7 +193,9 @@
>
>;; For AVX512FP16 suppport
>UNSPEC_COMPLEX_FMA
> +  UNSPEC_COMPLEX_FMA_PAIR
>UNSPEC_COMPLEX_FCMA
> +  UNSPEC_COMPLEX_FCMA_PAIR
>UNSPEC_COMPLEX_FMUL
>UNSPEC_COMPLEX_FCMUL
>UNSPEC_COMPLEX_MASK
> @@ -5913,6 +5915,9 @@
>  (define_int_iterator UNSPEC_COMPLEX_F_C_MA
> [UNSPEC_COMPLEX_FMA UNSPEC_COMPLEX_FCMA])
>
> +(define_int_iterator UNSPEC_COMPLEX_F_C_MA_PAIR
> +   [UNSPEC_COMPLEX_FMA_PAIR UNSPEC_COMPLEX_FCMA_PAIR])
> +
>  (define_int_iterator UNSPEC_COMPLEX_F_C_MUL
> [UNSPEC_COMPLEX_FMUL UNSPEC_COMPLEX_FCMUL])
>
> @@ -5922,6 +5927,10 @@
>  (UNSPEC_COMPLEX_FMUL "fmulc")
>  (UNSPEC_COMPLEX_FCMUL "fcmulc")])
>
> +(define_int_attr complexpairopname
> +   [(UNSPEC_COMPLEX_FMA_PAIR "fmaddc")
> +(UNSPEC_COMPLEX_FCMA_PAIR "fcmaddc")])
> +
>  (define_mode_attr complexmove
>[(V32HF "avx512f_loadv16sf")
> (V16HF "avx512vl_loadv8sf")
> @@ -6067,6 +6076,59 @@
>   [(match_dup 1) (match_dup 2) (match_dup 4)]
>UNSPEC_COMPLEX_F_C_MA))])
>
> +(define_insn "fma___pair"
> + [(set (match_operand:VF1_AVX512VL 0 "register_operand" "=&v")
> +   (unspec:VF1_AVX512VL
> +[(match_operand:VF1_AVX512VL 1 "vector_operand" "%v")
> + (match_operand:VF1_AVX512VL 2 "bcst_vector_operand" "vmBr")
> + (match_operand:VF1_AVX512VL 3 "vector_operand" "0")]
> + UNSPEC_COMPLEX_F_C_MA_PAIR))]
> + "TARGET_AVX512FP16"
> + "vph\t{%2, %1, %0|%0, %1, %2}"
> + [(set_attr "type" "ssemuladd")])
> +
> +(define_insn_and_split "fma__fmaddc_bcst"
> +  [(set (match_operand:VF_AVX512FP16VL 0 "register_operand")
> +   (unspec:VF_AVX512FP16VL
> + [(match_operand:VF_AVX512FP16VL 1 "vector_operand")
> +  (subreg:VF_AVX512FP16VL
> +(match_operand: 2 "bcst_vector_operand") 0)
> +  (match_operand:VF_AVX512FP16VL 3 "vector_operand")]
> +  UNSPEC_COMPLEX_FMA))]
> +  "TARGET_AVX512FP16"
> +  "#"
> +  "&& 1"
> +  [(set (match_dup 0)
> +   (unspec:
> + [(match_dup 1) (match_dup 2) (match_dup 3)]
> +  UNSPEC_COMPLEX_FMA_PAIR))]
> +  {
> +operands[0] = lowpart_subreg (mode, operands[0], mode);
> +operands[1] = lowpart_subreg (mode, operands[1], mode);
> +operands[3] = lowpart_subreg (mode, operands[3],
> +mode);
> +  })
> +
> +(define_insn_and_split "fma__fcmaddc_bcst"
> +  [(set (match_operand:VF_AVX512FP16VL 0 "register_operand")
> +   (unspec:VF_AVX512FP16VL
> + [(match_operand:VF_AVX512FP16VL 1 "vector_operand")
> +  (subreg:VF_AVX512FP16VL
> +(match_operand: 2 "bcst_vector_operand") 0)
> +  (match_operand:VF_AVX512FP16VL 3 "vector_operand")]
> +  UNSPEC_COMPLEX_FCMA))]
> +  "TARGET_AVX512FP16"
> +  "#"
> +  "&& 1"
> +  [(set (match_dup 0)
> +   (unspec:
> + [(match_dup 1) (match_dup 2) (match_dup 3)]
> +  UNSPEC_COMPLEX_FCMA_PAIR))]
> +  {
> +operands[0] = lowpart_subreg (mode, operands[0], mode);
> +operands[1] = lowpart_subreg (mode, operands[1], mode);
> +operands[3] = lowpart_subreg (mode, operands[3],
> +mode);
> +  })
> +
>  (define_insn "___mask"
>[(set (match_operand:VF_AVX512FP16VL 0 "register_operand" "=&v")
> (vec_merge:VF_AVX512FP16VL
> diff --git a/gcc/testsuite/gcc.target/i386/avx512fp16vl-complex-broadcast-1.c 
> b/gcc/testsuite/gcc.target/i386/avx512fp16vl-complex-broadcast-1.c
> new file mode 100644
> index 000..3c8e84230f3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/avx512fp16vl-complex-broadcast-1.c
> @@ -0,0 +1,25 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512fp16 -mavx512vl" } */
> +/* { dg-final { scan-assembler-times "\[^\n\]*\\\{1to4\\\}" 2 } }  */
> +/* { dg-final { scan-assembler-times "\[^\n\]*\\\{1to8\\\}" 2 } }  */
> +/* { dg-final { scan-assembler-times "\[^\n\]*\\\{1to16\\\}" 2 } }  */
> +
> +#include 
> +
> +volatile __m512h res0, a0, c0;
> +volatile __m256h res1, a1, c1;
> +volatile __m128h res2, a2, c2;
> +volatile _Fl

Re: [PATCH v1 1/7] LoongArch Port: gcc

2021-11-07 Thread Chenghua Xu
This patch does not arrive at  mail list. Send as an attachment in a 
compressed format.


v1-0001-LoongArch-Port-gcc.patch.tar.xz
Description: application/xz


[PING] [PATCH v2] tree-optimization/101186 - extend FRE with "equivalence map" for condition prediction

2021-11-07 Thread Di Zhao OS via Gcc-patches
Hi,

Gentle ping on this.

Di Zhao

-Original Message-
From: Di Zhao OS 
Sent: Monday, October 25, 2021 3:03 AM
To: Richard Biener 
Cc: gcc-patches@gcc.gnu.org
Subject: RE: [PATCH v2] tree-optimization/101186 - extend FRE with "equivalence 
map" for condition prediction

Hi,

Attached is a new version of the patch, mainly for improving performance
and simplifying the code.

First, regarding the comments:

> -Original Message-
> From: Richard Biener 
> Sent: Friday, October 1, 2021 9:00 PM
> To: Di Zhao OS 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v2] tree-optimization/101186 - extend FRE with
> "equivalence map" for condition prediction
> 
> On Thu, Sep 16, 2021 at 8:13 PM Di Zhao OS
>  wrote:
> >
> > Sorry about updating on this after so long. It took me much time to work 
> > out a
> > new plan and pass the tests.
> >
> > The new idea is to use one variable to represent a set of equal variables at
> > some basic-block. This variable is called a "equivalence head" or 
> > "equiv-head"
> > in the code. (There's no-longer a "equivalence map".)
> >
> > - Initially an SSA_NAME's "equivalence head" is its value number. Temporary
> >   equivalence heads are recorded as unary NOP_EXPR results in the 
> > vn_nary_op_t
> >   map. Besides, when inserting into vn_nary_op_t map, make the new result at
> >   front of the vn_pval list, so that when searching for a variable's
> >   equivalence head, the first result represents the largest equivalence set 
> > at
> >   current location.
> > - In vn_ssa_aux_t, maintain a list of references to valid_info->nary entry.
> >   For recorded equivalences, the reference is result->entry; for normal 
> > N-ary
> >   operations, the reference is operand->entry.
> > - When recording equivalences, if one side A is constant or has more refs, 
> > make
> >   it the new equivalence head of the other side B. Traverse B's ref-list, 
> > if a
> >   variable C's previous equiv-head is B, update to A. And re-insert B's 
> > n-ary
> >   operations by replacing B with A.
> > - When inserting and looking for the results of n-ary operations, insert and
> >   lookup by the operands' equiv-heads.
> > ...
> >
> > Thanks,
> > Di Zhao
> >
> > 
> > Extend FRE with temporary equivalences.
> 
> Comments on the patch:
> 
> +  /* nary_ref count.  */
> +  unsigned num_nary_ref;
> +
> 
> I think a unsigned short should be enough and that would nicely
> pack after value_id together with the bitfield (maybe change that
> to unsigned short :1 then).

Changed num_nary_ref to unsigned short and moved after value_id.

> @@ -7307,17 +7839,23 @@ process_bb (rpo_elim &avail, basic_block bb,
> tree val = gimple_simplify (gimple_cond_code (last),
> boolean_type_node, lhs, rhs,
> NULL, vn_valueize);
> +   vn_nary_op_t vnresult = NULL;
> /* If the condition didn't simplfy see if we have recorded
>an expression from sofar taken edges.  */
> if (! val || TREE_CODE (val) != INTEGER_CST)
>   {
> -   vn_nary_op_t vnresult;
> 
> looks like you don't need vnresult outside of the if()?

vnresult is reused later to record equivalences generated by PHI nodes.

> +/* Find predicated value of vn_nary_op by the operands' equivalences.  Return
> + * NULL_TREE if no known result is found.  */
> +
> +static tree
> +find_predicated_value_by_equivs (vn_nary_op_t vno, basic_block bb,
> +vn_nary_op_t *vnresult)
> +{
> +  lookup_equiv_heads (vno->length, vno->op, vno->op, bb);
> +  tree result
> += simplify_nary_op (vno->length, vno->opcode, vno->op, vno->type);
> 
> why is it necessary to simplify here?  It looks like the caller
> already does this.

In the new patch, changed the code a little to remove redundant calculation.

> I wonder whether it's valid to always perform find_predicated_value_by_equivs
> from inside vn_nary_op_get_predicated_value instead of bolting it to only
> a single place?

Removed function find_predicated_value_by_equivs and inlined the code.

Because lookup_equiv_head uses vn_nary_op_get_predicated_value, so I left
vn_nary_op_get_predicated_value unchanged. Instead, operands are set to
equiv-heads in init_vn_nary_op_from_stmt. So altogether, predicates are always
inserted and searched by equiv-heads.

> +
> +static vn_nary_op_t
> +val_equiv_insert (tree op1, tree op2, edge e)
> +{
> 
> +  if (is_gimple_min_invariant (lhs))
> +std::swap (lhs, rhs);
> +  if (is_gimple_min_invariant (lhs) || TREE_CODE (lhs) != SSA_NAME)
> +/* Possible if invoked from record_equiv_from_previous_cond.  */
> +return NULL;
> 
> Better formulate all of the above in terms of only SSA_NAME checks since...
> 
> +  /* Make the hand-side with more recorded n-ary expressions new
> +   * equivalence-head, to make fewer re-insertions.  */
> +  if (TREE_CODE (rhs) == SSA_NAME
> +  && VN_INFO (rhs)->n

Re: [PATCH] PR middle-end/103059: reload: Also accept ASHIFT with indexed addressing

2021-11-07 Thread Hans-Peter Nilsson
On Sun, 7 Nov 2021, Maciej W. Rozycki wrote:
> On Fri, 5 Nov 2021, Hans-Peter Nilsson wrote:
>
> > >  I was trying to chase another target I could use to regression-test this
> > > with that does do scaled indexed addressing while still using old reload.
> > > The i386 port would be a good candidate, but it has switched to LRA long
> > > ago with no option to use old reload, and I think there would be little
> > > point in adding one just for the sake of such verification.  Do we have
> > > any other port actually that could be affected by this change?
> >
> > That'd be cris-elf.
>
>  Good to know, thanks!
>
>  How do I run regression-testing with this target however?  I can see QEMU
> support upstream, even for user-mode Linux, which would be the easiest to
> run (sadly toolchain support for CRIS/Linux was removed a while ago as was
> the Linux kernel port; at one point I even considered getting myself a
> CRIS development board as an alternative RISC platform that would Linux,
> but concluded that it was too expensive for the features it offered), but
> for a bare metal environment both a C library (newlib?) and then a
> specific board support package is required.

Classic "bare-metal" whatever-elf testing should not be a
stranger: sim and binutils support are in place in the official
binutils+gdb git, as is newlib in that git and since many
dejagnu releases a cris-sim.exp baseboard file.  Just build and
install binutils and sim for cris-elf (can probable be done at
the same time/same builds from a binutils-and-gdb checkout, but
separate builds are sometimes necessary) then build and test gcc
from a combined-source-tree containing newlib and gcc.
(Instructions for combining trees may be salvaged from the
rottening https://gcc.gnu.org/simtest-howto.html but actually I
roll tarballs and untar gcc over an (untarred) newlib tree.)

I don't have a baseboard file for QEMU, sorry.

>  Or may I ask you to put this patch through testing with your environment?

Where's the fun in that? :)
(I thought you'd use 6cb68940dcf9 and do the same for VAX.)

> > Your proposed patch reminded me of 6cb68940dcf9; giving reload a
> > reload-specific insn_and_split pattern to play with, matching
> > "mult" outside of a mem.  I *guess* that's the CRIS-specific
> > replacement to c605a8bf9270.
>
>  Possibly, except for the missing reload bits making it incomplete.

No, my thinking was that it wouldn't be needed.  But, I didn't
have a close look and maybe the problem isn't exactly the same
or VAX has additional caveats.  Also, that reload-pacifying
pattern *is* a target-specific workaround for a reload bug, but
a risk-free one for other targets.

brgds, H-P
PS. I'll fire up a round with that patch "tomorrow".  Film at 11.


Re: [PATCH v3 1/4] Fix loop split incorrect count and probability

2021-11-07 Thread Xionghu Luo via Gcc-patches



On 2021/10/27 15:44, Jan Hubicka wrote:
>> On Wed, 27 Oct 2021, Jan Hubicka wrote:
>>

 gcc/ChangeLog:

* tree-ssa-loop-split.c (split_loop): Fix incorrect probability.
(do_split_loop_on_cond): Likewise.
 ---
  gcc/tree-ssa-loop-split.c | 25 -
  1 file changed, 16 insertions(+), 9 deletions(-)

 diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
 index 3f6ad046623..d30782888f3 100644
 --- a/gcc/tree-ssa-loop-split.c
 +++ b/gcc/tree-ssa-loop-split.c
 @@ -575,7 +575,11 @@ split_loop (class loop *loop1)
stmts2);
tree cond = build2 (guard_code, boolean_type_node, guard_init, border);
if (!initial_true)
 -cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond); 
 +cond = fold_build1 (TRUTH_NOT_EXPR, boolean_type_node, cond);
 +
 +  edge true_edge = EDGE_SUCC (bbs[i], 0)->flags & EDGE_TRUE_VALUE
 + ? EDGE_SUCC (bbs[i], 0)
 + : EDGE_SUCC (bbs[i], 1);
  
/* Now version the loop, placing loop2 after loop1 connecting
   them, and fix up SSA form for that.  */
 @@ -583,10 +587,10 @@ split_loop (class loop *loop1)
basic_block cond_bb;
  
class loop *loop2 = loop_version (loop1, cond, &cond_bb,
 - profile_probability::always (),
 - profile_probability::always (),
 - profile_probability::always (),
 - profile_probability::always (),
 + true_edge->probability,
 + true_edge->probability.invert (),
 + true_edge->probability,
 + true_edge->probability.invert (),
   true);
>>>
>>> As discussed yesterday, for loop of form
>>>
>>> for (...)
>>>   if (cond)
>>> cond = something();
>>>   else
>>> something2
>>>
>>> Split as
>>
>> Note that you are missing to conditionalize loop1 execution
>> on 'cond' (not sure if that makes a difference).
> You are right - forgot to mention that.
> 
> Entry conditional makes no difference on scaling stmts inside loop but
> affects its header and expected trip count. We however need to set up
> probability of this conditional (and preheader count if it exists)
> There is no general way to read the probability of this initial
> conditional from cfg profile.  So I guess we are stuck with guessing
> some arbitrary value. I guess common case is that cond is true first
> iteration tough and often we can easily see that fromo PHI node
> initializing the test variable.
> 
> Other thing that changes is expected number of iterations of the split
> loops, so we may want to update the exit conditinal probability
> accordingly...
> 
Sorry for the late reply.  The below updated patch mainly solves the issues
you pointed out:
  - profile count proportion for both original loop and copied loop
without dropping down the true branch's count;
  - probability update in the two loops and between the two loops;
  - number of iterations update/check for split_loop.


[PATCH v3] Fix loop split incorrect count and probability

In tree-ssa-loop-split.c, split_loop and split_loop_on_cond does two
kind of split. split_loop only works for single loop and insert edge at
exit when split, while split_loop_on_cond is not limited to single loop
and insert edge at latch when split.  Both split behavior should consider
loop count and probability update.  For split_loop, loop split condition
is moved in front of loop1 and loop2; But split_loop_on_cond moves the
condition between loop1 and loop2, this patch does:
1) profile count proportion for both original loop and copied loop
without dropping down the true branch's count;
2) probability update in and between the two loops;
3) number of iterations update for split_loop.

Regression tested pass, OK for master?

Changes diff for split_loop and split_loop_on_cond cases:

1) diff base/loop-split.c.151t.lsplit patched/loop-split.c.152t.lsplit
...
[local count: 118111600]:
   if (beg_5(D) < end_8(D))
 goto ; [89.00%]
   else
 goto ; [11.00%]

[local count: 105119324]:
   if (beg2_6(D) < c_9(D))
-goto ; [100.00%]
+goto ; [33.00%]
   else
-goto ; [100.00%]
+goto ; [67.00%]

-   [local count: 105119324]:
+   [local count: 34689377]:
   _25 = beg_5(D) + 1;
   _26 = end_8(D) - beg_5(D);
   _27 = beg2_6(D) + _26;
   _28 = MIN_EXPR ;

-   [local count: 955630225]:
+   [local count: 315357973]:
   # i_16 = PHI 
   # j_17 = PHI 
   printf ("a: %d %d\n", i_16, j_17);
   i_11 = i_16 + 1;
   j_12 = j_17 + 1;
   if (j_12 < _28)
-goto ; [89.00%]
+goto ; [29.37%]
   else
-goto ; [11.00%]
+goto ; [70.63%]

-   

Improve optimization of some builtins

2021-11-07 Thread Jan Hubicka via Gcc-patches
Hi,
for nested functions we output call to builtin_dwarf_cfa which
initializes frame entry used only for debugging.  This however
prevents us from detecting functions containing nested functions
as const/pure or analyze side effects in modref.

builtin_dwarf_cfa is not documented and I wonder if it should be turned to
internal function. But I think we could consider functions using it const even
if in theory one can do things like test the return address and see the
difference between different frame addreses.

While doing so I also noticed that special_buitin_state handles quite few
builtins that are not special cased by ipa-modref.  They do not make
user visible loads/stores and thus I think they shoul dbe annotated by
".c" to make this explicit for both modref and PTA.

Finally I aded dwarf_cfa and similar return_address to list of simple
bulitins since it compiles to simple stack frame load and we consider
several other builtins doing so simple. 

lto-bootstrapped/regtested all languages on x86_64-linux, seems sane?

Honza

* builtins.c (is_simple_builtin): Add builitin_dwarf_cfa
and builtin_return_address.
(builtin_fnspec): Annotate builtin_return,
bulitin_eh_pointer, builtin_eh_filter, builtin_unwind_resume,
builtin_cxa_end_cleanup, builtin_eh_copy_values,
builtin_frame_address, builtin_apply_args,
builtin_asan_before_dynamic_init, builtin_asan_after_dynamic_init,
builtin_prefetch, builtin_dwarf_cfa, builtin_return_addrss
as ".c"
* ipa-pure-const.c (special_builtin_state): Add builtin_dwarf_cfa
and builtin_return_address.
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 7d0f61fc98b..43433e8d6ce 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -10711,6 +10711,8 @@ is_simple_builtin (tree decl)
   case BUILT_IN_VA_END:
   case BUILT_IN_STACK_SAVE:
   case BUILT_IN_STACK_RESTORE:
+  case BUILT_IN_DWARF_CFA:
+  case BUILT_IN_RETURN_ADDRESS:
/* Exception state returns or moves registers around.  */
   case BUILT_IN_EH_FILTER:
   case BUILT_IN_EH_POINTER:
@@ -11099,6 +11099,19 @@ builtin_fnspec (tree callee)
   CASE_BUILT_IN_TM_STORE (M256):
return ".cO ";
   case BUILT_IN_STACK_SAVE:
+  case BUILT_IN_RETURN:
+  case BUILT_IN_EH_POINTER:
+  case BUILT_IN_EH_FILTER:
+  case BUILT_IN_UNWIND_RESUME:
+  case BUILT_IN_CXA_END_CLEANUP:
+  case BUILT_IN_EH_COPY_VALUES:
+  case BUILT_IN_FRAME_ADDRESS:
+  case BUILT_IN_APPLY_ARGS:
+  case BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT:
+  case BUILT_IN_ASAN_AFTER_DYNAMIC_INIT:
+  case BUILT_IN_PREFETCH:
+  case BUILT_IN_DWARF_CFA:
+  case BUILT_IN_RETURN_ADDRESS:
return ".c";
   case BUILT_IN_ASSUME_ALIGNED:
return "1cX ";
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index a84a4eb7ac0..e5048092939 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -529,6 +529,8 @@ special_builtin_state (enum pure_const_state_e *state, bool 
*looping,
   case BUILT_IN_APPLY_ARGS:
   case BUILT_IN_ASAN_BEFORE_DYNAMIC_INIT:
   case BUILT_IN_ASAN_AFTER_DYNAMIC_INIT:
+  case BUILT_IN_DWARF_CFA:
+  case BUILT_IN_RETURN_ADDRESS:
*looping = false;
*state = IPA_CONST;
return true;


Re: [PATCH 0/4] config: Allow a host to opt out of PCH.

2021-11-07 Thread Richard Biener via Gcc-patches
On Fri, Nov 5, 2021 at 5:37 PM Iain Sandoe  wrote:
>
>
>
> > On 5 Nov 2021, at 15:25, Jakub Jelinek  wrote:
> >
> > On Fri, Nov 05, 2021 at 11:31:58AM +0100, Richard Biener wrote:
> >> On Fri, Nov 5, 2021 at 10:54 AM Jakub Jelinek  wrote:
> >>>
> >>> On Fri, Nov 05, 2021 at 10:42:05AM +0100, Richard Biener via Gcc-patches 
> >>> wrote:
>  I had the impression we have support for PCH file relocation to deal 
>  with ASLR
>  at least on some platforms.
> >>>
> >>> Unfortunately we do not, e.g. if you build cc1/cc1plus as PIE on
> >>> x86_64-linux, PCH will stop working unless one always invokes it with
> >>> disabled ASLR through personality.
> >>>
> >>> I think this is related to function pointers and pointers to .rodata/.data
> >>> etc. variables in GC memory, we currently do not relocate that.
> >>>
> >>> What we perhaps could do is (at least assuming all the ELF PT_LOAD 
> >>> segments
> >>> are adjacent with a single load base for them - I think at least ia64
> >>> non-PIE binaries were violating this by having .text and .data PT_LOAD
> >>> segments many terrabytes appart with a whole in between not protected in 
> >>> any
> >>> way, but dunno if that is for PIEs too), perhaps try in a host
> >>> specific way remember the address range in which the function pointers and
> >>> .rodata/.data can exist, remember the extent start and end from PCH 
> >>> generation
> >>> and on PCH load query those addresses for the current compiler and 
> >>> relocate
> >>> everything in that extent by the load bias from the last run.
> >>> But, the assumption for this is that those function and data/rodata 
> >>> pointers
> >>> in GC memory are actually marked at least as pointers...
> >>
> >> If any such pointers exist they must be marked GTY((skip)) since they do 
> >> not
> >> point to GC memory...  So we'd need to invent special-handling for those.
> >>
> >>> Do we e.g. have objects with virtual classes in GC memory and if so, do we
> >>> catch their virtual table pointers?
> >>
> >> Who knows, but then I don't remember adding stuff that should end in a PCH.
> >
> > So, I've investigated a little bit.
> > Apparently all the relocation we currently do for PCH is done at PCH write
> > time, we choose some address range in the address space we think will be 
> > likely
> > mmappable each time successfully, relocate all pointers pointing to GC
> > memory to point in there and then write that to file, together with the
> > scalar GTY global vars values and GTY pointers in global vars.
> > On PCH load, we just try to mmap memory in the right range, fail PCH load if
> > unsuccessful, and read the GC memory into that range and update scalar and
> > pointer GTY global vars from what we've recorded.
> > Patch that made PCH load to fail for PIEs etc. was
> > https://gcc.gnu.org/legacy-ml/gcc-patches/2003-10/msg01994.html
> > If we wanted to relocate pointers to functions and .data/.rodata etc.,
> > ideally we'd create a relocation list of addresses that should be
> > incremented by the bias and quickly relocate those.
>
> It is hard to judge the relative effort in the two immediately visible 
> solutions:
>
> 1. relocatable PCH
> 2. taking the tree streamer from the modules implementation, moving its home
> to c-family and adding hooks so that each FE can stream its own special 
> trees.
>
> ISTM, that part of the reason people dislike PCH is because the 
> implementation is
> mixed up with the GC solution - the rendering is non-transparent etc.

Yes.  In particular it stands in the way of even thinking of doing sth
different than
GC for trees.

> So, in some ways, (2) above would be a better investment - the process of PCH 
> is:
> generate:
> “get to the end of parsing a TU” .. stream the AST
> consume:
> .. see a header .. stream the PCH AST in if there is one available for the 
> header.
>
> There is no reason for this to be mixed into the GC solution - the read in 
> (currently)
> happens to an empty TU and there should be nothing in the AST that carries any
> reference to the compiler’s executable.

It makes the PCH read-in "cheap" - IIRC Google invested quite some work in
evaluating different PC* approaches but none in the end made a big enough
difference.  Given we now have a standards backed PCH-like thing for C++
with modules and given that for C (besides on Darwin...) PCH never made much
sense I doubt investing into generalizing the C++ module support or making
PCH relocatable is worth the trouble.

Richard.

>
> just 0.02 GBP.
> Iain
>
>
> >
> > I wrote following ugly hack:
> >
> > --- ggc-common.c.jj   2021-08-19 11:42:27.365422400 +0200
> > +++ ggc-common.c  2021-11-05 15:37:51.447222544 +0100
> > @@ -404,6 +404,9 @@ struct mmap_info
> >
> > /* Write out the state of the compiler to F.  */
> >
> > +char *exestart = (char *) 2;
> > +char *exeend = (char *) 2;
> > +
> > void
> > gt_pch_save (FILE *f)
> > {
> > @@ -458,6 +461,14 @@ gt_pch_save (FILE *f)
> > for (rti = *rt; rti->base != NULL; rti

Re: [PATCH 0/4] config: Allow a host to opt out of PCH.

2021-11-07 Thread Iain Sandoe



> On 8 Nov 2021, at 07:16, Richard Biener  wrote:
> 
> On Fri, Nov 5, 2021 at 5:37 PM Iain Sandoe  wrote:
>> 
>> 
>> 
>>> On 5 Nov 2021, at 15:25, Jakub Jelinek  wrote:
>>> 
>>> On Fri, Nov 05, 2021 at 11:31:58AM +0100, Richard Biener wrote:
 On Fri, Nov 5, 2021 at 10:54 AM Jakub Jelinek  wrote:
> 
> On Fri, Nov 05, 2021 at 10:42:05AM +0100, Richard Biener via Gcc-patches 
> wrote:
>> I had the impression we have support for PCH file relocation to deal 
>> with ASLR
>> at least on some platforms.
> 
> Unfortunately we do not, e.g. if you build cc1/cc1plus as PIE on
> x86_64-linux, PCH will stop working unless one always invokes it with
> disabled ASLR through personality.
> 
> I think this is related to function pointers and pointers to .rodata/.data
> etc. variables in GC memory, we currently do not relocate that.
> 
> What we perhaps could do is (at least assuming all the ELF PT_LOAD 
> segments
> are adjacent with a single load base for them - I think at least ia64
> non-PIE binaries were violating this by having .text and .data PT_LOAD
> segments many terrabytes appart with a whole in between not protected in 
> any
> way, but dunno if that is for PIEs too), perhaps try in a host
> specific way remember the address range in which the function pointers and
> .rodata/.data can exist, remember the extent start and end from PCH 
> generation
> and on PCH load query those addresses for the current compiler and 
> relocate
> everything in that extent by the load bias from the last run.
> But, the assumption for this is that those function and data/rodata 
> pointers
> in GC memory are actually marked at least as pointers...
 
 If any such pointers exist they must be marked GTY((skip)) since they do 
 not
 point to GC memory...  So we'd need to invent special-handling for those.
 
> Do we e.g. have objects with virtual classes in GC memory and if so, do we
> catch their virtual table pointers?
 
 Who knows, but then I don't remember adding stuff that should end in a PCH.
>>> 
>>> So, I've investigated a little bit.
>>> Apparently all the relocation we currently do for PCH is done at PCH write
>>> time, we choose some address range in the address space we think will be 
>>> likely
>>> mmappable each time successfully, relocate all pointers pointing to GC
>>> memory to point in there and then write that to file, together with the
>>> scalar GTY global vars values and GTY pointers in global vars.
>>> On PCH load, we just try to mmap memory in the right range, fail PCH load if
>>> unsuccessful, and read the GC memory into that range and update scalar and
>>> pointer GTY global vars from what we've recorded.
>>> Patch that made PCH load to fail for PIEs etc. was
>>> https://gcc.gnu.org/legacy-ml/gcc-patches/2003-10/msg01994.html
>>> If we wanted to relocate pointers to functions and .data/.rodata etc.,
>>> ideally we'd create a relocation list of addresses that should be
>>> incremented by the bias and quickly relocate those.
>> 
>> It is hard to judge the relative effort in the two immediately visible 
>> solutions:
>> 
>> 1. relocatable PCH
>> 2. taking the tree streamer from the modules implementation, moving its home
>>to c-family and adding hooks so that each FE can stream its own special 
>> trees.
>> 
>> ISTM, that part of the reason people dislike PCH is because the 
>> implementation is
>> mixed up with the GC solution - the rendering is non-transparent etc.
> 
> Yes.  In particular it stands in the way of even thinking of doing sth
> different than
> GC for trees.
> 
>> So, in some ways, (2) above would be a better investment - the process of 
>> PCH is:
>> generate:
>> “get to the end of parsing a TU” .. stream the AST
>> consume:
>> .. see a header .. stream the PCH AST in if there is one available for the 
>> header.
>> 
>> There is no reason for this to be mixed into the GC solution - the read in 
>> (currently)
>> happens to an empty TU and there should be nothing in the AST that carries 
>> any
>> reference to the compiler’s executable.
> 
> It makes the PCH read-in "cheap" - IIRC Google invested quite some work in
> evaluating different PC* approaches but none in the end made a big enough
> difference.

that was presumably looking for a more efficient streamer - where we care more 
now
about a less invasive streamer.

>  Given we now have a standards backed PCH-like thing for C++
> with modules and given that for C (besides on Darwin…)

For the record, it’s not actually Darwin that is calling for it - it’s 
Objective-C (which has
large header uses in much the same way as C++) - so it would/will affect GNUStep
in the same way.

> PCH never made much
> sense I doubt investing into generalizing the C++ module support or making
> PCH relocatable is worth the trouble.

There are certainly things higher on my TODO …
Iain

> 
> 
> Richard.

Re: [PATCH] libsanitizer: Disable libbacktrace on sanitizer_platform_limits_freebsd.cpp

2021-11-07 Thread Gerald Pfeifer
On Thu, 4 Nov 2021, H.J. Lu wrote:
>> Ok.  But please after committing mention the revision in
>> libsanitizer/LOCAL_PATCHES.
> include and libsanitizer should use 2 separate patches.  The 
> libsanitizer patch should be in libsanitizer/LOCAL_PATCHES.

Okay, thanks.

This is the first part I committed on Friday, the second will 
follow today.

Gerald


commit 44d9d55c6d0e3a1e26427662d30f350a80282634
Author: Gerald Pfeifer 
Date:   Fri Nov 5 12:56:07 2021 +0100

include: Allow for our md5.h to defer to the system header

This came up in the context of libsanitizer, where platform-specific
support for FreeBSD relies on aspects provided by FreeBSD's own md5.h.

Address this by allowing GCC's md5.h to pull in the system header
instead, controlled by a new macro USE_SYSTEM_MD5.

2021-11-05  Gerald Pfeifer  
Jakub Jelinek  

include/
* md5.h (USE_SYSTEM_MD5): Introduce.

diff --git a/include/md5.h b/include/md5.h
index 03f7d29afc7..c5bb6076969 100644
--- a/include/md5.h
+++ b/include/md5.h
@@ -21,6 +21,10 @@
 #ifndef _MD5_H
 #define _MD5_H 1
 
+#ifdef USE_SYSTEM_MD5
+#include_next 
+#else
+
 #include 
 
 #if defined HAVE_LIMITS_H || _LIBC
@@ -151,4 +155,6 @@ extern void *md5_buffer (const char *buffer, size_t len, 
void *resblock);
 }
 #endif
 
+#endif // USE_SYSTEM_MD5
+
 #endif


[ping] Small adjustments to Aarch64 and i386 back-ends

2021-11-07 Thread Eric Botcazou via Gcc-patches
For the Aarch64 back-end:
  https://gcc.gnu.org/pipermail/gcc-patches/2021-October/580952.html

For the i386 back-end:
  https://gcc.gnu.org/pipermail/gcc-patches/2021-October/580992.html

Thanks in advance.

-- 
Eric Botcazou