Re: [PATCH 0/4] IVOPTs consider step cost for different forms when unrolling

2020-06-03 Thread Richard Biener
On Wed, 3 Jun 2020, Kewen.Lin wrote:

> Hi Richi,
> 
> on 2020/6/2 下午7:38, Richard Biener wrote:
> > On Thu, 28 May 2020, Kewen.Lin wrote:
> > 
> >> Hi,
> >>
> >> This is one repost and you can refer to the original series 
> >> via https://gcc.gnu.org/pipermail/gcc-patches/2020-January/538360.html.
> >>
> >> As we discussed in the thread
> >> https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00196.html
> >> Original: https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00104.html,
> >> I'm working to teach IVOPTs to consider D-form group access during 
> >> unrolling.
> >> The difference on D-form and other forms during unrolling is we can put the
> >> stride into displacement field to avoid additional step increment. eg:
> >>
> >> With X-form (uf step increment):
> >>   ...
> >>   LD A = baseA, X
> >>   LD B = baseB, X
> >>   ST C = baseC, X
> >>   X = X + stride
> >>   LD A = baseA, X
> >>   LD B = baseB, X
> >>   ST C = baseC, X
> >>   X = X + stride
> >>   LD A = baseA, X
> >>   LD B = baseB, X
> >>   ST C = baseC, X
> >>   X = X + stride
> >>   ...
> >>
> >> With D-form (one step increment for each base):
> >>   ...
> >>   LD A = baseA, OFF
> >>   LD B = baseB, OFF
> >>   ST C = baseC, OFF
> >>   LD A = baseA, OFF+stride
> >>   LD B = baseB, OFF+stride
> >>   ST C = baseC, OFF+stride
> >>   LD A = baseA, OFF+2*stride
> >>   LD B = baseB, OFF+2*stride
> >>   ST C = baseC, OFF+2*stride
> >>   ...
> >>   baseA += stride * uf
> >>   baseB += stride * uf
> >>   baseC += stride * uf
> >>
> >> Imagining that if the loop get unrolled by 8 times, then 3 step updates 
> >> with
> >> D-form vs. 8 step updates with X-form. Here we only need to check stride
> >> meet D-form field requirement, since if OFF doesn't meet, we can construct
> >> baseA' with baseA + OFF.
> > 
> > I'd just mention there are other targets that have the choice between
> > the above forms.  Since IVOPTs itself does not perform the unrolling
> > the IL it produces is the same, correct?
> > 
> Yes.  Before this patch, IVOPTs doesn't consider the unrolling impacts,
> it only models things based on what it sees.  We can assume it thinks
> later RTL unrolling won't perform.
> 
> With this patch, since the IV choice probably changes, the IL can probably
> change.  The typical difference with this patch is:
> 
>   vect__1.7_15 = MEM[symbol: x, index: ivtmp.19_22, offset: 0B];
> vs.
>   vect__1.7_15 = MEM[base: _29, offset: 0B];

So we're asking IVOPTS "if we were unrolling this loop would you make
a different IV choice?" thus I wonder why we need so much complexity
here?  That is, if we can classify the loop as being possibly unrolled
we could evaluate IVOPTs IV choice (and overall cost) on the original
loop and in a second run on the original loop with fake IV uses
added with extra offset.  If the overall IV cost is similar we'll
take the unroll friendly choice if the costs are way different
(I wouldn't expect this to be the case ever?) I'd side with the
IV choice when not unrolling (and mark the loop as to be not unrolled).

Thus I'd err on the side of not unrolling but leave the ultimate choice
of whether to unroll to RTL unless IV cost makes that prohibitive.

Even without X- or D- form addressing modes the IV choice may differ
and I think we don't need extra knobs for the unroller but instead
can decide to set the existing n_unroll to zero (force not unroll)
when costs say it would be bad?

Richard.

> BR,
> Kewen
> 
> > Richard.
> > 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH] FAT library support for libatomic, libgfortran, libgomp, libstdc++

2020-06-03 Thread Thomas Koenig via Gcc-patches

Hi David,


After trying various options, the best solution seems to be the
inclusion of target-specific Makefile fragments. Directly adding the
rules to the Makefiles conflicts with Automake conditionals syntax.
And Makefile fragments are easily extendible to other targets, such as
Darwin. It avoids regenerating Makefile.in or configure for each
change.


So, at the moment, this is a no-op for everything but AIX, but some
other architectures might want to make use of it.

Looks like a good approach to me, with just one request: Could you add a
comment to Makefile.am what the line

include $(srcdir)/empty.mk $(tmake_file)

does?

libfortran changes are OK.

Regards

Thomas


Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-06-03 Thread Richard Biener via Gcc-patches
On Tue, Jun 2, 2020 at 5:00 PM Martin Liška  wrote:
>
> On 6/2/20 1:09 PM, Richard Biener wrote:
> > So please be constructive.  Like, provide a testcase that ICEs
> > with the FAILs replaced by gcc_unreachable ().  Martin, may I suggest
> > to do this replacement and bootstrap/test?  I think it would be nice
> > to have testsuite coverage for the FAILs, and maybe we have that
> > already.
>
> Hello.
>
> There's the suggested patch that survives bootstrap on ppc64le-linux-gnu
> and passes test-suite.

OK, so can you please re-post the version of the VEC_COND_EXPR
patch that uses a regular IFN (without the static non-FAIL checking)
in a new thread?  If there's no OK from rs6000 maintainers to remove
the FAILs then we'll go ahead with that version, unless Richard objects
here.

Thanks,
Richard.

> Martin


Re: [patch] Make memory copy functions scalar storage order barriers

2020-06-03 Thread Richard Biener via Gcc-patches
On Tue, Jun 2, 2020 at 7:11 PM Eric Botcazou  wrote:
>
> [Starting with tne VN issue]
>
> > For the VN case the issue is interpreting a read via memcpy (non
> > reverse-storage) as a reverse-storage one when matching up with a hashtable
> > entry from a regular reverse-storage order store, correct?
>
> It's a read from a scalar (hence native order) combined with a store to an
> aggregate type with reverse order:
>
>u = 305419896;
>   __builtin_memcpy (&tempb, &u, 4);
>   _1 = tempb.val;
>
> > But likewise whether the pointer destination had the attribute doesn't
> > correctly qualify this - instead I think we are already properly matching up
> > the reverse storage property?
>
> No, we do not do direct matching for the storage order because you may not
> access the same memory location with different scalar storage orders, it's the
> fundamental invariant, so it's redundant with e.g. the offset for matching.
>
> You need to byte swap if you want to propagate 305419896 to _1 above.  But
> it's an easy case: suppose that tempb is made up of 2 scalars with reverse
> storage order, then it's more complicated.  So memcpy where one of the types
> is aggregate with reverse storage order needs to be considered a storage order
> barrier, exactly like VIEW_CONVERT_EXPR:
>
> case VIEW_CONVERT_EXPR:
>   temp.off = 0;
>   temp.reverse = storage_order_barrier_p (ref);
>   break;
>
> /* Return true if T is a storage order barrier, i.e. a VIEW_CONVERT_EXPR
>that can modify the storage order of objects.  Note that, even if the
>TYPE_REVERSE_STORAGE_ORDER flag is set on both the inner type and the
>outer type, a VIEW_CONVERT_EXPR can modify the storage order because
>it can change the partition of the aggregate object into scalars.  */
>
> static inline bool
> storage_order_barrier_p (const_tree t)
> {
>   if (TREE_CODE (t) != VIEW_CONVERT_EXPR)
> return false;
>
>   if (AGGREGATE_TYPE_P (TREE_TYPE (t))
>   && TYPE_REVERSE_STORAGE_ORDER (TREE_TYPE (t)))
> return true;
>
>   tree op = TREE_OPERAND (t, 0);
>
>   if (AGGREGATE_TYPE_P (TREE_TYPE (op))
>   && TYPE_REVERSE_STORAGE_ORDER (TREE_TYPE (op)))
> return true;
>
>   return false;
> }
>
> hence the similar conditions, but I can add some more commentary.

OK, but all I was saying is that looking at the pointed-to type of
the address argument to the memcpy looks fragile to me.  For the
case cited above it would be better to look at the actual
reference we are looking up and not allowing memcpy handling
when that reference contains a storage order barrier?  Like
a && !contains_storage_order_barrier_p (vr->operands) on the whole block
like we do for the assign from constant/SSA value case?

> > Hum.  How does the pointer type matter at all for memcpy()?
> > Isn't the
> > issue just that we may not use the TYPE_REVERSE_STORAGE_ORDER
> > qualified type in the following case:
> >
> >   gimple *new_stmt;
> >   if (is_gimple_reg_type (TREE_TYPE (srcvar)))
> > {
> >   tree tem = fold_const_aggregate_ref (srcvar);
> >   if (tem)
> > srcvar = tem;
> >   if (! is_gimple_min_invariant (srcvar))
> > {
> >   new_stmt = gimple_build_assign (NULL_TREE, srcvar);
> >   srcvar = create_tmp_reg_or_ssa_name (TREE_TYPE (srcvar),
> >new_stmt);
> >
> > ?
>
> Same rationale as above: if you want to rewrite the copy in the presence of an
> aggregate type with reverse storage order, you may need to split the operation
> into the appropriate pieces; one large access may be incorrect.

But the above cited case is the _only_ case we're possibly building an
assignment of a variable - so it's enough to assert that in the above case
destvar is not TYPE_REVERSE_STORAGE_ORDER?

The other cases all build an aggregate copy assignment with
a non-reverse-storage-order char[size] type which should be OK, no?
Note even for the above we probably would be fine if we'd made sure
to use a !TYPE_REVERSE_STORAGE_ORDER "qualified" type
for the access.  So the memcpy happens via a native order
load plus a native order store, exactly what memcpy does (in fact
the actual order does not matter unless the source and destination
order "agree")?  Thus maybe amend

  /* Make sure we are not copying using a floating-point mode or
 a type whose size possibly does not match its precision.  */
  if (FLOAT_MODE_P (TYPE_MODE (desttype))
  || TREE_CODE (desttype) == BOOLEAN_TYPE
  || TREE_CODE (desttype) == ENUMERAL_TYPE)
desttype = bitwise_type_for_mode (TYPE_MODE (desttype));
  if (FLOAT_MODE_P (TYPE_MODE (srctype))
  || TREE_CODE (srctype) == BOOLEAN_TYPE
  || TREE_CODE (srctype) == ENUMERAL_TYPE)
srctype = bitwise_type_for_mode (TYPE_MODE (srctype));

with || TYPE_REVERSE_STORAGE_ORDER (srctype) (same for desttype)?

Thanks,
Richard.

>
> --
> Eric Botcazou

Re: [PATCH 0/4] IVOPTs consider step cost for different forms when unrolling

2020-06-03 Thread Kewen.Lin via Gcc-patches
on 2020/6/3 下午3:07, Richard Biener wrote:
> On Wed, 3 Jun 2020, Kewen.Lin wrote:
> 
>> Hi Richi,
>>
>> on 2020/6/2 下午7:38, Richard Biener wrote:
>>> On Thu, 28 May 2020, Kewen.Lin wrote:
>>>
 Hi,

 This is one repost and you can refer to the original series 
 via https://gcc.gnu.org/pipermail/gcc-patches/2020-January/538360.html.

 As we discussed in the thread
 https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00196.html
 Original: https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00104.html,
 I'm working to teach IVOPTs to consider D-form group access during 
 unrolling.
 The difference on D-form and other forms during unrolling is we can put the
 stride into displacement field to avoid additional step increment. eg:

 With X-form (uf step increment):
   ...
   LD A = baseA, X
   LD B = baseB, X
   ST C = baseC, X
   X = X + stride
   LD A = baseA, X
   LD B = baseB, X
   ST C = baseC, X
   X = X + stride
   LD A = baseA, X
   LD B = baseB, X
   ST C = baseC, X
   X = X + stride
   ...

 With D-form (one step increment for each base):
   ...
   LD A = baseA, OFF
   LD B = baseB, OFF
   ST C = baseC, OFF
   LD A = baseA, OFF+stride
   LD B = baseB, OFF+stride
   ST C = baseC, OFF+stride
   LD A = baseA, OFF+2*stride
   LD B = baseB, OFF+2*stride
   ST C = baseC, OFF+2*stride
   ...
   baseA += stride * uf
   baseB += stride * uf
   baseC += stride * uf

 Imagining that if the loop get unrolled by 8 times, then 3 step updates 
 with
 D-form vs. 8 step updates with X-form. Here we only need to check stride
 meet D-form field requirement, since if OFF doesn't meet, we can construct
 baseA' with baseA + OFF.
>>>
>>> I'd just mention there are other targets that have the choice between
>>> the above forms.  Since IVOPTs itself does not perform the unrolling
>>> the IL it produces is the same, correct?
>>>
>> Yes.  Before this patch, IVOPTs doesn't consider the unrolling impacts,
>> it only models things based on what it sees.  We can assume it thinks
>> later RTL unrolling won't perform.
>>
>> With this patch, since the IV choice probably changes, the IL can probably
>> change.  The typical difference with this patch is:
>>
>>   vect__1.7_15 = MEM[symbol: x, index: ivtmp.19_22, offset: 0B];
>> vs.
>>   vect__1.7_15 = MEM[base: _29, offset: 0B];
> 
> So we're asking IVOPTS "if we were unrolling this loop would you make
> a different IV choice?" thus I wonder why we need so much complexity
> here?  

I would describe it more like "we are going to unroll this loop with
unroll factor uf in RTL, would you consider this variable when modeling?"

In most cases, one single iteration is representative for the unrolled
body, so it doesn't matter considering unrolling or not.  But for the
case here, it's not true, expected reg_offset iv cand can make iv cand
step cost reduced, it leads the difference.

> That is, if we can classify the loop as being possibly unrolled
> we could evaluate IVOPTs IV choice (and overall cost) on the original
> loop and in a second run on the original loop with fake IV uses
> added with extra offset.  If the overall IV cost is similar we'll
> take the unroll friendly choice if the costs are way different
> (I wouldn't expect this to be the case ever?) I'd side with the
> IV choice when not unrolling (and mark the loop as to be not unrolled).
> 

Could you elaborate it a bit?  I guess it won't estimate the unroll
factor here, just guess it's to be unrolled or not?  The second run
with fake IV uses added with extra offset sounds like scaling up the 
iv group cost by uf.

> Thus I'd err on the side of not unrolling but leave the ultimate choice
> of whether to unroll to RTL unless IV cost makes that prohibitive.
> 
> Even without X- or D- form addressing modes the IV choice may differ
> and I think we don't need extra knobs for the unroller but instead
> can decide to set the existing n_unroll to zero (force not unroll)
> when costs say it would be bad?

Yes, even without x- or d- form addressing, the difference probably comes 
from compare type IV use for loop ending, maybe more cases which I am not
aware of.  But I don't see people care about it, probably the impact is
small.

IIUC what you stated here looks like to use ivopts information for unrolling
factor decision, I think this is a separate direction, do we have this
kind of case where ivopts costs can foresee the unrolling?

Now the unroll factor estimation can be used for other optimization passes
if they are wondering future unrolling factor decision, as discussed it
sounds a good idea to override the n_unroll with some benchmarking.

BR,
Kewen

> 
> Richard.
> 
>> BR,
>> Kewen
>>
>>> Richard.
>>>
>>
> 


[PATCH] V2: Lower VEC_COND_EXPR into internal functions.

2020-06-03 Thread Martin Liška

gcc/ChangeLog:

2020-03-30  Martin Liska  

* expr.c (expand_expr_real_2): Put gcc_unreachable, we should reach
this path.
(do_store_flag): Likewise here.
* internal-fn.c (expand_vect_cond_optab_fn): New.
(expand_VCOND): Likewise.
(expand_VCONDU): Likewise.
(expand_VCONDEQ): Likewise.
(expand_vect_cond_mask_optab_fn): Likewise.
(expand_VCOND_MASK): Likewise.
* internal-fn.def (VCOND): New.
(VCONDU): Likewise.
(VCONDEQ): Likewise.
(VCOND_MASK): Likewise.
* optabs.c (expand_vec_cond_mask_expr): Removed.
(expand_vec_cond_expr): Likewise.
* optabs.h (expand_vec_cond_expr): Likewise.
(vector_compare_rtx): Likewise.
* passes.def: Add pass_gimple_isel.
* tree-cfg.c (verify_gimple_assign_ternary): Add new
GIMPLE check.
* tree-pass.h (make_pass_gimple_isel): New.
* tree-ssa-forwprop.c (pass_forwprop::execute): Do not forward
to already lowered VEC_COND_EXPR.
* tree-vect-generic.c (expand_vector_divmod): Expand to SSA_NAME.
(expand_vector_condition): Expand tcc_comparison of a VEC_COND_EXPR
into a SSA_NAME.
(gimple_expand_vec_cond_expr): New.
(gimple_expand_vec_cond_exprs): New.
(class pass_gimple_isel): New.
(make_pass_gimple_isel): New.
---
 gcc/expr.c  |  25 +
 gcc/internal-fn.c   |  98 +
 gcc/internal-fn.def |   5 +
 gcc/optabs.c| 124 +
 gcc/optabs.h|   7 +-
 gcc/passes.def  |   1 +
 gcc/tree-cfg.c  |   8 ++
 gcc/tree-pass.h |   1 +
 gcc/tree-ssa-forwprop.c |   6 +
 gcc/tree-vect-generic.c | 237 +++-
 10 files changed, 358 insertions(+), 154 deletions(-)

diff --git a/gcc/expr.c b/gcc/expr.c
index 6b75028e7f1..b284f723220 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -9242,17 +9242,8 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode 
tmode,
   if (temp != 0)
return temp;
 
-  /* For vector MIN , expand it a VEC_COND_EXPR 

-and similarly for MAX .  */
   if (VECTOR_TYPE_P (type))
-   {
- tree t0 = make_tree (type, op0);
- tree t1 = make_tree (type, op1);
- tree comparison = build2 (code == MIN_EXPR ? LE_EXPR : GE_EXPR,
-   type, t0, t1);
- return expand_vec_cond_expr (type, comparison, t0, t1,
-  original_target);
-   }
+   gcc_unreachable ();
 
   /* At this point, a MEM target is no longer useful; we will get better

 code without it.  */
@@ -9841,10 +9832,6 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode 
tmode,
return temp;
   }
 
-case VEC_COND_EXPR:

-  target = expand_vec_cond_expr (type, treeop0, treeop1, treeop2, target);
-  return target;
-
 case VEC_DUPLICATE_EXPR:
   op0 = expand_expr (treeop0, NULL_RTX, VOIDmode, modifier);
   target = expand_vector_broadcast (mode, op0);
@@ -12175,8 +12162,7 @@ do_store_flag (sepops ops, rtx target, machine_mode 
mode)
   STRIP_NOPS (arg1);
 
   /* For vector typed comparisons emit code to generate the desired

- all-ones or all-zeros mask.  Conveniently use the VEC_COND_EXPR
- expander for this.  */
+ all-ones or all-zeros mask.  */
   if (TREE_CODE (ops->type) == VECTOR_TYPE)
 {
   tree ifexp = build2 (ops->code, ops->type, arg0, arg1);
@@ -12184,12 +12170,7 @@ do_store_flag (sepops ops, rtx target, machine_mode 
mode)
  && expand_vec_cmp_expr_p (TREE_TYPE (arg0), ops->type, ops->code))
return expand_vec_cmp_expr (ops->type, ifexp, target);
   else
-   {
- tree if_true = constant_boolean_node (true, ops->type);
- tree if_false = constant_boolean_node (false, ops->type);
- return expand_vec_cond_expr (ops->type, ifexp, if_true,
-  if_false, target);
-   }
+   gcc_unreachable ();
 }
 
   /* Optimize (x % C1) == C2 or (x % C1) != C2 if it is beneficial

diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 5e9aa60721e..aa41b4f6870 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -49,6 +49,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-ssa.h"
 #include "tree-phinodes.h"
 #include "ssa-iterators.h"
+#include "explow.h"
 
 /* The names of each internal function, indexed by function number.  */

 const char *const internal_fn_name_array[] = {
@@ -2548,6 +2549,103 @@ expand_mask_store_optab_fn (internal_fn, gcall *stmt, 
convert_optab optab)
 
 #define expand_mask_store_lanes_optab_fn expand_mask_store_optab_fn
 
+/* Expand VCOND, VCONDU and VCONDEQ internal functions.

+   The expansion of STMT happens based on OPTAB table associated.  */
+
+static void
+expand_vect_cond_optab_fn (internal_fn ifn, gcall *stmt)
+{
+  class expand_operand

Re: zstd not found if installed in non-system prefix

2020-06-03 Thread Matthias Klose
On 5/27/20 3:36 PM, Martin Liška wrote:
> On 5/20/20 9:32 PM, Michael Kuhn wrote:
>> Hi,
>>
>> when specifying a non-system prefix with --with-zstd, the build fails
>> because the header and library cannot be found (see
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95005).
>>
>> The attached patch fixes the problem and is what we use in Spack to
>> make GCC build with zstd support.
> 
> Hello.
> 
> I support the patch, but we need to wait for an approval of a maintainer.

but please don't allow configuration via environment variables.



Re: [PATCH] Add pattern for pointer-diff on addresses with same base/offset (PR 94234)

2020-06-03 Thread Feng Xue OS via Gcc-patches
>> This patch is meant to add match rules to simplify patterns as:
>>
>> o. (pointer + offset_a) - (pointer + offset_b)   ->   (ptrdiff_t) (offset_a 
>> - offset_b)
>> o. (pointer_a + offset) - (pointer_b + offset)   ->   (pointer_a - pointer_b)

> You are also changing the existing pattern which IIRC tries to
> preserve the undefinedness of overflow in offset_a - offset_b.
> Without an explanation it's hard to guess why you think eliding
> this conversion is correct.
The old rule adds signed cast to both offset_b and offset_b, and does
minus on them, as

   (stype)offset_a - (stype)offset_b

Suppose that offset_a = (signed_int_max)UL, offset_b = 1UL, the trans
will generate overflow result, while original computation will not.

Alternative way is to add type cast after minus are done on offset_a and offset 
b.
And to avoid unsigned overflow warning, we should add a view_convert 
instead of convert, which was missed in my patch.

> Adding the TYPE_OVERFLOW_UNDEFINED guard also looks
> odd - AFAICS overflow of the pointer type does not matter
> but overflow of the generated minus?  Thus at least
> a || !TYPE_OVERFLOW_UNDEFINED (type) would be
> in order?
Yes, it is. will remove TYPE_OVERFLOW_UNDEFINED on pointer.

Thanks,
Feng

Re: [Patch][OpenMP] Fix mapping of artificial variables (PR94874)

2020-06-03 Thread Tobias Burnus

Hi Jakub,

Revised patch – changes are:
* cp-gimplify.c's cxx_omp_predetermined_mapping (twice 3 lines removed as 
proposed)
* gimplify.c's omp_notice_variable: changes how returned 'kind' is used.

Namely:

Pre-commit follow-up version to the 'LGTM patch':
As instructed, I have removed the two-times three lines
from cp-gimplify.c's cxx_omp_predetermined_mapping.

However, I made a thinko regarding 'nflags |= kind;' which just
happened to work for a test case – the nflags value differs
from the enum value.
[See gimplify.c's omp_notice_variable; see also line 9610+
for the full set of mappings].

For two items + unreachable, I think 'if' is simpler than
'switch', hence, I used it. Alternatively, one could split-off
the case block of ll. 9610+ and use a function call for both.
Thoughts?

OK?

Tobias

On 5/26/20 5:43 PM, Jakub Jelinek via Gcc-patches wrote:


On Tue, May 26, 2020 at 04:06:04PM +0200, Jakub Jelinek wrote:

On Mon, May 25, 2020 at 07:10:31PM +0200, Tobias Burnus wrote:

+enum omp_clause_defaultmap_kind
+cxx_omp_predetermined_mapping (tree decl)
+{
+  enum omp_clause_default_kind ret = cxx_omp_predetermined_sharing_1 (decl);
+  if (ret != OMP_CLAUSE_DEFAULT_UNSPECIFIED)
+return OMP_CLAUSE_DEFAULTMAP_TO;

I don't like the above 3 lines, do you need it for anything?
Static data members certainly aren't covered and it is unlikely we want to
predetermine them mapped.

And forgot, for this the behavior is already specified in 5.0 that any
reference to this should result in map(tofrom:this[:1]).
It is not implemented yet, but we'll need to implement it some other way
than through this hook.

  Jakub


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
[OpenMP] Fix mapping of artificial variables (PR94874)

gcc/c-family/ChangeLog:

	* c-common.h (c_omp_predetermined_mapping): Declare.
	* c-omp.c (c_omp_predetermined_mapping): New.

gcc/c/ChangeLog:

	* c-objc-common.h (LANG_HOOKS_OMP_PREDETERMINED_MAPPING): Redefine.

gcc/cp/ChangeLog:

	* cp-gimplify.c (cxx_omp_predetermined_mapping): New.
	* cp-objcp-common.h (LANG_HOOKS_OMP_PREDETERMINED_MAPPING): Redfine.
	* cp-tree.h (cxx_omp_predetermined_mapping): Declare.

gcc/fortran/ChangeLog:

	* f95-lang.c (LANG_HOOKS_OMP_PREDETERMINED_MAPPING): Redefine.
	* trans-openmp.c (gfc_omp_predetermined_mapping): New.
	* trans.h (gfc_omp_predetermined_mapping): Declare.

gcc/ChangeLog:

	* gimplify.c (omp_notice_variable): Use new hook.
	* langhooks-def.h (lhd_omp_predetermined_mapping): Declare.
	(LANG_HOOKS_OMP_PREDETERMINED_MAPPING): Define
	(LANG_HOOKS_DECLS): Add it.
	* langhooks.c (lhd_omp_predetermined_sharing): Remove bogus unused attr.
	(lhd_omp_predetermined_mapping): New.
	* langhooks.h (struct lang_hooks_for_decls): Add new hook.

gcc/testsuite/ChangeLog
2020-06-03  Thomas Schwinge  
Tobias Burnus  

PR middle-end/94874
* c-c++-common/gomp/pr94874.c: New.

 gcc/c-family/c-common.h   |  1 +
 gcc/c-family/c-omp.c  | 24 +++-
 gcc/c/c-objc-common.h |  3 +++
 gcc/cp/cp-gimplify.c  | 19 ++-
 gcc/cp/cp-objcp-common.h  |  2 ++
 gcc/cp/cp-tree.h  |  1 +
 gcc/fortran/f95-lang.c|  2 ++
 gcc/fortran/trans-openmp.c| 25 -
 gcc/fortran/trans.h   |  1 +
 gcc/gimplify.c| 13 -
 gcc/langhooks-def.h   |  3 +++
 gcc/langhooks.c   | 13 -
 gcc/langhooks.h   |  4 
 gcc/testsuite/c-c++-common/gomp/pr94874.c | 25 +
 14 files changed, 131 insertions(+), 5 deletions(-)

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 7c1a6370aae..c74b23db05c 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1206,6 +1206,7 @@ extern tree c_omp_declare_simd_clauses_to_numbers (tree, tree);
 extern void c_omp_declare_simd_clauses_to_decls (tree, tree);
 extern bool c_omp_predefined_variable (tree);
 extern enum omp_clause_default_kind c_omp_predetermined_sharing (tree);
+extern enum omp_clause_defaultmap_kind c_omp_predetermined_mapping (tree);
 extern tree c_omp_check_context_selector (location_t, tree);
 extern void c_omp_mark_declare_variant (location_t, tree, tree);
 extern const char *c_omp_map_clause_name (tree, bool);
diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c
index 51c18a4ba08..6f8fba350ed 100644
--- a/gcc/c-family/c-omp.c
+++ b/gcc/c-family/c-omp.c
@@ -2104,7 +2104,8 @@ c_omp_predefined_variable (tree decl)
   return false;
 }
 
-/* True if OpenMP sharing attribute of DECL is predetermined.  */
+/* OMP_CLAUSE_DEFAULT_UNSPECIFIED unless OpenMP sharing attribute of DECL
+   is predetermined.  */
 
 e

Re: [Patch][OpenMP] Fix mapping of artificial variables (PR94874)

2020-06-03 Thread Jakub Jelinek via Gcc-patches
On Wed, Jun 03, 2020 at 11:03:50AM +0200, Tobias Burnus wrote:
> @@ -2123,6 +2124,27 @@ c_omp_predetermined_sharing (tree decl)
>return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
>  }
>  
> +/* OMP_CLAUSE_DEFAULTMAP_CATEGORY_UNSPECIFIED unless OpenMP mapping attribute
> +   of DECL is predetermined.  */
> +
> +enum omp_clause_defaultmap_kind
> +c_omp_predetermined_mapping (tree decl)
> +{
> +  /* Predetermine artificial variables holding integral values, those
> + are usually result of gimplify_one_sizepos or SAVE_EXPR
> + gimplification.  */
> +  if (VAR_P (decl)
> +  && DECL_ARTIFICIAL (decl)
> +  && INTEGRAL_TYPE_P (TREE_TYPE (decl)))
> +return OMP_CLAUSE_DEFAULTMAP_FIRSTPRIVATE;
> +
> +  if (c_omp_predefined_variable (decl))
> +return OMP_CLAUSE_DEFAULTMAP_TO;
> +
> +  return OMP_CLAUSE_DEFAULTMAP_CATEGORY_UNSPECIFIED;
> +}
...
> @@ -2321,6 +2322,22 @@ cxx_omp_predetermined_sharing (tree decl)
>return OMP_CLAUSE_DEFAULT_UNSPECIFIED;
>  }
>  
> +enum omp_clause_defaultmap_kind
> +cxx_omp_predetermined_mapping (tree decl)
> +{
> +  /* Predetermine artificial variables holding integral values, those
> + are usually result of gimplify_one_sizepos or SAVE_EXPR
> + gimplification.  */
> +  if (VAR_P (decl)
> +  && DECL_ARTIFICIAL (decl)
> +  && INTEGRAL_TYPE_P (TREE_TYPE (decl))
> +  && !(DECL_LANG_SPECIFIC (decl)
> +&& DECL_OMP_PRIVATIZED_MEMBER (decl)))
> +return OMP_CLAUSE_DEFAULTMAP_FIRSTPRIVATE;

I'd prefer consistency between C and C++, so I'd add
+  if (c_omp_predefined_variable (decl))
+return OMP_CLAUSE_DEFAULTMAP_TO;

here too for now (and I'll create a new spec issue for __FUNCTION__ etc.
mapping because the old one happened to be closed without actually solving
what is needed).

> +
> +  return OMP_CLAUSE_DEFAULTMAP_CATEGORY_UNSPECIFIED;
> +}

Ok with that change.  Thanks.

Jakub



Re: [PATCH] Add pattern for pointer-diff on addresses with same base/offset (PR 94234)

2020-06-03 Thread Feng Xue OS via Gcc-patches
>>   * match.pd ((PTR + A) - (PTR + B)) -> (ptrdiff_t)(A - B): New
>>   simplification.

> Not new, modified.
OK.

>>   * ((PTR_A + O) - (PTR_B + O)) -> (PTR_A - PTR_B): New simplification.

> O might not be the best choice because of how close it looks to 0.
OK.

> What don't you like about the existing transformation? You are replacing a
> transformation that always folds by one that folds only in some cases, and
> looses the information that some overflows cannot happen. That looks like
> it is making things worse from an optimization point of view. Do you
> consider the transformation as unsafe with -fsanitize=pointer-overflow
> (does that correspond to the case where TYPE_OVERFLOW_UNDEFINED is true
> for a pointer type?)?
Yes. We should use !TYPE_OVERFLOW_SANITIZED, not TYPE_OVERFLOW_UNDEFINED.
But even for !TYPE_OVERFLOW_SANITIZED, some ptr_diff rules have the check, and 
some
do not. Here we could also remove it?

> Ah, looking at the PR, you decided to perform the operation as unsigned
> because that has fewer NOP conversions, which, in that particular testcase
> where the offsets are originally unsigned, means we simplify better. But I
> would expect it to regress other testcases (in particular if the offsets
> were originally signed). Also, changing the second argument of
> pointer_plus to be signed, as is supposed to eventually happen, would
> break your testcase again.
The old rule might produce overflow result (offset_a = (signed_int_max)UL, 
offset_b = 1UL). 

Additionally, (stype)(offset_a - offset_b) is more compact, there might be
further simplification opportunities on offset_a - offset_b, even it is not
in form of (A * C - B * C), for example (~A - 1 -> -A). But for old rule, we 
have
to introduce another rule as (T)A - (T)(B) -> (T)(A - B), which seems to
be too generic to benefit performance in all situations.

If the 2nd argument is signed, we can add a specific rule as your suggestion
(T)(A * C) - (T)(B * C) -> (T) (A - B) * C.

> At the very least we want to keep a comment next to the transformation
> explaining the situation.

> If there are platforms where the second argument of pointer_plus is a
> smaller type than the result of pointer_diff (can this happen? I keep
> forgetting all the weird things some platforms do), this version may do an
> unsafe zero-extension.
If the 2nd argument is a smaller type, this might bring confuse semantic to
pointer_plus operator. Suppose the type is a (unsigned) char, the expression
"ptr + ((char) -1)" represents ptr + 255 or ptr - 1?

Regards,
Feng

Re: [PATCH 2/2] Tune memcpy and memset for Zen cores.

2020-06-03 Thread Martin Liška

On 6/1/20 1:40 PM, Martin Liška wrote:

Adding Honza as Uros recommended him for a review.


@Honza asked me personally to remind him this patch ;)

Martin


Re: [PATCH 0/4] IVOPTs consider step cost for different forms when unrolling

2020-06-03 Thread Richard Biener
On Wed, 3 Jun 2020, Kewen.Lin wrote:

> on 2020/6/3 下午3:07, Richard Biener wrote:
> > On Wed, 3 Jun 2020, Kewen.Lin wrote:
> > 
> >> Hi Richi,
> >>
> >> on 2020/6/2 下午7:38, Richard Biener wrote:
> >>> On Thu, 28 May 2020, Kewen.Lin wrote:
> >>>
>  Hi,
> 
>  This is one repost and you can refer to the original series 
>  via https://gcc.gnu.org/pipermail/gcc-patches/2020-January/538360.html.
> 
>  As we discussed in the thread
>  https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00196.html
>  Original: https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00104.html,
>  I'm working to teach IVOPTs to consider D-form group access during 
>  unrolling.
>  The difference on D-form and other forms during unrolling is we can put 
>  the
>  stride into displacement field to avoid additional step increment. eg:
> 
>  With X-form (uf step increment):
>    ...
>    LD A = baseA, X
>    LD B = baseB, X
>    ST C = baseC, X
>    X = X + stride
>    LD A = baseA, X
>    LD B = baseB, X
>    ST C = baseC, X
>    X = X + stride
>    LD A = baseA, X
>    LD B = baseB, X
>    ST C = baseC, X
>    X = X + stride
>    ...
> 
>  With D-form (one step increment for each base):
>    ...
>    LD A = baseA, OFF
>    LD B = baseB, OFF
>    ST C = baseC, OFF
>    LD A = baseA, OFF+stride
>    LD B = baseB, OFF+stride
>    ST C = baseC, OFF+stride
>    LD A = baseA, OFF+2*stride
>    LD B = baseB, OFF+2*stride
>    ST C = baseC, OFF+2*stride
>    ...
>    baseA += stride * uf
>    baseB += stride * uf
>    baseC += stride * uf
> 
>  Imagining that if the loop get unrolled by 8 times, then 3 step updates 
>  with
>  D-form vs. 8 step updates with X-form. Here we only need to check stride
>  meet D-form field requirement, since if OFF doesn't meet, we can 
>  construct
>  baseA' with baseA + OFF.
> >>>
> >>> I'd just mention there are other targets that have the choice between
> >>> the above forms.  Since IVOPTs itself does not perform the unrolling
> >>> the IL it produces is the same, correct?
> >>>
> >> Yes.  Before this patch, IVOPTs doesn't consider the unrolling impacts,
> >> it only models things based on what it sees.  We can assume it thinks
> >> later RTL unrolling won't perform.
> >>
> >> With this patch, since the IV choice probably changes, the IL can probably
> >> change.  The typical difference with this patch is:
> >>
> >>   vect__1.7_15 = MEM[symbol: x, index: ivtmp.19_22, offset: 0B];
> >> vs.
> >>   vect__1.7_15 = MEM[base: _29, offset: 0B];
> > 
> > So we're asking IVOPTS "if we were unrolling this loop would you make
> > a different IV choice?" thus I wonder why we need so much complexity
> > here?  
> 
> I would describe it more like "we are going to unroll this loop with
> unroll factor uf in RTL, would you consider this variable when modeling?"
> 
> In most cases, one single iteration is representative for the unrolled
> body, so it doesn't matter considering unrolling or not.  But for the
> case here, it's not true, expected reg_offset iv cand can make iv cand
> step cost reduced, it leads the difference.
> 
> > That is, if we can classify the loop as being possibly unrolled
> > we could evaluate IVOPTs IV choice (and overall cost) on the original
> > loop and in a second run on the original loop with fake IV uses
> > added with extra offset.  If the overall IV cost is similar we'll
> > take the unroll friendly choice if the costs are way different
> > (I wouldn't expect this to be the case ever?) I'd side with the
> > IV choice when not unrolling (and mark the loop as to be not unrolled).
> > 
> 
> Could you elaborate it a bit?  I guess it won't estimate the unroll
> factor here, just guess it's to be unrolled or not?  The second run
> with fake IV uses added with extra offset sounds like scaling up the 
> iv group cost by uf.

>From your example above the D-form (MEM[symbol: x, index: ivtmp.19_22, 
offset: 0B]) is preferable since in the unrolled variant we have
the same addres but with a different constant offset for the unroll
copies while the second form would have to update the 'base' IV.

Thus I think the difference in IV cost and decision should already
show up if we, for each USE add a USE with an added constant offset.
This might be what your patch does with that extra flag on the USEs,
I was suggesting to model the USEs more explicitely, simulating a
2-way unroll.  I think in the end I'll defer to Bin here who knows
the code best.

> > Thus I'd err on the side of not unrolling but leave the ultimate choice
> > of whether to unroll to RTL unless IV cost makes that prohibitive.
> > 
> > Even without X- or D- form addressing modes the IV choice may differ
> > and I think we don't need extra knobs for the unroller but instead
> > can decide to set the existing n_unroll to zero (force not unroll)
> > when costs

[Ada] Remove -gnatt switch processing

2020-06-03 Thread Pierre-Marie de Rodat
ASIS is now maintained on a separate branch, so newer compilers no
longer need to generate ASIS trees. First remove processing of -gnatt,
then in a second stage, we'll remove all associated code.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Arnaud Charlet  

gcc/ada/

* switch-c.adb (Scan_Front_End_Switches): Remove processing of
-gnatt.
* usage.adb (Usage): Remove mention of -gnatt.--- gcc/ada/switch-c.adb
+++ gcc/ada/switch-c.adb
@@ -1250,13 +1250,6 @@ package body Switch.C is
Print_Standard := True;
Ptr := Ptr + 1;
 
---  -gnatt (output tree)
-
-when 't' =>
-   Ptr := Ptr + 1;
-   Tree_Output := True;
-   Back_Annotate_Rep_Info := True;
-
 --  -gnatT (change start of internal table sizes)
 
 when 'T' =>

--- gcc/ada/usage.adb
+++ gcc/ada/usage.adb
@@ -418,11 +418,6 @@ begin
Write_Switch_Char ("S");
Write_Line ("Print listing of package Standard");
 
-   --  Line for -gnatt switch
-
-   Write_Switch_Char ("t");
-   Write_Line ("Tree output file to be generated");
-
--  Line for -gnatTnn switch
 
Write_Switch_Char ("Tnn");



[Ada] Remove ASIS tree generation

2020-06-03 Thread Pierre-Marie de Rodat
This patch removes completely the processing associated with -gnatt.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Arnaud Charlet  

gcc/ada/

* aspects.adb, aspects.ads, atree.adb, atree.ads, elists.adb,
elists.ads, fname.adb, fname.ads, gnat1drv.adb, lib.adb,
lib.ads, namet.adb, namet.ads, nlists.adb, nlists.ads, opt.adb,
opt.ads, osint-c.adb, osint-c.ads, repinfo.adb, repinfo.ads,
sem_aux.adb, sem_aux.ads, sinput.adb, sinput.ads, stand.ads,
stringt.adb, stringt.ads, switch-c.adb, table.adb, table.ads,
uintp.adb, uintp.ads, urealp.adb, urealp.ads (Tree_Read,
Tree_Write): Remove generation of ASIS trees.
* doc/gnat_ugn/building_executable_programs_with_gnat.rst:
Remove -gnatt and -gnatct documentation.
* gnat_ugn.texi: Regenerate.
* tree_in.ads, tree_in.adb, tree_io.ads, tree_io.adb,
tree_gen.ads, tree_gen.adb, stand.adb: Remove.
* gcc-interface/Makefile.in, gcc-interface/Make-lang.in: Remove
references to tree_gen.o tree_in.o tree_io.o.

patch.diff.gz
Description: application/gzip


[Ada] More efficient System.Random_Numbers.Random_Discrete for 32-bit types

2020-06-03 Thread Pierre-Marie de Rodat
The current implementation is suboptimal for 32-bit or smaller types
because it does the computation in Universal_Integer and can generate
checks.  That's unnecessary in the common case, i.e. for integer types
or enumeration types with contiguous representation.

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Eric Botcazou  

gcc/ada/

* libgnat/s-rannum.adb (Random_Discrete): In the 32-bit case,
use the same linear implementation as in the 64-bit case when
the type has a contiguous representation.--- gcc/ada/libgnat/s-rannum.adb
+++ gcc/ada/libgnat/s-rannum.adb
@@ -404,10 +404,9 @@ is
 
   elsif Result_Subtype'Base'Size > 32 then
  declare
---  In the 64-bit case, we have to be careful, since not all 64-bit
---  unsigned values are representable in GNAT's root_integer type.
---  Ignore different-size warnings here since GNAT's handling
---  is correct.
+--  In the 64-bit case, we have to be careful since not all 64-bit
+--  unsigned values are representable in GNAT's universal integer.
+--  Ignore unequal-size warnings since GNAT's handling is correct.
 
 pragma Warnings ("Z");
 function Conv_To_Unsigned is
@@ -423,7 +422,8 @@ is
 
  begin
 if N = 0 then
-   return Conv_To_Result (Conv_To_Unsigned (Min) + Random (Gen));
+   X := Random (Gen);
+   return Conv_To_Result (Conv_To_Unsigned (Min) + X);
 
 else
Slop := Unsigned_64'Last rem N + 1;
@@ -437,28 +437,73 @@ is
 end if;
  end;
 
-  elsif Result_Subtype'Pos (Max) - Result_Subtype'Pos (Min) =
- 2 ** 32 - 1
-  then
- return Result_Subtype'Val
-   (Result_Subtype'Pos (Min) + Unsigned_32'Pos (Random (Gen)));
   else
  declare
-N: constant Unsigned_32 :=
- Unsigned_32 (Result_Subtype'Pos (Max) -
-Result_Subtype'Pos (Min) + 1);
-Slop : constant Unsigned_32 := Unsigned_32'Last rem N + 1;
-X: Unsigned_32;
+--  In the 32-bit case, unlike the above case, we need to handle
+--  both integer and enumeration types. If the values of the result
+--  subtype are contiguous, then we can still use the above trick.
+--  Otherwise we need to rely on 'Pos and 'Val in the computation,
+--  which is more costly since it will thus be done in universal
+--  integer. And ignore unequal-size warnings in this case too.
+
+pragma Warnings ("Z");
+function Conv_To_Unsigned is
+   new Unchecked_Conversion (Result_Subtype'Base, Unsigned_32);
+function Conv_To_Result is
+   new Unchecked_Conversion (Unsigned_32, Result_Subtype'Base);
+pragma Warnings ("z");
+
+Contiguous : constant Boolean :=
+  Result_Subtype'Pos (Result_Subtype'Last) -
+Result_Subtype'Pos (Result_Subtype'First)
+=
+  Result_Subtype'Enum_Rep (Result_Subtype'Last) -
+Result_Subtype'Enum_Rep (Result_Subtype'First);
+
+N, X, Slop : Unsigned_32;
 
  begin
-loop
-   X := Random (Gen);
-   exit when Slop = N or else X <= Unsigned_32'Last - Slop;
-end loop;
+if Contiguous then
+   N := Conv_To_Unsigned (Max) - Conv_To_Unsigned (Min) + 1;
+
+   if N = 0 then
+  X := Random (Gen);
+  return Conv_To_Result (Conv_To_Unsigned (Min) + X);
+
+   else
+  Slop := Unsigned_32'Last rem N + 1;
 
-return
-  Result_Subtype'Val
-(Result_Subtype'Pos (Min) + Unsigned_32'Pos (X rem N));
+  loop
+ X := Random (Gen);
+ exit when Slop = N or else X <= Unsigned_32'Last - Slop;
+  end loop;
+
+  return Conv_To_Result (Conv_To_Unsigned (Min) + X rem N);
+   end if;
+
+else
+   N := Unsigned_32 (Result_Subtype'Pos (Max) -
+   Result_Subtype'Pos (Min) + 1);
+
+   if N = 0 then
+  X := Random (Gen);
+  return
+Result_Subtype'Val
+  (Result_Subtype'Pos (Min) + Unsigned_32'Pos (X));
+
+   else
+  Slop := Unsigned_32'Last rem N + 1;
+
+  loop
+ X := Random (Gen);
+ exit when Slop = N or else X <= Unsigned_32'Last - Slop;
+  end loop;
+
+  

[Ada] Get rid of more references to Universal_Integer in expanded code

2020-06-03 Thread Pierre-Marie de Rodat
This fixes a couple of places to using the standard idiom for choosing
an appropriately-sized integer type.  No practical change since the old
and the new type are effectively identical.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Eric Botcazou  

gcc/ada/

* exp_attr.adb (Attribute_Valid): Use Standard_Long_Long_Integer
in lieu of Universal_Integer as large integer type.
* exp_ch3.adb (Expand_Freeze_Enumeration_Type): Likewise.--- gcc/ada/exp_attr.adb
+++ gcc/ada/exp_attr.adb
@@ -6924,7 +6924,7 @@ package body Exp_Attr is
 if Esize (Ptyp) <= Esize (Standard_Integer) then
PBtyp := Standard_Integer;
 else
-   PBtyp := Universal_Integer;
+   PBtyp := Standard_Long_Long_Integer;
 end if;
 
 Rewrite (N, Make_Range_Test);

--- gcc/ada/exp_ch3.adb
+++ gcc/ada/exp_ch3.adb
@@ -4806,7 +4806,7 @@ package body Exp_Ch3 is
  if Esize (Typ) <= Standard_Integer_Size then
 Ityp := Standard_Integer;
  else
-Ityp := Universal_Integer;
+Ityp := Standard_Long_Long_Integer;
  end if;
 
   --  Representations are unsigned



[Ada] Debug info not available on subp having a class-wide precondition

2020-06-03 Thread Pierre-Marie de Rodat
The debugger doesn't allow stepping into a subprogram when the
subprogram has a class-wide precondition (or postcondition). The
compiler generates wrappers for such subprograms and creates a new
"class-wide clone" of the subprogram, and the Needs_Debug_Info flag
isn't set on the clone subprogram's entity.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Gary Dismukes  

gcc/ada/

* sem_util.adb (Build_Class_Wide_Clone_Decl): Call
Set_Debug_Info_Needed to set the Needs_Debug_Info flag on
Clone_Id if the flag is set on Spec_Id.--- gcc/ada/sem_util.adb
+++ gcc/ada/sem_util.adb
@@ -1618,6 +1618,13 @@ package body Sem_Util is
   --  wrapper call to inherited operation.
 
   Set_Class_Wide_Clone (Spec_Id, Clone_Id);
+
+  --  Inherit debug info flag from Spec_Id to Clone_Id to allow debugging
+  --  of the class-wide clone subprogram.
+
+  if Needs_Debug_Info (Spec_Id) then
+ Set_Debug_Info_Needed (Clone_Id);
+  end if;
end Build_Class_Wide_Clone_Decl;
 
-



[Ada] Unnesting problems with expansion of Loop_Entry attribute

2020-06-03 Thread Pierre-Marie de Rodat
For expansion of a Loop_Entry attribute, a function body is generated in
association with the condition of a generated while loop, and the
entities within the function need to have their scopes reset to the
function Entity_Id, for proper handling of up-level-reference processing
in the GNAT-LLVM compiler. The existing handling only did this for
itypes associated with the loop, but there can be other kinds of
entities that also need to be reset, so the condition of the loop was
relaxed to allow resetting for more entity cases. However, note that the
fix is likely still incomplete (potentially allowing either too many or
too few entity cases).

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Gary Dismukes  

gcc/ada/

* exp_attr.adb (Expand_Loop_Entry_Attribute): Revise loop that
resets the scopes of entities associated with Loop_Id to the
scope of the new function, so the resetting is not restricted to
itypes, but excludes loop parameters and the function entity
itself. However, this fix is believed to be incomplete and a ???
comment is added to indicate that.--- gcc/ada/exp_attr.adb
+++ gcc/ada/exp_attr.adb
@@ -1434,22 +1434,36 @@ package body Exp_Attr is
Insert_Action (Loop_Stmt, Func_Decl);
Pop_Scope;
 
-   --  The analysis of the condition may have generated itypes
-   --  that are now used within the function: Adjust their
-   --  scopes accordingly so that their use appears in their
-   --  scope of definition.
+   --  The analysis of the condition may have generated entities
+   --  (such as itypes) that are now used within the function.
+   --  Adjust their scopes accordingly so that their use appears
+   --  in their scope of definition.
 
declare
-  Ityp : Entity_Id;
+  Ent : Entity_Id;
 
begin
-  Ityp := First_Entity (Loop_Id);
-
-  while Present (Ityp) loop
- if Is_Itype (Ityp) then
-Set_Scope (Ityp, Func_Id);
+  Ent := First_Entity (Loop_Id);
+
+  while Present (Ent) loop
+ --  Various entities that now occur within the function
+ --  need to have their scope reset, but not all entities
+ --  associated with Loop_Id are now inside the function.
+ --  The function entity itself and loop parameters can
+ --  be outside the function, and there may be others.
+ --  It's not clear how the determination of what entity
+ --  scopes need to be adjusted can be made accurately.
+ --  Perhaps it will be necessary to traverse the function
+ --  body to find the exact entities whose scopes need to
+ --  be reset to the function's Entity_Id. ???
+
+ if Ekind (Ent) /= E_Loop_Parameter
+   and then Ent /= Func_Id
+ then
+Set_Scope (Ent, Func_Id);
  end if;
- Next_Entity (Ityp);
+
+ Next_Entity (Ent);
   end loop;
end;
 



[Ada] Minor change to comment in System.Regexp spec

2020-06-03 Thread Pierre-Marie de Rodat
Fix the comment about the form of a 'term'. Allow a sequence of elmts,
in {...}, but not nested {...}.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Bob Duff  

gcc/ada/

* libgnat/s-regexp.ads: Fix comment--- gcc/ada/libgnat/s-regexp.ads
+++ gcc/ada/libgnat/s-regexp.ads
@@ -83,8 +83,10 @@ package System.Regexp is
-- regexp ::= term
 
-- term   ::= elmt
-   -- term   ::= elmt elmt ... -- concatenation (elmt then elmt)
-   -- term   ::= {elmt, elmt, ...} -- alternation (matches any of elmt)
+   -- term   ::= seq
+   -- term   ::= {seq, seq, ...}   -- alternation (matches any of seq)
+
+   -- seq::= elmt elmt ... -- concatenation (sequence of elmts)
 
-- elmt   ::= * -- any string of 0 or more characters
-- elmt   ::= ? -- matches any character



[Ada] Avoid creating temporaries in Universal_Integer for range checks

2020-06-03 Thread Pierre-Marie de Rodat
This extends the exception made for attribute references in the code
generating range checks to the simple expressions containing a single
attribute reference, thus avoiding to create a temporary whose type is
Universal_Integer when the attribute returns Universal_Integer, which is
the common case.  This also prevents Remove_Side_Effects from creating a
similar temporary for an attribute reference whose prefix is not a name.

The reason is that Universal_Integer must be a type as large as the
largest supported integer type and, therefore, can be much larger than
what is really needed here.

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Eric Botcazou  

gcc/ada/

* checks.adb (Is_Single_Attribute_Reference): New predicate.
(Generate_Range_Check): Do not force the evaluation if the
node is a single attribute reference.
* exp_util.adb (Side_Effect_Free_Attribute): New predicate.
(Side_Effect_Free) : Call it.
(Remove_Side_Effects): Remove the side effects of the prefix
for an attribute reference whose prefix is not a name.--- gcc/ada/checks.adb
+++ gcc/ada/checks.adb
@@ -6875,6 +6875,10 @@ package body Checks is
   --  given Suppress argument. Then check the converted value against the
   --  range of the target subtype.
 
+  function Is_Single_Attribute_Reference (N : Node_Id) return Boolean;
+  --  Return True if N is an expression that contains a single attribute
+  --  reference, possibly as operand among only integer literal operands.
+
   -
   -- Convert_And_Check_Range --
   -
@@ -6934,6 +6938,31 @@ package body Checks is
  Set_Etype (N, Target_Base_Type);
   end Convert_And_Check_Range;
 
+  -
+  --  Is_Single_Attribute_Reference  --
+  -
+
+  function Is_Single_Attribute_Reference (N : Node_Id) return Boolean is
+  begin
+ if Nkind (N) = N_Attribute_Reference then
+return True;
+
+ elsif Nkind (N) in N_Binary_Op then
+if Nkind (Right_Opnd (N)) = N_Integer_Literal then
+   return Is_Single_Attribute_Reference (Left_Opnd (N));
+
+elsif Nkind (Left_Opnd (N)) = N_Integer_Literal then
+   return Is_Single_Attribute_Reference (Right_Opnd (N));
+
+else
+   return False;
+end if;
+
+ else
+return False;
+ end if;
+  end Is_Single_Attribute_Reference;
+
--  Start of processing for Generate_Range_Check
 
begin
@@ -6982,9 +7011,10 @@ package body Checks is
   --  We skip the evaluation of attribute references because, after these
   --  runtime checks are generated, the expander may need to rewrite this
   --  node (for example, see Attribute_Max_Size_In_Storage_Elements in
-  --  Expand_N_Attribute_Reference).
+  --  Expand_N_Attribute_Reference) and, in many cases, their return type
+  --  is universal integer, which is a very large type for a temporary.
 
-  if Nkind (N) /= N_Attribute_Reference
+  if not Is_Single_Attribute_Reference (N)
 and then (not Is_Entity_Name (N)
or else Treat_As_Volatile (Entity (N)))
   then

--- gcc/ada/exp_util.adb
+++ gcc/ada/exp_util.adb
@@ -224,6 +224,10 @@ package body Exp_Util is
--  level, and False otherwise. Nested_Constructs is True when any nested
--  packages declared in L must be processed, and False otherwise.
 
+   function Side_Effect_Free_Attribute (Name : Name_Id) return Boolean;
+   --  Return True if the evaluation of the given attribute is considered
+   --  side-effect free, independently of its prefix and expressions.
+
-
-- Activate_Atomic_Synchronization --
-
@@ -11306,6 +11310,21 @@ package body Exp_Util is
 
   Scope_Suppress.Suppress := (others => True);
 
+  --  If this is a side-effect free attribute reference whose expressions
+  --  are also side-effect free and whose prefix is not a name, remove the
+  --  side effects of the prefix. A copy of the prefix is required in this
+  --  case and it is better not to make an additional one for the attribute
+  --  itself, because the return type of many of them is universal integer,
+  --  which is a very large type for a temporary.
+
+  if Nkind (Exp) = N_Attribute_Reference
+and then Side_Effect_Free_Attribute (Attribute_Name (Exp))
+and then Side_Effect_Free (Expressions (Exp), Name_Req, Variable_Ref)
+and then not Is_Name_Reference (Prefix (Exp))
+  then
+ Remove_Side_Effects (Prefix (Exp), Name_Req, Variable_Ref);
+ goto Leave;
+
   --  If this is an elementary or a small not-by-reference record type, and
   --  we need to capture th

[Ada] Incorrect accessibility checking on aliased formals

2020-06-03 Thread Pierre-Marie de Rodat
This patch fixes an issue whereby the compiler incorrectly omits static
and dynamic accessibility checks on explicitly aliased parameters within
functions.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Justin Squirek  

gcc/ada/

* libgnat/a-cborse.adb, libgnat/a-cihase.adb,
libgnat/a-ciorse.adb, libgnat/a-coorse.adb: Modified to use
'Unrestricted_Access in certain cases where static accessibility
errors were triggered.
* exp_ch6.adb (Expand_Simple_Return_Statement): Add generation
of dynamic accessibility checks as determined by
Is_Special_Aliased_Formal_Access.
* sem_attr.adb (Resolve_Attribute): Add call to
Is_Special_Aliased_Formal_Access to avoid performing static
checks where dynamic ones are required.
* sem_ch6.adb (Check_Return_Obj_Accessibility): Handle renamed
objects within component associations requiring special
accessibility checks.
* sem_util.adb, sem_util.ads (Is_Special_Aliased_Formal_Access):
Created to detect the special case where an aliased formal is
being compared against the level of an anonymous access return
object.
(Object_Access_Level): Remove incorrect condition leading to
overly permissive accessibility levels being returned on
explicitly aliased parameters.--- gcc/ada/exp_ch6.adb
+++ gcc/ada/exp_ch6.adb
@@ -6769,6 +6769,28 @@ package body Exp_Ch6 is
   --  of the return object to the specific type on assignments to the
   --  individual components.
 
+  procedure Check_Against_Result_Level (Level : Node_Id);
+  --  Check the given accessibility level against the level
+  --  determined by the point of call. (AI05-0234).
+
+  
+  -- Check_Against_Result_Level --
+  
+
+  procedure Check_Against_Result_Level (Level : Node_Id) is
+  begin
+ Insert_Action (N,
+   Make_Raise_Program_Error (Loc,
+ Condition =>
+   Make_Op_Gt (Loc,
+ Left_Opnd  => Level,
+ Right_Opnd =>
+   New_Occurrence_Of
+ (Extra_Accessibility_Of_Result (Scope_Id), Loc)),
+ Reason => PE_Accessibility_Check_Failed));
+  end Check_Against_Result_Level;
+
+   --  Start of processing for Expand_Simple_Function_Return
begin
   if Is_Class_Wide_Type (R_Type)
 and then not Is_Class_Wide_Type (Exptyp)
@@ -7315,6 +7337,16 @@ package body Exp_Ch6 is
  Suppress  => All_Checks);
   end if;
 
+  --  Determine if the special rules within RM 3.10.2 for explicitly
+  --  aliased formals apply to Exp - in which case we require a dynamic
+  --  check to be generated.
+
+  if Is_Special_Aliased_Formal_Access (Exp, Scope_Id) then
+ Check_Against_Result_Level
+   (Make_Integer_Literal (Loc,
+ Object_Access_Level (Entity (Ultimate_Prefix (Prefix (Exp));
+  end if;
+
   --  AI05-0234: RM 6.5(21/3). Check access discriminants to
   --  ensure that the function result does not outlive an
   --  object designated by one of it discriminants.
@@ -7324,28 +7356,6 @@ package body Exp_Ch6 is
   then
  declare
 Discrim_Source : Node_Id;
-
-procedure Check_Against_Result_Level (Level : Node_Id);
---  Check the given accessibility level against the level
---  determined by the point of call. (AI05-0234).
-
-
--- Check_Against_Result_Level --
-
-
-procedure Check_Against_Result_Level (Level : Node_Id) is
-begin
-   Insert_Action (N,
- Make_Raise_Program_Error (Loc,
-   Condition =>
- Make_Op_Gt (Loc,
-   Left_Opnd  => Level,
-   Right_Opnd =>
- New_Occurrence_Of
-   (Extra_Accessibility_Of_Result (Scope_Id), Loc)),
-   Reason => PE_Accessibility_Check_Failed));
-end Check_Against_Result_Level;
-
  begin
 Discrim_Source := Exp;
 while Nkind (Discrim_Source) = N_Qualified_Expression loop

--- gcc/ada/libgnat/a-cborse.adb
+++ gcc/ada/libgnat/a-cborse.adb
@@ -933,7 +933,7 @@ package body Ada.Containers.Bounded_Ordered_Sets is
   Control =>
 (Controlled with
   Container.TC'Unrestricted_Access,
-  Container => Container'Access,
+  Container => Container'Unchecked_Access,
   Pos   => Position,
   Old_Key   => new Key_Type'(Key (Position
 do
@@ -961,7 +961,7

[Ada] Improve handling of SPARK_Mode in generic instances

2020-06-03 Thread Pierre-Marie de Rodat
SPARK_Mode aspect/pragma used to signal parts of the code in SPARK is
now allowed to be Off inside generic instantiations for parts of the
code that should not be considered in SPARK.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Yannick Moy  

gcc/ada/

* rtsfind.adb (Load_RTU): Correctly set/reset global variable to
ignore SPARK_Mode in instances around loading.
* sem_ch6.adb (Analyze_Subprogram_Body_Helper): Accept Off
without prior On.
* sem_ch7.adb (Analyze_Package_Body_Helper): Likewise.
* sem_prag.adb (Analyze_Pragma): Always take into account
SPARK_Mode Off.--- gcc/ada/rtsfind.adb
+++ gcc/ada/rtsfind.adb
@@ -931,6 +931,8 @@ package body Rtsfind is
 
   Saved_GM  : constant Ghost_Mode_Type := Ghost_Mode;
   Saved_IGR : constant Node_Id := Ignored_Ghost_Region;
+  Saved_ISMP : constant Boolean:=
+ Ignore_SPARK_Mode_Pragmas_In_Instance;
   Saved_SM  : constant SPARK_Mode_Type := SPARK_Mode;
   Saved_SMP : constant Node_Id := SPARK_Mode_Pragma;
   --  Save Ghost and SPARK mode-related data to restore on exit
@@ -946,6 +948,7 @@ package body Rtsfind is
 
   --  Provide a clean environment for the unit
 
+  Ignore_SPARK_Mode_Pragmas_In_Instance := False;
   Install_Ghost_Region (None, Empty);
   Install_SPARK_Mode   (None, Empty);
 
@@ -1044,6 +1047,7 @@ package body Rtsfind is
  Set_Is_Potentially_Use_Visible (U.Entity, True);
   end if;
 
+  Ignore_SPARK_Mode_Pragmas_In_Instance := Saved_ISMP;
   Restore_Ghost_Region (Saved_GM, Saved_IGR);
   Restore_SPARK_Mode   (Saved_SM, Saved_SMP);
end Load_RTU;

--- gcc/ada/sem_ch6.adb
+++ gcc/ada/sem_ch6.adb
@@ -4592,6 +4592,15 @@ package body Sem_Ch6 is
  elsif Nkind (Parent (Parent (Spec_Id))) = N_Subprogram_Body_Stub then
 null;
 
+ --  SPARK_Mode Off could complete no SPARK_Mode in a generic, either
+ --  as specified in source code, or because SPARK_Mode On is ignored
+ --  in an instance where the context is SPARK_Mode Off/Auto.
+
+ elsif Get_SPARK_Mode_From_Annotation (SPARK_Pragma (Body_Id)) = Off
+   and then (Is_Generic_Unit (Spec_Id) or else In_Instance)
+ then
+null;
+
  else
 Error_Msg_Sloc := Sloc (SPARK_Pragma (Body_Id));
 Error_Msg_N ("incorrect application of SPARK_Mode #", N);

--- gcc/ada/sem_ch7.adb
+++ gcc/ada/sem_ch7.adb
@@ -956,6 +956,15 @@ package body Sem_Ch7 is
  ("\value Off was set for SPARK_Mode on & #", N, Spec_Id);
 end if;
 
+ --  SPARK_Mode Off could complete no SPARK_Mode in a generic, either
+ --  as specified in source code, or because SPARK_Mode On is ignored
+ --  in an instance where the context is SPARK_Mode Off/Auto.
+
+ elsif Get_SPARK_Mode_From_Annotation (SPARK_Pragma (Body_Id)) = Off
+   and then (Is_Generic_Unit (Spec_Id) or else In_Instance)
+ then
+null;
+
  else
 Error_Msg_Sloc := Sloc (SPARK_Pragma (Body_Id));
 Error_Msg_N ("incorrect application of SPARK_Mode#", N);

--- gcc/ada/sem_prag.adb
+++ gcc/ada/sem_prag.adb
@@ -23448,6 +23448,11 @@ package body Sem_Prag is
   --  pragma in which case the current pragma is illegal as
   --  it cannot "complete".
 
+  elsif Get_SPARK_Mode_From_Annotation (N) = Off
+and then (Is_Generic_Unit (Entity) or else In_Instance)
+  then
+ null;
+
   else
  Error_Msg_N ("incorrect use of SPARK_Mode", Err_N);
  Error_Msg_Sloc := Sloc (Err_Id);
@@ -23773,16 +23778,6 @@ package body Sem_Prag is
  --  Start of processing for Do_SPARK_Mode
 
  begin
---  When a SPARK_Mode pragma appears inside an instantiation whose
---  enclosing context has SPARK_Mode set to "off", the pragma has
---  no semantic effect.
-
-if Ignore_SPARK_Mode_Pragmas_In_Instance then
-   Rewrite (N, Make_Null_Statement (Loc));
-   Analyze (N);
-   return;
-end if;
-
 GNAT_Pragma;
 Check_No_Identifiers;
 Check_At_Most_N_Arguments (1);
@@ -23799,6 +23794,18 @@ package body Sem_Prag is
 Mode_Id := Get_SPARK_Mode_Type (Mode);
 Context := Parent (N);
 
+--  When a SPARK_Mode pragma appears inside an instantiation whose
+--  enclosing context has SPARK_Mode set to "off", the pragma has
+--  no semantic effect.
+
+if Ignore_SPARK_Mode_Pragmas_In_Instance
+  and then Mode_Id /= Off
+then
+   Rewrite (N, Make_Null_Statement (Loc));
+   Analyze (N);
+   return;
+end if;
+
 

[Ada] Small improvement to Expand_N_Unchecked_Type_Conversion

2020-06-03 Thread Pierre-Marie de Rodat
This extends the constant folding done in the procedure to literals of
enumeration types.  This is just a matter of calling Expr_Rep_Value in
lieu of Expr_Value on the operand, since the result is the bit pattern.

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Eric Botcazou  

gcc/ada/

* exp_ch4.adb (Expand_N_Unchecked_Type_Conversion): Also fold the
conversion for literals of enumeration types.--- gcc/ada/exp_ch4.adb
+++ gcc/ada/exp_ch4.adb
@@ -12460,7 +12460,7 @@ package body Exp_Ch4 is
   --  If we have a conversion of a compile time known value to a target
   --  type and the value is in range of the target type, then we can simply
   --  replace the construct by an integer literal of the correct type. We
-  --  only apply this to integer types being converted. Possibly it may
+  --  only apply this to discrete types being converted. Possibly it may
   --  apply in other cases, but it is too much trouble to worry about.
 
   --  Note that we do not do this transformation if the Kill_Range_Check
@@ -12473,13 +12473,13 @@ package body Exp_Ch4 is
 
   if Is_Integer_Type (Target_Type)
 and then not Has_Biased_Representation (Target_Type)
-and then Is_Integer_Type (Operand_Type)
+and then Is_Discrete_Type (Operand_Type)
 and then not Has_Biased_Representation (Operand_Type)
 and then Compile_Time_Known_Value (Operand)
 and then not Kill_Range_Check (N)
   then
  declare
-Val : constant Uint := Expr_Value (Operand);
+Val : constant Uint := Expr_Rep_Value (Operand);
 
  begin
 if Compile_Time_Known_Value (Type_Low_Bound (Target_Type))



[Ada] Small housekeeping work in Check_Private_View

2020-06-03 Thread Pierre-Marie de Rodat
This reverts a series of 3 old changes made without real explanation to
the second main case of Check_Private_View, which appear to be largely
obsolete by now and have made it quite hard to follow, and also changes
it to using In_Open_Scopes as the other cases.

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Eric Botcazou  

gcc/ada/

* sem_ch12.adb (Check_Private_View): Clean up implementation of
second main case, when the generic sees the private declaration.--- gcc/ada/sem_ch12.adb
+++ gcc/ada/sem_ch12.adb
@@ -7431,35 +7431,25 @@ package body Sem_Ch12 is
and then Present (Full_View (T))
and then not In_Open_Scopes (Scope (T))
  then
---  In the generic, the full type was visible. Save the private
---  entity, for subsequent exchange.
+--  In the generic, the full declaration was visible
 
 Switch_View (T);
 
  elsif Has_Private_View (N)
and then not Is_Private_Type (T)
and then not Has_Been_Exchanged (T)
-   and then Etype (Get_Associated_Node (N)) /= T
+   and then (not In_Open_Scopes (Scope (T))
+  or else Nkind (Parent (N)) = N_Subtype_Declaration)
  then
---  Only the private declaration was visible in the generic. If
---  the type appears in a subtype declaration, the subtype in the
+--  In the generic, only the private declaration was visible
+
+--  If the type appears in a subtype declaration, the subtype in
 --  instance must have a view compatible with that of its parent,
 --  which must be exchanged (see corresponding code in Restore_
---  Private_Views). Otherwise, if the type is defined in a parent
---  unit, leave full visibility within instance, which is safe.
-
-if In_Open_Scopes (Scope (Base_Type (T)))
-  and then not Is_Private_Type (Base_Type (T))
-  and then Comes_From_Source (Base_Type (T))
-then
-   null;
+--  Private_Views) so we make an exception to the open scope rule.
 
-elsif Nkind (Parent (N)) = N_Subtype_Declaration
-  or else not In_Private_Part (Scope (Base_Type (T)))
-then
-   Prepend_Elmt (T, Exchanged_Views);
-   Exchange_Declarations (Etype (Get_Associated_Node (N)));
-end if;
+Prepend_Elmt (T, Exchanged_Views);
+Exchange_Declarations (Etype (Get_Associated_Node (N)));
 
  --  For composite types with inconsistent representation exchange
  --  component types accordingly.



[Ada] Iterate with procedural versions of Next_... routines where possible

2020-06-03 Thread Pierre-Marie de Rodat
Routines like Next_Index and Next_Formal are implemented both as
procedures and functions. The procedure variant seems meant to be used
when iterating, e.g.:

   Next_Formal (Formal);

because it is more readable than the corresponding functions:

   Formal := Next_Formal (Formal);

(and it is inlined anyway, so there is no performance penalty). This
patch is to use the procedure variants where possible, where functional
ones were detected with:

  $ grep " \([[:alpha:]_]\+\) := Next[[:alpha:]_]\+ (\1)" *.adb

Semantics is unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Piotr Trojanek  

gcc/ada/

* checks.adb, einfo.adb, exp_aggr.adb, exp_ch4.adb, exp_ch6.adb,
exp_unst.adb, exp_util.adb, freeze.adb, inline.adb, repinfo.adb,
sem_aggr.adb, sem_attr.adb, sem_aux.adb, sem_ch13.adb,
sem_ch3.adb, sem_ch4.adb, sem_ch8.adb, sem_elab.adb,
sem_eval.adb, sem_prag.adb, sem_res.adb, sem_smem.adb,
sem_util.adb, treepr.adb: Replace uses of Next_ functions with
corresponding procedures.--- gcc/ada/checks.adb
+++ gcc/ada/checks.adb
@@ -4889,7 +4889,7 @@ package body Checks is
 
  Indx := First_Index (Atyp);
  for J in 2 .. Inum loop
-Indx := Next_Index (Indx);
+Next_Index (Indx);
  end loop;
 
  --  If the index type is a formal type or derived from
@@ -6847,7 +6847,7 @@ package body Checks is
   Reason => CE_Index_Check_Failed));
end if;
 
-   A_Idx := Next_Index (A_Idx);
+   Next_Index (A_Idx);
Ind := Ind + 1;
Next (Sub);
 end loop;

--- gcc/ada/einfo.adb
+++ gcc/ada/einfo.adb
@@ -7405,7 +7405,7 @@ package body Einfo is
   Comp_Id := First_Entity (Id);
   while Present (Comp_Id) loop
  exit when Ekind (Comp_Id) = E_Component;
- Comp_Id := Next_Entity (Comp_Id);
+ Next_Entity (Comp_Id);
   end loop;
 
   return Comp_Id;
@@ -7428,7 +7428,7 @@ package body Einfo is
   Comp_Id := First_Entity (Id);
   while Present (Comp_Id) loop
  exit when Ekind_In (Comp_Id, E_Component, E_Discriminant);
- Comp_Id := Next_Entity (Comp_Id);
+ Next_Entity (Comp_Id);
   end loop;
 
   return Comp_Id;
@@ -7755,7 +7755,7 @@ package body Einfo is
 return True;
  end if;
 
- Ent := Next_Entity (Ent);
+ Next_Entity (Ent);
   end loop;
 
   return False;
@@ -8540,7 +8540,7 @@ package body Einfo is
   Comp_Id := Next_Entity (Id);
   while Present (Comp_Id) loop
  exit when Ekind (Comp_Id) = E_Component;
- Comp_Id := Next_Entity (Comp_Id);
+ Next_Entity (Comp_Id);
   end loop;
 
   return Comp_Id;
@@ -8557,7 +8557,7 @@ package body Einfo is
   Comp_Id := Next_Entity (Id);
   while Present (Comp_Id) loop
  exit when Ekind_In (Comp_Id, E_Component, E_Discriminant);
- Comp_Id := Next_Entity (Comp_Id);
+ Next_Entity (Comp_Id);
   end loop;
 
   return Comp_Id;
@@ -8590,7 +8590,7 @@ package body Einfo is
   pragma Assert (Ekind (Id) = E_Discriminant);
 
   loop
- D := Next_Entity (D);
+ Next_Entity (D);
  if No (D)
or else (Ekind (D) /= E_Discriminant
   and then not Is_Itype (D))
@@ -8715,7 +8715,7 @@ package body Einfo is
 N := N + 1;
  end if;
 
- Ent := Next_Entity (Ent);
+ Next_Entity (Ent);
   end loop;
 
   return N;
@@ -8734,7 +8734,7 @@ package body Einfo is
   Formal := First_Formal (Id);
   while Present (Formal) loop
  N := N + 1;
- Formal := Next_Formal (Formal);
+ Next_Formal (Formal);
   end loop;
 
   return N;

--- gcc/ada/exp_aggr.adb
+++ gcc/ada/exp_aggr.adb
@@ -7227,7 +7227,7 @@ package body Exp_Aggr is
 
   Comp := First_Component (Typ);
   while Chars (Comp) /= Name_uParent loop
- Comp := Next_Component (Comp);
+ Next_Component (Comp);
   end loop;
 
   Parent_Name := New_Occurrence_Of (Comp, Loc);
@@ -7364,7 +7364,7 @@ package body Exp_Aggr is
 return False;
  end if;
 
- Indx := Next_Index (Indx);
+ Next_Index (Indx);
   end loop;
end if;
 end;
@@ -,12 +,12 @@ package body Exp_Aggr is
   Comp_Val := Comp_Val mod Uint_2 ** Comp_Size;
   Val := UI_To_Int (Val + Comp_Val * Uint_2 ** Shift);
   Shift := Shift + Incr;
-  One_Comp := Next (One_Comp);
+  Next (One_Comp);
   Packed_Num := Packed_Num + 1;
end if;
 end loop;
 
-

[Ada] Remove more cases of empty loops

2020-06-03 Thread Pierre-Marie de Rodat
We introduce new functions to detect a list of statements with no side
effects as well as a loop with no side effect so that we can then mark
them for removal.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Arnaud Charlet  

gcc/ada/

* sem_util.ads, sem_util.adb (Side_Effect_Free_Statements,
Side_Effect_Free_Loop): New functions.
(Has_Non_Null_Statements): Consider N_Call_Marker as a null
statement.
* sem_ch5.adb (Analyze_Loop_Parameter_Specification): Call
Set_Is_Null_Loop even inside a generic instantiation.
(Analyze_Loop_Statement): Mark for removal loops with no side
effects.--- gcc/ada/sem_ch5.adb
+++ gcc/ada/sem_ch5.adb
@@ -3210,8 +3210,9 @@ package body Sem_Ch5 is
 and then Is_Discrete_Type (Etype (DS))
   then
  declare
-L : Node_Id;
-H : Node_Id;
+L  : Node_Id;
+H  : Node_Id;
+Null_Range : Boolean := False;
 
  begin
 if Nkind (DS) = N_Range then
@@ -3231,6 +3232,14 @@ package body Sem_Ch5 is
 --  null range may be detected statically.
 
 if Compile_Time_Compare (L, H, Assume_Valid => True) = GT then
+   if Compile_Time_Compare (L, H, Assume_Valid => False) = GT then
+  --  Since we know the range of the loop is always null,
+  --  set the appropriate flag to remove the loop entirely
+  --  during expansion.
+
+  Set_Is_Null_Loop (Loop_Nod);
+  Null_Range := True;
+   end if;
 
--  Suppress the warning if inside a generic template or
--  instance, since in practice they tend to be dubious in these
@@ -3241,24 +3250,14 @@ package body Sem_Ch5 is
   --  Specialize msg if invalid values could make the loop
   --  non-null after all.
 
-  if Compile_Time_Compare
-   (L, H, Assume_Valid => False) = GT
-  then
- --  Since we know the range of the loop is null, set the
- --  appropriate flag to remove the loop entirely during
- --  expansion.
-
- Set_Is_Null_Loop (Loop_Nod);
-
+  if Null_Range then
  if Comes_From_Source (N) then
 Error_Msg_N
   ("??loop range is null, loop will not execute", DS);
  end if;
 
- --  Here is where the loop could execute because of
- --  invalid values, so issue appropriate message and in
- --  this case we do not set the Is_Null_Loop flag since
- --  the loop may execute.
+  --  Here is where the loop could execute because of
+  --  invalid values, so issue appropriate message.
 
   elsif Comes_From_Source (N) then
  Error_Msg_N
@@ -3994,6 +3993,12 @@ package body Sem_Ch5 is
  Analyze_Statements (Statements (N));
   end if;
 
+  --  If the loop has no side effects, mark it for removal.
+
+  if Side_Effect_Free_Loop (N) then
+ Set_Is_Null_Loop (N);
+  end if;
+
   --  When the iteration scheme of a loop contains attribute 'Loop_Entry,
   --  the loop is transformed into a conditional block. Retrieve the loop.
 

--- gcc/ada/sem_util.adb
+++ gcc/ada/sem_util.adb
@@ -11536,7 +11536,7 @@ package body Sem_Util is
  Node := First (L);
 
  loop
-if Nkind (Node) /= N_Null_Statement then
+if not Nkind_In (Node, N_Null_Statement, N_Call_Marker) then
return True;
 end if;
 
@@ -11548,6 +11548,91 @@ package body Sem_Util is
   return False;
end Has_Non_Null_Statements;
 
+   -
+   -- Side_Effect_Free_Statements --
+   -
+
+   function Side_Effect_Free_Statements (L : List_Id) return Boolean is
+  Node : Node_Id;
+
+   begin
+  if Is_Non_Empty_List (L) then
+ Node := First (L);
+
+ loop
+case Nkind (Node) is
+   when N_Null_Statement | N_Call_Marker | N_Raise_xxx_Error =>
+  null;
+   when N_Object_Declaration =>
+  if Present (Expression (Node))
+and then not Side_Effect_Free (Expression (Node))
+  then
+ return False;
+  end if;
+
+   when others =>
+  return False;
+end case;
+
+Next (Node);
+exit when Node = Empty;
+ end loop;
+  end if;
+
+  return True;
+   end Side_Effect_Free_Statements;
+
+   ---
+   -- Side_Effect_Free_Loop --
+   ---
+
+   function Side

[Ada] Initial infrastructure for adding a tree checker

2020-06-03 Thread Pierre-Marie de Rodat
This is part of a larger project to add a tree checker in GNAT to check
that the GNAT tree out of expansion is properly formed before passing it
to the code generator.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Arnaud Charlet  

gcc/ada/

* frontend.adb (Frontend): Call (dummy for now) tree checker.
* debug.adb: Reserve -gnatd_V for the tree checker.
* vast.ads, vast.adb: New files.
* gcc-interface/Make-lang.in: Add vast.o.--- gcc/ada/debug.adb
+++ gcc/ada/debug.adb
@@ -193,7 +193,7 @@ package body Debug is
--  d_S
--  d_T  Output trace information on invocation path recording
--  d_U
-   --  d_V
+   --  d_V  Enable verifications on the expanded tree
--  d_W
--  d_X
--  d_Y
@@ -1012,9 +1012,12 @@ package body Debug is
--   it is checked, and the progress of the recursive trace through
--   elaboration calls at compile time.
 
-   --  d_T  The compiler outputs trance information to standard output whenever
+   --  d_T  The compiler outputs trace information to standard output whenever
--   an invocation path is recorded.
 
+   --  d_V  Enable verification of the expanded code before calling the backend
+   --   and generate error messages on each inconsistency found.
+
--  d1   Error messages have node numbers where possible. Normally error
--   messages have only source locations. This option is useful when
--   debugging errors caused by expanded code, where the source location

--- gcc/ada/frontend.adb
+++ gcc/ada/frontend.adb
@@ -66,6 +66,7 @@ with Sinput.L; use Sinput.L;
 with SCIL_LL;
 with Tbuild;   use Tbuild;
 with Types;use Types;
+with VAST;
 
 procedure Frontend is
 begin
@@ -505,6 +506,12 @@ begin
   null;
end if;
 
+   --  Verify the validity of the tree
+
+   if Debug_Flag_Underscore_VV then
+  VAST.Check_Tree (Cunit (Main_Unit));
+   end if;
+
--  Dump the source now. Note that we do this as soon as the analysis
--  of the tree is complete, because it is not just a dump in the case
--  of -gnatD, where it rewrites all source locations in the tree.

--- gcc/ada/gcc-interface/Make-lang.in
+++ gcc/ada/gcc-interface/Make-lang.in
@@ -481,6 +481,7 @@ GNAT_ADA_OBJS =	\
  ada/urealp.o	\
  ada/usage.o	\
  ada/validsw.o	\
+ ada/vast.o	\
  ada/warnsw.o	\
  ada/widechar.o
 

--- /dev/null
new file mode 100644
+++ gcc/ada/vast.adb
@@ -0,0 +1,46 @@
+--
+--  --
+-- GNAT COMPILER COMPONENTS --
+--  --
+-- V A S T  --
+--  --
+-- B o d y  --
+--  --
+-- Copyright (C) 2020, Free Software Foundation, Inc.   --
+--  --
+-- GNAT is free software;  you can  redistribute it  and/or modify it under --
+-- terms of the  GNU General Public License as published  by the Free Soft- --
+-- ware  Foundation;  either version 3,  or (at your option) any later ver- --
+-- sion.  GNAT is distributed in the hope that it will be useful, but WITH- --
+-- OUT ANY WARRANTY;  without even the  implied warranty of MERCHANTABILITY --
+-- or FITNESS FOR A PARTICULAR PURPOSE. --
+--  --
+-- As a special exception under Section 7 of GPL version 3, you are granted --
+-- additional permissions described in the GCC Runtime Library Exception,   --
+-- version 3.1, as published by the Free Software Foundation.   --
+--  --
+-- You should have received a copy of the GNU General Public License and--
+-- a copy of the GCC Runtime Library Exception along with this program; --
+-- see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see--
+-- .  --
+--  --
+-- GNAT was originally developed  by the GNAT team at  New York University. --
+-- Extensive contributions were provided by Ada Core Technologies Inc.  --
+--  --
+--
+
+--  Dummy implementation
+
+package body VAST is
+
+   
+   -- Check_Tree --
+   
+
+   procedure Check_Tree (GNAT_Root :

[Ada] Get rid of more references to Universal_Integer in expanded code

2020-06-03 Thread Pierre-Marie de Rodat
This changes a few places in the System.Atomic_Operations packages to
using static expressions, which guarantees that contant folding is done
in the front-end instead of the code generator.

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Eric Botcazou  

gcc/ada/

* libgnat/s-atopar.adb (Atomic_Fetch_And_Add): Make use of an
equivalent static expression.
(Atomic_Fetch_And_Subtract): Likewise.
(Is_Lock_Free): Likewise.
* libgnat/s-atopex.adb (Atomic_Exchange): Likewise.
(Atomic_Compare_And_Exchange): Likewise.
(Is_Lock_Free): Likewise.--- gcc/ada/libgnat/s-atopar.adb
+++ gcc/ada/libgnat/s-atopar.adb
@@ -88,7 +88,7 @@ package body System.Atomic_Operations.Arithmetic is
   pragma Warnings (On);
 
begin
-  case Item'Size is
+  case Atomic_Type'Size is
  when 8  => return Atomic_Fetch_Add_1 (Item'Address, Value);
  when 16 => return Atomic_Fetch_Add_2 (Item'Address, Value);
  when 32 => return Atomic_Fetch_Add_4 (Item'Address, Value);
@@ -125,7 +125,7 @@ package body System.Atomic_Operations.Arithmetic is
   pragma Warnings (On);
 
begin
-  case Item'Size is
+  case Atomic_Type'Size is
  when 8  => return Atomic_Fetch_Sub_1 (Item'Address, Value);
  when 16 => return Atomic_Fetch_Sub_2 (Item'Address, Value);
  when 32 => return Atomic_Fetch_Sub_4 (Item'Address, Value);
@@ -139,9 +139,10 @@ package body System.Atomic_Operations.Arithmetic is
--
 
function Is_Lock_Free (Item : aliased Atomic_Type) return Boolean is
+  pragma Unreferenced (Item);
   use type Interfaces.C.size_t;
begin
-  return Boolean (Atomic_Always_Lock_Free (Item'Size / 8));
+  return Boolean (Atomic_Always_Lock_Free (Atomic_Type'Size / 8));
end Is_Lock_Free;
 
 end System.Atomic_Operations.Arithmetic;

--- gcc/ada/libgnat/s-atopex.adb
+++ gcc/ada/libgnat/s-atopex.adb
@@ -66,7 +66,7 @@ package body System.Atomic_Operations.Exchange is
   pragma Warnings (On);
 
begin
-  case Item'Size is
+  case Atomic_Type'Size is
  when 8  => return Atomic_Exchange_1 (Item'Address, Value);
  when 16 => return Atomic_Exchange_2 (Item'Address, Value);
  when 32 => return Atomic_Exchange_4 (Item'Address, Value);
@@ -124,7 +124,7 @@ package body System.Atomic_Operations.Exchange is
   pragma Warnings (On);
 
begin
-  case Item'Size is
+  case Atomic_Type'Size is
  when 8 =>
 return Boolean
   (Atomic_Compare_Exchange_1
@@ -151,9 +151,10 @@ package body System.Atomic_Operations.Exchange is
--
 
function Is_Lock_Free (Item : aliased Atomic_Type) return Boolean is
+  pragma Unreferenced (Item);
   use type Interfaces.C.size_t;
begin
-  return Boolean (Atomic_Always_Lock_Free (Item'Size / 8));
+  return Boolean (Atomic_Always_Lock_Free (Atomic_Type'Size / 8));
end Is_Lock_Free;
 
 end System.Atomic_Operations.Exchange;



[Ada] Avoid creating temporaries in Universal_Integer, continued

2020-06-03 Thread Pierre-Marie de Rodat
This gets rid of one more case where the front-end would create a
temporary in Universal_Integer type, which is unnecessary.

The reason is that Universal_Integer must be a type as large as the
largest supported integer type and, therefore, can be much larger than
what is really needed here.

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Eric Botcazou  

gcc/ada/

* exp_util.adb (Remove_Side_Effects): For a type conversion, do
not remove the side effects of the expression only if it is of
universal integer type.--- gcc/ada/exp_util.adb
+++ gcc/ada/exp_util.adb
@@ -11428,12 +11428,15 @@ package body Exp_Util is
  goto Leave;
 
   --  If this is a type conversion, leave the type conversion and remove
-  --  the side effects in the expression. This is important in several
-  --  circumstances: for change of representations, and also when this is a
-  --  view conversion to a smaller object, where gigi can end up creating
-  --  its own temporary of the wrong size.
-
-  elsif Nkind (Exp) = N_Type_Conversion then
+  --  side effects in the expression, unless it is of universal integer,
+  --  which is a very large type for a temporary. This is important in
+  --  several circumstances: for change of representations and also when
+  --  this is a view conversion to a smaller object, where gigi can end
+  --  up creating its own temporary of the wrong size.
+
+  elsif Nkind (Exp) = N_Type_Conversion
+and then Etype (Expression (Exp)) /= Universal_Integer
+  then
  Remove_Side_Effects (Expression (Exp), Name_Req, Variable_Ref);
 
  --  Generating C code the type conversion of an access to constrained



[Ada] Get rid of more references to Universal_Integer in expanded code

2020-06-03 Thread Pierre-Marie de Rodat
This further tweaks the expanded code generated by the front-end
for attributes returning Universal_Integer, in particular removes
hardcoded references to it and fixes a couple of type mismatches
in the process.

The only observable change is that 'Val is now expanded by the
front-end for integer types, for the sake of consistency with 'Pos.

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Eric Botcazou  

gcc/ada/

* exp_attr.adb (Expand_N_Attribute_Reference) :
In the case of an enumeration type, do an intermediate
conversion to a small integer type.  Remove useless stuff.
: Do not hardcode Universal_Integer and
fix a type mismatch in the assignment to the variable.
: Likewise.
: Do not redefine the Ptyp local variable.
: Likewise.
: Likewise.
: Small tweaks.
: For an enumeration type with standard representation,
apply the range check to the expression of a convertion to
Universal_Integer, if any.  For an integer type, expand to
a mere conversion.--- gcc/ada/exp_attr.adb
+++ gcc/ada/exp_attr.adb
@@ -2817,7 +2817,7 @@ package body Exp_Attr is
  --  If the prefix is an access to object, the attribute applies to
  --  the designated object, so rewrite with an explicit dereference.
 
- elsif Is_Access_Type (Etype (Pref))
+ elsif Is_Access_Type (Ptyp)
and then
  (not Is_Entity_Name (Pref) or else Is_Object (Entity (Pref)))
  then
@@ -3133,6 +3133,8 @@ package body Exp_Attr is
 
   when Attribute_Enum_Rep => Enum_Rep : declare
  Expr : Node_Id;
+ Ityp : Entity_Id;
+ Psiz : Uint;
 
   begin
  --  Get the expression, which is X for Enum_Type'Enum_Rep (X) or
@@ -3180,11 +3182,34 @@ package body Exp_Attr is
  --  make sure that the analyzer does not complain about what otherwise
  --  might be an illegal conversion.
 
+ --  However the target type is universal integer in most cases, which
+ --  is a very large type, so in the case of an enumeration type, we
+ --  first convert to a small signed integer type in order not to lose
+ --  the size information.
+
+ elsif Is_Enumeration_Type (Ptyp) then
+Psiz := RM_Size (Base_Type (Ptyp));
+
+if Psiz < 8 then
+   Ityp := Standard_Integer_8;
+
+elsif Psiz < 16 then
+   Ityp := Standard_Integer_16;
+
+elsif Psiz < 32 then
+   Ityp := Standard_Integer_32;
+
+else
+   Ityp := Standard_Integer_64;
+end if;
+
+Rewrite (N, OK_Convert_To (Ityp, Expr));
+Convert_To_And_Rewrite (Typ, N);
+
  else
-Rewrite (N, OK_Convert_To (Typ, Relocate_Node (Expr)));
+Rewrite (N, OK_Convert_To (Typ, Expr));
  end if;
 
- Set_Etype (N, Typ);
  Analyze_And_Resolve (N, Typ);
   end Enum_Rep;
 
@@ -3275,11 +3300,10 @@ package body Exp_Attr is
  function Calculate_Header_Size return Node_Id is
  begin
 --  Generate:
---Universal_Integer
---  (Header_Size_With_Padding (Pref'Alignment))
+--Typ (Header_Size_With_Padding (Pref'Alignment))
 
 return
-  Convert_To (Universal_Integer,
+  Convert_To (Typ,
 Make_Function_Call (Loc,
   Name   =>
 New_Occurrence_Of (RTE (RE_Header_Size_With_Padding), Loc),
@@ -3307,9 +3331,7 @@ package body Exp_Attr is
  --Size : Integer := 0;
  --
  --if Needs_Finalization (Pref'Tag) then
- --   Size :=
- -- Universal_Integer
- --   (Header_Size_With_Padding (Pref'Alignment));
+ --   Size := Integer (Header_Size_With_Padding (Pref'Alignment));
  --end if;
  --
  --  and the attribute reference is replaced with a reference to Size.
@@ -3331,8 +3353,7 @@ package body Exp_Attr is
   --  Generate:
   --if Needs_Finalization (Pref'Tag) then
   --   Size :=
-  -- Universal_Integer
-  --   (Header_Size_With_Padding (Pref'Alignment));
+  -- Integer (Header_Size_With_Padding (Pref'Alignment));
   --end if;
 
   Make_If_Statement (Loc,
@@ -3349,7 +3370,9 @@ package body Exp_Attr is
 Then_Statements=> New_List (
Make_Assignment_Statement (Loc,
  Name   => New_Occurrence_Of (Size, Loc),
- Expression => Calculate_Header_Size);
+ Expression =>
+   Convert_To
+ (Standard_Integer, Calculate_Header_Size));
 
  

[Ada] Improve 'Val implementation for some enumeration types

2020-06-03 Thread Pierre-Marie de Rodat
This improves the expanded code generated by the front-end for the Val,
Pred and Succ attributes when they are applied to an enumeration type
with a non-standard but contiguous representation.  There is no need to
call the Rep_To_Pos routine to generate the required constraint checks
in this case, it is straightforward to do it manually.

For the following procedure:

procedure P (N : Natural) is

  type Color is (Blue, White, Red);
  for Color use (Blue => -1, White => 0, Red => 1);

  C1, C2, C3 : Color;

begin
  C1 := Color'Val (N);
  C2 := Color'Pred (C1);
  C3 := Color'Succ (C1);
end;

the expanded code for the assignment statements must now be:

   [constraint_error when
 not (n in 0 .. 2)
 "range check failed"]
   c1 := p__color!(-1 + integer(n));
   [constraint_error when
 c1 = blue
 "overflow check failed"]
   c2 := p__color!(integer!(c1) - 1);
   [constraint_error when
 c1 = red
 "overflow check failed"]
   c3 := p__color!(integer!(c1) + 1);

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Eric Botcazou  

gcc/ada/

* einfo.ads (Enum_Pos_To_Rep): Adjust description.
* exp_attr.adb (Expand_N_Attribute_Reference) :
Reimplement in the case of an enumeration type with non-standard
but contiguous representation.
: Likewise.
: Likewise.
* exp_ch3.adb (Expand_Freeze_Enumeration_Type): Count the
literals in the first loop.  If the representation is
contiguous, just build the index type of the array type and set
Enum_Pos_To_Rep to it.--- gcc/ada/einfo.ads
+++ gcc/ada/einfo.ads
@@ -1217,14 +1217,16 @@ package Einfo is
 --   for the same literal.
 
 --Enum_Pos_To_Rep (Node23)
---   Defined in enumeration types (but not enumeration subtypes). Set to
---   Empty unless the enumeration type has a non-standard representation
---   (i.e. at least one literal has a representation value different from
---   its pos value). In this case, Enum_Pos_To_Rep is the entity for an
---   array constructed when the type is frozen that maps Pos values to
---   corresponding Rep values. The index type of this array is Natural,
---   and the component type is a suitable integer type that holds the
---   full range of representation values.
+--   Defined in enumeration types, but not enumeration subtypes. Set to
+--   Empty unless the enumeration type has a non-standard representation,
+--   i.e. at least one literal has a representation value different from
+--   its position value. In this case, the alternative is the following:
+--   if the representation is not contiguous, then Enum_Pos_To_Rep is the
+--   entity for an array constant built when the type is frozen that maps
+--   Pos values to corresponding Rep values, whose index type is Natural
+--   and whose component type is the enumeration type itself; or else, if
+--   the representation is contiguous, then Enum_Pos_To_Rep is the entity
+--   of the index type defined above.
 
 --Equivalent_Type (Node18)
 --   Defined in class wide types and subtypes, access to protected

--- gcc/ada/exp_attr.adb
+++ gcc/ada/exp_attr.adb
@@ -5246,46 +5246,48 @@ package body Exp_Attr is
 
   when Attribute_Pred => Pred : declare
  Etyp : constant Entity_Id := Base_Type (Ptyp);
+ Ityp : Entity_Id;
 
   begin
-
  --  For enumeration types with non-standard representations, we
- --  expand typ'Pred (x) into
+ --  expand typ'Pred (x) into:
 
  --Pos_To_Rep (Rep_To_Pos (x) - 1)
 
- --If the representation is contiguous, we compute instead
- --Lit1 + Rep_to_Pos (x -1), to catch invalid representations.
- --The conversion function Enum_Pos_To_Rep is defined on the
- --base type, not the subtype, so we have to use the base type
- --explicitly for this and other enumeration attributes.
+ --  if the representation is non-contiguous, and just x - 1 if it is
+ --  after having dealt with constraint checking.
 
- if Is_Enumeration_Type (Ptyp)
+ if Is_Enumeration_Type (Etyp)
and then Present (Enum_Pos_To_Rep (Etyp))
  then
 if Has_Contiguous_Rep (Etyp) then
-   Rewrite (N,
-  Unchecked_Convert_To (Ptyp,
- Make_Op_Add (Loc,
-Left_Opnd  =>
- Make_Integer_Literal (Loc,
-   Enumeration_Rep (First_Literal (Ptyp))),
-Right_Opnd =>
-  Make_Function_Call (Loc,
-Name =>
-  New_Occurrence_Of
-   (TSS (Etyp, TSS_Rep_To_Pos), Loc),
+   if not Range_Checks_Suppressed (Ptyp) then
+  Set_Do_Range_Check (First (Exprs), False);
+  Expand_Pre

[Ada] Support pragma Allow_Integer_Address on 64-bit targets

2020-06-03 Thread Pierre-Marie de Rodat
This implements the documented semantics of the pragma for operators,
that is to say, introduces an implicit unchecked conversion from the
integer value to type System.Address, thus making the pragma work
on 64-bit targets as well.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Eric Botcazou  

gcc/ada/

* opt.ads (Allow_Integer_Address): Fix typo in comment.
* stand.ads (Standard_Address): New entity.
* cstand.adb (Create_Standard): Create it.
* sem_ch4.adb (Operator_Check): Convert the operands of an
operation with addresses and integers to Standard_Address
if pragma Allow_Integer_Address is in effect.--- gcc/ada/cstand.adb
+++ gcc/ada/cstand.adb
@@ -1372,11 +1372,18 @@ package body CStand is
  "long_long_unsigned");
 
   --  Standard_Unsigned_64 is not user visible, but is used internally. It
-  --  is an unsigned type mod 2**64, 64-bits unsigned, size is 64.
+  --  is an unsigned type mod 2**64 with 64 bits size.
 
   Standard_Unsigned_64 := New_Standard_Entity;
   Build_Unsigned_Integer_Type (Standard_Unsigned_64, 64, "unsigned_64");
 
+  --  Standard_Address is not user visible, but is used internally. It is
+  --  an unsigned type mod 2**System_Address_Size with System.Address size.
+
+  Standard_Address := New_Standard_Entity;
+  Build_Unsigned_Integer_Type
+(Standard_Address, System_Address_Size, "standard_address");
+
   --  Note: universal integer and universal real are constructed as fully
   --  formed signed numeric types, with parameters corresponding to the
   --  longest runtime types (Long_Long_Integer and Long_Long_Float). This

--- gcc/ada/opt.ads
+++ gcc/ada/opt.ads
@@ -210,7 +210,7 @@ package Opt is
Allow_Integer_Address : Boolean := False;
--  GNAT
--  Allow use of integer expression in a context requiring System.Address.
-   --  Set by the use of configuration pragma Allow_Integer_Address Also set
+   --  Set by the use of configuration pragma Allow_Integer_Address. Also set
--  in relaxed semantics mode for use by CodePeer or when -gnatd.M is used.
 
All_Sources : Boolean := False;

--- gcc/ada/sem_ch4.adb
+++ gcc/ada/sem_ch4.adb
@@ -7168,9 +7168,8 @@ package body Sem_Ch4 is
N_Op_Divide,
N_Op_Ge,
N_Op_Gt,
-   N_Op_Le)
-  or else
-  Nkind_In (N, N_Op_Lt,
+   N_Op_Le,
+   N_Op_Lt,
N_Op_Mod,
N_Op_Multiply,
N_Op_Rem,
@@ -7183,8 +7182,12 @@ package body Sem_Ch4 is
  and then not Is_Numeric_Type (Etype (R))
then
   if Address_Integer_Convert_OK (Etype (R), Etype (L)) then
+ Rewrite (L,
+   Unchecked_Convert_To (
+ Standard_Address, Relocate_Node (L)));
  Rewrite (R,
-   Unchecked_Convert_To (Etype (L), Relocate_Node (R)));
+   Unchecked_Convert_To (
+ Standard_Address, Relocate_Node (R)));
 
  if Nkind_In (N, N_Op_Ge, N_Op_Gt, N_Op_Le, N_Op_Lt) then
 Analyze_Comparison_Op (N);
@@ -7202,7 +7205,11 @@ package body Sem_Ch4 is
then
   if Address_Integer_Convert_OK (Etype (L), Etype (R)) then
  Rewrite (L,
-   Unchecked_Convert_To (Etype (R), Relocate_Node (L)));
+   Unchecked_Convert_To (
+ Standard_Address, Relocate_Node (L)));
+ Rewrite (R,
+   Unchecked_Convert_To (
+ Standard_Address, Relocate_Node (R)));
 
  if Nkind_In (N, N_Op_Ge, N_Op_Gt, N_Op_Le, N_Op_Lt) then
 Analyze_Comparison_Op (N);
@@ -7229,10 +7236,10 @@ package body Sem_Ch4 is
   begin
  Rewrite (L,
Unchecked_Convert_To (
- Standard_Integer, Relocate_Node (L)));
+ Standard_Address, Relocate_Node (L)));
  Rewrite (R,
Unchecked_Convert_To (
- Standard_Integer, Relocate_Node (R)));
+ Standard_Address, Relocate_Node (R)));
 
  if Nkind_In (N, N_Op_Ge, N_Op_Gt, N_Op_Le, N_Op_Lt) then
 Analyze_Comparison_Op (N);
@@ -7330,8 +7337,12 @@ package body Sem_Ch4 is
 
 elsif Nkind_In (N, N_Op_Eq, N_Op_Ne) then
if Address_Integer_Convert_OK (Etype (R), Etype (L)) then
+  Rewrite (L,
+Unchecked_Convert_To (
+  Standard_

[Ada] Spurious ineffective with clause warnings on use type clause

2020-06-03 Thread Pierre-Marie de Rodat
This patch fixes an issue whereby the compiler generates spurious
ineffective with_clause warnings under -gnatwr when the only use of the
with'ed package occurs in a use_type clause in the context area of the
body of the target package and such use_type clause has multiple
prefixes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Justin Squirek  

gcc/ada/

* sem_ch10.adb (Process_Body_Clauses): Add loop to interate
through all prefixes in a use_type clause so that all packages
in the expanded name get examined for effectiveness.--- gcc/ada/sem_ch10.adb
+++ gcc/ada/sem_ch10.adb
@@ -320,7 +320,6 @@ package body Sem_Ch10 is
 Nam_Ent   : constant Entity_Id := Entity (Name (Clause));
 Cont_Item : Node_Id;
 Prag_Unit : Node_Id;
-Subt_Mark : Node_Id;
 Use_Item  : Node_Id;
 
 function Same_Unit (N : Node_Id; P : Entity_Id) return Boolean;
@@ -390,12 +389,24 @@ package body Sem_Ch10 is
elsif Nkind (Cont_Item) = N_Use_Type_Clause
  and then not Used_Type_Or_Elab
then
-  Subt_Mark := Subtype_Mark (Cont_Item);
-  if not Used_Type_Or_Elab
-and then Same_Unit (Prefix (Subt_Mark), Nam_Ent)
-  then
- Used_Type_Or_Elab := True;
-  end if;
+  declare
+ UE : Node_Id;
+
+  begin
+ --  Loop through prefixes looking for a match
+
+ UE := Prefix (Subtype_Mark (Cont_Item));
+ loop
+if not Used_Type_Or_Elab
+  and then Same_Unit (UE, Nam_Ent)
+then
+   Used_Type_Or_Elab := True;
+end if;
+
+exit when Nkind (UE) /= N_Expanded_Name;
+UE := Prefix (UE);
+ end loop;
+  end;
 
--  Pragma Elaborate or Elaborate_All
 



[Ada] Fix for missing calls to Adjust primitive with nested generics

2020-06-03 Thread Pierre-Marie de Rodat
This changes the strategy for dealing with the limitations of the
Has_Private_View mechanism, which is responsible for updating the views
in instantiations of a type first declared as private; this is necessary
because the views of a type may be different at the declaration point of
a generic and at the instantiation point, and the semantic analysis of
the instance must be done with the view at the declaration point, which
may thus require swapping views.

The Has_Private_View mechanism cannot directly handle the cases where
the type is only referenced implicitly in the generic tree, i.e. not
from a node of the tree but only from another type, for example as
Component_Type of an array type.  So the flag was "overloaded" for types
that can reference other types, like array types, which turns out to be
problematic when the array type and its component type can have
independent views.

The patch removes the overloading mechanism and replaces it with an
ad-hoc mechanism in Copy_Generic_Node to deal with the aforementioned
problematic cases, which may be quite rare in practice because it is
apparently only needed for an old unofficial ACATS test (cc5008a).

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Eric Botcazou  

gcc/ada/

* sem_ch12.adb (Denotes_Previous_Actual): Delete.
(Check_Generic_Actuals): Do not special case array types whose
component type denotes a previous actual.  Do not special case
access types whose base type is private.
(Check_Private_View): Remove code dealing with secondary types.
Do not switch the views of an array because of its component.
(Copy_Generic_Node): Add special handling for a comparison
operator on array types.
(Instantiate_Type): Do not special case access types whose
designated type is private.
(Set_Global_Type): Do not special case array types whose
component type is private.--- gcc/ada/sem_ch12.adb
+++ gcc/ada/sem_ch12.adb
@@ -6794,48 +6794,6 @@ package body Sem_Ch12 is
   E  : Entity_Id;
   Astype : Entity_Id;
 
-  function Denotes_Previous_Actual (Typ : Entity_Id) return Boolean;
-  --  For a formal that is an array type, the component type is often a
-  --  previous formal in the same unit. The privacy status of the component
-  --  type will have been examined earlier in the traversal of the
-  --  corresponding actuals, and this status should not be modified for
-  --  the array (sub)type itself. However, if the base type of the array
-  --  (sub)type is private, its full view must be restored in the body to
-  --  be consistent with subsequent index subtypes, etc.
-  --
-  --  To detect this case we have to rescan the list of formals, which is
-  --  usually short enough to ignore the resulting inefficiency.
-
-  -
-  -- Denotes_Previous_Actual --
-  -
-
-  function Denotes_Previous_Actual (Typ : Entity_Id) return Boolean is
- Prev : Entity_Id;
-
-  begin
- Prev := First_Entity (Instance);
- while Present (Prev) loop
-if Is_Type (Prev)
-  and then Nkind (Parent (Prev)) = N_Subtype_Declaration
-  and then Is_Entity_Name (Subtype_Indication (Parent (Prev)))
-  and then Entity (Subtype_Indication (Parent (Prev))) = Typ
-then
-   return True;
-
-elsif Prev = E then
-   return False;
-
-else
-   Next_Entity (Prev);
-end if;
- end loop;
-
- return False;
-  end Denotes_Previous_Actual;
-
-   --  Start of processing for Check_Generic_Actuals
-
begin
   E := First_Entity (Instance);
   while Present (E) loop
@@ -6844,14 +6802,7 @@ package body Sem_Ch12 is
and then Scope (Etype (E)) /= Instance
and then Is_Entity_Name (Subtype_Indication (Parent (E)))
  then
-if Is_Array_Type (E)
-  and then not Is_Private_Type (Etype (E))
-  and then Denotes_Previous_Actual (Component_Type (E))
-then
-   null;
-else
-   Check_Private_View (Subtype_Indication (Parent (E)));
-end if;
+Check_Private_View (Subtype_Indication (Parent (E)));
 
 Set_Is_Generic_Actual_Type (E);
 
@@ -6886,15 +6837,6 @@ package body Sem_Ch12 is
 
 if Is_Discrete_Or_Fixed_Point_Type (E) then
Set_RM_Size (E, RM_Size (Astype));
-
---  In nested instances, the base type of an access actual may
---  itself be private, and need to be exchanged.
-
-elsif Is_Access_Type (E)
-  and then Is_Private_Type (Etype (E))
-then
-   Check_Private_View
- (New_Occurrence_Of (Etype (E), Sloc (Instance)));
 end if;
 
  elsif Ekin

[Ada] Add detection of uninitialized big reals

2020-06-03 Thread Pierre-Marie de Rodat
This was there for big integers after the previous code changes but not
(yet) for big reals, now done.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-03  Arnaud Charlet  

gcc/ada/

* libgnat/a-nbnbin.ads: Minor reformatting.
* libgnat/a-nbnbre.ads, libgnat/a-nbnbre.adb (Is_Valid): Add
convention Intrinsic. Add detection of uninitialized big reals.--- gcc/ada/libgnat/a-nbnbin.ads
+++ gcc/ada/libgnat/a-nbnbin.ads
@@ -13,9 +13,9 @@
 --  --
 --
 
-with Ada.Finalization;
 with Ada.Streams;
 
+private with Ada.Finalization;
 private with System;
 
 --  Note that some Ada 2020 aspects are commented out since they are not
@@ -89,7 +89,7 @@ is
 
end Unsigned_Conversions;
 
-   function To_String (Arg : Big_Integer;
+   function To_String (Arg   : Big_Integer;
Width : Field := 0;
Base  : Number_Base := 10) return String
  with Post => To_String'Result'First = 1;

--- gcc/ada/libgnat/a-nbnbre.adb
+++ gcc/ada/libgnat/a-nbnbre.adb
@@ -46,7 +46,7 @@ package body Ada.Numerics.Big_Numbers.Big_Reals is
--
 
function Is_Valid (Arg : Big_Real) return Boolean is
- (Is_Valid (Arg.Num) and then Is_Valid (Arg.Den));
+ (Is_Valid (Arg.Num) and Is_Valid (Arg.Den));
 
-
-- "/" --
@@ -69,13 +69,17 @@ package body Ada.Numerics.Big_Numbers.Big_Reals is
-- Numerator --
---
 
-   function Numerator (Arg : Big_Real) return Big_Integer is (Arg.Num);
+   function Numerator (Arg : Big_Real) return Big_Integer is
+ (if Is_Valid (Arg.Num) then Arg.Num
+  else raise Constraint_Error with "invalid big real");
 
-
-- Denominator --
-
 
-   function Denominator (Arg : Big_Real) return Big_Positive is (Arg.Den);
+   function Denominator (Arg : Big_Real) return Big_Positive is
+ (if Is_Valid (Arg.Den) then Arg.Den
+  else raise Constraint_Error with "invalid big real");
 
-
-- "=" --
@@ -409,6 +413,10 @@ package body Ada.Numerics.Big_Numbers.Big_Reals is
function "+" (L : Big_Real) return Big_Real is
   Result : Big_Real;
begin
+  if not Is_Valid (L) then
+ raise Constraint_Error with "invalid big real";
+  end if;
+
   Result.Num := L.Num;
   Result.Den := L.Den;
   return Result;
@@ -419,14 +427,16 @@ package body Ada.Numerics.Big_Numbers.Big_Reals is
-
 
function "-" (L : Big_Real) return Big_Real is
- (Num => -L.Num, Den => L.Den);
+ (if Is_Valid (L) then (Num => -L.Num, Den => L.Den)
+  else raise Constraint_Error with "invalid big real");
 
---
-- "abs" --
---
 
function "abs" (L : Big_Real) return Big_Real is
- (Num => abs L.Num, Den => L.Den);
+ (if Is_Valid (L) then (Num => abs L.Num, Den => L.Den)
+  else raise Constraint_Error with "invalid big real");
 
-
-- "+" --
@@ -435,6 +445,10 @@ package body Ada.Numerics.Big_Numbers.Big_Reals is
function "+" (L, R : Big_Real) return Big_Real is
   Result : Big_Real;
begin
+  if not Is_Valid (L) or not Is_Valid (R) then
+ raise Constraint_Error with "invalid big real";
+  end if;
+
   Result.Num := L.Num * R.Den + R.Num * L.Den;
   Result.Den := L.Den * R.Den;
   Normalize (Result);
@@ -448,6 +462,10 @@ package body Ada.Numerics.Big_Numbers.Big_Reals is
function "-" (L, R : Big_Real) return Big_Real is
   Result : Big_Real;
begin
+  if not Is_Valid (L) or not Is_Valid (R) then
+ raise Constraint_Error with "invalid big real";
+  end if;
+
   Result.Num := L.Num * R.Den - R.Num * L.Den;
   Result.Den := L.Den * R.Den;
   Normalize (Result);
@@ -461,6 +479,10 @@ package body Ada.Numerics.Big_Numbers.Big_Reals is
function "*" (L, R : Big_Real) return Big_Real is
   Result : Big_Real;
begin
+  if not Is_Valid (L) or not Is_Valid (R) then
+ raise Constraint_Error with "invalid big real";
+  end if;
+
   Result.Num := L.Num * R.Num;
   Result.Den := L.Den * R.Den;
   Normalize (Result);
@@ -474,6 +496,10 @@ package body Ada.Numerics.Big_Numbers.Big_Reals is
function "/" (L, R : Big_Real) return Big_Real is
   Result : Big_Real;
begin
+  if not Is_Valid (L) or not Is_Valid (R) then
+ raise Constraint_Error with "invalid big real";
+  end if;
+
   Result.Num := L.Num * R.Den;
   Result.Den := L.Den * R.Num;
   Normalize (Result);
@@ -487,6 +513,10 @@ package body Ada.Numerics.Big_Numbers.Big_Reals is
function "**" (L : Big_Real; R : Integer) return Big_Real is
   Result : Big_Real;
begin
+  if not Is_Valid (L) then
+ raise Constraint_Error with "invalid big real";
+  end if;
+
   if R = 0 then
  Result.Num := 

Re: [PATCH 1/2] [aarch64] Rework fpcr fpsr getter/setter builtins

2020-06-03 Thread Andrea Corallo
Andrea Corallo  writes:

> Andrea Corallo  writes:
>
>> Hi all,
>>
>> I'd like to submit this patch introducing the following 64bit builtins
>> variants as FPCR and FPSR registers getter/setter:
>>
>> unsigned long long __builtin_aarch64_get_fpcr64 ()
>> void __builtin_aarch64_set_fpcr64 (unsigned long long)
>> unsigned long long __builtin_aarch64_get_fpsr64 ()
>> void __builtin_aarch64_set_fpsr64 (unsigned long long)
>>
>> Regards
>>   Andrea
>>
>> gcc/ChangeLog:
>>
>> 2020-??-??  Andrea Corallo  
>>
>>  * config/aarch64/aarch64-builtins.c (aarch64_builtins): Add enums
>>  for 64bits fpsr/fpcr getter setters builtin variants.
>>  (aarch64_init_fpsr_fpcr_builtins): New function.
>>  (aarch64_expand_fcr_fpsr_builtin): New function.
>>  (aarch64_general_expand_builtin): Modify to make use of the later.
>>  * config/aarch64/aarch64.md (UNSPECV_GET_FPCR64)
>>  (UNSPECV_SET_FPCR64, UNSPECV_GET_FPSR64, UNSPECV_SET_FPSR64): Add
>>  4 new unpecv.
>>  (set_fpcr64, get_fpcr64,set_fpsr64, get_fpsr64): New patterns.
>>  * doc/extend.texi (__builtin_aarch64_get_fpcr64)
>>  (__builtin_aarch64_set_fpcr64, __builtin_aarch64_get_fpsr64)
>>  (__builtin_aarch64_set_fpsr64): Add into AArch64 Built-in
>>  Functions.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2020-??-??  Andrea Corallo  
>>
>>  * gcc.target/aarch64/get_fpcr64.c: New test.
>>  * gcc.target/aarch64/set_fpcr64.c: New test.
>>  * gcc.target/aarch64/get_fpsr64.c: New test.
>>  * gcc.target/aarch64/set_fpsr64.c: New test.
>
> Hi all,
>
> Leaving aside 2/2, I've retested this one (1/2) on top of current
> master.
>
> Regtested and bootstrapped on aarch64-linux-gnu.
>
> Is it okay for trunk?
>
> Regards
>
>   Andrea

Ping

  Andrea


[PATCH] libgccjit: Add new gcc_jit_context_new_blob entry point

2020-06-03 Thread Andrea Corallo
Hi all,

I'd like to submit this patch to extend libgccjit to allow for storing
binary blobs (equivalent to initialized char arrays).

The use-case is when libgccjit is used as a back-end for dynamic
programming languages.  In this case is often necessary to store
serialized objects into the compilation unit as support to the
generated code.

The proposed entry point is the following:

gcc_jit_lvalue *
gcc_jit_context_new_blob (gcc_jit_context *ctxt,
  gcc_jit_location *loc,
  enum gcc_jit_global_kind kind,
  const void *ptr,
  size_t size,
  const char *name);

This create a special kind of libgccjit global that will be initialized
with the memory content pointed by 'ptr'.

I've added a testcase and the regression is clean.  I've also tested
it with the libgccjit based Emacs disabling the current workaround we have
for this.

Feedback is very welcome.

  Andrea

gcc/jit/ChangeLog

2020-06-02  Andrea Corallo  

* docs/topics/compatibility.rst (LIBGCCJIT_ABI_14): New ABI tag.

* docs/topics/expressions.rst (Binary Blobs): New section
documenting gcc_jit_context_new_blob.

* jit-common.h: Document class blob.

* jit-playback.c (playback::context::global_new_decl)
(playback::context::global_finalize_lvalue): New methods.
(playback::context::new_global): Make use of global_new_decl,
global_finalize_lvalue.
(playback::context::new_blob): New method.

* jit-playback.h (new_blob, global_new_decl): New method
declarations.

* jit-recording.c (recording::context::new_blob)
(recording::blob::replay_into)
(recording::global::write_qualifier_to_dump): New methods.
(recording::global::write_to_dump): Use write_qualifier_to_dump.
(recording::blob::write_to_dump): New method.

* jit-recording.h (class blob): New class.
(class global): Have m_kind and m_name as protected.
(class global): Remove FINAL qualifier to replay_into and
write_to_dump.
(class global): New method write_qualifier_to_dump decl.
(class context): New method new_blob decl.

* libgccjit++.h (context::new_blob): New method.

* libgccjit.c (gcc_jit_lvalue_as_rvalue): New function.

* libgccjit.h (gcc_jit_context_new_blob): New function
declaration.
(LIBGCCJIT_HAVE_gcc_jit_context_new_blob): New macro.

* libgccjit.map (LIBGCCJIT_ABI_14): New ABI tag.

gcc/testsuite/ChangeLog

2020-06-02  Andrea Corallo  

* jit.dg/all-non-failing-tests.h: Add test-blob.c.

* jit.dg/test-blob.c: New test.

>From b861fbbfbcbf18cc1b69dc4b7f91c596af40c8e4 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Sat, 30 May 2020 10:33:08 +0100
Subject: [PATCH] Add new gcc_jit_context_new_blob entry point

gcc/jit/ChangeLog

2020-06-02  Andrea Corallo  

	* docs/topics/compatibility.rst (LIBGCCJIT_ABI_14): New ABI tag.

	* docs/topics/expressions.rst (Binary Blobs): New section
	documenting gcc_jit_context_new_blob.

	* jit-common.h: Document class blob.

	* jit-playback.c (playback::context::global_new_decl)
	(playback::context::global_finalize_lvalue): New methods.
	(playback::context::new_global): Make use of global_new_decl,
	global_finalize_lvalue.
	(playback::context::new_blob): New method.

	* jit-playback.h (new_blob, global_new_decl): New method
	declarations.

	* jit-recording.c (recording::context::new_blob)
	(recording::blob::replay_into)
	(recording::global::write_qualifier_to_dump): New methods.
	(recording::global::write_to_dump): Use write_qualifier_to_dump.
	(recording::blob::write_to_dump): New method.

	* jit-recording.h (class blob): New class.
	(class global): Have m_kind and m_name as protected.
	(class global): Remove FINAL qualifier to replay_into and
	write_to_dump.
	(class global): New method write_qualifier_to_dump decl.
	(class context): New method new_blob decl.

	* libgccjit++.h (context::new_blob): New method.

	* libgccjit.c (gcc_jit_lvalue_as_rvalue): New function.

	* libgccjit.h (gcc_jit_context_new_blob): New function
	declaration.
	(LIBGCCJIT_HAVE_gcc_jit_context_new_blob): New macro.

	* libgccjit.map (LIBGCCJIT_ABI_14): New ABI tag.

gcc/testsuite/ChangeLog

2020-06-02  Andrea Corallo  

	* jit.dg/all-non-failing-tests.h: Add test-blob.c.

	* jit.dg/test-blob.c: New test.
---
 gcc/jit/docs/topics/compatibility.rst|  5 ++
 gcc/jit/docs/topics/expressions.rst  | 49 ++
 gcc/jit/jit-common.h |  1 +
 gcc/jit/jit-playback.c   | 67 --
 gcc/jit/jit-playback.h   | 16 
 gcc/jit/jit-recording.c  | 95 +---
 gcc/jit/jit-recording.h  | 48 +-
 gcc/jit/libgccjit++.h| 21 +
 gcc/jit/libgccjit.c   

RE: [EXT] Re: [PATCH] Optimize and+or+sub into xor+not (PR94882)

2020-06-03 Thread Naveen Hurugalawadi via Gcc-patches
Hi Richard,

Thanks for reviewing the patch and sharing your comments.

>> pattern should be guarded with !TYPE_OVERFLOW_SANITIZED
>> and/or !TYPE_OVERFLOW_TRAPS
Done.

>> can you add some cases with unsigned types and types
>> smaller than int (thus with integral promotion applied)?
Done.

Please find attached the modified patch as per your suggestions.
Bootstrapped and regression tested on x86_64-pc-linux-gnu.

Thanks,
Naveen


pr94882.patch
Description: pr94882.patch


Re: [PATCH 0/4] IVOPTs consider step cost for different forms when unrolling

2020-06-03 Thread Kewen.Lin via Gcc-patches
on 2020/6/3 下午5:27, Richard Biener wrote:
> On Wed, 3 Jun 2020, Kewen.Lin wrote:
> 
>> on 2020/6/3 下午3:07, Richard Biener wrote:
>>> On Wed, 3 Jun 2020, Kewen.Lin wrote:
>>>
 Hi Richi,


snip ...

>
> I'd just mention there are other targets that have the choice between
> the above forms.  Since IVOPTs itself does not perform the unrolling
> the IL it produces is the same, correct?
>
 Yes.  Before this patch, IVOPTs doesn't consider the unrolling impacts,
 it only models things based on what it sees.  We can assume it thinks
 later RTL unrolling won't perform.

 With this patch, since the IV choice probably changes, the IL can probably
 change.  The typical difference with this patch is:

   vect__1.7_15 = MEM[symbol: x, index: ivtmp.19_22, offset: 0B];
 vs.
   vect__1.7_15 = MEM[base: _29, offset: 0B];
>>>
>>> So we're asking IVOPTS "if we were unrolling this loop would you make
>>> a different IV choice?" thus I wonder why we need so much complexity
>>> here?  
>>
>> I would describe it more like "we are going to unroll this loop with
>> unroll factor uf in RTL, would you consider this variable when modeling?"
>>
>> In most cases, one single iteration is representative for the unrolled
>> body, so it doesn't matter considering unrolling or not.  But for the
>> case here, it's not true, expected reg_offset iv cand can make iv cand
>> step cost reduced, it leads the difference.
>>
>>> That is, if we can classify the loop as being possibly unrolled
>>> we could evaluate IVOPTs IV choice (and overall cost) on the original
>>> loop and in a second run on the original loop with fake IV uses
>>> added with extra offset.  If the overall IV cost is similar we'll
>>> take the unroll friendly choice if the costs are way different
>>> (I wouldn't expect this to be the case ever?) I'd side with the
>>> IV choice when not unrolling (and mark the loop as to be not unrolled).
>>>
>>
>> Could you elaborate it a bit?  I guess it won't estimate the unroll
>> factor here, just guess it's to be unrolled or not?  The second run
>> with fake IV uses added with extra offset sounds like scaling up the 
>> iv group cost by uf.
> 
> From your example above the D-form (MEM[symbol: x, index: ivtmp.19_22, 
> offset: 0B]) is preferable since in the unrolled variant we have
> the same addres but with a different constant offset for the unroll
> copies while the second form would have to update the 'base' IV.
> 
> Thus I think the difference in IV cost and decision should already
> show up if we, for each USE add a USE with an added constant offset.
> This might be what your patch does with that extra flag on the USEs,
> I was suggesting to model the USEs more explicitely, simulating a
> 2-way unroll.  I think in the end I'll defer to Bin here who knows
> the code best.
> 

Thanks for your further explanation!  As your proposal we introduce more
iv use groups with step added.  Take the example here
https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547128.html
Imagining initially the cand iv 4 leading to x-form wins, it's the
original iv, has the iv-group cost 1 against the address group.
Although we introduce one more group (2-way unrolling), the iv still
wins since pulling the address iv in takes 5 (15 for three).  Probably
we can introduce more groups according to uf here.

OK.  Looking forward to Bin's comments.

>>> Thus I'd err on the side of not unrolling but leave the ultimate choice
>>> of whether to unroll to RTL unless IV cost makes that prohibitive.
>>>
>>> Even without X- or D- form addressing modes the IV choice may differ
>>> and I think we don't need extra knobs for the unroller but instead
>>> can decide to set the existing n_unroll to zero (force not unroll)
>>> when costs say it would be bad?
>>
>> Yes, even without x- or d- form addressing, the difference probably comes 
>> from compare type IV use for loop ending, maybe more cases which I am not
>> aware of.  But I don't see people care about it, probably the impact is
>> small.
>>
>> IIUC what you stated here looks like to use ivopts information for unrolling
>> factor decision, I think this is a separate direction, do we have this
>> kind of case where ivopts costs can foresee the unrolling?
>>
>> Now the unroll factor estimation can be used for other optimization passes
>> if they are wondering future unrolling factor decision, as discussed it
>> sounds a good idea to override the n_unroll with some benchmarking.
> 
> I didnt' suggest to use IVOPTs to determine the unroll factor.  In
> fact your patch looks like it does this?  Instead I wanted to make
> IVOPTs choose a set of IVs that is best for a blend of both worlds - use
> D-form when it doesn't hurt the not unrolled code [much], and X-form
> when the D-form is way worse (for whatever reason) and signal that
> to the unroller (but we could chose to not do that).
> 

Sorry for my weak comprehension!  Nice, we are on the same direction.  :)

>

Re: [PATCH 0/4] IVOPTs consider step cost for different forms when unrolling

2020-06-03 Thread Richard Sandiford
"Kewen.Lin"  writes:
> on 2020/6/3 下午5:27, Richard Biener wrote:
>> On Wed, 3 Jun 2020, Kewen.Lin wrote:
>> 
>>> on 2020/6/3 下午3:07, Richard Biener wrote:
 On Wed, 3 Jun 2020, Kewen.Lin wrote:

> Hi Richi,
>
>
> snip ...
>
>>
>> I'd just mention there are other targets that have the choice between
>> the above forms.  Since IVOPTs itself does not perform the unrolling
>> the IL it produces is the same, correct?
>>
> Yes.  Before this patch, IVOPTs doesn't consider the unrolling impacts,
> it only models things based on what it sees.  We can assume it thinks
> later RTL unrolling won't perform.
>
> With this patch, since the IV choice probably changes, the IL can probably
> change.  The typical difference with this patch is:
>
>   vect__1.7_15 = MEM[symbol: x, index: ivtmp.19_22, offset: 0B];
> vs.
>   vect__1.7_15 = MEM[base: _29, offset: 0B];

 So we're asking IVOPTS "if we were unrolling this loop would you make
 a different IV choice?" thus I wonder why we need so much complexity
 here?  
>>>
>>> I would describe it more like "we are going to unroll this loop with
>>> unroll factor uf in RTL, would you consider this variable when modeling?"
>>>
>>> In most cases, one single iteration is representative for the unrolled
>>> body, so it doesn't matter considering unrolling or not.  But for the
>>> case here, it's not true, expected reg_offset iv cand can make iv cand
>>> step cost reduced, it leads the difference.
>>>
 That is, if we can classify the loop as being possibly unrolled
 we could evaluate IVOPTs IV choice (and overall cost) on the original
 loop and in a second run on the original loop with fake IV uses
 added with extra offset.  If the overall IV cost is similar we'll
 take the unroll friendly choice if the costs are way different
 (I wouldn't expect this to be the case ever?) I'd side with the
 IV choice when not unrolling (and mark the loop as to be not unrolled).

>>>
>>> Could you elaborate it a bit?  I guess it won't estimate the unroll
>>> factor here, just guess it's to be unrolled or not?  The second run
>>> with fake IV uses added with extra offset sounds like scaling up the 
>>> iv group cost by uf.
>> 
>> From your example above the D-form (MEM[symbol: x, index: ivtmp.19_22, 
>> offset: 0B]) is preferable since in the unrolled variant we have
>> the same addres but with a different constant offset for the unroll
>> copies while the second form would have to update the 'base' IV.
>> 
>> Thus I think the difference in IV cost and decision should already
>> show up if we, for each USE add a USE with an added constant offset.
>> This might be what your patch does with that extra flag on the USEs,
>> I was suggesting to model the USEs more explicitely, simulating a
>> 2-way unroll.  I think in the end I'll defer to Bin here who knows
>> the code best.
>> 
>
> Thanks for your further explanation!  As your proposal we introduce more
> iv use groups with step added.  Take the example here
> https://gcc.gnu.org/pipermail/gcc-patches/2020-June/547128.html
> Imagining initially the cand iv 4 leading to x-form wins, it's the
> original iv, has the iv-group cost 1 against the address group.
> Although we introduce one more group (2-way unrolling), the iv still
> wins since pulling the address iv in takes 5 (15 for three).  Probably
> we can introduce more groups according to uf here.

Yeah, to summarise that thread: the idea there was that we would
continue to cost each use once, but base the cost on the kind of address
seen in the unrolled iterations.  I guess this tends to over-estimate the
cost of index IVs to some extent, but I too was aiming for something
simple that doesn't depend on a specific unroll factor.

Kewen's point there was that that approach works for high unroll factors,
but not for small unroll factors like 2.  For:

  LD A = baseA, X
  LD B = baseB, X
  ST C = baseC, X
  X = X + stride
  LD A = baseA, X
  LD B = baseB, X
  ST C = baseC, X
  X = X + stride

using X as an IV is still preferred.  It's only once the unroll
factor exceeds the number of pointer IVs that using pointer IVs
becomes better.

So like Kewen says, using 2 USEs (the original one and an unrolled one)
would have the opposite problem: it would still prefer index IVs and not
consider the benefit of pointer IVs at higher unroll factors.

But I agree that trying to guess what a much later pass will do doesn't
feel very clean either...

Thanks,
Richard


[PATCH] simplify visited flag unsetting in SLP scalar cost computation

2020-06-03 Thread Richard Biener
This changes the loop over all stmts to a loop over all costed stmts.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

2020-06-03  Richard Biener  

* tree-vect-slp.c (vect_bb_vectorization_profitable_p): Loop over
the cost vector to unset the visited flag on stmts.
---
 gcc/tree-vect-slp.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 836defce990..f1d61944ae0 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -3078,6 +3078,11 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo)
   SLP_INSTANCE_TREE (instance),
   &life, &scalar_costs, visited);
 }
+  /* Unset visited flag.  */
+  stmt_info_for_cost *si;
+  FOR_EACH_VEC_ELT (scalar_costs, i, si)
+gimple_set_visited  (si->stmt_info->stmt, false);
+
   void *target_cost_data = init_cost (NULL);
   add_stmt_costs (bb_vinfo, target_cost_data, &scalar_costs);
   scalar_costs.release ();
@@ -3085,11 +3090,6 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo)
   finish_cost (target_cost_data, &dummy, &scalar_cost, &dummy);
   destroy_cost_data (target_cost_data);
 
-  /* Unset visited flag.  */
-  for (gimple_stmt_iterator gsi = bb_vinfo->region_begin;
-   gsi_stmt (gsi) != gsi_stmt (bb_vinfo->region_end); gsi_next (&gsi))
-gimple_set_visited  (gsi_stmt (gsi), false);
-
   /* Complete the target-specific cost calculation.  */
   finish_cost (BB_VINFO_TARGET_COST_DATA (bb_vinfo), &vec_prologue_cost,
   &vec_inside_cost, &vec_epilogue_cost);
-- 
2.25.1


[PATCH] middle-end/95493 - bogus MEM_ATTRs from variable ARRAY_REFs

2020-06-03 Thread Richard Biener


The following patch avoids keeping the inherited MEM_ATTRs when
set_mem_attributes_minus_bitpos is called with a variable ARRAY_REF.
The inherited ones may not reflect the correct offset.

We try to preserve the MEM_EXPR from the base object expansion
but we have to make sure to also preserve the corresponding MEM_ALIAS_SET
then since otherwise path-based disambiguation gets confused.
The other choice we'd have would be to drop the MEM_EXPR but keep
the alias-set from the full reference.  Or, of course "improve"
MEM_EXPR generation to not "fail" in case the ARRAY_REF offset is
variable (we keep variable ARRAY_REFs in the MEM_EXPR if there's
a COMPONENT_REF ontop of the ref for example).  I 
understand the "inheriting" be exactly useful in the
case we otherwise give up (I've added an assert that we're not
inheriting from sth when creating a stack slot).

The expand_assignment fix could be treated separately (in different
code paths we are already preserving MEM_ALIAS_SET this way).

As said the alternative fix for set_mem_attrs_minus_bitpos would
be to remove all the ARRAY_REF special-casing and do the same
as we do for MEM_REF.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Any comments?

Thanks,
Richard.

2020-06-03  Richard Biener  

PR middle-end/95493
* emit-rtl.c (set_mem_attrs_minus_bitpos): Clear
offset-known when the ARRAY_REF is variable.  If we end up
inherting the MEM_EXPR from the original MEM_ATTRS then also
inherit the corresponding MEM_ALIAS_SET.  Avoid altering
MEM_KEEP_ALIAS_SET_P in that case.
* expr.c (expand_assignment): Preserve MEM_ALIAS_SET when emitting
the store.

* g++.dg/torture/pr95493.C: New testcase.
---
 gcc/emit-rtl.c | 59 ++--
 gcc/expr.c |  6 ++-
 gcc/testsuite/g++.dg/torture/pr95493.C | 62 ++
 3 files changed, 103 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr95493.C

diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index 2b790636366..09521b6f85e 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -1964,8 +1964,9 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int 
objectp,
   refattrs = MEM_ATTRS (ref);
   if (refattrs)
 {
-  /* ??? Can this ever happen?  Calling this routine on a MEM that
-already carries memory attributes should probably be invalid.  */
+  gcc_assert (! TYPE_P (t));
+  /* We are calling this function with attributes set up from expanding
+the base of a memory reference as returned by get_inner_referece.  */
   attrs.expr = refattrs->expr;
   attrs.offset_known_p = refattrs->offset_known_p;
   attrs.offset = refattrs->offset;
@@ -2052,11 +2053,6 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int 
objectp,
as = TYPE_ADDR_SPACE (TREE_TYPE (base));
}
 
-  /* If this expression uses it's parent's alias set, mark it such
-that we won't change it.  */
-  if (component_uses_parent_alias_set_from (t) != NULL_TREE)
-   MEM_KEEP_ALIAS_SET_P (ref) = 1;
-
   /* If this is a decl, set the attributes of the MEM from it.  */
   if (DECL_P (t))
{
@@ -2114,6 +2110,9 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int 
objectp,
}
  while (TREE_CODE (t2) == ARRAY_REF);
 
+ attrs.offset_known_p = false;
+ attrs.offset = 0;
+ apply_bitpos = 0;
  if (DECL_P (t2)
  || (TREE_CODE (t2) == COMPONENT_REF
  /* For trailing arrays t2 doesn't have a size that
@@ -2141,24 +2140,38 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int 
objectp,
  apply_bitpos = bitpos;
}
 
-  /* If this is a reference based on a partitioned decl replace the
-base with a MEM_REF of the pointer representative we created
-during stack slot partitioning.  */
-  if (attrs.expr
- && VAR_P (base)
- && ! is_global_var (base)
- && cfun->gimple_df->decls_to_pointers != NULL)
+  /* If we chose to keep the MEM_EXPR from the inner reference
+expansion make sure to also keep the corresponding alias-set
+since otherwise path-based disambiguation is confused.  */
+  if (refattrs
+ && attrs.expr == refattrs->expr)
+   attrs.alias = refattrs->alias;
+  else if (attrs.expr)
{
- tree *namep = cfun->gimple_df->decls_to_pointers->get (base);
- if (namep)
+ /* If this expression uses it's parent's alias set, mark it such
+that we won't change it.  */
+ if (component_uses_parent_alias_set_from (t) != NULL_TREE)
+   MEM_KEEP_ALIAS_SET_P (ref) = 1;
+
+ /* If this is a reference based on a partitioned decl replace the
+base with a MEM_REF of the pointer representative we created
+during stack slot partitioning.  */
+ if 

[Patch, fortran] PR fortran/94022 - Array slices of assumed-size arrays

2020-06-03 Thread José Rui Faustino de Sousa via Gcc-patches

Hi All!

Proposed patch to Bug 94022 - Array slices of assumed-size arrays.

Patch tested only on x86_64-pc-linux-gnu.

Make sure that when passing array sections of assumed-size arrays to 
procedures expecting an assumed-rank array the upper bound of the last 
dimension of the array section does not get improperly reset to -1 to 
mark it has an assumed size array.


Best regards,
José Rui


2020-6-3  José Rui Faustino de Sousa  

 PR fortran/94022
 * trans-expr.c (gfc_conv_procedure_call): In the case of assumed-size
 arrays ensure that the reference is to a full array.

2020-6-3  José Rui Faustino de Sousa  

 PR fortran/94022
 * PR94022.f90: New test.

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 33fc061..2e221b5 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -6243,6 +6243,8 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
  || gfc_expr_attr (e).allocatable)
set_dtype_for_unallocated (&parmse, e);
  else if (e->expr_type == EXPR_VARIABLE
+  && e->ref
+  && e->ref->u.ar.type == AR_FULL
   && e->symtree->n.sym->attr.dummy
   && e->symtree->n.sym->as
   && e->symtree->n.sym->as->type == AS_ASSUMED_SIZE)
diff --git a/gcc/testsuite/gfortran.dg/PR94022.f90 
b/gcc/testsuite/gfortran.dg/PR94022.f90
new file mode 100644
index 000..63b7d90
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/PR94022.f90
@@ -0,0 +1,132 @@
+! { dg-do run }
+!
+! Test the fix for PR94022
+!
+
+function isasa_f(a) result(s)
+  implicit none
+
+  integer, intent(in) :: a(..)
+  
+  logical :: s
+  
+  select rank(a)
+  rank(*)
+s = .true.
+  rank default
+s = .false.
+  end select
+  return
+end function isasa_f
+
+function isasa_c(a) result(s) bind(c)
+  use, intrinsic :: iso_c_binding, only: c_int, c_bool
+
+  implicit none
+
+  integer(kind=c_int), intent(in) :: a(..)
+  
+  logical(kind=c_bool) :: s
+  
+  select rank(a)
+  rank(*)
+s = .true.
+  rank default
+s = .false.
+  end select
+  return
+end function isasa_c
+
+program isasa_p
+
+  implicit none
+
+  interface
+function isasa_f(a) result(s)
+  implicit none
+  integer, intent(in) :: a(..)
+  logical :: s
+end function isasa_f
+function isasa_c(a) result(s) bind(c)
+  use, intrinsic :: iso_c_binding, only: c_int, c_bool
+  implicit none
+  integer(kind=c_int), intent(in) :: a(..)
+  logical(kind=c_bool):: s
+end function isasa_c
+  end interface
+
+  integer, parameter :: sz = 7
+  integer, parameter :: lb = 3
+  integer, parameter :: ub = 9
+  integer, parameter :: ex = ub-lb+1
+
+  integer :: arr(sz,lb:ub)
+
+  arr = 1
+  if (asaf_a(arr, lb+1, ub-1)) stop 1
+  if (asaf_p(arr, lb+1, ub-1)) stop 2
+  if (asaf_a(arr, 2, ex-1))stop 3
+  if (asaf_p(arr, 2, ex-1))stop 4
+  if (asac_a(arr, lb+1, ub-1)) stop 5
+  if (asac_p(arr, lb+1, ub-1)) stop 6
+  if (asac_a(arr, 2, ex-1))stop 7
+  if (asac_p(arr, 2, ex-1))stop 8
+  
+  stop
+
+contains
+
+  function asaf_a(a, lb, ub) result(s)
+integer, intent(in) :: lb
+integer, target, intent(in) :: a(sz,lb:*)
+integer, intent(in) :: ub
+
+logical :: s
+
+s = isasa_f(a(:,lb:ub))
+return
+  end function asaf_a
+
+  function asaf_p(a, lb, ub) result(s)
+integer, intent(in) :: lb
+integer, target, intent(in) :: a(sz,lb:*)
+integer, intent(in) :: ub
+
+logical :: s
+
+integer, pointer :: p(:,:)
+
+p => a(:,lb:ub)
+s = isasa_f(p)
+return
+  end function asaf_p
+
+  function asac_a(a, lb, ub) result(s)
+integer, intent(in) :: lb
+integer, target, intent(in) :: a(sz,lb:*)
+integer, intent(in) :: ub
+
+logical :: s
+
+s = logical(isasa_c(a(:,lb:ub)))
+return
+  end function asac_a
+
+  function asac_p(a, lb, ub) result(s)
+integer, intent(in) :: lb
+integer, target, intent(in) :: a(sz,lb:*)
+integer, intent(in) :: ub
+
+logical :: s
+
+integer, pointer :: p(:,:)
+
+p => a(:,lb:ub)
+s = logical(isasa_c(p))
+return
+  end function asac_p
+
+end program isasa_p
+
+
+  


Re: [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics

2020-06-03 Thread Thomas Schwinge
Hi Julian!

On 2020-05-22T15:16:05-0700, Julian Brown  wrote:
> This patch adjusts the semantics of dynamic reference counts, as described
> in the parent email.

Thanks!

A few questions, but no need to send an updated patch.

> --- a/libgomp/oacc-mem.c
> +++ b/libgomp/oacc-mem.c

> @@ -1018,13 +1036,102 @@ goacc_enter_data_internal (struct gomp_device_descr 
> *acc_dev, size_t mapnum,
>  {
>for (size_t i = 0; i < mapnum; i++)
>  {
> -  int group_last = find_group_last (i, mapnum, sizes, kinds);
> +  splay_tree_key n;
> +  size_t group_last = find_group_last (i, mapnum, sizes, kinds);
> +  bool struct_p = false;
> +  size_t size, groupnum = (group_last - i) + 1;
>
> -  gomp_map_vars_async (acc_dev, aq,
> -(group_last - i) + 1,
> -&hostaddrs[i], NULL,
> -&sizes[i], &kinds[i], true,
> -GOMP_MAP_VARS_OPENACC_ENTER_DATA);
> +  switch (kinds[i] & 0xff)
> + {
> + case GOMP_MAP_STRUCT:
> +   {
> + int last = i + sizes[i];

The 'last' calculated here must always equal the 'group_last' calculated
above.  ;-) (... so we might just use 'group_last' instead of 'last' in
the following.)

> + size = (uintptr_t) hostaddrs[last] + sizes[last]
> +- (uintptr_t) hostaddrs[i];
> + struct_p = true;
> +   }
> +   break;
> +
> + case GOMP_MAP_ATTACH:
> +   size = sizeof (void *);
> +   break;
> +
> + default:
> +   size = sizes[i];
> + }
> +
> +  n = lookup_host (acc_dev, hostaddrs[i], size);
> +

> +  if (n && struct_p)
> + {
> +   if (n->refcount != REFCOUNT_INFINITY)
> + n->refcount += groupnum - 1;
> +   n->dynamic_refcount += groupnum - 1;
> +   gomp_mutex_unlock (&acc_dev->lock);
> + }

Is the 'GOMP_MAP_STRUCT' handling here specifically necessary, or is that
just an optimization of the 'n && groupnum > 1' case below?

> +  else if (n && groupnum == 1)
> + {
> +   void *h = hostaddrs[i];
> +   size_t s = sizes[i];
> +
> +   /* A standalone attach clause.  */
> +   if ((kinds[i] & 0xff) == GOMP_MAP_ATTACH)
> + gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, n,
> +  (uintptr_t) h, s, NULL);
> +   else if (h + s > (void *) n->host_end)
> + {
> +   gomp_mutex_unlock (&acc_dev->lock);
> +   gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s);
> + }
> +
> +   assert (n->refcount != REFCOUNT_LINK);
> +   if (n->refcount != REFCOUNT_INFINITY)
> + n->refcount++;
> +   n->dynamic_refcount++;
> +
> +   gomp_mutex_unlock (&acc_dev->lock);
> + }

> +  else if (n && groupnum > 1)
> + {
> +   assert (n->refcount != REFCOUNT_INFINITY
> +   && n->refcount != REFCOUNT_LINK);
> +
> +   bool processed = false;
> +
> +   struct target_mem_desc *tgt = n->tgt;
> +   for (size_t j = 0; j < tgt->list_count; j++)
> + if (tgt->list[j].key == n)
> +   {
> + for (size_t k = 0; k < groupnum; k++)
> +   if (j + k < tgt->list_count && tgt->list[j + k].key)
> + {
> +   tgt->list[j + k].key->refcount++;
> +   tgt->list[j + k].key->dynamic_refcount++;
> + }
> + processed = true;
> +   }
> +
> +   gomp_mutex_unlock (&acc_dev->lock);
> +   if (!processed)
> + gomp_fatal ("dynamic refcount incrementing failed for "
> + "pointer/pset");
> + }

Please add some text to explain the nested 'j', 'k' loops and their 'if'
conditionals, and the 'groupnum' usage in the 'k' loop boundary.  Should
the 'k' loop maybe run 'for (size_t k = j; k < tgt->list_count; ++k)'
(..., or is 'groupnum' relevant?), and in the loop body then use 'k'
instead of 'j + k'?  (Maybe I've now confused myself, staring at this for
a while...)

> +  else if (hostaddrs[i])
> + {
> +   gomp_mutex_unlock (&acc_dev->lock);
> +
> +   struct target_mem_desc *tgt
> + = gomp_map_vars_async (acc_dev, aq, groupnum, &hostaddrs[i], NULL,
> +&sizes[i], &kinds[i], true,
> +GOMP_MAP_VARS_ENTER_DATA);
> +   assert (tgt);
> +   for (size_t j = 0; j < tgt->list_count; j++)
> + {
> +   n = tgt->list[j].key;
> +   if (n)
> + n->dynamic_refcount++;
> + }
> + }

... else nothing.  This latter "nothing" case (not present, and no
'hostaddrs[i]') is exercised by
'libgomp.oacc-fortran/optional-data-enter-exit.f90' (only).  Is that
alright?

>
>i = group_last;
>  }


> @@ -1137,45 +1241,40 @@ goacc_exit_data_internal (struct gomp_device_descr 
> *acc_dev, size_t mapnum,

(Diff slightly edited.)

>   if (n->refcount == 0)
> -   gomp_remove_var_async (acc_dev, n, aq);

> +   {

Re: [PATCH v2] arm: Warn if IRQ handler is not compiled with -mgeneral-regs-only [PR target/94743]

2020-06-03 Thread Christophe Lyon via Gcc-patches
Ping?
https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545747.html

On Wed, 27 May 2020 at 13:52, Christophe Lyon
 wrote:
>
> Ping?
>
> On Thu, 14 May 2020 at 16:57, Christophe Lyon
>  wrote:
> >
> > The interrupt attribute does not guarantee that the FP registers are
> > saved, which can result in problems difficult to debug.
> >
> > Saving the FP registers and status registers can be a large penalty,
> > so it's probably not desirable to do that all the time.
> >
> > If the handler calls other functions, we'd likely need to save all of
> > them, for lack of knowledge of which registers they actually clobber.
> >
> > This is even more obscure for the end-user when the compiler inserts
> > calls to helper functions such as memcpy (some multilibs do use FP
> > registers to speed it up).
> >
> > In the PR, we discussed adding routines in libgcc to save the FP
> > context and saving only locally-clobbered FP registers, but this seems
> > to be too much work for the purpose, given that in general such
> > handlers try to avoid this kind of penalty.
> > I suspect we would also want new attributes to instruct the compiler
> > that saving the FP context is not needed.
> >
> > In the mean time, emit a warning to suggest re-compiling with
> > -mgeneral-regs-only. Note that this can lead to errors if the code
> > uses floating-point and -mfloat-abi=hard, eg:
> > argument of type 'double' not permitted with -mgeneral-regs-only
> >
> > This can be troublesome for the user, but at least this would make
> > him aware of the latent issue.
> >
> > The patch adds several testcases:
> >
> > - pr94734-1-hard.c checks that a warning is emitted when using
> >   -mfloat-abi=hard. Function IRQ_HDLR_Test can make implicit calls to
> >   runtime floating-point routines (or direct use of FP instructions),
> >   IRQ_HDLR_Test2 doesn't. We emit a warning in both cases, though.
> >
> > - pr94734-1-softfp.c: same as above wih -mfloat-abi=softfp.
> >
> > - pr94734-1-soft.c checks that no warning is emitted when using
> >   -mfloat-abi=soft when the same code as above.
> >
> > - pr94734-2.c checks that no warning is emitted when using
> >   -mgeneral-regs-only.
> >
> > - pr94734-3.c checks that no warning is emitted when using
> >   -mgeneral-regs-only even using float-point data.
> >
> > 2020-05-14  Christophe Lyon  
> >
> > PR target/94743
> > gcc/
> > * config/arm/arm.c (arm_handle_isr_attribute): Warn if
> > -mgeneral-regs-only is not used.
> >
> > gcc/testsuite/
> > * gcc.misc-tests/arm-isr.c: Add -mgeneral-regs-only.
> > * gcc.target/arm/empty_fiq_handler.c: Add -mgeneral-regs-only.
> > * gcc.target/arm/interrupt-1.c: Add -mgeneral-regs-only.
> > * gcc.target/arm/interrupt-2.c: Add -mgeneral-regs-only.
> > * gcc.target/arm/pr70830.c: Add -mgeneral-regs-only.
> > * gcc.target/arm/pr94743-1-hard.c: New test.
> > * gcc.target/arm/pr94743-1-soft.c: New test.
> > * gcc.target/arm/pr94743-1-softfp.c: New test.
> > * gcc.target/arm/pr94743-2.c: New test.
> > * gcc.target/arm/pr94743-3.c: New test.
> > ---
> >  gcc/config/arm/arm.c |  5 
> >  gcc/testsuite/gcc.misc-tests/arm-isr.c   |  2 ++
> >  gcc/testsuite/gcc.target/arm/empty_fiq_handler.c |  1 +
> >  gcc/testsuite/gcc.target/arm/interrupt-1.c   |  2 +-
> >  gcc/testsuite/gcc.target/arm/interrupt-2.c   |  2 +-
> >  gcc/testsuite/gcc.target/arm/pr70830.c   |  2 +-
> >  gcc/testsuite/gcc.target/arm/pr94743-1-hard.c| 29 
> > 
> >  gcc/testsuite/gcc.target/arm/pr94743-1-soft.c| 27 
> > ++
> >  gcc/testsuite/gcc.target/arm/pr94743-1-softfp.c  | 29 
> > 
> >  gcc/testsuite/gcc.target/arm/pr94743-2.c | 22 ++
> >  gcc/testsuite/gcc.target/arm/pr94743-3.c | 23 +++
> >  11 files changed, 141 insertions(+), 3 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1-hard.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1-soft.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1-softfp.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-3.c
> >
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index dda8771..c88de3e 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -7232,6 +7232,11 @@ arm_handle_isr_attribute (tree *node, tree name, 
> > tree args, int flags,
> >name);
> >   *no_add_attrs = true;
> > }
> > +  else if (TARGET_VFP_BASE)
> > +   {
> > + warning (OPT_Wattributes, "FP registers might be clobbered 
> > despite %qE attribute: compile with -mgeneral-regs-only",
> > +  name);
> > +   }
> >/* FIXME: the argument if any is checked for type attributes;
> >

Re: [PATCH] middle-end/95493 - bogus MEM_ATTRs from variable ARRAY_REFs

2020-06-03 Thread Richard Biener
On Wed, 3 Jun 2020, Richard Biener wrote:

> 
> The following patch avoids keeping the inherited MEM_ATTRs when
> set_mem_attributes_minus_bitpos is called with a variable ARRAY_REF.
> The inherited ones may not reflect the correct offset.
> 
> We try to preserve the MEM_EXPR from the base object expansion
> but we have to make sure to also preserve the corresponding MEM_ALIAS_SET
> then since otherwise path-based disambiguation gets confused.
> The other choice we'd have would be to drop the MEM_EXPR but keep
> the alias-set from the full reference.  Or, of course "improve"
> MEM_EXPR generation to not "fail" in case the ARRAY_REF offset is
> variable (we keep variable ARRAY_REFs in the MEM_EXPR if there's
> a COMPONENT_REF ontop of the ref for example).  I 
> understand the "inheriting" be exactly useful in the
> case we otherwise give up (I've added an assert that we're not
> inheriting from sth when creating a stack slot).
> 
> The expand_assignment fix could be treated separately (in different
> code paths we are already preserving MEM_ALIAS_SET this way).
> 
> As said the alternative fix for set_mem_attrs_minus_bitpos would
> be to remove all the ARRAY_REF special-casing and do the same
> as we do for MEM_REF.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Testing passed but only with the assert removed - at least when
expanding SSA partitions we run into this case (but with a NULL
previous MEM_EXPR).

> Any comments?
> 
> Thanks,
> Richard.
> 
> 2020-06-03  Richard Biener  
> 
>   PR middle-end/95493
>   * emit-rtl.c (set_mem_attrs_minus_bitpos): Clear
>   offset-known when the ARRAY_REF is variable.  If we end up
>   inherting the MEM_EXPR from the original MEM_ATTRS then also
>   inherit the corresponding MEM_ALIAS_SET.  Avoid altering
>   MEM_KEEP_ALIAS_SET_P in that case.
>   * expr.c (expand_assignment): Preserve MEM_ALIAS_SET when emitting
>   the store.
> 
>   * g++.dg/torture/pr95493.C: New testcase.
> ---
>  gcc/emit-rtl.c | 59 ++--
>  gcc/expr.c |  6 ++-
>  gcc/testsuite/g++.dg/torture/pr95493.C | 62 ++
>  3 files changed, 103 insertions(+), 24 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/torture/pr95493.C
> 
> diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> index 2b790636366..09521b6f85e 100644
> --- a/gcc/emit-rtl.c
> +++ b/gcc/emit-rtl.c
> @@ -1964,8 +1964,9 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int 
> objectp,
>refattrs = MEM_ATTRS (ref);
>if (refattrs)
>  {
> -  /* ??? Can this ever happen?  Calling this routine on a MEM that
> -  already carries memory attributes should probably be invalid.  */
> +  gcc_assert (! TYPE_P (t));
> +  /* We are calling this function with attributes set up from expanding
> +  the base of a memory reference as returned by get_inner_referece.  */
>attrs.expr = refattrs->expr;
>attrs.offset_known_p = refattrs->offset_known_p;
>attrs.offset = refattrs->offset;
> @@ -2052,11 +2053,6 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int 
> objectp,
>   as = TYPE_ADDR_SPACE (TREE_TYPE (base));
>   }
>  
> -  /* If this expression uses it's parent's alias set, mark it such
> -  that we won't change it.  */
> -  if (component_uses_parent_alias_set_from (t) != NULL_TREE)
> - MEM_KEEP_ALIAS_SET_P (ref) = 1;
> -
>/* If this is a decl, set the attributes of the MEM from it.  */
>if (DECL_P (t))
>   {
> @@ -2114,6 +2110,9 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int 
> objectp,
>   }
> while (TREE_CODE (t2) == ARRAY_REF);
>  
> +   attrs.offset_known_p = false;
> +   attrs.offset = 0;
> +   apply_bitpos = 0;
> if (DECL_P (t2)
> || (TREE_CODE (t2) == COMPONENT_REF
> /* For trailing arrays t2 doesn't have a size that
> @@ -2141,24 +2140,38 @@ set_mem_attributes_minus_bitpos (rtx ref, tree t, int 
> objectp,
> apply_bitpos = bitpos;
>   }
>  
> -  /* If this is a reference based on a partitioned decl replace the
> -  base with a MEM_REF of the pointer representative we created
> -  during stack slot partitioning.  */
> -  if (attrs.expr
> -   && VAR_P (base)
> -   && ! is_global_var (base)
> -   && cfun->gimple_df->decls_to_pointers != NULL)
> +  /* If we chose to keep the MEM_EXPR from the inner reference
> +  expansion make sure to also keep the corresponding alias-set
> +  since otherwise path-based disambiguation is confused.  */
> +  if (refattrs
> +   && attrs.expr == refattrs->expr)
> + attrs.alias = refattrs->alias;
> +  else if (attrs.expr)
>   {
> -   tree *namep = cfun->gimple_df->decls_to_pointers->get (base);
> -   if (namep)
> +   /* If this expression uses it's parent's alias set, mark it such
> +  that we won't cha

Re: arm: Fix vfp_operand_register for VFP HI regs

2020-06-03 Thread Christophe Lyon via Gcc-patches
Hi,


On Thu, 14 May 2020 at 17:08, Christophe Lyon
 wrote:
>
> On Fri, 1 May 2020 at 12:57, Kyrylo Tkachov  wrote:
> >
> >
> >
> > > -Original Message-
> > > From: Christophe Lyon 
> > > Sent: 30 April 2020 09:51
> > > To: Kyrylo Tkachov 
> > > Cc: gcc-patches@gcc.gnu.org
> > > Subject: Re: arm: Fix vfp_operand_register for VFP HI regs
> > >
> > > On Wed, 29 Apr 2020 at 18:40, Kyrylo Tkachov 
> > > wrote:
> > > >
> > > > Hi Christophe,
> > > >
> > > > > -Original Message-
> > > > > From: Gcc-patches  On Behalf Of
> > > > > Christophe Lyon via Gcc-patches
> > > > > Sent: 29 April 2020 16:53
> > > > > To: gcc Patches 
> > > > > Subject: arm: Fix vfp_operand_register for VFP HI regs
> > > > >
> > > > > Hi,
> > > > >
> > > > > While looking at PR target/94743 I noticed an ICE when I tried to save
> > > > > all the FP registers: this was because all HI registers wouldn't match
> > > > > vfp_register_operand.
> > > >
> > > > Hmm, I see that arm_regno_class indeed never returns VFP_REGS and
> > > would return VFP_HI_REGS here.
> > > > So the patch looks correct to me.
> > > > Do you have a testcase for the ICE to add to the testsuite?
> > > >
> > >
> > > No C source code: I found that while extending the list of registers
> > > pushed in the prologue of an IRQ handler, more-or-less modifying
> > > arm_save_coproc_regs so that more registers are handled by
> > > vfp_emit_fstmd.
> > > The problem occurs when trying to push d16-d31.
> >
> > I'd be comfortable taking this now for trunk (GCC 11) so it has time to 
> > bake.
> > Once you're ready to post the IRQ handler work we can see about backporting 
> > this fix it to the branch, if we deem it necessary.
> > Thanks,
> > Kyrill
> >
>
> Hi,
>
> I've just sent several patches for PR 94743:
> https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545747.html
> is v2 of my previous patch: it only emits a warning, and might be
> sufficient to close the PR, if we decide that
> we don't want to save the FP registers without explicit user request...
>
> Then,
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545748.html
> [2] https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545749.html
> are refactorization patches that would make implementing the patch
> attached here easier.
>
> If you apply the attached on top of [1] and [2], you'll notice an ICE
> fixed by the original patch of this thread.
>
> So hopefully applying [1], [2] and the attached should help convince you that
> the vfp_operand_register is OK.
>

I've committed [1] and [2] several days ago, so it's now easier to apply
the patch from
https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545754.html
which will trigger an ICE in the new testcases it adds,
which is fixed by the patch that started this thread:
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/544877.html

Is that lalter patch OK?

Thanks,

Christophe

> Thanks
>
> > >
> > >
> > > > Thanks,
> > > > Kyrill
> > > >
> > > > >
> > > > > Regression-tested and bootstrapped OK.
> > > > >
> > > > > 2020-04-29  Christophe Lyon  
> > > > >
> > > > > gcc/
> > > > > * config/arm/predicates.md (vfp_register_operand): Use
> > > VFP_HI_REGS
> > > > > instead of VFP_REGS.
> > > > >
> > > > > OK?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Christophe


Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-06-03 Thread Richard Sandiford
Richard Biener  writes:
> On Tue, Jun 2, 2020 at 5:00 PM Martin Liška  wrote:
>>
>> On 6/2/20 1:09 PM, Richard Biener wrote:
>> > So please be constructive.  Like, provide a testcase that ICEs
>> > with the FAILs replaced by gcc_unreachable ().  Martin, may I suggest
>> > to do this replacement and bootstrap/test?  I think it would be nice
>> > to have testsuite coverage for the FAILs, and maybe we have that
>> > already.
>>
>> Hello.
>>
>> There's the suggested patch that survives bootstrap on ppc64le-linux-gnu
>> and passes test-suite.
>
> OK, so can you please re-post the version of the VEC_COND_EXPR
> patch that uses a regular IFN (without the static non-FAIL checking)
> in a new thread?  If there's no OK from rs6000 maintainers to remove
> the FAILs then we'll go ahead with that version, unless Richard objects
> here.

Well, it seems unfortunate to have to do that.

I think Martin's powerpc patch is the correct one.  But assuming that
the powerpc maintainers still object, I guess the options are:

- Find enough global reviewers who are prepared to approve that patch,
  to override the powerpc maintainers.

- Avoid conflict by going with the regular IFN patch.  To be clear,
  this will ICE in exactly the same cases that Martin's powerpc patch
  does (and current master does), so there's no real benefit to the
  powerpc port from doing this.  It just makes the code more complicated
  and means that other ports don't benefit from the static checking.

In the circumstances, I agree the second is probably the most practical
way forward.

I can't help but think this is a process failure though.  I don't think
using regular IFNs has any technical merits, and it doesn't give Segher
what he wants either (i.e. code that copes with failing vconds).

Thanks,
Richard


[PATCH] tree-optimization/95487 - use a truth type for scatter masks

2020-06-03 Thread Richard Biener


I am testing the following patch on x86_64-unknown-linux-gnu.  Test
coverage for autovectorization of scatters seems rare.

Richard.


This makes sure to get a truth type for scatter masks even when they
are invariant.

2020-06-03  Richard Biener  

PR tree-optimization/95487
* tree-vect-stmts.c (vectorizable_store): Use a truth type
for the scatter mask.

* g++.dg/vect/pr95487.cc: New testcase.
---
 gcc/testsuite/g++.dg/vect/pr95487.cc | 20 
 gcc/tree-vect-stmts.c|  8 ++--
 2 files changed, 26 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/vect/pr95487.cc

diff --git a/gcc/testsuite/g++.dg/vect/pr95487.cc 
b/gcc/testsuite/g++.dg/vect/pr95487.cc
new file mode 100644
index 000..34023d25572
--- /dev/null
+++ b/gcc/testsuite/g++.dg/vect/pr95487.cc
@@ -0,0 +1,20 @@
+// { dg-do compile }
+// { dg-additional-options "-O3" }
+// { dg-additional-options "-march=skylake-avx512" { target x86_64-*-* 
i?86-*-* } }
+
+int a;
+bool d;
+char e;
+extern short f[];
+extern int g[];
+short j;
+void h() {
+  for (short b = j; b < 0; b += 2) {
+f[b] = 0;
+if (d) {
+  for (char c = 0; c < a; c += 3)
+e = 0;
+  g[b] = 0;
+}
+  }
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 10f2e394b1e..26121dfa79c 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7668,8 +7668,12 @@ vectorizable_store (vec_info *vinfo,
  gs_info.offset,
  stmt_info);
  if (mask)
-   mask_op = vec_mask = vect_get_vec_def_for_operand (vinfo, mask,
-  stmt_info);
+   {
+ tree mask_vectype = truth_type_for (vectype);
+ mask_op = vec_mask
+   = vect_get_vec_def_for_operand (vinfo, mask,
+   stmt_info, mask_vectype);
+   }
}
  else if (modifier != NONE && (j & 1))
{
-- 
2.26.2


[PATCH] tree-optimization/95495 - use SLP_TREE_REPRESENTATIVE in assertion

2020-06-03 Thread Richard Biener
This fixes a place where I missed to use SLP_TREE_REPRESENTATIVE
after its introduction.

2020-06-03  Richard Biener  

PR tree-optimization/95495
* tree-vect-slp.c (vect_slp_analyze_node_operations): Use
SLP_TREE_REPRESENTATIVE in the shift assertion.

* gcc.dg/vect/pr95495.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr95495.c | 16 
 gcc/tree-vect-slp.c |  2 +-
 2 files changed, 17 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr95495.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr95495.c 
b/gcc/testsuite/gcc.dg/vect/pr95495.c
new file mode 100644
index 000..a961aef59fc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr95495.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+
+typedef struct {
+  int a;
+  int b;
+} c;
+int d, f, g;
+c e[4];
+void
+h()
+{
+  for (; f; f++) {
+g += e[f].a >> 1 | e[f].a & 1;
+d += e[f].b >> 1 | e[f].b & 1;
+  }
+}
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index f1d61944ae0..cc33b64454c 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2888,7 +2888,7 @@ vect_slp_analyze_node_operations (vec_info *vinfo, 
slp_tree node,
  /* For shifts with a scalar argument we don't need
 to cost or code-generate anything.
 ???  Represent this more explicitely.  */
- gcc_assert ((STMT_VINFO_TYPE (SLP_TREE_SCALAR_STMTS (node)[0])
+ gcc_assert ((STMT_VINFO_TYPE (SLP_TREE_REPRESENTATIVE (node))
   == shift_vec_info_type)
  && j == 1);
  continue;
-- 
2.25.1


Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-06-03 Thread David Edelsohn via Gcc-patches
On Wed, Jun 3, 2020 at 9:41 AM Richard Sandiford
 wrote:
>
> Richard Biener  writes:
> > On Tue, Jun 2, 2020 at 5:00 PM Martin Liška  wrote:
> >>
> >> On 6/2/20 1:09 PM, Richard Biener wrote:
> >> > So please be constructive.  Like, provide a testcase that ICEs
> >> > with the FAILs replaced by gcc_unreachable ().  Martin, may I suggest
> >> > to do this replacement and bootstrap/test?  I think it would be nice
> >> > to have testsuite coverage for the FAILs, and maybe we have that
> >> > already.
> >>
> >> Hello.
> >>
> >> There's the suggested patch that survives bootstrap on ppc64le-linux-gnu
> >> and passes test-suite.
> >
> > OK, so can you please re-post the version of the VEC_COND_EXPR
> > patch that uses a regular IFN (without the static non-FAIL checking)
> > in a new thread?  If there's no OK from rs6000 maintainers to remove
> > the FAILs then we'll go ahead with that version, unless Richard objects
> > here.
>
> Well, it seems unfortunate to have to do that.
>
> I think Martin's powerpc patch is the correct one.  But assuming that
> the powerpc maintainers still object, I guess the options are:
>
> - Find enough global reviewers who are prepared to approve that patch,
>   to override the powerpc maintainers.

Luckily GCC Development does not operate this way.

How about (3) help to remove reliance on this incorrect behavior from
the PowerPC port?

I didn't formally check, but if this is 16 years old, then it's from
the original RHES Altivec work.

I don't believe that anyone fundamentally is objecting to "fixing this
correctly".  I don't know the entire history of this discussion, but
my objection is to a fix that breaks a long-time assumption of the
PowerPC port and leaves it as an exercise to the PowerPC maintainers
to fix it.

Thanks, David


Re: [PATCH] Add pattern for pointer-diff on addresses with same base/offset (PR 94234)

2020-06-03 Thread Marc Glisse

On Wed, 3 Jun 2020, Feng Xue OS via Gcc-patches wrote:


Ah, looking at the PR, you decided to perform the operation as unsigned
because that has fewer NOP conversions, which, in that particular testcase
where the offsets are originally unsigned, means we simplify better. But I
would expect it to regress other testcases (in particular if the offsets
were originally signed). Also, changing the second argument of
pointer_plus to be signed, as is supposed to eventually happen, would
break your testcase again.
The old rule might produce overflow result (offset_a = (signed_int_max)UL, 
offset_b = 1UL).


signed_int_max-1 does not overflow. But the point is that pointer_plus / 
pointer_diff are defined in a way that if that subtraction would overflow, 
then one of the pointer_plus or pointed_diff would have been undefined 
already. In particular, you cannot have objects larger than half the 
address space, and pointer_plus/pointer_diff have to remain inside an 
object. Doing the subtraction in a signed type keeps (part of) that 
information.



Additionally, (stype)(offset_a - offset_b) is more compact,


Not if offset_a comes from (utype)a and offset_b from (utype)b with a and 
b signed. Using size_t indices as in the bugzilla testcase is not 
recommended practice. Change it to ssize_t, and we do optimize the 
testcase in CCP1 already.



there might be
further simplification opportunities on offset_a - offset_b, even it is not
in form of (A * C - B * C), for example (~A - 1 -> -A). But for old rule, we 
have
to introduce another rule as (T)A - (T)(B) -> (T)(A - B), which seems to
be too generic to benefit performance in all situations.


Sadly, conversions complicate optimizations and are all over the place, we 
need to handle them in more places. I sometimes dream of getting rid of 
NOP conversions, and having a single PLUS_EXPR with some kind of flag 
saying if it can wrap/saturate/trap when seen as a signed/unsigned 
operation, i.e. push the information on the operations instead of objects.



If the 2nd argument is signed, we can add a specific rule as your suggestion
(T)(A * C) - (T)(B * C) -> (T) (A - B) * C.


At the very least we want to keep a comment next to the transformation
explaining the situation.



If there are platforms where the second argument of pointer_plus is a
smaller type than the result of pointer_diff (can this happen? I keep
forgetting all the weird things some platforms do), this version may do an
unsafe zero-extension.

If the 2nd argument is a smaller type, this might bring confuse semantic to
pointer_plus operator. Suppose the type is a (unsigned) char, the expression
"ptr + ((char) -1)" represents ptr + 255 or ptr - 1?


(pointer_plus ptr 255) would mean ptr - 1 on a platform where the second 
argument of pointer_plus has size 1 byte.



Do note that I am not a reviewer, what I say isn't final.

--
Marc Glisse


Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-06-03 Thread Richard Biener via Gcc-patches
On Wed, Jun 3, 2020 at 4:17 PM David Edelsohn  wrote:
>
> On Wed, Jun 3, 2020 at 9:41 AM Richard Sandiford
>  wrote:
> >
> > Richard Biener  writes:
> > > On Tue, Jun 2, 2020 at 5:00 PM Martin Liška  wrote:
> > >>
> > >> On 6/2/20 1:09 PM, Richard Biener wrote:
> > >> > So please be constructive.  Like, provide a testcase that ICEs
> > >> > with the FAILs replaced by gcc_unreachable ().  Martin, may I suggest
> > >> > to do this replacement and bootstrap/test?  I think it would be nice
> > >> > to have testsuite coverage for the FAILs, and maybe we have that
> > >> > already.
> > >>
> > >> Hello.
> > >>
> > >> There's the suggested patch that survives bootstrap on ppc64le-linux-gnu
> > >> and passes test-suite.
> > >
> > > OK, so can you please re-post the version of the VEC_COND_EXPR
> > > patch that uses a regular IFN (without the static non-FAIL checking)
> > > in a new thread?  If there's no OK from rs6000 maintainers to remove
> > > the FAILs then we'll go ahead with that version, unless Richard objects
> > > here.
> >
> > Well, it seems unfortunate to have to do that.
> >
> > I think Martin's powerpc patch is the correct one.  But assuming that
> > the powerpc maintainers still object, I guess the options are:
> >
> > - Find enough global reviewers who are prepared to approve that patch,
> >   to override the powerpc maintainers.
>
> Luckily GCC Development does not operate this way.
>
> How about (3) help to remove reliance on this incorrect behavior from
> the PowerPC port?
>
> I didn't formally check, but if this is 16 years old, then it's from
> the original RHES Altivec work.
>
> I don't believe that anyone fundamentally is objecting to "fixing this
> correctly".  I don't know the entire history of this discussion, but
> my objection is to a fix that breaks a long-time assumption of the
> PowerPC port and leaves it as an exercise to the PowerPC maintainers
> to fix it.

I _think_ there's nothing to fix besides removing the FAIL.  And I would
have no idea how to "fix" the powerpc port here since a) we lack a testcase
that actually FAILs, b) I'm not familiar with the ISA.  So we did (3) by
replacing the FAILs with gcc_unreachable () and bootstrap/regtest this
without any regression which I think "proves" the failure modes do not
actually exist.

So I'm not sure how we can help.

There's the option to remove the vcond_* patterns entirely which makes
us fall back to the vec_cmp_* ones which do never FAIL on powerpc.
But I suspect this might regress code generation.

It's suspicious that vec_cmp_* never FAIL when vcond_* do btw. - see
the case I pointed out where it appears to have inconsistencies/wrong-code
issues regarding ordered compare support.  But see above (I do not know
the powerpc ISA).

A vcond can usually be emulated by vec_cmp plus masking.  So if
we ever get a testcase that runs into the gcc_unreachable () I'll promise
to fix it up using this strategy in the vcond expander.  But without a
testcase and powerpc ISA knowledge it's really hard.  Or do you want
us to stick the vec_cmp expansion fallback in place of the FAILs?
I'm sure the powerpc maintainers are better suited to do that even though
I'll probably manage with some cut&paste.  To recap: vcond is
equal to

  mask = vec_cmp of the comparison
  true_masked = true_op & mask;
  false_masked = false_op & ~mask;
  result = true_masked | false_masked;

but I believe this would be dead code never triggered.

Richard.

>
> Thanks, David


Re: [PATCH] libgccjit: Add new gcc_jit_context_new_blob entry point

2020-06-03 Thread David Malcolm via Gcc-patches
On Wed, 2020-06-03 at 12:11 +0200, Andrea Corallo wrote:
> Hi all,
> 
> I'd like to submit this patch to extend libgccjit to allow for
> storing
> binary blobs (equivalent to initialized char arrays).
> 
> The use-case is when libgccjit is used as a back-end for dynamic
> programming languages.  In this case is often necessary to store
> serialized objects into the compilation unit as support to the
> generated code.
> 
> The proposed entry point is the following:
> 
> gcc_jit_lvalue *
> gcc_jit_context_new_blob (gcc_jit_context *ctxt,
> gcc_jit_location *loc,
> enum gcc_jit_global_kind kind,
> const void *ptr,
> size_t size,
> const char *name);
> 
> This create a special kind of libgccjit global that will be
> initialized
> with the memory content pointed by 'ptr'.
> 
> I've added a testcase and the regression is clean.  I've also tested
> it with the libgccjit based Emacs disabling the current workaround we
> have
> for this.
> 
> Feedback is very welcome.

Thanks for the patch.

Is this entrypoint only needed for the ahead-of-time use case?  If the
client code is instead going to do an in-memory compilation, then I
believe it can simply build the buffer of data in memory and expose it
to the jitted code via gcc_jit_context_new_rvalue_from_ptr.

It occurred to me that the entrypoint is combining two things:
- creating a global char[]
- creating an initializer for that global

which got me wondering if we should instead have a way to add
initializers for globals.

My first thought was something like:

gcc_jit_context_new_global_with_initializer

which would be like gcc_jit_context_new_global but would have an
additional gcc_jit_rvalue *init_value param?
The global would have to be of kind GCC_JIT_GLOBAL_EXPORTED or
GCC_JIT_GLOBAL_INTERNAL, not GCC_JIT_GLOBAL_IMPORTED.

Alternatively, maybe it would be better to have 

gcc_jit_global_set_initializer (gcc_jit_lvalue *global,
gcc_jit_rvalue *init_val);

to make the API more flexible.

But even if we had this, we'd still need some way to create the rvalue
for that initial value.  Also, maybe there ought to be a distinction
between rvalues that can vary at runtime vs those that can be computed
at compile-time (and are thus suitable for use in static
initialization).

I suspect you may have gone through the same thought process and come
up with a simpler approach.   (I'm mostly just "thinking out loud"
here, sorry if it isn't very coherent).

Should it be a "const char[]" rather than just a "char[]"?

One specific nit about the patch: in compatibility.rst, there should be
a:

.. _LIBGCCJIT_ABI_14:

label before the heading.

Hope this is constructive
Dave


>   Andrea
> 
> gcc/jit/ChangeLog
> 
> 2020-06-02  Andrea Corallo  
> 
>   * docs/topics/compatibility.rst (LIBGCCJIT_ABI_14): New ABI
> tag.
> 
>   * docs/topics/expressions.rst (Binary Blobs): New section
>   documenting gcc_jit_context_new_blob.
> 
>   * jit-common.h: Document class blob.
> 
>   * jit-playback.c (playback::context::global_new_decl)
>   (playback::context::global_finalize_lvalue): New methods.
>   (playback::context::new_global): Make use of global_new_decl,
>   global_finalize_lvalue.
>   (playback::context::new_blob): New method.
> 
>   * jit-playback.h (new_blob, global_new_decl): New method
>   declarations.
> 
>   * jit-recording.c (recording::context::new_blob)
>   (recording::blob::replay_into)
>   (recording::global::write_qualifier_to_dump): New methods.
>   (recording::global::write_to_dump): Use
> write_qualifier_to_dump.
>   (recording::blob::write_to_dump): New method.
> 
>   * jit-recording.h (class blob): New class.
>   (class global): Have m_kind and m_name as protected.
>   (class global): Remove FINAL qualifier to replay_into and
>   write_to_dump.
>   (class global): New method write_qualifier_to_dump decl.
>   (class context): New method new_blob decl.
> 
>   * libgccjit++.h (context::new_blob): New method.
> 
>   * libgccjit.c (gcc_jit_lvalue_as_rvalue): New function.
> 
>   * libgccjit.h (gcc_jit_context_new_blob): New function
>   declaration.
>   (LIBGCCJIT_HAVE_gcc_jit_context_new_blob): New macro.
> 
>   * libgccjit.map (LIBGCCJIT_ABI_14): New ABI tag.
> 
> gcc/testsuite/ChangeLog
> 
> 2020-06-02  Andrea Corallo  
> 
>   * jit.dg/all-non-failing-tests.h: Add test-blob.c.
> 
>   * jit.dg/test-blob.c: New test.
> 



[committed] jit: implement BT_CONST_VOLATILE_PTR [PR 95306]

2020-06-03 Thread David Malcolm via Gcc-patches
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as r11-877-g6d36cc21b69d952967a4df6653954f572a101796.

gcc/jit/ChangeLog:
PR jit/95306
* jit-builtins.c (builtins_manager::make_primitive_type):
Implement BT_CONST_VOLATILE_PTR.

gcc/testsuite/ChangeLog:
PR jit/95306
* jit.dg/test-pr95306-builtin-types.c (create_code): Add
test of getting __atomic_load.
---
 gcc/jit/jit-builtins.c| 4 +++-
 gcc/testsuite/jit.dg/test-pr95306-builtin-types.c | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/jit/jit-builtins.c b/gcc/jit/jit-builtins.c
index 4842ff3ea96..56a79b4e249 100644
--- a/gcc/jit/jit-builtins.c
+++ b/gcc/jit/jit-builtins.c
@@ -513,7 +513,9 @@ builtins_manager::make_primitive_type (enum 
jit_builtin_type type_id)
 case BT_VOLATILE_PTR:
   return (m_ctxt->get_type (GCC_JIT_TYPE_VOID)->get_volatile ()
  ->get_pointer ());
-// case BT_CONST_VOLATILE_PTR:
+case BT_CONST_VOLATILE_PTR:
+  return (m_ctxt->get_type (GCC_JIT_TYPE_VOID)->get_const ()
+ ->get_volatile ()->get_pointer ());
 // case BT_PTRMODE:
 case BT_INT_PTR:
   return m_ctxt->get_type (GCC_JIT_TYPE_INT)->get_pointer ();
diff --git a/gcc/testsuite/jit.dg/test-pr95306-builtin-types.c 
b/gcc/testsuite/jit.dg/test-pr95306-builtin-types.c
index 6c3b087ab98..960163ed7d4 100644
--- a/gcc/testsuite/jit.dg/test-pr95306-builtin-types.c
+++ b/gcc/testsuite/jit.dg/test-pr95306-builtin-types.c
@@ -8,6 +8,7 @@ create_code (gcc_jit_context *ctxt, void *user_data)
 #define CHECK_BUILTIN(NAME) \
   CHECK_NON_NULL (gcc_jit_context_get_builtin_function (ctxt, NAME));
   
+  CHECK_BUILTIN ("__atomic_load");
   CHECK_BUILTIN ("__builtin_memcpy");
   CHECK_BUILTIN ("__builtin_sadd_overflow");
 
-- 
2.26.2



Re: [PATCH 2/7] [OpenACC] Adjust dynamic reference count semantics

2020-06-03 Thread Thomas Schwinge
Hi Julian!

On 2020-06-03T14:36:14+0200, I wrote:
> On 2020-05-22T15:16:05-0700, Julian Brown  wrote:
>> This patch adjusts the semantics of dynamic reference counts, as described
>> in the parent email.
>
> Thanks!
>
> A few questions, but no need to send an updated patch.
>
>> --- a/libgomp/oacc-mem.c
>> +++ b/libgomp/oacc-mem.c
>
>> @@ -1018,13 +1036,102 @@ goacc_enter_data_internal (struct gomp_device_descr 
>> *acc_dev, size_t mapnum,
>>  {
>>for (size_t i = 0; i < mapnum; i++)
>>  {
>> -  int group_last = find_group_last (i, mapnum, sizes, kinds);
>> +  splay_tree_key n;
>> +  size_t group_last = find_group_last (i, mapnum, sizes, kinds);
>> +  bool struct_p = false;
>> +  size_t size, groupnum = (group_last - i) + 1;
>>
>> -  gomp_map_vars_async (acc_dev, aq,
>> -   (group_last - i) + 1,
>> -   &hostaddrs[i], NULL,
>> -   &sizes[i], &kinds[i], true,
>> -   GOMP_MAP_VARS_OPENACC_ENTER_DATA);
>> +  switch (kinds[i] & 0xff)
>> +{
>> +case GOMP_MAP_STRUCT:
>> +  {
>> +int last = i + sizes[i];
>
> The 'last' calculated here must always equal the 'group_last' calculated
> above.  ;-) (... so we might just use 'group_last' instead of 'last' in
> the following.)
>
>> +size = (uintptr_t) hostaddrs[last] + sizes[last]
>> +   - (uintptr_t) hostaddrs[i];
>> +struct_p = true;
>> +  }
>> +  break;
>> +
>> +case GOMP_MAP_ATTACH:
>> +  size = sizeof (void *);
>> +  break;
>> +
>> +default:
>> +  size = sizes[i];
>> +}
>> +
>> +  n = lookup_host (acc_dev, hostaddrs[i], size);
>> +
>
>> +  if (n && struct_p)
>> +{
>> +  if (n->refcount != REFCOUNT_INFINITY)
>> +n->refcount += groupnum - 1;
>> +  n->dynamic_refcount += groupnum - 1;
>> +  gomp_mutex_unlock (&acc_dev->lock);
>> +}
>
> Is the 'GOMP_MAP_STRUCT' handling here specifically necessary, or is that
> just an optimization of the 'n && groupnum > 1' case below?

Eh, OK, I think I see where this is going; the 'n && groupnum > 1' case
below might not necessarily take care of the 'groupnum - 1' refcounts
that we're filing here?

>> +  else if (n && groupnum == 1)
>> +{
>> +  void *h = hostaddrs[i];
>> +  size_t s = sizes[i];
>> +
>> +  /* A standalone attach clause.  */
>> +  if ((kinds[i] & 0xff) == GOMP_MAP_ATTACH)
>> +gomp_attach_pointer (acc_dev, aq, &acc_dev->mem_map, n,
>> + (uintptr_t) h, s, NULL);
>> +  else if (h + s > (void *) n->host_end)
>> +{
>> +  gomp_mutex_unlock (&acc_dev->lock);
>> +  gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s);
>> +}
>> +
>> +  assert (n->refcount != REFCOUNT_LINK);
>> +  if (n->refcount != REFCOUNT_INFINITY)
>> +n->refcount++;
>> +  n->dynamic_refcount++;
>> +
>> +  gomp_mutex_unlock (&acc_dev->lock);
>> +}
>
>> +  else if (n && groupnum > 1)
>> +{
>> +  assert (n->refcount != REFCOUNT_INFINITY
>> +  && n->refcount != REFCOUNT_LINK);
>> +
>> +  bool processed = false;
>> +
>> +  struct target_mem_desc *tgt = n->tgt;
>> +  for (size_t j = 0; j < tgt->list_count; j++)
>> +if (tgt->list[j].key == n)
>> +  {
>> +for (size_t k = 0; k < groupnum; k++)
>> +  if (j + k < tgt->list_count && tgt->list[j + k].key)
>> +{
>> +  tgt->list[j + k].key->refcount++;
>> +  tgt->list[j + k].key->dynamic_refcount++;
>> +}
>> +processed = true;
>> +  }
>> +
>> +  gomp_mutex_unlock (&acc_dev->lock);
>> +  if (!processed)
>> +gomp_fatal ("dynamic refcount incrementing failed for "
>> +"pointer/pset");
>> +}
>
> Please add some text to explain the nested 'j', 'k' loops and their 'if'
> conditionals, and the 'groupnum' usage in the 'k' loop boundary.  Should
> the 'k' loop maybe run 'for (size_t k = j; k < tgt->list_count; ++k)'
> (..., or is 'groupnum' relevant?), and in the loop body then use 'k'
> instead of 'j + k'?  (Maybe I've now confused myself, staring at this for
> a while...)

Audacious as I am sometimes, I did put a '__builtin_abort' right after
'tgt->list[j].key == n' -- and it doesn't trigger one single time for the
current libgomp test cases, meaning this is all dead code?  I'm confused.

>> +  else if (hostaddrs[i])
>> +{
>> +  gomp_mutex_unlock (&acc_dev->lock);
>> +
>> +  struct target_mem_desc *tgt
>> += gomp_map_vars_async (acc_dev, aq, groupnum, &hostaddrs[i], NULL,
>> +   &sizes[i], &kinds[i], true,
>> +   GOMP_MAP_VARS_ENTER_DATA);
>> +  assert (tgt);
>> +  for (size_t j = 0; j < tgt->list_count; j++)
>> +{
>> +  n = tgt->list[j].key;
>> +  if (n)
>> +n->

Re: [PATCH] libgccjit: Add new gcc_jit_context_new_blob entry point

2020-06-03 Thread Andrea Corallo
David Malcolm  writes:

> Thanks for the patch.
>
> Is this entrypoint only needed for the ahead-of-time use case?  If the
> client code is instead going to do an in-memory compilation, then I
> believe it can simply build the buffer of data in memory and expose it
> to the jitted code via gcc_jit_context_new_rvalue_from_ptr.

Hi Dave,

correct, if you can leak pointers into the generated code this entry
point is likely to be unnecessary.

> It occurred to me that the entrypoint is combining two things:
> - creating a global char[]
> - creating an initializer for that global
>
> which got me wondering if we should instead have a way to add
> initializers for globals.
>
> My first thought was something like:
>
> gcc_jit_context_new_global_with_initializer
>
> which would be like gcc_jit_context_new_global but would have an
> additional gcc_jit_rvalue *init_value param?
> The global would have to be of kind GCC_JIT_GLOBAL_EXPORTED or
> GCC_JIT_GLOBAL_INTERNAL, not GCC_JIT_GLOBAL_IMPORTED.
>
> Alternatively, maybe it would be better to have
>
> gcc_jit_global_set_initializer (gcc_jit_lvalue *global,
>   gcc_jit_rvalue *init_val);
>
> to make the API more flexible.
>
> But even if we had this, we'd still need some way to create the rvalue
> for that initial value.  Also, maybe there ought to be a distinction
> between rvalues that can vary at runtime vs those that can be computed
> at compile-time (and are thus suitable for use in static
> initialization).
>
> I suspect you may have gone through the same thought process and come
> up with a simpler approach.   (I'm mostly just "thinking out loud"
> here, sorry if it isn't very coherent).

Yes I had kind of similar thoughs.

Ideally would be good to have a generic solution, the complication is
that as you mentioned not every rvalue is suitable for initializing
every global, but rather the opposite.  My fear was that the space to be
covered would be non trivial for a robust solution in this case.

Also I believe we currently have no way to express in libgccjit rvalues
an array with some content, so how to use this as initializer?  Perhaps
also we should have a new type gcc_jit_initializer?

On the other hand I thought that for simple things like integers the
problem is tipically already solved with an assignment in some init code
(infact I think so far nobody complained) while the real standing
limitation is for blobs (perhaps I'm wrong).  And in the end if you can
stuff some data in, you can use it later for any scope.

Another "hybrid" solution would be to have specific entry point for each
type of the subset we want to allow for static initialization.  This way
we still control the creation of the initializer internally so it's
safe.  In this view this blob entry point would be just one of these
(probably with a different name like
'gcc_jit_context_new_glob_init_char_array').

> Should it be a "const char[]" rather than just a "char[]"?

Good question, I believe depends on the usecase so I left out the const.
Perhaps should be a parameter of the entry point.

> One specific nit about the patch: in compatibility.rst, there should be
> a:
>
> .. _LIBGCCJIT_ABI_14:
>
> label before the heading.

Ops

> Hope this is constructive
> Dave

Sure it is.

Thanks

  Andrea


Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-06-03 Thread Segher Boessenkool
Hi!

On Wed, Jun 03, 2020 at 04:46:12PM +0200, Richard Biener wrote:
> On Wed, Jun 3, 2020 at 4:17 PM David Edelsohn  wrote:
> > On Wed, Jun 3, 2020 at 9:41 AM Richard Sandiford
> >  wrote:
> > > Well, it seems unfortunate to have to do that.
> > >
> > > I think Martin's powerpc patch is the correct one.

It is papering over the issues a little -- the same assumption is made
at lower levels as well, so all *that* needs to be changed as well (not
"fixed", it is not a bug, we have a change in the vcond* interface here;
oh and that should be documented as well).

> > How about (3) help to remove reliance on this incorrect behavior from
> > the PowerPC port?

It is not a reliance on incorrect behaviour.  This is a change.  Which
pretty much everyone seems to want, so fine, but that takes time.

> > I didn't formally check, but if this is 16 years old, then it's from
> > the original RHES Altivec work.

It is, exactly.

> > I don't believe that anyone fundamentally is objecting to "fixing this
> > correctly".  I don't know the entire history of this discussion, but
> > my objection is to a fix that breaks a long-time assumption of the
> > PowerPC port and leaves it as an exercise to the PowerPC maintainers
> > to fix it.

*Exactly*.  This is changing an ancient interface, claiming "it always
was that way" (which very obviously isn't true), and leaving the rs6000
people to deal with all the fallout.  Again.

> I _think_ there's nothing to fix besides removing the FAIL.

All the lower levels need to get asserts as well.  We need a week or so
to put the whole thing through the wringer.  The documentation needs to
be changed by whoever changes the vcond* semantics.  All other ports
should be checked, too.

> And I would
> have no idea how to "fix" the powerpc port here since a) we lack a testcase
> that actually FAILs, b) I'm not familiar with the ISA.  So we did (3) by
> replacing the FAILs with gcc_unreachable () and bootstrap/regtest this
> without any regression which I think "proves" the failure modes do not
> actually exist.

Heh, assuming the testsuite is comprehensive?  Heh.  (Bootstrap doesn't
mean much for vector code).

> So I'm not sure how we can help.

You'll have to document the "vcond* is not allowed to FAIL" change.
We'll deal with the rest.  But testing needs a week or so.  (That is
an extremely short timescale already).

> A vcond can usually be emulated by vec_cmp plus masking.

That would be the generic way to implement this of course, but apparently
such code doesn't yet exist?  If there is a generic implementation it
should be trivial to deal with FAILs.

> So if
> we ever get a testcase that runs into the gcc_unreachable () I'll promise
> to fix it up using this strategy in the vcond expander.  But without a
> testcase and powerpc ISA knowledge it's really hard.  Or do you want
> us to stick the vec_cmp expansion fallback in place of the FAILs?
> I'm sure the powerpc maintainers are better suited to do that even though
> I'll probably manage with some cut&paste.  To recap: vcond is
> equal to
> 
>   mask = vec_cmp of the comparison
>   true_masked = true_op & mask;
>   false_masked = false_op & ~mask;
>   result = true_masked | false_masked;
> 
> but I believe this would be dead code never triggered.

But that would be the generic code as well?  Is that not useful to have
in any case?


Segher


Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-06-03 Thread Richard Biener via Gcc-patches
On June 3, 2020 7:01:39 PM GMT+02:00, Segher Boessenkool 
 wrote:
>Hi!
>
>On Wed, Jun 03, 2020 at 04:46:12PM +0200, Richard Biener wrote:
>> On Wed, Jun 3, 2020 at 4:17 PM David Edelsohn 
>wrote:
>> > On Wed, Jun 3, 2020 at 9:41 AM Richard Sandiford
>> >  wrote:
>> > > Well, it seems unfortunate to have to do that.
>> > >
>> > > I think Martin's powerpc patch is the correct one.
>
>It is papering over the issues a little -- the same assumption is made
>at lower levels as well, so all *that* needs to be changed as well (not
>"fixed", it is not a bug, we have a change in the vcond* interface
>here;
>oh and that should be documented as well).
>
>> > How about (3) help to remove reliance on this incorrect behavior
>from
>> > the PowerPC port?
>
>It is not a reliance on incorrect behaviour.  This is a change.  Which
>pretty much everyone seems to want, so fine, but that takes time.
>
>> > I didn't formally check, but if this is 16 years old, then it's
>from
>> > the original RHES Altivec work.
>
>It is, exactly.
>
>> > I don't believe that anyone fundamentally is objecting to "fixing
>this
>> > correctly".  I don't know the entire history of this discussion,
>but
>> > my objection is to a fix that breaks a long-time assumption of the
>> > PowerPC port and leaves it as an exercise to the PowerPC
>maintainers
>> > to fix it.
>
>*Exactly*.  This is changing an ancient interface, claiming "it always
>was that way" (which very obviously isn't true), and leaving the rs6000
>people to deal with all the fallout.  Again.
>
>> I _think_ there's nothing to fix besides removing the FAIL.
>
>All the lower levels need to get asserts as well.  We need a week or so
>to put the whole thing through the wringer.  The documentation needs to
>be changed by whoever changes the vcond* semantics.  All other ports
>should be checked, too.
>
>> And I would
>> have no idea how to "fix" the powerpc port here since a) we lack a
>testcase
>> that actually FAILs, b) I'm not familiar with the ISA.  So we did (3)
>by
>> replacing the FAILs with gcc_unreachable () and bootstrap/regtest
>this
>> without any regression which I think "proves" the failure modes do
>not
>> actually exist.
>
>Heh, assuming the testsuite is comprehensive?  Heh.  (Bootstrap doesn't
>mean much for vector code).
>
>> So I'm not sure how we can help.
>
>You'll have to document the "vcond* is not allowed to FAIL" change.
>We'll deal with the rest.  But testing needs a week or so.  (That is
>an extremely short timescale already).
>
>> A vcond can usually be emulated by vec_cmp plus masking.
>
>That would be the generic way to implement this of course, but
>apparently
>such code doesn't yet exist?  If there is a generic implementation it
>should be trivial to deal with FAILs.
>
>> So if
>> we ever get a testcase that runs into the gcc_unreachable () I'll
>promise
>> to fix it up using this strategy in the vcond expander.  But without
>a
>> testcase and powerpc ISA knowledge it's really hard.  Or do you want
>> us to stick the vec_cmp expansion fallback in place of the FAILs?
>> I'm sure the powerpc maintainers are better suited to do that even
>though
>> I'll probably manage with some cut&paste.  To recap: vcond is
>> equal to
>> 
>>   mask = vec_cmp of the comparison
>>   true_masked = true_op & mask;
>>   false_masked = false_op & ~mask;
>>   result = true_masked | false_masked;
>> 
>> but I believe this would be dead code never triggered.
>
>But that would be the generic code as well?  Is that not useful to have
>in any case?

Sure. If you remove the vcond patterns from your port the vectorizer will do 
this transparently for you. So if you do not actually have a more clever way of 
representing this in the ISA there's no point of the vcond patterns. (though I 
think the vec_cmp ones didn't originally exist) 

The point is the vectorizer relies on a optab query for querying backend 
support and power claims vcond support here. If you then FAIL you have lied. 
(not in your interpretation of the pattern docs but in the implementations 
since introduction of vcond named patterns) 

So if you're happy I'll document explicitly that vector named patterns may not 
FAIL. 

Richard. 

>
>Segher



Re: [Patch, fortran] PR fortran/66833,67938,95214 ICE on using assumed rank character array

2020-06-03 Thread Thomas Koenig via Gcc-patches

Hi Jose,

Proposed patch to PRs 66833, 67938 and 95214 ICE(s) on using assumed 
rank character array in different situations.


Reviewed and committed with some trival changes:

It is better not to use STOP codes > 255, so I just
counted them up.

Some changes to the ChangeLog: Mentioned all PRs there and made the
ChangeLog conform to the upload checker (well, it worked the second time
:-)

Here's what I committed as r11-879:

Simple patch only add assumed-rank to the list of possible attributes.

gcc/fortran/ChangeLog:

2020-05-19  José Rui Faustino de Sousa  

PR fortran/95214
PR fortran/66833
PR fortran/67938
* trans-expr.c (gfc_maybe_dereference_var): Add assumed-rank to
character dummy arguments list of possible attributes.

gcc/testsuite/ChangeLog:

2020-05-19  José Rui Faustino de Sousa  

PR fortran/95214
PR fortran/66833
PR fortran/67938
* gfortran.dg/PR95214.f90: New test.

Thanks a lot for the patch!  I notice you still have a couple
of submissions, I'll try to get to them in the next few
days (unless somebody else beats me to a review).

Best regards

Thomas

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 33fc061d89b..435eaeb2c99 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -2613,7 +2613,8 @@ gfc_maybe_dereference_var (gfc_symbol *sym, tree var, bool descriptor_only_p,
 {
   /* Dereference character pointer dummy arguments
 	 or results.  */
-  if ((sym->attr.pointer || sym->attr.allocatable)
+  if ((sym->attr.pointer || sym->attr.allocatable
+	   || (sym->as && sym->as->type == AS_ASSUMED_RANK))
 	  && (sym->attr.dummy
 	  || sym->attr.function
 	  || sym->attr.result))
diff --git a/gcc/testsuite/gfortran.dg/PR95214.f90 b/gcc/testsuite/gfortran.dg/PR95214.f90
new file mode 100644
index 000..8224767cb67
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/PR95214.f90
@@ -0,0 +1,84 @@
+! { dg-do run }
+!
+! PR fortran/95214
+!
+
+program chr_p
+
+  implicit none
+
+  integer, parameter :: u = 65
+  
+  integer, parameter :: n = 26
+  
+  character :: c(n)
+  integer   :: i
+
+  c = [(achar(i), i=u,u+n-1)]
+  call chr_s(c, c)
+  call gfc_descriptor_c_char(c)
+  call s1(c)
+  call s1s_a(c)
+  call s1s_b(c)
+  call s2(c)
+  stop
+  
+contains
+
+  subroutine chr_s(a, b)
+character, intent(in) :: a(..)
+character, intent(in) :: b(:)
+
+integer :: i
+
+select rank(a)
+rank(1)
+  do i = 1, size(a)
+if(a(i)/=b(i)) stop 1
+  end do
+rank default
+  stop 2
+end select
+return
+  end subroutine chr_s
+
+  ! From Bug 66833
+  ! Contributed by Damian Rouson 
+  subroutine gfc_descriptor_c_char(a)
+character a(..)
+if(rank(a)/=1) stop 3 ! ICE (also for lbound, ubound, and c_loc)
+  end subroutine gfc_descriptor_c_char
+
+
+  ! From Bug 67938
+  ! Contributed by Gerhard Steinmetz 
+  
+  ! example z1.f90
+  subroutine s1(x)
+character(1) :: x(..)
+if(any(lbound(x)/=[1])) stop 4
+if(any(ubound(x)/=[n])) stop 5
+  end subroutine s1
+  
+  ! example z1s.f90
+  subroutine s1s_a(x)
+character :: x(..)
+if(size(x)/=n) stop 6
+  end subroutine s1s_a
+  
+  subroutine s1s_b(x)
+character(77) :: x(..)
+if(size(x)/=n) stop 7
+  end subroutine s1s_b
+  
+  ! example z2.f90
+  subroutine s2(x)
+character(1) :: x(..)
+if(lbound(x, dim=1)/=1) stop 8
+if(ubound(x, dim=1)/=n) stop 9
+if(size(x, dim=1)/=n)   stop 10
+  end subroutine s2
+  
+end program chr_p
+
+


[pushed] c++: Fix VLA in template [PR95232]

2020-06-03 Thread Jason Merrill via Gcc-patches
In a non-template, grokdeclarator notices when we build a pointer to a
variably-modified type, and inserts a TYPE_DECL to make sure that type gets
gimplified.  But in a template we can't always recognize a variably-modified
type, so we need to deal with it at instantiation time.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/95232
* cp-tree.h (predeclare_vla): Declare.
* cp-gimplify.c (predeclare_vla): Handle getting a decl.
* pt.c (tsubst_expr) [DECL_EXPR]: Use it.

gcc/testsuite/ChangeLog:

PR c++/95232
* g++.dg/ubsan/vla-2.C: New test.
---
 gcc/cp/cp-tree.h   |  1 +
 gcc/cp/cp-gimplify.c   | 19 +++
 gcc/cp/pt.c|  5 +
 gcc/testsuite/g++.dg/ubsan/vla-2.C | 14 ++
 4 files changed, 35 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ubsan/vla-2.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 1775b0a4327..447d1349117 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7794,6 +7794,7 @@ extern tree cp_fold_maybe_rvalue  (tree, bool);
 extern tree cp_fold_rvalue (tree);
 extern tree cp_fully_fold  (tree);
 extern tree cp_fully_fold_init (tree);
+extern tree predeclare_vla (tree);
 extern void clear_fold_cache   (void);
 extern tree lookup_hotness_attribute   (tree);
 extern tree process_stmt_hotness_attribute (tree, location_t);
diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index 6c29be9d836..e8fbc300fda 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -1185,14 +1185,17 @@ static tree genericize_spaceship (tree expr)
 
 /* If EXPR involves an anonymous VLA type, prepend a DECL_EXPR for that type
to trigger gimplify_type_sizes; otherwise a cast to pointer-to-VLA confuses
-   the middle-end (c++/88256).  */
+   the middle-end (c++/88256).  If EXPR is a DECL, use add_stmt and return
+   NULL_TREE; otherwise return a COMPOUND_STMT of the DECL_EXPR and EXPR.  */
 
-static tree
+tree
 predeclare_vla (tree expr)
 {
   tree type = TREE_TYPE (expr);
   if (type == error_mark_node)
 return expr;
+  if (is_typedef_decl (expr))
+type = DECL_ORIGINAL_TYPE (expr);
 
   /* We need to strip pointers for gimplify_type_sizes.  */
   tree vla = type;
@@ -1209,8 +1212,16 @@ predeclare_vla (tree expr)
   DECL_ARTIFICIAL (decl) = 1;
   TYPE_NAME (vla) = decl;
   tree dexp = build_stmt (input_location, DECL_EXPR, decl);
-  expr = build2 (COMPOUND_EXPR, type, dexp, expr);
-  return expr;
+  if (DECL_P (expr))
+{
+  add_stmt (dexp);
+  return NULL_TREE;
+}
+  else
+{
+  expr = build2 (COMPOUND_EXPR, type, dexp, expr);
+  return expr;
+}
 }
 
 /* Perform any pre-gimplification lowering of C++ front end trees to
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 9c03c5a5bbd..38a771b8e6f 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -17990,6 +17990,11 @@ tsubst_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl,
if (ndecl != error_mark_node)
  cp_maybe_mangle_decomp (ndecl, first, cnt);
 
+   /* In a non-template function, VLA type declarations are
+  handled in grokdeclarator; for templates, handle them
+  now.  */
+   predeclare_vla (decl);
+
cp_finish_decl (decl, init, const_init, NULL_TREE,
constinit_p ? LOOKUP_CONSTINIT : 0);
 
diff --git a/gcc/testsuite/g++.dg/ubsan/vla-2.C 
b/gcc/testsuite/g++.dg/ubsan/vla-2.C
new file mode 100644
index 000..342d30bfeac
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ubsan/vla-2.C
@@ -0,0 +1,14 @@
+// PR c++/95232
+// { dg-additional-options "-Wno-vla -ftrapv -fnon-call-exceptions -O 
-fsanitize=undefined" }
+
+template 
+int tmain(T argc) {
+  typedef double (*chunk_t)[argc[0][0]];
+  chunk_t var;
+  (void)var[0][0];
+  return 0;
+}
+
+int main (int argc, char **argv) {
+  return tmain(argv);
+}

base-commit: 8d57c30611b05a89fd265f6c0a74fe829c21cd34
-- 
2.18.1



Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-06-03 Thread Segher Boessenkool
On Wed, Jun 03, 2020 at 07:23:47PM +0200, Richard Biener wrote:
> >>   mask = vec_cmp of the comparison
> >>   true_masked = true_op & mask;
> >>   false_masked = false_op & ~mask;
> >>   result = true_masked | false_masked;
> >> 
> >> but I believe this would be dead code never triggered.
> >
> >But that would be the generic code as well?  Is that not useful to have
> >in any case?
> 
> Sure. If you remove the vcond patterns from your port the vectorizer will do 
> this transparently for you. So if you do not actually have a more clever way 
> of representing this in the ISA there's no point of the vcond patterns. 
> (though I think the vec_cmp ones didn't originally exist) 

So why can the expander not just do that whenever the patterns FAIL as
well?

> The point is the vectorizer relies on a optab query for querying backend 
> support and power claims vcond support here. If you then FAIL you have lied. 
> (not in your interpretation of the pattern docs but in the implementations 
> since introduction of vcond named patterns) 

Almost all RTL patterns are allowed to FAIL, and that is a very good
thing.  If the vectoriser does not allow that, *it* is buggy.

> So if you're happy I'll document explicitly that vector named patterns may 
> not FAIL. 

That will not work in general at all, no.  Please document it for only
those RTL patterns you need it for (and it is documented per pattern
currently, anyway).


Segher


Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-06-03 Thread Segher Boessenkool
Hi Martin,

Okay, let's try this out.  Okay for trunk.  Thanks for the work!

On Tue, Jun 02, 2020 at 05:00:56PM +0200, Martin Liška wrote:
> >From 22db04d058c9bbd140041e7aa2caf1613767095a Mon Sep 17 00:00:00 2001
> From: Martin Liska 
> Date: Tue, 2 Jun 2020 15:29:37 +0200
> Subject: [PATCH] rs6000: replace FAIL with gcc_unreachable.

("Replace", and no dot at the end).

>   * config/rs6000/vector.md: Replace FAIL with gcc_unreachable
>   in all vcond* patterns.


Segher


Re: [PATCH] gcc/Makefile.in: move SELFTEST_DEPS before including language makefile fragments

2020-06-03 Thread Jeff Law via Gcc-patches
On Thu, 2020-05-21 at 17:35 +0200, Romain Naour via Gcc-patches wrote:
> As reported by several Buildroot users [1][2][3], the gcc build
> may fail while running selftests makefile target.
> 
> The problem only occurs when ccache is used with gcc 9 and 10,
> probably due to a race condition.
> 
> While debuging with "make -p" we can notice that s-selftest-c target
> contain only "cc1" as dependency instead of cc1 and SELFTEST_DEPS [4].
> 
>   s-selftest-c: cc1
> 
> While the build is failing, the s-selftest-c dependencies recipe is
> still running and reported as a bug by make.
> 
>   "Dependencies recipe running (THIS IS A BUG)."
> 
> A change [5] in gcc 9 seems to introduce the problem since we can't
> reproduce this problem with gcc 8.
> 
> As suggested by Yann E. MORIN [6], move SELFTEST_DEPS before
> including language makefile fragments.
> 
> With the fix applied, the s-seltest-c dependency contains
> SELFTEST_DEPS value.
> 
>   s-selftest-c: cc1 xgcc specs stmp-int-hdrs ../../gcc/testsuite/selftests
> 
> [1] http://lists.busybox.net/pipermail/buildroot/2020-May/282171.html
> [2] http://lists.busybox.net/pipermail/buildroot/2020-May/282766.html
> [3] https://github.com/cirosantilli/linux-kernel-module-cheat/issues/108
> [4] 
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/c/Make-lang.in;h=bfae6fd2549c4f728816cd355fa9739dcc08fcde;hb=033eb5671769a4c681a44aad08a454e667e08502#l120
> [5] 
> https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=033eb5671769a4c681a44aad08a454e667e08502
> [6] http://lists.busybox.net/pipermail/buildroot/2020-May/283213.html
> 
> Signed-off-by: Romain Naour 
> Cc: Ben Dakin-Norris 
> Cc: Maxim Kochetkov 
> Cc: Thomas Petazzoni 
> Cc: Yann E. MORIN 
> Cc: Cc: David Malcolm 
> ---
> This patch should be backported to gcc 10 and gcc 9.
> ---
>  gcc/ChangeLog   | 5 +
>  gcc/Makefile.in | 6 --
>  2 files changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 977e7664b62..c3bb18f2afd 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,8 @@
> +2020-05-21  Romain Naour 
> +
> + * Makefile.in: move SELFTEST_DEPS before including language
> + makefile fragments.
THanks.  I've installed this on the trunk.

jeff
> 



Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-06-03 Thread Richard Biener via Gcc-patches
On June 3, 2020 8:23:14 PM GMT+02:00, Segher Boessenkool 
 wrote:
>On Wed, Jun 03, 2020 at 07:23:47PM +0200, Richard Biener wrote:
>> >>   mask = vec_cmp of the comparison
>> >>   true_masked = true_op & mask;
>> >>   false_masked = false_op & ~mask;
>> >>   result = true_masked | false_masked;
>> >> 
>> >> but I believe this would be dead code never triggered.
>> >
>> >But that would be the generic code as well?  Is that not useful to
>have
>> >in any case?
>> 
>> Sure. If you remove the vcond patterns from your port the vectorizer
>will do this transparently for you. So if you do not actually have a
>more clever way of representing this in the ISA there's no point of the
>vcond patterns. (though I think the vec_cmp ones didn't originally
>exist) 
>
>So why can the expander not just do that whenever the patterns FAIL as
>well?

It could but all the vectorizer costing assumed it goes the 'cheaper' way. So 
this is kind of a sanity check. And what when vec_cmp expansion fails as well? 
Resort to scalar soft FP support as ultimate fallback? That sounds very wrong 
as a auto vectorization result... 

>> The point is the vectorizer relies on a optab query for querying
>backend support and power claims vcond support here. If you then FAIL
>you have lied. (not in your interpretation of the pattern docs but in
>the implementations since introduction of vcond named patterns) 
>
>Almost all RTL patterns are allowed to FAIL, and that is a very good
>thing.  If the vectoriser does not allow that, *it* is buggy.

Your opinion. Please suggest a better way to query target vector capabilities. 

>> So if you're happy I'll document explicitly that vector named
>patterns may not FAIL. 
>
>That will not work in general at all, no.  Please document it for only
>those RTL patterns you need it for (and it is documented per pattern
>currently, anyway).

Sure, will do. But as Richard said, the documented list is the other way 
around. 
At least that was my interpretation. 

All vectorizer queried optabs have this constraint BTW.

Richard. 

>
>Segher



Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-06-03 Thread David Edelsohn via Gcc-patches
On Wed, Jun 3, 2020 at 2:38 PM Richard Biener
 wrote:
>
> On June 3, 2020 8:23:14 PM GMT+02:00, Segher Boessenkool 
>  wrote:
> >On Wed, Jun 03, 2020 at 07:23:47PM +0200, Richard Biener wrote:
> >> >>   mask = vec_cmp of the comparison
> >> >>   true_masked = true_op & mask;
> >> >>   false_masked = false_op & ~mask;
> >> >>   result = true_masked | false_masked;
> >> >>
> >> >> but I believe this would be dead code never triggered.
> >> >
> >> >But that would be the generic code as well?  Is that not useful to
> >have
> >> >in any case?
> >>
> >> Sure. If you remove the vcond patterns from your port the vectorizer
> >will do this transparently for you. So if you do not actually have a
> >more clever way of representing this in the ISA there's no point of the
> >vcond patterns. (though I think the vec_cmp ones didn't originally
> >exist)
> >
> >So why can the expander not just do that whenever the patterns FAIL as
> >well?
>
> It could but all the vectorizer costing assumed it goes the 'cheaper' way. So 
> this is kind of a sanity check. And what when vec_cmp expansion fails as 
> well? Resort to scalar soft FP support as ultimate fallback? That sounds very 
> wrong as a auto vectorization result...
>
> >> The point is the vectorizer relies on a optab query for querying
> >backend support and power claims vcond support here. If you then FAIL
> >you have lied. (not in your interpretation of the pattern docs but in
> >the implementations since introduction of vcond named patterns)
> >
> >Almost all RTL patterns are allowed to FAIL, and that is a very good
> >thing.  If the vectoriser does not allow that, *it* is buggy.
>
> Your opinion. Please suggest a better way to query target vector capabilities.
>
> >> So if you're happy I'll document explicitly that vector named
> >patterns may not FAIL.
> >
> >That will not work in general at all, no.  Please document it for only
> >those RTL patterns you need it for (and it is documented per pattern
> >currently, anyway).
>
> Sure, will do. But as Richard said, the documented list is the other way 
> around.
> At least that was my interpretation.
>
> All vectorizer queried optabs have this constraint BTW.

Looking at other primary targets, like x86, ARM, AArch64, I agree that
other targets don't FAIL vector patterns in the same manner as scalar
patterns.  The design of Altivec support came with the work from RHES
and no one at the time mentioned that FAILing those named patterns was
incorrect.  As Segher said, allowing named patterns to fail is fairly
standard in GCC and the restriction for vector patterns seems to have
appeared without warning or documentation.  Maybe the vectorizer
started to assume that (because of x86 behavior) but no one announced
this.

Again, I'm not saying that it is an incorrect design or policy, but
that it seems to have been imposed retroactively.  This needs to be
documented. It needs to be applied uniformly throughout the common
parts of the vectorizer and RTL generation. It needs to allow time for
the Power port to adapt. And we need to test this thoroughly to catch
unanticipated fallout.

Thanks, David


Re: [PATCH] c++: Reject some further reinterpret casts in constexpr [PR82304, PR95307]

2020-06-03 Thread Jason Merrill via Gcc-patches

On 6/1/20 6:50 PM, Jakub Jelinek wrote:

On Fri, May 29, 2020 at 01:26:32PM -0400, Jason Merrill via Gcc-patches wrote:

This is a diagnostic quality regression, moving the error message away from
the line where the actual problem is.

Maybe use error_at (loc, ...)?


That works fine, bootstrapped/regtested on x86_64-linux and i686-linux, ok
for trunk?


OK.


2020-06-02  Jakub Jelinek  

PR c++/82304
PR c++/95307
* constexpr.c (cxx_eval_constant_expression): Diagnose CONVERT_EXPR
conversions from pointer types to arithmetic types here...
(cxx_eval_outermost_constant_expr): ... instead of here.

* g++.dg/template/pr79650.C: Expect different diagnostics and expect
it on all lines that do pointer to integer casts.
* g++.dg/cpp1y/constexpr-shift1.C: Expect different diagnostics.
* g++.dg/cpp1y/constexpr-82304.C: New test.
* g++.dg/cpp0x/constexpr-95307.C: New test.

--- gcc/cp/constexpr.c.jj   2020-05-29 23:49:25.479087388 +0200
+++ gcc/cp/constexpr.c  2020-06-01 12:53:30.348337388 +0200
@@ -6210,6 +6210,18 @@ cxx_eval_constant_expression (const cons
if (VOID_TYPE_P (type))
  return void_node;
  
+	if (TREE_CODE (t) == CONVERT_EXPR

+   && ARITHMETIC_TYPE_P (type)
+   && INDIRECT_TYPE_P (TREE_TYPE (op)))
+ {
+   if (!ctx->quiet)
+ error_at (loc,
+   "conversion from pointer type %qT to arithmetic type "
+   "%qT in a constant expression", TREE_TYPE (op), type);
+   *non_constant_p = true;
+   return t;
+ }
+
if (TREE_CODE (op) == PTRMEM_CST && !TYPE_PTRMEM_P (type))
  op = cplus_expand_constant (op);
  
@@ -6811,19 +6823,6 @@ cxx_eval_outermost_constant_expr (tree t

non_constant_p = true;
  }
  
-  /* Technically we should check this for all subexpressions, but that

- runs into problems with our internal representation of pointer
- subtraction and the 5.19 rules are still in flux.  */
-  if (CONVERT_EXPR_CODE_P (TREE_CODE (r))
-  && ARITHMETIC_TYPE_P (TREE_TYPE (r))
-  && TREE_CODE (TREE_OPERAND (r, 0)) == ADDR_EXPR)
-{
-  if (!allow_non_constant)
-   error ("conversion from pointer type %qT "
-  "to arithmetic type %qT in a constant expression",
-  TREE_TYPE (TREE_OPERAND (r, 0)), TREE_TYPE (r));
-  non_constant_p = true;
-}
  
if (!non_constant_p && overflow_p)

  non_constant_p = true;
--- gcc/testsuite/g++.dg/template/pr79650.C.jj  2020-05-29 23:49:19.040183088 
+0200
+++ gcc/testsuite/g++.dg/template/pr79650.C 2020-06-01 12:53:30.348337388 
+0200
@@ -11,10 +11,10 @@ foo ()
static int a, b;
  lab1:
  lab2:
-  A<(intptr_t)&&lab1 - (__INTPTR_TYPE__)&&lab2> c;   // { dg-error "not a 
constant integer" }
-  A<(intptr_t)&&lab1 - (__INTPTR_TYPE__)&&lab1> d;
-  A<(intptr_t)&a - (intptr_t)&b> e;  // { dg-error "is not a 
constant expression" }
-  A<(intptr_t)&a - (intptr_t)&a> f;
-  A<(intptr_t)sizeof(a) + (intptr_t)&a> g;   // { dg-error "not a constant 
integer" }
+  A<(intptr_t)&&lab1 - (__INTPTR_TYPE__)&&lab2> c;   // { dg-error "conversion 
from pointer type" }
+  A<(intptr_t)&&lab1 - (__INTPTR_TYPE__)&&lab1> d;   // { dg-error "conversion 
from pointer type" }
+  A<(intptr_t)&a - (intptr_t)&b> e;  // { dg-error "conversion 
from pointer type" }
+  A<(intptr_t)&a - (intptr_t)&a> f;  // { dg-error "conversion 
from pointer type" }
+  A<(intptr_t)sizeof(a) + (intptr_t)&a> g;   // { dg-error "conversion from 
pointer type" }
A<(intptr_t)&a> h;// { dg-error 
"conversion from pointer type" }
  }
--- gcc/testsuite/g++.dg/cpp1y/constexpr-shift1.C.jj2020-05-29 
23:49:19.036183148 +0200
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-shift1.C   2020-06-01 
13:55:22.607594689 +0200
@@ -3,7 +3,7 @@
  constexpr int p = 1;
  constexpr __PTRDIFF_TYPE__ bar (int a)
  {
-  return ((__PTRDIFF_TYPE__) &p) << a; // { dg-error "is not a constant 
expression" }
+  return ((__PTRDIFF_TYPE__) &p) << a;   // { dg-error "conversion from 
pointer" }
  }
  constexpr __PTRDIFF_TYPE__ r = bar (2); // { dg-message "in .constexpr. expansion 
of" }
-constexpr __PTRDIFF_TYPE__ s = bar (0); // { dg-error "conversion from 
pointer" }
+constexpr __PTRDIFF_TYPE__ s = bar (0);
--- gcc/testsuite/g++.dg/cpp1y/constexpr-82304.C.jj 2020-06-01 
12:53:30.349337373 +0200
+++ gcc/testsuite/g++.dg/cpp1y/constexpr-82304.C2020-06-01 
13:03:40.668227604 +0200
@@ -0,0 +1,14 @@
+// PR c++/82304
+// { dg-do compile { target c++14 } }
+
+typedef __UINTPTR_TYPE__ uintptr_t;
+
+constexpr const char *
+foo (const char *p)
+{
+  auto l = reinterpret_cast(p);   // { dg-error "conversion from 
pointer" }
+  ++l;
+  return reinterpret_cast(l);
+}
+
+constexpr auto s = foo ("Hello");
--- gcc/testsuite/g++.dg/c

Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-06-03 Thread Segher Boessenkool
On Wed, Jun 03, 2020 at 08:38:04PM +0200, Richard Biener wrote:
> On June 3, 2020 8:23:14 PM GMT+02:00, Segher Boessenkool 
>  wrote:
> >On Wed, Jun 03, 2020 at 07:23:47PM +0200, Richard Biener wrote:
> >> >>   mask = vec_cmp of the comparison
> >> >>   true_masked = true_op & mask;
> >> >>   false_masked = false_op & ~mask;
> >> >>   result = true_masked | false_masked;
> >> >> 
> >> >> but I believe this would be dead code never triggered.
> >> >
> >> >But that would be the generic code as well?  Is that not useful to
> >have
> >> >in any case?
> >> 
> >> Sure. If you remove the vcond patterns from your port the vectorizer
> >will do this transparently for you. So if you do not actually have a
> >more clever way of representing this in the ISA there's no point of the
> >vcond patterns. (though I think the vec_cmp ones didn't originally
> >exist) 
> >
> >So why can the expander not just do that whenever the patterns FAIL as
> >well?
> 
> It could but all the vectorizer costing assumed it goes the 'cheaper' way. So 
> this is kind of a sanity check. And what when vec_cmp expansion fails as 
> well? Resort to scalar soft FP support as ultimate fallback? That sounds very 
> wrong as a auto vectorization result... 

Yeah, but that is specific to the vectoriser, not something that the
RTL expander should have to deal with.  A big impedance mismatch there.

> >> The point is the vectorizer relies on a optab query for querying
> >backend support and power claims vcond support here. If you then FAIL
> >you have lied. (not in your interpretation of the pattern docs but in
> >the implementations since introduction of vcond named patterns) 
> >
> >Almost all RTL patterns are allowed to FAIL, and that is a very good
> >thing.  If the vectoriser does not allow that, *it* is buggy.
> 
> Your opinion. Please suggest a better way to query target vector 
> capabilities. 

Again, that is what RTL does.  If the vectoriser has extra considerations
or requirements, it perhaps should use some more conditions than just
"this pattern exists"?  Maybe this can be done via the optabs, dunno.

I have no good idea how to query things better -- I don't work in the
vectorisers much at all -- but this should not constrain the target
instruction support that much (it makes no good sense to have to have
two separate patterns for things, one for the vectorisers, one for the
builtins and everything else).

Thanks,


Segher


Optimize ODR enum streaming

2020-06-03 Thread Jan Hubicka
Hi,
it turns out that half of the global decl stream of cc1 LTO build consits
TREE_LISTS, identifiers and integer cosntats representing TYPE_VALUES of enums.
Those are streamed only to produce ODR warning and used otherwise, so this
patch moves the info to a separate section that is represented and streamed
more effectively.

This also adds place for more info that may be used for ODR diagnostics
(i.e. at the moment we do not warn when the declarations differs i.e. by the
associated member functions and their types) and the type inheritance graph
rather then poluting the global stream.

I was bit unsure what enums we want to store into the section.  All parsed
enums is probably too expensive, only those enums streamed to represent IL is
bit hard to get, so I went for those seen by free lang data.

As a plus we now get bit more precise warning because also the location of
mismatched enum CONST_DECL is streamed.

It changes:
[WPA] read 4608466 unshared trees
[WPA] read 2942094 mergeable SCCs of average size 1.365328
[WPA] 8625389 tree bodies read in total
[WPA] tree SCC table: size 524287, 247652 elements, collision ratio: 0.383702
[WPA] tree SCC max chain length 2 (size 1)
[WPA] Compared 2694442 SCCs, 228 collisions (0.85)
[WPA] Merged 2694419 SCCs
[WPA] Merged 3731982 tree bodies
[WPA] Merged 65 types
[WPA] 122077 types prevailed (155548 associated trees)
...
[WPA] Compression: 110593119 input bytes, 287696614 uncompressed bytes (ratio: 
2.601397)
[WPA] Size of mmap'd section decls: 85628556 bytes
[WPA] Size of mmap'd section function_body: 13842928 bytes


[WPA] read 1720989 unshared trees
[WPA] read 1252217 mergeable SCCs of average size 1.858507
[WPA] 4048243 tree bodies read in total
[WPA] tree SCC table: size 524287, 226524 elements, collision ratio: 0.491759
[WPA] tree SCC max chain length 2 (size 1)
[WPA] Compared 1025693 SCCs, 196 collisions (0.000191)
[WPA] Merged 1025670 SCCs
[WPA] Merged 2063373 tree bodies
[WPA] Merged 633497 types
[WPA] 122299 types prevailed (155827 associated trees)
...
[WPA] Compression: 103428770 input bytes, 281151423 uncompressed bytes (ratio: 
2.718310)
[WPA] Size of mmap'd section decls: 49390917 bytes
[WPA] Size of mmap'd section function_body: 13858258 bytes
...
[WPA] Size of mmap'd section odr_types: 29054816 bytes

So number of SCCs streamed drops to 38% and the number of unshared trees (that
are bit misnamed since it is mostly integer_cst) to 37%.

Things speeds up correspondingly, but I did not save time report from previous
build.

The enum values are still quite surprisingly large (29MB compressed). 
I may take a look into ways getting it smaller incrementally, but it
streams reasonably fast:

Time variable   usr   sys  wall 
  GGC
 phase opt and generate :  25.20 ( 68%)  10.88 ( 72%)  36.13 ( 69%) 
 868060 kB ( 52%)
 phase stream in:   4.46 ( 12%)   0.90 (  6%)   5.38 ( 10%) 
 790724 kB ( 48%)
 phase stream out   :   6.69 ( 18%)   3.32 ( 22%)  10.03 ( 19%) 
  8 kB (  0%)
 ipa lto gimple in  :   0.79 (  2%)   1.86 ( 12%)   2.39 (  5%) 
 252612 kB ( 15%)
 ipa lto gimple out :   2.48 (  7%)   0.78 (  5%)   3.26 (  6%) 
  0 kB (  0%)
 ipa lto decl in:   1.71 (  5%)   0.46 (  3%)   2.34 (  4%) 
 417883 kB ( 25%)
 ipa lto decl out   :   3.28 (  9%)   0.07 (  0%)   3.27 (  6%) 
  0 kB (  0%)
 whopr wpa I/O  :   0.40 (  1%)   2.24 ( 15%)   2.77 (  5%) 
  8 kB (  0%)
 lto stream decompression   :   1.38 (  4%)   0.31 (  2%)   1.36 (  3%) 
  0 kB (  0%)
 ipa ODR types  :   0.18 (  0%)   0.02 (  0%)   0.25 (  0%) 
  0 kB (  0%)
 ipa inlining heuristics:  11.64 ( 31%)   1.45 ( 10%)  13.12 ( 25%) 
 453160 kB ( 27%)
 ipa pure const :   1.74 (  5%)   0.00 (  0%)   1.76 (  3%) 
  0 kB (  0%)
 ipa icf:   1.72 (  5%)   5.33 ( 35%)   7.06 ( 13%) 
  16593 kB (  1%)
 whopr partitioning :   2.22 (  6%)   0.01 (  0%)   2.23 (  4%) 
   5689 kB (  0%)
 TOTAL  :  37.17 15.20 52.46
1660886 kB

LTO-bootstrapped/regtested x86_64-linux, will comit it shortly.

gcc/ChangeLog:

2020-06-03  Jan Hubicka  

* ipa-devirt.c: Include data-streamer.h, lto-streamer.h and
streamer-hooks.h.
(odr_enums): New static var.
(struct odr_enum_val): New struct.
(class odr_enum): New struct.
(odr_enum_map): New hashtable.
(odr_types_equivalent_p): Drop code testing TYPE_VALUES.
(add_type_duplicate): Likewise.
(free_odr_warning_data): Do not free TYPE_VALUES.
(register_odr_enum): New function.
(ipa_odr_summary_write): New function.
(ipa_odr_read_section): New function.
(ipa_odr_summary_read): New function.
(class pass_ip

Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-06-03 Thread Jakub Jelinek via Gcc-patches
On Wed, Jun 03, 2020 at 02:09:11PM -0500, Segher Boessenkool wrote:
> Yeah, but that is specific to the vectoriser, not something that the
> RTL expander should have to deal with.  A big impedance mismatch there.

It isn't specific to vectorizer, many named patterns aren't allowed to FAIL
and many others are.
E.g. if one looks at optabs.c, there are many calls to maybe_emit_*
functions which are allowed to FAIL, and many calls to the corresponding
emit_* functions that assert that it doesn't FAIL.
It is true that the documentation documents this in some and not in all
cases.

Jakub



Re: [PATCH] FAT library support for libatomic, libgfortran, libgomp, libstdc++

2020-06-03 Thread Iain Sandoe via Gcc-patches

Hi David,

thanks for working on this!

David Edelsohn  wrote:


[I'll start by repeating what I wrote about a similar libgcc change to
provide background and context.]

When AIX added 64 bit support, it implemented what Apple MacOS Darwin
calls "FAT" libraries for its equivalent functionality -- both 32 bit
and 64 bit objects (or shared objects) are co-located in the same
archive.  GCC on AIX historically has followed the GCC multilib
directory hierarchy approach with separate directories and archives
for each multilib.


End-user builds of GCC on Darwin also currently follow the “GCC multilib”
pattern;  However when and where GCC libraries were installed by Apple
they are already FAT versions (built outside the GCC build system).

… which fortuitously means that a FAT GCC library set is compatible with
the existing system implementations.

Of course that’s an initial reaction - and I/We will need to discuss with the
‘downstreams’ (macports, fink, homebrew) as to whether a switch needs
some transition - or would cause problems.

Given we anticipate a confiuration flag to disable this mode, it seems we
can cover most reasonable concerns.


libstdc++-v3/
   * Makefile.am (tmake_file): Build and install AIX-style FAT libraries.
   * Makefile.in: Regenerate.
   * configure.ac (tmake_file): Substitute.
   * configure: Regenerate.
   * configure.host (aix*): Define tmake_file.
   * config/os/aix/t-aix: New file.
   * empty.mk: New file.


As a “proof of principle, quick hack” I used the following fragment for  
libstdc++

on Darwin; this means that the same fragment would work for X86 and PPC
- and (in principle) would work for as many sub-archs as we elected to build.

My only concern in the short-term is about dependencies; is it defined that
the MULTIBUILDTOP case cannot be active until all the deps are built?

(the libgcc_s case for Darwin which is already FAT is more complex but does
 have rules for the components to be built first).

I haven’t tried to implement the FAT CRTs so far - but I suspect that those  
are

going to use simple linear rules.

thanks
Iain



ifeq ($(MULTIBUILDTOP),)
MAJOR=$(firstword $(subst :, ,$(libtool_VERSION)))
OTHER=$(shell for d in $(MULTIDIRS); do echo  
"../$$d/$(PACKAGE)-v3/src/.libs/$(PACKAGE).$(MAJOR).dylib "; done)

all-local:
lipo -create -output src/.libs/$(PACKAGE).$(MAJOR).f.dylib 
src/.libs/$(PACKAGE).$(MAJOR).dylib $(OTHER)
mv src/.libs/$(PACKAGE).$(MAJOR).f.dylib 
src/.libs/$(PACKAGE).$(MAJOR).dylib
endif



Re: [PATCH] c++: constrained nested partial specialization [PR92103]

2020-06-03 Thread Jason Merrill via Gcc-patches

On 6/2/20 3:43 PM, Patrick Palka wrote:

When determining the most specialized partial specialization of a
primary template that is nested inside a class template, we first
tsubst the outer template arguments into the TEMPLATE_DECL of each
partial specialization, and then check for satisfaction of the new
TEMPLATE_DECL's constraints.

But tsubst_template_decl does not currently guarantee that constraints
from the original DECL_TEMPLATE_RESULT get reattached to the new
DECL_TEMPLATE_RESULT.  In the testcase below, this leads to the
constraints_satisfied_p check in most_specialized_partial_spec to
trivially return true for each of the partial specializations.

I'm not sure if such a guarantee would be desirable, but in any case we
can just check constraints_satisfied_p on the original TEMPLATE_DECL
instead of on the tsubsted TEMPLATE_DECL here, which is what this patch
does (alongside some reorganizing).

Passes 'make check-c++' and also tested by building the testsuites of
cmcstl2 and range-v3.  Does this look OK to commit to mainline and to
the 10 branch after a full bootstrap and regtest?


OK, thanks.


gcc/cp/ChangeLog:

PR c++/92103
* pt.c (most_specialized_partial_spec): Reorganize the loop over
DECL_TEMPLATE_SPECIALIZATIONS.  Check constraints_satisfied_p on
the original template declaration, not on the substituted one.

gcc/testsuite/ChangeLog:

PR c++/92103
* g++.dg/cpp2a/concepts-partial-spec7.C: New test.
---
  gcc/cp/pt.c   | 19 
  .../g++.dg/cpp2a/concepts-partial-spec7.C | 22 +++
  2 files changed, 32 insertions(+), 9 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec7.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 907ca879c73..df92f5584cf 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -24482,21 +24482,22 @@ most_specialized_partial_spec (tree target, 
tsubst_flags_t complain)
  
for (t = DECL_TEMPLATE_SPECIALIZATIONS (main_tmpl); t; t = TREE_CHAIN (t))

  {
-  tree spec_args;
-  tree spec_tmpl = TREE_VALUE (t);
+  const tree ospec_tmpl = TREE_VALUE (t);
  
+  tree spec_tmpl;

if (outer_args)
{
  /* Substitute in the template args from the enclosing class.  */
  ++processing_template_decl;
- spec_tmpl = tsubst (spec_tmpl, outer_args, tf_none, NULL_TREE);
+ spec_tmpl = tsubst (ospec_tmpl, outer_args, tf_none, NULL_TREE);
  --processing_template_decl;
+ if (spec_tmpl == error_mark_node)
+   return error_mark_node;
}
+  else
+   spec_tmpl = ospec_tmpl;
  
-  if (spec_tmpl == error_mark_node)

-   return error_mark_node;
-
-  spec_args = get_partial_spec_bindings (tmpl, spec_tmpl, args);
+  tree spec_args = get_partial_spec_bindings (tmpl, spec_tmpl, args);
if (spec_args)
{
  if (outer_args)
@@ -24505,9 +24506,9 @@ most_specialized_partial_spec (tree target, 
tsubst_flags_t complain)
/* Keep the candidate only if the constraints are satisfied,
   or if we're not compiling with concepts.  */
if (!flag_concepts
-  || constraints_satisfied_p (spec_tmpl, spec_args))
+ || constraints_satisfied_p (ospec_tmpl, spec_args))
  {
-  list = tree_cons (spec_args, TREE_VALUE (t), list);
+ list = tree_cons (spec_args, ospec_tmpl, list);
TREE_TYPE (list) = TREE_TYPE (t);
  }
}
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec7.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec7.C
new file mode 100644
index 000..5b3afce3bc7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec7.C
@@ -0,0 +1,22 @@
+// PR c++/92103
+// { dg-do compile { target c++20 } }
+
+template
+struct traits
+{
+  template
+struct equal_to
+{ static constexpr bool value = false; };
+
+  template requires (M == N)
+struct equal_to
+{ static constexpr bool value = true; };
+
+  template requires (M < 0) || (N < 0)
+struct equal_to
+{ };
+};
+
+static_assert(traits<0>::equal_to<0>::value);
+static_assert(!traits<0>::equal_to<1>::value);
+static_assert(traits<-1>::equal_to<0>::value); // { dg-error "not a member" }





Re: [PATCH] c++: more constrained nested partial specialization

2020-06-03 Thread Jason Merrill via Gcc-patches

On 6/2/20 5:45 PM, Patrick Palka wrote:

When checking that a constrained partial specialization is more
constrained than the primary template, we pass only the innermost level
of generic template arguments to strictly_subsumes.  This leads to us
doing a nonsensical substitution from normalize_concept_check if the
full set of template arguments has multiple levels, and it causes
strictly_subsumes to sometimes erroneously return false as in the
testcase below.

Passes 'make check-c++' and also tested by building the testsuites of
cmcstl2 and range-v3.  Does this look OK to commit to mainline and to
the 10 branch after a full bootstrap and regtest?


OK.


(We shouldn't need to do any substitution from strictly_subsumes here,
since processing_template_decl would always be non-zero and so
substituting in the complete set of generic template arguments should
always be a no-op I think.  I can propose this as a subsequent patch for
mainline.)


Sounds good.


gcc/cp/ChangeLog:

* pt.c (process_partial_specialization): Pass the full set of
generic template arguments to strictly_subsumes.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-partial-spec8.C: New test.
---
  gcc/cp/pt.c|  2 +-
  .../g++.dg/cpp2a/concepts-partial-spec8.C  | 14 ++
  2 files changed, 15 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec8.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index df92f5584cf..d7f61289621 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -5062,7 +5062,7 @@ process_partial_specialization (tree decl)
if (comp_template_args (inner_args, INNERMOST_TEMPLATE_ARGS (main_args))
&& (!flag_concepts
  || !strictly_subsumes (current_template_constraints (),
-inner_args, maintmpl)))
+main_args, maintmpl)))
  {
if (!flag_concepts)
  error ("partial specialization %q+D does not specialize "
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec8.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec8.C
new file mode 100644
index 000..873cf44e407
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-partial-spec8.C
@@ -0,0 +1,14 @@
+// { dg-do compile { target c++20 } }
+
+template
+concept equal = M == N;
+
+template
+struct traits
+{
+  template requires equal
+struct foo {};
+
+  template requires equal && (M >= 0) // { dg-bogus "not more 
constrained" }
+struct foo {};
+};





Re: [PATCH] FAT library support for libatomic, libgfortran, libgomp, libstdc++

2020-06-03 Thread David Edelsohn via Gcc-patches
On Wed, Jun 3, 2020 at 3:14 PM Iain Sandoe  wrote:
>
> Hi David,
>
> thanks for working on this!
>
> David Edelsohn  wrote:
>
> > [I'll start by repeating what I wrote about a similar libgcc change to
> > provide background and context.]
> >
> > When AIX added 64 bit support, it implemented what Apple MacOS Darwin
> > calls "FAT" libraries for its equivalent functionality -- both 32 bit
> > and 64 bit objects (or shared objects) are co-located in the same
> > archive.  GCC on AIX historically has followed the GCC multilib
> > directory hierarchy approach with separate directories and archives
> > for each multilib.
>
> End-user builds of GCC on Darwin also currently follow the “GCC multilib”
> pattern;  However when and where GCC libraries were installed by Apple
> they are already FAT versions (built outside the GCC build system).
>
> … which fortuitously means that a FAT GCC library set is compatible with
> the existing system implementations.
>
> Of course that’s an initial reaction - and I/We will need to discuss with the
> ‘downstreams’ (macports, fink, homebrew) as to whether a switch needs
> some transition - or would cause problems.
>
> Given we anticipate a confiuration flag to disable this mode, it seems we
> can cover most reasonable concerns.
>
> > libstdc++-v3/
> >* Makefile.am (tmake_file): Build and install AIX-style FAT 
> > libraries.
> >* Makefile.in: Regenerate.
> >* configure.ac (tmake_file): Substitute.
> >* configure: Regenerate.
> >* configure.host (aix*): Define tmake_file.
> >* config/os/aix/t-aix: New file.
> >* empty.mk: New file.
>
> As a “proof of principle, quick hack” I used the following fragment for
> libstdc++
> on Darwin; this means that the same fragment would work for X86 and PPC
> - and (in principle) would work for as many sub-archs as we elected to build.
>
> My only concern in the short-term is about dependencies; is it defined that
> the MULTIBUILDTOP case cannot be active until all the deps are built?
>
> (the libgcc_s case for Darwin which is already FAT is more complex but does
>   have rules for the components to be built first).

Hi, Ian

Thanks for trying this out and the feedback.

On AIX I can start building the "FAT" libraries without actually
relying upon them for more than the "native" objects.  So I slowly can
create 32/64 bit compatible libraries and then enable the multilib
configuration bits at the end to instruct GCC to look at the same
directory level for both 32 bit and 64 bit objects.  I don't know if
that is viable for Darwin.

And on AIX I cannot transition for some libraries and not others.  In
other words, I can't switch C++ and Fortran, but not Objective-C and
GCCGo.

Thanks, David


Re: [PATCH] FAT library support for libatomic, libgfortran, libgomp, libstdc++

2020-06-03 Thread Iain Sandoe via Gcc-patches

David Edelsohn  wrote:


On Wed, Jun 3, 2020 at 3:14 PM Iain Sandoe  wrote:



libstdc++-v3/
  * Makefile.am (tmake_file): Build and install AIX-style FAT libraries.
  * Makefile.in: Regenerate.
  * configure.ac (tmake_file): Substitute.
  * configure: Regenerate.
  * configure.host (aix*): Define tmake_file.
  * config/os/aix/t-aix: New file.
  * empty.mk: New file.


As a “proof of principle, quick hack” I used the following fragment for
libstdc++
on Darwin; this means that the same fragment would work for X86 and PPC
- and (in principle) would work for as many sub-archs as we elected to  
build.


My only concern in the short-term is about dependencies; is it defined  
that

the MULTIBUILDTOP case cannot be active until all the deps are built?

(the libgcc_s case for Darwin which is already FAT is more complex but  
does

 have rules for the components to be built first).





On AIX I can start building the "FAT" libraries without actually
relying upon them for more than the "native" objects.  So I slowly can
create 32/64 bit compatible libraries and then enable the multilib
configuration bits at the end to instruct GCC to look at the same
directory level for both 32 bit and 64 bit objects.  I don't know if
that is viable for Darwin.


probably, since the linker is happy to consume either - so long as the
search paths are there for the multi-lib style ld64 will use the first lib
it finds with a viable slice.


And on AIX I cannot transition for some libraries and not others.  In
other words, I can't switch C++ and Fortran, but not Objective-C and
GCCGo.


That would seem unwise for Darwin too - even if it would work - it would
create confusion.
Iain




Re: [PATCH] gcc/Makefile.in: move SELFTEST_DEPS before including language makefile fragments

2020-06-03 Thread Romain Naour via Gcc-patches
Hi Jeff,

Le 03/06/2020 à 20:33, Jeff Law a écrit :
> On Thu, 2020-05-21 at 17:35 +0200, Romain Naour via Gcc-patches wrote:
>> As reported by several Buildroot users [1][2][3], the gcc build
>> may fail while running selftests makefile target.
>>
>> The problem only occurs when ccache is used with gcc 9 and 10,
>> probably due to a race condition.
>>
>> While debuging with "make -p" we can notice that s-selftest-c target
>> contain only "cc1" as dependency instead of cc1 and SELFTEST_DEPS [4].
>>
>>   s-selftest-c: cc1
>>
>> While the build is failing, the s-selftest-c dependencies recipe is
>> still running and reported as a bug by make.
>>
>>   "Dependencies recipe running (THIS IS A BUG)."
>>
>> A change [5] in gcc 9 seems to introduce the problem since we can't
>> reproduce this problem with gcc 8.
>>
>> As suggested by Yann E. MORIN [6], move SELFTEST_DEPS before
>> including language makefile fragments.
>>
>> With the fix applied, the s-seltest-c dependency contains
>> SELFTEST_DEPS value.
>>
>>   s-selftest-c: cc1 xgcc specs stmp-int-hdrs ../../gcc/testsuite/selftests
>>
>> [1] http://lists.busybox.net/pipermail/buildroot/2020-May/282171.html
>> [2] http://lists.busybox.net/pipermail/buildroot/2020-May/282766.html
>> [3] https://github.com/cirosantilli/linux-kernel-module-cheat/issues/108
>> [4] 
>> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/c/Make-lang.in;h=bfae6fd2549c4f728816cd355fa9739dcc08fcde;hb=033eb5671769a4c681a44aad08a454e667e08502#l120
>> [5] 
>> https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=033eb5671769a4c681a44aad08a454e667e08502
>> [6] http://lists.busybox.net/pipermail/buildroot/2020-May/283213.html
>>
>> Signed-off-by: Romain Naour 
>> Cc: Ben Dakin-Norris 
>> Cc: Maxim Kochetkov 
>> Cc: Thomas Petazzoni 
>> Cc: Yann E. MORIN 
>> Cc: Cc: David Malcolm 
>> ---
>> This patch should be backported to gcc 10 and gcc 9.
>> ---
>>  gcc/ChangeLog   | 5 +
>>  gcc/Makefile.in | 6 --
>>  2 files changed, 9 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 977e7664b62..c3bb18f2afd 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,8 @@
>> +2020-05-21  Romain Naour 
>> +
>> +* Makefile.in: move SELFTEST_DEPS before including language
>> +makefile fragments.
> THanks.  I've installed this on the trunk.

Many thanks for merging the patch!

But I don't see the commit log I've written to explain the issue.
Was there a reason to drop it?

Best regards,
Romain

> 
> jeff
>>
> 



Re: [PATCH] gcc/Makefile.in: move SELFTEST_DEPS before including language makefile fragments

2020-06-03 Thread Jeff Law via Gcc-patches
On Wed, 2020-06-03 at 21:56 +0200, Romain Naour wrote:
> Hi Jeff,
> 
> Le 03/06/2020 à 20:33, Jeff Law a écrit :
> > On Thu, 2020-05-21 at 17:35 +0200, Romain Naour via Gcc-patches wrote:
> > > As reported by several Buildroot users [1][2][3], the gcc build
> > > may fail while running selftests makefile target.
> > > 
> > > The problem only occurs when ccache is used with gcc 9 and 10,
> > > probably due to a race condition.
> > > 
> > > While debuging with "make -p" we can notice that s-selftest-c target
> > > contain only "cc1" as dependency instead of cc1 and SELFTEST_DEPS [4].
> > > 
> > >   s-selftest-c: cc1
> > > 
> > > While the build is failing, the s-selftest-c dependencies recipe is
> > > still running and reported as a bug by make.
> > > 
> > >   "Dependencies recipe running (THIS IS A BUG)."
> > > 
> > > A change [5] in gcc 9 seems to introduce the problem since we can't
> > > reproduce this problem with gcc 8.
> > > 
> > > As suggested by Yann E. MORIN [6], move SELFTEST_DEPS before
> > > including language makefile fragments.
> > > 
> > > With the fix applied, the s-seltest-c dependency contains
> > > SELFTEST_DEPS value.
> > > 
> > >   s-selftest-c: cc1 xgcc specs stmp-int-hdrs ../../gcc/testsuite/selftests
> > > 
> > > [1] http://lists.busybox.net/pipermail/buildroot/2020-May/282171.html
> > > [2] http://lists.busybox.net/pipermail/buildroot/2020-May/282766.html
> > > [3] https://github.com/cirosantilli/linux-kernel-module-cheat/issues/108
> > > [4] 
> > > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/c/Make-lang.in;h=bfae6fd2549c4f728816cd355fa9739dcc08fcde;hb=033eb5671769a4c681a44aad08a454e667e08502#l120
> > > [5] 
> > > https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=033eb5671769a4c681a44aad08a454e667e08502
> > > [6] http://lists.busybox.net/pipermail/buildroot/2020-May/283213.html
> > > 
> > > Signed-off-by: Romain Naour 
> > > Cc: Ben Dakin-Norris 
> > > Cc: Maxim Kochetkov 
> > > Cc: Thomas Petazzoni 
> > > Cc: Yann E. MORIN 
> > > Cc: Cc: David Malcolm 
> > > ---
> > > This patch should be backported to gcc 10 and gcc 9.
> > > ---
> > >  gcc/ChangeLog   | 5 +
> > >  gcc/Makefile.in | 6 --
> > >  2 files changed, 9 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> > > index 977e7664b62..c3bb18f2afd 100644
> > > --- a/gcc/ChangeLog
> > > +++ b/gcc/ChangeLog
> > > @@ -1,3 +1,8 @@
> > > +2020-05-21  Romain Naour 
> > > +
> > > + * Makefile.in: move SELFTEST_DEPS before including language
> > > + makefile fragments.
> > THanks.  I've installed this on the trunk.
> 
> Many thanks for merging the patch!
> 
> But I don't see the commit log I've written to explain the issue.
> Was there a reason to drop it?
As a project we're still trying to sort out the right level of verbosity of the
commit log.  I tend to use short ones.

jeff
> 



Re: [PATCH 0/6] Permute Class Operations

2020-06-03 Thread Segher Boessenkool
Hi Carl,

On Mon, Jun 01, 2020 at 09:14:25AM -0700, Carl Love wrote:
> Based on previous IBM internal reviews of the patch set, the desire is
> for all of the vector insert and extract support to be in vsx.md as
> there is a longer term plan to re-work this support for PPC.

Very long time ago, it was easy: there was altivec.md, which contained
the AltiVec instruction patterns.  Life was good.

Then, for Power7, we got vsx.md for the original VSX insns, and vector.md
to mux the things that exist similarly in AltiVec (aka VMX) and VSX.
This was still quite nice.

But since then we got Power8, and Power9, and now Power10, and there is
not any obvious way to decide what goes where anymore.  It may seem (I
thought this for a while) that instructions that only operate on the VRs
(and not the other VSRs, namely the FPRs) can go into altivec.md -- but
then a later version of the ISA makes things more orthogonal, and things
*can* go into all VSRs now, so things have to move again, and the
distinction turns out to have been totally artificial and not useful.

So yes, we need to divide this up some saner way.  Maybe we can start
with splitting of the FP parts (scalar and vector, also classical FP,
but maybe only the computational insns (not the moves))?  Will need some
experimentation to find out something that works well (and that is the
point: it should be easy to *work with*, nothing else).

Sorry for blabbering :-)


Segher


[pushed][PATCH] identify lfs prefixed case PR95347

2020-06-03 Thread Aaron Sawdey via Gcc-patches
This passed regstrap and was approved offline by Segher, posting
the final form (minus my debug code, oops).

The same problem also arises for plfs where prefixed_load_p()
doesn't recognize it so we get just lfs in the asm output
with an @pcrel address.

PR target/95347
* config/rs6000/rs6000.c (is_stfs_insn): Rename to
is_lfs_stfs_insn and make it recognize lfs as well.
(prefixed_store_p): Use is_lfs_stfs_insn().
(prefixed_load_p): Use is_lfs_stfs_insn() to recognize lfs.
---
 gcc/config/rs6000/rs6000.c | 37 +
 1 file changed, 25 insertions(+), 12 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index ba9069ecc3b..42d517c1f65 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -24980,14 +24980,18 @@ address_to_insn_form (rtx addr,
   return INSN_FORM_BAD;
 }
 
-/* Helper function to see if we're potentially looking at stfs.
+/* Helper function to see if we're potentially looking at lfs/stfs.
- PARALLEL containing a SET and a CLOBBER
-   - SET is from UNSPEC_SI_FROM_SF to MEM:SI
-   - CLOBBER is a V4SF
+   - stfs:
+- SET is from UNSPEC_SI_FROM_SF to MEM:SI
+- CLOBBER is a V4SF
+   - lfs:
+- SET is from UNSPEC_SF_FROM_SI to REG:SF
+- CLOBBER is a DI
  */
 
 static bool
-is_stfs_insn (rtx_insn *insn)
+is_lfs_stfs_insn (rtx_insn *insn)
 {
   rtx pattern = PATTERN (insn);
   if (GET_CODE (pattern) != PARALLEL)
@@ -25013,16 +25017,22 @@ is_stfs_insn (rtx_insn *insn)
   rtx src = SET_SRC (set);
   rtx scratch = SET_DEST (clobber);
 
-  if (GET_CODE (src) != UNSPEC || XINT (src, 1) != UNSPEC_SI_FROM_SF)
+  if (GET_CODE (src) != UNSPEC)
 return false;
 
-  if (GET_CODE (dest) != MEM || GET_MODE (dest) != SImode)
-return false;
+  /* stfs case.  */
+  if (XINT (src, 1) == UNSPEC_SI_FROM_SF
+  && GET_CODE (dest) == MEM && GET_MODE (dest) == SImode
+  && GET_CODE (scratch) == SCRATCH && GET_MODE (scratch) == V4SFmode)
+return true;
 
-  if (GET_CODE (scratch) != SCRATCH || GET_MODE (scratch) != V4SFmode)
-return false;
+  /* lfs case.  */
+  if (XINT (src, 1) == UNSPEC_SF_FROM_SI
+  && GET_CODE (dest) == REG && GET_MODE (dest) == SFmode
+  && GET_CODE (scratch) == SCRATCH && GET_MODE (scratch) == DImode)
+return true;
 
-  return true;
+  return false;
 }
 
 /* Helper function to take a REG and a MODE and turn it into the non-prefixed
@@ -25135,7 +25145,10 @@ prefixed_load_p (rtx_insn *insn)
   else
 non_prefixed = reg_to_non_prefixed (reg, mem_mode);
 
-  return address_is_prefixed (XEXP (mem, 0), mem_mode, non_prefixed);
+  if (non_prefixed == NON_PREFIXED_X && is_lfs_stfs_insn (insn))
+return address_is_prefixed (XEXP (mem, 0), mem_mode, NON_PREFIXED_DEFAULT);
+  else
+return address_is_prefixed (XEXP (mem, 0), mem_mode, non_prefixed);
 }
 
 /* Whether a store instruction is a prefixed instruction.  This is called from
@@ -25170,7 +25183,7 @@ prefixed_store_p (rtx_insn *insn)
   /* Need to make sure we aren't looking at a stfs which doesn't look
  like the other things reg_to_non_prefixed/address_is_prefixed
  looks for.  */
-  if (non_prefixed == NON_PREFIXED_X && is_stfs_insn (insn))
+  if (non_prefixed == NON_PREFIXED_X && is_lfs_stfs_insn (insn))
 return address_is_prefixed (addr, mem_mode, NON_PREFIXED_DEFAULT);
   else
 return address_is_prefixed (addr, mem_mode, non_prefixed);
-- 
2.17.1



Re: [PATCH] Fix missing assemble_external in ASM_OUTPUT_FDESC

2020-06-03 Thread Jeff Law via Gcc-patches
On Mon, 2020-05-18 at 12:02 +0200, Andreas Schwab wrote:
> When TARGET_VTABLE_USES_DESCRIPTORS is defined then function pointers in
> the vtable are output by ASM_OUTPUT_FDESC.  The only current user of
> this is ia64, but its implementation of ASM_OUTPUT_FDESC lacks a call to
> assemble_external.  Thus if there is no other reference to the function
> the weak declaration for it will be missing.
> 
>   PR target/95154
>   * config/ia64/ia64.h (ASM_OUTPUT_FDESC): Call assemble_external.
OK.  Sorry about the delays.

jeff
> 



std::optional defaut constructor

2020-06-03 Thread Marc Glisse

Hello,

is there any drawback to the attached patch? It changes the code generated for

std::optional>f(){return{};}

from

movq$0, (%rdi)
movq%rdi, %r8
leaq8(%rdi), %rdi
xorl%eax, %eax
movq$0, 4084(%rdi)
movq%r8, %rcx
andq$-8, %rdi
subq%rdi, %rcx
addl$4100, %ecx
shrl$3, %ecx
rep stosq
movq%r8, %rax

or with different tuning

subq$8, %rsp
movl$4100, %edx
xorl%esi, %esi
callmemset
addq$8, %rsp

to the much shorter

movb$0, 4096(%rdi)
movq%rdi, %rax

i.e. the same as the nullopt constructor.

The constructor was already non-trivial, so we don't lose that. It passes the
testsuite without regression. I thought I remembered there was a reason not to
do this...

2020-06-04  Marc Glisse  

* include/std/optional (optional()): Explicitly define it.

(I don't currently have a setup that would enable me to commit anything. I'll
try to fix it eventually, but likely not so soon)

--
Marc Glissediff --git a/libstdc++-v3/include/std/optional b/libstdc++-v3/include/std/optional
index 37c2ba7a025..e84ba28a806 100644
--- a/libstdc++-v3/include/std/optional
+++ b/libstdc++-v3/include/std/optional
@@ -688,7 +688,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 public:
   using value_type = _Tp;
 
-  constexpr optional() = default;
+  constexpr optional() noexcept { }
 
   constexpr optional(nullopt_t) noexcept { }
 


[PATCH] Treat { 0 } specially for structs with the designated_init attribute.

2020-06-03 Thread Asher Gordon via Gcc-patches
Closes #95379.

gcc/ChangeLog:

* doc/extend.texi: Document { 0 } as a special case for the
designated_init attribute.

gcc/c/ChangeLog:

* c-typeck.c (struct location_list): New type.
(struct initializer_stack): Add positional_init_locs for
-Wdesignated-init warnings.
(start_init): Initialize initializer_stack->positional_init_locs
to NULL.
(finish_init): Free initializer_stack->positional_init_locs.
(pop_init_level): Move -Wdesignated-init warning here from
process_init_element so that we can treat { 0 } specially.
(process_init_element): Instead of warning on -Wdesignated-init
here, remember a list of locations where we should warn and do the
actual warning in pop_init_level.

gcc/testsuite/ChangeLog:

* gcc.dg/Wdesignated-init.c: Add tests for ignoring { 0 }.
---
Note: Please see the discussion here[1] before applying this
patch. Also note that this patch does not implement a
-Wno-designated-init flag, but that shouldn't be too hard to
implement.

[1]  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95379

 gcc/ChangeLog   |  5 +++
 gcc/c/ChangeLog | 14 +++
 gcc/c/c-typeck.c| 55 +++--
 gcc/doc/extend.texi |  4 ++
 gcc/testsuite/ChangeLog |  4 ++
 gcc/testsuite/gcc.dg/Wdesignated-init.c |  3 ++
 6 files changed, 81 insertions(+), 4 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3fc8a8e55cc..0249f965c4f 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2020-06-03  Asher Gordon  
+
+   * doc/extend.texi: Document { 0 } as a special case for the
+   designated_init attribute.
+
 2020-06-02  Felix Yang  
 
PR target/95459
diff --git a/gcc/c/ChangeLog b/gcc/c/ChangeLog
index 7efb6bfc544..0a6cf36bad7 100644
--- a/gcc/c/ChangeLog
+++ b/gcc/c/ChangeLog
@@ -1,3 +1,17 @@
+2020-06-03  Asher Gordon  
+
+   * c-typeck.c (struct location_list): New type.
+   (struct initializer_stack): Add positional_init_locs for
+   -Wdesignated-init warnings.
+   (start_init): Initialize initializer_stack->positional_init_locs
+   to NULL.
+   (finish_init): Free initializer_stack->positional_init_locs.
+   (pop_init_level): Move -Wdesignated-init warning here from
+   process_init_element so that we can treat { 0 } specially.
+   (process_init_element): Instead of warning on -Wdesignated-init
+   here, remember a list of locations where we should warn and do the
+   actual warning in pop_init_level.
+
 2020-05-28  Nicolas Bértolo  
 
* Make-lang.in: Remove extra slash.
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 385bf3a1c7b..6273a7ce2d7 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -8179,6 +8179,13 @@ struct constructor_range_stack
 
 static struct constructor_range_stack *constructor_range_stack;
 
+/* A list of locations.  */
+
+struct location_list {
+  struct location_list *next;
+  location_t loc;
+};
+
 /* This stack records separate initializers that are nested.
Nested initializers can't happen in ANSI C, but GNU C allows them
in cases like { ... (struct foo) { ... } ... }.  */
@@ -8197,6 +8204,7 @@ struct initializer_stack
   char require_constant_value;
   char require_constant_elements;
   rich_location *missing_brace_richloc;
+  struct location_list *positional_init_locs;
 };
 
 static struct initializer_stack *initializer_stack;
@@ -8222,6 +8230,7 @@ start_init (tree decl, tree asmspec_tree 
ATTRIBUTE_UNUSED, int top_level,
   p->top_level = constructor_top_level;
   p->next = initializer_stack;
   p->missing_brace_richloc = richloc;
+  p->positional_init_locs = NULL;
   initializer_stack = p;
 
   constructor_decl = decl;
@@ -8287,6 +8296,13 @@ finish_init (void)
   spelling_size = p->spelling_size;
   constructor_top_level = p->top_level;
   initializer_stack = p->next;
+
+  while (p->positional_init_locs)
+{
+  struct location_list *list = p->positional_init_locs;
+  p->positional_init_locs = list->next;
+  free (list);
+}
   free (p);
 }
 
@@ -8744,6 +8760,22 @@ pop_init_level (location_t loc, int implicit,
  }
 }
 
+  /* Warn when positional initializers are used for a structure with
+ the designated_init attribute, but make an exception for { 0 }.  */
+  while (initializer_stack->positional_init_locs)
+{
+  struct location_list *loc = initializer_stack->positional_init_locs;
+
+  if (!constructor_zeroinit)
+   warning_init (loc->loc,
+ OPT_Wdesignated_init,
+ "positional initialization of field in % "
+ "declared with % attribute");
+
+  initializer_stack->positional_init_locs = loc->next;
+  free(loc);
+}
+
   /* Pad out the end of the structure.  */
   if (p->replacement_value.value)
 /* If this closes a superfluous brace 

Re: [PATCH 1/6] rs6000, Update support for vec_extract

2020-06-03 Thread Segher Boessenkool
Hi carl,

On Mon, Jun 01, 2020 at 09:14:33AM -0700, Carl Love wrote:
> * config/rs6000/altivec.md: Move UNSPEC_EXTRACTL, UNSPEC_EXTRACTR
>   declarations to gcc/config/rs6000/vsx.md.
>   (define_expand): Move vextractl and vextractr to
>   gcc/config/rs6000/vsx.md.
>   (define_insn): Move vextractl_internal and 
> vextractr_internal
>   to gcc/config/rs6000/vsx.md.

* config/rs6000/altivec.md (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)
(vextractl, vextractr)
(vextractl_internal, vextractr_internal): Move to ...

>   * config/rs6000/vsx.md: Code moved from file config/rs6000/altivec.md.

* config/rs6000/vsx.md (UNSPEC_EXTRACTL, UNSPEC_EXTRACTR)
(vextractl, vextractr)
(vextractl_internal, vextractr_internal): ... here.

You can probably do this some less formal way, but (define_insn) as
name is a bit much ;-)

>   * gcc/doc/extend.texi: Update documentation for vec_extractl.
>   Replace builtin name vec_extractr with vec_extracth.  Update description
>   of vec_extracth.

(For documentation like this I certainly do not care at all.)

> -;; Like VI, defined in vector.md, but add ISA 2.07 integer vector ops
> -(define_mode_iterator VI2 [V4SI V8HI V16QI V2DI])

You should mention this in the changelog as well.

> +;; Like VI, defined in vector.md, but add ISA 2.07 integer vector ops
> +(define_mode_iterator VI2 [V4SI V8HI V16QI V2DI])

It may be easier to just put them together in vector.me then, no matter
where they are used?  But we should restructure this whole thing anyway,
just something to keep in the back of your head.

Okay for trunk with such tweaks.  Thanks!


Segher


Re: [Patch] Inline optimization for tanh(x)/sinh(x) -> 1.0/cosh(x)

2020-06-03 Thread Jeff Law via Gcc-patches
On Fri, 2020-01-31 at 15:50 -0300, Vitor Guidi wrote:
> I took the required steps. The patch is attached to this email, I hope
> I got it right this time. I did not forward the patch to gcc-patches
> the first time, sorry for the inconvenience.
> 
> Thank you for your attention,
> 
> Vitor.
> 
> in gcc/ChangeLog:
> 2020-01-28 Vitor Guidi 
> * match.pd: New substitution rule for tanh(x)/sinh(x) ->
> 1.0/cosh(x).
> 
> in gcc/testsuite/ChangeLog:
> 2020-01-28 Vitor Guidi 
> *gcc.dg/tanhbysinh.c (new): New testcase.
Thanks.  I've installed this on the trunk.  Sorry for the delays.

jeff
> 



Re: [PATCH] Treat { 0 } specially for structs with the designated_init attribute.

2020-06-03 Thread Asher Gordon via Gcc-patches
Asher Gordon  writes:

> Also note that this patch does not implement a -Wno-designated-init
> flag, but that shouldn't be too hard to implement.

Sorry, I meant -Wno-universal-initializer, not -Wno-designated-init.

Asher

-- 
Vimes grunted.  "Where there are policemen there's crime, sergeant,
remember that."

"Yes, I do, sir, although I think it sounds better with a little reordering
of the words."

  [Snuff, by Terry Pratchett]
   
I prefer to send and receive mail encrypted. Please send me your
public key, and if you do not have my public key, please let me
know. Thanks.

GPG fingerprint: 38F3 975C D173 4037 B397  8095 D4C9 C4FC 5460 8E68


signature.asc
Description: PGP signature


Re: std::optional defaut constructor

2020-06-03 Thread Ville Voutilainen via Gcc-patches
On Thu, 4 Jun 2020 at 01:52, Marc Glisse  wrote:
>
> Hello,
>
> is there any drawback to the attached patch? It changes the code generated for

I don't get it. The noexceptness of the defaulted default constructor
should be a computation
of the noexceptness of the subobjects, and that should boil down to
whether optional's storage
is noexcept-default-constructible. And that thing has a noexcept
default constructor. Why
does this patch change anything?


Re: std::optional defaut constructor

2020-06-03 Thread Marc Glisse

On Thu, 4 Jun 2020, Ville Voutilainen wrote:


On Thu, 4 Jun 2020 at 01:52, Marc Glisse  wrote:


Hello,

is there any drawback to the attached patch? It changes the code generated for


I don't get it. The noexceptness of the defaulted default constructor
should be a computation
of the noexceptness of the subobjects, and that should boil down to
whether optional's storage
is noexcept-default-constructible. And that thing has a noexcept
default constructor. Why
does this patch change anything?


"noexcept" is a red herring, what matters is defaulted vs user-provided. 
In one case, we end up zero-initializing the whole buffer, and not in the 
other.


--
Marc Glisse


Re: [PATCH] diagnostics: Add function call parens matching to c_parser.

2020-06-03 Thread Joseph Myers
On Sun, 31 May 2020, Mark Wielaard wrote:

> On Sun, May 24, 2020 at 11:46:34PM +0200, Mark Wielaard wrote:
> > The C++ parser already tracks function call parens matching, but the C
> > parser doesn't. This adds the same functionality to the C parser and adds
> > a testcase showing the C++ and C parser matching function call parens
> > in an error message.
> > 
> > gcc/c/ChangeLog:
> > 
> > * c-parser.c (c_parser_postfix_expression_after_primary): Add
> > scope with matching_parens after CPP_OPEN_PAREN.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * c-c++-common/missing-close-func-paren.c: New test.
> 
> Ping.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 2/6] rs6000 Add vector insert builtin support

2020-06-03 Thread Segher Boessenkool
Hi!

On Mon, Jun 01, 2020 at 09:14:40AM -0700, Carl Love wrote:
> * config/rs6000/altivec.h: Add define vec_insertl, vec_inserth.

* config/rs6000/altivec.h (vec_insertl, vec_inserth): New defines.

>   * config/rs6000/rs6000-builtin.def (BU_FUTURE_V_3): Add definition for
>   VINSERTGPRBL, VINSERTGPRHL, VINSERTGPRWL, VINSERTGPRDL, VINSERTVPRBL,
>   VINSERTVPRHL, VINSERTVPRWL, VINSERTGPRBR, VINSERTGPRHR, VINSERTGPRWR,
>   VINSERTGPRDR, VINSERTVPRBR, VINSERTVPRHR, VINSERTVPRWR.

* config/rs6000/rs6000-builtin.def (VINSERTGPRBL, VINSERTGPRHL,
VINSERTGPRWL, VINSERTGPRDL, VINSERTVPRBL, VINSERTVPRHL, VINSERTVPRWL,
VINSERTGPRBR, VINSERTGPRHR, VINSERTGPRWR, VINSERTGPRDR, VINSERTVPRBR,
VINSERTVPRHR, VINSERTVPRWR): New builtins.

>   (BU_FUTURE_OVERLOAD_3): Add definition for INSERTL, INSERTH.

(INSERTL, INSERTH): New builtins.

>   * config/rs6000/rs6000-call.c (FUTURE_BUILTIN_VEC_INSERTL): Add 
> overloaded
>   argument declarations.
>   (FUTURE_BUILTIN_VEC_INSERTH):  Add overloaded   argument declarations.

Maybe just "New overloads." for both?  (That latter one has whitespace
problems).

>   (builtin_function_type): Add case entries for 
> FUTURE_BUILTIN_VINSERTGPRBL,
>   FUTURE_BUILTIN_VINSERTGPRHL, FUTURE_BUILTIN_VINSERTGPRWL,
>   FUTURE_BUILTIN_VINSERTGPRDL, FUTURE_BUILTIN_VINSERTVPRBL,
>   FUTURE_BUILTIN_VINSERTVPRHL, FUTURE_BUILTIN_VINSERTVPRWL.

builtin_function_type is not the name of what you changed (FUTURE* is).

> +(define_expand "vinsertvr_"
> +  [(set (match_operand:VI2 0 "altivec_register_operand")
> +   (unspec:VI2 [(match_operand:VI2 1 "altivec_register_operand")
> +(match_operand:VI2 2 "altivec_register_operand")
> +(match_operand:SI 3 "register_operand" "r")]
> +UNSPEC_INSERTR))]
> +  "TARGET_FUTURE"
> +{
> +  if (BYTES_BIG_ENDIAN)
> + emit_insn (gen_vinsertvr_internal_ (operands[0], operands[3],
> +   operands[1], operands[2]));
> +   else
> + emit_insn (gen_vinsertvr_internal_ (operands[0], operands[3],
> +   operands[1], operands[2]));
> +   DONE;
> +})

The two cases (BE and LE) are now exactly identical?  Same for all
similar cases.

Okay for trunk with those nits taken care of.  Thanks!


Segher


Re: [PATCH 1/2] Provide diagnostic hints for missing C inttypes.h string constants.

2020-06-03 Thread Joseph Myers
On Mon, 1 Jun 2020, Mark Wielaard wrote:

> On Sun, May 24, 2020 at 02:30:13AM +0200, Mark Wielaard wrote:
> > This adds a flag to c_parser so we know when we were trying to
> > construct a string literal. If there is a parse error and we were
> > constructing a string literal, and the next token is an unknown
> > identifier name, and we know there is a standard header that defines
> > that name as a string literal, then add a missing header hint to
> > the error messsage.
> > 
> > The list of macro names are also used when providing a hint for
> > missing identifiers.
> 
> Ping. Note the followup patch that introduces the same functionality
> for the C++ parser was already approved. This patch (as attached) only
> needs review/approval from a C-frontend maintainer for some of the
> gcc/c/c-parser.c bits.

This patch is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: std::optional defaut constructor

2020-06-03 Thread Ville Voutilainen via Gcc-patches
On Thu, 4 Jun 2020 at 02:13, Marc Glisse  wrote:
>
> On Thu, 4 Jun 2020, Ville Voutilainen wrote:
>
> > On Thu, 4 Jun 2020 at 01:52, Marc Glisse  wrote:
> >>
> >> Hello,
> >>
> >> is there any drawback to the attached patch? It changes the code generated 
> >> for
> >
> > I don't get it. The noexceptness of the defaulted default constructor
> > should be a computation
> > of the noexceptness of the subobjects, and that should boil down to
> > whether optional's storage
> > is noexcept-default-constructible. And that thing has a noexcept
> > default constructor. Why
> > does this patch change anything?
>
> "noexcept" is a red herring, what matters is defaulted vs user-provided.
> In one case, we end up zero-initializing the whole buffer, and not in the
> other.

Yes, I just came to that conclusion. This is value-init, so the
language manages to zero-init the whole-object,
but with the change, it just calls a user-provided constructor.
That'll then merely boil down to value-initializing just the _Empty
part
of the _Storage in _Optional_payload_base. We are in
http://eel.is/c++draft/dcl.init#8.1.2. The change takes us
to http://eel.is/c++draft/dcl.init#8.1.1.


Re: [PATCH 3/6] rs6000, Add vector replace builtin support

2020-06-03 Thread Segher Boessenkool
Hi!

On Mon, Jun 01, 2020 at 09:14:47AM -0700, Carl Love wrote:
> * config/rs6000/altivec.h: Add define for vec_replace_elt and
> vec_replace_unaligned.
* config/rs6000/altivec.h (vec_replace_elt, vec_replace_unaligned):
New defines.

> * config/rs6000/vsx.md: Add unspec UNSPEC_REPLACE_ELT and
> UNSPEC_REPLACE_UN.
> Add mode iterator REPLACE_ELT.
> Add mode attributes REPLACE_ELT_atr, REPLACE_ELT_inst,
> REPLACE_ELT_char, REPLACE_ELT_sh, REPLACE_ELT_max.
> Add define_expand vreplace_elt_, mode REPLACE_ELT.
> Add define_expand vreplace_un_, mode REPLACE_ELT.
> Add define_insn vreplace_elt__inst, mode REPLACE_ELT.
* config/rs6000/vsx.md (UNSPEC_REPLACE_ELT, UNSPEC_REPLACE_UN): New.
(REPLACE_ELT): New mode iterator.
(REPLACE_ELT_atr, REPLACE_ELT_inst, REPLACE_ELT_char, REPLACE_ELT_sh,
REPLACE_ELT_max): New mode attributes.
(vreplace_elt_, vreplace_un_, vreplace_elt__inst
for REPLACE_ELT): New.

(etc.)

> (builtin_function_type): Add case statements for
> FUTURE_BUILTIN_VREPLACE_ELT_UV4SI,
> FUTURE_BUILTIN_VREPLACE_ELT_UV2DI,
> FUTURE_BUILTIN_VREPLACE_UN_UV4SI,
> FUTURE_BUILTIN_VREPLACE_UN_UV2DI.

(builtin_function_type) [FUTURE_BUILTIN_VREPLACE_ELT_UV4SI,
FUTURE_BUILTIN_VREPLACE_ELT_UV2DI, FUTURE_BUILTIN_VREPLACE_UN_UV4SI,
FUTURE_BUILTIN_VREPLACE_UN_UV2DI]: New cases.

> + /* Check whether the 3rd argument is an integer constant in the
> range
> + 0 to 3 inclusive.  */
> + STRIP_NOPS (arg2);
> + if (TREE_CODE (arg2) != INTEGER_CST || TREE_INT_CST_LOW (arg2) >
> 3)

Hrm, does this test that is it non-negative as well?  Maybe it should use
something like

 if (TREE_CODE (arg2) != INTEGER_CST
 || !IN_RANGE (TREE_INT_CST_LOW (arg2), 0, 3))

(same below, for 0..12).

> +(define_mode_attr REPLACE_ELT_atr [(V4SI "SI") (V4SF "SF")
> +(V2DI "DI") (V2DF "DF")])

You can use VS_scalar wherever you use this?

> +(define_mode_attr REPLACE_ELT_inst [(V4SI "v4si") (V4SF "v4sf")
> + (V2DI "v2di") (V2DF "v2df")])

This one is just .

>  
> +(define_expand "vreplace_elt_"
> + [(set (match_operand:REPLACE_ELT 0 "register_operand")
> + (unspec:REPLACE_ELT [(match_operand:REPLACE_ELT 1 "register_operand")

Indent is a bit wrong here (the "unspec" should line up with the
"match_operand").

> +  Endianess if needed.  */

(two n's in endianness).

I think the rest is fine.  Okay for trunk with these things fixed, or
resend it if you prefer?

Thanks!


Segher


Re: [stage1][PATCH] Make TOPN counter dynamically allocated.

2020-06-03 Thread Gerald Pfeifer
On Wed, 3 Jun 2020, Martin Liška wrote:
> Sorry for the breakage. Can you please paste full build output for the 
> problematic .o file?
> 
> I bet it's a C file compilation, where we should use:
> 
> __sync_val_compare_and_swap (counter, 0, (intptr_t)node);
> 
> Can you please test it?

c++ -std=c++11 -fno-PIE -c -g -DIN_GCC -fno-strict-aliasing 
-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall 
-Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-error=format-diag 
-Wno-format -Wmissing-format-attribute -Wove rloaded-virtual -pedantic 
-Wno-long-long -Wno-variadic-macros -Wno-overlength-st rings -fno-common 
-Wno-error -DHAVE_CONFIG_H -I. -I. -I/scratch/tmp/gerald/GCC-H EAD/gcc 
-I/scratch/tmp/gerald/GCC-HEAD/gcc/. 
-I/scratch/tmp/gerald/GCC-HEAD/gcc/../include -I./../intl 
-I/scratch/tmp/gerald/GCC-HEAD/gcc/../libcpp/include 
-I/home/gerald/11-i386/include 
-I/scratch/tmp/gerald/GCC-HEAD/gcc/../libdecnumber 
-I/scratch/tmp/gerald/GCC-HEAD/gcc/../libdecnumber/dpd -I../libdecnumber 
-I/scrat ch/tmp/gerald/GCC-HEAD/gcc/../libbacktrace -I. -I. 
-I/scratch/tmp/gerald/GCC-H EAD/gcc -I/scratch/tmp/gerald/GCC-HEAD/gcc/. 
-I/scratch/tmp/gerald/GCC-HEAD/gcc/ ../include -I./../intl 
-I/scratch/tmp/gerald/GCC-HEAD/gcc/../libcpp/include 
-I/home/gerald/11-i386/include 
-I/scratch/tmp/gerald/GCC-HEAD/gcc/../libdecnumber 
-I/scratch/tmp/gerald/GCC-HEAD/gcc/../libdecnumber/dpd -I../libdecnumber 
-I/scrat ch/tmp/gerald/GCC-HEAD/gcc/../libbacktrace \ -DIN_GCOV_TOOL=1
-o libgcov-merge-tool.o 
/scratch/tmp/gerald/GCC-HEAD/gcc/../libgcc/libgcov-merge.c

Alas in tonights build (and the one with JOBS=1 I run during the day)
this now did not fail any longer. On the very same system and with the
same invocation of the build.

No I'm wondering what might have fixed this. The git lob did not show
me something obvious.

> I would recommend:
> 
> $ git show --format=fuller 871e5ada6d53d5eb495cc9f323983f347487c1b2

Ah, cool.  Great to know, thank you!

Gerald


Re: std::optional defaut constructor

2020-06-03 Thread Ville Voutilainen via Gcc-patches
On Thu, 4 Jun 2020 at 02:20, Ville Voutilainen
 wrote:
>
> On Thu, 4 Jun 2020 at 02:13, Marc Glisse  wrote:
> >
> > On Thu, 4 Jun 2020, Ville Voutilainen wrote:
> >
> > > On Thu, 4 Jun 2020 at 01:52, Marc Glisse  wrote:
> > >>
> > >> Hello,
> > >>
> > >> is there any drawback to the attached patch? It changes the code 
> > >> generated for
> > >
> > > I don't get it. The noexceptness of the defaulted default constructor
> > > should be a computation
> > > of the noexceptness of the subobjects, and that should boil down to
> > > whether optional's storage
> > > is noexcept-default-constructible. And that thing has a noexcept
> > > default constructor. Why
> > > does this patch change anything?
> >
> > "noexcept" is a red herring, what matters is defaulted vs user-provided.
> > In one case, we end up zero-initializing the whole buffer, and not in the
> > other.
>
> Yes, I just came to that conclusion. This is value-init, so the
> language manages to zero-init the whole-object,
> but with the change, it just calls a user-provided constructor.
> That'll then merely boil down to value-initializing just the _Empty
> part
> of the _Storage in _Optional_payload_base. We are in
> http://eel.is/c++draft/dcl.init#8.1.2. The change takes us
> to http://eel.is/c++draft/dcl.init#8.1.1.

Ha, and optional's default constructor isn't even specified to be defaulted.


Re: std::optional defaut constructor

2020-06-03 Thread Ville Voutilainen via Gcc-patches
On Thu, 4 Jun 2020 at 03:05, Ville Voutilainen
 wrote:

> > > "noexcept" is a red herring, what matters is defaulted vs user-provided.
> > > In one case, we end up zero-initializing the whole buffer, and not in the
> > > other.
> >
> > Yes, I just came to that conclusion. This is value-init, so the
> > language manages to zero-init the whole-object,
> > but with the change, it just calls a user-provided constructor.
> > That'll then merely boil down to value-initializing just the _Empty
> > part
> > of the _Storage in _Optional_payload_base. We are in
> > http://eel.is/c++draft/dcl.init#8.1.2. The change takes us
> > to http://eel.is/c++draft/dcl.init#8.1.1.
>
> Ha, and optional's default constructor isn't even specified to be defaulted.

So the change is correct. Can we test the change somehow?


[PATCH] tree-ssa-threadbackward.c (profitable_jump_thread_path): Do not allow __builtin_constant_p.

2020-06-03 Thread Ilya Leoshkevich via Gcc-patches
Bootstrapped and regtested on x86_64-redhat-linux, ppc64le-redhat-linux
and s390x-redhat-linux.


Linux Kernel (specifically, drivers/leds/trigger/ledtrig-cpu.c) build
with GCC 10 fails on s390 with "impossible constraint".

The problem is that jump threading makes __builtin_constant_p lie when
an expression in question appears to be a constant on a threading path.

Fix by disallowing __builtin_constant_p on threading paths.

gcc/ChangeLog:

2020-06-03  Ilya Leoshkevich  

* tree-ssa-threadbackward.c (thread_jumps::profitable_jump_thread_path):
Do not allow __builtin_constant_p on a threading path.

gcc/testsuite/ChangeLog:

2020-06-03  Ilya Leoshkevich  

* gcc.target/s390/builtin-constant-p-threading.c: New test.
---
 .../s390/builtin-constant-p-threading.c   | 46 +++
 gcc/tree-ssa-threadbackward.c |  7 ++-
 2 files changed, 52 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/builtin-constant-p-threading.c

diff --git a/gcc/testsuite/gcc.target/s390/builtin-constant-p-threading.c 
b/gcc/testsuite/gcc.target/s390/builtin-constant-p-threading.c
new file mode 100644
index 000..5f0acdce0b0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/builtin-constant-p-threading.c
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=z196 -mzarch" } */
+
+typedef struct
+{
+  int counter;
+} atomic_t;
+
+static inline __attribute__ ((__gnu_inline__)) int
+__atomic_add (int val, int *ptr)
+{
+  int old;
+  asm volatile("laa %[old],%[val],%[ptr]\n"
+  : [old] "=d" (old), [ptr] "+Q"(*ptr)
+  : [val] "d" (val)
+  : "cc", "memory");
+  return old;
+}
+
+static inline __attribute__ ((__gnu_inline__)) void
+__atomic_add_const (int val, int *ptr)
+{
+  asm volatile("asi %[ptr],%[val]\n"
+  : [ptr] "+Q" (*ptr)
+  : [val] "i" (val)
+  : "cc", "memory");
+}
+
+static inline __attribute__ ((__gnu_inline__)) void
+atomic_add (int i, atomic_t *v)
+{
+  if (__builtin_constant_p (i) && (i > -129) && (i < 128))
+{
+  __atomic_add_const (i, &v->counter);
+  return;
+}
+  __atomic_add (i, &v->counter);
+}
+
+static atomic_t num_active_cpus = { (0) };
+
+void
+ledtrig_cpu (_Bool is_active)
+{
+  atomic_add (is_active ? 1 : -1, &num_active_cpus);
+}
diff --git a/gcc/tree-ssa-threadbackward.c b/gcc/tree-ssa-threadbackward.c
index 327628f1662..668932f6d85 100644
--- a/gcc/tree-ssa-threadbackward.c
+++ b/gcc/tree-ssa-threadbackward.c
@@ -259,8 +259,13 @@ thread_jumps::profitable_jump_thread_path (basic_block 
bbi, tree name,
   !gsi_end_p (gsi);
   gsi_next_nondebug (&gsi))
{
+ /* Do not allow OpenACC loop markers and __builtin_constant_p on
+a threading path.  The latter is disallowed, because an
+expression being constant on a threading path does not mean it
+can be considered constant in the entire function.  */
  gimple *stmt = gsi_stmt (gsi);
- if (gimple_call_internal_p (stmt, IFN_UNIQUE))
+ if (gimple_call_internal_p (stmt, IFN_UNIQUE)
+ || gimple_call_builtin_p (stmt, BUILT_IN_CONSTANT_P))
{
  m_path.pop ();
  return NULL;
-- 
2.25.4



Re: [PATCH] Fix zero-masking for vcvtps2ph when dest operand is memory.

2020-06-03 Thread Hongtao Liu via Gcc-patches
Hi Richard:
  Could you help review this patch.
  uros said he wouldn't review patches related to x86 vector ISA anymore.

On Wed, Jun 3, 2020 at 10:26 AM Hongtao Liu  wrote:
>
> Hi:
>   When dest is memory, zero-masking is not valid, only merging-masking
> is available,
>
>   Bootstrap is ok, regression test on i386/x86-64 backend is ok.
>
> gcc/ChangeLog:
> * gcc/config/i386/sse.md (*vcvtps2ph_store):
> Refine from *vcvtps2ph_store.
> (vcvtps2ph256): Refine constraint from vm to v.
> (avx512f_vcvtps2ph512): Ditto.
> (*vcvtps2ph256): New define_insn.
> (*avx512f_vcvtps2ph512): Ditto.
> * gcc/config/i386/subst.md (merge_mask): New define_subst.
> (merge_mask_name): New define_subst_attr.
> (merge_mask_operand3): Ditto.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/i386/avx512f-vcvtps2ph-pr95254.c: New test.
> * gcc.target/i386/avx512vl-vcvtps2ph-pr95254.c: Ditto.
> --
> BR,
> Hongtao



-- 
BR,
Hongtao


Re: [PATCH V2 1/2] Introduce flag_cunroll_grow_size for cunroll

2020-06-03 Thread Jiufu Guo via Gcc-patches
Jiufu Guo  writes:

Hi,

Patch is updated a little according to comments.
Please see if this is ok to commit.

diff --git a/gcc/common.opt b/gcc/common.opt
index 4464049fc1f..570e2aa53c8 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2856,6 +2856,10 @@ funroll-all-loops
 Common Report Var(flag_unroll_all_loops) Optimization
 Perform loop unrolling for all loops.
 
+funroll-completely-grow-size
+Undocumented Var(flag_cunroll_grow_size) Init(2) Optimization
+; Internal undocumented flag, allow size growth during complete unrolling
+
 ; Nonzero means that loop optimizer may assume that the induction variables
 ; that control loops do not overflow and that the loops with nontrivial
 ; exit condition are not infinite
diff --git a/gcc/toplev.c b/gcc/toplev.c
index 96316fbd23b..e1010077ddb 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1474,6 +1474,11 @@ process_options (void)
   if (flag_unroll_all_loops)
 flag_unroll_loops = 1;
 
+  /* Allow cunroll to grow size accordingly.  */
+  if (flag_cunroll_grow_size == AUTODETECT_VALUE)
+flag_cunroll_grow_size
+  = flag_unroll_loops || flag_peel_loops || optimize >= 3;
+
   /* web and rename-registers help when run after loop unrolling.  */
   if (flag_web == AUTODETECT_VALUE)
 flag_web = flag_unroll_loops;
diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c
index 8ab6ab3330c..298ab215530 100644
--- a/gcc/tree-ssa-loop-ivcanon.c
+++ b/gcc/tree-ssa-loop-ivcanon.c
@@ -1603,9 +1603,8 @@ pass_complete_unroll::execute (function *fun)
  re-peeling the same loop multiple times.  */
   if (flag_peel_loops)
 peeled_loops = BITMAP_ALLOC (NULL);
-  unsigned int val = tree_unroll_loops_completely (flag_unroll_loops
-  || flag_peel_loops
-  || optimize >= 3, true);
+  unsigned int val = tree_unroll_loops_completely (flag_cunroll_grow_size, 
+  true);
   if (peeled_loops)
 {
   BITMAP_FREE (peeled_loops);

BR,
Jiufu


> Richard Biener  writes:
>
>> On Tue, Jun 2, 2020 at 4:10 AM Jiufu Guo  wrote:
>>>
>>> Jiufu Guo  writes:
>>>
>>> Hi,
>>>
>>> I updated the patch just a little accordinlgy.  Thanks!
>>>
>>> diff --git a/gcc/common.opt b/gcc/common.opt
>>> index 4464049fc1f..570e2aa53c8 100644
>>> --- a/gcc/common.opt
>>> +++ b/gcc/common.opt
>>> @@ -2856,6 +2856,10 @@ funroll-all-loops
>>>  Common Report Var(flag_unroll_all_loops) Optimization
>>>  Perform loop unrolling for all loops.
>>>
>>> +funroll-completely-grow-size
>>> +Common Undocumented Var(flag_cunroll_grow_size) Init(2) Optimization
>>> +; Internal undocumented flag, allow size growth during complete unrolling
>>> +
>>>  ; Nonzero means that loop optimizer may assume that the induction variables
>>>  ; that control loops do not overflow and that the loops with nontrivial
>>>  ; exit condition are not infinite
>>> diff --git a/gcc/toplev.c b/gcc/toplev.c
>>> index 96316fbd23b..8d52358efdd 100644
>>> --- a/gcc/toplev.c
>>> +++ b/gcc/toplev.c
>>> @@ -1474,6 +1474,10 @@ process_options (void)
>>>if (flag_unroll_all_loops)
>>>  flag_unroll_loops = 1;
>>>
>>> +  /* Allow cunroll to grow size accordingly.  */
>>> +  if (flag_cunroll_grow_size == AUTODETECT_VALUE)
>>> +flag_cunroll_grow_size = flag_unroll_loops || flag_peel_loops;
>>> +
>>
>> (*)
>>
>>>/* web and rename-registers help when run after loop unrolling.  */
>>>if (flag_web == AUTODETECT_VALUE)
>>>  flag_web = flag_unroll_loops;
>>> diff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c
>>> index 8ab6ab3330c..298ab215530 100644
>>> --- a/gcc/tree-ssa-loop-ivcanon.c
>>> +++ b/gcc/tree-ssa-loop-ivcanon.c
>>> @@ -1603,9 +1603,8 @@ pass_complete_unroll::execute (function *fun)
>>>   re-peeling the same loop multiple times.  */
>>>if (flag_peel_loops)
>>>  peeled_loops = BITMAP_ALLOC (NULL);
>>> -  unsigned int val = tree_unroll_loops_completely (flag_unroll_loops
>>> -  || flag_peel_loops
>>> -  || optimize >= 3, true);
>>> +  unsigned int val = tree_unroll_loops_completely (flag_cunroll_grow_size,
>>
>> With -O3 -fno-peel-loops this does not have the same effect anymore.
>> So above (*) you'd need to check || optimize >= 3 as well.
>
> Oh, yes.  Thanks for your kindly review!
>
> BR,
> Jiufu
>
>>
>>> +  true);
>>>if (peeled_loops)
>>>  {
>>>BITMAP_FREE (peeled_loops);
>>>
>>> BR,
>>> Jiufu
>>>
>>> > Richard Biener  writes:
>>> >
>>>
>>> >>> >> From: Jiufu Guo 
>>> >>> >>
>>> >>> >> Currently GIMPLE complete unroller(cunroll) is checking
>>> >>> >> flag_unroll_loops and flag_peel_loops to see if allow size growth.
>>> >>> >> Beside affects curnoll, flag_unroll_loops also controls RTL unroler.
>>> >>> >> To have more freedom to control cunroll and RTL unroller, this 

Re: [PATCH V2 2/2] rs6000: allow cunroll to grow size according to -funroll-loop or -fpeel-loops

2020-06-03 Thread Jiufu Guo via Gcc-patches
guojiufu  writes:

> From: Jiufu Guo 
>
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -4567,7 +4567,12 @@ rs6000_option_override_internal (bool global_init_p)
>   unroll_only_small_loops = 0;
> if (!global_options_set.x_flag_rename_registers)
>   flag_rename_registers = 1;
> +   if (!global_options_set.x_flag_cunroll_grow_size)
> + flag_cunroll_grow_size = 1;
>   }
> +  else
> + if (!global_options_set.x_flag_cunroll_grow_size)
> +   flag_cunroll_grow_size = flag_peel_loops;

As this patch V2 1/2, also update here accordingy:
 flag_cunroll_grow_size = flag_peel_loops || optimize >= 3;
>  
>/* If using typedef char *va_list, signal that
>__builtin_va_start (&ap, 0) can be optimized to

BR,
Jiufu


  1   2   >