Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-06-30 Thread Richard Biener
On Wed, 30 Jun 2021, Qing Zhao wrote:

> Hi, 
> 
> I am testing the 4th patch of -ftrivial-auto-var-init with CPU2017 today, and 
> found the following issues:
> 
> In the dump file of “*t.i.031t.objsz1”, we have:
> 
>  :
>   __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
>   __s2_len_218 = .DEFERRED_INIT (__s2_len_177, 2);

I looks like this .DEFERRED_INIT initializes an already initialized
variable.  I'd expect to only ever see default definition SSA names
as first argument to .DEFERRED_INIT.

>   __s2_len_219 = 7;
>   if (__s2_len_219 <= 3)
> goto ; [INV]
>   else
> goto ; [INV]
> 
>:
>   _1 = (long unsigned int) i_175;
>  
> 
> However, after “ccp”, in “t.i.032t.ccp1”, we have:
> 
>  :
>   __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
>   __s2_len_218 = .DEFERRED_INIT (7, 2);
>   _36 = (long unsigned int) i_175;
>   _37 = _36 * 8;
>   _38 = argv_220(D) + _37;
> 
> 
> Looks like that the optimization “ccp” replaced the first argument of the 
> call .DEFERRED_INIT with the constant 7.
> This should be avoided. 
> 
> (NOTE, this issue existed in the previous patches, however, only exposed with 
> this version since I added more verification
> code in tree-cfg.c to verify the call to .DEFERRED_INIT).
> 
> I am wondering what’s the best solution to this problem? 

I think you have to trace where this "bogus" .DEFERRED_INIT comes from 
originally.  Or alternatively, if this is unavoidable, add "constant 
folding" of .DEFERRED_INIT so that defered init of an initialized
object becomes the object itself, thus retain the previous - eventually
partial - initialization only.

Richard.

> Can we add any attribute to the internal function argument to prevent later 
> optimizations that might applied on it? 
> Or just update “ccp” phase to specially handle calls to .DEFERRED_INIT? (Not 
> sure whether there are other phases have the
> Same issue?)
> 
> Let me know if you have any suggestion.
> 
> Thanks a lot for your help.
> 
> Qing

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: GCC documentation: porting to Sphinx

2021-06-30 Thread Martin Liška

On 6/29/21 5:54 PM, Arnaud Charlet wrote:

In particular can you explain the motivation behind all the changes in the
gcc/ada/doc directory?


Sure:
1) All Sphinx manuals live in a directory where index page is called index.rst. 
That's why
I moved e.g. this: gcc/ada/doc/{gnat_rm.rst => gnat_rm/index.rst}
2) I moved latex_elements.py to ada_latex_elements.py as it clashes with Sphinx 
global variable
you modify in Sphinx config files
3) I created a shared Ada config (adabaseconf.py) that extends doc/baseconf.py 
and sets what
is shared in between 3 Ada manuals.
4) gnu_free_documentation_license.rst is taken from $root/doc/


OK, this is really lots of changes, if we could minimize these changes
that would be best (and sorry but posting a link to a tarball also doesn't
help reviews, it was actually better with a link to a git repo previously...


All right, ideally please pull my branch:
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=log;h=refs/users/marxin/heads/sphinx-v4

which I force push once I rebase it. One can fetch the branch with:
$ git fetch origin refs/users/marxin/heads/sphinx-v4


At least the Ada part itself shouldn't be too big in particular once
simplified so could be posted standalone).


Sorry, but the patch is still 400 kB when using zstd -22. Actually, the change 
is very small
if you ignore renames of the 3 files:

 gcc/ada/gnat-style.texi|   954
 gcc/ada/gnat_rm.texi   | 29822
 gcc/ada/gnat_ugn.texi  | 29232

The only significant change is refactoring of the conf.py config.

Martin



Arno





RE: [ARM] PR66791: Replace calls to builtin in vmul_n (a, b) intrinsics with __a * __b

2021-06-30 Thread Kyrylo Tkachov via Gcc-patches
Hi Prathamesh,

> -Original Message-
> From: Prathamesh Kulkarni 
> Sent: 29 June 2021 08:22
> To: gcc Patches 
> Cc: Kyrylo Tkachov 
> Subject: Re: [ARM] PR66791: Replace calls to builtin in vmul_n (a, b) 
> intrinsics
> with __a * __b
> 
> On Mon, 21 Jun 2021 at 14:04, Prathamesh Kulkarni
>  wrote:
> >
> > On Mon, 14 Jun 2021 at 13:27, Prathamesh Kulkarni
> >  wrote:
> > >
> > > On Mon, 7 Jun 2021 at 12:45, Prathamesh Kulkarni
> > >  wrote:
> > > >
> > > > On Mon, 31 May 2021 at 16:01, Prathamesh Kulkarni
> > > >  wrote:
> > > > >
> > > > > On Mon, 31 May 2021 at 15:22, Prathamesh Kulkarni
> > > > >  wrote:
> > > > > >
> > > > > > On Wed, 26 May 2021 at 14:07, Marc Glisse 
> wrote:
> > > > > > >
> > > > > > > On Wed, 26 May 2021, Prathamesh Kulkarni via Gcc-patches
> wrote:
> > > > > > >
> > > > > > > > The attached patch removes calls to builtins in vmul_n* (a, b)
> with __a * __b.
> > > > > > >
> > > > > > > I am not familiar with neon, but are __a and __b unsigned here?
> Otherwise,
> > > > > > > is vmul_n already undefined in case of overflow?
> > > > > > Hi Marc,
> > > > > > Sorry for late reply, for vmul_n_s*, I think they are signed
> > > > > > (intx_t).
> > > > > Oops, I meant intx_t.
> > > > > > I am not sure how should the intrinsic behave in case of signed
> overflow,
> > > > > > but I am assuming it's OK since vmul_s* intrinsics leave it 
> > > > > > undefined
> too.
> > > > > > Kyrill, is it OK to leave vmul_s* and vmul_n_s* undefined in case of
> overflow ?
> > > > The attached version fixes one fallout I missed earlier.
> > > > Is this OK to commit ?
> > > ping https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572037.html
> > ping * 2 https://gcc.gnu.org/pipermail/gcc-patches/2021-
> June/572037.html
> ping * 3 https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572037.html

I'm a bit wary of leaving them undefined for signed overflow. I see that 
aarch64 leaves them too so maybe it's not such a big deal, but I'd like to 
consider that separately.
Can you please split this into the unsigned and floating-point parts, followed 
by the signed intrinsics? The unsigned and FP parts are okay, lets' review the 
signed intrinsics separately.

Thanks,
Kyrill

> 
> Thanks,
> Prathamesh
> >
> > Thanks,
> > Prathamesh
> > >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > Thanks,
> > > > Prathamesh
> > > > > >
> > > > > > Thanks,
> > > > > > Prathamesh
> > > > > > >
> > > > > > > --
> > > > > > > Marc Glisse


Re: [PATCH 0/4] openacc: Async fixes

2021-06-30 Thread Thomas Schwinge
Hi Julian!

On 2021-06-29T16:42:00-0700, Julian Brown  wrote:
> This patch series contains fixes for various problems with async support
> for OpenACC at present:

Thanks, I shall be looking into these in detail, "soonish".

Some quick comments.


>  - Asynchonous host-to-device copies invoked from within libgomp
>(target.c) could copy bad data to the target -- and the workaround
>for that currently used in the AMD GCN target plugin could lead to
>a different problem (a race condition).

As per discussion on Monday, I like to you moved (back?) the "ephemeral"
handling into 'libgomp/target.c:gomp_copy_host2dev'.


>  - The OpenACC profiling-interface implementation did not measure
>asynchronous operations properly.

We'll need to be careful: (possibly, an older version of) that one we
internally had identified to be causing some issues; see the
"acc_prof-parallel-1.c intermittent failure on og10 branch" emails,
2020-07.


>  - Several test cases misuse OpenACC asynchronous support (more race
>conditions).

Mostly ACK, but some more changes may be necessary; please see
<87sg1s9s9l.fsf@euler.schwinge.homeip.net">http://mid.mail-archive.com/87sg1s9s9l.fsf@euler.schwinge.homeip.net>
(you were CCed).

>  .../libgomp.oacc-c-c++-common/deep-copy-10.c  |  14 +-

Please provide some detail about that one ("Fix async behaviour").  It's
not obvious to me what's wrong with the current version (but I haven't
really spent time on that yet).


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf


RE: [ARM] PR66791: Gate comparison in vca intrinsics on __FAST_MATH__

2021-06-30 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Prathamesh Kulkarni 
> Sent: 29 June 2021 08:21
> To: gcc Patches ; Kyrylo Tkachov
> 
> Subject: Re: [ARM] PR66791: Gate comparison in vca intrinsics on
> __FAST_MATH__
> 
> On Tue, 22 Jun 2021 at 15:04, Prathamesh Kulkarni
>  wrote:
> >
> > Hi,
> > The attached patch gates abs(__a) cmp abs(__b) for vca intrinsics on
> > __FAST_MATH__. I moved vabs intrinsics before vcage_f32 since vca
> > intrinsics use those.
> > Bootstrapped+tested on arm-linux-gnueabihf.
> > OK to commit ?
> ping https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573384.html

Hmm, does this result in better optimisation? I guess it's expressing the 
operation at a higher level, but there's now conceptually three operations 
(2xvabs + 1 comparison) that would need to be folded away by the optimisers...

Thanks,
Kyrill
> 
> Thanks,
> Prathamesh
> >
> > Thanks,
> > Prathamesh


Re: [PATCH] define auto_vec copy ctor and assignment (PR 90904)

2021-06-30 Thread Richard Biener via Gcc-patches
On Tue, Jun 29, 2021 at 7:18 PM Martin Sebor  wrote:
>
> On 6/29/21 8:43 AM, Jason Merrill wrote:
> > On 6/28/21 2:07 PM, Martin Sebor wrote:
> >> On 6/28/21 2:07 AM, Richard Biener wrote:
> >>> On Sat, Jun 26, 2021 at 12:36 AM Martin Sebor  wrote:
> 
>  On 6/25/21 4:11 PM, Jason Merrill wrote:
> > On 6/25/21 4:51 PM, Martin Sebor wrote:
> >> On 6/1/21 3:38 PM, Jason Merrill wrote:
> >>> On 6/1/21 3:56 PM, Martin Sebor wrote:
>  On 5/27/21 2:53 PM, Jason Merrill wrote:
> > On 4/27/21 11:52 AM, Martin Sebor via Gcc-patches wrote:
> >> On 4/27/21 8:04 AM, Richard Biener wrote:
> >>> On Tue, Apr 27, 2021 at 3:59 PM Martin Sebor 
> >>> wrote:
> 
>  On 4/27/21 1:58 AM, Richard Biener wrote:
> > On Tue, Apr 27, 2021 at 2:46 AM Martin Sebor via Gcc-patches
> >  wrote:
> >>
> >> PR 90904 notes that auto_vec is unsafe to copy and assign
> >> because
> >> the class manages its own memory but doesn't define (or
> >> delete)
> >> either special function.  Since I first ran into the problem,
> >> auto_vec has grown a move ctor and move assignment from
> >> a dynamically-allocated vec but still no copy ctor or copy
> >> assignment operator.
> >>
> >> The attached patch adds the two special functions to auto_vec
> >> along
> >> with a few simple tests.  It makes auto_vec safe to use in
> >> containers
> >> that expect copyable and assignable element types and passes
> >> bootstrap
> >> and regression testing on x86_64-linux.
> >
> > The question is whether we want such uses to appear since
> > those
> > can be quite inefficient?  Thus the option is to delete those
> > operators?
> 
>  I would strongly prefer the generic vector class to have the
>  properties
>  expected of any other generic container: copyable and
>  assignable.  If
>  we also want another vector type with this restriction I
>  suggest
>  to add
>  another "noncopyable" type and make that property explicit in
>  its name.
>  I can submit one in a followup patch if you think we need one.
> >>>
> >>> I'm not sure (and not strictly against the copy and assign).
> >>> Looking around
> >>> I see that vec<> does not do deep copying.  Making auto_vec<>
> >>> do it
> >>> might be surprising (I added the move capability to match how
> >>> vec<>
> >>> is used - as "reference" to a vector)
> >>
> >> The vec base classes are special: they have no ctors at all
> >> (because
> >> of their use in unions).  That's something we might have to
> >> live with
> >> but it's not a model to follow in ordinary containers.
> >
> > I don't think we have to live with it anymore, now that we're
> > writing C++11.
> >
> >> The auto_vec class was introduced to fill the need for a
> >> conventional
> >> sequence container with a ctor and dtor.  The missing copy
> >> ctor and
> >> assignment operators were an oversight, not a deliberate feature.
> >> This change fixes that oversight.
> >>
> >> The revised patch also adds a copy ctor/assignment to the
> >> auto_vec
> >> primary template (that's also missing it).  In addition, it adds
> >> a new class called auto_vec_ncopy that disables copying and
> >> assignment as you prefer.
> >
> > Hmm, adding another class doesn't really help with the confusion
> > richi mentions.  And many uses of auto_vec will pass them as vec,
> > which will still do a shallow copy.  I think it's probably better
> > to disable the copy special members for auto_vec until we fix
> > vec<>.
> 
>  There are at least a couple of problems that get in the way of
>  fixing
>  all of vec to act like a well-behaved C++ container:
> 
>  1) The embedded vec has a trailing "flexible" array member with its
>  instances having different size.  They're initialized by memset and
>  copied by memcpy.  The class can't have copy ctors or assignments
>  but it should disable/delete them instead.
> 
>  2) The heap-based vec is used throughout GCC with the assumption of
>  shallow copy semantics (not just as function arguments but also as
>  members of other such POD classes).  This can be changed by
>  providing
>  copy and move ctors and assignment operators for it, and also for
> >>>

[RFC PATCH] Change the type of predicates to bool.

2021-06-30 Thread Uros Bizjak via Gcc-patches
This RFC patch changes the type of predicates to bool. However, some
of the targets (e.g. x86) use indirect functions to call the
predicates, so without the local change, the build fails. Putting the
patch through CI bots should weed out the problems, but I have no
infrastructure to do it myself.

2021-06-30  Uroš Bizjak  

gcc/
* genpreds.c (write_predicate_subfunction):
Change the type of written subfunction to bool.
(write_one_predicate_function):
Change the type of written function to bool.
(write_tm_preds_h): Ditto.
* recog.h (*insn_operand_predicate_fn): Change the type to bool.
* recog.c (general_operand): Change the type to bool.
(address_operand): Ditto.
(register_operand): Ditto.
(pmode_register_operand): Ditto.
(scratch_operand): Ditto.
(immediate_operand): Ditto.
(const_int_operand): Ditto.
(const_scalar_int_operand): Ditto.
(const_double_operand): Ditto.
(nonimmediate_operand): Ditto.
(nonmemory_operand): Ditto.
(push_operand): Ditto.
(pop_operand): Ditto.
(memory_operand): Ditto.
(indirect_operand): Ditto.
(ordered_comparison_operator): Ditto.
(comparison_operator): Ditto.

* config/i386/i386-expand.c (ix86_expand_sse_cmp):
Change the type of indirect predicate function to bool.

Patch was bootstrapped on x86_64-linux-gnu.

Comments welcome.

Uros.
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index e9763eb5b3e..76d6afd6d9d 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3571,7 +3571,7 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx 
cmp_op0, rtx cmp_op1,
 
   cmp_op0 = force_reg (cmp_ops_mode, cmp_op0);
 
-  int (*op1_predicate)(rtx, machine_mode)
+  bool (*op1_predicate)(rtx, machine_mode)
 = VECTOR_MODE_P (cmp_ops_mode) ? vector_operand : nonimmediate_operand;
 
   if (!op1_predicate (cmp_op1, cmp_ops_mode))
diff --git a/gcc/genpreds.c b/gcc/genpreds.c
index 63fac0c7d34..9d9715f3d2f 100644
--- a/gcc/genpreds.c
+++ b/gcc/genpreds.c
@@ -110,7 +110,7 @@ process_define_predicate (md_rtx_info *info)
 
becomes
 
-   static inline int basereg_operand_1(rtx op, machine_mode mode)
+   static inline bool basereg_operand_1(rtx op, machine_mode mode)
{
  if (GET_CODE (op) == SUBREG)
op = SUBREG_REG (op);
@@ -151,7 +151,7 @@ write_predicate_subfunction (struct pred_data *p)
 
   p->exp = and_exp;
 
-  printf ("static inline int\n"
+  printf ("static inline bool\n"
  "%s_1 (rtx op ATTRIBUTE_UNUSED, machine_mode mode 
ATTRIBUTE_UNUSED)\n",
  p->name);
   rtx_reader_ptr->print_md_ptr_loc (p->c_block);
@@ -651,7 +651,7 @@ write_one_predicate_function (struct pred_data *p)
 
   /* A normal predicate can legitimately not look at machine_mode
  if it accepts only CONST_INTs and/or CONST_WIDE_INT and/or CONST_DOUBLEs. 
 */
-  printf ("int\n%s (rtx op, machine_mode mode ATTRIBUTE_UNUSED)\n{\n",
+  printf ("bool\n%s (rtx op, machine_mode mode ATTRIBUTE_UNUSED)\n{\n",
  p->name);
   write_predicate_stmts (p->exp);
   fputs ("}\n\n", stdout);
@@ -1416,7 +1416,7 @@ write_tm_preds_h (void)
 #ifdef HAVE_MACHINE_MODES");
 
   FOR_ALL_PREDICATES (p)
-printf ("extern int %s (rtx, machine_mode);\n", p->name);
+printf ("extern bool %s (rtx, machine_mode);\n", p->name);
 
   puts ("#endif /* HAVE_MACHINE_MODES */\n");
 
diff --git a/gcc/recog.c b/gcc/recog.c
index eb617f11163..9d880433e6f 100644
--- a/gcc/recog.c
+++ b/gcc/recog.c
@@ -1393,7 +1393,7 @@ valid_insn_p (rtx_insn *insn)
   return true;
 }
 
-/* Return 1 if OP is a valid general operand for machine mode MODE.
+/* Return true if OP is a valid general operand for machine mode MODE.
This is either a register reference, a memory reference,
or a constant.  In the case of a memory reference, the address
is checked for general validity for the target machine.
@@ -1407,7 +1407,7 @@ valid_insn_p (rtx_insn *insn)
The main use of this function is as a predicate in match_operand
expressions in the machine description.  */
 
-int
+bool
 general_operand (rtx op, machine_mode mode)
 {
   enum rtx_code code = GET_CODE (op);
@@ -1515,13 +1515,13 @@ general_operand (rtx op, machine_mode mode)
   return 0;
 }
 
-/* Return 1 if OP is a valid memory address for a memory reference
+/* Return true if OP is a valid memory address for a memory reference
of mode MODE.
 
The main use of this function is as a predicate in match_operand
expressions in the machine description.  */
 
-int
+bool
 address_operand (rtx op, machine_mode mode)
 {
   /* Wrong mode for an address expr.  */
@@ -1532,13 +1532,13 @@ address_operand (rtx op, machine_mode mode)
   return memory_address_p (mode, op);
 }
 
-/* Return 1 if OP is a register reference of mode MODE.
+/* Return true if OP is a register reference of mode MODE.
If MODE is VOIDmode, accept a register in any mode.
 
The main use of this function is as a predi

Re: [RFC/PATCH v3] ira: Support more matching constraint forms with param [PR100328]

2021-06-30 Thread Hongtao Liu via Gcc-patches
On Mon, Jun 28, 2021 at 3:27 PM Kewen.Lin  wrote:
>
> on 2021/6/28 下午3:20, Hongtao Liu wrote:
> > On Mon, Jun 28, 2021 at 3:12 PM Hongtao Liu  wrote:
> >>
> >> On Mon, Jun 28, 2021 at 2:50 PM Kewen.Lin  wrote:
> >>>
> >>> Hi!
> >>>
> >>> on 2021/6/9 下午1:18, Kewen.Lin via Gcc-patches wrote:
>  Hi,
> 
>  PR100328 has some details about this issue, I am trying to
>  brief it here.  In the hottest function LBM_performStreamCollideTRT
>  of SPEC2017 bmk 519.lbm_r, there are many FMA style expressions
>  (27 FMA, 19 FMS, 11 FNMA).  On rs6000, this kind of FMA style
>  insn has two flavors: FLOAT_REG and VSX_REG, the VSX_REG reg
>  class have 64 registers whose foregoing 32 ones make up the
>  whole FLOAT_REG.  There are some differences for these two
>  flavors, taking "*fma4_fpr" as example:
> 
>  (define_insn "*fma4_fpr"
>    [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,wa,wa")
>    (fma:SFDF
>  (match_operand:SFDF 1 "gpc_reg_operand" "%,wa,wa")
>  (match_operand:SFDF 2 "gpc_reg_operand" ",wa,0")
>  (match_operand:SFDF 3 "gpc_reg_operand" ",0,wa")))]
> 
>  // wa => A VSX register (VSR), vs0…vs63, aka. VSX_REG.
>  //  (f/d) => A floating point register, aka. FLOAT_REG.
> 
>  So for VSX_REG, we only have the destructive form, when VSX_REG
>  alternative being used, the operand 2 or operand 3 is required
>  to be the same as operand 0.  reload has to take care of this
>  constraint and create some non-free register copies if required.
> 
>  Assuming one fma insn looks like:
>    op0 = FMA (op1, op2, op3)
> 
>  The best regclass of them are VSX_REG, when op1,op2,op3 are all dead,
>  IRA simply creates three shuffle copies for them (here the operand
>  order matters, since with the same freq, the one with smaller number
>  takes preference), but IMO both op2 and op3 should take higher priority
>  in copy queue due to the matching constraint.
> 
>  I noticed that there is one function ira_get_dup_out_num, which meant
>  to create this kind of constraint copy, but the below code looks to
>  refuse to create if there is an alternative which has valid regclass
>  without spilled need.
> 
>    default:
>    {
>  enum constraint_num cn = lookup_constraint (str);
>  enum reg_class cl = reg_class_for_constraint (cn);
>  if (cl != NO_REGS
>  && !targetm.class_likely_spilled_p (cl))
>    goto fail
> 
> ...
> 
>  I cooked one patch attached to make ira respect this kind of matching
>  constraint guarded with one parameter.  As I stated in the PR, I was
>  not sure this is on the right track.  The RFC patch is to check the
>  matching constraint in all alternatives, if there is one alternative
>  with matching constraint and matches the current preferred regclass
>  (or best of allocno?), it will record the output operand number and
>  further create one constraint copy for it.  Normally it can get the
>  priority against shuffle copies and the matching constraint will get
>  satisfied with higher possibility, reload doesn't create extra copies
>  to meet the matching constraint or the desirable register class when
>  it has to.
> 
>  For FMA A,B,C,D, I think ideally copies A/B, A/C, A/D can firstly stay
>  as shuffle copies, and later any of A,B,C,D gets assigned by one
>  hardware register which is a VSX register (VSX_REG) but not a FP
>  register (FLOAT_REG), which means it has to pay costs once we can NOT
>  go with VSX alternatives, so at that time it's important to respect
>  the matching constraint then we can increase the freq for the remaining
>  copies related to this (A/B, A/C, A/D).  This idea requires some side
>  tables to record some information and seems a bit complicated in the
>  current framework, so the proposed patch aggressively emphasizes the
>  matching constraint at the time of creating copies.
> 
> >>>
> >>> Comparing with the original patch (v1), this patch v3 has
> >>> considered: (this should be v2 for this mail list, but bump
> >>> it to be consistent as PR's).
> >>>
> >>>   - Excluding the case where for one preferred register class
> >>> there can be two or more alternatives, one of them has the
> >>> matching constraint, while another doesn't have.  So for
> >>> the given operand, even if it's assigned by a hardware reg
> >>> which doesn't meet the matching constraint, it can simply
> >>> use the alternative which doesn't have matching constraint
> >>> so no register move is needed.  One typical case is
> >>> define_insn *mov_internal2 on rs6000.  So we
> >>> shouldn't create constraint copy for it.
> >>>
> >>>   - The possible free register move in the same register class,
> >>> di

Re: [PATCH] define auto_vec copy ctor and assignment (PR 90904)

2021-06-30 Thread Richard Biener via Gcc-patches
On Wed, Jun 30, 2021 at 3:46 AM Martin Sebor  wrote:
>
> On 6/29/21 4:58 AM, Richard Biener wrote:
> > On Mon, Jun 28, 2021 at 8:07 PM Martin Sebor  wrote:
> >>
> >> On 6/28/21 2:07 AM, Richard Biener wrote:
> >>> On Sat, Jun 26, 2021 at 12:36 AM Martin Sebor  wrote:
> 
>  On 6/25/21 4:11 PM, Jason Merrill wrote:
> > On 6/25/21 4:51 PM, Martin Sebor wrote:
> >> On 6/1/21 3:38 PM, Jason Merrill wrote:
> >>> On 6/1/21 3:56 PM, Martin Sebor wrote:
>  On 5/27/21 2:53 PM, Jason Merrill wrote:
> > On 4/27/21 11:52 AM, Martin Sebor via Gcc-patches wrote:
> >> On 4/27/21 8:04 AM, Richard Biener wrote:
> >>> On Tue, Apr 27, 2021 at 3:59 PM Martin Sebor 
> >>> wrote:
> 
>  On 4/27/21 1:58 AM, Richard Biener wrote:
> > On Tue, Apr 27, 2021 at 2:46 AM Martin Sebor via Gcc-patches
> >  wrote:
> >>
> >> PR 90904 notes that auto_vec is unsafe to copy and assign 
> >> because
> >> the class manages its own memory but doesn't define (or delete)
> >> either special function.  Since I first ran into the problem,
> >> auto_vec has grown a move ctor and move assignment from
> >> a dynamically-allocated vec but still no copy ctor or copy
> >> assignment operator.
> >>
> >> The attached patch adds the two special functions to auto_vec
> >> along
> >> with a few simple tests.  It makes auto_vec safe to use in
> >> containers
> >> that expect copyable and assignable element types and passes
> >> bootstrap
> >> and regression testing on x86_64-linux.
> >
> > The question is whether we want such uses to appear since those
> > can be quite inefficient?  Thus the option is to delete those
> > operators?
> 
>  I would strongly prefer the generic vector class to have the
>  properties
>  expected of any other generic container: copyable and
>  assignable.  If
>  we also want another vector type with this restriction I suggest
>  to add
>  another "noncopyable" type and make that property explicit in
>  its name.
>  I can submit one in a followup patch if you think we need one.
> >>>
> >>> I'm not sure (and not strictly against the copy and assign).
> >>> Looking around
> >>> I see that vec<> does not do deep copying.  Making auto_vec<> do 
> >>> it
> >>> might be surprising (I added the move capability to match how 
> >>> vec<>
> >>> is used - as "reference" to a vector)
> >>
> >> The vec base classes are special: they have no ctors at all 
> >> (because
> >> of their use in unions).  That's something we might have to live 
> >> with
> >> but it's not a model to follow in ordinary containers.
> >
> > I don't think we have to live with it anymore, now that we're
> > writing C++11.
> >
> >> The auto_vec class was introduced to fill the need for a 
> >> conventional
> >> sequence container with a ctor and dtor.  The missing copy ctor and
> >> assignment operators were an oversight, not a deliberate feature.
> >> This change fixes that oversight.
> >>
> >> The revised patch also adds a copy ctor/assignment to the auto_vec
> >> primary template (that's also missing it).  In addition, it adds
> >> a new class called auto_vec_ncopy that disables copying and
> >> assignment as you prefer.
> >
> > Hmm, adding another class doesn't really help with the confusion
> > richi mentions.  And many uses of auto_vec will pass them as vec,
> > which will still do a shallow copy.  I think it's probably better
> > to disable the copy special members for auto_vec until we fix vec<>.
> 
>  There are at least a couple of problems that get in the way of fixing
>  all of vec to act like a well-behaved C++ container:
> 
>  1) The embedded vec has a trailing "flexible" array member with its
>  instances having different size.  They're initialized by memset and
>  copied by memcpy.  The class can't have copy ctors or assignments
>  but it should disable/delete them instead.
> 
>  2) The heap-based vec is used throughout GCC with the assumption of
>  shallow copy semantics (not just as function arguments but also as
>  members of other such POD classes).  This can be changed by providing
>  copy and move ctors and assignment operators for it, and also for
>  some of the classes in which it's a member and that are used with
>  the 

[PATCH] c-family: Add more predefined macros for math flags

2021-06-30 Thread Matthias Kretz

Library code, especially in headers, sometimes needs to know how the
compiler interprets / optimizes floating-point types and operations.
This information can be used for additional optimizations or for
ensuring correctness. This change makes -freciprocal-math,
-fno-signed-zeros, -fno-trapping-math, -fassociative-math, and
-frounding-math report their state via corresponding pre-defined macros.

Signed-off-by: Matthias Kretz 

gcc/testsuite/ChangeLog:

* gcc.dg/associative-math-1.c: New test.
* gcc.dg/associative-math-2.c: New test.
* gcc.dg/no-signed-zeros-1.c: New test.
* gcc.dg/no-signed-zeros-2.c: New test.
* gcc.dg/no-trapping-math-1.c: New test.
* gcc.dg/no-trapping-math-2.c: New test.
* gcc.dg/reciprocal-math-1.c: New test.
* gcc.dg/reciprocal-math-2.c: New test.
* gcc.dg/rounding-math-1.c: New test.
* gcc.dg/rounding-math-2.c: New test.

gcc/c-family/ChangeLog:

* c-cppbuiltin.c (c_cpp_builtins_optimize_pragma): Define or
undefine __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__,
__NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and
__ROUNDING_MATH__ according to the new optimization flags.

gcc/ChangeLog:

* cppbuiltin.c (define_builtin_macros_for_compilation_flags):
Define __RECIPROCAL_MATH__, __NO_SIGNED_ZEROS__,
__NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__, and
__ROUNDING_MATH__ according to their corresponding flags.
* doc/cpp.texi: Document __RECIPROCAL_MATH__,
__NO_SIGNED_ZEROS__, __NO_TRAPPING_MATH__, __ASSOCIATIVE_MATH__,
and __ROUNDING_MATH__.
---
 gcc/c-family/c-cppbuiltin.c   | 25 +++
 gcc/cppbuiltin.c  | 10 +
 gcc/doc/cpp.texi  | 18 
 gcc/testsuite/gcc.dg/associative-math-1.c | 17 +++
 gcc/testsuite/gcc.dg/associative-math-2.c | 17 +++
 gcc/testsuite/gcc.dg/no-signed-zeros-1.c  | 17 +++
 gcc/testsuite/gcc.dg/no-signed-zeros-2.c  | 17 +++
 gcc/testsuite/gcc.dg/no-trapping-math-1.c | 17 +++
 gcc/testsuite/gcc.dg/no-trapping-math-2.c | 17 +++
 gcc/testsuite/gcc.dg/reciprocal-math-1.c  | 17 +++
 gcc/testsuite/gcc.dg/reciprocal-math-2.c  | 17 +++
 gcc/testsuite/gcc.dg/rounding-math-1.c| 17 +++
 gcc/testsuite/gcc.dg/rounding-math-2.c| 17 +++
 13 files changed, 223 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/associative-math-1.c
 create mode 100644 gcc/testsuite/gcc.dg/associative-math-2.c
 create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-1.c
 create mode 100644 gcc/testsuite/gcc.dg/no-signed-zeros-2.c
 create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-1.c
 create mode 100644 gcc/testsuite/gcc.dg/no-trapping-math-2.c
 create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-1.c
 create mode 100644 gcc/testsuite/gcc.dg/reciprocal-math-2.c
 create mode 100644 gcc/testsuite/gcc.dg/rounding-math-1.c
 create mode 100644 gcc/testsuite/gcc.dg/rounding-math-2.c


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index f79f939bd10..671af04b1f8 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -628,6 +628,31 @@ c_cpp_builtins_optimize_pragma (cpp_reader *pfile, tree prev_tree,
   cpp_undef (pfile, "__FINITE_MATH_ONLY__");
   cpp_define_unused (pfile, "__FINITE_MATH_ONLY__=0");
 }
+
+  if (!prev->x_flag_reciprocal_math && cur->x_flag_reciprocal_math)
+cpp_define_unused (pfile, "__RECIPROCAL_MATH__");
+  else if (prev->x_flag_reciprocal_math && !cur->x_flag_reciprocal_math)
+cpp_undef (pfile, "__RECIPROCAL_MATH__");
+
+  if (!prev->x_flag_signed_zeros && cur->x_flag_signed_zeros)
+cpp_undef (pfile, "__NO_SIGNED_ZEROS__");
+  else if (prev->x_flag_signed_zeros && !cur->x_flag_signed_zeros)
+cpp_define_unused (pfile, "__NO_SIGNED_ZEROS__");
+
+  if (!prev->x_flag_trapping_math && cur->x_flag_trapping_math)
+cpp_undef (pfile, "__NO_TRAPPING_MATH__");
+  else if (prev->x_flag_trapping_math && !cur->x_flag_trapping_math)
+cpp_define_unused (pfile, "__NO_TRAPPING_MATH__");
+
+  if (!prev->x_flag_associative_math && cur->x_flag_associative_math)
+cpp_define_unused (pfile, "__ASSOCIATIVE_MATH__");
+  else if (prev->x_flag_associative_math && !cur->x_flag_associative_math)
+cpp_undef (pfile, "__ASSOCIATIVE_MATH__");
+
+  if (!prev->x_flag_rounding_math && cur->x_flag_rounding_math)
+cpp_define_unused (pfile, "__ROUNDING_MATH__");
+  else if (prev->x_

Re: [PATCH] define auto_vec copy ctor and assignment (PR 90904)

2021-06-30 Thread Richard Sandiford via Gcc-patches
Richard Biener via Gcc-patches  writes:
> Note there's also array_slice<> which could be used to pass non-const
> vec<>s that are never resized but modified - the only "valid" case of
> passing a non-const vec<> by value.

Yeah.  We'd need a new constructor for that (the current one only
takes const vec<>&) but I agree it would be a good thing to do.

I realise you weren't saying otherwise, but: array_slice<> can also be
used for const vec<>s.  E.g. array_slice can't be resized
or modified.

I think array_slice<> is going to be more efficient as well.  E.g.:

void
f1 (vec &foo)
{
  for (unsigned int i = 0; i < foo.length (); ++i)
foo[i] += 1;
}

void
f2 (array_slice foo)
{
  for (unsigned int i = 0; i < foo.size (); ++i)
foo[i] += 1;
}

gives:

d150 &)>:
d150:   48 8b 07mov(%rdi),%rax
d153:   31 d2   xor%edx,%edx
d155:   48 85 c0test   %rax,%rax
d158:   74 26   je d180 &)+0x30>
d15a:   66 0f 1f 44 00 00   nopw   0x0(%rax,%rax,1)
d160:   3b 50 04cmp0x4(%rax),%edx
d163:   73 12   jaed177 &)+0x27>
d165:   89 d1   mov%edx,%ecx
d167:   83 c2 01add$0x1,%edx
d16a:   80 44 08 08 01  addb   $0x1,0x8(%rax,%rcx,1)
d16f:   48 8b 07mov(%rdi),%rax
d172:   48 85 c0test   %rax,%rax
d175:   75 e9   jned160 &)+0x10>
d177:   c3  retq
d178:   0f 1f 84 00 00 00 00nopl   0x0(%rax,%rax,1)
d17f:   00
d180:   c3  retq

d190 )>:
d190:   85 f6   test   %esi,%esi
d192:   74 18   je d1ac )+0x1c>
d194:   8d 46 fflea-0x1(%rsi),%eax
d197:   48 8d 44 07 01  lea0x1(%rdi,%rax,1),%rax
d19c:   0f 1f 40 00 nopl   0x0(%rax)
d1a0:   80 07 01addb   $0x1,(%rdi)
d1a3:   48 83 c7 01 add$0x1,%rdi
d1a7:   48 39 c7cmp%rax,%rdi
d1aa:   75 f4   jned1a0 )+0x10>
d1ac:   c3  retq

where f1 has to reload the length and base each iteration,
but f2 doesn't.

> But as noted array_slice<> lacks most of the vec<> API so I'm not sure
> how awkward that option would be.  We of course can amend its API as
> well.

Yeah, that'd be good.  The current class follows the principle
“don't add stuff that isn't needed yet”. :-)

Thanks,
Richard


Re: [PATCH 2/4] allow poisoning input_location in ranges it should not be used

2021-06-30 Thread Richard Biener via Gcc-patches
On Wed, Jun 30, 2021 at 7:37 AM Trevor Saunders  wrote:
>
> This makes it possible to assert if input_location is used during the lifetime
> of a scope.  This will allow us to find places that currently use it within a
> function and its callees, or prevent adding uses within the lifetime of a
> function after all existing uses are removed.
>
> bootstrapped and regtested on x86_64-linux-gnu, ok?

I'm not sure about the general approach but I have comments about
input_location.

IMHO a good first step would be to guard the input_location declaration with sth
like

#ifndef GCC_NEED_INPUT_LOCATION
extern location_t input_location;
#endif

and "white-list" it in the few files that refer to it.  (but not in
coretypes.h or rtl.h
which seem to include input.h).

Richard.

> Trev
>
> gcc/cp/ChangeLog:
>
> * call.c (add_builtin_candidate): Adjust.
> * decl.c (compute_array_index_type_loc): Likewise.
> * decl2.c (get_guard_cond): Likewise.
> (one_static_initialization_or_destruction): Likewise.
> (do_static_initialization_or_destruction): Likewise.
> * init.c (build_new_1): Likewise.
> (build_vec_init): Likewise.
> * module.cc (finish_module_processing): Likewise.
> * parser.c (cp_convert_range_for): Likewise.
> (cp_parser_perform_range_for_lookup): Likewise.
> (cp_parser_omp_for_incr): Likewise.
> (cp_convert_omp_range_for): Likewise.
> * pt.c (fold_expression): Likewise.
> (tsubst_copy_and_build): Likewise.
> * typeck.c (common_pointer_type): Likewise.
> (cp_build_array_ref): Likewise.
> (get_member_function_from_ptrfunc): Likewise.
> (cp_build_unary_op): Likewise.
> (convert_ptrmem): Likewise.
> (cp_build_modify_expr): Likewise.
> (build_ptrmemfunc): Likewise.
>
> gcc/ChangeLog:
>
> * diagnostic.c (internal_error): Remove use of input_location.
> * input.c (input_location): Change type to poisonable.
> * input.h (input_location): Adjust prototype.
>
> gcc/objc/ChangeLog:
>
> * objc-next-runtime-abi-02.c (build_v2_objc_method_fixup_call): 
> Adjust.
> ---
>  gcc/cp/call.c   |  2 +-
>  gcc/cp/decl.c   |  2 +-
>  gcc/cp/decl2.c  | 12 +--
>  gcc/cp/init.c   | 14 ++--
>  gcc/cp/module.cc|  2 +-
>  gcc/cp/parser.c | 11 +-
>  gcc/cp/pt.c |  4 ++--
>  gcc/cp/typeck.c | 33 -
>  gcc/diagnostic.c|  2 +-
>  gcc/input.c |  2 +-
>  gcc/input.h |  3 ++-
>  gcc/objc/objc-next-runtime-abi-02.c |  2 +-
>  12 files changed, 47 insertions(+), 42 deletions(-)
>
> diff --git a/gcc/cp/call.c b/gcc/cp/call.c
> index e4df72ec1a3..c94fe0b3bd2 100644
> --- a/gcc/cp/call.c
> +++ b/gcc/cp/call.c
> @@ -3126,7 +3126,7 @@ add_builtin_candidate (struct z_candidate **candidates, 
> enum tree_code code,
>  {
>if (TYPE_PTR_OR_PTRMEM_P (type1))
> {
> - tree cptype = composite_pointer_type (input_location,
> + tree cptype = composite_pointer_type (op_location_t 
> (input_location),
> type1, type2,
> error_mark_node,
> error_mark_node,
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index fa6af6fec11..84e2bdae6bf 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -10884,7 +10884,7 @@ compute_array_index_type_loc (location_t name_loc, 
> tree name, tree size,
>  cp_build_binary_op will be appropriately folded.  */
>{
> processing_template_decl_sentinel s;
> -   itype = cp_build_binary_op (input_location,
> +   itype = cp_build_binary_op (op_location_t (input_location),
> MINUS_EXPR,
> cp_convert (ssizetype, size, complain),
> cp_convert (ssizetype, integer_one_node,
> diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
> index 090a83bd670..ddb7e248c63 100644
> --- a/gcc/cp/decl2.c
> +++ b/gcc/cp/decl2.c
> @@ -3386,7 +3386,7 @@ get_guard_cond (tree guard, bool thread_safe)
>guard_value = integer_one_node;
>if (!same_type_p (TREE_TYPE (guard_value), TREE_TYPE (guard)))
> guard_value = fold_convert (TREE_TYPE (guard), guard_value);
> -  guard = cp_build_binary_op (input_location,
> +  guard = cp_build_binary_op (location_t (input_location),
>   BIT_AND_EXPR, guard, guard_value,
>   tf_warning_or_error);
>  }
> @@ -3394,7 +3394,7 @@ get_guard_cond (tree guard, bool thread_safe)
>guard_value = integer_zero_node;
>if (!same_type_p (TREE_TYPE (guard_value), TREE_TYPE 

Re: [PATCH 4/4] poison input_location and cfun in one spot

2021-06-30 Thread Richard Biener via Gcc-patches
On Wed, Jun 30, 2021 at 7:37 AM Trevor Saunders  wrote:
>
> This simply confirms we can poison them in a small region.
>
> boostrapped and regtested on x86_64-linux-gnu, ok?

So this shows the approach doesn't really scale since it's necessarily
at most function-scope granularity rather than file-scope as possible
with the existing #pragma (maybe add the possibility to un-poison
identifiers or a push/pop mechanism).

> Trev
>
> gcc/ChangeLog:
>
> * gimple-range.cc (disable_ranger): Prevent access to cfun and
> input_location.
> ---
>  gcc/gimple-range.cc | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
> index 1851339c528..d4a3a6e46be 100644
> --- a/gcc/gimple-range.cc
> +++ b/gcc/gimple-range.cc
> @@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "config.h"
>  #include "system.h"
>  #include "coretypes.h"
> +#include "poison.h"
>  #include "backend.h"
>  #include "tree.h"
>  #include "gimple.h"
> @@ -509,6 +510,8 @@ enable_ranger (struct function *fun)
>  void
>  disable_ranger (struct function *fun)
>  {
> +  auto_poison pil (input_location);
> +  auto_poison pcfun (cfun_poison);
>delete fun->x_range_query;
>
>fun->x_range_query = &global_ranges;
> --
> 2.20.1
>


Re: [ARM] PR66791: Gate comparison in vca intrinsics on __FAST_MATH__

2021-06-30 Thread Prathamesh Kulkarni via Gcc-patches
On Wed, 30 Jun 2021 at 14:00, Kyrylo Tkachov  wrote:
>
>
>
> > -Original Message-
> > From: Prathamesh Kulkarni 
> > Sent: 29 June 2021 08:21
> > To: gcc Patches ; Kyrylo Tkachov
> > 
> > Subject: Re: [ARM] PR66791: Gate comparison in vca intrinsics on
> > __FAST_MATH__
> >
> > On Tue, 22 Jun 2021 at 15:04, Prathamesh Kulkarni
> >  wrote:
> > >
> > > Hi,
> > > The attached patch gates abs(__a) cmp abs(__b) for vca intrinsics on
> > > __FAST_MATH__. I moved vabs intrinsics before vcage_f32 since vca
> > > intrinsics use those.
> > > Bootstrapped+tested on arm-linux-gnueabihf.
> > > OK to commit ?
> > ping https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573384.html
>
> Hmm, does this result in better optimisation? I guess it's expressing the 
> operation at a higher level, but there's now conceptually three operations 
> (2xvabs + 1 comparison) that would need to be folded away by the optimisers...
Hi Kyrill,
That was my motivation for PR97906 ;-)
With that fix, it now folds c = vabs(a) >= vabs(b) to vacle z, b, a
with __FAST_MATH__ defined.

Thanks,
Prathamesh
>
> Thanks,
> Kyrill
> >
> > Thanks,
> > Prathamesh
> > >
> > > Thanks,
> > > Prathamesh


RE: [ARM] PR66791: Gate comparison in vca intrinsics on __FAST_MATH__

2021-06-30 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Prathamesh Kulkarni 
> Sent: 30 June 2021 10:05
> To: Kyrylo Tkachov 
> Cc: gcc Patches 
> Subject: Re: [ARM] PR66791: Gate comparison in vca intrinsics on
> __FAST_MATH__
> 
> On Wed, 30 Jun 2021 at 14:00, Kyrylo Tkachov 
> wrote:
> >
> >
> >
> > > -Original Message-
> > > From: Prathamesh Kulkarni 
> > > Sent: 29 June 2021 08:21
> > > To: gcc Patches ; Kyrylo Tkachov
> > > 
> > > Subject: Re: [ARM] PR66791: Gate comparison in vca intrinsics on
> > > __FAST_MATH__
> > >
> > > On Tue, 22 Jun 2021 at 15:04, Prathamesh Kulkarni
> > >  wrote:
> > > >
> > > > Hi,
> > > > The attached patch gates abs(__a) cmp abs(__b) for vca intrinsics on
> > > > __FAST_MATH__. I moved vabs intrinsics before vcage_f32 since vca
> > > > intrinsics use those.
> > > > Bootstrapped+tested on arm-linux-gnueabihf.
> > > > OK to commit ?
> > > ping https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573384.html
> >
> > Hmm, does this result in better optimisation? I guess it's expressing the
> operation at a higher level, but there's now conceptually three operations
> (2xvabs + 1 comparison) that would need to be folded away by the
> optimisers...
> Hi Kyrill,
> That was my motivation for PR97906 ;-)
> With that fix, it now folds c = vabs(a) >= vabs(b) to vacle z, b, a
> with __FAST_MATH__ defined.

Ah right. Ok for trunk then.
Kyrill

> 
> Thanks,
> Prathamesh
> >
> > Thanks,
> > Kyrill
> > >
> > > Thanks,
> > > Prathamesh
> > > >
> > > > Thanks,
> > > > Prathamesh


[PATCH] tree-optimization/101264 - rework SLP "any" permute forward prop

2021-06-30 Thread Richard Biener
This integrates the forward propagation of SLP "any" permutes into
the main propagation stage as a separate single-pass propagation
didn't work out.

It does make the propagation iterate more - propagation in both
directions doesn't tend to behave nicely.  I've checked on CPU 2017
fprate and it isn't too bad:

#iters  #count
2 30810
3 7386
4 1250
5 140
6 6
7 3

before and

2 30812
3 7303
4 1218
5 122
6 33
7 56
8 26
9 12
10 4
11 1
12 2
13 2
14 4

after.  So yes, the peak number of required iterations grows
significantly but the majority is still covered with three iterations.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-06-30  Richard Biener  

PR tree-optimization/101264
* tree-vect-slp.c (vect_optimize_slp): Propagate the
computed perm_in to all "any" permute successors
we cannot de-duplicate immediately.

* gfortran.dg/pr101264.f90: New testcase.
---
 gcc/testsuite/gfortran.dg/pr101264.f90 | 94 ++
 gcc/tree-vect-slp.c| 79 ++
 2 files changed, 115 insertions(+), 58 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr101264.f90

diff --git a/gcc/testsuite/gfortran.dg/pr101264.f90 
b/gcc/testsuite/gfortran.dg/pr101264.f90
new file mode 100644
index 000..5602a709a36
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr101264.f90
@@ -0,0 +1,94 @@
+! { dg-do compile }
+! { dg-options "-Ofast" }
+  SUBROUTINE foo (a,b,c,d,trigs,inc1,inc2,inc3,inc4,lot,n,la)
+IMPLICIT NONE (type, external)
+INTEGER, PARAMETER ::   wp = 8
+INTEGER, PARAMETER ::  iwp = 4
+INTEGER(iwp) ::  inc1
+INTEGER(iwp) ::  inc2
+INTEGER(iwp) ::  inc3
+INTEGER(iwp) ::  inc4
+INTEGER(iwp) ::  la
+INTEGER(iwp) ::  lot
+INTEGER(iwp) ::  n
+
+REAL(wp) ::  a(*)
+REAL(wp) ::  b(*)
+REAL(wp) ::  c(*)
+REAL(wp) ::  d(*)
+REAL(wp) ::  trigs(*)
+
+REAL(wp) ::  c1
+REAL(wp) ::  c2
+REAL(wp) ::  s1
+REAL(wp) ::  s2
+REAL(wp) ::  sin60
+
+INTEGER(iwp) ::  i
+INTEGER(iwp) ::  ia
+INTEGER(iwp) ::  ib
+INTEGER(iwp) ::  ibase
+INTEGER(iwp) ::  ic
+INTEGER(iwp) ::  iink
+INTEGER(iwp) ::  ijk
+INTEGER(iwp) ::  j
+INTEGER(iwp) ::  ja
+INTEGER(iwp) ::  jb
+INTEGER(iwp) ::  jbase
+INTEGER(iwp) ::  jc
+INTEGER(iwp) ::  jink
+INTEGER(iwp) ::  jump
+INTEGER(iwp) ::  k
+INTEGER(iwp) ::  kb
+INTEGER(iwp) ::  kc
+INTEGER(iwp) ::  kstop
+INTEGER(iwp) ::  l
+INTEGER(iwp) ::  m
+
+sin60=0.866025403784437_wp
+
+ia = 1
+ib = ia + (2*m-la)*inc1
+ic = ib
+ja = 1
+jb = ja + jink
+jc = jb + jink
+
+DO k = la, kstop, la
+   kb = k + k
+   kc = kb + kb
+   c1 = trigs(kb+1)
+   s1 = trigs(kb+2)
+   c2 = trigs(kc+1)
+   s2 = trigs(kc+2)
+   ibase = 0
+   DO l = 1, la
+  i = ibase
+  j = jbase
+  DO ijk = 1, lot
+ c(ja+j) = a(ia+i) + (a(ib+i)+a(ic+i))
+ d(ja+j) = b(ia+i) + (b(ib+i)-b(ic+i))
+ c(jb+j) = c1*((a(ia+i)-0.5_wp*(a(ib+i)+a(ic+i)))-(sin60*(b(ib+i)+ 
&
+  &b(ic+i  
&
+  &- 
s1*((b(ia+i)-0.5_wp*(b(ib+i)-b(ic+i)))+(sin60*(a(ib+i)- &
+  &a(ic+i
+ d(jb+j) = s1*((a(ia+i)-0.5_wp*(a(ib+i)+a(ic+i)))-(sin60*(b(ib+i)+ 
&
+  &b(ic+i  
&
+  &+ 
c1*((b(ia+i)-0.5_wp*(b(ib+i)-b(ic+i)))+(sin60*(a(ib+i)- &
+  &a(ic+i
+ c(jc+j) = c2*((a(ia+i)-0.5_wp*(a(ib+i)+a(ic+i)))+(sin60*(b(ib+i)+ 
&
+  &b(ic+i  
&
+  &- 
s2*((b(ia+i)-0.5_wp*(b(ib+i)-b(ic+i)))-(sin60*(a(ib+i)- &
+  &a(ic+i
+ i = i + inc3
+ j = j + inc4
+  END DO
+  ibase = ibase + inc1
+  jbase = jbase + inc2
+   END DO
+   ia = ia + iink
+   ib = ib + iink
+   ic = ic - iink
+   jbase = jbase + jump
+END DO
+  END
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 9155af499b3..10195d3629f 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -3729,6 +3729,7 @@ vect_optimize_slp (vec_info *vinfo)
perm = vertices[idx].perm_out;
  else
{
+ bool any_succ_perm_out_m1 = false;
  perm = vertices[idx].get_perm_in ();
  for (graph_edge *succ = slpg->vertices[idx].succ;
   succ; succ = succ->succ_next)
@@ -3742,7 +3743,15 @@ vect_optimize_slp (vec_info *vinfo)
 For example see gcc.dg/vect/bb-slp-14.c for a case
 that would break.  */
  if (succ_perm == -1)
-   continue;
+   {
+ /* 

Re: [PATCH] define auto_vec copy ctor and assignment (PR 90904)

2021-06-30 Thread Martin Jambor
Hi,

On Wed, Jun 30 2021, Richard Biener wrote:
> On Wed, Jun 30, 2021 at 3:46 AM Martin Sebor  wrote:
>> On 6/29/21 4:58 AM, Richard Biener wrote:

[...]

>> >
>> > But I wonder if since you now identified them they could be massaged
>> > prior to doing the change.
>> >
>> > I do hope we end up not needing .to_vec () after all, if no users remain ;)
>>
>> I'd be happy to if none remained.  I see how to eliminate those in
>> calls to functions like c_parser_declaration_or_fndef() (done in
>> the attached revision of the patch), but no easy way to get rid
>> of those that replace other implicit conversions, like all those
>> assignments to the vec members of the ipa_call_arg_values ctor.
>> If it's appropriate to std::move those then that would get rid
>> of the .to_vec () call.  I'm not familiar with the code but I
>> have the impression it might be meant more as a reference to
>> some "remote" object (an instance of ipa_auto_call_arg_values?)
>> If that's right then making the vec members auto_vec references
>> (or pointers) would be one way to "fix" this.
>
> I think ipa_call_arg_values is just a temporary container used to
> pass a collection of vec<>s along API boundaries.  I'm not sure
> whether it's default CTOR is ever used but it's definitely an
> optimization avoiding extra indirection (when changing the vec<>
> members to vec<> * or references, in case the default CTOR is
> not used).  It _might_ be that the vecs are all just read and never
> written to in the APIs using this abstract type

No, IPA-CP does add and then remove extra context-specific values in the
auto version of the container, ipa_auto_call_arg_values, but I do not
think that consumers of ipa_call_arg_values do.  The caching mechanism
can make a (partial) copy.

> but then it's
> likely the vector are always appropriately pre-allocated.

They are, they should never be reallocated.

> Maybe using array_slice instead of vec<> members would work,
> but they'd pack less efficient (but I guess not an issue for this
> aggregate which should be only used temporarily for argument
> passing).

I need to educate myself more about array_slice to to comment on that.
But note that apart from reducing the number of parameters, there is
also an ipa_call_arg_values field in ipa_call_context and especially
ipa_cached_call_context.

Martin


[Ada] Rewrite Validated_View in recursive style

2021-06-30 Thread Pierre-Marie de Rodat
Iteration with an artificial Continue flag in routine Validated_View was
confusing. Also, most of the routines that traverse type hierarchies are
written in recursive style.

Cleanup only; semantics is unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_util.ads (Validated_View): Fix style in comment.
* sem_util.adb (Validated_View): Rewrite in recursive style.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -29471,42 +29471,36 @@ package body Sem_Util is

 
function Validated_View (Typ : Entity_Id) return Entity_Id is
-  Continue : Boolean;
-  Val_Typ  : Entity_Id;
-
begin
-  Continue := True;
-  Val_Typ  := Base_Type (Typ);
-
   --  Obtain the full view of the input type by stripping away concurrency,
   --  derivations, and privacy.
 
-  while Continue loop
- Continue := False;
-
- if Is_Concurrent_Type (Val_Typ) then
-if Present (Corresponding_Record_Type (Val_Typ)) then
-   Continue := True;
-   Val_Typ  := Corresponding_Record_Type (Val_Typ);
+  if Is_Base_Type (Typ) then
+ if Is_Concurrent_Type (Typ) then
+if Present (Corresponding_Record_Type (Typ)) then
+   return Corresponding_Record_Type (Typ);
+else
+   return Typ;
 end if;
 
- elsif Is_Derived_Type (Val_Typ) then
-Continue := True;
-Val_Typ  := Etype (Val_Typ);
+ elsif Is_Derived_Type (Typ) then
+return Validated_View (Etype (Typ));
 
- elsif Is_Private_Type (Val_Typ) then
-if Present (Underlying_Full_View (Val_Typ)) then
-   Continue := True;
-   Val_Typ  := Underlying_Full_View (Val_Typ);
+ elsif Is_Private_Type (Typ) then
+if Present (Underlying_Full_View (Typ)) then
+   return Validated_View (Underlying_Full_View (Typ));
 
-elsif Present (Full_View (Val_Typ)) then
-   Continue := True;
-   Val_Typ  := Full_View (Val_Typ);
+elsif Present (Full_View (Typ)) then
+   return Validated_View (Full_View (Typ));
+else
+   return Typ;
 end if;
  end if;
-  end loop;
 
-  return Val_Typ;
+ return Typ;
+  else
+ return Validated_View (Base_Type (Typ));
+  end if;
end Validated_View;
 
---


diff --git a/gcc/ada/sem_util.ads b/gcc/ada/sem_util.ads
--- a/gcc/ada/sem_util.ads
+++ b/gcc/ada/sem_util.ads
@@ -3290,7 +3290,7 @@ package Sem_Util is
 
function Validated_View (Typ : Entity_Id) return Entity_Id;
--  Obtain the "validated view" of arbitrary type Typ which is suitable for
-   --  verification by attributes 'Valid_Scalars. This view is the type itself
+   --  verification by attribute 'Valid_Scalars. This view is the type itself
--  or its full view while stripping away concurrency, derivations, and
--  privacy.
 




[Ada] Consistently use Validated_View for Valid_Scalars on records

2021-06-30 Thread Pierre-Marie de Rodat
Expansion of attribute Valid_Scalars was meant to use Get_Fullest_View
for arrays and Validated_View for records. However, this was not done
consistently and for records we were mixing Get_Fullest_View with
Validated_View.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_attr.adb (Expand_N_Attribute_Reference): Explicitly use
Validated_View for record objects.diff --git a/gcc/ada/exp_attr.adb b/gcc/ada/exp_attr.adb
--- a/gcc/ada/exp_attr.adb
+++ b/gcc/ada/exp_attr.adb
@@ -7460,7 +7460,7 @@ package body Exp_Attr is
 (Build_Record_VS_Func
   (Attr   => N,
Formal_Typ => Ptyp,
-   Rec_Typ=> Val_Typ),
+   Rec_Typ=> Validated_View (Ptyp)),
 Loc),
 Parameter_Associations => New_List (Pref));
  end if;




[Ada] Ignore again errors when running gen_il-main

2021-06-30 Thread Pierre-Marie de Rodat
This is needed to allow bootstrap with old compilers, due to
finalization issues.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* Make-generated.in (ada/stamp-gen_il): Ignore errors from
running gen_il-main.diff --git a/gcc/ada/Make-generated.in b/gcc/ada/Make-generated.in
--- a/gcc/ada/Make-generated.in
+++ b/gcc/ada/Make-generated.in
@@ -18,7 +18,9 @@ GEN_IL_FLAGS = -gnata -gnat2012 -gnatw.g -gnatyg -gnatU $(GEN_IL_INCLUDES)
 ada/seinfo_tables.ads ada/seinfo_tables.adb ada/sinfo.h ada/einfo.h ada/nmake.ads ada/nmake.adb ada/seinfo.ads ada/sinfo-nodes.ads ada/sinfo-nodes.adb ada/einfo-entities.ads ada/einfo-entities.adb: ada/stamp-gen_il ; @true
 ada/stamp-gen_il: $(fsrcdir)/ada/gen_il*
 	$(MKDIR) ada/gen_il
-	cd ada/gen_il ; gnatmake -q -g $(GEN_IL_FLAGS) gen_il-main ; ./gen_il-main
+	cd ada/gen_il; gnatmake -q -g $(GEN_IL_FLAGS) gen_il-main
+	# Ignore errors to work around finalization issues in older compilers
+	- cd ada/gen_il; ./gen_il-main
 	$(fsrcdir)/../move-if-change ada/gen_il/seinfo_tables.ads ada/seinfo_tables.ads
 	$(fsrcdir)/../move-if-change ada/gen_il/seinfo_tables.adb ada/seinfo_tables.adb
 	$(fsrcdir)/../move-if-change ada/gen_il/sinfo.h ada/sinfo.h




[Ada] Fix bug in node/entity kind numbers in sinfo/einfo.h

2021-06-30 Thread Pierre-Marie de Rodat
This patch fixes a bug in the node/entity kinds that are generated by
Gen_IL in sinfo/einfo.h. These numbers should be the same as the 'Pos of
the corresponding enumeration literals in Node_Kind and Entity_Kind.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* gen_il-gen.adb (Put_C_Type_And_Subtypes): Put the correct
numbers.
* gen_il-internals.ads, gen_il-internals.adb: (Pos): Remove this
function. It was assuming that the order of the enumeration
literals in Type_Enum is the same as the order of the generated
types Node_Kind and Entity_Kind, which is not true.diff --git a/gcc/ada/gen_il-gen.adb b/gcc/ada/gen_il-gen.adb
--- a/gcc/ada/gen_il-gen.adb
+++ b/gcc/ada/gen_il-gen.adb
@@ -2930,9 +2930,15 @@ package body Gen_IL.Gen is
   procedure Put_C_Type_And_Subtypes
 (S : in out Sink; Root : Root_Type) is
 
+ Cur_Pos : Root_Nat := 0;
+ --  Current Node_Kind'Pos or Entity_Kind'Pos to be printed
+
  procedure Put_Enum_Lit (T : Node_Or_Entity_Type);
  --  Print out the #define corresponding to the Ada enumeration literal
  --  for T in Node_Kind and Entity_Kind (i.e. concrete types).
+ --  This looks like "#define Some_Kind ", where Some_Kind
+ --  is the Node_Kind or Entity_Kind enumeration literal, and
+ --   is Node_Kind'Pos or Entity_Kind'Pos of that literal.
 
  procedure Put_Kind_Subtype (T : Node_Or_Entity_Type);
  --  Print out the SUBTYPE macro call corresponding to an abstract
@@ -2941,7 +2947,8 @@ package body Gen_IL.Gen is
  procedure Put_Enum_Lit (T : Node_Or_Entity_Type) is
  begin
 if T in Concrete_Type then
-   Put (S, "#define " & Image (T) & " " & Image (Pos (T)) & "" & LF);
+   Put (S, "#define " & Image (T) & " " & Image (Cur_Pos) & LF);
+   Cur_Pos := Cur_Pos + 1;
 end if;
  end Put_Enum_Lit;
 
@@ -2961,7 +2968,7 @@ package body Gen_IL.Gen is
  Iterate_Types (Root, Pre => Put_Enum_Lit'Access);
 
  Put (S, "#define Number_" & Node_Or_Entity (Root) & "_Kinds " &
-  Image (Pos (Last_Concrete (Root)) + 1) & "" & LF & LF);
+  Image (Cur_Pos) & "" & LF & LF);
 
  Iterate_Types (Root, Pre => Put_Kind_Subtype'Access);
 


diff --git a/gcc/ada/gen_il-internals.adb b/gcc/ada/gen_il-internals.adb
--- a/gcc/ada/gen_il-internals.adb
+++ b/gcc/ada/gen_il-internals.adb
@@ -477,16 +477,4 @@ package body Gen_IL.Internals is
   Put (S, "--  End type hierarchy for " & N_Or_E & LF & LF);
end Put_Type_Hierarchy;
 
-   -
-   -- Pos --
-   -
-
-   function Pos (T : Concrete_Type) return Root_Nat is
-  First : constant Concrete_Type :=
-(if T in Concrete_Node then Concrete_Node'First
- else Concrete_Entity'First);
-   begin
-  return Type_Enum'Pos (T) - Type_Enum'Pos (First);
-   end Pos;
-
 end Gen_IL.Internals;


diff --git a/gcc/ada/gen_il-internals.ads b/gcc/ada/gen_il-internals.ads
--- a/gcc/ada/gen_il-internals.ads
+++ b/gcc/ada/gen_il-internals.ads
@@ -202,7 +202,10 @@ package Gen_IL.Internals is
 Nil'Access);
--  Iterate top-down on the type hierarchy. Call Pre and Post before and
--  after walking child types. Note that this ignores union types, because
-   --  they are nonhierarchical.
+   --  they are nonhierarchical. The order in which concrete types are visited
+   --  matches the order of the generated enumeration types Node_Kind and
+   --  Entity_Kind, which is not the same as the order of the Type_Enum
+   --  type in Gen_IL.Types.
 
function Is_Descendant (Ancestor, Descendant : Node_Or_Entity_Type)
  return Boolean;
@@ -212,9 +215,6 @@ package Gen_IL.Internals is
 
procedure Put_Type_Hierarchy (S : in out Sink; Root : Root_Type);
 
-   function Pos (T : Concrete_Type) return Root_Nat;
-   --  Return Node_Kind'Pos (T) or Entity_Kind'Pos (T)
-

 
type Field_Desc is record




[Ada] Factor out many fields in entities

2021-06-30 Thread Pierre-Marie de Rodat
Also minor reformatting nearby.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* gen_il-gen-gen_entities.adb (Record_Field_Kind,
Allocatable_Kind): Add new abstract kinds.
(Constant_Or_Variable_Kind): Likewise.
(E_Constant, E_Variable, E_Loop_Parameter): Use them.
(E_Discriminant, E_Component): Likewise.
* gen_il-types.ads (type Opt_Type_Enum): Add them.diff --git a/gcc/ada/gen_il-gen-gen_entities.adb b/gcc/ada/gen_il-gen-gen_entities.adb
--- a/gcc/ada/gen_il-gen-gen_entities.adb
+++ b/gcc/ada/gen_il-gen-gen_entities.adb
@@ -242,8 +242,8 @@ begin -- Gen_IL.Gen.Gen_Entities
--  The initial Ekind value for a newly created entity. Also used as the
--  Ekind for Standard_Void_Type, a type entity in Standard used as a
--  dummy type for the return type of a procedure (the reason we create
-   --  this type is to share the circuits for performing overload resolution
-   --  on calls).
+   --  this type is to share the circuits for performing overload
+   --  resolution on calls).
(Sm (Alignment, Uint),
 Sm (Contract, Node_Id),
 Sm (Is_Elaboration_Warnings_OK_Id, Flag),
@@ -254,7 +254,9 @@ begin -- Gen_IL.Gen.Gen_Entities
 Sm (Current_Value, Node_Id), -- setter only
 Sm (Has_Predicates, Flag), -- setter only
 Sm (Initialization_Statements, Node_Id), -- setter only
-Sm (Is_Param_Block_Component_Type, Flag, Base_Type_Only), -- setter only
+Sm (Is_Param_Block_Component_Type, Flag, Base_Type_Only),
+-- setter only
+
 Sm (Package_Instantiation, Node_Id), -- setter only
 Sm (Related_Expression, Node_Id), -- setter only
 
@@ -302,17 +304,10 @@ begin -- Gen_IL.Gen.Gen_Entities
(Sm (Current_Value, Node_Id),
 Sm (Renamed_Or_Alias, Node_Id)));
 
-   Cc (E_Component, Object_Kind,
-   --  Components of a record declaration, private declarations of
-   --  protected objects.
+   Ab (Record_Field_Kind, Object_Kind,
(Sm (Component_Bit_Offset, Uint),
 Sm (Component_Clause, Node_Id),
 Sm (Corresponding_Record_Component, Node_Id),
-Sm (Discriminant_Checking_Func, Node_Id),
-Sm (DT_Entry_Count, Uint,
-Pre => "Is_Tag (N)"),
-Sm (DT_Offset_To_Top_Func, Node_Id,
-Pre => "Is_Tag (N)"),
 Sm (Entry_Formal, Node_Id),
 Sm (Esize, Uint),
 Sm (Interface_Name, Node_Id),
@@ -320,114 +315,80 @@ begin -- Gen_IL.Gen.Gen_Entities
 Sm (Normalized_First_Bit, Uint),
 Sm (Normalized_Position, Uint),
 Sm (Normalized_Position_Max, Uint),
-Sm (Original_Record_Component, Node_Id),
+Sm (Original_Record_Component, Node_Id)));
+
+   Cc (E_Component, Record_Field_Kind,
+   --  Components of a record declaration, private declarations of
+   --  protected objects.
+   (Sm (Discriminant_Checking_Func, Node_Id),
+Sm (DT_Entry_Count, Uint,
+Pre => "Is_Tag (N)"),
+Sm (DT_Offset_To_Top_Func, Node_Id,
+Pre => "Is_Tag (N)"),
 Sm (Prival, Node_Id,
 Pre => "Is_Protected_Component (N)"),
 Sm (Related_Type, Node_Id)));
 
-   Cc (E_Constant, Object_Kind,
-   --  Constants created by an object declaration with a constant keyword
+   Ab (Allocatable_Kind, Object_Kind,
(Sm (Activation_Record_Component, Node_Id),
-Sm (Actual_Subtype, Node_Id),
 Sm (Alignment, Uint),
+Sm (Esize, Uint),
+Sm (Interface_Name, Node_Id),
+Sm (Is_Finalized_Transient, Flag),
+Sm (Is_Ignored_Transient, Flag),
+Sm (Linker_Section_Pragma, Node_Id),
+Sm (Related_Expression, Node_Id),
+Sm (Status_Flag_Or_Transient_Decl, Node_Id)));
+
+   Ab (Constant_Or_Variable_Kind, Allocatable_Kind,
+   (Sm (Actual_Subtype, Node_Id),
 Sm (BIP_Initialization_Call, Node_Id),
 Sm (Contract, Node_Id),
 Sm (Discriminal_Link, Node_Id),
 Sm (Encapsulating_State, Node_Id),
-Sm (Esize, Uint),
 Sm (Extra_Accessibility, Node_Id),
-Sm (Full_View, Node_Id),
 Sm (Initialization_Statements, Node_Id),
-Sm (Interface_Name, Node_Id),
 Sm (Is_Elaboration_Checks_OK_Id, Flag),
 Sm (Is_Elaboration_Warnings_OK_Id, Flag),
-Sm (Is_Finalized_Transient, Flag),
-Sm (Is_Ignored_Transient, Flag),
 Sm (Last_Aggregate_Assignment, Node_Id),
-Sm (Linker_Section_Pragma, Node_Id),
 Sm (Optimize_Alignment_Space, Flag),
 Sm (Optimize_Alignment_Time, Flag),
 Sm (Prival_Link, Node_Id),
-Sm (Related_Expression, Node_Id),
 Sm (Related_Type, Node_Id),
 Sm (Return_Statement, Node_Id),
 Sm (Size_Check_Code, Node_Id),
 Sm (SPARK_Pragma, Node_Id),
-Sm (SPARK_Pragma_Inherited, Flag),
-Sm (Status_Flag_Or_Transient_Decl, Node_Id)));
+Sm (SPARK_Pragma_Inheri

[Ada] Add some OS constants to control keepalive on TCP connections

2021-06-30 Thread Pierre-Marie de Rodat
This adds some OS constants that are needed to control the
keepalive status of TCP connections. The new constants are
TCP_KEEPCNT, TCP_KEEPIDLE and TCP_KEEPINTVL.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* s-oscons-tmplt.c: Add some OS constants.diff --git a/gcc/ada/s-oscons-tmplt.c b/gcc/ada/s-oscons-tmplt.c
--- a/gcc/ada/s-oscons-tmplt.c
+++ b/gcc/ada/s-oscons-tmplt.c
@@ -1501,6 +1501,21 @@ CNS(MSG_Forced_Flags, "")
 #endif
 CND(TCP_NODELAY, "Do not coalesce packets")
 
+#ifndef TCP_KEEPCNT
+# define TCP_KEEPCNT -1
+#endif
+CND(TCP_KEEPCNT, "Maximum number of keepalive probes")
+
+#ifndef TCP_KEEPIDLE
+# define TCP_KEEPIDLE -1
+#endif
+CND(TCP_KEEPIDLE, "Idle time before TCP starts sending keepalive probes")
+
+#ifndef TCP_KEEPINTVL
+# define TCP_KEEPINTVL -1
+#endif
+CND(TCP_KEEPINTVL, "Time between individual keepalive probes")
+
 #ifndef SO_REUSEADDR
 # define SO_REUSEADDR -1
 #endif




[Ada] Accept arrays and scalars as type views that can be validated

2021-06-30 Thread Pierre-Marie de Rodat
Originally the expansion of attribute Validate_Scalars was only using
Validated_View, but it was generating unnecessary unchecked conversions
between array types that prevented validity checks from being optimized
at compilation time.

To prevent those conversions some of the calls to Validated_View were
replaced with calls to Get_Fullest_View, which behaves as an identity
function for non-packed arrays (and unchecked conversions between the
views of a type are trivially eliminated).

This patch restores uses of Validated_View, makes it behave as an
identity function on arrays and explains this behaviour in a comment.

A similar issue occurs for scalar (sub)types, which must be validated
without switching to their base types that would lack range constraints.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_attr.adb (Build_Array_VS_Func): Restore uses of
Validated_View.
(Build_Record_VS_Func): Likewise.
(Expand_N_Attribute_Reference): Likewise.
* sem_util.adb (Validated_View): Behave as an identity function
for arrays and scalars.diff --git a/gcc/ada/exp_attr.adb b/gcc/ada/exp_attr.adb
--- a/gcc/ada/exp_attr.adb
+++ b/gcc/ada/exp_attr.adb
@@ -248,7 +248,7 @@ package body Exp_Attr is
is
   Loc  : constant Source_Ptr := Sloc (Attr);
   Comp_Typ : constant Entity_Id :=
-Get_Fullest_View (Component_Type (Array_Typ));
+Validated_View (Component_Type (Array_Typ));
 
   function Validate_Component
 (Obj_Id  : Entity_Id;
@@ -535,7 +535,7 @@ package body Exp_Attr is
   is
  Field_Id  : constant Entity_Id := Defining_Entity (Field);
  Field_Nam : constant Name_Id   := Chars (Field_Id);
- Field_Typ : constant Entity_Id := Get_Fullest_View (Etype (Field_Id));
+ Field_Typ : constant Entity_Id := Validated_View (Etype (Field_Id));
  Attr_Nam  : Name_Id;
 
   begin
@@ -7396,7 +7396,7 @@ package body Exp_Attr is
   ---
 
   when Attribute_Valid_Scalars => Valid_Scalars : declare
- Val_Typ : constant Entity_Id := Get_Fullest_View (Ptyp);
+ Val_Typ : constant Entity_Id := Validated_View (Ptyp);
  Expr: Node_Id;
 
   begin
@@ -7460,7 +7460,7 @@ package body Exp_Attr is
 (Build_Record_VS_Func
   (Attr   => N,
Formal_Typ => Ptyp,
-   Rec_Typ=> Validated_View (Ptyp)),
+   Rec_Typ=> Val_Typ),
 Loc),
 Parameter_Associations => New_List (Pref));
  end if;


diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -29473,34 +29473,53 @@ package body Sem_Util is
 
function Validated_View (Typ : Entity_Id) return Entity_Id is
begin
+  --  Scalar types can be always validated. In fast, switiching to the base
+  --  type would drop the range constraints and force validation to use a
+  --  larger type than necessary.
+
+  if Is_Scalar_Type (Typ) then
+ return Typ;
+
+  --  Array types can be validated even when they are derived, because
+  --  validation only requires their bounds and component types to be
+  --  accessible. In fact, switching to the parent type would pollute
+  --  expansion of attribute Valid_Scalars with unnecessary conversion
+  --  that might not be eliminated by the frontend.
+
+  elsif Is_Array_Type (Typ) then
+ return Typ;
+
+  --  For other types, in particular for record subtypes, we switch to the
+  --  base type.
+
+  elsif not Is_Base_Type (Typ) then
+ return Validated_View (Base_Type (Typ));
+
   --  Obtain the full view of the input type by stripping away concurrency,
   --  derivations, and privacy.
 
-  if Is_Base_Type (Typ) then
- if Is_Concurrent_Type (Typ) then
-if Present (Corresponding_Record_Type (Typ)) then
-   return Corresponding_Record_Type (Typ);
-else
-   return Typ;
-end if;
+  elsif Is_Concurrent_Type (Typ) then
+ if Present (Corresponding_Record_Type (Typ)) then
+return Corresponding_Record_Type (Typ);
+ else
+return Typ;
+ end if;
 
- elsif Is_Derived_Type (Typ) then
-return Validated_View (Etype (Typ));
+  elsif Is_Derived_Type (Typ) then
+ return Validated_View (Etype (Typ));
 
- elsif Is_Private_Type (Typ) then
-if Present (Underlying_Full_View (Typ)) then
-   return Validated_View (Underlying_Full_View (Typ));
+  elsif Is_Private_Type (Typ) then
+ if Present (Underlying_Full_View (Typ)) then
+return Validated_View (Underlying_Full_View (Typ));
 
-elsif Present (Full_View (Typ)) then
-   return Validated_View (Full_View (Typ));
- 

[Ada] More robust guard against cascaded errors with overlapping actuals

2021-06-30 Thread Pierre-Marie de Rodat
Code cleanup, both to improve efficiency (when no error has been posted
for a subprogram call) and to avoid potential crashes (when an error has
been posted and the subprogram call parameters are likely to be
ill-formed as well).

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_warn.adb (Warn_On_Overlapping_Actuals): Prevent cascaded
errors once for the subprogram call, not for every pair of
actual parameters.diff --git a/gcc/ada/sem_warn.adb b/gcc/ada/sem_warn.adb
--- a/gcc/ada/sem_warn.adb
+++ b/gcc/ada/sem_warn.adb
@@ -3729,6 +3729,11 @@ package body Sem_Warn is
 
   if Nkind (N) not in N_Subprogram_Call | N_Entry_Call_Statement then
  return;
+
+  --  Guard against previous errors
+
+  elsif Error_Posted (N) then
+ return;
   end if;
 
   --  If a call C has two or more parameters of mode in out or out that are
@@ -3800,10 +3805,9 @@ package body Sem_Warn is
   and then Is_Composite_Type (Etype (Form1)))
then
 
-   --  Guard against previous errors
+  --  Guard against previous errors
 
-  if Error_Posted (N)
-or else No (Etype (Act1))
+  if No (Etype (Act1))
 or else No (Etype (Act2))
   then
  null;




[Ada] Further adjustment and optimization of System.Value_N

2021-06-30 Thread Pierre-Marie de Rodat
This moves the declaration of Value_Enumeration_Pos to the body, renames
Valid_Enumeration_Value into Valid_Value_Enumeration and eliminates the
need for a range check on the return path of Value_Enumeration.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* rtsfind.ads (RE_Id): Change RE_Valid_Enumeration_Value_NN into
RE_Valid_Value_Enumeration_NN.
(RE_Unit_Table): Adjust to above renaming.
* exp_imgv.adb (Expand_Valid_Value_Attribute): Likewise.
* libgnat/s-valuen.ads (Invalid): Remove.
(Value_Enumeration_Pos): Move to...
* libgnat/s-valuen.adb (Value_Enumeration_Pos): ...here.
Return -1 instead of Invalid.
(Value_Enumeration): Compare against 0 instead of Invalid.
(Valid_Enumeration_Value): Likewise.  Rename to...
(Valid_Value_Enumeration): ...this.
* libgnat/s-vaenu8.ads (Valid_Enumeration_Value_8): Rename into...
(Valid_Value_Enumeration_8): ...this.
* libgnat/s-vaen16.ads (Valid_Enumeration_Value_16): Rename into...
(Valid_Value_Enumeration_16): ...this.
* libgnat/s-vaen32.ads (Valid_Enumeration_Value_32): Rename into...
(Valid_Value_Enumeration_32): ...this.diff --git a/gcc/ada/exp_imgv.adb b/gcc/ada/exp_imgv.adb
--- a/gcc/ada/exp_imgv.adb
+++ b/gcc/ada/exp_imgv.adb
@@ -1439,17 +1439,17 @@ package body Exp_Imgv is
begin
   --  Generate:
 
-  -- Valid_Enumeration_Value _NN
+  -- Valid_Value_Enumeration_NN
   --   (typS, typN'Address, typH'Unrestricted_Access, Num, X)
 
   Ttyp := Component_Type (Etype (Lit_Indexes (Rtyp)));
 
   if Ttyp = Standard_Integer_8 then
- Func := RE_Valid_Enumeration_Value_8;
+ Func := RE_Valid_Value_Enumeration_8;
   elsif Ttyp = Standard_Integer_16 then
- Func := RE_Valid_Enumeration_Value_16;
+ Func := RE_Valid_Value_Enumeration_16;
   else
- Func := RE_Valid_Enumeration_Value_32;
+ Func := RE_Valid_Value_Enumeration_32;
   end if;
 
   Prepend_To (Args,


diff --git a/gcc/ada/libgnat/s-vaen16.ads b/gcc/ada/libgnat/s-vaen16.ads
--- a/gcc/ada/libgnat/s-vaen16.ads
+++ b/gcc/ada/libgnat/s-vaen16.ads
@@ -49,13 +49,13 @@ package System.Val_Enum_16 is
   returnNatural
  renames Impl.Value_Enumeration;
 
-   function Valid_Enumeration_Value_16
+   function Valid_Value_Enumeration_16
  (Names   : String;
   Indexes : System.Address;
   Hash: Impl.Hash_Function_Ptr;
   Num : Natural;
   Str : String)
   returnBoolean
- renames Impl.Valid_Enumeration_Value;
+ renames Impl.Valid_Value_Enumeration;
 
 end System.Val_Enum_16;


diff --git a/gcc/ada/libgnat/s-vaen32.ads b/gcc/ada/libgnat/s-vaen32.ads
--- a/gcc/ada/libgnat/s-vaen32.ads
+++ b/gcc/ada/libgnat/s-vaen32.ads
@@ -49,13 +49,13 @@ package System.Val_Enum_32 is
   returnNatural
  renames Impl.Value_Enumeration;
 
-   function Valid_Enumeration_Value_32
+   function Valid_Value_Enumeration_32
  (Names   : String;
   Indexes : System.Address;
   Hash: Impl.Hash_Function_Ptr;
   Num : Natural;
   Str : String)
   returnBoolean
- renames Impl.Valid_Enumeration_Value;
+ renames Impl.Valid_Value_Enumeration;
 
 end System.Val_Enum_32;


diff --git a/gcc/ada/libgnat/s-vaenu8.ads b/gcc/ada/libgnat/s-vaenu8.ads
--- a/gcc/ada/libgnat/s-vaenu8.ads
+++ b/gcc/ada/libgnat/s-vaenu8.ads
@@ -49,13 +49,13 @@ package System.Val_Enum_8 is
   returnNatural
  renames Impl.Value_Enumeration;
 
-   function Valid_Enumeration_Value_8
+   function Valid_Value_Enumeration_8
  (Names   : String;
   Indexes : System.Address;
   Hash: Impl.Hash_Function_Ptr;
   Num : Natural;
   Str : String)
   returnBoolean
- renames Impl.Valid_Enumeration_Value;
+ renames Impl.Valid_Value_Enumeration;
 
 end System.Val_Enum_8;


diff --git a/gcc/ada/libgnat/s-valuen.adb b/gcc/ada/libgnat/s-valuen.adb
--- a/gcc/ada/libgnat/s-valuen.adb
+++ b/gcc/ada/libgnat/s-valuen.adb
@@ -35,6 +35,16 @@ with System.Val_Util; use System.Val_Util;
 
 package body System.Value_N is
 
+   function Value_Enumeration_Pos
+ (Names   : String;
+  Indexes : System.Address;
+  Hash: Hash_Function_Ptr;
+  Num : Natural;
+  Str : String)
+  returnInteger with Pure_Function;
+   --  Same as Value_Enumeration, except returns negative if Value_Enumeration
+   --  would raise Constraint_Error.
+
---
-- Value_Enumeration_Pos --
---
@@ -98,9 +108,25 @@ package body System.Value_N is
  end if;
   end;
 
-  return Invalid;
+  return -1;
end Value_Enumeration_Pos;
 
+   -
+   -- Valid_Value_Enumeration --
+   -
+
+   function Valid_Value_Enumeration
+ (Names   : String;
+  Indexes : System.Address

[Ada] Simplify detection of local types

2021-06-30 Thread Pierre-Marie de Rodat
Code cleanup related to fixing Valid_Scalars for private records.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch3.adb (Is_Local_Type): Simplify by reusing Scope_Within.diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -19133,21 +19133,8 @@ package body Sem_Ch3 is
   ---
 
   function Is_Local_Type (Typ : Entity_Id) return Boolean is
- Scop : Entity_Id;
-
   begin
- Scop := Scope (Typ);
- while Present (Scop)
-   and then Scop /= Standard_Standard
- loop
-if Scop = Scope (Current_Scope) then
-   return True;
-end if;
-
-Scop := Scope (Scop);
- end loop;
-
- return False;
+ return Scope_Within (Inner => Typ, Outer => Scope (Current_Scope));
   end Is_Local_Type;
 
--  Start of processing for Is_Visible_Component




[Ada] Reuse Is_Subprogram_Or_Entry where possible

2021-06-30 Thread Pierre-Marie de Rodat
Code cleanup; semantics is unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* freeze.adb (Freeze_All): Simplify by reusing
Is_Subprogram_Or_Entry.
* sem_ch11.adb (Analyze_Handled_Statement): Likewise.diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -2179,7 +2179,7 @@ package body Freeze is
  elsif Is_Concurrent_Type (E) then
 Item := First_Entity (E);
 while Present (Item) loop
-   if (Is_Entry (Item) or else Is_Subprogram (Item))
+   if Is_Subprogram_Or_Entry (Item)
  and then not Default_Expressions_Processed (Item)
then
   Process_Default_Expressions (Item, After);


diff --git a/gcc/ada/sem_ch11.adb b/gcc/ada/sem_ch11.adb
--- a/gcc/ada/sem_ch11.adb
+++ b/gcc/ada/sem_ch11.adb
@@ -435,7 +435,7 @@ package body Sem_Ch11 is
   --  postcondition, since in that case there are no source references, and
   --  we need to preserve deferred references from the enclosing scope.
 
-  if ((Is_Subprogram (Current_Scope) or else Is_Entry (Current_Scope))
+  if (Is_Subprogram_Or_Entry (Current_Scope)
and then Chars (Current_Scope) /= Name_uPostconditions)
  or else Ekind (Current_Scope) in E_Block | E_Task_Type
   then




[Ada] Remove redundant check for empty list

2021-06-30 Thread Pierre-Marie de Rodat
Cleanup only; semantics is unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch3.adb (Analyze_Declarations): Remove explicit check for
missing, because a subsequent call to Is_Empty_List will detect
them anyway.diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -2758,7 +2758,6 @@ package body Sem_Ch3 is
Resolve_Aspects;
 
 elsif L /= Visible_Declarations (Parent (L))
-  or else No (Private_Declarations (Parent (L)))
   or else Is_Empty_List (Private_Declarations (Parent (L)))
 then
Adjust_Decl;




[Ada] Fix style in Get_Fullest_View

2021-06-30 Thread Pierre-Marie de Rodat
Code cleanup related to fixing Valid_Scalars for private records, which
used to involve Get_Fullest_View.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_util.ads (Get_Fullest_View): Refill comment; remove extra
extra after period.
* sem_util.adb (Get_Fullest_View): Fix style.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -10772,22 +10772,26 @@ package body Sem_Util is
  when E_Class_Wide_Type =>
 return Get_Fullest_View (Root_Type (E), Include_PAT);
 
- when  E_Class_Wide_Subtype =>
+ when E_Class_Wide_Subtype =>
 if Present (Equivalent_Type (E)) then
return Get_Fullest_View (Equivalent_Type (E), Include_PAT);
 elsif Present (Cloned_Subtype (E)) then
return Get_Fullest_View (Cloned_Subtype (E), Include_PAT);
 end if;
 
- when E_Protected_Type | E_Protected_Subtype
-| E_Task_Type |  E_Task_Subtype =>
+ when E_Protected_Subtype
+| E_Protected_Type
+| E_Task_Subtype
+| E_Task_Type
+ =>
 if Present (Corresponding_Record_Type (E)) then
return Get_Fullest_View (Corresponding_Record_Type (E),
 Include_PAT);
 end if;
 
  when E_Access_Protected_Subprogram_Type
-| E_Anonymous_Access_Protected_Subprogram_Type =>
+| E_Anonymous_Access_Protected_Subprogram_Type
+ =>
 if Present (Equivalent_Type (E)) then
return Get_Fullest_View (Equivalent_Type (E), Include_PAT);
 end if;


diff --git a/gcc/ada/sem_util.ads b/gcc/ada/sem_util.ads
--- a/gcc/ada/sem_util.ads
+++ b/gcc/ada/sem_util.ads
@@ -1328,9 +1328,9 @@ package Sem_Util is
 
function Get_Fullest_View
  (E : Entity_Id; Include_PAT : Boolean := True) return Entity_Id;
-   --  Get the fullest possible view of E, looking through private,
-   --  limited, packed array and other implementation types.  If Include_PAT
-   --  is False, don't look inside packed array types.
+   --  Get the fullest possible view of E, looking through private, limited,
+   --  packed array and other implementation types. If Include_PAT is False,
+   --  don't look inside packed array types.
 
function Has_Access_Values (T : Entity_Id) return Boolean;
--  Returns true if the underlying type of T is an access type, or has a




[Ada] tech debt: Parent (Empty) is not allowed

2021-06-30 Thread Pierre-Marie de Rodat
The documentation says that the Parent field is not defined for the
Empty node, but many places were setting and getting the field. This
patch changes the code to obey the documentation.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* atree.adb, atree.ads (Parent, Set_Parent): Assert node is
Present.
(Copy_Parent, Parent_Kind): New helper routines.
* gen_il-gen.adb: Add with clause.
* nlists.adb (Parent): Assert Parent of list is Present.
* aspects.adb, checks.adb, exp_aggr.adb, exp_ch6.adb,
exp_util.adb, lib-xref-spark_specific.adb, osint.ads,
sem_ch12.adb, sem_ch13.adb, sem_ch3.adb, sem_ch6.adb,
sem_dim.adb, sem_prag.adb, sem_res.adb, sem_util.adb,
treepr.adb: Do not call Parent and Set_Parent on the Empty node.
* libgnat/a-stwiun__shared.adb, libgnat/a-stzunb__shared.adb:
Minor: Fix typos in comments.
* einfo.ads: Minor comment update.
* sinfo-utils.ads, sinfo-utils.adb (Parent_Kind, Copy_Parent):
New functions.diff --git a/gcc/ada/aspects.adb b/gcc/ada/aspects.adb
--- a/gcc/ada/aspects.adb
+++ b/gcc/ada/aspects.adb
@@ -241,6 +241,10 @@ package body Aspects is
   --  find the declaration node where the aspects reside. This is usually
   --  the parent or the parent of the parent.
 
+  if No (Parent (Owner)) then
+ return Empty;
+  end if;
+
   Decl := Parent (Owner);
   if not Permits_Aspect_Specifications (Decl) then
  Decl := Parent (Decl);
@@ -488,6 +492,7 @@ package body Aspects is
 
function Permits_Aspect_Specifications (N : Node_Id) return Boolean is
begin
+  pragma Assert (Present (N));
   return Has_Aspect_Specifications_Flag (Nkind (N));
end Permits_Aspect_Specifications;
 


diff --git a/gcc/ada/atree.adb b/gcc/ada/atree.adb
--- a/gcc/ada/atree.adb
+++ b/gcc/ada/atree.adb
@@ -1232,7 +1232,9 @@ package body Atree is
  if Field in Node_Range then
 New_N := Union_Id (Copy_Separate_Tree (Node_Id (Field)));
 
-if Parent (Node_Id (Field)) = Source then
+if Present (Node_Id (Field))
+  and then Parent (Node_Id (Field)) = Source
+then
Set_Parent (Node_Id (New_N), New_Id);
 end if;
 
@@ -1801,16 +1803,14 @@ package body Atree is
   end if;
end Paren_Count;
 
-   
-   -- Parent --
-   
-
-   function Parent (N : Node_Id) return Node_Id is
+   function Parent (N : Node_Or_Entity_Id) return Node_Or_Entity_Id is
begin
+  pragma Assert (Atree.Present (N));
+
   if Is_List_Member (N) then
  return Parent (List_Containing (N));
   else
- return Node_Id (Link (N));
+ return Node_Or_Entity_Id (Link (N));
   end if;
end Parent;
 
@@ -2126,9 +2126,9 @@ package body Atree is
-- Set_Parent --

 
-   procedure Set_Parent (N : Node_Id; Val : Node_Id) is
+   procedure Set_Parent (N : Node_Or_Entity_Id; Val : Node_Or_Entity_Id) is
begin
-  pragma Assert (not Locked);
+  pragma Assert (Atree.Present (N));
   pragma Assert (not In_List (N));
   Set_Link (N, Union_Id (Val));
end Set_Parent;


diff --git a/gcc/ada/atree.ads b/gcc/ada/atree.ads
--- a/gcc/ada/atree.ads
+++ b/gcc/ada/atree.ads
@@ -414,34 +414,34 @@ package Atree is
--  The following functions return the contents of the indicated field of
--  the node referenced by the argument, which is a Node_Id.
 
-   function No   (N : Node_Id) return Boolean;
+   function No (N : Node_Id) return Boolean;
pragma Inline (No);
--  Tests given Id for equality with the Empty node. This allows notations
--  like "if No (Variant_Part)" as opposed to "if Variant_Part = Empty".
 
-   function Parent   (N : Node_Id) return Node_Id;
+   function Parent (N : Node_Or_Entity_Id) return Node_Or_Entity_Id;
pragma Inline (Parent);
--  Returns the parent of a node if the node is not a list member, or else
--  the parent of the list containing the node if the node is a list member.
 
-   function Paren_Count  (N : Node_Id) return Nat;
+   function Paren_Count (N : Node_Id) return Nat;
pragma Inline (Paren_Count);
--  Number of parentheses that surround an expression
 
-   function Present  (N : Node_Id) return Boolean;
+   function Present (N : Node_Id) return Boolean;
pragma Inline (Present);
--  Tests given Id for inequality with the Empty node. This allows notations
--  like "if Present (Statement)" as opposed to "if Statement /= Empty".
 
-   procedure Set_Original_Node (N : Node_Id; Val : Node_Id);
+   procedure Set_Original_Node (N : Node_Id; Val : Node_Id);
pragma Inline (Set_Original_Node);
--  Note that this routine is used only in very peculiar cases. In normal
--  cases, the Original_Node link is set by calls to Rewri

[Ada] Overriding errors on renamings and instances overriding predefined operators

2021-06-30 Thread Pierre-Marie de Rodat
The compiler improperly flags a renaming-as-body or a function
instantiation declaring an operator that overrides a predefined
operator, when the renaming or instantiation is declared with an
overriding_indicator.  In the renaming case, the Overridden_Operation
field of the subprogram's entity is not set, and a test for the
possibility that the renaming overrides a predefined operator must be
added. In the case of a function instantiation, the function entity is
an N_Defining_Identifier rather than an N_Defining_Operator_Symbol, and
a test for the entity having an operator name is now used.  In both
cases these additional tests serve to prevent issuing a "not overriding"
error.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.ads (Can_Override_Operator): Function declaration
moved from package body to package spec.
* sem_ch6.adb (Check_Overriding_Indicator): Now use test of
whether the subprogram's Chars is an operator name, to handle
cases of function instances whose entity is
N_Defining_Identifier rather than N_Defining_Operator_Symbol.
(Can_Override_Operator): Function declaration moved to package
spec.  Now use test of whether the subprogram's Chars is an
operator name, to handle cases of function instances whose
entity is N_Defining_Identifier rather than
N_Defining_Operator_Symbol.
* sem_ch8.adb (Analyze_Renamed_Subprogram): Check for
possibility of an overridden predefined operator, and suppress
the "not overriding" message in that case.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -132,9 +132,6 @@ package body Sem_Ch6 is
--  Does all the real work of Analyze_Subprogram_Body. This is split out so
--  that we can use RETURN but not skip the debug output at the end.
 
-   function Can_Override_Operator (Subp : Entity_Id) return Boolean;
-   --  Returns true if Subp can override a predefined operator.
-
procedure Check_Conformance
  (New_Id   : Entity_Id;
   Old_Id   : Entity_Id;
@@ -7321,7 +7318,7 @@ package body Sem_Ch6 is
   --  predefined signature, because we know already that there is no
   --  explicit overridden operation.
 
-  elsif Nkind (Subp) = N_Defining_Operator_Symbol then
+  elsif Chars (Subp) in Any_Operator_Name then
  if Must_Not_Override (Spec) then
 
 --  If this is not a primitive or a protected subprogram, then
@@ -8313,7 +8310,12 @@ package body Sem_Ch6 is
   Typ : Entity_Id;
 
begin
-  if Nkind (Subp) /= N_Defining_Operator_Symbol then
+  --  Return False if not an operator. We test the name rather than testing
+  --  that the Nkind is N_Defining_Operator_Symbol, because there are cases
+  --  where an operator entity can be an N_Defining_Identifier (such as for
+  --  function instantiations).
+
+  if Chars (Subp) not in Any_Operator_Name then
  return False;
 
   else


diff --git a/gcc/ada/sem_ch6.ads b/gcc/ada/sem_ch6.ads
--- a/gcc/ada/sem_ch6.ads
+++ b/gcc/ada/sem_ch6.ads
@@ -51,6 +51,9 @@ package Sem_Ch6 is
--  and body declarations. Returns the defining entity for the
--  specification N.
 
+   function Can_Override_Operator (Subp : Entity_Id) return Boolean;
+   --  Returns true if Subp can override a predefined operator
+
procedure Check_Conventions (Typ : Entity_Id);
--  Ada 2005 (AI-430): Check that the conventions of all inherited and
--  overridden dispatching operations of type Typ are consistent with their


diff --git a/gcc/ada/sem_ch8.adb b/gcc/ada/sem_ch8.adb
--- a/gcc/ada/sem_ch8.adb
+++ b/gcc/ada/sem_ch8.adb
@@ -3299,7 +3299,9 @@ package body Sem_Ch8 is
Style.Missing_Overriding (N, Rename_Spec);
 end if;
 
- elsif Must_Override (Specification (N)) then
+ elsif Must_Override (Specification (N))
+   and then not Can_Override_Operator (Rename_Spec)
+ then
 Error_Msg_NE ("subprogram& is not overriding", N, Rename_Spec);
  end if;
 




[Ada] vx7-shared-libs - x86_64-vx7r2 (gnat runtime)

2021-06-30 Thread Pierre-Marie de Rodat
A standalong ifeq is added for selecting vxworks7r2 targets which
have shared gnatlib enabled.  The powerpc64 ifeq is moved here and
x86_64 is added to the filter list.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* Makefile.rtl: Add a new ifeq for vx7r2 shared gnatlib.diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -1200,13 +1200,6 @@ ifeq ($(strip $(filter-out powerpc% wrs vxworks vxworksspe vxworks7% vxworks7spe
 GCC_SPEC_FILES+=vxworks-cert-$(ARCH_STR)-link.spec
 GCC_SPEC_FILES+=vxworks-smp-$(ARCH_STR)-link.spec
   endif
-
-  ifeq ($(strip $(filter-out vxworks7r2 powerpc64 rtp rtp-smp, $(target_os) $(target_cpu) $(THREAD_KIND))),)
-# Shared libraries are only supported on PowerPC64, VxWorks7r2
-# ATM.  Also this is disabled for kernel runtimes.
-GNATLIB_SHARED = gnatlib-shared-dual
-LIBRARY_VERSION := $(LIB_VERSION)
-  endif
 endif
 
 # PowerPC and e500v2 VxWorks 653
@@ -2973,6 +2966,13 @@ ifeq ($(strip $(filter-out linux%,$(target_os))),)
 g-sercom.adb

[Ada] Disable Pre/Post in formal containers

2021-06-30 Thread Pierre-Marie de Rodat
Pre and postconditions in the formal containers library are
designed for formal verification. In general, we do not want to
execute them.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-cfdlli.ads: Use pragma Assertion_Policy to disable
pre and postconditions.
* libgnat/a-cfhama.ads: Likewise.
* libgnat/a-cfhase.ads: Likewise.
* libgnat/a-cfinve.ads: Likewise.
* libgnat/a-cforma.ads: Likewise.
* libgnat/a-cforse.ads: Likewise.
* libgnat/a-cofove.ads: Likewise.diff --git a/gcc/ada/libgnat/a-cfdlli.ads b/gcc/ada/libgnat/a-cfdlli.ads
--- a/gcc/ada/libgnat/a-cfdlli.ads
+++ b/gcc/ada/libgnat/a-cfdlli.ads
@@ -39,6 +39,11 @@ generic
 package Ada.Containers.Formal_Doubly_Linked_Lists with
   SPARK_Mode
 is
+   --  Contracts in this unit are meant for analysis only, not for run-time
+   --  checking.
+
+   pragma Assertion_Policy (Pre => Ignore);
+   pragma Assertion_Policy (Post => Ignore);
pragma Annotate (CodePeer, Skip_Analysis);
 
type List (Capacity : Count_Type) is private with


diff --git a/gcc/ada/libgnat/a-cfhama.ads b/gcc/ada/libgnat/a-cfhama.ads
--- a/gcc/ada/libgnat/a-cfhama.ads
+++ b/gcc/ada/libgnat/a-cfhama.ads
@@ -64,6 +64,11 @@ generic
 package Ada.Containers.Formal_Hashed_Maps with
   SPARK_Mode
 is
+   --  Contracts in this unit are meant for analysis only, not for run-time
+   --  checking.
+
+   pragma Assertion_Policy (Pre => Ignore);
+   pragma Assertion_Policy (Post => Ignore);
pragma Annotate (CodePeer, Skip_Analysis);
 
type Map (Capacity : Count_Type; Modulus : Hash_Type) is private with


diff --git a/gcc/ada/libgnat/a-cfhase.ads b/gcc/ada/libgnat/a-cfhase.ads
--- a/gcc/ada/libgnat/a-cfhase.ads
+++ b/gcc/ada/libgnat/a-cfhase.ads
@@ -62,6 +62,11 @@ generic
 package Ada.Containers.Formal_Hashed_Sets with
   SPARK_Mode
 is
+   --  Contracts in this unit are meant for analysis only, not for run-time
+   --  checking.
+
+   pragma Assertion_Policy (Pre => Ignore);
+   pragma Assertion_Policy (Post => Ignore);
pragma Annotate (CodePeer, Skip_Analysis);
 
type Set (Capacity : Count_Type; Modulus : Hash_Type) is private with


diff --git a/gcc/ada/libgnat/a-cfinve.ads b/gcc/ada/libgnat/a-cfinve.ads
--- a/gcc/ada/libgnat/a-cfinve.ads
+++ b/gcc/ada/libgnat/a-cfinve.ads
@@ -55,6 +55,11 @@ generic
 package Ada.Containers.Formal_Indefinite_Vectors with
   SPARK_Mode => On
 is
+   --  Contracts in this unit are meant for analysis only, not for run-time
+   --  checking.
+
+   pragma Assertion_Policy (Pre => Ignore);
+   pragma Assertion_Policy (Post => Ignore);
pragma Annotate (CodePeer, Skip_Analysis);
 
subtype Extended_Index is Index_Type'Base


diff --git a/gcc/ada/libgnat/a-cforma.ads b/gcc/ada/libgnat/a-cforma.ads
--- a/gcc/ada/libgnat/a-cforma.ads
+++ b/gcc/ada/libgnat/a-cforma.ads
@@ -63,6 +63,11 @@ generic
 package Ada.Containers.Formal_Ordered_Maps with
   SPARK_Mode
 is
+   --  Contracts in this unit are meant for analysis only, not for run-time
+   --  checking.
+
+   pragma Assertion_Policy (Pre => Ignore);
+   pragma Assertion_Policy (Post => Ignore);
pragma Annotate (CodePeer, Skip_Analysis);
 
function Equivalent_Keys (Left, Right : Key_Type) return Boolean with


diff --git a/gcc/ada/libgnat/a-cforse.ads b/gcc/ada/libgnat/a-cforse.ads
--- a/gcc/ada/libgnat/a-cforse.ads
+++ b/gcc/ada/libgnat/a-cforse.ads
@@ -59,6 +59,11 @@ generic
 package Ada.Containers.Formal_Ordered_Sets with
   SPARK_Mode
 is
+   --  Contracts in this unit are meant for analysis only, not for run-time
+   --  checking.
+
+   pragma Assertion_Policy (Pre => Ignore);
+   pragma Assertion_Policy (Post => Ignore);
pragma Annotate (CodePeer, Skip_Analysis);
 
function Equivalent_Elements (Left, Right : Element_Type) return Boolean


diff --git a/gcc/ada/libgnat/a-cofove.ads b/gcc/ada/libgnat/a-cofove.ads
--- a/gcc/ada/libgnat/a-cofove.ads
+++ b/gcc/ada/libgnat/a-cofove.ads
@@ -45,6 +45,11 @@ generic
 package Ada.Containers.Formal_Vectors with
   SPARK_Mode
 is
+   --  Contracts in this unit are meant for analysis only, not for run-time
+   --  checking.
+
+   pragma Assertion_Policy (Pre => Ignore);
+   pragma Assertion_Policy (Post => Ignore);
pragma Annotate (CodePeer, Skip_Analysis);
 
subtype Extended_Index is Index_Type'Base




[Ada] vx7-shared-libs: Unused variable __gnat_user_int_connect

2021-06-30 Thread Pierre-Marie de Rodat
Makefile.rtl (x86_64-vx7r2) adds a package i-vxinco.adb, which imports
a variable that is not exported on rtp. This problem only shows up
with a shared library runtime because it's never actually used, and
gets optimized away in a static link.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* Makefile.rtl (x86_64-vx7r2) [EXTRA_GNATRTL_TASKING_OBJS]: Move
i-vxinco.o out of RTP runtime.diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -1457,12 +1457,12 @@ ifeq ($(strip $(filter-out %86 x86_64 wrs vxworks vxworks7%,$(target_cpu) $(targ
 endif
   endif
 
-  EXTRA_GNATRTL_NONTASKING_OBJS += i-vxwork.o i-vxwoio.o
+  EXTRA_GNATRTL_NONTASKING_OBJS += i-vxinco.o i-vxwork.o i-vxwoio.o
 endif
   endif
 
   EXTRA_GNATRTL_NONTASKING_OBJS += s-stchop.o
-  EXTRA_GNATRTL_TASKING_OBJS += i-vxinco.o s-vxwork.o s-vxwext.o
+  EXTRA_GNATRTL_TASKING_OBJS += s-vxwork.o s-vxwext.o
 
   EXTRA_LIBGNAT_OBJS+=vx_stack_info.o
 




[Ada] Remove a special case for forking-for-expect from ordinary spawn

2021-06-30 Thread Pierre-Marie de Rodat
The __gnat_in_child_after_fork flag was introduced for tracking memory
within a child process created by __gnat_expect_fork. It is not needed
for __gnat_portable_spawn, where fork is immediately followed by execv
(and _exit should execv fail), because there can be no memory
allocations between fork and execv/_exit.

Code cleanup only related to discussion about the use of fork, vfork and
posix_spawn in GNATprove.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* adaint.c (__gnat_portable_spawn): Revert change that
introduced setting of __gnat_in_child_after_fork.diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
--- a/gcc/ada/adaint.c
+++ b/gcc/ada/adaint.c
@@ -2423,7 +2423,6 @@ __gnat_portable_spawn (char *args[] ATTRIBUTE_UNUSED)
   if (pid == 0)
 {
   /* The child. */
-  __gnat_in_child_after_fork = 1;
   if (execv (args[0], MAYBE_TO_PTR32 (args)) != 0)
 	_exit (1);
 }




[Ada] Ensure System.Tasking.Debug.Known_Tasks component access is atomic

2021-06-30 Thread Pierre-Marie de Rodat
Multiple threads can access the elements of
System.Tasking.Debug.Known_Tasks concurrently. While the compiler will
generally produce code that reads and writes Task_Ids atomically since
Task_Id is word size, it is best to be explicit to prevent data race
issues that may arise from non-atomic component accesses.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnarl/s-tasdeb.ads (Known_Tasks): Add Atomic_Components
aspect.diff --git a/gcc/ada/libgnarl/s-tasdeb.ads b/gcc/ada/libgnarl/s-tasdeb.ads
--- a/gcc/ada/libgnarl/s-tasdeb.ads
+++ b/gcc/ada/libgnarl/s-tasdeb.ads
@@ -65,9 +65,11 @@ package System.Tasking.Debug is
-- General GDB support --
-
 
-   Known_Tasks : array (0 .. 999) of Task_Id := (others => null);
+   Known_Tasks : array (0 .. 999) of Task_Id := (others => null)
+ with Atomic_Components;
--  Global array of tasks read by gdb, and updated by Create_Task and
-   --  Finalize_TCB
+   --  Finalize_TCB. Ensure access to its components is atomic to allow
+   --  lock-free concurrent access.
 
Debug_Event_Activating   : constant := 1;
Debug_Event_Run  : constant := 2;




[Ada] Small tweak in a couple of comments

2021-06-30 Thread Pierre-Marie de Rodat
This makes the comments use the same syntax as -gnatD/G for freeze nodes.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch12.adb (Freeze_Subprogram_Body): Add missing "freeze".
(Install_Body): Likewise.diff --git a/gcc/ada/sem_ch12.adb b/gcc/ada/sem_ch12.adb
--- a/gcc/ada/sem_ch12.adb
+++ b/gcc/ada/sem_ch12.adb
@@ -9106,7 +9106,7 @@ package body Sem_Ch12 is
  --  Handle the following case:
  --
  --package Parent_Inst is new ...
- --Parent_Inst []
+ --freeze Parent_Inst []
  --
  --procedure P ...  --  this body freezes Parent_Inst
  --
@@ -9942,7 +9942,7 @@ package body Sem_Ch12 is
--  Handle the following case:
 
--package Parent_Inst is new ...
-   --Parent_Inst []
+   --freeze Parent_Inst []
 
--procedure P ...  --  this body freezes Parent_Inst
 




[Ada] Remove an obsolete variant of Adjust_Name_Case used only by SPARK

2021-06-30 Thread Pierre-Marie de Rodat
GNATprove no longer calls an obsolete variant of Adjust_Name_Case that
uses a global buffer.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* errout.ads (Adjust_Name_Case): Remove obsolete and now unused
variant.
* errout.adb (Adjust_Name_Case): Likewise; fix variant that uses
a custom buffer to also use it for names in Standard_Location.diff --git a/gcc/ada/errout.adb b/gcc/ada/errout.adb
--- a/gcc/ada/errout.adb
+++ b/gcc/ada/errout.adb
@@ -3402,7 +3402,7 @@ package body Errout is
  --  For standard locations, always use mixed case
 
  if Loc <= No_Location then
-Set_Casing (Mixed_Case);
+Set_Casing (Buf, Mixed_Case);
 
  else
 --  Determine if the reference we are dealing with corresponds to
@@ -3440,11 +3440,6 @@ package body Errout is
   end;
end Adjust_Name_Case;
 
-   procedure Adjust_Name_Case (Loc : Source_Ptr) is
-   begin
-  Adjust_Name_Case (Global_Name_Buffer, Loc);
-   end Adjust_Name_Case;
-
---
-- Set_Identifier_Casing --
---


diff --git a/gcc/ada/errout.ads b/gcc/ada/errout.ads
--- a/gcc/ada/errout.ads
+++ b/gcc/ada/errout.ads
@@ -985,10 +985,6 @@ package Errout is
--  the name at that source location, we copy the casing from the source,
--  otherwise we set appropriate default casing.
 
-   procedure Adjust_Name_Case (Loc : Source_Ptr);
-   --  Uses Buf => Global_Name_Buffer. There are no calls to this in the
-   --  compiler, but it is called in SPARK 2014.
-
procedure Set_Identifier_Casing
  (Identifier_Name : System.Address;
   File_Name   : System.Address);




[Ada] Do not catch 'N mod -1' in CodePeer_Mode

2021-06-30 Thread Pierre-Marie de Rodat
The special case used for catching the 'mod -1' operation is not useful
to CodePeer, and in fact may be detrimental to its precision. Remove
it in CodePeer_Mode.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch4.adb (Expand_N_Op_Mod): Remove special case for mod -1
in CodePeer_Mode.diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -9626,6 +9626,7 @@ package body Exp_Ch4 is
 
 if ((not ROK) or else (Rlo <= (-1) and then (-1) <= Rhi))
   and then ((not LOK) or else (Llo = LLB))
+  and then not CodePeer_Mode
 then
Rewrite (N,
  Make_If_Expression (Loc,




[Ada] Crash on limited array object with address clause

2021-06-30 Thread Pierre-Marie de Rodat
Compiler aborts on an object declaration for a limited array type,
when declaration includes an aggregate that must be built in place,
and declaration carries an aspect specification for Address of object.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_aggr.adb (Convert_Aggr_In_Object_Decl): After expansion of
the aggregate, the expression can be removed from the
declaration, except if the object is class-wide, in which case
the aggregate provides the actual type. In other cases the
presence of the expression may lead to spurious freezing issue.
* exp_ch3.adb (Expand_N_Object_Declaration): If the expression
in the declaration is an aggregate with delayed expansion (as is
the case for objects of a limited type, or a subsequent address
specification) the aggregate must be resolved at this point.
This resolution must not include expansion, because the
expansion of the enclosing declaration will construct the
necessary aggregate expansion.diff --git a/gcc/ada/exp_aggr.adb b/gcc/ada/exp_aggr.adb
--- a/gcc/ada/exp_aggr.adb
+++ b/gcc/ada/exp_aggr.adb
@@ -4437,6 +4437,15 @@ package body Exp_Aggr is
   end;
 
   Set_No_Initialization (N);
+
+  --  After expansion the expression can be removed from the declaration
+  --  except if the object is class-wide, in which case the aggregate
+  --  provides the actual type.
+
+  if not Is_Class_Wide_Type (Etype (Obj)) then
+ Set_Expression (N, Empty);
+  end if;
+
   Initialize_Discriminants (N, Typ);
end Convert_Aggr_In_Object_Decl;
 


diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -30,6 +30,7 @@ with Einfo;  use Einfo;
 with Einfo.Entities; use Einfo.Entities;
 with Einfo.Utils;use Einfo.Utils;
 with Errout; use Errout;
+with Expander;   use Expander;
 with Exp_Aggr;   use Exp_Aggr;
 with Exp_Atag;   use Exp_Atag;
 with Exp_Ch4;use Exp_Ch4;
@@ -6985,12 +6986,16 @@ package body Exp_Ch3 is
 --  happen when the aggregate is limited and the declared object
 --  has a following address clause; it happens also when generating
 --  C code for an aggregate that has an alignment or address clause
---  (see Analyze_Object_Declaration).
+--  (see Analyze_Object_Declaration). Resolution is done without
+--  expansion because it will take place when the declaration
+--  itself is expanded.
 
 if (Is_Limited_Type (Typ) or else Modify_Tree_For_C)
   and then not Analyzed (Expr)
 then
+   Expander_Mode_Save_And_Set (False);
Resolve (Expr, Typ);
+   Expander_Mode_Restore;
 end if;
 
 Convert_Aggr_In_Object_Decl (N);




[Ada] Fix the -gnatyr switch so it works in record rep clauses

2021-06-30 Thread Pierre-Marie de Rodat
The -gnatyr switch is supposed to generate a style warning if the case
of a usage name does not match that of the defining_identifier it
denotes. The warning was missing for component names appearing in record
representation clauses; this patch fixes that bug.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch13.adb (Analyze_Record_Representation_Clause): Call
Set_Entity_With_Checks instead of Set_Entity, so we perform the
check for correct casing.
* style.adb (Check_Identifier): Minor comment improvement.
Cleanup overly complicated code.diff --git a/gcc/ada/sem_ch13.adb b/gcc/ada/sem_ch13.adb
--- a/gcc/ada/sem_ch13.adb
+++ b/gcc/ada/sem_ch13.adb
@@ -8561,7 +8561,7 @@ package body Sem_Ch13 is
 
  Generate_Reference
(Comp, Component_Name (CC), Set_Ref => False);
- Set_Entity (Component_Name (CC), Comp);
+ Set_Entity_With_Checks (Component_Name (CC), Comp);
 
  --  Update Fbit and Lbit to the actual bit number
 


diff --git a/gcc/ada/style.adb b/gcc/ada/style.adb
--- a/gcc/ada/style.adb
+++ b/gcc/ada/style.adb
@@ -136,48 +136,42 @@ package body Style is
 Tref := Source_Text (Get_Source_File_Index (Sref));
 Tdef := Source_Text (Get_Source_File_Index (Sdef));
 
---  Ignore operator name case completely. This also catches the
---  case of where one is an operator and the other is not. This
---  is a phenomenon from rewriting of operators as functions,
---  and is to be ignored.
+--  Ignore case of operator names. This also catches the case
+--  where one is an operator and the other is not. This is a
+--  phenomenon from rewriting of operators as functions, and is
+--  to be ignored.
 
 if Tref (Sref) = '"' or else Tdef (Sdef) = '"' then
return;
 
 else
-   while Tref (Sref) = Tdef (Sdef) loop
+   loop
+  --  If end of identifiers, all done. Note that they are the
+  --  same length.
 
-  --  If end of identifier, all done
+  pragma Assert
+(Identifier_Char (Tref (Sref)) =
+ Identifier_Char (Tdef (Sdef)));
 
   if not Identifier_Char (Tref (Sref)) then
  return;
-
-  --  Otherwise loop continues
-
-  else
- Sref := Sref + 1;
- Sdef := Sdef + 1;
   end if;
-   end loop;
 
-   --  Fall through loop when mismatch between identifiers
-   --  If either identifier is not terminated, error.
+  --  Case mismatch
 
-   if Identifier_Char (Tref (Sref))
-or else
-  Identifier_Char (Tdef (Sdef))
-   then
-  Error_Msg_Node_1 := Def;
-  Error_Msg_Sloc := Sloc (Def);
-  Error_Msg -- CODEFIX
-("(style) bad casing of & declared#", Sref, Ref);
-  return;
+  if Tref (Sref) /= Tdef (Sdef) then
+ Error_Msg_Node_1 := Def;
+ Error_Msg_Sloc := Sloc (Def);
+ Error_Msg -- CODEFIX
+   ("(style) bad casing of & declared#", Sref, Ref);
+ return;
+  end if;
 
-   --  Else end of identifiers, and they match
+  Sref := Sref + 1;
+  Sdef := Sdef + 1;
+   end loop;
 
-   else
-  return;
-   end if;
+   pragma Assert (False);
 end if;
  end if;
 




[Ada] Make copies of entities being declared when copying block

2021-06-30 Thread Pierre-Marie de Rodat
When we make a copy of a tree containing a block, we need to make new
entities for variables declared in the block.  If not, the entity
points to the wrong declaration, which is an invalid tree and can
cause issues when we need static links and that variable is an uplevel
reference.  There may also be latent issues if the variable is
actually in the tree twice.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_util.adb (Visit_Node): Add handling for N_Block_Statement
with declarations.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -24344,6 +24344,26 @@ package body Sem_Util is
 EWA_Inner_Scope_Level := EWA_Inner_Scope_Level + 1;
  end if;
 
+ --  If the node is a block, we need to process all declarations
+ --  in the block and make new entities for each.
+
+ if Nkind (N) = N_Block_Statement and then Present (Declarations (N))
+ then
+declare
+   Decl : Node_Id := First (Declarations (N));
+
+begin
+   while Present (Decl) loop
+  if Nkind (Decl) = N_Object_Declaration then
+ Add_New_Entity (Defining_Identifier (Decl),
+ New_Copy (Defining_Identifier (Decl)));
+  end if;
+
+  Next (Decl);
+   end loop;
+end;
+ end if;
+
  declare
 procedure Action (U : Union_Id);
 procedure Action (U : Union_Id) is




[Ada] Expose symmetry between Known_ and Unknown_ query routines

2021-06-30 Thread Pierre-Marie de Rodat
We have two families of routines to query entity properties: Known_XXX
and Unknown_XXX. They now simply negate each other instead of negating
their complex conditions.

Code cleanup only related to handling of Alignment in GNATprove;
semantics is unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* einfo-utils.adb
(Unknown_Alignment): Simply negate the Known_ counterpart.
(Unknown_Component_Bit_Offset): Likewise.
(Unknown_Esize): Likewise.
(Unknown_Normalized_First_Bit): Likewise.
(Unknown_Normalized_Position): Likewise.
(Unknown_Normalized_Position_Max): Likewise.
(Unknown_RM_Size): Likewise.diff --git a/gcc/ada/einfo-utils.adb b/gcc/ada/einfo-utils.adb
--- a/gcc/ada/einfo-utils.adb
+++ b/gcc/ada/einfo-utils.adb
@@ -593,13 +593,12 @@ package body Einfo.Utils is
 
function Unknown_Alignment (E : Entity_Id) return B is
begin
-  return Alignment (E) = Uint_0
-or else Alignment (E) = No_Uint;
+  return not Known_Alignment (E);
end Unknown_Alignment;
 
function Unknown_Component_Bit_Offset  (E : Entity_Id) return B is
begin
-  return Component_Bit_Offset (E) = No_Uint;
+  return not Known_Component_Bit_Offset (E);
end Unknown_Component_Bit_Offset;
 
function Unknown_Component_Size(E : Entity_Id) return B is
@@ -609,32 +608,27 @@ package body Einfo.Utils is
 
function Unknown_Esize (E : Entity_Id) return B is
begin
-  return Esize (E) = No_Uint
-   or else
- Esize (E) = Uint_0;
+  return not Known_Esize (E);
end Unknown_Esize;
 
function Unknown_Normalized_First_Bit  (E : Entity_Id) return B is
begin
-  return Normalized_First_Bit (E) = No_Uint;
+  return not Known_Normalized_First_Bit (E);
end Unknown_Normalized_First_Bit;
 
function Unknown_Normalized_Position   (E : Entity_Id) return B is
begin
-  return Normalized_Position (E) = No_Uint;
+  return not Known_Normalized_Position (E);
end Unknown_Normalized_Position;
 
function Unknown_Normalized_Position_Max   (E : Entity_Id) return B is
begin
-  return Normalized_Position_Max (E) = No_Uint;
+  return not Known_Normalized_Position_Max (E);
end Unknown_Normalized_Position_Max;
 
function Unknown_RM_Size   (E : Entity_Id) return B is
begin
-  return (RM_Size (E) = Uint_0
-and then not Is_Discrete_Type (E)
-and then not Is_Fixed_Point_Type (E))
-or else RM_Size (E) = No_Uint;
+  return not Known_RM_Size (E);
end Unknown_RM_Size;
 





Re: [RFC/PATCH v3] ira: Support more matching constraint forms with param [PR100328]

2021-06-30 Thread Kewen.Lin via Gcc-patches
on 2021/6/30 下午4:53, Hongtao Liu wrote:
> On Mon, Jun 28, 2021 at 3:27 PM Kewen.Lin  wrote:
>>
>> on 2021/6/28 下午3:20, Hongtao Liu wrote:
>>> On Mon, Jun 28, 2021 at 3:12 PM Hongtao Liu  wrote:

 On Mon, Jun 28, 2021 at 2:50 PM Kewen.Lin  wrote:
>
> Hi!
>
> on 2021/6/9 下午1:18, Kewen.Lin via Gcc-patches wrote:
>> Hi,
>>
>> PR100328 has some details about this issue, I am trying to
>> brief it here.  In the hottest function LBM_performStreamCollideTRT
>> of SPEC2017 bmk 519.lbm_r, there are many FMA style expressions
>> (27 FMA, 19 FMS, 11 FNMA).  On rs6000, this kind of FMA style
>> insn has two flavors: FLOAT_REG and VSX_REG, the VSX_REG reg
>> class have 64 registers whose foregoing 32 ones make up the
>> whole FLOAT_REG.  There are some differences for these two
>> flavors, taking "*fma4_fpr" as example:
>>
>> (define_insn "*fma4_fpr"
>>   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,wa,wa")
>>   (fma:SFDF
>> (match_operand:SFDF 1 "gpc_reg_operand" "%,wa,wa")
>> (match_operand:SFDF 2 "gpc_reg_operand" ",wa,0")
>> (match_operand:SFDF 3 "gpc_reg_operand" ",0,wa")))]
>>
>> // wa => A VSX register (VSR), vs0…vs63, aka. VSX_REG.
>> //  (f/d) => A floating point register, aka. FLOAT_REG.
>>
>> So for VSX_REG, we only have the destructive form, when VSX_REG
>> alternative being used, the operand 2 or operand 3 is required
>> to be the same as operand 0.  reload has to take care of this
>> constraint and create some non-free register copies if required.
>>
>> Assuming one fma insn looks like:
>>   op0 = FMA (op1, op2, op3)
>>
>> The best regclass of them are VSX_REG, when op1,op2,op3 are all dead,
>> IRA simply creates three shuffle copies for them (here the operand
>> order matters, since with the same freq, the one with smaller number
>> takes preference), but IMO both op2 and op3 should take higher priority
>> in copy queue due to the matching constraint.
>>
>> I noticed that there is one function ira_get_dup_out_num, which meant
>> to create this kind of constraint copy, but the below code looks to
>> refuse to create if there is an alternative which has valid regclass
>> without spilled need.
>>
>>   default:
>>   {
>> enum constraint_num cn = lookup_constraint (str);
>> enum reg_class cl = reg_class_for_constraint (cn);
>> if (cl != NO_REGS
>> && !targetm.class_likely_spilled_p (cl))
>>   goto fail
>>
>>...
>>
>> I cooked one patch attached to make ira respect this kind of matching
>> constraint guarded with one parameter.  As I stated in the PR, I was
>> not sure this is on the right track.  The RFC patch is to check the
>> matching constraint in all alternatives, if there is one alternative
>> with matching constraint and matches the current preferred regclass
>> (or best of allocno?), it will record the output operand number and
>> further create one constraint copy for it.  Normally it can get the
>> priority against shuffle copies and the matching constraint will get
>> satisfied with higher possibility, reload doesn't create extra copies
>> to meet the matching constraint or the desirable register class when
>> it has to.
>>
>> For FMA A,B,C,D, I think ideally copies A/B, A/C, A/D can firstly stay
>> as shuffle copies, and later any of A,B,C,D gets assigned by one
>> hardware register which is a VSX register (VSX_REG) but not a FP
>> register (FLOAT_REG), which means it has to pay costs once we can NOT
>> go with VSX alternatives, so at that time it's important to respect
>> the matching constraint then we can increase the freq for the remaining
>> copies related to this (A/B, A/C, A/D).  This idea requires some side
>> tables to record some information and seems a bit complicated in the
>> current framework, so the proposed patch aggressively emphasizes the
>> matching constraint at the time of creating copies.
>>
>
> Comparing with the original patch (v1), this patch v3 has
> considered: (this should be v2 for this mail list, but bump
> it to be consistent as PR's).
>
>   - Excluding the case where for one preferred register class
> there can be two or more alternatives, one of them has the
> matching constraint, while another doesn't have.  So for
> the given operand, even if it's assigned by a hardware reg
> which doesn't meet the matching constraint, it can simply
> use the alternative which doesn't have matching constraint
> so no register move is needed.  One typical case is
> define_insn *mov_internal2 on rs6000.  So we
> shouldn't create constraint copy for it.
>
>   - The possible free register m

Re: [PATCH] Port GCC documentation to Sphinx

2021-06-30 Thread Martin Liška

On 6/29/21 6:57 PM, Eli Zaretskii wrote:

From: Martin Liška 
Date: Tue, 29 Jun 2021 12:09:23 +0200
Cc: GCC Development , gcc-patches@gcc.gnu.org

On 6/28/21 5:33 PM, Joseph Myers wrote:

Are formatted manuals (HTML, PDF, man, info) corresponding to this patch
version also available for review?


I've just uploaded them here:
https://splichal.eu/gccsphinx-final/


Thanks.


Hey!



I'm an Info junkie, so I grabbed gcc.info from there and skimmed
through it.  Please allow me a few unsolicited comments:


I really welcome them!



1. It sounds like Sphinx is heavily biased towards HTML format, and as
result uglifies the Info format?


Hopefully not :)



For example, many cross-references (AFAIU introduced as part of
migration to Sphinx) make the text illegible in Emacs.  Example:

   This standard, in both its forms, is commonly known as `C89', or
   occasionally as `C90', from the dates of ratification.  To select this
   standard in GCC, use one of the options *note -ansi *note -std
   .‘=c90’ or *note -std.‘=iso9899:1990’; to obtain all the diagnostics
   required by the standard, you should also specify *note -pedantic.
   (or *note -pedantic-errors. if you want them to be errors rather
   than warnings).  See *note Options Controlling C Dialect.
   [...]
   An amendment to the 1990 standard was published in 1995.  This amendment
   added digraphs and ‘__STDC_VERSION__’ to the language, but otherwise
   concerned the library.  This amendment is commonly known as `AMD1'; the
   amended standard is sometimes known as `C94' or `C95'.  To select this
   standard in GCC, use the option *note -std.‘=iso9899:199409’ (with,
   as for other standard versions, *note -pedantic. to receive all
   required diagnostics).

Or how about this:

   `Overall Options'

See Options Controlling the Kind of Output.

*note -c. *note -S. *note -E. *note -o. ‘`file'’
*note -dumpbase. ‘`dumpbase'’ *note -dumpbase-ext.
‘`auxdropsuf'’ *note -dumpdir. ‘`dumppfx'’ ‘-x’ ‘`language'’
*note -v. *note -###. *note –help.‘[=`class'[,...]]’
*note –target-help. *note –version. *note -pass-exit-codes
. *note -pipe. *note -specs.‘=`file'’ *note -wrapper
.‘@`file'’ *note -ffile-prefix-map.‘=`old'=`new'’ *note
-fplugin.‘=`file'’ ‘-fplugin-arg-’‘`name'=`arg'’
‘-fdump-ada-spec’‘[-`slim']’ *note -fada-spec-parent.‘=`unit'’
*note -fdump-go-spec.‘=`file'’

In the first line, the emphasis became quotes, which sounds sub-optimal.
In the second line, the hyperlink was lost.
And the rest is not really readable.

Compare this with the original:

   _Overall Options_
*Note Options Controlling the Kind of Output.
-c  -S  -E  -o FILE  -x LANGUAGE
-v  -###  --help[=CLASS[,...]]  --target-help  --version
-pass-exit-codes  -pipe  -specs=FILE  -wrapper
@FILE  -ffile-prefix-map=OLD=NEW
-fplugin=FILE  -fplugin-arg-NAME=ARG
-fdump-ada-spec[-slim]  -fada-spec-parent=UNIT  -fdump-go-spec=FILE

(Admittedly, Emacs by default hides some of the text of a
cross-reference, but not hiding them in this case produces an even
less legible text.)


If I'm correct, it's exactly what's documented in Sphinx FAQ here:
https://www.sphinx-doc.org/en/master/faq.html#displaying-links

and there's a suggested Emacs code snippet that should help with links.
Does it help?



In general, it is a well-known rule that Texinfo documentation should
NOT use @ref{foo} as if @ref will disappear without a trace, leaving
just the hyperlink to 'foo'.  Looks like the rewritten manual uses
that a lot.

This "see" consistently gets in the way throughout the entire manual.
A few more examples:

-- Option: -flocal-ivars

Default option value for *note -fno-local-ivars.
...
For example *note -std.‘=gnu90 -Wpedantic’ warns about C++ style
‘//’ comments, while *note -std.‘=gnu99 -Wpedantic’ does not.
...
If this option is not provided but *note -Wabi.‘=`n'’ is, that
version is used for compatibility aliases.
...
Below *note -std.‘=c++20’, *note -fconcepts. enables
support for the C++ Extensions for Concepts Technical
Specification, ISO 19217 (2015).
...
   gcov [ *note -v. | *note –version. ] [ ‘-h’ | *note –help. ]


2. The translation of @var produces double-quoting in Info, here's an
example:

   The usual way to run GCC is to run the executable called ‘gcc’, or
   ‘`machine'-gcc’ when cross-compiling, or ‘`machine'-gcc-`version'’ to
   run a specific version of GCC.

vs, the old

The usual way to run GCC is to run the executable called 'gcc', or
   'MACHINE-gcc' when cross-compiling, or 'MACHINE-gcc-VERSION' to run a
   specific version of GCC.

I think the new variant is less readable and more confusing, because
it isn't clear whether the quotes are part of the text.  Here's an
extreme example:

   ‘@`file'’

Read command-line options

Re: [RFC/PATCH v3] ira: Support more matching constraint forms with param [PR100328]

2021-06-30 Thread Hongtao Liu via Gcc-patches
On Wed, Jun 30, 2021 at 5:42 PM Kewen.Lin  wrote:
>
> on 2021/6/30 下午4:53, Hongtao Liu wrote:
> > On Mon, Jun 28, 2021 at 3:27 PM Kewen.Lin  wrote:
> >>
> >> on 2021/6/28 下午3:20, Hongtao Liu wrote:
> >>> On Mon, Jun 28, 2021 at 3:12 PM Hongtao Liu  wrote:
> 
>  On Mon, Jun 28, 2021 at 2:50 PM Kewen.Lin  wrote:
> >
> > Hi!
> >
> > on 2021/6/9 下午1:18, Kewen.Lin via Gcc-patches wrote:
> >> Hi,
> >>
> >> PR100328 has some details about this issue, I am trying to
> >> brief it here.  In the hottest function LBM_performStreamCollideTRT
> >> of SPEC2017 bmk 519.lbm_r, there are many FMA style expressions
> >> (27 FMA, 19 FMS, 11 FNMA).  On rs6000, this kind of FMA style
> >> insn has two flavors: FLOAT_REG and VSX_REG, the VSX_REG reg
> >> class have 64 registers whose foregoing 32 ones make up the
> >> whole FLOAT_REG.  There are some differences for these two
> >> flavors, taking "*fma4_fpr" as example:
> >>
> >> (define_insn "*fma4_fpr"
> >>   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,wa,wa")
> >>   (fma:SFDF
> >> (match_operand:SFDF 1 "gpc_reg_operand" "%,wa,wa")
> >> (match_operand:SFDF 2 "gpc_reg_operand" ",wa,0")
> >> (match_operand:SFDF 3 "gpc_reg_operand" ",0,wa")))]
> >>
> >> // wa => A VSX register (VSR), vs0…vs63, aka. VSX_REG.
> >> //  (f/d) => A floating point register, aka. FLOAT_REG.
> >>
> >> So for VSX_REG, we only have the destructive form, when VSX_REG
> >> alternative being used, the operand 2 or operand 3 is required
> >> to be the same as operand 0.  reload has to take care of this
> >> constraint and create some non-free register copies if required.
> >>
> >> Assuming one fma insn looks like:
> >>   op0 = FMA (op1, op2, op3)
> >>
> >> The best regclass of them are VSX_REG, when op1,op2,op3 are all dead,
> >> IRA simply creates three shuffle copies for them (here the operand
> >> order matters, since with the same freq, the one with smaller number
> >> takes preference), but IMO both op2 and op3 should take higher priority
> >> in copy queue due to the matching constraint.
> >>
> >> I noticed that there is one function ira_get_dup_out_num, which meant
> >> to create this kind of constraint copy, but the below code looks to
> >> refuse to create if there is an alternative which has valid regclass
> >> without spilled need.
> >>
> >>   default:
> >>   {
> >> enum constraint_num cn = lookup_constraint (str);
> >> enum reg_class cl = reg_class_for_constraint (cn);
> >> if (cl != NO_REGS
> >> && !targetm.class_likely_spilled_p (cl))
> >>   goto fail
> >>
> >>...
> >>
> >> I cooked one patch attached to make ira respect this kind of matching
> >> constraint guarded with one parameter.  As I stated in the PR, I was
> >> not sure this is on the right track.  The RFC patch is to check the
> >> matching constraint in all alternatives, if there is one alternative
> >> with matching constraint and matches the current preferred regclass
> >> (or best of allocno?), it will record the output operand number and
> >> further create one constraint copy for it.  Normally it can get the
> >> priority against shuffle copies and the matching constraint will get
> >> satisfied with higher possibility, reload doesn't create extra copies
> >> to meet the matching constraint or the desirable register class when
> >> it has to.
> >>
> >> For FMA A,B,C,D, I think ideally copies A/B, A/C, A/D can firstly stay
> >> as shuffle copies, and later any of A,B,C,D gets assigned by one
> >> hardware register which is a VSX register (VSX_REG) but not a FP
> >> register (FLOAT_REG), which means it has to pay costs once we can NOT
> >> go with VSX alternatives, so at that time it's important to respect
> >> the matching constraint then we can increase the freq for the remaining
> >> copies related to this (A/B, A/C, A/D).  This idea requires some side
> >> tables to record some information and seems a bit complicated in the
> >> current framework, so the proposed patch aggressively emphasizes the
> >> matching constraint at the time of creating copies.
> >>
> >
> > Comparing with the original patch (v1), this patch v3 has
> > considered: (this should be v2 for this mail list, but bump
> > it to be consistent as PR's).
> >
> >   - Excluding the case where for one preferred register class
> > there can be two or more alternatives, one of them has the
> > matching constraint, while another doesn't have.  So for
> > the given operand, even if it's assigned by a hardware reg
> > which doesn't meet the matching constraint, it can simply
> > use the alternative which doe

Re: [PATCH] Port GCC documentation to Sphinx

2021-06-30 Thread Richard Earnshaw via Gcc-patches




On 30/06/2021 05:47, Martin Liška wrote:

On 6/29/21 12:50 PM, Richard Earnshaw wrote:



On 29/06/2021 11:09, Martin Liška wrote:

On 6/28/21 5:33 PM, Joseph Myers wrote:
Are formatted manuals (HTML, PDF, man, info) corresponding to this 
patch

version also available for review?


I've just uploaded them here:
https://splichal.eu/gccsphinx-final/

Martin



In the HTML version of the gcc manual the sidebar has an "Option 
index" link but no link to the general index.  When you follow that 
link the page contents is just a link to the "index" where everything 
is all lumped together.


If we can't have separate indexes for options and general entries, I 
think it would make more sense for the Option index link to be removed 
entirely.


Fully agree with you. Thanks for the feedback and I've changed that to 
the standard Sphinx section,
see e.g. 
https://splichal.eu/gccsphinx-final/html/gcc/indices-and-tables.html


Martin



R.





Thanks.  Given that the manual is nominally in American English, it 
might be better to use the term "indexes" rather than "indices".


https://grammarist.com/usage/indexes-indices/

R.


[Patch] gcc.c's check_offload_target_name: Fixes to inform hints

2021-06-30 Thread Tobias Burnus

As discussed at IRC:

* Replace alloca by XALLOCAVEC - and while being there, do it in the whole file.
* Fix splitting OFFLOAD_TARGETS at the ',' for the candidate list
* More helpful inform if no targets have been configured.
* For -foffload-options=, the 'target' argument may be 'nvptx,amdgcn' with
  len = 5. – That worked fine, except it failed to produce a hints. Now
  I (re)introduced target2.

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
gcc.c's check_offload_target_name: Fixes to inform hints

gcc/ChangeLog:

	* gcc.c (close_at_file, execute): Replace alloca by XALLOCAVEC.
	(check_offload_target_name): Fix splitting OFFLOAD_TARGETS into
	a candidate list; better inform no offload target is configured
	and fix hint extraction when passed target is not '\0' at [len].
	* common.opt (foffload, foffload-options): Add tailing '.'.

 gcc/common.opt |  4 ++--
 gcc/gcc.c  | 42 --
 2 files changed, 22 insertions(+), 24 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index eaee74c580a..847ff98c992 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2104,11 +2104,11 @@ Support synchronous non-call exceptions.
 ; -foffload== is supported for backward compatibility
 foffload=
 Driver Joined MissingArgError(targets missing after %qs)
--foffload=	Specify offloading targets
+-foffload=	Specify offloading targets.
 
 foffload-options=
 Common Driver Joined MissingArgError(options or targets=options missing after %qs)
--foffload==	Specify options for the offloading targets
+-foffload==	Specify options for the offloading targets.
 
 foffload-abi=
 Common Joined RejectNegative Enum(offload_abi) Var(flag_offload_abi) Init(OFFLOAD_ABI_UNSET)
diff --git a/gcc/gcc.c b/gcc/gcc.c
index 9baa7d67c76..f802148e567 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -2242,7 +2242,7 @@ close_at_file (void)
   if (n_args == 0)
 return;
 
-  char **argv = (char **) alloca (sizeof (char *) * (n_args + 1));
+  char **argv = XALLOCAVEC (char *, n_args + 1);
   char *temp_file = make_at_file ();
   char *at_argument = concat ("@", temp_file, NULL);
   FILE *f = fopen (temp_file, "w");
@@ -3251,7 +3251,7 @@ execute (void)
   n_commands++;
 
   /* Get storage for each command.  */
-  commands = (struct command *) alloca (n_commands * sizeof (struct command));
+  commands = XALLOCAVEC (struct command, n_commands);
 
   /* Split argbuf into its separate piped processes,
  and record info about each one.
@@ -3430,13 +3430,13 @@ execute (void)
 struct pex_time *times = NULL;
 int ret_code = 0;
 
-statuses = (int *) alloca (n_commands * sizeof (int));
+statuses = XALLOCAVEC (int, n_commands);
 if (!pex_get_status (pex, n_commands, statuses))
   fatal_error (input_location, "failed to get exit status: %m");
 
 if (report_times || report_times_to_file)
   {
-	times = (struct pex_time *) alloca (n_commands * sizeof (struct pex_time));
+	times = XALLOCAVEC (struct pex_time, n_commands);
 	if (!pex_get_times (pex, n_commands, times))
 	  fatal_error (input_location, "failed to get process times: %m");
   }
@@ -3997,24 +3997,22 @@ check_offload_target_name (const char *target, ptrdiff_t len)
 {
   char *s;
   auto_vec candidates;
-  char *cand = (char *) alloca (strlen (OFFLOAD_TARGETS) + 1);
-  c = OFFLOAD_TARGETS;
-  while (c)
-	{
-	  n = strchr (c, ',');
-	  if (n == NULL)
-	n = strchr (c, '\0');
-	  if (n - c == 0)
-	break;
-	  strncpy (cand, c, n - c);
-	  cand[n - c] = '\0';
-	  candidates.safe_push (cand);
-	  c = *n ? n + 1 : NULL;
-	}
-  error ("GCC is not configured to support %q.*s as offload target",
-	 (int) len, target);
-  const char *hint = candidates_list_and_hint (target, s, candidates);
-  if (hint)
+  size_t olen = strlen (OFFLOAD_TARGETS) + 1;
+  char *cand = XALLOCAVEC (char, olen);
+  memcpy (cand, OFFLOAD_TARGETS, olen);
+  for (c = strtok (cand, ","); c; c = strtok (NULL, ","))
+	candidates.safe_push (c);
+
+  char *target2 = XALLOCAVEC (char, len + 1);
+  memcpy (target2, target, len);
+  target2[len] = '\0';
+
+  error ("GCC is not configured to support %qs as offload target", target2);
+
+  const char *hint = candidates_list_and_hint (target2, s, candidates);
+  if (candidates.is_empty ())
+	inform (UNKNOWN_LOCATION, "no offloading targets configured");
+  else if (hint)
 	inform (UNKNOWN_LOCATION,
 		"valid offload targets are: %s; did you mean %qs?", s, hint);
   else


Re: [Patch] gcc.c's check_offload_target_name: Fixes to inform hints

2021-06-30 Thread Jakub Jelinek via Gcc-patches
On Wed, Jun 30, 2021 at 12:19:17PM +0200, Tobias Burnus wrote:
> gcc.c's check_offload_target_name: Fixes to inform hints
> 
> gcc/ChangeLog:
> 
>   * gcc.c (close_at_file, execute): Replace alloca by XALLOCAVEC.
>   (check_offload_target_name): Fix splitting OFFLOAD_TARGETS into
>   a candidate list; better inform no offload target is configured
>   and fix hint extraction when passed target is not '\0' at [len].
>   * common.opt (foffload, foffload-options): Add tailing '.'.

Ok, thanks.

Jakub



Re: [PATCH 0/4] openacc: Async fixes

2021-06-30 Thread Julian Brown
On Wed, 30 Jun 2021 10:28:00 +0200
Thomas Schwinge  wrote:

> >  - The OpenACC profiling-interface implementation did not measure
> >asynchronous operations properly.  
> 
> We'll need to be careful: (possibly, an older version of) that one we
> internally had identified to be causing some issues; see the
> "acc_prof-parallel-1.c intermittent failure on og10 branch" emails,
> 2020-07.

Hmm, I'll check those.

> >  - Several test cases misuse OpenACC asynchronous support (more race
> >conditions).  
> 
> Mostly ACK, but some more changes may be necessary; please see
> <87sg1s9s9l.fsf@euler.schwinge.homeip.net">http://mid.mail-archive.com/87sg1s9s9l.fsf@euler.schwinge.homeip.net>
> (you were CCed).

Thanks -- these test changes have been floating around uncommitted for
too long already, I guess...

> >  .../libgomp.oacc-c-c++-common/deep-copy-10.c  |  14 +-  
> 
> Please provide some detail about that one ("Fix async behaviour").
> It's not obvious to me what's wrong with the current version (but I
> haven't really spent time on that yet).

Aha, well I didn't see what was wrong with it either when I wrote the
test!

The problem is that we have copyin/modify-on-target/copyout operations
that process the *same data* from different async streams on successive
loop iterations. Those async streams are independent from one another,
so depending on how they are scheduled, we can be copying-in on one
async stream whilst simultaneously copying-out on another async stream
-- so of course, the data gets corrupted.

So the fix makes sure that each async stream only operates on "its own"
data. The increase in number of loop iterations was specifically to
tickle the flaw in the workaround used for GCN wrt. the ephemeral
copies -- i.e. snapshotting all host data immediately.

HTH,

Julian


Re: [PATCH] Port GCC documentation to Sphinx

2021-06-30 Thread Martin Liška

On 6/30/21 12:11 PM, Martin Liška wrote:

Seems correct to be, but it's likely not. Let me investigate that.


It's a real issue in Sphinx. I've just addressed that with:
https://github.com/sphinx-doc/sphinx/pull/9390

Cheers,
Martin


Re: [RFC PATCH] Change the type of predicates to bool.

2021-06-30 Thread Richard Biener via Gcc-patches
On Wed, Jun 30, 2021 at 10:47 AM Uros Bizjak via Gcc-patches
 wrote:
>
> This RFC patch changes the type of predicates to bool. However, some
> of the targets (e.g. x86) use indirect functions to call the
> predicates, so without the local change, the build fails. Putting the
> patch through CI bots should weed out the problems, but I have no
> infrastructure to do it myself.

I'd say thanks for the work - note building some cc1 crosses should
catch 99% of the fallout (just configure $target-linux/elf and make all-gcc)

Richard.

> 2021-06-30  Uroš Bizjak  
>
> gcc/
> * genpreds.c (write_predicate_subfunction):
> Change the type of written subfunction to bool.
> (write_one_predicate_function):
> Change the type of written function to bool.
> (write_tm_preds_h): Ditto.
> * recog.h (*insn_operand_predicate_fn): Change the type to bool.
> * recog.c (general_operand): Change the type to bool.
> (address_operand): Ditto.
> (register_operand): Ditto.
> (pmode_register_operand): Ditto.
> (scratch_operand): Ditto.
> (immediate_operand): Ditto.
> (const_int_operand): Ditto.
> (const_scalar_int_operand): Ditto.
> (const_double_operand): Ditto.
> (nonimmediate_operand): Ditto.
> (nonmemory_operand): Ditto.
> (push_operand): Ditto.
> (pop_operand): Ditto.
> (memory_operand): Ditto.
> (indirect_operand): Ditto.
> (ordered_comparison_operator): Ditto.
> (comparison_operator): Ditto.
>
> * config/i386/i386-expand.c (ix86_expand_sse_cmp):
> Change the type of indirect predicate function to bool.
>
> Patch was bootstrapped on x86_64-linux-gnu.
>
> Comments welcome.
>
> Uros.


[PATCH] tree-optimization/101267 - fix SLP vect with masked operations

2021-06-30 Thread Richard Biener
This fixes the missed handling of external/constant mask SLP
operations, for the testcase in particular masked loads.  The
patch adjusts the vect_check_scalar_mask API to reflect the
required vect_is_simple_use SLP compatible API plus adjusts
for the special handling of masked loads in SLP discovery.

The issue is likely latent.

Lightly tested as fixing the 521.wrf_r build and being clean
on vect.exp and i386.exp on x86_64.

Full bootstrap and regtest running on x86_64-unknown-linux-gnu,
I'll push it unless I hear otherwise.

I'm quite sure that SLP masked operations test coverage is weak though.
Maybe somebody can throw it at SVE[2] which should expose more
masking (but eventually not SLP - I don't know about the state of
SLP and masking with respect to SVE)

Thanks,
Richard.

2021-06-30  Richard Biener  

PR tree-optimization/101267
* tree-vect-stmts.c (vect_check_scalar_mask): Adjust
API and use SLP compatible interface of vect_is_simple_use.
Reject not vectorized SLP defs for callers that do not support
that.
(vect_check_store_rhs): Handle masked stores and pass down
the appropriate operator index.
(vectorizable_call): Adjust.
(vectorizable_store): Likewise.
(vectorizable_load): Likewise.  Handle SLP pecularity of
masked loads.
(vect_is_simple_use): Remove special-casing of masked stores.

* gfortran.dg/pr101267.f90: New testcase.
---
 gcc/testsuite/gfortran.dg/pr101267.f90 | 23 +++
 gcc/tree-vect-stmts.c  | 92 +++---
 2 files changed, 77 insertions(+), 38 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr101267.f90

diff --git a/gcc/testsuite/gfortran.dg/pr101267.f90 
b/gcc/testsuite/gfortran.dg/pr101267.f90
new file mode 100644
index 000..12723cf9c22
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr101267.f90
@@ -0,0 +1,23 @@
+! { dg-do compile }
+! { dg-options "-Ofast" }
+! { dg-additional-options "-march=znver2" { target x86_64-*-* i?86-*-* } }
+   SUBROUTINE sfddagd( regime, znt,ite ,jte )
+   REAL, DIMENSION( ime, IN) :: regime, znt
+   REAL, DIMENSION( ite, jte) :: wndcor_u 
+   LOGICAL wrf_dm_on_monitor
+   IF( int4 == 1 ) THEN
+ DO j=jts,jtf
+  DO i=itsu,itf
+   reg =   regime(i,  j) 
+   IF( reg > 10.0 ) THEN
+ znt0 = znt(i-1,  j) + znt(i,  j) 
+ IF( znt0 <= 0.2) THEN
+   wndcor_u(i,j) = 0.2
+ ENDIF
+   ENDIF
+  ENDDO
+ ENDDO
+ IF ( wrf_dm_on_monitor()) THEN
+ ENDIF
+   ENDIF
+   END
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 4ee11b2041a..e590f34d75d 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -2439,17 +2439,31 @@ get_load_store_type (vec_info  *vinfo, stmt_vec_info 
stmt_info,
   return true;
 }
 
-/* Return true if boolean argument MASK is suitable for vectorizing
-   conditional operation STMT_INFO.  When returning true, store the type
-   of the definition in *MASK_DT_OUT and the type of the vectorized mask
-   in *MASK_VECTYPE_OUT.  */
+/* Return true if boolean argument at MASK_INDEX is suitable for vectorizing
+   conditional operation STMT_INFO.  When returning true, store the mask
+   in *MASK, the type of its definition in *MASK_DT_OUT, the type of the
+   vectorized mask in *MASK_VECTYPE_OUT and the SLP node corresponding
+   to the mask in *MASK_NODE if MASK_NODE is not NULL.  */
 
 static bool
-vect_check_scalar_mask (vec_info *vinfo, stmt_vec_info stmt_info, tree mask,
-   vect_def_type *mask_dt_out,
-   tree *mask_vectype_out)
+vect_check_scalar_mask (vec_info *vinfo, stmt_vec_info stmt_info,
+   slp_tree slp_node, unsigned mask_index,
+   tree *mask, slp_tree *mask_node,
+   vect_def_type *mask_dt_out, tree *mask_vectype_out)
 {
-  if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (mask)))
+  enum vect_def_type mask_dt;
+  tree mask_vectype;
+  slp_tree mask_node_1;
+  if (!vect_is_simple_use (vinfo, stmt_info, slp_node, mask_index,
+  mask, &mask_node_1, &mask_dt, &mask_vectype))
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"mask use not simple.\n");
+  return false;
+}
+
+  if (!VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (*mask)))
 {
   if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -2457,7 +2471,7 @@ vect_check_scalar_mask (vec_info *vinfo, stmt_vec_info 
stmt_info, tree mask,
   return false;
 }
 
-  if (TREE_CODE (mask) != SSA_NAME)
+  if (TREE_CODE (*mask) != SSA_NAME)
 {
   if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -2465,13 +2479,15 @@ vect_check_scalar_mask (vec_info *vinfo, stmt_vec_info 
stmt_info, tree mask,
   return false;
 }
 
-  enum vect_def_type mask_dt;
-  tree mask_v

Re: [PATCH] Port GCC documentation to Sphinx

2021-06-30 Thread Andrey Belevantsev via Gcc-patches
Hi Martin,

On 29.06.2021 13:09, Martin Liška wrote:
> On 6/28/21 5:33 PM, Joseph Myers wrote:
>> Are formatted manuals (HTML, PDF, man, info) corresponding to this patch
>> version also available for review?
>
> I've just uploaded them here:
> https://splichal.eu/gccsphinx-final/
>
> Martin

I've randomly looked at the PDF version of the GCC internals manual and the
table of contents there only has an introduction and an index (looks like
all other chapters went under introduction).  Other PDFs have the first
level chapters in the contents.  Maybe it will be a good idea to include
the lower level chapters as well, or at least to fix the gccint one :)

Best,
Andrey


Re: [COMMITTED V10 4/7] CTF/BTF testsuites

2021-06-30 Thread Christophe Lyon via Gcc-patches
Hi,

I have just committed the following small patch as obvious:

Author: Christophe Lyon 
Date:   Wed Jun 30 11:44:00 2021 +

[testsuite]: Add missing dg-add-options float16 to
gcc.dg/debug/ctf/ctf-skip-types-2.c

The test already checks dg-require-effective-target float16, but this
is not sufficient to use the flags needed, if any.
This patch makes the test pass on arm.

2021-06-30  Christophe Lyon  

gcc/testsuite/
* gcc.dg/debug/ctf/ctf-skip-types-2.c: Add dg-add-options
float16.

diff --git a/gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-2.c
b/gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-2.c
index 7c8b17df6db..79d5cb230a1 100644
--- a/gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-2.c
+++ b/gcc/testsuite/gcc.dg/debug/ctf/ctf-skip-types-2.c
@@ -13,5 +13,6 @@
 /* { dg-options "-gctf" } */

 /* { dg-require-effective-target float16 } */
+/* { dg-add-options float16 } */

 _Float16 f16;


Christophe


On Mon, Jun 28, 2021 at 7:42 PM Jose E. Marchesi via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

> This commit adds a new testsuite for the CTF debug format.
>
> 2021-06-28  Indu Bhagat  
> David Faust  
>
> gcc/testsuite/
>
> * lib/gcc-dg.exp (gcc-dg-frontend-supports-ctf): New procedure.
> (gcc-dg-debug-runtest): Add -gctf support.
> * gcc.dg/debug/btf/btf-1.c: New test.
> * gcc.dg/debug/btf/btf-2.c: Likewise.
> * gcc.dg/debug/btf/btf-anonymous-struct-1.c: Likewise.
> * gcc.dg/debug/btf/btf-anonymous-union-1.c: Likewise.
> * gcc.dg/debug/btf/btf-array-1.c: Likewise.
> * gcc.dg/debug/btf/btf-bitfields-1.c: Likewise.
> * gcc.dg/debug/btf/btf-bitfields-2.c: Likewise.
> * gcc.dg/debug/btf/btf-bitfields-3.c: Likewise.
> * gcc.dg/debug/btf/btf-cvr-quals-1.c: Likewise.
> * gcc.dg/debug/btf/btf-enum-1.c: Likewise.
> * gcc.dg/debug/btf/btf-forward-1.c: Likewise.
> * gcc.dg/debug/btf/btf-function-1.c: Likewise.
> * gcc.dg/debug/btf/btf-function-2.c: Likewise.
> * gcc.dg/debug/btf/btf-int-1.c: Likewise.
> * gcc.dg/debug/btf/btf-pointers-1.c: Likewise.
> * gcc.dg/debug/btf/btf-struct-1.c: Likewise.
> * gcc.dg/debug/btf/btf-typedef-1.c: Likewise.
> * gcc.dg/debug/btf/btf-union-1.c: Likewise.
> * gcc.dg/debug/btf/btf-variables-1.c: Likewise.
> * gcc.dg/debug/btf/btf.exp: Likewise.
> * gcc.dg/debug/ctf/ctf-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-2.c: Likewise.
> * gcc.dg/debug/ctf/ctf-anonymous-struct-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-anonymous-union-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-array-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-array-2.c: Likewise.
> * gcc.dg/debug/ctf/ctf-array-3.c: Likewise.
> * gcc.dg/debug/ctf/ctf-array-4.c: Likewise.
> * gcc.dg/debug/ctf/ctf-attr-mode-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-attr-used-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-bitfields-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-bitfields-2.c: Likewise.
> * gcc.dg/debug/ctf/ctf-bitfields-3.c: Likewise.
> * gcc.dg/debug/ctf/ctf-bitfields-4.c: Likewise.
> * gcc.dg/debug/ctf/ctf-complex-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-cvr-quals-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-cvr-quals-2.c: Likewise.
> * gcc.dg/debug/ctf/ctf-cvr-quals-3.c: Likewise.
> * gcc.dg/debug/ctf/ctf-cvr-quals-4.c: Likewise.
> * gcc.dg/debug/ctf/ctf-enum-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-enum-2.c: Likewise.
> * gcc.dg/debug/ctf/ctf-file-scope-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-float-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-forward-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-forward-2.c: Likewise.
> * gcc.dg/debug/ctf/ctf-func-index-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-function-pointers-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-function-pointers-2.c: Likewise.
> * gcc.dg/debug/ctf/ctf-function-pointers-3.c: Likewise.
> * gcc.dg/debug/ctf/ctf-functions-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-int-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-objt-index-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-pointers-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-pointers-2.c: Likewise.
> * gcc.dg/debug/ctf/ctf-preamble-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-skip-types-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-skip-types-2.c: Likewise.
> * gcc.dg/debug/ctf/ctf-skip-types-3.c: Likewise.
> * gcc.dg/debug/ctf/ctf-skip-types-4.c: Likewise.
> * gcc.dg/debug/ctf/ctf-skip-types-5.c: Likewise.
> * gcc.dg/debug/ctf/ctf-skip-types-6.c: Likewise.
> * gcc.dg/debug/ctf/ctf-str-table-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-struct-1.c: Likewise.
> * gcc.dg/debug/ctf/ctf-struct-2.c: Likewise.
> * gcc.dg/debug/ct

Re: [PATCH] Port GCC documentation to Sphinx

2021-06-30 Thread Martin Liška

On 6/30/21 12:14 PM, Richard Earnshaw wrote:



On 30/06/2021 05:47, Martin Liška wrote:

On 6/29/21 12:50 PM, Richard Earnshaw wrote:



On 29/06/2021 11:09, Martin Liška wrote:

On 6/28/21 5:33 PM, Joseph Myers wrote:

Are formatted manuals (HTML, PDF, man, info) corresponding to this patch
version also available for review?


I've just uploaded them here:
https://splichal.eu/gccsphinx-final/

Martin



In the HTML version of the gcc manual the sidebar has an "Option index" link but no link 
to the general index.  When you follow that link the page contents is just a link to the 
"index" where everything is all lumped together.

If we can't have separate indexes for options and general entries, I think it 
would make more sense for the Option index link to be removed entirely.


Fully agree with you. Thanks for the feedback and I've changed that to the 
standard Sphinx section,
see e.g. https://splichal.eu/gccsphinx-final/html/gcc/indices-and-tables.html

Martin



R.





Thanks.  Given that the manual is nominally in American English, it might be better to use the term 
"indexes" rather than "indices".


Sure, fixed (while preserving the filename, we have multi copies of it).

Martin



https://grammarist.com/usage/indexes-indices/

R.




Re: [PATCH] define auto_vec copy ctor and assignment (PR 90904)

2021-06-30 Thread Richard Biener via Gcc-patches
On Wed, Jun 30, 2021 at 11:00 AM Richard Sandiford
 wrote:
>
> Richard Biener via Gcc-patches  writes:
> > Note there's also array_slice<> which could be used to pass non-const
> > vec<>s that are never resized but modified - the only "valid" case of
> > passing a non-const vec<> by value.
>
> Yeah.  We'd need a new constructor for that (the current one only
> takes const vec<>&) but I agree it would be a good thing to do.
>
> I realise you weren't saying otherwise, but: array_slice<> can also be
> used for const vec<>s.  E.g. array_slice can't be resized
> or modified.
>
> I think array_slice<> is going to be more efficient as well.  E.g.:
>
> void
> f1 (vec &foo)
> {
>   for (unsigned int i = 0; i < foo.length (); ++i)
> foo[i] += 1;
> }
>
> void
> f2 (array_slice foo)
> {
>   for (unsigned int i = 0; i < foo.size (); ++i)
> foo[i] += 1;
> }
>
> gives:
>
> d150 &)>:
> d150:   48 8b 07mov(%rdi),%rax
> d153:   31 d2   xor%edx,%edx
> d155:   48 85 c0test   %rax,%rax
> d158:   74 26   je d180  vl_ptr>&)+0x30>
> d15a:   66 0f 1f 44 00 00   nopw   0x0(%rax,%rax,1)
> d160:   3b 50 04cmp0x4(%rax),%edx
> d163:   73 12   jaed177  vl_ptr>&)+0x27>
> d165:   89 d1   mov%edx,%ecx
> d167:   83 c2 01add$0x1,%edx
> d16a:   80 44 08 08 01  addb   $0x1,0x8(%rax,%rcx,1)
> d16f:   48 8b 07mov(%rdi),%rax
> d172:   48 85 c0test   %rax,%rax
> d175:   75 e9   jned160  vl_ptr>&)+0x10>
> d177:   c3  retq
> d178:   0f 1f 84 00 00 00 00nopl   0x0(%rax,%rax,1)
> d17f:   00
> d180:   c3  retq
>
> d190 )>:
> d190:   85 f6   test   %esi,%esi
> d192:   74 18   je d1ac 
> )+0x1c>
> d194:   8d 46 fflea-0x1(%rsi),%eax
> d197:   48 8d 44 07 01  lea0x1(%rdi,%rax,1),%rax
> d19c:   0f 1f 40 00 nopl   0x0(%rax)
> d1a0:   80 07 01addb   $0x1,(%rdi)
> d1a3:   48 83 c7 01 add$0x1,%rdi
> d1a7:   48 39 c7cmp%rax,%rdi
> d1aa:   75 f4   jned1a0 
> )+0x10>
> d1ac:   c3  retq
>
> where f1 has to reload the length and base each iteration,
> but f2 doesn't.

Of course but that's unfair - by refrence vec<> vs. by value array_slice<>
plus char * pointer stores which destroy all TBAA ... ;)

>
> > But as noted array_slice<> lacks most of the vec<> API so I'm not sure
> > how awkward that option would be.  We of course can amend its API as
> > well.
>
> Yeah, that'd be good.  The current class follows the principle
> “don't add stuff that isn't needed yet”. :-)

Yes, and that's good of course.

Richard.

> Thanks,
> Richard


[Patch] gcc.c: Add -foffload= to display_help

2021-06-30 Thread Tobias Burnus

This is a side effect of removing 'foffload=' from Common
as Driver only does not show up with --help=... but only
as hard-coded list via display_help.

OK?

Tobias

PS: additional remarks in the next email.

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
gcc.c: Add -foffload= to display_help

gcc/ChangeLog:

	* common.opt (foffload): Remove help as Driver only.
	* gcc.c (display_help): Add -foffload.

diff --git a/gcc/common.opt b/gcc/common.opt
index 2f5ff9f02e9..5b03bbc6662 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2100,11 +2100,8 @@ fnon-call-exceptions
 Common Var(flag_non_call_exceptions) Optimization
 Support synchronous non-call exceptions.
 
-; -foffload= is documented
-; -foffload== is supported for backward compatibility
 foffload=
 Driver Joined MissingArgError(targets missing after %qs)
--foffload=	Specify offloading targets.
 
 foffload-options=
 Common Driver Joined MissingArgError(options or targets=options missing after %qs)
diff --git a/gcc/gcc.c b/gcc/gcc.c
index f802148e567..c8dbff61307 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -3752,6 +3752,7 @@ display_help (void)
   fputs (_("  -dumpspecs   Display all of the built in spec strings.\n"), stdout);
   fputs (_("  -dumpversion Display the version of the compiler.\n"), stdout);
   fputs (_("  -dumpmachine Display the compiler's target processor.\n"), stdout);
+  fputs (_("  -foffload=  Specify offloading targets.\n"), stdout);
   fputs (_("  -print-search-dirs   Display the directories in the compiler's search path.\n"), stdout);
   fputs (_("  -print-libgcc-file-name  Display the name of the compiler's companion library.\n"), stdout);
   fputs (_("  -print-file-name=   Display the full path to library .\n"), stdout);


Re: [PATCH] Port GCC documentation to Sphinx

2021-06-30 Thread Martin Liška

On 6/29/21 6:57 PM, Eli Zaretskii wrote:

5. There's some strange bug with symbols inside parentheses.  For
example:

   In GNU C and C++, you can use function attributes to specify certain
   function properties that may help the compiler optimize calls or check
   code more carefully for correctness.  For example, you can use
   attributes to specify that a function never returns ( ‘noreturn’ ),
   returns a value depending only on the values of its arguments ( ‘const’
   ), or has ‘printf’ -style arguments ( ‘format’ ).

See the extra blanks inside parens?  The old format was nicer:

   In GNU C and C++, you can use function attributes to specify certain
   function properties that may help the compiler optimize calls or check
   code more carefully for correctness.  For example, you can use
   attributes to specify that a function never returns ('noreturn'),
   returns a value depending only on the values of its arguments ('const'),
   or has 'printf'-style arguments ('format').


This issues is resolved now. Good point!

Martin


Re: [Patch] gcc.c: Add -foffload= to display_help

2021-06-30 Thread Jakub Jelinek via Gcc-patches
On Wed, Jun 30, 2021 at 02:24:51PM +0200, Tobias Burnus wrote:
> This is a side effect of removing 'foffload=' from Common
> as Driver only does not show up with --help=... but only
> as hard-coded list via display_help.
> 
> OK?
> 
> Tobias
> 
> PS: additional remarks in the next email.
> 
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
> Thürauf

> gcc.c: Add -foffload= to display_help
> 
> gcc/ChangeLog:
> 
>   * common.opt (foffload): Remove help as Driver only.
>   * gcc.c (display_help): Add -foffload.

Ok, thanks.

Jakub



Re: [PATCH 2/4] allow poisoning input_location in ranges it should not be used

2021-06-30 Thread Trevor Saunders
On Wed, Jun 30, 2021 at 11:00:37AM +0200, Richard Biener wrote:
> On Wed, Jun 30, 2021 at 7:37 AM Trevor Saunders  wrote:
> >
> > This makes it possible to assert if input_location is used during the 
> > lifetime
> > of a scope.  This will allow us to find places that currently use it within 
> > a
> > function and its callees, or prevent adding uses within the lifetime of a
> > function after all existing uses are removed.
> >
> > bootstrapped and regtested on x86_64-linux-gnu, ok?
> 
> I'm not sure about the general approach but I have comments about
> input_location.
> 
> IMHO a good first step would be to guard the input_location declaration with 
> sth
> like
> 
> #ifndef GCC_NEED_INPUT_LOCATION
> extern location_t input_location;
> #endif

I think that's another reasonable step, my one concern is that it can be
useful to push the usage of input_location, or any other global from the
bottom of the stack to a caller that can provide a better argument
eventually, but first just use the global.  Doing this sort of
refactoring is harder if you need to add files with callers to the
whitelist, and kind of defeats the point of the whitelist.  Consider the
below commit, that is untested, but perhaps I should have included in
this series as somewhat related.

As for this approach being limited to functions that's somewhat true,
but since it effects all functions called while its on the stack, it
would mean once enough infrastructure is fixed, we could add one the
execute method of a pass and nothing in the pass could touch
input_location.  The limit also means that it doesn't get in the way of
the above sort of refactoring, but as it proceeds lower level functions
that now take explicit arguments can be called from contexts that ban
use of input_location.

Trev

>From efd04d2df4163dd930f489d9fba1455bfb368114 Mon Sep 17 00:00:00 2001
From: Trevor Saunders 
Date: Sun, 27 Jun 2021 02:10:26 -0400
Subject: [PATCH] force decls to be allocated through build_decl to initialize
 them
To: gcc-patches@gcc.gnu.org

prior to this commit all calls to build_decl used input_location, even if
temporarily  until build_decl reset the location to something else that it was
told was the proper location.  To avoid using the global we need the caller to
pass in the location it wants, however that's not possible with make_node since
it makes other types of nodes.  So we force all callers who wish to make a decl
to go through build_decl which already takes a location argument.  To avoid
changing behavior this just explicitly passes in input_location to build_decl
for callers of make_node that create a decl, however it would seem in many of
these cases that the location of the decl being coppied might be a better
location.
---
 gcc/cfgexpand.c  |  8 +++---
 gcc/cp/cp-gimplify.c |  5 ++--
 gcc/fortran/trans-decl.c |  5 ++--
 gcc/fortran/trans-types.c|  4 +--
 gcc/ipa-param-manipulation.c |  8 +++---
 gcc/objc/objc-act.c  | 16 +---
 gcc/omp-simd-clone.c |  4 +--
 gcc/stor-layout.c|  2 +-
 gcc/tree-inline.c| 13 +-
 gcc/tree-into-ssa.c  |  4 +--
 gcc/tree-nested.c| 24 --
 gcc/tree-ssa-ccp.c   |  4 +--
 gcc/tree-ssa-loop-ivopts.c   |  4 +--
 gcc/tree-ssa-phiopt.c|  8 +++---
 gcc/tree-ssa-reassoc.c   |  4 +--
 gcc/tree-ssa.c   |  4 +--
 gcc/tree.c   | 49 ++--
 gcc/tree.h   |  9 ++-
 gcc/varasm.c | 12 -
 19 files changed, 93 insertions(+), 94 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 3edd53c37dc..fea8c837c80 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -4342,10 +4342,10 @@ avoid_deep_ter_for_debug (gimple *stmt, int depth)
  tree &vexpr = deep_ter_debug_map->get_or_insert (use);
  if (vexpr != NULL)
continue;
- vexpr = make_node (DEBUG_EXPR_DECL);
+ vexpr = build_decl (input_location, DEBUG_EXPR_DECL, nullptr,
+ TREE_TYPE (use));
  gimple *def_temp = gimple_build_debug_bind (vexpr, use, g);
  DECL_ARTIFICIAL (vexpr) = 1;
- TREE_TYPE (vexpr) = TREE_TYPE (use);
  SET_DECL_MODE (vexpr, TYPE_MODE (TREE_TYPE (use)));
  gimple_stmt_iterator gsi = gsi_for_stmt (g);
  gsi_insert_after (&gsi, def_temp, GSI_NEW_STMT);
@@ -5899,14 +5899,14 @@ expand_gimple_basic_block (basic_block bb, bool 
disable_tail_calls)
   temporary.  */
gimple *debugstmt;
tree value = gimple_assign_rhs_to_tree (def);
-   tree vexpr = make_node (DEBUG_EXPR_DECL);
rtx val;
machine_mode mode;
 
set_curr_insn_location (gimple_location (def));
 
+   tree vexpr = build_decl (input_location, DEBUG_EXPR_DECL,
+

Re: [PATCH] Port GCC documentation to Sphinx

2021-06-30 Thread Martin Liška

On 6/30/21 12:55 PM, Andrey Belevantsev wrote:

Hi Martin,

On 29.06.2021 13:09, Martin Liška wrote:

On 6/28/21 5:33 PM, Joseph Myers wrote:

Are formatted manuals (HTML, PDF, man, info) corresponding to this patch
version also available for review?


I've just uploaded them here:
https://splichal.eu/gccsphinx-final/

Martin




Hello.


I've randomly looked at the PDF version of the GCC internals manual and the
table of contents there only has an introduction and an index (looks like
all other chapters went under introduction).  Other PDFs have the first
level chapters in the contents.  Maybe it will be a good idea to include
the lower level chapters as well, or at least to fix the gccint one :)


Very good point and I fixed both your comments. It would be much better having
listed all chapters in TOC.

Martin



Best,
Andrey





RFC: --help for Driver options (was: [Patch] gcc.c: Add -foffload= to display_help)

2021-06-30 Thread Tobias Burnus

RFC for this topic. Comments are welcome as is someone volunteering to
cleanup this mess :-)

On 30.06.21 14:24, Tobias Burnus wrote:

This is a side effect of removing 'foffload=' from Common
as Driver only does not show up with --help=... but only
as hard-coded list via display_help.


While looking at that issue, Jakub and I wondered whether there
should be some warning if a Driver option has a help text,
which never appears.

Jakub mentioned
  -dumpdir
but that's also in Common and, hence, shows up with
  --help=common
but admittedly not with --help

 * * *

However, the following items are Drivar and neither LTO, C, Fortran nor
Common and do have a help text (that never shows up).

(Extracted with the first attached patch, as we wondered whether a
warning could make sense.)

* Those appear with --help as they have a puts line in gcc.c's display_help():
common.opt  no-pie
common.opt  pie
common.opt  shared
common.opt  static-pie

* While those don't but they also cannot be added to gcc.c
as they are FE specific driver options:
c-family/c.opt  static-libmpx
c-family/c.opt  static-libmpxwrappers
common.opt  foffload=
d/lang.opt  debuglib=
d/lang.opt  defaultlib=
d/lang.opt  dstartfiles
d/lang.opt  nophoboslib
d/lang.opt  shared-libphobos
d/lang.opt  static-libphobos

 * * *

Additionally, I was wondering whether having --help=driver makes sense.

That I tried with the following patch, but it looks as if I missed some
fineprint as Driver-only flags were still not printed, if I recall
correctly.

BTW: -help='s helptext differs between the hard-coded one in gcc.c
(used by --help) and the one in common.opt (shown, e.g., by --help=common).

Even with that patch applied, --help= does not really have a useful
error message while --help=something-invalid has.

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
diff --git a/gcc/optc-gen.awk b/gcc/optc-gen.awk
index 880ac776d8a..d63407c5c88 100644
--- a/gcc/optc-gen.awk
+++ b/gcc/optc-gen.awk
@@ -275,6 +275,25 @@ for (i = 0; i < n_opts; i++) {
 		indices[opts[i]] = j;
 	}
 	j++;
+	if (help[i] != "" && flag_set_p("Driver", flags[i]) \
+	&& !flag_set_p("Common", flags[i])) {
+		nflags = split(switch_flags(flags[i]), flag_array, " | ")
+		help_warn = 1
+		for (j = 0; j < nflags; j++) {
+			if (flag_array[j] != "" \
+			&& flag_array[j] != "|" \
+			&& flag_array[j] != "CL_DRIVER" \
+			&& flag_array[j] != "CL_SEPARATE" \
+			&& flag_array[j] != "CL_JOINED" \
+			&& flag_array[j] != "CL_UNDOCUMENTED") {
+help_warn = 0
+			}
+		}
+		if (help_warn != 0)
+			print "#warning Help text provided for Driver only " \
+			"option '" opts[i] "' move it to gcc.c's  " \
+"display_help: " help[i]
+	}
 }
 
 optindex = 0
diff --git a/gcc/opts.c b/gcc/opts.c
index f159bb35130..df5e2462d76 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -1815,7 +1815,9 @@ print_specific_help (unsigned int include_flags,
 {
   if (any_flags == 0)
 	{
-	  if (include_flags & CL_UNDOCUMENTED)
+	  if (include_flags & CL_DRIVER)
+	description = _("The following options control the driver");
+	  else if (include_flags & CL_UNDOCUMENTED)
 	description = _("The following options are not documented");
 	  else if (include_flags & CL_SEPARATE)
 	description = _("The following options take separate arguments");
@@ -2276,30 +2278,38 @@ print_help (struct gcc_options *opts, unsigned int lang_mask,
   if (lang_mask == CL_DRIVER)
 return;
 
+  static const struct
+{
+  const char *string;
+  unsigned int flag;
+}
+  specifics[] =
+{
+  { "common", CL_COMMON },
+  { "driver", CL_DRIVER },
+  { "joined", CL_JOINED },
+  { "optimizers", CL_OPTIMIZATION },
+  { "params", CL_PARAMS },
+  { "separate", CL_SEPARATE },
+  { "target", CL_TARGET },
+  { "undocumented", CL_UNDOCUMENTED },
+  { "warnings", CL_WARNING },
+  { NULL, 0 }
+};
+
+  auto_vec  candidates;
+  for (unsigned int i = 0; specifics[i].string != NULL; i++)
+candidates.safe_push (specifics[i].string);
+  for (unsigned int i = 0; i < cl_lang_count; i++)
+candidates.safe_push (lang_names[i]);
+
   /* Walk along the argument string, parsing each word in turn.
  The format is:
  arg = [^]{word}[,{arg}]
- word = {optimizers|target|warnings|undocumented|
- params|common|}  */
+ word = {common|driver|joined|optimizers|params|separate
+	 |target|undocumented|warnings|}  */
   while (*a != 0)
 {
-  static const struct
-	{
-	  const char *string;
-	  unsigned int flag;
-	}
-  specifics[] =
-	{
-	{ "optimizers", CL_OPTIMIZATION },
-	{ "target", CL_TARGET },
-	{ "warnings", CL_WARNING },
-	{ "undocumented", CL_UNDOCUMENTED },
-	{ "params", CL_PARAMS },
-	{ "joined", CL_JOINED },
-	{ "separate", CL_SEPARATE }

Re: [PATCH] Port GCC documentation to Sphinx

2021-06-30 Thread Eli Zaretskii via Gcc-patches
> Cc: jos...@codesourcery.com, g...@gcc.gnu.org, gcc-patches@gcc.gnu.org
> From: Martin Liška 
> Date: Wed, 30 Jun 2021 12:11:03 +0200
> 
> > (Admittedly, Emacs by default hides some of the text of a
> > cross-reference, but not hiding them in this case produces an even
> > less legible text.)
> 
> If I'm correct, it's exactly what's documented in Sphinx FAQ here:
> https://www.sphinx-doc.org/en/master/faq.html#displaying-links
> 
> and there's a suggested Emacs code snippet that should help with links.
> Does it help?

It helps some, but not all of the issues disappear.  For example,
stuff like this is still hard to read:

  To select this standard in GCC, use one of the options -ansi
 -
  -std.‘=c90’ or -std.‘=iso9899:1990’
     

The quotes around the option values don't help.

Also, using the method proposed by Sphinx FAQ would need a change in
Emacs, which will take time to propagate.  So my suggestion is to
minimize the use of such "inline" hyperlinks.

> >‘@`file'’
> > 
> > Read command-line options from ‘`file'’.  The options read are
> > inserted in place of the original ‘@`file'’ option.  If ‘`file'’
> > does not exist, or cannot be read, then the option will be treated
> > literally, and not removed.
> 
> I can confirm that, so e.g.
> Show :samp:`Samp with a {variable}.`
> 
> is transformed into:
> Show @code{Samp with a @emph{variable}.}
> 
> Default info formatting is selected as:
> 
> @definfoenclose strong,`,'
> @definfoenclose emph,`,'
> 
> We can adjust 'emph' formatting to nil, what do you think?

Something like that, yes.  But the problem is: how will you format it
instead?  The known alternatives, _foo_ and *foo* both use punctuation
characters, which will get in the way similarly to the quotes.  Can
you format those in caps, like makeinfo does?

> > 4. Menus lost the short descriptions of the sub-sections.  Example:
> > 
> >* Designated Initializers
> >* Case Ranges
> >* Cast to a Union Type
> >* Mixed Declarations, Labels and Code
> >* Declaring Attributes of Functions
> > 
> > vs
> > 
> >* Designated Inits::Labeling elements of initializers.
> >* Case Ranges:: 'case 1 ... 9' and such.
> >* Cast to Union::   Casting to union type from any member of the 
> > union.
> >* Mixed Declarations::  Mixing declarations and code.
> >* Function Attributes:: Declaring that functions have no side effects,
> >   or that they can never return.
> > 
> > Looks like some bug to me.
> > 
> > Note also that nodes are now called by the same name as the section,
> > which means node names generally got much longer.  Is that really a
> > good idea?
> 
> Well, I intentionally removed these and used simple TOC tree links
> which take display text for a section title.

I would suggest to discuss these decisions first, perhaps on the
Texinfo mailing list?  I'm accustomed to these short descriptions, but
I'm not sure how important they are for others.


PING: [PATCH] mips: check MSA support for vector modes [PR100760,PR100761,PR100762]

2021-06-30 Thread Xi Ruoyao via Gcc-patches
Ping patch:
https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573213.html

Status update: bootstrapped with BOOT_CFLAGS="-O3 -mmsa -mloongson-mmi"
(it failed without the patch), and regtested on mips64el-linux-gnu with
no new regression.

On Sat, 2021-06-19 at 15:34 +0800, Xi Ruoyao wrote:
> Check if the vector mode is really supported by MSA in certain cases,
> instead of testing ISA_HAS_MSA.  Simply testing ISA_HAS_MSA can cause
> ICE when MSA is enabled besides other MIPS SIMD extensions (notably,
> Loongson MMI).
> 
> Bootstrapped and tested on mips64el-linux-gnu.  OK to commit?
> 
> gcc/
> 
> * config/mips/mips.c (mips_const_insns): Use
> MSA_SUPPORTED_MODE_P
> instead of ISA_HAS_MSA.
> (mips_expand_vec_unpack): Likewise.
> (mips_expand_vector_init): Likewise.
> 
> gcc/testsuite/
> 
> * testsuite/gcc.target/mips/pr100760.c: New test.
> * testsuite/gcc.target/mips/pr100761.c: New test.
> * testsuite/gcc.target/mips/pr100762.c: New test.
-- 
Xi Ruoyao 



Re: [PATCH] Add stmt context in simplify_using_ranges.

2021-06-30 Thread Andrew MacLeod via Gcc-patches

On 6/30/21 2:20 AM, Aldy Hernandez wrote:



On 6/29/21 9:09 PM, Andrew MacLeod wrote:
We added context to a lot of simplify_using_ranges, but we didn't 
catch all the places. This provides the originating stmt to the 
missing cases which resolve a few EVRP testcases when running in 
ranger-only mode.


Bootstraps on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew




Thanks for doing this.  I've done a half-assed job at passing context 
around; probably only when it yielded a discrepancy with evrp.




 bool
-simplify_using_ranges::op_with_boolean_value_range_p (tree op)
+simplify_using_ranges::op_with_boolean_value_range_p (tree op, 
gimple *s)

 {
   if (TYPE_PRECISION (TREE_TYPE (op)) == 1)


I know you like single letter arguments, but I find them confusing 
when the method is more than a few lines long.  Besides, "stmt" is 
what is used throughout vr-values.c.


And speaking of passing statements around, I wonder if it'd be best to 
have m_stmt and possible m_gsi as class fields.  After all, we never 
change them, and they're used by most methods.


Aldy


I think there's a revamp of simplify down the pipe anyway.

class simplify_using_ranges
{
public:
  simplify_using_ranges (class range_query *query = NULL);
  ~simplify_using_ranges ();
  void set_range_query (class range_query *q) { query = q; }

  bool simplify (gimple_stmt_iterator *);


This is really the only external API.. the call to simplify. Long term 
Im not sure that containing all the switch update management stuff att 
he bottom of the class should be contained in this class.. That seems 
like it should be a class that is used by simplifcation...  and 
simplification itself could be stateless..    kinda following the model 
of fold_using_ranges.. the the gsi and stmt can be wrapped into a source 
class if needed...


likewise we're eventually going to want to restructure the folding stuff 
that happens..    but most of this can wait until evrp and vrp are gone, 
then we can change the model a bit easier. bigger fish to fry and they 
are already way better than they were before :-)




Re: [PATCH] Port GCC documentation to Sphinx

2021-06-30 Thread Martin Liška

On 6/29/21 6:57 PM, Eli Zaretskii wrote:

2. The translation of @var produces double-quoting in Info, here's an
example:

   The usual way to run GCC is to run the executable called ‘gcc’, or
   ‘`machine'-gcc’ when cross-compiling, or ‘`machine'-gcc-`version'’ to
   run a specific version of GCC.

vs, the old

The usual way to run GCC is to run the executable called 'gcc', or
   'MACHINE-gcc' when cross-compiling, or 'MACHINE-gcc-VERSION' to run a
   specific version of GCC.

I think the new variant is less readable and more confusing, because
it isn't clear whether the quotes are part of the text.  Here's an
extreme example:

   ‘@`file'’

Read command-line options from ‘`file'’.  The options read are
inserted in place of the original ‘@`file'’ option.  If ‘`file'’
does not exist, or cannot be read, then the option will be treated
literally, and not removed.


For this one, I've just created the following pull request:
https://github.com/sphinx-doc/sphinx/pull/9391

Cheers,
Martin


Re: [PATCH] c++: DR2397 - auto specifier for * and & to arrays [PR100975]

2021-06-30 Thread Jason Merrill via Gcc-patches

On 6/29/21 6:01 PM, Marek Polacek wrote:

On Tue, Jun 29, 2021 at 03:50:27PM -0400, Jason Merrill wrote:

On 6/29/21 3:25 PM, Marek Polacek wrote:

--- a/gcc/testsuite/g++.dg/cpp0x/auto3.C
+++ b/gcc/testsuite/g++.dg/cpp0x/auto3.C
@@ -10,7 +10,7 @@ auto x;   // { dg-error "auto" }
   auto i = 42, j = 42.0;   // { dg-error "auto" }
   // New CWG issue


Let's at least update this comment to quote [dcl.type.auto.deduct]/2: "T
shall not be an array type".  I guess "unable to deduce" is a suitable
diagnostic for that error.


Fixed.


diff --git a/gcc/testsuite/g++.dg/diagnostic/auto1.C 
b/gcc/testsuite/g++.dg/diagnostic/auto1.C
index ee2eefd59aa..9d9979e3fdc 100644
--- a/gcc/testsuite/g++.dg/diagnostic/auto1.C
+++ b/gcc/testsuite/g++.dg/diagnostic/auto1.C
@@ -1,4 +1,5 @@
   // PR c++/86915
   // { dg-do compile { target c++17 } }
+// Allowed since DR2397.


Well, not really; any attempt to use this template should hit the same
problem as above of trying to do auto deduction where T is an array type.
Please add to the testcase to get the error.


Hmm, this

template struct S { };
static int arr[1];
S s;

won't give an error: I think it's because we coerce the auto tparm into 'auto*'
before deducing and so don't get the type mismatch error.  That seems to be
in line with how 'template' works, though.


Ah, good point.


So I think we don't need to change this in the patch.  Do you agree?


Yes, the patch is OK with the earlier comment tweak.

Jason



Re: [PATCH] Port GCC documentation to Sphinx

2021-06-30 Thread Eli Zaretskii via Gcc-patches
> Cc: jos...@codesourcery.com, g...@gcc.gnu.org, gcc-patches@gcc.gnu.org
> From: Martin Liška 
> Date: Wed, 30 Jun 2021 15:28:40 +0200
> 
> >‘@`file'’
> > 
> > Read command-line options from ‘`file'’.  The options read are
> > inserted in place of the original ‘@`file'’ option.  If ‘`file'’
> > does not exist, or cannot be read, then the option will be treated
> > literally, and not removed.
> 
> For this one, I've just created the following pull request:
> https://github.com/sphinx-doc/sphinx/pull/9391

Thanks, but does that mean @var will no longer stand out in the
produced Info format?  That'd be sub-optimal, I think, because a clear
reference to a meta-syntactic variable will be lost.


[PATCH] testsuite: Add arm_arch_v7a_ok effective-target to pr57351.c

2021-06-30 Thread Christophe LYON via Gcc-patches

I've noticed that overriding cpu/arch flags when running the testsuite
can cause this test to fail rather than being skipped because of
incompatible flags combination.

Since the test forces -march=armv7-a, make sure it is accepted in
combination with the current runtestflags.

2021-06-30  Christophe Lyon  

    gcc/testsuite/
    * gcc.dg/debug/pr57351.c: Require arm_arch_v7a_ok
    effective-target.



From f49096dc6925a0b28785debdee8fdc323e4f6e82 Mon Sep 17 00:00:00 2001
From: Christophe Lyon 
Date: Wed, 30 Jun 2021 13:47:07 +
Subject: [PATCH] testsuite: Add arm_arch_v7a_ok effective-target to pr57351.c

I've noticed that overriding cpu/arch flags when running the testsuite
can cause this test to fail rather than being skipped because of
incompatible flags combination.

Since the test forces -march=armv7-a, make sure it is accepted in
combination with the current runtestflags.

2021-06-30  Christophe Lyon  

gcc/testsuite/
* gcc.dg/debug/pr57351.c: Require arm_arch_v7a_ok
effective-target.
---
 gcc/testsuite/gcc.dg/debug/pr57351.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.dg/debug/pr57351.c 
b/gcc/testsuite/gcc.dg/debug/pr57351.c
index 972f3e9ebec..236d74ddedb 100644
--- a/gcc/testsuite/gcc.dg/debug/pr57351.c
+++ b/gcc/testsuite/gcc.dg/debug/pr57351.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_neon }  */
+/* { dg-require-effective-target arm_arch_v7a_ok }  */
 /* { dg-options "-std=c99 -Os -g -march=armv7-a" } */
 /* { dg-add-options arm_neon } */
 
-- 
2.25.1



Re: [PATCH] Port GCC documentation to Sphinx

2021-06-30 Thread Martin Liška

On 6/30/21 3:38 PM, Eli Zaretskii wrote:

Cc: jos...@codesourcery.com, g...@gcc.gnu.org, gcc-patches@gcc.gnu.org
From: Martin Liška 
Date: Wed, 30 Jun 2021 15:28:40 +0200


‘@`file'’

 Read command-line options from ‘`file'’.  The options read are
 inserted in place of the original ‘@`file'’ option.  If ‘`file'’
 does not exist, or cannot be read, then the option will be treated
 literally, and not removed.


For this one, I've just created the following pull request:
https://github.com/sphinx-doc/sphinx/pull/9391


Thanks, but does that mean @var will no longer stand out in the
produced Info format?  That'd be sub-optimal, I think, because a clear
reference to a meta-syntactic variable will be lost.


Yes. An alternative approach for:
Show :samp:`Samp with a {variable}.`

can be using @var{variable}, resulting with the following info output:
Show ‘Samp with a VARIABLE.’

Does it seem reasonable?
Thanks,
Martin





Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-06-30 Thread Qing Zhao via Gcc-patches


On Jun 30, 2021, at 2:46 AM, Richard Biener 
mailto:rguent...@suse.de>> wrote:

On Wed, 30 Jun 2021, Qing Zhao wrote:

Hi,

I am testing the 4th patch of -ftrivial-auto-var-init with CPU2017 today, and 
found the following issues:

In the dump file of “*t.i.031t.objsz1”, we have:

 :
 __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
 __s2_len_218 = .DEFERRED_INIT (__s2_len_177, 2);

I looks like this .DEFERRED_INIT initializes an already initialized
variable.

Yes.

For cases like the following:

int s2_len;
s2_len = 4;

i.e, the initialization is not at the declaration.

We cannot avoid initialization for such cases.

 I'd expect to only ever see default definition SSA names
as first argument to .DEFERRED_INIT.

You mean something like:
__s2_len_218 = .DEFERRED_INIT (__s2_len, 2);

?


 __s2_len_219 = 7;
 if (__s2_len_219 <= 3)
   goto ; [INV]
 else
   goto ; [INV]

  :
 _1 = (long unsigned int) i_175;


However, after “ccp”, in “t.i.032t.ccp1”, we have:

 :
 __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
 __s2_len_218 = .DEFERRED_INIT (7, 2);
 _36 = (long unsigned int) i_175;
 _37 = _36 * 8;
 _38 = argv_220(D) + _37;


Looks like that the optimization “ccp” replaced the first argument of the call 
.DEFERRED_INIT with the constant 7.
This should be avoided.

(NOTE, this issue existed in the previous patches, however, only exposed with 
this version since I added more verification
code in tree-cfg.c to verify the call to .DEFERRED_INIT).

I am wondering what’s the best solution to this problem?

I think you have to trace where this "bogus" .DEFERRED_INIT comes from
originally.  Or alternatively, if this is unavoidable,

This is unavoidable, I believe.

add "constant
folding" of .DEFERRED_INIT so that defered init of an initialized
object becomes the object itself, thus retain the previous - eventually
partial - initialization only.

If this additional .DEFERRED_INIT will be kept till RTL expansion phase, then 
it will become a real initialization:

i.e.

s2_len = 0;//.DEFERRED_INIT expanded
s2_len = 4;// the original initialization

Then the first initialization will be eliminated by current RTL optimization 
easily, right?

Qing


Richard.

Can we add any attribute to the internal function argument to prevent later 
optimizations that might applied on it?
Or just update “ccp” phase to specially handle calls to .DEFERRED_INIT? (Not 
sure whether there are other phases have the
Same issue?)

Let me know if you have any suggestion.

Thanks a lot for your help.

Qing

--
Richard Biener mailto:rguent...@suse.de>>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)



[committed] analyzer: eliminate enum binding_key [PR95006]

2021-06-30 Thread David Malcolm via Gcc-patches
I rewrote the way the analyzer's region_model tracks the state of memory
in GCC 11 (in 808f4dfeb3a95f50f15e71148e5c1067f90a126d), which
introduced a store with a binding_map class, mapping binding keys to
symbolic values.

The GCC 11 implementation of binding keys has an enum binding_kind,
which can be "default" vs "direct"; the idea being that direct
bindings take priority over default bindings, where the latter could
be used to represent e.g. a zero-fill of a buffer, and the former
expresses those subregions that have since been touched.

This doesn't work well: it doesn't express the idea of filling
different subregions with different values, or a memset that only
touches part of a buffer, leading to numerous XFAILs in the memset
test cases (and elsewhere).

As preparatory work towards tracking uninitialized values, this patch
eliminates the enum binding_kind, so that all bindings have
equal weight; the order in which they happen is all that matters.
If a write happens which partially overwrites an existing binding,
the new code can partially overwrite a binding, potentially punching a
hole so that an existing binding is split into two parts.

The patch adds some new classes:
- a new "bits_within_svalue" symbolic value to support extracting
  parts of an existing value when its binding is partially clobbered
- a new "repeated_svalue" symbolic value to better express filling
  a region with repeated copies of a symbolic value (e.g. constant
  zero)
- a new "sized_region" region to express accessing a subregion
  with a symbolic size in bytes
and it rewrites e.g. how memset is implemented, so that we can precisely
track which bits in a region have not been touched.

That said, the patch doesn't actually implement "uninitialized" values;
I'm saving that for a followup.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Smoketested on powerpc64-unknown-linux-gnu (gcc110 in compile farm)

Pushed to trunk as r12-1931-ge61ffa201403e3814a43b176883e176716b1492f.

gcc/analyzer/ChangeLog:
PR analyzer/95006
* analyzer.h (class repeated_svalue): New forward decl.
(class bits_within_svalue): New forward decl.
(class sized_region): New forward decl.
(get_field_at_bit_offset): New forward decl.
* engine.cc (exploded_graph::get_or_create_node): Validate the
merged state.
(exploded_graph::maybe_process_run_of_before_supernode_enodes):
Validate the states at each stage.
* program-state.cc (program_state::validate): Validate
m_region_model.
* region-model-impl-calls.cc (region_model::impl_call_memset):
Replace special-case logic for handling constant sizes with
a call to fill_region of a sized_region with the given fill value.
* region-model-manager.cc (maybe_undo_optimize_bit_field_compare):
Drop DK_direct.
(region_model_manager::maybe_fold_sub_svalue):  Fold element-based
subregions of an initial value into initial values of an element.
Fold subvalues of repeated svalues.
(region_model_manager::maybe_fold_repeated_svalue): New.
(region_model_manager::get_or_create_repeated_svalue): New.
(get_bit_range_for_field): New.
(get_byte_range_for_field): New.
(get_field_at_byte_range): New.
(region_model_manager::maybe_fold_bits_within_svalue): New.
(region_model_manager::get_or_create_bits_within): New.
(region_model_manager::get_sized_region): New.
(region_model_manager::log_stats): Update for addition of
m_repeated_values_map, m_bits_within_values_map, and
m_sized_regions.
* region-model.cc (region_model::validate): New.
(region_model::on_assignment): Drop enum binding_kind.
(region_model::get_initial_value_for_global): Likewise.
(region_model::get_rvalue_for_bits): Replace body with call to
get_or_create_bits_within.
(region_model::get_capacity): Handle RK_SIZED.
(region_model::set_value): Drop enum binding_kind.
(region_model::fill_region): New.
(region_model::get_representative_path_var_1): Handle RK_SIZED.
* region-model.h (visitor::visit_repeated_svalue): New.
(visitor::visit_bits_within_svalue): New.
(region_model_manager::get_or_create_repeated_svalue): New decl.
(region_model_manager::get_or_create_bits_within): New decl.
(region_model_manager::get_sized_region): New decl.
(region_model_manager::maybe_fold_repeated_svalue): New decl.
(region_model_manager::maybe_fold_bits_within_svalue): New decl.
(region_model_manager::repeated_values_map_t): New typedef.
(region_model_manager::m_repeated_values_map): New field.
(region_model_manager::bits_within_values_map_t): New typedef.
(region_model_manager::m_bits_within_values_map): New field.
(region_model_manager::m_sized_regions): New field.
(region_mode

Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-06-30 Thread Richard Biener
On Wed, 30 Jun 2021, Qing Zhao wrote:

> 
> 
> On Jun 30, 2021, at 2:46 AM, Richard Biener 
> mailto:rguent...@suse.de>> wrote:
> 
> On Wed, 30 Jun 2021, Qing Zhao wrote:
> 
> Hi,
> 
> I am testing the 4th patch of -ftrivial-auto-var-init with CPU2017 today, and 
> found the following issues:
> 
> In the dump file of “*t.i.031t.objsz1”, we have:
> 
>  :
>  __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
>  __s2_len_218 = .DEFERRED_INIT (__s2_len_177, 2);
> 
> I looks like this .DEFERRED_INIT initializes an already initialized
> variable.
> 
> Yes.
> 
> For cases like the following:
> 
> int s2_len;
> s2_len = 4;
> 
> i.e, the initialization is not at the declaration.
> 
> We cannot avoid initialization for such cases.

But I'd have expected

  s2_len = .DEFERRED_INIT (s2_len, 0);
  s2_len = 4;

from the above - thus the deferred init _before_ the first
"use" which is the explicit init.  How does the other order
happen to materialize?  As said, I believe it shouldn't.

>  I'd expect to only ever see default definition SSA names
> as first argument to .DEFERRED_INIT.
> 
> You mean something like:
> __s2_len_218 = .DEFERRED_INIT (__s2_len, 2);

No,

__s2_len_218 = .DEFERRED_INIT (__s2_len_217(D), 2);

> ?
> 
> 
>  __s2_len_219 = 7;
>  if (__s2_len_219 <= 3)
>goto ; [INV]
>  else
>goto ; [INV]
> 
>   :
>  _1 = (long unsigned int) i_175;
> 
> 
> However, after “ccp”, in “t.i.032t.ccp1”, we have:
> 
>  :
>  __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
>  __s2_len_218 = .DEFERRED_INIT (7, 2);
>  _36 = (long unsigned int) i_175;
>  _37 = _36 * 8;
>  _38 = argv_220(D) + _37;
> 
> 
> Looks like that the optimization “ccp” replaced the first argument of the 
> call .DEFERRED_INIT with the constant 7.
> This should be avoided.
> 
> (NOTE, this issue existed in the previous patches, however, only exposed with 
> this version since I added more verification
> code in tree-cfg.c to verify the call to .DEFERRED_INIT).
> 
> I am wondering what’s the best solution to this problem?
> 
> I think you have to trace where this "bogus" .DEFERRED_INIT comes from
> originally.  Or alternatively, if this is unavoidable,
> 
> This is unavoidable, I believe.

I see but don't believe it yet ;)

> add "constant
> folding" of .DEFERRED_INIT so that defered init of an initialized
> object becomes the object itself, thus retain the previous - eventually
> partial - initialization only.
> 
> If this additional .DEFERRED_INIT will be kept till RTL expansion phase, then 
> it will become a real initialization:
> 
> i.e.
> 
> s2_len = 0;//.DEFERRED_INIT expanded
> s2_len = 4;// the original initialization
> 
> Then the first initialization will be eliminated by current RTL optimization 
> easily, right?

Well, in your example above it's effectively elimiated by GIMPLE 
optimization.  IIRC you're using the first argument of .DEFERRED_INIT
for diagnostic purposes only, correct?

Richard.

> Qing
> 
> 
> Richard.
> 
> Can we add any attribute to the internal function argument to prevent later 
> optimizations that might applied on it?
> Or just update “ccp” phase to specially handle calls to .DEFERRED_INIT? (Not 
> sure whether there are other phases have the
> Same issue?)
> 
> Let me know if you have any suggestion.
> 
> Thanks a lot for your help.
> 
> Qing
> 
> --
> Richard Biener mailto:rguent...@suse.de>>
> SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
> Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH 2/2] c++: Extend PR96204 fix to variable templates

2021-06-30 Thread Patrick Palka via Gcc-patches
On Tue, 29 Jun 2021, Jason Merrill wrote:

> On 6/29/21 1:57 PM, Patrick Palka wrote:
> > r12-1829 corrected the access scope during partial specialization
> > matching of class templates, but neglected the variable template case.
> > This patch moves the access scope adjustment to inside
> > most_specialized_partial_spec, so that all callers can benefit.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> > 
> > PR c++/96204
> > 
> > gcc/cp/ChangeLog:
> > 
> > * pt.c (instantiate_class_template_1): Remove call to
> > push_nested_class and pop_nested_class added by r12-1829.
> > (most_specialized_partial_spec): Use push_access_scope_guard
> > and deferring_access_check_sentinel.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/template/access40b.C: New test.
> > ---
> >   gcc/cp/pt.c   | 12 +++
> >   gcc/testsuite/g++.dg/template/access40b.C | 26 +++
> >   2 files changed, 34 insertions(+), 4 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/template/access40b.C
> > 
> > diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> > index bd8b17ca047..1e2e2ba5329 100644
> > --- a/gcc/cp/pt.c
> > +++ b/gcc/cp/pt.c
> > @@ -11776,11 +11776,8 @@ instantiate_class_template_1 (tree type)
> > deferring_access_check_sentinel acs (dk_no_deferred);
> >   /* Determine what specialization of the original template to
> > - instantiate; do this relative to the scope of the class for
> > - sake of access checking.  */
> > -  push_nested_class (type);
> > + instantiate.  */
> > t = most_specialized_partial_spec (type, tf_warning_or_error);
> > -  pop_nested_class ();
> > if (t == error_mark_node)
> >   return error_mark_node;
> > else if (t)
> > @@ -24989,26 +24986,33 @@ most_specialized_partial_spec (tree target,
> > tsubst_flags_t complain)
> > tree outer_args = NULL_TREE;
> > tree tmpl, args;
> >   +  tree decl;
> > if (TYPE_P (target))
> >   {
> > tree tinfo = CLASSTYPE_TEMPLATE_INFO (target);
> > tmpl = TI_TEMPLATE (tinfo);
> > args = TI_ARGS (tinfo);
> > +  decl = TYPE_NAME (target);
> >   }
> > else if (TREE_CODE (target) == TEMPLATE_ID_EXPR)
> >   {
> > tmpl = TREE_OPERAND (target, 0);
> > args = TREE_OPERAND (target, 1);
> > +  decl = DECL_TEMPLATE_RESULT (tmpl);
> 
> Hmm, this won't get us the right scope; we get here for the result of
> finish_template_variable, where tmpl is the most general template and args are
> args for it.  So in the below testcase, tmpl is outer::N:
> 
> template  struct outer {
>   template 
>   static constexpr int f() { return N; };
> 
>   template 
>   static const int N = f();
> };
> 
> template 
> template 
> const int outer::N = 1;
> 
> int i = outer::N;
> 
> Oddly, I notice that we also get here for static data members of class
> templates that are not themselves templates, as in mem-partial1.C that I
> adapted the above from.  Fixed by the attached patch.

Makes sense.  I was wondering if the VAR_P (pattern) test in
instantiate_template_1 should be adjusted as well, but that doesn't seem
to be strictly necessary since a VAR_DECL there will always be a
variable template specialization.

> 
> Since the type of the variable depends on the specialization, we can't
> actually get the decl before doing the resolution, but we should be able to
> push into the right enclosing class.  Perhaps we should pass the partially
> instantiated template and its args to lookup_template_variable instead of the
> most general template and its args.
> 

It seems what works is passing the partially instantiated template and
the full set of args to lookup_template_variable, because the outer
args of the partial specialization may be dependent as in e.g. the
above testcase.  One would hope that 'tmpl' contains the partially
instantiated template, but that's not the case because
finish_template_variable passes only the most general template
to us.  So we need to adjust finish_template_variable to pass the
partially instantiated template instead.

And instantiate_decl needs an adjustment as well, since it too calls
most_specialized_partial_spec.  Here, we could just pass the VAR_DECL
'd' to most_specialized_partial_spec, which'll set up the right
context for us.

How does the following look?  Passes make check-c++, full testing in
progress.  The addded second testcase below should adequately test all
this IIUC..

-- >8 --

PR c++/96204

gcc/cp/ChangeLog:

* pt.c (finish_template_variable): Pass the partially
instantiated template and args to instantiate_template.
(instantiate_class_template_1): No need to call
push_nested_class and pop_nested_class around the call to
most_specialized_partial_spec.
(instantiate_template_1): Pass the partially instantiated
template to lookup_template_variable.
(most_specialized_pa

Re: [RFC PATCH] Change the type of predicates to bool.

2021-06-30 Thread Jeff Law via Gcc-patches




On 6/30/2021 4:50 AM, Richard Biener via Gcc-patches wrote:

On Wed, Jun 30, 2021 at 10:47 AM Uros Bizjak via Gcc-patches
 wrote:

This RFC patch changes the type of predicates to bool. However, some
of the targets (e.g. x86) use indirect functions to call the
predicates, so without the local change, the build fails. Putting the
patch through CI bots should weed out the problems, but I have no
infrastructure to do it myself.

I'd say thanks for the work - note building some cc1 crosses should
catch 99% of the fallout (just configure $target-linux/elf and make all-gcc)
Note that if he was to commit,  my tester will pick up the crosses, so 
we'll  know about any cross fallout in ~24hrs.  The "native chroot" 
tests are once a week, so if it breaks something like hppa or m68k we'd 
still know within a week.


jeff



Re: [PATCH 1/2] c++: Fix push_access_scope and introduce RAII wrapper for it

2021-06-30 Thread Patrick Palka via Gcc-patches
On Tue, 29 Jun 2021, Jason Merrill wrote:

> On 6/29/21 1:57 PM, Patrick Palka wrote:
> > When push_access_scope is passed a TYPE_DECL for a class type (which
> > can happen during e.g. satisfaction), we undesirably push only the
> > enclosing context of the class instead of the class itself.  This causes
> > us to mishandle e.g. testcase below due to us not entering the scope of
> > A before checking its constraints.
> > 
> > This patch adjusts push_access_scope accordingly, and introduces an
> > RAII wrapper for it.  We also make use of this wrapper right away by
> > replacing the only use of push_nested_class_guard with this new wrapper,
> > which means we can remove this old wrapper (whose functionality is
> > basically subsumed by the new wrapper).
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> > 
> > gcc/cp/ChangeLog:
> > 
> > * constraint.cc (get_normalized_constraints_from_decl): Use
> > push_access_scope_guard instead of push_nested_class_guard.
> > * cp-tree.h (struct push_nested_class_guard): Replace with ...
> > (struct push_access_scope_guard): ... this.
> > * pt.c (push_access_scope): When the argument corresponds to
> > a class type, push the class instead of its context.
> > (pop_access_scope): Adjust accordingly.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp2a/concepts-access2.C: New test.
> > ---
> >   gcc/cp/constraint.cc  |  7 +-
> >   gcc/cp/cp-tree.h  | 23 +++
> >   gcc/cp/pt.c   |  9 +++-
> >   gcc/testsuite/g++.dg/cpp2a/concepts-access2.C | 13 +++
> >   4 files changed, 35 insertions(+), 17 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-access2.C
> > 
> > diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> > index 6df3ca6ce32..99d3ccc6998 100644
> > --- a/gcc/cp/constraint.cc
> > +++ b/gcc/cp/constraint.cc
> > @@ -926,12 +926,7 @@ get_normalized_constraints_from_decl (tree d, bool diag
> > = false)
> > tree norm = NULL_TREE;
> > if (tree ci = get_constraints (decl))
> >   {
> > -  push_nested_class_guard pncs (DECL_CONTEXT (d));
> > -
> > -  temp_override ovr (current_function_decl);
> > -  if (TREE_CODE (decl) == FUNCTION_DECL)
> > -   current_function_decl = decl;
> > -
> > +  push_access_scope_guard pas (decl);
> > norm = get_normalized_constraints_from_info (ci, tmpl, diag);
> >   }
> >   diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> > index 6f713719589..58da7460001 100644
> > --- a/gcc/cp/cp-tree.h
> > +++ b/gcc/cp/cp-tree.h
> > @@ -8463,21 +8463,24 @@ is_constrained_auto (const_tree t)
> > return is_auto (t) && PLACEHOLDER_TYPE_CONSTRAINTS_INFO (t);
> >   }
> >   -/* RAII class to push/pop class scope T; if T is not a class, do nothing.
> > */
> > +/* RAII class to push/pop the access scope for T.  */
> >   -struct push_nested_class_guard
> > +struct push_access_scope_guard
> >   {
> > -  bool push;
> > -  push_nested_class_guard (tree t)
> > -: push (t && CLASS_TYPE_P (t))
> > +  tree decl;
> > +  push_access_scope_guard (tree t)
> > +: decl (t)
> > {
> > -if (push)
> > -  push_nested_class (t);
> > +if (VAR_OR_FUNCTION_DECL_P (decl)
> > +   || TREE_CODE (decl) == TYPE_DECL)
> > +  push_access_scope (decl);
> > +else
> > +  decl = NULL_TREE;
> > }
> > -  ~push_nested_class_guard ()
> > +  ~push_access_scope_guard ()
> > {
> > -if (push)
> > -  pop_nested_class ();
> > +if (decl)
> > +  pop_access_scope (decl);
> > }
> >   };
> >   diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> > index f2039e09cd7..bd8b17ca047 100644
> > --- a/gcc/cp/pt.c
> > +++ b/gcc/cp/pt.c
> > @@ -224,7 +224,7 @@ static void instantiate_body (tree pattern, tree args,
> > tree d, bool nested);
> >   /* Make the current scope suitable for access checking when we are
> >  processing T.  T can be FUNCTION_DECL for instantiated function
> >  template, VAR_DECL for static member variable, or TYPE_DECL for
> > -   alias template (needed by instantiate_decl).  */
> > +   for a class or alias template (needed by instantiate_decl).  */
> > void
> >   push_access_scope (tree t)
> > @@ -234,6 +234,10 @@ push_access_scope (tree t)
> >   if (DECL_FRIEND_CONTEXT (t))
> >   push_nested_class (DECL_FRIEND_CONTEXT (t));
> > +  else if (TREE_CODE (t) == TYPE_DECL
> > +  && CLASS_TYPE_P (TREE_TYPE (t))
> > +  && DECL_ORIGINAL_TYPE (t) == NULL_TREE)
> 
> I suspect DECL_IMPLICIT_TYPEDEF_P is a better test for this case.

That works nicely.  How does the following look?  Bootstrapped and
regtested on x86_64-pc-linux-gnu.

-- >8 --

gcc/cp/ChangeLog:

* constraint.cc (get_normalized_constraints_from_decl): Use
push_access_scope_guard instead of push_nested_class_guard.
* cp-tree.h (struct push_nested_class_guard): Replace with ...
   

[PATCH] gcc-changelog: show correct line when complaining about unclosed paren

2021-06-30 Thread David Malcolm via Gcc-patches
Successfully tested via:
  pytest contrib/gcc-changelog/

contrib/ChangeLog:
* gcc-changelog/git_commit.py (ChangeLogEntry.__init__): Convert
ChangeLogEntry.opened_parentheses from an integer to a stack of
line strings.
(ChangeLogEntry.parse_changelog): Likewise.
(ChangeLogEntry.process_parentheses): Likewise.
(GitCommit.check_for_broken_parentheses): Update for above change.
Use line containing most recently opened parenthesis as line for
error.
* gcc-changelog/test_email.py
(TestGccChangelog.test_multiline_bad_parentheses): Verify that the
error uses the line containing the unclosed parenthesis, rather
than the first line.

Signed-off-by: David Malcolm 
---
 contrib/gcc-changelog/git_commit.py | 14 +++---
 contrib/gcc-changelog/test_email.py |  1 +
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index d1646bdc0cd..4aac4389a0d 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -217,7 +217,7 @@ class ChangeLogEntry:
 self.lines = []
 self.files = []
 self.file_patterns = []
-self.opened_parentheses = 0
+self.opened_parentheses = []  # stack of lines
 
 def parse_file_names(self):
 # Whether the content currently processed is between a star prefix the
@@ -549,7 +549,7 @@ class GitCommit:
 m = star_prefix_regex.match(line)
 if m:
 if (len(m.group('spaces')) != 1 and
-last_entry.opened_parentheses == 0):
+last_entry.opened_parentheses == []):
 msg = 'one space should follow asterisk'
 self.errors.append(Error(msg, line))
 else:
@@ -574,13 +574,13 @@ class GitCommit:
 def process_parentheses(self, last_entry, line):
 for c in line:
 if c == '(':
-last_entry.opened_parentheses += 1
+last_entry.opened_parentheses.append(line)
 elif c == ')':
-if last_entry.opened_parentheses == 0:
+if last_entry.opened_parentheses == []:
 msg = 'bad wrapping of parenthesis'
 self.errors.append(Error(msg, line))
 else:
-last_entry.opened_parentheses -= 1
+last_entry.opened_parentheses.pop()
 
 def parse_file_names(self):
 for entry in self.changelog_entries:
@@ -606,9 +606,9 @@ class GitCommit:
 
 def check_for_broken_parentheses(self):
 for entry in self.changelog_entries:
-if entry.opened_parentheses != 0:
+if entry.opened_parentheses != []:
 msg = 'bad parentheses wrapping'
-self.errors.append(Error(msg, entry.lines[0]))
+self.errors.append(Error(msg, entry.opened_parentheses[-1]))
 
 def get_file_changelog_location(self, changelog_file):
 for file in self.info.modified_files:
diff --git a/contrib/gcc-changelog/test_email.py 
b/contrib/gcc-changelog/test_email.py
index 319e065ca55..2f8e69fcdc0 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -415,6 +415,7 @@ class TestGccChangelog(unittest.TestCase):
 def test_multiline_bad_parentheses(self):
 email = self.from_patch_glob('0002-Wrong-macro-changelog.patch')
 assert email.errors[0].message == 'bad parentheses wrapping'
+assert email.errors[0].line == '\t* config/i386/i386.md 
(*fix_trunc_i387_1,'
 
 def test_changelog_removal(self):
 email = self.from_patch_glob('0001-ChangeLog-removal.patch')
-- 
2.26.3



Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-06-30 Thread Qing Zhao via Gcc-patches


> On Jun 30, 2021, at 9:39 AM, Richard Biener  wrote:
> 
> On Wed, 30 Jun 2021, Qing Zhao wrote:
> 
>> 
>> 
>> On Jun 30, 2021, at 2:46 AM, Richard Biener 
>> mailto:rguent...@suse.de>> wrote:
>> 
>> On Wed, 30 Jun 2021, Qing Zhao wrote:
>> 
>> Hi,
>> 
>> I am testing the 4th patch of -ftrivial-auto-var-init with CPU2017 today, 
>> and found the following issues:
>> 
>> In the dump file of “*t.i.031t.objsz1”, we have:
>> 
>>  :
>> __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
>> __s2_len_218 = .DEFERRED_INIT (__s2_len_177, 2);
>> 
>> I looks like this .DEFERRED_INIT initializes an already initialized
>> variable.
>> 
>> Yes.
>> 
>> For cases like the following:
>> 
>> int s2_len;
>> s2_len = 4;
>> 
>> i.e, the initialization is not at the declaration.
>> 
>> We cannot avoid initialization for such cases.
> 
> But I'd have expected
> 
>  s2_len = .DEFERRED_INIT (s2_len, 0);
>  s2_len = 4;
> 
> from the above - thus the deferred init _before_ the first
> "use" which is the explicit init.  How does the other order
> happen to materialize?  As said, I believe it shouldn't.

Right, this is strange to me too. … don’t quite understand. 
Maybe I need to debug deeper into “ccp” phase to understand this behavior.

> 
>> I'd expect to only ever see default definition SSA names
>> as first argument to .DEFERRED_INIT.
>> 
>> You mean something like:
>> __s2_len_218 = .DEFERRED_INIT (__s2_len, 2);
> 
> No,
> 
> __s2_len_218 = .DEFERRED_INIT (__s2_len_217(D), 2);

Okay, I see.

Then please see the following:

**In “GIMPLE” dump:

try
  {
C = .DEFERRED_INIT (C, 2);
syshandle = .DEFERRED_INIT (syshandle, 2);
ba = .DEFERRED_INIT (ba, 2);
{
  int i;

  i = .DEFERRED_INIT (i, 2);
  i = 0;
  goto ;
  :
  {
size_t __s1_len;
size_t __s2_len;

__s1_len = .DEFERRED_INIT (__s1_len, 2);
__s2_len = .DEFERRED_INIT (__s2_len, 2);
__s2_len = 7;
if (__s2_len <= 3) goto ; else goto ;

NOTE, in the above, in addition to “__s2_len”, “I” also has an original 
initialization already. 

**Then in “ssa” dump:

   :
  C_200 = .DEFERRED_INIT (C_199(D), 2);
  syshandle = .DEFERRED_INIT (syshandle, 2);
  ba_204 = .DEFERRED_INIT (ba_203(D), 2);
  i_206 = .DEFERRED_INIT (i_205(D), 2);
  i_207 = 0;
  goto ; [INV]

   :
  __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
  __s2_len_218 = .DEFERRED_INIT (__s2_len_177, 2);
  __s2_len_219 = 7;
  if (__s2_len_219 <= 3)

NOTE, in the above, “I” has:

i_206 = .DEFERRED_INIT (i_205(D), 2);

But, for “__s2_len”, it is:

__s2_len_218 = .DEFERRED_INIT (__s2_len_177, 2);

Not sure why such difference?
> 
>> ?
>> 
>> 
>> __s2_len_219 = 7;
>> if (__s2_len_219 <= 3)
>>   goto ; [INV]
>> else
>>   goto ; [INV]
>> 
>>  :
>> _1 = (long unsigned int) i_175;
>> 
>> 
>> However, after “ccp”, in “t.i.032t.ccp1”, we have:
>> 
>>  :
>> __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
>> __s2_len_218 = .DEFERRED_INIT (7, 2);
>> _36 = (long unsigned int) i_175;
>> _37 = _36 * 8;
>> _38 = argv_220(D) + _37;
>> 
>> 
>> Looks like that the optimization “ccp” replaced the first argument of the 
>> call .DEFERRED_INIT with the constant 7.
>> This should be avoided.
>> 
>> (NOTE, this issue existed in the previous patches, however, only exposed 
>> with this version since I added more verification
>> code in tree-cfg.c to verify the call to .DEFERRED_INIT).
>> 
>> I am wondering what’s the best solution to this problem?
>> 
>> I think you have to trace where this "bogus" .DEFERRED_INIT comes from
>> originally.  Or alternatively, if this is unavoidable,
>> 
>> This is unavoidable, I believe.
> 
> I see but don't believe it yet ;)

In order to add initialization for all possible uninitialized auto variables, 
we check whether the auto variable is initialized at the
Declaration, we don’t follow the data flow to check the later possible 
initialization. Therefore, we might add more initialization than 
Necessary.  However, most of such additional initialization should be 
eliminated easily by later optimizations.  Especially after 
the call to .DEFERRED_INIT being expanded to real initialization. 
> 
>> add "constant
>> folding" of .DEFERRED_INIT so that defered init of an initialized
>> object becomes the object itself, thus retain the previous - eventually
>> partial - initialization only.
>> 
>> If this additional .DEFERRED_INIT will be kept till RTL expansion phase, 
>> then it will become a real initialization:
>> 
>> i.e.
>> 
>> s2_len = 0;//.DEFERRED_INIT expanded
>> s2_len = 4;// the original initialization
>> 
>> Then the first initialization will be eliminated by current RTL optimization 
>> easily, right?
> 
> Well, in your example above it's effectively elimiated by GIMPLE 
> optimization.  IIRC you're using the first argument of .DEFERRED_INIT
> for diagnostic purposes only, correct?

In Previous patches, Yes, the first one only for diagn

Re: [PATCH 2/4] allow poisoning input_location in ranges it should not be used

2021-06-30 Thread David Malcolm via Gcc-patches
On Wed, 2021-06-30 at 01:35 -0400, Trevor Saunders wrote:
> This makes it possible to assert if input_location is used during the
> lifetime
> of a scope.  This will allow us to find places that currently use it
> within a
> function and its callees, or prevent adding uses within the lifetime
> of a
> function after all existing uses are removed.
> 
> bootstrapped and regtested on x86_64-linux-gnu, ok?
> 
> Trev

[...snip...]

> diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
> index d58586f2526..3f68d1d79eb 100644
> --- a/gcc/diagnostic.c
> +++ b/gcc/diagnostic.c
> @@ -1835,7 +1835,7 @@ internal_error (const char *gmsgid, ...)
>    auto_diagnostic_group d;
>    va_list ap;
>    va_start (ap, gmsgid);
> -  rich_location richloc (line_table, input_location);
> +  rich_location richloc (line_table, UNKNOWN_LOCATION);
>    diagnostic_impl (&richloc, NULL, -1, gmsgid, &ap, DK_ICE);
>    va_end (ap);
>  

I actually make use of this in the analyzer: the analyzer sets
input_location to stmt->location when analyzing a given stmt - that
way, if the analyzer ICEs, the ICE is shown at the code construct that
crashed the analyzer.

This behavior is useful to me, and would be lost with the proposed
patch.

Is there a better way of doing what I'm doing?

Is the long-term goal of the patch kit to reduce our reliance on global
variables?  Are we ultimately still going to need a variable for "where
to show the ICE if gcc crashes"?  (perhaps stashing it in the
diagnostic_context???)

Hope this is constructive
Dave



Re: [ARM] PR98435: Missed optimization in expanding vector constructor

2021-06-30 Thread Christophe LYON via Gcc-patches



On 29/06/2021 12:46, Prathamesh Kulkarni wrote:

On Mon, 28 Jun 2021 at 14:48, Christophe LYON
 wrote:


On 28/06/2021 10:40, Kyrylo Tkachov via Gcc-patches wrote:

-Original Message-
From: Prathamesh Kulkarni 
Sent: 28 June 2021 09:38
To: Kyrylo Tkachov 
Cc: Christophe Lyon ; gcc Patches 
Subject: Re: [ARM] PR98435: Missed optimization in expanding vector
constructor

On Thu, 24 Jun 2021 at 22:01, Kyrylo Tkachov 
wrote:



-Original Message-
From: Prathamesh Kulkarni 
Sent: 14 June 2021 09:02
To: Christophe Lyon 
Cc: gcc Patches ; Kyrylo Tkachov

Subject: Re: [ARM] PR98435: Missed optimization in expanding vector
constructor

On Wed, 9 Jun 2021 at 15:58, Prathamesh Kulkarni
 wrote:

On Fri, 4 Jun 2021 at 13:15, Christophe Lyon



wrote:

On Fri, 4 Jun 2021 at 09:27, Prathamesh Kulkarni via Gcc-patches
 wrote:

Hi,
As mentioned in PR, for the following test-case:

#include 

bfloat16x4_t f1 (bfloat16_t a)
{
return vdup_n_bf16 (a);
}

bfloat16x4_t f2 (bfloat16_t a)
{
return (bfloat16x4_t) {a, a, a, a};
}

Compiling with arm-linux-gnueabi -O3 -mfpu=neon -mfloat-

abi=softfp

-march=armv8.2-a+bf16+fp16 results in f2 not being vectorized:

f1:
  vdup.16 d16, r0
  vmovr0, r1, d16  @ v4bf
  bx  lr

f2:
  mov r3, r0  @ __bf16
  adr r1, .L4
  ldrdr0, [r1]
  mov r2, r3  @ __bf16
  mov ip, r3  @ __bf16
  bfi r1, r2, #0, #16
  bfi r0, ip, #0, #16
  bfi r1, r3, #16, #16
  bfi r0, r2, #16, #16
  bx  lr

This seems to happen because vec_init pattern in neon.md has VDQ

mode

iterator, which doesn't include V4BF. In attached patch, I changed
mode
to VDQX which seems to work for the test-case, and the compiler

now

generates:

f2:
  vdup.16 d16, r0
  vmovr0, r1, d16  @ v4bf
  bx  lr

However, the pattern is also gated on TARGET_HAVE_MVE and I am

not

sure if either VDQ or VDQX are correct modes for MVE since MVE

has

only 128-bit vectors ?


I think patterns common to both Neon and MVE should be moved to
vec-common.md, I don't know why such patterns were left in

neon.md.

Since we end up calling neon_expand_vector_init for both NEON and

MVE,

I am not sure if we should separate the pattern ?
Would it make sense to FAIL if the mode size isn't 16 bytes for MVE as
in attached patch so
it will call neon_expand_vector_init only for 128-bit vectors ?
Altho hard-coding 16 in the pattern doesn't seem a good idea to me

either.

ping https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572342.html
(attaching patch as text).


--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -459,10 +459,12 @@
   )

   (define_expand "vec_init"
-  [(match_operand:VDQ 0 "s_register_operand")
+  [(match_operand:VDQX 0 "s_register_operand")
  (match_operand 1 "" "")]
 "TARGET_NEON || TARGET_HAVE_MVE"
   {
+  if (TARGET_HAVE_MVE && GET_MODE_SIZE (GET_MODE

(operands[0])) != 16)

+FAIL;
 neon_expand_vector_init (operands[0], operands[1]);
 DONE;
   })

I think we should move this to vec-common.md like Christophe said.
Perhaps rather than making it FAIL for non-16 MVE sizes we just disable it in

the expander condition?

"TARGET_NEON || (TARGET_HAVE_MVE && GET_MODE_SIZE (<

VDQ>mode) != 16)"
Is it OK to use mode ? Because using mode resulted in lot
of build errors.
Also, I think the comparison should be inverted, ie, GET_MODE_SIZE
(mode) == 16 since
we want to make the pattern pass if target is MVE and vector size is 16 bytes ?
Do these changes in attached patch look OK ?

Yes, you're right.


Can't this be ARM_HAVE__ARITH like in most expanders in vec-common.md?

(maybe with a && !TARGET_REALLY_IWMMXT if needed)

I wonder if this should be ARM_HAVE__LDST instead since we're
initializing the vector ?



Well, it really depends on which modes you want to enable.


Looks like your move VDQ -> VDQ adds V4BF, V8BF and DI.

Are they all OK for Neon?

They are not OK for MVE.

Ideally you could add testcases to cover to the supported and 
unsupported modes for both Neon and MVE.\


Before your patch, the expander is enabled for MVE for 64 bit modes 
(V8QI, V4HI, V2SI): what happens in this case? Does the compiler crash 
or is there something else preventing the match?



Thanks,


Christophe



Thanks,
Prathamesh


Christophe



Ok.
Thanks,
Kyrill



Thanks,
Prathamesh

Thanks,
Kyrill


Thanks,
Prathamesh

Thanks,
Prathamesh

That being said, I suggest you look at other similar patterns in
vec-common.md, most of which are gated on
ARM_HAVE__ARITH
and possibly beware of issues with iwmmxt :-)

Christophe


Thanks,
Prathamesh


Re: [RFC/PATCH v3] ira: Support more matching constraint forms with param [PR100328]

2021-06-30 Thread Vladimir Makarov via Gcc-patches



On 2021-06-28 2:26 a.m., Kewen.Lin wrote:

Hi!

on 2021/6/9 下午1:18, Kewen.Lin via Gcc-patches wrote:

Hi,

PR100328 has some details about this issue, I am trying to
brief it here.  In the hottest function LBM_performStreamCollideTRT
of SPEC2017 bmk 519.lbm_r, there are many FMA style expressions
(27 FMA, 19 FMS, 11 FNMA).  On rs6000, this kind of FMA style
insn has two flavors: FLOAT_REG and VSX_REG, the VSX_REG reg
class have 64 registers whose foregoing 32 ones make up the
whole FLOAT_REG.  There are some differences for these two
flavors, taking "*fma4_fpr" as example:

(define_insn "*fma4_fpr"
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,wa,wa")
(fma:SFDF
  (match_operand:SFDF 1 "gpc_reg_operand" "%,wa,wa")
  (match_operand:SFDF 2 "gpc_reg_operand" ",wa,0")
  (match_operand:SFDF 3 "gpc_reg_operand" ",0,wa")))]

// wa => A VSX register (VSR), vs0…vs63, aka. VSX_REG.
//  (f/d) => A floating point register, aka. FLOAT_REG.

So for VSX_REG, we only have the destructive form, when VSX_REG
alternative being used, the operand 2 or operand 3 is required
to be the same as operand 0.  reload has to take care of this
constraint and create some non-free register copies if required.

Assuming one fma insn looks like:
   op0 = FMA (op1, op2, op3)

The best regclass of them are VSX_REG, when op1,op2,op3 are all dead,
IRA simply creates three shuffle copies for them (here the operand
order matters, since with the same freq, the one with smaller number
takes preference), but IMO both op2 and op3 should take higher priority
in copy queue due to the matching constraint.

I noticed that there is one function ira_get_dup_out_num, which meant
to create this kind of constraint copy, but the below code looks to
refuse to create if there is an alternative which has valid regclass
without spilled need.

   default:
{
  enum constraint_num cn = lookup_constraint (str);
  enum reg_class cl = reg_class_for_constraint (cn);
  if (cl != NO_REGS
  && !targetm.class_likely_spilled_p (cl))
goto fail

 ...

I cooked one patch attached to make ira respect this kind of matching
constraint guarded with one parameter.  As I stated in the PR, I was
not sure this is on the right track.  The RFC patch is to check the
matching constraint in all alternatives, if there is one alternative
with matching constraint and matches the current preferred regclass
(or best of allocno?), it will record the output operand number and
further create one constraint copy for it.  Normally it can get the
priority against shuffle copies and the matching constraint will get
satisfied with higher possibility, reload doesn't create extra copies
to meet the matching constraint or the desirable register class when
it has to.

For FMA A,B,C,D, I think ideally copies A/B, A/C, A/D can firstly stay
as shuffle copies, and later any of A,B,C,D gets assigned by one
hardware register which is a VSX register (VSX_REG) but not a FP
register (FLOAT_REG), which means it has to pay costs once we can NOT
go with VSX alternatives, so at that time it's important to respect
the matching constraint then we can increase the freq for the remaining
copies related to this (A/B, A/C, A/D).  This idea requires some side
tables to record some information and seems a bit complicated in the
current framework, so the proposed patch aggressively emphasizes the
matching constraint at the time of creating copies.


Comparing with the original patch (v1), this patch v3 has
considered: (this should be v2 for this mail list, but bump
it to be consistent as PR's).

   - Excluding the case where for one preferred register class
 there can be two or more alternatives, one of them has the
 matching constraint, while another doesn't have.  So for
 the given operand, even if it's assigned by a hardware reg
 which doesn't meet the matching constraint, it can simply
 use the alternative which doesn't have matching constraint
 so no register move is needed.  One typical case is
 define_insn *mov_internal2 on rs6000.  So we
 shouldn't create constraint copy for it.

   - The possible free register move in the same register class,
 disable this if so since the register move to meet the
 constraint is considered as free.

   - Making it on by default, suggested by Segher & Vladimir, we
 hope to get rid of the parameter if the benchmarking result
 looks good on major targets.

   - Tweaking cost when either of matching constraint two sides
 is hardware register.  Before this patch, the constraint
 copy is simply taken as a real move insn for pref and
 conflict cost with one hardware register, after this patch,
 it's allowed that there are several input operands
 respecting the same matching constraint (but in different
 alternatives), so we should take it to be like shuffle copy
 for some cases to avoid over preferring/disparaging

Re: [RFC/PATCH v3] ira: Support more matching constraint forms with param [PR100328]

2021-06-30 Thread Vladimir Makarov via Gcc-patches



On 2021-06-28 2:26 a.m., Kewen.Lin wrote:

Hi!

on 2021/6/9 下午1:18, Kewen.Lin via Gcc-patches wrote:

Hi,

PR100328 has some details about this issue, I am trying to
brief it here.  In the hottest function LBM_performStreamCollideTRT
of SPEC2017 bmk 519.lbm_r, there are many FMA style expressions
(27 FMA, 19 FMS, 11 FNMA).  On rs6000, this kind of FMA style
insn has two flavors: FLOAT_REG and VSX_REG, the VSX_REG reg
class have 64 registers whose foregoing 32 ones make up the
whole FLOAT_REG.  There are some differences for these two
flavors, taking "*fma4_fpr" as example:

(define_insn "*fma4_fpr"
   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,wa,wa")
(fma:SFDF
  (match_operand:SFDF 1 "gpc_reg_operand" "%,wa,wa")
  (match_operand:SFDF 2 "gpc_reg_operand" ",wa,0")
  (match_operand:SFDF 3 "gpc_reg_operand" ",0,wa")))]

// wa => A VSX register (VSR), vs0…vs63, aka. VSX_REG.
//  (f/d) => A floating point register, aka. FLOAT_REG.

So for VSX_REG, we only have the destructive form, when VSX_REG
alternative being used, the operand 2 or operand 3 is required
to be the same as operand 0.  reload has to take care of this
constraint and create some non-free register copies if required.

Assuming one fma insn looks like:
   op0 = FMA (op1, op2, op3)

The best regclass of them are VSX_REG, when op1,op2,op3 are all dead,
IRA simply creates three shuffle copies for them (here the operand
order matters, since with the same freq, the one with smaller number
takes preference), but IMO both op2 and op3 should take higher priority
in copy queue due to the matching constraint.

I noticed that there is one function ira_get_dup_out_num, which meant
to create this kind of constraint copy, but the below code looks to
refuse to create if there is an alternative which has valid regclass
without spilled need.

   default:
{
  enum constraint_num cn = lookup_constraint (str);
  enum reg_class cl = reg_class_for_constraint (cn);
  if (cl != NO_REGS
  && !targetm.class_likely_spilled_p (cl))
goto fail

 ...

I cooked one patch attached to make ira respect this kind of matching
constraint guarded with one parameter.  As I stated in the PR, I was
not sure this is on the right track.  The RFC patch is to check the
matching constraint in all alternatives, if there is one alternative
with matching constraint and matches the current preferred regclass
(or best of allocno?), it will record the output operand number and
further create one constraint copy for it.  Normally it can get the
priority against shuffle copies and the matching constraint will get
satisfied with higher possibility, reload doesn't create extra copies
to meet the matching constraint or the desirable register class when
it has to.

For FMA A,B,C,D, I think ideally copies A/B, A/C, A/D can firstly stay
as shuffle copies, and later any of A,B,C,D gets assigned by one
hardware register which is a VSX register (VSX_REG) but not a FP
register (FLOAT_REG), which means it has to pay costs once we can NOT
go with VSX alternatives, so at that time it's important to respect
the matching constraint then we can increase the freq for the remaining
copies related to this (A/B, A/C, A/D).  This idea requires some side
tables to record some information and seems a bit complicated in the
current framework, so the proposed patch aggressively emphasizes the
matching constraint at the time of creating copies.


Comparing with the original patch (v1), this patch v3 has
considered: (this should be v2 for this mail list, but bump
it to be consistent as PR's).

   - Excluding the case where for one preferred register class
 there can be two or more alternatives, one of them has the
 matching constraint, while another doesn't have.  So for
 the given operand, even if it's assigned by a hardware reg
 which doesn't meet the matching constraint, it can simply
 use the alternative which doesn't have matching constraint
 so no register move is needed.  One typical case is
 define_insn *mov_internal2 on rs6000.  So we
 shouldn't create constraint copy for it.

   - The possible free register move in the same register class,
 disable this if so since the register move to meet the
 constraint is considered as free.

   - Making it on by default, suggested by Segher & Vladimir, we
 hope to get rid of the parameter if the benchmarking result
 looks good on major targets.

   - Tweaking cost when either of matching constraint two sides
 is hardware register.  Before this patch, the constraint
 copy is simply taken as a real move insn for pref and
 conflict cost with one hardware register, after this patch,
 it's allowed that there are several input operands
 respecting the same matching constraint (but in different
 alternatives), so we should take it to be like shuffle copy
 for some cases to avoid over preferring/disparaging

[PING][PATCH 2/4] remove %G and %K from calls in front end and middle end (PR 98512)

2021-06-30 Thread Martin Sebor via Gcc-patches

Ping.  Attached is the same patch rebased on top the latest trunk.

As discussed in the review of Aldy's recent changes to the backwards
threader, he has run into the same bug the patch fixes.  Getting this
patch set reviewed and approved would be helpful in keeping him from
having to work around the bug.

https://gcc.gnu.org/pipermail/gcc/2021-June/236608.html

On 6/10/21 5:27 PM, Martin Sebor wrote:

This diff removes the uses of %G and %K from all warning_at() calls
throughout GCC front end and middle end.  The inlining context is
included in diagnostic output whenever it's present.


Improve warning suppression for inlined functions.

Resolves:
PR middle-end/98871 - Cannot silence -Wmaybe-uninitialized at declaration site
PR middle-end/98512 - #pragma GCC diagnostic ignored ineffective in conjunction with alias attribute

gcc/ChangeLog:

	* builtins.c (warn_string_no_nul): Remove %G.
	(maybe_warn_for_bound): Same.
	(warn_for_access): Same.
	(check_access): Same.
	(check_strncat_sizes): Same.
	(expand_builtin_strncat): Same.
	(expand_builtin_strncmp): Same.
	(expand_builtin): Same.
	(expand_builtin_object_size): Same.
	(warn_dealloc_offset): Same.
	(maybe_emit_free_warning): Same.
	* calls.c (maybe_warn_alloc_args_overflow): Same.
	(maybe_warn_nonstring_arg): Same.
	(maybe_warn_rdwr_sizes): Same.
	* expr.c (expand_expr_real_1): Remove %K.
	* gimple-fold.c (gimple_fold_builtin_strncpy): Remove %G.
	(gimple_fold_builtin_strncat): Same.
	* gimple-ssa-sprintf.c (format_directive): Same.
	(handle_printf_call): Same.
	* gimple-ssa-warn-alloca.c (pass_walloca::execute): Same.
	* gimple-ssa-warn-restrict.c (maybe_diag_overlap): Same.
	(maybe_diag_access_bounds): Same.  Call gimple_location.
	(check_bounds_or_overlap): Same.
	* trans-mem.c (ipa_tm_scan_irr_block): Remove %K.  Simplify.
	* tree-ssa-ccp.c (pass_post_ipa_warn::execute): Remove %G.
	* tree-ssa-strlen.c (maybe_warn_overflow): Same.
	(maybe_diag_stxncpy_trunc): Same.
	(handle_builtin_stxncpy_strncat): Same.
	(maybe_warn_pointless_strcmp): Same.
	* tree-ssa-uninit.c (maybe_warn_operand): Same.

gcc/testsuite/ChangeLog:

	* gcc.dg/Wfree-nonheap-object-4.c: Tighten up.
	* gcc.dg/Wobjsize-1.c: Prune expected output.
	* gcc.dg/Warray-bounds-71.c: New test.
	* gcc.dg/Warray-bounds-71.h: New test.
	* gcc.dg/Warray-bounds-72.c: New test.
	* gcc.dg/Warray-bounds-73.c: New test.
	* gcc.dg/Warray-bounds-74.c: New test.
	* gcc.dg/Warray-bounds-75.c: New test.
	* gcc.dg/Wfree-nonheap-object-5.c: New test.
	* gcc.dg/pragma-diag-10.c: New test.
	* gcc.dg/pragma-diag-9.c: New test.
	* gcc.dg/uninit-suppress_3.c: New test.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index e5e39386a93..e59fa322729 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -1126,30 +1126,30 @@ warn_string_no_nul (location_t loc, tree expr, const char *fname,
 	{
 	  if (wi::ltu_p (maxsiz, bndrng[0]))
 	warned = warning_at (loc, opt,
- "%K%qD specified bound %s exceeds "
+ "%qD specified bound %s exceeds "
  "maximum object size %E",
- expr, func, bndstr, maxobjsize);
+ func, bndstr, maxobjsize);
 	  else
 	{
 	  bool maybe = wi::to_wide (size) == bndrng[0];
 	  warned = warning_at (loc, opt,
    exact
-   ? G_("%K%qD specified bound %s exceeds "
+   ? G_("%qD specified bound %s exceeds "
 	"the size %E of unterminated array")
    : (maybe
-  ? G_("%K%qD specified bound %s may "
+  ? G_("%qD specified bound %s may "
 	   "exceed the size of at most %E "
 	   "of unterminated array")
-  : G_("%K%qD specified bound %s exceeds "
+  : G_("%qD specified bound %s exceeds "
 	   "the size of at most %E "
 	   "of unterminated array")),
-   expr, func, bndstr, size);
+   func, bndstr, size);
 	}
 	}
   else
 	warned = warning_at (loc, opt,
-			 "%K%qD argument missing terminating nul",
-			 expr, func);
+			 "%qD argument missing terminating nul",
+			 func);
 }
   else
 {
@@ -3969,35 +3969,34 @@ maybe_warn_for_bound (opt_code opt, location_t loc, tree exp, tree func,
 	warned = (func
 		  ? warning_at (loc, opt,
 (maybe
- ? G_("%K%qD specified bound %E may "
+ ? G_("%qD specified bound %E may "
 	  "exceed maximum object size %E")
- : G_("%K%qD specified bound %E "
+ : G_("%qD specified bound %E "
 	  "exceeds maximum object size %E")),
-exp, func, bndrng[0], maxobjsize)
+func, bndrng[0], maxobjsize)
 		  : warning_at (loc, opt,
 (maybe
- ? G_("%Kspecified bound %E may "
+ ? G_("specified bound %E may "
 	  "exceed maximum object size %E")
- : G_("%Kspecified bound %E "
+ : G_("specified bound %E "
 	  "exceeds maximum object size %E")),
-exp, bndrng[0], maxobjsize));
+bndrng[0], maxobjsize));
 	  else
 	warned = (func
 		  ? warning_at (loc, opt,
 (maybe
- ? G_("%K%qD specified bound [%E, %E] ma

Re: [RFC/PATCH v3] ira: Support more matching constraint forms with param [PR100328]

2021-06-30 Thread Richard Sandiford via Gcc-patches
"Kewen.Lin"  writes:
> on 2021/6/28 下午3:20, Hongtao Liu wrote:
>> On Mon, Jun 28, 2021 at 3:12 PM Hongtao Liu  wrote:
>>>
>>> On Mon, Jun 28, 2021 at 2:50 PM Kewen.Lin  wrote:

 Hi!

 on 2021/6/9 下午1:18, Kewen.Lin via Gcc-patches wrote:
> Hi,
>
> PR100328 has some details about this issue, I am trying to
> brief it here.  In the hottest function LBM_performStreamCollideTRT
> of SPEC2017 bmk 519.lbm_r, there are many FMA style expressions
> (27 FMA, 19 FMS, 11 FNMA).  On rs6000, this kind of FMA style
> insn has two flavors: FLOAT_REG and VSX_REG, the VSX_REG reg
> class have 64 registers whose foregoing 32 ones make up the
> whole FLOAT_REG.  There are some differences for these two
> flavors, taking "*fma4_fpr" as example:
>
> (define_insn "*fma4_fpr"
>   [(set (match_operand:SFDF 0 "gpc_reg_operand" "=,wa,wa")
>   (fma:SFDF
> (match_operand:SFDF 1 "gpc_reg_operand" "%,wa,wa")
> (match_operand:SFDF 2 "gpc_reg_operand" ",wa,0")
> (match_operand:SFDF 3 "gpc_reg_operand" ",0,wa")))]
>
> // wa => A VSX register (VSR), vs0…vs63, aka. VSX_REG.
> //  (f/d) => A floating point register, aka. FLOAT_REG.
>
> So for VSX_REG, we only have the destructive form, when VSX_REG
> alternative being used, the operand 2 or operand 3 is required
> to be the same as operand 0.  reload has to take care of this
> constraint and create some non-free register copies if required.
>
> Assuming one fma insn looks like:
>   op0 = FMA (op1, op2, op3)
>
> The best regclass of them are VSX_REG, when op1,op2,op3 are all dead,
> IRA simply creates three shuffle copies for them (here the operand
> order matters, since with the same freq, the one with smaller number
> takes preference), but IMO both op2 and op3 should take higher priority
> in copy queue due to the matching constraint.
>
> I noticed that there is one function ira_get_dup_out_num, which meant
> to create this kind of constraint copy, but the below code looks to
> refuse to create if there is an alternative which has valid regclass
> without spilled need.
>
>   default:
>   {
> enum constraint_num cn = lookup_constraint (str);
> enum reg_class cl = reg_class_for_constraint (cn);
> if (cl != NO_REGS
> && !targetm.class_likely_spilled_p (cl))
>   goto fail
>
>...
>
> I cooked one patch attached to make ira respect this kind of matching
> constraint guarded with one parameter.  As I stated in the PR, I was
> not sure this is on the right track.  The RFC patch is to check the
> matching constraint in all alternatives, if there is one alternative
> with matching constraint and matches the current preferred regclass
> (or best of allocno?), it will record the output operand number and
> further create one constraint copy for it.  Normally it can get the
> priority against shuffle copies and the matching constraint will get
> satisfied with higher possibility, reload doesn't create extra copies
> to meet the matching constraint or the desirable register class when
> it has to.
>
> For FMA A,B,C,D, I think ideally copies A/B, A/C, A/D can firstly stay
> as shuffle copies, and later any of A,B,C,D gets assigned by one
> hardware register which is a VSX register (VSX_REG) but not a FP
> register (FLOAT_REG), which means it has to pay costs once we can NOT
> go with VSX alternatives, so at that time it's important to respect
> the matching constraint then we can increase the freq for the remaining
> copies related to this (A/B, A/C, A/D).  This idea requires some side
> tables to record some information and seems a bit complicated in the
> current framework, so the proposed patch aggressively emphasizes the
> matching constraint at the time of creating copies.
>

 Comparing with the original patch (v1), this patch v3 has
 considered: (this should be v2 for this mail list, but bump
 it to be consistent as PR's).

   - Excluding the case where for one preferred register class
 there can be two or more alternatives, one of them has the
 matching constraint, while another doesn't have.  So for
 the given operand, even if it's assigned by a hardware reg
 which doesn't meet the matching constraint, it can simply
 use the alternative which doesn't have matching constraint
 so no register move is needed.  One typical case is
 define_insn *mov_internal2 on rs6000.  So we
 shouldn't create constraint copy for it.

   - The possible free register move in the same register class,
 disable this if so since the register move to meet the
 constraint is considered as free.

   - Making it on by de

Re: [PATCH] Port GCC documentation to Sphinx

2021-06-30 Thread Eli Zaretskii via Gcc-patches
> Cc: jos...@codesourcery.com, g...@gcc.gnu.org, gcc-patches@gcc.gnu.org
> From: Martin Liška 
> Date: Wed, 30 Jun 2021 16:04:32 +0200
> 
> > Thanks, but does that mean @var will no longer stand out in the
> > produced Info format?  That'd be sub-optimal, I think, because a clear
> > reference to a meta-syntactic variable will be lost.
> 
> Yes. An alternative approach for:
> Show :samp:`Samp with a {variable}.`
> 
> can be using @var{variable}, resulting with the following info output:
> Show ‘Samp with a VARIABLE.’
> 
> Does it seem reasonable?

Yes, the latter sounds reasonable.

Thanks.


Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-06-30 Thread Qing Zhao via Gcc-patches
I came up with a very simple testing case that can repeat the same issue:

[qinzhao@localhost gcc]$ cat t.c
extern void bar (int);
void foo (int a)
{
  int i;
  for (i = 0; i < a; i++) {
if (__extension__({int size2; 
size2 = 4;
size2 > 5;}))
bar (a);
  }
}

[qinzhao@localhost gcc]$ /home/qinzhao/Work/GCC/gcc_build_debug/gcc/xgcc 
-B/home/qinzhao/Work/GCC/gcc_build_debug/gcc/ -std=c99   -m64  -march=native 
-ftrivial-auto-var-init=zero t.c -fdump-tree-all  -O1
t.c: In function ‘foo’:
t.c:11:1: error: ‘DEFFERED_INIT’ calls should have the same LHS as the first 
argument
   11 | }
  | ^
size2_12 = .DEFERRED_INIT (4, 2);
during GIMPLE pass: ccp
dump file: a-t.c.032t.ccp1
t.c:11:1: internal compiler error: verify_gimple failed
0x143ee47 verify_gimple_in_cfg(function*, bool)
../../latest_gcc/gcc/tree-cfg.c:5501
0x122d799 execute_function_todo
../../latest_gcc/gcc/passes.c:2042
0x122c74b do_per_function
../../latest_gcc/gcc/passes.c:1687
0x122d986 execute_todo
../../latest_gcc/gcc/passes.c:2096
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

In this testing case, both “I” and “size2” are auto vars that are not 
initialized at declaration but are initialized later by assignment.
However, “I” doesn’t have any issue, but “size2” has such issue. 

**“ssa” dump:

   :
  i_7 = .DEFERRED_INIT (i_6(D), 2);
  i_8 = 0;
  goto ; [INV]

   :
  size2_12 = .DEFERRED_INIT (size2_3, 2);
  size2_13 = 4;

**”ccp1” dump:

   :
  i_7 = .DEFERRED_INIT (i_6(D), 2);
  goto ; [INV]

   :
  size2_12 = .DEFERRED_INIT (4, 2);

So, I am wondering:  Is it possible that “ssa” phase have a bug ?

Qing

> On Jun 30, 2021, at 9:39 AM, Richard Biener  wrote:
> 
> On Wed, 30 Jun 2021, Qing Zhao wrote:
> 
>> 
>> 
>> On Jun 30, 2021, at 2:46 AM, Richard Biener 
>> mailto:rguent...@suse.de>> wrote:
>> 
>> On Wed, 30 Jun 2021, Qing Zhao wrote:
>> 
>> Hi,
>> 
>> I am testing the 4th patch of -ftrivial-auto-var-init with CPU2017 today, 
>> and found the following issues:
>> 
>> In the dump file of “*t.i.031t.objsz1”, we have:
>> 
>>  :
>> __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
>> __s2_len_218 = .DEFERRED_INIT (__s2_len_177, 2);
>> 
>> I looks like this .DEFERRED_INIT initializes an already initialized
>> variable.
>> 
>> Yes.
>> 
>> For cases like the following:
>> 
>> int s2_len;
>> s2_len = 4;
>> 
>> i.e, the initialization is not at the declaration.
>> 
>> We cannot avoid initialization for such cases.
> 
> But I'd have expected
> 
>  s2_len = .DEFERRED_INIT (s2_len, 0);
>  s2_len = 4;
> 
> from the above - thus the deferred init _before_ the first
> "use" which is the explicit init.  How does the other order
> happen to materialize?  As said, I believe it shouldn't.
> 
>> I'd expect to only ever see default definition SSA names
>> as first argument to .DEFERRED_INIT.
>> 
>> You mean something like:
>> __s2_len_218 = .DEFERRED_INIT (__s2_len, 2);
> 
> No,
> 
> __s2_len_218 = .DEFERRED_INIT (__s2_len_217(D), 2);
> 
>> ?
>> 
>> 
>> __s2_len_219 = 7;
>> if (__s2_len_219 <= 3)
>>   goto ; [INV]
>> else
>>   goto ; [INV]
>> 
>>  :
>> _1 = (long unsigned int) i_175;
>> 
>> 
>> However, after “ccp”, in “t.i.032t.ccp1”, we have:
>> 
>>  :
>> __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
>> __s2_len_218 = .DEFERRED_INIT (7, 2);
>> _36 = (long unsigned int) i_175;
>> _37 = _36 * 8;
>> _38 = argv_220(D) + _37;
>> 
>> 
>> Looks like that the optimization “ccp” replaced the first argument of the 
>> call .DEFERRED_INIT with the constant 7.
>> This should be avoided.
>> 
>> (NOTE, this issue existed in the previous patches, however, only exposed 
>> with this version since I added more verification
>> code in tree-cfg.c to verify the call to .DEFERRED_INIT).
>> 
>> I am wondering what’s the best solution to this problem?
>> 
>> I think you have to trace where this "bogus" .DEFERRED_INIT comes from
>> originally.  Or alternatively, if this is unavoidable,
>> 
>> This is unavoidable, I believe.
> 
> I see but don't believe it yet ;)
> 
>> add "constant
>> folding" of .DEFERRED_INIT so that defered init of an initialized
>> object becomes the object itself, thus retain the previous - eventually
>> partial - initialization only.
>> 
>> If this additional .DEFERRED_INIT will be kept till RTL expansion phase, 
>> then it will become a real initialization:
>> 
>> i.e.
>> 
>> s2_len = 0;//.DEFERRED_INIT expanded
>> s2_len = 4;// the original initialization
>> 
>> Then the first initialization will be eliminated by current RTL optimization 
>> easily, right?
> 
> Well, in your example above it's effectively elimiated by GIMPLE 
> optimization.  IIRC you're using the first argument of .DEFERRED_INIT
> for diagnostic purposes only, correct?
> 
> Richard.
> 
>> Qing
>> 
>> 
>> Richard.
>> 
>> Can we add any attribute to the internal function argu

Re: [PATCH] c++: CTAD within alias template [PR91911]

2021-06-30 Thread Patrick Palka via Gcc-patches
On Fri, 25 Jun 2021, Jason Merrill wrote:

> On 6/25/21 1:11 PM, Patrick Palka wrote:
> > On Fri, 25 Jun 2021, Jason Merrill wrote:
> > 
> > > On 6/24/21 4:45 PM, Patrick Palka wrote:
> > > > In the first testcase below, during parsing of the alias template
> > > > ConstSpanType, transparency of alias template specializations means we
> > > > replace SpanType with SpanType's substituted definition.  But this
> > > > substitution lowers the level of the CTAD placeholder for span(T()) from
> > > > 2 to 1, and so the later instantiantion of ConstSpanType
> > > > erroneously substitutes this CTAD placeholder with the template argument
> > > > at level 1 index 0, i.e. with int, before we get a chance to perform the
> > > > CTAD.
> > > > 
> > > > In light of this, it seems we should avoid level lowering when
> > > > substituting through through the type-id of a dependent alias template
> > > > specialization.  To that end this patch makes lookup_template_class_1
> > > > pass tf_partial to tsubst in this situation.
> > > 
> > > This makes sense, but what happens if SpanType is a member template, so
> > > that
> > > the levels of it and ConstSpanType don't match?  Or the other way around?
> > 
> > If SpanType is a member template of say the class template A (and
> > thus its level is greater than ConstSpanType):
> > 
> >template
> >struct A {
> >  template
> >  using SpanType = decltype(span(T()));
> >};
> > 
> >template
> >using ConstSpanType = span > A::SpanType::value_type>;
> > 
> >using type = ConstSpanType;
> > 
> > then this case luckily works even without the patch because
> > instantiate_class_template now reuses the specialization A::SpanType
> > that was formed earlier during instantiation of A, where we
> > substitute only a single level of template arguments, so the level of
> > the CTAD placeholder inside the defining-type-id of this specialization
> > dropped from 3 to 2, so still more than the level of ConstSpanType.
> > 
> > This luck is short-lived though, because if we replace
> > A::SpanType with say A::SpanType then the testcase
> > breaks again (without the patch) because we no longer can reuse that
> > specialization, so we instead form it on the spot by substituting two
> > levels of template arguments (U=int,T=T) into the defining-type-id,
> > causing the level of the placeholder to drop to 1.  I think the patch
> > causes its level to remain 3 (though I guess it should really be 2).
> > 
> > 
> > For the other way around, if ConstSpanType is a member template of
> > say the class template B (and thus its level is greater than
> > SpanType):
> > 
> >template
> >using SpanType = decltype(span(T()));
> > 
> >template
> >struct B {
> >  template
> >  using ConstSpanType = span::value_type>;
> >};
> > 
> >using type = B::ConstSpanType;
> > 
> > then tf_partial doesn't help here at all; we end up substituting 'int'
> > for the CTAD placeholder...  What it seems we need is to _increase_ the
> > level of the CTAD placeholder from 2 to 3 during the dependent
> > substitution..
> > 
> > Hmm, rather than messing with tf_partial, which is apparently only a
> > partial solution, maybe we should just make tsubst never substitute a
> > CTAD placeholder -- they should always be resolved from do_class_deduction,
> > and their level doesn't really matter otherwise.  (But we'd still want
> > to substitute into the CLASS_PLACEHOLDER_TEMPLATE of the placeholder in
> > case it's a template template parm.)  Something like:
> > 
> > diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> > index 5107bfbf9d1..dead651ed84 100644
> > --- a/gcc/cp/pt.c
> > +++ b/gcc/cp/pt.c
> > @@ -15552,7 +15550,8 @@ tsubst (tree t, tree args, tsubst_flags_t complain,
> > tree in_decl)
> > levels = TMPL_ARGS_DEPTH (args);
> > if (level <= levels
> > -   && TREE_VEC_LENGTH (TMPL_ARGS_LEVEL (args, level)) > 0)
> > +   && TREE_VEC_LENGTH (TMPL_ARGS_LEVEL (args, level)) > 0
> > +   && !template_placeholder_p (t))
> >   {
> > arg = TMPL_ARG (args, level, idx);
> >   
> > seems to work better.
> 
> Makes sense.

Here's a patch that implements that.  I reckon it's good to have both
workarounds in place because the tf_partial workaround is necessary to
accept class-deduction93a.C below, and the tsubst workaround is
necessary to accept class-deduction-92b.C below.

-- >8 --

Subject: [PATCH] c++: CTAD within alias template [PR91911]

In the first testcase below, during parsing of the alias template
ConstSpanType, transparency of alias template specializations means we
replace SpanType with SpanType's substituted definition.  But this
substitution lowers the level of the CTAD placeholder for span{T()} from
2 to 1, and so the later instantiation of ConstSpanType erroneously
substitutes this CTAD placeholder with the template argument at level 1
index 0, i.e. with int, before we get a chance to perform the CTAD.

In light of this, it seems we should avoid 

Re: [PATCH] c++: CTAD within alias template [PR91911]

2021-06-30 Thread Patrick Palka via Gcc-patches
On Wed, 30 Jun 2021, Patrick Palka wrote:

> On Fri, 25 Jun 2021, Jason Merrill wrote:
> 
> > On 6/25/21 1:11 PM, Patrick Palka wrote:
> > > On Fri, 25 Jun 2021, Jason Merrill wrote:
> > > 
> > > > On 6/24/21 4:45 PM, Patrick Palka wrote:
> > > > > In the first testcase below, during parsing of the alias template
> > > > > ConstSpanType, transparency of alias template specializations means we
> > > > > replace SpanType with SpanType's substituted definition.  But this
> > > > > substitution lowers the level of the CTAD placeholder for span(T()) 
> > > > > from
> > > > > 2 to 1, and so the later instantiantion of ConstSpanType
> > > > > erroneously substitutes this CTAD placeholder with the template 
> > > > > argument
> > > > > at level 1 index 0, i.e. with int, before we get a chance to perform 
> > > > > the
> > > > > CTAD.
> > > > > 
> > > > > In light of this, it seems we should avoid level lowering when
> > > > > substituting through through the type-id of a dependent alias template
> > > > > specialization.  To that end this patch makes lookup_template_class_1
> > > > > pass tf_partial to tsubst in this situation.
> > > > 
> > > > This makes sense, but what happens if SpanType is a member template, so
> > > > that
> > > > the levels of it and ConstSpanType don't match?  Or the other way 
> > > > around?
> > > 
> > > If SpanType is a member template of say the class template A (and
> > > thus its level is greater than ConstSpanType):
> > > 
> > >template
> > >struct A {
> > >  template
> > >  using SpanType = decltype(span(T()));
> > >};
> > > 
> > >template
> > >using ConstSpanType = span > > A::SpanType::value_type>;
> > > 
> > >using type = ConstSpanType;
> > > 
> > > then this case luckily works even without the patch because
> > > instantiate_class_template now reuses the specialization 
> > > A::SpanType
> > > that was formed earlier during instantiation of A, where we
> > > substitute only a single level of template arguments, so the level of
> > > the CTAD placeholder inside the defining-type-id of this specialization
> > > dropped from 3 to 2, so still more than the level of ConstSpanType.
> > > 
> > > This luck is short-lived though, because if we replace
> > > A::SpanType with say A::SpanType then the testcase
> > > breaks again (without the patch) because we no longer can reuse that
> > > specialization, so we instead form it on the spot by substituting two
> > > levels of template arguments (U=int,T=T) into the defining-type-id,
> > > causing the level of the placeholder to drop to 1.  I think the patch
> > > causes its level to remain 3 (though I guess it should really be 2).
> > > 
> > > 
> > > For the other way around, if ConstSpanType is a member template of
> > > say the class template B (and thus its level is greater than
> > > SpanType):
> > > 
> > >template
> > >using SpanType = decltype(span(T()));
> > > 
> > >template
> > >struct B {
> > >  template
> > >  using ConstSpanType = span::value_type>;
> > >};
> > > 
> > >using type = B::ConstSpanType;
> > > 
> > > then tf_partial doesn't help here at all; we end up substituting 'int'
> > > for the CTAD placeholder...  What it seems we need is to _increase_ the
> > > level of the CTAD placeholder from 2 to 3 during the dependent
> > > substitution..
> > > 
> > > Hmm, rather than messing with tf_partial, which is apparently only a
> > > partial solution, maybe we should just make tsubst never substitute a
> > > CTAD placeholder -- they should always be resolved from 
> > > do_class_deduction,
> > > and their level doesn't really matter otherwise.  (But we'd still want
> > > to substitute into the CLASS_PLACEHOLDER_TEMPLATE of the placeholder in
> > > case it's a template template parm.)  Something like:
> > > 
> > > diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> > > index 5107bfbf9d1..dead651ed84 100644
> > > --- a/gcc/cp/pt.c
> > > +++ b/gcc/cp/pt.c
> > > @@ -15552,7 +15550,8 @@ tsubst (tree t, tree args, tsubst_flags_t 
> > > complain,
> > > tree in_decl)
> > >   levels = TMPL_ARGS_DEPTH (args);
> > >   if (level <= levels
> > > - && TREE_VEC_LENGTH (TMPL_ARGS_LEVEL (args, level)) > 0)
> > > + && TREE_VEC_LENGTH (TMPL_ARGS_LEVEL (args, level)) > 0
> > > + && !template_placeholder_p (t))
> > > {
> > >   arg = TMPL_ARG (args, level, idx);
> > >   
> > > seems to work better.
> > 
> > Makes sense.
> 
> Here's a patch that implements that.  I reckon it's good to have both
> workarounds in place because the tf_partial workaround is necessary to
> accept class-deduction93a.C below, and the tsubst workaround is
> necessary to accept class-deduction-92b.C below.

Whoops, forgot to git-add class-deduction93a.C:

-- >8 --

Subject: [PATCH] c++: CTAD within alias template [PR91911]

In the first testcase below, during parsing of the alias template
ConstSpanType, transparency of alias template specializations means we
replace SpanType wi

RE: [PATCH][RFC]AArch64 SVE: Fix multiple comparison masks on inverted operands

2021-06-30 Thread Tamar Christina via Gcc-patches
> -Original Message-
> From: Richard Sandiford 
> Sent: Monday, June 14, 2021 4:55 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> ; Kyrylo Tkachov 
> Subject: Re: [PATCH][RFC]AArch64 SVE: Fix multiple comparison masks on
> inverted operands
> 
> Tamar Christina  writes:
> > Hi Richard,
> >> -Original Message-
> >> From: Richard Sandiford 
> >> Sent: Monday, June 14, 2021 3:50 PM
> >> To: Tamar Christina 
> >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> >> ; Marcus Shawcroft
> >> ; Kyrylo Tkachov
> 
> >> Subject: Re: [PATCH][RFC]AArch64 SVE: Fix multiple comparison masks
> >> on inverted operands
> >>
> >> Tamar Christina  writes:
> >> > Hi All,
> >> >
> >> > This RFC is trying to address the following inefficiency when
> >> > vectorizing conditional statements with SVE.
> >> >
> >> > Consider the case
> >> >
> >> > void f10(double * restrict z, double * restrict w, double * restrict x,
> >> >   double * restrict y, int n)
> >> > {
> >> > for (int i = 0; i < n; i++) {
> >> > z[i] = (w[i] > 0) ? x[i] + w[i] : y[i] - w[i];
> >> > }
> >> > }
> >> >
> >> >
> >> > For which we currently generate at -O3:
> >> >
> >> > f10:
> >> > cmp w4, 0
> >> > ble .L1
> >> > mov x5, 0
> >> > whilelo p1.d, wzr, w4
> >> > ptrue   p3.b, all
> >> > .L3:
> >> > ld1dz1.d, p1/z, [x1, x5, lsl 3]
> >> > fcmgt   p2.d, p1/z, z1.d, #0.0
> >> > fcmgt   p0.d, p3/z, z1.d, #0.0
> >> > ld1dz2.d, p2/z, [x2, x5, lsl 3]
> >> > bic p0.b, p3/z, p1.b, p0.b
> >> > ld1dz0.d, p0/z, [x3, x5, lsl 3]
> >> > fsubz0.d, p0/m, z0.d, z1.d
> >> > movprfx z0.d, p2/m, z1.d
> >> > faddz0.d, p2/m, z0.d, z2.d
> >> > st1dz0.d, p1, [x0, x5, lsl 3]
> >> > incdx5
> >> > whilelo p1.d, w5, w4
> >> > b.any   .L3
> >> > .L1:
> >> > ret
> >> >
> >> > Notice that the condition for the else branch duplicates the same
> >> > predicate as the then branch and then uses BIC to negate the results.
> >> >
> >> > The reason for this is that during instruction generation in the
> >> > vectorizer we emit
> >> >
> >> >   mask__41.11_66 = vect__4.10_64 > vect_cst__65;
> >> >   vec_mask_and_69 = mask__41.11_66 & loop_mask_63;
> >> >   vec_mask_and_71 = mask__41.11_66 & loop_mask_63;
> >> >   mask__43.16_73 = ~mask__41.11_66;
> >> >   vec_mask_and_76 = mask__43.16_73 & loop_mask_63;
> >> >   vec_mask_and_78 = mask__43.16_73 & loop_mask_63;
> >> >
> >> > which ultimately gets optimized to
> >> >
> >> >   mask__41.11_66 = vect__4.10_64 > { 0.0, ... };
> >> >   vec_mask_and_69 = loop_mask_63 & mask__41.11_66;
> >> >   mask__43.16_73 = ~mask__41.11_66;
> >> >   vec_mask_and_76 = loop_mask_63 & mask__43.16_73;
> >> >
> >> > Notice how the negate is on the operation and not the predicate
> >> > resulting from the operation.  When this is expanded this turns
> >> > into RTL where the negate is on the compare directly.  This means
> >> > the RTL is different from the one without the negate and so CSE is
> >> > unable to
> >> recognize that they are essentially same operation.
> >> >
> >> > To fix this my patch changes it so you negate the mask rather than
> >> > the operation
> >> >
> >> >   mask__41.13_55 = vect__4.12_53 > { 0.0, ... };
> >> >   vec_mask_and_58 = loop_mask_52 & mask__41.13_55;
> >> >   vec_mask_op_67 = ~vec_mask_and_58;
> >> >   vec_mask_and_65 = loop_mask_52 & vec_mask_op_67;
> >>
> >> But to me this looks like a pessimisation in gimple terms.  We've
> >> increased the length of the critical path: vec_mask_and_65 now needs
> >> a chain of
> >> 4 operations instead of 3.
> >
> > True, but it should reduce the number of RTL patterns.  I would have
> > thought RTL is more expensive to handle than gimple.
> 
> I think this is only a fair gimple optimisation if gimple does the isel 
> itself (to a
> predicated compare and a predicated NOT).
> 
> >> We also need to be careful not to pessimise the case in which the
> >> comparison is an integer one.  At the moment we'll generate opposed
> >> conditions, which is the intended behaviour:
> >
> > This still happens with this patch at `-Ofast` because that flips the
> > conditions, So the different representation doesn't harm it.
> 
> OK, that's good.
> 
> >>
> >> .L3:
> >> ld1dz1.d, p0/z, [x1, x5, lsl 3]
> >> cmpgt   p2.d, p0/z, z1.d, #0
> >> movprfx z2, z1
> >> scvtf   z2.d, p3/m, z1.d
> >> cmple   p1.d, p0/z, z1.d, #0
> >> ld1dz0.d, p2/z, [x2, x5, lsl 3]
> >> ld1dz1.d, p1/z, [x3, x5, lsl 3]
> >> faddz0.d, p2/m, z0.d, z2.d
> >> movprfx z0.d, p1/m, z1.d
> >> fsubz0.d, p1/m, z0.d, z2.d
> >> st1dz0.d, p0, [x0, x5, lsl 3]
> >> add x5, x5, x6
> >> whilelo p0.d, w5, w4
> >> b.any   .L3
> >>
> >> Could we handle the fcmp

[PING][PATCH 4/4] remove %G and %K support from pretty printer and -Wformat (PR 98512)

2021-06-30 Thread Martin Sebor via Gcc-patches

Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-June/572519.html

On 6/10/21 5:30 PM, Martin Sebor wrote:

This final diff removes the handlers for %G and %K from the pretty
printer and the support for the directives from c-format.c so that
using them will be diagnosed.




Re: [PATCH 1/2] c++: Fix push_access_scope and introduce RAII wrapper for it

2021-06-30 Thread Jason Merrill via Gcc-patches

On 6/30/21 11:03 AM, Patrick Palka wrote:

On Tue, 29 Jun 2021, Jason Merrill wrote:


On 6/29/21 1:57 PM, Patrick Palka wrote:

When push_access_scope is passed a TYPE_DECL for a class type (which
can happen during e.g. satisfaction), we undesirably push only the
enclosing context of the class instead of the class itself.  This causes
us to mishandle e.g. testcase below due to us not entering the scope of
A before checking its constraints.

This patch adjusts push_access_scope accordingly, and introduces an
RAII wrapper for it.  We also make use of this wrapper right away by
replacing the only use of push_nested_class_guard with this new wrapper,
which means we can remove this old wrapper (whose functionality is
basically subsumed by the new wrapper).

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* constraint.cc (get_normalized_constraints_from_decl): Use
push_access_scope_guard instead of push_nested_class_guard.
* cp-tree.h (struct push_nested_class_guard): Replace with ...
(struct push_access_scope_guard): ... this.
* pt.c (push_access_scope): When the argument corresponds to
a class type, push the class instead of its context.
(pop_access_scope): Adjust accordingly.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-access2.C: New test.
---
   gcc/cp/constraint.cc  |  7 +-
   gcc/cp/cp-tree.h  | 23 +++
   gcc/cp/pt.c   |  9 +++-
   gcc/testsuite/g++.dg/cpp2a/concepts-access2.C | 13 +++
   4 files changed, 35 insertions(+), 17 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-access2.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 6df3ca6ce32..99d3ccc6998 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -926,12 +926,7 @@ get_normalized_constraints_from_decl (tree d, bool diag
= false)
 tree norm = NULL_TREE;
 if (tree ci = get_constraints (decl))
   {
-  push_nested_class_guard pncs (DECL_CONTEXT (d));
-
-  temp_override ovr (current_function_decl);
-  if (TREE_CODE (decl) == FUNCTION_DECL)
-   current_function_decl = decl;
-
+  push_access_scope_guard pas (decl);
 norm = get_normalized_constraints_from_info (ci, tmpl, diag);
   }
   diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 6f713719589..58da7460001 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -8463,21 +8463,24 @@ is_constrained_auto (const_tree t)
 return is_auto (t) && PLACEHOLDER_TYPE_CONSTRAINTS_INFO (t);
   }
   -/* RAII class to push/pop class scope T; if T is not a class, do nothing.
*/
+/* RAII class to push/pop the access scope for T.  */
   -struct push_nested_class_guard
+struct push_access_scope_guard
   {
-  bool push;
-  push_nested_class_guard (tree t)
-: push (t && CLASS_TYPE_P (t))
+  tree decl;
+  push_access_scope_guard (tree t)
+: decl (t)
 {
-if (push)
-  push_nested_class (t);
+if (VAR_OR_FUNCTION_DECL_P (decl)
+   || TREE_CODE (decl) == TYPE_DECL)
+  push_access_scope (decl);
+else
+  decl = NULL_TREE;
 }
-  ~push_nested_class_guard ()
+  ~push_access_scope_guard ()
 {
-if (push)
-  pop_nested_class ();
+if (decl)
+  pop_access_scope (decl);
 }
   };
   diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index f2039e09cd7..bd8b17ca047 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -224,7 +224,7 @@ static void instantiate_body (tree pattern, tree args,
tree d, bool nested);
   /* Make the current scope suitable for access checking when we are
  processing T.  T can be FUNCTION_DECL for instantiated function
  template, VAR_DECL for static member variable, or TYPE_DECL for
-   alias template (needed by instantiate_decl).  */
+   for a class or alias template (needed by instantiate_decl).  */
 void
   push_access_scope (tree t)
@@ -234,6 +234,10 @@ push_access_scope (tree t)
   if (DECL_FRIEND_CONTEXT (t))
   push_nested_class (DECL_FRIEND_CONTEXT (t));
+  else if (TREE_CODE (t) == TYPE_DECL
+  && CLASS_TYPE_P (TREE_TYPE (t))
+  && DECL_ORIGINAL_TYPE (t) == NULL_TREE)


I suspect DECL_IMPLICIT_TYPEDEF_P is a better test for this case.


That works nicely.  How does the following look?  Bootstrapped and
regtested on x86_64-pc-linux-gnu.


OK.


-- >8 --

gcc/cp/ChangeLog:

* constraint.cc (get_normalized_constraints_from_decl): Use
push_access_scope_guard instead of push_nested_class_guard.
* cp-tree.h (struct push_nested_class_guard): Replace with ...
(struct push_access_scope_guard): ... this.
* pt.c (push_access_scope): When the argument corresponds to
a class type, push the class instead of its context.
(pop_access_scope): Adjust accordingly.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-access2.C: New test.
---
  gcc/

Re: [PATCH 2/3 V2] Fix IEEE 128-bit min/max test.

2021-06-30 Thread Michael Meissner via Gcc-patches
On Tue, Jun 29, 2021 at 07:06:14PM -0500, Segher Boessenkool wrote:
> On Thu, Jun 17, 2021 at 06:56:09PM -0400, Michael Meissner wrote:
> > The 'lp64' test
> > was needed because big endian 32-bit code cannot enable the IEEE 128-bit
> > floating point instructions.
> 
> No, *does not* enable them.  After
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 2c249e186e1e..d4aac4164cfe 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -4281,7 +4281,7 @@ rs6000_option_override_internal (bool global_init_p)
>rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW;
>  }
>  
> -  if (TARGET_FLOAT128_HW && !TARGET_64BIT)
> +  if (0&& TARGET_FLOAT128_HW && !TARGET_64BIT)
>  {
>if ((rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_HW) != 0)
> error ("%qs requires %qs", "%<-mfloat128-hardware%>", "-m64");
> 
> you can compile fine with -m32 if you add -mfloat128-hardware as well
> (it is disabled for BE as well, that should be fixed as well a few lines
> up from there).
> 
> Can you show any code that will not work please?  Not allowing QP float
> with -m32 causes many more problems than just allowing it would.

I will in a bit, but it isn't that simple.  Right now, all of the optimizations
for converting char/short/int -> __float128 (and presumably the other way)
don't work on 32-bit, because they want to split:

(parallel [(set (reg:KF )
(float:KF (reg:SI )))
   (clobber (reg:DI ))])

Into:

(set (reg:DI )
 (sign_extend:DI (reg:SI )))

(set (reg:KF )
 (float:KF (reg:DI )))

And the sign_extend SI->DI doesn't exist in 32-bit.


> > * gcc.target/powerpc/float128-minmax.c: Adjust expected code for
> > power10.
> > * lib/target-supports.exp (check_effective_target_has_arch_pwr10):
> > New target support.
> 
> Just "New." please.
> 
> >  /* { dg-require-effective-target powerpc_p9vector_ok } */
> 
> Please try whether you can lose that line as well.
> 
> Okay for trunk, and for 11 after the usual soak.  Thanks!
> 
> 
> Segher

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-06-30 Thread Andrew Pinski via Gcc-patches
On Wed, Jun 30, 2021 at 8:47 AM Qing Zhao via Gcc-patches
 wrote:
>
> I came up with a very simple testing case that can repeat the same issue:
>
> [qinzhao@localhost gcc]$ cat t.c
> extern void bar (int);
> void foo (int a)
> {
>   int i;
>   for (i = 0; i < a; i++) {
> if (__extension__({int size2;
> size2 = 4;
> size2 > 5;}))
> bar (a);
>   }
> }

You should show the full dump,
What we have is the following:



size2_3 = PHI 
:

size2_12 = .DEFERRED_INIT (size2_3, 2);
size2_13 = 4;

So CCP decides to propagate 4 into the PHI and then decides size2_1(D)
is undefined so size2_3 is then considered 4 and propagates it into
the .DEFERRED_INIT.

Thanks,
Andrew


>
> [qinzhao@localhost gcc]$ /home/qinzhao/Work/GCC/gcc_build_debug/gcc/xgcc 
> -B/home/qinzhao/Work/GCC/gcc_build_debug/gcc/ -std=c99   -m64  -march=native 
> -ftrivial-auto-var-init=zero t.c -fdump-tree-all  -O1
> t.c: In function ‘foo’:
> t.c:11:1: error: ‘DEFFERED_INIT’ calls should have the same LHS as the first 
> argument
>11 | }
>   | ^
> size2_12 = .DEFERRED_INIT (4, 2);
> during GIMPLE pass: ccp
> dump file: a-t.c.032t.ccp1
> t.c:11:1: internal compiler error: verify_gimple failed
> 0x143ee47 verify_gimple_in_cfg(function*, bool)
> ../../latest_gcc/gcc/tree-cfg.c:5501
> 0x122d799 execute_function_todo
> ../../latest_gcc/gcc/passes.c:2042
> 0x122c74b do_per_function
> ../../latest_gcc/gcc/passes.c:1687
> 0x122d986 execute_todo
> ../../latest_gcc/gcc/passes.c:2096
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See  for instructions.
>
> In this testing case, both “I” and “size2” are auto vars that are not 
> initialized at declaration but are initialized later by assignment.
> However, “I” doesn’t have any issue, but “size2” has such issue.
>
> **“ssa” dump:
>
>:
>   i_7 = .DEFERRED_INIT (i_6(D), 2);
>   i_8 = 0;
>   goto ; [INV]
>
>:
>   size2_12 = .DEFERRED_INIT (size2_3, 2);
>   size2_13 = 4;
>
> **”ccp1” dump:
>
>:
>   i_7 = .DEFERRED_INIT (i_6(D), 2);
>   goto ; [INV]
>
>:
>   size2_12 = .DEFERRED_INIT (4, 2);
>
> So, I am wondering:  Is it possible that “ssa” phase have a bug ?
>
> Qing
>
> > On Jun 30, 2021, at 9:39 AM, Richard Biener  wrote:
> >
> > On Wed, 30 Jun 2021, Qing Zhao wrote:
> >
> >>
> >>
> >> On Jun 30, 2021, at 2:46 AM, Richard Biener 
> >> mailto:rguent...@suse.de>> wrote:
> >>
> >> On Wed, 30 Jun 2021, Qing Zhao wrote:
> >>
> >> Hi,
> >>
> >> I am testing the 4th patch of -ftrivial-auto-var-init with CPU2017 today, 
> >> and found the following issues:
> >>
> >> In the dump file of “*t.i.031t.objsz1”, we have:
> >>
> >>  :
> >> __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
> >> __s2_len_218 = .DEFERRED_INIT (__s2_len_177, 2);
> >>
> >> I looks like this .DEFERRED_INIT initializes an already initialized
> >> variable.
> >>
> >> Yes.
> >>
> >> For cases like the following:
> >>
> >> int s2_len;
> >> s2_len = 4;
> >>
> >> i.e, the initialization is not at the declaration.
> >>
> >> We cannot avoid initialization for such cases.
> >
> > But I'd have expected
> >
> >  s2_len = .DEFERRED_INIT (s2_len, 0);
> >  s2_len = 4;
> >
> > from the above - thus the deferred init _before_ the first
> > "use" which is the explicit init.  How does the other order
> > happen to materialize?  As said, I believe it shouldn't.
> >
> >> I'd expect to only ever see default definition SSA names
> >> as first argument to .DEFERRED_INIT.
> >>
> >> You mean something like:
> >> __s2_len_218 = .DEFERRED_INIT (__s2_len, 2);
> >
> > No,
> >
> > __s2_len_218 = .DEFERRED_INIT (__s2_len_217(D), 2);
> >
> >> ?
> >>
> >>
> >> __s2_len_219 = 7;
> >> if (__s2_len_219 <= 3)
> >>   goto ; [INV]
> >> else
> >>   goto ; [INV]
> >>
> >>  :
> >> _1 = (long unsigned int) i_175;
> >>
> >>
> >> However, after “ccp”, in “t.i.032t.ccp1”, we have:
> >>
> >>  :
> >> __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
> >> __s2_len_218 = .DEFERRED_INIT (7, 2);
> >> _36 = (long unsigned int) i_175;
> >> _37 = _36 * 8;
> >> _38 = argv_220(D) + _37;
> >>
> >>
> >> Looks like that the optimization “ccp” replaced the first argument of the 
> >> call .DEFERRED_INIT with the constant 7.
> >> This should be avoided.
> >>
> >> (NOTE, this issue existed in the previous patches, however, only exposed 
> >> with this version since I added more verification
> >> code in tree-cfg.c to verify the call to .DEFERRED_INIT).
> >>
> >> I am wondering what’s the best solution to this problem?
> >>
> >> I think you have to trace where this "bogus" .DEFERRED_INIT comes from
> >> originally.  Or alternatively, if this is unavoidable,
> >>
> >> This is unavoidable, I believe.
> >
> > I see but don't believe it yet ;)
> >
> >> add "constant
> >> folding" of .DEFERRED_INIT so that defered init of an initialized
> >> object becomes the object itself, thus retain the previous - event

Re: [PATCH 1/3] Add IEEE 128-bit min/max support on PowerPC.

2021-06-30 Thread Segher Boessenkool
On Mon, Jun 28, 2021 at 03:00:02PM -0400, Michael Meissner wrote:
> On Wed, Jun 23, 2021 at 06:56:37PM -0500, Segher Boessenkool wrote:
> > > The problem area is a power10 running in
> > > big endian mode and running 32-bit code.  Because we don't have TImode, we
> > > can't enable the IEEE 128-bit hardware instructions.
> > 
> > I don't see why not?

And this is still true, and the core of the problem here.  Please show
any argument for this?

> > > > > +/* { dg-require-effective-target ppc_float128_hw } */
> > > > > +/* { dg-require-effective-target power10_ok } */
> > > > > +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ffast-math" } */
> > > > 
> > > > In testcases we can assume that float128_hw is set whenever we have a
> > > > p10; we don't manually disable it to make live hard for ourselves ;-)
> > > 
> > > Again, I put it in case somebody builds a BE power10 compiler.
> > 
> > This should still be fixed.  And yes, people do test BE p10, of course.
> > And BE p10 *should* enable the QP float insns.  Does it not currently?
> 
> GCC does not enable __float128 by default on BE.

But it does enable _Float128, as it should, and it does work.

> The reason is there are no
> plans to enable all of the float128 support in glibc in BE.  Without a 
> library,
> it is kind of useless to enable __float128.

This is fundamentally wrong.  GCC is a compiler.  It is used without
libraries often (some applications do not want the standard libraries
for a reason, some implement them themselves, some *are* the standard
libraries).  And you can have a lot of useful code without using libm.

> If the compiler enabled __float128, It breaks things that check if __float128
> is avaiable.  They think __float128 is available, and then they fail when when
> they can't anything besides basic arithmetic.

So?  That would be the *correct* behaviour.

> Because the compiler is configured not to enable __float128 in a BE context, 
> we
> don't build the __float128 emulator in libgcc.

Yes, another imperfection.

> In addition, BE GCC runs on things that does not have GLIBC (like AIX).  If we
> enabled it by default, it would break those environments.

How so?  Not anymore than you do now, you cannot use *any* QP float
with the status quo.

> A further complication is BE by default is still power4 or power5.  You need
> VSX support to even pass __float128 arguments.  While it is possible to pass
> __float128 in GPRs, you run into compatibility issues if one module is 
> compiled
> with VSX and another is compiled without setting a base cpu, because one 
> module
> will expect things in GPRs and the other in Altivec registers.

Yes, allowing QP float before p8 would solve a lot more problems, as I
have told you very often; but it is independent of this.  You need p9
to have the machine insns, and you can compile code that needs the
libgcc soft-float emulation functions for QP on power7 already (it needs
basic VSX only, and should be okay with only VMX even).

> And as I've said, the issue with 32-bit move is we don't have TImode support.
> Some of the machine indepenent passes want to use an appropriate integer type
> to move basic types.

So why does it work fine with double-double?

Please show an example.


Segher


Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-06-30 Thread Richard Biener
On June 30, 2021 7:20:18 PM GMT+02:00, Andrew Pinski  wrote:
>On Wed, Jun 30, 2021 at 8:47 AM Qing Zhao via Gcc-patches
> wrote:
>>
>> I came up with a very simple testing case that can repeat the same
>issue:
>>
>> [qinzhao@localhost gcc]$ cat t.c
>> extern void bar (int);
>> void foo (int a)
>> {
>>   int i;
>>   for (i = 0; i < a; i++) {
>> if (__extension__({int size2;
>> size2 = 4;
>> size2 > 5;}))
>> bar (a);
>>   }
>> }
>
>You should show the full dump,
>What we have is the following:
>
>
>
>size2_3 = PHI 
>:
>
>size2_12 = .DEFERRED_INIT (size2_3, 2);
>size2_13 = 4;
>
>So CCP decides to propagate 4 into the PHI and then decides size2_1(D)
>is undefined so size2_3 is then considered 4 and propagates it into
>the .DEFERRED_INIT.

Which means the DEFERED_INIT is inserted at the wrong place. 

Richard. 
>
>Thanks,
>Andrew
>
>
>>
>> [qinzhao@localhost gcc]$
>/home/qinzhao/Work/GCC/gcc_build_debug/gcc/xgcc
>-B/home/qinzhao/Work/GCC/gcc_build_debug/gcc/ -std=c99   -m64 
>-march=native -ftrivial-auto-var-init=zero t.c -fdump-tree-all  -O1
>> t.c: In function ‘foo’:
>> t.c:11:1: error: ‘DEFFERED_INIT’ calls should have the same LHS as
>the first argument
>>11 | }
>>   | ^
>> size2_12 = .DEFERRED_INIT (4, 2);
>> during GIMPLE pass: ccp
>> dump file: a-t.c.032t.ccp1
>> t.c:11:1: internal compiler error: verify_gimple failed
>> 0x143ee47 verify_gimple_in_cfg(function*, bool)
>> ../../latest_gcc/gcc/tree-cfg.c:5501
>> 0x122d799 execute_function_todo
>> ../../latest_gcc/gcc/passes.c:2042
>> 0x122c74b do_per_function
>> ../../latest_gcc/gcc/passes.c:1687
>> 0x122d986 execute_todo
>> ../../latest_gcc/gcc/passes.c:2096
>> Please submit a full bug report,
>> with preprocessed source if appropriate.
>> Please include the complete backtrace with any bug report.
>> See  for instructions.
>>
>> In this testing case, both “I” and “size2” are auto vars that are not
>initialized at declaration but are initialized later by assignment.
>> However, “I” doesn’t have any issue, but “size2” has such issue.
>>
>> **“ssa” dump:
>>
>>:
>>   i_7 = .DEFERRED_INIT (i_6(D), 2);
>>   i_8 = 0;
>>   goto ; [INV]
>>
>>:
>>   size2_12 = .DEFERRED_INIT (size2_3, 2);
>>   size2_13 = 4;
>>
>> **”ccp1” dump:
>>
>>:
>>   i_7 = .DEFERRED_INIT (i_6(D), 2);
>>   goto ; [INV]
>>
>>:
>>   size2_12 = .DEFERRED_INIT (4, 2);
>>
>> So, I am wondering:  Is it possible that “ssa” phase have a bug ?
>>
>> Qing
>>
>> > On Jun 30, 2021, at 9:39 AM, Richard Biener 
>wrote:
>> >
>> > On Wed, 30 Jun 2021, Qing Zhao wrote:
>> >
>> >>
>> >>
>> >> On Jun 30, 2021, at 2:46 AM, Richard Biener
>mailto:rguent...@suse.de>> wrote:
>> >>
>> >> On Wed, 30 Jun 2021, Qing Zhao wrote:
>> >>
>> >> Hi,
>> >>
>> >> I am testing the 4th patch of -ftrivial-auto-var-init with CPU2017
>today, and found the following issues:
>> >>
>> >> In the dump file of “*t.i.031t.objsz1”, we have:
>> >>
>> >>  :
>> >> __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
>> >> __s2_len_218 = .DEFERRED_INIT (__s2_len_177, 2);
>> >>
>> >> I looks like this .DEFERRED_INIT initializes an already
>initialized
>> >> variable.
>> >>
>> >> Yes.
>> >>
>> >> For cases like the following:
>> >>
>> >> int s2_len;
>> >> s2_len = 4;
>> >>
>> >> i.e, the initialization is not at the declaration.
>> >>
>> >> We cannot avoid initialization for such cases.
>> >
>> > But I'd have expected
>> >
>> >  s2_len = .DEFERRED_INIT (s2_len, 0);
>> >  s2_len = 4;
>> >
>> > from the above - thus the deferred init _before_ the first
>> > "use" which is the explicit init.  How does the other order
>> > happen to materialize?  As said, I believe it shouldn't.
>> >
>> >> I'd expect to only ever see default definition SSA names
>> >> as first argument to .DEFERRED_INIT.
>> >>
>> >> You mean something like:
>> >> __s2_len_218 = .DEFERRED_INIT (__s2_len, 2);
>> >
>> > No,
>> >
>> > __s2_len_218 = .DEFERRED_INIT (__s2_len_217(D), 2);
>> >
>> >> ?
>> >>
>> >>
>> >> __s2_len_219 = 7;
>> >> if (__s2_len_219 <= 3)
>> >>   goto ; [INV]
>> >> else
>> >>   goto ; [INV]
>> >>
>> >>  :
>> >> _1 = (long unsigned int) i_175;
>> >>
>> >>
>> >> However, after “ccp”, in “t.i.032t.ccp1”, we have:
>> >>
>> >>  :
>> >> __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
>> >> __s2_len_218 = .DEFERRED_INIT (7, 2);
>> >> _36 = (long unsigned int) i_175;
>> >> _37 = _36 * 8;
>> >> _38 = argv_220(D) + _37;
>> >>
>> >>
>> >> Looks like that the optimization “ccp” replaced the first argument
>of the call .DEFERRED_INIT with the constant 7.
>> >> This should be avoided.
>> >>
>> >> (NOTE, this issue existed in the previous patches, however, only
>exposed with this version since I added more verification
>> >> code in tree-cfg.c to verify the call to .DEFERRED_INIT).
>> >>
>> >> I am wondering what’s the best solution to this problem?
>> >>
>> >> I think you have to trace where this "bogus" .DEFERRED_INIT comes
>from
>> >> originally.  Or alt

Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-06-30 Thread Qing Zhao via Gcc-patches

Hi, Andrew,

Thanks a lot for your explanation.

On Jun 30, 2021, at 12:20 PM, Andrew Pinski 
mailto:pins...@gmail.com>> wrote:

On Wed, Jun 30, 2021 at 8:47 AM Qing Zhao via Gcc-patches
mailto:gcc-patches@gcc.gnu.org>> wrote:

I came up with a very simple testing case that can repeat the same issue:

[qinzhao@localhost gcc]$ cat t.c
extern void bar (int);
void foo (int a)
{
 int i;
 for (i = 0; i < a; i++) {
   if (__extension__({int size2;
   size2 = 4;
   size2 > 5;}))
   bar (a);
 }
}

You should show the full dump,

For the above small testing case:

*The full dump of “gimple” phase is:

void foo (int a)
{
  int D.2240;
  int i;

  i = .DEFERRED_INIT (i, 2);
  i = 0;
  goto ;
  :
  {
int size2;

size2 = .DEFERRED_INIT (size2, 2);
size2 = 4;
_1 = size2 > 5;
D.2240 = (int) _1;
  }
  if (D.2240 != 0) goto ; else goto ;
  :
  bar (a);
  :
  i = i + 1;
  :
  if (i < a) goto ; else goto ;
  :
}

*The full dump of “ssa” phase is:


;; Function foo (foo, funcdef_no=0, decl_uid=2236, cgraph_uid=1, symbol_order=0)

void foo (int a)
{
  int size2;
  int i;
  _Bool _1;
  int _14;

   :
  i_7 = .DEFERRED_INIT (i_6(D), 2);
  i_8 = 0;
  goto ; [INV]

   :
  size2_12 = .DEFERRED_INIT (size2_3, 2);
  size2_13 = 4;
  _1 = size2_13 > 5;
  _14 = (int) _1;
  if (_14 != 0)
goto ; [INV]
  else
goto ; [INV]

   :
  bar (a_11(D));

   :
  i_16 = i_2 + 1;

   :
  # i_2 = PHI 
  # size2_3 = PHI 
  if (i_2 < a_11(D))
goto ; [INV]
  else
goto ; [INV]

   :
  return;

}


**The full dump of the “ccp1” phase is:


;; Function foo (foo, funcdef_no=0, decl_uid=2236, cgraph_uid=1, symbol_order=0)

Folding predicate 0 != 0 to 0
Removing basic block 4
Merging blocks 3 and 5



EMERGENCY DUMP:

void foo (int a)
{
  int size2;
  int i;

   :
  i_7 = .DEFERRED_INIT (i_6(D), 2);
  goto ; [INV]

   :
  size2_12 = .DEFERRED_INIT (4, 2);
  i_16 = i_2 + 1;

   :
  # i_2 = PHI <0(2), i_16(3)>
  if (i_2 < a_11(D))
goto ; [INV]
  else
goto ; [INV]

   :
  return;

}



What we have is the following:

size2_3 = PHI 
   :

size2_12 = .DEFERRED_INIT (size2_3, 2);
size2_13 = 4;

So CCP decides to propagate 4 into the PHI and then decides size2_1(D)
is undefined so size2_3 is then considered 4 and propagates it into
the .DEFERRED_INIT.

Okay, now I understand.

So, both SSA and CCP do correctly?

Is there simple solution to avoid CCP from propagating 4 into .DEFERRED_INIT?

thanks.

Qing

Thanks,
Andrew



[qinzhao@localhost gcc]$ /home/qinzhao/Work/GCC/gcc_build_debug/gcc/xgcc 
-B/home/qinzhao/Work/GCC/gcc_build_debug/gcc/ -std=c99   -m64  -march=native 
-ftrivial-auto-var-init=zero t.c -fdump-tree-all  -O1
t.c: In function ‘foo’:
t.c:11:1: error: ‘DEFFERED_INIT’ calls should have the same LHS as the first 
argument
  11 | }
 | ^
size2_12 = .DEFERRED_INIT (4, 2);
during GIMPLE pass: ccp
dump file: a-t.c.032t.ccp1
t.c:11:1: internal compiler error: verify_gimple failed
0x143ee47 verify_gimple_in_cfg(function*, bool)
   ../../latest_gcc/gcc/tree-cfg.c:5501
0x122d799 execute_function_todo
   ../../latest_gcc/gcc/passes.c:2042
0x122c74b do_per_function
   ../../latest_gcc/gcc/passes.c:1687
0x122d986 execute_todo
   ../../latest_gcc/gcc/passes.c:2096
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.

In this testing case, both “I” and “size2” are auto vars that are not 
initialized at declaration but are initialized later by assignment.
However, “I” doesn’t have any issue, but “size2” has such issue.

**“ssa” dump:

  :
 i_7 = .DEFERRED_INIT (i_6(D), 2);
 i_8 = 0;
 goto ; [INV]

  :
 size2_12 = .DEFERRED_INIT (size2_3, 2);
 size2_13 = 4;

**”ccp1” dump:

  :
 i_7 = .DEFERRED_INIT (i_6(D), 2);
 goto ; [INV]

  :
 size2_12 = .DEFERRED_INIT (4, 2);

So, I am wondering:  Is it possible that “ssa” phase have a bug ?

Qing

On Jun 30, 2021, at 9:39 AM, Richard Biener 
mailto:rguent...@suse.de>> wrote:

On Wed, 30 Jun 2021, Qing Zhao wrote:



On Jun 30, 2021, at 2:46 AM, Richard Biener 
mailto:rguent...@suse.de>> wrote:

On Wed, 30 Jun 2021, Qing Zhao wrote:

Hi,

I am testing the 4th patch of -ftrivial-auto-var-init with CPU2017 today, and 
found the following issues:

In the dump file of “*t.i.031t.objsz1”, we have:

 :
__s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
__s2_len_218 = .DEFERRED_INIT (__s2_len_177, 2);

I looks like this .DEFERRED_INIT initializes an already initialized
variable.

Yes.

For cases like the following:

int s2_len;
s2_len = 4;

i.e, the initialization is not at the declaration.

We cannot avoid initialization for such cases.

But I'd have expected

s2_len = .DEFERRED_INIT (s2_len, 0);
s2_len = 4;

from the above - thus the deferred init _before_ the first
"use" which is the explicit init.  How does the other order
happen to materialize?  As said, I believe it shouldn't.

I'd expect

Re: [PATCH][RFC]AArch64 SVE: Fix multiple comparison masks on inverted operands

2021-06-30 Thread Richard Sandiford via Gcc-patches
Tamar Christina  writes:
>> -Original Message-
>> From: Richard Sandiford 
>> Sent: Monday, June 14, 2021 4:55 PM
>> To: Tamar Christina 
>> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> ; Marcus Shawcroft
>> ; Kyrylo Tkachov 
>> Subject: Re: [PATCH][RFC]AArch64 SVE: Fix multiple comparison masks on
>> inverted operands
>> 
>> Tamar Christina  writes:
>> > Hi Richard,
>> >> -Original Message-
>> >> From: Richard Sandiford 
>> >> Sent: Monday, June 14, 2021 3:50 PM
>> >> To: Tamar Christina 
>> >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
>> >> ; Marcus Shawcroft
>> >> ; Kyrylo Tkachov
>> 
>> >> Subject: Re: [PATCH][RFC]AArch64 SVE: Fix multiple comparison masks
>> >> on inverted operands
>> >>
>> >> Tamar Christina  writes:
>> >> > Hi All,
>> >> >
>> >> > This RFC is trying to address the following inefficiency when
>> >> > vectorizing conditional statements with SVE.
>> >> >
>> >> > Consider the case
>> >> >
>> >> > void f10(double * restrict z, double * restrict w, double * restrict x,
>> >> >  double * restrict y, int n)
>> >> > {
>> >> > for (int i = 0; i < n; i++) {
>> >> > z[i] = (w[i] > 0) ? x[i] + w[i] : y[i] - w[i];
>> >> > }
>> >> > }
>> >> >
>> >> >
>> >> > For which we currently generate at -O3:
>> >> >
>> >> > f10:
>> >> > cmp w4, 0
>> >> > ble .L1
>> >> > mov x5, 0
>> >> > whilelo p1.d, wzr, w4
>> >> > ptrue   p3.b, all
>> >> > .L3:
>> >> > ld1dz1.d, p1/z, [x1, x5, lsl 3]
>> >> > fcmgt   p2.d, p1/z, z1.d, #0.0
>> >> > fcmgt   p0.d, p3/z, z1.d, #0.0
>> >> > ld1dz2.d, p2/z, [x2, x5, lsl 3]
>> >> > bic p0.b, p3/z, p1.b, p0.b
>> >> > ld1dz0.d, p0/z, [x3, x5, lsl 3]
>> >> > fsubz0.d, p0/m, z0.d, z1.d
>> >> > movprfx z0.d, p2/m, z1.d
>> >> > faddz0.d, p2/m, z0.d, z2.d
>> >> > st1dz0.d, p1, [x0, x5, lsl 3]
>> >> > incdx5
>> >> > whilelo p1.d, w5, w4
>> >> > b.any   .L3
>> >> > .L1:
>> >> > ret
>> >> >
>> >> > Notice that the condition for the else branch duplicates the same
>> >> > predicate as the then branch and then uses BIC to negate the results.
>> >> >
>> >> > The reason for this is that during instruction generation in the
>> >> > vectorizer we emit
>> >> >
>> >> >   mask__41.11_66 = vect__4.10_64 > vect_cst__65;
>> >> >   vec_mask_and_69 = mask__41.11_66 & loop_mask_63;
>> >> >   vec_mask_and_71 = mask__41.11_66 & loop_mask_63;
>> >> >   mask__43.16_73 = ~mask__41.11_66;
>> >> >   vec_mask_and_76 = mask__43.16_73 & loop_mask_63;
>> >> >   vec_mask_and_78 = mask__43.16_73 & loop_mask_63;
>> >> >
>> >> > which ultimately gets optimized to
>> >> >
>> >> >   mask__41.11_66 = vect__4.10_64 > { 0.0, ... };
>> >> >   vec_mask_and_69 = loop_mask_63 & mask__41.11_66;
>> >> >   mask__43.16_73 = ~mask__41.11_66;
>> >> >   vec_mask_and_76 = loop_mask_63 & mask__43.16_73;
>> >> >
>> >> > Notice how the negate is on the operation and not the predicate
>> >> > resulting from the operation.  When this is expanded this turns
>> >> > into RTL where the negate is on the compare directly.  This means
>> >> > the RTL is different from the one without the negate and so CSE is
>> >> > unable to
>> >> recognize that they are essentially same operation.
>> >> >
>> >> > To fix this my patch changes it so you negate the mask rather than
>> >> > the operation
>> >> >
>> >> >   mask__41.13_55 = vect__4.12_53 > { 0.0, ... };
>> >> >   vec_mask_and_58 = loop_mask_52 & mask__41.13_55;
>> >> >   vec_mask_op_67 = ~vec_mask_and_58;
>> >> >   vec_mask_and_65 = loop_mask_52 & vec_mask_op_67;
>> >>
>> >> But to me this looks like a pessimisation in gimple terms.  We've
>> >> increased the length of the critical path: vec_mask_and_65 now needs
>> >> a chain of
>> >> 4 operations instead of 3.
>> >
>> > True, but it should reduce the number of RTL patterns.  I would have
>> > thought RTL is more expensive to handle than gimple.
>> 
>> I think this is only a fair gimple optimisation if gimple does the isel 
>> itself (to a
>> predicated compare and a predicated NOT).
>> 
>> >> We also need to be careful not to pessimise the case in which the
>> >> comparison is an integer one.  At the moment we'll generate opposed
>> >> conditions, which is the intended behaviour:
>> >
>> > This still happens with this patch at `-Ofast` because that flips the
>> > conditions, So the different representation doesn't harm it.
>> 
>> OK, that's good.
>> 
>> >>
>> >> .L3:
>> >> ld1dz1.d, p0/z, [x1, x5, lsl 3]
>> >> cmpgt   p2.d, p0/z, z1.d, #0
>> >> movprfx z2, z1
>> >> scvtf   z2.d, p3/m, z1.d
>> >> cmple   p1.d, p0/z, z1.d, #0
>> >> ld1dz0.d, p2/z, [x2, x5, lsl 3]
>> >> ld1dz1.d, p1/z, [x3, x5, lsl 3]
>> >> faddz0.d, p2/m, z0.d, z2.d
>> >> movprfx z0.d, p1/m, z1.d
>> >> fsubz0.d, p1/m, z0.d, z2.d
>> >>  

Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-06-30 Thread Qing Zhao via Gcc-patches


> On Jun 30, 2021, at 12:36 PM, Richard Biener  wrote:
> 
> On June 30, 2021 7:20:18 PM GMT+02:00, Andrew Pinski  
> wrote:
>> On Wed, Jun 30, 2021 at 8:47 AM Qing Zhao via Gcc-patches
>>  wrote:
>>> 
>>> I came up with a very simple testing case that can repeat the same
>> issue:
>>> 
>>> [qinzhao@localhost gcc]$ cat t.c
>>> extern void bar (int);
>>> void foo (int a)
>>> {
>>>  int i;
>>>  for (i = 0; i < a; i++) {
>>>if (__extension__({int size2;
>>>size2 = 4;
>>>size2 > 5;}))
>>>bar (a);
>>>  }
>>> }
>> 
>> You should show the full dump,
>> What we have is the following:
>> 
>> 
>> 
>> size2_3 = PHI 
>>   :
>> 
>> size2_12 = .DEFERRED_INIT (size2_3, 2);
>> size2_13 = 4;
>> 
>> So CCP decides to propagate 4 into the PHI and then decides size2_1(D)
>> is undefined so size2_3 is then considered 4 and propagates it into
>> the .DEFERRED_INIT.
> 
> Which means the DEFERED_INIT is inserted at the wrong place. 

Then, where is the correct place for “.DEFERRED_INIT(size2,2)?

The variable “size2” is a block scope variable which is declared inside the 
“if” condition:

>>> if (__extension__({int size2;
>>>size2 = 4;
>>>size2 > 5;}))

So, it’s reasonable to insert the initialization inside this block and 
immediately after the declaration, This is what the patch currently does:

*The full dump of “gimple” phase is:

void foo (int a)
{
 int D.2240;
 int i;

 i = .DEFERRED_INIT (i, 2);
 i = 0;
 goto ;
 :
 {
   int size2;

   size2 = .DEFERRED_INIT (size2, 2);
   size2 = 4;
   _1 = size2 > 5;
   D.2240 = (int) _1;
 }
 if (D.2240 != 0) goto ; else goto ;
 :
 bar (a);
 :
 i = i + 1;
 :
 if (i < a) goto ; else goto ;
 :
}

However, I suspect that the SSA phase moved the “size2” out of its block scope 
as following:

**The full “SSA” dump is:

;; Function foo (foo, funcdef_no=0, decl_uid=2236, cgraph_uid=1, symbol_order=0)

void foo (int a)
{
 int size2;
 int i;
 _Bool _1;
 int _14;

  :
 i_7 = .DEFERRED_INIT (i_6(D), 2);
 i_8 = 0;
 goto ; [INV]

  :
 size2_12 = .DEFERRED_INIT (size2_3, 2);
 size2_13 = 4;
 _1 = size2_13 > 5;
 _14 = (int) _1;
 if (_14 != 0)
   goto ; [INV]
 else
   goto ; [INV]

  :
 bar (a_11(D));

  :
 i_16 = i_2 + 1;

  :
 # i_2 = PHI 
 # size2_3 = PHI 
 if (i_2 < a_11(D))
   goto ; [INV]
 else
   goto ; [INV]

  :
 return;

}

In the above, we can see that “ # size2_3 = PHI ”  
is outside of its block scope already.
“size2” should not be in the same scope as “I" . 

This looks incorrect SSA transformation to me.

What do you think?

Qing

> 
> Richard. 
>> 
>> Thanks,
>> Andrew
>> 
>> 
>>> 
>>> [qinzhao@localhost gcc]$
>> /home/qinzhao/Work/GCC/gcc_build_debug/gcc/xgcc
>> -B/home/qinzhao/Work/GCC/gcc_build_debug/gcc/ -std=c99   -m64 
>> -march=native -ftrivial-auto-var-init=zero t.c -fdump-tree-all  -O1
>>> t.c: In function ‘foo’:
>>> t.c:11:1: error: ‘DEFFERED_INIT’ calls should have the same LHS as
>> the first argument
>>>   11 | }
>>>  | ^
>>> size2_12 = .DEFERRED_INIT (4, 2);
>>> during GIMPLE pass: ccp
>>> dump file: a-t.c.032t.ccp1
>>> t.c:11:1: internal compiler error: verify_gimple failed
>>> 0x143ee47 verify_gimple_in_cfg(function*, bool)
>>>../../latest_gcc/gcc/tree-cfg.c:5501
>>> 0x122d799 execute_function_todo
>>>../../latest_gcc/gcc/passes.c:2042
>>> 0x122c74b do_per_function
>>>../../latest_gcc/gcc/passes.c:1687
>>> 0x122d986 execute_todo
>>>../../latest_gcc/gcc/passes.c:2096
>>> Please submit a full bug report,
>>> with preprocessed source if appropriate.
>>> Please include the complete backtrace with any bug report.
>>> See  for instructions.
>>> 
>>> In this testing case, both “I” and “size2” are auto vars that are not
>> initialized at declaration but are initialized later by assignment.
>>> However, “I” doesn’t have any issue, but “size2” has such issue.
>>> 
>>> **“ssa” dump:
>>> 
>>>   :
>>>  i_7 = .DEFERRED_INIT (i_6(D), 2);
>>>  i_8 = 0;
>>>  goto ; [INV]
>>> 
>>>   :
>>>  size2_12 = .DEFERRED_INIT (size2_3, 2);
>>>  size2_13 = 4;
>>> 
>>> **”ccp1” dump:
>>> 
>>>   :
>>>  i_7 = .DEFERRED_INIT (i_6(D), 2);
>>>  goto ; [INV]
>>> 
>>>   :
>>>  size2_12 = .DEFERRED_INIT (4, 2);
>>> 
>>> So, I am wondering:  Is it possible that “ssa” phase have a bug ?
>>> 
>>> Qing
>>> 
 On Jun 30, 2021, at 9:39 AM, Richard Biener 
>> wrote:
 
 On Wed, 30 Jun 2021, Qing Zhao wrote:
 
> 
> 
> On Jun 30, 2021, at 2:46 AM, Richard Biener
>> mailto:rguent...@suse.de>> wrote:
> 
> On Wed, 30 Jun 2021, Qing Zhao wrote:
> 
> Hi,
> 
> I am testing the 4th patch of -ftrivial-auto-var-init with CPU2017
>> today, and found the following issues:
> 
> In the dump file of “*t.i.031t.objsz1”, we have:
> 
>  :
> __s1_len_217 = .DEFERRED_INIT (__s1_len_176, 2);
> __s2_len_218 = .DEFERRED_INIT (__s2_len_177, 2);
> 
> I looks like this .DEFERRED_INIT initializes an already
>> initialized
> variable.
> 

  1   2   >