Re: [PATCH v4] DSE: Use the constant store source if possible

2022-06-01 Thread Richard Sandiford via Gcc-patches
"H.J. Lu"  writes:
> On Mon, May 30, 2022 at 09:35:43AM +0100, Richard Sandiford wrote:
>> "H.J. Lu"  writes:
>> > ---
>> > RTL DSE tracks redundant constant stores within a basic block.  When RTL
>> > loop invariant motion hoists a constant initialization out of the loop
>> > into a separate basic block, the constant store value becomes unknown
>> > within the original basic block.  When recording store for RTL DSE, check
>> > if the source register is set only once to a constant by a non-partial
>> > unconditional load.  If yes, record the constant as the constant store
>> > source.  It eliminates unrolled zero stores after memset 0 in a loop
>> > where a vector register is used as the zero store source.
>> >
>> > Extract find_single_def_src from loop-iv.cc and move it to df-core.cc:
>> >
>> > 1. Rename to df_find_single_def_src.
>> > 2. Change the argument to rtx and use rtx_equal_p.
>> > 3. Return null for partial or conditional defs.
>> >
>> > gcc/
>> >
>> >PR rtl-optimization/105638
>> >* df-core.cc (df_find_single_def_sr): Moved and renamed from
>> >find_single_def_src in loop-iv.cc.  Change the argument to rtx
>> >and use rtx_equal_p.  Return null for partial or conditional
>> >defs.
>> >* df.h (df_find_single_def_src): New prototype.
>> >* dse.cc (record_store): Use the constant source if the source
>> >register is set only once.
>> >* loop-iv.cc (find_single_def_src): Moved to df-core.cc.
>> >(replace_single_def_regs): Replace find_single_def_src with
>> >df_find_single_def_src.
>> >
>> > gcc/testsuite/
>> >
>> >PR rtl-optimization/105638
>> >* g++.target/i386/pr105638.C: New test.
>> > ---
>> >  gcc/df-core.cc   | 44 +++
>> >  gcc/df.h |  1 +
>> >  gcc/dse.cc   | 14 
>> >  gcc/loop-iv.cc   | 45 +---
>> >  gcc/testsuite/g++.target/i386/pr105638.C | 44 +++
>> >  5 files changed, 104 insertions(+), 44 deletions(-)
>> >  create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
>> >
>> > diff --git a/gcc/df-core.cc b/gcc/df-core.cc
>> > index a901b84878f..f9b4de8eb7a 100644
>> > --- a/gcc/df-core.cc
>> > +++ b/gcc/df-core.cc
>> > @@ -2009,6 +2009,50 @@ df_reg_used (rtx_insn *insn, rtx reg)
>> >return df_find_use (insn, reg) != NULL;
>> >  }
>> >  
>> > +/* If REG has a single definition, return its known value, otherwise 
>> > return
>> > +   null.  */
>> > +
>> > +rtx
>> > +df_find_single_def_src (rtx reg)
>> > +{
>> > +  rtx src = NULL_RTX;
>> > +
>> > +  /* Don't look through unbounded number of single definition REG copies,
>> > + there might be loops for sources with uninitialized variables.  */
>> > +  for (int cnt = 0; cnt < 128; cnt++)
>> > +{
>> > +  df_ref adef = DF_REG_DEF_CHAIN (REGNO (reg));
>> > +  if (adef == NULL || DF_REF_NEXT_REG (adef) != NULL
>> > +|| DF_REF_IS_ARTIFICIAL (adef)
>> > +|| (DF_REF_FLAGS (adef)
>> > +& (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))
>> > +  return NULL_RTX;
>> > +
>> > +  rtx set = single_set (DF_REF_INSN (adef));
>> > +  if (set == NULL || !rtx_equal_p (SET_DEST (set), reg))
>> > +  return NULL_RTX;
>> > +
>> > +  rtx note = find_reg_equal_equiv_note (DF_REF_INSN (adef));
>> > +  if (note && function_invariant_p (XEXP (note, 0)))
>> > +  {
>> > +src = XEXP (note, 0);
>> > +break;
>> > +  }
>> 
>> Seems simpler to return this directly, rather than break and then
>> check function_invariant_p again.
>
> Fixed.
>
>> 
>> > +  src = SET_SRC (set);
>> > +
>> > +  if (REG_P (src))
>> > +  {
>> > +reg = src;
>> > +continue;
>> > +  }
>> > +  break;
>> > +}
>> > +  if (!function_invariant_p (src))
>> > +return NULL_RTX;
>> > +
>> > +  return src;
>> > +}
>> > +
>> >
>> >  
>> > /*
>> > Debugging and printing functions.
>> > diff --git a/gcc/df.h b/gcc/df.h
>> > index bd329205d08..71e249ad20a 100644
>> > --- a/gcc/df.h
>> > +++ b/gcc/df.h
>> > @@ -991,6 +991,7 @@ extern df_ref df_find_def (rtx_insn *, rtx);
>> >  extern bool df_reg_defined (rtx_insn *, rtx);
>> >  extern df_ref df_find_use (rtx_insn *, rtx);
>> >  extern bool df_reg_used (rtx_insn *, rtx);
>> > +extern rtx df_find_single_def_src (rtx);
>> >  extern void df_worklist_dataflow (struct dataflow *,bitmap, int *, int);
>> >  extern void df_print_regset (FILE *file, const_bitmap r);
>> >  extern void df_print_word_regset (FILE *file, const_bitmap r);
>> > diff --git a/gcc/dse.cc b/gcc/dse.cc
>> > index 30c11cee034..c915266f025 100644
>> > --- a/gcc/dse.cc
>> > +++ b/gcc/dse.cc
>> > @@ -1508,6 +1508,20 @@ record_store (rtx body, bb_info_t bb_info)
>> >  
>> >  if (tem && CONSTANT_P (tem))
>> >const_rhs = tem;
>> > +else
>> > +  {
>> > +/* If RHS is set only once to a constant, set CONST_RH

Re: [2/2] PR96463 -- changes to type checking vec_perm_expr in middle end

2022-06-01 Thread Richard Biener via Gcc-patches
On Tue, 31 May 2022, Prathamesh Kulkarni wrote:

> On Mon, 23 May 2022 at 22:57, Prathamesh Kulkarni
>  wrote:
> >
> > On Mon, 9 May 2022 at 21:21, Prathamesh Kulkarni
> >  wrote:
> > >
> > > On Mon, 9 May 2022 at 19:22, Richard Sandiford
> > >  wrote:
> > > >
> > > > Prathamesh Kulkarni  writes:
> > > > > On Tue, 3 May 2022 at 18:25, Richard Sandiford
> > > > >  wrote:
> > > > >>
> > > > >> Prathamesh Kulkarni  writes:
> > > > >> > On Tue, 4 Jan 2022 at 19:12, Richard Sandiford
> > > > >> >  wrote:
> > > > >> >>
> > > > >> >> Richard Biener  writes:
> > > > >> >> > On Tue, 4 Jan 2022, Richard Sandiford wrote:
> > > > >> >> >
> > > > >> >> >> Richard Biener  writes:
> > > > >> >> >> > On Fri, 17 Dec 2021, Richard Sandiford wrote:
> > > > >> >> >> >
> > > > >> >> >> >> Prathamesh Kulkarni  writes:
> > > > >> >> >> >> > Hi,
> > > > >> >> >> >> > The attached patch rearranges order of type-check for 
> > > > >> >> >> >> > vec_perm_expr
> > > > >> >> >> >> > and relaxes type checking for
> > > > >> >> >> >> > lhs = vec_perm_expr
> > > > >> >> >> >> >
> > > > >> >> >> >> > when:
> > > > >> >> >> >> > rhs1 == rhs2,
> > > > >> >> >> >> > lhs is variable length vector,
> > > > >> >> >> >> > rhs1 is fixed length vector,
> > > > >> >> >> >> > TREE_TYPE (lhs) == TREE_TYPE (rhs1)
> > > > >> >> >> >> >
> > > > >> >> >> >> > I am not sure tho if this check is correct ? My intent 
> > > > >> >> >> >> > was to capture
> > > > >> >> >> >> > case when vec_perm_expr is used to "extend" fixed length 
> > > > >> >> >> >> > vector to
> > > > >> >> >> >> > it's VLA equivalent.
> > > > >> >> >> >>
> > > > >> >> >> >> VLAness isn't really the issue.  We want the same thing to 
> > > > >> >> >> >> work for
> > > > >> >> >> >> -msve-vector-bits=256, -msve-vector-bits=512, etc., even 
> > > > >> >> >> >> though the
> > > > >> >> >> >> vectors are fixed-length in that case.
> > > > >> >> >> >>
> > > > >> >> >> >> The principle is that for:
> > > > >> >> >> >>
> > > > >> >> >> >>   A = VEC_PERM_EXPR ;
> > > > >> >> >> >>
> > > > >> >> >> >> the requirements are:
> > > > >> >> >> >>
> > > > >> >> >> >> - A, B, C and D must be vectors
> > > > >> >> >> >> - A, B and C must have the same element type
> > > > >> >> >> >> - D must have an integer element type
> > > > >> >> >> >> - A and D must have the same number of elements (NA)
> > > > >> >> >> >> - B and C must have the same number of elements (NB)
> > > > >> >> >> >>
> > > > >> >> >> >> The semantics are that we create a joined vector BC (all 
> > > > >> >> >> >> elements of B
> > > > >> >> >> >> followed by all element of C) and that:
> > > > >> >> >> >>
> > > > >> >> >> >>   A[i] = BC[D[i] % (NB+NB)]
> > > > >> >> >> >>
> > > > >> >> >> >> for 0 ? i < NA.
> > > > >> >> >> >>
> > > > >> >> >> >> This operation makes sense even if NA != NB.
> > > > >> >> >> >
> > > > >> >> >> > But note that we don't currently expect NA != NB and the 
> > > > >> >> >> > optab just
> > > > >> >> >> > has a single mode.
> > > > >> >> >>
> > > > >> >> >> True, but we only need this for constant permutes.  They are 
> > > > >> >> >> already
> > > > >> >> >> special in that they allow the index elements to be wider than 
> > > > >> >> >> the data
> > > > >> >> >> elements.
> > > > >> >> >
> > > > >> >> > OK, then we should reflect this in the stmt verification and 
> > > > >> >> > only relax
> > > > >> >> > the constant permute vector case and also amend the
> > > > >> >> > TARGET_VECTORIZE_VEC_PERM_CONST accordingly.
> > > > >> >>
> > > > >> >> Sounds good.
> > > > >> >>
> > > > >> >> > For non-constant permutes the docs say the mode of vec_perm is
> > > > >> >> > the common mode of operands 1 and 2 whilst the mode of operand 0
> > > > >> >> > is unspecified - even unconstrained by the docs.  I'm not sure
> > > > >> >> > if vec_perm expansion is expected to eventually FAIL.  Updating 
> > > > >> >> > the
> > > > >> >> > docs of vec_perm would be appreciated as well.
> > > > >> >>
> > > > >> >> Yeah, I guess de facto operand 0 has to be the same mode as 
> > > > >> >> operands
> > > > >> >> 1 and 2.  Maybe that was just an oversight, or maybe it seemed 
> > > > >> >> obvious
> > > > >> >> or self-explanatory at the time. :-)
> > > > >> >>
> > > > >> >> > As said I prefer to not mangle the existing stmt checking too 
> > > > >> >> > much
> > > > >> >> > at this stage so minimal adjustment is prefered there.
> > > > >> >>
> > > > >> >> The PR is only an enhancement request rather than a bug, so I 
> > > > >> >> think the
> > > > >> >> patch would need to wait for GCC 13 whatever happens.
> > > > >> > Hi,
> > > > >> > In attached patch, the type checking is relaxed only if mask is 
> > > > >> > constant.
> > > > >> > Does this look OK ?
> > > > >> >
> > > > >> > Thanks,
> > > > >> > Prathamesh
> > > > >> >>
> > > > >> >> Thanks,
> > > > >> >> Richard
> > > > >> >
> > > > >> > diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
> > > > >> > index e321d929fd0..02b88f67855 100644
> > > > >> > --- a/gcc/tree-cfg.cc
> > 

Re: [1/2] PR96463 - aarch64 specific changes

2022-06-01 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni  writes:
> On Thu, 12 May 2022 at 16:15, Richard Sandiford
>  wrote:
>>
>> Prathamesh Kulkarni  writes:
>> > On Wed, 11 May 2022 at 12:44, Richard Sandiford
>> >  wrote:
>> >>
>> >> Prathamesh Kulkarni  writes:
>> >> > On Fri, 6 May 2022 at 16:00, Richard Sandiford
>> >> >  wrote:
>> >> >>
>> >> >> Prathamesh Kulkarni  writes:
>> >> >> > diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
>> >> >> > b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> >> >> > index c24c0548724..1ef4ea2087b 100644
>> >> >> > --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> >> >> > +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
>> >> >> > @@ -44,6 +44,14 @@
>> >> >> >  #include "aarch64-sve-builtins-shapes.h"
>> >> >> >  #include "aarch64-sve-builtins-base.h"
>> >> >> >  #include "aarch64-sve-builtins-functions.h"
>> >> >> > +#include "aarch64-builtins.h"
>> >> >> > +#include "gimple-ssa.h"
>> >> >> > +#include "tree-phinodes.h"
>> >> >> > +#include "tree-ssa-operands.h"
>> >> >> > +#include "ssa-iterators.h"
>> >> >> > +#include "stringpool.h"
>> >> >> > +#include "value-range.h"
>> >> >> > +#include "tree-ssanames.h"
>> >> >>
>> >> >> Minor, but: I think the preferred approach is to include "ssa.h"
>> >> >> rather than include some of these headers directly.
>> >> >>
>> >> >> >
>> >> >> >  using namespace aarch64_sve;
>> >> >> >
>> >> >> > @@ -1207,6 +1215,56 @@ public:
>> >> >> >  insn_code icode = code_for_aarch64_sve_ld1rq (e.vector_mode 
>> >> >> > (0));
>> >> >> >  return e.use_contiguous_load_insn (icode);
>> >> >> >}
>> >> >> > +
>> >> >> > +  gimple *
>> >> >> > +  fold (gimple_folder &f) const OVERRIDE
>> >> >> > +  {
>> >> >> > +tree arg0 = gimple_call_arg (f.call, 0);
>> >> >> > +tree arg1 = gimple_call_arg (f.call, 1);
>> >> >> > +
>> >> >> > +/* Transform:
>> >> >> > +   lhs = svld1rq ({-1, -1, ... }, arg1)
>> >> >> > +   into:
>> >> >> > +   tmp = mem_ref [(int * {ref-all}) arg1]
>> >> >> > +   lhs = vec_perm_expr.
>> >> >> > +   on little endian target.  */
>> >> >> > +
>> >> >> > +if (!BYTES_BIG_ENDIAN
>> >> >> > + && integer_all_onesp (arg0))
>> >> >> > +  {
>> >> >> > + tree lhs = gimple_call_lhs (f.call);
>> >> >> > + auto simd_type = aarch64_get_simd_info_for_type (Int32x4_t);
>> >> >>
>> >> >> Does this work for other element sizes?  I would have expected it
>> >> >> to be the (128-bit) Advanced SIMD vector associated with the same
>> >> >> element type as the SVE vector.
>> >> >>
>> >> >> The testcase should cover more than just int32x4_t -> svint32_t,
>> >> >> just to be sure.
>> >> > In the attached patch, it obtains corresponding advsimd type with:
>> >> >
>> >> > tree eltype = TREE_TYPE (lhs_type);
>> >> > unsigned nunits = 128 / TREE_INT_CST_LOW (TYPE_SIZE (eltype));
>> >> > tree vectype = build_vector_type (eltype, nunits);
>> >> >
>> >> > While this seems to work with different element sizes, I am not sure if 
>> >> > it's
>> >> > the correct approach ?
>> >>
>> >> Yeah, that looks correct.  Other SVE code uses aarch64_vq_mode
>> >> to get the vector mode associated with a .Q “element”, so an
>> >> alternative would be:
>> >>
>> >> machine_mode vq_mode = aarch64_vq_mode (TYPE_MODE (eltype)).require 
>> >> ();
>> >> tree vectype = build_vector_type_for_mode (eltype, vq_mode);
>> >>
>> >> which is more explicit about wanting an Advanced SIMD vector.
>> >>
>> >> >> > +
>> >> >> > + tree elt_ptr_type
>> >> >> > +   = build_pointer_type_for_mode (simd_type.eltype, VOIDmode, 
>> >> >> > true);
>> >> >> > + tree zero = build_zero_cst (elt_ptr_type);
>> >> >> > +
>> >> >> > + /* Use element type alignment.  */
>> >> >> > + tree access_type
>> >> >> > +   = build_aligned_type (simd_type.itype, TYPE_ALIGN 
>> >> >> > (simd_type.eltype));
>> >> >> > +
>> >> >> > + tree tmp = make_ssa_name_fn (cfun, access_type, 0);
>> >> >> > + gimple *mem_ref_stmt
>> >> >> > +   = gimple_build_assign (tmp, fold_build2 (MEM_REF, 
>> >> >> > access_type, arg1, zero));
>> >> >>
>> >> >> Long line.  Might be easier to format by assigning the fold_build2 
>> >> >> result
>> >> >> to a temporary variable.
>> >> >>
>> >> >> > + gsi_insert_before (f.gsi, mem_ref_stmt, GSI_SAME_STMT);
>> >> >> > +
>> >> >> > + tree mem_ref_lhs = gimple_get_lhs (mem_ref_stmt);
>> >> >> > + tree vectype = TREE_TYPE (mem_ref_lhs);
>> >> >> > + tree lhs_type = TREE_TYPE (lhs);
>> >> >>
>> >> >> Is this necessary?  The code above supplied the types and I wouldn't
>> >> >> have expected them to change during the build process.
>> >> >>
>> >> >> > +
>> >> >> > + int source_nelts = TYPE_VECTOR_SUBPARTS (vectype).to_constant 
>> >> >> > ();
>> >> >> > + vec_perm_builder sel (TYPE_VECTOR_SUBPARTS (lhs_type), 
>> >> >> > source_nelts, 1);
>> >> >> > + for (int i = 0; i < source_nelts; i++)
>> >> >> > +   sel.quick_push (i);
>> >> >> > +
>> >> >> > + vec_perm_indices indic

[Ada] Suppress warnings on membership test of ranges

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
For a membership test "X in A .. B", the compiler used to warn if it
could prove that X is within one of the bounds.  For example, if we know
at compile time that X >= A, then the above could be replaced by "X <=
B".

This patch suppresses that warning, because there is really
nothing wrong with the membership test, and programmers sometimes
find it annoying.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch4.adb (Expand_N_In): Do not warn in the above-mentioned
cases.
* fe.h (Assume_No_Invalid_Values): Remove from fe.h, because
this is not used in gigi.
* opt.ads (Assume_No_Invalid_Values): Improve the comment. We
don't need to "clearly prove"; we can just "prove". Remove the
comment about fe.h, which is no longer true.diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -6388,7 +6388,7 @@ package body Exp_Ch4 is
 Lcheck : Compare_Result;
 Ucheck : Compare_Result;
 
-Warn1 : constant Boolean :=
+Warn : constant Boolean :=
   Constant_Condition_Warnings
 and then Comes_From_Source (N)
 and then not In_Instance;
@@ -6397,16 +6397,6 @@ package body Exp_Ch4 is
 --  also skip these warnings in an instance since it may be the
 --  case that different instantiations have different ranges.
 
-Warn2 : constant Boolean :=
-  Warn1
-and then Nkind (Original_Node (Rop)) = N_Range
-and then Is_Integer_Type (Etype (Lo));
---  For the case where only one bound warning is elided, we also
---  insist on an explicit range and an integer type. The reason is
---  that the use of enumeration ranges including an end point is
---  common, as is the use of a subtype name, one of whose bounds is
---  the same as the type of the expression.
-
  begin
 --  If test is explicit x'First .. x'Last, replace by valid check
 
@@ -6491,7 +6481,7 @@ package body Exp_Ch4 is
 --  legality checks, because we are constant-folding beyond RM 4.9.
 
 if Lcheck = LT or else Ucheck = GT then
-   if Warn1 then
+   if Warn then
   Error_Msg_N ("?c?range test optimized away", N);
   Error_Msg_N ("\?c?value is known to be out of range", N);
end if;
@@ -6505,7 +6495,7 @@ package body Exp_Ch4 is
 --  since we know we are in range.
 
 elsif Lcheck in Compare_GE and then Ucheck in Compare_LE then
-   if Warn1 then
+   if Warn then
   Error_Msg_N ("?c?range test optimized away", N);
   Error_Msg_N ("\?c?value is known to be in range", N);
end if;
@@ -6520,11 +6510,6 @@ package body Exp_Ch4 is
 --  a comparison against the upper bound.
 
 elsif Lcheck in Compare_GE then
-   if Warn2 and then not In_Instance then
-  Error_Msg_N ("??lower bound test optimized away", Lo);
-  Error_Msg_N ("\??value is known to be in range", Lo);
-   end if;
-
Rewrite (N,
  Make_Op_Le (Loc,
Left_Opnd  => Lop,
@@ -6532,16 +6517,9 @@ package body Exp_Ch4 is
Analyze_And_Resolve (N, Restyp);
goto Leave;
 
---  If upper bound check succeeds and lower bound check is not
---  known to succeed or fail, then replace the range check with
---  a comparison against the lower bound.
+--  Inverse of previous case.
 
 elsif Ucheck in Compare_LE then
-   if Warn2 and then not In_Instance then
-  Error_Msg_N ("??upper bound test optimized away", Hi);
-  Error_Msg_N ("\??value is known to be in range", Hi);
-   end if;
-
Rewrite (N,
  Make_Op_Ge (Loc,
Left_Opnd  => Lop,
@@ -6555,7 +6533,7 @@ package body Exp_Ch4 is
 --  see if we can determine the outcome assuming everything is
 --  valid, and if so give an appropriate warning.
 
-if Warn1 and then not Assume_No_Invalid_Values then
+if Warn and then not Assume_No_Invalid_Values then
Lcheck := Compile_Time_Compare (Lop, Lo, Assume_Valid => True);
Ucheck := Compile_Time_Compare (Lop, Hi, Assume_Valid => True);
 
@@ -6570,18 +6548,6 @@ package body Exp_Ch4 is
elsif Lcheck in Compare_GE and then Ucheck in Compare_LE then
   Error_Msg_N
 ("?c?value can only be out of range if it is invalid", N);
-
-   --  Lower bound check succeeds if value is valid
-
-   elsif

[Ada] Incorrect code for anonymous access-to-function with convention C

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
This patch fixes a bug where the compiler generates incorrect code for a
call via an object with convention C, whose type is an anonymous
access-to-function type.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* einfo-utils.adb (Set_Convention): Call Set_Convention
recursively, so that Set_Can_Use_Internal_Rep is called (if
appropriate) on the anonymous access type of the object, and its
designated subprogram type.
* sem_ch3.adb (Access_Definition): Remove redundant call to
Set_Can_Use_Internal_Rep.diff --git a/gcc/ada/einfo-utils.adb b/gcc/ada/einfo-utils.adb
--- a/gcc/ada/einfo-utils.adb
+++ b/gcc/ada/einfo-utils.adb
@@ -2659,7 +2659,7 @@ package body Einfo.Utils is
 | E_Anonymous_Access_Subprogram_Type
   and then not Has_Convention_Pragma (Typ)
 then
-   Set_Basic_Convention (Typ, Val);
+   Set_Convention (Typ, Val);
Set_Has_Convention_Pragma (Typ);
 
--  And for the access subprogram type, deal similarly with the
@@ -2669,10 +2669,9 @@ package body Einfo.Utils is
   declare
  Dtype : constant Entity_Id := Designated_Type (Typ);
   begin
- if Ekind (Dtype) = E_Subprogram_Type
-   and then not Has_Convention_Pragma (Dtype)
- then
-Set_Basic_Convention (Dtype, Val);
+ if Ekind (Dtype) = E_Subprogram_Type then
+pragma Assert (not Has_Convention_Pragma (Dtype));
+Set_Convention (Dtype, Val);
 Set_Has_Convention_Pragma (Dtype);
  end if;
   end;


diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -876,9 +876,6 @@ package body Sem_Ch3 is
 Mutate_Ekind (Anon_Type, E_Anonymous_Access_Subprogram_Type);
  end if;
 
- Set_Can_Use_Internal_Rep
-   (Anon_Type, not Always_Compatible_Rep_On_Target);
-
  --  If the anonymous access is associated with a protected operation,
  --  create a reference to it after the enclosing protected definition
  --  because the itype will be used in the subsequent bodies.




[Ada] Add inline documentation for Is_{Parenthesis,Enum_Array}_Aggregate

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
Both flags were added when square brackets for array/container
aggregates have been enabled with -gnat2022 without their corresponding
inline documentation. This change adds the missing documention.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sinfo.ads: Add inline documention for Is_Parenthesis_Aggregate
and Is_Enum_Array_Aggregate.diff --git a/gcc/ada/sinfo.ads b/gcc/ada/sinfo.ads
--- a/gcc/ada/sinfo.ads
+++ b/gcc/ada/sinfo.ads
@@ -1680,6 +1680,10 @@ package Sinfo is
--nodes which emulate the barrier function of a protected entry body.
--The flag is used when checking for incorrect use of Current_Task.
 
+   --  Is_Enum_Array_Aggregate
+   --A flag set on an aggregate created internally while building the
+   --images tables for enumerations.
+
--  Is_Expanded_Build_In_Place_Call
--This flag is set in an N_Function_Call node to indicate that the extra
--actuals to support a build-in-place style of call have been added to
@@ -1794,6 +1798,9 @@ package Sinfo is
--overloading determination. The setting of this flag is not relevant
--once overloading analysis is complete.
 
+   --  Is_Parenthesis_Aggregate
+   --A flag set on an aggregate that uses parentheses as delimiters
+
--  Is_Power_Of_2_For_Shift
--A flag present only in N_Op_Expon nodes. It is set when the
--exponentiation is of the form 2 ** N, where the type of N is an
@@ -4024,7 +4031,9 @@ package Sinfo is
   --  Compile_Time_Known_Aggregate
   --  Expansion_Delayed
   --  Has_Self_Reference
+  --  Is_Enum_Array_Aggregate
   --  Is_Homogeneous_Aggregate
+  --  Is_Parenthesis_Aggregate
   --  plus fields for expression
 
   --  Note: this structure is used for both record and array aggregates




[Ada] Use Actions field of freeze nodes for subprograms (continued)

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
This case was missed in the previous change.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch6.adb (Freeze_Subprogram.Register_Predefined_DT_Entry): Put
the actions into the Actions field of the freeze node instead of
inserting them after it.diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -7839,6 +7839,7 @@ package body Exp_Ch6 is
 
   procedure Register_Predefined_DT_Entry (Prim : Entity_Id) is
  Iface_DT_Ptr : Elmt_Id;
+ L: List_Id;
  Tagged_Typ   : Entity_Id;
  Thunk_Id : Entity_Id;
  Thunk_Code   : Node_Id;
@@ -7871,7 +7872,7 @@ package body Exp_Ch6 is
   Iface => Related_Type (Node (Iface_DT_Ptr)));
 
 if Present (Thunk_Code) then
-   Insert_Actions_After (N, New_List (
+   L := New_List (
  Thunk_Code,
 
  Build_Set_Predefined_Prim_Op_Address (Loc,
@@ -7894,7 +7895,14 @@ package body Exp_Ch6 is
  Unchecked_Convert_To (RTE (RE_Prim_Ptr),
Make_Attribute_Reference (Loc,
  Prefix => New_Occurrence_Of (Prim, Loc),
- Attribute_Name => Name_Unrestricted_Access);
+ Attribute_Name => Name_Unrestricted_Access;
+
+   if No (Actions (N)) then
+  Set_Actions (N, L);
+
+   else
+  Append_List (L, Actions (N));
+   end if;
 end if;
 
 --  Skip the tag of the predefined primitives dispatch table




[Ada] Issue better error message for out-of-order keywords in record def

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
Various cases of out-of-order keywords in the definition of a record
were already detected. This adds a similar detection after NULL and
RECORD keywords.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* par-ch3.adb (P_Known_Discriminant_Part_Opt): Reword error
message to benefit from existing codefix.
(P_Record_Definition): Detect out-of-order keywords in record
definition and issue appropriate messages. Other cases are
already caught at appropriate places.diff --git a/gcc/ada/par-ch3.adb b/gcc/ada/par-ch3.adb
--- a/gcc/ada/par-ch3.adb
+++ b/gcc/ada/par-ch3.adb
@@ -3180,7 +3180,8 @@ package body Ch3 is
   Scan;
 
   if Token = Tok_Access then
- Error_Msg_SC ("CONSTANT must appear after ACCESS");
+ Error_Msg_SC -- CODEFIX
+   ("ACCESS must come before CONSTANT");
  Set_Discriminant_Type
(Specification_Node,
 P_Access_Definition (Not_Null_Present));
@@ -3462,8 +3463,42 @@ package body Ch3 is
--  Error recovery: can raise Error_Resync
 
function P_Record_Definition return Node_Id is
+
+  procedure Catch_Out_Of_Order_Keywords (Keyword : String);
+  --  Catch ouf-of-order keywords in a record definition
+
+  -
+  -- Catch_Out_Of_Order_Keywords --
+  -
+
+  procedure Catch_Out_Of_Order_Keywords (Keyword : String) is
+  begin
+ loop
+if Token = Tok_Abstract then
+   Error_Msg_SC -- CODEFIX
+ ("ABSTRACT must come before " & Keyword);
+   Scan; -- past ABSTRACT
+
+elsif Token = Tok_Tagged then
+   Error_Msg_SC -- CODEFIX
+ ("TAGGED must come before " & Keyword);
+   Scan; -- past TAGGED
+
+elsif Token = Tok_Limited then
+   Error_Msg_SC -- CODEFIX
+ ("LIMITED must come before " & Keyword);
+   Scan; -- past LIMITED
+
+else
+   exit;
+end if;
+ end loop;
+  end Catch_Out_Of_Order_Keywords;
+
   Rec_Node : Node_Id;
 
+   --  Start of processing for P_Record_Definition
+
begin
   Inside_Record_Definition := True;
   Rec_Node := New_Node (N_Record_Definition, Token_Ptr);
@@ -3472,8 +3507,11 @@ package body Ch3 is
 
   if Token = Tok_Null then
  Scan; -- past NULL
+
+ Catch_Out_Of_Order_Keywords ("NULL");
  T_Record;
  Set_Null_Present (Rec_Node, True);
+ Catch_Out_Of_Order_Keywords ("RECORD");
 
   --  Catch incomplete declaration to prevent cascaded errors, see
   --  ACATS B393002 for an example.
@@ -3501,6 +3539,7 @@ package body Ch3 is
  Scopes (Scope.Last).Junk := (Token /= Tok_Record);
 
  T_Record;
+ Catch_Out_Of_Order_Keywords ("RECORD");
 
  Set_Component_List (Rec_Node, P_Component_List);
 




[Ada] Issue a warning on entity hidden in use_clause with -gnatwh

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
Augment the warnings issued with switch -gnatwh, so that a warning is
also issued when an entity from the package of a use_clause ends up
hidden due to an existing visible homonym.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch8.adb (Use_One_Package): Possibly warn.
* sem_util.adb (Enter_Name): Factor out warning on hidden entity.
(Warn_On_Hiding_Entity): Extract warning logic from Enter_Name and
generalize it to be applied also on use_clause.
* sem_util.ads (Warn_On_Hiding_Entity): Add new procedure.diff --git a/gcc/ada/sem_ch8.adb b/gcc/ada/sem_ch8.adb
--- a/gcc/ada/sem_ch8.adb
+++ b/gcc/ada/sem_ch8.adb
@@ -10326,6 +10326,11 @@ package body Sem_Ch8 is
 
   --  Potentially use-visible entity remains hidden
 
+  if Warn_On_Hiding then
+ Warn_On_Hiding_Entity (N, Hidden => Id, Visible => Prev,
+On_Use_Clause => True);
+  end if;
+
   goto Next_Usable_Entity;
 
--  A use clause within an instance hides outer global entities,


diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -8825,47 +8825,9 @@ package body Sem_Util is
 
   --  Warn if new entity hides an old one
 
-  if Warn_On_Hiding and then Present (C)
-
---  Don't warn for record components since they always have a well
---  defined scope which does not confuse other uses. Note that in
---  some cases, Ekind has not been set yet.
-
-and then Ekind (C) /= E_Component
-and then Ekind (C) /= E_Discriminant
-and then Nkind (Parent (C)) /= N_Component_Declaration
-and then Ekind (Def_Id) /= E_Component
-and then Ekind (Def_Id) /= E_Discriminant
-and then Nkind (Parent (Def_Id)) /= N_Component_Declaration
-
---  Don't warn for one character variables. It is too common to use
---  such variables as locals and will just cause too many false hits.
-
-and then Length_Of_Name (Chars (C)) /= 1
-
---  Don't warn for non-source entities
-
-and then Comes_From_Source (C)
-and then Comes_From_Source (Def_Id)
-
---  Don't warn within a generic instantiation
-
-and then not In_Instance
-
---  Don't warn unless entity in question is in extended main source
-
-and then In_Extended_Main_Source_Unit (Def_Id)
-
---  Finally, the hidden entity must be either immediately visible or
---  use visible (i.e. from a used package).
-
-and then
-  (Is_Immediately_Visible (C)
- or else
-   Is_Potentially_Use_Visible (C))
-  then
- Error_Msg_Sloc := Sloc (C);
- Error_Msg_N ("declaration hides &#?h?", Def_Id);
+  if Warn_On_Hiding and then Present (C) then
+ Warn_On_Hiding_Entity (Def_Id, Hidden => C, Visible => Def_Id,
+On_Use_Clause => False);
   end if;
end Enter_Name;
 
@@ -30344,6 +30306,69 @@ package body Sem_Util is
   return List_1;
end Visible_Ancestors;
 
+   ---
+   -- Warn_On_Hiding_Entity --
+   ---
+
+   procedure Warn_On_Hiding_Entity
+ (N   : Node_Id;
+  Hidden, Visible : Entity_Id;
+  On_Use_Clause   : Boolean)
+   is
+   begin
+  --  Don't warn for record components since they always have a well
+  --  defined scope which does not confuse other uses. Note that in
+  --  some cases, Ekind has not been set yet.
+
+  if Ekind (Hidden) /= E_Component
+and then Ekind (Hidden) /= E_Discriminant
+and then Nkind (Parent (Hidden)) /= N_Component_Declaration
+and then Ekind (Visible) /= E_Component
+and then Ekind (Visible) /= E_Discriminant
+and then Nkind (Parent (Visible)) /= N_Component_Declaration
+
+--  Don't warn for one character variables. It is too common to use
+--  such variables as locals and will just cause too many false hits.
+
+and then Length_Of_Name (Chars (Hidden)) /= 1
+
+--  Don't warn for non-source entities
+
+and then Comes_From_Source (Hidden)
+and then Comes_From_Source (Visible)
+
+--  Don't warn within a generic instantiation
+
+and then not In_Instance
+
+--  Don't warn unless entity in question is in extended main source
+
+and then In_Extended_Main_Source_Unit (Visible)
+
+--  Finally, in the case of a declaration, the hidden entity must
+--  be either immediately visible or use visible (i.e. from a used
+--  package). In the case of a use clause, the visible entity must
+--  be immediately visible.
+
+and then
+  (if On_Use_Clause then
+ Is_Immediately_Visible (Visible)
+   else
+ (Is_Immediately_Visible (Hidde

[Ada] arm-qnx-7.1: unwind goes wrong after regs restore

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
The usual increment of the pc to pc+2 for ARM is needed.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* init.c (QNX): __gnat_adjust_context_for_raise: New
implementation for arm-qnx.diff --git a/gcc/ada/init.c b/gcc/ada/init.c
--- a/gcc/ada/init.c
+++ b/gcc/ada/init.c
@@ -2560,6 +2560,29 @@ __gnat_install_handler (void)
 #include 
 #include "sigtramp.h"
 
+#if defined (__ARMEL__) && !defined (__aarch64__)
+
+/* ARM-QNX case with arm unwinding exceptions */
+#define HAVE_GNAT_ADJUST_CONTEXT_FOR_RAISE
+
+#include 
+#include 
+#include 
+
+void
+__gnat_adjust_context_for_raise (int signo ATTRIBUTE_UNUSED,
+ void *sc ATTRIBUTE_UNUSED)
+{
+  /* In case of ARM exceptions, the registers context have the PC pointing
+ to the instruction that raised the signal.  However the unwinder expects
+ the instruction to be in the range [PC+2,PC+3].  */
+  uintptr_t *pc_addr;
+  mcontext_t *mcontext = &((ucontext_t *) sc)->uc_mcontext;
+  pc_addr = (uintptr_t *)&mcontext->cpu.gpr [ARM_REG_PC];
+  *pc_addr += 2;
+}
+#endif /* ARMEL */
+
 void
 __gnat_map_signal (int sig,
 		   siginfo_t *si ATTRIBUTE_UNUSED,
@@ -2597,6 +2620,13 @@ __gnat_map_signal (int sig,
 static void
 __gnat_error_handler (int sig, siginfo_t *si, void *ucontext)
 {
+#ifdef HAVE_GNAT_ADJUST_CONTEXT_FOR_RAISE
+  /* We need to sometimes to adjust the PC in case of signals so that it
+ doesn't reference the exception that actually raised the signal but the
+ instruction before it.  */
+  __gnat_adjust_context_for_raise (sig, ucontext);
+#endif
+
   __gnat_sigtramp (sig, (void *) si, (void *) ucontext,
 		   (__sigtramphandler_t *)&__gnat_map_signal);
 }




[Ada] Add reference counting in functional containers

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
This patch adds reference counting to dynamically allocated pointers
on arrays and elements used by the functional container. This is done
by making both the arrays and the elements controlled.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-cofuba.ads, libgnat/a-cofuba.adb: Add reference
counting.diff --git a/gcc/ada/libgnat/a-cofuba.adb b/gcc/ada/libgnat/a-cofuba.adb
--- a/gcc/ada/libgnat/a-cofuba.adb
+++ b/gcc/ada/libgnat/a-cofuba.adb
@@ -52,6 +52,24 @@ package body Ada.Containers.Functional_Base with SPARK_Mode => Off is
--  Resize the underlying array if needed so that it can contain one more
--  element.
 
+   function Elements (C : Container) return Element_Array_Access is
+ (C.Controlled_Base.Base.Elements)
+   with
+ Global => null,
+ Pre=>
+   C.Controlled_Base.Base /= null
+   and then C.Controlled_Base.Base.Elements /= null;
+
+   function Get
+ (C_E : Element_Array_Access;
+  I   : Count_Type)
+  return Element_Access
+   is
+ (C_E (I).Ref.E_Access)
+   with
+ Global => null,
+ Pre=> C_E /= null and then C_E (I).Ref /= null;
+
-
-- "=" --
-
@@ -61,9 +79,8 @@ package body Ada.Containers.Functional_Base with SPARK_Mode => Off is
   if C1.Length /= C2.Length then
  return False;
   end if;
-
   for I in 1 .. C1.Length loop
- if C1.Base.Elements (I).all /= C2.Base.Elements (I).all then
+ if Get (Elements (C1), I).all /= Get (Elements (C2), I).all then
 return False;
  end if;
   end loop;
@@ -78,7 +95,7 @@ package body Ada.Containers.Functional_Base with SPARK_Mode => Off is
function "<=" (C1 : Container; C2 : Container) return Boolean is
begin
   for I in 1 .. C1.Length loop
- if Find (C2, C1.Base.Elements (I)) = 0 then
+ if Find (C2, Get (Elements (C1), I)) = 0 then
 return False;
  end if;
   end loop;
@@ -95,50 +112,138 @@ package body Ada.Containers.Functional_Base with SPARK_Mode => Off is
   I : Index_Type;
   E : Element_Type) return Container
is
+  C_B : Array_Base_Access renames C.Controlled_Base.Base;
begin
-  if To_Count (I) = C.Length + 1 and then C.Length = C.Base.Max_Length then
- Resize (C.Base);
- C.Base.Max_Length := C.Base.Max_Length + 1;
- C.Base.Elements (C.Base.Max_Length) := new Element_Type'(E);
+  if To_Count (I) = C.Length + 1 and then C.Length = C_B.Max_Length then
+ Resize (C_B);
+ C_B.Max_Length := C_B.Max_Length + 1;
+ C_B.Elements (C_B.Max_Length) := Element_Init (E);
 
- return Container'(Length => C.Base.Max_Length, Base => C.Base);
+ return Container'(Length  => C_B.Max_Length,
+   Controlled_Base => C.Controlled_Base);
   else
  declare
-A : constant Array_Base_Access := Content_Init (C.Length);
+A : constant Array_Base_Controlled_Access :=
+  Content_Init (C.Length);
 P : Count_Type := 0;
  begin
-A.Max_Length := C.Length + 1;
+A.Base.Max_Length := C.Length + 1;
 for J in 1 .. C.Length + 1 loop
if J /= To_Count (I) then
   P := P + 1;
-  A.Elements (J) := C.Base.Elements (P);
+  A.Base.Elements (J) := C_B.Elements (P);
else
-  A.Elements (J) := new Element_Type'(E);
+  A.Base.Elements (J) := Element_Init (E);
end if;
 end loop;
 
-return Container'(Length => A.Max_Length,
-  Base   => A);
+return Container'(Length   => A.Base.Max_Length,
+  Controlled_Base  => A);
  end;
   end if;
end Add;
 
+   
+   -- Adjust --
+   
+
+   procedure Adjust (Controlled_Base : in out Array_Base_Controlled_Access) is
+  C_B : Array_Base_Access renames Controlled_Base.Base;
+   begin
+  if C_B /= null then
+ C_B.Reference_Count := C_B.Reference_Count + 1;
+  end if;
+   end Adjust;
+
+   procedure Adjust (Ctrl_E : in out Controlled_Element_Access) is
+   begin
+  if Ctrl_E.Ref /= null then
+ Ctrl_E.Ref.Reference_Count := Ctrl_E.Ref.Reference_Count + 1;
+  end if;
+   end Adjust;
+
--
-- Content_Init --
--
 
-   function Content_Init (L : Count_Type := 0) return Array_Base_Access
+   function Content_Init
+ (L : Count_Type := 0) return Array_Base_Controlled_Access
is
   Max_Init : constant Count_Type := 100;
   Size : constant Count_Type :=
 (if L < Count_Type'Last - Max_Init then L + Max_Init
  else Count_Type'Last);
+
+  --  The Access in the array will be initialized to null
+
   Elements : constant Element_Array_Access :=
 n

[Ada] Fix search for "for ... of" loop subprograms

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
This patch makes the search for Get_Element_Access, Step (Next/Prev),
Reference_Control_Type, and Pseudo_Reference (for optimized "for ... of"
loops) more robust.  In particular, we have a new Next procedure in Ada
2022, and we need to pick the right one.

We have not yet added the new Next and other subprograms.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch5.adb (Expand_Iterator_Loop_Over_Container): For each
subprogram found, assert that the variable is Empty, so we can
detect bugs where we find two or more things with the same name.
Without this patch, that bug would happen when we add the new
Next procedure.  For Step, make sure we pick the right one, by
checking name and number of parameters.  For Get_Element_Access,
check that we're picking a function.  That's not really
necessary, because there is no procedure with that name, but it
seems cleaner this way.
* rtsfind.ads: Minor comment improvement. It seems kind of odd
to say "under no circumstances", and then immediately contradict
that with "The one exception is...".diff --git a/gcc/ada/exp_ch5.adb b/gcc/ada/exp_ch5.adb
--- a/gcc/ada/exp_ch5.adb
+++ b/gcc/ada/exp_ch5.adb
@@ -5203,22 +5203,36 @@ package body Exp_Ch5 is
 
 Ent := First_Entity (Pack);
 while Present (Ent) loop
+   --  Get_Element_Access function with one parameter called
+   --  Position.
+
if Chars (Ent) = Name_Get_Element_Access
+ and then Ekind (Ent) = E_Function
  and then Present (First_Formal (Ent))
  and then Chars (First_Formal (Ent)) = Name_Position
  and then No (Next_Formal (First_Formal (Ent)))
then
+  pragma Assert (No (Fast_Element_Access_Op));
   Fast_Element_Access_Op := Ent;
 
+   --  Next or Prev procedure with one parameter called
+   --  Position.
+
elsif Chars (Ent) = Name_Step
  and then Ekind (Ent) = E_Procedure
+ and then Present (First_Formal (Ent))
+ and then Chars (First_Formal (Ent)) = Name_Position
+ and then No (Next_Formal (First_Formal (Ent)))
then
+  pragma Assert (No (Fast_Step_Op));
   Fast_Step_Op := Ent;
 
elsif Chars (Ent) = Name_Reference_Control_Type then
+  pragma Assert (No (Reference_Control_Type));
   Reference_Control_Type := Ent;
 
elsif Chars (Ent) = Name_Pseudo_Reference then
+  pragma Assert (No (Pseudo_Reference));
   Pseudo_Reference := Ent;
end if;
 


diff --git a/gcc/ada/rtsfind.ads b/gcc/ada/rtsfind.ads
--- a/gcc/ada/rtsfind.ads
+++ b/gcc/ada/rtsfind.ads
@@ -540,13 +540,11 @@ package Rtsfind is
--  value is required syntactically, but no real entry is required or
--  needed. Use of this value will cause a fatal error in an RTE call.
 
-   --  Note that under no circumstances can any of these entities be defined
-   --  more than once in a given package, i.e. no overloading is allowed for
-   --  any entity that is found using rtsfind. A fatal error is given if this
-   --  rule is violated. The one exception is for Save_Occurrence, where the
-   --  RM mandates the overloading. In this case, the compiler only uses the
-   --  procedure, not the function, and the procedure must come first so that
-   --  the compiler finds it and not the function.
+   --  It is normally not allowed to have more than one of these entities with
+   --  the same name in a given package. The one exception is Save_Occurrence,
+   --  where the RM mandates the overloading. In this case, the compiler uses
+   --  the procedure, not the function, and the procedure must come first so
+   --  that the compiler finds it and not the function.
 
type RE_Id is (
 




[Ada] Minor tweaks to dispatching support code

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_disp.ads (Expand_Interface_Thunk): Change type of Prim.
* exp_disp.adb (Expand_Interface_Thunk): Declare Is_Predef_Op
earlier, do not initialize Iface_Formal, use No idiom and tweaks
comments.
(Register_Primitive): Declare L earlier and tweak comments.
* sem_disp.adb (Check_Dispatching_Operation): Move tests out of
loop.diff --git a/gcc/ada/exp_disp.adb b/gcc/ada/exp_disp.adb
--- a/gcc/ada/exp_disp.adb
+++ b/gcc/ada/exp_disp.adb
@@ -1731,26 +1731,26 @@ package body Exp_Disp is

 
procedure Expand_Interface_Thunk
- (Prim   : Node_Id;
+ (Prim   : Entity_Id;
   Thunk_Id   : out Entity_Id;
   Thunk_Code : out Node_Id;
   Iface  : Entity_Id)
is
-  Loc : constant Source_Ptr := Sloc (Prim);
-  Actuals : constant List_Id:= New_List;
-  Decl: constant List_Id:= New_List;
-  Formals : constant List_Id:= New_List;
-  Target  : constant Entity_Id  := Ultimate_Alias (Prim);
+  Actuals  : constant List_Id:= New_List;
+  Decl : constant List_Id:= New_List;
+  Formals  : constant List_Id:= New_List;
+  Loc  : constant Source_Ptr := Sloc (Prim);
+  Target   : constant Entity_Id  := Ultimate_Alias (Prim);
+  Is_Predef_Op : constant Boolean:=
+   Is_Predefined_Dispatching_Operation (Prim)
+ or else Is_Predefined_Dispatching_Operation (Target);
 
   Decl_1: Node_Id;
   Decl_2: Node_Id;
   Expr  : Node_Id;
   Formal: Node_Id;
   Ftyp  : Entity_Id;
-  Iface_Formal  : Node_Id := Empty;  -- initialize to prevent warning
-  Is_Predef_Op  : constant Boolean :=
-Is_Predefined_Dispatching_Operation (Prim)
-  or else Is_Predefined_Dispatching_Operation (Target);
+  Iface_Formal  : Node_Id;
   New_Arg   : Node_Id;
   Offset_To_Top : Node_Id;
   Target_Formal : Entity_Id;
@@ -1764,16 +1764,17 @@ package body Exp_Disp is
   if Is_Eliminated (Target) then
  return;
 
-  --  In case of primitives that are functions without formals and a
-  --  controlling result there is no need to build the thunk.
+  --  No thunk needed if the primitive has no formals. In this case, this
+  --  must be a function with a controlling result.
 
-  elsif not Present (First_Formal (Target)) then
+  elsif No (First_Formal (Target)) then
  pragma Assert (Ekind (Target) = E_Function
and then Has_Controlling_Result (Target));
+
  return;
   end if;
 
-  --  Duplicate the formals of the Target primitive. In the thunk, the type
+  --  Duplicate the formals of the target primitive. In the thunk, the type
   --  of the controlling formal is the covered interface type (instead of
   --  the target tagged type). Done to avoid problems with discriminated
   --  tagged types because, if the controlling type has discriminants with
@@ -1785,14 +1786,14 @@ package body Exp_Disp is
   --  because they don't have available the Interface_Alias attribute (see
   --  Sem_Ch3.Add_Internal_Interface_Entities).
 
-  if not Is_Predef_Op then
+  if Is_Predef_Op then
+ Iface_Formal := Empty;
+  else
  Iface_Formal := First_Formal (Interface_Alias (Prim));
   end if;
 
   Formal := First_Formal (Target);
   while Present (Formal) loop
- Ftyp := Etype (Formal);
-
  --  Use the interface type as the type of the controlling formal (see
  --  comment above).
 
@@ -1814,10 +1815,10 @@ package body Exp_Disp is
 
 --  Sanity check performed to ensure the proper controlling type
 --  when the thunk has exactly one controlling parameter and it
---  comes first. In such case the GCC backend reuses the C++
+--  comes first. In such a case, the GCC back end reuses the C++
 --  thunks machinery which perform a computation equivalent to
 --  the code generated by the expander; for other cases the GCC
---  backend translates the expanded code unmodified. However, as
+--  back end translates the expanded code unmodified. However, as
 --  a generalization, the check is performed for all controlling
 --  types.
 
@@ -7115,12 +7116,13 @@ package body Exp_Disp is
  (Loc : Source_Ptr;
   Prim: Entity_Id) return List_Id
is
+  L : constant List_Id := New_List;
+
   DT_Ptr: Entity_Id;
   Iface_Prim: Entity_Id;
   Iface_Typ : Entity_Id;
   Iface_DT_Ptr  : Entity_Id;
   Iface_DT_Elmt : Elmt_Id;
-  L : constant List_Id := New_List;
   Pos   : Uint;
  

[Ada] Missing discriminant checks when accessing variant field

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
In some cases, the compiler would incorrectly fail to generate
discriminant checks when accessing fields declared in a variant part.
Correct some such cases; detect the remaining cases and flag them as
unsupported. The formerly-problematic cases that are now handled
correctly involve component references occurring in a predicate
expression (e.g., the expression of a Dynamic_Predicate aspect
specification) for a type declaration (not for a subtype declaration).
The cases which are now flagged as unsupported involve expression
functions declared before the discriminated type in question has been
frozen.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch3.ads: Replace visible Build_Discr_Checking_Funcs (which
did not need to be visible - it was not referenced outside this
package) with Build_Or_Copy_Discr_Checking_Funcs.
* exp_ch3.adb: Refactor existing code into 3 procedures -
Build_Discr_Checking_Funcs, Copy_Discr_Checking_Funcs, and
Build_Or_Copy_Discr_Checking_Funcs. This refactoring is intended
to be semantics-preserving.
* exp_ch4.adb (Expand_N_Selected_Component): Detect case where a
call should be generated to the Discriminant_Checking_Func for
the component in question, but that subprogram does not yet
exist.
* sem_ch13.adb (Freeze_Entity_Checks): Immediately before
calling Build_Predicate_Function, add a call to
Exp_Ch3.Build_Or_Copy_Discr_Checking_Funcs in order to ensure
that Discriminant_Checking_Func attributes are already set when
Build_Predicate_Function is called.
* sem_ch6.adb (Analyze_Expression_Function): If the expression
of a static expression function has been transformed into an
N_Raise_xxx_Error node, then we need to copy the original
expression in order to check the requirement that the expression
must be a potentially static expression. We also want to set
aside a copy the untransformed expression for later use in
checking calls to the expression function via
Inline_Static_Function_Call.  So introduce a new function,
Make_Expr_Copy, for use in these situations.
* sem_res.adb (Preanalyze_And_Resolve): When analyzing certain
expressions (e.g., a default parameter expression in a
subprogram declaration) we want to suppress checks. However, we
do not want to suppress checks for the expression of an
expression function.diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -106,6 +106,13 @@ package body Exp_Ch3 is
--  types with discriminants. Otherwise new identifiers are created,
--  with the source names of the discriminants.
 
+   procedure Build_Discr_Checking_Funcs (N : Node_Id);
+   --  For each variant component, builds a function which checks whether
+   --  the component name is consistent with the current discriminants
+   --  and sets the component's Dcheck_Function attribute to refer to it.
+   --  N is the full type declaration node; the discriminant checking
+   --  functions are inserted after this node.
+
function Build_Equivalent_Array_Aggregate (T : Entity_Id) return Node_Id;
--  This function builds a static aggregate that can serve as the initial
--  value for an array type whose bounds are static, and whose component
@@ -152,6 +159,12 @@ package body Exp_Ch3 is
--  needed after an initialization. Typ is the component type, and Proc_Id
--  the initialization procedure for the enclosing composite type.
 
+   procedure Copy_Discr_Checking_Funcs (N : Node_Id);
+   --  For a derived untagged type, copy the attributes that were set
+   --  for the components of the parent type onto the components of the
+   --  derived type. No new subprograms are constructed.
+   --  N is the full type declaration node, as for Build_Discr_Checking_Funcs.
+
procedure Expand_Freeze_Array_Type (N : Node_Id);
--  Freeze an array type. Deals with building the initialization procedure,
--  creating the packed array type for a packed array and also with the
@@ -1219,6 +1232,25 @@ package body Exp_Ch3 is
   end if;
end Build_Discr_Checking_Funcs;
 
+   
+   -- Build_Or_Copy_Discr_Checking_Funcs --
+   
+
+   procedure Build_Or_Copy_Discr_Checking_Funcs (N : Node_Id) is
+  Typ : constant Entity_Id := Defining_Identifier (N);
+   begin
+  if Is_Unchecked_Union (Typ) or else not Has_Discriminants (Typ) then
+ null;
+  elsif not Is_Derived_Type (Typ)
+or else Has_New_Non_Standard_Rep (Typ)
+or else Is_Tagged_Type (Typ)
+  then
+ Build_Discr_Checking_Funcs (N);
+  else
+ Copy_Discr_Checking_Funcs (N);
+  end if;
+   end Build_Or_Copy_Discr_Checking_Funcs;
+

-- Bu

[Ada] Adjust warning switches

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
This makes tagging more accurate.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_warn.adb (Check_References): Adjust conditions under which
warning messages should be emitted and their tags as well.diff --git a/gcc/ada/sem_warn.adb b/gcc/ada/sem_warn.adb
--- a/gcc/ada/sem_warn.adb
+++ b/gcc/ada/sem_warn.adb
@@ -1272,10 +1272,6 @@ package body Sem_Warn is
 
elsif Never_Set_In_Source_Check_Spec (E1)
 
- --  No warning if warning for this case turned off
-
- and then Warn_On_No_Value_Assigned
-
  --  No warning if address taken somewhere
 
  and then not Address_Taken (E1)
@@ -1381,7 +1377,7 @@ package body Sem_Warn is
  --  force use of IN OUT, even if in this particular case
  --  the formal is not modified.
 
- else
+ elsif Warn_On_No_Value_Assigned then
 --  Suppress the warnings for a junk name
 
 if not Has_Junk_Name (E1) then
@@ -1397,15 +1393,17 @@ package body Sem_Warn is
if not Has_Pragma_Unmodified_Check_Spec (E1)
  and then not Warnings_Off_E1
  and then not Has_Junk_Name (E1)
+ and then Warn_On_No_Value_Assigned
then
   Output_Reference_Error
-("?f?formal parameter& is read but "
+("?v?formal parameter& is read but "
  & "never assigned!");
end if;
 
 elsif not Has_Pragma_Unreferenced_Check_Spec (E1)
   and then not Warnings_Off_E1
   and then not Has_Junk_Name (E1)
+  and then Check_Unreferenced_Formals
 then
Output_Reference_Error
  ("?f?formal parameter& is not referenced!");
@@ -1416,7 +1414,8 @@ package body Sem_Warn is
 
   else
  if Referenced (E1) then
-if not Has_Unmodified (E1)
+if Warn_On_No_Value_Assigned
+  and then not Has_Unmodified (E1)
   and then not Warnings_Off_E1
   and then not Has_Junk_Name (E1)
 then
@@ -1431,12 +1430,13 @@ package body Sem_Warn is
May_Need_Initialized_Actual (E1);
 end if;
 
- elsif not Has_Unreferenced (E1)
+ elsif Check_Unreferenced
+   and then not Has_Unreferenced (E1)
and then not Warnings_Off_E1
and then not Has_Junk_Name (E1)
  then
 Output_Reference_Error -- CODEFIX
-  ("?v?variable& is never read and never assigned!");
+  ("?u?variable& is never read and never assigned!");
  end if;
 
  --  Deal with special case where this variable is hidden
@@ -1445,14 +1445,15 @@ package body Sem_Warn is
  if Ekind (E1) = E_Variable
and then Present (Hiding_Loop_Variable (E1))
and then not Warnings_Off_E1
+   and then Warn_On_Hiding
  then
 Error_Msg_N
-  ("?v?for loop implicitly declares loop variable!",
+  ("?h?for loop implicitly declares loop variable!",
Hiding_Loop_Variable (E1));
 
 Error_Msg_Sloc := Sloc (E1);
 Error_Msg_N
-  ("\?v?declaration hides & declared#!",
+  ("\?h?declaration hides & declared#!",
Hiding_Loop_Variable (E1));
  end if;
   end if;




[Ada] Fix composability of return on the secondary stack

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
Having components that need to be returned on the secondary stack would
not always force a record type to be returned on the secondary stack
itself.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_util.adb
(Returns_On_Secondary_Stack.Caller_Known_Size_Record): Directly
check the dependence on discriminants for the variant part, if
any, instead of calling the Is_Definite_Subtype predicate.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -27388,14 +27388,8 @@ package body Sem_Util is
   pragma Assert (if Present (Id) then Ekind (Id) in E_Void | Type_Kind);
 
   function Caller_Known_Size_Record (Typ : Entity_Id) return Boolean;
-  --  This is called for untagged records and protected types, with
-  --  nondefaulted discriminants. Returns True if the size of function
-  --  results is known at the call site, False otherwise. Returns False
-  --  if there is a variant part that depends on the discriminants of
-  --  this type, or if there is an array constrained by the discriminants
-  --  of this type. ???Currently, this is overly conservative (the array
-  --  could be nested inside some other record that is constrained by
-  --  nondiscriminants). That is, the recursive calls are too conservative.
+  --  Called for untagged record and protected types. Return True if the
+  --  size of function results is known in the caller for Typ.
 
   function Large_Max_Size_Mutable (Typ : Entity_Id) return Boolean;
   --  Returns True if Typ is a nonlimited record with defaulted
@@ -27409,22 +27403,61 @@ package body Sem_Util is
   function Caller_Known_Size_Record (Typ : Entity_Id) return Boolean is
  pragma Assert (Typ = Underlying_Type (Typ));
 
+ function Depends_On_Discriminant (Typ : Entity_Id) return Boolean;
+ --  Called for untagged record and protected types. Return True if Typ
+ --  depends on discriminants, either directly when it is unconstrained
+ --  or indirectly when it is constrained by uplevel discriminants.
+
+ -
+ -- Depends_On_Discriminant --
+ -
+
+ function Depends_On_Discriminant (Typ : Entity_Id) return Boolean is
+Cons : Elmt_Id;
+
+ begin
+if Has_Discriminants (Typ) then
+   if not Is_Constrained (Typ) then
+  return True;
+
+   else
+  Cons := First_Elmt (Discriminant_Constraint (Typ));
+  while Present (Cons) loop
+ if Nkind (Node (Cons)) = N_Identifier
+   and then Ekind (Entity (Node (Cons))) = E_Discriminant
+ then
+return True;
+ end if;
+
+ Next_Elmt (Cons);
+  end loop;
+   end if;
+end if;
+
+return False;
+ end Depends_On_Discriminant;
+
   begin
- if Has_Variant_Part (Typ) and then not Is_Definite_Subtype (Typ) then
+ --  First see if we have a variant part and return False if it depends
+ --  on discriminants.
+
+ if Has_Variant_Part (Typ) and then Depends_On_Discriminant (Typ) then
 return False;
  end if;
 
+ --  Then loop over components and return False if their subtype has a
+ --  caller-unknown size, possibly recursively.
+
+ --  ??? This is overly conservative, an array could be nested inside
+ --  some other record that is constrained by nondiscriminants. That
+ --  is, the recursive calls are too conservative.
+
  declare
 Comp : Entity_Id;
 
  begin
 Comp := First_Component (Typ);
 while Present (Comp) loop
-
-   --  Only look at E_Component entities. No need to look at
-   --  E_Discriminant entities, and we must ignore internal
-   --  subtypes generated for constrained components.
-
declare
   Comp_Type : constant Entity_Id :=
 Underlying_Type (Etype (Comp));




[Ada] QNX shared libraries - arm-qnx build gnatlib .so's

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
Shared libraries now fully supported on arm-qnx.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* Makefile.rtl (GNATLIB_SHARED): Revert disablement for arm-qnx.diff --git a/gcc/ada/Makefile.rtl b/gcc/ada/Makefile.rtl
--- a/gcc/ada/Makefile.rtl
+++ b/gcc/ada/Makefile.rtl
@@ -1542,9 +1542,8 @@ ifeq ($(strip $(filter-out arm aarch64 %qnx,$(target_cpu) $(target_os))),)
s-dorepr.adb

[Ada] Adjust reference in comment

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
This is needed after the creation of Returns_On_Secondary_Stack from the
original Requires_Transient_Scope.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_util.adb (Indirect_Temp_Needed): Adjust reference in comment.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -32129,7 +32129,7 @@ package body Sem_Util is
 --
 --  See Large_Max_Size_Mutable function elsewhere in this
 --  file (currently declared inside of
---  Requires_Transient_Scope, so it would have to be
+--  Returns_On_Secondary_Stack, so it would have to be
 --  moved if we want it to be callable from here).
 
  end Indirect_Temp_Needed;




[Ada] Another case where freezing incorrectly suppresses checks

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
Avoid improperly suppressing checks for the wrapper subprogram that is
built when a null type extension inherits (and does not override) a
function with a controlling result. This is a follow-up to other changes
already made on this ticket.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch3.adb (Make_Controlling_Function_Wrappers): Set the
Corresponding_Spec field of a wrapper subprogram body before
analyzing the subprogram body; the field will be set (again)
during analysis, but we need it to be set earlier.
* exp_ch13.adb (Expand_N_Freeze_Entity): Add wrapper subprogram
bodies to the list of declarations for which we do not want to
suppress checks.diff --git a/gcc/ada/exp_ch13.adb b/gcc/ada/exp_ch13.adb
--- a/gcc/ada/exp_ch13.adb
+++ b/gcc/ada/exp_ch13.adb
@@ -626,10 +626,10 @@ package body Exp_Ch13 is
   end if;
 
   --  Analyze actions generated by freezing. The init_proc contains source
-  --  expressions that may raise Constraint_Error, and the assignment
+  --  expressions that may raise Constraint_Error, the assignment
   --  procedure for complex types needs checks on individual component
-  --  assignments, but all other freezing actions should be compiled with
-  --  all checks off.
+  --  assignments, and wrappers may need checks. Other freezing actions
+  --  should be compiled with all checks off.
 
   if Present (Actions (N)) then
  Decl := First (Actions (N));
@@ -637,7 +637,11 @@ package body Exp_Ch13 is
 if Nkind (Decl) = N_Subprogram_Body
   and then (Is_Init_Proc (Defining_Entity (Decl))
   or else
-Chars (Defining_Entity (Decl)) = Name_uAssign)
+Chars (Defining_Entity (Decl)) = Name_uAssign
+  or else
+(Present (Corresponding_Spec (Decl))
+   and then Is_Wrapper
+  (Corresponding_Spec (Decl
 then
Analyze (Decl);
 


diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -10031,6 +10031,13 @@ package body Exp_Ch3 is
 Mutate_Ekind (Func_Id, E_Function);
 Set_Is_Wrapper (Func_Id);
 
+--  Corresponding_Spec will be set again to the same value during
+--  analysis, but we need this information earlier.
+--  Expand_N_Freeze_Entity needs to know whether a subprogram body
+--  is a wrapper's body in order to get check suppression right.
+
+Set_Corresponding_Spec (Func_Body, Func_Id);
+
 Override_Dispatching_Operation (Tag_Typ, Subp, New_Op => Func_Id);
  end if;
 




[Ada] Note that hardening features are experimental

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
Some features haven't got customer feedback or made upstream yet.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* doc/gnat_rm/security_hardening_features.rst: Note that hardening
features are experimental.
* gnat_rm.texi: Regenerate.diff --git a/gcc/ada/doc/gnat_rm/security_hardening_features.rst b/gcc/ada/doc/gnat_rm/security_hardening_features.rst
--- a/gcc/ada/doc/gnat_rm/security_hardening_features.rst
+++ b/gcc/ada/doc/gnat_rm/security_hardening_features.rst
@@ -7,6 +7,9 @@ Security Hardening Features
 This chapter describes Ada extensions aimed at security hardening that
 are provided by GNAT.
 
+The features in this chapter are currently experimental and subject to
+change.
+
 .. Register Scrubbing:
 
 Register Scrubbing


diff --git a/gcc/ada/gnat_rm.texi b/gcc/ada/gnat_rm.texi
--- a/gcc/ada/gnat_rm.texi
+++ b/gcc/ada/gnat_rm.texi
@@ -28878,6 +28878,9 @@ RM References:  H.04 (8/1)
 This chapter describes Ada extensions aimed at security hardening that
 are provided by GNAT.
 
+The features in this chapter are currently experimental and subject to
+change.
+
 @c Register Scrubbing:
 
 @menu




[Ada] Get rid of secondary stack for controlled components of limited types

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
The initial work didn't change anything for limited types because they use
a specific return mechanism for functions called build-in-place where there
is no anonymous return object, so the secondary stack was used only for the
sake of consistency with the nonlimited case.

This change aligns the limited case with the nonlimited case, i.e. either
they both use the primary stack or they both use the secondary stack.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch6.adb (Caller_Known_Size): Call Returns_On_Secondary_Stack
instead of Requires_Transient_Scope and tidy up.
(Needs_BIP_Alloc_Form): Likewise.
* exp_util.adb (Initialized_By_Aliased_BIP_Func_Call): Also return
true if the build-in-place function call has no BIPalloc parameter.
(Is_Finalizable_Transient): Remove redundant test.diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -1055,11 +1055,12 @@ package body Exp_Ch6 is
  (Func_Call   : Node_Id;
   Result_Subt : Entity_Id) return Boolean
is
+  Ctrl : constant Node_Id   := Controlling_Argument (Func_Call);
+  Utyp : constant Entity_Id := Underlying_Type (Result_Subt);
+
begin
-  return
-  (Is_Definite_Subtype (Underlying_Type (Result_Subt))
-and then No (Controlling_Argument (Func_Call)))
-or else not Requires_Transient_Scope (Underlying_Type (Result_Subt));
+  return (No (Ctrl) and then Is_Definite_Subtype (Utyp))
+or else not Returns_On_Secondary_Stack (Utyp);
end Caller_Known_Size;
 
---
@@ -10218,7 +10219,7 @@ package body Exp_Ch6 is
   pragma Assert (Is_Build_In_Place_Function (Func_Id));
   Func_Typ : constant Entity_Id := Underlying_Type (Etype (Func_Id));
begin
-  return Requires_Transient_Scope (Func_Typ);
+  return Returns_On_Secondary_Stack (Func_Typ);
end Needs_BIP_Alloc_Form;
 
-


diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -8368,9 +8368,10 @@ package body Exp_Util is
   function Initialized_By_Aliased_BIP_Func_Call
 (Trans_Id : Entity_Id) return Boolean;
   --  Determine whether transient object Trans_Id is initialized by a
-  --  build-in-place function call where the BIPalloc parameter is of
-  --  value 1 and BIPaccess is not null. This case creates an aliasing
-  --  between the returned value and the value denoted by BIPaccess.
+  --  build-in-place function call where the BIPalloc parameter either
+  --  does not exist or is Caller_Allocation, and BIPaccess is not null.
+  --  This case creates an aliasing between the returned value and the
+  --  value denoted by BIPaccess.
 
   function Is_Aliased
 (Trans_Id   : Entity_Id;
@@ -8427,11 +8428,14 @@ package body Exp_Util is
 
  if Is_Build_In_Place_Function_Call (Call) then
 declare
+   Caller_Allocation_Val : constant Uint :=
+ UI_From_Int (BIP_Allocation_Form'Pos (Caller_Allocation));
+
Access_Nam : Name_Id := No_Name;
Access_OK  : Boolean := False;
Actual : Node_Id;
Alloc_Nam  : Name_Id := No_Name;
-   Alloc_OK   : Boolean := False;
+   Alloc_OK   : Boolean := True;
Formal : Node_Id;
Func_Id: Entity_Id;
Param  : Node_Id;
@@ -8466,7 +8470,7 @@ package body Exp_Util is
 BIP_Formal_Suffix (BIP_Alloc_Form));
  end if;
 
- --  A match for BIPaccess => Temp has been found
+ --  A nonnull BIPaccess has been found
 
  if Chars (Formal) = Access_Nam
and then Nkind (Actual) /= N_Null
@@ -8474,13 +8478,12 @@ package body Exp_Util is
 Access_OK := True;
  end if;
 
- --  A match for BIPalloc => 1 has been found
+ --  A BIPalloc has been found
 
  if Chars (Formal) = Alloc_Nam
and then Nkind (Actual) = N_Integer_Literal
-   and then Intval (Actual) = Uint_1
  then
-Alloc_OK := True;
+Alloc_OK := Intval (Actual) = Caller_Allocation_Val;
  end if;
   end if;
 
@@ -8767,7 +8770,6 @@ package body Exp_Util is
   return
 Ekind (Obj_Id) in E_Constant | E_Variable
   and then Needs_Finalization (Desig)
-  and then Requires_Transient_Scope (Desig)
   and then Nkind (Rel_Node) /= N_Simple_Return_Statement
   and then not Is_Part_Of_BIP_Return_Statement (Rel_Node)
 




[Ada] Propagate null-exclusion to anonymous access types

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
When analyzing an array or record type declaration whose component has a
constrained access type, e.g.:

   type Buffer_Acc is not null access all String;

   type Buffer_Rec is record
  Data : Buffer_Acc (1 .. 10);
   end record;

   type Buffer_Arr is array (Boolean) of Buffer_Acc (1 .. 10);

we propagated various properties of the unconstrained access type (e.g.
the designated type, access-to-constant flag), but forgot to propagate
the null-exclusion.

For GNAT it didn't make a big difference, because the (anonymous)
component type was never subject to legality checks. The "value
tracking" optimisation machinery, which also deals with null values,
only works for entire objects and doesn't care about components.
However, GNATprove uses this flag when an access-to-component object is
dereferenced.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch3.adb (Constrain_Access): Propagate null-exclusion flag
from parent type.diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -13535,6 +13535,7 @@ package body Sem_Ch3 is
   Set_Directly_Designated_Type (Def_Id, Desig_Subtype);
   Set_Depends_On_Private   (Def_Id, Has_Private_Component (Def_Id));
   Set_Is_Access_Constant   (Def_Id, Is_Access_Constant (T));
+  Set_Can_Never_Be_Null(Def_Id, Can_Never_Be_Null (T));
 
   Conditional_Delay (Def_Id, T);
 




[Ada] Fix bad interaction between Inline_Always and -gnateV + -gnata

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
The combination of pragma/aspect Inline_Always and -gnateV -gnata runs
afoul of the handling of inlining across units by gigi, which does not
inline a subprogram that calls nested subprograms if these subprograms
are not themselves inlined.

This condition does not apply to internally generated subprograms but
the special _postconditions procedure has Debug_Info_Needed set so it
is not considered as such and, as a consequence, triggers an error if
the enclosing subprogram requires inlining by means of Inline_Always.

The _postconditions procedure is already marked inlined when generating
C code so it makes sense to mark it inlined in the general case as well.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* contracts.adb (Build_Postconditions_Procedure): Set Is_Inlined
unconditionnally on the procedure entity.diff --git a/gcc/ada/contracts.adb b/gcc/ada/contracts.adb
--- a/gcc/ada/contracts.adb
+++ b/gcc/ada/contracts.adb
@@ -2365,6 +2365,10 @@ package body Contracts is
  Set_Debug_Info_Needed   (Proc_Id);
  Set_Postconditions_Proc (Subp_Id, Proc_Id);
 
+ --  Mark it inlined to speed up the call
+
+ Set_Is_Inlined (Proc_Id);
+
  --  Force the front-end inlining of _Postconditions when generating C
  --  code, since its body may have references to itypes defined in the
  --  enclosing subprogram, which would cause problems for unnesting
@@ -2373,7 +2377,6 @@ package body Contracts is
  if Modify_Tree_For_C then
 Set_Has_Pragma_Inline(Proc_Id);
 Set_Has_Pragma_Inline_Always (Proc_Id);
-Set_Is_Inlined   (Proc_Id);
  end if;
 
  --  The related subprogram is a function: create the specification of




[Ada] Enable using absolute paths in -fdiagnostics-format=json output

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
This commit makes GNAT use absolute paths in -fdiagnostics-format=json's
output when -gnatef is present on the command line. This makes life
easier for tools that ingest GNAT's output.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* doc/gnat_ugn/building_executable_programs_with_gnat.rst:
Document new behavior.
* errout.adb (Write_JSON_Location): Output absolute paths when
needed.
* switch-c.adb (Scan_Front_End_Switches): Update -gnatef
comment.
* usage.adb (Usage): Update description of -gnatef.
* gnat_ugn.texi: Regenerate.diff --git a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
--- a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
+++ b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
@@ -1238,7 +1238,8 @@ Alphabetical List of All Switches
 :switch:`-fdiagnostics-format=json`
   Makes GNAT emit warning and error messages as JSON. Inhibits printing of
   text warning and errors messages except if :switch:`-gnatv` or
-  :switch:`-gnatl` are present.
+  :switch:`-gnatl` are present. Uses absolute file paths when used along
+  :switch:`-gnatef`.
 
 
 .. index:: -fdump-scos  (gcc)
@@ -1582,7 +1583,8 @@ Alphabetical List of All Switches
 .. index:: -gnatef  (gcc)
 
 :switch:`-gnatef`
-  Display full source path name in brief error messages.
+  Display full source path name in brief error messages and absolute paths in
+  :switch:`-fdiagnostics-format=json`'s output.
 
 
 .. index:: -gnateF  (gcc)


diff --git a/gcc/ada/errout.adb b/gcc/ada/errout.adb
--- a/gcc/ada/errout.adb
+++ b/gcc/ada/errout.adb
@@ -51,6 +51,7 @@ with Sinfo.Utils;use Sinfo.Utils;
 with Snames; use Snames;
 with Stand;  use Stand;
 with Stylesw;use Stylesw;
+with System.OS_Lib;
 with Uname;  use Uname;
 
 package body Errout is
@@ -2082,6 +2083,7 @@ package body Errout is
   --  Return True if E is a continuation message.
 
   procedure Write_JSON_Escaped_String (Str : String_Ptr);
+  procedure Write_JSON_Escaped_String (Str : String);
   --  Write each character of Str, taking care of preceding each quote and
   --  backslash with a backslash. Note that this escaping differs from what
   --  GCC does.
@@ -2114,9 +2116,9 @@ package body Errout is
   -- Write_JSON_Escaped_String --
   ---
 
-  procedure Write_JSON_Escaped_String (Str : String_Ptr) is
+  procedure Write_JSON_Escaped_String (Str : String) is
   begin
- for C of Str.all loop
+ for C of Str loop
 if C = '"' or else C = '\' then
Write_Char ('\');
 end if;
@@ -2125,14 +2127,30 @@ package body Errout is
  end loop;
   end Write_JSON_Escaped_String;
 
+  ---
+  -- Write_JSON_Escaped_String --
+  ---
+
+  procedure Write_JSON_Escaped_String (Str : String_Ptr) is
+  begin
+ Write_JSON_Escaped_String (Str.all);
+  end Write_JSON_Escaped_String;
+
   -
   -- Write_JSON_Location --
   -
 
   procedure Write_JSON_Location (Sptr : Source_Ptr) is
+ Name : constant File_Name_Type :=
+   Full_Ref_Name (Get_Source_File_Index (Sptr));
   begin
  Write_Str ("{""file"":""");
- Write_Name (Full_Ref_Name (Get_Source_File_Index (Sptr)));
+ if Full_Path_Name_For_Brief_Errors then
+Write_JSON_Escaped_String
+  (System.OS_Lib.Normalize_Pathname (Get_Name_String (Name)));
+ else
+Write_Name (Name);
+ end if;
  Write_Str (""",""line"":");
  Write_Int (Pos (Get_Physical_Line_Number (Sptr)));
  Write_Str (", ""column"":");


diff --git a/gcc/ada/gnat_ugn.texi b/gcc/ada/gnat_ugn.texi
--- a/gcc/ada/gnat_ugn.texi
+++ b/gcc/ada/gnat_ugn.texi
@@ -21,7 +21,7 @@
 
 @copying
 @quotation
-GNAT User's Guide for Native Platforms , Apr 22, 2022
+GNAT User's Guide for Native Platforms , May 24, 2022
 
 AdaCore
 
@@ -8552,7 +8552,8 @@ dynamically allocated objects.
 
 Makes GNAT emit warning and error messages as JSON. Inhibits printing of
 text warning and errors messages except if @code{-gnatv} or
-@code{-gnatl} are present.
+@code{-gnatl} are present. Uses absolute file paths when used along
+@code{-gnatef}.
 @end table
 
 @geindex -fdump-scos (gcc)
@@ -9037,7 +9038,8 @@ produced at run time.
 
 @item @code{-gnatef}
 
-Display full source path name in brief error messages.
+Display full source path name in brief error messages and absolute paths in
+@code{-fdiagnostics-format=json}’s output.
 @end table
 
 @geindex -gnateF (gcc)
@@ -29247,8 +29249,8 @@ to permit their use in free software.
 
 @printindex ge
 
-@anchor{cf}@w{  }
 @anchor{gnat_ugn/gnat_utilit

[Ada] arm-qnx-7.1: unwind goes wrong after regs restore

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
Bump the pc +3 total for Thumb mode, the same calculation that as is
done for arm-linux.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* init.c (__gnat_adjust_context_for_raise) [QNX][__thumb2__]: Bump
the pc an extra byte.diff --git a/gcc/ada/init.c b/gcc/ada/init.c
--- a/gcc/ada/init.c
+++ b/gcc/ada/init.c
@@ -2579,7 +2579,17 @@ __gnat_adjust_context_for_raise (int signo ATTRIBUTE_UNUSED,
   uintptr_t *pc_addr;
   mcontext_t *mcontext = &((ucontext_t *) sc)->uc_mcontext;
   pc_addr = (uintptr_t *)&mcontext->cpu.gpr [ARM_REG_PC];
+
+  /* ARM Bump has to be an even number because of odd/even architecture.  */
   *pc_addr += 2;
+#ifdef __thumb2__
+  /* For thumb, the return address must have the low order bit set, otherwise
+ the unwinder will reset to "arm" mode upon return.  As long as the
+ compilation unit containing the landing pad is compiled with the same
+ mode (arm vs thumb) as the signaling compilation unit, this works.  */
+  if (mcontext->cpu.spsr & ARM_CPSR_T)
+*pc_addr += 1;
+#endif
 }
 #endif /* ARMEL */
 




[Ada] Fix predicate check on object declaration

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
When subtype predicate checks are added for object declarations, it
could lead to a compiler crash or to an incorrect check.

When the subtype for the object being declared is built later by
Analyze_Object_Declaration, the predicate check can't be applied on the
object instead of a copy as the call will be incorrect after the subtype
has been built.

When subtypes for LHS and RHS do not statically match, only checking the
predicate on the object after it has been initialized may miss a failing
predicate on the RHS.

In both cases, skip the optimization and check the predicate on a copy.

Rename Should_Build_Subtype into Build_Default_Subtype_OK and move it
out of sem_ch3 to make it available to other part of the compiler (in
particular to checks.adb).

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* checks.adb (Apply_Predicate_Check): Refine condition for
applying optimization.
* sem_ch3.adb (Analyze_Component_Declaration): Adjust calls to
Should_Build_Subtype.
(Analyze_Object_Declaration): Likewise.
(Should_Build_Subtype): Rename/move to ...
* sem_util.ads (Build_Default_Subtype_OK): ... this.
* sem_util.adb (Build_Default_Subtype_OK): Moved from
sem_ch3.adb.diff --git a/gcc/ada/checks.adb b/gcc/ada/checks.adb
--- a/gcc/ada/checks.adb
+++ b/gcc/ada/checks.adb
@@ -2944,14 +2944,28 @@ package body Checks is
 
  --  Similarly, if the expression is an aggregate in an object
  --  declaration, apply it to the object after the declaration.
- --  This is only necessary in rare cases of tagged extensions
- --  initialized with an aggregate with an "others => <>" clause.
+
+ --  This is only necessary in cases of tagged extensions
+ --  initialized with an aggregate with an "others => <>" clause,
+ --  when the subtypes of LHS and RHS do not statically match or
+ --  when we know the object's type will be rewritten later.
+ --  The condition for the later is copied from the
+ --  Analyze_Object_Declaration procedure when it actually builds the
+ --  subtype.
 
  elsif Nkind (Par) = N_Object_Declaration then
-Insert_Action_After (Par,
-  Make_Predicate_Check (Typ,
-New_Occurrence_Of (Defining_Identifier (Par), Sloc (N;
-return;
+if Subtypes_Statically_Match
+ (Etype (Defining_Identifier (Par)), Typ)
+  and then (Nkind (N) = N_Extension_Aggregate
+ or else (Is_Definite_Subtype (Typ)
+   and then Build_Default_Subtype_OK (Typ)))
+then
+   Insert_Action_After (Par,
+  Make_Predicate_Check (Typ,
+New_Occurrence_Of (Defining_Identifier (Par), Sloc (N;
+   return;
+end if;
+
  end if;
   end if;
 


diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -725,16 +725,6 @@ package body Sem_Ch3 is
--  sets the flags SSO_Set_Low_By_Default/SSO_Set_High_By_Default according
--  to the setting of Opt.Default_SSO.
 
-   function Should_Build_Subtype (T : Entity_Id) return Boolean;
-   --  When analyzing components or object declarations, it is possible, in
-   --  some cases, to build subtypes for discriminated types. This is
-   --  worthwhile to avoid the backend allocating the maximum possible size for
-   --  objects of the type.
-   --  In particular, when T is limited, the discriminants and therefore the
-   --  size of an object of type T cannot change. Furthermore, if T is definite
-   --  with statically initialized defaulted discriminants, we are able and
-   --  want to build a constrained subtype of the right size.
-
procedure Signed_Integer_Type_Declaration (T : Entity_Id; Def : Node_Id);
--  Create a new signed integer entity, and apply the constraint to obtain
--  the required first named subtype of this type.
@@ -2214,7 +2204,7 @@ package body Sem_Ch3 is
 
   --  When possible, build the default subtype
 
-  if Should_Build_Subtype (T) then
+  if Build_Default_Subtype_OK (T) then
  declare
 Act_T : constant Entity_Id := Build_Default_Subtype (T, N);
 
@@ -4815,7 +4805,7 @@ package body Sem_Ch3 is
 
   --  When possible, build the default subtype
 
-  elsif Should_Build_Subtype (T) then
+  elsif Build_Default_Subtype_OK (T) then
  if No (E) then
 Act_T := Build_Default_Subtype (T, N);
  else
@@ -22963,80 +22953,6 @@ package body Sem_Ch3 is
   end if;
end Set_Stored_Constraint_From_Discriminant_Constraint;
 
-   --
-   -- Should_Build_Subtype --
-   --
-
-   function Should_Build_Subtype (T : Entity_Id) return Boolean is
-
-  function Default_Discriminant_Values_Known_At_Compile_Time

[Ada] Bug fix in "=" function of formal doubly linked list

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
Correction of a typo regarding indexes.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-cfdlli.adb ("="): Make the function properly loop
over the right list.diff --git a/gcc/ada/libgnat/a-cfdlli.adb b/gcc/ada/libgnat/a-cfdlli.adb
--- a/gcc/ada/libgnat/a-cfdlli.adb
+++ b/gcc/ada/libgnat/a-cfdlli.adb
@@ -68,9 +68,9 @@ is
   end if;
 
   LI := Left.First;
-  RI := Left.First;
+  RI := Right.First;
   while LI /= 0 loop
- if Left.Nodes (LI).Element /= Right.Nodes (LI).Element then
+ if Left.Nodes (LI).Element /= Right.Nodes (RI).Element then
 return False;
  end if;
 




[Ada] Do not freeze subprogram body without spec too early

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
This fixes a small oddity whereby a subprogram body declared without a spec
would be frozen before its entity is fully processed as an overloaded name.
Now the latter step computes useful information, for example whether the
body is a (late) primitive of a tagged type, which can be required during
the freezing process.  The change also adjusts Check_Dispatching_Operation
accordingly.  No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.adb (Analyze_Subprogram_Body_Helper): For the case where
there is no previous declaration, freeze the body entity only after
it has been processed as a new overloaded name.
Use Was_Expression_Function to recognize expression functions.
* sem_disp.adb (Check_Dispatching_Operation): Do not require a body
which is the last primitive to be frozen here.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -4389,25 +4389,15 @@ package body Sem_Ch6 is
 end if;
 
 --  A subprogram body should cause freezing of its own declaration,
---  but if there was no previous explicit declaration, then the
---  subprogram will get frozen too late (there may be code within
---  the body that depends on the subprogram having been frozen,
---  such as uses of extra formals), so we force it to be frozen
---  here. Same holds if the body and spec are compilation units.
---  Finally, if the return type is an anonymous access to protected
---  subprogram, it must be frozen before the body because its
---  expansion has generated an equivalent type that is used when
---  elaborating the body.
-
---  An exception in the case of Ada 2012, AI05-177: The bodies
---  created for expression functions do not freeze.
-
-if No (Spec_Id)
-  and then Nkind (Original_Node (N)) /= N_Expression_Function
+--  so, if the body and spec are compilation units, we must do it
+--  manually here. Moreover, if the return type is anonymous access
+--  to protected subprogram, it must be frozen before the body
+--  because its expansion has generated an equivalent type that is
+--  used when elaborating the body.
+
+if Present (Spec_Id)
+  and then Nkind (Parent (N)) = N_Compilation_Unit
 then
-   Freeze_Before (N, Body_Id);
-
-elsif Nkind (Parent (N)) = N_Compilation_Unit then
Freeze_Before (N, Spec_Id);
 
 elsif Is_Access_Subprogram_Type (Etype (Body_Id)) then
@@ -4775,13 +4765,28 @@ package body Sem_Ch6 is
 
--  No warnings for expression functions
 
-   and then Nkind (Original_Node (N)) /= N_Expression_Function
+   and then (Nkind (N) /= N_Subprogram_Body
+  or else not Was_Expression_Function (N))
  then
 Style.Body_With_No_Spec (N);
  end if;
 
  New_Overloaded_Entity (Body_Id);
 
+ --  A subprogram body should cause freezing of its own declaration,
+ --  but if there was no previous explicit declaration, then the
+ --  subprogram will get frozen too late (there may be code within
+ --  the body that depends on the subprogram having been frozen,
+ --  such as uses of extra formals), so we force it to be frozen here.
+ --  An exception in Ada 2012 is that the body created for expression
+ --  functions does not freeze.
+
+ if Nkind (N) /= N_Subprogram_Body
+   or else not Was_Expression_Function (N)
+ then
+Freeze_Before (N, Body_Id);
+ end if;
+
  if Nkind (N) /= N_Subprogram_Body_Stub then
 Set_Acts_As_Spec (N);
 Generate_Definition (Body_Id);


diff --git a/gcc/ada/sem_disp.adb b/gcc/ada/sem_disp.adb
--- a/gcc/ada/sem_disp.adb
+++ b/gcc/ada/sem_disp.adb
@@ -1516,11 +1516,10 @@ package body Sem_Disp is
("\spec should appear immediately after the type!",
 Subp);
 
-  elsif Is_Frozen (Subp) then
+  else
 
  --  The subprogram body declares a primitive operation.
- --  If the subprogram is already frozen, we must update
- --  its dispatching information explicitly here. The
+ --  We must update its dispatching information here. The
  --  information is taken from the overridden subprogram.
  --  We must also generate a cross-reference entry because
  --  references to other primitives were already created




[Ada] Fix classification of Subprogram_Variant as assertion pragma

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
This pragma was wrongly not recognized as an assertion pragma.  Now
fixed.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_prag.ads (Assertion_Expression_Pragmas): Fix value for
pragma Subprogram_Variant.diff --git a/gcc/ada/sem_prag.ads b/gcc/ada/sem_prag.ads
--- a/gcc/ada/sem_prag.ads
+++ b/gcc/ada/sem_prag.ads
@@ -149,6 +149,7 @@ package Sem_Prag is
   Pragma_Precondition  => True,
   Pragma_Predicate => True,
   Pragma_Refined_Post  => True,
+  Pragma_Subprogram_Variant=> True,
   Pragma_Test_Case => True,
   Pragma_Type_Invariant=> True,
   Pragma_Type_Invariant_Class  => True,




[Ada] Rename Returns_On_Secondary_Stack into Needs_Secondary_Stack

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
The Returns_On_Secondary_Stack predicate is a misnomer because it must be
invoked on a type and types do not return; as a matter of fact, the other
Returns_XXX predicates apply to functions.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch6.adb (Caller_Known_Size): Invoke Needs_Secondary_Stack in
lieu of Returns_On_Secondary_Stack.
(Expand_Call_Helper): Likewise.
(Expand_Simple_Function_Return): Likewise.
(Needs_BIP_Alloc_Form): Likewise.
* exp_ch7.adb (Wrap_Transient_Declaration): Likewise.
* sem_res.adb (Resolve_Call): Likewise.
(Resolve_Entry_Call): Likewise.
* sem_util.ads (Returns_On_Secondary_Stack): Rename into...
(Needs_Secondary_Stack): ...this.
* sem_util.adb (Returns_On_Secondary_Stack): Rename into...
(Needs_Secondary_Stack): ...this.
* fe.h (Returns_On_Secondary_Stack): Delete.
(Needs_Secondary_Stack): New function.
* gcc-interface/decl.cc (gnat_to_gnu_subprog_type): Replace call
to Returns_On_Secondary_Stack with Needs_Secondary_Stack.diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -1060,7 +1060,7 @@ package body Exp_Ch6 is
 
begin
   return (No (Ctrl) and then Is_Definite_Subtype (Utyp))
-or else not Returns_On_Secondary_Stack (Utyp);
+or else not Needs_Secondary_Stack (Utyp);
end Caller_Known_Size;
 
---
@@ -4946,7 +4946,7 @@ package body Exp_Ch6 is
  Is_Build_In_Place_Function_Call (Parent (Call_Node)))
  then
 Establish_Transient_Scope
-  (Call_Node, Returns_On_Secondary_Stack (Etype (Subp)));
+  (Call_Node, Needs_Secondary_Stack (Etype (Subp)));
  end if;
   end if;
end Expand_Call_Helper;
@@ -7341,7 +7341,7 @@ package body Exp_Ch6 is
   --  A return statement from an ignored Ghost function does not use the
   --  secondary stack (or any other one).
 
-  elsif not Returns_On_Secondary_Stack (R_Type)
+  elsif not Needs_Secondary_Stack (R_Type)
 or else Is_Ignored_Ghost_Entity (Scope_Id)
   then
  --  Mutable records with variable-length components are not returned
@@ -7455,7 +7455,7 @@ package body Exp_Ch6 is
  --  how to do a copy.)
 
  if Exp_Is_Function_Call
-   and then Returns_On_Secondary_Stack (Exp_Typ)
+   and then Needs_Secondary_Stack (Exp_Typ)
  then
 Set_By_Ref (N);
 
@@ -10219,7 +10219,7 @@ package body Exp_Ch6 is
   pragma Assert (Is_Build_In_Place_Function (Func_Id));
   Func_Typ : constant Entity_Id := Underlying_Type (Etype (Func_Id));
begin
-  return Returns_On_Secondary_Stack (Func_Typ);
+  return Needs_Secondary_Stack (Func_Typ);
end Needs_BIP_Alloc_Form;
 
-


diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb
--- a/gcc/ada/exp_ch7.adb
+++ b/gcc/ada/exp_ch7.adb
@@ -10312,7 +10312,7 @@ package body Exp_Ch7 is
  --  reclamation is done by the caller.
 
  if Ekind (Curr_S) = E_Function
-   and then Returns_On_Secondary_Stack (Etype (Curr_S))
+   and then Needs_Secondary_Stack (Etype (Curr_S))
  then
 null;
 


diff --git a/gcc/ada/fe.h b/gcc/ada/fe.h
--- a/gcc/ada/fe.h
+++ b/gcc/ada/fe.h
@@ -297,15 +297,15 @@ extern Boolean Compile_Time_Known_Value	(Node_Id);
 #define First_Actual			sem_util__first_actual
 #define Is_Expression_Function		sem_util__is_expression_function
 #define Is_Variable_Size_Record 	sem_util__is_variable_size_record
+#define Needs_Secondary_Stack		sem_util__needs_secondary_stack
 #define Next_Actual			sem_util__next_actual
-#define Returns_On_Secondary_Stack	sem_util__returns_on_secondary_stack
 
 extern Entity_Id Defining_Entity		(Node_Id);
 extern Node_Id First_Actual			(Node_Id);
 extern Boolean Is_Expression_Function		(Entity_Id);
 extern Boolean Is_Variable_Size_Record 		(Entity_Id);
+extern Boolean Needs_Secondary_Stack		(Entity_Id);
 extern Node_Id Next_Actual			(Node_Id);
-extern Boolean Returns_On_Secondary_Stack	(Entity_Id);
 
 /* sinfo: */
 


diff --git a/gcc/ada/gcc-interface/decl.cc b/gcc/ada/gcc-interface/decl.cc
--- a/gcc/ada/gcc-interface/decl.cc
+++ b/gcc/ada/gcc-interface/decl.cc
@@ -5864,7 +5864,7 @@ gnat_to_gnu_subprog_type (Entity_Id gnat_subprog, bool definition,
 	}
 
   /* This is for the other types returned on the secondary stack.  */
-  else if (Returns_On_Secondary_Stack (gnat_return_type))
+  else if (Needs_Secondary_Stack (gnat_return_type))
 	{
 	  gnu_return_type = build_reference_type (gnu_return_type);
 	  return_unconstrained_p = true;


diff --git a/gcc/ada/sem_res.adb b/gcc/ada/sem_res.adb
--- a/gcc/ada/sem_res.adb
+++ b/gcc/ada/sem_res.adb
@@ -6959,8 +6959,7 @@ package body Sem_Res is
 and then Requires_Transient_Scope (Etype (Nam))
 and then not 

[Ada] Fix missing space in error message

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
On illegal code like:

   type T is new Positive in range 1..5;

the compiler was emitting message:

  error: extra "in"ignored
  ^^

which lacked a space character.

A tiny diagnostic improvement; spotted while mistakenly typing an
illegal test.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* par-util.adb (Ignore): Add missing space to message string.diff --git a/gcc/ada/par-util.adb b/gcc/ada/par-util.adb
--- a/gcc/ada/par-util.adb
+++ b/gcc/ada/par-util.adb
@@ -462,7 +462,7 @@ package body Util is
 declare
Tname : constant String := Token_Type'Image (Token);
 begin
-   Error_Msg_SC ("|extra " & Tname (5 .. Tname'Last) & "ignored");
+   Error_Msg_SC ("|extra " & Tname (5 .. Tname'Last) & " ignored");
 end;
  end if;
 




[Ada] Combine system.ads files - arm and aarch64 qnx

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
Systemitize Word_Size and Memory_Size declarations rather than hard code
with numerical values or OS specific Long_Integer size.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/system-qnx-arm.ads (Memory_Size): Compute based on
Word_Size.diff --git a/gcc/ada/libgnat/system-qnx-arm.ads b/gcc/ada/libgnat/system-qnx-arm.ads
--- a/gcc/ada/libgnat/system-qnx-arm.ads
+++ b/gcc/ada/libgnat/system-qnx-arm.ads
@@ -70,7 +70,7 @@ package System is
 
Storage_Unit : constant := 8;
Word_Size: constant := Standard'Word_Size;
-   Memory_Size  : constant := 2 ** Long_Integer'Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 




[Ada] Combine system.ads file - vxworks7 kernel constants.

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
Systemitize Word_Size and Memory_Size declarations rather than hard code
with numerical values or OS specific Long_Integer size.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/system-vxworks7-aarch64.ads (Word_Size): Compute
based on Standard'Word_Size. (Memory_Size): Compute based
on Word_Size.
* libgnat/system-vxworks7-arm.ads: Likewise.
* libgnat/system-vxworks7-e500-kernel.ads: Likewise.
* libgnat/system-vxworks7-ppc-kernel.ads: Likewise.
* libgnat/system-vxworks7-ppc64-kernel.ads: Likewise.
* libgnat/system-vxworks7-x86-kernel.ads: Likewise.
* libgnat/system-vxworks7-x86_64-kernel.ads: Likewise.diff --git a/gcc/ada/libgnat/system-vxworks7-aarch64.ads b/gcc/ada/libgnat/system-vxworks7-aarch64.ads
--- a/gcc/ada/libgnat/system-vxworks7-aarch64.ads
+++ b/gcc/ada/libgnat/system-vxworks7-aarch64.ads
@@ -71,8 +71,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 64;
-   Memory_Size  : constant := 2 ** 64;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 


diff --git a/gcc/ada/libgnat/system-vxworks7-arm.ads b/gcc/ada/libgnat/system-vxworks7-arm.ads
--- a/gcc/ada/libgnat/system-vxworks7-arm.ads
+++ b/gcc/ada/libgnat/system-vxworks7-arm.ads
@@ -69,8 +69,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 32;
-   Memory_Size  : constant := 2 ** 32;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 


diff --git a/gcc/ada/libgnat/system-vxworks7-e500-kernel.ads b/gcc/ada/libgnat/system-vxworks7-e500-kernel.ads
--- a/gcc/ada/libgnat/system-vxworks7-e500-kernel.ads
+++ b/gcc/ada/libgnat/system-vxworks7-e500-kernel.ads
@@ -69,8 +69,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 32;
-   Memory_Size  : constant := 2 ** 32;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 


diff --git a/gcc/ada/libgnat/system-vxworks7-ppc-kernel.ads b/gcc/ada/libgnat/system-vxworks7-ppc-kernel.ads
--- a/gcc/ada/libgnat/system-vxworks7-ppc-kernel.ads
+++ b/gcc/ada/libgnat/system-vxworks7-ppc-kernel.ads
@@ -69,8 +69,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 32;
-   Memory_Size  : constant := 2 ** 32;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 


diff --git a/gcc/ada/libgnat/system-vxworks7-ppc64-kernel.ads b/gcc/ada/libgnat/system-vxworks7-ppc64-kernel.ads
--- a/gcc/ada/libgnat/system-vxworks7-ppc64-kernel.ads
+++ b/gcc/ada/libgnat/system-vxworks7-ppc64-kernel.ads
@@ -71,8 +71,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 64;
-   Memory_Size  : constant := 2 ** 64;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 


diff --git a/gcc/ada/libgnat/system-vxworks7-x86-kernel.ads b/gcc/ada/libgnat/system-vxworks7-x86-kernel.ads
--- a/gcc/ada/libgnat/system-vxworks7-x86-kernel.ads
+++ b/gcc/ada/libgnat/system-vxworks7-x86-kernel.ads
@@ -69,8 +69,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 32;
-   Memory_Size  : constant := 2 ** 32;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 


diff --git a/gcc/ada/libgnat/system-vxworks7-x86_64-kernel.ads b/gcc/ada/libgnat/system-vxworks7-x86_64-kernel.ads
--- a/gcc/ada/libgnat/system-vxworks7-x86_64-kernel.ads
+++ b/gcc/ada/libgnat/system-vxworks7-x86_64-kernel.ads
@@ -69,8 +69,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 64;
-   Memory_Size  : constant := 2 ** 64;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 




[Ada] Allow confirming volatile properties on No_Caching variables

2022-06-01 Thread Pierre-Marie de Rodat via Gcc-patches
Volatile variables marked with the No_Caching aspect can now have
confirming aspects for other volatile properties, with a value of
False.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* contracts.adb (Check_Type_Or_Object_External_Properties): Check
the validity of combinations only when No_Caching is not used.
* sem_prag.adb (Analyze_External_Property_In_Decl_Part): Check
valid combinations with No_Caching.diff --git a/gcc/ada/contracts.adb b/gcc/ada/contracts.adb
--- a/gcc/ada/contracts.adb
+++ b/gcc/ada/contracts.adb
@@ -892,9 +892,15 @@ package body Contracts is
  end;
   end if;
 
-  --  Verify the mutual interaction of the various external properties
-
-  if Seen then
+  --  Verify the mutual interaction of the various external properties.
+  --  For variables for which No_Caching is enabled, it has been checked
+  --  already that only False values for other external properties are
+  --  allowed.
+
+  if Seen
+and then (Ekind (Type_Or_Obj_Id) /= E_Variable
+   or else not No_Caching_Enabled (Type_Or_Obj_Id))
+  then
  Check_External_Properties
(Type_Or_Obj_Id, AR_Val, AW_Val, ER_Val, EW_Val);
   end if;


diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -2139,11 +2139,24 @@ package body Sem_Prag is
   Expr : Node_Id;
 
begin
-  --  Do not analyze the pragma multiple times, but set the output
-  --  parameter to the argument specified by the pragma.
+  --  Ensure that the Boolean expression (if present) is static. A missing
+  --  argument defaults the value to True (SPARK RM 7.1.2(5)).
+
+  Expr_Val := True;
+
+  if Present (Arg1) then
+ Expr := Get_Pragma_Arg (Arg1);
+
+ if Is_OK_Static_Expression (Expr) then
+Expr_Val := Is_True (Expr_Value (Expr));
+ end if;
+  end if;
+
+  --  The output parameter was set to the argument specified by the pragma.
+  --  Do not analyze the pragma multiple times.
 
   if Is_Analyzed_Pragma (N) then
- goto Leave;
+ return;
   end if;
 
   Error_Msg_Name_1 := Pragma_Name (N);
@@ -2163,9 +2176,11 @@ package body Sem_Prag is
  if Ekind (Obj_Id) = E_Variable
and then No_Caching_Enabled (Obj_Id)
  then
-SPARK_Msg_N
-  ("illegal combination of external property % and property "
-   & """No_Caching"" (SPARK RM 7.1.2(6))", N);
+if Expr_Val then  --  Confirming value of False is allowed
+   SPARK_Msg_N
+ ("illegal combination of external property % and property "
+  & """No_Caching"" (SPARK RM 7.1.2(6))", N);
+end if;
  else
 SPARK_Msg_N
   ("external property % must apply to a volatile type or object",
@@ -2185,22 +2200,6 @@ package body Sem_Prag is
   end if;
 
   Set_Is_Analyzed_Pragma (N);
-
-  <>
-
-  --  Ensure that the Boolean expression (if present) is static. A missing
-  --  argument defaults the value to True (SPARK RM 7.1.2(5)).
-
-  Expr_Val := True;
-
-  if Present (Arg1) then
- Expr := Get_Pragma_Arg (Arg1);
-
- if Is_OK_Static_Expression (Expr) then
-Expr_Val := Is_True (Expr_Value (Expr));
- end if;
-  end if;
-
end Analyze_External_Property_In_Decl_Part;
 
-




Re: [PATCH] expr.cc: Optimize if char array initialization consists of all zeros

2022-06-01 Thread Richard Biener via Gcc-patches
On Tue, May 31, 2022 at 5:37 AM Takayuki 'January June' Suwa via
Gcc-patches  wrote:
>
> Hi all,
>
> In some targets, initialization code for char array may be split into two
> parts even if the initialization consists of all zeros:
>
> /* example */
> extern void foo(char*);
> void test(void) {
>char a[256] = { 0, 0, 0, 0, 0, 0, 0, 0, 0 };
>foo(a);
> }
>
> ;; Xtensa (xtensa-lx106)
> .LC0:
> .string ""
> .string ""
> .string ""
> .string ""
> .string ""
> .string ""
> .string ""
> .string ""
> .string ""
> .string ""
> .zero   246
> test:
> movia9, 0x110
> sub sp, sp, a9
> l32ra3, .LC1
> movi.n  a4, 0xa
> mov.n   a2, sp
> s32ia0, sp, 268
> call0   memcpy
> movia4, 0xf6
> movi.n  a3, 0
> addi.n  a2, sp, 10
> call0   memset
> mov.n   a2, sp
> call0   foo
> l32ia0, sp, 268
> movia9, 0x110
> add.n   sp, sp, a9
> ret.n
>
> ;; H8/300 (-mh -mint32)
> .LC0:
> .string ""
> .string ""
> .string ""
> .string ""
> .string ""
> .string ""
> .string ""
> .string ""
> .string ""
> .string ""
> .zero   246
> _test:
> sub.l   #256,er7
> sub.l   er2,er2
> add.b   #10,r2l
> mov.l   #.LC0,er1
> mov.l   er7,er0
> jsr @_memcpy
> sub.l   er2,er2
> add.b   #246,r2l
> sub.l   er1,er1
> sub.l   er0,er0
> add.b   #10,r0l
> add.l   er7,er0
> jsr @_memset
> mov.l   er7,er0
> jsr @_foo
> add.l   #256,er7
> rts
>
> i386 target (both 32 and 64bit) does not show such behavior.
>
>  gcc/ChangeLog:
>
> * expr.cc (store_expr): Add check if the initialization content
> consists of all zeros.
> ---
>   gcc/expr.cc | 7 +++
>   1 file changed, 7 insertions(+)
>
> diff --git a/gcc/expr.cc b/gcc/expr.cc
> index 7197996cec7..f94310dc7b9 100644
> --- a/gcc/expr.cc
> +++ b/gcc/expr.cc
> @@ -6015,6 +6015,7 @@ store_expr (tree exp, rtx target, int call_param_p,
> rtx dest_mem;
> tree str = TREE_CODE (exp) == STRING_CST
>  ? exp : TREE_OPERAND (TREE_OPERAND (exp, 0), 0);
> +  char ch;
>
> exp_len = int_expr_size (exp);
> if (exp_len <= 0)
> @@ -6032,6 +6033,12 @@ store_expr (tree exp, rtx target, int call_param_p,
> }
>
> str_copy_len = TREE_STRING_LENGTH (str);
> +  /* If str contains only zeroes, no need to store to target.  */
> +  ch = 0;
> +  for (HOST_WIDE_INT i = 0; i < str_copy_len; ++i)
> +   ch |= TREE_STRING_POINTER (str)[i];
> +  if (ch == 0)
> +   str_copy_len = 0;

Not sure if I decipher the current code correctly but maybe we instead
want to prune str_copy_len from the end for trailing \0 bytes instead of
just special-casing all-zero initializers?

> if ((STORE_MAX_PIECES & (STORE_MAX_PIECES - 1)) == 0)
> {
>   str_copy_len += STORE_MAX_PIECES - 1;
> --
> 2.20.1


Re: [PATCH 1/5] Implement abstract vrange class.

2022-06-01 Thread Aldy Hernandez via Gcc-patches
Final patch committed.

Includes support for class unsupported_range.

Re-tested on x86-64 Linux.

On Mon, May 30, 2022 at 3:28 PM Aldy Hernandez  wrote:
>
> This is a series of patches making ranger type agnostic in preparation
> for contributing support for other types of ranges (pointers and
> floats initially).
>
> The first step in this process is to implement vrange, an abstract
> class that will be exclusively used by ranger, and from which all
> ranges will inherit.  Vrange provides the minimum operations for
> ranger to work.  The current virtual methods are what we've used to
> implement frange (floats) and prange (pointers), but we may restrict
> the virtual methods further as other ranges come about
> (i.e. set_nonnegative() has no meaning for a future string range).
>
> This patchset also provides a mechanism for declaring local type
> agnostic ranges that can transparently hold an irange, frange,
> prange's, etc, and a dispatch mechanism for range-ops to work with
> various range types.  More details in the relevant patches.
>
> FUTURE PLAN
> ===
>
> The plan after this is to contribute a bare bones implementation for
> floats (frange) that will provide relationals, followed by a
> separation of integers and pointers (irange and prange).  Once this is
> in place, we can further enhance both floats and pointers.  For
> example, pointer tracking, pointer plus optimizations, and keeping
> track of NaN's, etc.
>
> Once frange and prange come live, all ranger clients will immediately
> benefit from these enhancements.  For instance, in our local branch,
> the threader is already float aware with regards to relationals.
>
> We expect to wait a few weeks before starting to contribute further
> enhancements to give the tree a time to stabilize, and Andrew time to
> rebase his upcoming patches  :-P.
>
> NOTES
> =
>
> In discussions with Andrew, it has become clear that with vrange
> coming about, supports_type_p() is somewhat ambiguous.  Prior to
> vrange it has been used to (a) determine if a type is supported by
> ranger, (b) as a short-cut for checking if a type is pointer or integer,
> as well as (c) to see if a given range can hold a type.  These things
> have had the same meaning in irange, but are slightly different with
> vrange.  I will address this in a follow-up patch.
>
> Speaking of supported types, we now provide an unsupported_range
> for passing around ranges for unsupported types. We've been silently
> doing this for a while, in both vr-values by creating VARYING for
> unsupported types with error_mark_node end points, and in ranger when
> we pass an unsupported range before we realize in range_of_expr that
> it's unsupported.  This class just formalizes what we've already been
> doing in an irange, but making it explicit that you can't do anything
> with these ranges except pass them.  Any other operation traps.
>
> There is no GTY support for vrange yet, as we don't store it long
> term.  When we contribute support for global ranges (think
> SSA_NAME_RANGE_INFO but for generic ranges), we will include it.  There
> was just no need to pollute this patchset with it.
>
> TESTING
> ===
>
> The patchset has been tested on x86-64 Linux as well as ppc64 Linux.
> I have also verified that we fold the same number of conditionals in
> evrp as well as thread the same number of paths.  There should be no
> user visible changes.
>
> We have also benchmarked the work, with the final numbers being an
> *improvement* of 1.92% for evrp, and 0.82% for VRP.  Overall
> compilation has a miniscule improvement.  This is despite the extra
> indirection level.
>
> The improvements are mostly because of small cleanups required for the
> generalization of ranges.  As a sanity check, I stuck kcachegrind on a
> few sample .ii files to see where the time was being gained.  Most of
> the gain came from gimple_range_global() being 19% faster.  This
> function is called a lot, and it was constructing a legacy
> value_range, then returning it by value, which the caller then had to
> convert to an irange.  This is in line with other pending work:
> anytime we get rid of legacy, we gain time.
>
> I will wait a few days before committing to welcome any comments.
>
> gcc/ChangeLog:
>
> * value-range-equiv.cc (value_range_equiv::set): New.
> * value-range-equiv.h (class value_range_equiv): Make set method
> virtual.
> Remove default bitmap argument from set method.
> * value-range.cc (vrange::contains_p): New.
> (vrange::singleton_p): New.
> (vrange::operator=): New.
> (vrange::operator==): New.
> (irange::fits_p): Move to .cc file.
> (irange::set_nonnegative): New.
> (unsupported_range::unsupported_range): New.
> (unsupported_range::set): New.
> (unsupported_range::type): New.
> (unsupported_range::set_undefined): New.
> (unsupported_range::set_varying): New.
> (unsupported_ran

Re: [PATCH 2/5] Implement generic range temporaries.

2022-06-01 Thread Aldy Hernandez via Gcc-patches
Final patch committed.

tmp_range has been renamed to Value_Range.

Value_Range::init() has been renamed to set_type() which is more obvious.

Default constructor for Value_Range now points the vrange to the
unsupported_range object, so we always have an object available.

Re-tested on x86-64 Linux.

On Tue, May 31, 2022 at 8:21 AM Aldy Hernandez  wrote:
>
> On Mon, May 30, 2022 at 4:56 PM Andrew MacLeod  wrote:
> >
> > On 5/30/22 09:27, Aldy Hernandez wrote:
> > > Now that we have generic ranges, we need a way to define generic local
> > > temporaries on the stack for intermediate calculations in the ranger
> > > and elsewhere.  We need temporaries analogous to int_range_max, but
> > > for any of the supported types (currently just integers, but soon
> > > integers, pointers, and floats).
> > >
> > > The tmp_range object is such a temporary.  It is designed to be
> > > transparently used as a vrange.  It shares vrange's abstract API, and
> > > implicitly casts itself to a vrange when passed around.
> > >
> > > The ultimate name will be value_range, but we need to remove legacy
> > > first for that to happen.  Until then, tmp_range will do.
> > >
> > I was going to suggest maybe renaming value_range to legacy_range or
> > something, and then start using value_range for ranges of any time.
> > Then it occurred to me that numerous places which use value_range
> > will/can continue to use value_range going forward.. ie
> >
> > value_range vr;
> >   if (!rvals->range_of_expr (vr, name, stmt))
> > return -1;
> >
> > would be unaffected, to it would be pointless turmoil to rename that and
> > then rename it back to value_range.
> >
> > I also notice there are already a few instance of local variable named
> > tmp_range, which make name renames annoying.   Perhaps we should use
> > Value_Range or something like that in the interim for the multi-type
> > ranges?   Then the rename is trivial down the road, formatting will be
> > unaffected, and then we're kinda sorta using the end_goal name?
>
> OMG that is so ugly!  Although I guess it would be temporary.
>
> Speaking of which, how far away are we from enabling ranger in VRP1?
> Because once we do that, we can start nuking legacy and cleaning all
> this up.
>
> Aldy
From 59c8e96dd02383baec4c15665985da3caadaaa5e Mon Sep 17 00:00:00 2001
From: Aldy Hernandez 
Date: Mon, 14 Mar 2022 13:27:36 +0100
Subject: [PATCH] Implement generic range temporaries.

Now that we have generic ranges, we need a way to define generic local
temporaries on the stack for intermediate calculations in the ranger
and elsewhere.  We need temporaries analogous to int_range_max, but
for any of the supported types (currently just integers, but soon
integers, pointers, and floats).

The Value_Range object is such a temporary.  It is designed to be
transparently used as a vrange.  It shares vrange's abstract API, and
implicitly casts itself to a vrange when passed around.

The ultimate name will be value_range, but we need to remove legacy
first for that to happen.  Until then, Value_Range will do.

Sample usage is as follows.  Instead of:

	extern void foo (vrange &);

	int_range_max t;
	t.set_nonzero (type);
	foo (t);

one does:

	Value_Range t (type);
	t.set_nonzero (type);
	foo (t);

You can also delay initialization, for use in loops for example:

	Value_Range t;
	...
	t.set_type (type);
	t.set_varying (type);

Creating an supported range type, will result in an unsupported_range
object being created, which will trap if anything but set_undefined()
and undefined_p() are called on it.  There's no size penalty for the
unsupported_range, since its immutable and can be shared across
instances.

Since supports_type_p() is called at construction time for each
temporary, I've removed the non-zero check from this function, which
was mostly unneeded.  I fixed the handful of callers that were
passing null types, and in the process sped things up a bit.

As more range types come about, the Value_Range class will be augmented
to support them by adding the relevant bits in the initialization
code, etc.

Tested on x86-64 & ppc64le Linux.

gcc/ChangeLog:

	* gimple-range-fold.h (gimple_range_type): Check type before
	calling supports_type_p.
	* gimple-range-path.cc (path_range_query::range_of_stmt): Same.
	* value-query.cc (range_query::get_tree_range): Same.
	* value-range.cc (Value_Range::lower_bound): New.
	(Value_Range::upper_bound): New.
	(Value_Range::dump): New.
	* value-range.h (class Value_Range): New.
	(irange::supports_type_p): Do not check if type is non-zero.
---
 gcc/gimple-range-fold.h  |   2 +-
 gcc/gimple-range-path.cc |   2 +-
 gcc/value-query.cc   |   3 +-
 gcc/value-range.cc   |  38 +++
 gcc/value-range.h| 135 ++-
 5 files changed, 174 insertions(+), 6 deletions(-)

diff --git a/gcc/gimple-range-fold.h b/gcc/gimple-range-fold.h
index 53a5bf85dd4..20cb73dabb9 100644
--- a/gcc/gimple-range-fold.h
+++ b/gcc/gimple-range-fold.h

Re: [PATCH 3/5] Convert range-op.* to vrange.

2022-06-01 Thread Aldy Hernandez via Gcc-patches
Final patch committed.

Re-tested on x86-64 Linux.

On Mon, May 30, 2022 at 3:28 PM Aldy Hernandez  wrote:
>
> This patch provides the infrastructure to make range-ops type agnostic.
>
> First, the range_op_handler function has been replaced with an object
> of the same name.  It's coded in such a way to minimize changes to the
> code base, and to encapsulate the dispatch code.
>
> Instead of:
>
> range_operator *op = range_op_handler (code, type);
> if (op)
>   op->fold_range (...);
>
> We now do:
> range_op_handler op (code, type);
> if (op)
>   op->fold_range (...);
>
> I've folded gimple_range_handler into the range_op_handler class,
> since it's also a query into the range operators.
>
> Instead of:
>
> range_operator *handler = gimple_range_handler (stmt);
>
> We now do:
>
> range_op_handler handler (stmt);
>
> This all has the added benefit of moving all the dispatch code into an
> independent class and avoid polluting range_operator (which we'll
> further split later when frange and prange come live).
>
> There's this annoying "using" keyword that's been added to each
> operator due to hiding rules in C++.  The issue is that we will have
> different virtual versions of fold_range() for each combination of
> operands.  For example:
>
> // Traditional binary op on irange's.
> fold_range (irange &lhs, const irange &op1, const irange &op2);
> // For POINTER_DIFF_EXPR:
> fold_range (irange &lhs, const prange &op1, const prange &op2);
> // Cast from irange to prange.
> fold_range (prange &lhs, const irange &op1, const irange &op2);
>
> Overloading virtuals when there are multiple same named methods causes
> hidden virtuals warnings from -Woverloaded-virtual, thus the using
> keyword.  An alternative would be to have different names:
> fold_range_III, fold_range_IPP, fold_range_PII, but that's uglier
> still.
>
> Tested on x86-64 & ppc64le Linux.
>
> gcc/ChangeLog:
>
> * gimple-range-edge.cc (gimple_outgoing_range_stmt_p): Adjust for
> vrange and convert range_op_handler function calls to use the
> identically named object.
> * gimple-range-fold.cc (gimple_range_operand1): Same.
> (gimple_range_operand2): Same.
> (fold_using_range::fold_stmt): Same.
> (fold_using_range::range_of_range_op): Same.
> (fold_using_range::range_of_builtin_ubsan_call): Same.
> (fold_using_range::relation_fold_and_or): Same.
> (fur_source::register_outgoing_edges): Same.
> * gimple-range-fold.h (gimple_range_handler): Remove.
> * gimple-range-gori.cc (gimple_range_calc_op1): Adjust for vrange.
> (gimple_range_calc_op2): Same.
> (range_def_chain::get_def_chain): Same.
> (gori_compute::compute_operand_range): Same.
> (gori_compute::condexpr_adjust): Same.
> * gimple-range.cc (gimple_ranger::prefill_name): Same.
> (gimple_ranger::prefill_stmt_dependencies): Same.
> * range-op.cc (get_bool_state): Same.
> (class operator_equal): Add using clause.
> (class operator_not_equal): Same.
> (class operator_lt): Same.
> (class operator_le): Same.
> (class operator_gt): Same.
> (class operator_ge): Same.
> (class operator_plus): Same.
> (class operator_minus): Same.
> (class operator_mult): Same.
> (class operator_exact_divide): Same.
> (class operator_lshift): Same.
> (class operator_rshift): Same.
> (class operator_cast): Same.
> (class operator_logical_and): Same.
> (class operator_bitwise_and): Same.
> (class operator_logical_or): Same.
> (class operator_bitwise_or): Same.
> (class operator_bitwise_xor): Same.
> (class operator_trunc_mod): Same.
> (class operator_logical_not): Same.
> (class operator_bitwise_not): Same.
> (class operator_cst): Same.
> (class operator_identity): Same.
> (class operator_unknown): Same.
> (class operator_abs): Same.
> (class operator_negate): Same.
> (class operator_addr_expr): Same.
> (class pointer_or_operator): Same.
> (operator_plus::op1_range): Adjust for vrange.
> (operator_minus::op1_range): Same.
> (operator_mult::op1_range): Same.
> (operator_cast::op1_range): Same.
> (operator_bitwise_not::fold_range): Same.
> (operator_negate::fold_range): Same.
> (range_op_handler): Rename to...
> (get_handler): ...this.
> (range_op_handler::range_op_handler): New.
> (range_op_handler::fold_range): New.
> (range_op_handler::op1_range): New.
> (range_op_handler::op2_range): New.
> (range_op_handler::lhs_op1_relation): New.
> (range_op_handler::lhs_op2_relation): New.
> (range_op_handler::op1_op2_relation): New.
> 

Re: [PATCH 4/5] Revamp irange_allocator to handle vranges.

2022-06-01 Thread Aldy Hernandez via Gcc-patches
Final patch committed.

Re-tested on x86-64 Linux.

On Mon, May 30, 2022 at 3:28 PM Aldy Hernandez  wrote:
>
> This patch revamps the range allocator to handle generic vrange's.
> I've cleaned it up somehow to make it obvious the various things you
> can allocate with it.  I've also moved away from overloads into
> distinct names when appropriate.
>
> The various entry points are now:
>
>   // Allocate a range of TYPE.
>   vrange *alloc_vrange (tree type);
>   // Allocate a memory block of BYTES.
>   void *alloc (unsigned bytes);
>   // Return a clone of SRC.
>   template  T *clone (const T &src);
>
> It is now possible to allocate a clone of an irange, or any future
> range types:
>
>   irange *i = allocator.clone  (some_irange);
>   frange *f = allocator.clone  (some_frange);
>
> You can actually do so without the <>, but I find it clearer to
> specify the vrange type.
>
> So with it you can allocate a specific range type, or vrange, or a
> block of memory.
>
> I have rewritten the C style casts to C++ casts, since casts tend to
> be hints of problematic designs.  With the C++ casts you can at least
> grep for them easier.  Speak of which, the next patch, which converts
> ranger to vrange, will further clean this space by removing some
> unnecessary casts.
>
> Tested on x86-64 Linux and ppc64le Linux.
>
> * gimple-range-cache.cc (sbr_vector::sbr_vector): Adjust for
> vrange allocator.
> (sbr_vector::grow): Same.
> (sbr_vector::set_bb_range): Same.
> (sbr_sparse_bitmap::sbr_sparse_bitmap): Same.
> (sbr_sparse_bitmap::set_bb_range): Same.
> (block_range_cache::~block_range_cache): Same.
> (block_range_cache::set_bb_range): Same.
> (ssa_global_cache::ssa_global_cache): Same.
> (ssa_global_cache::~ssa_global_cache): Same.
> (ssa_global_cache::set_global_range): Same.
> * gimple-range-cache.h (block_range_cache): Same.
> (ssa_global_cache): Same.
> * gimple-range-edge.cc
> (gimple_outgoing_range::calc_switch_ranges): Same.
> * gimple-range-edge.h (gimple_outgoing_range): Same.
> * gimple-range-side-effect.cc (side_effect_manager::get_nonzero):
> Same.
> (side_effect_manager::add_range): Same.
> * gimple-range-side-effect.h (class side_effect_manager): Same.
> * value-range.h (class irange_allocator): Rename to...
> (class vrange_allocator): ...this.
> (irange_allocator::irange_allocator): New.
> (vrange_allocator::vrange_allocator): New.
> (irange_allocator::~irange_allocator): New.
> (vrange_allocator::~vrange_allocator): New.
> (irange_allocator::get_memory): Rename to...
> (vrange_allocator::alloc): ...this.
> (vrange_allocator::alloc_vrange): Rename from...
> (irange_allocator::allocate): ...this.
> (vrange_allocator::alloc_irange): New.
> ---
>  gcc/gimple-range-cache.cc   | 55 +++---
>  gcc/gimple-range-cache.h|  4 +-
>  gcc/gimple-range-edge.cc|  4 +-
>  gcc/gimple-range-edge.h |  2 +-
>  gcc/gimple-range-side-effect.cc | 13 --
>  gcc/gimple-range-side-effect.h  |  2 +-
>  gcc/value-range.h   | 82 +
>  7 files changed, 96 insertions(+), 66 deletions(-)
>
> diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
> index c726393b380..9c541993fb6 100644
> --- a/gcc/gimple-range-cache.cc
> +++ b/gcc/gimple-range-cache.cc
> @@ -75,7 +75,7 @@ ssa_block_ranges::dump (FILE *f)
>  class sbr_vector : public ssa_block_ranges
>  {
>  public:
> -  sbr_vector (tree t, irange_allocator *allocator);
> +  sbr_vector (tree t, vrange_allocator *allocator);
>
>virtual bool set_bb_range (const_basic_block bb, const irange &r) override;
>virtual bool get_bb_range (irange &r, const_basic_block bb) override;
> @@ -86,20 +86,21 @@ protected:
>int_range<2> m_varying;
>int_range<2> m_undefined;
>tree m_type;
> -  irange_allocator *m_irange_allocator;
> +  vrange_allocator *m_range_allocator;
>void grow ();
>  };
>
>
>  // Initialize a block cache for an ssa_name of type T.
>
> -sbr_vector::sbr_vector (tree t, irange_allocator *allocator)
> +sbr_vector::sbr_vector (tree t, vrange_allocator *allocator)
>  {
>gcc_checking_assert (TYPE_P (t));
>m_type = t;
> -  m_irange_allocator = allocator;
> +  m_range_allocator = allocator;
>m_tab_size = last_basic_block_for_fn (cfun) + 1;
> -  m_tab = (irange **)allocator->get_memory (m_tab_size * sizeof (irange *));
> +  m_tab = static_cast 
> +(allocator->alloc (m_tab_size * sizeof (irange *)));
>memset (m_tab, 0, m_tab_size * sizeof (irange *));
>
>// Create the cached type range.
> @@ -121,8 +122,8 @@ sbr_vector::grow ()
>int new_size = inc + curr_bb_size;
>
>// Allocate new memory, copy the old vector and clear the new space.
> -  irange **t = (irange **)m_irange_allocator->get_memo

[PATCH] tree-optimization/101668 - relax SLP of existing vectors

2022-06-01 Thread Richard Biener via Gcc-patches
This relaxes the conditions on SLPing extracts from existing vectors
leveraging the relaxed VEC_PERM conditions on the input vs output
vector type compatibility.  It also handles lowpart extracts
and concats without VEC_PERMs now.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Richard.

2022-05-25  Richard Biener  

PR tree-optimization/101668
* tree-vect-slp.cc (vect_build_slp_tree_1): Allow BIT_FIELD_REFs
for vector types with compatible lane types.
(vect_build_slp_tree_2): Deal with this.
(vect_add_slp_permutation): Adjust.  Emit lowpart/concat
special cases without VEC_PERM.
(vectorizable_slp_permutation): Select the operand vector
type and relax requirements.  Handle identity permutes
with mismatching operand types.
* optabs-query.cc (can_vec_perm_const_p): Only allow variable
permutes for op_mode == mode.

* gcc.target/i386/pr101668.c: New testcase.
* gcc.dg/vect/bb-slp-pr101668.c: Likewise.
---
 gcc/optabs-query.cc |  2 +-
 gcc/testsuite/gcc.dg/vect/bb-slp-pr101668.c | 59 +
 gcc/testsuite/gcc.target/i386/pr101668.c| 27 ++
 gcc/tree-vect-slp.cc| 98 +
 4 files changed, 167 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-pr101668.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101668.c

diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc
index 809482b8092..44ce41e95e3 100644
--- a/gcc/optabs-query.cc
+++ b/gcc/optabs-query.cc
@@ -426,7 +426,7 @@ can_vec_perm_const_p (machine_mode mode, machine_mode 
op_mode,
 return false;
 
   /* It's probably cheaper to test for the variable case first.  */
-  if (allow_variable_p && selector_fits_mode_p (mode, sel))
+  if (op_mode == mode && allow_variable_p && selector_fits_mode_p (mode, sel))
 {
   if (direct_optab_handler (vec_perm_optab, mode) != CODE_FOR_nothing)
return true;
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr101668.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr101668.c
new file mode 100644
index 000..eb44ad73657
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr101668.c
@@ -0,0 +1,59 @@
+/* { dg-do run } */
+/* { dg-additional-options "-w -Wno-psabi" } */
+
+#include "tree-vect.h"
+
+typedef int v4si __attribute__((vector_size(16)));
+typedef int v8si __attribute__((vector_size(32)));
+
+void __attribute__((noipa)) test_lo (v4si *dst, v8si src)
+{
+  (*dst)[0] = src[0];
+  (*dst)[1] = src[1];
+  (*dst)[2] = src[2];
+  (*dst)[3] = src[3];
+}
+
+void __attribute__((noipa)) test_hi (v4si *dst, v8si src)
+{
+  (*dst)[0] = src[4];
+  (*dst)[1] = src[5];
+  (*dst)[2] = src[6];
+  (*dst)[3] = src[7];
+}
+
+void __attribute__((noipa)) test_even (v4si *dst, v8si src)
+{
+  (*dst)[0] = src[0];
+  (*dst)[1] = src[2];
+  (*dst)[2] = src[4];
+  (*dst)[3] = src[6];
+}
+
+void __attribute__((noipa)) test_odd (v4si *dst, v8si src)
+{
+  (*dst)[0] = src[1];
+  (*dst)[1] = src[3];
+  (*dst)[2] = src[5];
+  (*dst)[3] = src[7];
+}
+
+int main()
+{
+  check_vect ();
+  v8si v = (v8si) { 0, 1, 2, 3, 4, 5, 6, 7 };
+  v4si dst;
+  test_lo (&dst, v);
+  if (dst[0] != 0 || dst[1] != 1 || dst[2] != 2 || dst[3] != 3)
+abort ();
+  test_hi (&dst, v);
+  if (dst[0] != 4 || dst[1] != 5 || dst[2] != 6 || dst[3] != 7)
+abort ();
+  test_even (&dst, v);
+  if (dst[0] != 0 || dst[1] != 2 || dst[2] != 4 || dst[3] != 6)
+abort ();
+  test_odd (&dst, v);
+  if (dst[0] != 1 || dst[1] != 3 || dst[2] != 5 || dst[3] != 7)
+abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr101668.c 
b/gcc/testsuite/gcc.target/i386/pr101668.c
new file mode 100644
index 000..07719ec03d9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr101668.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=skylake-avx512 -mprefer-vector-width=512" } */
+
+typedef int v16si __attribute__((vector_size (64)));
+typedef long long v8di __attribute__((vector_size (64)));
+
+void
+bar_s32_s64 (v8di * dst, v16si src)
+{
+  long long tem[8];
+  tem[0] = src[0];
+  tem[1] = src[1];
+  tem[2] = src[2];
+  tem[3] = src[3];
+  tem[4] = src[4];
+  tem[5] = src[5];
+  tem[6] = src[6];
+  tem[7] = src[7];
+  dst[0] = *(v8di *) tem;
+}
+
+/* We want to generate
+vpmovsxdq   %ymm0, %zmm0
+vmovdqa64   %zmm0, (%rdi)
+ret
+ */
+/* { dg-final { scan-assembler "vpmovsxdq" } } */
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index fe9361c338e..e46c787d7c9 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -1086,8 +1086,13 @@ vect_build_slp_tree_1 (vec_info *vinfo, unsigned char 
*swap,
  tree vec = TREE_OPERAND (gimple_assign_rhs1 (stmt), 0);
  if (!is_a  (vinfo)
  || TREE_CODE (vec) != SSA_NAME
- || !operand_equal_p (TYPE_SIZE (vectype),
-  TYPE_SIZE (TREE_T

Re: [PATCH] [PR105665] ivopts: check defs of names in base for undefs

2022-06-01 Thread Richard Biener via Gcc-patches
On Tue, May 31, 2022 at 3:27 PM Alexandre Oliva  wrote:
>
> On May 30, 2022, Richard Biener  wrote:
>
> > I don't think you can rely on TREE_VISITED not set at the start of the
> > pass (and you don't clear it either).
>
> I don't clear it, but I loop over all SSA names and set TREE_VISITED to
> either true or false, so that's covered.
>
> I even had a test patch that checked that TREE_VISITED remains unchanged
> and still matched the expected value, with a recursive verification.
>
> I could switch to an sbitmap if that's preferred, though.
>
> > I also wonder how you decide that tracking PHIs with (one) uninit arg
> > is "enough"?
>
> It's a conservative assumption, granted.  One could imagine cases in
> which an uninit def is never actually used, say because of conditionals
> forced by external circumstances the compiler cannot possibly know
> about.  But then, just as this sort of bug shows, sometimes even when an
> uninit is not actually used, the fact that it is uninit and thus
> undefined may end up percolating onto stuff that is actually used, so I
> figured we'd be better off leaving alone whatever is potentially derived
> from an uninit value.
>
> > Is it important which edge of the PHI the undef appears in?
>
> At some point, I added recursion to find_ssa_undef, at PHI nodes and
> assignments, and pondered whether to recurse at PHI nodes only for defs
> that were "earlier" ones, rather than coming from back edges.  I ended
> up reasoning that back edges would affect step and rule out an IV
> candidate even sooner.  But the forward propagation of maybe-undef
> obviated that reasoning.  Now, almost tautologically, if circumstances
> are such that the compiler could only tell that an ssa name is defined
> with external knowledge, then, since such external knowledge is not
> available to the compiler, it has to assume that the ssa name may be
> undefined.
>
> > I presume the testcase might have it on the loop entry edge?
>
> The original testcase did.  The modified one (the added increment) shows
> it can be an earlier edge that has the maybe-undef name.
>
> > I presume only PHIs in loop headers are to be considered?
>
> As in the modified testcase, earlier PHIs that are entirely outside the
> loop can still trigger the bug.  Adding more increments of g guarded by
> conditionals involving other global variables pushes the undef ssa name
> further and further away from the inner loop, while still rendering g an
> unsuitable IV.

So for example

void foo (int flag, int init, int n, int *p)
{
   int i;
   if (flag)
 i = init;
   for (; i < n; ++i)
 *(p++) = 1;
}

will now not consider 'i - i-on-entry' as an IV to express *(p++) = 1.
But

void foo (int flag, int init, int n, int *p)
{
   int i;
   if (flag)
 i = init;
   i++;
   for (; i < n; ++i)
 *(p++) = 1;
}

still would (the propagation stops at i++).  That said - I wonder if we
want to compute a conservative set of 'must-initialized' here or to
say, do we understand the failure mode good enough to say that
a must-def (even if from an SSA name without a definition) is good
enough to avoid the issues we are seeing?

One would argue that my example above invokes undefined behavior
if (!flag), but IIRC the cases in the PRs we talk about are not but
IVOPTs with its IV choice exposes undefined behavior - orignially
by relying on undef - undef being zero.

That said, the contains-undef thing tries to avoid rewriting expressions
with terms that possibly contain undefs which means if we want to
strenthen it then we look for must-defined (currently it's must-undefined)?

Richard.

> >> +int a, b, c[1], d[2], *e = c;
> >> +int main() {
> >> +  int f = 0;
> >> +  for (; b < 2; b++) {
> >> +int g;
> >> +if (f)
> >> +  g++, b = 40;
> >> +a = d[b * b];
> >> +for (f = 0; f < 3; f++) {
> >> +  if (e)
> >> +break;
> >> +  g--;
> >> +  if (a)
> >> +a = g;
> >> +}
> >> +  }
> >> +  return 0;
> >> +}
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about 


RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2022-06-01 Thread Richard Biener via Gcc-patches
On Tue, 31 May 2022, Joel Hutton wrote:

> > Can you post an updated patch (after the .cc renaming, and code_helper
> > now already moved to tree.h).
> > 
> > Thanks,
> > Richard.
> 
> Patches attached. They already incorporated the .cc rename, now rebased to be 
> after the change to tree.h

@@ -1412,8 +1412,7 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
   2, oprnd, half_type, unprom, vectype);

   tree var = vect_recog_temp_ssa_var (itype, NULL);
-  gimple *pattern_stmt = gimple_build_assign (var, wide_code,
- oprnd[0], oprnd[1]);
+  gimple *pattern_stmt = gimple_build (var, wide_code, oprnd[0], 
oprnd[1]);


you should be able to do without the new gimple_build overload
by using

   gimple_seq stmts = NULL;
   gimple_build (&stmts, wide_code, itype, oprnd[0], oprnd[1]);
   gimple *pattern_stmt = gimple_seq_last_stmt (stmts);

because 'gimple_build' is an existing API.


-  if (TREE_CODE (gimple_assign_lhs (stmt)) != SSA_NAME)
+  if (gimple_get_lhs (stmt) == NULL_TREE ||
+  TREE_CODE(gimple_get_lhs (stmt)) != SSA_NAME)
 return false;

|| go to the next line, space after TREE_CODE

+  bool widen_arith = false;
+  gimple_match_op res_op;
+  if (!gimple_extract_op (stmt, &res_op))
+return false;
+  code = res_op.code;
+  op_type = res_op.num_ops;
+
+  if (is_gimple_assign (stmt))
+  {
+  widen_arith = (code == WIDEN_PLUS_EXPR
+|| code == WIDEN_MINUS_EXPR
+|| code == WIDEN_MULT_EXPR
+|| code == WIDEN_LSHIFT_EXPR);
+ }
+  else
+  widen_arith = gimple_call_flags (stmt) & ECF_WIDEN;

there seem to be formatting issues.  Also shouldn't you check
if (res_op.code.is_tree_code ()) instead if is_gimple_assign?
I also don't like the ECF_WIDEN "trick", just do as with the
tree codes and explicitely enumerate widening ifns here.

gimple_extract_op is a bit heavy-weight as well, so maybe
instead simply do

  if (is_gimple_assign (stmt))
{
  code = gimple_assign_rhs_code (stmt);
...
}
  else if (gimple_call_internal_p (stmt))
{
  code = gimple_call_internal_fn (stmt);
...
}
  else
return false;

+  code_helper c1=MAX_TREE_CODES, c2=MAX_TREE_CODES;

spaces before/after '='

@@ -12061,13 +12105,16 @@ supportable_widening_operation (vec_info *vinfo,
   if (BYTES_BIG_ENDIAN && c1 != VEC_WIDEN_MULT_EVEN_EXPR)
 std::swap (c1, c2);

+
   if (code == FIX_TRUNC_EXPR)
 {

unnecessary whitespace change.

diff --git a/gcc/tree.h b/gcc/tree.h
index 
f84958933d51144bb6ce7cc41eca5f7f06814550..e51e34c051d9b91d1c02a4b2fefdb2b15606a36f
 
100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -92,6 +92,10 @@ public:
   bool is_fn_code () const { return rep < 0; }
   bool is_internal_fn () const;
   bool is_builtin_fn () const;
+  enum tree_code as_tree_code () const { return is_tree_code () ?
+(tree_code)* this : MAX_TREE_CODES; }
+  combined_fn as_fn_code () const { return is_fn_code () ? (combined_fn) 
*this
+: CFN_LAST;}

hmm, the other as_* functions we have are not member functions.
Also this semantically differs from the tree_code () conversion
operator (that one was supposed to be "cheap").  The existing
as_internal_fn for example is documented as being valid only if
the code is actually an internal fn.  I see you are introducing
the new function as convenience to get a "safe" not-a-X value,
so maybe they should be called safe_as_tree_code () instead?


   int get_rep () const { return rep; }
   bool operator== (const code_helper &other) { return rep == other.rep; }
   bool operator!= (const code_helper &other) { return rep != other.rep; }
@@ -6657,6 +6661,54 @@ extern unsigned fndecl_dealloc_argno (tree);
if nonnull, set the second argument to the referenced enclosing
object or pointer.  Otherwise return null.  */
 extern tree get_attr_nonstring_decl (tree, tree * = NULL);
+/* Helper to transparently allow tree codes and builtin function codes
+   exist in one storage entity.  */
+class code_helper
+{

duplicate add of code_helper.

Sorry to raise these issues so late.

Richard.


Re: [PATCH v3] RISC-V/testsuite: constraint some of tests to hard_float

2022-06-01 Thread Maciej W. Rozycki
On Thu, 26 May 2022, Vineet Gupta wrote:

> Commit 9ddd44b58649d1d ("RISC-V: Provide `fmin'/`fmax' RTL pattern") added
> tests which check for hard float instructions which obviously fails on
> soft-float ABI builds.

 Sorry to miss it and thank you for the fix!

  Maciej


[PATCH 8/12 V2] arm: Introduce multilibs for PACBTI target feature

2022-06-01 Thread Andrea Corallo via Gcc-patches
Hi all,

second iteration of the previous patch adding the following new
multilibs:

thumb/v8.1-m.main+pacbti/mbranch-protection/nofp
thumb/v8.1-m.main+pacbti+dp/mbranch-protection/soft
thumb/v8.1-m.main+pacbti+dp/mbranch-protection/hard
thumb/v8.1-m.main+pacbti+fp/mbranch-protection/soft
thumb/v8.1-m.main+pacbti+fp/mbranch-protection/hard
thumb/v8.1-m.main+pacbti+mve/mbranch-protection/hard

To trigger the following compiler flags:

-mthumb -march=armv8.1-m.main+pacbti -mbranch-protection=standard 
-mfloat-abi=soft
-mthumb -march=armv8.1-m.main+pacbti+fp -mbranch-protection=standard 
-mfloat-abi=softfp
-mthumb -march=armv8.1-m.main+pacbti+fp -mbranch-protection=standard 
-mfloat-abi=hard
-mthumb -march=armv8.1-m.main+pacbti+fp.dp -mbranch-protection=standard 
-mfloat-abi=softfp
-mthumb -march=armv8.1-m.main+pacbti+fp.dp -mbranch-protection=standard 
-mfloat-abi=hard
-mthumb -march=armv8.1-m.main+pacbti+mve -mbranch-protection=standard 
-mfloat-abi=hard

gcc/ChangeLog:

* config/arm/t-rmprofile: Add multilib rules for march +pacbti
  and mbranch-protection.

diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
index eb321e832f1..77147dde2ea 100644
--- a/gcc/config/arm/t-rmprofile
+++ b/gcc/config/arm/t-rmprofile
@@ -27,8 +27,11 @@
 
 # Arch and FPU variants to build libraries with
 
-MULTI_ARCH_OPTS_RM = 
march=armv6s-m/march=armv7-m/march=armv7e-m/march=armv7e-m+fp/march=armv7e-m+fp.dp/march=armv8-m.base/march=armv8-m.main/march=armv8-m.main+fp/march=armv8-m.main+fp.dp/march=armv8.1-m.main+mve
-MULTI_ARCH_DIRS_RM = v6-m v7-m v7e-m v7e-m+fp v7e-m+dp v8-m.base v8-m.main 
v8-m.main+fp v8-m.main+dp v8.1-m.main+mve
+MULTI_ARCH_OPTS_RM = 
march=armv6s-m/march=armv7-m/march=armv7e-m/march=armv7e-m+fp/march=armv7e-m+fp.dp/march=armv8-m.base/march=armv8-m.main/march=armv8-m.main+fp/march=armv8-m.main+fp.dp/march=armv8.1-m.main+mve/march=armv8.1-m.main+pacbti/march=armv8.1-m.main+pacbti+fp/march=armv8.1-m.main+pacbti+fp.dp/march=armv8.1-m.main+pacbti+mve
+MULTI_ARCH_DIRS_RM = v6-m v7-m v7e-m v7e-m+fp v7e-m+dp v8-m.base v8-m.main 
v8-m.main+fp v8-m.main+dp v8.1-m.main+mve v8.1-m.main+pacbti 
v8.1-m.main+pacbti+fp v8.1-m.main+pacbti+dp 8.1-m.main+pacbti+mve
+
+MULTI_ARCH_OPTS_RM += mbranch-protection=standard
+MULTI_ARCH_DIRS_RM += mbranch-protection
 
 # Base M-profile (no fp)
 MULTILIB_REQUIRED  += mthumb/march=armv6s-m/mfloat-abi=soft
@@ -50,6 +53,14 @@ MULTILIB_REQUIRED+= 
mthumb/march=armv8-m.main+fp.dp/mfloat-abi=hard
 MULTILIB_REQUIRED  += mthumb/march=armv8-m.main+fp.dp/mfloat-abi=softfp
 MULTILIB_REQUIRED  += mthumb/march=armv8.1-m.main+mve/mfloat-abi=hard
 
+MULTILIB_REQUIRED  += 
mthumb/march=armv8.1-m.main+pacbti/mbranch-protection=standard/mfloat-abi=soft
+MULTILIB_REQUIRED  += 
mthumb/march=armv8.1-m.main+pacbti+fp/mbranch-protection=standard/mfloat-abi=softfp
+MULTILIB_REQUIRED  += 
mthumb/march=armv8.1-m.main+pacbti+fp/mbranch-protection=standard/mfloat-abi=hard
+MULTILIB_REQUIRED  += 
mthumb/march=armv8.1-m.main+pacbti+fp.dp/mbranch-protection=standard/mfloat-abi=softfp
+MULTILIB_REQUIRED  += 
mthumb/march=armv8.1-m.main+pacbti+fp.dp/mbranch-protection=standard/mfloat-abi=hard
+MULTILIB_REQUIRED  += 
mthumb/march=armv8.1-m.main+pacbti+mve/mbranch-protection=standard/mfloat-abi=hard
+
+
 # Arch Matches
 MULTILIB_MATCHES   += march?armv6s-m=march?armv6-m
 
@@ -87,9 +98,19 @@ MULTILIB_MATCHES += $(foreach FP, $(v8_1m_sp_variants), \
 MULTILIB_MATCHES += $(foreach FP, $(v8_1m_dp_variants), \
 
march?armv8-m.main+fp.dp=mlibarch?armv8.1-m.main$(FP))
 
+# Map all mbranch-protection values other than 'none' to 'standard'.
+MULTILIB_MATCHES   += mbranch-protection?standard=mbranch-protection?bti
+MULTILIB_MATCHES   += 
mbranch-protection?standard=mbranch-protection?pac-ret
+MULTILIB_MATCHES   += 
mbranch-protection?standard=mbranch-protection?pac-ret+leaf
+MULTILIB_MATCHES   += 
mbranch-protection?standard=mbranch-protection?pac-ret+bti
+MULTILIB_MATCHES   += 
mbranch-protection?standard=mbranch-protection?pac-ret+leaf+bti
+MULTILIB_MATCHES   += 
mbranch-protection?standard=mbranch-protection?bti+pac-ret
+MULTILIB_MATCHES   += 
mbranch-protection?standard=mbranch-protection?bti+pac-ret+leaf
+
 # For all the MULTILIB_REQUIRED for v8-m and above, add MULTILIB_MATCHES which
 # maps mlibarch with march for multilib linking.
 MULTILIB_MATCHES   += march?armv8-m.main=mlibarch?armv8-m.main
 MULTILIB_MATCHES   += march?armv8-m.main+fp=mlibarch?armv8-m.main+fp
 MULTILIB_MATCHES   += march?armv8-m.main+fp.dp=mlibarch?armv8-m.main+fp.dp
 MULTILIB_MATCHES   += march?armv8.1-m.main+mve=mlibarch?armv8.1-m.main+mve
+MULTILIB_MATCHES   += 
march?armv8.1-m.main+pacbti=mlibarch?armv8.1-m.main+pacbti


Re: [PATCH 0/12] arm: Enables return address verification and branch target identification on Cortex-M

2022-06-01 Thread Andrea Corallo via Gcc-patches
Andrea Corallo via Gcc-patches  writes:

> Hi all,
>
> this series enables return address verification and branch target
> identification based on Armv8.1-M Pointer Authentication and Branch
> Target Identification Extension [1] for Arm Cortex-M.
>
> This feature is controlled by the newly introduced '-mbranch-protection'
> option, contextually the Armv8.1-M Mainline target feature '+pacbti' is
> added.
>
> Best Regards
>
>   Andrea
>
> [1] 
> 

Hi all,

this is to ping this series.

I believe 3/12 7/12 8/12 9/12 10/12 12/12 are still pending for review.

Thanks!

  Andrea


[PATCH] tree-optimization/105786 - avoid strlen replacement for pointers

2022-06-01 Thread Richard Biener via Gcc-patches
This avoids matching strlen to a pointer result, avoiding ICEing
because of an integer adjustment using PLUS_EXPR on pointers.

Boostrapped and tested on x86_64-unknown-linux-gnu, pushed.

2022-06-01  Richard Biener  

PR tree-optimization/105786
* tree-loop-distribution.cc
(loop_distribution::transform_reduction_loop): Only do strlen
replacement for integer type reductions.

* gcc.dg/torture/pr105786.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr105786.c | 13 +
 gcc/tree-loop-distribution.cc   |  1 +
 2 files changed, 14 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr105786.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr105786.c 
b/gcc/testsuite/gcc.dg/torture/pr105786.c
new file mode 100644
index 000..64aacf74b0a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr105786.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+
+void sink(const char*);
+static const char *a;
+int main()
+{
+  const char *b = a;
+  for (int i = 0; i < 2; ++i)
+while (*b++)
+  ;
+  sink(b);
+  return 0;
+}
diff --git a/gcc/tree-loop-distribution.cc b/gcc/tree-loop-distribution.cc
index db6e9096a86..086b59ca2be 100644
--- a/gcc/tree-loop-distribution.cc
+++ b/gcc/tree-loop-distribution.cc
@@ -3658,6 +3658,7 @@ loop_distribution::transform_reduction_loop (loop_p loop)
   /* Handle strlen like loops.  */
   if (store_dr == NULL
   && integer_zerop (pattern)
+  && INTEGRAL_TYPE_P (TREE_TYPE (reduction_var))
   && TREE_CODE (reduction_iv.base) == INTEGER_CST
   && TREE_CODE (reduction_iv.step) == INTEGER_CST
   && integer_onep (reduction_iv.step))
-- 
2.35.3


[PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (utype)0) to x > ~(utype)0 / cst [PR30314]

2022-06-01 Thread Jakub Jelinek via Gcc-patches
Hi!

A comparison with a constant is most likely always faster than
.MUL_OVERFLOW from which we only check whether it overflowed and not the
multiplication result, and even if not, it is simpler operation on GIMPLE
and even if a target exists where such multiplications with overflow checking
are cheaper than comparisons, because comparisons are so much more common
than overflow checking multiplications, it would be nice if it simply
arranged for comparisons to be emitted like those multiplications on its
own...

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-06-01  Jakub Jelinek  

PR middle-end/30314
* match.pd (__builtin_mul_overflow_p (x, cst, (utype) 0) ->
x > ~(utype)0 / cst): New simplification.

* gcc.dg/tree-ssa/pr30314.c: New test.

--- gcc/match.pd.jj 2022-06-01 13:54:32.000654151 +0200
+++ gcc/match.pd2022-06-01 15:13:35.473084402 +0200
@@ -5969,6 +5969,17 @@ (define_operator_list SYNC_FETCH_AND_AND
&& (!TYPE_UNSIGNED (TREE_TYPE (@2)) || TYPE_UNSIGNED (TREE_TYPE (@0
(ovf @1 @0
 
+/* Optimize __builtin_mul_overflow_p (x, cst, (utype) 0) if all 3 types
+   are unsigned to x > (umax / cst).  */
+(simplify
+ (imagpart (IFN_MUL_OVERFLOW:cs@2 @0 integer_nonzerop@1))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+   && TYPE_UNSIGNED (TREE_TYPE (@0))
+   && TYPE_MAX_VALUE (TREE_TYPE (@0))
+   && types_match (TREE_TYPE (@0), TREE_TYPE (TREE_TYPE (@2)))
+   && int_fits_type_p (@1, TREE_TYPE (@0)))
+   (convert (gt @0 (trunc_div! { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1)
+
 /* Simplification of math builtins.  These rules must all be optimizations
as well as IL simplifications.  If there is a possibility that the new
form could be a pessimization, the rule should go in the canonicalization
--- gcc/testsuite/gcc.dg/tree-ssa/pr30314.c.jj  2022-06-01 15:22:53.201271365 
+0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr30314.c 2022-06-01 15:13:24.725196482 
+0200
@@ -0,0 +1,18 @@
+/* PR middle-end/30314 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-not "\.MUL_OVERFLOW " "optimized" } } */
+/* { dg-final { scan-tree-dump " > 122713351" "optimized" { target int32 } } } 
*/
+/* { dg-final { scan-tree-dump " > 527049830677415760" "optimized" { target 
lp64 } } } */
+
+int
+foo (unsigned int x)
+{
+  return __builtin_mul_overflow_p (x, 35U, 0U);
+}
+
+int
+bar (unsigned long int x)
+{
+  return __builtin_mul_overflow_p (x, 35UL, 0UL);
+}

Jakub



Re: [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (utype)0) to x > ~(utype)0 / cst [PR30314]

2022-06-01 Thread Jeff Law via Gcc-patches




On 6/1/2022 7:55 AM, Jakub Jelinek via Gcc-patches wrote:

Hi!

A comparison with a constant is most likely always faster than
.MUL_OVERFLOW from which we only check whether it overflowed and not the
multiplication result, and even if not, it is simpler operation on GIMPLE
and even if a target exists where such multiplications with overflow checking
are cheaper than comparisons, because comparisons are so much more common
than overflow checking multiplications, it would be nice if it simply
arranged for comparisons to be emitted like those multiplications on its
own...

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-06-01  Jakub Jelinek  

PR middle-end/30314
* match.pd (__builtin_mul_overflow_p (x, cst, (utype) 0) ->
x > ~(utype)0 / cst): New simplification.

* gcc.dg/tree-ssa/pr30314.c: New test.

OK
jeff



c++: Make static init generation more consistent

2022-06-01 Thread Nathan Sidwell

The end-of-compilation static init code generation functions are:

* Inconsistent in argument ordering (swapping 'is-init' and 'priority',
  wrt each other and other arguments).

* Inconsistent in naming. mostly calling the is-init argument 'initp',
  but sometimes calling it 'constructor_p' and in the worst case using
  a transcoded 'methody_type' character, and naming the priority
  argument 'initp'.

* Inconsistent in typing.  Sometimes the priority is unsigned,
  sometimes signed.  And the initp argument can of course be a bool.

* Several of the function comments have bit-rotted.

This addresses those oddities.  Name is-init 'initp', name priority
'priority'.  Place initp first, make priority unsigned.

nathan

--
Nathan SidwellFrom f4890de522f3df6edadf2ede072317c908c3 Mon Sep 17 00:00:00 2001
From: Nathan Sidwell 
Date: Tue, 31 May 2022 07:56:53 -0700
Subject: [PATCH] c++: Make static init generation more consistent

The end-of-compilation static init code generation functions are:

* Inconsistent in argument ordering (swapping 'is-init' and 'priority',
  wrt each other and other arguments).

* Inconsistent in naming. mostly calling the is-init argument 'initp',
  but sometimes calling it 'constructor_p' and in the worst case using
  a transcoded 'methody_type' character, and naming the priority
  argument 'initp'.

* Inconsistent in typing.  Sometimes the priority is unsigned,
  sometimes signed.  And the initp argument can of course be a bool.

* Several of the function comments have bit-rotted.

This addresses those oddities.  Name is-init 'initp', name priority
'priority'.  Place initp first, make priority unsigned.

	gcc/cp/
	* decl2.cc (start_objects): Replace 'method_type' parameter
	with 'initp' boolean, rename and retype 'priority' parameter.
	(finish_objects): Likewise.  Do not expand here.
	(one_static_initialization_or_destruction): Move 'initp'
	parameter first.
	(do_static_initialization_or_destruction): Likewise.
	(generate_ctor_or_dtor_function): Rename 'initp' parameter.
	Adjust start_objects/finish_obects calls and expand here.
	(generate_ctor_and_dtor_functions_for_priority): Adjust calls.
	(c_parse_final_cleanups): Likewise.
	(vtv_start_verification_constructor_init): Adjust.
	(vtv_finish_verification_constructor_init): Use finish_objects.
---
 gcc/cp/decl2.cc | 92 +
 1 file changed, 39 insertions(+), 53 deletions(-)

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index e72fdf05382..17e5877ddba 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -66,14 +66,14 @@ typedef struct priority_info_s {
   int destructions_p;
 } *priority_info;
 
-static tree start_objects (int, int);
-static void finish_objects (int, int, tree);
+static tree start_objects (bool, unsigned);
+static tree finish_objects (bool, unsigned, tree);
 static tree start_static_storage_duration_function (unsigned);
 static void finish_static_storage_duration_function (tree);
 static priority_info get_priority_info (int);
-static void do_static_initialization_or_destruction (tree, bool);
-static void one_static_initialization_or_destruction (tree, tree, bool);
-static void generate_ctor_or_dtor_function (bool, int, location_t *);
+static void do_static_initialization_or_destruction (bool, tree);
+static void one_static_initialization_or_destruction (bool, tree, tree);
+static void generate_ctor_or_dtor_function (bool, unsigned, location_t *);
 static int generate_ctor_and_dtor_functions_for_priority (splay_tree_node,
 			  void *);
 static tree prune_vars_needing_no_initialization (tree *);
@@ -3813,17 +3813,14 @@ generate_tls_wrapper (tree fn)
   expand_or_defer_fn (finish_function (/*inline_p=*/false));
 }
 
-/* Start the process of running a particular set of global constructors
-   or destructors.  Subroutine of do_[cd]tors.  Also called from
-   vtv_start_verification_constructor_init_function.  */
+/* Start a global constructor or destructor function.  */
 
 static tree
-start_objects (int method_type, int initp)
+start_objects (bool initp, unsigned priority)
 {
-  /* Make ctor or dtor function.  METHOD_TYPE may be 'I' or 'D'.  */
   int module_init = 0;
 
-  if (initp == DEFAULT_INIT_PRIORITY && method_type == 'I')
+  if (priority == DEFAULT_INIT_PRIORITY && initp)
 module_init = module_initializer_kind ();
 
   tree name = NULL_TREE;
@@ -3833,15 +3830,17 @@ start_objects (int method_type, int initp)
 {
   char type[14];
 
-  unsigned len = sprintf (type, "sub_%c", method_type);
-  if (initp != DEFAULT_INIT_PRIORITY)
+  /* We use `I' to indicate initialization and `D' to indicate
+	 destruction.  */
+  unsigned len = sprintf (type, "sub_%c", initp ? 'I' : 'D');
+  if (priority != DEFAULT_INIT_PRIORITY)
 	{
 	  char joiner = '_';
 #ifdef JOINER
 	  joiner = JOINER;
 #endif
 	  type[len++] = joiner;
-	  sprintf (type + len, "%.5u", initp);
+	  sprintf (type + len, "%.5u", priority);
 	}
   name = get_file_function_name 

c++: Cleanup static init generation

2022-06-01 Thread Nathan Sidwell

The static init/fini generation is showing some bitrot.  This cleans
up several places to use C++, and also take advantage of already
having checked a variable for non-nullness.

nathan

--
Nathan SidwellFrom c4d702fb3c1e2f6e1bc8711da81bff59543b1b19 Mon Sep 17 00:00:00 2001
From: Nathan Sidwell 
Date: Tue, 31 May 2022 13:22:06 -0700
Subject: [PATCH] c++: Cleanup static init generation

The static init/fini generation is showing some bitrot.  This cleans
up several places to use C++, and also take advantage of already
having checked a variable for non-nullness.

	gcc/cp/
	* decl2.cc (ssdf_decl): Delete global.
	(start_static_storage_duration_function): Use some RAII.
	(do_static_initialization_or_destruction): Likewise.
	(c_parse_final_cleanups): Likewise.  Avoid rechecking 'vars'.
---
 gcc/cp/decl2.cc | 71 +
 1 file changed, 25 insertions(+), 46 deletions(-)

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 17e5877ddba..9de3f806e95 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -3943,9 +3943,6 @@ static GTY(()) tree initialize_p_decl;
 /* The declaration for the __PRIORITY argument.  */
 static GTY(()) tree priority_decl;
 
-/* The declaration for the static storage duration function.  */
-static GTY(()) tree ssdf_decl;
-
 /* All the static storage duration functions created in this
translation unit.  */
 static GTY(()) vec *ssdf_decls;
@@ -3970,24 +3967,20 @@ static splay_tree priority_info_map;
 static tree
 start_static_storage_duration_function (unsigned count)
 {
-  tree type;
-  tree body;
   char id[sizeof (SSDF_IDENTIFIER) + 1 /* '\0' */ + 32];
 
   /* Create the identifier for this function.  It will be of the form
  SSDF_IDENTIFIER_.  */
   sprintf (id, "%s_%u", SSDF_IDENTIFIER, count);
 
-  type = build_function_type_list (void_type_node,
-   integer_type_node, integer_type_node,
-   NULL_TREE);
+  tree type = build_function_type_list (void_type_node,
+	integer_type_node, integer_type_node,
+	NULL_TREE);
 
   /* Create the FUNCTION_DECL itself.  */
-  ssdf_decl = build_lang_decl (FUNCTION_DECL,
-			   get_identifier (id),
-			   type);
-  TREE_PUBLIC (ssdf_decl) = 0;
-  DECL_ARTIFICIAL (ssdf_decl) = 1;
+  tree fn = build_lang_decl (FUNCTION_DECL, get_identifier (id), type);
+  TREE_PUBLIC (fn) = 0;
+  DECL_ARTIFICIAL (fn) = 1;
 
   /* Put this function in the list of functions to be called from the
  static constructors and destructors.  */
@@ -4009,21 +4002,21 @@ start_static_storage_duration_function (unsigned count)
   get_priority_info (DEFAULT_INIT_PRIORITY);
 }
 
-  vec_safe_push (ssdf_decls, ssdf_decl);
+  vec_safe_push (ssdf_decls, fn);
 
   /* Create the argument list.  */
   initialize_p_decl = cp_build_parm_decl
-(ssdf_decl, get_identifier (INITIALIZE_P_IDENTIFIER), integer_type_node);
+(fn, get_identifier (INITIALIZE_P_IDENTIFIER), integer_type_node);
   TREE_USED (initialize_p_decl) = 1;
   priority_decl = cp_build_parm_decl
-(ssdf_decl, get_identifier (PRIORITY_IDENTIFIER), integer_type_node);
+(fn, get_identifier (PRIORITY_IDENTIFIER), integer_type_node);
   TREE_USED (priority_decl) = 1;
 
   DECL_CHAIN (initialize_p_decl) = priority_decl;
-  DECL_ARGUMENTS (ssdf_decl) = initialize_p_decl;
+  DECL_ARGUMENTS (fn) = initialize_p_decl;
 
   /* Put the function in the global scope.  */
-  pushdecl (ssdf_decl);
+  pushdecl (fn);
 
   /* Start the function itself.  This is equivalent to declaring the
  function as:
@@ -4032,14 +4025,10 @@ start_static_storage_duration_function (unsigned count)
 
  It is static because we only need to call this function from the
  various constructor and destructor functions for this module.  */
-  start_preparsed_function (ssdf_decl,
-			/*attrs=*/NULL_TREE,
-			SF_PRE_PARSED);
+  start_preparsed_function (fn, /*attrs=*/NULL_TREE, SF_PRE_PARSED);
 
   /* Set up the scope of the outermost block in the function.  */
-  body = begin_compound_stmt (BCS_FN_BODY);
-
-  return body;
+  return begin_compound_stmt (BCS_FN_BODY);
 }
 
 /* Finish the generation of the function which performs initialization
@@ -4234,7 +4223,6 @@ one_static_initialization_or_destruction (bool initp, tree decl, tree init)
   finish_if_stmt_cond (guard_cond, guard_if_stmt);
 }
 
-
   /* If we're using __cxa_atexit, we have not already set the GUARD,
  so we must do so now.  */
   if (guard && initp && flag_use_cxa_atexit)
@@ -4282,11 +4270,9 @@ one_static_initialization_or_destruction (bool initp, tree decl, tree init)
 static void
 do_static_initialization_or_destruction (bool initp, tree vars)
 {
-  tree node, init_if_stmt, cond;
-
   /* Build the outer if-stmt to check for initialization or destruction.  */
-  init_if_stmt = begin_if_stmt ();
-  cond = initp ? integer_one_node : integer_zero_node;
+  tree init_if_stmt = begin_if_stmt ();
+  tree cond = initp ? integer_one_node : integer_zero_node;
   cond = cp_build_binary_op (inpu

Re: [PATCH] Fold truncations of left shifts in match.pd

2022-06-01 Thread Jeff Law via Gcc-patches




On 5/30/2022 6:23 AM, Roger Sayle wrote:

Whilst investigating PR 55278, I noticed that the tree-ssa optimizers
aren't eliminating the promotions of shifts to "int" as inserted by the
c-family front-ends, instead leaving this simplification to be left to
the RTL optimizers.  This patch allows match.pd to do this itself earlier,
narrowing (T)(X << C) to (T)X << C when the constant C is known to be
valid for the (narrower) type T.

Hence for this simple test case:
short foo(short x) { return x << 5; }

the .optimized dump currently looks like:

short int foo (short int x)
{
   int _1;
   int _2;
   short int _4;

[local count: 1073741824]:
   _1 = (int) x_3(D);
   _2 = _1 << 5;
   _4 = (short int) _2;
   return _4;
}

but with this patch, now becomes:

short int foo (short int x)
{
   short int _2;

[local count: 1073741824]:
   _2 = x_1(D) << 5;
   return _2;
}

This is always reasonable as RTL expansion knows how to use
widening optabs if it makes sense at the RTL level to perform
this shift in a wider mode.

Of course, there's often a catch.  The above simplification not only
reduces the number of statements in gimple, but also allows further
optimizations, for example including the perception of rotate idioms
and bswap16.  Alas, optimizing things earlier than anticipated
requires several testsuite changes [though all these tests have
been confirmed to generate identical assembly code on x86_64].
The only significant change is that the vectorization pass previously
wouldn't vectorize rotations if the backend doesn't explicitly provide
an optab for them.  This is curious as if the rotate is expressed as
ior(lshift,rshift) it will vectorize, and likewise RTL expansion will
generate the iorv(lshiftv,rshiftv) sequence if required for a vector
mode rotation.  Hence this patch includes a tweak to the optabs
test in tree-vect-stmts.cc's vectorizable_shifts to better reflect
the functionality supported by RTL expansion.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2022-05-30  Roger Sayle  

gcc/ChangeLog
 * match.pd (convert (lshift @1 INTEGER_CST@2)): Narrow integer
 left shifts by a constant when the result is truncated, and the
 shift constant is well-defined for the narrower mode.
 * tree-vect-stmts.cc (vectorizable_shift): Rotations by
 constants are vectorizable, if the backend supports logical
 shifts and IOR logical operations in the required vector mode.

gcc/testsuite/ChangeLog
 * gcc.dg/fold-convlshift-4.c: New test case.
 * gcc.dg/optimize-bswaphi-1.c: Update found bswap count.
 * gcc.dg/tree-ssa/pr61839_3.c: Shift is now optimized before VRP.
 * gcc.dg/vect/vect-over-widen-1-big-array.c: Remove obsolete tests.
 * gcc.dg/vect/vect-over-widen-1.c: Likewise.
 * gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise.
 * gcc.dg/vect/vect-over-widen-3.c: Likewise.
 * gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
 * gcc.dg/vect/vect-over-widen-4.c: Likewise.
So the worry here would be stuff like narrowing the source operand 
leading to partial stalls.  But as you indicated, if the target really 
wants to do the shift in a wider mode, it can.  Furthermore, the place 
to make that decision is at the gimple->rtl border, IMHO.


OK.

jeff

ps.  There may still be another old BZ for the lack of narrowing 
inhibiting vectorization IIRC.  I don't recall the specifics enough to 
hazard a guess if this patch will help or not.


Re: [PATCH] PR rtl-optimization/7061: Complex number arguments on x86_64-like ABIs.

2022-06-01 Thread Jeff Law via Gcc-patches




On 5/30/2022 4:06 AM, Roger Sayle wrote:

This patch addresses the issue in comment #6 of PR rtl-optimization/7061
(a four digit PR number) from 2006 where on x86_64 complex number arguments
are unconditionally spilled to the stack.

For the test cases below:
float re(float _Complex a) { return __real__ a; }
float im(float _Complex a) { return __imag__ a; }

GCC with -O2 currently generates:

re: movq%xmm0, -8(%rsp)
 movss   -8(%rsp), %xmm0
 ret
im: movq%xmm0, -8(%rsp)
 movss   -4(%rsp), %xmm0
 ret

with this patch we now generate:

re: ret
im: movq%xmm0, %rax
 shrq$32, %rax
 movd%eax, %xmm0
 ret

[Technically, this shift can be performed on %xmm0 in a single
instruction, but the backend needs to be taught to do that, the
important bit is that the SCmode argument isn't written to the
stack].

The patch itself is to emit_group_store where just before RTL
expansion commits to writing to the stack, we check if the store
group consists of a single scalar integer register that holds
a complex mode value; on x86_64 SCmode arguments are passed in
DImode registers.  If this is the case, we can use a SUBREG to
"view_convert" the integer to the equivalent complex mode.

An interesting corner case that showed up during testing is that
x86_64 also passes HCmode arguments in DImode registers(!), i.e.
using modes of different sizes.  This is easily handled/supported
by first converting to an integer mode of the correct size, and
then generating a complex mode SUBREG of this.  This is similar
in concept to the patch I proposed here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590139.html
which was almost (but not quite) approved here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591139.html
Yea, sorry.  Too much to do at the new job.  Trying to work my way 
through queued up stuff now...





This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2020-05-30  Roger Sayle  

gcc/ChangeLog
 PR rtl-optimization/7061
 * expr.cc (emit_group_stote): For groups that consist of a single
 scalar integer register that hold a complex mode value, use
 gen_lowpart to generate a SUBREG to "view_convert" to the complex
 mode.  For modes of different sizes, first convert to an integer
 mode of the appropriate size.

gcc/testsuite/ChangeLog
 PR rtl-optimization/7061
 * gcc.target/i386/pr7061-1.c: New test case.
 * gcc.target/i386/pr7061-2.c: New test case.

OK
jeff



Re: [PATCH] gengtype: do not skip char after escape sequnce

2022-06-01 Thread Jeff Law via Gcc-patches




On 5/4/2022 1:14 PM, Martin Liška wrote:

Right now, when a \$x escape sequence occures, the
next character after $x is skipped, which is bogus.

The code has very low coverage right now.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* gengtype-state.cc (read_a_state_token): Do not skip extra
character after escaped sequence.
Any way you can add a test for this?  I'm OK with the patch as-is, but 
with a test is obviously better.


jeff



c++: Static init guard generation

2022-06-01 Thread Nathan Sidwell

The guard generation for a static var init was overly verbose.  We can
use a bit of RAII and avoid some rechecking.  Also in the !cxa_atexit
case, the only difference is whether can become whether to use
post-inc or pre-dec.

nathan

--
Nathan SidwellFrom 289f860fe62423a66e43989688e1d24bcdb25b5e Mon Sep 17 00:00:00 2001
From: Nathan Sidwell 
Date: Wed, 1 Jun 2022 04:52:21 -0700
Subject: [PATCH] c++: Static init guard generation

The guard generation for a static var init was overly verbose.  We can
use a bit of RAII and avoid some rechecking.  Also in the !cxa_atexit
case, the only difference is whether can become whether to use
post-inc or pre-dec.

	gcc/cp/
	* decl2.cc (fix_temporary_vars_context_r): Use data argument
	for new context.
	(one_static_initialization_or_destruction): Adjust tree walk
	call.  Refactor guard generation.
---
 gcc/cp/decl2.cc | 111 ++--
 1 file changed, 42 insertions(+), 69 deletions(-)

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 9de3f806e95..974afe798b6 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -4085,34 +4085,25 @@ get_priority_info (int priority)
 		|| DECL_ONE_ONLY (decl) \
 		|| DECL_WEAK (decl)))
 
-/* Called from one_static_initialization_or_destruction(),
-   via walk_tree.
-   Walks the initializer list of a global variable and looks for
+/* Walks the initializer list of a global variable and looks for
temporary variables (DECL_NAME() == NULL and DECL_ARTIFICIAL != 0)
-   and that have their DECL_CONTEXT() == NULL.
-   For each such temporary variable, set their DECL_CONTEXT() to
-   the current function. This is necessary because otherwise
-   some optimizers (enabled by -O2 -fprofile-arcs) might crash
-   when trying to refer to a temporary variable that does not have
-   it's DECL_CONTECT() properly set.  */
+   and that have their DECL_CONTEXT() == NULL.  For each such
+   temporary variable, set their DECL_CONTEXT() to CTX -- the
+   initializing function. This is necessary because otherwise some
+   optimizers (enabled by -O2 -fprofile-arcs) might crash when trying
+   to refer to a temporary variable that does not have its
+   DECL_CONTEXT() properly set.  */
+
 static tree 
 fix_temporary_vars_context_r (tree *node,
 			  int  * /*unused*/,
-			  void * /*unused1*/)
+			  void *ctx)
 {
-  gcc_assert (current_function_decl);
-
   if (TREE_CODE (*node) == BIND_EXPR)
-{
-  tree var;
-
-  for (var = BIND_EXPR_VARS (*node); var; var = DECL_CHAIN (var))
-	if (VAR_P (var)
-	  && !DECL_NAME (var)
-	  && DECL_ARTIFICIAL (var)
-	  && !DECL_CONTEXT (var))
-	  DECL_CONTEXT (var) = current_function_decl;
-}
+for (tree var = BIND_EXPR_VARS (*node); var; var = DECL_CHAIN (var))
+  if (VAR_P (var) && !DECL_NAME (var)
+	  && DECL_ARTIFICIAL (var) && !DECL_CONTEXT (var))
+	DECL_CONTEXT (var) = tree (ctx);
 
   return NULL_TREE;
 }
@@ -4124,9 +4115,6 @@ fix_temporary_vars_context_r (tree *node,
 static void
 one_static_initialization_or_destruction (bool initp, tree decl, tree init)
 {
-  tree guard_if_stmt = NULL_TREE;
-  tree guard;
-
   /* If we are supposed to destruct and there's a trivial destructor,
  nothing has to be done.  */
   if (!initp
@@ -4150,7 +4138,7 @@ one_static_initialization_or_destruction (bool initp, tree decl, tree init)
  of the temporaries are set to the current function decl.  */
   cp_walk_tree_without_duplicates (&init,
    fix_temporary_vars_context_r,
-   NULL);
+   current_function_decl);
 
   /* Because of:
 
@@ -4171,62 +4159,50 @@ one_static_initialization_or_destruction (bool initp, tree decl, tree init)
 }
 
   /* Assume we don't need a guard.  */
-  guard = NULL_TREE;
+  tree guard_if_stmt = NULL_TREE;
+
   /* We need a guard if this is an object with external linkage that
  might be initialized in more than one place.  (For example, a
  static data member of a template, when the data member requires
  construction.)  */
   if (NEEDS_GUARD_P (decl))
 {
+  tree guard = get_guard (decl);
   tree guard_cond;
 
-  guard = get_guard (decl);
-
-  /* When using __cxa_atexit, we just check the GUARD as we would
-	 for a local static.  */
   if (flag_use_cxa_atexit)
 	{
-	  /* When using __cxa_atexit, we never try to destroy
+	  /* When using __cxa_atexit, we just check the GUARD as we
+	 would for a local static.  We never try to destroy
 	 anything from a static destructor.  */
 	  gcc_assert (initp);
 	  guard_cond = get_guard_cond (guard, false);
 	}
-  /* If we don't have __cxa_atexit, then we will be running
-	 destructors from .fini sections, or their equivalents.  So,
-	 we need to know how many times we've tried to initialize this
-	 object.  We do initializations only if the GUARD is zero,
-	 i.e., if we are the first to initialize the variable.  We do
-	 destructions only if the GUARD is one, i.e., if we are the
-	 last to destroy the variable.  */
-

[RFC PATCH] RISC-V: Add Zawrs ISA extension support

2022-06-01 Thread Christoph Muellner via Gcc-patches
This patch adds support for the Zawrs ISA extension.
The patch depends on the corresponding Binutils patch
to be usable (see [1])

The specification can be found here:
https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc

Note, that the Zawrs extension is not frozen or ratified yet.
Therefore this patch is an RFC and not intended to get merged.

[1] https://sourceware.org/pipermail/binutils/2022-April/120559.html

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add zawrs extension.
* config/riscv/riscv-opts.h (MASK_ZAWRS): New.
(TARGET_ZAWRS): New.
* config/riscv/riscv.opt: New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zawrs.c: New test.

Signed-off-by: Christoph Muellner 
---
 gcc/common/config/riscv/riscv-common.cc |  4 
 gcc/config/riscv/riscv-opts.h   |  3 +++
 gcc/config/riscv/riscv.opt  |  3 +++
 gcc/testsuite/gcc.target/riscv/zawrs.c  | 13 +
 4 files changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zawrs.c

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 0e5be2ce105..7dc5a64006a 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -149,6 +149,8 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"zifencei", ISA_SPEC_CLASS_20191213, 2, 0},
   {"zifencei", ISA_SPEC_CLASS_20190608, 2, 0},
 
+  {"zawrs", ISA_SPEC_CLASS_NONE, 1, 0},
+
   {"zba", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zbb", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zbc", ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1098,6 +1100,8 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zicsr",&gcc_options::x_riscv_zi_subext, MASK_ZICSR},
   {"zifencei", &gcc_options::x_riscv_zi_subext, MASK_ZIFENCEI},
 
+  {"zawrs", &gcc_options::x_riscv_za_subext, MASK_ZAWRS},
+
   {"zba",&gcc_options::x_riscv_zb_subext, MASK_ZBA},
   {"zbb",&gcc_options::x_riscv_zb_subext, MASK_ZBB},
   {"zbc",&gcc_options::x_riscv_zb_subext, MASK_ZBC},
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 1e153b3a6e7..ced086b955d 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -73,6 +73,9 @@ enum stack_protector_guard {
 #define TARGET_ZICSR((riscv_zi_subext & MASK_ZICSR) != 0)
 #define TARGET_ZIFENCEI ((riscv_zi_subext & MASK_ZIFENCEI) != 0)
 
+#define MASK_ZAWRS   (1 << 0)
+#define TARGET_ZAWRS ((riscv_za_subext & MASK_ZAWRS) != 0)
+
 #define MASK_ZBA  (1 << 0)
 #define MASK_ZBB  (1 << 1)
 #define MASK_ZBC  (1 << 2)
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 9e9fe6d8ccd..f01850e3b19 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -197,6 +197,9 @@ long riscv_stack_protector_guard_offset = 0
 TargetVariable
 int riscv_zi_subext
 
+TargetVariable
+int riscv_za_subext
+
 TargetVariable
 int riscv_zb_subext
 
diff --git a/gcc/testsuite/gcc.target/riscv/zawrs.c 
b/gcc/testsuite/gcc.target/riscv/zawrs.c
new file mode 100644
index 000..0b7e2662343
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zawrs.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zawrs" { target { rv64 } } } */
+/* { dg-options "-march=rv32gc_zawrs" { target { rv32 } } } */
+
+#ifndef __riscv_zawrs
+#error Feature macro not defined
+#endif
+
+int
+foo (int a)
+{
+  return a;
+}
-- 
2.35.3



Re: [PATCH] Update {skylake,icelake,alderlake}_cost to add a bit preference to vector store.

2022-06-01 Thread H.J. Lu via Gcc-patches
On Tue, May 31, 2022 at 10:06 PM Cui,Lili  wrote:
>
> This patch is to update {skylake,icelake,alderlake}_cost to add a bit 
> preference to vector store.
> Since the interger vector construction cost has changed, we need to adjust 
> the load and store costs for intel processers.
>
> With the patch applied
> 538.imagic_r:gets ~6% improvement on ADL for multicopy.
> 525.x264_r  :gets ~2% improvement on ADL and ICX for multicopy.
> with no measurable changes for other benchmarks.
>
> Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Ok for trunk?
>
> Thanks,
> Lili.
>
> gcc/ChangeLog
>
> PR target/105493
> * config/i386/x86-tune-costs.h (skylake_cost): Raise the gpr load cost
> from 4 to 6 and gpr store cost from 6 to 8. Change SSE loads and
> unaligned loads cost from {6, 6, 6, 10, 20} to {8, 8, 8, 8, 16}.
> (icelake_cost): Ditto.
> (alderlake_cost): Raise the gpr store cost from 6 to 8 and SSE loads,
> stores and unaligned stores cost from {6, 6, 6, 10, 15} to
> {8, 8, 8, 10, 15}.
>
> gcc/testsuite/
>
> PR target/105493
> * gcc.target/i386/pr91446.c: Adjust to expect vectorization
> * gcc.target/i386/pr99881.c: XFAIL.
> ---
>  gcc/config/i386/x86-tune-costs.h| 26 -
>  gcc/testsuite/gcc.target/i386/pr91446.c |  2 +-
>  gcc/testsuite/gcc.target/i386/pr99881.c |  2 +-
>  3 files changed, 15 insertions(+), 15 deletions(-)
>
> diff --git a/gcc/config/i386/x86-tune-costs.h 
> b/gcc/config/i386/x86-tune-costs.h
> index ea34a939c68..6c9066c84cc 100644
> --- a/gcc/config/i386/x86-tune-costs.h
> +++ b/gcc/config/i386/x86-tune-costs.h
> @@ -1897,15 +1897,15 @@ struct processor_costs skylake_cost = {
>8,   /* "large" insn */
>17,  /* MOVE_RATIO */
>17,  /* CLEAR_RATIO */
> -  {4, 4, 4},   /* cost of loading integer registers
> +  {6, 6, 6},   /* cost of loading integer registers
>in QImode, HImode and SImode.
>Relative to reg-reg move (2).  */
> -  {6, 6, 6},   /* cost of storing integer registers 
> */
> -  {6, 6, 6, 10, 20},   /* cost of loading SSE register
> +  {8, 8, 8},   /* cost of storing integer registers 
> */
> +  {8, 8, 8, 8, 16},/* cost of loading SSE register
>in 32bit, 64bit, 128bit, 256bit 
> and 512bit */
>{8, 8, 8, 8, 16},/* cost of storing SSE register
>in 32bit, 64bit, 128bit, 256bit 
> and 512bit */
> -  {6, 6, 6, 10, 20},   /* cost of unaligned loads.  */
> +  {8, 8, 8, 8, 16},/* cost of unaligned loads.  */
>{8, 8, 8, 8, 16},/* cost of unaligned stores.  */
>2, 2, 4, /* cost of moving XMM,YMM,ZMM 
> register */
>6,   /* cost of moving SSE register to 
> integer.  */
> @@ -2023,15 +2023,15 @@ struct processor_costs icelake_cost = {
>8,   /* "large" insn */
>17,  /* MOVE_RATIO */
>17,  /* CLEAR_RATIO */
> -  {4, 4, 4},   /* cost of loading integer registers
> +  {6, 6, 6},   /* cost of loading integer registers
>in QImode, HImode and SImode.
>Relative to reg-reg move (2).  */
> -  {6, 6, 6},   /* cost of storing integer registers 
> */
> -  {6, 6, 6, 10, 20},   /* cost of loading SSE register
> +  {8, 8, 8},   /* cost of storing integer registers 
> */
> +  {8, 8, 8, 8, 16},/* cost of loading SSE register
>in 32bit, 64bit, 128bit, 256bit 
> and 512bit */
>{8, 8, 8, 8, 16},/* cost of storing SSE register
>in 32bit, 64bit, 128bit, 256bit 
> and 512bit */
> -  {6, 6, 6, 10, 20},   /* cost of unaligned loads.  */
> +  {8, 8, 8, 8, 16},/* cost of unaligned loads.  */
>{8, 8, 8, 8, 16},/* cost of unaligned stores.  */
>2, 2, 4, /* cost of moving XMM,YMM,ZMM 
> register */
>6,   /* cost of moving SSE register to 
> integer.  */
> @@ -2146,13 +2146,13 @@ struct processor_costs alderlake_cost = {
>{6, 6, 6},   /* cost of loading integer registers
>in QImode, HImode and SImode.
>

[PATCH] c++: find_template_parameters and PARM_DECLs [PR105797]

2022-06-01 Thread Patrick Palka via Gcc-patches
As explained in r11-4959-gde6f64f9556ae3, the atom cache assumes two
equivalent expressions (according to cp_tree_equal) must use the same
template parameters (according to find_template_parameters).  This
assumption turned out to not hold for TARGET_EXPR, which was addressed
by that commit.

But this assumption apparently doesn't hold for PARM_DECL either:
find_template_parameters walks its DECL_CONTEXT but cp_tree_equal by
default doesn't consider DECL_CONTEXT unless comparing_specializations
is set.  Thus in the first testcase below, the atomic constraints of #1
and #2 are equivalent according to cp_tree_equal, but according to
find_template_parameters the former uses T and the latter uses both T
and U.

I suppose we can fix this assumption violation by setting
comparing_specializations in the atom_hasher, which would make
cp_tree_equal return false for the two atoms, but that seems overly
pessimistic here.  Ideally the atoms should be considered equivalent
and we should fix find_template_paremeters to return just T for #2's
atom.

To that end this patch makes for_each_template_parm_r stop walking the
DECL_CONTEXT of a PARM_DECL.  This should be safe to do because
tsubst_copy / tsubst_decl only cares about the TREE_TYPE of a PARM_DECL
and doesn't bother substituting the DECL_CONTEXT, thus the only relevant
template parameters are those used in its type.  any_template_parm_r is
currently responsible for walking its TREE_TYPE, but I suppose it now makes
sense make for_each_template_parm_r do so instead.

In passing this patch also makes for_each_template_parm_r stop walking
the DECL_CONTEXT of a VAR_/FUNCTION_DECL since it should be unnecessary
after walking DECL_TI_ARGS.

I experimented with not walking DECL_CONTEXT for CONST_DECL, but the
second testcase below demonstrates it's necessary to walk it.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/105797

gcc/cp/ChangeLog:

* pt.cc (for_each_template_parm_r) :
Don't walk DECL_CONTEXT.
: Likewise.  Walk TREE_TYPE.
: Simplify accordingly.
(any_template_parm_r) : Don't walk TREE_TYPE.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-decltype4.C: New test.
---
 gcc/cp/pt.cc| 10 +-
 gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C | 16 
 gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C   | 12 
 3 files changed, 33 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 4f0ace2644b..e4a473002a0 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -10561,11 +10561,14 @@ for_each_template_parm_r (tree *tp, int 
*walk_subtrees, void *d)
 case VAR_DECL:
   if (DECL_LANG_SPECIFIC (t) && DECL_TEMPLATE_INFO (t))
WALK_SUBTREE (DECL_TI_ARGS (t));
-  /* Fall through.  */
+  break;
 
 case PARM_DECL:
+  WALK_SUBTREE (TREE_TYPE (t));
+  break;
+
 case CONST_DECL:
-  if (TREE_CODE (t) == CONST_DECL && DECL_TEMPLATE_PARM_P (t))
+  if (DECL_TEMPLATE_PARM_P (t))
WALK_SUBTREE (DECL_INITIAL (t));
   if (DECL_CONTEXT (t)
  && pfd->include_nondeduced_p)
@@ -10824,9 +10827,6 @@ any_template_parm_r (tree t, void *data)
   break;
 
 case TEMPLATE_PARM_INDEX:
-case PARM_DECL:
-  /* A parameter or constraint variable may also depend on a template
-parameter without explicitly naming it.  */
   WALK_SUBTREE (TREE_TYPE (t));
   break;
 
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C
new file mode 100644
index 000..6683d224cf8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C
@@ -0,0 +1,16 @@
+// PR c++/105797
+// { dg-do compile { target c++20 } }
+
+template
+concept C = requires { T(); };
+
+template
+void f(T v) requires C; // #1
+
+template
+void f(T v) requires C; // #2
+
+int main() {
+  f(0);
+  f(0);
+}
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C
new file mode 100644
index 000..3fa4fb82818
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C
@@ -0,0 +1,12 @@
+// { dg-do compile { target c++20 } }
+
+template
+struct A {
+  enum E { e = I };
+  static void f() requires (e != 0);
+};
+
+int main() {
+  A::f();
+  A::f(); // { dg-error "no match" }
+}
-- 
2.36.1.203.g1bcf4f6271



Re: [PATCH] c++: find_template_parameters and PARM_DECLs [PR105797]

2022-06-01 Thread Patrick Palka via Gcc-patches
On Wed, 1 Jun 2022, Patrick Palka wrote:

> As explained in r11-4959-gde6f64f9556ae3, the atom cache assumes two
> equivalent expressions (according to cp_tree_equal) must use the same
> template parameters (according to find_template_parameters).  This
> assumption turned out to not hold for TARGET_EXPR, which was addressed
> by that commit.
> 
> But this assumption apparently doesn't hold for PARM_DECL either:
> find_template_parameters walks its DECL_CONTEXT but cp_tree_equal by
> default doesn't consider DECL_CONTEXT unless comparing_specializations
> is set.  Thus in the first testcase below, the atomic constraints of #1
> and #2 are equivalent according to cp_tree_equal, but according to
> find_template_parameters the former uses T and the latter uses both T
> and U.
> 
> I suppose we can fix this assumption violation by setting
> comparing_specializations in the atom_hasher, which would make
> cp_tree_equal return false for the two atoms, but that seems overly
> pessimistic here.  Ideally the atoms should be considered equivalent
> and we should fix find_template_paremeters to return just T for #2's
> atom.
> 
> To that end this patch makes for_each_template_parm_r stop walking the
> DECL_CONTEXT of a PARM_DECL.  This should be safe to do because
> tsubst_copy / tsubst_decl only cares about the TREE_TYPE of a PARM_DECL
> and doesn't bother substituting the DECL_CONTEXT, thus the only relevant
> template parameters are those used in its type.  any_template_parm_r is
> currently responsible for walking its TREE_TYPE, but I suppose it now makes
> sense make for_each_template_parm_r do so instead.
> 
> In passing this patch also makes for_each_template_parm_r stop walking
> the DECL_CONTEXT of a VAR_/FUNCTION_DECL since it should be unnecessary
> after walking DECL_TI_ARGS.
> 
> I experimented with not walking DECL_CONTEXT for CONST_DECL, but the
> second testcase below demonstrates it's necessary to walk it.

... and that's ultimately because tsubst_decl substitutes the
DECL_CONTEXT of CONST_DECL (unlike for PARM_DECL)

> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?
> 
>   PR c++/105797
> 
> gcc/cp/ChangeLog:
> 
>   * pt.cc (for_each_template_parm_r) :
>   Don't walk DECL_CONTEXT.
>   : Likewise.  Walk TREE_TYPE.
>   : Simplify accordingly.
>   (any_template_parm_r) : Don't walk TREE_TYPE.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp2a/concepts-decltype4.C: New test.
> ---
>  gcc/cp/pt.cc| 10 +-
>  gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C | 16 
>  gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C   | 12 
>  3 files changed, 33 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C
> 
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index 4f0ace2644b..e4a473002a0 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -10561,11 +10561,14 @@ for_each_template_parm_r (tree *tp, int 
> *walk_subtrees, void *d)
>  case VAR_DECL:
>if (DECL_LANG_SPECIFIC (t) && DECL_TEMPLATE_INFO (t))
>   WALK_SUBTREE (DECL_TI_ARGS (t));
> -  /* Fall through.  */
> +  break;
>  
>  case PARM_DECL:
> +  WALK_SUBTREE (TREE_TYPE (t));
> +  break;
> +
>  case CONST_DECL:
> -  if (TREE_CODE (t) == CONST_DECL && DECL_TEMPLATE_PARM_P (t))
> +  if (DECL_TEMPLATE_PARM_P (t))
>   WALK_SUBTREE (DECL_INITIAL (t));
>if (DECL_CONTEXT (t)
> && pfd->include_nondeduced_p)
> @@ -10824,9 +10827,6 @@ any_template_parm_r (tree t, void *data)
>break;
>  
>  case TEMPLATE_PARM_INDEX:
> -case PARM_DECL:
> -  /* A parameter or constraint variable may also depend on a template
> -  parameter without explicitly naming it.  */
>WALK_SUBTREE (TREE_TYPE (t));
>break;
>  
> diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C 
> b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C
> new file mode 100644
> index 000..6683d224cf8
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C
> @@ -0,0 +1,16 @@
> +// PR c++/105797
> +// { dg-do compile { target c++20 } }
> +
> +template
> +concept C = requires { T(); };
> +
> +template
> +void f(T v) requires C; // #1
> +
> +template
> +void f(T v) requires C; // #2
> +
> +int main() {
> +  f(0);
> +  f(0);
> +}
> diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C 
> b/gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C
> new file mode 100644
> index 000..3fa4fb82818
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C
> @@ -0,0 +1,12 @@
> +// { dg-do compile { target c++20 } }
> +
> +template
> +struct A {
> +  enum E { e = I };
> +  static void f() requires (e != 0);
> +};
> +
> +int main() {
> +  A::f();
> +  A::f(); // { dg-error "no match" }
> +}
> -- 
> 2.36.1.203.g1bcf4f6271
> 
> 



[PATCH] c++: value-dep but not type-dep decltype operand [PR105756]

2022-06-01 Thread Patrick Palka via Gcc-patches
r12-7564-gec0f53a3a542e7 made us instantiate non-constant non-dependent
decltype operands by relaxing instantiate_non_dependent_expr to check
instantiation_dependent_uneval_expression_p.  But as the testcase below
demonstrates, this predicate is too permissive here because it allows
value-dependent-only expressions to go through and get instantiated
ahead of time, which causes us to crash during constexpr evaluation of
(5 % N).

This patch strengthens instantiate_non_dependent_expr to use the
non-uneval version of the predicate instead, which does consider value
dependence.  In turn, we need to make finish_decltype_type avoid calling
i_n_d_e on a value-dependent-only expression; I assume we still want to
resolve the decltype ahead of time in this case.  (Doing so seems
unintuitive to me since the expression could be ill-formed at
instantiation time as in the testcase, but it matches the behavior of
Clang and MSVC.)

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/12?

PR c++/105756

gcc/cp/ChangeLog:

* pt.cc (instantiate_non_dependent_expr_internal): Adjust
comment.
(instantiate_non_dependent_expr_sfinae): Assert i_d_e_p instead
of i_d_u_e_p.
* semantics.cc (finish_decltype_type): Don't instantiate the
expression when i_d_e_p is true.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/decltype82.C: New test.
---
 gcc/cp/pt.cc|  4 ++--
 gcc/cp/semantics.cc | 13 -
 gcc/testsuite/g++.dg/cpp0x/decltype82.C | 10 ++
 3 files changed, 24 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/decltype82.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index e4a473002a0..1ea2545e115 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -6372,7 +6372,7 @@ redeclare_class_template (tree type, tree parms, tree 
cons)
 
 /* The actual substitution part of instantiate_non_dependent_expr_sfinae,
to be used when the caller has already checked
-!instantiation_dependent_uneval_expression_p (expr)
+!instantiation_dependent_expression_p (expr)
and cleared processing_template_decl.  */
 
 tree
@@ -6397,7 +6397,7 @@ instantiate_non_dependent_expr_sfinae (tree expr, 
tsubst_flags_t complain)
   if (processing_template_decl)
 {
   /* The caller should have checked this already.  */
-  gcc_checking_assert (!instantiation_dependent_uneval_expression_p 
(expr));
+  gcc_checking_assert (!instantiation_dependent_expression_p (expr));
   processing_template_decl_sentinel s;
   expr = instantiate_non_dependent_expr_internal (expr, complain);
 }
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 3600d270ff8..b23848ab94c 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -11302,9 +11302,20 @@ finish_decltype_type (tree expr, bool 
id_expression_or_member_access_p,
 
   return type;
 }
+  else if (processing_template_decl
+  && potential_constant_expression (expr)
+  && value_dependent_expression_p (expr))
+/* The above test is equivalent to instantiation_dependent_expression_p
+   after instantiation_dependent_uneval_expression_p has been ruled out.
+   In this case the expression is dependent but not type-dependent, so
+   we can resolve the decltype ahead of time but we can't instantiate
+   the expression.  */;
   else if (processing_template_decl)
 {
-  expr = instantiate_non_dependent_expr_sfinae (expr, 
complain|tf_decltype);
+  /* The expression isn't instantiation dependent, so we can fully
+instantiate it ahead of time.  */
+  expr = instantiate_non_dependent_expr_sfinae (expr,
+   complain|tf_decltype);
   if (expr == error_mark_node)
return error_mark_node;
   /* Keep processing_template_decl cleared for the rest of the function
diff --git a/gcc/testsuite/g++.dg/cpp0x/decltype82.C 
b/gcc/testsuite/g++.dg/cpp0x/decltype82.C
new file mode 100644
index 000..915e5e37675
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/decltype82.C
@@ -0,0 +1,10 @@
+// PR c++/105756
+// { dg-do compile { target c++11 } }
+
+template
+void f() {
+  using ty1 = decltype((5 % N) == 0);
+  using ty2 = decltype((5 / N) == 0);
+}
+
+template void f<0>();
-- 
2.36.1.203.g1bcf4f6271



[PATCH 0/6] OpenMP 5.0: Fortran "declare mapper" support

2022-06-01 Thread Julian Brown
This patch series implements "declare mapper" support for Fortran,
following on from the C and C++ support for same in the currently
in-review series posted here:

  https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591973.html

Further commentary on individual patches. Tested with offloading to NVPTX.

OK? (Pending rework of the patch series it depends on?)

Thanks,

Julian

Julian Brown (6):
  Fortran: Typo/unicode-o fixes
  OpenMP: Templatize omp_mapper_list
  OpenMP: Rename strip_components_and_deref to omp_get_root_term
  OpenMP: Tweak NOP handling in in omp_get_root_term and
accumulate_sibling_list
  OpenMP: Pointers and member mappings
  OpenMP: Fortran "!$omp declare mapper" support

 gcc/c-family/c-common.h   |   4 +-
 gcc/c-family/c-omp.cc |   2 +-
 gcc/c/c-decl.cc   |   6 +-
 gcc/cp/semantics.cc   |   8 +-
 gcc/fortran/dump-parse-tree.cc|   5 +-
 gcc/fortran/f95-lang.cc   |   7 +
 gcc/fortran/gfortran.h|  55 +-
 gcc/fortran/match.cc  |   8 +-
 gcc/fortran/match.h   |   1 +
 gcc/fortran/module.cc | 252 +-
 gcc/fortran/openmp.cc | 296 ++-
 gcc/fortran/parse.cc  |   9 +-
 gcc/fortran/resolve.cc|   2 +
 gcc/fortran/st.cc |   2 +-
 gcc/fortran/symbol.cc |  16 +
 gcc/fortran/trans-decl.cc |  30 +-
 gcc/fortran/trans-openmp.cc   | 778 +-
 gcc/fortran/trans-stmt.h  |   1 +
 gcc/fortran/trans.h   |   3 +
 gcc/gimplify.cc   | 422 --
 gcc/omp-general.h |  32 +-
 .../gfortran.dg/gomp/declare-mapper-1.f90 |  71 ++
 .../gfortran.dg/gomp/declare-mapper-14.f90|  26 +
 .../gfortran.dg/gomp/declare-mapper-16.f90|  22 +
 .../gfortran.dg/gomp/declare-mapper-5.f90 |  45 +
 gcc/tree-pretty-print.cc  |   3 +
 include/gomp-constants.h  |   5 +-
 .../libgomp.fortran/declare-mapper-10.f90 |  40 +
 .../libgomp.fortran/declare-mapper-11.f90 |  38 +
 .../libgomp.fortran/declare-mapper-12.f90 |  33 +
 .../libgomp.fortran/declare-mapper-13.f90 |  49 ++
 .../libgomp.fortran/declare-mapper-15.f90 |  24 +
 .../libgomp.fortran/declare-mapper-17.f90 |  92 +++
 .../libgomp.fortran/declare-mapper-18.f90 |  46 ++
 .../libgomp.fortran/declare-mapper-19.f90 |  29 +
 .../libgomp.fortran/declare-mapper-2.f90  |  32 +
 .../libgomp.fortran/declare-mapper-20.f90 |  29 +
 .../libgomp.fortran/declare-mapper-3.f90  |  33 +
 .../libgomp.fortran/declare-mapper-4.f90  |  36 +
 .../libgomp.fortran/declare-mapper-6.f90  |  28 +
 .../libgomp.fortran/declare-mapper-7.f90  |  29 +
 .../libgomp.fortran/declare-mapper-8.f90  | 115 +++
 .../libgomp.fortran/declare-mapper-9.f90  |  27 +
 .../libgomp.fortran/map-subarray.f90  |  33 +
 .../libgomp.fortran/map-subcomponents.f90 |  32 +
 .../libgomp.fortran/struct-elem-map-1.f90 |  10 +-
 46 files changed, 2702 insertions(+), 164 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/declare-mapper-1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/declare-mapper-14.f90
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/declare-mapper-16.f90
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/declare-mapper-5.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-10.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-11.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-12.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-13.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-15.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-17.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-18.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-19.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-2.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-20.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-3.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-4.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-6.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-7.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-8.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/declare-mapper-9.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/map-subarray.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/map-subcomponents.f90

-

[PATCH 1/6] Fortran: Typo/unicode-o fixes

2022-06-01 Thread Julian Brown
This patch fixes a minor typo in dump output and a stray unicode character
in a comment.

This one probably counts as obvious.

2022-06-01  Julian Brown  

gcc/fortran/
* dump-parse-tree.cc (show_attr): Fix OMP-UDR-ARTIFICIAL-VAR typo.
* trans-openmp.cc (gfc_trans_omp_array_section): Replace stray unicode
m-dash character with hyphen.
---
 gcc/fortran/dump-parse-tree.cc | 2 +-
 gcc/fortran/trans-openmp.cc| 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc
index 3112caec053..f7bf91370a5 100644
--- a/gcc/fortran/dump-parse-tree.cc
+++ b/gcc/fortran/dump-parse-tree.cc
@@ -893,7 +893,7 @@ show_attr (symbol_attribute *attr, const char * module)
   if (attr->pdt_string)
 fputs (" PDT-STRING", dumpfile);
   if (attr->omp_udr_artificial_var)
-fputs (" OMP-UDT-ARTIFICIAL-VAR", dumpfile);
+fputs (" OMP-UDR-ARTIFICIAL-VAR", dumpfile);
   if (attr->omp_declare_target)
 fputs (" OMP-DECLARE-TARGET", dumpfile);
   if (attr->omp_declare_target_link)
diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 8c6f6a250de..9ca019b9535 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -2440,7 +2440,7 @@ gfc_trans_omp_array_section (stmtblock_t *block, 
gfc_omp_namelist *n,
= gfc_conv_descriptor_data_get (decl);
   /* This purposely does not include GOMP_MAP_ALWAYS_POINTER.  The extra
 cast prevents gimplify.cc from recognising it as being part of the
-struct – and adding an 'alloc: for the 'desc.data' pointer, which
+struct - and adding an 'alloc: for the 'desc.data' pointer, which
 would break as the 'desc' (the descriptor) is also mapped
 (see node4 above).  */
   if (ptr_kind == GOMP_MAP_ATTACH_DETACH)
-- 
2.29.2



[PATCH 2/6] OpenMP: Templatize omp_mapper_list

2022-06-01 Thread Julian Brown
This patch parameterizes the omp_mapper_list class to allow it to use
different representations for the types of mappers -- e.g., to allow
Fortran to gather mappers by "gfc_typespec *" instead of tree type
(in a later patch in the series).

There should be no behavioural changes introduced by this patch.

OK?

Julian

2022-06-01  Julian Brown  

gcc/c-family/
* c-common.h (omp_mapper_list): Add T type parameter.
(c_omp_find_nested_mappers): Update prototype.
* c-omp.cc (c_omp_find_nested_mappers): Use omp_mapper_list.

gcc/c/
* c-decl.cc (c_omp_scan_mapper_bindings): Use omp_name_type and
omp_mapper_list.

gcc/cp/
* semantics.cc (omp_target_walk_data, finish_omp_target_clauses_r):
Likewise.

gcc/
* gimplify.cc (gimplify_omp_ctx, new_omp_context,
omp_instantiate_mapper): Use omp_name_type.
* omp-general.h (omp_name_type): Parameterize by T.
(hash traits template): Use omp_name_type.
(omp_mapper_list): Parameterize by T.
---
 gcc/c-family/c-common.h |  4 ++--
 gcc/c-family/c-omp.cc   |  2 +-
 gcc/c/c-decl.cc |  6 +++---
 gcc/cp/semantics.cc |  8 
 gcc/gimplify.cc |  6 +++---
 gcc/omp-general.h   | 32 +---
 6 files changed, 30 insertions(+), 28 deletions(-)

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index adebd0a2605..fe493fb3916 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1252,8 +1252,8 @@ extern tree c_omp_check_context_selector (location_t, 
tree);
 extern void c_omp_mark_declare_variant (location_t, tree, tree);
 extern const char *c_omp_map_clause_name (tree, bool);
 extern void c_omp_adjust_map_clauses (tree, bool);
-struct omp_mapper_list;
-extern void c_omp_find_nested_mappers (struct omp_mapper_list *, tree);
+template struct omp_mapper_list;
+extern void c_omp_find_nested_mappers (struct omp_mapper_list *, tree);
 extern tree c_omp_instantiate_mappers (tree);
 
 class c_omp_address_inspector
diff --git a/gcc/c-family/c-omp.cc b/gcc/c-family/c-omp.cc
index 789da097bb0..ee02121f1d5 100644
--- a/gcc/c-family/c-omp.cc
+++ b/gcc/c-family/c-omp.cc
@@ -3401,7 +3401,7 @@ c_omp_address_inspector::get_attachment_point (tree expr)
themselves, add it to MLIST.  */
 
 void
-c_omp_find_nested_mappers (omp_mapper_list *mlist, tree mapper_fn)
+c_omp_find_nested_mappers (omp_mapper_list *mlist, tree mapper_fn)
 {
   tree mapper = lang_hooks.decls.omp_extract_mapper_directive (mapper_fn);
   tree mapper_name = NULL_TREE;
diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index 64e5faf7137..ea920e5c452 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -12562,7 +12562,7 @@ static tree
 c_omp_scan_mapper_bindings_r (tree *tp, int *walk_subtrees, void *ptr)
 {
   tree t = *tp;
-  omp_mapper_list *mlist = (omp_mapper_list *) ptr;
+  omp_mapper_list *mlist = (omp_mapper_list *) ptr;
   tree aggr_type = NULL_TREE;
 
   if (TREE_CODE (t) == SIZEOF_EXPR
@@ -12600,9 +12600,9 @@ c_omp_scan_mapper_bindings_r (tree *tp, int 
*walk_subtrees, void *ptr)
 void
 c_omp_scan_mapper_bindings (location_t loc, tree *clauses_ptr, tree body)
 {
-  hash_set seen_types;
+  hash_set> seen_types;
   auto_vec mappers;
-  omp_mapper_list mlist (&seen_types, &mappers);
+  omp_mapper_list mlist (&seen_types, &mappers);
 
   walk_tree_without_duplicates (&body, c_omp_scan_mapper_bindings_r, &mlist);
 
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 21234be3c31..4e19872c246 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -9455,7 +9455,7 @@ struct omp_target_walk_data
  variables when recording lambda_objects_accessed.  */
   hash_set local_decls;
 
-  omp_mapper_list *mappers;
+  omp_mapper_list *mappers;
 };
 
 /* Helper function of finish_omp_target_clauses, called via
@@ -9469,7 +9469,7 @@ finish_omp_target_clauses_r (tree *tp, int 
*walk_subtrees, void *ptr)
   struct omp_target_walk_data *data = (struct omp_target_walk_data *) ptr;
   tree current_object = data->current_object;
   tree current_closure = data->current_closure;
-  omp_mapper_list *mlist = data->mappers;
+  omp_mapper_list *mlist = data->mappers;
   tree aggr_type = NULL_TREE;
 
   /* References inside of these expression codes shouldn't incur any
@@ -9603,9 +9603,9 @@ finish_omp_target_clauses (location_t loc, tree body, 
tree *clauses_ptr)
   else
 data.current_closure = NULL_TREE;
 
-  hash_set seen_types;
+  hash_set > seen_types;
   auto_vec mapper_fns;
-  omp_mapper_list mlist (&seen_types, &mapper_fns);
+  omp_mapper_list mlist (&seen_types, &mapper_fns);
   data.mappers = &mlist;
 
   cp_walk_tree_without_duplicates (&body, finish_omp_target_clauses_r, &data);
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 861159687a7..cb6877b5009 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -219,7 +219,7 @@ struct gimplify_omp_ctx
 {
   struct gimplify_omp_ctx *outer_context;
   splay_tree variables;
-  hash_map *implicit_mappers;
+  ha

[PATCH 3/6] OpenMP: Rename strip_components_and_deref to omp_get_root_term

2022-06-01 Thread Julian Brown
This patch renames the strip_components_and_deref function to better
describe what it does. I'll fold this into the originating patch series
during rework.

2022-06-01  Julian Brown  

gcc/
* gimplify.cc (strip_components_and_deref): Rename to...
(omp_get_root_term): This.
---
 gcc/gimplify.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index cb6877b5009..1646fdaa9b8 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -8768,7 +8768,7 @@ omp_get_base_pointer (tree expr)
 /* Remove COMPONENT_REFS and indirections from EXPR.  */
 
 static tree
-strip_components_and_deref (tree expr)
+omp_get_root_term (tree expr)
 {
   while (TREE_CODE (expr) == COMPONENT_REF
 || TREE_CODE (expr) == INDIRECT_REF
@@ -11168,7 +11168,7 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
*pre_p,
 
  if (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_STRUCT)
{
- tree base = strip_components_and_deref (decl);
+ tree base = omp_get_root_term (decl);
  if (DECL_P (base))
{
  decl = base;
-- 
2.29.2



[PATCH 4/6] OpenMP: Tweak NOP handling in in omp_get_root_term and accumulate_sibling_list

2022-06-01 Thread Julian Brown
This patch strips NOPs in omp_get_root_term and accumulate_sibling_list
to cover cases that came up writing tests for "omp declare mapper"
functionality. I'll fold this into the originating patch series for
those functions during rework.

2022-06-01  Julian Brown  

gcc/
* gimplify.cc (omp_get_root_term): Look through NOP_EXPRs.
(accumulate_sibling_list): Strip NOPs on struct base pointers.
---
 gcc/gimplify.cc | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 1646fdaa9b8..742fd5e4a8d 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -8775,7 +8775,8 @@ omp_get_root_term (tree expr)
 || (TREE_CODE (expr) == MEM_REF
 && integer_zerop (TREE_OPERAND (expr, 1)))
 || TREE_CODE (expr) == POINTER_PLUS_EXPR
-|| TREE_CODE (expr) == COMPOUND_EXPR)
+|| TREE_CODE (expr) == COMPOUND_EXPR
+|| TREE_CODE (expr) == NOP_EXPR)
   if (TREE_CODE (expr) == COMPOUND_EXPR)
expr = TREE_OPERAND (expr, 1);
   else
@@ -9932,6 +9933,8 @@ accumulate_sibling_list (enum omp_region_type 
region_type, enum tree_code code,
sdecl = TREE_OPERAND (sdecl, 0);
}
 
+  STRIP_NOPS (sdecl);
+
   while (TREE_CODE (sdecl) == POINTER_PLUS_EXPR)
sdecl = TREE_OPERAND (sdecl, 0);
 
-- 
2.29.2



[PATCH 5/6] OpenMP: Pointers and member mappings

2022-06-01 Thread Julian Brown
Implementing the "omp declare mapper" functionality, I noticed some
cases where handling of derived type members that are pointers doesn't
seem to be quite right. At present, a type such as this:

  type T
  integer, pointer, dimension(:) :: arrptr
  end type T

  type(T) :: tvar
  [...]
  !$omp target map(tofrom: tvar%arrptr)

will be mapped using three mapping nodes:

  GOMP_MAP_TO tvar%arrptr   (the descriptor)
  GOMP_MAP_TOFROM *tvar%arrptr%data (the actual array data)
  GOMP_MAP_ALWAYS_POINTER tvar%arrptr%data  (a pointer to the array data)

This follows OMP 5.0, 2.19.7.1 "map Clause":

  "If a list item in a map clause is an associated pointer and the
   pointer is not the base pointer of another list item in a map clause
   on the same construct, then it is treated as if its pointer target
   is implicitly mapped in the same clause. For the purposes of the map
   clause, the mapped pointer target is treated as if its base pointer
   is the associated pointer."

However, we can also write this:

  map(to: tvar%arrptr) map(tofrom: tvar%arrptr(3:8))

and then instead we should follow:

  "If the structure sibling list item is a pointer then it is treated
   as if its association status is undefined, unless it appears as
   the base pointer of another list item in a map clause on the same
   construct."

But, that's not implemented quite right at the moment (and completely
breaks once we introduce declare mappers), because we still map the "to:
tvar%arrptr" as the descriptor and the entire array, then we map the
"tvar%arrptr(3:8)" part using the descriptor (again!) and the array slice.

The solution is to detect when we're mapping a smaller part of the array
(or a subcomponent) on the same directive, and only map the descriptor
in that case. So we get mappings like this instead:

  map(to: tvar%arrptr)   -->
  GOMP_MAP_ALLOC  tvar%arrptr  (the descriptor)

  map(tofrom: tvar%arrptr(3:8)   -->
  GOMP_MAP_TOFROM tvar%arrptr%data(3) (size 8-3+1, etc.)
  GOMP_MAP_ALWAYS_POINTER tvar%arrptr%data (bias 3, etc.)

OK?

Thanks,

Julian

2022-06-01  Julian Brown  

gcc/fortran/
* trans-openmp.cc (dependency.h): Include.
(gfc_trans_omp_array_section): Do not map descriptors here for OpenMP.
(gfc_trans_omp_clauses): Check subcomponent and subarray/element
accesses elsewhere in the clause list for pointers to derived types or
array descriptors, and map just the pointer/descriptor if we have any.

libgomp/
* testsuite/libgomp.fortran/map-subarray.f90: New test.
* testsuite/libgomp.fortran/map-subcomponents.f90: New test.
* testsuite/libgomp.fortran/struct-elem-map-1.f90: Adjust for
descriptor-mapping changes.
---
 gcc/fortran/trans-openmp.cc   | 106 +++---
 .../libgomp.fortran/map-subarray.f90  |  33 ++
 .../libgomp.fortran/map-subcomponents.f90 |  32 ++
 .../libgomp.fortran/struct-elem-map-1.f90 |  10 +-
 4 files changed, 164 insertions(+), 17 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.fortran/map-subarray.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/map-subcomponents.f90

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index 9ca019b9535..21f3336a898 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -40,6 +40,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "omp-general.h"
 #include "omp-low.h"
 #include "memmodel.h"  /* For MEMMODEL_ enums.  */
+#include "dependency.h"
 
 #undef GCC_DIAG_STYLE
 #define GCC_DIAG_STYLE __gcc_tdiag__
@@ -2416,22 +2417,18 @@ gfc_trans_omp_array_section (stmtblock_t *block, 
gfc_omp_namelist *n,
 }
   if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (decl)))
 {
-  tree desc_node;
   tree type = TREE_TYPE (decl);
   ptr2 = gfc_conv_descriptor_data_get (decl);
-  desc_node = build_omp_clause (input_location, OMP_CLAUSE_MAP);
-  OMP_CLAUSE_DECL (desc_node) = decl;
-  OMP_CLAUSE_SIZE (desc_node) = TYPE_SIZE_UNIT (type);
-  if (ptr_kind == GOMP_MAP_ALWAYS_POINTER)
+  if (ptr_kind != GOMP_MAP_ALWAYS_POINTER)
{
- OMP_CLAUSE_SET_MAP_KIND (desc_node, GOMP_MAP_TO);
- node2 = node;
- node = desc_node;  /* Needs to come first.  */
-   }
-  else
-   {
- OMP_CLAUSE_SET_MAP_KIND (desc_node, GOMP_MAP_TO_PSET);
- node2 = desc_node;
+ /* For OpenMP, the descriptor must be mapped with its own explicit
+map clause (e.g. both "map(foo%arr)" and "map(foo%arr(:))" must
+be present in the clause list if "foo%arr" is a pointer to an
+array).  So, we don't create a GOMP_MAP_TO_PSET node here.  */
+ node2 = build_omp_clause (input_location, OMP_CLAUSE_MAP);
+ OMP_CLAUSE_SET_MAP_KIND (node2, GOMP_MAP_TO_PSET);
+ OMP_CLAUSE_DECL (node2) = decl;
+ OMP_CLAUSE_SIZE (node2) = TYPE_SIZE_UNIT (type);
}
   no

[PATCH 6/6] OpenMP: Fortran "!$omp declare mapper" support

2022-06-01 Thread Julian Brown
This patch implements "omp declare mapper" functionality for Fortran,
following the equivalent support for C and C++.

Fortran differs quite substantially from C and C++ in that "map"
clauses are naturally represented in the gfortran front-end's own
representation rather than as trees. Those are turned into one -- or
several -- OMP_CLAUSE_MAP nodes in gfc_trans_omp_clauses.

The "several nodes" case is problematic for mappers, for a few different
reasons:

 - Firstly, if we're invoking a nested mapper, we need some way of
   keeping those nodes together so they can be replaced "as one" by the
   clauses listed in that mapper. (For C and C++, a single OMP_CLAUSE_MAP
   node is used to represent a map clause early in compilation, which
   is then expanded in c_finish_omp_clauses for C, and similar for C++.
   We process mappers before that function is called.)

 - Secondly, the process of translating FE representation of clauses
   into "tree" mapping nodes can generate preamble code, and we need to
   either defer that generation or else put the preamble code somewhere
   if we're defining a mapper.

 - Thirdly, gfc_trans_omp_clauses needs to examine both the FE
   representation and partially-translated tree codes.  In the case
   where we're instantiating mappers implicitly from the middle end,
   the FE representation is long gone.

The scheme used is as follows.

For the first problem, we introduce a GOMP_MAP_MAPPING_GROUP mapping kind.
This is used to keep several mapping nodes together in mapper definitions
until instantiation time.  If the group triggers a nested mapper,
the required information can be extracted from it and then it can be
deleted/replaced as a whole.

For the second and third problems, we emit preamble code into a function
wrapping the "omp declare mapper" node.  This extends the scheme currently
under review for C++, and performs inlining of a modified version of
the function whenever a mapper is invoked from the middle-end.  New copies
of variables (e.g. temporary array descriptors or other metadata) are
introduced to copy needed values out of the inlined function to where
they're needed in the mapper instantiation.

For Fortran, we also need to add special-case handling for mapping
derived-type variables that are (a) pointers and (b) trigger a mapper,
in both the explicit mapping and implicit mapping cases.  If we have a
type and a mapper like this:

  type T
  integer, dimension(10) :: iarr
  end type T

  type(T), pointer :: tptr

  !$omp declare mapper (T :: t) map(t%iarr)

  !$omp target map(tptr)
  [...]
  !$omp end target

Here "map(tptr)" maps the pointer itself, and implicitly maps the
pointed-to object as well.  So, when invoking the mapper, rather than
rewriting this as just:

  !$omp target map(tptr%iarr)

we must introduce a new node to map the pointer also, i.e.:

  !$omp target map(alloc:tptr) map(tptr%iarr)

...before the mapping nodes go off to gimplify for processing.
(This relates to the previous patch in the series.)

We also need to handle module writing and reading for "declare mappers".
This requires an ABI bump that I noticed one of Tobias's patches also
does, so we'll probably need to synchronize on that somehow.

OK?

Thanks,

Julian

2022-06-01  Julian Brown  

gcc/fortran/
* dump-parse-tree.cc (show_attr): Show omp_udm_artificial_var flag.
(show_omp_namelist): Support OMP_MAP_UNSET.
* f95-lang.cc (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define language hooks.
* gfortran.h (gfc_statement): Add ST_OMP_DECLARE_MAPPER.
(symbol_attribute): Add omp_udm_artificial_var attribute.
(gfc_omp_map_op): Add OMP_MAP_UNSET.
(gfc_omp_namelist): Add udm pointer to u2 union.
(gfc_omp_udm): New struct.
(gfc_omp_namelist_udm): New struct.
(gfc_symtree): Add omp_udm pointer.
(gfc_namespace): Add omp_udm_root symtree. Add omp_udm_ns flag.
(gfc_free_omp_namelist): Update prototype.
(gfc_free_omp_udm, gfc_omp_udm_find, gfc_find_omp_udm,
gfc_resolve_omp_udms): Add prototypes.
* match.cc (gfc_free_omp_namelist): Change FREE_NS parameter to LIST,
to handle freeing user-defined mapper namelists safely.
* match.h (gfc_match_omp_declare_mapper): Add prototype.
* module.cc (MOD_VERSION): Bump to 16.
(ab_attribute): Add AB_OMP_DECLARE_MAPPER_VAR.
(attr_bits): Add OMP_DECLARE_MAPPER_VAR.
(mio_symbol_attribute): Read/write AB_OMP_DECLARE_MAPPER_VAR attribute.
Set referenced attr on read.
(omp_map_clause_ops, omp_map_cardinality): New arrays.
(load_omp_udms, check_omp_declare_mappers): New functions.
(read_module): Load and check OMP declare mappers.
(write_omp_udm, write_omp_udms): New functions.
(write_module): Write OMP declare mappers.
* openmp.cc (gfc_free_omp_clauses, gfc_mat

[pushed] c++: auto function as function argument [PR105779]

2022-06-01 Thread Jason Merrill via Gcc-patches
This testcase demonstrates that the issue in PR105623 is not limited to
templates, so we should do the marking in a less template-specific place.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/105779

gcc/cp/ChangeLog:

* call.cc (resolve_args): Call mark_single_function here.
* pt.cc (unify_one_argument): Not here.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/auto-fn63.C: New test.
---
 gcc/cp/call.cc |  5 +
 gcc/cp/pt.cc   |  4 
 gcc/testsuite/g++.dg/cpp1y/auto-fn63.C | 12 
 3 files changed, 17 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/auto-fn63.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 85fe9b5ab85..4710c3777c5 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -4672,6 +4672,11 @@ resolve_args (vec *args, tsubst_flags_t 
complain)
}
   else if (invalid_nonstatic_memfn_p (EXPR_LOCATION (arg), arg, complain))
return NULL;
+
+  /* Force auto deduction now.  Omit tf_warning to avoid redundant
+deprecated warning on deprecated-14.C.  */
+  if (!mark_single_function (arg, tf_error))
+   return NULL;
 }
   return args;
 }
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 4f0ace2644b..6de8e496859 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -22624,10 +22624,6 @@ unify_one_argument (tree tparms, tree targs, tree 
parm, tree arg,
  return unify_success (explain_p);
}
 
- /* Force auto deduction now.  Use tf_none to avoid redundant
-deprecated warning on deprecated-14.C.  */
- mark_single_function (arg, tf_none);
-
  arg_expr = arg;
  arg = unlowered_expr_type (arg);
  if (arg == error_mark_node)
diff --git a/gcc/testsuite/g++.dg/cpp1y/auto-fn63.C 
b/gcc/testsuite/g++.dg/cpp1y/auto-fn63.C
new file mode 100644
index 000..ca3bc854065
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/auto-fn63.C
@@ -0,0 +1,12 @@
+// PR c++/105779
+// { dg-do compile { target c++14 } }
+
+template
+struct struct1
+{
+  static auto apply() { return 1; }
+};
+
+int method(int(*f)());
+
+int t = method(struct1<1>::apply);

base-commit: ae54c1b09963779c5c3914782324ff48af32e2f1
-- 
2.27.0



[pushed] c++: auto and dependent member name [PR105734]

2022-06-01 Thread Jason Merrill via Gcc-patches
In r12-3643 I improved our handling of type names after . or -> when
unqualified lookup doesn't find anything, but it needs to handle auto
specially.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/105734

gcc/cp/ChangeLog:

* parser.cc (cp_parser_postfix_dot_deref_expression): Use typeof
if the expression has auto type.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/auto57.C: New test.
---
 gcc/cp/parser.cc|  2 +-
 gcc/testsuite/g++.dg/cpp0x/auto57.C | 15 +++
 2 files changed, 16 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/auto57.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 0eefa740dc5..3fc73442da5 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -8262,7 +8262,7 @@ cp_parser_postfix_dot_deref_expression (cp_parser *parser,
   tree type = TREE_TYPE (postfix_expression);
   /* If we don't have a (type-dependent) object of class type, use
 typeof to figure out the type of the object.  */
-  if (type == NULL_TREE)
+  if (type == NULL_TREE || is_auto (type))
type = finish_typeof (postfix_expression);
   parser->context->object_type = type;
 }
diff --git a/gcc/testsuite/g++.dg/cpp0x/auto57.C 
b/gcc/testsuite/g++.dg/cpp0x/auto57.C
new file mode 100644
index 000..fedcfde2f0c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/auto57.C
@@ -0,0 +1,15 @@
+// PR c++/105734
+// { dg-do compile { target c++11 } }
+
+namespace N {
+  struct A { };
+  A f(A);
+}
+
+template 
+void bar() {
+  auto m = f(T());
+  m.~A();
+}
+
+void foo() { bar(); }

base-commit: ae54c1b09963779c5c3914782324ff48af32e2f1
prerequisite-patch-id: f5fe205a2e35ff3b41ef11e9c72a2f2c12c0f3c5
-- 
2.27.0



Re: [PATCH] [PR105665] ivopts: check defs of names in base for undefs

2022-06-01 Thread Alexandre Oliva via Gcc-patches
On Jun  1, 2022, Richard Biener  wrote:

> On Tue, May 31, 2022 at 3:27 PM Alexandre Oliva  wrote:
>int i;
>if (flag)
>  i = init;
>i++;

> still would (the propagation stops at i++).

Uh, are you sure?  That doesn't sound right.  I meant for the
propagation to affect the incremented i as well, and any users thereof:
the incremented i is maybe-undefined, since its computation involves
another maybe-undefined value.

> do we understand the failure mode good enough to say that
> a must-def (even if from an SSA name without a definition) is good
> enough to avoid the issues we are seeing?

A must-def computed from a maybe-undef doesn't protect us from the
failure.  I assumed the failure mode was understood well enough to make
directly-visible undefined SSA_NAMEs problematic, and, given the finding
that indirectly-visible ones were also problematic, even with
multiple-steps removed, I figured other optimizations such as jump
threading could turn indirectly-visible undef names into
directly-visible ones, or that other passes could take advantage of
indirectly-visible undefinedness leading to potential undefined
behavior, to the point that we ought to avoid that.


> One would argue that my example above invokes undefined behavior
> if (!flag), but IIRC the cases in the PRs we talk about are not but
> IVOPTs with its IV choice exposes undefined behavior - orignially
> by relying on undef - undef being zero.

*nod*.  Perhaps we could refrain from propagating through assignments,
like the i++ increment above, rather than PHIs after the conditional
increment in my modified testcase, on the grounds that, if such non-PHI
assignments exercised, then we can assume any of its operands are
defined, because otherwise undefined behavior would have been invoked.

I.e., at least for ivopts purposes, we could propagate maybe-undefined
down PHI nodes only.

> That said, the contains-undef thing tries to avoid rewriting expressions
> with terms that possibly contain undefs which means if we want to
> strenthen it then we look for must-defined (currently it's must-undefined)?

I went for must-defined in the patch, but by computing its negated form,
maybe-undefined.  Now I'm thinking we can go for an even stricter
predicate to disable the optimization: if a non-PHI use of a
maybe-undefined dominates the loop, then we can still perform the
optimization:

>> >> +int g;
>> >> +if (f)
>> >> +  g++, b = 40;

   y = g+2;
   [loop]

The mere use of g before the loop requires it to be defined (even though
that's impossible above: either path carries the uninitialized value,
incremented or not), so we can proceed with the optimization under the
assumption that, after undefined behavior, anything goes.

This would be more useful for the case you showed, in which there's a
conditional initialization followed by an unconditional use.  The
requirement of no undefined behavior implies the conditional guarding
the initializer must hold, so the optimization can be performed.

WDYT?

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


Re: [PATCH v3, rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605]

2022-06-01 Thread Segher Boessenkool
Hi!

Some more nitpicking...

On Wed, May 18, 2022 at 04:52:26PM +0800, HAO CHEN GUI wrote:
>const double __builtin_vsx_xsmaxdp (double, double);
> -XSMAXDP smaxdf3 {}
> +XSMAXDP fmaxdf3 {}
> 
>const double __builtin_vsx_xsmindp (double, double);
> -XSMINDP smindf3 {}
> +XSMINDP fmindf3 {}

Are s{min,max}df3 still used after this?

> +   UNSPEC_FMAX
> +   UNSPEC_FMIN

Pity we have to do this as an unspec still, this should be handled by
some generic code, with some new operator (fmin/fmax would be obvious
names :-) )

> +(define_insn "f3"
> +  [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
> + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
> +   (match_operand:SFDF 2 "vsx_register_operand" "wa")]
> +   FMINMAX))]
> +"TARGET_VSX"
> +"xsdp %x0,%x1,%x2"
> +[(set_attr "type" "fp")]
> +)

Indentation is broken here, correct is

(define_insn "f3"
  [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa")
(unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa")
  (match_operand:SFDF 2 "vsx_register_operand" "wa")]
 FMINMAX))]
  "TARGET_VSX"
  "xsdp %x0,%x1,%x2"
  [(set_attr "type" "fp")])

(FMINMAX has the same indent as the preceding [, its sibling;
"TARGET_VSX" and the next two lines are indented like the same thing
before it at the same level (the "[(set"); the finishing ) does never
start a new line).


Segher


Re: [PATCH v4] DSE: Use the constant store source if possible

2022-06-01 Thread H.J. Lu via Gcc-patches
On Wed, Jun 1, 2022 at 12:20 AM Richard Sandiford
 wrote:
>
> "H.J. Lu"  writes:
> > On Mon, May 30, 2022 at 09:35:43AM +0100, Richard Sandiford wrote:
> >> "H.J. Lu"  writes:
> >> > ---
> >> > RTL DSE tracks redundant constant stores within a basic block.  When RTL
> >> > loop invariant motion hoists a constant initialization out of the loop
> >> > into a separate basic block, the constant store value becomes unknown
> >> > within the original basic block.  When recording store for RTL DSE, check
> >> > if the source register is set only once to a constant by a non-partial
> >> > unconditional load.  If yes, record the constant as the constant store
> >> > source.  It eliminates unrolled zero stores after memset 0 in a loop
> >> > where a vector register is used as the zero store source.
> >> >
> >> > Extract find_single_def_src from loop-iv.cc and move it to df-core.cc:
> >> >
> >> > 1. Rename to df_find_single_def_src.
> >> > 2. Change the argument to rtx and use rtx_equal_p.
> >> > 3. Return null for partial or conditional defs.
> >> >
> >> > gcc/
> >> >
> >> >PR rtl-optimization/105638
> >> >* df-core.cc (df_find_single_def_sr): Moved and renamed from
> >> >find_single_def_src in loop-iv.cc.  Change the argument to rtx
> >> >and use rtx_equal_p.  Return null for partial or conditional
> >> >defs.
> >> >* df.h (df_find_single_def_src): New prototype.
> >> >* dse.cc (record_store): Use the constant source if the source
> >> >register is set only once.
> >> >* loop-iv.cc (find_single_def_src): Moved to df-core.cc.
> >> >(replace_single_def_regs): Replace find_single_def_src with
> >> >df_find_single_def_src.
> >> >
> >> > gcc/testsuite/
> >> >
> >> >PR rtl-optimization/105638
> >> >* g++.target/i386/pr105638.C: New test.
> >> > ---
> >> >  gcc/df-core.cc   | 44 +++
> >> >  gcc/df.h |  1 +
> >> >  gcc/dse.cc   | 14 
> >> >  gcc/loop-iv.cc   | 45 +---
> >> >  gcc/testsuite/g++.target/i386/pr105638.C | 44 +++
> >> >  5 files changed, 104 insertions(+), 44 deletions(-)
> >> >  create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
> >> >
> >> > diff --git a/gcc/df-core.cc b/gcc/df-core.cc
> >> > index a901b84878f..f9b4de8eb7a 100644
> >> > --- a/gcc/df-core.cc
> >> > +++ b/gcc/df-core.cc
> >> > @@ -2009,6 +2009,50 @@ df_reg_used (rtx_insn *insn, rtx reg)
> >> >return df_find_use (insn, reg) != NULL;
> >> >  }
> >> >
> >> > +/* If REG has a single definition, return its known value, otherwise 
> >> > return
> >> > +   null.  */
> >> > +
> >> > +rtx
> >> > +df_find_single_def_src (rtx reg)
> >> > +{
> >> > +  rtx src = NULL_RTX;
> >> > +
> >> > +  /* Don't look through unbounded number of single definition REG 
> >> > copies,
> >> > + there might be loops for sources with uninitialized variables.  */
> >> > +  for (int cnt = 0; cnt < 128; cnt++)
> >> > +{
> >> > +  df_ref adef = DF_REG_DEF_CHAIN (REGNO (reg));
> >> > +  if (adef == NULL || DF_REF_NEXT_REG (adef) != NULL
> >> > +|| DF_REF_IS_ARTIFICIAL (adef)
> >> > +|| (DF_REF_FLAGS (adef)
> >> > +& (DF_REF_PARTIAL | DF_REF_CONDITIONAL)))
> >> > +  return NULL_RTX;
> >> > +
> >> > +  rtx set = single_set (DF_REF_INSN (adef));
> >> > +  if (set == NULL || !rtx_equal_p (SET_DEST (set), reg))
> >> > +  return NULL_RTX;
> >> > +
> >> > +  rtx note = find_reg_equal_equiv_note (DF_REF_INSN (adef));
> >> > +  if (note && function_invariant_p (XEXP (note, 0)))
> >> > +  {
> >> > +src = XEXP (note, 0);
> >> > +break;
> >> > +  }
> >>
> >> Seems simpler to return this directly, rather than break and then
> >> check function_invariant_p again.
> >
> > Fixed.
> >
> >>
> >> > +  src = SET_SRC (set);
> >> > +
> >> > +  if (REG_P (src))
> >> > +  {
> >> > +reg = src;
> >> > +continue;
> >> > +  }
> >> > +  break;
> >> > +}
> >> > +  if (!function_invariant_p (src))
> >> > +return NULL_RTX;
> >> > +
> >> > +  return src;
> >> > +}
> >> > +
> >> >
> >> >  
> >> > /*
> >> > Debugging and printing functions.
> >> > diff --git a/gcc/df.h b/gcc/df.h
> >> > index bd329205d08..71e249ad20a 100644
> >> > --- a/gcc/df.h
> >> > +++ b/gcc/df.h
> >> > @@ -991,6 +991,7 @@ extern df_ref df_find_def (rtx_insn *, rtx);
> >> >  extern bool df_reg_defined (rtx_insn *, rtx);
> >> >  extern df_ref df_find_use (rtx_insn *, rtx);
> >> >  extern bool df_reg_used (rtx_insn *, rtx);
> >> > +extern rtx df_find_single_def_src (rtx);
> >> >  extern void df_worklist_dataflow (struct dataflow *,bitmap, int *, int);
> >> >  extern void df_print_regset (FILE *file, const_bitmap r);
> >> >  extern void df_print_word_regset (FILE *file, const_bitmap r);
> >> > diff --git a/gcc/dse.cc b/gcc/dse.cc
> >> > index 30c11ce

Re: [PATCH] libgcc: Align __EH_FRAME_BEGIN__ to pointer size

2022-06-01 Thread Jeff Law via Gcc-patches




On 1/18/2022 10:09 AM, H.J. Lu via Gcc-patches wrote:

Aligne __EH_FRAME_BEGIN__ to pointer size since gcc/unwind-dw2-fde.h has

/* The first few fields of a CIE.  The CIE_id field is 0 for a CIE,
to distinguish it from a valid FDE.  FDEs are aligned to an addressing
unit boundary, but the fields within are unaligned.  */
struct dwarf_cie
{
   uword length;
   sword CIE_id;
   ubyte version;
   unsigned char augmentation[];
} __attribute__ ((packed, aligned (__alignof__ (void *;

/* The first few fields of an FDE.  */
struct dwarf_fde
{
   uword length;
   sword CIE_delta;
   unsigned char pc_begin[];
} __attribute__ ((packed, aligned (__alignof__ (void *;

which indicates that CIE/FDE should be aligned at the pointer size.

PR libgcc/27576
* crtstuff.c (__EH_FRAME_BEGIN__): Aligned to pointer size.

OK.  Though it's unclear how important this is in practice.

jeff



Re: [PATCH] PR middle-end/95126: Expand small const structs as immediate constants.

2022-06-01 Thread Jeff Law via Gcc-patches




On 2/26/2022 2:35 PM, Roger Sayle wrote:

This patch resolves PR middle-end/95126 which is a code quality regression,
by teaching the RTL expander to emit small const structs/unions as integer
immediate constants.

The motivating example from the bugzilla PR is:

struct small{ short a,b; signed char c; };
extern int func(struct small X);
void call_func(void)
{
 static struct small const s = { 1, 2, 0 };
 func(s);
}

which on x86_64 is currently compiled to:

call_func:
 movzwl  s.0+2(%rip), %eax
 movzwl  s.0(%rip), %edx
 movzwl  s.0+4(%rip), %edi
 salq$16, %rax
 orq %rdx, %rax
 salq$32, %rdi
 orq %rax, %rdi
 jmp func

but with this patch is now optimized to:

call_func:
 movl$131073, %edi
 jmp func


This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures.  Ok for mainline?


2022-02-26  Roger Sayle  

gcc/ChangeLog
PR middle-end/95126
* calls.cc (load_register_parameters): When loading a suitable
immediate_const_ctor_p VAR_DECL into a single word_mode register,
construct it directly in a pseudo rather than read it (by parts)
from memory.
* expr.cc (int_expr_size): Make tree argument a const_tree.
(immediate_const_ctor_p): Helper predicate.  Return true for
simple constructors that may be materialized in a register.
(expand_expr_real_1) [VAR_DECL]: When expanding a constant
VAR_DECL with a suitable immediate_const_ctor_p constructor
use store_constructor to materialize it directly in a pseudo.
* expr.h (immediate_const_ctor_p): Prototype here.
* varasm.cc (initializer_constant_valid_for_bitfield_p): Change
VALUE argument from tree to const_tree.
* varasm.h (initializer_constant_valid_for_bitfield_p): Update
prototype.

gcc/testsuite/ChangeLog
PR middle-end/95126
* gcc.target/i386/pr95126-m32-1.c: New test case.
* gcc.target/i386/pr95126-m32-2.c: New test case.
* gcc.target/i386/pr95126-m32-3.c: New test case.
* gcc.target/i386/pr95126-m32-4.c: New test case.
* gcc.target/i386/pr95126-m64-1.c: New test case.
* gcc.target/i386/pr95126-m64-2.c: New test case.
* gcc.target/i386/pr95126-m64-3.c: New test case.
* gcc.target/i386/pr95126-m64-4.c: New test case.

OK after a fresh bootstrap & regression test.  Sorry for the long wait.

jeff



Re: [PATCH] configure: use OBJDUMP determined by libtool [PR95648]

2022-06-01 Thread Jeff Law via Gcc-patches




On 3/15/2022 2:59 AM, David Seifert via Gcc-patches wrote:

$ac_cv_prog_OBJDUMP contains the --host OBJDUMP that
libtool has inferred. Current config/gcc-plugin.m4 does
not respect the user's choice for OBJDUMP.

config/

 * gcc-plugin.m4: Use libtool's $ac_cv_prog_OBJDUMP.

gcc/

 * configure: Regenerate.

libcc1/

 * configure: Regenerate.

THanks.  I've pushed this to the trunk.
jeff



Re: [PATCH] Place jump tables in RELRO only when targets require local relocation to be placed in a read-write section

2022-06-01 Thread Jeff Law via Gcc-patches




On 1/12/2022 12:20 AM, HAO CHEN GUI via Gcc-patches wrote:

Hi,
This patch sets "relocatable" of jump table to true when targets require 
local relocation to be placed
in a read-write section - bit 0 is set in reloc_rw_mask. Jump tables are in 
local relocation, so they
should be placed in RELRO only when both global and local relocation need to be 
placed in a read-write
section. Bit 1 is always set when bit 0 is set.
Bootstrapped and tested on powerpc64-linux BE/LE and x86 with no 
regressions. Is this okay for trunk?
Any recommendations? Thanks a lot.

ChangeLog
2022-01-12 Haochen Gui 

gcc/
* final.c (jumptable_relocatable): Set relocatable to true when
targets require local relocation to be placed in a read-write section.
It seems unwise to me to rely on the fact that bit1 is already on when 
bit0 is on.    I realize that's likely just preserving existing 
behavior, but unless there's a compelling reason, I'd rather do:


relocatable = (targetm.asm_out.reloc_rw_mask () & 1) != 0;

Which avoids the assumption that if bit0 is on, then bit 1 will always 
be on.

jeff



Re: [PATCH] Support multilib-aware target lib flags self-specs overriding

2022-06-01 Thread Hans-Peter Nilsson
On Fri, 20 May 2022, Alexandre Oliva via Gcc-patches wrote:
>
> This patch introduces -multiflags, short for multilib TFLAGS, as an
> option that does nothing by default, but that can be added to TFLAGS
> and mapped to useful options by driver self-specs.
>
> I realize -m is reserved for machine-specific flags, which this option
> sort-of isn't, but its intended use is indeed to stand for
> machine-specific flags, so it's kind of ok.  But that's just my
> official excuse, the reason I couldn't help picking it up is that it
> is a portmanteau of multi[lib] and TFLAGS.

Ooh, a bikeshedding opportunity!
Not liking the "-m"-space infringement:

-frob-multiflags?
-forward-multiflags?
or why not just
-fmultiflags?

brgds, H-P


Re: [PATCH v2, rs6000] Fix ICE on expand bcd__ [PR100736]

2022-06-01 Thread Segher Boessenkool
Hi!

On Tue, May 31, 2022 at 06:56:00PM -0500, Segher Boessenkool wrote:
> It's not clear to me how this can ever happen without finite_math_only?
> The patch is safe, sure, but it may the real problem is elsewhere.

So, it is incorrect the RTL for our bcd{add,sub} insns uses CCFP at all.

CCFP stands for the result of a 4-way comparison, regular float
comparison: lt gt eq un.  But bcdadd does not have an unordered at all.
Instead, it has the result of a 3-way comparison (lt gt eq), and bit 3
is set if an overflow happened -- but still exactly one of bits 0..2 is
set then!  (If one of the inputs is an invalid number it sets bits 0..3
to 0001 though.)

So it would be much more correct and sensible to use regular integer
comparison results here, so, CC.

Does that fix the problem?


Segher


[PATCH] c++: ICE with template NEW_EXPR [PR105803]

2022-06-01 Thread Marek Polacek via Gcc-patches
Here we ICE because value_dependent_expression_p gets a NEW_EXPR
whose operand is a type, and we go to the default case which just
calls v_d_e_p on each operand of the NEW_EXPR.  Since one of them
is a type, we crash on the new assert in t_d_e_p.

t_d_e_p has code to handle {,VEC_}NEW_EXPR, which at this point
was already performed, so I think we can handle these two codes
specifically and skip the second operand, which is always going
to be a type.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/105803

gcc/cp/ChangeLog:

* pt.cc (value_dependent_expression_p): Handle {,VEC_}NEW_EXPR
in the switch.

gcc/testsuite/ChangeLog:

* g++.dg/template/new13.C: New test.
---
 gcc/cp/pt.cc  |  8 
 gcc/testsuite/g++.dg/template/new13.C | 11 +++
 2 files changed, 19 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/template/new13.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 6de8e496859..836861e1039 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -27643,6 +27643,14 @@ value_dependent_expression_p (tree expression)
 under instantiate_non_dependent_expr; it can't be constant.  */
   return true;
 
+case NEW_EXPR:
+case VEC_NEW_EXPR:
+  /* The second operand is a type, which type_dependent_expression_p
+(and therefore value_dependent_expression_p) doesn't want to see.  */
+  return (value_dependent_expression_p (TREE_OPERAND (expression, 0))
+ || value_dependent_expression_p (TREE_OPERAND (expression, 2))
+ || value_dependent_expression_p (TREE_OPERAND (expression, 3)));
+
 default:
   /* A constant expression is value-dependent if any subexpression is
 value-dependent.  */
diff --git a/gcc/testsuite/g++.dg/template/new13.C 
b/gcc/testsuite/g++.dg/template/new13.C
new file mode 100644
index 000..3168374b26d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/new13.C
@@ -0,0 +1,11 @@
+// PR c++/105803
+// { dg-do compile }
+// { dg-additional-options "-fchecking=2" }
+
+namespace std {
+template  class shared_ptr;
+}
+struct S {};
+template  void build_matrices() {
+  std::shared_ptr(new S);
+}

base-commit: 2d546ff69455f7deadab65309de89d19380a8864
-- 
2.36.1



[PATCH] Simplify (B * v + C) * D -> BD* v + CD when B, C, D are all INTEGER_CST.

2022-06-01 Thread liuhongt via Gcc-patches
Similar for (v + B) * C + D -> C * v + BCD.
Don't simplify it when there's overflow and overflow is UB for type v.

There's new failure

gcc.dg/vect/slp-11a.c scan-tree-dump-times vect "vectorizing stmts using SLP" 0

It's because the patch simplify different operations to mult + add and enables
SLP. So I adjust the testcase to prevent simplication by making the
multiplication result overflow.

Also with -fwrapv, benchmark in the PR now 100% faster than
origin(scalar version).

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
Ok for trunk?

gcc/ChangeLog:

PR tree-optimization/53533
* match.pd: Simplify (B * v + C) * D -> BD * v + CD and
(v + B) * C + D -> C * v + BCD when B,C,D are all INTEGER_CST,
and there's no overflow or TYPE_OVERFLOW_WRAP.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr53533-1.c: New test.
* gcc.target/i386/pr53533-2.c: New test.
* gcc.target/i386/pr53533-3.c: New test.
* gcc.target/i386/pr53533-4.c: New test.
* gcc.dg/vect/slp-11a.c: Adjust testcase.
---
 gcc/match.pd  | 36 ++
 gcc/testsuite/gcc.dg/vect/slp-11a.c   | 10 ++---
 gcc/testsuite/gcc.target/i386/pr53533-1.c | 23 
 gcc/testsuite/gcc.target/i386/pr53533-2.c | 46 +++
 gcc/testsuite/gcc.target/i386/pr53533-3.c | 24 
 gcc/testsuite/gcc.target/i386/pr53533-4.c | 46 +++
 6 files changed, 180 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr53533-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr53533-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr53533-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr53533-4.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 88c6c414881..b753f7bda3c 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -489,6 +489,42 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (if (!overflow || TYPE_OVERFLOW_WRAPS (type))
(mult @0 { wide_int_to_tree (type, mul); }
 
+/* Similar to above, but there could be an extra add/sub between
+   successive multuiplications.  */
+(simplify
+ (mult:c (plus:c@4 (mult:c@5 @0 INTEGER_CST@1) INTEGER_CST@2) INTEGER_CST@3)
+ (if (single_use (@4)
+  && single_use (@5))
+  (with {
+wi::overflow_type overflow;
+wi::overflow_type overflow2;
+wide_int mul = wi::mul (wi::to_wide (@1), wi::to_wide (@3),
+   TYPE_SIGN (type), &overflow);
+wide_int add = wi::mul (wi::to_wide (@2), wi::to_wide (@3),
+   TYPE_SIGN (type), &overflow2);
+  }
+   /* Skip folding on overflow.  */
+   (if (!(overflow || overflow2) || TYPE_OVERFLOW_WRAPS (type))
+(plus (mult @0 { wide_int_to_tree (type, mul); })
+ { wide_int_to_tree (type, add); })
+
+/* Similar to above, but a multiplication between successive additions.  */
+(simplify
+ (plus:c (mult:c@4 (plus:c@5 @0 INTEGER_CST@1) INTEGER_CST@2) INTEGER_CST@3)
+ (if (single_use (@4)
+  && single_use (@5))
+  (with {
+wi::overflow_type overflow;
+wi::overflow_type overflow2;
+wide_int mul = wi::mul (wi::to_wide (@1), wi::to_wide (@2),
+   TYPE_SIGN (type), &overflow);
+wide_int add = wi::add (mul, wi::to_wide (@3),
+   TYPE_SIGN (type), &overflow2);
+  }
+   /* Skip folding on overflow.  */
+   (if (!(overflow || overflow2) || TYPE_OVERFLOW_WRAPS (type))
+(plus (mult @0 @2) { wide_int_to_tree (type, add); })
+
 /* Optimize A / A to 1.0 if we don't care about
NaNs or Infinities.  */
 (simplify
diff --git a/gcc/testsuite/gcc.dg/vect/slp-11a.c 
b/gcc/testsuite/gcc.dg/vect/slp-11a.c
index bcd3c861ca4..e6632fa77be 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-11a.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-11a.c
@@ -9,14 +9,14 @@ int
 main1 ()
 {
   int i;
-  unsigned int out[N*8], a0, a1, a2, a3, a4, a5, a6, a7, b1, b0, b2, b3, b4, 
b5, b6, b7;
-  unsigned int in[N*8] = 
{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
+  int out[N*8], a0, a1, a2, a3, a4, a5, a6, a7, b1, b0, b2, b3, b4, b5, b6, b7;
+  int in[N*8] = 
{0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
 
   /* Different operations - not SLPable.  */
   for (i = 0; i < N; i++)
 {
   a0 = in[i*8] + 5;
-  a1 = in[i*8 + 1] * 6;
+  a1 = in[i*8 + 1] * 51072;
   a2 = in[i*8 + 2] + 7;
   a3 = in[i*8 + 3] + 8;
   a4 = in[i*8 + 4] + 9;
@@ -25,7 +25,7 @@ main1 ()
   a7 = in[i*8 + 7] + 12;
 
   b0 = a0 * 3;
-  b1 = a1 * 2;
+  b1 = a1 * 51072;
   b2 = a2 * 12;
   b3 = a3 * 5;
   b4 = a4 * 8;
@@ -47,7 +47,7 @@ main1 ()
   for (i = 0; i < N; i++)
 {
   if (out[i*8] !=  (in[i*8] + 5) * 3 - 2
-   

Re: [PATCH] [PR/target 105666] RISC-V: Inhibit FP <--> int register moves via tune param

2022-06-01 Thread Kito Cheng via Gcc-patches
I just hesitated for a few days about backporting this, but I think
it's OK to back port because
1. Simple enough
2. Good for general RISC-V core

Committed with your latest testsuite fix.

Thanks!



On Wed, May 25, 2022 at 3:38 AM Vineet Gupta  wrote:
>
>
>
> On 5/24/22 00:59, Kito Cheng wrote:
> > Committed, thanks!
>
> Thx for the quick action Kito,
> Can this be backported to gcc 12 as well ?
>
> Thx,
> -Vineet
>
> >
> > On Tue, May 24, 2022 at 3:40 AM Philipp Tomsich
> >  wrote:
> >> Good catch!
> >>
> >> On Mon, 23 May 2022 at 20:12, Vineet Gupta  wrote:
> >>
> >>> Under extreme register pressure, compiler can use FP <--> int
> >>> moves as a cheap alternate to spilling to memory.
> >>> This was seen with SPEC2017 FP benchmark 507.cactu:
> >>> ML_BSSN_Advect.cc:ML_BSSN_Advect_Body()
> >>>
> >>> |   fmv.d.x fa5,s9  # PDupwindNthSymm2Xt1, PDupwindNthSymm2Xt1
> >>> | .LVL325:
> >>> |   ld  s9,184(sp)  # _12469, %sfp
> >>> | ...
> >>> | .LVL339:
> >>> |   fmv.x.d s4,fa5  # PDupwindNthSymm2Xt1, PDupwindNthSymm2Xt1
> >>> |
> >>>
> >>> The FMV instructions could be costlier (than stack spill) on certain
> >>> micro-architectures, thus this needs to be a per-cpu tunable
> >>> (default being to inhibit on all existing RV cpus).
> >>>
> >>> Testsuite run with new test reports 10 failures without the fix
> >>> corresponding to the build variations of pr105666.c
> >>>
> >>> |   === gcc Summary ===
> >>> |
> >>> | # of expected passes  123318   (+10)
> >>> | # of unexpected failures  34   (-10)
> >>> | # of unexpected successes 4
> >>> | # of expected failures780
> >>> | # of unresolved testcases 4
> >>> | # of unsupported tests2796
> >>>
> >>> gcc/Changelog:
> >>>
> >>>  * config/riscv/riscv.cc: (struct riscv_tune_param): Add
> >>>fmv_cost.
> >>>  (rocket_tune_info): Add default fmv_cost 8.
> >>>  (sifive_7_tune_info): Ditto.
> >>>  (thead_c906_tune_info): Ditto.
> >>>  (optimize_size_tune_info): Ditto.
> >>>  (riscv_register_move_cost): Use fmv_cost for int<->fp moves.
> >>>
> >>> gcc/testsuite/Changelog:
> >>>
> >>>  * gcc.target/riscv/pr105666.c: New test.
> >>>
> >>> Signed-off-by: Vineet Gupta 
> >>> ---
> >>>   gcc/config/riscv/riscv.cc |  9 
> >>>   gcc/testsuite/gcc.target/riscv/pr105666.c | 55 +++
> >>>   2 files changed, 64 insertions(+)
> >>>   create mode 100644 gcc/testsuite/gcc.target/riscv/pr105666.c
> >>>
> >>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> >>> index ee756aab6940..f3ac0d8865f0 100644
> >>> --- a/gcc/config/riscv/riscv.cc
> >>> +++ b/gcc/config/riscv/riscv.cc
> >>> @@ -220,6 +220,7 @@ struct riscv_tune_param
> >>> unsigned short issue_rate;
> >>> unsigned short branch_cost;
> >>> unsigned short memory_cost;
> >>> +  unsigned short fmv_cost;
> >>> bool slow_unaligned_access;
> >>>   };
> >>>
> >>> @@ -285,6 +286,7 @@ static const struct riscv_tune_param rocket_tune_info
> >>> = {
> >>> 1,   /* issue_rate */
> >>> 3,   /* branch_cost */
> >>> 5,   /* memory_cost */
> >>> +  8,   /* fmv_cost */
> >>> true,/*
> >>> slow_unaligned_access */
> >>>   };
> >>>
> >>> @@ -298,6 +300,7 @@ static const struct riscv_tune_param
> >>> sifive_7_tune_info = {
> >>> 2,   /* issue_rate */
> >>> 4,   /* branch_cost */
> >>> 3,   /* memory_cost */
> >>> +  8,   /* fmv_cost */
> >>> true,/*
> >>> slow_unaligned_access */
> >>>   };
> >>>
> >>> @@ -311,6 +314,7 @@ static const struct riscv_tune_param
> >>> thead_c906_tune_info = {
> >>> 1,/* issue_rate */
> >>> 3,/* branch_cost */
> >>> 5,/* memory_cost */
> >>> +  8,   /* fmv_cost */
> >>> false,/* slow_unaligned_access */
> >>>   };
> >>>
> >>> @@ -324,6 +328,7 @@ static const struct riscv_tune_param
> >>> optimize_size_tune_info = {
> >>> 1,   /* issue_rate */
> >>> 1,   /* branch_cost */
> >>> 2,   /* memory_cost */
> >>> +  8,   /* fmv_cost */
> >>> false,   /* slow_unaligned_access 
> >>> */
> >>>   };
> >>>
> >>> @@ -4737,6 +4742,10 @@ static int
> >>>   riscv_register_move_cost (machine_mode mode,
> >>>reg_class_t from, reg_class_t to)
> >>>   {
> >>> +  if ((from == FP_REGS &&

[PATCH] libgccjit: Fix bug where unary_op will return an integer type instead of the correct type

2022-06-01 Thread Antoni Boucher via Gcc-patches
Hi.
The attached patch fix bug 105812:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105812

I'm having an issue where contrib/check_GNU_style.sh doesn't seem to
work, i.e. it doesn't seem to do any checking.
Is there a new way to do that or am I missing something?

Thanks for the review.
From ef20b0a18e4978aac9eb77b91898356c67f6a0e4 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Wed, 1 Jun 2022 22:07:07 -0400
Subject: [PATCH] libgccjit: Fix bug where unary_op will return an integer type
 instead of the correct type

2022-06-01  Antoni Boucher  

gcc/jit/
	PR target/105812
	* jit-playback.cc: Use the correct return when folding in
	as_truth_value.

gcc/testsuite/
	PR target/105812
	* jit.dg/test-pr105812-bool-operations.c: New test.
---
 gcc/jit/jit-playback.cc   |  3 +-
 .../jit.dg/test-pr105812-bool-operations.c| 89 +++
 2 files changed, 91 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/jit.dg/test-pr105812-bool-operations.c

diff --git a/gcc/jit/jit-playback.cc b/gcc/jit/jit-playback.cc
index 6be6bdf8dea..c08cba58743 100644
--- a/gcc/jit/jit-playback.cc
+++ b/gcc/jit/jit-playback.cc
@@ -1025,8 +1025,9 @@ as_truth_value (tree expr, location *loc)
   if (loc)
 set_tree_location (typed_zero, loc);
 
+  tree type = TREE_TYPE (expr);
   expr = fold_build2_loc (UNKNOWN_LOCATION,
-NE_EXPR, integer_type_node, expr, typed_zero);
+NE_EXPR, type, expr, typed_zero);
   if (loc)
 set_tree_location (expr, loc);
 
diff --git a/gcc/testsuite/jit.dg/test-pr105812-bool-operations.c b/gcc/testsuite/jit.dg/test-pr105812-bool-operations.c
new file mode 100644
index 000..1daa1c3c35a
--- /dev/null
+++ b/gcc/testsuite/jit.dg/test-pr105812-bool-operations.c
@@ -0,0 +1,89 @@
+#include "libgccjit.h"
+
+#include "harness.h"
+
+void
+create_code (gcc_jit_context *ctxt, void *user_data)
+{
+  gcc_jit_type* bool_type =
+gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_BOOL);
+  gcc_jit_type* bool_ptr_type =
+gcc_jit_type_get_pointer (gcc_jit_type_get_aligned (bool_type, 1));
+
+  /* Function 1 */
+
+  gcc_jit_param* param1 = gcc_jit_context_new_param (ctxt, NULL, bool_type,
+		 "param1");
+  gcc_jit_function* function1 =
+gcc_jit_context_new_function (ctxt, NULL,
+  GCC_JIT_FUNCTION_EXPORTED, bool_type,
+  "function1", 1, ¶m1, 0);
+  gcc_jit_block* block1 = gcc_jit_function_new_block (function1, "start1");
+
+  gcc_jit_lvalue* var1 =
+gcc_jit_function_new_local (function1, NULL, bool_type, "var1");
+  gcc_jit_rvalue* addr1 =
+gcc_jit_lvalue_get_address (var1, NULL);
+  gcc_jit_rvalue* ptr1 =
+gcc_jit_context_new_cast (ctxt, NULL, addr1, bool_ptr_type);
+  gcc_jit_lvalue* deref1 =
+gcc_jit_rvalue_dereference (ptr1, NULL);
+  gcc_jit_rvalue* param1_rvalue =
+gcc_jit_param_as_rvalue (param1);
+  gcc_jit_block_add_assignment (block1, NULL, deref1, param1_rvalue);
+
+  gcc_jit_rvalue* one = gcc_jit_context_one (ctxt, bool_type);
+  gcc_jit_block_end_with_return (block1, NULL, one);
+
+  /* Function 2 */
+
+  gcc_jit_param* param2 = gcc_jit_context_new_param (ctxt, NULL, bool_type,
+		 "param2");
+  gcc_jit_function* function2 =
+gcc_jit_context_new_function (ctxt, NULL,
+  GCC_JIT_FUNCTION_EXPORTED, bool_type,
+  "function2", 1, ¶m2, 0);
+  gcc_jit_block* block2 = gcc_jit_function_new_block (function2, "start2");
+
+  gcc_jit_lvalue* var2 =
+gcc_jit_function_new_local (function2, NULL, bool_type, "var2");
+  gcc_jit_rvalue* addr2 =
+gcc_jit_lvalue_get_address (var2, NULL);
+  gcc_jit_rvalue* ptr2 =
+gcc_jit_context_new_cast (ctxt, NULL, addr2, bool_ptr_type);
+  gcc_jit_lvalue* deref2 =
+gcc_jit_rvalue_dereference (ptr2, NULL);
+  gcc_jit_rvalue* param2_rvalue =
+gcc_jit_param_as_rvalue (param2);
+  gcc_jit_block_add_assignment (block2, NULL, deref2, param2_rvalue);
+
+  gcc_jit_lvalue* return_value =
+gcc_jit_function_new_local (function2, NULL, bool_type, "return_value");
+  gcc_jit_rvalue* call =
+gcc_jit_context_new_call (ctxt, NULL, function1, 1, ¶m2_rvalue);
+  gcc_jit_block_add_assignment (block2, NULL, return_value, call);
+
+  gcc_jit_block* block2_1 =
+gcc_jit_function_new_block (function2, "end2");
+  gcc_jit_block_end_with_jump (block2, NULL, block2_1);
+
+  gcc_jit_rvalue* value =
+gcc_jit_context_new_unary_op (ctxt, NULL,
+  GCC_JIT_UNARY_OP_LOGICAL_NEGATE, bool_type,
+  param2_rvalue);
+  gcc_jit_rvalue* return_rvalue =
+gcc_jit_lvalue_as_rvalue (return_value);
+  gcc_jit_rvalue* and =
+gcc_jit_context_new_binary_op (ctxt, NULL,
+   GCC_JIT_BINARY_OP_BITWISE_AND, bool_type,
+   return_rvalue, value);
+
+  gcc_jit_block_end_with_return (block2_1, NULL, and);
+}
+
+extern void
+verify_code (gcc_jit_context *ctxt, gcc_jit_result *result)
+{
+  /* Verify that no errors were emitted.  */
+  CHECK_NON_NULL (result);
+}
-- 
2.26.2.7.g19db9cfb68.dirty



Re: [PATCH] libgccjit: Fix bug where unary_op will return an integer type instead of the correct type

2022-06-01 Thread Antoni Boucher via Gcc-patches
Also, the test gcc/testsuite/jit.dg/test-asm.cc fails and would need
this line:

#include 

Is this okay if I add it in this patch?

On Wed, 2022-06-01 at 22:13 -0400, Antoni Boucher wrote:
> Hi.
> The attached patch fix bug 105812:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105812
> 
> I'm having an issue where contrib/check_GNU_style.sh doesn't seem to
> work, i.e. it doesn't seem to do any checking.
> Is there a new way to do that or am I missing something?
> 
> Thanks for the review.



Re: [PATCH v2, rs6000] Fix ICE on expand bcd__ [PR100736]

2022-06-01 Thread HAO CHEN GUI via Gcc-patches
Segher,
  Does BCD comparison return false when either operand is invalid coding?
If yes, the result could be 3-way. We can check gt and eq bits for ge.
We still can't use crnot to only check lt bit as there could be invalid
coding.
  Also, do you think finite-math-only excludes invalid coding? Seems GCC
doesn't clear define it.

Thanks.
Gui Haochen


On 2/6/2022 上午 6:05, Segher Boessenkool wrote:
> Hi!
> 
> On Tue, May 31, 2022 at 06:56:00PM -0500, Segher Boessenkool wrote:
>> It's not clear to me how this can ever happen without finite_math_only?
>> The patch is safe, sure, but it may the real problem is elsewhere.
> 
> So, it is incorrect the RTL for our bcd{add,sub} insns uses CCFP at all.
> 
> CCFP stands for the result of a 4-way comparison, regular float
> comparison: lt gt eq un.  But bcdadd does not have an unordered at all.
> Instead, it has the result of a 3-way comparison (lt gt eq), and bit 3
> is set if an overflow happened -- but still exactly one of bits 0..2 is
> set then!  (If one of the inputs is an invalid number it sets bits 0..3
> to 0001 though.)
> 
> So it would be much more correct and sensible to use regular integer
> comparison results here, so, CC.
> 
> Does that fix the problem?
> 
> 
> Segher


[x86_64 PATCH] PR target/105791: Add V1TI to V_128_256 for xop_pcmov_v1ti.

2022-06-01 Thread Roger Sayle

This patch resolves PR target/105791 which is a regression that was
accidentally introduced for my workaround to PR tree-optimization/10566.
(a deeper problem in GCC's vectorizer creating VEC_COND_EXPR when it
shouldn't).  The latest issues is that by providing a vcond_mask_v1tiv1ti
pattern in sse.md, the backend now calls ix86_expand_sse_movcc with
V1TImode operands, which has a special case for TARGET_XOP to generate
a vpcmov instruction.  Unfortunately, there wasn't previously a V1TImode
variant, xop_pcmov_v1ti, so we'd ICE.

This is easily fixed by adding V1TImode (and V2TImode) to V_128_256
which is only used for defining XOP's vpcmov instruction.  This in turn
requires V1TI (and V2TI) to be supported by  (though
the use if  in the names xop_pcmov_
seems unnecessary; the mode makes the name unique).

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2022-06-02  Roger Sayle  

gcc/ChangeLog
PR target/105791
* config/i386/sse.md (V_128_256):Add V1TI and V2TI.
(define_mode_attr avxsizesuffix): Add support for V1TI and V2TI.

gcc/testsuite/ChangeLog
PR target/105791
* gcc.target/i386/pr105791.c: New test case.


Thanks in advance. Sorry for the inconvenience/breakage.
Roger
--

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index c2e046e8..8b3163f 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -301,7 +301,8 @@
 
 ;; All 128bit and 256bit vector modes
 (define_mode_iterator V_128_256
-  [V32QI V16QI V16HI V8HI V8SI V4SI V4DI V2DI V16HF V8HF V8SF V4SF V4DF V2DF])
+  [V32QI V16QI V16HI V8HI V8SI V4SI V4DI V2DI V2TI V1TI
+   V16HF V8HF V8SF V4SF V4DF V2DF])
 
 ;; All 512bit vector modes
 (define_mode_iterator V_512 [V64QI V32HI V16SI V8DI V16SF V8DF])
@@ -897,9 +898,9 @@
(V8HI "sse4_1") (V16HI "avx")])
 
 (define_mode_attr avxsizesuffix
-  [(V64QI "512") (V32HI "512") (V16SI "512") (V8DI "512")
-   (V32QI "256") (V16HI "256") (V8SI "256") (V4DI "256")
-   (V16QI "") (V8HI "") (V4SI "") (V2DI "")
+  [(V64QI "512") (V32HI "512") (V16SI "512") (V8DI "512") (V4TI "512")
+   (V32QI "256") (V16HI "256") (V8SI "256") (V4DI "256") (V2TI "256")
+   (V16QI "") (V8HI "") (V4SI "") (V2DI "") (V1TI "")
(V32HF "512") (V16SF "512") (V8DF "512")
(V16HF "256") (V8SF "256") (V4DF "256")
(V8HF "") (V4SF "") (V2DF "")])
diff --git a/gcc/testsuite/gcc.target/i386/pr105791.c 
b/gcc/testsuite/gcc.target/i386/pr105791.c
new file mode 100644
index 000..55e278b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr105791.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O2 -mxop" } */
+typedef __int128 __attribute__((__vector_size__ (sizeof (__int128 U;
+typedef int __attribute__((__vector_size__ (sizeof (int V;
+
+U u;
+V v;
+
+U
+foo (void)
+{
+  return (0 != __builtin_convertvector (v, U)) <= (0 != u);
+}


[PATCH] tree-optimization/105802 - another unswitching type issue

2022-06-01 Thread Richard Biener via Gcc-patches
This also fixes the type of the irange used for unswitching of
switch statements.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/105802
* tree-ssa-loop-unswitch.cc (find_unswitching_predicates_for_bb):
Make sure to also compute the range in the type of the switch index.

* g++.dg/opt/pr105802.C: New testcase.
---
 gcc/testsuite/g++.dg/opt/pr105802.C | 23 +++
 gcc/tree-ssa-loop-unswitch.cc   | 17 +++--
 2 files changed, 30 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/opt/pr105802.C

diff --git a/gcc/testsuite/g++.dg/opt/pr105802.C 
b/gcc/testsuite/g++.dg/opt/pr105802.C
new file mode 100644
index 000..2514245d00a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/pr105802.C
@@ -0,0 +1,23 @@
+// { dg-do compile }
+// { dg-options "-O3" }
+
+enum E { E0, E1 };
+
+void bar ();
+void baz ();
+
+int c;
+
+void
+foo (int i)
+{
+  E e = (E) i;
+  while (c)
+switch (e)
+  {
+  case E0:
+bar ();
+  case E1:
+baz ();
+  }
+}
diff --git a/gcc/tree-ssa-loop-unswitch.cc b/gcc/tree-ssa-loop-unswitch.cc
index 2b6013e9d69..61c04ed9f2e 100644
--- a/gcc/tree-ssa-loop-unswitch.cc
+++ b/gcc/tree-ssa-loop-unswitch.cc
@@ -523,22 +523,19 @@ find_unswitching_predicates_for_bb (basic_block bb, class 
loop *loop,
  tree lab = gimple_switch_label (stmt, i);
  tree cmp;
  int_range<2> lab_range;
+ tree low = fold_convert (idx_type, CASE_LOW (lab));
  if (CASE_HIGH (lab) != NULL_TREE)
{
- tree cmp1 = fold_build2 (GE_EXPR, boolean_type_node, idx,
-  fold_convert (idx_type,
-CASE_LOW (lab)));
- tree cmp2 = fold_build2 (LE_EXPR, boolean_type_node, idx,
-  fold_convert (idx_type,
-CASE_HIGH (lab)));
+ tree high = fold_convert (idx_type, CASE_HIGH (lab));
+ tree cmp1 = fold_build2 (GE_EXPR, boolean_type_node, idx, low);
+ tree cmp2 = fold_build2 (LE_EXPR, boolean_type_node, idx, high);
  cmp = fold_build2 (BIT_AND_EXPR, boolean_type_node, cmp1, cmp2);
- lab_range.set (CASE_LOW (lab), CASE_HIGH (lab));
+ lab_range.set (low, high);
}
  else
{
- cmp = fold_build2 (EQ_EXPR, boolean_type_node, idx,
-fold_convert (idx_type, CASE_LOW (lab)));
- lab_range.set (CASE_LOW (lab));
+ cmp = fold_build2 (EQ_EXPR, boolean_type_node, idx, low);
+ lab_range.set (low);
}
 
  /* Combine the expression with the existing one.  */
-- 
2.35.3


Re: [x86_64 PATCH] PR target/105791: Add V1TI to V_128_256 for xop_pcmov_v1ti.

2022-06-01 Thread Hongtao Liu via Gcc-patches
On Thu, Jun 2, 2022 at 2:24 PM Roger Sayle  wrote:
>
>
> This patch resolves PR target/105791 which is a regression that was
> accidentally introduced for my workaround to PR tree-optimization/10566.
> (a deeper problem in GCC's vectorizer creating VEC_COND_EXPR when it
> shouldn't).  The latest issues is that by providing a vcond_mask_v1tiv1ti
> pattern in sse.md, the backend now calls ix86_expand_sse_movcc with
> V1TImode operands, which has a special case for TARGET_XOP to generate
> a vpcmov instruction.  Unfortunately, there wasn't previously a V1TImode
> variant, xop_pcmov_v1ti, so we'd ICE.
>
> This is easily fixed by adding V1TImode (and V2TImode) to V_128_256
> which is only used for defining XOP's vpcmov instruction.  This in turn
> requires V1TI (and V2TI) to be supported by  (though
> the use if  in the names xop_pcmov_
> seems unnecessary; the mode makes the name unique).
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?
LGTM.
>
>
> 2022-06-02  Roger Sayle  
>
> gcc/ChangeLog
> PR target/105791
> * config/i386/sse.md (V_128_256):Add V1TI and V2TI.
> (define_mode_attr avxsizesuffix): Add support for V1TI and V2TI.
>
> gcc/testsuite/ChangeLog
> PR target/105791
> * gcc.target/i386/pr105791.c: New test case.
>
>
> Thanks in advance. Sorry for the inconvenience/breakage.
> Roger
> --
>


-- 
BR,
Hongtao