[PATCH] lto/106334 - relax assert during WPA tree merging

2022-07-19 Thread Richard Biener via Gcc-patches
The dwarf2out map of tree to symbol + offset is populated too early
when streaming in trees so that when WPA tree merging decides to
recycle them the mapping prevails and if we are unlucky the same
address is used for another tree with a symbol + offset DIE to
record.  The following mitigates the resulting ICE by relaxing the
assert, allowing re-use of a slot during WPA.  Delaying the register
would be better but it's already somewhat hairy and uglifying this
further doesn't look too important right now.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

* dwarf2out.cc (dwarf2out_register_external_die): Allow
map entry re-use during WPA.
---
 gcc/dwarf2out.cc | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc
index e3920c898f5..3ac39c1a5b0 100644
--- a/gcc/dwarf2out.cc
+++ b/gcc/dwarf2out.cc
@@ -6069,7 +6069,11 @@ dwarf2out_register_external_die (tree decl, const char 
*sym,
 
   if (!external_die_map)
 external_die_map = hash_map::create_ggc (1000);
-  gcc_checking_assert (!external_die_map->get (decl));
+  /* When we do tree merging during WPA we can end up re-using GC memory
+ as there's currently no way to unregister external DIEs.  Ideally
+ we'd register them only after merging finished but allowing override
+ here is easiest.  See PR106334.  */
+  gcc_checking_assert (flag_wpa || !external_die_map->get (decl));
   sym_off_pair p = { IDENTIFIER_POINTER (get_identifier (sym)), off };
   external_die_map->put (decl, p);
 }
-- 
2.35.3


[PATCH] middle-end/106331 - fix mem attributes for string op arguments

2022-07-19 Thread Richard Biener via Gcc-patches
get_memory_rtx tries hard to come up with a MEM_EXPR to record
in the memory attributes but in the last fallback fails to properly
account for an unknown offset and thus, as visible in this testcase,
incorrect alignment computed from set_mem_attributes.  The following
rectifies both parts.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

PR middle-end/106331
* builtins.cc (get_memory_rtx): Compute alignment from
the original address and set MEM_OFFSET to unknown when
we create a MEM_EXPR from the base object of the address.

* gfortran.dg/pr106331.f90: New testcase.
---
 gcc/builtins.cc| 13 +
 gcc/testsuite/gfortran.dg/pr106331.f90 |  7 +++
 2 files changed, 16 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr106331.f90

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 35b9197945f..bc05a40bfe7 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -1360,7 +1360,7 @@ expand_builtin_prefetch (tree exp)
 rtx
 get_memory_rtx (tree exp, tree len)
 {
-  tree orig_exp = exp;
+  tree orig_exp = exp, base;
   rtx addr, mem;
 
   /* When EXP is not resolved SAVE_EXPR, MEM_ATTRS can be still derived
@@ -1391,10 +1391,11 @@ get_memory_rtx (tree exp, tree len)
   if (is_gimple_mem_ref_addr (TREE_OPERAND (exp, 0)))
 set_mem_attributes (mem, exp, 0);
   else if (TREE_CODE (TREE_OPERAND (exp, 0)) == ADDR_EXPR
-  && (exp = get_base_address (TREE_OPERAND (TREE_OPERAND (exp, 0),
-0
+  && (base = get_base_address (TREE_OPERAND (TREE_OPERAND (exp, 0),
+ 0
 {
-  exp = build_fold_addr_expr (exp);
+  unsigned int align = get_pointer_alignment (TREE_OPERAND (exp, 0));
+  exp = build_fold_addr_expr (base);
   exp = fold_build2 (MEM_REF,
 build_array_type (char_type_node,
   build_range_type (sizetype,
@@ -1402,6 +1403,10 @@ get_memory_rtx (tree exp, tree len)
 NULL)),
 exp, build_int_cst (ptr_type_node, 0));
   set_mem_attributes (mem, exp, 0);
+  /* Since we stripped parts make sure the offset is unknown and the
+alignment is computed from the original address.  */
+  clear_mem_offset (mem);
+  set_mem_align (mem, align);
 }
   set_mem_alias_set (mem, 0);
   return mem;
diff --git a/gcc/testsuite/gfortran.dg/pr106331.f90 
b/gcc/testsuite/gfortran.dg/pr106331.f90
new file mode 100644
index 000..3873863be48
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr106331.f90
@@ -0,0 +1,7 @@
+! { dg-do run }
+! { dg-options "-Og" }
+
+PROGRAM main
+  CHARACTER(LEN=24) :: a(2)
+  a = ''
+END PROGRAM
-- 
2.35.3


Re: [PATCH 2/3] tree-cfg: do not duplicate returns_twice calls

2022-07-19 Thread Richard Biener via Gcc-patches
On Thu, Jul 14, 2022 at 10:12 PM Alexander Monakov  wrote:
>
>
> On Thu, 14 Jul 2022, Richard Biener wrote:
>
> > Indeed.  Guess that's what __builtin_setjmp[_receiver] for SJLJ_EH got 
> > "right".
> >
> > When copying a block we do not copy labels so any "jumps" remain to the 
> > original
> > block and thus we are indeed able to isolate normal control flow.  Given 
> > that
> > returns_twice functions _do_ seem to be special, and also special as to how
> > we handle other abnormal receivers in duplicate_block.
>
> We do? Sorry, I don't see what you mean here, can you point me to specific 
> lines?

I'm referring to

  stmt = gsi_stmt (gsi);
  if (gimple_code (stmt) == GIMPLE_LABEL)
continue;

but indeed, looking again we do _not_ skip a __builtin_setjmp_receiver
(but I don't
exactly remember the CFG setup with SJLJ EH and setjmp_{receiver,setup}.

> > So it might indeed make sense to special-case them in can_duplicate_block_p
> > ... (sorry for going back-and-forth ...)
> >
> > Note that I think this detail of duplicate_block (the function) and the hook
> > needs to be better documented (the semantics on incoming edges, not 
> > duplicating
> > labels used for incoming control flow).
> >
> > Can you see as to how to adjust the RTL side for this?  It looks like at 
> > least
> > some places set a REG_SETJMP note on call_insns (emit_call_1), I wonder
> > if in rtl_verify_flow_info[_1] (or its callees) we can check that such
> > calls come
> > first ... they might not since IIRC we do _not_ preserve abnormal edges when
> > expanding RTL (there's some existing bug about this and how this breaks some
> > setjmp tests) (but we try to recompute them?).
>
> No, we emit arguments/return value handling before/after a REG_SETJMP call,
> and yeah, we don't always properly recompute abnormal edges, so improving
> RTL in this respect seems hopeless.

:/  (but yes, nobody got to fixing PR57067 in the last 10 years)

> For example, it is easy enough to create
> a testcase where bb-reordering duplicates a returns_twice call, although it
> runs so late that perhaps later passes don't care:
>
> // gcc -O2 --param=max-grow-copy-bb-insns=100
> __attribute__((returns_twice))
> int rtwice(int);
> int g1(int), g2(int);
> void f(int i)
> {
>   do {
> i = i%2 ? g1(i) : g2(i);
>   } while (i = rtwice(i));
> }
>
> FWIW, I also investigated https://gcc.gnu.org/PR101347
>
> > Sorry about the back-and-forth again ... your original patch looks OK for 
> > the
> > GIMPLE side but can you amend the cfghooks.{cc,h} documentation to
> > summarize our findings and
> > the desired semantics of duplicate_block in this respect?
>
> Like below?

Yes.

Thanks and sorry for the back and forth - this _is_ a mightly
complicated area ...

Richard.

> ---8<---
>
> Subject: [PATCH v3] tree-cfg: do not duplicate returns_twice calls
>
> A returns_twice call may have associated abnormal edges that correspond
> to the "second return" from the call. If the call is duplicated, the
> copies of those edges also need to be abnormal, but e.g. tracer does not
> enforce that. Just prohibit the (unlikely to be useful) duplication.
>
> gcc/ChangeLog:
>
> * cfghooks.cc (duplicate_block): Expand comment.
> * tree-cfg.cc (gimple_can_duplicate_bb_p): Reject blocks with
> calls that may return twice.
> ---
>  gcc/cfghooks.cc | 13 ++---
>  gcc/tree-cfg.cc |  7 +--
>  2 files changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/cfghooks.cc b/gcc/cfghooks.cc
> index e435891fa..c6ac9532c 100644
> --- a/gcc/cfghooks.cc
> +++ b/gcc/cfghooks.cc
> @@ -1086,9 +1086,16 @@ can_duplicate_block_p (const_basic_block bb)
>return cfg_hooks->can_duplicate_block_p (bb);
>  }
>
> -/* Duplicates basic block BB and redirects edge E to it.  Returns the
> -   new basic block.  The new basic block is placed after the basic block
> -   AFTER.  */
> +/* Duplicate basic block BB, place it after AFTER (if non-null) and redirect
> +   edge E to it (if non-null).  Return the new basic block.
> +
> +   If BB contains a returns_twice call, the caller is responsible for 
> recreating
> +   incoming abnormal edges corresponding to the "second return" for the copy.
> +   gimple_can_duplicate_bb_p rejects such blocks, while RTL likes to live
> +   dangerously.
> +
> +   If BB has incoming abnormal edges for some other reason, their 
> destinations
> +   should be tied to label(s) of the original BB and not the copy.  */
>
>  basic_block
>  duplicate_block (basic_block bb, edge e, basic_block after, copy_bb_data *id)
> diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
> index f846dc2d8..5bcf78198 100644
> --- a/gcc/tree-cfg.cc
> +++ b/gcc/tree-cfg.cc
> @@ -6346,12 +6346,15 @@ gimple_can_duplicate_bb_p (const_basic_block bb)
>  {
>gimple *g = gsi_stmt (gsi);
>
> -  /* An IFN_GOMP_SIMT_ENTER_ALLOC/IFN_GOMP_SIMT_EXIT call must be
> +  /* Prohibit duplication of returns_twice calls, otherwise associated
> +abnormal 

Re: [PATCH V2] [RFC]Support vectorization for Complex type.

2022-07-19 Thread Richard Biener via Gcc-patches
On Mon, Jul 18, 2022 at 4:31 AM liuhongt  wrote:
>
> V2 update:
>Handle VMAT_ELEMENTWISE, VMAT_CONTIGUOUS_PERMUTE, VMAT_STRIDED_SLP,
>VMAT_CONTIGUOUS_REVERSE, VMAT_CONTIGUOUS_DOWN for complex type.
>
> I've run SPECspeed@2017 627.cam4_s, there's some vectorization cases,
> but no big performance impact(since this patch only handle load/store).
>
> Any comments?

My original comments still stand (it feels like this should be more generic).
Can we go the way lowering complex loads/stores first?  A large part
of the testcases
added by the patch should pass after that.

Thanks,
Richard.

> gcc/ChangeLog:
>
> PR tree-optimization/106010
> * tree-vect-data-refs.cc (vect_get_data_access_cost):
> Pass complex_p to vect_get_num_copies to avoid ICE.
> (vect_analyze_data_refs): Support vectorization for Complex
> type with vector scalar types.
> (vect_permute_load_chain): Handle Complex type.
> * tree-vect-loop.cc (vect_determine_vf_for_stmt_1): VF should
> be half of TYPE_VECTOR_SUBPARTS when complex_p.
> * tree-vect-slp.cc (vect_record_max_nunits): nunits should be
> half of TYPE_VECTOR_SUBPARTS when complex_p.
> (vect_optimize_slp): Support permutation for complex type.
> (vect_slp_analyze_node_operations_1): Double nunits in
> vect_get_num_vectors to get right SLP_TREE_NUMBER_OF_VEC_STMTS
> when complex_p.
> (vect_slp_analyze_node_operations): Ditto.
> (vect_create_constant_vectors): Support CTOR for complex type.
> (vect_transform_slp_perm_load): Support permutation for
> complex type.
> * tree-vect-stmts.cc (vect_init_vector): Support complex type.
> (vect_get_vec_defs_for_operand): Get vector type for
> complex type.
> (vectorizable_store): Get right ncopies/nunits and
> elem_type for complex type vector, also return false when
> complex_p and !TYPE_VECTOR_SUBPARTS.is_constant ().
> (vect_truncate_gather_scatter_offset): Return false for
> complex type.
> (vectorizable_load): Ditto.
> (vect_get_vector_types_for_stmt): Get vector type for
> complex type.
> (get_group_load_store_type): Hanlde complex type for
> nunits.
> (perm_mask_for_reverse): New overload.
> (get_negative_load_store_type): Handle complex type,
> p_offset should be N - 2 beofre addres of DR.
> (vect_check_scalar_mask): Return false for complex type.
> * tree-vectorizer.h (STMT_VINFO_COMPLEX_P): New macro.
> (vect_get_num_copies): New overload.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr106010-1a.c: New test.
> * gcc.target/i386/pr106010-1b.c: New test.
> * gcc.target/i386/pr106010-1c.c: New test.
> * gcc.target/i386/pr106010-2a.c: New test.
> * gcc.target/i386/pr106010-2b.c: New test.
> * gcc.target/i386/pr106010-2c.c: New test.
> * gcc.target/i386/pr106010-3a.c: New test.
> * gcc.target/i386/pr106010-3b.c: New test.
> * gcc.target/i386/pr106010-3c.c: New test.
> * gcc.target/i386/pr106010-4a.c: New test.
> * gcc.target/i386/pr106010-4b.c: New test.
> * gcc.target/i386/pr106010-4c.c: New test.
> * gcc.target/i386/pr106010-5a.c: New test.
> * gcc.target/i386/pr106010-5b.c: New test.
> * gcc.target/i386/pr106010-5c.c: New test.
> * gcc.target/i386/pr106010-6a.c: New test.
> * gcc.target/i386/pr106010-6b.c: New test.
> * gcc.target/i386/pr106010-6c.c: New test.
> * gcc.target/i386/pr106010-7a.c: New test.
> * gcc.target/i386/pr106010-7b.c: New test.
> * gcc.target/i386/pr106010-7c.c: New test.
> * gcc.target/i386/pr106010-8a.c: New test.
> * gcc.target/i386/pr106010-8b.c: New test.
> * gcc.target/i386/pr106010-8c.c: New test.
> * gcc.target/i386/pr106010-9a.c: New test.
> * gcc.target/i386/pr106010-9b.c: New test.
> * gcc.target/i386/pr106010-9c.c: New test.
> * gcc.target/i386/pr106010-9d.c: New test.
> ---
>  gcc/testsuite/gcc.target/i386/pr106010-1a.c |  58 +
>  gcc/testsuite/gcc.target/i386/pr106010-1b.c |  63 ++
>  gcc/testsuite/gcc.target/i386/pr106010-1c.c |  41 
>  gcc/testsuite/gcc.target/i386/pr106010-2a.c |  82 +++
>  gcc/testsuite/gcc.target/i386/pr106010-2b.c |  62 ++
>  gcc/testsuite/gcc.target/i386/pr106010-2c.c |  47 
>  gcc/testsuite/gcc.target/i386/pr106010-3a.c |  80 +++
>  gcc/testsuite/gcc.target/i386/pr106010-3b.c | 126 +++
>  gcc/testsuite/gcc.target/i386/pr106010-3c.c |  69 ++
>  gcc/testsuite/gcc.target/i386/pr106010-4a.c | 101 +
>  gcc/testsuite/gcc.target/i386/pr106010-4b.c |  67 ++
>  gcc/testsuite/gcc.target/i386/pr106010-4c.c |  54 +
>  gcc/testsuite/gcc.target/i386/pr106010-5a.c | 117 ++
>  gcc/testsuite/gcc.target/i386/pr106

Re: [PATCH] Fortran: error recovery on invalid array reference of non-array [PR103590]

2022-07-19 Thread Mikael Morin

Hello,

the principle looks good, but...

Le 18/07/2022 à 22:43, Harald Anlauf via Fortran a écrit :


diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 2ebf076f730..dacd33561d0 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -5004,7 +5004,11 @@ find_array_spec (gfc_expr *e)
   {
   case REF_ARRAY:
if (as == NULL)
- gfc_internal_error ("find_array_spec(): Missing spec");
+ {
+   gfc_error ("Symbol %qs at %L has not been declared as an array",
+  e->symtree->n.sym->name, &e->where);


... the error here only makes sense if the array reference follows a 
variable reference.  If it follows a derived type component reference, a 
slightly different error message would be more appropriate.


Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-19 Thread Uros Bizjak via Gcc-patches
On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu  wrote:
>
> On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches
>  wrote:
> >
> > On Tue, Jul 19, 2022 at 8:07 AM liuhongt  wrote:
> > >
> > > And split it after reload.
> > >
> > > > You will need ix86_binary_operator_ok insn constraint here with
> > > > corresponding expander using ix86_fixup_binary_operands_no_copy to
> > > > prepare insn operands.
> > > Split define_expand with just register_operand, and allow
> > > memory/immediate in define_insn, assume combine/forwprop will do 
> > > optimization.
> >
> > But you will *ease* the job of the above passes if you use
> > ix86_fixup_binary_operands_no_copy in the expander.
> for -m32, it will hit ICE in
> Breakpoint 1, ix86_fixup_binary_operands_no_copy (code=XOR,
> mode=E_V4QImode, operands=0x7fffa970) a
> /gcc/config/i386/i386-expand.cc:1184
> 1184  rtx dst = ix86_fixup_binary_operands (code, mode, operands);
> (gdb) n
> 1185  gcc_assert (dst == operands[0]); -- here
> (gdb)
>
> the original operands[0], operands[1], operands[2] are below
> (gdb) p debug_rtx (operands[0])
> (mem/c:V4QI (plus:SI (reg/f:SI 77 virtual-stack-vars)
> (const_int -8220 [0xdfe4])) [0 MEM  unsigned char> [(unsigned char *)&tmp2 + 4B]+0 S4 A32])
> $1 = void
> (gdb) p debug_rtx (operands[1])
> (subreg:V4QI (reg:SI 129) 0)
> $2 = void
> (gdb) p debug_rtx (operands[2])
> (subreg:V4QI (reg:SI 98 [ _46 ]) 0)
> $3 = void
> (gdb)
>
> since operands[0] is mem and not equal to operands[1],
> ix86_fixup_binary_operands will create a pseudo register for dst. and
> then hit ICE.
> Is this a bug or assumed?

You will need ix86_expand_binary_operator here.

Uros.


Re: Re: [PATCH] libstdc++: Make __from_chars_alnum_to_val conversion explicit

2022-07-19 Thread Jonathan Wakely via Gcc-patches
On Mon, 18 Jul 2022 at 11:11, Marco Falke via Libstdc++
 wrote:
>
> (in reply to https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598412.html,
> adding libstdc++ to CC, with the same patch attached again)
>
> To clarify, this is not a fix for a user-facing issue of gcc or a fix
> for UB. It is just a minor UX improvement for developers that use the
> clang integer sanitizer to detect implicit int conversions.

Thanks for the patch, I'm running tests now and will commit it once
everything passes.

As this is an obvious, non-copyrightable fix we don't need any
paperwork of DCO sign-off for this change. Is it OK if I commit it for
the name "Marco Falke" or do you want  to use the Author Name
"MacroFake" from the patch attachment?

Thanks again.



Re: [PATCH] c++: Add __reference_con{struc, ver}ts_from_temporary [PR104477]

2022-07-19 Thread Jonathan Wakely via Gcc-patches
On Sun, 17 Jul 2022 at 22:13, Stephan Bergmann via Libstdc++
 wrote:
>
> On 7/15/22 22:25, Marek Polacek via Gcc-patches wrote:
> > Yeah, I guess so.  But I've already pushed the patch.
>
> This commit obviously breaks using libstdc++ with Clang (in -std=c++2b
> mode), which doesn't implement those new builtins.  Something like the
> below would fix that,

Thanks, Stephan, I'll fix this.


>
> > diff --git a/libstdc++-v3/include/std/type_traits 
> > b/libstdc++-v3/include/std/type_traits
> > index b1a1deecf66..a6e028b42ec 100644
> > --- a/libstdc++-v3/include/std/type_traits
> > +++ b/libstdc++-v3/include/std/type_traits
> > @@ -3505,6 +3505,7 @@ template > _Args>
> >template
> >  inline constexpr bool is_scoped_enum_v = is_scoped_enum<_Tp>::value;
> >
> > +#if __has_builtin(__reference_constructs_from_temporary) && 
> > __has_builtin(__reference_converts_from_temporary)
> >  #define __cpp_lib_reference_from_temporary 202202L
> >
> >/// True if _Tp is a reference type, a _Up value can be bound to _Tp in
> > @@ -3544,6 +3545,7 @@ template > _Args>
> >template
> >  inline constexpr bool reference_converts_from_temporary_v
> >= reference_converts_from_temporary<_Tp, _Up>::value;
> > +#endif
> >  #endif // C++23
> >
> >  #if _GLIBCXX_HAVE_IS_CONSTANT_EVALUATED
> > diff --git a/libstdc++-v3/include/std/version 
> > b/libstdc++-v3/include/std/version
> > index 5edca2f3007..7c4b7f7cc6d 100644
> > --- a/libstdc++-v3/include/std/version
> > +++ b/libstdc++-v3/include/std/version
> > @@ -304,7 +304,9 @@
> >  #define __cpp_lib_byteswap 202110L
> >  #define __cpp_lib_constexpr_typeinfo 202106L
> >  #define __cpp_lib_is_scoped_enum 202011L
> > +#if __has_builtin(__reference_constructs_from_temporary) && 
> > __has_builtin(__reference_converts_from_temporary)
> >  #define __cpp_lib_reference_from_temporary 202202L
> > +#endif
> >
> >  #if _GLIBCXX_HOSTED
> >  #define __cpp_lib_adaptor_iterator_pair_constructor 202106L
>



Re: Re: [PATCH] libstdc++: Make __from_chars_alnum_to_val conversion explicit

2022-07-19 Thread Marco Falke via Gcc-patches
Thanks Jonathan. Either name is perfectly fine.

To clarify, this will be pushed to master and also backported to the 12-branch?

Best, Marco

On Tue, Jul 19, 2022 at 12:06 PM Jonathan Wakely  wrote:
>
> On Mon, 18 Jul 2022 at 11:11, Marco Falke via Libstdc++
>  wrote:
> >
> > (in reply to 
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598412.html,
> > adding libstdc++ to CC, with the same patch attached again)
> >
> > To clarify, this is not a fix for a user-facing issue of gcc or a fix
> > for UB. It is just a minor UX improvement for developers that use the
> > clang integer sanitizer to detect implicit int conversions.
>
> Thanks for the patch, I'm running tests now and will commit it once
> everything passes.
>
> As this is an obvious, non-copyrightable fix we don't need any
> paperwork of DCO sign-off for this change. Is it OK if I commit it for
> the name "Marco Falke" or do you want  to use the Author Name
> "MacroFake" from the patch attachment?
>
> Thanks again.
>


Re: Re: [PATCH] libstdc++: Make __from_chars_alnum_to_val conversion explicit

2022-07-19 Thread Jonathan Wakely via Gcc-patches
On Tue, 19 Jul 2022 at 11:13, Marco Falke  wrote:
>
> Thanks Jonathan. Either name is perfectly fine.

OK, thanks.

>
> To clarify, this will be pushed to master and also backported to the 
> 12-branch?

Yes, that's right. It will go to trunk (aka master) today and then
backported to the gcc-12 branch "soon" (whenever I do the next batch
of backports, which might be later this week).



Re: [PATCH] Complete __gnu_debug::basic_string Standard conformity

2022-07-19 Thread Jonathan Wakely via Gcc-patches
On Sun, 10 Jul 2022 at 14:57, François Dumont via Libstdc++
 wrote:
>
> Here is a first patch to complete __gnu_debug::basic_string Standard
> conformity.
>
> I prefer to submit this before checking for more missing stuff to check
> that my proposal of having a testsuite_string.h header is ok.

I think this change means some testcases will never test std::string
with _GLIBCXX_DEBUG defined, because if that is defined they test
__gnu_debug::string instead. That means assertions in
std::basic_string like this one will not get tested:

  template
_GLIBCXX_STRING_CONSTEXPR
typename basic_string<_CharT, _Traits, _Alloc>::size_type
basic_string<_CharT, _Traits, _Alloc>::
find(const _CharT* __s, size_type __pos, size_type __n) const
_GLIBCXX_NOEXCEPT
{
  __glibcxx_requires_string_len(__s, __n);

Are we OK with never testing those assertions?

Currently they don't get tested by default because of the extern
template declarations for std::string, so they would only be tested
with -std=c++20 -D_GLIBCXX_DEBUG -D_GLIBCXX_DEBUG_PEDANTIC, which I
don't test routinely. So I suppose this change doesn't make things any
worse in practice.

> I also noticed some problems with _GLIBCXX_DEBUG_PEDANTIC.
>
>  libstdc++: Complete __gnu_debug::string Standard conformity
>
>  Add testsuite/testsuite_string.h header to help testing
> __gnu_debug::basic_string like
>  std::basic_string depending on _GLIBCXX_DEBUG.
>
>  Add using of base type methods in __gnu_debug::basic_string to make
> use of the method
>  overloads when there is no debug version.
>
>  Fix _GLIBCXX_DEBUG_PEDANTIC assertions in . This
> header has to be used directly
>  like __gnu_debug::string, it is not included by _GLIBCXX_DEBUG. It
> means that
>  _GLIBCXX_DEBUG_PEDANTIC is not considered to define
> __glibcxx_check_string and
>  __glibcxx_check_string_len which are then empty macros. Now those
> macros are defined
>  directly in  and properly consider
> _GLIBCXX_DEBUG_PEDANTIC.

Nice catch.

OK for trunk, thanks.



Re: [PATCH] ifcvt: Improve noce_try_store_flag_mask [PR105314]

2022-07-19 Thread Maciej W. Rozycki
Hi Jakub,

On Tue, 26 Apr 2022, Jakub Jelinek via Gcc-patches wrote:

> The following testcase regressed on riscv due to the splitting of critical
> edges in the sink pass, similarly to x86_64 compared to GCC 11 we now swap
> the edges, whether true or false edge goes to an empty forwarded bb.
> >From GIMPLE POV, those 2 forms are equivalent, but as can be seen here, for
> some ifcvt opts it matters one way or another.
[...]
> --- gcc/testsuite/gcc.target/riscv/pr105314.c.jj  2022-04-25 
> 17:41:00.958736306 +0200
> +++ gcc/testsuite/gcc.target/riscv/pr105314.c 2022-04-25 17:40:46.237940642 
> +0200
> @@ -0,0 +1,12 @@
> +/* PR rtl-optimization/105314 */
> +/* { dg-do compile } *
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler-not "\tbeq\t" } } */
> +
> +long
> +foo (long a, long b, long c)
> +{
> +  if (c)
> +a = 0;
> +  return a;
> +}

 It's not clear to me why we insist that a branch is to be unconditionally 
avoided here.  I think it has to be microarchitecture-specific and based 
on my observations a branch will or will not be used depending on what the 
specific microarchitecture sets the branch cost to (via BRANCH_COST), e.g. 
if the cost is set to 1 (which none of our current microarchitectures do), 
then a branch is preferred to a three-instruction fall-through sequence, 
i.e.:

foo:
beq a2,zero,.L1
li  a0,0
.L1:
ret

vs:

foo:
seqza2,a2
neg a2,a2
and a0,a0,a2
ret

 Also with `-mtune=sifive-7-series' code produced here is already as 
follows:

foo:
beq a2,zero,1f; mv a0,zero; 1: # movcc
ret

and it only passes the test case due to sloppy coding in the machine 
description (which I'll post a fix for separately and which will obviously 
uncover the failure expected here) combined with the test's strictness 
about using tabs for white space.

 Therefore shouldn't the case be somehow restricted to a subset of RISC-V 
microarchitectures, ones we know that do not want to see a branch here?

  Maciej


Re: [PATCH] c++: Add __reference_con{struc, ver}ts_from_temporary [PR104477]

2022-07-19 Thread Jonathan Wakely via Gcc-patches
On Tue, 19 Jul 2022 at 11:08, Jonathan Wakely wrote:
>
> On Sun, 17 Jul 2022 at 22:13, Stephan Bergmann via Libstdc++
>  wrote:
> >
> > On 7/15/22 22:25, Marek Polacek via Gcc-patches wrote:
> > > Yeah, I guess so.  But I've already pushed the patch.
> >
> > This commit obviously breaks using libstdc++ with Clang (in -std=c++2b
> > mode), which doesn't implement those new builtins.  Something like the
> > below would fix that,
>
> Thanks, Stephan, I'll fix this.

This patch doesn't work, because __has_builtin doesn't detect the new
built-ins. I have a patch that solves that, so we can make the change
to the library headers.

>
> >
> > > diff --git a/libstdc++-v3/include/std/type_traits 
> > > b/libstdc++-v3/include/std/type_traits
> > > index b1a1deecf66..a6e028b42ec 100644
> > > --- a/libstdc++-v3/include/std/type_traits
> > > +++ b/libstdc++-v3/include/std/type_traits
> > > @@ -3505,6 +3505,7 @@ template > > _Args>
> > >template
> > >  inline constexpr bool is_scoped_enum_v = is_scoped_enum<_Tp>::value;
> > >
> > > +#if __has_builtin(__reference_constructs_from_temporary) && 
> > > __has_builtin(__reference_converts_from_temporary)
> > >  #define __cpp_lib_reference_from_temporary 202202L
> > >
> > >/// True if _Tp is a reference type, a _Up value can be bound to _Tp in
> > > @@ -3544,6 +3545,7 @@ template > > _Args>
> > >template
> > >  inline constexpr bool reference_converts_from_temporary_v
> > >= reference_converts_from_temporary<_Tp, _Up>::value;
> > > +#endif
> > >  #endif // C++23
> > >
> > >  #if _GLIBCXX_HAVE_IS_CONSTANT_EVALUATED
> > > diff --git a/libstdc++-v3/include/std/version 
> > > b/libstdc++-v3/include/std/version
> > > index 5edca2f3007..7c4b7f7cc6d 100644
> > > --- a/libstdc++-v3/include/std/version
> > > +++ b/libstdc++-v3/include/std/version
> > > @@ -304,7 +304,9 @@
> > >  #define __cpp_lib_byteswap 202110L
> > >  #define __cpp_lib_constexpr_typeinfo 202106L
> > >  #define __cpp_lib_is_scoped_enum 202011L
> > > +#if __has_builtin(__reference_constructs_from_temporary) && 
> > > __has_builtin(__reference_converts_from_temporary)
> > >  #define __cpp_lib_reference_from_temporary 202202L
> > > +#endif
> > >
> > >  #if _GLIBCXX_HOSTED
> > >  #define __cpp_lib_adaptor_iterator_pair_constructor 202106L
> >



[PATCH] doc: Clarify FENV_ACCESS pragma semantics WRT `-ftrapping-math'

2022-07-19 Thread Maciej W. Rozycki
Our documentation indicates that it is the `-frounding-math' invocation 
option that controls whether we respect what the FENV_ACCESS pragma 
would imply, should we implement it, regarding the floating point 
environment.  It is only a part of the picture however, because the 
`-ftrapping-math' invocation option also affects how we handle said 
environment.  Clarify that in the description of both options then, as 
well as the FENV_ACCESS pragma itself.

gcc/
* doc/implement-c.texi (Floating point implementation): Mention
`-fno-trapping-math' in the context of FENV_ACCESS pragma.
* doc/invoke.texi (Optimize Options): Clarify FENV_ACCESS pragma
implication in the descriptions of `-fno-trapping-math' and 
`-frounding-math'.
---
Hi,

 Discovered in the course of investigating RISC-V unordered comparisons, 
c.f. .

  Maciej
---
 gcc/doc/implement-c.texi |3 ++-
 gcc/doc/invoke.texi  |8 +++-
 2 files changed, 9 insertions(+), 2 deletions(-)

gcc-doc-fenv-access-trapping.diff
Index: gcc/gcc/doc/implement-c.texi
===
--- gcc.orig/gcc/doc/implement-c.texi
+++ gcc/gcc/doc/implement-c.texi
@@ -339,7 +339,8 @@ This is subject to change.
 7.6.1).}
 
 This pragma is not implemented, but the default is to ``off'' unless
-@option{-frounding-math} is used in which case it is ``on''.
+@option{-frounding-math} is used and @option{-fno-trapping-math} is not
+in which case it is ``on''.
 
 @item
 @cite{Additional floating-point exceptions, rounding modes, environments,
Index: gcc/gcc/doc/invoke.texi
===
--- gcc.orig/gcc/doc/invoke.texi
+++ gcc/gcc/doc/invoke.texi
@@ -13513,6 +13513,11 @@ math functions.
 
 The default is @option{-ftrapping-math}.
 
+Future versions of GCC may provide finer control of this setting
+using C99's @code{FENV_ACCESS} pragma.  This command-line option
+will be used along with @option{-frounding-math} to specify the
+default state for @code{FENV_ACCESS}.
+
 @item -frounding-math
 @opindex frounding-math
 Disable transformations and optimizations that assume default floating-point
@@ -13531,7 +13536,8 @@ This option is experimental and does not
 disable all GCC optimizations that are affected by rounding mode.
 Future versions of GCC may provide finer control of this setting
 using C99's @code{FENV_ACCESS} pragma.  This command-line option
-will be used to specify the default state for @code{FENV_ACCESS}.
+will be used along with @option{-ftrapping-math} to specify the
+default state for @code{FENV_ACCESS}.
 
 @item -fsignaling-nans
 @opindex fsignaling-nans


[PATCH v1 0/2] LoongArch: Modify the method of obtaining symbolic addresses.

2022-07-19 Thread Lulu Cheng
1. The original LA macro instruction is split into two instructions to
   obtain the address of the symbol if enable '-mexplicit-relocs'.
2. Currently, '-mcmodel=' only supports 'normal' mode, because other mode
   behaviors are not yet determined. This function is gradually improved
   by the subsequent handling.
3. Modify the method that calls global functions. From 'la.global + jirl'
   to 'bl'.
4. Some R_LARCH_64 in section .eh_frame will to generate  R_LARCH_NONE, we
   change ASM_PREFERRED_EH_DATA_FORMAT from 'WD_EH_PE_absptr' to
   'WD_EH_PE_pcrel | DW_EH_PE_sdata4' then relocation to R_LARCH_32_PCREL
   from R_LARCH_64 in setction .eh_frame and not generate dynamic relocation
   for R_LARCH_32_PCREL.

This new symbol loading method not support by upstream binutils yet,
this GCC port requires the following patches applied to binutils to build.

https://sourceware.org/pipermail/binutils/2022-July/121849.html
https://sourceware.org/pipermail/binutils/2022-July/121850.html
https://sourceware.org/pipermail/binutils/2022-July/121851.html
https://sourceware.org/pipermail/binutils/2022-July/121852.html
https://sourceware.org/pipermail/binutils/2022-July/121853.html

Lulu Cheng (2):
  LoongArch: Modify the method of obtaining symbolic addresses.
  LoongArch: Modify the definition of the ASM_PREFERRED_EH_DATA_FORMAT
macro.

 .../config/loongarch/loongarch-common.cc  |   1 +
 gcc/config/loongarch/constraints.md   |  24 +-
 gcc/config/loongarch/genopts/loongarch.opt.in |   4 +
 gcc/config/loongarch/loongarch-protos.h   |   9 +-
 gcc/config/loongarch/loongarch.cc | 660 +-
 gcc/config/loongarch/loongarch.h  |   4 +-
 gcc/config/loongarch/loongarch.md | 401 +++
 gcc/config/loongarch/loongarch.opt|   4 +
 gcc/config/loongarch/predicates.md|  56 +-
 9 files changed, 629 insertions(+), 534 deletions(-)

-- 
2.31.1



[PATCH v1 1/2] LoongArch: Modify the method of obtaining symbolic addresses.

2022-07-19 Thread Lulu Cheng
1. The original LA macro instruction is split into two instructions to
   obtain the address of the symbol if enable '-mexplicit-relocs'.
2. Currently, '-mcmodel=' only supports 'normal' mode, because other mode
   behaviors are not yet determined. This function is gradually improved
   by the subsequent handling.
3. Modify the method that calls global functions. From 'la.global + jirl'
   to 'bl'.

gcc/ChangeLog:

* common/config/loongarch/loongarch-common.cc:
Enable '-fsection-anchors' when O1 and more advanced optimization.
* config/loongarch/constraints.md (a): Delete the constraint.
(b): A constant call not local address.
(h): Delete the constraint.
(t): Delete the constraint.
* config/loongarch/genopts/loongarch.opt.in: Add new options
'-mexplicit-relocs', and enable by default.
* config/loongarch/loongarch-protos.h (enum loongarch_symbol_type):
Add new symbol type 'SYMBOL_PCREL', 'SYMBOL_TLS_IE' and 'SYMBOL_TLS_LE'.
(loongarch_split_move_insn_p): Delete function declaration.
(loongarch_split_move_insn): Delete function declaration.
* config/loongarch/loongarch.cc (enum loongarch_address_type):
Add new address type 'ADDRESS_LO_SUM'.
(loongarch_symbol_binds_local_p):
(loongarch_classify_symbol): Return symbol type. If symbol is a label,
or symbol is a local symbol return SYMBOL_PCREL. If is a tls symbol,
return SYMBOL_TLS. If is a not local symbol return SYMBOL_GOT_DISP.
(loongarch_classify_symbolic_expression): New function definitions.
Classify the base of symbolic expression X, given that X appears in
context CONTEXT.
(loongarch_symbolic_constant_p): Add handling of 'SYMBOL_TLS_IE'
'SYMBOL_TLS_LE' and 'SYMBOL_PCREL'.
(loongarch_symbol_insns): Add handling of 'SYMBOL_TLS_IE' 
'SYMBOL_TLS_LE'
and 'SYMBOL_PCREL'.
(loongarch_split_symbol_type): New function definitions.
Determines whether the symbol load should be split into two 
instructions.
(loongarch_valid_lo_sum_p): New function definitions.
Return true if a LO_SUM can address a value of mode MODE when the LO_SUM
symbol has type SYMBOL_TYPE.
(loongarch_classify_address): Add handling of 'LO_SUM'.
(loongarch_address_insns): Add handling of 'ADDRESS_LO_SUM'.
(loongarch_12bit_offset_address_p): Return true if address type is 
ADDRESS_LO_SUM.
(loongarch_const_insns): Add handling of 'HIGH'.
(loongarch_split_move_insn_p): Add the static attribute to the function.
(loongarch_emit_set): New function definitions.
(loongarch_call_tls_get_addr): Add symbol handling when defining the
TARGET_EXPLICIT_RELOCS macro.
(loongarch_legitimize_tls_address): Add symbol handling when defining 
the
TARGET_EXPLICIT_RELOCS macro.
(loongarch_split_symbol): New function definitions. Split symbol.
(loongarch_legitimize_address): Add codes see if the address can split 
into a high part
and a LO_SUM.
(loongarch_legitimize_const_move): Add codes split moves of symbolic 
constants into
high and low.
(loongarch_split_move_insn): Delete function definitions.
(loongarch_output_move):
(loongarch_print_operand_reloc): New function definitions.
Print symbolic operand OP, which is part of a HIGH or LO_SUM in context 
CONTEXT.
(loongarch_print_operand): Rearrange alphabetical order and add H and L 
to support HIGH
and LOW output.
(loongarch_print_operand_address): Add handling of 'ADDRESS_LO_SUM'.
(TARGET_MIN_ANCHOR_OFFSET): Define macro to -IMM_REACH/2.
(TARGET_MAX_ANCHOR_OFFSET): Define macro to IMM_REACH/2-1.
* config/loongarch/loongarch.h (LARCH_U12BIT_OFFSET_P):
Rename to LARCH_12BIT_OFFSET_P.
(LARCH_12BIT_OFFSET_P): New macro.
* config/loongarch/loongarch.md (movti): Delete the template.
(*movti): Delete the template.
(movtf): Delete the template.
(*movtf): Delete the template.
(*low): New template of normal symbol low address.
(@tls_low): New template of tls symbol low address.
(@ld_from_got): New template load address from got table.
(@ori_l_lo12): New template.
* config/loongarch/loongarch.opt: Update.
* config/loongarch/predicates.md (is_const_call_weak_symbol):
Delete the predicate.
(is_const_call_plt_symbol): Delete the predicate.
(is_const_call_global_noplt_symbol): Delete the predicate.
(is_const_call_no_local_symbol): New predicate, Determines whether it 
is a local symbol
or label.
---
 .../config/loongarch/loongarch-common.cc  |   1 +
 gcc/config/loongarch/constraints.md   |  24 +-
 gcc/config/loongarch/genopts/loongarch.opt.in |   4 +
 gcc/config/loongarch/loongarch-protos.h   |   9 +-
 g

[PATCH v1 2/2] LoongArch: Modify the definition of the ASM_PREFERRED_EH_DATA_FORMAT macro.

2022-07-19 Thread Lulu Cheng
Some R_LARCH_64 in section .eh_frame will to generate  R_LARCH_NONE, we
change relocation to R_LARCH_32_PCREL from R_LARCH_64 in setction .eh_frame
and not generate dynamic relocation for R_LARCH_32_PCREL.

gcc/ChangeLog:

* config/loongarch/loongarch.h (ASM_PREFERRED_EH_DATA_FORMAT):
Modify the definition of the ASM_PREFERRED_EH_DATA_FORMAT macro.
---
 gcc/config/loongarch/loongarch.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index 89a5bd728fe..222b58b838d 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -1128,7 +1128,7 @@ struct GTY (()) machine_function
 #endif
 
 #define ASM_PREFERRED_EH_DATA_FORMAT(CODE, GLOBAL) \
-  (((GLOBAL) ? DW_EH_PE_indirect : 0) | DW_EH_PE_absptr)
+  (((GLOBAL) ? DW_EH_PE_indirect : 0) | DW_EH_PE_pcrel | DW_EH_PE_sdata4)
 
 /* Do emit .note.GNU-stack by default.  */
 #ifndef NEED_INDICATE_EXEC_STACK
-- 
2.31.1



[PATCH] c++: Enable __has_builtin for new reference binding built-ins

2022-07-19 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, OK for trunk?

-- >8 --

The new built-ins need to be detectable using __has_builtin, and the
library should use that to check for them.

This fixes an error with Clang when C++23 is enabled.

gcc/cp/ChangeLog:

* cp-objcp-common.cc (names_builtin_p): Return true for
RID_REF_CONSTRUCTS_FROM_TEMPORARY and
RID_REF_CONVERTS_FROM_TEMPORARY.

libstdc++-v3/ChangeLog:

* include/std/type_traits (__cpp_lib_reference_from_temporary)
(reference_constructs_from_temporary)
(reference_converts_from_temporary): Only define when the
built-ins are available.
---
 gcc/cp/cp-objcp-common.cc| 2 ++
 libstdc++-v3/include/std/type_traits | 4 
 2 files changed, 6 insertions(+)

diff --git a/gcc/cp/cp-objcp-common.cc b/gcc/cp/cp-objcp-common.cc
index 0b70d5567e4..4079a4b4aec 100644
--- a/gcc/cp/cp-objcp-common.cc
+++ b/gcc/cp/cp-objcp-common.cc
@@ -461,6 +461,8 @@ names_builtin_p (const char *name)
 case RID_IS_ASSIGNABLE:
 case RID_IS_CONSTRUCTIBLE:
 case RID_UNDERLYING_TYPE:
+case RID_REF_CONSTRUCTS_FROM_TEMPORARY:
+case RID_REF_CONVERTS_FROM_TEMPORARY:
   return true;
 default:
   break;
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index b1a1deecf66..14b029cec64 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3505,6 +3505,9 @@ template
   template
 inline constexpr bool is_scoped_enum_v = is_scoped_enum<_Tp>::value;
 
+#if __has_builtin(__reference_constructs_from_temporary) \
+  && __has_builtin(__reference_converts_from_temporary)
+
 #define __cpp_lib_reference_from_temporary 202202L
 
   /// True if _Tp is a reference type, a _Up value can be bound to _Tp in
@@ -3544,6 +3547,7 @@ template
   template
 inline constexpr bool reference_converts_from_temporary_v
   = reference_converts_from_temporary<_Tp, _Up>::value;
+#endif // __has_builtin for reference_from_temporary
 #endif // C++23
 
 #if _GLIBCXX_HAVE_IS_CONSTANT_EVALUATED
-- 
2.34.3



Re: [PATCH] c++: Enable __has_builtin for new reference binding built-ins

2022-07-19 Thread Marek Polacek via Gcc-patches
On Tue, Jul 19, 2022 at 02:11:02PM +0100, Jonathan Wakely wrote:
> Tested x86_64-linux, OK for trunk?
> 
> -- >8 --
> 
> The new built-ins need to be detectable using __has_builtin, and the
> library should use that to check for them.
> 
> This fixes an error with Clang when C++23 is enabled.

LGTM but can't approve.

Thanks and sorry for missing this!
 
> gcc/cp/ChangeLog:
> 
>   * cp-objcp-common.cc (names_builtin_p): Return true for
>   RID_REF_CONSTRUCTS_FROM_TEMPORARY and
>   RID_REF_CONVERTS_FROM_TEMPORARY.
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/std/type_traits (__cpp_lib_reference_from_temporary)
>   (reference_constructs_from_temporary)
>   (reference_converts_from_temporary): Only define when the
>   built-ins are available.
> ---
>  gcc/cp/cp-objcp-common.cc| 2 ++
>  libstdc++-v3/include/std/type_traits | 4 
>  2 files changed, 6 insertions(+)
> 
> diff --git a/gcc/cp/cp-objcp-common.cc b/gcc/cp/cp-objcp-common.cc
> index 0b70d5567e4..4079a4b4aec 100644
> --- a/gcc/cp/cp-objcp-common.cc
> +++ b/gcc/cp/cp-objcp-common.cc
> @@ -461,6 +461,8 @@ names_builtin_p (const char *name)
>  case RID_IS_ASSIGNABLE:
>  case RID_IS_CONSTRUCTIBLE:
>  case RID_UNDERLYING_TYPE:
> +case RID_REF_CONSTRUCTS_FROM_TEMPORARY:
> +case RID_REF_CONVERTS_FROM_TEMPORARY:
>return true;
>  default:
>break;
> diff --git a/libstdc++-v3/include/std/type_traits 
> b/libstdc++-v3/include/std/type_traits
> index b1a1deecf66..14b029cec64 100644
> --- a/libstdc++-v3/include/std/type_traits
> +++ b/libstdc++-v3/include/std/type_traits
> @@ -3505,6 +3505,9 @@ template
>template
>  inline constexpr bool is_scoped_enum_v = is_scoped_enum<_Tp>::value;
>  
> +#if __has_builtin(__reference_constructs_from_temporary) \
> +  && __has_builtin(__reference_converts_from_temporary)
> +
>  #define __cpp_lib_reference_from_temporary 202202L
>  
>/// True if _Tp is a reference type, a _Up value can be bound to _Tp in
> @@ -3544,6 +3547,7 @@ template
>template
>  inline constexpr bool reference_converts_from_temporary_v
>= reference_converts_from_temporary<_Tp, _Up>::value;
> +#endif // __has_builtin for reference_from_temporary
>  #endif // C++23
>  
>  #if _GLIBCXX_HAVE_IS_CONSTANT_EVALUATED
> -- 
> 2.34.3
> 

Marek



Re: [PATCH] match.pd: Add new abs pattern [PR94290]

2022-07-19 Thread Sam Feifer via Gcc-patches
Attached is a patch file with those changes.

Thanks
-Sam

On Tue, Jul 19, 2022 at 2:36 AM Richard Biener 
wrote:

> On Mon, Jul 18, 2022 at 7:31 PM Sam Feifer via Gcc-patches
>  wrote:
> >
> > Just realized I had mixed up the 9 and the 2 when labelling the patch.
> This
> > patch is referring to pr94920 not pr94290. Attached is a fixed patch
> file.
> > Sorry for any confusion.
>
> Can you put the patterns somewhere related?  The abs producing one
> seems to fit around line 323, the max producing before line 3415?
>
> +/* (x <= 0 ? -x : 0) -> max(-x, 0).  */
> +(simplify
> +  (cond (le @0 integer_zerop@1) (negate @0) integer_zerop@1)
> +  (max (negate @0) @1))
>
> you can re-use the existing (negate @0) in the result by doing
>
> +/* (x <= 0 ? -x : 0) -> max(-x, 0).  */
> +(simplify
> +  (cond (le @0 integer_zerop@1) (negate@2 @0) integer_zerop@1)
> +  (max @2 @1))
>
> OK with those changes.
>
> Thanks,
> Richard.
>
> > On Mon, Jul 18, 2022 at 9:07 AM Sam Feifer  wrote:
> >
> > > Here's an updated version of the patch.
> > >
> > > Thanks
> > > -Sam
> > >
> > > On Thu, Jul 14, 2022 at 3:54 PM Andrew Pinski 
> wrote:
> > >
> > >> On Thu, Jul 14, 2022 at 12:38 PM Sam Feifer 
> wrote:
> > >> >
> > >> >
> > >> >
> > >> > On Thu, Jul 14, 2022 at 1:24 PM Andrew Pinski 
> > >> wrote:
> > >> >>
> > >> >> On Thu, Jul 14, 2022 at 7:09 AM Sam Feifer 
> wrote:
> > >> >> >
> > >> >> >
> > >> >> > On Wed, Jul 13, 2022 at 3:36 PM Andrew Pinski  >
> > >> wrote:
> > >> >> >>
> > >> >> >> On Wed, Jul 13, 2022 at 12:26 PM Sam Feifer via Gcc-patches
> > >> >> >>  wrote:
> > >> >> >> >
> > >> >> >> > This patch is intended to fix a missed optimization in
> match.pd.
> > >> It optimizes (x >= 0 ? x : 0) + (x <= 0 ? -x : 0) to just abs(x). I
> had to
> > >> write a second simplification in match.pd to handle the commutative
> > >> property as the match was not ocurring otherwise. Additionally, the
> pattern
> > >> (x <= 0 ? -x : 0) now gets optimized to max(-x, 0), which helps with
> the
> > >> other simplification rule.
> > >> >> >>
> > >> >> >> You could use :c for the commutative property instead and that
> > >> should
> > >> >> >> simplify things.
> > >> >> >> That is:
> > >> >> >>
> > >> >> >> (simplify
> > >> >> >>   (plus:c (max @0 integer_zerop) (max (negate @0)
> integer_zerop))
> > >> >> >>   (abs @0))
> > >> >> >>
> > >> >> >> Also since integer_zerop works on vectors, it seems like you
> should
> > >> >> >> add a testcase or two for the vector case.
> > >> >> >> Also would be useful if you write a testcase that uses different
> > >> >> >> statements rather than one big one so it gets exercised in the
> > >> >> >> forwprop case.
> > >> >> >> Note also if either of the max are used more than just in this
> > >> >> >> simplification, it could increase the lifetime of @0, maybe you
> need
> > >> >> >> to add :s to the max expressions.
> > >> >> >>
> > >> >> >
> > >> >> > Thanks for the feedback. I'm not quite sure what a vector test
> case
> > >> would look like for this. Could I get some guidance on that?
> > >> >>
> > >> >> Yes this should produce the pattern at forwprop1 time (with the C++
> > >> >> front-end, the C front-end does not support vector selects):
> > >> >> typedef int __attribute__((vector_size(4*sizeof(int vint;
> > >> >>
> > >> >> vint foo(vint x) {
> > >> >> vint t = (x >= 0 ? x : 0) ;
> > >> >> vint xx = -x;
> > >> >> vint t1 =  (xx >= 0 ? xx : 0);
> > >> >> return t + t1;
> > >> >> }
> > >> >>
> > >> >> int foo(int x) {
> > >> >> int t = (x >= 0 ? x : 0) ;
> > >> >> int xx = -x;
> > >> >> int t1 =  (xx >= 0 ? xx : 0);
> > >> >> return t + t1;
> > >> >> }
> > >> >>
> > >> >> Thanks,
> > >> >> Andrew Pinski
> > >> >>
> > >> >
> > >> > Thanks for the help. I'm still having trouble with the vector test,
> > >> though. When I try to compile, I get an error saying "used vector type
> > >> where scalar is required", referring to the max expressions. How do I
> use
> > >> the max expression with two vectors as the inputs?
> > >>
> > >> As I mentioned it only works with the C++ front-end :). The C
> > >> front-end does not support ?: with vectors types.
> > >>
> > >> Thanks,
> > >> Andrew Pinski
> > >>
> > >> >
> > >> > Thanks
> > >> > -Sam
> > >> >>
> > >> >>
> > >> >> >
> > >> >> > Thanks
> > >> >> > -Sam
> > >> >> >
> > >> >> >>
> > >> >> >> Thanks,
> > >> >> >> Andrew
> > >> >> >>
> > >> >> >> >
> > >> >> >> > Tests are also included to be added to the testsuite.
> > >> >> >> >
> > >> >> >> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> > >> >> >> >
> > >> >> >> > PR tree-optimization/94290
> > >> >> >> >
> > >> >> >> > gcc/ChangeLog:
> > >> >> >> >
> > >> >> >> > * match.pd (x >= 0 ? x : 0) + (x <= 0 ? -x : 0): New
> > >> simplification.
> > >> >> >> > * match.pd (x <= 0 ? -x : 0): New Simplification.
> > >> >> >> >
> > >> >> >> > gcc/testsuite/ChangeLog:
> > >> >> >> >
> > >> >> >> > * gcc.c-torture/execute/pr94290-1.c

[committed] analyzer: log out-edge description in exploded_graph::process_node

2022-07-19 Thread David Malcolm via Gcc-patches
I found this logging tweak very helpful when working on
PR analyzer/106284.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-1747-g434d521d118fc7.

gcc/analyzer/ChangeLog:
* engine.cc (exploded_graph::process_node): Show any description
of the out-edge when logging it for consideration.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/engine.cc | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 9ffcc410839..4f7f9d03900 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -3974,8 +3974,12 @@ exploded_graph::process_node (exploded_node *node)
  {
found_a_superedge = true;
if (logger)
- logger->log ("considering SN: %i -> SN: %i",
-  succ->m_src->m_index, succ->m_dest->m_index);
+ {
+   label_text succ_desc (succ->get_description (false));
+   logger->log ("considering SN: %i -> SN: %i (%s)",
+succ->m_src->m_index, succ->m_dest->m_index,
+succ_desc.get ());
+ }
 
program_point next_point
  = program_point::before_supernode (succ->m_dest, succ,
-- 
2.26.3



[committed] analyzer: fix taint handling of switch statements [PR106321]

2022-07-19 Thread David Malcolm via Gcc-patches
PR analyzer/106321 reports false positives from
-Wanalyzer-tainted-array-index on switch statements, seen e.g.
in the Linux kernel in drivers/vfio/pci/vfio_pci_core.c, where
vfio_pci_core_ioctl has:

|  744 | switch (info.index) {
|  | ~~  ~~
|  | |   |
|  | |   (8) ...to here
|  | (9) following ‘case 0 ... 5:’ branch...
|..
|  751 | case VFIO_PCI_BAR0_REGION_INDEX ... 
VFIO_PCI_BAR5_REGION_INDEX:
|  | 
|  | |
|  | (10) ...to here

and then a false complaint about "use of attacker-controlled value
‘info.index’ in array lookup without upper-bounds checking", where
info.index has clearly had its bounds checked by the switch/case.

It turns out that when I rewrote switch handling for the analyzer in
r12-3101-g8ca7fa84a3af35, I removed notifications to state machines
about the constraints on cases.

This patch fixes that oversight by adding a new on_bounded_ranges vfunc
for region_model_context, called on switch statement edges, which calls
a new state_machine vfunc.  It implements it for the "taint" state
machine, so that it updates the "has bounds" flags at out-edges for
switch statements, based on whether the bounds from the edge appear to
actually constrain the switch index.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-1748-g2c044ff123ee57.

gcc/analyzer/ChangeLog:
PR analyzer/106321
* constraint-manager.h (bounded_ranges::get_count): New.
(bounded_ranges::get_range): New.
* engine.cc (impl_region_model_context::on_bounded_ranges): New.
* exploded-graph.h (impl_region_model_context::on_bounded_ranges):
New decl.
* region-model.cc (region_model::apply_constraints_for_gswitch):
Potentially call ctxt->on_bounded_ranges.
* region-model.h (region_model_context::on_bounded_ranges): New
vfunc.
(noop_region_model_context::on_bounded_ranges): New.
(region_model_context_decorator::on_bounded_ranges): New.
* sm-taint.cc: Include "analyzer/constraint-manager.h".
(taint_state_machine::on_bounded_ranges): New.
* sm.h (state_machine::on_bounded_ranges): New.

gcc/testsuite/ChangeLog:
PR analyzer/106321
* gcc.dg/analyzer/torture/taint-read-index-2.c: Add test coverage
for switch statements.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/constraint-manager.h |  3 +
 gcc/analyzer/engine.cc| 26 ++
 gcc/analyzer/exploded-graph.h |  3 +
 gcc/analyzer/region-model.cc  |  2 +
 gcc/analyzer/region-model.h   | 17 
 gcc/analyzer/sm-taint.cc  | 58 +
 gcc/analyzer/sm.h |  9 ++
 .../analyzer/torture/taint-read-index-2.c | 85 +++
 8 files changed, 203 insertions(+)

diff --git a/gcc/analyzer/constraint-manager.h 
b/gcc/analyzer/constraint-manager.h
index f67c7647497..1271f18f1d1 100644
--- a/gcc/analyzer/constraint-manager.h
+++ b/gcc/analyzer/constraint-manager.h
@@ -138,6 +138,9 @@ public:
 
   static int cmp (const bounded_ranges *a, const bounded_ranges *b);
 
+  unsigned get_count () const { return m_ranges.length (); }
+  const bounded_range &get_range (unsigned idx) const { return m_ranges[idx]; }
+
 private:
   void canonicalize ();
   void validate () const;
diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index 4f7f9d03900..85b7c5e1227 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -916,6 +916,32 @@ impl_region_model_context::on_condition (const svalue *lhs,
 }
 }
 
+/* Implementation of region_model_context::on_bounded_ranges vfunc.
+   Notify all state machines about the ranges, which could lead to
+   state transitions.  */
+
+void
+impl_region_model_context::on_bounded_ranges (const svalue &sval,
+ const bounded_ranges &ranges)
+{
+  int sm_idx;
+  sm_state_map *smap;
+  FOR_EACH_VEC_ELT (m_new_state->m_checker_states, sm_idx, smap)
+{
+  const state_machine &sm = m_ext_state.get_sm (sm_idx);
+  impl_sm_context sm_ctxt (*m_eg, sm_idx, sm, m_enode_for_diag,
+  m_old_state, m_new_state,
+  m_old_state->m_checker_states[sm_idx],
+  m_new_state->m_checker_states[sm_idx],
+  m_path_ctxt);
+  sm.on_bounded_ranges (&sm_ctxt,
+   (m_enode_for_diag
+? m_enode_for_diag->get_supernode ()
+: NULL),
+   m_stmt, sval, ranges);
+}
+}
+
 /* Implementation of region_model_context::on_phi vfunc.
Notify all state machines about t

Re: [PATCH] c++: Enable __has_builtin for new reference binding built-ins

2022-07-19 Thread Jason Merrill via Gcc-patches

On 7/19/22 09:11, Jonathan Wakely wrote:

Tested x86_64-linux, OK for trunk?


OK.


-- >8 --

The new built-ins need to be detectable using __has_builtin, and the
library should use that to check for them.

This fixes an error with Clang when C++23 is enabled.

gcc/cp/ChangeLog:

* cp-objcp-common.cc (names_builtin_p): Return true for
RID_REF_CONSTRUCTS_FROM_TEMPORARY and
RID_REF_CONVERTS_FROM_TEMPORARY.

libstdc++-v3/ChangeLog:

* include/std/type_traits (__cpp_lib_reference_from_temporary)
(reference_constructs_from_temporary)
(reference_converts_from_temporary): Only define when the
built-ins are available.
---
  gcc/cp/cp-objcp-common.cc| 2 ++
  libstdc++-v3/include/std/type_traits | 4 
  2 files changed, 6 insertions(+)

diff --git a/gcc/cp/cp-objcp-common.cc b/gcc/cp/cp-objcp-common.cc
index 0b70d5567e4..4079a4b4aec 100644
--- a/gcc/cp/cp-objcp-common.cc
+++ b/gcc/cp/cp-objcp-common.cc
@@ -461,6 +461,8 @@ names_builtin_p (const char *name)
  case RID_IS_ASSIGNABLE:
  case RID_IS_CONSTRUCTIBLE:
  case RID_UNDERLYING_TYPE:
+case RID_REF_CONSTRUCTS_FROM_TEMPORARY:
+case RID_REF_CONVERTS_FROM_TEMPORARY:
return true;
  default:
break;
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index b1a1deecf66..14b029cec64 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -3505,6 +3505,9 @@ template
template
  inline constexpr bool is_scoped_enum_v = is_scoped_enum<_Tp>::value;
  
+#if __has_builtin(__reference_constructs_from_temporary) \

+  && __has_builtin(__reference_converts_from_temporary)
+
  #define __cpp_lib_reference_from_temporary 202202L
  
/// True if _Tp is a reference type, a _Up value can be bound to _Tp in

@@ -3544,6 +3547,7 @@ template
template
  inline constexpr bool reference_converts_from_temporary_v
= reference_converts_from_temporary<_Tp, _Up>::value;
+#endif // __has_builtin for reference_from_temporary
  #endif // C++23
  
  #if _GLIBCXX_HAVE_IS_CONSTANT_EVALUATED




[GCC13][Patch][V2][0/2]Add a new option -fstrict-flex-array[=n] and attribute strict_flex_array(n) and use it in PR101836

2022-07-19 Thread Qing Zhao via Gcc-patches
Hi, 

Based on the previous discussion on the Version 1 of the patch:

https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597350.html
https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598010.html

We decided:

*User interface:
.  command line option in C/C++:
-fstrict-flex-array[=N]   (N=0, 1, 2, 3)
.  decl attribute for FIELD_DECL:
strict_flex_array (N)  (N=0, 1, 2, 3)

*Implementation:

1.  Add a new IR flag “DECL_NOT_FLEXARRAY” to FIELD_DECL; 

2.  In C/C++ FE, set the new flag “DECL_NOT_FLEXARRAY” for a FIELD_DECL based 
on [0], [1],
   [] and the option -fstrict-flex-array, the attribute strict_flex_array and 
whether it’s the last field of
   the DECL_CONTEXT.

3.  In Middle end,   update any place that treats trailing array as flexible 
array member with the 
new flag  DECL_NOT_FLEXARRAY + array_at_struct_end_p to control the 
behavior with
Multiple level consistently. 

Then the above implementation will be divided into the following 3 parts:

Part A:  1 + 2
Part B:  In Middle end, use the new flag in tree-object-size.cc to resolve 
PR101836, then kernel can use __FORTIFY_SOURCE correctly after this;
Part C:  in Middle end, use the new flag in all other places that use 
“array_at_struct_end_p” or “component_ref_size” to make GCC consistently 
behave for trailing array. 

This set of patches (2 patches in total, the #1 is for Part A, and #2 is for 
Part B) are for the above Part A and Part B.   

NOTE: After this set of patches, the one major issue remained in GCC is, 
different phases that call either “array_at_struct_end_p” or 
“component_ref_size” behave
inconsistently on the default behavior of what’s a flexible array member, some 
phases treat all trailing array as flexible array member by default, and some
phases treat trailing [], [0], [1] as flexible array member by default. Such 
inconsistency need to be resolved in Part C. 

I have bootstrapped and regression tested on both aarch64 and x86, no issues.

Let me know if you have any comments on the patches.

Thanks.

Qing

[committed] .gitignore: do not ignore config.h

2022-07-19 Thread Alexander Monakov via Gcc-patches
GCC does not support in-tree builds at the moment, so .gitignore
concealing artifacts of accidental in-tree ./configure run may cause
confusion. Un-ignore config.h, which is known to break the build.

ChangeLog:

* .gitignore: Do not ignore config.h.
---
 .gitignore | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/.gitignore b/.gitignore
index 021a8c741..5cc4a0fdf 100644
--- a/.gitignore
+++ b/.gitignore
@@ -23,7 +23,8 @@
 
 autom4te.cache
 config.cache
-config.h
+# GCC does not support in-tree builds, do not conceal a stray config.h:
+# config.h
 config.intl
 config.log
 config.status
-- 
2.35.1


[GCC13][Patch][V2][1/2]Add a new option -fstrict-flex-array[=n] and attribute strict_flex_array(n) and use it in PR101836

2022-07-19 Thread Qing Zhao via Gcc-patches
>From 3854004802b8e2f132ebf218fc35a632f5e80c6a Mon Sep 17 00:00:00 2001
From: Qing Zhao 
Date: Mon, 18 Jul 2022 17:04:12 +
Subject: [PATCH 1/2] Add a new option -fstrict-flex-array[=n] and new
attribute strict_flex_array

Add the following new option -fstrict-flex-array[=n] and a corresponding
attribute strict_flex_array to GCC:

'-fstrict-flex-array'
Treat the trailing array of a structure as a flexible array member
in a stricter way.  The positive form is equivalent to
'-fstrict-flex-array=3', which is the strictest.  A trailing array
is treated as a flexible array member only when it is declared as a
flexible array member per C99 standard onwards.  The negative form
is equivalent to '-fstrict-flex-array=0', which is the least
strict.  All trailing arrays of structures are treated as flexible
array members.

'-fstrict-flex-array=LEVEL'
Treat the trailing array of a structure as a flexible array member
in a stricter way.  The value of LEVEL controls the level of
strictness.

The possible values of LEVEL are the same as for the
'strict_flex_array' attribute (*note Variable Attributes::).

You can control this behavior for a specific trailing array field
of a structure by using the variable attribute 'strict_flex_array'
attribute (*note Variable Attributes::).

'strict_flex_array (LEVEL)'
The 'strict_flex_array' attribute should be attached to the
trailing array field of a structure.  It specifies the level of
strictness of treating the trailing array field of a structure as a
flexible array member.  LEVEL must be an integer betwen 0 to 3.

LEVEL=0 is the least strict level, all trailing arrays of
structures are treated as flexible array members.  LEVEL=3 is the
strictest level, only when the trailing array is declared as a
flexible array member per C99 standard onwards ([]), it is treated
as a flexible array member.

There are two more levels in between 0 and 3, which are provided to
support older codes that use GCC zero-length array extension ([0])
or one-size array as flexible array member ([1]): When LEVEL is 1,
the trailing array is treated as a flexible array member when it is
declared as either [], [0], or [1]; When LEVEL is 2, the trailing
array is treated as a flexible array member when it is declared as
either [], or [0].

This attribute can be used with or without '-fstrict-flex-array'.
When both the attribute and the option present at the same time,
the level of the strictness for the specific trailing array field
is determined by the attribute.

gcc/c-family/ChangeLog:

* c-attribs.cc (handle_strict_flex_array_attribute): New function.
(c_common_attribute_table): New item for strict_flex_array.
* c.opt (fstrict-flex-array): New option.
(fstrict-flex-array=): New option.

gcc/c/ChangeLog:

* c-decl.cc (add_flexible_array_elts_to_size): Call new utility
routine flexible_array_member_p.
(is_flexible_array_member_p): New function.
(finish_struct): Set the new DECL_NOT_FLEXARRAY flag.

gcc/ChangeLog:

* doc/extend.texi: Document strict_flex_array attribute.
* doc/invoke.texi: Document -fstrict-flex-array[=n] option.
* tree-core.h (struct tree_decl_common): New bit field
decl_not_flexarray.
* tree.cc (component_ref_size): Reorg by using new utility functions.
(flexible_array_member_p): New function.
(zero_length_array_p): Likewise.
(one_element_array_p): Likewise.
(flexible_array_type_p): Likewise.
* tree.h (DECL_NOT_FLEXARRAY): New flag.
(zero_length_array_p): New function prototype.
(one_element_array_p): Likewise.
(flexible_array_member_p): Likewise.

gcc/testsuite/ChangeLog:

* gcc.dg/strict-flex-array-1.c: New test.
---
gcc/c-family/c-attribs.cc  |  47 
gcc/c-family/c.opt |   7 ++
gcc/c/c-decl.cc|  91 +--
gcc/doc/extend.texi|  25 
gcc/doc/invoke.texi|  27 -
gcc/testsuite/gcc.dg/strict-flex-array-1.c |  31 +
gcc/tree-core.h|   5 +-
gcc/tree.cc| 130 ++---
gcc/tree.h |  16 ++-
9 files changed, 322 insertions(+), 57 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/strict-flex-array-1.c

diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index c8d96723f4c..10d16532f0d 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -101,6 +101,8 @@ static tree handle_special_var_sec_attribute (tree *, tree, 
tree, int, bool *);
static tree handle_aligned_attribute (tree *, tree, tree, int, bool *);
static tree handle_warn_if_not_aligned_attribute (tree *, tree, tree,

[GCC13][Patch][V2][2/2]Add a new option -fstrict-flex-array[=n] and attribute strict_flex_array(n) and use it in PR101836

2022-07-19 Thread Qing Zhao via Gcc-patches
>From a09f39ded462611286a44d9e8273de8342673ba2 Mon Sep 17 00:00:00 2001
From: Qing Zhao 
Date: Mon, 18 Jul 2022 18:12:26 +
Subject: [PATCH 2/2] Use new flag DECL_NOT_FLEXARRAY in __builtin_object_size
[PR101836]

Use new flag DECL_NOT_FLEXARRAY to determine whether the trailing array
of a structure is flexible array member in __builtin_object_size.

gcc/ChangeLog:

PR tree-optimization/101836
* tree-object-size.cc (addr_object_size): Use array_at_struct_end_p
and DECL_NOT_FLEXARRAY to determine a flexible array member reference.

gcc/testsuite/ChangeLog:

PR tree-optimization/101836
* gcc.dg/pr101836.c: New test.
* gcc.dg/pr101836_1.c: New test.
* gcc.dg/pr101836_2.c: New test.
* gcc.dg/pr101836_3.c: New test.
* gcc.dg/pr101836_4.c: New test.
* gcc.dg/pr101836_5.c: New test.
* gcc.dg/strict-flex-array-2.c: New test.
* gcc.dg/strict-flex-array-3.c: New test.
---
gcc/testsuite/gcc.dg/pr101836.c| 60 ++
gcc/testsuite/gcc.dg/pr101836_1.c  | 60 ++
gcc/testsuite/gcc.dg/pr101836_2.c  | 60 ++
gcc/testsuite/gcc.dg/pr101836_3.c  | 60 ++
gcc/testsuite/gcc.dg/pr101836_4.c  | 60 ++
gcc/testsuite/gcc.dg/pr101836_5.c  | 60 ++
gcc/testsuite/gcc.dg/strict-flex-array-2.c | 60 ++
gcc/testsuite/gcc.dg/strict-flex-array-3.c | 60 ++
gcc/tree-object-size.cc| 18 +++
9 files changed, 489 insertions(+), 9 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/pr101836.c
create mode 100644 gcc/testsuite/gcc.dg/pr101836_1.c
create mode 100644 gcc/testsuite/gcc.dg/pr101836_2.c
create mode 100644 gcc/testsuite/gcc.dg/pr101836_3.c
create mode 100644 gcc/testsuite/gcc.dg/pr101836_4.c
create mode 100644 gcc/testsuite/gcc.dg/pr101836_5.c
create mode 100644 gcc/testsuite/gcc.dg/strict-flex-array-2.c
create mode 100644 gcc/testsuite/gcc.dg/strict-flex-array-3.c

diff --git a/gcc/testsuite/gcc.dg/pr101836.c b/gcc/testsuite/gcc.dg/pr101836.c
new file mode 100644
index 000..e5b4e5160a4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr101836.c
@@ -0,0 +1,60 @@
+/* -fstrict-flex-array is aliased with -ftrict-flex-array=3, which is the
+   strictest, only [] is treated as flexible array.  */ 
+/* PR tree-optimization/101836 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fstrict-flex-array" } */
+
+#include 
+
+#define expect(p, _v) do { \
+size_t v = _v; \
+if (p == v) \
+printf("ok:  %s == %zd\n", #p, p); \
+else \
+   {  \
+  printf("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
+ __builtin_abort (); \
+   } \
+} while (0);
+
+struct trailing_array_1 {
+int a;
+int b;
+int c[4];
+};
+
+struct trailing_array_2 {
+int a;
+int b;
+int c[1];
+};
+
+struct trailing_array_3 {
+int a;
+int b;
+int c[0];
+};
+struct trailing_array_4 {
+int a;
+int b;
+int c[];
+};
+
+void __attribute__((__noinline__)) stuff(
+struct trailing_array_1 *normal,
+struct trailing_array_2 *trailing_1,
+struct trailing_array_3 *trailing_0,
+struct trailing_array_4 *trailing_flex)
+{
+expect(__builtin_object_size(normal->c, 1), 16);
+expect(__builtin_object_size(trailing_1->c, 1), 4);
+expect(__builtin_object_size(trailing_0->c, 1), 0);
+expect(__builtin_object_size(trailing_flex->c, 1), -1);
+}
+
+int main(int argc, char *argv[])
+{
+stuff((void *)argv[0], (void *)argv[0], (void *)argv[0], (void *)argv[0]);
+
+return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/pr101836_1.c 
b/gcc/testsuite/gcc.dg/pr101836_1.c
new file mode 100644
index 000..30ea20427a5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr101836_1.c
@@ -0,0 +1,60 @@
+/* -fstrict-flex-array=3 is the strictest, only [] is treated as
+   flexible array.  */ 
+/* PR tree-optimization/101836 */
+/* { dg-do run } */
+/* { dg-options "-O2 -fstrict-flex-array=3" } */
+
+#include 
+
+#define expect(p, _v) do { \
+size_t v = _v; \
+if (p == v) \
+printf("ok:  %s == %zd\n", #p, p); \
+else \
+   {  \
+  printf("WAT: %s == %zd (expected %zd)\n", #p, p, v); \
+ __builtin_abort (); \
+   } \
+} while (0);
+
+struct trailing_array_1 {
+int a;
+int b;
+int c[4];
+};
+
+struct trailing_array_2 {
+int a;
+int b;
+int c[1];
+};
+
+struct trailing_array_3 {
+int a;
+int b;
+int c[0];
+};
+struct trailing_array_4 {
+int a;
+int b;
+int c[];
+};
+
+void __attribute__((__noinline__)) stuff(
+struct trailing_array_1 *normal,
+struct trailing_array_2 *trailing_1,
+struct trailing_array_3 *trailing_0,
+struct trailing_array_4 *trailing_flex)
+{
+expect(__builtin_object_size(normal->c, 1), 16);
+expect(__builtin_object_size(trailing_1->c, 1), 4);
+expect(__builtin_ob

Re: [PATCH] c++: Add __reference_con{struc, ver}ts_from_temporary [PR104477]

2022-07-19 Thread Jonathan Wakely via Gcc-patches
On Tue, 19 Jul 2022 at 12:10, Jonathan Wakely wrote:
>
> On Tue, 19 Jul 2022 at 11:08, Jonathan Wakely wrote:
> >
> > On Sun, 17 Jul 2022 at 22:13, Stephan Bergmann via Libstdc++
> >  wrote:
> > >
> > > On 7/15/22 22:25, Marek Polacek via Gcc-patches wrote:
> > > > Yeah, I guess so.  But I've already pushed the patch.
> > >
> > > This commit obviously breaks using libstdc++ with Clang (in -std=c++2b
> > > mode), which doesn't implement those new builtins.  Something like the
> > > below would fix that,
> >
> > Thanks, Stephan, I'll fix this.
>
> This patch doesn't work, because __has_builtin doesn't detect the new
> built-ins. I have a patch that solves that, so we can make the change
> to the library headers.

Should be fixed now, thanks for reporting it.



Re: [PATCH v1 1/2] LoongArch: Modify the method of obtaining symbolic addresses.

2022-07-19 Thread Xi Ruoyao via Gcc-patches
The change seems too large.  It would be better to split it into
multiple commits (for example, just 3 commits for 1,2,3 below).

On Tue, 2022-07-19 at 21:08 +0800, Lulu Cheng wrote:

> 1. The original LA macro instruction is split into two instructions to
>    obtain the address of the symbol if enable '-mexplicit-relocs'.

It's better to add some test cases (with dg-final scan-assembler) for
this.  The test case will also show humans the intended behavior after
the change.

> 3. Modify the method that calls global functions. From 'la.global + jirl'
>    to 'bl'.

Why?  Does it means we'll rely on the assembler to emit the correct
sequence for -fno-plt?  Then it would be better to use a pseudo mnemonic
like "call" instead of "bl" (because it's not a single "bl"
instruction).


/* snip */

>  static bool
>  loongarch_valid_index_p (struct loongarch_address_info *info, rtx x,
>   machine_mode mode, bool strict_p)
> @@ -1881,6 +1978,26 @@ loongarch_classify_address (struct 
> loongarch_address_info *info, rtx x,
>    info->offset = XEXP (x, 1);
>    return (loongarch_valid_base_register_p (info->reg, mode, strict_p)
>   && loongarch_valid_offset_p (info->offset, mode));
> +
> +    case LO_SUM:
> +  info->type = ADDRESS_LO_SUM;
> +  info->reg = XEXP (x, 0);
> +  info->offset = XEXP (x, 1);
> +  /* We have to trust the creator of the LO_SUM to do something vaguely
> +    sane.  Target-independent code that creates a LO_SUM should also
> +    create and verify the matching HIGH.  Target-independent code that
> +    adds an offset to a LO_SUM must prove that the offset will not
> +    induce a carry.  Failure to do either of these things would be
> +    a bug, and we are not required to check for it here.  The MIPS

Don't copy MIPS code.

> +static bool loongarch_split_move_insn_p (rtx dest, rtx src);

Nit: "loongarch_split_move_insn_p" shall start a new line.


[PATCH V1] HIGH part of symbol ref is invalid for constant pool

2022-07-19 Thread Jiufu Guo via Gcc-patches
Hi,

In patch https://gcc.gnu.org/pipermail/gcc-patches/2022-July/597712.html,
test case was not added.  After more check, a testcase is added for it.

The high part of the symbol address is invalid for the constant pool.
In function rs6000_cannot_force_const_mem, we already return true for
"HIGH with UNSPEC" rtx.  Below are some examples also indicate the high
part of a symbol_ref:
(high:DI (const:DI (plus:DI (symbol_ref:DI ("xx") (const_int 12 [0xc])
(high:DI (symbol_ref:DI ("var_1")..)))

This patch updates rs6000_cannot_force_const_mem to return true for
rtx with HIGH code.

Bootstrapped and regtested on ppc64le and ppc64.
Is it ok for trunk?

BR,
Jeff(Jiufu)


gcc/ChangeLog:

* config/rs6000/rs6000.cc (rs6000_cannot_force_const_mem):
Return true for HIGH code rtx.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/constpoolcheck.c: New test.

---
 gcc/config/rs6000/rs6000.cc   |  7 +--
 gcc/testsuite/gcc.target/powerpc/constpoolcheck.c | 11 +++
 2 files changed, 16 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/constpoolcheck.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index 0af2085adc0..d56832ebbfc 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -9704,8 +9704,11 @@ rs6000_init_stack_protect_guard (void)
 static bool
 rs6000_cannot_force_const_mem (machine_mode mode ATTRIBUTE_UNUSED, rtx x)
 {
-  if (GET_CODE (x) == HIGH
-  && GET_CODE (XEXP (x, 0)) == UNSPEC)
+  /* High part of a symbol ref/address can not be put into constant pool. e.g.
+ (high:DI (symbol_ref:DI ("var")..)) or
+ (high:DI (unspec:DI [(symbol_ref/u:DI ("*.LC0")..)
+ (high:DI (const:DI (plus:DI (symbol_ref:DI ("xx")) (const_int 12.  */
+  if (GET_CODE (x) == HIGH)
 return true;
 
   /* A TLS symbol in the TOC cannot contain a sum.  */
diff --git a/gcc/testsuite/gcc.target/powerpc/constpoolcheck.c 
b/gcc/testsuite/gcc.target/powerpc/constpoolcheck.c
new file mode 100644
index 000..ed7a994827b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/constpoolcheck.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target powerpc*-*-* } } */
+/* { dg-options "-O1 -mdejagnu-cpu=power10" } */
+/* (high:DI (symbol_ref:DI ("var_48")..))) should not cause ICE. */
+extern short var_48;
+void
+foo (double *r)
+{
+  if (var_48)
+*r = 1234.5678;
+}
+
-- 
2.17.1



[PATCH 1/2] Remove unused remove_node_from_expr_list

2022-07-19 Thread Alexander Monakov via Gcc-patches
This function remains unused since remove_node_from_insn_list was cloned
from it.

gcc/ChangeLog:

* rtl.h (remove_node_from_expr_list): Remove declaration.
* rtlanal.cc (remove_node_from_expr_list): Remove (no uses).
---
 gcc/rtl.h  |  1 -
 gcc/rtlanal.cc | 29 -
 2 files changed, 30 deletions(-)

diff --git a/gcc/rtl.h b/gcc/rtl.h
index 488016bb4..645c009a3 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -3712,7 +3712,6 @@ extern unsigned hash_rtx_cb (const_rtx, machine_mode, int 
*, int *,
 extern rtx regno_use_in (unsigned int, rtx);
 extern int auto_inc_p (const_rtx);
 extern bool in_insn_list_p (const rtx_insn_list *, const rtx_insn *);
-extern void remove_node_from_expr_list (const_rtx, rtx_expr_list **);
 extern void remove_node_from_insn_list (const rtx_insn *, rtx_insn_list **);
 extern int loc_mentioned_in_p (rtx *, const_rtx);
 extern rtx_insn *find_first_parameter_load (rtx_insn *, rtx_insn *);
diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
index d78cc6024..ec95ecd6c 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -2878,35 +2878,6 @@ in_insn_list_p (const rtx_insn_list *listp, const 
rtx_insn *node)
   return false;
 }
 
-/* Search LISTP (an EXPR_LIST) for an entry whose first operand is NODE and
-   remove that entry from the list if it is found.
-
-   A simple equality test is used to determine if NODE matches.  */
-
-void
-remove_node_from_expr_list (const_rtx node, rtx_expr_list **listp)
-{
-  rtx_expr_list *temp = *listp;
-  rtx_expr_list *prev = NULL;
-
-  while (temp)
-{
-  if (node == temp->element ())
-   {
- /* Splice the node out of the list.  */
- if (prev)
-   XEXP (prev, 1) = temp->next ();
- else
-   *listp = temp->next ();
-
- return;
-   }
-
-  prev = temp;
-  temp = temp->next ();
-}
-}
-
 /* Search LISTP (an INSN_LIST) for an entry whose first operand is NODE and
remove that entry from the list if it is found.
 
-- 
2.35.1



[PATCH 2/2] Avoid registering __builtin_setjmp_receiver label twice [PR101347]

2022-07-19 Thread Alexander Monakov via Gcc-patches
The testcase in the PR demonstrates how it is possible for one
__builtin_setjmp_receiver label to appear in
nonlocal_goto_handler_labels list twice (after the block with
__builtin_setjmp_setup referring to it was duplicated).

remove_node_from_insn_list did not account for this possibility and
removed only the first copy from the list. Add an assert verifying that
duplicates are not present.

To avoid adding a label to the list twice, move registration of the
label from __builtin_setjmp_setup handling to __builtin_setjmp_receiver.

gcc/ChangeLog:

PR rtl-optimization/101347
* builtins.cc (expand_builtin) [BUILT_IN_SETJMP_SETUP]: Move
population of nonlocal_goto_handler_labels from here ...
(expand_builtin) [BUILT_IN_SETJMP_RECEIVER]: ... to here.
* rtlanal.cc (remove_node_from_insn_list): Verify that a
duplicate is not present in the remainder of the list.
---
 gcc/builtins.cc | 15 +++
 gcc/rtlanal.cc  |  1 +
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index e6816d5c8..12a688dd8 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -7467,15 +7467,7 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
machine_mode mode,
  tree label = TREE_OPERAND (CALL_EXPR_ARG (exp, 1), 0);
  rtx_insn *label_r = label_rtx (label);
 
- /* This is copied from the handling of non-local gotos.  */
  expand_builtin_setjmp_setup (buf_addr, label_r);
- nonlocal_goto_handler_labels
-   = gen_rtx_INSN_LIST (VOIDmode, label_r,
-nonlocal_goto_handler_labels);
- /* ??? Do not let expand_label treat us as such since we would
-not want to be both on the list of non-local labels and on
-the list of forced labels.  */
- FORCED_LABEL (label) = 0;
  return const0_rtx;
}
   break;
@@ -7488,6 +7480,13 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
machine_mode mode,
  rtx_insn *label_r = label_rtx (label);
 
  expand_builtin_setjmp_receiver (label_r);
+ nonlocal_goto_handler_labels
+   = gen_rtx_INSN_LIST (VOIDmode, label_r,
+nonlocal_goto_handler_labels);
+ /* ??? Do not let expand_label treat us as such since we would
+not want to be both on the list of non-local labels and on
+the list of forced labels.  */
+ FORCED_LABEL (label) = 0;
  return const0_rtx;
}
   break;
diff --git a/gcc/rtlanal.cc b/gcc/rtlanal.cc
index ec95ecd6c..56da7435a 100644
--- a/gcc/rtlanal.cc
+++ b/gcc/rtlanal.cc
@@ -2899,6 +2899,7 @@ remove_node_from_insn_list (const rtx_insn *node, 
rtx_insn_list **listp)
  else
*listp = temp->next ();
 
+ gcc_checking_assert (!in_insn_list_p (temp->next (), node));
  return;
}
 
-- 
2.35.1



[PATCH] Adding three new function attributes for static analysis of file descriptors

2022-07-19 Thread Immad Mir via Gcc-patches
This patch adds three new function attributes to GCC that
are used for static analysis of usage of file descriptors:

1) __attribute__ ((fd_arg(N))): The attributes may be applied to a function that
takes on open file descriptor at refrenced argument N.

It indicates that the passed filedescriptor must not have been closed.
Therefore, when the analyzer is enabled with -fanalyzer, the
analyzer may emit a -Wanalyzer-fd-use-after-close diagnostic
if it detects a code path in which a function with this attribute is
called with a closed file descriptor.

The attribute also indicates that the file descriptor must have been checked for
validity before usage. Therefore, analyzer may emit
-Wanalyzer-fd-use-without-check diagnostic if it detects a code path in
which a function with this attribute is called with a file descriptor that has
not been checked for validity.

2) __attribute__((fd_arg_read(N))): The attribute is identical to
fd_arg, but with the additional requirement that it might read from
the file descriptor, and thus, the file descriptor must not have been opened
as write-only.

The analyzer may emit a -Wanalyzer-access-mode-mismatch
diagnostic if it detects a code path in which a function with this
attribute is called on a file descriptor opened with O_WRONLY.

3) __attribute__((fd_arg_write(N))): The attribute is identical to fd_arg_read
except that the analyzer may emit a -Wanalyzer-access-mode-mismatch diagnostic 
if
it detects a code path in which a function with this attribute is called on a
file descriptor opened with O_RDONLY.

gcc/analyzer/ChangeLog:
* sm-fd.cc (fd_param_diagnostic): New diagnostic class.
(fd_access_mode_mismatch): Change inheritance from fd_diagnostic
to fd_param_diagnostic. Add new overloaded constructor.
(fd_use_after_close): Likewise.
(unchecked_use_of_fd): Likewise and also change name to 
fd_use_without_check.
(double_close): Change name to fd_double_close.
(enum access_directions): New.
(fd_state_machine::on_stmt): Handle calls to function with the
new three function attributes.
(fd_state_machine::check_for_fd_attrs): New.
(fd_state_machine::on_open): Use the new overloaded constructors
of diagnostic classes.

gcc/c-family/ChangeLog:
* c-attribs.cc: (c_common_attribute_table): add three new attributes
namely: fd_arg, fd_arg_read and fd_arg_write.
(handle_fd_arg_attribute): New.

gcc/ChangeLog:
* doc/extend.texi: Add fd_arg, fd_arg_read and fd_arg_write under
"Common Function Attributes" section.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/fd-5.c: New test.
* c-c++-common/attr-fd.c: New test.

Signed-off-by: Immad Mir 
---
 gcc/analyzer/sm-fd.cc| 323 +--
 gcc/c-family/c-attribs.cc|  36 ++-
 gcc/doc/extend.texi  |  34 +++
 gcc/testsuite/c-c++-common/attr-fd.c |  18 ++
 gcc/testsuite/gcc.dg/analyzer/fd-5.c |  53 +
 5 files changed, 398 insertions(+), 66 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/attr-fd.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/fd-5.c

diff --git a/gcc/analyzer/sm-fd.cc b/gcc/analyzer/sm-fd.cc
index 8e4300b06e2..c577a7db7ea 100644
--- a/gcc/analyzer/sm-fd.cc
+++ b/gcc/analyzer/sm-fd.cc
@@ -39,10 +39,13 @@ along with GCC; see the file COPYING3.  If not see
 #include "analyzer/analyzer-selftests.h"
 #include "tristate.h"
 #include "selftest.h"
+#include "stringpool.h"
+#include "attribs.h"
 #include "analyzer/call-string.h"
 #include "analyzer/program-point.h"
 #include "analyzer/store.h"
 #include "analyzer/region-model.h"
+#include "bitmap.h"
 
 #if ENABLE_ANALYZER
 
@@ -59,6 +62,13 @@ enum access_mode
   WRITE_ONLY
 };
 
+enum access_directions
+{
+  DIRS_READ_WRITE,
+  DIRS_READ,
+  DIRS_WRITE
+};
+
 class fd_state_machine : public state_machine
 {
 public:
@@ -146,7 +156,7 @@ private:
   void check_for_open_fd (sm_context *sm_ctxt, const supernode *node,
   const gimple *stmt, const gcall *call,
   const tree callee_fndecl,
-  enum access_direction access_fn) const;
+  enum access_directions access_fn) const;
 
   void make_valid_transitions_on_condition (sm_context *sm_ctxt,
 const supernode *node,
@@ -156,12 +166,17 @@ private:
   const supernode *node,
   const gimple *stmt,
   const svalue *lhs) const;
+  void check_for_fd_attrs (sm_context *sm_ctxt, const supernode *node,
+   const gimple *stmt, const tree callee_fndecl,
+   const char *attr_name,
+   access_directions fd_attr_access_dir) const;
+
 };
 
 /* Base diagnostic class relative to fd_state_machine. */
 cla

Re: [PATCH] Complete __gnu_debug::basic_string Standard conformity

2022-07-19 Thread François Dumont via Gcc-patches

On 19/07/22 12:42, Jonathan Wakely wrote:

On Sun, 10 Jul 2022 at 14:57, François Dumont via Libstdc++
 wrote:

Here is a first patch to complete __gnu_debug::basic_string Standard
conformity.

I prefer to submit this before checking for more missing stuff to check
that my proposal of having a testsuite_string.h header is ok.

I think this change means some testcases will never test std::string
with _GLIBCXX_DEBUG defined, because if that is defined they test
__gnu_debug::string instead. That means assertions in
std::basic_string like this one will not get tested:

   template
 _GLIBCXX_STRING_CONSTEXPR
 typename basic_string<_CharT, _Traits, _Alloc>::size_type
 basic_string<_CharT, _Traits, _Alloc>::
 find(const _CharT* __s, size_type __pos, size_type __n) const
 _GLIBCXX_NOEXCEPT
 {
   __glibcxx_requires_string_len(__s, __n);

Are we OK with never testing those assertions?



Currently they don't get tested by default because of the extern
template declarations for std::string, so they would only be tested
with -std=c++20 -D_GLIBCXX_DEBUG -D_GLIBCXX_DEBUG_PEDANTIC, which I
don't test routinely. So I suppose this change doesn't make things any
worse in practice.
Good point, I'll try to activate those asserts with just 
_GLIBCXX_ASSERTIONS so that we will still be able to test despite my change.

I also noticed some problems with _GLIBCXX_DEBUG_PEDANTIC.

  libstdc++: Complete __gnu_debug::string Standard conformity

  Add testsuite/testsuite_string.h header to help testing
__gnu_debug::basic_string like
  std::basic_string depending on _GLIBCXX_DEBUG.

  Add using of base type methods in __gnu_debug::basic_string to make
use of the method
  overloads when there is no debug version.

  Fix _GLIBCXX_DEBUG_PEDANTIC assertions in . This
header has to be used directly
  like __gnu_debug::string, it is not included by _GLIBCXX_DEBUG. It
means that
  _GLIBCXX_DEBUG_PEDANTIC is not considered to define
__glibcxx_check_string and
  __glibcxx_check_string_len which are then empty macros. Now those
macros are defined
  directly in  and properly consider
_GLIBCXX_DEBUG_PEDANTIC.

Nice catch.


Yes, I forgot to signal that before this patch some tests were XFAIL in 
segfault rather than with the proper _GLIBCXX_DEBUG_PEDANTIC assertion.





OK for trunk, thanks.


Ok, committed.




Re: [PATCH] Adding three new function attributes for static analysis of file descriptors

2022-07-19 Thread David Malcolm via Gcc-patches
On Tue, 2022-07-19 at 21:36 +0530, Immad Mir wrote:


[...snip...]

Thanks for the patch.

It's nearly ready for trunk; I have some review comments below, mostly
nits, but a couple of other issues...

> gcc/ChangeLog:
>   * doc/extend.texi: Add fd_arg, fd_arg_read and fd_arg_write under
>   "Common Function Attributes" section.

As well as these additions, please can you also update doc/invoke.texi.
Specifically, please update the description of the three warnings that
are affected by these attributes so that they refer to the new
attributes, rather than just to "read" and "write", so that as well as
the docs for the attributes referring to the warnings, the docs for the
warnings refer to the attributes.

> gcc/testsuite/ChangeLog:
>   * gcc.dg/analyzer/fd-5.c: New test.
>   * c-c++-common/attr-fd.c: New test.
> 
> Signed-off-by: Immad Mir 

[...snip...]

>  /* Base diagnostic class relative to fd_state_machine. */
>  class fd_diagnostic : public pending_diagnostic
>  {
> -public:
> +public:

There's what looks like an accidental change here, adding a couple of
stray spaces after the trailing colon (I think); please fix, to avoid
adding noise to the git history.

[...snip...]

> +  void
> +  inform_filedescriptor_attribute (access_directions fd_dir)
> +  {
> +
> +if (m_attr)
> +  switch (fd_dir)
> +{
> +case DIRS_READ_WRITE:
> +  inform (DECL_SOURCE_LOCATION (m_callee_fndecl),
> +  "argument %d of %qD must be an open file descriptor",
> +  m_arg_idx + 1, m_callee_fndecl);
> +  break;
> +case DIRS_WRITE:
> +  inform (DECL_SOURCE_LOCATION (m_callee_fndecl),
> +  "argument %d of %qD must be a read-only file descriptor",
> +  m_arg_idx + 1, m_callee_fndecl);
> +  break;
> +case DIRS_READ:
> +  inform (DECL_SOURCE_LOCATION (m_callee_fndecl),
> +  "argument %d of %qD must be a write-only file descriptor",
> +  m_arg_idx + 1, m_callee_fndecl);
> +  break;
> +}
> +  }

I don't like the wording of the direction-based messages; if I'm
following the code correctly, it's not so much that the argument must
be, say, a read-only file descriptor, as that the argument must be a
*readable* file descriptor.

For example in fd-5.c test_2 you have:

void g (int fd) __attribute__((fd_arg_read(1))); /* { dg-message "argument 1 of 
'g' must be a read-only file descriptor" } */
void
test_2 (const char *path)
{
  int fd = open (path, O_WRONLY);
  if (fd != -1)
  {
g (fd); /* { dg-warning "'g' on 'write-only' file descriptor 'fd'" } */
  }
  close (fd);
}

so presumably with the patch as posted it emits:

  warning: 'g' on 'write-only' file descriptor 'fd'
  note: argument 1 of 'g' must be a read-only file descriptor

whereas I think we really mean:

  warning: 'g' on write-only file descriptor 'fd'
  note: argument 1 of 'g' must be a readable file descriptor

Also, it will be easier for the user to understand why these warnings
are appearing if they refer to the attribute by name.

So please add something like:

  "due to %<__attribute__((fd_arg(%d)))%>",
  m_arg_idx + 1

to the format strings and arguments.

So the example above might read:

  warning: 'g' on write-only file descriptor 'fd'
  note: argument 1 of 'g' must be a readable file descriptor, due to 
'__attribute__((fd_arg_read(1)))'


[...snip...]

> @@ -317,29 +398,25 @@ public:
>bool
>emit (rich_location *rich_loc) final override
>{
> +bool warned;
>  switch (m_fd_dir)
>{
> -  case DIR_READ:
> -return warning_at (rich_loc, get_controlling_option (),
> +  case DIRS_READ:
> +warned =  warning_at (rich_loc, get_controlling_option (),
> "%qE on % file descriptor %qE",

I hadn't noticed this before, but read-only shouldn't be in quotes in
this message.

> m_callee_fndecl, m_arg);
> -  case DIR_WRITE:
> -return warning_at (rich_loc, get_controlling_option (),
> +break;
> +  case DIRS_WRITE:
> +warned = warning_at (rich_loc, get_controlling_option (),
> "%qE on % file descriptor %qE",

Likewise.

> m_callee_fndecl, m_arg);
> +break;
>default:
>  gcc_unreachable ();
>}
> -  }
> -
> -  bool
> -  subclass_equal_p (const pending_diagnostic &base_other) const override
> -  {
> -const fd_access_mode_mismatch &sub_other
> -= (const fd_access_mode_mismatch &)base_other;
> -return (same_tree_p (m_arg, sub_other.m_arg)
> -&& m_callee_fndecl == sub_other.m_callee_fndecl
> -&& m_fd_dir == sub_other.m_fd_dir);
> +  if (warned)
> +inform_filedescriptor_attribute (m_fd_dir);
> +  return warned;
>}
>  
>label_text
> @@ -347,10 +424,10 @@ public:
>{
>  switch (m_fd_dir)
>{
> -  

[PATCH v2.1 3/4] aarch64: Consolidate simd type lookup functions

2022-07-19 Thread Andrew Carlotti via Gcc-patches
On Wed, Jul 13, 2022 at 05:36:04PM +0100, Richard Sandiford wrote:
> I like the part about getting rid of:
> 
> static tree
> aarch64_simd_builtin_type (machine_mode mode,
>  bool unsigned_p, bool poly_p)
> 
> and the flow of the new function.  However, I think it's still
> slightly more readable if we keep the switch and lookup routines
> separate, partly to keep down the size of the main routine and
> partly to avoid the goto.

I agree.

> So how about:
> 
> - aarch64_simd_builtin_std_type becomes aarch64_int_or_fp_element_type
>   but otherwise stays as-is
>
> ...

I've called it aarch64_int_or_fp_type, because it's sometimes used for
an operand that doesn't represent an element of a vector.


Updated patch below.

---

There were several similarly-named functions, which each built or looked up an
operand type using a different subset of valid modes or qualifiers.

This change provides a single function to return operand types, which can
additionally handle const and pointer qualifiers. For clarity, the existing
functionality is kept in separate helper functions.

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc
(aarch64_simd_builtin_std_type): Rename to...
(aarch64_int_or_fp_type): ...this, and allow irrelevant qualifiers.
(aarch64_lookup_simd_builtin_type): Rename to...
(aarch64_simd_builtin_type): ...this. Add const/pointer
support, and extract table lookup to...
(aarch64_lookup_simd_type_in_table): ...this function.
(aarch64_init_crc32_builtins): Update to use aarch64_simd_builtin_type.
(aarch64_init_fcmla_laneq_builtins): Ditto.
(aarch64_init_simd_builtin_functions): Ditto.

---


diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
b/gcc/config/aarch64/aarch64-builtins.cc
index 
55ad2e8b6831d6cc2b039270c8656d429347092d..cd7c2a79d9b4d67adf1d9de1f9b56eb3a0d1ee2b
 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -788,12 +788,13 @@ aarch64_general_mangle_builtin_type (const_tree type)
   return NULL;
 }
 
+/* Helper function for aarch64_simd_builtin_type. */
 static tree
-aarch64_simd_builtin_std_type (machine_mode mode,
-  enum aarch64_type_qualifiers q)
+aarch64_int_or_fp_type (machine_mode mode,
+  enum aarch64_type_qualifiers qualifiers)
 {
-#define QUAL_TYPE(M)  \
-  ((q == qualifier_none) ? int##M##_type_node : unsigned_int##M##_type_node);
+#define QUAL_TYPE(M) ((qualifiers & qualifier_unsigned) \
+  ? unsigned_int##M##_type_node : int##M##_type_node);
   switch (mode)
 {
 case E_QImode:
@@ -826,16 +827,14 @@ aarch64_simd_builtin_std_type (machine_mode mode,
 #undef QUAL_TYPE
 }
 
+/* Helper function for aarch64_simd_builtin_type. */
 static tree
-aarch64_lookup_simd_builtin_type (machine_mode mode,
- enum aarch64_type_qualifiers q)
+aarch64_lookup_simd_type_in_table (machine_mode mode,
+  enum aarch64_type_qualifiers qualifiers)
 {
   int i;
   int nelts = sizeof (aarch64_simd_types) / sizeof (aarch64_simd_types[0]);
-
-  /* Non-poly scalar modes map to standard types not in the table.  */
-  if (q != qualifier_poly && !VECTOR_MODE_P (mode))
-return aarch64_simd_builtin_std_type (mode, q);
+  int q = qualifiers & (qualifier_poly | qualifier_unsigned);
 
   for (i = 0; i < nelts; i++)
 {
@@ -852,16 +851,32 @@ aarch64_lookup_simd_builtin_type (machine_mode mode,
   return NULL_TREE;
 }
 
+/* Return a type for an operand with specified mode and qualifiers. */
 static tree
 aarch64_simd_builtin_type (machine_mode mode,
-  bool unsigned_p, bool poly_p)
+  enum aarch64_type_qualifiers qualifiers)
 {
-  if (poly_p)
-return aarch64_lookup_simd_builtin_type (mode, qualifier_poly);
-  else if (unsigned_p)
-return aarch64_lookup_simd_builtin_type (mode, qualifier_unsigned);
+  tree type = NULL_TREE;
+
+  /* For pointers, we want a pointer to the basic type of the vector.  */
+  if ((qualifiers & qualifier_pointer) && VECTOR_MODE_P (mode))
+mode = GET_MODE_INNER (mode);
+
+  /* Non-poly scalar modes map to standard types not in the table.  */
+  if ((qualifiers & qualifier_poly) || VECTOR_MODE_P (mode))
+type = aarch64_lookup_simd_type_in_table (mode, qualifiers);
   else
-return aarch64_lookup_simd_builtin_type (mode, qualifier_none);
+type = aarch64_int_or_fp_type (mode, qualifiers);
+
+  gcc_assert (type != NULL_TREE);
+
+  /* Add qualifiers.  */
+  if (qualifiers & qualifier_const)
+type = build_qualified_type (type, TYPE_QUAL_CONST);
+  if (qualifiers & qualifier_pointer)
+type = build_pointer_type (type);
+
+  return type;
 }
  
 static void
@@ -1110,12 +1125,11 @@ aarch64_init_fcmla_laneq_builtins (void)
 {
   aarch64_fcmla_laneq_builtin_datum* d
= &aarch64_fcmla_lane_builtin_data[i];
-  tre

Re: [PATCH] c++: shortcut bad reference bindings [PR94894]

2022-07-19 Thread Patrick Palka via Gcc-patches
On Mon, 18 Jul 2022, Jason Merrill wrote:

> On 7/18/22 12:59, Patrick Palka wrote:
> > In case of l/rvalue or cv-qual mismatch during reference binding, we try
> > to give more helpful diagnostics by attempting a bad conversion that
> > ignores the mismatch.  But in doing so, we may end up instantiating an
> > ill-formed conversion function, something that would otherwise be
> > avoided if we didn't try for the better diagnostic in the first place.
> > We could just give up on producing a good diagnostic in this case, but
> > ideally we could keep the good diagnostics while avoiding unnecessary
> > template instantiations.
> > 
> > To that end, this patch adapts the bad conversion shortcutting mechanism
> > from r12-3346-g47543e5f9d1fc5 to handle this situation as well.  The main
> > observation from there applies here as well: when there's a strictly
> > viable candidate, we don't care about the distinction between an unviable
> > and non-strictly viable candidate.  And in turn it also means we don't
> > really care about the distinction between an invalid and bad conversion.
> > Of course, we don't know whether there's a strictly viable candidate until
> > after the fact, so we still need to keep track of when we "shortcutted"
> > distinguishing between an invalid and bad conversion.  This patch adds a
> > new kind of conversion, ck_shortcutted, to represent such conversions.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> > 
> > PR c++/94894
> > PR c++/105766
> > PR c++/106201
> > 
> > gcc/cp/ChangeLog:
> > 
> > * call.cc (enum conversion_kind): Add ck_shortcutted enumerator.
> > (has_next): Return false for it.
> > (reference_binding): Return a ck_shortcutted conversion instead
> > of an actual bad conversion when LOOKUP_SHORTCUT_BAD_CONVS is set.
> > Remove now obsolete early exit for the incomplete target type case.
> > (implicit_conversion_1): Don't mask out LOOKUP_SHORTCUT_BAD_CONVS.
> > (add_function_candidate): Set LOOKUP_SHORTCUT_BAD_CONVS iff
> > shortcut_bad_convs.
> > (missing_conversion_p): Return true for a ck_shortcutted
> > conversion.
> > * cp-tree.h (LOOKUP_SHORTCUT_BAD_CONVS): Define.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/conversion/ref8.C: New test.
> > * g++.dg/conversion/ref9.C: New test.
> > ---
> >   gcc/cp/call.cc | 87 --
> >   gcc/cp/cp-tree.h   |  5 ++
> >   gcc/testsuite/g++.dg/conversion/ref8.C | 22 +++
> >   gcc/testsuite/g++.dg/conversion/ref9.C | 21 +++
> >   4 files changed, 103 insertions(+), 32 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/conversion/ref8.C
> >   create mode 100644 gcc/testsuite/g++.dg/conversion/ref9.C
> > 
> > diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
> > index fc98552fda2..ca2f5fbca39 100644
> > --- a/gcc/cp/call.cc
> > +++ b/gcc/cp/call.cc
> > @@ -59,7 +59,8 @@ enum conversion_kind {
> > ck_ambig,
> > ck_list,
> > ck_aggr,
> > -  ck_rvalue
> > +  ck_rvalue,
> > +  ck_shortcutted,
> 
> Please give this a comment explaining how it's used.  OK with that change.

Done.

> 
> Maybe ck_deferred_bad?

Ah nice, I like this much better than ck_shortcutted.  Here's what I
ended up pushing, thanks a lot:

-- >8 --

Subject: [PATCH] c++: shortcut bad reference binding [PR94894]

In case of l/rvalue or cv-qual mismatch during reference binding, we
try to give more helpful diagnostics by computing a bad conversion that
allows the mismatch.  But in doing so, we may end up considering and
instantiating a conversion function that could induce a hard error and
in turn cause us to reject otherwise valid code.  We could just give up
on producing a better diagnostic here, but ideally we'd preserve the
better diagnostics for invalid code while avoiding unnecessary template
instantiations for valid code.

To that end, this patch adapts the bad conversion shortcutting mechanism
from r12-3346-g47543e5f9d1fc5 to additionally handle this situation.
The main observation from there is that during overload resolution, if we
know we have a strictly viable candidate then we don't need to distinguish
between an unviable and non-strictly viable candidate.  Thus we don't
need to distinguish between an invalid and bad conversion either, which
is what this patch exploits.  Of course, we don't know whether we have a
strictly viable candidate until after the fact, so we still need to
remember when we deferred distinguishing between an invalid and bad
conversion.  This patch adds a special conversion kind ck_deferred_bad
for this purpose.

PR c++/94894
PR c++/105766
PR c++/106201

gcc/cp/ChangeLog:

* call.cc (enum conversion_kind): Add ck_deferred_bad enumerator.
(has_next): Return false for it.
(reference_binding): Return a ck_deferred_bad conversion instead
of an actual bad conversion when LOOKUP_SHORTCUT

Re: [PATCH] Fortran: error recovery on invalid array reference of non-array [PR103590]

2022-07-19 Thread Harald Anlauf via Gcc-patches

Hi Mikael,

Am 19.07.22 um 11:03 schrieb Mikael Morin:

Hello,

the principle looks good, but...

Le 18/07/2022 à 22:43, Harald Anlauf via Fortran a écrit :


diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 2ebf076f730..dacd33561d0 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -5004,7 +5004,11 @@ find_array_spec (gfc_expr *e)
   {
   case REF_ARRAY:
 if (as == NULL)
-  gfc_internal_error ("find_array_spec(): Missing spec");
+  {
+    gfc_error ("Symbol %qs at %L has not been declared as an array",
+   e->symtree->n.sym->name, &e->where);


... the error here only makes sense if the array reference follows a
variable reference.  If it follows a derived type component reference, a
slightly different error message would be more appropriate.


how detailed or tailored should the error message be, or can
we just have a more generic message, like "Name at %L ...",
or "Invalid subscript reference at %L"?  We seem to not hit
that internal error very often...

I have played only little with invalid code in the present context,
but often hit another code path that shows up in associate_54.f90
and gives

Error: Associate-name 'state' at (1) is used as array

For the testcase in the PR, Intel says:

associate_59.f90(7): error #6410: This name has not been declared as an
array or a function.   [A]
print *, [character(a(1)) :: '1'] ! { dg-error "Scalar INTEGER
expression" }
^

Crayftn 14.0 says:

  Improper ir tree in expr_semantics.

print *, [character(a(1)) :: '1'] ! { dg-error "Scalar INTEGER
expression" }
 ^

ftn-873 crayftn: ERROR P, File = associate_59.f90, Line = 7, Column = 26
  Invalid subscripted reference of a scalar ASSOCIATE name.


gfortran's behavior during error handling is difficult to understand.
While the proposed new error message is emitted for associate_54.f90,
it never makes it far enough for the testcase of the present PR
(associate_59.f90).

Thanks,
Harald


Re: [PATCH 2/3] tree-cfg: do not duplicate returns_twice calls

2022-07-19 Thread Alexander Monakov via Gcc-patches
On Tue, 19 Jul 2022, Richard Biener wrote:

> > Like below?
> 
> Yes.
> 
> Thanks and sorry for the back and forth - this _is_ a mightly
> complicated area ...

No problem!  This is the good, healthy kind of back-and-forth, and
I am grateful.

Pushed, including the tree-cfg validator enhancement in patch 3/3.

Alexander


Re: [PATCH] Adding three new function attributes for static analysis of file descriptors

2022-07-19 Thread David Malcolm via Gcc-patches
On Tue, 2022-07-19 at 14:18 -0400, David Malcolm wrote:
> On Tue, 2022-07-19 at 21:36 +0530, Immad Mir wrote:
> 

[...snip...]

> 
> 
> > +void
> > +fd_state_machine::check_for_fd_attrs (
> > +    sm_context *sm_ctxt, const supernode *node, const gimple
> > *stmt,
> > +    const tree callee_fndecl, const char *attr_name,
> > +    access_directions fd_attr_access_dir) const
> > +{


I had another idea about the patch: we have attr_name here, and thus I
think it's available when creating the various fd_param_diagnostic
instances, so why not convert the:

  bool m_attr;

in fd_param_diagnostic to:

  const char *m_attr_name; // can be NULL if no attribute involved

and then, when it's non-NULL we can use it directly in the inform
message, so that the attribute name we report is the one that the user
used.  I think that's clearer, both for us and for the end-user.

Dave



[PATCH, rs6000, v2] Additional cleanup of rs6000_builtin_mask

2022-07-19 Thread will schmidt via Gcc-patches
[PATCH, rs6000, v2] Additional cleanup of rs6000_builtin_mask

Hi,
  Post the rs6000 builtins rewrite, some of the leftover builtin
code is redundant and can be removed.
  This replaces the usage of bu_mask in rs6000_target_modify_macros
with checks against the rs6000_isa_flags equivalent directly.  Thusly
the bu_mask variable can be removed.  After this update there
are no other uses of rs6000_builtin_mask_calculate, so that function
can also be safely removed.

No functional change, though some output under debug has been removed.

[V2]
  Per patch review and subsequent investigations, the
rs6000_builtin_mask and x_rs6000_builtin_mask can also be removed, as
well as the entirety of the rs6000_builtin_mask_names table.

gcc/
* config/rs6000/rs6000-c.cc: Update comments.
(rs6000_target_modify_macros): Remove bu_mask references.
(rs6000_define_or_undefine_macro): Replace bu_mask reference
with a rs6000_cpu value check.
(rs6000_cpu_cpp_builtins): Remove rs6000_builtin_mask_calculate()
parameter from call to rs6000_target_modify_macros.
* config/rs6000/rs6000-protos.h (rs6000_target_modify_macros,
rs6000_target_modify_macros_ptr): Remove parameter from extern
for the prototype.
* config/rs6000/rs6000.cc (rs6000_target_modify_macros_ptr): Remove
parameter from prototype, update calls to this function.
(rs6000_print_builtin_options): Remove prototype, call and function.
(rs6000_builtin_mask_calculate): Remove function.
(rs6000_debug_reg_global): Remove call to rs6000_print_builtin_options.
(rs6000_option_override_internal): Remove rs6000_builtin_mask var
and builtin_mask debug output.
(rs6000_builtin_mask_names): Remove.
(rs6000_pragma_target_parse): Remove prev_bumask, cur_bumask,
diff_bumask references; Update calls to
rs6000_target_modify_ptr.
* config/rs6000/rs6000.opt (rs6000_builtin_mask): Remove.

diff --git a/gcc/config/rs6000/rs6000-c.cc b/gcc/config/rs6000/rs6000-c.cc
index 0d13645040ff..4d051b906582 100644
--- a/gcc/config/rs6000/rs6000-c.cc
+++ b/gcc/config/rs6000/rs6000-c.cc
@@ -333,24 +333,20 @@ rs6000_define_or_undefine_macro (bool define_p, const 
char *name)
   else
 cpp_undef (parse_in, name);
 }
 
 /* Define or undefine macros based on the current target.  If the user does
-   #pragma GCC target, we need to adjust the macros dynamically.  Note, some of
-   the options needed for builtins have been moved to separate variables, so
-   have both the target flags and the builtin flags as arguments.  */
+   #pragma GCC target, we need to adjust the macros dynamically.  */
 
 void
-rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags,
-HOST_WIDE_INT bu_mask)
+rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT flags)
 {
   if (TARGET_DEBUG_BUILTIN || TARGET_DEBUG_TARGET)
 fprintf (stderr,
-"rs6000_target_modify_macros (%s, " HOST_WIDE_INT_PRINT_HEX
-", " HOST_WIDE_INT_PRINT_HEX ")\n",
+"rs6000_target_modify_macros (%s, " HOST_WIDE_INT_PRINT_HEX ")\n",
 (define_p) ? "define" : "undef",
-flags, bu_mask);
+flags);
 
   /* Each of the flags mentioned below controls whether certain
  preprocessor macros will be automatically defined when
  preprocessing source files for compilation by this compiler.
  While most of these flags can be enabled or disabled
@@ -593,14 +589,12 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags,
   /* OPTION_MASK_FLOAT128_HARDWARE can be turned on if -mcpu=power9 is used or
  via the target attribute/pragma.  */
   if ((flags & OPTION_MASK_FLOAT128_HW) != 0)
 rs6000_define_or_undefine_macro (define_p, "__FLOAT128_HARDWARE__");
 
-  /* options from the builtin masks.  */
-  /* Note that OPTION_MASK_FPRND is enabled only if
- (rs6000_cpu == PROCESSOR_CELL) (e.g. -mcpu=cell).  */
-  if ((bu_mask & OPTION_MASK_FPRND) != 0)
+  /* Tell the user if we are targeting CELL.  */
+  if (rs6000_cpu == PROCESSOR_CELL)
 rs6000_define_or_undefine_macro (define_p, "__PPU__");
 
   /* Tell the user if we support the MMA instructions.  */
   if ((flags & OPTION_MASK_MMA) != 0)
 rs6000_define_or_undefine_macro (define_p, "__MMA__");
@@ -614,12 +608,11 @@ rs6000_target_modify_macros (bool define_p, HOST_WIDE_INT 
flags,
 
 void
 rs6000_cpu_cpp_builtins (cpp_reader *pfile)
 {
   /* Define all of the common macros.  */
-  rs6000_target_modify_macros (true, rs6000_isa_flags,
-  rs6000_builtin_mask_calculate ());
+  rs6000_target_modify_macros (true, rs6000_isa_flags);
 
   if (TARGET_FRE)
 builtin_define ("__RECIP__");
   if (TARGET_FRES)
 builtin_define ("__RECIPF__");
diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 3ea010236090..b3c16e7448d8 100644
--- a/gcc/config/rs6000/rs6000-pr

[PATCH, rs6000, v2] Cleanup some vstrir define_expand naming inconsistencies

2022-07-19 Thread will schmidt via Gcc-patches
[PATCH, rs6000, v2] Cleanup some vstrir define_expand naming inconsistencies

Hi,
  This cleans up some of the naming around the vstrir and vstril
instruction definitions, with some cosmetic changes for consistency.
No functional changes.
Regtested just in case, no regressions.

[V2]
Used 'direct' instead of 'internal', and cosmetically reworked
the changelog.

OK for trunk?

Thanks,

gcc/
* config/rs6000/altivec.md:
(vstrir_code_): Rename to...
(vstrir_direct_): ... this.
(vstrir_p_code_): Rename to...
(vstrir_p_direct_): ... this.
(vstril_code_): Rename to...
(vstril_direct_): ... this.
(vstril_p_code_): Rename to...
(vstril_p_direct_): ... this.

diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index efc8ae35c2e7..2c4940f2e21c 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -884,44 +884,44 @@ (define_expand "vstrir_"
(unspec:VIshort [(match_operand:VIshort 1 "altivec_register_operand")]
UNSPEC_VSTRIR))]
   "TARGET_POWER10"
 {
   if (BYTES_BIG_ENDIAN)
-emit_insn (gen_vstrir_code_ (operands[0], operands[1]));
+emit_insn (gen_vstrir_direct_ (operands[0], operands[1]));
   else
-emit_insn (gen_vstril_code_ (operands[0], operands[1]));
+emit_insn (gen_vstril_direct_ (operands[0], operands[1]));
   DONE;
 })
 
-(define_insn "vstrir_code_"
+(define_insn "vstrir_direct_"
   [(set (match_operand:VIshort 0 "altivec_register_operand" "=v")
(unspec:VIshort
   [(match_operand:VIshort 1 "altivec_register_operand" "v")]
   UNSPEC_VSTRIR))]
   "TARGET_POWER10"
   "vstrir %0,%1"
   [(set_attr "type" "vecsimple")])
 
-;; This expands into same code as vstrir_ followed by condition logic
+;; This expands into same code as vstrir followed by condition logic
 ;; so that a single vstribr. or vstrihr. or vstribl. or vstrihl. instruction
 ;; can, for example, satisfy the needs of a vec_strir () function paired
 ;; with a vec_strir_p () function if both take the same incoming arguments.
 (define_expand "vstrir_p_"
   [(match_operand:SI 0 "gpc_reg_operand")
(match_operand:VIshort 1 "altivec_register_operand")]
   "TARGET_POWER10"
 {
   rtx scratch = gen_reg_rtx (mode);
   if (BYTES_BIG_ENDIAN)
-emit_insn (gen_vstrir_p_code_ (scratch, operands[1]));
+emit_insn (gen_vstrir_p_direct_ (scratch, operands[1]));
   else
-emit_insn (gen_vstril_p_code_ (scratch, operands[1]));
+emit_insn (gen_vstril_p_direct_ (scratch, operands[1]));
   emit_insn (gen_cr6_test_for_zero (operands[0]));
   DONE;
 })
 
-(define_insn "vstrir_p_code_"
+(define_insn "vstrir_p_direct_"
   [(set (match_operand:VIshort 0 "altivec_register_operand" "=v")
(unspec:VIshort
   [(match_operand:VIshort 1 "altivec_register_operand" "v")]
   UNSPEC_VSTRIR))
(set (reg:CC CR6_REGNO)
@@ -936,17 +936,17 @@ (define_expand "vstril_"
(unspec:VIshort [(match_operand:VIshort 1 "altivec_register_operand")]
UNSPEC_VSTRIR))]
   "TARGET_POWER10"
 {
   if (BYTES_BIG_ENDIAN)
-emit_insn (gen_vstril_code_ (operands[0], operands[1]));
+emit_insn (gen_vstril_direct_ (operands[0], operands[1]));
   else
-emit_insn (gen_vstrir_code_ (operands[0], operands[1]));
+emit_insn (gen_vstrir_direct_ (operands[0], operands[1]));
   DONE;
 })
 
-(define_insn "vstril_code_"
+(define_insn "vstril_direct_"
   [(set (match_operand:VIshort 0 "altivec_register_operand" "=v")
(unspec:VIshort
   [(match_operand:VIshort 1 "altivec_register_operand" "v")]
   UNSPEC_VSTRIL))]
   "TARGET_POWER10"
@@ -962,18 +962,18 @@ (define_expand "vstril_p_"
(match_operand:VIshort 1 "altivec_register_operand")]
   "TARGET_POWER10"
 {
   rtx scratch = gen_reg_rtx (mode);
   if (BYTES_BIG_ENDIAN)
-emit_insn (gen_vstril_p_code_ (scratch, operands[1]));
+emit_insn (gen_vstril_p_direct_ (scratch, operands[1]));
   else
-emit_insn (gen_vstrir_p_code_ (scratch, operands[1]));
+emit_insn (gen_vstrir_p_direct_ (scratch, operands[1]));
   emit_insn (gen_cr6_test_for_zero (operands[0]));
   DONE;
 })
 
-(define_insn "vstril_p_code_"
+(define_insn "vstril_p_direct_"
   [(set (match_operand:VIshort 0 "altivec_register_operand" "=v")
(unspec:VIshort
   [(match_operand:VIshort 1 "altivec_register_operand" "v")]
   UNSPEC_VSTRIL))
(set (reg:CC CR6_REGNO)



[PATCH] arm: add -static-pie support

2022-07-19 Thread Lance Fredrickson via Gcc-patches
This patch adds -static-pie support for the arm architecture. aarch64 
had the appropriate code for handling -static-pie, so this just mirrors 
the code found there.  Tested with uclibc-ng and musl c-standard 
libraries to produce static-pie binaries.From 56f0daba7bea5d64922c0f45a4fde360f39fb17e Mon Sep 17 00:00:00 2001
From: lancethepants 
Date: Tue, 19 Jul 2022 14:21:05 -0600
Subject: [PATCH] arm: add -static-pie support

The commit mirros code from aarch64 to handle -static-pie.
Tested with uclibc-ng and musl c-standard libraries.

Signed-off-by: Lance Fredrickson 
---
 gcc/config/arm/linux-elf.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/config/arm/linux-elf.h b/gcc/config/arm/linux-elf.h
index df3da67c4f0..70f71b051a3 100644
--- a/gcc/config/arm/linux-elf.h
+++ b/gcc/config/arm/linux-elf.h
@@ -66,9 +66,10 @@
%{static:-Bstatic} \
%{shared:-shared} \
%{symbolic:-Bsymbolic} \
-   %{!static: \
+   %{!static:%{!static-pie: \
  %{rdynamic:-export-dynamic} \
- %{!shared:-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}} \
+ %{!shared:-dynamic-linker " GNU_USER_DYNAMIC_LINKER "}}} \
+   %{static-pie:-Bstatic -pie --no-dynamic-linker -z text} \
-X \
%{mbig-endian:-EB} %{mlittle-endian:-EL}" \
SUBTARGET_EXTRA_LINK_SPEC
-- 
2.20.1



Re: [PATCH] Fortran: error recovery on invalid array reference of non-array [PR103590]

2022-07-19 Thread Mikael Morin

Le 19/07/2022 à 21:09, Harald Anlauf via Fortran a écrit :

Hi Mikael,

Am 19.07.22 um 11:03 schrieb Mikael Morin:

Hello,

the principle looks good, but...

Le 18/07/2022 à 22:43, Harald Anlauf via Fortran a écrit :


diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 2ebf076f730..dacd33561d0 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -5004,7 +5004,11 @@ find_array_spec (gfc_expr *e)
   {
   case REF_ARRAY:
 if (as == NULL)
-  gfc_internal_error ("find_array_spec(): Missing spec");
+  {
+    gfc_error ("Symbol %qs at %L has not been declared as an 
array",

+   e->symtree->n.sym->name, &e->where);


... the error here only makes sense if the array reference follows a 
variable reference.  If it follows a derived type component reference, 
a slightly different error message would be more appropriate.


how detailed or tailored should the error message be, or can
we just have a more generic message, like "Name at %L ...",
or "Invalid subscript reference at %L"?  We seem to not hit
that internal error very often...

It could be anything better than the (unhelpfull) internal error it is 
replacing.
I suggest for example "Invalid array reference of a non-array entity at 
%L".  Cray’s words, or Intel’s, or your other propositions work as well.




gfortran's behavior during error handling is difficult to understand.
While the proposed new error message is emitted for associate_54.f90,
it never makes it far enough for the testcase of the present PR
(associate_59.f90).

Indeed.  We try to match several types of statement one after the other, 
and each failed match can possibly register an error.  But it is emitted 
only if all have failed (see gfc_error_check).  There is no choice of 
the best error; the last one registered wins.


This buffering behavior is controlled by calls to gfc_buffer_error(...). 
 Error handling during resolution doesn’t need to delay error reporting 
as far as I know, and the calls to gfc_buffer_error (false) seem to 
correctly disable buffering at the end of every call to next_statement. 
 I don’t know why it remains active in this case.


Re: [PATCH] Fortran: error recovery on invalid array reference of non-array [PR103590]

2022-07-19 Thread Harald Anlauf via Gcc-patches

Hi Mikael,

Am 19.07.22 um 22:53 schrieb Mikael Morin:

It could be anything better than the (unhelpfull) internal error it is
replacing.
I suggest for example "Invalid array reference of a non-array entity at
%L".


yes, that's much better!  The attached updated patch uses this.

Committed: r13-1757-gf838d15641d256e21ffc126c3277b290ed743928



gfortran's behavior during error handling is difficult to understand.
While the proposed new error message is emitted for associate_54.f90,
it never makes it far enough for the testcase of the present PR
(associate_59.f90).


Indeed.  We try to match several types of statement one after the other,
and each failed match can possibly register an error.  But it is emitted
only if all have failed (see gfc_error_check).  There is no choice of
the best error; the last one registered wins.

This buffering behavior is controlled by calls to gfc_buffer_error(...).
  Error handling during resolution doesn’t need to delay error reporting
as far as I know, and the calls to gfc_buffer_error (false) seem to
correctly disable buffering at the end of every call to next_statement.
  I don’t know why it remains active in this case.



Alright, I should try to remember this and take a closer look next time
I get confused about not getting the error message I wanted...

Thanks,
Harald


[COMMITTED] [PATCH 1/2] Remove recursion from range_from_dom.

2022-07-19 Thread Andrew MacLeod via Gcc-patches

This patch removes the recursion from range_from_dom.

It was walking the dominators looking for a range, and if a block was 
encountered with outgoing ranges, a recursive call was made to resolve 
the incoming dominator range and then calculate incoming edges.


This meant the edges could all be resolved by simply calculating to the 
nearest dominator. This allowed range_from_dom to always get a decent 
answer.


Along the way, any "simple" blocks which had only a single incoming edge 
were pushed onto a stack to be quickly resolved by applying the 
dominators range instead of doing a full range-on-entry.


This patch extends that such that all interesting blocks are pushed onto 
the list, and after a range is found from a dominator, any such blocks 
are simply calculated at the end as they are opped off.  ITs just a 
better genralization of what we were doing and no longer requires a 
recursive call.


It also enables the enhancement from the follow on patch.

There should be no functional change, and its just a hair faster.  There 
will be less pressure on deep call stacks as a result as well.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew
From b0cc57cd76f511f29cab233654249817312ec2a6 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Fri, 15 Jul 2022 09:35:29 -0400
Subject: [PATCH 1/2] Remove recursion from range_from_dom.

Avoid calling range_of_dom recursively by putting all nodes to be
calculated on the worklist, and figure out which kind they are
when removed from the list.

	* gimple-range-cache.cc (ranger_cache::resolve_dom): New.
	(ranger_cache::range_from_dom): Put all nodes to be calculated
	in the worklist and resolve after the dom walk.
	* gimple-range-cache.h (resolve_dom): New prototype.
---
 gcc/gimple-range-cache.cc | 84 ++-
 gcc/gimple-range-cache.h  |  1 +
 2 files changed, 48 insertions(+), 37 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index da7b8055d42..20dd5ead3bc 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -1312,6 +1312,38 @@ ranger_cache::fill_block_cache (tree name, basic_block bb, basic_block def_bb)
 fprintf (dump_file, "  Propagation update done.\n");
 }
 
+// Resolve the range of BB if the dominators range is R by calculating incoming
+// edges to this block.  All lead back to the dominator so should be cheap.
+// The range for BB is set and returned in R.
+
+void
+ranger_cache::resolve_dom (vrange &r, tree name, basic_block bb)
+{
+  basic_block def_bb = gimple_bb (SSA_NAME_DEF_STMT (name));
+  basic_block dom_bb = get_immediate_dominator (CDI_DOMINATORS, bb);
+
+  // if it doesn't already have a value, store the incoming range.
+  if (!m_on_entry.bb_range_p (name, dom_bb) && def_bb != dom_bb)
+{
+  // If the range can't be store, don't try to accumulate
+  // the range in PREV_BB due to excessive recalculations.
+  if (!m_on_entry.set_bb_range (name, dom_bb, r))
+	return;
+}
+  // With the dominator set, we should be able to cheaply query
+  // each incoming edge now and accumulate the results.
+  r.set_undefined ();
+  edge e;
+  edge_iterator ei;
+  Value_Range er (TREE_TYPE (name));
+  FOR_EACH_EDGE (e, ei, bb->preds)
+{
+  edge_range (er, e, name, RFD_READ_ONLY);
+  r.union_ (er);
+}
+  // Set the cache in PREV_BB so it is not calculated again.
+  m_on_entry.set_bb_range (name, bb, r);
+}
 
 // Get the range of NAME from dominators of BB and return it in R.  Search the
 // dominator tree based on MODE.
@@ -1341,7 +1373,7 @@ ranger_cache::range_from_dom (vrange &r, tree name, basic_block start_bb,
   // Default value is global range.
   get_global_range (r, name);
 
-  // Search until a value is found, pushing outgoing edges encountered.
+  // Search until a value is found, pushing blocks which may need calculating.
   for (bb = get_immediate_dominator (CDI_DOMINATORS, start_bb);
bb;
prev_bb = bb, bb = get_immediate_dominator (CDI_DOMINATORS, bb))
@@ -1351,40 +1383,7 @@ ranger_cache::range_from_dom (vrange &r, tree name, basic_block start_bb,
 
   // This block has an outgoing range.
   if (m_gori.has_edge_range_p (name, bb))
-	{
-	  // Only outgoing ranges to single_pred blocks are dominated by
-	  // outgoing edge ranges, so those can be simply adjusted on the fly.
-	  edge e = find_edge (bb, prev_bb);
-	  if (e && single_pred_p (prev_bb))
-	m_workback.quick_push (prev_bb);
-	  else if (mode == RFD_FILL)
-	{
-	  // Multiple incoming edges, so recursively satisfy this block
-	  // if it doesn't already have a value, and store the range.
-	  if (!m_on_entry.bb_range_p (name, bb) && def_bb != bb)
-		{
-		  // If the dominator has not been set, look it up.
-		  range_from_dom (r, name, bb, RFD_FILL);
-		  // If the range can't be store, don't try to accumulate
-		  // the range in PREV_BB due to excessive recalculations.
-		  if (!m_on_entry.set_bb_range (n

[COMMITTED] [PATCH 2/2] Resolve complicated join nodes in range_from_dom.

2022-07-19 Thread Andrew MacLeod via Gcc-patches
The main time a dominator search in the cache fails to produce an 
accurate range is when we have a "complicated" join node.


ie

bb4:
   if (a > 200) goto bb5 ; else goto bb9
bb5
   if (b > 400) goto bb6 ; else goto bb7
bb6
   foo (b)
bb7
   onward()
   goto bb200;
bb9:
  if (b > 200) goto bb6; else goto bb200

in this case bb6  is a complicated join node by ranger standards. It has 
2 incoming edges which carry range information for b:

   9->6 has [201, +INF] and
   5->6 has[401, +INF]
The immediate dominator of bb6 is bb4.  When asking bb6 for a range from 
its dominator, there is no sign of any outgoing ranges for b, and 
range_from_dom would simply return VARYING.


This patch tweaks range_from_dom()  so that we calculate the range for 
any block in which either
  1) The dominator has an outgoing range    (this was the old behaviour 
only)

  2) The current block sees an outgoing range on an incoming edge.
The addition of 2) allows us to capture the above case with minimal cost 
which gets use very very close to the results from the old style full 
cache value propagation.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  pushed.

Andrew
From dbb093f4f15ea66f2ce5cd2dc1903a6894563356 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Mon, 18 Jul 2022 15:04:23 -0400
Subject: [PATCH 2/2] Resolve complicated join nodes in range_from_dom.

Join nodes which carry outgoing ranges on incoming edges are uncommon,
but can still be resolved by setting the dominator range, and then
calculating incoming edges.  Avoid doing so if one of the incoing edges
is not dominated by the same dominator.

	* gimple-range-cache.cc (ranger_cache::range_from_dom): Check
	  for incoming ranges on join nodes and add to worklist.
---
 gcc/gimple-range-cache.cc | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 20dd5ead3bc..f3292fccaee 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -1384,6 +1384,32 @@ ranger_cache::range_from_dom (vrange &r, tree name, basic_block start_bb,
   // This block has an outgoing range.
   if (m_gori.has_edge_range_p (name, bb))
 	m_workback.quick_push (prev_bb);
+  else
+	{
+	  // Normally join blocks don't carry any new range information on
+	  // incoming edges.  If the first incoming edge to this block does
+	  // generate a range, calculate the ranges if all incoming edges
+	  // are also dominated by the dominator.  (Avoids backedges which
+	  // will break the rule of moving only upward in the domniator tree).
+	  // If the first pred does not generate a range, then we will be
+	  // using the dominator range anyway, so thats all the check needed.
+	  if (EDGE_COUNT (prev_bb->preds) > 1
+	  && m_gori.has_edge_range_p (name, EDGE_PRED (prev_bb, 0)->src))
+	{
+	  edge e;
+	  edge_iterator ei;
+	  bool all_dom = true;
+	  FOR_EACH_EDGE (e, ei, prev_bb->preds)
+		if (e->src != bb
+		&& !dominated_by_p (CDI_DOMINATORS, e->src, bb))
+		  {
+		all_dom = false;
+		break;
+		  }
+	  if (all_dom)
+		m_workback.quick_push (prev_bb);
+	}
+	}
 
   if (def_bb == bb)
 	break;
-- 
2.36.1



[committed] analyzer: don't track string literals in the store [PR106359]

2022-07-19 Thread David Malcolm via Gcc-patches
Doing so speeds up -fanalyzer from taking over 4 hours to under a
minute on the Linux kernel's sound/soc/codecs/cs47l90.c

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-1761-g68871a008e686d.

gcc/analyzer/ChangeLog:
PR analyzer/106359
* region.h (string_region::tracked_p): New.
* store.cc (binding_cluster::binding_cluster): Move here from
store.h.  Add assertion that base_region is tracked_p.
* store.h (binding_cluster::binding_cluster): Move to store.cc.

Signed-off-by: David Malcolm 
---
 gcc/analyzer/region.h | 4 
 gcc/analyzer/store.cc | 7 +++
 gcc/analyzer/store.h  | 4 +---
 3 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/gcc/analyzer/region.h b/gcc/analyzer/region.h
index 60d8149f513..fd0d4a05cc9 100644
--- a/gcc/analyzer/region.h
+++ b/gcc/analyzer/region.h
@@ -1151,6 +1151,10 @@ public:
 
   void dump_to_pp (pretty_printer *pp, bool simple) const final override;
 
+  /* We assume string literals are immutable, so we don't track them in
+ the store.  */
+  bool tracked_p () const final override { return false; }
+
   tree get_string_cst () const { return m_string_cst; }
 
 private:
diff --git a/gcc/analyzer/store.cc b/gcc/analyzer/store.cc
index 06151d8c041..e3dabf300df 100644
--- a/gcc/analyzer/store.cc
+++ b/gcc/analyzer/store.cc
@@ -1103,6 +1103,13 @@ binding_map::remove_overlapping_bindings (store_manager 
*mgr,
 
 /* class binding_cluster.  */
 
+binding_cluster::binding_cluster (const region *base_region)
+: m_base_region (base_region), m_map (),
+  m_escaped (false), m_touched (false)
+{
+  gcc_assert (base_region->tracked_p ());
+}
+
 /* binding_cluster's copy ctor.  */
 
 binding_cluster::binding_cluster (const binding_cluster &other)
diff --git a/gcc/analyzer/store.h b/gcc/analyzer/store.h
index 368b2990ae8..9b54c7b9156 100644
--- a/gcc/analyzer/store.h
+++ b/gcc/analyzer/store.h
@@ -544,9 +544,7 @@ public:
   typedef hash_map  map_t;
   typedef map_t::iterator iterator_t;
 
-  binding_cluster (const region *base_region)
-  : m_base_region (base_region), m_map (),
-m_escaped (false), m_touched (false) {}
+  binding_cluster (const region *base_region);
   binding_cluster (const binding_cluster &other);
   binding_cluster& operator=(const binding_cluster &other);
 
-- 
2.26.3



Re: [PATCH v1 1/2] LoongArch: Modify the method of obtaining symbolic addresses.

2022-07-19 Thread Lulu Cheng



在 2022/7/19 下午10:29, Xi Ruoyao 写道:

The change seems too large.  It would be better to split it into
multiple commits (for example, just 3 commits for 1,2,3 below).

On Tue, 2022-07-19 at 21:08 +0800, Lulu Cheng wrote:


1. The original LA macro instruction is split into two instructions to
    obtain the address of the symbol if enable '-mexplicit-relocs'.

It's better to add some test cases (with dg-final scan-assembler) for
this.  The test case will also show humans the intended behavior after
the change.

I'm sorry, I'll add the test cases as soon as possible.

3. Modify the method that calls global functions. From 'la.global + jirl'
    to 'bl'.

Why?  Does it means we'll rely on the assembler to emit the correct
sequence for -fno-plt?  Then it would be better to use a pseudo mnemonic
like "call" instead of "bl" (because it's not a single "bl"
instruction).
I have not described the behavior of noplt, but I will do so in the next 
submission.




Re: [PATCH, rs6000, v2] Additional cleanup of rs6000_builtin_mask

2022-07-19 Thread Kewen.Lin via Gcc-patches
Hi Will,

on 2022/7/20 04:15, will schmidt wrote:
> [PATCH, rs6000, v2] Additional cleanup of rs6000_builtin_mask
> 
> Hi,
>   Post the rs6000 builtins rewrite, some of the leftover builtin
> code is redundant and can be removed.
>   This replaces the usage of bu_mask in rs6000_target_modify_macros
> with checks against the rs6000_isa_flags equivalent directly.  Thusly
> the bu_mask variable can be removed.  After this update there
> are no other uses of rs6000_builtin_mask_calculate, so that function
> can also be safely removed.
> 
> No functional change, though some output under debug has been removed.
> 
> [V2]
>   Per patch review and subsequent investigations, the
> rs6000_builtin_mask and x_rs6000_builtin_mask can also be removed, as
> well as the entirety of the rs6000_builtin_mask_names table.
> 

I assumed this has been tested as before, so this patch is OK.  Thanks!

> gcc/
>   * config/rs6000/rs6000-c.cc: Update comments.
>   (rs6000_target_modify_macros): Remove bu_mask references.
>   (rs6000_define_or_undefine_macro): Replace bu_mask reference
>   with a rs6000_cpu value check.
>   (rs6000_cpu_cpp_builtins): Remove rs6000_builtin_mask_calculate()
>   parameter from call to rs6000_target_modify_macros.
>   * config/rs6000/rs6000-protos.h (rs6000_target_modify_macros,
>   rs6000_target_modify_macros_ptr): Remove parameter from extern
>   for the prototype.
>   * config/rs6000/rs6000.cc (rs6000_target_modify_macros_ptr): Remove
>   parameter from prototype, update calls to this function.
>   (rs6000_print_builtin_options): Remove prototype, call and function.
>   (rs6000_builtin_mask_calculate): Remove function.
>   (rs6000_debug_reg_global): Remove call to rs6000_print_builtin_options.
>   (rs6000_option_override_internal): Remove rs6000_builtin_mask var
>   and builtin_mask debug output.
>   (rs6000_builtin_mask_names): Remove.
>   (rs6000_pragma_target_parse): Remove prev_bumask, cur_bumask,
>   diff_bumask references; Update calls to
>   rs6000_target_modify_ptr.

Nit: this "rs6000_target_modify_ptr" seems not necessarily to start one new
line.  :)

BR,
Kewen

>   * config/rs6000/rs6000.opt (rs6000_builtin_mask): Remove.
>


[PATCH] Move pass_cse_sincos after vectorizer.

2022-07-19 Thread liuhongt via Gcc-patches
__builtin_cexpi can't be vectorized since there's gap between it and
vectorized sincos version(In libmvec, it passes a double and two
double pointer and returns nothing.) And it will lose some
vectorization opportunity if sin & cos are optimized to cexpi before
vectorizer.

I'm trying to add vect_recog_cexpi_pattern to split cexpi to sin and
cos, but it failed vectorizable_simd_clone_call since NULL is returned
by cgraph_node::get (fndecl).  So alternatively, the patch try to move
pass_cse_sincos after vectorizer, just before pas_cse_reciprocals.

Also original pass_cse_sincos additionaly expands pow&cabs, this patch
split that part into a separate pass named pass_expand_powcabs which
remains the old pass position.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Observe more libmvec sin/cos vectorization in specfp, but no big performance.

Ok for trunk?

gcc/ChangeLog:

* passes.def: (Split pass_cse_sincos to pass_expand_powcabs
and pass_cse_sincos, and move pass_cse_sincos after vectorizer).
* timevar.def (TV_TREE_POWCABS): New timevar.
* tree-pass.h (make_pass_expand_powcabs): Split from pass_cse_sincos.
* tree-ssa-math-opts.cc (gimple_expand_builtin_cabs): Ditto.
(class pass_expand_powcabs): Ditto.
(pass_expand_powcabs::execute): Ditto.
(make_pass_expand_powcabs): Ditto.
(pass_cse_sincos::execute): Remove pow/cabs expand part.
(make_pass_cse_sincos): Ditto.

gcc/testsuite/ChangeLog:

* gcc.dg/pow-sqrt-synth-1.c: Adjust testcase.
---
 gcc/passes.def  |   3 +-
 gcc/testsuite/gcc.dg/pow-sqrt-synth-1.c |   4 +-
 gcc/timevar.def |   1 +
 gcc/tree-pass.h |   1 +
 gcc/tree-ssa-math-opts.cc   | 112 +++-
 5 files changed, 97 insertions(+), 24 deletions(-)

diff --git a/gcc/passes.def b/gcc/passes.def
index 375d3d62d51..6bb92efacd4 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -253,7 +253,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_ccp, true /* nonzero_p */);
   /* After CCP we rewrite no longer addressed locals into SSA
 form if possible.  */
-  NEXT_PASS (pass_cse_sincos);
+  NEXT_PASS (pass_expand_powcabs);
   NEXT_PASS (pass_optimize_bswap);
   NEXT_PASS (pass_laddress);
   NEXT_PASS (pass_lim);
@@ -328,6 +328,7 @@ along with GCC; see the file COPYING3.  If not see
   NEXT_PASS (pass_simduid_cleanup);
   NEXT_PASS (pass_lower_vector_ssa);
   NEXT_PASS (pass_lower_switch);
+  NEXT_PASS (pass_cse_sincos);
   NEXT_PASS (pass_cse_reciprocals);
   NEXT_PASS (pass_reassoc, false /* early_p */);
   NEXT_PASS (pass_strength_reduction);
diff --git a/gcc/testsuite/gcc.dg/pow-sqrt-synth-1.c 
b/gcc/testsuite/gcc.dg/pow-sqrt-synth-1.c
index 4a94325cdb3..484b29a8fc8 100644
--- a/gcc/testsuite/gcc.dg/pow-sqrt-synth-1.c
+++ b/gcc/testsuite/gcc.dg/pow-sqrt-synth-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile { target sqrt_insn } } */
-/* { dg-options "-fdump-tree-sincos -Ofast --param max-pow-sqrt-depth=8" } */
+/* { dg-options "-fdump-tree-powcabs -Ofast --param max-pow-sqrt-depth=8" } */
 /* { dg-additional-options "-mfloat-abi=softfp -mfpu=neon-vfpv4" { target 
arm*-*-* } } */
 
 double
@@ -34,4 +34,4 @@ vecfoo (double *a)
 a[i] = __builtin_pow (a[i], 1.25);
 }
 
-/* { dg-final { scan-tree-dump-times "synthesizing" 7 "sincos" } } */
+/* { dg-final { scan-tree-dump-times "synthesizing" 7 "powcabs" } } */
diff --git a/gcc/timevar.def b/gcc/timevar.def
index 2dae5e1c760..651af19876f 100644
--- a/gcc/timevar.def
+++ b/gcc/timevar.def
@@ -220,6 +220,7 @@ DEFTIMEVAR (TV_TREE_SWITCH_CONVERSION, "tree switch 
conversion")
 DEFTIMEVAR (TV_TREE_SWITCH_LOWERING,   "tree switch lowering")
 DEFTIMEVAR (TV_TREE_RECIP, "gimple CSE reciprocals")
 DEFTIMEVAR (TV_TREE_SINCOS   , "gimple CSE sin/cos")
+DEFTIMEVAR (TV_TREE_POWCABS   , "gimple expand pow/cabs")
 DEFTIMEVAR (TV_TREE_WIDEN_MUL, "gimple widening/fma detection")
 DEFTIMEVAR (TV_TRANS_MEM , "transactional memory")
 DEFTIMEVAR (TV_TREE_STRLEN   , "tree strlen optimization")
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 606d1d60b85..4dfe05ed8e0 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -444,6 +444,7 @@ extern gimple_opt_pass *make_pass_early_warn_uninitialized 
(gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_late_warn_uninitialized (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_cse_reciprocals (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_cse_sincos (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_expand_powcabs (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_optimize_bswap (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_store_merging (gcc::context *ctxt);
 extern gimple_opt_pass *make_pass_optimize_widening_mul (gcc::context *ctxt);
diff --git a/gcc/tree-ssa-

Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-19 Thread Hongtao Liu via Gcc-patches
On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak  wrote:
>
> On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu  wrote:
> >
> > On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches
> >  wrote:
> > >
> > > On Tue, Jul 19, 2022 at 8:07 AM liuhongt  wrote:
> > > >
> > > > And split it after reload.
> > > >
> > > > > You will need ix86_binary_operator_ok insn constraint here with
> > > > > corresponding expander using ix86_fixup_binary_operands_no_copy to
> > > > > prepare insn operands.
> > > > Split define_expand with just register_operand, and allow
> > > > memory/immediate in define_insn, assume combine/forwprop will do 
> > > > optimization.
> > >
> > > But you will *ease* the job of the above passes if you use
> > > ix86_fixup_binary_operands_no_copy in the expander.
> > for -m32, it will hit ICE in
> > Breakpoint 1, ix86_fixup_binary_operands_no_copy (code=XOR,
> > mode=E_V4QImode, operands=0x7fffa970) a
> > /gcc/config/i386/i386-expand.cc:1184
> > 1184  rtx dst = ix86_fixup_binary_operands (code, mode, operands);
> > (gdb) n
> > 1185  gcc_assert (dst == operands[0]); -- here
> > (gdb)
> >
> > the original operands[0], operands[1], operands[2] are below
> > (gdb) p debug_rtx (operands[0])
> > (mem/c:V4QI (plus:SI (reg/f:SI 77 virtual-stack-vars)
> > (const_int -8220 [0xdfe4])) [0 MEM  > unsigned char> [(unsigned char *)&tmp2 + 4B]+0 S4 A32])
> > $1 = void
> > (gdb) p debug_rtx (operands[1])
> > (subreg:V4QI (reg:SI 129) 0)
> > $2 = void
> > (gdb) p debug_rtx (operands[2])
> > (subreg:V4QI (reg:SI 98 [ _46 ]) 0)
> > $3 = void
> > (gdb)
> >
> > since operands[0] is mem and not equal to operands[1],
> > ix86_fixup_binary_operands will create a pseudo register for dst. and
> > then hit ICE.
> > Is this a bug or assumed?
>
> You will need ix86_expand_binary_operator here.
It will swap memory operand from op1 to op2 and hit ICE for unrecognized insn.

What about this?

-(define_insn "3"
-  [(set (match_operand:VI_16_32 0 "register_operand" "=?r,x,x,v")
+(define_expand "3"
+  [(set (match_operand:VI_16_32 0 "nonimmediate_operand")
 (any_logic:VI_16_32
- (match_operand:VI_16_32 1 "register_operand" "%0,0,x,v")
- (match_operand:VI_16_32 2 "register_operand" "r,x,x,v")))
-   (clobber (reg:CC FLAGS_REG))]
+ (match_operand:VI_16_32 1 "nonimmediate_operand")
+ (match_operand:VI_16_32 2
"register_or_x86_64_const_vector_operand")))]
   ""
+{
+  rtx dst = ix86_fixup_binary_operands (, mode, operands);
+  if (MEM_P (operands[2]))
+operands[2] = force_reg (mode, operands[2]);
+  rtx op = gen_rtx_SET (dst, gen_rtx_fmt_ee (, mode,
+operands[1], operands[2]));
+  rtx clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
+  emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, op, clob)));
+  if (dst != operands[0])
+emit_move_insn (operands[0], dst);
+   DONE;
+})
+

>
> Uros.



-- 
BR,
Hongtao


gcc-patches@gcc.gnu.org

2022-07-19 Thread liuhongt via Gcc-patches
> My original comments still stand (it feels like this should be more generic).
> Can we go the way lowering complex loads/stores first?  A large part
> of the testcases
> added by the patch should pass after that.

This is the patch as suggested, one additional change is handling COMPLEX_CST
for rhs. And it will enable vectorization for pr106010-8a.c.

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

2022-07-20  Richard Biener  
Hongtao Liu  

gcc/ChangeLog:

PR tree-optimization/106010
* tree-complex.cc (init_dont_simulate_again): Lower complex
type move.
(expand_complex_move): Also expand COMPLEX_CST for rhs.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr106010-1a.c: New test.
* gcc.target/i386/pr106010-1b.c: New test.
* gcc.target/i386/pr106010-1c.c: New test.
* gcc.target/i386/pr106010-2a.c: New test.
* gcc.target/i386/pr106010-2b.c: New test.
* gcc.target/i386/pr106010-2c.c: New test.
* gcc.target/i386/pr106010-3a.c: New test.
* gcc.target/i386/pr106010-3b.c: New test.
* gcc.target/i386/pr106010-3c.c: New test.
* gcc.target/i386/pr106010-4a.c: New test.
* gcc.target/i386/pr106010-4b.c: New test.
* gcc.target/i386/pr106010-4c.c: New test.
* gcc.target/i386/pr106010-5a.c: New test.
* gcc.target/i386/pr106010-5b.c: New test.
* gcc.target/i386/pr106010-5c.c: New test.
* gcc.target/i386/pr106010-6a.c: New test.
* gcc.target/i386/pr106010-6b.c: New test.
* gcc.target/i386/pr106010-6c.c: New test.
* gcc.target/i386/pr106010-7a.c: New test.
* gcc.target/i386/pr106010-7b.c: New test.
* gcc.target/i386/pr106010-7c.c: New test.
* gcc.target/i386/pr106010-8a.c: New test.
* gcc.target/i386/pr106010-8b.c: New test.
* gcc.target/i386/pr106010-8c.c: New test.
* gcc.target/i386/pr106010-9a.c: New test.
* gcc.target/i386/pr106010-9b.c: New test.
* gcc.target/i386/pr106010-9c.c: New test.
* gcc.target/i386/pr106010-9d.c: New test.
---
 gcc/testsuite/gcc.target/i386/pr106010-1a.c |  58 
 gcc/testsuite/gcc.target/i386/pr106010-1b.c |  63 
 gcc/testsuite/gcc.target/i386/pr106010-1c.c |  41 +
 gcc/testsuite/gcc.target/i386/pr106010-2a.c |  82 ++
 gcc/testsuite/gcc.target/i386/pr106010-2b.c |  62 
 gcc/testsuite/gcc.target/i386/pr106010-2c.c |  47 ++
 gcc/testsuite/gcc.target/i386/pr106010-3a.c |  80 ++
 gcc/testsuite/gcc.target/i386/pr106010-3b.c | 126 
 gcc/testsuite/gcc.target/i386/pr106010-3c.c |  69 +
 gcc/testsuite/gcc.target/i386/pr106010-4a.c | 101 +
 gcc/testsuite/gcc.target/i386/pr106010-4b.c |  67 +
 gcc/testsuite/gcc.target/i386/pr106010-4c.c |  54 +++
 gcc/testsuite/gcc.target/i386/pr106010-5a.c | 117 +++
 gcc/testsuite/gcc.target/i386/pr106010-5b.c |  80 ++
 gcc/testsuite/gcc.target/i386/pr106010-5c.c |  62 
 gcc/testsuite/gcc.target/i386/pr106010-6a.c | 115 ++
 gcc/testsuite/gcc.target/i386/pr106010-6b.c | 157 
 gcc/testsuite/gcc.target/i386/pr106010-6c.c |  80 ++
 gcc/testsuite/gcc.target/i386/pr106010-7a.c |  58 
 gcc/testsuite/gcc.target/i386/pr106010-7b.c |  63 
 gcc/testsuite/gcc.target/i386/pr106010-7c.c |  41 +
 gcc/testsuite/gcc.target/i386/pr106010-8a.c |  58 
 gcc/testsuite/gcc.target/i386/pr106010-8b.c |  53 +++
 gcc/testsuite/gcc.target/i386/pr106010-8c.c |  38 +
 gcc/testsuite/gcc.target/i386/pr106010-9a.c |  89 +++
 gcc/testsuite/gcc.target/i386/pr106010-9b.c |  90 +++
 gcc/testsuite/gcc.target/i386/pr106010-9c.c |  90 +++
 gcc/testsuite/gcc.target/i386/pr106010-9d.c |  92 
 gcc/tree-complex.cc |   9 +-
 29 files changed, 2141 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-1a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-1b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-1c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-2a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-2b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-2c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-3a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-3b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-3c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-4a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-4b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-4c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-5a.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-5b.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106010-5c.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr10601

Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-19 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 20, 2022 at 4:37 AM Hongtao Liu  wrote:
>
> On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak  wrote:
> >
> > On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu  wrote:
> > >
> > > On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches
> > >  wrote:
> > > >
> > > > On Tue, Jul 19, 2022 at 8:07 AM liuhongt  wrote:
> > > > >
> > > > > And split it after reload.
> > > > >
> > > > > > You will need ix86_binary_operator_ok insn constraint here with
> > > > > > corresponding expander using ix86_fixup_binary_operands_no_copy to
> > > > > > prepare insn operands.
> > > > > Split define_expand with just register_operand, and allow
> > > > > memory/immediate in define_insn, assume combine/forwprop will do 
> > > > > optimization.
> > > >
> > > > But you will *ease* the job of the above passes if you use
> > > > ix86_fixup_binary_operands_no_copy in the expander.
> > > for -m32, it will hit ICE in
> > > Breakpoint 1, ix86_fixup_binary_operands_no_copy (code=XOR,
> > > mode=E_V4QImode, operands=0x7fffa970) a
> > > /gcc/config/i386/i386-expand.cc:1184
> > > 1184  rtx dst = ix86_fixup_binary_operands (code, mode, operands);
> > > (gdb) n
> > > 1185  gcc_assert (dst == operands[0]); -- here
> > > (gdb)
> > >
> > > the original operands[0], operands[1], operands[2] are below
> > > (gdb) p debug_rtx (operands[0])
> > > (mem/c:V4QI (plus:SI (reg/f:SI 77 virtual-stack-vars)
> > > (const_int -8220 [0xdfe4])) [0 MEM  > > unsigned char> [(unsigned char *)&tmp2 + 4B]+0 S4 A32])
> > > $1 = void
> > > (gdb) p debug_rtx (operands[1])
> > > (subreg:V4QI (reg:SI 129) 0)
> > > $2 = void
> > > (gdb) p debug_rtx (operands[2])
> > > (subreg:V4QI (reg:SI 98 [ _46 ]) 0)
> > > $3 = void
> > > (gdb)
> > >
> > > since operands[0] is mem and not equal to operands[1],
> > > ix86_fixup_binary_operands will create a pseudo register for dst. and
> > > then hit ICE.
> > > Is this a bug or assumed?
> >
> > You will need ix86_expand_binary_operator here.
> It will swap memory operand from op1 to op2 and hit ICE for unrecognized insn.
>
> What about this?

Still no good... You are using commutative operands, so the predicate
of operand 2 should also allow memory. So, the predicate should be
nonimmediate_or_x86_64_const_vector_operand. The intermediate insn
pattern should look something like *_1, but with
added XMM and MMX reg alternatives instead of mask regs.

Uros.

>
> -(define_insn "3"
> -  [(set (match_operand:VI_16_32 0 "register_operand" "=?r,x,x,v")
> +(define_expand "3"
> +  [(set (match_operand:VI_16_32 0 "nonimmediate_operand")
>  (any_logic:VI_16_32
> - (match_operand:VI_16_32 1 "register_operand" "%0,0,x,v")
> - (match_operand:VI_16_32 2 "register_operand" "r,x,x,v")))
> -   (clobber (reg:CC FLAGS_REG))]
> + (match_operand:VI_16_32 1 "nonimmediate_operand")
> + (match_operand:VI_16_32 2
> "register_or_x86_64_const_vector_operand")))]
>""
> +{
> +  rtx dst = ix86_fixup_binary_operands (, mode, operands);
> +  if (MEM_P (operands[2]))
> +operands[2] = force_reg (mode, operands[2]);
> +  rtx op = gen_rtx_SET (dst, gen_rtx_fmt_ee (, mode,
> +operands[1], operands[2]));
> +  rtx clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
> +  emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, op, clob)));
> +  if (dst != operands[0])
> +emit_move_insn (operands[0], dst);
> +   DONE;
> +})
> +
>
> >
> > Uros.
>
>
>
> --
> BR,
> Hongtao


Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-19 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 20, 2022 at 8:14 AM Uros Bizjak  wrote:
>
> On Wed, Jul 20, 2022 at 4:37 AM Hongtao Liu  wrote:
> >
> > On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak  wrote:
> > >
> > > On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu  wrote:
> > > >
> > > > On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > On Tue, Jul 19, 2022 at 8:07 AM liuhongt  
> > > > > wrote:
> > > > > >
> > > > > > And split it after reload.
> > > > > >
> > > > > > > You will need ix86_binary_operator_ok insn constraint here with
> > > > > > > corresponding expander using ix86_fixup_binary_operands_no_copy to
> > > > > > > prepare insn operands.
> > > > > > Split define_expand with just register_operand, and allow
> > > > > > memory/immediate in define_insn, assume combine/forwprop will do 
> > > > > > optimization.
> > > > >
> > > > > But you will *ease* the job of the above passes if you use
> > > > > ix86_fixup_binary_operands_no_copy in the expander.
> > > > for -m32, it will hit ICE in
> > > > Breakpoint 1, ix86_fixup_binary_operands_no_copy (code=XOR,
> > > > mode=E_V4QImode, operands=0x7fffa970) a
> > > > /gcc/config/i386/i386-expand.cc:1184
> > > > 1184  rtx dst = ix86_fixup_binary_operands (code, mode, operands);
> > > > (gdb) n
> > > > 1185  gcc_assert (dst == operands[0]); -- here
> > > > (gdb)
> > > >
> > > > the original operands[0], operands[1], operands[2] are below
> > > > (gdb) p debug_rtx (operands[0])
> > > > (mem/c:V4QI (plus:SI (reg/f:SI 77 virtual-stack-vars)
> > > > (const_int -8220 [0xdfe4])) [0 MEM  > > > unsigned char> [(unsigned char *)&tmp2 + 4B]+0 S4 A32])
> > > > $1 = void
> > > > (gdb) p debug_rtx (operands[1])
> > > > (subreg:V4QI (reg:SI 129) 0)
> > > > $2 = void
> > > > (gdb) p debug_rtx (operands[2])
> > > > (subreg:V4QI (reg:SI 98 [ _46 ]) 0)
> > > > $3 = void
> > > > (gdb)
> > > >
> > > > since operands[0] is mem and not equal to operands[1],
> > > > ix86_fixup_binary_operands will create a pseudo register for dst. and
> > > > then hit ICE.
> > > > Is this a bug or assumed?
> > >
> > > You will need ix86_expand_binary_operator here.
> > It will swap memory operand from op1 to op2 and hit ICE for unrecognized 
> > insn.
> >
> > What about this?
>
> Still no good... You are using commutative operands, so the predicate
> of operand 2 should also allow memory. So, the predicate should be
> nonimmediate_or_x86_64_const_vector_operand. The intermediate insn
> pattern should look something like *_1, but with
> added XMM and MMX reg alternatives instead of mask regs.

Alternatively, you can use UNKNOWN operator to prevent
canonicalization, but then you should not use commutative constraint
in the intermediate insn. I think this is the best solution.

Uros.

> >
> > -(define_insn "3"
> > -  [(set (match_operand:VI_16_32 0 "register_operand" "=?r,x,x,v")
> > +(define_expand "3"
> > +  [(set (match_operand:VI_16_32 0 "nonimmediate_operand")
> >  (any_logic:VI_16_32
> > - (match_operand:VI_16_32 1 "register_operand" "%0,0,x,v")
> > - (match_operand:VI_16_32 2 "register_operand" "r,x,x,v")))
> > -   (clobber (reg:CC FLAGS_REG))]
> > + (match_operand:VI_16_32 1 "nonimmediate_operand")
> > + (match_operand:VI_16_32 2
> > "register_or_x86_64_const_vector_operand")))]
> >""
> > +{
> > +  rtx dst = ix86_fixup_binary_operands (, mode, operands);
> > +  if (MEM_P (operands[2]))
> > +operands[2] = force_reg (mode, operands[2]);
> > +  rtx op = gen_rtx_SET (dst, gen_rtx_fmt_ee (, mode,
> > +operands[1], operands[2]));
> > +  rtx clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
> > +  emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, op, clob)));
> > +  if (dst != operands[0])
> > +emit_move_insn (operands[0], dst);
> > +   DONE;
> > +})
> > +
> >
> > >
> > > Uros.
> >
> >
> >
> > --
> > BR,
> > Hongtao


[PATCH] libgo: make match.sh POSIX-shell compatible

2022-07-19 Thread soeren--- via Gcc-patches
From: Sören Tempel 

The `(( expression ))` syntax is a Bash extension and not supported by
POSIX shell [1]. However, the arithmetic expressions used by the
gobuild() function can also be expressed using arithmetic POSIX
expansions with `$(( expression ))` [2].

Contrary to the Bash extension, arithmetic expansion doesn't set
the return value if the expression is non-zero but instead just prints
the expression result. Hence, the expression also needs to be negated.
Without this patch, match.sh does currently not work correctly if
/bin/sh is not a symlink to Bash.

[1]: https://www.gnu.org/software/bash/manual/bash.html#Conditional-Constructs
[2]: 
https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_04

Signed-off-by: Sören Tempel 
---
 libgo/match.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libgo/match.sh b/libgo/match.sh
index 7ed587ff794..9fbb498544c 100755
--- a/libgo/match.sh
+++ b/libgo/match.sh
@@ -111,7 +111,7 @@ gobuild() {
 if test "$goarch" != "386"; then
line=$(echo "$line" | sed -e "s/\\(${wrap}\\)386\\(${wrap}\\)/\10\2/g")
 fi
-(($line))
+return $((!(line)))
 }
 
 matched=


Re: [PATCH V2] Extend 16/32-bit vector bit_op patterns with (m, 0, i) alternative.

2022-07-19 Thread Hongtao Liu via Gcc-patches
On Wed, Jul 20, 2022 at 2:18 PM Uros Bizjak  wrote:
>
> On Wed, Jul 20, 2022 at 8:14 AM Uros Bizjak  wrote:
> >
> > On Wed, Jul 20, 2022 at 4:37 AM Hongtao Liu  wrote:
> > >
> > > On Tue, Jul 19, 2022 at 5:37 PM Uros Bizjak  wrote:
> > > >
> > > > On Tue, Jul 19, 2022 at 8:56 AM Hongtao Liu  wrote:
> > > > >
> > > > > On Tue, Jul 19, 2022 at 2:35 PM Uros Bizjak via Gcc-patches
> > > > >  wrote:
> > > > > >
> > > > > > On Tue, Jul 19, 2022 at 8:07 AM liuhongt  
> > > > > > wrote:
> > > > > > >
> > > > > > > And split it after reload.
> > > > > > >
> > > > > > > > You will need ix86_binary_operator_ok insn constraint here with
> > > > > > > > corresponding expander using ix86_fixup_binary_operands_no_copy 
> > > > > > > > to
> > > > > > > > prepare insn operands.
> > > > > > > Split define_expand with just register_operand, and allow
> > > > > > > memory/immediate in define_insn, assume combine/forwprop will do 
> > > > > > > optimization.
> > > > > >
> > > > > > But you will *ease* the job of the above passes if you use
> > > > > > ix86_fixup_binary_operands_no_copy in the expander.
> > > > > for -m32, it will hit ICE in
> > > > > Breakpoint 1, ix86_fixup_binary_operands_no_copy (code=XOR,
> > > > > mode=E_V4QImode, operands=0x7fffa970) a
> > > > > /gcc/config/i386/i386-expand.cc:1184
> > > > > 1184  rtx dst = ix86_fixup_binary_operands (code, mode, operands);
> > > > > (gdb) n
> > > > > 1185  gcc_assert (dst == operands[0]); -- here
> > > > > (gdb)
> > > > >
> > > > > the original operands[0], operands[1], operands[2] are below
> > > > > (gdb) p debug_rtx (operands[0])
> > > > > (mem/c:V4QI (plus:SI (reg/f:SI 77 virtual-stack-vars)
> > > > > (const_int -8220 [0xdfe4])) [0 MEM  > > > > unsigned char> [(unsigned char *)&tmp2 + 4B]+0 S4 A32])
> > > > > $1 = void
> > > > > (gdb) p debug_rtx (operands[1])
> > > > > (subreg:V4QI (reg:SI 129) 0)
> > > > > $2 = void
> > > > > (gdb) p debug_rtx (operands[2])
> > > > > (subreg:V4QI (reg:SI 98 [ _46 ]) 0)
> > > > > $3 = void
> > > > > (gdb)
> > > > >
> > > > > since operands[0] is mem and not equal to operands[1],
> > > > > ix86_fixup_binary_operands will create a pseudo register for dst. and
> > > > > then hit ICE.
> > > > > Is this a bug or assumed?
> > > >
> > > > You will need ix86_expand_binary_operator here.
> > > It will swap memory operand from op1 to op2 and hit ICE for unrecognized 
> > > insn.
> > >
> > > What about this?
> >
> > Still no good... You are using commutative operands, so the predicate
> > of operand 2 should also allow memory. So, the predicate should be
> > nonimmediate_or_x86_64_const_vector_operand. The intermediate insn
> > pattern should look something like *_1, but with
> > added XMM and MMX reg alternatives instead of mask regs.
>
> Alternatively, you can use UNKNOWN operator to prevent
> canonicalization, but then you should not use commutative constraint
> in the intermediate insn. I think this is the best solution.
Like this?

-(define_insn "3"
-  [(set (match_operand:VI_16_32 0 "register_operand" "=?r,x,x,v")
+(define_expand "3"
+  [(set (match_operand:VI_16_32 0 "nonimmediate_operand")
 (any_logic:VI_16_32
- (match_operand:VI_16_32 1 "register_operand" "%0,0,x,v")
- (match_operand:VI_16_32 2 "register_operand" "r,x,x,v")))
-   (clobber (reg:CC FLAGS_REG))]
+ (match_operand:VI_16_32 1 "nonimmediate_operand")
+ (match_operand:VI_16_32 2
"register_or_x86_64_const_vector_operand")))]
   ""
+{
+  rtx dst = ix86_fixup_binary_operands (, mode, operands);
+  if (MEM_P (operands[2]))
+operands[2] = force_reg (mode, operands[2]);
+  rtx op = gen_rtx_SET (dst, gen_rtx_fmt_ee (, mode,
+operands[1], operands[2]));
+  rtx clob = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG));
+  emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, op, clob)));
+  if (dst != operands[0])
+emit_move_insn (operands[0], dst);
+   DONE;
+})
+
+(define_insn "*3"
+  [(set (match_operand:VI_16_32 0 "nonimmediate_operand" "=?r,m,x,x,v")
+(any_logic:VI_16_32
+ (match_operand:VI_16_32 1 "nonimmediate_operand" "0,0,0,x,v")
+ (match_operand:VI_16_32 2
"register_or_x86_64_const_vector_operand" "r,i,x,x,v")))
+   (clobber (reg:CC FLAGS_REG))]
+  "ix86_binary_operator_ok (UNKNOWN, mode, operands)"
   "#"
-  [(set_attr "isa" "*,sse2_noavx,avx,avx512vl")
-   (set_attr "type" "alu,sselog,sselog,sselog")
-   (set_attr "mode" "SI,TI,TI,TI")])
+  [(set_attr "isa" "*,*,sse2_noavx,avx,avx512vl")
+   (set_attr "type" "alu,alu,sselog,sselog,sselog")
+   (set_attr "mode" "SI,SI,TI,TI,TI")])

>
> Uros.
>
> > >
> > > -(define_insn "3"
> > > -  [(set (match_operand:VI_16_32 0 "register_operand" "=?r,x,x,v")
> > > +(define_expand "3"
> > > +  [(set (match_operand:VI_16_32 0 "nonimmediate_operand")
> > >  (any_logic:VI_16_32
> > > - (match_operand:VI_16_32 1 "register_operand" "%0,0,x,v")
> > > - (match_operand