[RFC/RFT 3/3] [PR102768] aarch64: Add support for Control Flow Integrity

2022-12-19 Thread Dan Li via Gcc-patches
In the AArch64 platform, typeid can be directly inserted in front
of the function header (offset is -4).

For all functions that will not be called indirectly, insert the
reserved RESERVED_CFI_TYPEID (0x0) as typeid in front of them. If
not, the attacker may use the instruction/data before the function
as typeid to bypass CFI.

All typeids ignore some bits (& AARCH64_UNALLOCATED_INSN_MASK) to
avoid conflicts with the AArch64 instruction set.

Signed-off-by: Dan Li 

gcc/ChangeLog:

PR c/102768
* config/aarch64/aarch64.cc (RESERVED_CFI_TYPEID): Macro definition.
(DEFAULT_CFI_TYPEID): Likewise.
(AARCH64_UNALLOCATED_INSN_MASK): Likewise.
(aarch64_gimple_get_func_cfi_typeid): Platform-dependent
CFI function.
(aarch64_calc_func_cfi_typeid): Likewise.
(cgraph_indirectly_callable): Determine whether a funtion may
be called indirectly.
(aarch64_output_func_cfi_typeid): Platform-dependent CFI function.
(TARGET_HAVE_CFI): New hook.
(TARGET_CALC_FUNC_CFI_TYPEID): Likewise.
(TARGET_ASM_OUTPUT_FUNC_CFI_TYPEID): Likewise.
(TARGET_GIMPLE_GET_FUNC_CFI_TYPEID): Likewise.
* doc/invoke.texi: Document -fsanitize=cfi.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/control_flow_integrity_1.c: New test.
* gcc.target/aarch64/control_flow_integrity_2.c: New test.
* gcc.target/aarch64/control_flow_integrity_3.c: New test.
---
 gcc/config/aarch64/aarch64.cc | 106 ++
 gcc/doc/invoke.texi   |  35 ++
 .../aarch64/control_flow_integrity_1.c|  14 +++
 .../aarch64/control_flow_integrity_2.c|  25 +
 .../aarch64/control_flow_integrity_3.c|  23 
 5 files changed, 203 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_1.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_2.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/control_flow_integrity_3.c

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 5c9e7791a12..2796df0cdf3 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -81,6 +81,7 @@
 #include "rtlanal.h"
 #include "tree-dfa.h"
 #include "asan.h"
+#include "ssa.h"
 
 /* This file should be included last.  */
 #include "target-def.h"
@@ -5450,6 +5451,99 @@ aarch64_output_sve_addvl_addpl (rtx offset)
   return buffer;
 }
 
+/* Reserved for all functions that cannot be called indirectly.  */
+#define RESERVED_CFI_TYPEID 0x0U
+
+/* If the typeid of a function that can be called indirectly is equal to
+   RESERVED_CFI_TYPEID, change it to DEFAULT_CFI_TYPEID.  */
+#define DEFAULT_CFI_TYPEID 0x0ADAU
+
+/* Mask of reserved and unallocated instructions in AArch64 platform.  */
+#define AARCH64_UNALLOCATED_INSN_MASK 0xE7FFU
+
+/* Generate gimple insns to return the callee's typeid to a tmp var,
+   for aarch64, like:
+   __cfi_tmp = *(fptr - 4);  */
+
+static tree
+aarch64_gimple_get_func_cfi_typeid (gimple_seq *stmts,
+   location_t loc, tree fptr)
+{
+  gimple *stmt;
+  tree result, rhs;
+
+  result = create_tmp_var (integer_type_node, "__cfi_tmp");
+  result = make_ssa_name (result, NULL);
+
+  rhs = build_pointer_type (integer_type_node);
+  rhs = build_int_cst_type (rhs, -4);
+  rhs = build2 (MEM_REF, integer_type_node, fptr, rhs);
+
+  stmt = gimple_build_assign (result, rhs);
+  gimple_set_location (stmt, loc);
+
+  SSA_NAME_DEF_STMT (result) = stmt;
+
+  gimple_seq_add_stmt (stmts, stmt);
+
+  return result;
+}
+
+static unsigned int
+aarch64_calc_func_cfi_typeid (const_tree fntype)
+{
+  unsigned int hash;
+
+  /* The value of typeid has a probability of being the same as the encoding
+ of an instruction.  If the attacker can find the same encoding as the
+ typeid in the assembly code, then he has found a usable jump location.
+ So here, a platform-related mask is used when generating a typeid to
+ avoid such conflicts as much as possible.  */
+  hash = unified_type_hash (fntype) & AARCH64_UNALLOCATED_INSN_MASK;
+
+  /* RESERVED_CFI_TYPEID is reserved for functions that cannot
+ be called indirectly.  */
+  if (hash == RESERVED_CFI_TYPEID)
+hash = DEFAULT_CFI_TYPEID;
+
+  return hash;
+}
+
+static bool
+cgraph_indirectly_callable (struct cgraph_node *node,
+   void *data ATTRIBUTE_UNUSED)
+{
+  if (node->externally_visible || node->address_taken)
+return true;
+
+  return false;
+}
+
+static void
+aarch64_output_func_cfi_typeid (FILE * stream, tree decl)
+{
+  struct cgraph_node *node;
+  unsigned int cur_func_typeid;
+
+  node = cgraph_node::get (decl);
+
+  if (!node->call_for_symbol_thunks_and_aliases (cgraph_indirectly_callable,
+  NULL, true))
+/* CFI's typeid check always considers that there is a typeid before the
+   target 

Re: [PATCH] rs6000: Fix some issues related to Power10 fusion [PR104024]

2022-12-19 Thread Kewen.Lin via Gcc-patches
Hi Segher,

Thanks for the review comments!

on 2022/12/15 06:29, Segher Boessenkool wrote:
> On Wed, Nov 30, 2022 at 04:30:13PM +0800, Kewen.Lin wrote:
>> As PR104024 shows, the option -mpower10-fusion isn't guarded by
>> -mcpu=power10, it causes compiler to fuse for some patterns
>> even without power10 support and then causes ICE unexpectedly,
>> this patch is to simply unmask it without power10 support, not
>> emit any warnings as this option is undocumented.
> 
> Yes, it mostly exists for debugging purposes (and also for testcase).
> 
>> Besides, for some define_insns in fusion.md which use constraint
>> v, it requires the condition VECTOR_UNIT_ALTIVEC_OR_VSX_P
>> (mode), otherwise it can cause ICE in reload, see test
>> case pr104024-2.c.
> 
> Please don't two separate things in one patch.  It makes bisecting
> harder than necessary, and perhaps more interesting to you: it makes
> writing good changelog entries and commit messages harder.

OK, will do.

> 
>> --- a/gcc/config/rs6000/genfusion.pl
>> +++ b/gcc/config/rs6000/genfusion.pl
>> @@ -167,7 +167,7 @@ sub gen_logical_addsubf
>>  $inner_comp, $inner_inv, $inner_rtl, $inner_op, $both_commute, $c4,
>>  $bc, $inner_arg0, $inner_arg1, $inner_exp, $outer_arg2, $outer_exp,
>>  $ftype, $insn, $is_subf, $is_rsubf, $outer_32, $outer_42,$outer_name,
>> -$fuse_type);
>> +$fuse_type, $constraint_cond);
>>KIND: foreach $kind ('scalar','vector') {
>>@outer_ops = @logicals;
>>if ( $kind eq 'vector' ) {
>> @@ -176,12 +176,14 @@ sub gen_logical_addsubf
>>$pred = "altivec_register_operand";
>>$constraint = "v";
>>$fuse_type = "fused_vector";
>> +  $constraint_cond = "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && ";
>>} else {
>>$vchr = "";
>>$mode = "GPR";
>>$pred = "gpc_reg_operand";
>>$constraint = "r";
>>$fuse_type = "fused_arith_logical";
>> +  $constraint_cond = "";
>>push (@outer_ops, @addsub);
>>push (@outer_ops, ( "rsubf" ));
>>}
> 
> I don't like this at all.  Please use the "isa" attribute where needed?
> Or do you need more in some cases?  But, again, separate patch.

This is to add one more condition for those define_insns, for example:

@@ -1875,7 +1875,7 @@ (define_insn "*fuse_vand_vand"
   (match_operand:VM 1 "altivec_register_operand" 
"%v,v,v,v"))
  (match_operand:VM 2 "altivec_register_operand" "v,v,v,v")))
(clobber (match_scratch:VM 4 "=X,X,X,&v"))]
-  "(TARGET_P10_FUSION)"
+  "(VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && TARGET_P10_FUSION)"
   "@
vand %3,%1,%0\;vand %3,%3,%2
vand %3,%1,%0\;vand %3,%3,%2

It's to avoid the pseudo whose mode isn't available for register constraint v
causes ICE during reload.  I'm not sure how the "isa" attribute helps here,
could you elaborate it?

> 
>> +  if (TARGET_POWER10
>> +  && (rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION) == 0)
>> +rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
>> +  else if (!TARGET_POWER10 && TARGET_P10_FUSION)
>> +rs6000_isa_flags &= ~OPTION_MASK_P10_FUSION;
> 
> That's not right.  If you want something like this you should check for
> TARGET_POWER10 whenever you check for TARGET_P10_FUSION; but there
> really is no reason at all to disable P10 fusion on other CPUs (neither
> newer nor older!).

Good point, and I just noticed that we should check tune setting instead
of TARGET_POWER10 here?  Something like:

if (!(rs6000_isa_flags_explicit & OPTION_MASK_P10_FUSION))
  {
if (processor_target_table[tune_index].processor == PROCESSOR_POWER10)
  rs6000_isa_flags |= OPTION_MASK_P10_FUSION;
else
  rs6000_isa_flags &= ~OPTION_MASK_P10_FUSION;
  }

> 
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr104024-1.c
>> @@ -0,0 +1,16 @@
>> +/* { dg-require-effective-target int128 } */
>> +/* { dg-options "-O1 -mdejagnu-cpu=power6 -mpower10-fusion" } */
> 
> Does this need -O1?  If not, use -O2 please; if so, document it.
> 

No, it doesn't, will use -O2 instead.

BR,
Kewen


[RFC/RFT 1/3] [PR102768] flag-types.h (enum sanitize_code): Extend sanitize_code to 64 bits to support more features

2022-12-19 Thread Dan Li via Gcc-patches
32-bit sanitize_code can no longer accommodate new options,
extending it to 64-bit.

Signed-off-by: Dan Li 

gcc/ChangeLog:

PR c/102768
* asan.h (sanitize_flags_p): Promote to uint64_t.
* common.opt: Likewise.
* dwarf2asm.cc (dw2_output_indirect_constant_1): Likewise.
* flag-types.h (enum sanitize_code): Likewise.
* opt-suggestions.cc (option_proposer::build_option_suggestions):
Likewise.
* opts.cc (find_sanitizer_argument): Likewise.
(report_conflicting_sanitizer_options): Likewise.
(get_closest_sanitizer_option): Likewise.
(parse_sanitizer_options): Likewise.
(parse_no_sanitize_attribute): Likewise.
* opts.h (parse_sanitizer_options): Likewise.
(parse_no_sanitize_attribute): Likewise.
* tree-cfg.cc (print_no_sanitize_attr_value): Likewise.

gcc/c-family/ChangeLog:

* c-attribs.cc (add_no_sanitize_value): Likewise.
(handle_no_sanitize_attribute): Likewise.
* c-common.h (add_no_sanitize_value): Likewise.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_declaration_or_fndef): Likewise.

gcc/cp/ChangeLog:

* typeck.cc (get_member_function_from_ptrfunc): Likewise.
---
 gcc/asan.h|  4 +--
 gcc/c-family/c-attribs.cc | 10 +++---
 gcc/c-family/c-common.h   |  2 +-
 gcc/c/c-parser.cc |  4 +--
 gcc/common.opt|  4 +--
 gcc/cp/typeck.cc  |  2 +-
 gcc/dwarf2asm.cc  |  2 +-
 gcc/flag-types.h  | 65 ---
 gcc/opt-suggestions.cc|  2 +-
 gcc/opts.cc   | 22 ++---
 gcc/opts.h|  8 ++---
 gcc/tree-cfg.cc   |  2 +-
 12 files changed, 64 insertions(+), 63 deletions(-)

diff --git a/gcc/asan.h b/gcc/asan.h
index d4ea49cb240..5b98172549b 100644
--- a/gcc/asan.h
+++ b/gcc/asan.h
@@ -233,9 +233,9 @@ asan_protect_stack_decl (tree decl)
remove all flags mentioned in "no_sanitize" of DECL_ATTRIBUTES.  */
 
 static inline bool
-sanitize_flags_p (unsigned int flag, const_tree fn = current_function_decl)
+sanitize_flags_p (uint64_t flag, const_tree fn = current_function_decl)
 {
-  unsigned int result_flags = flag_sanitize & flag;
+  uint64_t result_flags = flag_sanitize & flag;
   if (result_flags == 0)
 return false;
 
diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 111a33f405a..a73e2364525 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -1118,23 +1118,23 @@ handle_cold_attribute (tree *node, tree name, tree 
ARG_UNUSED (args),
 /* Add FLAGS for a function NODE to no_sanitize_flags in DECL_ATTRIBUTES.  */
 
 void
-add_no_sanitize_value (tree node, unsigned int flags)
+add_no_sanitize_value (tree node, uint64_t flags)
 {
   tree attr = lookup_attribute ("no_sanitize", DECL_ATTRIBUTES (node));
   if (attr)
 {
-  unsigned int old_value = tree_to_uhwi (TREE_VALUE (attr));
+  uint64_t old_value = tree_to_uhwi (TREE_VALUE (attr));
   flags |= old_value;
 
   if (flags == old_value)
return;
 
-  TREE_VALUE (attr) = build_int_cst (unsigned_type_node, flags);
+  TREE_VALUE (attr) = build_int_cst (long_long_unsigned_type_node, flags);
 }
   else
 DECL_ATTRIBUTES (node)
   = tree_cons (get_identifier ("no_sanitize"),
-  build_int_cst (unsigned_type_node, flags),
+  build_int_cst (long_long_unsigned_type_node, flags),
   DECL_ATTRIBUTES (node));
 }
 
@@ -1145,7 +1145,7 @@ static tree
 handle_no_sanitize_attribute (tree *node, tree name, tree args, int,
  bool *no_add_attrs)
 {
-  unsigned int flags = 0;
+  uint64_t flags = 0;
   *no_add_attrs = true;
   if (TREE_CODE (*node) != FUNCTION_DECL)
 {
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 52a85bfb783..eb91b9703db 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1500,7 +1500,7 @@ extern enum flt_eval_method
 excess_precision_mode_join (enum flt_eval_method, enum flt_eval_method);
 
 extern int c_flt_eval_method (bool ts18661_p);
-extern void add_no_sanitize_value (tree node, unsigned int flags);
+extern void add_no_sanitize_value (tree node, uint64_t flags);
 
 extern void maybe_add_include_fixit (rich_location *, const char *, bool);
 extern void maybe_suggest_missing_token_insertion (rich_location *richloc,
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index f679d53706a..9d55ea55fa6 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -2217,7 +2217,7 @@ c_parser_declaration_or_fndef (c_parser *parser, bool 
fndef_ok,
  start_init (NULL_TREE, asm_name, global_bindings_p (), 
&richloc);
  /* A parameter is initialized, which is invalid.  Don't
 attempt to instrument the initializer.  */
- int flag_sanitize_save = flag_sanitize;
+ uint64_t flag_sanitize_save = flag_sanitize;
   

[RFC/RFT 2/3] [PR102768] Support CFI: Add new pass for Control Flow Integrity

2022-12-19 Thread Dan Li via Gcc-patches
The CFI sanitizer enabled with -fsanitize=cfi implements a forward
edge control flow integrity scheme for indirect calls, roughly
similar to -fsanitize=kcfi [1] in llvm.

At compile time, it appends a uniform type identifier before the
first instruction of each function and inserts check code before
each indirect call in a function with protection enabled.

At runtime, according to the code order, the check code for each
indirect call will be executed first, and it will:
1. Dynamically obtain the typeid before the callee function.
2. Compare it to the expected typeid of the current call site (caller).
3. If the two match, continue to execute the indirect call, if not,
call the user-defined callback function cfi_check_failed.

A typeid (type identifier) is a 32-bit constant on all platforms,
whose value depends on the function's prototype, and is invariant
across compilation units. However, different platforms may ignore
some of the bits to avoid conflicts with instructions.

If a program contains indirect calls to assembly functions, they
must be manually annotated with the expected type identifiers to
prevent errors. To make this easier, gcc generates a weak SHN_ABS
__cfi_typeid_ symbol for each address-taken function
declaration, which can be used to annotate functions in assembly
as long as at least one translation unit linked into the program
takes the function address.

It should be noted that on different platforms, the location of
typeid insertion (the offset between it and the function header)
may be different, such as [1], and this patch only implements
the platform-independent part.

[1]: https://reviews.llvm.org/D119296

Signed-off-by: Dan Li 

gcc/ChangeLog:

PR c/102768
* Makefile.in: Add tree-cfi.o.
* cgraphunit.cc (output_decl_cfi_typeid_symbol): Output the
CFI typeid corresponding to each external declaration when necessary.
(output_decl_cfi_typeid_symbols): Likewise.
* doc/passes.texi: Document it.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: New hooks.
* flag-types.h (enum sanitize_code):
Add SANITIZE_CONTROL_FLOW_INTEGRITY.
* opts.cc (parse_sanitizer_options): Add cfi and exclude
SANITIZE_CONTROL_FLOW_INTEGRITY.
* output.h (default_output_func_cfi_typeid): Declare.
(default_calc_func_cfi_typeid): Declare.
(default_gimple_get_func_cfi_typeid): Declare.
* passes.def: Add pass_cfi.
* target.def: Add new hooks.
* toplev.cc (process_options): Add CFI compile option check.
* tree-pass.h (make_pass_cfi): Declare.
* tree.cc (tree_node_sizes[): Add the unified tree type hash
calculation functions.
(append_unified_type_hash): Likewise.
(initialize_unified_tree_type_hash_table): Likewise.
(append_unified_type_name_hash): Likewise.
(append_unified_type_precision_hash): Likewise.
(append_unified_function_ret_and_args_hash): Likewise.
(unified_type_hash): Likewise.
(init_ttree): Likewise.
* tree.h (unified_type_hash): Declare.
* varasm.cc (assemble_start_function): Output the CFI typeid
of each function.
(default_output_func_cfi_typeid): New.
(default_gimple_get_func_cfi_typeid): New.
(default_calc_func_cfi_typeid): New.
* tree-cfi.cc: New file.
---
 gcc/Makefile.in |   1 +
 gcc/cgraphunit.cc   |  34 +++
 gcc/doc/passes.texi |  10 ++
 gcc/doc/tm.texi |  27 ++
 gcc/doc/tm.texi.in  |   8 ++
 gcc/flag-types.h|   2 +
 gcc/opts.cc |   4 +-
 gcc/output.h|   3 +
 gcc/passes.def  |   1 +
 gcc/target.def  |  39 
 gcc/toplev.cc   |   4 +
 gcc/tree-cfi.cc | 229 
 gcc/tree-pass.h |   1 +
 gcc/tree.cc | 144 
 gcc/tree.h  |   1 +
 gcc/varasm.cc   |  29 ++
 16 files changed, 536 insertions(+), 1 deletion(-)
 create mode 100644 gcc/tree-cfi.cc

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 31ff95500c9..0d23bad6b63 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1610,6 +1610,7 @@ OBJS = \
tree-call-cdce.o \
tree-cfg.o \
tree-cfgcleanup.o \
+   tree-cfi.o \
tree-chrec.o \
tree-complex.o \
tree-data-ref.o \
diff --git a/gcc/cgraphunit.cc b/gcc/cgraphunit.cc
index 76d541755b8..fb4999559ae 100644
--- a/gcc/cgraphunit.cc
+++ b/gcc/cgraphunit.cc
@@ -,6 +,37 @@ ipa_passes (void)
   bitmap_obstack_release (NULL);
 }
 
+/* Output a weak symbol value of a decl's typeid (hash) to the
+   assembly file, like:
+   .weak __cfi_typeid_A
+   .set __cfi_typeid_A, 0x0ADA
+   typeid is platform-dependent, because the bits in typeid that conflicts
+   with the instruction set of the current platform needs to be ignored.  */
+
+static void
+output_decl_cfi_typeid_symbol (FILE *stream, tree fndecl)
+{
+  unsigned int hash =

Re: [PATCH v2 04/11] riscv: riscv-cores.def: Add T-Head XuanTie C906

2022-12-19 Thread Kito Cheng
LGTM

On Mon, Dec 19, 2022 at 9:08 AM Christoph Muellner
 wrote:
>
> From: Christoph Müllner 
>
> This adds T-Head's XuanTie C906 to the list of known cores as "thead-c906".
> The C906 is shipped for quite some time (it is the core of the Allwinner D1).
> Note, that the tuning struct for the C906 is already part of GCC (it is
> also name "thead-c906").
>
> gcc/ChangeLog:
>
> * config/riscv/riscv-cores.def (RISCV_CORE): Add "thead-c906".
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/mcpu-thead-c906.c: New test.
>
> Changes for v2:
> - Enable all supported vendor extensions
>
> Signed-off-by: Christoph Müllner 
> ---
>  gcc/config/riscv/riscv-cores.def  |  4 +++
>  .../gcc.target/riscv/mcpu-thead-c906.c| 28 +++
>  2 files changed, 32 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/mcpu-thead-c906.c
>
> diff --git a/gcc/config/riscv/riscv-cores.def 
> b/gcc/config/riscv/riscv-cores.def
> index 31ad34682c5..307381802fa 100644
> --- a/gcc/config/riscv/riscv-cores.def
> +++ b/gcc/config/riscv/riscv-cores.def
> @@ -73,4 +73,8 @@ RISCV_CORE("sifive-s76",  "rv64imafdc", 
> "sifive-7-series")
>  RISCV_CORE("sifive-u54",  "rv64imafdc", "sifive-5-series")
>  RISCV_CORE("sifive-u74",  "rv64imafdc", "sifive-7-series")
>
> +RISCV_CORE("thead-c906",  
> "rv64imafdc_xtheadba_xtheadbb_xtheadbs_xtheadcmo_"
> + "xtheadcondmov_xtheadfmemidx_xtheadmac_"
> + "xtheadmemidx_xtheadmempair_xtheadsync",
> + "thead-c906")
>  #undef RISCV_CORE
> diff --git a/gcc/testsuite/gcc.target/riscv/mcpu-thead-c906.c 
> b/gcc/testsuite/gcc.target/riscv/mcpu-thead-c906.c
> new file mode 100644
> index 000..a71b43a6167
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/mcpu-thead-c906.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-skip-if "-march given" { *-*-* } { "-march=*" } } */
> +/* { dg-options "-mcpu=thead-c906" { target { rv64 } } } */
> +/* T-Head XuanTie C906 => rv64imafdc */
> +
> +#if !((__riscv_xlen == 64) \
> +  && !defined(__riscv_32e) \
> +  && defined(__riscv_mul)  \
> +  && defined(__riscv_atomic)   \
> +  && (__riscv_flen == 64)  \
> +  && defined(__riscv_compressed)   \
> +  && defined(__riscv_xtheadba) \
> +  && defined(__riscv_xtheadbb) \
> +  && defined(__riscv_xtheadbs) \
> +  && defined(__riscv_xtheadcmo)\
> +  && defined(__riscv_xtheadcondmov)\
> +  && defined(__riscv_xtheadfmemidx)\
> +  && defined(__riscv_xtheadmac)\
> +  && defined(__riscv_xtheadmemidx) \
> +  && defined(__riscv_xtheadmempair)\
> +  && defined(__riscv_xtheadsync))
> +#error "unexpected arch"
> +#endif
> +
> +int main()
> +{
> +  return 0;
> +}
> --
> 2.38.1
>


[PATCH v6, rs6000] Change mode and insn condition for VSX scalar extract/insert instructions

2022-12-19 Thread HAO CHEN GUI via Gcc-patches
Hi,
This patch fixes several problems:
1. The exponent of double-precision can be put into a SImode register.
So "xsxexpdp" doesn't require 64-bit environment. Also "xsxsigdp",
"xsiexpdp" and "xsiexpdpf" can put exponent into a GPR register.

2. "TARGET_64BIT" check in insn conditions should be replaced with
"TARGET_POWERPC64" check.

3. "lp64" check in test cases should be replaced with "has_arch_ppc64"
check. "ilp32" check should be replaced with "dg-skip-if has_arch_ppc64".

This patch keeps outer interfaces of these builtins unchanged.

Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

ChangeLog
2022-12-19  Haochen Gui  

gcc/
* config/rs6000/rs6000-builtins.def
(__builtin_vsx_scalar_extract_exp): Set return type to const unsigned
int and set its bif-pattern to xsxexpdp_si, move it from power9-64 to
power9 catalog.
(__builtin_vsx_scalar_extract_sig): Set return type to const unsigned
long long.
(__builtin_vsx_scalar_insert_exp): Set its bif-pattern to xsiexpdp_di
unsigned int.
(__builtin_vsx_scalar_insert_exp_dp): Set its bif-pattern to
xsiexpdpf_di.
* config/rs6000/vsx.md (xsxexpdp): Rename to ...
(xsxexpdp_): ..., set mode of operand 0 to GPR and remove
TARGET_64BIT check.
(xsxsigdp): Change insn condition from TARGET_64BIT to TARGET_POWERPC64.
(xsiexpdp): Rename to ...
(xsiexpdp_): ..., set mode of operand 2 to GPR and change insn
condition from TARGET_64BIT to TARGET_POWERPC64.
(xsiexpdpf): Rename to ...
(xsiexpdpf_): ..., set mode of operand 2 to GPR and change insn
condition from TARGET_64BIT to TARGET_POWERPC64.
* doc/extend.texi (scalar_extract_exp): Remove 64-bit environment
requirement when it has a 64-bit argument.

gcc/testsuite/
* gcc.target/powerpc/bfp/scalar-extract-exp-0.c: Remove lp64 check.
* gcc.target/powerpc/bfp/scalar-extract-exp-1.c: Likewise.
* gcc.target/powerpc/bfp/scalar-extract-exp-2.c: Deleted as the case is
invalid now.
* gcc.target/powerpc/bfp/scalar-extract-exp-6.c: Remove lp64 check.
* gcc.target/powerpc/bfp/scalar-extract-sig-0.c: Replace lp64 check
with has_arch_ppc64.
* gcc.target/powerpc/bfp/scalar-extract-sig-1.c: Likewise.
* gcc.target/powerpc/bfp/scalar-extract-sig-2.c: Replace ilp32 check
with dg skip has_arch_ppc64.
* gcc.target/powerpc/bfp/scalar-extract-sig-6.c: Replace lp64 check
with has_arch_ppc64.
* gcc.target/powerpc/bfp/scalar-insert-exp-0.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-1.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-12.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-13.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-2.c: Replace ilp32 check
with dg skip has_arch_ppc64.
* gcc.target/powerpc/bfp/scalar-insert-exp-3.c: Replace lp64 check
with has_arch_ppc64.
* gcc.target/powerpc/bfp/scalar-insert-exp-4.c: Likewise.
* gcc.target/powerpc/bfp/scalar-insert-exp-5.c: Replace ilp32 check
with dg-skip-if has_arch_ppc64.

patch.diff
diff --git a/gcc/config/rs6000/rs6000-builtins.def 
b/gcc/config/rs6000/rs6000-builtins.def
index f76f54793d7..b1b5002d7d9 100644
--- a/gcc/config/rs6000/rs6000-builtins.def
+++ b/gcc/config/rs6000/rs6000-builtins.def
@@ -2833,6 +2833,8 @@
   const signed int __builtin_dtstsfi_ov_td (const int<6>, _Decimal128);
 TSTSFI_OV_TD dfptstsfi_unordered_td {}

+  const signed int  __builtin_vsx_scalar_extract_exp (double);
+VSEEDP xsxexpdp_si {}

 [power9-64]
   void __builtin_altivec_xst_len_r (vsc, void *, long);
@@ -2847,18 +2849,15 @@
   pure vsc __builtin_vsx_lxvl (const void *, signed long);
 LXVL lxvl {}

-  const signed long __builtin_vsx_scalar_extract_exp (double);
-VSEEDP xsxexpdp {}
-
-  const signed long __builtin_vsx_scalar_extract_sig (double);
+  const signed long long __builtin_vsx_scalar_extract_sig (double);
 VSESDP xsxsigdp {}

   const double __builtin_vsx_scalar_insert_exp (unsigned long long, \
 unsigned long long);
-VSIEDP xsiexpdp {}
+VSIEDP xsiexpdp_di {}

   const double __builtin_vsx_scalar_insert_exp_dp (double, unsigned long long);
-VSIEDPF xsiexpdpf {}
+VSIEDPF xsiexpdpf_di {}

   pure vsc __builtin_vsx_xl_len_r (void *, signed long);
 XL_LEN_R xl_len_r {}
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index fb5cf04147e..e1c905a3f91 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -5089,11 +5089,11 @@ (define_insn "xsxexpqp_"
   [(set_attr "type" "vecmove")])

 ;; VSX Scalar Extract Exponent Double-Precision
-(define_insn "xsxexpdp"
-  [(set (match_operand:DI 0 

Re: [PATCH v2 02/11] riscv: Restructure callee-saved register save/restore code

2022-12-19 Thread Kito Cheng
just one more nit: Use INVALID_REGNUM as sentinel value for
riscv_next_saved_reg, otherwise LGTM, and feel free to commit that
separately :)

On Mon, Dec 19, 2022 at 9:08 AM Christoph Muellner
 wrote:
>
> From: Christoph Müllner 
>
> This patch restructures the loop over the GP registers
> which saves/restores then as part of the prologue/epilogue.
> No functional change is intended by this patch, but it
> offers the possibility to use load-pair/store-pair instructions.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_next_saved_reg): New function.
> (riscv_is_eh_return_data_register): New function.
> (riscv_for_each_saved_reg): Restructure loop.
>
> Signed-off-by: Christoph Müllner 
> ---
>  gcc/config/riscv/riscv.cc | 94 +++
>  1 file changed, 66 insertions(+), 28 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 6dd2ab2d11e..a8d5e1dac7f 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -4835,6 +4835,49 @@ riscv_save_restore_reg (machine_mode mode, int regno,
>fn (gen_rtx_REG (mode, regno), mem);
>  }
>
> +/* Return the next register up from REGNO up to LIMIT for the callee
> +   to save or restore.  OFFSET will be adjusted accordingly.
> +   If INC is set, then REGNO will be incremented first.  */
> +
> +static unsigned int
> +riscv_next_saved_reg (unsigned int regno, unsigned int limit,
> + HOST_WIDE_INT *offset, bool inc = true)
> +{
> +  if (inc)
> +regno++;
> +
> +  while (regno <= limit)
> +{
> +  if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
> +   {
> + *offset = *offset - UNITS_PER_WORD;
> + break;
> +   }
> +
> +  regno++;
> +}
> +  return regno;
> +}
> +
> +/* Return TRUE if provided REGNO is eh return data register.  */
> +
> +static bool
> +riscv_is_eh_return_data_register (unsigned int regno)
> +{
> +  unsigned int i, regnum;
> +
> +  if (!crtl->calls_eh_return)
> +return false;
> +
> +  for (i = 0; (regnum = EH_RETURN_DATA_REGNO (i)) != INVALID_REGNUM; i++)
> +if (regno == regnum)
> +  {
> +   return true;
> +  }
> +
> +  return false;
> +}
> +
>  /* Call FN for each register that is saved by the current function.
> SP_OFFSET is the offset of the current stack pointer from the start
> of the frame.  */
> @@ -4844,36 +4887,31 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
> riscv_save_restore_fn fn,
>   bool epilogue, bool maybe_eh_return)
>  {
>HOST_WIDE_INT offset;
> +  unsigned int regno;
> +  unsigned int start = GP_REG_FIRST;
> +  unsigned int limit = GP_REG_LAST;
>
>/* Save the link register and s-registers. */
> -  offset = (cfun->machine->frame.gp_sp_offset - sp_offset).to_constant ();
> -  for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
> -if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
> -  {
> -   bool handle_reg = !cfun->machine->reg_is_wrapped_separately[regno];
> -
> -   /* If this is a normal return in a function that calls the eh_return
> -  builtin, then do not restore the eh return data registers as that
> -  would clobber the return value.  But we do still need to save them
> -  in the prologue, and restore them for an exception return, so we
> -  need special handling here.  */
> -   if (epilogue && !maybe_eh_return && crtl->calls_eh_return)
> - {
> -   unsigned int i, regnum;
> -
> -   for (i = 0; (regnum = EH_RETURN_DATA_REGNO (i)) != INVALID_REGNUM;
> -i++)
> - if (regno == regnum)
> -   {
> - handle_reg = FALSE;
> - break;
> -   }
> - }
> -
> -   if (handle_reg)
> - riscv_save_restore_reg (word_mode, regno, offset, fn);
> -   offset -= UNITS_PER_WORD;
> -  }
> +  offset = (cfun->machine->frame.gp_sp_offset - sp_offset).to_constant ()
> +  + UNITS_PER_WORD;
> +  for (regno = riscv_next_saved_reg (start, limit, &offset, false);
> +   regno <= limit;
> +   regno = riscv_next_saved_reg (regno, limit, &offset))
> +{
> +  if (cfun->machine->reg_is_wrapped_separately[regno])
> +   continue;
> +
> +  /* If this is a normal return in a function that calls the eh_return
> +builtin, then do not restore the eh return data registers as that
> +would clobber the return value.  But we do still need to save them
> +in the prologue, and restore them for an exception return, so we
> +need special handling here.  */
> +  if (epilogue && !maybe_eh_return
> + && riscv_is_eh_return_data_register (regno))
> +   continue;
> +
> +  riscv_save_restore_reg (word_mode, regno, offset, fn);
> +}
>
>/* This loop must iterate over the same space as its companion in
>   riscv_compute_frame_info.  */
>

Re: [PATCH] RISC-V: Fix RVV mask mode size

2022-12-19 Thread Richard Biener via Gcc-patches
On Sat, Dec 17, 2022 at 2:54 AM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 12/16/22 18:44, 钟居哲 wrote:
> > Yes, VNx4DF only has 4 bit in mask mode in case of load and store.
> > For example vlm or vsm we will load store 8-bit ??? (I am not sure
> > hardward can load store 4bit,but I am sure it definetly not load store
> > the whole register size)
> Most likely than not you end up loading a larger quantity with the high
> bits zero'd.  Interesting that we're using a packed model.  I'd been
> told it was fairly expensive to implement in hardware relative to teh
> cost of implementing the sparse model.

Since the masks are extra inputs if you use a packed model you need
to wire less bits into the execution units for the masks which I guess
is actually cheaper.  Yes, producing the masks might be more complicated.

> > So ideally it should be model more accurate. However, since GCC assumes
> > that 1 BOOL is 1-byte, the only thing I do is to model mask mode as
> > smallest as possible.
> > Maybe in the future, I can support 1BOOL for 1-bit?? I am not sure since
> > it will need to change GCC framework.
> I'm a bit confused by this.  GCC can support single bit bools, though
> ports often extend them to 8 bits or more for computational efficiency
> purposes.  At least that's the case in general.  Is there something
> particularly special about masks & bools that's causing problems?

The only "issue" might be with 4, 2 and 1 bit masks which would
have a size of 8 bits but a precision of less that endianess might
play a role.

Btw, this is all similar to AVX512 where we even don't use
vector BI modes but integer modes for the mask which
then becomes QImode for 1, 2, 4 and 8 bit masks and
HImode for 16, SImode for 32 and DImode for 64 bit masks.

Richard.

> Jeff


[PATCH] fold-const: Treat fp conversion to a type with same mode as copy

2022-12-19 Thread Kewen.Lin via Gcc-patches
Hi,

In function fold_convert_const_real_from_real, when the modes of
two types involved in fp conversion are the same, we can simply
take it as copy, rebuild with the exactly same TREE_REAL_CST and
the target type.  It is more efficient and helps to avoid possible
unexpected signalling bit clearing in [1].

Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu
and powerpc64{,le}-linux-gnu.

Is it ok for trunk?

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608533.html

gcc/ChangeLog:

* fold-const.cc (fold_convert_const_real_from_real): Treat floating
point conversion to a type with same mode as copy instead of normal
convertFormat.
---
 gcc/fold-const.cc | 9 +
 1 file changed, 9 insertions(+)

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 114258fa182..eb4b6ca8820 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -2178,6 +2178,15 @@ fold_convert_const_real_from_real (tree type, const_tree 
arg1)
   REAL_VALUE_TYPE value;
   tree t;

+  /* If the underlying modes are the same, simply treat it as
+ copy and rebuild with TREE_REAL_CST information and the
+ given type.  */
+  if (TYPE_MODE (type) == TYPE_MODE (TREE_TYPE (arg1)))
+{
+  t = build_real (type, TREE_REAL_CST (arg1));
+  return t;
+}
+
   /* Don't perform the operation if flag_signaling_nans is on
  and the operand is a signaling NaN.  */
   if (HONOR_SNANS (arg1)
--
2.27.0


Re: [PATCH v2 03/11] riscv: Add basic XThead* vendor extension support

2022-12-19 Thread Kito Cheng
> +  {"xtheadba", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"xtheadbb", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"xtheadbs", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"xtheadcmo", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"xtheadcondmov", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"xtheadfmemidx", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"xtheadfmv", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"xtheadint", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"xtheadmac", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"xtheadmemidx", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"xtheadmempair", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"xtheadsync", ISA_SPEC_CLASS_NONE, 1, 0},

I don't have strong opinions on version of vendor extensions, but I
would like this could
be align with 
https://github.com/riscv-non-isa/riscv-toolchain-conventions/pull/27


Re: [PATCH v2 02/11] riscv: Restructure callee-saved register save/restore code

2022-12-19 Thread Kito Cheng
Something like this:

static unsigned int
riscv_next_saved_reg (unsigned int regno, unsigned int limit,
 HOST_WIDE_INT *offset, bool inc = true)
{
  if (inc)
regno++;

  while (regno <= limit)
{
  if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
   {
 *offset = *offset - UNITS_PER_WORD;
 break;
   }

  regno++;
}
  if (regno >= limit)
return INVALID_REGNUM;
  else
return regno;
}
...

  for (regno = riscv_next_saved_reg (start, limit, &offset, false);
   regno != INVALID_REGNUM;
   regno = riscv_next_saved_reg (regno, limit, &offset))
{
...

On Mon, Dec 19, 2022 at 5:21 PM Christoph Müllner
 wrote:
>
>
>
> On Mon, Dec 19, 2022 at 7:30 AM Kito Cheng  wrote:
>>
>> just one more nit: Use INVALID_REGNUM as sentinel value for
>> riscv_next_saved_reg, otherwise LGTM, and feel free to commit that
>> separately :)
>
>
> Would this change below be ok?
>
> @@ -5540,7 +5540,7 @@ riscv_next_saved_reg (unsigned int regno, unsigned int 
> limit,
>if (inc)
>  regno++;
>
> -  while (regno <= limit)
> +  while (regno <= limit && regno != INVALID_REGNUM)
>  {
>if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
> {
>
> Thanks,
> Christoph
>
>
>>
>>
>> On Mon, Dec 19, 2022 at 9:08 AM Christoph Muellner
>>  wrote:
>> >
>> > From: Christoph Müllner 
>> >
>> > This patch restructures the loop over the GP registers
>> > which saves/restores then as part of the prologue/epilogue.
>> > No functional change is intended by this patch, but it
>> > offers the possibility to use load-pair/store-pair instructions.
>> >
>> > gcc/ChangeLog:
>> >
>> > * config/riscv/riscv.cc (riscv_next_saved_reg): New function.
>> > (riscv_is_eh_return_data_register): New function.
>> > (riscv_for_each_saved_reg): Restructure loop.
>> >
>> > Signed-off-by: Christoph Müllner 
>> > ---
>> >  gcc/config/riscv/riscv.cc | 94 +++
>> >  1 file changed, 66 insertions(+), 28 deletions(-)
>> >
>> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
>> > index 6dd2ab2d11e..a8d5e1dac7f 100644
>> > --- a/gcc/config/riscv/riscv.cc
>> > +++ b/gcc/config/riscv/riscv.cc
>> > @@ -4835,6 +4835,49 @@ riscv_save_restore_reg (machine_mode mode, int 
>> > regno,
>> >fn (gen_rtx_REG (mode, regno), mem);
>> >  }
>> >
>> > +/* Return the next register up from REGNO up to LIMIT for the callee
>> > +   to save or restore.  OFFSET will be adjusted accordingly.
>> > +   If INC is set, then REGNO will be incremented first.  */
>> > +
>> > +static unsigned int
>> > +riscv_next_saved_reg (unsigned int regno, unsigned int limit,
>> > + HOST_WIDE_INT *offset, bool inc = true)
>> > +{
>> > +  if (inc)
>> > +regno++;
>> > +
>> > +  while (regno <= limit)
>> > +{
>> > +  if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
>> > +   {
>> > + *offset = *offset - UNITS_PER_WORD;
>> > + break;
>> > +   }
>> > +
>> > +  regno++;
>> > +}
>> > +  return regno;
>> > +}
>> > +
>> > +/* Return TRUE if provided REGNO is eh return data register.  */
>> > +
>> > +static bool
>> > +riscv_is_eh_return_data_register (unsigned int regno)
>> > +{
>> > +  unsigned int i, regnum;
>> > +
>> > +  if (!crtl->calls_eh_return)
>> > +return false;
>> > +
>> > +  for (i = 0; (regnum = EH_RETURN_DATA_REGNO (i)) != INVALID_REGNUM; i++)
>> > +if (regno == regnum)
>> > +  {
>> > +   return true;
>> > +  }
>> > +
>> > +  return false;
>> > +}
>> > +
>> >  /* Call FN for each register that is saved by the current function.
>> > SP_OFFSET is the offset of the current stack pointer from the start
>> > of the frame.  */
>> > @@ -4844,36 +4887,31 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, 
>> > riscv_save_restore_fn fn,
>> >   bool epilogue, bool maybe_eh_return)
>> >  {
>> >HOST_WIDE_INT offset;
>> > +  unsigned int regno;
>> > +  unsigned int start = GP_REG_FIRST;
>> > +  unsigned int limit = GP_REG_LAST;
>> >
>> >/* Save the link register and s-registers. */
>> > -  offset = (cfun->machine->frame.gp_sp_offset - sp_offset).to_constant ();
>> > -  for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
>> > -if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
>> > -  {
>> > -   bool handle_reg = !cfun->machine->reg_is_wrapped_separately[regno];
>> > -
>> > -   /* If this is a normal return in a function that calls the 
>> > eh_return
>> > -  builtin, then do not restore the eh return data registers as 
>> > that
>> > -  would clobber the return value.  But we do still need to save 
>> > them
>> > -  in the prologue, and restore them for an exception return, so we
>> > -  need special handling here.  */
>> > -   if (epilogue && !maybe_eh_return && crtl->calls_eh_return)
>> > - {
>> > -   

Re: [PATCH] fold-const: Treat fp conversion to a type with same mode as copy

2022-12-19 Thread Richard Biener via Gcc-patches
On Mon, Dec 19, 2022 at 9:12 AM Kewen.Lin  wrote:
>
> Hi,
>
> In function fold_convert_const_real_from_real, when the modes of
> two types involved in fp conversion are the same, we can simply
> take it as copy, rebuild with the exactly same TREE_REAL_CST and
> the target type.  It is more efficient and helps to avoid possible
> unexpected signalling bit clearing in [1].
>
> Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu
> and powerpc64{,le}-linux-gnu.
>
> Is it ok for trunk?

But shouldn't

double x = (double) __builtin_nans("sNAN");

result in a quiet NaN?

> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608533.html
>
> gcc/ChangeLog:
>
> * fold-const.cc (fold_convert_const_real_from_real): Treat floating
> point conversion to a type with same mode as copy instead of normal
> convertFormat.
> ---
>  gcc/fold-const.cc | 9 +
>  1 file changed, 9 insertions(+)
>
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index 114258fa182..eb4b6ca8820 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -2178,6 +2178,15 @@ fold_convert_const_real_from_real (tree type, 
> const_tree arg1)
>REAL_VALUE_TYPE value;
>tree t;
>
> +  /* If the underlying modes are the same, simply treat it as
> + copy and rebuild with TREE_REAL_CST information and the
> + given type.  */
> +  if (TYPE_MODE (type) == TYPE_MODE (TREE_TYPE (arg1)))
> +{
> +  t = build_real (type, TREE_REAL_CST (arg1));
> +  return t;
> +}
> +
>/* Don't perform the operation if flag_signaling_nans is on
>   and the operand is a signaling NaN.  */
>if (HONOR_SNANS (arg1)
> --
> 2.27.0


Re: [PATCH V4 1/2] rs6000: use li;x?oris to build constant

2022-12-19 Thread Jiufu Guo via Gcc-patches
Hi,

Segher Boessenkool  writes:

> Hi!
>
> On Mon, Dec 12, 2022 at 09:38:28AM +0800, Jiufu Guo wrote:
>>  PR target/106708
>> 
>> gcc/ChangeLog:
>> 
>>  * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add using
>>  "li; x?oris" to build constant.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.target/powerpc/pr106708.c: New test.
>
> Okay for trunk with the nits Ke Wen pointed out taken care off.

Thanks! Updated and committed via r13-4771-g97a8e88cd7d225.

BR,
Jeff (Jiufu)
>
> Thanks!
>
>
> Segher


Re: [Patch] gfortran.dg/read_dir.f90: Make PASS on Windows

2022-12-19 Thread Tobias Burnus

And here is a more light-wight variant, suggested by Nightstrike:

Using '.' instead of creating a new directory - and checking for
__WIN32__ instead for __MINGW32__.

The only downside of this variant is that it does not check whether
"close(10,status='delete')" will delete a directory without failing with
an error. – If the latter makes sense, I think a follow-up check should
be added to ensure the directory has indeed been removed by 'close'.

Thoughts about which variant is better? Other suggestions or comments?

Tobias

PS: On my x86-64 Linux, OPEN works but READ fails with EISDIR/errno == 21.

On 19.12.22 10:09, Tobias Burnus wrote:

As discussed in #gfortran IRC, on Windows opening a directory fails
with EACCESS.
(It works under Cygwin - nightstrike was so kind to test this.)

Additionally, '[ -d dir ] || mkdir dir' is also not very portable.

Hence, I use an auxiliary C file calling the POSIX functions and
expect a fail for non-Cygwin windows.

Comments? Suggestions? - If there aren't any, I plan to commit it
as obvious tomorrow.

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
gfortran.dg/read_dir.f90: Make PASS on Windows

Avoid call to the shell using POSIX syntax and use '.' instead.
Additionally, expect fail on non-Cygwin Windows as opening a directory
is documented to fail with EACCESS.

gcc/testsuite/ChangeLog:

	* gfortran.dg/read_dir.f90: Open '.' instead of a freshly created
	directory; expect error on Windows when opening a directory.

 gcc/testsuite/gfortran.dg/read_dir.f90 | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/read_dir.f90 b/gcc/testsuite/gfortran.dg/read_dir.f90
index c7ddc51fb90..c91d0f78413 100644
--- a/gcc/testsuite/gfortran.dg/read_dir.f90
+++ b/gcc/testsuite/gfortran.dg/read_dir.f90
@@ -1,20 +1,27 @@
 ! { dg-do run }
+! { dg-additional-options "-cpp" }
+!
 ! PR67367
+
 program bug
implicit none
character(len=1) :: c
-   character(len=256) :: message
integer ios
-   call system('[ -d junko.dir ] || mkdir junko.dir')
-   open(unit=10, file='junko.dir',iostat=ios,action='read',access='stream')
+   open(unit=10, file='.',iostat=ios,action='read',access='stream')
+
+#if defined(__WIN32__) && !defined(__CYGWIN__)
+   ! Windows is documented to fail with EACCESS when trying to open a directory
+   if (ios == 0) &
+  stop 3  ! Expected EACCESS
+   stop 0  ! OK
+#endif   
+
if (ios.ne.0) then
-  call system('rmdir junko.dir')
   STOP 1
end if
read(10, iostat=ios) c
-   if (ios.ne.21.and.ios.ne.0) then 
-  close(10, status='delete')
+   close(10)
+   if (ios.ne.21.and.ios.ne.0) then  ! EISDIR has often the value 21
   STOP 2
end if
-   close(10, status='delete')
 end program bug


Get $500 in Free Google Ads Credits! Find Out How!

2022-12-19 Thread Smart Marketting via Gcc-patches
*Congratulations!*

I have noticed that your business is showing on google but it's not on the
first page, so I wanted to personally reach out and help you.
By giving you a $500 google ads voucher that you can use to advertise your
business on the first page positions.

I am going to be completely honest.
I am NOT going to sell you anything I am just telling you how to get this
voucher as most people don’t know about it.
And if you get it, I am going to receive a small affiliate commission for
it just being transparent.

*How to get the voucher?*

Simply click *https://free-trial.adcreative.ai/500-free-ads-credit6976
*  & sign up for
any package and you are going to receive a $500 voucher right away.

You can sign up for a free trial and claim it from the support before you
get charged anything so you know this is not some scam haha.
I am a real person and I like to keep it real.

You’ll also get a $500 Google Ad Credit for signing up! contact the
customer support, they reply pretty fast!

*Start your 7-day trial today!
https://free-trial.adcreative.ai/500-free-ads-credit6976
*

*Best of luck! *


Re: [PATCH v2 02/11] riscv: Restructure callee-saved register save/restore code

2022-12-19 Thread Christoph Müllner
On Mon, Dec 19, 2022 at 10:26 AM Kito Cheng  wrote:

> Something like this:
>
> static unsigned int
> riscv_next_saved_reg (unsigned int regno, unsigned int limit,
>  HOST_WIDE_INT *offset, bool inc = true)
> {
>   if (inc)
> regno++;
>
>   while (regno <= limit)
> {
>   if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
>{
>  *offset = *offset - UNITS_PER_WORD;
>  break;
>}
>
>   regno++;
> }
>   if (regno >= limit)
> return INVALID_REGNUM;
>   else
> return regno;
> }
> ...
>
>   for (regno = riscv_next_saved_reg (start, limit, &offset, false);
>regno != INVALID_REGNUM;
>regno = riscv_next_saved_reg (regno, limit, &offset))
> {
> ...
>
>
Ok, I see.
I changed it as follows (it will be retested before committing):

@@ -5531,7 +5531,8 @@ riscv_save_restore_reg (machine_mode mode, int regno,

 /* Return the next register up from REGNO up to LIMIT for the callee
to save or restore.  OFFSET will be adjusted accordingly.
-   If INC is set, then REGNO will be incremented first.  */
+   If INC is set, then REGNO will be incremented first.
+   Returns INVALID_REGNUM if there is no such next register.  */

 static unsigned int
 riscv_next_saved_reg (unsigned int regno, unsigned int limit,
@@ -5545,12 +5546,12 @@ riscv_next_saved_reg (unsigned int regno, unsigned
int limit,
   if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
{
  *offset = *offset - UNITS_PER_WORD;
- break;
+ return regno;
}

   regno++;
 }
-  return regno;
+  return INVALID_REGNUM;
 }

 /* Return TRUE if provided REGNO is eh return data register.  */
@@ -5589,7 +5590,7 @@ riscv_for_each_saved_reg (poly_int64 sp_offset,
riscv_save_restore_fn fn,
   offset = (cfun->machine->frame.gp_sp_offset - sp_offset).to_constant ()
   + UNITS_PER_WORD;
   for (regno = riscv_next_saved_reg (start, limit, &offset, false);
-   regno <= limit;
+   regno != INVALID_REGNUM;

Thanks!



> On Mon, Dec 19, 2022 at 5:21 PM Christoph Müllner
>  wrote:
> >
> >
> >
> > On Mon, Dec 19, 2022 at 7:30 AM Kito Cheng 
> wrote:
> >>
> >> just one more nit: Use INVALID_REGNUM as sentinel value for
> >> riscv_next_saved_reg, otherwise LGTM, and feel free to commit that
> >> separately :)
> >
> >
> > Would this change below be ok?
> >
> > @@ -5540,7 +5540,7 @@ riscv_next_saved_reg (unsigned int regno, unsigned
> int limit,
> >if (inc)
> >  regno++;
> >
> > -  while (regno <= limit)
> > +  while (regno <= limit && regno != INVALID_REGNUM)
> >  {
> >if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
> > {
> >
> > Thanks,
> > Christoph
> >
> >
> >>
> >>
> >> On Mon, Dec 19, 2022 at 9:08 AM Christoph Muellner
> >>  wrote:
> >> >
> >> > From: Christoph Müllner 
> >> >
> >> > This patch restructures the loop over the GP registers
> >> > which saves/restores then as part of the prologue/epilogue.
> >> > No functional change is intended by this patch, but it
> >> > offers the possibility to use load-pair/store-pair instructions.
> >> >
> >> > gcc/ChangeLog:
> >> >
> >> > * config/riscv/riscv.cc (riscv_next_saved_reg): New function.
> >> > (riscv_is_eh_return_data_register): New function.
> >> > (riscv_for_each_saved_reg): Restructure loop.
> >> >
> >> > Signed-off-by: Christoph Müllner 
> >> > ---
> >> >  gcc/config/riscv/riscv.cc | 94
> +++
> >> >  1 file changed, 66 insertions(+), 28 deletions(-)
> >> >
> >> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> >> > index 6dd2ab2d11e..a8d5e1dac7f 100644
> >> > --- a/gcc/config/riscv/riscv.cc
> >> > +++ b/gcc/config/riscv/riscv.cc
> >> > @@ -4835,6 +4835,49 @@ riscv_save_restore_reg (machine_mode mode, int
> regno,
> >> >fn (gen_rtx_REG (mode, regno), mem);
> >> >  }
> >> >
> >> > +/* Return the next register up from REGNO up to LIMIT for the callee
> >> > +   to save or restore.  OFFSET will be adjusted accordingly.
> >> > +   If INC is set, then REGNO will be incremented first.  */
> >> > +
> >> > +static unsigned int
> >> > +riscv_next_saved_reg (unsigned int regno, unsigned int limit,
> >> > + HOST_WIDE_INT *offset, bool inc = true)
> >> > +{
> >> > +  if (inc)
> >> > +regno++;
> >> > +
> >> > +  while (regno <= limit)
> >> > +{
> >> > +  if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
> >> > +   {
> >> > + *offset = *offset - UNITS_PER_WORD;
> >> > + break;
> >> > +   }
> >> > +
> >> > +  regno++;
> >> > +}
> >> > +  return regno;
> >> > +}
> >> > +
> >> > +/* Return TRUE if provided REGNO is eh return data register.  */
> >> > +
> >> > +static bool
> >> > +riscv_is_eh_return_data_register (unsigned int regno)
> >> > +{
> >> > +  unsigned int i, regnum;
> >> > +
> >> > +  if (!crtl->calls_eh_return)
> >> > +return false;
> >> > +
> >> > + 

[committed] testsuite: Fix up pr107397.f90 test [PR107397]

2022-12-19 Thread Jakub Jelinek via Gcc-patches
On Sat, Dec 17, 2022 at 09:12:43AM -0800, Jerry D via Gcc-patches wrote:
> The attached patch fixes a regression and is a patch from Steve.  I have
> regression tested it and provided a test case.  It is fairly simple and I
> will commit under the "simple" rule in a little while.
> 
> Thanks Steve for Patch. Thanks Harald for helping me get back up to speed on
> the git magic.

The pr107397.f90 test FAILs for me, one problem was that the
added diagnostics has an indefinite article before BOZ, but
the test dg-error didn't.  The other problem was that on the
other dg-error there was no space between the string and closing
}, so it was completely ignored and the error was an excess
error.

2022-12-19  Jakub Jelinek  

PR fortran/107397
* gfortran.dg/pr107397.f90: Adjust expected diagnostic wording and
add space between dg-error string and closing }.

--- gcc/testsuite/gfortran.dg/pr107397.f90.jj   2022-12-19 11:09:13.793166473 
+0100
+++ gcc/testsuite/gfortran.dg/pr107397.f90  2022-12-19 11:23:02.981322107 
+0100
@@ -4,6 +4,6 @@ program p
   type t
 real :: a = 1.0
   end type
-  type(t), parameter :: x = z'1' ! { dg-error "incompatible with BOZ" }
-  x%a = x%a + 2 ! { dg-error "has no IMPLICIT type"}
+  type(t), parameter :: x = z'1' ! { dg-error "incompatible with a BOZ" }
+  x%a = x%a + 2 ! { dg-error "has no IMPLICIT type" }
 end


Jakub



Re: [PATCH v2 02/11] riscv: Restructure callee-saved register save/restore code

2022-12-19 Thread Christoph Müllner
On Mon, Dec 19, 2022 at 7:30 AM Kito Cheng  wrote:

> just one more nit: Use INVALID_REGNUM as sentinel value for
> riscv_next_saved_reg, otherwise LGTM, and feel free to commit that
> separately :)
>

Would this change below be ok?

@@ -5540,7 +5540,7 @@ riscv_next_saved_reg (unsigned int regno, unsigned
int limit,
   if (inc)
 regno++;

-  while (regno <= limit)
+  while (regno <= limit && regno != INVALID_REGNUM)
 {
   if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
{

Thanks,
Christoph



>
> On Mon, Dec 19, 2022 at 9:08 AM Christoph Muellner
>  wrote:
> >
> > From: Christoph Müllner 
> >
> > This patch restructures the loop over the GP registers
> > which saves/restores then as part of the prologue/epilogue.
> > No functional change is intended by this patch, but it
> > offers the possibility to use load-pair/store-pair instructions.
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv.cc (riscv_next_saved_reg): New function.
> > (riscv_is_eh_return_data_register): New function.
> > (riscv_for_each_saved_reg): Restructure loop.
> >
> > Signed-off-by: Christoph Müllner 
> > ---
> >  gcc/config/riscv/riscv.cc | 94 +++
> >  1 file changed, 66 insertions(+), 28 deletions(-)
> >
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index 6dd2ab2d11e..a8d5e1dac7f 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -4835,6 +4835,49 @@ riscv_save_restore_reg (machine_mode mode, int
> regno,
> >fn (gen_rtx_REG (mode, regno), mem);
> >  }
> >
> > +/* Return the next register up from REGNO up to LIMIT for the callee
> > +   to save or restore.  OFFSET will be adjusted accordingly.
> > +   If INC is set, then REGNO will be incremented first.  */
> > +
> > +static unsigned int
> > +riscv_next_saved_reg (unsigned int regno, unsigned int limit,
> > + HOST_WIDE_INT *offset, bool inc = true)
> > +{
> > +  if (inc)
> > +regno++;
> > +
> > +  while (regno <= limit)
> > +{
> > +  if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
> > +   {
> > + *offset = *offset - UNITS_PER_WORD;
> > + break;
> > +   }
> > +
> > +  regno++;
> > +}
> > +  return regno;
> > +}
> > +
> > +/* Return TRUE if provided REGNO is eh return data register.  */
> > +
> > +static bool
> > +riscv_is_eh_return_data_register (unsigned int regno)
> > +{
> > +  unsigned int i, regnum;
> > +
> > +  if (!crtl->calls_eh_return)
> > +return false;
> > +
> > +  for (i = 0; (regnum = EH_RETURN_DATA_REGNO (i)) != INVALID_REGNUM;
> i++)
> > +if (regno == regnum)
> > +  {
> > +   return true;
> > +  }
> > +
> > +  return false;
> > +}
> > +
> >  /* Call FN for each register that is saved by the current function.
> > SP_OFFSET is the offset of the current stack pointer from the start
> > of the frame.  */
> > @@ -4844,36 +4887,31 @@ riscv_for_each_saved_reg (poly_int64 sp_offset,
> riscv_save_restore_fn fn,
> >   bool epilogue, bool maybe_eh_return)
> >  {
> >HOST_WIDE_INT offset;
> > +  unsigned int regno;
> > +  unsigned int start = GP_REG_FIRST;
> > +  unsigned int limit = GP_REG_LAST;
> >
> >/* Save the link register and s-registers. */
> > -  offset = (cfun->machine->frame.gp_sp_offset - sp_offset).to_constant
> ();
> > -  for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
> > -if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
> > -  {
> > -   bool handle_reg =
> !cfun->machine->reg_is_wrapped_separately[regno];
> > -
> > -   /* If this is a normal return in a function that calls the
> eh_return
> > -  builtin, then do not restore the eh return data registers as
> that
> > -  would clobber the return value.  But we do still need to save
> them
> > -  in the prologue, and restore them for an exception return, so
> we
> > -  need special handling here.  */
> > -   if (epilogue && !maybe_eh_return && crtl->calls_eh_return)
> > - {
> > -   unsigned int i, regnum;
> > -
> > -   for (i = 0; (regnum = EH_RETURN_DATA_REGNO (i)) !=
> INVALID_REGNUM;
> > -i++)
> > - if (regno == regnum)
> > -   {
> > - handle_reg = FALSE;
> > - break;
> > -   }
> > - }
> > -
> > -   if (handle_reg)
> > - riscv_save_restore_reg (word_mode, regno, offset, fn);
> > -   offset -= UNITS_PER_WORD;
> > -  }
> > +  offset = (cfun->machine->frame.gp_sp_offset - sp_offset).to_constant
> ()
> > +  + UNITS_PER_WORD;
> > +  for (regno = riscv_next_saved_reg (start, limit, &offset, false);
> > +   regno <= limit;
> > +   regno = riscv_next_saved_reg (regno, limit, &offset))
> > +{
> > +  if (cfun->machine->reg_is_wrapped_separately[regno])
> > +   continue;
> > +
> > +

Ping^2: [PATCH] jit: Install jit headers in $(libsubincludedir) [PR 101491]

2022-12-19 Thread Lorenzo Salvadore via Gcc-patches
Hello,

Ping https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606450.html

Thanks,

Lorenzo Salvadore

> From f8e2c2ee89a7d8741bb65163d1f1c20edcd546ac Mon Sep 17 00:00:00 2001
> From: Lorenzo Salvadore develo...@lorenzosalvadore.it
> 
> Date: Wed, 16 Nov 2022 11:27:38 +0100
> Subject: [PATCH] jit: Install jit headers in $(libsubincludedir) [PR 101491]
> 
> Installing jit/libgccjit.h and jit/libgccjit++.h headers in
> $(includedir) can be a problem for machines where multiple versions of
> GCC are required simultaneously, see for example this bug report on
> FreeBSD:
> 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=257060
> 
> Hence,
> 
> - define $(libsubincludedir) the same way it is defined in libgomp;
> - install jit/libgccjit.h and jit/libgccjit++.h in $(libsubincludedir).
> 
> The patch has already been applied successfully in the official FreeBSD
> ports tree for the ports lang/gcc11 and lang/gcc12. Please see the
> following commits:
> 
> https://cgit.freebsd.org/ports/commit/?id=0338e04504ee269b7a95e6707f1314bc1c4239fe
> https://cgit.freebsd.org/ports/commit/?id=f1957296ed2dce8a09bb9582e9a5a715bf8b3d4d
> 
> gcc/ChangeLog:
> 
> 2022-11-16 Lorenzo Salvadore develo...@lorenzosalvadore.it
> 
> 
> PR jit/101491
> * Makefile.in: Define and create $(libsubincludedir)
> 
> gcc/jit/ChangeLog:
> 
> 2022-11-16 Lorenzo Salvadore develo...@lorenzosalvadore.it
> 
> 
> PR jit/101491
> * Make-lang.in: Install headers in $(libsubincludedir)
> ---
> gcc/Makefile.in | 3 +++
> gcc/jit/Make-lang.in | 4 ++--
> 2 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index f672e6ea549..3bcf1c491ab 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -635,6 +635,8 @@ libexecdir = @libexecdir@
> 
> # Directory in which the compiler finds libraries etc.
> libsubdir = 
> $(libdir)/gcc/$(real_target_noncanonical)/$(version)$(accel_dir_suffix)
> +# Directory in which the compiler finds headers.
> +libsubincludedir = $(libdir)/gcc/$(target_alias)/$(version)/include
> # Directory in which the compiler finds executables
> libexecsubdir = 
> $(libexecdir)/gcc/$(real_target_noncanonical)/$(version)$(accel_dir_suffix)
> # Directory in which all plugin resources are installed
> @@ -3642,6 +3644,7 @@ install-cpp: installdirs cpp$(exeext)
> # $(libdir)/gcc/include isn't currently searched by cpp.
> installdirs:
> $(mkinstalldirs) $(DESTDIR)$(libsubdir)
> + $(mkinstalldirs) $(DESTDIR)$(libsubincludedir)
> $(mkinstalldirs) $(DESTDIR)$(libexecsubdir)
> $(mkinstalldirs) $(DESTDIR)$(bindir)
> $(mkinstalldirs) $(DESTDIR)$(includedir)
> diff --git a/gcc/jit/Make-lang.in b/gcc/jit/Make-lang.in
> index 248ec45b729..ba1b3e95da5 100644
> --- a/gcc/jit/Make-lang.in
> +++ b/gcc/jit/Make-lang.in
> @@ -360,9 +360,9 @@ selftest-jit:
> # Install hooks:
> jit.install-headers: installdirs
> $(INSTALL_DATA) $(srcdir)/jit/libgccjit.h \
> - $(DESTDIR)$(includedir)/libgccjit.h
> + $(DESTDIR)$(libsubincludedir)/libgccjit.h
> $(INSTALL_DATA) $(srcdir)/jit/libgccjit++.h \
> - $(DESTDIR)$(includedir)/libgccjit++.h
> + $(DESTDIR)$(libsubincludedir)/libgccjit++.h
> 
> ifneq (,$(findstring mingw,$(target)))
> jit.install-common: installdirs jit.install-headers
> --
> 2.38.0


[PATCH (pushed)] gcc-changelog: stop using --flake8

2022-12-19 Thread Martin Liška

The flake8 pytest plug-in is broken and we should not use it.

contrib/ChangeLog:

* gcc-changelog/setup.cfg: Do not use flake8 pytest plug-in.
---
 contrib/gcc-changelog/setup.cfg | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/gcc-changelog/setup.cfg b/contrib/gcc-changelog/setup.cfg
index efc313f6d52..a606029e25f 100644
--- a/contrib/gcc-changelog/setup.cfg
+++ b/contrib/gcc-changelog/setup.cfg
@@ -2,4 +2,4 @@
 max-line-length = 120
 
 [tool:pytest]

-addopts = -vv --flake8
+addopts = -vv
--
2.39.0



Re: [PATCH] fold-const: Treat fp conversion to a type with same mode as copy

2022-12-19 Thread Kewen.Lin via Gcc-patches
Hi Richi,

Thanks for the comments!

on 2022/12/19 16:49, Richard Biener wrote:
> On Mon, Dec 19, 2022 at 9:12 AM Kewen.Lin  wrote:
>>
>> Hi,
>>
>> In function fold_convert_const_real_from_real, when the modes of
>> two types involved in fp conversion are the same, we can simply
>> take it as copy, rebuild with the exactly same TREE_REAL_CST and
>> the target type.  It is more efficient and helps to avoid possible
>> unexpected signalling bit clearing in [1].
>>
>> Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu
>> and powerpc64{,le}-linux-gnu.
>>
>> Is it ok for trunk?
> 
> But shouldn't
> 
> double x = (double) __builtin_nans("sNAN");
> 
> result in a quiet NaN?

IMHO, this treatment is still an tiny enhancement even without the
consideration on signalling nan.

I had the same doubt on if the bif should return qNaN without the
explicit option -fsignaling-nans, by checking the manual it says

"Compile code assuming that IEEE signaling NaNs may generate user-visible
traps during floating-point operations."

for the default -fno-signaling-nans, it doesn't exclude sNaN existence
clearly, and the generation of sNaN seems not to be like a normal floating
point operation which can cause trap.  So I was inclined to believe that
the current behavior is expected.  Besides the glibc uses this built-in
for SNAN macro, it seems not a good idea to return a qNaN for it at some
cases.

BR,
Kewen

> 
>> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608533.html
>>
>> gcc/ChangeLog:
>>
>> * fold-const.cc (fold_convert_const_real_from_real): Treat floating
>> point conversion to a type with same mode as copy instead of normal
>> convertFormat.
>> ---
>>  gcc/fold-const.cc | 9 +
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
>> index 114258fa182..eb4b6ca8820 100644
>> --- a/gcc/fold-const.cc
>> +++ b/gcc/fold-const.cc
>> @@ -2178,6 +2178,15 @@ fold_convert_const_real_from_real (tree type, 
>> const_tree arg1)
>>REAL_VALUE_TYPE value;
>>tree t;
>>
>> +  /* If the underlying modes are the same, simply treat it as
>> + copy and rebuild with TREE_REAL_CST information and the
>> + given type.  */
>> +  if (TYPE_MODE (type) == TYPE_MODE (TREE_TYPE (arg1)))
>> +{
>> +  t = build_real (type, TREE_REAL_CST (arg1));
>> +  return t;
>> +}
>> +
>>/* Don't perform the operation if flag_signaling_nans is on
>>   and the operand is a signaling NaN.  */
>>if (HONOR_SNANS (arg1)
>> --
>> 2.27.0


[Patch] gfortran.dg/read_dir.f90: Make PASS on Windows

2022-12-19 Thread Tobias Burnus

As discussed in #gfortran IRC, on Windows opening a directory fails with 
EACCESS.
(It works under Cygwin - nightstrike was so kind to test this.)

Additionally, '[ -d dir ] || mkdir dir' is also not very portable.

Hence, I use an auxiliary C file calling the POSIX functions and
expect a fail for non-Cygwin windows.

Comments? Suggestions? - If there aren't any, I plan to commit it
as obvious tomorrow.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
gfortran.dg/read_dir.f90: Make PASS on Windows

Call POSIX's stat/mkdir/rmdir instead of using the shell via 'call system'.
Additionally, expect EACCESS on non-Cygwin Windows as documented for trying
to open a directory.

gcc/testsuite/ChangeLog:

	* gfortran.dg/read_dir-aux.c: New; provides my_mkdir and my_rmdir.
	* gfortran.dg/read_dir.f90: Call my_mkdir/my_rmdir; expect
	error on Windows when opening a directory.

 gcc/testsuite/gfortran.dg/read_dir-aux.c | 39 +
 gcc/testsuite/gfortran.dg/read_dir.f90   | 43 
 2 files changed, 77 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/read_dir-aux.c b/gcc/testsuite/gfortran.dg/read_dir-aux.c
new file mode 100644
index 000..e8404478517
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/read_dir-aux.c
@@ -0,0 +1,39 @@
+#include   /* For mkdir + permission bits.  */
+#include   /* For rmdir.  */
+#include   /* For errno.  */
+#include   /* For perror.  */
+#include   /* For abort.  */
+ 
+
+void
+my_mkdir (const char *dir)
+{
+  int err;
+  struct stat path_stat;
+
+  /* Check whether 'dir' exists and is a directory.  */
+  err = stat (dir, &path_stat);
+  if (err && errno != ENOENT)
+{
+  perror ("my_mkdir: failed to call stat for directory");
+  abort ();
+}
+  if (err == 0 && !S_ISDIR (path_stat.st_mode))
+{
+  printf ("my_mkdir: pathname %s is not a directory\n", dir);
+  abort ();
+}
+
+  err = mkdir (dir, S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH);
+  if (err != 0)
+{
+  perror ("my_mkdir: failed to create directory");
+  abort ();
+}
+}
+
+void
+my_rmdir (const char *dir)
+{
+  rmdir (dir);
+}
diff --git a/gcc/testsuite/gfortran.dg/read_dir.f90 b/gcc/testsuite/gfortran.dg/read_dir.f90
index c7ddc51fb90..3a8ff6adbc7 100644
--- a/gcc/testsuite/gfortran.dg/read_dir.f90
+++ b/gcc/testsuite/gfortran.dg/read_dir.f90
@@ -1,18 +1,51 @@
 ! { dg-do run }
+! { dg-additional-options "-cpp" }
+! { dg-additional-sources read_dir-aux.c }
+!
 ! PR67367
+
 program bug
+   use iso_c_binding
implicit none
+
+   interface
+ subroutine my_mkdir(s) bind(C)
+   ! Call POSIX's mkdir - and ignore fails due to
+   ! existing directories but fail otherwise
+   import
+   character(len=1,kind=c_char) :: s(*)
+ end subroutine
+ subroutine my_rmdir(s) bind(C)
+   ! Call POSIX's rmdir - and ignore fails
+   import
+   character(len=1,kind=c_char) :: s(*)
+ end subroutine
+   end interface
+
+   character(len=*), parameter :: sdir = "junko.dir"
+   character(len=*,kind=c_char), parameter :: c_sdir = sdir // c_null_char
+
character(len=1) :: c
-   character(len=256) :: message
integer ios
-   call system('[ -d junko.dir ] || mkdir junko.dir')
-   open(unit=10, file='junko.dir',iostat=ios,action='read',access='stream')
+
+   call my_mkdir(c_sdir)
+   open(unit=10, file=sdir,iostat=ios,action='read',access='stream')
+
+#if defined(__MINGW32__)
+   ! Windows is documented to fail with EACCESS when trying to open a directory
+   ! Note: Testing showed that __CYGWIN__ does permit opening directories
+   call my_rmdir(c_sdir)
+   if (ios == 0) &
+  stop 3  ! Expected EACCESS
+   stop 0  ! OK
+#endif   
+
if (ios.ne.0) then
-  call system('rmdir junko.dir')
+  call my_rmdir(c_sdir)
   STOP 1
end if
read(10, iostat=ios) c
-   if (ios.ne.21.and.ios.ne.0) then 
+   if (ios.ne.21.and.ios.ne.0) then  ! EISDIR has often the value 21
   close(10, status='delete')
   STOP 2
end if


Re: [PATCH] fold-const: Treat fp conversion to a type with same mode as copy

2022-12-19 Thread Jakub Jelinek via Gcc-patches
On Mon, Dec 19, 2022 at 09:49:36AM +0100, Richard Biener wrote:
> On Mon, Dec 19, 2022 at 9:12 AM Kewen.Lin  wrote:
> > In function fold_convert_const_real_from_real, when the modes of
> > two types involved in fp conversion are the same, we can simply
> > take it as copy, rebuild with the exactly same TREE_REAL_CST and
> > the target type.  It is more efficient and helps to avoid possible
> > unexpected signalling bit clearing in [1].
> >
> > Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu
> > and powerpc64{,le}-linux-gnu.
> >
> > Is it ok for trunk?
> 
> But shouldn't
> 
> double x = (double) __builtin_nans("sNAN");
> 
> result in a quiet NaN?

I think it doesn't already, fold_convert_const starts with
  tree arg_type = TREE_TYPE (arg1);
  if (arg_type == type)
return arg1;
and so if it is a conversion to the same type, fold_convert_const_real_from_real
won't be called at all.
It is just cast to a different type with same mode (say typedef double doublet)
that triggers it right now.
Try:

double
foo (void)
{
  return __builtin_nans ("");
}

double
bar (void)
{
  return (double) __builtin_nans ("");
}

typedef double doublet;

doublet
baz (void)
{
  return __builtin_nans ("");
}

doublet
qux (void)
{
  return (doublet) __builtin_nans ("");
}

float
corge (void)
{
  return (float) __builtin_nans ("");
}

GCC right now returns a sNaN in foo and bar and qNaN in baz, qux and corge,
clang and ICC (tried 19.0.1 on godbolt) return sNaN in foo, bar, baz, qux
and qNaN in corge.

The last case is required by C:
convertFormat - different formats   cast and implicit conversions   
6.3.1.5, 6.5.4
convertFormat - same format canonicalize
7.12.11.7, F.10.8.7
As for the rest, C n3047.pdf has:
Whether C assignment (6.5.16) (and conversion as if by assignment) to the same 
format is an
IEC 60559 convertFormat or copy operation439) is implementation-defined, even 
if  defines
the macro FE_SNANS_ALWAYS_SIGNAL (F.2.1). If the return expression of a return 
statement is
evaluated to the floating-point format of the return type, it is 
implementation-defined whether a
convertFormat operation is applied to the result of the return expression.

439) Where the source and destination formats are the same, convertFormat 
operations differ from copy operations in
that convertFormat operations raise the "invalid" floating-point exception on 
signaling NaN inputs and do not propagate
non-canonical encodings.

I think the posted patch is good for consistency, treating conversion to the
same format sometimes as convertFormat and sometimes as copy is maybe valid
but confusing, especially when on:

double
foo (double x)
{
  return x;
}

double
bar (double x)
{
  return (double) x;
}

typedef double doublet;

doublet
baz (double x)
{
  return x;
}

doublet
qux (double x)
{
  return (doublet) x;
}

float
corge (double x)
{
  return (float) x;
}
we actually use copy operations in all of foo, bar, baz and qux.

Jakub



Re: [Patch] gfortran.dg/read_dir.f90: Make PASS on Windows

2022-12-19 Thread Tobias Burnus

On 19.12.22 10:26, Tobias Burnus wrote:

And here is a more light-wight variant, suggested by Nightstrike:

Using '.' instead of creating a new directory - and checking for
__WIN32__ instead for __MINGW32__.

The only downside of this variant is that it does not check whether
"close(10,status='delete')" will delete a directory without failing with
an error. – If the latter makes sense, I think a follow-up check should
be added to ensure the directory has indeed been removed by 'close'.


I have now updated the heavy version. The #if check moved to C as those
macros aren't set in Fortran. (That's now https://gcc.gnu.org/PR108175 -
I thought that there was a PR before, but I couldn't find any.)

Additionally, on Windows the '.' directory is now opened - avoiding
issues with POSIX functions (and the requirement to use '#include
' etc.). - As OPEN already fails, there is no point in
checking for the rest.

On the non-Windows side, there is now a check that 'CLOSE' with
status='delete' indeed has deleted the directory.


Thoughts about which variant is better? Other suggestions or comments?

^- comments?

PS: On my x86-64 Linux, OPEN works but READ fails with EISDIR/errno == 21.


And thanks to Nightstrike for testing, suggestions and reporting the
issue at the first place.



On 19.12.22 10:09, Tobias Burnus wrote:

As discussed in #gfortran IRC, on Windows opening a directory fails
with EACCESS.
(It works under Cygwin - nightstrike was so kind to test this.)

Additionally, '[ -d dir ] || mkdir dir' is also not very portable.

Hence, I use an auxiliary C file calling the POSIX functions and
expect a fail for non-Cygwin windows.

Comments? Suggestions? - If there aren't any, I plan to commit it
as obvious tomorrow.


I don't have a strong preference for the one-file/'.'/smaller solutions
vs the two-file/mkdir/close-'delete' solution, but I am slightly
inclined to the the one that tests more.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
gfortran.dg/read_dir.f90: Make PASS on Windows

On non-Cygwin Windows, use '.' and expect the documented fail when opening
a directory (EACCESS).  As gfortran does not set __WIN32__ this check is
done on the C side. (On __CYGWIN__, __WIN32__ is not set - but to make it
clear, !__CYGWIN__ is used in #if.)

On non-Windows, replace the 'call system' shell call by the POSIX functions
stat/mkdir/rmdir for better compatibility, especially on embedded systems;
additionally add some more checks. In particular, confirm that 'close' with
status='delete' indeed deleted the directory.

gcc/testsuite/ChangeLog:

	* gfortran.dg/read_dir-aux.c: New; provides my_mkdir, my_rmdir,
	my_verify_not_exists and expect_open_to_fail.
	* gfortran.dg/read_dir.f90: Call those; expect that opening a
	directory fails on Windows.

 gcc/testsuite/gfortran.dg/read_dir-aux.c | 68 
 gcc/testsuite/gfortran.dg/read_dir.f90   | 54 ++---
 2 files changed, 117 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/read_dir-aux.c b/gcc/testsuite/gfortran.dg/read_dir-aux.c
new file mode 100644
index 000..307b44472af
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/read_dir-aux.c
@@ -0,0 +1,68 @@
+#if defined(__WIN32__) && !defined(__CYGWIN__)
+  /* Mostly skip on Windows, cf. main file why. */
+
+int expect_open_to_fail () { return 1; }
+
+void my_verify_not_exists (const char *dir) { }
+void my_mkdir (const char *dir) { }
+void my_rmdir (const char *dir) { }
+
+#else
+
+#include   /* For mkdir + permission bits.  */
+#include   /* For rmdir.  */
+#include   /* For errno.  */
+#include   /* For perror.  */
+#include   /* For abort.  */
+ 
+
+int expect_open_to_fail () { return 0; }
+
+void
+my_verify_not_exists (const char *dir)
+{
+  struct stat path_stat;
+  int err = stat (dir, &path_stat);
+  if (err && errno == ENOENT)
+return;  /* OK */
+  if (err)
+perror ("my_verify_not_exists");
+  else
+printf ("my_verify_not_exists: pathname %s still exists\n", dir);
+  abort ();
+ }
+
+void
+my_mkdir (const char *dir)
+{
+  int err;
+  struct stat path_stat;
+
+  /* Check whether 'dir' exists and is a directory.  */
+  err = stat (dir, &path_stat);
+  if (err && errno != ENOENT)
+{
+  perror ("my_mkdir: failed to call stat for directory");
+  abort ();
+}
+  if (err == 0 && !S_ISDIR (path_stat.st_mode))
+{
+  printf ("my_mkdir: pathname %s is not a directory\n", dir);
+  abort ();
+}
+
+  err = mkdir (dir, S_IRWXU | S_IRGRP | S_IXGRP | S_IROTH | S_IXOTH);
+  if (err != 0)
+{
+  perror ("my_mkdir: failed to create directory");
+  abort ();
+}
+}
+
+void
+my_rmdir (const char *dir)
+{
+  rmdir (dir);
+}
+
+#endif  /* !defined(__WIN32__) || defined(__CYGWIN__) */
diff --g

[PATCH] c: Diagnose compound literals with function type [PR108043]

2022-12-19 Thread Jakub Jelinek via Gcc-patches
Hi!

Both C99 and latest C2X say that compound literal shall have an object type
(complete object type in the latter case) or array of unknown bound,
so complit with function type is invalid.  When the initializer had to be
non-empty for such case, we used to diagnose it as incorrect initializer,
but with (fntype){} now allowed we just ICE on it.

The following patch diagnoses that.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-12-19  Jakub Jelinek  

PR c/108043
* c-parser.cc (c_parser_postfix_expression_after_paren_type): Diagnose
compound literals with function type.

* gcc.dg/pr108043.c: New test.
* gcc.dg/c99-complit-2.c (foo): Adjust expected diagnostics for
complit with function type.

--- gcc/c/c-parser.cc.jj2022-11-18 09:00:44.331323558 +0100
+++ gcc/c/c-parser.cc   2022-12-16 13:08:51.143083269 +0100
@@ -10924,6 +10924,11 @@ c_parser_postfix_expression_after_paren_
   error_at (type_loc, "compound literal has variable size");
   type = error_mark_node;
 }
+  else if (TREE_CODE (type) == FUNCTION_TYPE)
+{
+  error_at (type_loc, "compound literal has function type");
+  type = error_mark_node;
+}
   if (constexpr_p && type != error_mark_node)
 {
   tree type_no_array = strip_array_types (type);
--- gcc/testsuite/gcc.dg/pr108043.c.jj  2022-12-16 13:15:40.122083457 +0100
+++ gcc/testsuite/gcc.dg/pr108043.c 2022-12-16 13:15:20.840366320 +0100
@@ -0,0 +1,12 @@
+/* PR c/108043 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+typedef void F (void);
+
+void
+foo (void)
+{
+  (F) {};  /* { dg-error "compound literal has function type" } */
+  (F) { foo }; /* { dg-error "compound literal has function type" } */
+}
--- gcc/testsuite/gcc.dg/c99-complit-2.c.jj 2020-01-12 11:54:37.393398623 
+0100
+++ gcc/testsuite/gcc.dg/c99-complit-2.c2022-12-19 11:51:45.098467295 
+0100
@@ -23,7 +23,7 @@ foo (int a)
   /* { dg-error "init" "incomplete union type" { target *-*-* } .-1 } */
   /* { dg-error "invalid use of undefined type" "" { target *-*-* } .-2 } */
   (void (void)) { 0 }; /* { dg-bogus "warning" "warning in place of error" } */
-  /* { dg-error "init" "function type" { target *-*-* } .-1 } */
+  /* { dg-error "compound literal has function type" "function type" { target 
*-*-* } .-1 } */
   (int [a]) { 1 }; /* { dg-bogus "warning" "warning in place of error" } */
   /* { dg-error "init|variable" "VLA type" { target *-*-* } .-1 } */
   /* Initializers must not attempt to initialize outside the object

Jakub



[PATCH] modula2: Fix up bootstrap on powerpc64le-linux [PR108147]

2022-12-19 Thread Jakub Jelinek via Gcc-patches
Hi!

As mentioned in the PR, bootstrap with m2 enabled currently fails
on powerpc64le-linux, we get weird ICE after printing some diagnostics.
The problem is that mc creates from *.def prototypes like
extern void m2linemap_WarningAtf (m2linemap_location_t location, void * 
message);
but the actual function definitions use
void m2linemap_WarningAtf (m2linemap_location_t location, void * message,
...) { code }
and on powerpc64le-linux such lying about the prototype results in
wrong-code, on the caller side we assume the function isn't varargs
and so don't reserve 64 bytes in the frame for it, while the callee
relies on the area being reserved and stores into it.

Fixed by adding non-stdarg wrappers around stdarg functions (because
we want va_list and pass it to diagnostics functions).

Bootstrapped/regtested on {x86_64,i686,powerpc64le,s390x,aarch64}-linux,
ok for trunk?

2022-12-19  Jakub Jelinek  

PR modula2/108147
* m2/gm2-gcc/m2linemap.def (ErrorAtf, WarningAtf, NoteAtf):
Comment out prototypes with varargs.
* m2/gm2-gcc/m2linemap.h (m2linemap_ErrorAtf, m2linemap_WarningAtf,
m2linemap_NoteAtf): No longer varargs.
* m2/gm2-gcc/m2linemap.cc (m2linemap_ErrorAtf): Turned into a
non-varargs wrapper around ...
(m2linemap_ErrorAtf_1): ... this.  New static function.
(m2linemap_WarningAtf): Turned into a non-varargs wrapper around ...
(m2linemap_WarningAtf_1): ... this.  New static function.
(m2linemap_NoteAtf): Turned into a non-varargs wrapper around ...
(m2linemap_NoteAtf_1): ... this.  New static function.

--- gcc/m2/gm2-gcc/m2linemap.def.jj 2022-12-14 20:30:41.110303552 +0100
+++ gcc/m2/gm2-gcc/m2linemap.def2022-12-16 18:44:04.029951259 +0100
@@ -47,11 +47,6 @@ PROCEDURE GetLineNoFromLocation (locatio
 PROCEDURE GetColumnNoFromLocation (location: location_t) : INTEGER ;
 PROCEDURE GetFilenameFromLocation (location: location_t) : ADDRESS ;
 PROCEDURE ErrorAt (location: location_t; message: ADDRESS) ;
-(*
-PROCEDURE ErrorAtf (location: location_t; message: ADDRESS; ...) ;
-PROCEDURE WarningAtf (location: location_t; message: ADDRESS; ...) ;
-PROCEDURE NoteAtf (location: location_t; message: ADDRESS; ...) ;
-*)
 PROCEDURE ErrorAtf (location: location_t; message: ADDRESS) ;
 PROCEDURE WarningAtf (location: location_t; message: ADDRESS) ;
 PROCEDURE NoteAtf (location: location_t; message: ADDRESS) ;
--- gcc/m2/gm2-gcc/m2linemap.h.jj   2022-12-14 20:30:41.111303537 +0100
+++ gcc/m2/gm2-gcc/m2linemap.h  2022-12-16 18:44:19.443727288 +0100
@@ -55,9 +55,9 @@ EXTERN int m2linemap_GetLineNoFromLocati
 EXTERN int m2linemap_GetColumnNoFromLocation (location_t location);
 EXTERN const char *m2linemap_GetFilenameFromLocation (location_t location);
 EXTERN void m2linemap_ErrorAt (location_t location, char *message);
-EXTERN void m2linemap_ErrorAtf (location_t location, const char *message, ...);
-EXTERN void m2linemap_WarningAtf (location_t location, const char *message, 
...);
-EXTERN void m2linemap_NoteAtf (location_t location, const char *message, ...);
+EXTERN void m2linemap_ErrorAtf (location_t location, const char *message);
+EXTERN void m2linemap_WarningAtf (location_t location, const char *message);
+EXTERN void m2linemap_NoteAtf (location_t location, const char *message);
 EXTERN void m2linemap_internal_error (const char *message);
 
 
--- gcc/m2/gm2-gcc/m2linemap.cc.jj  2022-12-14 20:30:41.110303552 +0100
+++ gcc/m2/gm2-gcc/m2linemap.cc 2022-12-16 18:50:00.763767826 +0100
@@ -182,8 +182,8 @@ m2linemap_ErrorAt (location_t location,
 
 /* m2linemap_ErrorAtf - wraps up an error message.  */
 
-void
-m2linemap_ErrorAtf (location_t location, const char *message, ...)
+static void
+m2linemap_ErrorAtf_1 (location_t location, const char *message, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
@@ -195,10 +195,16 @@ m2linemap_ErrorAtf (location_t location,
   va_end (ap);
 }
 
+void
+m2linemap_ErrorAtf (location_t location, const char *message)
+{
+  m2linemap_ErrorAtf_1 (location, message);
+}
+
 /* m2linemap_WarningAtf - wraps up a warning message.  */
 
-void
-m2linemap_WarningAtf (location_t location, const char *message, ...)
+static void
+m2linemap_WarningAtf_1 (location_t location, const char *message, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
@@ -210,10 +216,16 @@ m2linemap_WarningAtf (location_t locatio
   va_end (ap);
 }
 
+void
+m2linemap_WarningAtf (location_t location, const char *message)
+{
+  m2linemap_WarningAtf_1 (location, message);
+}
+
 /* m2linemap_NoteAtf - wraps up a note message.  */
 
-void
-m2linemap_NoteAtf (location_t location, const char *message, ...)
+static void
+m2linemap_NoteAtf_1 (location_t location, const char *message, ...)
 {
   diagnostic_info diagnostic;
   va_list ap;
@@ -225,6 +237,12 @@ m2linemap_NoteAtf (location_t location,
   va_end (ap);
 }
 
+void
+m2linemap_NoteAtf (location_t location, const char *message)
+{
+  m2linemap_NoteAtf_1 (loca

[PATCH] RISC-V: Simplify ASM checks.

2022-12-19 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/vlmax_phi-1.c: Simplify ASM checks.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-10.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-11.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-12.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-13.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-14.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-15.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-16.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-17.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-18.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-19.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-2.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-20.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-21.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-22.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-23.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-24.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-25.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-26.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-27.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-28.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-3.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-4.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-5.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-6.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-7.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-8.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_phi-9.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-1.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-10.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-11.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-12.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-13.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-14.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-15.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-16.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-17.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-18.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-19.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-2.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-3.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-4.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-5.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-6.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-7.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-8.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_block-9.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-1.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-2.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-3.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-4.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-5.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-6.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-7.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-8.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-1.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-10.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-12.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-13.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-14.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-15.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-16.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-2.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-4.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-5.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-6.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-7.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-8.c: Ditto.
* gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-9.c: Ditto.

---
 .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-1.c |  4 +--
 .../riscv/rvv/vsetvl/vlmax_phi-10.c   |  4 +--
 .../riscv/rvv/vsetvl/vlmax_phi-11.c   |  4 +--
 .../riscv/rvv/vsetvl/vlmax_phi-12.c   |  4 +--
 .../riscv/rvv/vsetvl/vlmax_phi-13.c   |  4 +--
 .../riscv/rvv/vsetvl/vlmax_phi-14.c   | 16 +-
 .../riscv/rvv/vsetvl/vlmax_phi-15.c  

[PATCH][committed] aarch64: PR target/108140 Handle NULL target in data intrinsic expansion

2022-12-19 Thread Kyrylo Tkachov via Gcc-patches
Hi all,

In this PR we ICE when expanding the __rbit builtin with a NULL target rtx.
I *think* that only happens when the result is unused and hence maybe we 
shouldn't be expanding
any RTL at all, but the ICE here is easily fixed by deriving the mode from the 
type of the expression
rather than the target.

This patch does that.
Bootstrapped and tested on aarch64-none-linux-gnu.
Pushing to trunk and a GCC 12 version later after some baking time.
Thanks,
Kyrill

gcc/ChangeLog:

PR target/108140
* config/aarch64/aarch64-builtins.cc
(aarch64_expand_builtin_data_intrinsic): Handle NULL target.

gcc/testsuite/ChangeLog:

PR target/108140
* gcc.target/aarch64/acle/pr108140.c: New test.


rbit-mode.patch
Description: rbit-mode.patch


[PATCH] modula2: Don't treat % in Modula 2 messages specially

2022-12-19 Thread Jakub Jelinek via Gcc-patches
Hi!

On top of the just posted patch, this patch makes sure that
any % chars in message strings aren't treated as format chars.
None of these functions take variable number of arguments, so for
most of format specifiers there is nowhere to take arguments from,
it is true that a couple of format specifiers don't take any
arguments - %%, %m, %<, %>, %' - so it is actually possible
to use them, but one needs to verify that no other are emitted and
that what should be printed as % is really emitted as %%.
If the FE does that, then please ignore this patch, otherwise I think
it is safer to do this.

Bootstrapped/regtested on x86_64-linux and i686-linux.

2022-12-19  Jakub Jelinek  

* gm2-gcc/m2linemap.cc (m2linemap_ErrorAt, m2linemap_ErrorAtf,
m2linemap_WarningAtf, m2linemap_NoteAtf, m2linemap_internal_error):
Call functions with "%s", message rather than just message, so that
% chars in message aren't treated as format specifiers.

--- gcc/m2/gm2-gcc/m2linemap.cc.jj  2022-12-18 22:23:08.825497256 +0100
+++ gcc/m2/gm2-gcc/m2linemap.cc 2022-12-18 22:24:13.104563212 +0100
@@ -177,7 +177,7 @@ EXTERN
 void
 m2linemap_ErrorAt (location_t location, char *message)
 {
-  error_at (location, message);
+  error_at (location, "%s", message);
 }
 
 /* m2linemap_ErrorAtf - wraps up an error message.  */
@@ -198,7 +198,7 @@ m2linemap_ErrorAtf_1 (location_t locatio
 void
 m2linemap_ErrorAtf (location_t location, const char *message)
 {
-  m2linemap_ErrorAtf_1 (location, message);
+  m2linemap_ErrorAtf_1 (location, "%s", message);
 }
 
 /* m2linemap_WarningAtf - wraps up a warning message.  */
@@ -219,7 +219,7 @@ m2linemap_WarningAtf_1 (location_t locat
 void
 m2linemap_WarningAtf (location_t location, const char *message)
 {
-  m2linemap_WarningAtf_1 (location, message);
+  m2linemap_WarningAtf_1 (location, "%s", message);
 }
 
 /* m2linemap_NoteAtf - wraps up a note message.  */
@@ -240,7 +240,7 @@ m2linemap_NoteAtf_1 (location_t location
 void
 m2linemap_NoteAtf (location_t location, const char *message)
 {
-  m2linemap_NoteAtf_1 (location, message);
+  m2linemap_NoteAtf_1 (location, "%s", message);
 }
 
 /* m2linemap_internal_error - allow Modula-2 to use the GCC internal error.  */
@@ -248,7 +248,7 @@ m2linemap_NoteAtf (location_t location,
 void
 m2linemap_internal_error (const char *message)
 {
-  internal_error (message);
+  internal_error ("%s", message);
 }
 
 /* UnknownLocation - return the predefined location representing an

Jakub



Re: [Patch] nvptx/mkoffload.cc: Add dummy proc for OpenMP rev-offload table [PR108098]

2022-12-19 Thread Thomas Schwinge
Hi!

On 2022-12-16T17:19:00+0100, Tobias Burnus  wrote:
> Seems to be a CUDA JIT issue

A Nvidia Driver JIT issue, more precisely.  ;-)

> which is fixed by adding a dummy procedure.

Gah...  :-|

> Lightly tested with 4 systems at hand, where 2 failed before.

I'm happy to confirm that indeed this does resolve the issue for all
configurations that I reported in 
"OpenMP/nvptx reverse offload execution test FAILs".


As I said on IRC, #gcc, 2022-12-16:

> [...] we're unlikely to reverse-engineer the exact version/conditions
> where this got fixed, so don't have a useful means for versioning the
> workaround.  Fortunately, it doesn't "cost" anything really.  (In
> constrast to some other GCC/nvptx back end workarounds, as I
> understand.)


Grüße
 Thomas


> One had 10.2 and
> the other had some ancient CUDA where 'nvptx-smi' did not print a CUDA version
> and requires -mptx=3.1.
> (I did check that offloading indeed happened and no hostfallback was done.)
>
> OK for mainline?
>
> Tobias


> nvptx/mkoffload.cc: Add dummy proc for OpenMP rev-offload table [PR108098]
>
> Seemingly, the ptx JIT of CUDA <= 10.2 replaces function pointers in global
> variables by NULL if a translation does not contain any executable code. It
> works with CUDA 11.1.  The code of this commit is about reverse offload;
> having NULL values disables the side of reverse offload during image load.
>
> Solution is the same as found by Thomas for a related issue: Adding a dummy
> procedure. Cf. the PR of this issue and Thomas' patch
> "nvptx: Support global constructors/destructors via 'collect2'"
> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607749.html
>
> As that approach also works here:
>
> Co-authored-by: Thomas Schwinge 
>
> gcc/
>   PR libgomp/108098
>
>   * config/nvptx/mkoffload.cc (process): Emit dummy procedure
>   alongside reverse-offload function table to prevent NULL values
>   of the function addresses.
>
> ---
>  gcc/config/nvptx/mkoffload.cc | 14 ++
>  1 file changed, 14 insertions(+)
>
> diff --git a/gcc/config/nvptx/mkoffload.cc b/gcc/config/nvptx/mkoffload.cc
> index 5d89ba8..8306aa0 100644
> --- a/gcc/config/nvptx/mkoffload.cc
> +++ b/gcc/config/nvptx/mkoffload.cc
> @@ -357,6 +357,20 @@ process (FILE *in, FILE *out, uint32_t omp_requires)
>   fputc (sm_ver2[i], out);
>fprintf (out, "\"\n\t\".file 1 \\\"\\\"\"\n");
>
> +  /* WORKAROUND - see PR 108098
> +  It seems as if older CUDA JIT compiler optimizes the function pointers
> +  in offload_func_table to NULL, which can be prevented by adding a
> +  dummy procedure. With CUDA 11.1, it seems to work fine without
> +  workaround while CUDA 10.2 as some ancient version have need the
> +  workaround. Assuming CUDA 11.0 fixes it, emitting it could be
> +  restricted to 'if (sm_ver2[0] < 8 && version2[0] < 7)' as sm_80 and
> +  PTX ISA 7.0 are new in CUDA 11.0; for 11.1 it would be sm_86 and
> +  PTX ISA 7.1.  */
> +  fprintf (out, "\n\t\".func __dummy$func ( );\"\n");
> +  fprintf (out, "\t\".func __dummy$func ( )\"\n");
> +  fprintf (out, "\t\"{\"\n");
> +  fprintf (out, "\t\"}\"\n");
> +
>size_t fidx = 0;
>for (id = func_ids; id; id = id->next)
>   {
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH, nvptx, 1/2] Reimplement libgomp barriers for nvptx

2022-12-19 Thread Thomas Schwinge
Hi!

On 2022-12-16T15:51:35+0100, Tom de Vries via Gcc-patches 
 wrote:
> On 9/21/22 09:45, Chung-Lin Tang wrote:
>> I had a patch submitted earlier, where I reported that the current way
>> of implementing
>> barriers in libgomp on nvptx created a quite significant performance
>> drop on some SPEChpc2021
>> benchmarks:
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/600818.html

Also: my 2022-03-17 report in 
"[OpenMP/nvptx] Execution-time hang for simple nested OpenMP 
'target'/'parallel'/'task' constructs":
.

>> This patch has been tested on x86_64/powerpc64le with nvptx offloading,
>> using libgomp, ovo, omptests,
>> and sollve_vv testsuites, all without regressions. Also verified that
>> the SPEChpc 2021 521.miniswp_t
>> and 534.hpgmgfv_t performance regressions that occurred in the GCC12
>> cycle has been restored to
>> devel/omp/gcc-11 (OG11) branch levels.

I'm happy to confirm that this also does resolve the PR99555 issue
mentioned above, so please do reference PR99555 in the commit log.

>> Is this okay for trunk?

> Yes, LGTM, please apply (after the other one).
>
> Thanks for addressing this.

Thanks, Chung-Lin and Tom!

>> (also suggest backporting to GCC12 branch, if performance regression can
>> be considered a defect)
>
> That's ok, but wait a while after applying on trunk before doing that,
> say a month.


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[committed] testsuite: Fix up pr64536.c for LLP64 targets [PR108151]

2022-12-19 Thread Jakub Jelinek via Gcc-patches
Hi!

The test casts a pointer to long, which is ok for ilp32 and lp64
targets but not for llp64 targets.  Nothing reads the values later,
it is a link test, so all we care about is that it is the same
cast on s390x-linux where it used to fail before the PR64536 fix,
and that we don't warn about it.

Tested on x86_64-linux -m32/-m64 and with a cross to s390x-linux,
committed to trunk as obvious.

2022-12-19  Jakub Jelinek  

PR testsuite/108151
* gcc.dg/pr64536.c (bar): Use casts to __INTPTR_TYPE__ rather than
long when casting pointer to integral type.

--- gcc/testsuite/gcc.dg/pr64536.c.jj   2020-01-12 11:54:37.489397175 +0100
+++ gcc/testsuite/gcc.dg/pr64536.c  2022-12-19 13:44:42.608091278 +0100
@@ -41,7 +41,7 @@ bar (int x)
}
  else
i = (long *) (h->q = *f);
- *c++ = (long) f;
+ *c++ = (__INTPTR_TYPE__) f;
  e += 6;
}
   else
@@ -55,7 +55,7 @@ bar (int x)
}
  else
i = (long *) (h->q = *f);
- *c++ = (long) f;
+ *c++ = (__INTPTR_TYPE__) f;
  e += 6;
}
 }

Jakub



Re: [PATCH] modula2: Fix up bootstrap on powerpc64le-linux [PR108147]

2022-12-19 Thread Gaius Mulley via Gcc-patches
Jakub Jelinek  writes:

> Hi!
>
> As mentioned in the PR, bootstrap with m2 enabled currently fails
> on powerpc64le-linux, we get weird ICE after printing some diagnostics.
> The problem is that mc creates from *.def prototypes like
> extern void m2linemap_WarningAtf (m2linemap_location_t location, void * 
> message);
> but the actual function definitions use
> void m2linemap_WarningAtf (m2linemap_location_t location, void * message,
> ...) { code }
> and on powerpc64le-linux such lying about the prototype results in
> wrong-code, on the caller side we assume the function isn't varargs
> and so don't reserve 64 bytes in the frame for it, while the callee
> relies on the area being reserved and stores into it.
>
> Fixed by adding non-stdarg wrappers around stdarg functions (because
> we want va_list and pass it to diagnostics functions).
>
> Bootstrapped/regtested on {x86_64,i686,powerpc64le,s390x,aarch64}-linux,
> ok for trunk?
>
> 2022-12-19  Jakub Jelinek  
>
>   PR modula2/108147
>   * m2/gm2-gcc/m2linemap.def (ErrorAtf, WarningAtf, NoteAtf):
>   Comment out prototypes with varargs.
>   * m2/gm2-gcc/m2linemap.h (m2linemap_ErrorAtf, m2linemap_WarningAtf,
>   m2linemap_NoteAtf): No longer varargs.
>   * m2/gm2-gcc/m2linemap.cc (m2linemap_ErrorAtf): Turned into a
>   non-varargs wrapper around ...
>   (m2linemap_ErrorAtf_1): ... this.  New static function.
>   (m2linemap_WarningAtf): Turned into a non-varargs wrapper around ...
>   (m2linemap_WarningAtf_1): ... this.  New static function.
>   (m2linemap_NoteAtf): Turned into a non-varargs wrapper around ...
>   (m2linemap_NoteAtf_1): ... this.  New static function.

thanks for the patch, both this and the subsequent followup patch LGTM,

regards,
Gaius


Re: [PATCH v2 05/11] riscv: thead: Add support for the XTheadBa ISA extension

2022-12-19 Thread Kito Cheng via Gcc-patches
LGTM with a nit:

...
> +  "TARGET_XTHEADBA
> +   && (INTVAL (operands[2]) >= 0) && (INTVAL (operands[2]) <= 3)"

IN_RANGE(INTVAL(operands[2]), 0, 3)

and I am little bit suppress it can be zero

> +  "th.addsl\t%0,%1,%3,%2"
> +  [(set_attr "type" "bitmanip")
> +   (set_attr "mode" "")])


Re: [PATCH] modula2: Don't treat % in Modula 2 messages specially

2022-12-19 Thread Gaius Mulley via Gcc-patches
Jakub Jelinek  writes:

> Hi!
>
> On top of the just posted patch, this patch makes sure that
> any % chars in message strings aren't treated as format chars.
> None of these functions take variable number of arguments, so for
> most of format specifiers there is nowhere to take arguments from,
> it is true that a couple of format specifiers don't take any
> arguments - %%, %m, %<, %>, %' - so it is actually possible
> to use them, but one needs to verify that no other are emitted and
> that what should be printed as % is really emitted as %%.
> If the FE does that, then please ignore this patch, otherwise I think
> it is safer to do this.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux.

yes it might be possible for % to slip though.  Thus much safer to avoid
the situation by using the patch.

LGTM

regards,
Gaius


[PATCH (pushed)] gcc-changelog: allow digit in component name

2022-12-19 Thread Martin Liška

contrib/ChangeLog:

* gcc-changelog/git_commit.py: Allow digit in component name.

contrib/ChangeLog:

* gcc-changelog/test_email.py: Add new test.
* gcc-changelog/test_patches.txt: Add new patch.
---
 contrib/gcc-changelog/git_commit.py|  2 +-
 contrib/gcc-changelog/test_email.py|  4 
 contrib/gcc-changelog/test_patches.txt | 25 +
 3 files changed, 30 insertions(+), 1 deletion(-)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index e82fbcacd3e..7fde02cba85 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -165,7 +165,7 @@ additional_author_regex = re.compile(r'^\t(?P\ 
*)?(?P.*  <.*>)')
 changelog_regex = re.compile(r'^(?:[fF]or +)?([a-z0-9+-/]*)ChangeLog:?')
 subject_pr_regex = 
re.compile(r'(^|\W)PR\s+(?P[a-zA-Z+-]+)/(?P\d{4,7})')
 subject_pr2_regex = re.compile(r'[(\[]PR\s*(?P\d{4,7})[)\]]')
-pr_regex = re.compile(r'\tPR (?P[a-z+-]+\/)?(?P[0-9]+)$')
+pr_regex = re.compile(r'\tPR (?P[a-z0-9+-]+\/)?(?P[0-9]+)$')
 dr_regex = re.compile(r'\tDR ([0-9]+)$')
 star_prefix_regex = re.compile(r'\t\*(?P\ *)(?P.*)')
 end_of_location_regex = re.compile(r'[\[<(:]')
diff --git a/contrib/gcc-changelog/test_email.py 
b/contrib/gcc-changelog/test_email.py
index 79f8e0b8604..3e311d8d0f1 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -475,3 +475,7 @@ class TestGccChangelog(unittest.TestCase):
 assert (len(email.warnings) == 2)
 assert (email.warnings[0] == "Auto-added new file 'gcc/doc/gm2.texi'")
 assert (email.warnings[1] == "Auto-added 2 new files in 'gcc/m2'")
+
+def test_digit_in_PR_component(self):
+email = self.from_patch_glob('modula-PR-component.patch')
+assert not email.errors
diff --git a/contrib/gcc-changelog/test_patches.txt 
b/contrib/gcc-changelog/test_patches.txt
index 6004608a8f9..8bbd341c399 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -3732,3 +3732,28 @@ index 000..649af5e573a
 +GCC RUNTIME LIBRARY EXCEPTION
 --
 2.25.1
+
+=== modula-PR-component.patch ===
+From 1052d89a0b9769453561e18da32b1558d059b320 Mon Sep 17 00:00:00 2001
+From: Martin Liska 
+Date: Mon, 19 Dec 2022 14:34:18 +0100
+Subject: [PATCH] gcc-changelog: allow digit in component name
+
+   PR modula2/123456
+
+contrib/ChangeLog:
+
+   * gcc-changelog/git_commit.py: Allow digit in component name.
+---
+ contrib/gcc-changelog/git_commit.py | 2 +-
+ 1 file changed, 1 insertion(+), 1 deletion(-)
+
+diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
+index e82fbcacd3e..7fde02cba85 100755
+--- a/contrib/gcc-changelog/git_commit.py
 b/contrib/gcc-changelog/git_commit.py
+@@ -0,0 +1,1 @@
++  GNU Free Documentation License
+--
+2.39.0
+
--
2.39.0



Re: [PATCH (pushed)] gcc-changelog: allow digit in component name

2022-12-19 Thread Jakub Jelinek via Gcc-patches
On Mon, Dec 19, 2022 at 02:40:29PM +0100, Martin Liška wrote:
> contrib/ChangeLog:
> 
>   * gcc-changelog/git_commit.py: Allow digit in component name.
> 
> contrib/ChangeLog:
> 
>   * gcc-changelog/test_email.py: Add new test.
>   * gcc-changelog/test_patches.txt: Add new patch.
> ---
>  contrib/gcc-changelog/git_commit.py|  2 +-
>  contrib/gcc-changelog/test_email.py|  4 
>  contrib/gcc-changelog/test_patches.txt | 25 +
>  3 files changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/contrib/gcc-changelog/git_commit.py 
> b/contrib/gcc-changelog/git_commit.py
> index e82fbcacd3e..7fde02cba85 100755
> --- a/contrib/gcc-changelog/git_commit.py
> +++ b/contrib/gcc-changelog/git_commit.py
> @@ -165,7 +165,7 @@ additional_author_regex = re.compile(r'^\t(?P\ 
> *)?(?P.*  <.*>)')
>  changelog_regex = re.compile(r'^(?:[fF]or +)?([a-z0-9+-/]*)ChangeLog:?')
>  subject_pr_regex = 
> re.compile(r'(^|\W)PR\s+(?P[a-zA-Z+-]+)/(?P\d{4,7})')

What about the above regex, shouldn't that be adjusted too?

>  subject_pr2_regex = re.compile(r'[(\[]PR\s*(?P\d{4,7})[)\]]')
> -pr_regex = re.compile(r'\tPR (?P[a-z+-]+\/)?(?P[0-9]+)$')
> +pr_regex = re.compile(r'\tPR (?P[a-z0-9+-]+\/)?(?P[0-9]+)$')
>  dr_regex = re.compile(r'\tDR ([0-9]+)$')
>  star_prefix_regex = re.compile(r'\t\*(?P\ *)(?P.*)')
>  end_of_location_regex = re.compile(r'[\[<(:]')

Jakub



[PATCH (pushed)] gcc-changelog: support digits in PR's component in subject

2022-12-19 Thread Martin Liška

Yes, fixed in the following patch.

Martin

contrib/ChangeLog:

* gcc-changelog/git_commit.py: Support digits in PR's
component in subject.
---
 contrib/gcc-changelog/git_commit.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 7fde02cba85..b73e587eb98 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -163,7 +163,7 @@ author_line_regex = \
 re.compile(r'^(?P\d{4}-\d{2}-\d{2})\ {2}(?P.*  <.*>)')
 additional_author_regex = re.compile(r'^\t(?P\ *)?(?P.*  <.*>)')
 changelog_regex = re.compile(r'^(?:[fF]or +)?([a-z0-9+-/]*)ChangeLog:?')
-subject_pr_regex = 
re.compile(r'(^|\W)PR\s+(?P[a-zA-Z+-]+)/(?P\d{4,7})')
+subject_pr_regex = 
re.compile(r'(^|\W)PR\s+(?P[a-zA-Z0-9+-]+)/(?P\d{4,7})')
 subject_pr2_regex = re.compile(r'[(\[]PR\s*(?P\d{4,7})[)\]]')
 pr_regex = re.compile(r'\tPR (?P[a-z0-9+-]+\/)?(?P[0-9]+)$')
 dr_regex = re.compile(r'\tDR ([0-9]+)$')
--
2.39.0



Re: [PATCH v2 06/11] riscv: thead: Add support for the XTheadBs ISA extension

2022-12-19 Thread Kito Cheng via Gcc-patches
LGTM

On Mon, Dec 19, 2022 at 9:12 AM Christoph Muellner
 wrote:
>
> From: Christoph Müllner 
>
> This patch adds support for the XTheadBs ISA extension.
> The new INSN pattern is defined in a new file to separate
> this vendor extension from the standard extensions.
> The cost model adjustment reuses the xbs:bext cost.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_rtx_costs): Add xthead:tst cost.
> * config/riscv/thead.md (*th_tst): New INSN.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/xtheadbs-tst.c: New test.
>
> Signed-off-by: Christoph Müllner 
> ---
>  gcc/config/riscv/riscv.cc |  4 ++--
>  gcc/config/riscv/thead.md | 11 +++
>  gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c | 13 +
>  3 files changed, 26 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index a8d5e1dac7f..537515771c6 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -2400,8 +2400,8 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
> outer_code, int opno ATTRIBUTE_UN
>   *total = COSTS_N_INSNS (SINGLE_SHIFT_COST);
>   return true;
> }
> -  /* bext pattern for zbs.  */
> -  if (TARGET_ZBS && outer_code == SET
> +  /* bit extraction pattern (zbs:bext, xtheadbs:tst).  */
> +  if ((TARGET_ZBS || TARGET_XTHEADBS) && outer_code == SET
>   && GET_CODE (XEXP (x, 1)) == CONST_INT
>   && INTVAL (XEXP (x, 1)) == 1)
> {
> diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
> index 0257cbfad3e..0e23644ef59 100644
> --- a/gcc/config/riscv/thead.md
> +++ b/gcc/config/riscv/thead.md
> @@ -29,3 +29,14 @@ (define_insn "*th_addsl"
>"th.addsl\t%0,%1,%3,%2"
>[(set_attr "type" "bitmanip")
> (set_attr "mode" "")])
> +
> +;; XTheadBs
> +
> +(define_insn "*th_tst"
> +  [(set (match_operand:X 0 "register_operand" "=r")
> +   (zero_extract:X (match_operand:X 1 "register_operand" "r")
> +   (const_int 1)
> +   (match_operand 2 "immediate_operand" "i")))]
> +  "TARGET_XTHEADBS"
> +  "th.tst\t%0,%1,%2"
> +  [(set_attr "type" "bitmanip")])
> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c 
> b/gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c
> new file mode 100644
> index 000..674cec09128
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/xtheadbs-tst.c
> @@ -0,0 +1,13 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gc_xtheadbs" { target { rv32 } } } */
> +/* { dg-options "-march=rv64gc_xtheadbs" { target { rv64 } } } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" } } */
> +
> +long
> +foo1 (long i)
> +{
> +  return 1L & (i >> 20);
> +}
> +
> +/* { dg-final { scan-assembler-times "th.tst\t" 1 } } */
> +/* { dg-final { scan-assembler-not "andi" } } */
> --
> 2.38.1
>


[committed] testsuite: Fix up pr64536.c for LLP64 targets [PR108151]

2022-12-19 Thread Jakub Jelinek via Gcc-patches
Hi!

Apparently llp64 had 2 further warnings, fixed thusly.

Committed as obvious after testing it with cross to mingw.

2022-12-19  Jakub Jelinek  

PR testsuite/108151
* gcc.dg/pr64536.c (bar): Cast long to __INTPTR_TYPE__
before casting to long *.

--- gcc/testsuite/gcc.dg/pr64536.c.jj
+++ gcc/testsuite/gcc.dg/pr64536.c
@@ -40,7 +40,7 @@ bar (int x)
  h->q = *f;
}
  else
-   i = (long *) (h->q = *f);
+   i = (long *) (__INTPTR_TYPE__) (h->q = *f);
  *c++ = (__INTPTR_TYPE__) f;
  e += 6;
}
@@ -54,7 +54,7 @@ bar (int x)
  h->q = *f;
}
  else
-   i = (long *) (h->q = *f);
+   i = (long *) (__INTPTR_TYPE__) (h->q = *f);
  *c++ = (__INTPTR_TYPE__) f;
  e += 6;
}

Jakub



[PATCH V7] rs6000: Optimize cmp on rotated 16bits constant

2022-12-19 Thread Jiufu Guo via Gcc-patches
Hi,

When checking eq/ne with a constant which has only 16bits, it can be
optimized to check the rotated data.  By this, the constant building
is optimized.

As the example in PR103743:
For "in == 0x8000LL", this patch generates:
rotldi 3,3,1 ; cmpldi 0,3,1
instead of:
li 9,-1 ; rldicr 9,9,0,0 ; cmpd 0,3,9

Compare with previous version:
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600475.html.
This patch refactor the code according to review comments.
e.g. updating function names/comments/code.

This patch pass bootstrap and regtest on ppc64le.
Is it ok for trunk?  Thanks for comments!

BR,
Jeff(Jiufu)


PR target/103743

gcc/ChangeLog:

* config/rs6000/rs6000-protos.h (can_be_rotated_to_lowbits): New.
(can_be_rotated_to_positive_16bits): New.
(can_be_rotated_to_negative_15bits): New.
* config/rs6000/rs6000.cc (can_be_rotated_to_lowbits): New definition.
(can_be_rotated_to_positive_16bits): New definition.
(can_be_rotated_to_negative_15bits): New definition.
* config/rs6000/rs6000.md (*rotate_on_cmpdi): New define_insn_and_split.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr103743.c: New test.
* gcc.target/powerpc/pr103743_1.c: New test.

---
 gcc/config/rs6000/rs6000-protos.h |  3 +
 gcc/config/rs6000/rs6000.cc   | 65 +
 gcc/config/rs6000/rs6000.md   | 67 -
 gcc/testsuite/gcc.target/powerpc/pr103743.c   | 52 ++
 gcc/testsuite/gcc.target/powerpc/pr103743_1.c | 95 +++
 5 files changed, 281 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743_1.c

diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index d0d89320ef6..8355621d4ed 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -35,6 +35,9 @@ extern bool xxspltib_constant_p (rtx, machine_mode, int *, 
int *);
 extern int vspltis_shifted (rtx);
 extern HOST_WIDE_INT const_vector_elt_as_int (rtx, unsigned int);
 extern bool macho_lo_sum_memory_operand (rtx, machine_mode);
+extern bool can_be_rotated_to_lowbits (unsigned HOST_WIDE_INT, int, int *);
+extern bool can_be_rotated_to_positive_16bits (HOST_WIDE_INT);
+extern bool can_be_rotated_to_negative_15bits (HOST_WIDE_INT);
 extern int num_insns_constant (rtx, machine_mode);
 extern int small_data_operand (rtx, machine_mode);
 extern bool mem_operand_gpr (rtx, machine_mode);
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index b3a609f3aa3..ece450254e5 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -14925,6 +14925,71 @@ rs6000_reverse_condition (machine_mode mode, enum 
rtx_code code)
 return reverse_condition (code);
 }
 
+/* Check if C (as 64bit integer) can be rotated to a constant which constains
+   nonzero bits at LOWBITS only.
+
+   Return true if C can be rotated to such constant.  And *ROT is written to
+   the number by which C is rotated.
+   Return false otherwise.  */
+
+bool
+can_be_rotated_to_lowbits (unsigned HOST_WIDE_INT c, int lowbits, int *rot)
+{
+  int clz = HOST_BITS_PER_WIDE_INT - lowbits;
+
+  /* case a. 0..0xxx: already at least clz zeros.  */
+  int lz = clz_hwi (c);
+  if (lz >= clz)
+{
+  *rot = 0;
+  return true;
+}
+
+  /* case b. 0..0xxx0..0: at least clz zeros.  */
+  int tz = ctz_hwi (c);
+  if (lz + tz >= clz)
+{
+  *rot = HOST_BITS_PER_WIDE_INT - tz;
+  return true;
+}
+
+  /* case c. xx10.0xx: rotate 'clz - 1' bits first, then check case b.
+  ^bit -> Vbit, , then zeros are at head or tail.
+00...00xxx100, 'clz - 1' >= 'bits of '.  */
+  const int rot_bits = lowbits + 1;
+  unsigned HOST_WIDE_INT rc = (c >> rot_bits) | (c << (clz - 1));
+  tz = ctz_hwi (rc);
+  if (clz_hwi (rc) + tz >= clz)
+{
+  *rot = HOST_BITS_PER_WIDE_INT - (tz + rot_bits);
+  return true;
+}
+
+  return false;
+}
+
+/* Check if C (as 64bit integer) can be rotated to a positive 16bits constant
+   which contains 48bits leading zeros and 16bits of any value.  */
+
+bool
+can_be_rotated_to_positive_16bits (HOST_WIDE_INT c)
+{
+  int rot = 0;
+  bool res = can_be_rotated_to_lowbits (c, 16, &rot);
+  return res && rot > 0;
+}
+
+/* Check if C (as 64bit integer) can be rotated to a negative 15bits constant
+   which contains 49bits leading ones and 15bits of any value.  */
+
+bool
+can_be_rotated_to_negative_15bits (HOST_WIDE_INT c)
+{
+  int rot = 0;
+  bool res = can_be_rotated_to_lowbits (~c, 15, &rot);
+  return res && rot > 0;
+}
+
 /* Generate a compare for CODE.  Return a brand-new rtx that
represents the result of the compare.  */
 
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6011f5bf76a..f464dd07394 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/g

Re: [PATCH V6] rs6000: Optimize cmp on rotated 16bits constant

2022-12-19 Thread Jiufu Guo via Gcc-patches
Hi,

Segher Boessenkool  writes:

> On Wed, Dec 14, 2022 at 04:26:54PM +0800, Jiufu Guo wrote:
>> Segher Boessenkool  writes:
>> > On Mon, Aug 29, 2022 at 11:42:16AM +0800, Jiufu Guo wrote:
>> >> li %r9,-1
>> >> rldicr %r9,%r9,0,0
>> >> cmpd %cr0,%r3,%r9
>> >
>> > FWIW, I find the winnt assembler syntax very hard to read, and I doubt
>> > I am the only one.
>> Oh, sorry about that.  I will avoid to add '-mregnames' to dump asm. :)
>> BTW, what options are you used to dump asm code? 
>
> The same as GCC outputs, and as I write assembler code as: bare numbers.
> It is much easier to type, and very much easier to read.
>
> -mregnames is fine for output (and it is the default as well), but
> problematic for input.  Take for example
>   li r10,r10
> which translates to
>   li 10,10
> while what was probably wanted is to load the address of the global
> symbol r10, which can be written as
>   li r10,(r10)
>
> I do understand that liking the bare numbers syntax is an acquired taste
> of course.  But less clutter is very useful.  This goes hand in hand
> with writing multiple asm statements per line, which allows you to group
> things together nicely:
>   li 9,-1 ; rldicr 9,9,0,0 ; cmpd 3,9
>
Great!  Thanks for your helpful comments!
>> > Maybe it is better to not return magic values anyway?  So perhaps
>> >
>> > bool
>> > can_be_done_as_compare_of_rotate (unsigned HOST_WIDE_INT c, int clz, int 
>> > *rot)
>> >
>> > (with *rot written if the return value is true).
>> Thanks for your suggestion!
>> It is checking if a constant can be rotated from/to a value which has
>> only few tailing nonzero bits (all leading bits are zeros). 
>> 
>> So, I'm thinking to name the function as something like:
>> can_be_rotated_to_lowbits.
>
> That is a good name yeah.
>
>> >> +bool
>> >> +compare_rotate_immediate_p (unsigned HOST_WIDE_INT c)
>> >
>> > No _p please, this function is not a predicate (at least, the name does
>> > not say what it tests).  So a better name please.  This matters even
>> > more for extern functions (like this one) because the function
>> > implementation is always farther away so you do not easily have all
>> > interface details in mind.  Good names help :-)
>> Thanks! Name is always a matter. :)
>> 
>> Maybe we can name this funciton as "can_be_rotated_as_compare_operand",
>> or "is_constant_rotateable_for_compare", because this function checks
>> "if a constant can be rotated to/from an immediate operand of
>> cmpdi/cmpldi". 
>
> Maybe just "constant_can_be_rotated_to_lowbits"?  (If that is what the
> function does).  It doesn't clearly say that it allows negative numbers
> as well, but that is a problem of the function itself already; maybe it
> would be better to do signed and unsigned separately.
It makes sense. I updated a new version patch.
>
>> >> +(define_code_iterator eqne [eq ne])
>> >> +(define_code_attr EQNE [(eq "EQ") (ne "NE")])
>> >
>> > Just  or  should work?
>> Great! Thanks for point out this!  works.
>> >
>> > Please fix these things.  Almost there :-)
>> 
>> I updated the patch as below. Bootstraping and regtesting is ongoing.
>> Thanks again for your careful and insight review!
>
> Please send as new message (not as reply even), that is much easier to
> handle.  Thanks!

Sure, I just submit a new patch version.
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608765.html

Thanks a lot for your review.


BR,
Jeff (Jiufu)

>
>
> Segher


[PATCH] PR tree-optimization/108139 - Don't use PHI equivalences in range-on-entry.

2022-12-19 Thread Andrew MacLeod via Gcc-patches
our use of equivalences on range-on-entry calculations cause an issue 
through a PHI node when a back edge is involved.  ie

   a = VARYING
   <...>
bb5
   b = PHI 
bb6
   if (a != 0)
     goto bb5

since the value of b is undefined on the edge 2->5, we ignore it. The 
range of a on the edge 6->5 is ~[0,0]
we calculate the range of b to be ~[0,0].   we also provide an 
equivalency between a and b.


Unfortunately the on-entry code looks at equivalencies, and says, "hey, 
a and b are equivalent, so we can use the range of b instead"


So it now thinks a is ~[0,0] and folds away the condition.

The problem is that b can be considered equivalent to a, but the 
converse is not true, because there is a path (2->5) upon which a is not 
equivalent to b.  we have no way to represent a one way equivalence at 
the moment. This patch avoid using that relation in range-on-entry 
calculations.


Perhaps next release I'll add a specific kind of one way equivalence for 
this kind of situation.


Bootstraps on x86_64-pc-linux-gnu with no regressions. OK?

Andrew
commit ecf19b6eb6f8e17d8d148e3c6627bd2151766420
Author: Andrew MacLeod 
Date:   Fri Dec 16 16:53:31 2022 -0500

Don't use PHI equivalences in range-on-entry.

If there is only one argument to a PHI which is defined, an equivalency is
created between the def and the argument.  It is safe to consider the def
equal to the argument, but it is dangerous to assume the argument is also
equivalent to the def as there may be branches which change the range on the
path to the PHI on that argument

This patch avoid using that relation in range-on-entry calculations.

PR tree-optimization/108139
gcc/
* gimple-range-cache.cc (ranger_cache::fill_block_cache): Do not
use equivalences originating from PHIS.

gcc/testsuite/
* gcc.dg/pr108139.c: New.

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index ce5a0c8155e..9848d140242 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -1235,6 +1235,13 @@ ranger_cache::fill_block_cache (tree name, basic_block bb, basic_block def_bb)
 	  if (!m_gori.has_edge_range_p (equiv_name))
 		continue;
 
+	  // PR 108139. It is hazardous to assume an equivalence with
+	  // a PHI is the same value.  The PHI may be an equivalence
+	  // via UNDEFINED arguments which is really a one way equivalence.
+	  // PHIDEF == name, but name may not be == PHIDEF.
+	  if (is_a (SSA_NAME_DEF_STMT (equiv_name)))
+		continue;
+
 	  // Check if the equiv definition dominates this block
 	  if (equiv_bb == bb ||
 		  (equiv_bb && !dominated_by_p (CDI_DOMINATORS, bb, equiv_bb)))
diff --git a/gcc/testsuite/gcc.dg/pr108139.c b/gcc/testsuite/gcc.dg/pr108139.c
new file mode 100644
index 000..6f224e3ce62
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr108139.c
@@ -0,0 +1,18 @@
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O1 -ftree-vrp -fdump-tree-vrp" } */
+
+int a, b;
+int main() {
+  int c;
+  if (a > 1)
+a++;
+  while (a)
+if (c == b)
+  c = a;
+  return 0;
+}
+
+
+/* { dg-final { scan-tree-dump-not "Folding predicate" "vrp2" } } */
+
+


Re: Re: [PATCH] RISC-V: Fix RVV machine mode attribute configuration

2022-12-19 Thread Kito Cheng via Gcc-patches
Commited to trunk.

钟居哲  於 2022年12月17日 週六 09:52 寫道:

> Actually, I don't check and test HF carefully since I disable them.
> Kito ask me to disable all HF modes since zvfhmin is no ratified and GCC
> doesn't allow any un-ratified ISA. You can see vector-iterator.md that all
> RVV modes supported including QI HI SI DI SF DF excluding HF and BF.
>
>
>
> juzhe.zh...@rivai.ai
>
> From: Jeff Law
> Date: 2022-12-17 09:48
> To: juzhe.zhong; gcc-patches
> CC: kito.cheng; palmer
> Subject: Re: [PATCH] RISC-V: Fix RVV machine mode attribute configuration
>
>
> On 12/14/22 00:01, juzhe.zh...@rivai.ai wrote:
> > From: Ju-Zhe Zhong 
> >
> > The attribute configuration of each machine mode are support in the
> previous patch.
> > I noticed some of them are not correct during VSETVL PASS testsing.
> > Correct them in the single patch now.
> >
> > gcc/ChangeLog:
> >
> >  * config/riscv/riscv-vector-switch.def (ENTRY): Correct
> attributes.
> >
>
>
>
> > @@ -121,7 +121,7 @@ ENTRY (VNx2HI, true, LMUL_1, 16, LMUL_F2, 32)
> >   ENTRY (VNx1HI, true, LMUL_F2, 32, LMUL_F4, 64)
> >
> >   /* TODO:Disable all FP16 vector, enable them when 'zvfh' is
> supported.  */
> > -ENTRY (VNx32HF, false, LMUL_8, 2, LMUL_RESERVED, 0)
> > +ENTRY (VNx32HF, false, LMUL_RESERVED, 0, LMUL_8, 2)
> Is there any value in making VNx32HF dependent on TARGET_MIN_VLEN > 32
> like we're doing for VNx32HI?   In the past I've found it useful to have
> HI, HF, BF behave identically as much as possible.
>
> You call.  The patch is OK either way.
>
> jeff
>
>
>
>
>


[committed] arm: correctly define __ARM_FEATURE_CLZ

2022-12-19 Thread Richard Earnshaw via Gcc-patches

The ACLE requires that __ARM_FEATURE_CLZ be defined if the hardware
supports it; it's also clear that this doesn't mean the current ISA,
so we must define this even when compiling for Thumb1 if the target
supports CLZ in A32.

This brings GCC into alignment with Clang.

gcc/ChangeLog:

* config/arm/arm-c.cc (__ARM_FEATURE_CLZ): Fix definition of
preprocessor macro when target has CLZ in another ISA.
---
 gcc/config/arm/arm-c.cc | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/config/arm/arm-c.cc b/gcc/config/arm/arm-c.cc
index 86c56bf2680..202898fa041 100644
--- a/gcc/config/arm/arm-c.cc
+++ b/gcc/config/arm/arm-c.cc
@@ -238,8 +238,12 @@ arm_cpu_builtins (struct cpp_reader* pfile)
 builtin_define_with_int_value ("__ARM_FEATURE_LDREX",
    TARGET_ARM_FEATURE_LDREX);
 
+  /* ACLE says that __ARM_FEATURE_CLZ is defined if the hardware
+ supports it; it's also clear that this doesn't mean the current
+ ISA, so we define this even when compiling for Thumb1 if the
+ target supports CLZ in A32.  */
   def_or_undef_macro (pfile, "__ARM_FEATURE_CLZ",
-		  ((TARGET_ARM_ARCH >= 5 && !TARGET_THUMB)
+		  ((TARGET_ARM_ARCH >= 5 && arm_arch_notm)
 		   || TARGET_ARM_ARCH_ISA_THUMB >=2));
 
   def_or_undef_macro (pfile, "__ARM_FEATURE_NUMERIC_MAXMIN",


Re: Re: [PATCH] RISC-V: Remove unit-stride store from ta attribute

2022-12-19 Thread Kito Cheng via Gcc-patches
Commited to trunk, thanks:)


钟居哲  於 2022年12月17日 週六 09:22 寫道:

> Yes, the vector stores doesn't care about policy no matter mask or tail.
> Removing it can allow VSETVL PASS have more optimization chances
> since VSETVL PASS has backward demands fusion.
>
> For example:
> vadd tama
> vse.v
> VSETVL PASS will choose to set tama for vse.v
>
> vadd tumu
> vse.v
> VSETVL PASS will choose to set tumu for vse.v
>
>
>
> juzhe.zh...@rivai.ai
>
> From: Jeff Law
> Date: 2022-12-17 04:01
> To: juzhe.zhong; gcc-patches
> CC: kito.cheng; palmer
> Subject: Re: [PATCH] RISC-V: Remove unit-stride store from ta attribute
>
>
> On 12/14/22 04:36, juzhe.zh...@rivai.ai wrote:
> > From: Ju-Zhe Zhong 
> >
> > Since store instructions doesn't care about tail policy, we remove
> > vste from "ta" attribute. Hence, we could have more fusion chances
> > and better optimization.
> >
> > gcc/ChangeLog:
> >
> >  * config/riscv/vector.md: Remove vste.
> Just to confirm that I understand the basic model.  Vector stores only
> update active elements, thus they don't care about tail policy, right?
>
> Assuming that's the case, then this is OK.
>
> jeff
>
>


Re: [PATCH] RISC-V: Remove unused redundant vector attributes

2022-12-19 Thread Kito Cheng via Gcc-patches
Commited

Jeff Law via Gcc-patches  於 2022年12月17日 週六 03:55
寫道:

>
>
> On 12/14/22 01:51, juzhe.zh...@rivai.ai wrote:
> > From: Ju-Zhe Zhong 
> >
> > I found that I forgot to remove these redundant attributes.
> > Sorry about that.
> >
> > gcc/ChangeLog:
> >
> >  * config/riscv/vector.md (): Remove redundant attributes.
> OK.
> jeff
>
>


Re: [PATCH] RISC-V: Fix annotation

2022-12-19 Thread Kito Cheng via Gcc-patches
Merged into previou patch.

Jeff Law via Gcc-patches  於 2022年12月17日 週六 03:54
寫道:

>
>
> On 12/14/22 01:39, juzhe.zh...@rivai.ai wrote:
> > From: Ju-Zhe Zhong 
> >
> > gcc/ChangeLog:
> >
> >  * config/riscv/riscv-vsetvl.cc: Fix annotation.
> Just roll this into the patch that adds riscv-vsetvl.cc.
>
> jeff
>


Re: [PATCH] RISC-V: Change vlmul printing rule

2022-12-19 Thread Kito Cheng via Gcc-patches
Commited

Jeff Law via Gcc-patches  於 2022年12月17日 週六 03:51
寫道:

>
>
> On 12/13/22 23:57, juzhe.zh...@rivai.ai wrote:
> > From: Ju-Zhe Zhong 
> >
> > This patch is preparing patch for the following patch (VSETVL PASS)
> > support since the current vlmul printing rule is not appropriate
> > information for VSETVL PASS. I split this fix in a single patch.
> >
> > gcc/ChangeLog:
> >
> >  * config/riscv/riscv-v.cc (emit_vlmax_vsetvl): Pass through
> VLMUL enum instead of machine mode.
> >  * config/riscv/riscv-vector-builtins-bases.cc: Ditto.
> >  * config/riscv/riscv.cc (riscv_print_operand): Print LMUL by
> enum vlmul instead of machine mode.
> OK.  Though I suspect your ChangeLog will need to be wrapped to get past
> the commit hooks.
>
> jeff
>


Re: [PATCH] RISC-V: Add VSETVL PASS VLMAX testcases.

2022-12-19 Thread Kito Cheng via Gcc-patches
Commited to trunk

 於 2022年12月14日 週三 15:46 寫道:

> From: Ju-Zhe Zhong 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/rvv.exp: Adjust to enable tests for VSETVL
> PASS.
> * gcc.target/riscv/rvv/vsetvl/dump-1.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-1.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-10.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-11.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-12.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-13.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-14.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-15.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-16.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-17.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-18.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-19.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-2.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-20.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-21.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-22.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-23.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-24.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-27.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-28.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-29.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-3.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-30.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-31.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-32.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-33.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-34.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-35.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-36.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-37.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-38.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-39.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-4.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-40.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-41.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-42.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-43.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-44.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-45.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-46.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-5.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-6.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-7.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-8.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-9.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-1.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-11.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-13.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-14.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-15.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-16.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-17.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-18.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-19.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-2.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-20.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-21.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-22.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-23.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-24.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-25.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-26.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-27.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop

Re: [PATCH] RISC-V: Add testcases for VSETVL PASS 2

2022-12-19 Thread Kito Cheng via Gcc-patches
Commited to trunk

 於 2022年12月14日 週三 16:13 寫道:

> From: Ju-Zhe Zhong 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-1.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-10.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-11.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-12.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-13.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-14.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-15.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-16.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-17.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-18.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-19.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-2.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-20.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-21.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-22.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-23.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-24.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-25.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-26.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-27.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-28.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-3.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-4.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-5.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-6.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-7.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-8.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-9.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-1.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-10.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-11.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-12.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-13.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-14.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-15.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-16.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-2.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-3.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-4.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-5.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-6.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-7.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-8.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-9.c: New test.
>
> ---
>  .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-1.c |  37 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-10.c   |  37 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-11.c   |  37 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-12.c   |  37 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-13.c   |  37 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-14.c   | 217 
>  .../riscv/rvv/vsetvl/vlmax_phi-15.c   |  40 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-16.c   |  40 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-17.c   |  40 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-18.c   |  40 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-19.c   |  40 +++
>  .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-2.c |  37 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-20.c   |  40 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-21.c   |  40 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-22.c   |  40 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-23.c   |  40 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-24.c   |  40 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-25.c   |  40 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-26.c   |  40 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-27.c   |  40 +++
>  .../riscv/rvv/vsetvl/vlmax_phi-28.c   | 237 ++
>  .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-3.c |  37 +++
>  .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-4.c |  37 +++
>  .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-5.c |  37 +++
>  .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-6.c |  37 +++
>  .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-7.c |  37 +++
>  .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-8.c |  37 +++
>  .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-9.c |  37 +++
>  .../riscv/rvv/vsetvl/vlmax_switch_vtype-1.c   |  26 ++
>  .../riscv/rvv/vsetvl/vlmax_switch_vtype-10.c  |  47 
>  .../riscv/rvv/vsetvl/vlmax_switch_vtype-11.c  |  55 
> 

Re: [PATCH] RISC-V: Add testcases for VSETVL PASS 5

2022-12-19 Thread Kito Cheng via Gcc-patches
Commited to trunk

 於 2022年12月14日 週三 16:27 寫道:

> From: Ju-Zhe Zhong 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-1.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-10.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-11.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-12.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-13.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-14.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-15.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-16.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-17.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-18.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-19.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-2.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-20.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-21.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-22.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-23.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-24.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-27.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-28.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-29.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-3.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-30.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-31.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-32.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-33.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-34.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-35.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-36.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-37.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-38.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-39.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-4.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-40.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-41.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-42.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-43.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-44.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-45.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-46.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-5.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-6.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-7.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-8.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-9.c: New test.
>
> ---
>  .../riscv/rvv/vsetvl/vlmax_back_prop-1.c  |  36 
>  .../riscv/rvv/vsetvl/vlmax_back_prop-10.c |  59 +++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-11.c |  63 +++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-12.c |  64 
>  .../riscv/rvv/vsetvl/vlmax_back_prop-13.c |  64 
>  .../riscv/rvv/vsetvl/vlmax_back_prop-14.c |  58 +++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-15.c | 143 
>  .../riscv/rvv/vsetvl/vlmax_back_prop-16.c |  54 ++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-17.c |  59 +++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-18.c |  58 +++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-19.c |  48 ++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-2.c  |  50 ++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-20.c |  59 +++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-21.c |  50 ++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-22.c |  58 +++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-23.c |  41 +
>  .../riscv/rvv/vsetvl/vlmax_back_prop-24.c |  41 +
>  .../riscv/rvv/vsetvl/vlmax_back_prop-25.c |  96 +++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-26.c |  89 ++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-27.c |  51 ++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-28.c |  54 ++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-29.c |  54 ++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-3.c  |  47 ++
>  .../riscv/rvv/vsetvl/vlmax_back_prop-30.c |  44 +
>  .../riscv/rvv/vsetvl/vlmax_back_prop-31.c |  46 ++
>  .../riscv

Re: [PATCH] RISC-V: Add testcases for VSETVL PASS 3

2022-12-19 Thread Kito Cheng via Gcc-patches
Commited to trunk

 於 2022年12月14日 週三 16:16 寫道:

> From: Ju-Zhe Zhong 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-1.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-10.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-11.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-12.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-13.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-14.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-15.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-16.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-17.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-18.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-19.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-2.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-20.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-21.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-22.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-23.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-24.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-25.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-26.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-27.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-28.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-3.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-4.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-5.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-6.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-7.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-8.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-9.c: New test.
>
> ---
>  .../riscv/rvv/vsetvl/vlmax_miss_default-1.c   |  32 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-10.c  |  32 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-11.c  |  32 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-12.c  |  32 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-13.c  |  32 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-14.c  | 189 ++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-15.c  |  38 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-16.c  |  38 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-17.c  |  38 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-18.c  |  38 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-19.c  |  38 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-2.c   |  32 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-20.c  |  38 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-21.c  |  38 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-22.c  |  38 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-23.c  |  38 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-24.c  |  38 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-25.c  |  38 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-26.c  |  38 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-27.c  |  38 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-28.c  | 231 ++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-3.c   |  32 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-4.c   |  32 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-5.c   |  32 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-6.c   |  32 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-7.c   |  32 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-8.c   |  32 +++
>  .../riscv/rvv/vsetvl/vlmax_miss_default-9.c   |  32 +++
>  28 files changed, 1330 insertions(+)
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-1.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-10.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-11.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-12.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-13.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-14.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-15.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-16.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-17.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-18.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-19.c
>  create mode 100644
> gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vlmax_miss_default-2.c
>  

Re: [PATCH] RISC-V: Add testcases for VSETVL PASS 4

2022-12-19 Thread Kito Cheng via Gcc-patches
Commited to trunk

 於 2022年12月14日 週三 16:20 寫道:

> From: Ju-Zhe Zhong 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-1.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-11.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-13.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-14.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-15.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-16.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-17.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-18.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-19.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-2.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-20.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-21.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-22.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-23.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-24.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-25.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-26.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-27.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-28.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-4.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-5.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-6.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-7.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-8.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-9.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_call-1.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_call-2.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_call-3.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_call-4.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_complex_loop-1.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_complex_loop-2.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-1.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-10.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-11.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-12.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-2.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-3.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-4.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-5.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-6.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-7.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-8.c: New test.
> * gcc.target/riscv/rvv/vsetvl/vlmax_conflict-9.c: New test.
>
> ---
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-1.c| 182 ++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-10.c   | 230 +++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-11.c   |  43 ++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-12.c   | 266 
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-13.c   | 221 +++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-14.c   | 221 +++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-15.c   |  41 ++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-16.c   | 257 
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-17.c   | 177 ++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-18.c   | 177 ++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-19.c   |  34 ++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-2.c| 182 ++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-20.c   | 203 +++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-21.c   | 155 +
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-22.c   | 155 +
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-23.c   |  30 +
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-24.c   | 180 ++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-25.c   | 572 ++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-26.c   | 492 +++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-27.c   | 491 +++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-28.c   |  86 +++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-3.c|  35 ++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-4.c| 210 +++
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-5.c| 167 +
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-6.c| 167 +
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-7.c|  32 +
>  .../riscv/rvv/vsetvl/vlmax_bb_prop-8.c| 194 ++
>  .

Re: [PATCH] RISC-V: Support VSETVL PASS for RVV support

2022-12-19 Thread Kito Cheng via Gcc-patches
LGTM, and thanks for this amazing work, actually I review this more than
one month, so I gonna commit this for now.

But feel free to keep helping review that, give comment and report bug to
Ju-Zhe and me :)



 於 2022年12月14日 週三 15:32 寫道:

> From: Ju-Zhe Zhong 
>
> This patch is to support VSETVL PASS for RVV support.
> 1.The optimization and performance is guaranteed LCM (Lazy code motion).
> 2.Base on RTL_SSA framework to gain better optimization chances.
> 3.Also we do VL/VTYPE, demand information backward propagation across
>   blocks by RTL_SSA reverse order in CFG.
> 4.It has been well and fully tested by about 200+ testcases for VLMAX
>   AVL situation (Only for VLMAX since we don't have an intrinsics to
>   test non-VLMAX).
> 5.Will support AVL model in the next patch.
>
> gcc/ChangeLog:
>
> * config.gcc: Add riscv-vsetvl.o.
> * config/riscv/riscv-passes.def (INSERT_PASS_BEFORE): Add VSETVL
> PASS location.
> * config/riscv/riscv-protos.h (make_pass_vsetvl): New function.
> (enum avl_type): New enum.
> (get_ta): New function.
> (get_ma): Ditto.
> (get_avl_type): Ditto.
> (calculate_ratio): Ditto.
> (enum tail_policy): New enum.
> (enum mask_policy): Ditto.
> * config/riscv/riscv-v.cc (calculate_ratio): New function.
> (emit_pred_op): change the VLMAX mov codgen.
> (get_ta): New function.
> (get_ma): Ditto.
> (enum tail_policy): Change enum.
> (get_prefer_tail_policy): New function.
> (enum mask_policy): Change enum.
> (get_prefer_mask_policy): New function.
> * config/riscv/t-riscv: Add riscv-vsetvl.o
> * config/riscv/vector.md (): Adjust attribute and pattern for
> VSETVL PASS.
> (@vlmax_avl): Ditto.
> (@vsetvl_no_side_effects): Delete.
> (vsetvl_vtype_change_only): New MD pattern.
> (@vsetvl_discard_result): Ditto.
> * config/riscv/riscv-vsetvl.cc: New file.
> * config/riscv/riscv-vsetvl.h: New file.
>
> ---
>  gcc/config.gcc|2 +-
>  gcc/config/riscv/riscv-passes.def |1 +
>  gcc/config/riscv/riscv-protos.h   |   15 +
>  gcc/config/riscv/riscv-v.cc   |  102 +-
>  gcc/config/riscv/riscv-vsetvl.cc  | 2509 +
>  gcc/config/riscv/riscv-vsetvl.h   |  344 
>  gcc/config/riscv/t-riscv  |8 +
>  gcc/config/riscv/vector.md|  131 +-
>  8 files changed, 3076 insertions(+), 36 deletions(-)
>  create mode 100644 gcc/config/riscv/riscv-vsetvl.cc
>  create mode 100644 gcc/config/riscv/riscv-vsetvl.h
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index b5eda046033..1eb76c6c076 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -518,7 +518,7 @@ pru-*-*)
> ;;
>  riscv*)
> cpu_type=riscv
> -   extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o
> riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o"
> +   extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o
> riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o riscv-vsetvl.o"
> extra_objs="${extra_objs} riscv-vector-builtins.o
> riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o"
> d_target_objs="riscv-d.o"
> extra_headers="riscv_vector.h"
> diff --git a/gcc/config/riscv/riscv-passes.def
> b/gcc/config/riscv/riscv-passes.def
> index 23ef8ac6114..d2d48f231aa 100644
> --- a/gcc/config/riscv/riscv-passes.def
> +++ b/gcc/config/riscv/riscv-passes.def
> @@ -18,3 +18,4 @@
> .  */
>
>  INSERT_PASS_AFTER (pass_rtl_store_motion, 1, pass_shorten_memrefs);
> +INSERT_PASS_BEFORE (pass_sched2, 1, pass_vsetvl);
> diff --git a/gcc/config/riscv/riscv-protos.h
> b/gcc/config/riscv/riscv-protos.h
> index e17e003f8e2..cfd0f284f91 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -96,6 +96,7 @@ extern void riscv_parse_arch_string (const char *,
> struct gcc_options *, locatio
>  extern bool riscv_hard_regno_rename_ok (unsigned, unsigned);
>
>  rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt);
> +rtl_opt_pass * make_pass_vsetvl (gcc::context *ctxt);
>
>  /* Information about one CPU we know about.  */
>  struct riscv_cpu_info {
> @@ -131,6 +132,12 @@ enum vlmul_type
>LMUL_F4 = 6,
>LMUL_F2 = 7,
>  };
> +
> +enum avl_type
> +{
> +  NONVLMAX,
> +  VLMAX,
> +};
>  /* Routines implemented in riscv-vector-builtins.cc.  */
>  extern void init_builtins (void);
>  extern const char *mangle_builtin_type (const_tree);
> @@ -145,17 +152,25 @@ extern bool legitimize_move (rtx, rtx, machine_mode);
>  extern void emit_pred_op (unsigned, rtx, rtx, machine_mode);
>  extern enum vlmul_type get_vlmul (machine_mode);
>  extern unsigned int get_ratio (machine_mode);
> +extern int get_ta (rtx);
> +extern int get_ma (rtx);
> +extern int get_avl_type (rtx);
> +extern unsigned int calculate_ratio (unsigned int, enum vlmul_type);
>  enum tail_policy
>  {
>TAIL_UNDIST

Re: [PATCH] RISC-V: Fix ASM checks.

2022-12-19 Thread Kito Cheng via Gcc-patches
Merged into previou patch and commited

 於 2022年12月19日 週一 18:55 寫道:

> From: Ju-Zhe Zhong 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-1.c: Fix asm check.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-10.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-11.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-12.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-13.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-14.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-15.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-17.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-18.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-19.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-2.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-20.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-21.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-22.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-23.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-24.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-27.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-28.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-29.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-3.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-30.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-31.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-32.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-33.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-34.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-35.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-36.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-37.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-38.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-39.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-4.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-40.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-41.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-42.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-45.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-46.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-6.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-7.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-8.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-9.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-1.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-11.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-13.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-14.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-15.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-16.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-17.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-18.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-19.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-2.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-20.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-21.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-22.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-23.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-24.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-25.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-26.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-27.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-28.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-4.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-5.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-6.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-7.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-8.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-9.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_call-1.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_cal

Re: [PATCH] RISC-V: Simplify ASM checks.

2022-12-19 Thread Kito Cheng via Gcc-patches
Merged into previou patch and commited

 於 2022年12月19日 週一 19:11 寫道:

> From: Ju-Zhe Zhong 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-1.c: Simplify ASM checks.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-10.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-11.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-12.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-13.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-14.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-15.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-16.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-17.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-18.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-19.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-2.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-20.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-21.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-22.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-23.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-24.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-25.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-26.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-27.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-28.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-3.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-4.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-5.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-6.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-7.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-8.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_phi-9.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-1.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-10.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-11.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-12.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-13.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-14.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-15.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-16.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-17.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-18.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-19.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-2.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-3.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-4.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-5.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-6.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-7.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-8.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_block-9.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-1.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-2.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-3.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-4.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-5.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-6.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-7.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_single_vtype-8.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-1.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-10.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-12.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-13.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-14.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-15.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-16.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-2.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-4.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-5.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-6.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-7.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-8.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_switch_vtype-9.c: Ditto.
>
> ---
>  .../gcc.target/riscv/rvv/vsetvl/vlmax_phi-1.c |  4 +--
>  .../riscv/rvv/vsetvl/vlmax_phi-10.c   |  4 +--
>  .../riscv/rvv/vsetvl/vlmax_phi-11.c   |  4 +--

Re: [PATCH] RISC-V: Simplify ASM checks 2

2022-12-19 Thread Kito Cheng via Gcc-patches
Merged into previou patch and commited

 於 2022年12月19日 週一 19:13 寫道:

> From: Ju-Zhe Zhong 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-1.c: Simplify ASM
> checks.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-10.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-11.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-12.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-13.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-14.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-15.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-17.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-18.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-19.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-2.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-20.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-21.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-22.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-23.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-24.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-25.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-26.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-27.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-28.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-29.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-3.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-30.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-31.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-32.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-33.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-34.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-35.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-36.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-37.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-38.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-39.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-4.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-40.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-41.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-42.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-45.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-46.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-6.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-7.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-8.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-9.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-1.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-10.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-11.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-12.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-13.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-14.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-15.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-16.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-17.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-18.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-19.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-2.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-20.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-21.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-22.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-23.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-24.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-25.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-26.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-27.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-28.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-3.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-4.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-5.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-6.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-7.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-8.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_bb_prop-9.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/vlmax_call-1.c: Ditto.
> * gcc.target/riscv/rvv/vsetvl/v

[PATCH] tree-optimization/108164 - undefined overflow with IV vectorization

2022-12-19 Thread Richard Biener via Gcc-patches
vect_update_ivs_after_vectorizer can end up emitting a signed
IV update when the loop body performed an unsigned computation.
The following makes sure to perform that update in the type
of the loop update type to avoid undefined behavior on overflow.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/108164
* tree-vect-loop-manip.cc (vect_update_ivs_after_vectorizer):
Perform vect_step_op_add update in the appropriate type.

* gcc.dg/pr108164.c: New testcase.
---
 gcc/testsuite/gcc.dg/pr108164.c | 19 +++
 gcc/tree-vect-loop-manip.cc | 12 +++-
 2 files changed, 26 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr108164.c

diff --git a/gcc/testsuite/gcc.dg/pr108164.c b/gcc/testsuite/gcc.dg/pr108164.c
new file mode 100644
index 000..d76d557876e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr108164.c
@@ -0,0 +1,19 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -fno-tree-dce" } */
+
+int a, b, c;
+int main()
+{
+  int e = -1;
+  short f = -1;
+  for (; c < 1; c++)
+while (f >= e)
+  f++;
+  for (; a < 2; a++) {
+short g = ~(~b | ~f);
+int h = -g;
+int i = (3 / ~h) / ~b;
+b = i;
+  }
+  return 0;
+}
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 1d96130c985..5ec739ed218 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -1576,14 +1576,16 @@ vect_update_ivs_after_vectorizer (loop_vec_info 
loop_vinfo,
 
   if (induction_type == vect_step_op_add)
{
- off = fold_build2 (MULT_EXPR, TREE_TYPE (step_expr),
-fold_convert (TREE_TYPE (step_expr), niters),
-step_expr);
+ tree stype = TREE_TYPE (step_expr);
+ off = fold_build2 (MULT_EXPR, stype,
+fold_convert (stype, niters), step_expr);
  if (POINTER_TYPE_P (type))
ni = fold_build_pointer_plus (init_expr, off);
  else
-   ni = fold_build2 (PLUS_EXPR, type,
- init_expr, fold_convert (type, off));
+   ni = fold_convert (type,
+  fold_build2 (PLUS_EXPR, stype,
+   fold_convert (stype, init_expr),
+   off));
}
   /* Don't bother call vect_peel_nonlinear_iv_init.  */
   else if (induction_type == vect_step_op_neg)
-- 
2.35.3


Re: [PATCH] RISC-V: Support VSETVL PASS for RVV support

2022-12-19 Thread Jeff Law via Gcc-patches
I believe Kito already approved.  There's nothing here that is critical, 
just minor cleanups and I'm fine with them being cleaned up as a 
follow-up patch given Kito has already approved this patch.


On 12/14/22 00:13, juzhe.zh...@rivai.ai wrote:

From: Ju-Zhe Zhong 

This patch is to support VSETVL PASS for RVV support.
1.The optimization and performance is guaranteed LCM (Lazy code motion).
2.Base on RTL_SSA framework to gain better optimization chances.
3.Also we do VL/VTYPE, demand information backward propagation across
   blocks by RTL_SSA reverse order in CFG.
4.It has been well and fully tested by about 200+ testcases for VLMAX
   AVL situation (Only for VLMAX since we don't have an intrinsics to
   test non-VLMAX).
5.Will support AVL model in the next patch

gcc/ChangeLog:

 * config.gcc: Add riscv-vsetvl.o.
 * config/riscv/riscv-passes.def (INSERT_PASS_BEFORE): Add VSETVL PASS 
location.
 * config/riscv/riscv-protos.h (make_pass_vsetvl): New function.
 (enum avl_type): New enum.
 (get_ta): New function.
 (get_ma): Ditto.
 (get_avl_type): Ditto.
 (calculate_ratio): Ditto.
 (enum tail_policy): New enum.
 (enum mask_policy): Ditto.
 * config/riscv/riscv-v.cc (calculate_ratio): New function.
 (emit_pred_op): change the VLMAX mov codgen.
 (get_ta): New function.
 (get_ma): Ditto.
 (enum tail_policy): Change enum.
 (get_prefer_tail_policy): New function.
 (enum mask_policy): Change enum.
 (get_prefer_mask_policy): New function.
 * config/riscv/t-riscv: Add riscv-vsetvl.o
 * config/riscv/vector.md (): Adjust attribute and pattern for VSETVL 
PASS.
 (@vlmax_avl): Ditto.
 (@vsetvl_no_side_effects): Delete.
 (vsetvl_vtype_change_only): New MD pattern.
 (@vsetvl_discard_result): Ditto.
 * config/riscv/riscv-vsetvl.cc: New file.
 * config/riscv/riscv-vsetvl.h: New file.
So a high level note.  Once you've inserted your vsetvl instrutions, you 
can't have further code motion, correct?  So doesn't this potentially 
have a poor interaction with something like speculative code motion as 
performed by sched?   ISTM that if you want to run before sched2, then 
you'd need to introduce dependencies between the vsetvl instrutions and 
the vector instructions that utilize those settings?


I can envision wanting to schedule the vsetvl instructions so that they 
bubble up slightly from their insertion points to avoid stalls or allow 
the vector units to start executing earlier.  Is that what's driving the 
the current pass placement?  If not would it make more sense to use the 
late prologue/epilogue hooks that Richard Sandiford posted recently (I'm 
not sure they're committed yet).








+
+static bool
+loop_basic_block_p (const basic_block cfg_bb)
+{
+  return JUMP_P (BB_END (cfg_bb)) && any_condjump_p (BB_END (cfg_bb));
+}
The name seems poor here -- AFAICT this has nothing to do with loops. 
It's just a test that the end of a block is a conditional jump.  I'm 
pretty sure we could extract BB_END (cfg_bb) and use an existing routine 
instead of writing our own.  I'd suggest peeking at jump.cc to see if 
there's something already suitable.


 +

+/* Return true if it is vsetvldi or vsetvlsi.  */
+static bool
+vsetvl_insn_p (rtx_insn *rinsn)
+{
+  return INSN_CODE (rinsn) == CODE_FOR_vsetvldi
+|| INSN_CODE (rinsn) == CODE_FOR_vsetvlsi;
Formatting note.  For a multi-line conditional, go ahead and use an open 
paren and the usual indention style.


  return (INSN_CODE (rinsn) == CODE_FOR_vsetvldi
  || INSN_CODE (rinsn) == CODE_FOR_vsetvlsi);

There's other examples in the new file.




+
+/* An "anticipatable occurrence" is one that is the first occurrence in the
+   basic block, the operands are not modified in the basic block prior
+   to the occurrence and the output is not used between the start of
+   the block and the occurrence.  */
+static bool
+anticipatable_occurrence_p (const insn_info *insn, const vector_insn_info dem)
+{
+  /* The only possible operand we care of VSETVL is AVL.  */
+  if (dem.has_avl_reg ())
+{
+  /* The operands shoule not be modified in the basic block prior

s/shoule/should/


+
+/* Return true if the branch probability is dominate.  */ > +static bool
+dominate_probability_p (edge e)
The function comment needs some work.  "is dominate" doesn't really mean 
anything without more context.  It looks like you're really testing if 
the edge probability is greater than 50%.




+
+  /* There is a obvious case that is not worthwhile and meaningless
+to propagate the demand information:
+ local_dem
+__
+| |
+   || |
+   || |
+|_|
+ reach

Re: [PATCH v2] coroutines: Accept 'extern "C"' coroutines.

2022-12-19 Thread Jason Merrill via Gcc-patches

On 12/17/22 08:40, Iain Sandoe wrote:

Hi.

It seems that everyone agrees that extern C coroutines should be permitted,
although I have yet to see a useful testcase.

This patch has been revised to append the suffices for such functions in
mangle.cc rather than as part of the outlined function decl production.

tested on x86_64-darwin21.
OK for trunk?
Iain


OK, thanks.



— 8< —

'extern "C"' coroutines are permitted by the standard and expected to work
(although constructing useful cases could be challenging). In order to
permit this we need to arrange for the outlined helper functions to be
named properly, even when no mangling is required.  To do this, we append
the actor and destroy suffixes in all cases.

Signed-off-by: Iain Sandoe 

gcc/cp/ChangeLog:

* mangle.cc (write_mangled_name): Append the helper function
suffixes here...
(write_encoding): ... rather than here.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/torture/extern-c-coroutine.C: New test.
---
  gcc/cp/mangle.cc  | 23 ++---
  .../coroutines/torture/extern-c-coroutine.C   | 89 +++
  2 files changed, 101 insertions(+), 11 deletions(-)
  create mode 100644 
gcc/testsuite/g++.dg/coroutines/torture/extern-c-coroutine.C

diff --git a/gcc/cp/mangle.cc b/gcc/cp/mangle.cc
index 074cf27ec7a..5789adcf680 100644
--- a/gcc/cp/mangle.cc
+++ b/gcc/cp/mangle.cc
@@ -805,6 +805,18 @@ write_mangled_name (const tree decl, bool top_level)
  write_string (".pre");
else if (DECL_IS_POST_FN_P (decl))
  write_string (".post");
+
+  /* If this is a coroutine helper, then append an appropriate string to
+ identify which.  */
+  if (tree ramp = DECL_RAMP_FN (decl))
+{
+  if (DECL_ACTOR_FN (ramp) == decl)
+   write_string (JOIN_STR "actor");
+  else if (DECL_DESTROY_FN (ramp) == decl)
+   write_string (JOIN_STR "destroy");
+  else
+   gcc_unreachable ();
+}
  }
  
  /* Returns true if the return type of DECL is part of its signature, and

@@ -863,17 +875,6 @@ write_encoding (const tree decl)
mangle_return_type_p (decl),
d);
  
-  /* If this is a coroutine helper, then append an appropriate string to

-identify which.  */
-  if (tree ramp = DECL_RAMP_FN (decl))
-   {
- if (DECL_ACTOR_FN (ramp) == decl)
-   write_string (JOIN_STR "actor");
- else if (DECL_DESTROY_FN (ramp) == decl)
-   write_string (JOIN_STR "destroy");
- else
-   gcc_unreachable ();
-   }
  }
  }
  
diff --git a/gcc/testsuite/g++.dg/coroutines/torture/extern-c-coroutine.C b/gcc/testsuite/g++.dg/coroutines/torture/extern-c-coroutine.C

new file mode 100644
index 000..c178a80ee4b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/torture/extern-c-coroutine.C
@@ -0,0 +1,89 @@
+#include 
+#include 
+
+#ifndef OUTPUT
+#  define PRINT(X)
+#  define PRINTF(X,...)
+#else
+#  define PRINT(X) puts(X)
+#  define PRINTF printf
+#endif
+
+struct future {
+  struct promise_type;
+  using handle_type = std::coroutine_handle;
+  handle_type handle;
+  future () : handle(0) {}
+  future (handle_type _handle)
+: handle(_handle) {
+PRINT("Created future object from handle");
+  }
+  future (const future &) = delete; // no copying
+  future (future &&s) : handle(s.handle) {
+   s.handle = nullptr;
+   PRINT("future mv ctor ");
+  }
+  future &operator = (future &&s) {
+   handle = s.handle;
+   s.handle = nullptr;
+   PRINT("future op=  ");
+   return *this;
+  }
+  ~future() {
+PRINT("Destroyed future");
+if ( handle )
+  handle.destroy();
+  }
+
+  struct promise_type {
+void return_value (int v) {
+  PRINTF ("return_value (%d)\n", v);
+  vv = v;
+}
+
+std::suspend_always initial_suspend() noexcept { return {}; }
+std::suspend_always final_suspend() noexcept { return {}; }
+void unhandled_exception() {}
+auto get_return_object() {return handle_type::from_promise (*this);}
+
+int get_value () { return vv; }
+  private:
+int vv;
+  };
+  bool await_ready() { return false; }
+  void await_suspend(std::coroutine_handle<>) {}
+  void await_resume() {}
+};
+
+extern "C" future
+test () {
+  co_return 22;
+}
+
+extern "C" future
+f () noexcept
+{
+  PRINT ("future: about to return");
+  co_return 42;
+}
+
+int main ()
+{
+  PRINT ("main: create future");
+  future x = f ();
+  PRINT ("main: got future - resuming");
+  if (x.handle.done())
+__builtin_abort ();
+  x.handle.resume();
+  PRINT ("main: after resume");
+  int y = x.handle.promise().get_value();
+  if ( y != 42 )
+__builtin_abort ();
+  if (!x.handle.done())
+{
+  PRINT ("main: apparently not done...");
+  __builtin_abort ();
+}
+  PRINT ("main: returning");
+  return 0;
+}




Re: [PATCH] c: Diagnose compound literals with function type [PR108043]

2022-12-19 Thread Marek Polacek via Gcc-patches
On Mon, Dec 19, 2022 at 12:05:48PM +0100, Jakub Jelinek wrote:
> Hi!
> 
> Both C99 and latest C2X say that compound literal shall have an object type
> (complete object type in the latter case) or array of unknown bound,
> so complit with function type is invalid.  When the initializer had to be
> non-empty for such case, we used to diagnose it as incorrect initializer,
> but with (fntype){} now allowed we just ICE on it.
> 
> The following patch diagnoses that.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

This looks OK to me, thanks.
 
> 2022-12-19  Jakub Jelinek  
> 
>   PR c/108043
>   * c-parser.cc (c_parser_postfix_expression_after_paren_type): Diagnose
>   compound literals with function type.
> 
>   * gcc.dg/pr108043.c: New test.
>   * gcc.dg/c99-complit-2.c (foo): Adjust expected diagnostics for
>   complit with function type.
> 
> --- gcc/c/c-parser.cc.jj  2022-11-18 09:00:44.331323558 +0100
> +++ gcc/c/c-parser.cc 2022-12-16 13:08:51.143083269 +0100
> @@ -10924,6 +10924,11 @@ c_parser_postfix_expression_after_paren_
>error_at (type_loc, "compound literal has variable size");
>type = error_mark_node;
>  }
> +  else if (TREE_CODE (type) == FUNCTION_TYPE)
> +{
> +  error_at (type_loc, "compound literal has function type");
> +  type = error_mark_node;
> +}
>if (constexpr_p && type != error_mark_node)
>  {
>tree type_no_array = strip_array_types (type);
> --- gcc/testsuite/gcc.dg/pr108043.c.jj2022-12-16 13:15:40.122083457 
> +0100
> +++ gcc/testsuite/gcc.dg/pr108043.c   2022-12-16 13:15:20.840366320 +0100
> @@ -0,0 +1,12 @@
> +/* PR c/108043 */
> +/* { dg-do compile } */
> +/* { dg-options "" } */
> +
> +typedef void F (void);
> +
> +void
> +foo (void)
> +{
> +  (F) {};/* { dg-error "compound literal has function type" } */
> +  (F) { foo };   /* { dg-error "compound literal has function 
> type" } */
> +}
> --- gcc/testsuite/gcc.dg/c99-complit-2.c.jj   2020-01-12 11:54:37.393398623 
> +0100
> +++ gcc/testsuite/gcc.dg/c99-complit-2.c  2022-12-19 11:51:45.098467295 
> +0100
> @@ -23,7 +23,7 @@ foo (int a)
>/* { dg-error "init" "incomplete union type" { target *-*-* } .-1 } */
>/* { dg-error "invalid use of undefined type" "" { target *-*-* } .-2 } */
>(void (void)) { 0 }; /* { dg-bogus "warning" "warning in place of error" } 
> */
> -  /* { dg-error "init" "function type" { target *-*-* } .-1 } */
> +  /* { dg-error "compound literal has function type" "function type" { 
> target *-*-* } .-1 } */
>(int [a]) { 1 }; /* { dg-bogus "warning" "warning in place of error" } */
>/* { dg-error "init|variable" "VLA type" { target *-*-* } .-1 } */
>/* Initializers must not attempt to initialize outside the object
> 
>   Jakub
> 

Marek



Re: [PATCH PING 2 (tree)] c++: source position of lambda captures [PR84471]

2022-12-19 Thread Jason Merrill via Gcc-patches

On 12/2/22 10:45, Jason Merrill wrote:

Tested x86_64-pc-linux-gnu, OK for trunk?

-- 8< --

If the DECL_VALUE_EXPR of a VAR_DECL has EXPR_LOCATION set, then any use of
that variable looks like it has that location, which leads to the debugger
jumping back and forth for both lambdas and structured bindings.

Rather than fix all the uses, it seems simplest to remove any EXPR_LOCATION
when setting DECL_VALUE_EXPR.  So the cp/ hunks aren't necessary, but it
seems cleaner not to work to add a location that will immediately get
stripped.

PR c++/84471
PR c++/107504

gcc/cp/ChangeLog:

* coroutines.cc (transform_local_var_uses): Don't
specify a location for DECL_VALUE_EXPR.
* decl.cc (cp_finish_decomp): Likewise.

gcc/ChangeLog:

* tree.cc (decl_value_expr_insert): Clear EXPR_LOCATION.

gcc/testsuite/ChangeLog:

* g++.dg/tree-ssa/value-expr1.C: New test.
* g++.dg/tree-ssa/value-expr2.C: New test.
* g++.dg/analyzer/pr93212.C: Move warning.
---
  gcc/cp/coroutines.cc|  4 ++--
  gcc/cp/decl.cc  | 12 +++---
  gcc/testsuite/g++.dg/analyzer/pr93212.C |  4 ++--
  gcc/testsuite/g++.dg/tree-ssa/value-expr1.C | 16 +
  gcc/testsuite/g++.dg/tree-ssa/value-expr2.C | 26 +
  gcc/tree.cc |  3 +++
  6 files changed, 52 insertions(+), 13 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/tree-ssa/value-expr1.C
  create mode 100644 gcc/testsuite/g++.dg/tree-ssa/value-expr2.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 01a3e831ee5..a72bd6bbef0 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -2047,8 +2047,8 @@ transform_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
= lookup_member (lvd->coro_frame_type, local_var.field_id,
 /*protect=*/1, /*want_type=*/0,
 tf_warning_or_error);
- tree fld_idx = build3_loc (lvd->loc, COMPONENT_REF, TREE_TYPE (lvar),
-lvd->actor_frame, fld_ref, NULL_TREE);
+ tree fld_idx = build3 (COMPONENT_REF, TREE_TYPE (lvar),
+lvd->actor_frame, fld_ref, NULL_TREE);
  local_var.field_idx = fld_idx;
  SET_DECL_VALUE_EXPR (lvar, fld_idx);
  DECL_HAS_VALUE_EXPR_P (lvar) = true;
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index 7af0b05d5f8..59e21581503 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -9133,9 +9133,7 @@ cp_finish_decomp (tree decl, tree first, unsigned int 
count)
  if (processing_template_decl)
continue;
  tree t = unshare_expr (dexp);
- t = build4_loc (DECL_SOURCE_LOCATION (v[i]), ARRAY_REF,
- eltype, t, size_int (i), NULL_TREE,
- NULL_TREE);
+ t = build4 (ARRAY_REF, eltype, t, size_int (i), NULL_TREE, NULL_TREE);
  SET_DECL_VALUE_EXPR (v[i], t);
  DECL_HAS_VALUE_EXPR_P (v[i]) = 1;
}
@@ -9154,9 +9152,7 @@ cp_finish_decomp (tree decl, tree first, unsigned int 
count)
  if (processing_template_decl)
continue;
  tree t = unshare_expr (dexp);
- t = build1_loc (DECL_SOURCE_LOCATION (v[i]),
- i ? IMAGPART_EXPR : REALPART_EXPR, eltype,
- t);
+ t = build1 (i ? IMAGPART_EXPR : REALPART_EXPR, eltype, t);
  SET_DECL_VALUE_EXPR (v[i], t);
  DECL_HAS_VALUE_EXPR_P (v[i]) = 1;
}
@@ -9180,9 +9176,7 @@ cp_finish_decomp (tree decl, tree first, unsigned int 
count)
  tree t = unshare_expr (dexp);
  convert_vector_to_array_for_subscript (DECL_SOURCE_LOCATION (v[i]),
 &t, size_int (i));
- t = build4_loc (DECL_SOURCE_LOCATION (v[i]), ARRAY_REF,
- eltype, t, size_int (i), NULL_TREE,
- NULL_TREE);
+ t = build4 (ARRAY_REF, eltype, t, size_int (i), NULL_TREE, NULL_TREE);
  SET_DECL_VALUE_EXPR (v[i], t);
  DECL_HAS_VALUE_EXPR_P (v[i]) = 1;
}
diff --git a/gcc/testsuite/g++.dg/analyzer/pr93212.C 
b/gcc/testsuite/g++.dg/analyzer/pr93212.C
index 41507e2b837..1029e8d547b 100644
--- a/gcc/testsuite/g++.dg/analyzer/pr93212.C
+++ b/gcc/testsuite/g++.dg/analyzer/pr93212.C
@@ -4,8 +4,8 @@
  auto lol()
  {
  int aha = 3;
-return [&aha] { // { dg-warning "dereferencing pointer '.*' to within stale 
stack frame" }
-return aha;
+return [&aha] {
+return aha; // { dg-warning "dereferencing pointer '.*' to within stale 
stack frame" }
  };
  /* TODO: may be worth special-casing the reporting of dangling
 references from lambdas, to highlight the declaration, and maybe fix
diff --git a/gcc/testsuite/g++.dg/tree-ssa/value-expr1.C 
b/gcc/testsuite/g++.dg/tree-ssa/value-expr1.C
new file mode 100644
index 00

Re: [PATCH] c++: ICE with concepts TS multiple auto deduction [PR101886]

2022-12-19 Thread Patrick Palka via Gcc-patches
On Wed, 7 Dec 2022, Patrick Palka wrote:

> In extract_autos_r, we need to reset TYPE_CANONICAL for the template
> type parameter after adjusting its index, otherwise we end up with a
> comptypes ICE for the below testcase.  Note that such in-place type
> adjustment isn't generallly safe to do since the type could be the
> TYPE_CANONICAL of another (unadjusted) type, but in this case the
> canonical auto (of some level and 0 index) is the first auto (of that
> level) that's created, and so any auto that we do end up adjusting can't
> be the canonical one.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?

Ping.  FWIW this blocks the modules + std::source_location patch[1]
because the make_auto() call it adds to cxx_init_decl_processing would
otherwise cause us to ICE on the existing test g++.dg/concepts/auto1.C.

[1]: https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608127.html

> 
>   PR c++/101886
> 
> gcc/cp/ChangeLog:
> 
>   * pt.cc (extract_autos_r): Reset TYPE_CANONICAL after
>   adjusting the template type parameter's index.  Simplify
>   by using TEMPLATE_TYPE_IDX.  Add some sanity checks.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/concepts/auto5.C: New test.
> ---
>  gcc/cp/pt.cc  | 12 +---
>  gcc/testsuite/g++.dg/concepts/auto5.C |  9 +
>  2 files changed, 18 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/concepts/auto5.C
> 
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index 24ed718ffbb..d05a49b1c11 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -29164,18 +29164,24 @@ extract_autos_r (tree t, void *data)
>  {
>/* All the autos were built with index 0; fix that up now.  */
>tree *p = hash.find_slot (t, INSERT);
> -  unsigned idx;
> +  int idx;
>if (*p)
>   /* If this is a repeated constrained-type-specifier, use the index we
>  chose before.  */
> - idx = TEMPLATE_PARM_IDX (TEMPLATE_TYPE_PARM_INDEX (*p));
> + idx = TEMPLATE_TYPE_IDX (*p);
>else
>   {
> /* Otherwise this is new, so use the current count.  */
> *p = t;
> idx = hash.elements () - 1;
>   }
> -  TEMPLATE_PARM_IDX (TEMPLATE_TYPE_PARM_INDEX (t)) = idx;
> +  if (idx != TEMPLATE_TYPE_IDX (t))
> + {
> +   gcc_checking_assert (TEMPLATE_TYPE_IDX (t) == 0);
> +   gcc_checking_assert (TYPE_CANONICAL (t) != t);
> +   TEMPLATE_TYPE_IDX (t) = idx;
> +   TYPE_CANONICAL (t) = canonical_type_parameter (t);
> + }
>  }
>  
>/* Always keep walking.  */
> diff --git a/gcc/testsuite/g++.dg/concepts/auto5.C 
> b/gcc/testsuite/g++.dg/concepts/auto5.C
> new file mode 100644
> index 000..f1d653efd87
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/concepts/auto5.C
> @@ -0,0 +1,9 @@
> +// PR c++/101886
> +// { dg-do compile { target c++17_only } }
> +// { dg-options "-fconcepts-ts" }
> +
> +template struct A { };
> +
> +A a;
> +A b1 = a;
> +A b2 = a;
> -- 
> 2.39.0.rc2
> 
> 



Re: [PATCH] c++: NTTP object wrapper substitution fixes [PR103346, ...]

2022-12-19 Thread Patrick Palka via Gcc-patches
On Tue, 6 Dec 2022, Patrick Palka wrote:

> On Tue, 6 Dec 2022, Patrick Palka wrote:
> 
> > This patch fixes some issues with substitution into a C++20 template
> > parameter object wrapper:
> > 
> > * The first testcase demonstrates a situation where the same_type_p
> >   assert in relevant case of tsubst_copy doesn't hold, because (partial)
> >   substitution of {int,} into the VIEW_CONVERT_EXPR wrapper yields
> >   A but substitution into the underlying TEMPLATE_PARM_INDEX is a
> >   nop and yields A due to tsubst's level == 1 early exit test.  So
> >   this patch just gets rid of the assert; the type mismatch doesn't
> >   seem to be a problem in practice since the coercion is from one
> >   dependent type to another.
> > 
> > * In the second testcase, dependent substitution into the underlying
> >   TEMPLATE_PARM_INDEX yields a CALL_EXPR with empty TREE_TYPE, which
> >   tsubst_copy doesn't expect.  This patch fixes this by handling empty
> >   TREE_TYPE the same way as a non-const TREE_TYPE.  Moreover, after
> >   this substitution we're left with a VIEW_CONVERT_EXPR wrapping a
> >   CALL_EXPR instead of a TEMPLATE_PARM_INDEX, which during the subsequent
> >   non-dependent substitution tsubst_copy doesn't expect either.  So
> >   this patch also relaxes the tsubst_copy case to accept such
> >   VIEW_CONVERT_EXPR too.
> > 
> > * In the third testcase, we end up never resolving the call to
> >   f.modify() since tsubst_copy doesn't do overload resolution.
> >   This patch fixes this by moving the handling of these
> >   VIEW_CONVERT_EXPR wrappers from tsubst_copy to tsubst_copy_and_build.
> >   And it turns out (at least according to our testsuite) that
> >   tsubst_copy doesn't directly need to handle the other kinds of
> >   NON_LVALUE_EXPR and VIEW_CONVERT_EXPR, so this patch also gets rid
> >   of the location_wrapper_p handling from tsubst_copy and moves the
> >   REF_PARENTHESIZED_P handling to tsubst_copy_and_build.
> 
> On second thought, getting rid of the location_wrapper_p and
> REF_PARENTHESIZED_P handling from tsubst_copy is perhaps too
> risky at this stage.  The following patch instead just moves
> the tparm object wrapper handling from tsubst_copy to
> tsubst_copy_and_build and leaves the rest of tsubst_copy alone.
> Smoke tested so far, full bootstrap+regtest in progress:
> 
> -- >8--
> 
> Subject: [PATCH] c++: NTTP object wrapper substitution fixes [PR103346, ...]
> 
> This patch fixes some issues with substitution into a C++20 template
> parameter object wrapper:
> 
> * The first testcase demonstrates a situation where the same_type_p
>   assert in relevant case of tsubst_copy doesn't hold, because (partial)
>   substitution of {int,} into the VIEW_CONVERT_EXPR wrapper yields
>   A but substitution into the underlying TEMPLATE_PARM_INDEX is a
>   nop and yields A due to tsubst's level == 1 early exit test.  So
>   this patch just gets rid of the assert; the type mismatch doesn't
>   seem to be a problem in practice, I suppose because the coercion is
>   from one dependent type to another.
> 
> * In the second testcase, dependent substitution into the underlying
>   TEMPLATE_PARM_INDEX yields a CALL_EXPR with empty TREE_TYPE, which
>   tsubst_copy doesn't expect.  This patch fixes this by handling empty
>   TREE_TYPE the same way as a non-const TREE_TYPE.  Moreover, after
>   this substitution we're left with a VIEW_CONVERT_EXPR wrapping a
>   CALL_EXPR instead of a TEMPLATE_PARM_INDEX, which during the subsequent
>   non-dependent substitution tsubst_copy doesn't expect either.  So
>   this patch also relaxes tsubst_copy to accept such VIEW_CONVERT_EXPR
>   too.
> 
> * In the third testcase, we end up never resolving the call to
>   f.modify() since tsubst_copy doesn't do overload resolution.
>   This patch fixes this by moving the handling of these
>   VIEW_CONVERT_EXPR wrappers from tsubst_copy to tsubst_copy_and_build.
>   For good measure tsubst_copy_and_build should also handle
>   REF_PARENTHESIZED_P wrappers instead of delegating to tsubst_copy.
> 
> After this patch, VIEW_CONVERT_EXPR substitution is ultimately just
> moved from tsubst_copy to tsubst_copy_and_build and made more
> permissive.
> 
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk?

Ping.

> 
>   PR c++/103346
>   PR c++/104278
>   PR c++/102553
> 
> gcc/cp/ChangeLog:
> 
>   * pt.cc (tsubst_copy) : In the handling
>   of C++20 template parameter object wrappers: Remove same_type_p
>   assert.  Accept non-TEMPLATE_PARM_INDEX inner operand.  Handle
>   empty TREE_TYPE on substituted inner operand.  Move it to ...
>   (tsubst_copy_and_build): ... here.  Also handle REF_PARENTHESIZED_P
>   VIEW_CONVERT_EXPRs.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp2a/nontype-class52a.C: New test.
>   * g++.dg/cpp2a/nontype-class53.C: New test.
>   * g++.dg/cpp2a/nontype-class54.C: New test.
>   * g++.dg/cpp2a/nontype-class55.C: New test.

[PATCH] libatomic: Provide gthr.h default implementation

2022-12-19 Thread Sebastian Huber
Build libatomic for all targets.  Use gthr.h to provide a default
implementation.  If the thread model is "single", then this implementation will
not work if for example atomic operations are used for thread/interrupt
synchronization.

libatomic/ChangeLog:

* Makefile.am (BUILT_SOURCES): New.
(gthr.h): Link from libgcc source file if USE_CONFIG_GTHR is enabled.
(gthr-default.h): Likewise.
* Makefile.in: Regenerate.
* configure: Likewise.
* configure.ac: Map thread model to thread header.
(thread_header): New substitution.
(USE_CONFIG_GTHR): New automake conditional.
* configure.tgt (*-*-elf*): Delete.
(UNSUPPORTED): Likewise.
(USE_CONFIG_GTHR): Define as default.
* testsuite/Makefile.in: Regenerate.
* config/gthr/host-config.h: New file.
* config/gthr/lock.c: Likewise.
---
 libatomic/Makefile.am   |  12 +++
 libatomic/Makefile.in   |  40 +---
 libatomic/config/gthr/host-config.h |  55 +++
 libatomic/config/gthr/lock.c| 136 
 libatomic/configure |  40 +++-
 libatomic/configure.ac  |  10 +-
 libatomic/configure.tgt |  16 +---
 libatomic/testsuite/Makefile.in |   4 +-
 8 files changed, 281 insertions(+), 32 deletions(-)
 create mode 100644 libatomic/config/gthr/host-config.h
 create mode 100644 libatomic/config/gthr/lock.c

diff --git a/libatomic/Makefile.am b/libatomic/Makefile.am
index 41e5da28512..d56d553c9d5 100644
--- a/libatomic/Makefile.am
+++ b/libatomic/Makefile.am
@@ -34,6 +34,18 @@ search_path = $(addprefix $(top_srcdir)/config/, 
$(config_path)) \
 
 vpath % $(strip $(search_path))
 
+BUILT_SOURCES =
+
+if USE_CONFIG_GTHR
+gthr.h: $(top_srcdir)/../libgcc/gthr.h
+   -$(LN_S) $< $@
+
+gthr-default.h: $(top_srcdir)/../libgcc/$(thread_header)
+   -$(LN_S) $< $@
+
+BUILT_SOURCES += gthr.h gthr-default.h
+endif
+
 DEFAULT_INCLUDES = $(addprefix -I, $(search_path))
 AM_CFLAGS = $(XCFLAGS)
 AM_CCASFLAGS = $(XCFLAGS)
diff --git a/libatomic/Makefile.in b/libatomic/Makefile.in
index a0fa3dfc8cc..2e3ddd014f8 100644
--- a/libatomic/Makefile.in
+++ b/libatomic/Makefile.in
@@ -89,15 +89,16 @@ POST_UNINSTALL = :
 build_triplet = @build@
 host_triplet = @host@
 target_triplet = @target@
-@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_1 = $(foreach 
s,$(SIZES),$(addsuffix _$(s)_1_.lo,$(SIZEOBJS)))
-@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_2 = atomic_16.S
-@ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_3 = $(foreach \
+@USE_CONFIG_GTHR_TRUE@am__append_1 = gthr.h gthr-default.h
+@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_2 = $(foreach 
s,$(SIZES),$(addsuffix _$(s)_1_.lo,$(SIZEOBJS)))
+@ARCH_AARCH64_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_3 = atomic_16.S
+@ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@am__append_4 = $(foreach \
 @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@ s,$(SIZES),$(addsuffix \
 @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@ _$(s)_1_.lo,$(SIZEOBJS))) \
 @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@ $(addsuffix \
 @ARCH_ARM_LINUX_TRUE@@HAVE_IFUNC_TRUE@ _8_2_.lo,$(SIZEOBJS))
-@ARCH_I386_TRUE@@HAVE_IFUNC_TRUE@am__append_4 = $(addsuffix 
_8_1_.lo,$(SIZEOBJS))
-@ARCH_X86_64_TRUE@@HAVE_IFUNC_TRUE@am__append_5 = $(addsuffix 
_16_1_.lo,$(SIZEOBJS)) \
+@ARCH_I386_TRUE@@HAVE_IFUNC_TRUE@am__append_5 = $(addsuffix 
_8_1_.lo,$(SIZEOBJS))
+@ARCH_X86_64_TRUE@@HAVE_IFUNC_TRUE@am__append_6 = $(addsuffix 
_16_1_.lo,$(SIZEOBJS)) \
 @ARCH_X86_64_TRUE@@HAVE_IFUNC_TRUE@   $(addsuffix 
_16_2_.lo,$(SIZEOBJS))
 
 subdir = .
@@ -115,7 +116,8 @@ am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \
$(top_srcdir)/../ltversion.m4 $(top_srcdir)/../lt~obsolete.m4 \
$(top_srcdir)/acinclude.m4 $(top_srcdir)/../libtool.m4 \
$(top_srcdir)/../config/enable.m4 \
-   $(top_srcdir)/../config/cet.m4 $(top_srcdir)/configure.ac
+   $(top_srcdir)/../config/cet.m4 $(top_srcdir)/../config/gthr.m4 \
+   $(top_srcdir)/configure.ac
 am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
$(ACLOCAL_M4)
 DIST_COMMON = $(srcdir)/Makefile.am $(top_srcdir)/configure \
@@ -392,6 +394,7 @@ target_alias = @target_alias@
 target_cpu = @target_cpu@
 target_os = @target_os@
 target_vendor = @target_vendor@
+thread_header = @thread_header@
 tmake_file = @tmake_file@
 toolexecdir = @toolexecdir@
 toolexeclibdir = @toolexeclibdir@
@@ -404,6 +407,7 @@ gcc_version := $(shell @get_gcc_base_ver@ 
$(top_srcdir)/../gcc/BASE-VER)
 search_path = $(addprefix $(top_srcdir)/config/, $(config_path)) \
$(top_srcdir) $(top_builddir)
 
+BUILT_SOURCES = $(am__append_1)
 DEFAULT_INCLUDES = $(addprefix -I, $(search_path))
 AM_CFLAGS = $(XCFLAGS)
 AM_CCASFLAGS = $(XCFLAGS)
@@ -419,7 +423,7 @@ noinst_LTLIBRARIES = libatomic_convenience.la
 libatomic_version_info = -version-info $(libtool_VERSION)
 libatomic_la_LDFLAGS = $(libatomic_version_info) $(libatomic_ver

Re: [PATCH] i386: correct division modeling in lujiazui.md

2022-12-19 Thread Alexander Monakov via Gcc-patches
Ping. If there are any questions or concerns about the patch, please let me
know: I'm interested in continuing this cleanup at least for older AMD models.

I noticed I had an extra line in my Changelog:

>   (lua_sseicvt_si): Ditto.

It got there accidentally and I will drop it.

Alexander

On Fri, 9 Dec 2022, Alexander Monakov wrote:

> Model the divider in Lujiazui processors as a separate automaton to
> significantly reduce the overall model size. This should also result
> in improved accuracy, as pipe 0 should be able to accept new
> instructions while the divider is occupied.
> 
> It is unclear why integer divisions are modeled as if pipes 0-3 are all
> occupied. I've opted to keep a single-cycle reservation of all four
> pipes together, so GCC should continue trying to pack instructions
> around a division accordingly.
> 
> Currently top three symbols in insn-automata.o are:
> 
> 106102 r lujiazui_core_check
> 106102 r lujiazui_core_transitions
> 196123 r lujiazui_core_min_issue_delay
> 
> This patch shrinks all lujiazui tables to:
> 
> 3 r lujiazui_decoder_min_issue_delay
> 20 r lujiazui_decoder_transitions
> 32 r lujiazui_agu_min_issue_delay
> 126 r lujiazui_agu_transitions
> 304 r lujiazui_div_base
> 352 r lujiazui_div_check
> 352 r lujiazui_div_transitions
> 1152 r lujiazui_core_min_issue_delay
> 1592 r lujiazui_agu_translate
> 1592 r lujiazui_core_translate
> 1592 r lujiazui_decoder_translate
> 1592 r lujiazui_div_translate
> 3952 r lujiazui_div_min_issue_delay
> 9216 r lujiazui_core_transitions
> 
> This continues the work on reducing i386 insn-automata.o size started
> with similar fixes for division and multiplication instructions in
> znver.md [1][2]. I plan to submit corresponding fixes for
> b[td]ver[123].md as well.
> 
> [1] 
> https://inbox.sourceware.org/gcc-patches/23c795d6-403c-5927-e610-f0f1215f5...@ispras.ru/T/#m36e069d43d07d768d4842a779e26b4a0915cc543
> [2] 
> https://inbox.sourceware.org/gcc-patches/20221101162637.14238-1-amona...@ispras.ru/
> 
> gcc/ChangeLog:
> 
>   PR target/87832
>   * config/i386/lujiazui.md (lujiazui_div): New automaton.
>   (lua_div): New unit.
>   (lua_idiv_qi): Correct unit in the reservation.
>   (lua_idiv_qi_load): Ditto.
>   (lua_idiv_hi): Ditto.
>   (lua_idiv_hi_load): Ditto.
>   (lua_idiv_si): Ditto.
>   (lua_idiv_si_load): Ditto.
>   (lua_idiv_di): Ditto.
>   (lua_idiv_di_load): Ditto.
>   (lua_fdiv_SF): Ditto.
>   (lua_fdiv_SF_load): Ditto.
>   (lua_fdiv_DF): Ditto.
>   (lua_fdiv_DF_load): Ditto.
>   (lua_fdiv_XF): Ditto.
>   (lua_fdiv_XF_load): Ditto.
>   (lua_ssediv_SF): Ditto.
>   (lua_ssediv_load_SF): Ditto.
>   (lua_ssediv_V4SF): Ditto.
>   (lua_ssediv_load_V4SF): Ditto.
>   (lua_ssediv_V8SF): Ditto.
>   (lua_ssediv_load_V8SF): Ditto.
>   (lua_ssediv_SD): Ditto.
>   (lua_ssediv_load_SD): Ditto.
>   (lua_ssediv_V2DF): Ditto.
>   (lua_ssediv_load_V2DF): Ditto.
>   (lua_ssediv_V4DF): Ditto.
>   (lua_ssediv_load_V4DF): Ditto.
>   (lua_sseicvt_si): Ditto.
> ---
>  gcc/config/i386/lujiazui.md | 58 +++--
>  1 file changed, 30 insertions(+), 28 deletions(-)
> 
> diff --git a/gcc/config/i386/lujiazui.md b/gcc/config/i386/lujiazui.md
> index 9046c09f2..58a230c70 100644
> --- a/gcc/config/i386/lujiazui.md
> +++ b/gcc/config/i386/lujiazui.md
> @@ -19,8 +19,8 @@
>  
>  ;; Scheduling for ZHAOXIN lujiazui processor.
>  
> -;; Modeling automatons for decoders, execution pipes and AGU pipes.
> -(define_automaton "lujiazui_decoder,lujiazui_core,lujiazui_agu")
> +;; Modeling automatons for decoders, execution pipes, AGU pipes, and divider.
> +(define_automaton "lujiazui_decoder,lujiazui_core,lujiazui_agu,lujiazui_div")
>  
>  ;; The rules for the decoder are simple:
>  ;;  - an instruction with 1 uop can be decoded by any of the three
> @@ -55,6 +55,8 @@ (define_reservation "lua_decoder01" 
> "lua_decoder0|lua_decoder1")
>  (define_cpu_unit "lua_p0,lua_p1,lua_p2,lua_p3" "lujiazui_core")
>  (define_cpu_unit "lua_p4,lua_p5" "lujiazui_agu")
>  
> +(define_cpu_unit "lua_div" "lujiazui_div")
> +
>  (define_reservation "lua_p03" "lua_p0|lua_p3")
>  (define_reservation "lua_p12" "lua_p1|lua_p2")
>  (define_reservation "lua_p1p2" "lua_p1+lua_p2")
> @@ -229,56 +231,56 @@ (define_insn_reservation "lua_idiv_qi" 21
> (and (eq_attr "memory" "none")
>  (and (eq_attr "mode" "QI")
>   (eq_attr "type" "idiv"
> -  "lua_decoder0,lua_p0p1p2p3*21")
> +  "lua_decoder0,lua_p0p1p2p3,lua_div*21")
>  
>  (define_insn_reservation "lua_idiv_qi_load" 25
>(and (eq_attr "cpu" "lujiazui")
> (and (eq_attr "memory" "load")
>  (and (eq_attr "mode" "QI")
>   (eq_attr "type" "idiv"
> - 

Re: [PATCH V7] rs6000: Optimize cmp on rotated 16bits constant

2022-12-19 Thread Segher Boessenkool
Hi!

Mostlt nitpicking left:

On Mon, Dec 19, 2022 at 10:06:45PM +0800, Jiufu Guo wrote:
> When checking eq/ne with a constant which has only 16bits, it can be
> optimized to check the rotated data.  By this, the constant building
> is optimized.
> 
> As the example in PR103743:
> For "in == 0x8000LL", this patch generates:
> rotldi 3,3,1 ; cmpldi 0,3,1
> instead of:
> li 9,-1 ; rldicr 9,9,0,0 ; cmpd 0,3,9

Excellent :-)

>   * config/rs6000/rs6000-protos.h (can_be_rotated_to_lowbits): New.
>   (can_be_rotated_to_positive_16bits): New.
>   (can_be_rotated_to_negative_15bits): New.
>   * config/rs6000/rs6000.cc (can_be_rotated_to_lowbits): New definition.
>   (can_be_rotated_to_positive_16bits): New definition.
>   (can_be_rotated_to_negative_15bits): New definition.
>   * config/rs6000/rs6000.md (*rotate_on_cmpdi): New define_insn_and_split.

Good names.  Great function comments as well.

> +/* Check if C (as 64bit integer) can be rotated to a constant which constains
> +   nonzero bits at LOWBITS only.

"at the LOWBITS low bits only".  Well it probably is clear what is
meant :-)

> +   Return true if C can be rotated to such constant.  And *ROT is written to
> +   the number by which C is rotated.
> +   Return false otherwise.  */

"If so, *ROT is written" etc.

> +(define_code_iterator eqne [eq ne])

You should say in the changelog that "eqne" was moved.
"(eqne): Move earlier." is plenty of course.

> +(define_insn_and_split "*rotate_on_cmpdi"

> +  rtx note = find_reg_note (curr_insn, REG_BR_PROB, 0);

Move this much later please, to just before it is used.

> +  /* keep the probability info for the prediction of the branch insn.  */

"Keep", sentences start with a capital.

> +}
> +)

These go on one line, as just
})

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103743.c
> @@ -0,0 +1,52 @@
> +/* { dg-options "-O2" } */
> +/* { dg-do compile { target has_arch_ppc64 } } */
> +
> +/* { dg-final { scan-assembler-times {\mcmpldi\M} 10  } } */
> +/* { dg-final { scan-assembler-times {\mcmpdi\M} 4  } } */
> +/* { dg-final { scan-assembler-times {\mrotldi\M} 14  } } */
> +

With so much going on in just one function, I am a bit worried that this
testcase will easily fail in the future.  We will see.

Okay for trunk with those i's dotted.  Thank you!


Segher


[PATCH] libgo: check if -lucontext is required for {make, set, get}context

2022-12-19 Thread soeren--- via Gcc-patches
From: Sören Tempel 

This patch is similar to the existing check for librt. If libucontext
is installed and libucontext.a provides the aforementioned symbols, then
it is added to $LIBS. If not, no error is emitted. We could,
alternatively, also check libc.a for these symbols and thus prefer libc
over libucontext if both are installed and provide the symbols. If
deemed desirable, this could be achieved by changing the invocation
to AC_SEARCH_LIBS([makecontext], [c ucontext]).

This version of this patch has been tested on x86_64 Alpine Linux Edge
(libucontext 1.2 + musl 1.2.3) and Arch Linux (glibc 2.36). On the
latter, the check is a no-op and $LIBS is not modified.

Signed-off-by: Sören Tempel 
---
 libgo/configure| 178 +
 libgo/configure.ac |   5 ++
 2 files changed, 183 insertions(+)

diff --git a/libgo/configure b/libgo/configure
index 460fdad7..ac9202dc 100755
--- a/libgo/configure
+++ b/libgo/configure
@@ -14818,6 +14818,184 @@ fi
 
 
 
+{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for library containing 
makecontext" >&5
+printf %s "checking for library containing makecontext... " >&6; }
+if test ${ac_cv_search_makecontext+y}
+then :
+  printf %s "(cached) " >&6
+else $as_nop
+  ac_func_search_save_LIBS=$LIBS
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+char makecontext ();
+int
+main (void)
+{
+return makecontext ();
+  ;
+  return 0;
+}
+_ACEOF
+for ac_lib in '' ucontext
+do
+  if test -z "$ac_lib"; then
+ac_res="none required"
+  else
+ac_res=-l$ac_lib
+LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
+  fi
+  if ac_fn_c_try_link "$LINENO"
+then :
+  ac_cv_search_makecontext=$ac_res
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.beam \
+conftest$ac_exeext
+  if test ${ac_cv_search_makecontext+y}
+then :
+  break
+fi
+done
+if test ${ac_cv_search_makecontext+y}
+then :
+
+else $as_nop
+  ac_cv_search_makecontext=no
+fi
+rm conftest.$ac_ext
+LIBS=$ac_func_search_save_LIBS
+fi
+{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: 
$ac_cv_search_makecontext" >&5
+printf "%s\n" "$ac_cv_search_makecontext" >&6; }
+ac_res=$ac_cv_search_makecontext
+if test "$ac_res" != no
+then :
+  test "$ac_res" = "none required" || LIBS="$ac_res $LIBS"
+
+fi
+
+{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for library containing 
getcontext" >&5
+printf %s "checking for library containing getcontext... " >&6; }
+if test ${ac_cv_search_getcontext+y}
+then :
+  printf %s "(cached) " >&6
+else $as_nop
+  ac_func_search_save_LIBS=$LIBS
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+char getcontext ();
+int
+main (void)
+{
+return getcontext ();
+  ;
+  return 0;
+}
+_ACEOF
+for ac_lib in '' ucontext
+do
+  if test -z "$ac_lib"; then
+ac_res="none required"
+  else
+ac_res=-l$ac_lib
+LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
+  fi
+  if ac_fn_c_try_link "$LINENO"
+then :
+  ac_cv_search_getcontext=$ac_res
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.beam \
+conftest$ac_exeext
+  if test ${ac_cv_search_getcontext+y}
+then :
+  break
+fi
+done
+if test ${ac_cv_search_getcontext+y}
+then :
+
+else $as_nop
+  ac_cv_search_getcontext=no
+fi
+rm conftest.$ac_ext
+LIBS=$ac_func_search_save_LIBS
+fi
+{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: 
$ac_cv_search_getcontext" >&5
+printf "%s\n" "$ac_cv_search_getcontext" >&6; }
+ac_res=$ac_cv_search_getcontext
+if test "$ac_res" != no
+then :
+  test "$ac_res" = "none required" || LIBS="$ac_res $LIBS"
+
+fi
+
+{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for library containing 
setcontext" >&5
+printf %s "checking for library containing setcontext... " >&6; }
+if test ${ac_cv_search_setcontext+y}
+then :
+  printf %s "(cached) " >&6
+else $as_nop
+  ac_func_search_save_LIBS=$LIBS
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+
+/* Override any GCC internal prototype to avoid an error.
+   Use char because int might match the return type of a GCC
+   builtin and then its argument prototype would still apply.  */
+char setcontext ();
+int
+main (void)
+{
+return setcontext ();
+  ;
+  return 0;
+}
+_ACEOF
+for ac_lib in '' ucontext
+do
+  if test -z "$ac_lib"; then
+ac_res="none required"
+  else
+ac_res=-l$ac_lib
+LIBS="-l$ac_lib  $ac_func_search_save_LIBS"
+  fi
+  if ac_fn_c_try_link "$LINENO"
+then :
+  ac_cv_search_setcontext=$ac_res
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.beam \
+conftest$ac_exeext
+  if test ${ac_cv_search_setcontext+y}
+then :
+  break
+fi
+done
+if

Re: [PATCH] c++: NTTP object wrapper substitution fixes [PR103346, ...]

2022-12-19 Thread Jason Merrill via Gcc-patches

On 12/6/22 13:35, Patrick Palka wrote:

This patch fixes some issues with substitution into a C++20 template
parameter object wrapper:

* The first testcase demonstrates a situation where the same_type_p
   assert in relevant case of tsubst_copy doesn't hold, because (partial)
   substitution of {int,} into the VIEW_CONVERT_EXPR wrapper yields
   A but substitution into the underlying TEMPLATE_PARM_INDEX is a
   nop and yields A due to tsubst's level == 1 early exit test.


We exit in that case because continuing would reduce the level to an 
impossible 0.  Why doesn't the preceding code find a binding for T?



   So
   this patch just gets rid of the assert; the type mismatch doesn't
   seem to be a problem in practice, I suppose because the coercion is
   from one dependent type to another.

* In the second testcase, dependent substitution into the underlying
   TEMPLATE_PARM_INDEX yields a CALL_EXPR with empty TREE_TYPE, which
   tsubst_copy doesn't expect.  This patch fixes this by handling empty
   TREE_TYPE the same way as a non-const TREE_TYPE.  Moreover, after
   this substitution we're left with a VIEW_CONVERT_EXPR wrapping a
   CALL_EXPR instead of a TEMPLATE_PARM_INDEX, which during the subsequent
   non-dependent substitution tsubst_copy doesn't expect either.  So
   this patch also relaxes tsubst_copy to accept such VIEW_CONVERT_EXPR
   too.

* In the third testcase, we end up never resolving the call to
   f.modify() since tsubst_copy doesn't do overload resolution.
   This patch fixes this by moving the handling of these
   VIEW_CONVERT_EXPR wrappers from tsubst_copy to tsubst_copy_and_build.
   For good measure tsubst_copy_and_build should also handle
   REF_PARENTHESIZED_P wrappers instead of delegating to tsubst_copy.

After this patch, VIEW_CONVERT_EXPR substitution is ultimately just
moved from tsubst_copy to tsubst_copy_and_build and made more
permissive.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/103346
PR c++/104278
PR c++/102553

gcc/cp/ChangeLog:

* pt.cc (tsubst_copy) : In the handling
of C++20 template parameter object wrappers: Remove same_type_p
assert.  Accept non-TEMPLATE_PARM_INDEX inner operand.  Handle
empty TREE_TYPE on substituted inner operand.  Move it to ...
(tsubst_copy_and_build): ... here.  Also handle REF_PARENTHESIZED_P
VIEW_CONVERT_EXPRs.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-class52a.C: New test.
* g++.dg/cpp2a/nontype-class53.C: New test.
* g++.dg/cpp2a/nontype-class54.C: New test.
* g++.dg/cpp2a/nontype-class55.C: New test.
---
  gcc/cp/pt.cc  | 73 ++-
  gcc/testsuite/g++.dg/cpp2a/nontype-class52a.C | 15 
  gcc/testsuite/g++.dg/cpp2a/nontype-class53.C  | 25 +++
  gcc/testsuite/g++.dg/cpp2a/nontype-class54.C  | 23 ++
  gcc/testsuite/g++.dg/cpp2a/nontype-class55.C  | 15 
  5 files changed, 116 insertions(+), 35 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class52a.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class53.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class54.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class55.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 2d8e4fdd4b5..0a196f069ad 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -17271,42 +17271,16 @@ tsubst_copy (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
  return maybe_wrap_with_location (op0, EXPR_LOCATION (t));
}
  tree op = TREE_OPERAND (t, 0);
- if (code == VIEW_CONVERT_EXPR
- && TREE_CODE (op) == TEMPLATE_PARM_INDEX)
-   {
- /* Wrapper to make a C++20 template parameter object const.  */
- op = tsubst_copy (op, args, complain, in_decl);
- if (!CP_TYPE_CONST_P (TREE_TYPE (op)))
-   {
- /* The template argument is not const, presumably because
-it is still dependent, and so not the const template parm
-object.  */
- tree type = tsubst (TREE_TYPE (t), args, complain, in_decl);
- gcc_checking_assert (same_type_ignoring_top_level_qualifiers_p
-  (type, TREE_TYPE (op)));
- if (TREE_CODE (op) == CONSTRUCTOR
- || TREE_CODE (op) == IMPLICIT_CONV_EXPR)
-   {
- /* Don't add a wrapper to these.  */
- op = copy_node (op);
- TREE_TYPE (op) = type;
-   }
- else
-   /* Do add a wrapper otherwise (in particular, if op is
-  another TEMPLATE_PARM_INDEX).  */
-   op = build1 (code, type, op);
-   }
- return op;
-   }
  /* force_paren_expr can also cre

Re: [PATCH] c++: ICE with concepts TS multiple auto deduction [PR101886]

2022-12-19 Thread Jason Merrill via Gcc-patches

On 12/7/22 15:18, Patrick Palka wrote:

In extract_autos_r, we need to reset TYPE_CANONICAL for the template
type parameter after adjusting its index, otherwise we end up with a
comptypes ICE for the below testcase.  Note that such in-place type
adjustment isn't generallly safe to do since the type could be the
TYPE_CANONICAL of another (unadjusted) type, but in this case the
canonical auto (of some level and 0 index) is the first auto (of that
level) that's created, and so any auto that we do end up adjusting can't
be the canonical one.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


PR c++/101886

gcc/cp/ChangeLog:

* pt.cc (extract_autos_r): Reset TYPE_CANONICAL after
adjusting the template type parameter's index.  Simplify
by using TEMPLATE_TYPE_IDX.  Add some sanity checks.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/auto5.C: New test.
---
  gcc/cp/pt.cc  | 12 +---
  gcc/testsuite/g++.dg/concepts/auto5.C |  9 +
  2 files changed, 18 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/concepts/auto5.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 24ed718ffbb..d05a49b1c11 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -29164,18 +29164,24 @@ extract_autos_r (tree t, void *data)
  {
/* All the autos were built with index 0; fix that up now.  */
tree *p = hash.find_slot (t, INSERT);
-  unsigned idx;
+  int idx;
if (*p)
/* If this is a repeated constrained-type-specifier, use the index we
   chose before.  */
-   idx = TEMPLATE_PARM_IDX (TEMPLATE_TYPE_PARM_INDEX (*p));
+   idx = TEMPLATE_TYPE_IDX (*p);
else
{
  /* Otherwise this is new, so use the current count.  */
  *p = t;
  idx = hash.elements () - 1;
}
-  TEMPLATE_PARM_IDX (TEMPLATE_TYPE_PARM_INDEX (t)) = idx;
+  if (idx != TEMPLATE_TYPE_IDX (t))
+   {
+ gcc_checking_assert (TEMPLATE_TYPE_IDX (t) == 0);
+ gcc_checking_assert (TYPE_CANONICAL (t) != t);
+ TEMPLATE_TYPE_IDX (t) = idx;
+ TYPE_CANONICAL (t) = canonical_type_parameter (t);
+   }
  }
  
/* Always keep walking.  */

diff --git a/gcc/testsuite/g++.dg/concepts/auto5.C 
b/gcc/testsuite/g++.dg/concepts/auto5.C
new file mode 100644
index 000..f1d653efd87
--- /dev/null
+++ b/gcc/testsuite/g++.dg/concepts/auto5.C
@@ -0,0 +1,9 @@
+// PR c++/101886
+// { dg-do compile { target c++17_only } }
+// { dg-options "-fconcepts-ts" }
+
+template struct A { };
+
+A a;
+A b1 = a;
+A b2 = a;




Re: [PATCH] c++: NTTP object wrapper substitution fixes [PR103346, ...]

2022-12-19 Thread Patrick Palka via Gcc-patches
On Mon, 19 Dec 2022, Jason Merrill wrote:

> On 12/6/22 13:35, Patrick Palka wrote:
> > This patch fixes some issues with substitution into a C++20 template
> > parameter object wrapper:
> > 
> > * The first testcase demonstrates a situation where the same_type_p
> >assert in relevant case of tsubst_copy doesn't hold, because (partial)
> >substitution of {int,} into the VIEW_CONVERT_EXPR wrapper yields
> >A but substitution into the underlying TEMPLATE_PARM_INDEX is a
> >nop and yields A due to tsubst's level == 1 early exit test.
> 
> We exit in that case because continuing would reduce the level to an
> impossible 0.  Why doesn't the preceding code find a binding for T?

Whoops, I misspoke.  The problem is that there's no binding for V since
only T=int was explicitly specified, so when substituting into the
TEMPLATE_PARM_INDEX for V in 'void f(B)', we hit that early exit and
never get a chance to substitute T=int into the tpi's TREE_TYPE (which
would yield A as desired).  So the TREE_TYPE of the wrapped tpi
remains A whereas the substituted TREE_TYPE of the wrapper is A,
a mismatch.

> 
> >So
> >this patch just gets rid of the assert; the type mismatch doesn't
> >seem to be a problem in practice, I suppose because the coercion is
> >from one dependent type to another.
> > 
> > * In the second testcase, dependent substitution into the underlying
> >TEMPLATE_PARM_INDEX yields a CALL_EXPR with empty TREE_TYPE, which
> >tsubst_copy doesn't expect.  This patch fixes this by handling empty
> >TREE_TYPE the same way as a non-const TREE_TYPE.  Moreover, after
> >this substitution we're left with a VIEW_CONVERT_EXPR wrapping a
> >CALL_EXPR instead of a TEMPLATE_PARM_INDEX, which during the subsequent
> >non-dependent substitution tsubst_copy doesn't expect either.  So
> >this patch also relaxes tsubst_copy to accept such VIEW_CONVERT_EXPR
> >too.
> > 
> > * In the third testcase, we end up never resolving the call to
> >f.modify() since tsubst_copy doesn't do overload resolution.
> >This patch fixes this by moving the handling of these
> >VIEW_CONVERT_EXPR wrappers from tsubst_copy to tsubst_copy_and_build.
> >For good measure tsubst_copy_and_build should also handle
> >REF_PARENTHESIZED_P wrappers instead of delegating to tsubst_copy.
> > 
> > After this patch, VIEW_CONVERT_EXPR substitution is ultimately just
> > moved from tsubst_copy to tsubst_copy_and_build and made more
> > permissive.
> > 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk?
> > 
> > PR c++/103346
> > PR c++/104278
> > PR c++/102553
> > 
> > gcc/cp/ChangeLog:
> > 
> > * pt.cc (tsubst_copy) : In the handling
> > of C++20 template parameter object wrappers: Remove same_type_p
> > assert.  Accept non-TEMPLATE_PARM_INDEX inner operand.  Handle
> > empty TREE_TYPE on substituted inner operand.  Move it to ...
> > (tsubst_copy_and_build): ... here.  Also handle REF_PARENTHESIZED_P
> > VIEW_CONVERT_EXPRs.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp2a/nontype-class52a.C: New test.
> > * g++.dg/cpp2a/nontype-class53.C: New test.
> > * g++.dg/cpp2a/nontype-class54.C: New test.
> > * g++.dg/cpp2a/nontype-class55.C: New test.
> > ---
> >   gcc/cp/pt.cc  | 73 ++-
> >   gcc/testsuite/g++.dg/cpp2a/nontype-class52a.C | 15 
> >   gcc/testsuite/g++.dg/cpp2a/nontype-class53.C  | 25 +++
> >   gcc/testsuite/g++.dg/cpp2a/nontype-class54.C  | 23 ++
> >   gcc/testsuite/g++.dg/cpp2a/nontype-class55.C  | 15 
> >   5 files changed, 116 insertions(+), 35 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class52a.C
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class53.C
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class54.C
> >   create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class55.C
> > 
> > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> > index 2d8e4fdd4b5..0a196f069ad 100644
> > --- a/gcc/cp/pt.cc
> > +++ b/gcc/cp/pt.cc
> > @@ -17271,42 +17271,16 @@ tsubst_copy (tree t, tree args, tsubst_flags_t
> > complain, tree in_decl)
> >   return maybe_wrap_with_location (op0, EXPR_LOCATION (t));
> > }
> >   tree op = TREE_OPERAND (t, 0);
> > - if (code == VIEW_CONVERT_EXPR
> > - && TREE_CODE (op) == TEMPLATE_PARM_INDEX)
> > -   {
> > - /* Wrapper to make a C++20 template parameter object const.  */
> > - op = tsubst_copy (op, args, complain, in_decl);
> > - if (!CP_TYPE_CONST_P (TREE_TYPE (op)))
> > -   {
> > - /* The template argument is not const, presumably because
> > -it is still dependent, and so not the const template parm
> > -object.  */
> > - tree type = tsubst (TREE_TYPE (t), args, complain, in_decl);
> > - gcc_checking_assert
> > (

Re: [PATCH v2 05/11] riscv: thead: Add support for the XTheadBa ISA extension

2022-12-19 Thread Philipp Tomsich
On Mon, 19 Dec 2022 at 05:20, Kito Cheng  wrote:
>
> LGTM with a nit:
>
> ...
> > +  "TARGET_XTHEADBA
> > +   && (INTVAL (operands[2]) >= 0) && (INTVAL (operands[2]) <= 3)"
>
> IN_RANGE(INTVAL(operands[2]), 0, 3)
>
> and I am little bit suppress it can be zero

So was I, when reading the specification — and I reconfirmed that bit
by checking with the folks at T-Head.

We discussed this internally before submitting: while this case should
never occur (as other pieces in the compiler are smart enough to
simplify the RTX), we decided to include the 0 as it is an accurate
reflection of the instruction semantics.

Philipp.

>
> > +  "th.addsl\t%0,%1,%3,%2"
> > +  [(set_attr "type" "bitmanip")
> > +   (set_attr "mode" "")])


Re: [PATCH] c++: NTTP object wrapper substitution fixes [PR103346, ...]

2022-12-19 Thread Jason Merrill via Gcc-patches

On 12/19/22 13:13, Patrick Palka wrote:

On Mon, 19 Dec 2022, Jason Merrill wrote:


On 12/6/22 13:35, Patrick Palka wrote:

This patch fixes some issues with substitution into a C++20 template
parameter object wrapper:

* The first testcase demonstrates a situation where the same_type_p
assert in relevant case of tsubst_copy doesn't hold, because (partial)
substitution of {int,} into the VIEW_CONVERT_EXPR wrapper yields
A but substitution into the underlying TEMPLATE_PARM_INDEX is a
nop and yields A due to tsubst's level == 1 early exit test.


We exit in that case because continuing would reduce the level to an
impossible 0.  Why doesn't the preceding code find a binding for T?


Whoops, I misspoke.  The problem is that there's no binding for V since
only T=int was explicitly specified, so when substituting into the
TEMPLATE_PARM_INDEX for V in 'void f(B)', we hit that early exit and
never get a chance to substitute T=int into the tpi's TREE_TYPE (which
would yield A as desired).  So the TREE_TYPE of the wrapped tpi
remains A whereas the substituted TREE_TYPE of the wrapper is A,
a mismatch.


Ah, makes sense.  The patch is OK.




So
this patch just gets rid of the assert; the type mismatch doesn't
seem to be a problem in practice, I suppose because the coercion is
from one dependent type to another.

* In the second testcase, dependent substitution into the underlying
TEMPLATE_PARM_INDEX yields a CALL_EXPR with empty TREE_TYPE, which
tsubst_copy doesn't expect.  This patch fixes this by handling empty
TREE_TYPE the same way as a non-const TREE_TYPE.  Moreover, after
this substitution we're left with a VIEW_CONVERT_EXPR wrapping a
CALL_EXPR instead of a TEMPLATE_PARM_INDEX, which during the subsequent
non-dependent substitution tsubst_copy doesn't expect either.  So
this patch also relaxes tsubst_copy to accept such VIEW_CONVERT_EXPR
too.

* In the third testcase, we end up never resolving the call to
f.modify() since tsubst_copy doesn't do overload resolution.
This patch fixes this by moving the handling of these
VIEW_CONVERT_EXPR wrappers from tsubst_copy to tsubst_copy_and_build.
For good measure tsubst_copy_and_build should also handle
REF_PARENTHESIZED_P wrappers instead of delegating to tsubst_copy.

After this patch, VIEW_CONVERT_EXPR substitution is ultimately just
moved from tsubst_copy to tsubst_copy_and_build and made more
permissive.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

PR c++/103346
PR c++/104278
PR c++/102553

gcc/cp/ChangeLog:

* pt.cc (tsubst_copy) : In the handling
of C++20 template parameter object wrappers: Remove same_type_p
assert.  Accept non-TEMPLATE_PARM_INDEX inner operand.  Handle
empty TREE_TYPE on substituted inner operand.  Move it to ...
(tsubst_copy_and_build): ... here.  Also handle REF_PARENTHESIZED_P
VIEW_CONVERT_EXPRs.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-class52a.C: New test.
* g++.dg/cpp2a/nontype-class53.C: New test.
* g++.dg/cpp2a/nontype-class54.C: New test.
* g++.dg/cpp2a/nontype-class55.C: New test.
---
   gcc/cp/pt.cc  | 73 ++-
   gcc/testsuite/g++.dg/cpp2a/nontype-class52a.C | 15 
   gcc/testsuite/g++.dg/cpp2a/nontype-class53.C  | 25 +++
   gcc/testsuite/g++.dg/cpp2a/nontype-class54.C  | 23 ++
   gcc/testsuite/g++.dg/cpp2a/nontype-class55.C  | 15 
   5 files changed, 116 insertions(+), 35 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class52a.C
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class53.C
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class54.C
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class55.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 2d8e4fdd4b5..0a196f069ad 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -17271,42 +17271,16 @@ tsubst_copy (tree t, tree args, tsubst_flags_t
complain, tree in_decl)
  return maybe_wrap_with_location (op0, EXPR_LOCATION (t));
}
  tree op = TREE_OPERAND (t, 0);
- if (code == VIEW_CONVERT_EXPR
- && TREE_CODE (op) == TEMPLATE_PARM_INDEX)
-   {
- /* Wrapper to make a C++20 template parameter object const.  */
- op = tsubst_copy (op, args, complain, in_decl);
- if (!CP_TYPE_CONST_P (TREE_TYPE (op)))
-   {
- /* The template argument is not const, presumably because
-it is still dependent, and so not the const template parm
-object.  */
- tree type = tsubst (TREE_TYPE (t), args, complain, in_decl);
- gcc_checking_assert
(same_type_ignoring_top_level_qualifiers_p
-  (type, TREE_TYPE (op)));
- if (TREE_CODE (op)

[PATCH 1/2] Add 'gcc.target/nvptx/softstack-decl-1.c', 'gcc.target/nvptx/uniform-simt-decl-1.c'

2022-12-19 Thread Thomas Schwinge
... to document the status quo re implicit (via 'need_softstack_decl',
'need_unisimt_decl') and explicit declarations of '__nvptx_stacks',
'__nvptx_uni'.

gcc/testsuite/
* gcc.target/nvptx/softstack-decl-1.c: New.
* gcc.target/nvptx/uniform-simt-decl-1.c: Likewise.
---
 .../gcc.target/nvptx/softstack-decl-1.c   | 20 +
 .../gcc.target/nvptx/uniform-simt-decl-1.c| 29 +++
 2 files changed, 49 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c
 create mode 100644 gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c

diff --git a/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c 
b/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c
new file mode 100644
index 000..c502eacc1b3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options {-save-temps -O0 -msoft-stack} } */
+
+extern void *__nvptx_stacks[32] __attribute__((shared,nocommon));
+
+void *f()
+{
+  /* Implicit '__nvptx_stacks' usage for frame; per 'init_softstack_frame':
+ { dg-final { scan-assembler-times {mov\.u64 %fstmp2, __nvptx_stacks;} 1 } 
}
+  */
+  void *stack_array[123];
+  /* Explicit '__nvptx_stacks' usage.  */
+  stack_array[5] = __nvptx_stacks[0];
+  return stack_array[5];
+}
+
+/* The implicit (via 'need_softstack_decl') and explicit declarations of
+   '__nvptx_stacks' are both emitted:
+   { dg-final { scan-assembler-times {(?n)\.extern .* __nvptx_stacks\[32\];} 2 
} }
+*/
diff --git a/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c 
b/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c
new file mode 100644
index 000..486456ab243
--- /dev/null
+++ b/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options {-save-temps -O0 -muniform-simt} } */
+
+extern unsigned __nvptx_uni[32] __attribute__((shared,nocommon));
+
+enum memmodel
+{
+  MEMMODEL_RELAXED = 0,
+};
+
+int a = 0;
+
+int f (void)
+{
+  /* Explicit '__nvptx_uni' usage.  */
+  __builtin_printf("%u\n", __nvptx_uni[0]);
+
+  /* Implicit '__nvptx_uni' usage; per 'nvptx_init_unisimt_predicate':
+ { dg-final { scan-assembler-times {mov\.u64 %r[0-9]+, __nvptx_uni;} 1 } }
+  */
+  int expected = 1;
+  return __atomic_compare_exchange_n (&a, &expected, 0, 0, MEMMODEL_RELAXED,
+ MEMMODEL_RELAXED);
+}
+
+/* The implicit (via 'need_unisimt_decl') and explicit declarations of
+   '__nvptx_uni' are both emitted:
+   { dg-final { scan-assembler-times {(?n)\.extern .* __nvptx_uni\[32\];} 2 } }
+*/
--
2.25.1

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH 2/2] nvptx: Prevent emitting duplicate declarations for '__nvptx_stacks', '__nvptx_uni'

2022-12-19 Thread Thomas Schwinge
As I have reported to Nvidia in 2022-12-01 'NVIDIA Incident Report (3891704):
ptxas: Duplicate declaration error: "cannot be resolved by a '.static'"',
'ptxas' has an inscrutable error mode for duplicate declarations:

ptxas softstack-decl-1.o, line 11; error   : '.extern' variable 
'__nvptx_stacks' cannot be resolved by a '.static'
ptxas fatal   : Ptx assembly aborted due to errors
nvptx-as: ptxas returned 255 exit status

ptxas uniform-simt-decl-1.o, line 12; error   : '.extern' variable 
'__nvptx_uni' cannot be resolved by a '.static'
ptxas fatal   : Ptx assembly aborted due to errors
nvptx-as: ptxas returned 255 exit status

This is inscrutable, because (a) what is "cannot be resolved by a '.static'"
supposed to tell me (there is no '.static' in PTX?), and (b) why arent't
repeated declaration just verified to match the first, but otherwise a no-op
(like in other programming languages)?

gcc/
* config/nvptx/nvptx.cc (nvptx_assemble_undefined_decl): Notice
'__nvptx_stacks', '__nvptx_uni' declarations.
(nvptx_file_end): Don't emit duplicate declarations for those.
gcc/testsuite/
* gcc.target/nvptx/softstack-decl-1.c: Make 'dg-do assemble',
adjust.
* gcc.target/nvptx/uniform-simt-decl-1.c: Likewise.
---
 gcc/config/nvptx/nvptx.cc  | 14 --
 gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c  |  8 
 .../gcc.target/nvptx/uniform-simt-decl-1.c |  8 
 3 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 8e49dd9c647..b93a253ab31 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -180,9 +180,11 @@ static GTY(()) tree global_lock_var;

 /* True if any function references __nvptx_stacks.  */
 static bool need_softstack_decl;
+static bool have_softstack_decl;

 /* True if any function references __nvptx_uni.  */
 static bool need_unisimt_decl;
+static bool have_unisimt_decl;

 static int nvptx_mach_max_workers ();

@@ -2571,6 +2573,13 @@ nvptx_assemble_undefined_decl (FILE *file, const char 
*name, const_tree decl)
 TREE_TYPE (decl), size ? tree_to_shwi (size) : 0,
 DECL_ALIGN (decl), true);
   nvptx_assemble_decl_end ();
+
+  static tree softstack_id = get_identifier ("__nvptx_stacks");
+  static tree unisimt_id = get_identifier ("__nvptx_uni");
+  if (DECL_NAME (decl) == softstack_id)
+have_softstack_decl = true;
+  else if (DECL_NAME (decl) == unisimt_id)
+have_unisimt_decl = true;
 }

 /* Output a pattern for a move instruction.  */
@@ -6002,7 +6011,7 @@ nvptx_file_end (void)
 write_shared_buffer (asm_out_file, gang_private_shared_sym,
 gang_private_shared_align, gang_private_shared_size);

-  if (need_softstack_decl)
+  if (need_softstack_decl && !have_softstack_decl)
 {
   write_var_marker (asm_out_file, false, true, "__nvptx_stacks");
   /* 32 is the maximum number of warps in a block.  Even though it's an
@@ -6011,7 +6020,8 @@ nvptx_file_end (void)
   fprintf (asm_out_file, ".extern .shared .u%d __nvptx_stacks[32];\n",
   POINTER_SIZE);
 }
-  if (need_unisimt_decl)
+
+  if (need_unisimt_decl && !have_unisimt_decl)
 {
   write_var_marker (asm_out_file, false, true, "__nvptx_uni");
   fprintf (asm_out_file, ".extern .shared .u32 __nvptx_uni[32];\n");
diff --git a/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c 
b/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c
index c502eacc1b3..2415f6adb1f 100644
--- a/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c
+++ b/gcc/testsuite/gcc.target/nvptx/softstack-decl-1.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do assemble } */
 /* { dg-options {-save-temps -O0 -msoft-stack} } */

 extern void *__nvptx_stacks[32] __attribute__((shared,nocommon));
@@ -14,7 +14,7 @@ void *f()
   return stack_array[5];
 }

-/* The implicit (via 'need_softstack_decl') and explicit declarations of
-   '__nvptx_stacks' are both emitted:
-   { dg-final { scan-assembler-times {(?n)\.extern .* __nvptx_stacks\[32\];} 2 
} }
+/* Of the implicit (via 'need_softstack_decl') and explicit declarations of
+   '__nvptx_stacks', only one is emitted:
+   { dg-final { scan-assembler-times {(?n)\.extern .* __nvptx_stacks\[32\];} 1 
} }
 */
diff --git a/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c 
b/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c
index 486456ab243..5a975bdb269 100644
--- a/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c
+++ b/gcc/testsuite/gcc.target/nvptx/uniform-simt-decl-1.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do assemble } */
 /* { dg-options {-save-temps -O0 -muniform-simt} } */

 extern unsigned __nvptx_uni[32] __attribute__((shared,nocommon));
@@ -23,7 +23,7 @@ int f (void)
  MEMMODEL_RELAXED);
 }

-/* The implicit (via 'need_unisimt_dec

Re: [PATCH RFA] build: add -Wconditionally-supported to strict_warn [PR64867]

2022-12-19 Thread Jason Merrill via Gcc-patches

On 12/16/22 19:52, Jeff Law wrote:



On 12/6/22 06:26, Jason Merrill via Gcc-patches wrote:

Tested x86_64-pc-linux-gnu, OK for trunk?

-- 8< --

The PR (which isn't resolved by this commit) pointed out to me that GCC
should build with -Wconditionally-supported to support bootstrapping 
with a

C++11 compiler that makes different choices.

PR c++/64867

gcc/ChangeLog:

* configure.ac (strict_warn): Add -Wconditionally-supported.
* configure: Regenerate.

OK.  I wonder if it'll trip anything, particularly in the target files.


Also applying this to fix a breakage reported on IRC:

From eef0873b6906d5404a04b817377c33f585cf3f21 Mon Sep 17 00:00:00 2001
From: Jason Merrill 
Date: Mon, 19 Dec 2022 15:41:36 -0500
Subject: [PATCH] build: avoid -Wconditionally-supported on qsort check
To: gcc-patches@gcc.gnu.org

It's OK to rely on conditionally-supported features in #if CHECKING_P, since
that isn't defined in stage 1.

gcc/ChangeLog:

	* sort.cc: Disable -Wconditionally-supported in
	CHECKING_P code.
---
 gcc/sort.cc | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/sort.cc b/gcc/sort.cc
index 87f826818bb..eeddfcf1fef 100644
--- a/gcc/sort.cc
+++ b/gcc/sort.cc
@@ -237,6 +237,10 @@ do {\
 }
 
 #if CHECKING_P
+  /* Don't complain about cast from void* to function pointer.  */
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wconditionally-supported"
+
 /* Adapter for using two-argument comparators in functions expecting the
three-argument sort_r_cmp_fn type.  */
 static int
@@ -266,6 +270,7 @@ gcc_qsort (void *vbase, size_t n, size_t size, cmp_fn *cmp)
 free (buf);
 #if CHECKING_P
   qsort_chk (vbase, n, size, cmp2to3, (void*)cmp);
+#pragma GCC diagnostic pop
 #endif
 }
 
-- 
2.31.1



Modula-2 / Rust: Many targets failing

2022-12-19 Thread Jan-Benedict Glaw
Hi!

With the recent merges for Modula-2 and Rust, I see a good number of
targets failing with --enable-languages=all, mostly due to issues with
the Modula-2 driver.


 Modula-2 related issues
=


 --target=x86_64-apple-darwin
~~
/bin/bash ../../gcc/gcc/m2/tools-src/makeSystem -fpim \
 ../../gcc/gcc/m2/gm2-libs/SYSTEM.def \
 ../../gcc/gcc/m2/gm2-libs/SYSTEM.mod \
 -I../../gcc/gcc/m2/gm2-libs \
 
"/var/lib/laminar/run/gcc-x86_64-apple-darwin/19/toolchain-build/./gcc/gm2 
-B/var/lib/laminar/run/gcc-x86_64-apple-darwin/19/toolchain-build/./gcc/ " 
/var/lib/laminar/run/gcc-x86_64-apple-darwin/19/toolchain-build/gcc/m2/gm2-libs/SYSTEM.def

/var/lib/laminar/run/gcc-x86_64-apple-darwin/19/toolchain-build/./gcc/as: 114: 
exec: -arch: not found
SYSTEM module creates type: LOC
SYSTEM module creates type: WORD
SYSTEM module creates type: BYTE
SYSTEM module creates type: ADDRESS
SYSTEM module creates type: INTEGER8
SYSTEM module creates type: INTEGER16
SYSTEM module creates type: INTEGER32
SYSTEM module creates type: INTEGER64
SYSTEM module creates type: CARDINAL8
SYSTEM module creates type: CARDINAL16
SYSTEM module creates type: CARDINAL32
SYSTEM module creates type: CARDINAL64
SYSTEM module creates type: WORD16
SYSTEM module creates type: WORD32
SYSTEM module creates type: WORD64
SYSTEM module creates type: BITSET8
SYSTEM module creates type: BITSET16
SYSTEM module creates type: BITSET32
SYSTEM module creates type: REAL32
SYSTEM module creates type: REAL64
SYSTEM module creates type: REAL128
SYSTEM module creates type: COMPLEX32
SYSTEM module creates type: COMPLEX64
SYSTEM module creates type: COMPLEX128
SYSTEM module creates type: CSIZE_T
SYSTEM module creates type: CSSIZE_T

/var/lib/laminar/run/gcc-x86_64-apple-darwin/19/toolchain-build/./gcc/as: 114: 
exec: -arch: not found
make[1]: *** [../../gcc/gcc/m2/Make-lang.in:1524: 
/var/lib/laminar/run/gcc-x86_64-apple-darwin/19/toolchain-build/gcc/m2/gm2-libs/SYSTEM.def]
 Error 1
rm m2/gm2-compiler-boot/P2Build.mod 
m2/gm2-compiler-boot/P0SyntaxCheck.mod m2/gm2-compiler-boot/PCBuild.mod 
m2/gm2-compiler-boot/PHBuild.mod m2/gm2-compiler-boot/P1Build.mod 
m2/gm2-compiler-boot/P3Build.mod
make[1]: Leaving directory 
'/var/lib/laminar/run/gcc-x86_64-apple-darwin/19/toolchain-build/gcc'
make: *** [Makefile:4623: all-gcc] Error 2


 --target=sparc64-sun-solaris2.11 --with-gnu-ld --with-gnu-as 
--enable-threads=posix
~
Similar to x86_64-apple-darwin, but:

/var/lib/laminar/run/gcc-sparc64-sun-solaris2.11OPT-with-gnu-ldOPT-with-gnu-asOPT-enable-threads=posix/19/toolchain-install/sparc64-sun-solaris2.11/bin/as:
 unrecognized option '-m64'

 --target=sparc-sun-solaris2.11

Similar to x86_64-apple-darwin, but:

/var/lib/laminar/run/gcc-sparc-sun-solaris2.11/16/toolchain-install/sparc-sun-solaris2.11/bin/as:
 unrecognized option '-m32'

 --target=powerpc64-darwin
 --target=powerpc-darwin8
 --target=powerpc-darwin7
~~~
Similar to x86_64-apple-darwin

 --target=powerpc-lynxos
~
Same place, but
/var/lib/laminar/run/gcc-powerpc-lynxos/20/toolchain-build/./gcc/as: 
114: exec: -I: not found

 --target=mipsisa64sr71k-elf
~
Similar to the others:
/bin/bash ../../gcc/gcc/m2/tools-src/makeSystem -fpim \
 ../../gcc/gcc/m2/gm2-libs/SYSTEM.def \
 ../../gcc/gcc/m2/gm2-libs/SYSTEM.mod \
 -I../../gcc/gcc/m2/gm2-libs \
 
"/var/lib/laminar/run/gcc-mipsisa64sr71k-elf/23/toolchain-build/./gcc/gm2 
-B/var/lib/laminar/run/gcc-mipsisa64sr71k-elf/23/toolchain-build/./gcc/ " 
/var/lib/laminar/run/gcc-mipsisa64sr71k-elf/23/toolchain-build/gcc/m2/gm2-libs/SYSTEM.def
Assembler messages:
Error: bad value (sr71k) for default CPU
Internal error in mips_after_parse_args at config/tc-mips.c:15290.
Please report this bug.

 --target=m32rle-elf
~
/bin/bash ../../gcc/gcc/m2/tools-src/makeSystem -fpim \
 ../../gcc/gcc/m2/gm2-libs/SYSTEM.def \
 ../../gcc/gcc/m2/gm2-libs/SYSTEM.mod \
 -I../../gcc/gcc/m2/gm2-libs \
 
"/var/lib/laminar/run/gcc-m32rle-elf/21/toolchain-build/./gcc/gm2 
-B/var/lib/laminar/run/gcc-m32rle-elf/21/toolchain-build/./gcc/ " 
/var/lib/laminar/run/gcc-m32rle-elf/21/toolchain-build/gcc/m2/gm2-libs/SYSTEM.def
/var/lib/laminar/run/gcc-m32rle-elf/21/toolchain-build/./gcc

Re: [PATCH] fold-const: Treat fp conversion to a type with same mode as copy

2022-12-19 Thread Segher Boessenkool
Hi!

On Mon, Dec 19, 2022 at 11:02:30AM +0100, Jakub Jelinek wrote:
> On Mon, Dec 19, 2022 at 09:49:36AM +0100, Richard Biener wrote:
> > On Mon, Dec 19, 2022 at 9:12 AM Kewen.Lin  wrote:
> > > In function fold_convert_const_real_from_real, when the modes of
> > > two types involved in fp conversion are the same, we can simply
> > > take it as copy, rebuild with the exactly same TREE_REAL_CST and
> > > the target type.  It is more efficient and helps to avoid possible
> > > unexpected signalling bit clearing in [1].
> > >
> > > Bootstrapped and regtested on x86_64-redhat-linux, aarch64-linux-gnu
> > > and powerpc64{,le}-linux-gnu.
> > >
> > > Is it ok for trunk?
> > 
> > But shouldn't
> > 
> > double x = (double) __builtin_nans("sNAN");
> > 
> > result in a quiet NaN?

No, it is no conversion, it is a no-op?

With -fno-signaling-nans, anything goes.

With -fsignaling-nans, only corge should return a qNaN *and trap*, and
it does that for me, too (it does an frsp insn on the datum, and that
has that effect).

With signalling NaNs your program should always trap when it converts a
SNaN to a QNaN.  Not all operations on SNaNs can be optimised away; any
SNaN that is operated on should remain trapping.  With -fsignaling-nans
very many floating point operations can not be optimised very much.

> GCC right now returns a sNaN in foo and bar and qNaN in baz, qux and corge,
> clang and ICC (tried 19.0.1 on godbolt) return sNaN in foo, bar, baz, qux
> and qNaN in corge.

I just get an lfd everywhere (so returns SNaN, doesn't trap) except in
corge(), which also does an frsp (round to SP float, so it returns a
QNaN, but traps first).

> As for the rest, C n3047.pdf has:
> Whether C assignment (6.5.16) (and conversion as if by assignment) to the 
> same format is an
> IEC 60559 convertFormat or copy operation439) is implementation-defined, even 
> if  defines
> the macro FE_SNANS_ALWAYS_SIGNAL (F.2.1).

And GCC just does a copy, already?  At -O2 anyway.

> I think the posted patch is good for consistency, treating conversion to the
> same format sometimes as convertFormat and sometimes as copy is maybe valid
> but confusing, especially when on:

Agreed.


Segher


Re: Re: [PATCH] RISC-V: Support VSETVL PASS for RVV support

2022-12-19 Thread 钟居哲
>> ISTM that if you want to run before sched2, then
>> you'd need to introduce dependencies between the vsetvl instrutions and
>> the vector instructions that utilize those settings?

Yes, I want to run before sched2 so that we could have the chance to do the 
instruction scheduling before sched2. I already introduce dependencies in
vector instructions so that it won't produce any issues.

>> Formatting note.  For a multi-line conditional, go ahead and use an open
>> paren and the usual indention style.

>>return (INSN_CODE (rinsn) == CODE_FOR_vsetvldi
>>|| INSN_CODE (rinsn) == CODE_FOR_vsetvlsi);

>>  There's other examples in the new file.

>> s/shoule/should/
>> s/propagete/propagate/
>> s/optimzal/optimal/
>> s/PASSes/passes/
>> s/intrinsiscs/intrinsics/
>> s/instrinsics/intrinsics/
>> s/acrocss/across/

Address commnents.

>> It'd probably be better to move this into rtl.cc with a prototype in
>> rtl.h rather than have duplicate definitions in gcse.c and the RISC-V
>> backend.  I'm not even entirely sure why we really need it here.
Maybe we do that when GCC14 is open?

>> These need function comments.  What isn't clear to me is why we don't
>> just call validate_change?  Is it just so we get the dump info?
Yes, since it's called more than once and I want to dump details in dump file.
Such dump infos are important for debugging.


juzhe.zh...@rivai.ai
 
From: Jeff Law
Date: 2022-12-19 23:44
To: juzhe.zhong; gcc-patches
CC: kito.cheng; palmer
Subject: Re: [PATCH] RISC-V: Support VSETVL PASS for RVV support
I believe Kito already approved.  There's nothing here that is critical, 
just minor cleanups and I'm fine with them being cleaned up as a 
follow-up patch given Kito has already approved this patch.
 
On 12/14/22 00:13, juzhe.zh...@rivai.ai wrote:
> From: Ju-Zhe Zhong 
> 
> This patch is to support VSETVL PASS for RVV support.
> 1.The optimization and performance is guaranteed LCM (Lazy code motion).
> 2.Base on RTL_SSA framework to gain better optimization chances.
> 3.Also we do VL/VTYPE, demand information backward propagation across
>blocks by RTL_SSA reverse order in CFG.
> 4.It has been well and fully tested by about 200+ testcases for VLMAX
>AVL situation (Only for VLMAX since we don't have an intrinsics to
>test non-VLMAX).
> 5.Will support AVL model in the next patch
> 
> gcc/ChangeLog:
> 
>  * config.gcc: Add riscv-vsetvl.o.
>  * config/riscv/riscv-passes.def (INSERT_PASS_BEFORE): Add VSETVL 
> PASS location.
>  * config/riscv/riscv-protos.h (make_pass_vsetvl): New function.
>  (enum avl_type): New enum.
>  (get_ta): New function.
>  (get_ma): Ditto.
>  (get_avl_type): Ditto.
>  (calculate_ratio): Ditto.
>  (enum tail_policy): New enum.
>  (enum mask_policy): Ditto.
>  * config/riscv/riscv-v.cc (calculate_ratio): New function.
>  (emit_pred_op): change the VLMAX mov codgen.
>  (get_ta): New function.
>  (get_ma): Ditto.
>  (enum tail_policy): Change enum.
>  (get_prefer_tail_policy): New function.
>  (enum mask_policy): Change enum.
>  (get_prefer_mask_policy): New function.
>  * config/riscv/t-riscv: Add riscv-vsetvl.o
>  * config/riscv/vector.md (): Adjust attribute and pattern for VSETVL 
> PASS.
>  (@vlmax_avl): Ditto.
>  (@vsetvl_no_side_effects): Delete.
>  (vsetvl_vtype_change_only): New MD pattern.
>  (@vsetvl_discard_result): Ditto.
>  * config/riscv/riscv-vsetvl.cc: New file.
>  * config/riscv/riscv-vsetvl.h: New file.
So a high level note.  Once you've inserted your vsetvl instrutions, you 
can't have further code motion, correct?  So doesn't this potentially 
have a poor interaction with something like speculative code motion as 
performed by sched?   ISTM that if you want to run before sched2, then 
you'd need to introduce dependencies between the vsetvl instrutions and 
the vector instructions that utilize those settings?
 
I can envision wanting to schedule the vsetvl instructions so that they 
bubble up slightly from their insertion points to avoid stalls or allow 
the vector units to start executing earlier.  Is that what's driving the 
the current pass placement?  If not would it make more sense to use the 
late prologue/epilogue hooks that Richard Sandiford posted recently (I'm 
not sure they're committed yet).
 
 
 
 
 
 
> +
> +static bool
> +loop_basic_block_p (const basic_block cfg_bb)
> +{
> +  return JUMP_P (BB_END (cfg_bb)) && any_condjump_p (BB_END (cfg_bb));
> +}
The name seems poor here -- AFAICT this has nothing to do with loops. 
It's just a test that the end of a block is a conditional jump.  I'm 
pretty sure we could extract BB_END (cfg_bb) and use an existing routine 
instead of writing our own.  I'd suggest peeking at jump.cc to see if 
there's something already suitable.
 
  +
> +/* Return true if it is vsetvldi or vsetvlsi

[PATCH] RISC-V: Fix muti-line condition format

2022-12-19 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (vlmax_avl_insn_p): Fix multi-line 
conditional.
(vsetvl_insn_p): Ditto.
(same_bb_and_before_p): Ditto.
(same_bb_and_after_or_equal_p): Ditto.

---
 gcc/config/riscv/riscv-vsetvl.cc | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 3ca3fc15e5a..0c2ff630e96 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -112,8 +112,8 @@ vlmax_avl_p (rtx x)
 static bool
 vlmax_avl_insn_p (rtx_insn *rinsn)
 {
-  return INSN_CODE (rinsn) == CODE_FOR_vlmax_avlsi
-|| INSN_CODE (rinsn) == CODE_FOR_vlmax_avldi;
+  return (INSN_CODE (rinsn) == CODE_FOR_vlmax_avlsi
+ || INSN_CODE (rinsn) == CODE_FOR_vlmax_avldi);
 }
 
 static bool
@@ -156,24 +156,24 @@ vector_config_insn_p (rtx_insn *rinsn)
 static bool
 vsetvl_insn_p (rtx_insn *rinsn)
 {
-  return INSN_CODE (rinsn) == CODE_FOR_vsetvldi
-|| INSN_CODE (rinsn) == CODE_FOR_vsetvlsi;
+  return (INSN_CODE (rinsn) == CODE_FOR_vsetvldi
+|| INSN_CODE (rinsn) == CODE_FOR_vsetvlsi);
 }
 
 /* Return true if INSN1 comes befeore INSN2 in the same block.  */
 static bool
 same_bb_and_before_p (const insn_info *insn1, const insn_info *insn2)
 {
-  return (insn1->bb ()->index () == insn2->bb ()->index ())
-&& (*insn1 < *insn2);
+  return ((insn1->bb ()->index () == insn2->bb ()->index ())
+&& (*insn1 < *insn2));
 }
 
 /* Return true if INSN1 comes after or equal INSN2 in the same block.  */
 static bool
 same_bb_and_after_or_equal_p (const insn_info *insn1, const insn_info *insn2)
 {
-  return (insn1->bb ()->index () == insn2->bb ()->index ())
-&& (*insn1 >= *insn2);
+  return ((insn1->bb ()->index () == insn2->bb ()->index ())
+&& (*insn1 >= *insn2));
 }
 
 /* An "anticipatable occurrence" is one that is the first occurrence in the
-- 
2.36.1



[PATCH] RISC-V: Fix incorrect annotation

2022-12-19 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (anticipatable_occurrence_p): Fix 
incorrect annotations.
(available_occurrence_p): Ditto.
(backward_propagate_worthwhile_p): Ditto.
(can_backward_propagate_p): Ditto.

---
 gcc/config/riscv/riscv-vsetvl.cc | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 0c2ff630e96..72f1e4059ab 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -186,7 +186,7 @@ anticipatable_occurrence_p (const insn_info *insn, const 
vector_insn_info dem)
   /* The only possible operand we care of VSETVL is AVL.  */
   if (dem.has_avl_reg ())
 {
-  /* The operands shoule not be modified in the basic block prior
+  /* The operands should not be modified in the basic block prior
 to the occurrence.  */
   if (!vlmax_avl_p (dem.get_avl ()))
{
@@ -223,7 +223,7 @@ available_occurrence_p (const insn_info *insn, const 
vector_insn_info dem)
   /* The only possible operand we care of VSETVL is AVL.  */
   if (dem.has_avl_reg ())
 {
-  /* The operands shoule not be modified in the basic block prior
+  /* The operands should not be modified in the basic block prior
 to the occurrence.
 e.g.
bb:
@@ -284,7 +284,7 @@ backward_propagate_worthwhile_p (const basic_block cfg_bb,
 |_|
  reaching_out
  Header is incompatible with reaching_out and the block is loop itself,
- we don't backward propagete the local_dem since we can't avoid emit
+ we don't backward propagate the local_dem since we can't avoid emit
  vsetvl for the local_dem.  */
   edge e;
   edge_iterator ei;
@@ -334,10 +334,10 @@ can_backward_propagate_p (const function_info *ssa, const 
basic_block cfg_bb,
   insn_info *insn = prop.get_insn ();
 
   /* TODO: We don't backward propagate the explict VSETVL here
- since we will change vsetvl and vsetvlmax intrinsiscs into
- no side effects which can be optimized into optimzal location
- by GCC internal PASSes. We only need to support these backward
- propagation if vsetvl instrinsics have side effects.  */
+ since we will change vsetvl and vsetvlmax intrinsics into
+ no side effects which can be optimized into optimal location
+ by GCC internal passes. We only need to support these backward
+ propagation if vsetvl intrinsics have side effects.  */
   if (vsetvl_insn_p (insn->rtl ()))
 return false;
 
@@ -369,7 +369,7 @@ can_backward_propagate_p (const function_info *ssa, const 
basic_block cfg_bb,
   def_info *def = find_access (insn->uses (), REGNO (reg))->def ();
 
   /* If the definition is in the current block, we can't propagate it
- acrocss blocks.  */
+ across blocks.  */
   if (def->bb ()->cfg_bb ()->index == insn->bb ()->cfg_bb ()->index)
 {
   set_info *set = safe_dyn_cast (def);
@@ -406,7 +406,7 @@ can_backward_propagate_p (const function_info *ssa, const 
basic_block cfg_bb,
   if (def->bb ()->cfg_bb ()->index == cfg_bb->index)
 return true;
 
-  /* Make sure we don't backward propagete the VL/VTYPE info over the
+  /* Make sure we don't backward propagate the VL/VTYPE info over the
  definition blocks.  */
   bool visited_p = false;
   for (const bb_info *bb : ssa->reverse_bbs ())
-- 
2.36.1



Re: [PATCH] RISC-V: Fix muti-line condition format

2022-12-19 Thread Jeff Law via Gcc-patches




On 12/19/22 16:09, juzhe.zh...@rivai.ai wrote:

From: Ju-Zhe Zhong 

gcc/ChangeLog:

 * config/riscv/riscv-vsetvl.cc (vlmax_avl_insn_p): Fix multi-line 
conditional.
 (vsetvl_insn_p): Ditto.
 (same_bb_and_before_p): Ditto.
 (same_bb_and_after_or_equal_p): Ditto.

OK
jeff


Re: [PATCH] RISC-V: Fix incorrect annotation

2022-12-19 Thread Jeff Law via Gcc-patches




On 12/19/22 16:13, juzhe.zh...@rivai.ai wrote:

From: Ju-Zhe Zhong 

gcc/ChangeLog:

 * config/riscv/riscv-vsetvl.cc (anticipatable_occurrence_p): Fix 
incorrect annotations.
 (available_occurrence_p): Ditto.
 (backward_propagate_worthwhile_p): Ditto.
 (can_backward_propagate_p): Ditto.
OK.  And more generally, fixes like this don't need review.  Consider 
them pre-approved for the future.



jeff


Ping [PATCH v3] Add condition coverage profiling

2022-12-19 Thread Jørgen Kvalsvik via Gcc-patches
On 05/12/2022 10:40, Jørgen Kvalsvik wrote:
> This patch adds support in gcc+gcov for modified condition/decision
> coverage (MC/DC) with the -fprofile-conditions flag. MC/DC is a type of
> test/code coverage and it is particularly important in the avation and
> automotive industries for safety-critical applications. MC/DC it is
> required for or recommended by:
> 
> * DO-178C for the most critical software (Level A) in avionics
> * IEC 61508 for SIL 4
> * ISO 26262-6 for ASIL D
> 
>  From the SQLite webpage:
> 
> Two methods of measuring test coverage were described above:
> "statement" and "branch" coverage. There are many other test
> coverage metrics besides these two. Another popular metric is
> "Modified Condition/Decision Coverage" or MC/DC. Wikipedia defines
> MC/DC as follows:
> 
> * Each decision tries every possible outcome.
> * Each condition in a decision takes on every possible outcome.
> * Each entry and exit point is invoked.
> * Each condition in a decision is shown to independently affect
>   the outcome of the decision.
> 
> In the C programming language where && and || are "short-circuit"
> operators, MC/DC and branch coverage are very nearly the same thing.
> The primary difference is in boolean vector tests. One can test for
> any of several bits in bit-vector and still obtain 100% branch test
> coverage even though the second element of MC/DC - the requirement
> that each condition in a decision take on every possible outcome -
> might not be satisfied.
> 
> https://sqlite.org/testing.html#mcdc
> 
> Wahlen, Heimdahl, and De Silva "Efficient Test Coverage Measurement for
> MC/DC" describes an algorithm for adding instrumentation by carrying
> over information from the AST, but my algorithm analyses the the control
> flow graph to instrument for coverage. This has the benefit of being
> programming language independent and faithful to compiler decisions
> and transformations, although I have only tested it on constructs in C
> and C++, see testsuite/gcc.misc-tests and testsuite/g++.dg.
> 
> Like Wahlen et al this implementation records coverage in fixed-size
> bitsets which gcov knows how to interpret. This is very fast, but
> introduces a limit on the number of terms in a single boolean
> expression, the number of bits in a gcov_unsigned_type (which is
> typedef'd to uint64_t), so for most practical purposes this would be
> acceptable. This limitation is in the implementation and not the
> algorithm, so support for more conditions can be added by also
> introducing arbitrary-sized bitsets.
> 
> For space overhead, the instrumentation needs two accumulators
> (gcov_unsigned_type) per condition in the program which will be written
> to the gcov file. In addition, every function gets a pair of local
> accumulators, but these accmulators are reused between conditions in the
> same function.
> 
> For time overhead, there is a zeroing of the local accumulators for
> every condition and one or two bitwise operation on every edge taken in
> the an expression.
> 
> In action it looks pretty similar to the branch coverage. The -g short
> opt carries no significance, but was chosen because it was an available
> option with the upper-case free too.
> 
> gcov --conditions:
> 
> 3:   17:void fn (int a, int b, int c, int d) {
> 3:   18:if ((a && (b || c)) && d)
> condition outcomes covered 3/8
> condition  0 not covered (true false)
> condition  1 not covered (true)
> condition  2 not covered (true)
> condition  3 not covered (true)
> 1:   19:x = 1;
> -:   20:else
> 2:   21:x = 2;
> 3:   22:}
> 
> gcov --conditions --json-format:
> 
> "conditions": [
> {
> "not_covered_false": [
> 0
> ],
> "count": 8,
> "covered": 3,
> "not_covered_true": [
> 0,
> 1,
> 2,
> 3
> ]
> }
> ],
> 
> Some expressions, mostly those without else-blocks, are effectively
> "rewritten" in the CFG construction making the algorithm unable to
> distinguish them:
> 
> and.c:
> 
> if (a && b && c)
> x = 1;
> 
> ifs.c:
> 
> if (a)
> if (b)
> if (c)
> x = 1;
> 
> gcc will build the same graph for both these programs, and gcov will
> report boths as 3-term expressions. It is vital that it is not
> interpreted the other way around (which is consistent with the shape of
> the graph) because otherwise the masking would be wrong for the and.c
> program which is a more severe error. While surprising, users would
> probably expect some minor rewriting of semantically-identical
> expressions.
> 
> and.c.gcov:
> #:2:if (a && b && c)
> condition outcomes covered 6/6
> #:3:x = 1;
> 
> ifs.c.gcov:
> #:2

Re: [committed] testsuite: Fix up pr107397.f90 test [PR107397]

2022-12-19 Thread Jerry D via Gcc-patches

On 12/19/22 2:29 AM, Jakub Jelinek wrote:

On Sat, Dec 17, 2022 at 09:12:43AM -0800, Jerry D via Gcc-patches wrote:

The attached patch fixes a regression and is a patch from Steve.  I have
regression tested it and provided a test case.  It is fairly simple and I
will commit under the "simple" rule in a little while.

Thanks Steve for Patch. Thanks Harald for helping me get back up to speed on
the git magic.


The pr107397.f90 test FAILs for me, one problem was that the
added diagnostics has an indefinite article before BOZ, but
the test dg-error didn't.  The other problem was that on the
other dg-error there was no space between the string and closing
}, so it was completely ignored and the error was an excess
error.

2022-12-19  Jakub Jelinek  

PR fortran/107397
* gfortran.dg/pr107397.f90: Adjust expected diagnostic wording and
add space between dg-error string and closing }.

=== snip ===

Thanks Jakub.



Re: [PATCH V7] rs6000: Optimize cmp on rotated 16bits constant

2022-12-19 Thread Jiufu Guo via Gcc-patches


Hi Segher,

Thanks for your review, and helpful comments!

Segher Boessenkool  writes:

> Hi!
>
> Mostlt nitpicking left:
>
> On Mon, Dec 19, 2022 at 10:06:45PM +0800, Jiufu Guo wrote:
>> When checking eq/ne with a constant which has only 16bits, it can be
>> optimized to check the rotated data.  By this, the constant building
>> is optimized.
>> 
>> As the example in PR103743:
>> For "in == 0x8000LL", this patch generates:
>> rotldi 3,3,1 ; cmpldi 0,3,1
>> instead of:
>> li 9,-1 ; rldicr 9,9,0,0 ; cmpd 0,3,9
>
> Excellent :-)
>
>>  * config/rs6000/rs6000-protos.h (can_be_rotated_to_lowbits): New.
>>  (can_be_rotated_to_positive_16bits): New.
>>  (can_be_rotated_to_negative_15bits): New.
>>  * config/rs6000/rs6000.cc (can_be_rotated_to_lowbits): New definition.
>>  (can_be_rotated_to_positive_16bits): New definition.
>>  (can_be_rotated_to_negative_15bits): New definition.
>>  * config/rs6000/rs6000.md (*rotate_on_cmpdi): New define_insn_and_split.
>
> Good names.  Great function comments as well.
>
>> +/* Check if C (as 64bit integer) can be rotated to a constant which 
>> constains
>> +   nonzero bits at LOWBITS only.
>
> "at the LOWBITS low bits only".  Well it probably is clear what is
> meant :-)
Update.  Thanks :)
>
>> +   Return true if C can be rotated to such constant.  And *ROT is written to
>> +   the number by which C is rotated.
>> +   Return false otherwise.  */
>
> "If so, *ROT is written" etc.
Great.  Updated.
>
>> +(define_code_iterator eqne [eq ne])
>
> You should say in the changelog that "eqne" was moved.
> "(eqne): Move earlier." is plenty of course.
Sure, thanks! Update.
>
>> +(define_insn_and_split "*rotate_on_cmpdi"
>
>> +  rtx note = find_reg_note (curr_insn, REG_BR_PROB, 0);
>
> Move this much later please, to just before it is used.
Oh, thanks! Update.
>
>> +  /* keep the probability info for the prediction of the branch insn.  */
>
> "Keep", sentences start with a capital.
Update.
>
>> +}
>> +)
>
> These go on one line, as just
> })
Updated.
>
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103743.c
>> @@ -0,0 +1,52 @@
>> +/* { dg-options "-O2" } */
>> +/* { dg-do compile { target has_arch_ppc64 } } */
>> +
>> +/* { dg-final { scan-assembler-times {\mcmpldi\M} 10  } } */
>> +/* { dg-final { scan-assembler-times {\mcmpdi\M} 4  } } */
>> +/* { dg-final { scan-assembler-times {\mrotldi\M} 14  } } */
>> +
>
> With so much going on in just one function, I am a bit worried that this
> testcase will easily fail in the future.  We will see.
Thanks for pointing out this. I understand your concern!
I would pay attention to this.

>
> Okay for trunk with those i's dotted.  Thank you!
Updated and committed via r13-4803-g1060cd2ad00b51.


BR,
Jeff (Jiufu)
>
>
> Segher


Re: [PATCH V2 2/2] [x86] x86: Add a new option -mdaz-ftz to enable FTZ and DAZ flags in MXCSR.

2022-12-19 Thread Hongtao Liu via Gcc-patches
On Thu, Dec 15, 2022 at 3:45 PM Hongtao Liu  wrote:
>
> On Thu, Dec 15, 2022 at 3:39 PM Jakub Jelinek  wrote:
> >
> > On Thu, Dec 15, 2022 at 02:21:37PM +0800, liuhongt via Gcc-patches wrote:
> > > --- a/gcc/config/i386/i386.opt
> > > +++ b/gcc/config/i386/i386.opt
> > > @@ -420,6 +420,10 @@ mpc80
> > >  Target RejectNegative
> > >  Set 80387 floating-point precision to 80-bit.
> > >
> > > +mdaz-ftz
> > > +Target
> >
> > s/Target/Driver/
> Change to Driver and Got error like:cc1: error: command-line option
> ‘-mdaz-ftz’ is valid for the driver but not for C.
Hi Jakub:
  I didn't find a good solution to handle this error after changing
*Target* to *Driver*, Could you give some hints how to solve this
problem?
Or is it ok for you to mark this as *Target*(there won't be any save
and restore in cfun since there's no variable defined here.)
> >
> > > +Set the FTZ and DAZ Flags.
> >
> > Jakub
> >
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao


[PING^2] nvptx: stack size limits are relevant for execution only (was: [PATCH, testsuite] Add effective target stack_size)

2022-12-19 Thread Thomas Schwinge
Hi!

Ping.


Grüße
 Thomas


On 2022-11-25T12:09:36+0100, I wrote:
> Hi!
>
> Ping.
>
>
> Grüße
>  Thomas
>
>
> On 2022-11-08T21:29:49+0100, I wrote:
>> Hi!
>>
>> On 2017-06-09T16:24:30+0200, Tom de Vries  wrote:
>>> The patch defines an effective target stack_size, which is used in
>>> individual test-cases to add -DSTACK_SIZE= [...]
>>
>>> gccint.info (edited for long lines):
>>> ...
>>> 7.2.3.12 Other attributes
>>> .
>>>
>>> 'stack_size'
>>>   Target has limited stack size.  [...]
>>
>> On top of that, OK to push the attached
>> "nvptx: stack size limits are relevant for execution only"?
>>
>>
>> Grüße
>>  Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 158a077129cb1579b93ddf440a5bb60b457e4b7c Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 8 Nov 2022 12:10:03 +0100
Subject: [PATCH] nvptx: stack size limits are relevant for execution only

For non-'dg-do run' test cases, that means: big 'dg-require-stack-size' need
not be UNSUPPORTED (and indeed now do all PASS), 'dg-add-options stack_size'
need not define (and thus limit) 'STACK_SIZE' (and still do all PASS).

Re "Find 'dg-do-what' in an outer frame", currently (sources not completely
clean, though), we've got:

$ git grep -F 'check_effective_target_stack_size: found dg-do-what at level ' -- build-gcc/\*.log | sort | uniq -c
  6 build-gcc/gcc/testsuite/gcc/gcc.log:check_effective_target_stack_size: found dg-do-what at level 2
267 build-gcc/gcc/testsuite/gcc/gcc.log:check_effective_target_stack_size: found dg-do-what at level 3
239 build-gcc/gcc/testsuite/gcc/gcc.log:check_effective_target_stack_size: found dg-do-what at level 4

	gcc/testsuite/
	* lib/target-supports.exp (check_effective_target_stack_size): For
	nvptx target, stack size limits are relevant for execution only.
	gcc/
	* doc/sourcebuild.texi (stack_size): Update.
---
 gcc/doc/sourcebuild.texi  |  4 
 gcc/testsuite/lib/target-supports.exp | 16 
 2 files changed, 20 insertions(+)

diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 137f00aadc1f..5bbf6fc55909 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -2848,6 +2848,10 @@ Target has limited stack size.  The stack size limit can be obtained using the
 STACK_SIZE macro defined by @ref{stack_size_ao,,@code{dg-add-options} feature
 @code{stack_size}}.
 
+Note that for certain targets, stack size limits are relevant for
+execution only, and therefore considered only if @code{dg-do run} is
+in effect, otherwise unlimited.
+
 @item static
 Target supports @option{-static}.
 
diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp
index 750897d08548..39ed1723b03a 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -625,6 +625,22 @@ proc check_effective_target_trampolines { } {
 # Return 1 if target has limited stack size.
 
 proc check_effective_target_stack_size { } {
+# For nvptx target, stack size limits are relevant for execution only.
+if { [istarget nvptx-*-*] } {
+	# Find 'dg-do-what' in an outer frame.
+	set level 1
+	while true {
+	upvar $level dg-do-what dg-do-what
+	if [info exists dg-do-what] then break
+	incr level
+	}
+	verbose "check_effective_target_stack_size: found dg-do-what at level $level" 2
+
+	if { ![string equal [lindex ${dg-do-what} 0] run] } {
+	return 0
+	}
+}
+
 if [target_info exists gcc,stack_size] {
 	return 1
 }
-- 
2.35.1



[PING] nvptx: Re-enable a number of test cases

2022-12-19 Thread Thomas Schwinge
Hi!

Ping this whole series.


Grüße
 Thomas


On 2022-12-02T13:03:06+0100, Thomas Schwinge  wrote:
> Hi!
>
> I'm proposing to re-enable a number of test cases for nvptx.  OK to push?
>
>
> Grüße
>  Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955