[PATCH] Fortran: make IEEE_CLASS recognize signaling NaNs

2022-01-02 Thread FX via Gcc-patches
Hi,

This is the first part of a three-patch series to fix PR 82207 
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82207), making gfortran handle 
signaling NaNs. This part fixes the library code implementing IEEE_CLASS, by 
using the issignaling macro (from TS 18661-1:2014) to check whether a NaN is 
signalling.

The patch comes with a testcase, conditional on issignaling support (which will 
therefore run on glibc targets), which uses C built-ins to generate signaling 
NaNs and checks in Fortran code that they are classified and behave as expected.

Once this is in, the next two parts are:

- Add support for generating signaling NaNs in IEEE_VALUE, which is a longer 
patch because it requires moving the IEEE_VALUE library code from Fortran to C 
(but will be much more efficient and correct than the current implementation).
- Provide a fallback implementation of issignaling on targets that don’t have 
it.


Bootstrapped and regtested on x86_64-pc-gnu-linux. OK to commit?

FX



0001-Fortran-Allow-IEEE_CLASS-to-identify-signaling-NaNs.patch
Description: Binary data


[PATCH] libgo: include asm/ptrace.h for pt_regs definition on PowerPC

2022-01-02 Thread soeren--- via Gcc-patches
From: Sören Tempel 

Both glibc and musl libc declare pt_regs as an incomplete type. This
type has to be completed by inclusion of another header. On Linux, the
asm/ptrace.h header file provides this type definition. Without
including this header file, it is not possible to access the regs member
of the mcontext_t struct as done in libgo/runtime/go-signal.c. On glibc,
other headers (e.g. sys/user.h) include asm/ptrace.h but on musl
asm/ptrace.h is not included by other headers and thus the
aforementioned files do not compile without an explicit include of
asm/ptrace.h:

libgo/runtime/go-signal.c: In function 'getSiginfo':
libgo/runtime/go-signal.c:227:63: error: invalid use of undefined type 
'struct pt_regs'
  227 | ret.sigpc = 
((ucontext_t*)(context))->uc_mcontext.regs->nip;
  |

See also:

* 
https://git.musl-libc.org/cgit/musl/commit/?id=c2518a8efb6507f1b41c3b12e03b06f8f2317a1f
* https://github.com/kaniini/libucontext/issues/36

Signed-off-by: Sören Tempel 

ChangeLog:

* libgo/runtime/go-signal.c: Include asm/ptrace.h for the
  definition of pt_regs (used by mcontext_t) on PowerPC.
---
 libgo/runtime/go-signal.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/libgo/runtime/go-signal.c b/libgo/runtime/go-signal.c
index d30d1603adc..fc01e04e4a1 100644
--- a/libgo/runtime/go-signal.c
+++ b/libgo/runtime/go-signal.c
@@ -10,6 +10,12 @@
 #include 
 #include 
 
+// On PowerPC, ucontext.h uses a pt_regs struct as an incomplete
+// type. This type must be completed by including asm/ptrace.h.
+#ifdef __PPC__
+#include 
+#endif
+
 #include "runtime.h"
 
 #ifndef SA_RESTART


[COMMITTED] hppa: Use optab_libfunc to access sync_lock_test_and_set libfunc

2022-01-02 Thread John David Anglin
This patch revises the atomic store code for hppa-linux to use optab_libfunc
to access the sync_lock_test_and_set libfunc. We now call
convert_memory_address() to convert the memory address to Pmode. This
should handle more memory addresses.

Tested on hppa-unknown-linux-gnu. Committed to trunk and gcc-11 branch.

Dave
---

Use optab_libfunc to access sync_lock_test_and_set libfunc on hppa-linux.

2022-01-02  John David Anglin  

gcc/ChangeLog:

* config/pa/pa.md (atomic_storeq): Use optab_libfunc to access
sync_lock_test_and_set libfunc. Call convert_memory_address to
convert memory address to Pmode.
(atomic_storehi, atomic_storesi, atomic_storedi): Likewise.

diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
index af5449a9ea3..31e3b1bff80 100644
--- a/gcc/config/pa/pa.md
+++ b/gcc/config/pa/pa.md
@@ -10366,10 +10366,10 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
 {
   if (TARGET_SYNC_LIBCALL)
 {
-  rtx libfunc = init_one_libfunc ("__sync_lock_test_and_set_1");
+  rtx libfunc = optab_libfunc (sync_lock_test_and_set_optab, QImode);
+  rtx addr = convert_memory_address (Pmode, XEXP (operands[0], 0));
 
-  emit_library_call (libfunc, LCT_NORMAL, VOIDmode,
-XEXP (operands[0], 0), Pmode,
+  emit_library_call (libfunc, LCT_NORMAL, VOIDmode, addr, Pmode,
 operands[1], QImode);
   DONE;
 }
@@ -10386,10 +10386,10 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
 {
   if (TARGET_SYNC_LIBCALL)
 {
-  rtx libfunc = init_one_libfunc ("__sync_lock_test_and_set_2");
+  rtx libfunc = optab_libfunc (sync_lock_test_and_set_optab, HImode);
+  rtx addr = convert_memory_address (Pmode, XEXP (operands[0], 0));
 
-  emit_library_call (libfunc, LCT_NORMAL, VOIDmode,
-XEXP (operands[0], 0), Pmode,
+  emit_library_call (libfunc, LCT_NORMAL, VOIDmode, addr, Pmode,
 operands[1], HImode);
   DONE;
 }
@@ -10406,10 +10406,10 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
 {
   if (TARGET_SYNC_LIBCALL)
 {
-  rtx libfunc = init_one_libfunc ("__sync_lock_test_and_set_4");
+  rtx libfunc = optab_libfunc (sync_lock_test_and_set_optab, SImode);
+  rtx addr = convert_memory_address (Pmode, XEXP (operands[0], 0));
 
-  emit_library_call (libfunc, LCT_NORMAL, VOIDmode,
-XEXP (operands[0], 0), Pmode,
+  emit_library_call (libfunc, LCT_NORMAL, VOIDmode, addr, Pmode,
 operands[1], SImode);
   DONE;
 }
@@ -10459,10 +10459,10 @@ add,l %2,%3,%3\;bv,n %%r0(%3)"
 
   if (TARGET_SYNC_LIBCALL)
 {
-  rtx libfunc = init_one_libfunc ("__sync_lock_test_and_set_8");
+  rtx libfunc = optab_libfunc (sync_lock_test_and_set_optab, DImode);
+  rtx addr = convert_memory_address (Pmode, XEXP (operands[0], 0));
 
-  emit_library_call (libfunc, LCT_NORMAL, VOIDmode,
-XEXP (operands[0], 0), Pmode,
+  emit_library_call (libfunc, LCT_NORMAL, VOIDmode, addr, Pmode,
 operands[1], DImode);
   DONE;
 }



signature.asc
Description: PGP signature


[COMMITTED] hppa: Generate illegal instruction fault if LWS syscall returns -EFAULT

2022-01-02 Thread John David Anglin
The kernel compare and exchange calls will never succeed if they
return -EFAULT. This change generates an instruction fault if a
call returns -EFAULT. This prevents the code from spinning forever.

Tested on hppa-unknown-linux-gnu. Committed to trunk and gcc-11 branch.

Dave
---

Generate illegal instruction fault if LWS syscall returns -EFAULT.

2022-01-02  John David Anglin  

libgcc/ChangeLog:

* config/pa/linux-atomic.c (_ASM_EFAULT): Define.
(__kernel_cmpxchg): Nullify illegal iitlbp instruction if error
return is not equal _ASM_EFAULT.
(__kernel_cmpxchg2): Likewise.

diff --git a/libgcc/config/pa/linux-atomic.c b/libgcc/config/pa/linux-atomic.c
index 500a3652499..e4d74b2d598 100644
--- a/libgcc/config/pa/linux-atomic.c
+++ b/libgcc/config/pa/linux-atomic.c
@@ -28,6 +28,8 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  If 
not, see
 #define EBUSY   16
 #define ENOSYS 251 
 
+#define _ASM_EFAULT "-14"
+
 typedef unsigned char u8;
 typedef short unsigned int u16;
 #ifdef __LP64__
@@ -58,6 +60,8 @@ __kernel_cmpxchg (volatile void *mem, int oldval, int newval)
   register long lws_errno asm("r21");
   asm volatile (   "ble0xb0(%%sr2, %%r0)   \n\t"
"ldi%2, %%r20   \n\t"
+   "cmpiclr,<> " _ASM_EFAULT ", %%r21, %%r0\n\t"
+   "iitlbp %%r0,(%%sr0, %%r0)  \n\t"
: "=r" (lws_ret), "=r" (lws_errno)
: "i" (LWS_CAS), "r" (lws_mem), "r" (lws_old), "r" (lws_new)
: "r1", "r20", "r22", "r23", "r29", "r31", "memory"
@@ -84,6 +88,8 @@ __kernel_cmpxchg2 (volatile void *mem, const void *oldval, 
const void *newval,
   register long lws_errno asm("r21");
   asm volatile (   "ble0xb0(%%sr2, %%r0)   \n\t"
"ldi%6, %%r20   \n\t"
+   "cmpiclr,<> " _ASM_EFAULT ", %%r21, %%r0\n\t"
+   "iitlbp %%r0,(%%sr0, %%r0)  \n\t"
: "=r" (lws_ret), "=r" (lws_errno), "+r" (lws_mem),
  "+r" (lws_old), "+r" (lws_new), "+r" (lws_size)
: "i" (2)



signature.asc
Description: PGP signature


[pushed] c++: don't wrap cleanups that can't throw

2022-01-02 Thread Jason Merrill via Gcc-patches
Since C++11, the vast majority of destructors are noexcept, so
wrap_temporary_cleanups adds a bunch of useless TRY_CATCH_EXPR to be removed
later in the optimizers.  It's simple to avoid adding them in the first
place.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* decl.c (wrap_cleanups_r): Don't wrap if noexcept.

gcc/testsuite/ChangeLog:

* g++.dg/eh/cleanup6.C: New test.
---
 gcc/cp/decl.c  | 14 --
 gcc/testsuite/g++.dg/eh/cleanup6.C | 13 +
 2 files changed, 21 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/eh/cleanup6.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 982fca8983d..51c69957635 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -7428,12 +7428,14 @@ wrap_cleanups_r (tree *stmt_p, int *walk_subtrees, void 
*data)
   tree guard = (tree)data;
   tree tcleanup = TARGET_EXPR_CLEANUP (*stmt_p);
 
-  tcleanup = build2 (TRY_CATCH_EXPR, void_type_node, tcleanup, guard);
-  /* Tell honor_protect_cleanup_actions to handle this as a separate
-cleanup.  */
-  TRY_CATCH_IS_CLEANUP (tcleanup) = 1;
- 
-  TARGET_EXPR_CLEANUP (*stmt_p) = tcleanup;
+  if (tcleanup && !expr_noexcept_p (tcleanup, tf_none))
+   {
+ tcleanup = build2 (TRY_CATCH_EXPR, void_type_node, tcleanup, guard);
+ /* Tell honor_protect_cleanup_actions to handle this as a separate
+cleanup.  */
+ TRY_CATCH_IS_CLEANUP (tcleanup) = 1;
+ TARGET_EXPR_CLEANUP (*stmt_p) = tcleanup;
+   }
 }
 
   return NULL_TREE;
diff --git a/gcc/testsuite/g++.dg/eh/cleanup6.C 
b/gcc/testsuite/g++.dg/eh/cleanup6.C
new file mode 100644
index 000..e27563f4569
--- /dev/null
+++ b/gcc/testsuite/g++.dg/eh/cleanup6.C
@@ -0,0 +1,13 @@
+// Test that we don't wrap the non-throwing A cleanup with a B cleanup.
+
+// { dg-do compile { target c++11 } }
+// { dg-additional-options -fdump-tree-gimple }
+// { dg-final { scan-tree-dump-times "B::~B" 1 "gimple" } }
+
+struct A { A(); ~A(); };
+struct B { B(const A& = A()); ~B(); };
+
+int main()
+{
+  B b;
+}

base-commit: 62eb5308fe6c46f7eded3c9e06c53491515a6e63
-- 
2.27.0



[pushed] c++: fix array cleanup with throwing temp dtor

2022-01-02 Thread Jason Merrill via Gcc-patches
While working on PR66139 I noticed that if the destructor of a temporary
created during array initialization throws, we were failing to destroy the
last array element constructed.  Throwing destructors are rare since C++11,
but this should be fixed.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* init.c (build_vec_init): Append the decrement to elt_init.

gcc/testsuite/ChangeLog:

* g++.dg/eh/array2.C: New test.
---
 gcc/cp/init.c| 17 -
 gcc/testsuite/g++.dg/eh/array2.C | 43 
 2 files changed, 54 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/eh/array2.C

diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index 2a4512e462a..5a5c1257902 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -4665,11 +4665,13 @@ build_vec_init (tree base, tree maxindex, tree init,
   finish_for_cond (build2 (GT_EXPR, boolean_type_node, iterator,
   build_int_cst (TREE_TYPE (iterator), -1)),
   for_stmt, false, 0);
-  elt_init = cp_build_unary_op (PREDECREMENT_EXPR, iterator, false,
-   complain);
-  if (elt_init == error_mark_node)
-   errors = true;
-  finish_for_expr (elt_init, for_stmt);
+  /* We used to pass this decrement to finish_for_expr; now we add it to
+elt_init below so it's part of the same full-expression as the
+initialization, and thus happens before any potentially throwing
+temporary cleanups.  */
+  tree decr = cp_build_unary_op (PREDECREMENT_EXPR, iterator, false,
+complain);
+
 
   to = build1 (INDIRECT_REF, type, base);
 
@@ -4794,7 +4796,10 @@ build_vec_init (tree base, tree maxindex, tree init,
 
   current_stmt_tree ()->stmts_are_full_exprs_p = 1;
   if (elt_init && !errors)
-   finish_expr_stmt (elt_init);
+   elt_init = build2 (COMPOUND_EXPR, void_type_node, elt_init, decr);
+  else
+   elt_init = decr;
+  finish_expr_stmt (elt_init);
   current_stmt_tree ()->stmts_are_full_exprs_p = 0;
 
   finish_expr_stmt (cp_build_unary_op (PREINCREMENT_EXPR, base, false,
diff --git a/gcc/testsuite/g++.dg/eh/array2.C b/gcc/testsuite/g++.dg/eh/array2.C
new file mode 100644
index 000..d4d6c91cde7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/eh/array2.C
@@ -0,0 +1,43 @@
+// Test that we clean up the right number of array elements when
+// a temporary destructor throws.
+// { dg-do run }
+
+#if __cplusplus > 201100L
+#define THROWING noexcept(false)
+#else
+#define THROWING
+#endif
+
+extern "C" void abort ();
+
+int b;
+int d = -1;
+struct A {
+  A() { }
+  A(const A&);
+  ~A() THROWING {
+if (b == d) throw b;
+  }
+};
+struct B {
+  B(const A& = A()) { ++b; }
+  B(const B&);
+  ~B() { --b; }
+};
+void f()
+{
+  b = 0;
+  try
+{
+  B bs[3];
+  if (b != 3) abort ();
+}
+  catch (int i) { }
+  if (b != 0) abort ();
+}
+
+int main()
+{
+  for (d = 0; d <= 3; ++d)
+f();
+}

base-commit: 62eb5308fe6c46f7eded3c9e06c53491515a6e63
prerequisite-patch-id: ecfffb22c1ee7e449270885df0aaa6d5fc9d291f
-- 
2.27.0



[PATCH RFA] tree-pretty-print: still indent unhandled codes

2022-01-02 Thread Jason Merrill via Gcc-patches
It would be nice to handle language-specific codes in the tree
pretty-printer, but until then we can at least indent them appropriately.

Tested x86_64-pc-linux-gnu, ok for trunk?

gcc/ChangeLog:

* tree-pretty-print.c (do_niy): Add spc parameter.
(NIY): Pass it.
(print_call_name): Add spc local variable.
---
 gcc/tree-pretty-print.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index a81ba401ef9..7a1c4e843d1 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -53,19 +53,19 @@ static const char *op_symbol (const_tree);
 static void newline_and_indent (pretty_printer *, int);
 static void maybe_init_pretty_print (FILE *);
 static void print_struct_decl (pretty_printer *, const_tree, int, 
dump_flags_t);
-static void do_niy (pretty_printer *, const_tree, dump_flags_t);
+static void do_niy (pretty_printer *, const_tree, int, dump_flags_t);
 
 #define INDENT(SPACE) do { \
   int i; for (i = 0; i

[PATCH] Fortran: fix PR103390, ICE in gimplification

2022-01-02 Thread Sandra Loosemore
This patch is for PR103390.  For background on this issue, the Fortran 
standard requires that, when passing a non-contiguous array from Fortran 
to a BIND(C) function with the CONTIGUOUS attribute on the corresponding 
dummy argument, the compiler has to arrange for it to be copied to/from 
a contiguous temporary.  The ICE was happening because the front end was 
attempting to copy out to an array-valued expression that isn't an 
lvalue, and producing invalid code.


I poked around at several related examples (included as test cases in 
the patch) and realized that it should not be doing any copying at all 
here, since the expression result already was being put in a contiguous 
temporary.  And, besides the invalid code on copy-out, in some cases it 
was generating multiple copies of the code to compute the expression on 
copy-in.  :-S


Both parts of the patch seem to be necessary to fix all the test cases. 
Tobias pointed me in this direction when I discussed it with him a few 
weeks ago so I hope I got it right.


OK to check in?  It regression-tests fine on x86_64.

-Sandra
commit 3a5e4f3a14b4265ee6f92dd724cbae9103d38d4b
Author: Sandra Loosemore 
Date:   Wed Dec 29 16:44:14 2021 -0800

Fortran: Fix array copy-in/copy-out for BIND(C) functions [PR103390]

The Fortran front end was generating invalid code for the array
copy-out after a call to a BIND(C) function for a dummy with the
CONTIGUOUS attribute when the actual argument was a call to the SHAPE
intrinsic or other array expressions that are not lvalues.  It was
also generating code to evaluate the argument expression multiple
times on copy-in.  This patch teaches it to recognize that a copy is
not needed in these cases.

2022-01-02  Sandra Loosemore  

	PR fortran/103390

	gcc/fortran/
	* expr.c (gfc_is_simply_contiguous): Make it smarter about
	function calls.
	* trans-expr.c (gfc_conv_gfc_desc_to_cfi_desc): Do not generate
	copy loops for array expressions that are not "variables" (lvalues).

	gcc/testsuite/
	* gfortran.dg/c-interop/pr103390-1.f90: New.
	* gfortran.dg/c-interop/pr103390-2.f90: New.
	* gfortran.dg/c-interop/pr103390-3.f90: New.
	* gfortran.dg/c-interop/pr103390-4.f90: New.
	* gfortran.dg/c-interop/pr103390-6.f90: New.
	* gfortran.dg/c-interop/pr103390-7.f90: New.
	* gfortran.dg/c-interop/pr103390-8.f90: New.
	* gfortran.dg/c-interop/pr103390-9.f90: New.

diff --git a/gcc/fortran/expr.c b/gcc/fortran/expr.c
index c1258e0..a0129a3 100644
--- a/gcc/fortran/expr.c
+++ b/gcc/fortran/expr.c
@@ -5883,8 +5883,16 @@ gfc_is_simply_contiguous (gfc_expr *expr, bool strict, bool permit_element)
 
   if (expr->expr_type == EXPR_FUNCTION)
 {
-  if (expr->value.function.esym)
-	return expr->value.function.esym->result->attr.contiguous;
+  if (expr->value.function.isym)
+	/* TRANPOSE is the only intrinsic that may return a
+	   non-contiguous array.  It's treated as a special case in
+	   gfc_conv_expr_descriptor too.  */
+	return (expr->value.function.isym->id != GFC_ISYM_TRANSPOSE);
+  else if (expr->value.function.esym)
+	/* Only a pointer to an array without the contiguous attribute
+	   can be non-contiguous as a result value.  */
+	return (expr->value.function.esym->result->attr.contiguous
+		|| !expr->value.function.esym->result->attr.pointer);
   else
 	{
 	  /* Type-bound procedures.  */
diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 80c669f..10e1e37 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -5536,13 +5536,17 @@ gfc_conv_gfc_desc_to_cfi_desc (gfc_se *parmse, gfc_expr *e, gfc_symbol *fsym)
 {
   /* If the actual argument can be noncontiguous, copy-in/out is required,
 	 if the dummy has either the CONTIGUOUS attribute or is an assumed-
-	 length assumed-length/assumed-size CHARACTER array.  */
+	 length assumed-length/assumed-size CHARACTER array.  This only
+	 applies if the actual argument is a "variable"; if it's some
+	 non-lvalue expression, we are going to evaluate it to a
+	 temporary below anyway.  */
   se.force_no_tmp = 1;
   if ((fsym->attr.contiguous
 	   || (fsym->ts.type == BT_CHARACTER && !fsym->ts.u.cl->length
 	   && (fsym->as->type == AS_ASSUMED_SIZE
 		   || fsym->as->type == AS_EXPLICIT)))
-	  && !gfc_is_simply_contiguous (e, false, true))
+	  && !gfc_is_simply_contiguous (e, false, true)
+	  && gfc_expr_is_variable (e))
 	{
 	  bool optional = fsym->attr.optional;
 	  fsym->attr.optional = 0;
@@ -6841,6 +6845,8 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
 	 fsym->attr.pointer);
 		}
 	  else
+		/* This is where we introduce a temporary to store the
+		   result of a non-lvalue array expression.  */
 		gfc_conv_array_parameter (&parmse, e, nodesc_arg, fsym,
 	  sym->name, NULL);
 
diff --git a/gcc/testsuite/gfortran.dg/c-interop/pr103390-1.f90 b/gcc/testsuite/gfortran.dg/c-interop/p

[COMMITTED] hppa: Skip gcc.dg/guality/example.c on hppa-linux

2022-01-02 Thread John David Anglin
This test hangs on hppa-linux. Don't know why exactly but it seems
to be a sequencing issue with gdb. The hang breaks automated builds
as the test doesn't timeout.

Tested on hppa-unknown-linux-gnu. Committed to trunk.

Dave
---

Skip gcc.dg/guality/example.c on hppa-linux.

2022-01-02  John David Anglin  

gcc/testsuite/ChangeLog:

* gcc.dg/guality/example.c: Skip on hppa*-*-linux*.

diff --git a/gcc/testsuite/gcc.dg/guality/example.c 
b/gcc/testsuite/gcc.dg/guality/example.c
index 32014e2b4c0..0d5f48ba116 100644
--- a/gcc/testsuite/gcc.dg/guality/example.c
+++ b/gcc/testsuite/gcc.dg/guality/example.c
@@ -1,3 +1,4 @@
+/* { dg-skip-if "gdb hang" { hppa*-*-linux* } } */
 /* { dg-options "-g" } */
 /* { dg-do run { xfail { ! aarch64*-*-* } } } */
 /* { dg-xfail-run-if "" aarch64*-*-* "*" { "-O[01g]" } } */



signature.asc
Description: PGP signature


[COMMITTED] hppa: Adjust shadd-2 and shadd-3 scan counts

2022-01-02 Thread John David Anglin
Fixes shadd-2.c and shadd-3.c test fails on trunk.

Committed to trunk.

Dave
---

Adjust shadd-2 and shadd-3 scan counts.

2022-01-02  John David Anglin  

gcc/testsuite/ChangeLog:

* gcc.target/hppa/shadd-2.c: Adjust count to 3.
* gcc.target/hppa/shadd-3.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/hppa/shadd-2.c 
b/gcc/testsuite/gcc.target/hppa/shadd-2.c
index b92f782cf0d..74d4fcd146e 100644
--- a/gcc/testsuite/gcc.target/hppa/shadd-2.c
+++ b/gcc/testsuite/gcc.target/hppa/shadd-2.c
@@ -1,6 +1,6 @@
 /* { dg-do compile }  */
 /* { dg-options "-O2" }  */
-/* { dg-final { scan-assembler-times "sh.add" 4 } }  */
+/* { dg-final { scan-assembler-times "sh.add" 3 } }  */
 
 typedef struct rtx_def *rtx;
 typedef const struct rtx_def *const_rtx;
diff --git a/gcc/testsuite/gcc.target/hppa/shadd-3.c 
b/gcc/testsuite/gcc.target/hppa/shadd-3.c
index 2d0b648f384..a0c1f663d56 100644
--- a/gcc/testsuite/gcc.target/hppa/shadd-3.c
+++ b/gcc/testsuite/gcc.target/hppa/shadd-3.c
@@ -10,7 +10,7 @@
over time we'll have to revisit the combine and/or postreload
dumps.  Note we have disabled delay slot filling to improve
test stability.  */
-/* { dg-final { scan-assembler-times "sh.add" 4 } }  */
+/* { dg-final { scan-assembler-times "sh.add" 3 } }  */
 
 extern void oof (void);
 typedef struct simple_bitmap_def *sbitmap;



signature.asc
Description: PGP signature


Re: [PATCH] Fortran: fix PR103390, ICE in gimplification

2022-01-02 Thread Harald Anlauf via Gcc-patches

Hi Sandra,

Am 02.01.22 um 19:32 schrieb Sandra Loosemore:

This patch is for PR103390.  For background on this issue, the Fortran
standard requires that, when passing a non-contiguous array from Fortran
to a BIND(C) function with the CONTIGUOUS attribute on the corresponding
dummy argument, the compiler has to arrange for it to be copied to/from
a contiguous temporary.  The ICE was happening because the front end was
attempting to copy out to an array-valued expression that isn't an
lvalue, and producing invalid code.

I poked around at several related examples (included as test cases in
the patch) and realized that it should not be doing any copying at all
here, since the expression result already was being put in a contiguous
temporary.  And, besides the invalid code on copy-out, in some cases it
was generating multiple copies of the code to compute the expression on
copy-in.  :-S

Both parts of the patch seem to be necessary to fix all the test cases.
Tobias pointed me in this direction when I discussed it with him a few
weeks ago so I hope I got it right.

OK to check in?  It regression-tests fine on x86_64.


LGTM.

There are a few really minor things to improve:

+   /* TRANPOSE is the only intrinsic that may return a

s/TRANPOSE/TRANSPOSE/

+! We only expect one loop before the call, to fill in the contigous

s/contigous/contiguous/

+! { dg-final { scan-tree-dump-times "contiguous\\.\[0-9\]+" 0
"original" } }

There is a shorter, slightly shorter form for absence of a pattern:

! { dg-final { scan-tree-dump-not "contiguous\\.\[0-9\]+" "original" } }


-Sandra


Thanks for the patch!


[PATCH] i386: Introduce V2QImode vectorized arithmetic [PR103861]

2022-01-02 Thread Uros Bizjak via Gcc-patches
On Thu, Dec 30, 2021 at 3:45 PM Uros Bizjak  wrote:
>
> This patch adds basic V2QImode infrastructure and V2QImode arithmetic
> operations (plus, minus and neg).  The patched compiler can emit SSE
> vectorized QImode operations (e.g. PADDB) with partial QImode vector,
> and also synthesized double HI/LO QImode operations with integer registers.
>
> The testcase:
>
> typedef char __v2qi __attribute__ ((__vector_size__ (2)));
> __v2qi plus  (__v2qi a, __v2qi b) { return a + b; };
>
> compiles with -O2 to:
>
> movl%edi, %edx
> movl%esi, %eax
> addb%sil, %dl
> addb%ah, %dh
> movl%edx, %eax
> ret
>
> which is much better than what the unpatched compiler produces:
>
> movl%edi, %eax
> movl%esi, %edx
> xorl%ecx, %ecx
> movb%dil, %cl
> movsbl  %dh, %edx
> movsbl  %ah, %eax
> addl%edx, %eax
> addb%sil, %cl
> movb%al, %ch
> movl%ecx, %eax
> ret
>
> The V2QImode vectorization does not require vector registers, so it can
> be enabled by default also for 32-bit targets without SSE.
>
> The patch also enables vectorized V2QImode sign/zero extends.
>
> The reason for RFC are several warning failures in
> Wstringop-overflow-*.[Cc] as a result of an unwanted vectorization. I
> tried to sprinkle vect_slp_v2qi_store_align xfails around, but
> unfortunately without success, since I have no idea about the details
> of these tests.
>
> I didn't want to introduce testsuite FAILs, so help with these failing
> tests is greatly appreciated.

This is now fixed in a separate patch.

> Anyway, the above example shows the potential of V2QImode
> vectorization. There are additional similar optimizations possible
> (e.g. shifts with GPRs) in addition to SSE instructions on partial
> V2QI vectors.
>
> Patch is bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
>
> 2021-12-30  Uroš Bizjak  
>
> gcc/ChangeLog:
>
> PR target/103861
> * config/i386/i386.h (VALID_SSE2_REG_MOODE): Add V2QImode.
> (VALID_INT_MODE_P): Ditto.
> * config/i386/i386.c (ix86_secondary_reload): Handle
> V2QImode reloads from SSE register to memory.
> (vector_mode_supported_p): Always return true for V2QImode.
> * config/i386/i386.md (*subqi_ext_2): New insn pattern.
> (*negqi_ext_2): Ditto.
> * config/i386/mmx.md (movv2qi): New expander.
> (movmisalignv2qi): Ditto.
> (*movv2qi_internal): New insn pattern.
> (*pushv2qi2): Ditto.
> (negv2qi2 and splitters): Ditto.
> (v2qi3 and splitters): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> PR target/103861
> * gcc.dg/store_merging_18.c (dg-options): Add -fno-tree-vectorize.
> * gcc.dg/store_merging_29.c (dg-options): Ditto.
> * gcc.target/i386/pr103861.c: New test.
> * gcc.target/i386/pr92658-avx512vl.c (dg-final):
> Remove vpmovqb scan-assembler xfail.
> * gcc.target/i386/pr92658-sse4.c (dg-final):
> Remove pmovzxbq scan-assembler xfail.
> * gcc.target/i386/pr92658-sse4-2.c (dg-final):
> Remove pmovsxbq scan-assembler xfail.
> * gcc.target/i386/warn-vect-op-2.c (dg-warning): Adjust warnings.

Now pushed to master.

Uros.


[PATCH] Revamp documentation for _Complex types extension

2022-01-02 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

While cleaning up the bug database, I noticed there was a request
to improve the documentation of the _Complex type extensions.
So I rewrote part of the documentation to make things clearer on
__real/__imag and even added documentation about casts between
the scalar and the complex type.
I moved the documentation of __builtin_complex under this section
too because it makes more sense than having it in the other
built-in section and reference it.

OK? Built make info and make html and checked out the results to
make sure the tables look decent.

gcc/ChangeLog:

PR c/33193
* doc/extend.texi: Extend the documentation about Complex
types for casting and also rewrite the __real__/__imag__
expression portion to use tables.
Move __builtin_complex to the Complex type section.
---
 gcc/doc/extend.texi | 73 +
 1 file changed, 54 insertions(+), 19 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 9676a17406e..c7a43a79e16 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -986,22 +986,57 @@ The ISO C++14 library also defines the @samp{i} suffix, 
so C++14 code
 that includes the @samp{} header cannot use @samp{i} for the
 GNU extension.  The @samp{j} suffix still has the GNU meaning.
 
+GCC can handle both implicit and explicit casts between the @code{_Complex}
+types and other @code{_Complex} types as casting both the real and imaginary
+parts to the scalar type.
+GCC can handle implicit and explicit casts from a scalar type to a 
@code{_Complex}
+type and where the imaginary part will be considered zero.
+The C front-end can handle implicit and explicit casts from a @code{_Complex} 
type
+to a scalar type where the imaginary part will be ignored. In C++ code, this 
cast
+is considered illformed and G++ will error out.
+
+GCC provides a built-in function @code{__builtin_complex} will can be used to
+construct a complex value.
+
 @cindex @code{__real__} keyword
 @cindex @code{__imag__} keyword
-To extract the real part of a complex-valued expression @var{exp}, write
-@code{__real__ @var{exp}}.  Likewise, use @code{__imag__} to
-extract the imaginary part.  This is a GNU extension; for values of
-floating type, you should use the ISO C99 functions @code{crealf},
-@code{creal}, @code{creall}, @code{cimagf}, @code{cimag} and
-@code{cimagl}, declared in @code{} and also provided as
+
+GCC has a few extensions which can be used to extract the real
+and the imaginary part of the complex-valued expression. Note
+these expressions are lvalues if the @var{exp} is an lvalue.
+These expressions operands have the type of a complex type
+which might get prompoted to a complex type from a scalar type.
+E.g. @code{__real__ (int)@var{x}} is the same as casting to
+@code{_Complex int} before @code{__real__} is done.
+
+@multitable @columnfractions .4 .6
+@headitem Expression @tab Description
+@item @code{__real__ @var{exp}}
+@tab Extract the real part of @var{exp}.
+@item @code{__imag__ @var{exp}}
+@tab Extract the imaginary part of @var{exp}.
+@end multitable
+
+For values of floating point, you should use the ISO C99
+functions, declared in @code{} and also provided as
 built-in functions by GCC@.
 
+@multitable @columnfractions .4 .2 .2 .2
+@headitem Expression @tab float @tab double @tab long double
+@item @code{__real__ @var{exp}}
+@tab @code{crealf} @tab @code{creal} @tab @code{creall}
+@item @code{__imag__ @var{exp}}
+@tab @code{cimagf} @tab @code{cimag} @tab @code{cimagl}
+@end multitable
+
 @cindex complex conjugation
 The operator @samp{~} performs complex conjugation when used on a value
 with a complex type.  This is a GNU extension; for values of
 floating type, you should use the ISO C99 functions @code{conjf},
 @code{conj} and @code{conjl}, declared in @code{} and also
-provided as built-in functions by GCC@.
+provided as built-in functions by GCC@. Note unlike the @code{__real__}
+and @code{__imag__} operators, this operator will not do an implicit cast
+to the complex type because the @samp{~} is already a normal operator.
 
 GCC can allocate complex automatic variables in a noncontiguous
 fashion; it's even possible for the real part to be in a register while
@@ -1013,6 +1048,18 @@ If the variable's actual name is @code{foo}, the two 
fictitious
 variables are named @code{foo$real} and @code{foo$imag}.  You can
 examine and set these two fictitious variables with your debugger.
 
+@deftypefn {Built-in Function} @var{type} __builtin_complex (@var{real}, 
@var{imag})
+
+The built-in function @code{__builtin_complex} is provided for use in
+implementing the ISO C11 macros @code{CMPLXF}, @code{CMPLX} and
+@code{CMPLXL}.  @var{real} and @var{imag} must have the same type, a
+real binary floating-point type, and the result has the corresponding
+complex type with real and imaginary parts @var{real} and @var{imag}.
+Unlike @samp{@var{real} + I * @var{imag}}, this works even when
+infi