[PATCH] [PR106831] Avoid propagating long doubles that may have multiple representations.

2022-09-17 Thread Aldy Hernandez via Gcc-patches
Long doubles are tricky when it comes to considering singletons
because small numbers and +-INF can have multiple representations for
the same number.  So we need to be very careful not to treat those as
singletons, lest they be incorrectly propagated by VRP.  This is
similar to the -0.0 and +0.0 duality.

In long doubles +INF can be represented with +INF in the MSB and
either -0.0 or +0.0 in the LSB.  Similarly for numbers that are exactly
representable in DF.  For example, 1.0 can be represented as either
(1.0, +0.0) or (1.0, -0.0).

This patch avoids treating these numbers as singletons.

Note that NANs in long double format have a LSB of don't care, but
this is irrelevant for singleton_p, because NANs are never considered
singletons.  Also, internally in the frange we store NANs as a pair of
boolean flags indicating whether they are +NAN or -NAN, so we don't need
any special treatment here for comparing range equality etc.  We never
see anything but the boolean flags.  (Errr, the boolean flags are not
yet in trunk, but should go in shortly).

I will push this patch after tests complete.

Thank you Jakub for providing this patch.

PR middle-end/106831

gcc/ChangeLog:

* value-range.cc (frange::singleton_p): Avoid propagating long
doubles that may have multiple representations.
---
 gcc/value-range.cc | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 55a216efd8b..67d5d7fa90f 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -639,6 +639,21 @@ frange::singleton_p (tree *result) const
   if (HONOR_NANS (m_type) && maybe_isnan ())
return false;
 
+  if (MODE_COMPOSITE_P (TYPE_MODE (m_type)))
+   {
+ // For IBM long doubles, if the value is +-Inf or is exactly
+ // representable in double, the other double could be +0.0
+ // or -0.0.  Since this means there is more than one way to
+ // represent a value, return false to avoid propagating it.
+ // See libgcc/config/rs6000/ibm-ldouble-format for details.
+ if (real_isinf (&m_min))
+   return false;
+ REAL_VALUE_TYPE r;
+ real_convert (&r, DFmode, &m_min);
+ if (real_identical (&r, &m_min))
+   return false;
+   }
+
   if (result)
*result = build_real (m_type, m_min);
   return true;
-- 
2.37.1



Re: [PATCH] riscv: implement TARGET_MODE_REP_EXTENDED

2022-09-17 Thread Palmer Dabbelt

On Fri, 16 Sep 2022 16:48:24 PDT (-0700), gcc-patches@gcc.gnu.org wrote:


On 9/6/22 05:39, Alexander Monakov via Gcc-patches wrote:

On Mon, 5 Sep 2022, Philipp Tomsich wrote:


+riscv_mode_rep_extended (scalar_int_mode mode, scalar_int_mode mode_rep)
+{
+  /* On 64-bit targets, SImode register values are sign-extended to DImode.  */
+  if (TARGET_64BIT && mode == SImode && mode_rep == DImode)
+return SIGN_EXTEND;

I think this leads to a counter-intuitive requirement that a hand-written
inline asm must sign-extend its output operands that are bound to either
signed or unsigned 32-bit lvalues. Will compiler users be aware of that?


Is this significantly different than on MIPS?  Hand-written code there
also has to ensure that the results are properly sign extended and it's
been that way for 20+ years since the introduction of mips64 IIRC. 
Though I don't think we had MODE_REP_EXTENDED that long.


IMO the problem isn't so much that asm has this constraint, it's that 
it's a new constraint and thus risks breaking code that used to work.  
That said...



Haha, MIPS is the only target that currently defines
TARGET_MODE_REP_EXTENDED :-)





Moreover, without adjusting TARGET_TRULY_NOOP_TRUNCATION this should cause
miscompilation when a 64-bit variable is truncated to 32 bits: the pre-existing
hook says that nothing needs to be done to truncate, but the new hook says
that the result of the truncation is properly sign-extended.

The documentation for TARGET_MODE_REP_EXTENDED warns about that:

 In order to enforce the representation of mode, 
TARGET_TRULY_NOOP_TRUNCATION
 should return false when truncating to mode.


This may well need adjusting in Philipp's patch.   I'd be surprised if
the MIPS definition wasn't usable nearly verbatim here.


Yes, and we have a few MIPS-isms in the ISA but don't have the same 
flavor of TRULY_NOOP_TRUNCATION.  It's been pointed out a handful of 
times and I'm not sure what the right way to go is here, every time I 
try and reason about which is going to produce better code I come up 
with a different answer.  IIRC last time I looked at this I came to the 
conclusion that we're doing the right thing for RISC-V because most of 
our instructions implicitly truncate.  It's pretty easy to generate bad 
code here and I'm pretty sure we could fix some of that by moving to a 
more MIPS-like TRULY_MODE_TRUNCATION, but I think we'd end up just 
pushing the problems around.


Every time I look at this I also get worried that we've leaked some of 
these internal promotion rules into something visible to inline asm, but 
when I poke around it seems like things generally work.







jeff


[PATCH] frange: flush denormals to zero for -funsafe-math-optimizations.

2022-09-17 Thread Aldy Hernandez via Gcc-patches
Jakub has mentioned that for -funsafe-math-optimizations we may flush
denormals to zero, in which case we need to be careful to extend the
ranges to the appropriate zero.  This patch does exactly that.  For a
range of [x, -DENORMAL] we flush to [x, -0.0] and for [+DENORMAL, x]
we flush to [+0.0, x].

It is unclear whether we should do this for Alpha, since I believe
flushing to zero is the default, and the port requires -mieee for IEEE
sanity.  If so, perhaps we should add a target hook so backends are
free to request flushing to zero.

Thoughts?

gcc/ChangeLog:

* value-range.cc (frange::flush_denormals_to_zero): New.
(frange::set): Call flush_denormals_to_zero.
* value-range.h (class frange): Add flush_denormals_to_zero.
---
 gcc/value-range.cc | 24 
 gcc/value-range.h  |  1 +
 2 files changed, 25 insertions(+)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 67d5d7fa90f..f285734f0e0 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -267,6 +267,26 @@ tree_compare (tree_code code, tree op1, tree op2)
   return !integer_zerop (fold_build2 (code, integer_type_node, op1, op2));
 }
 
+// Flush denormal endpoints to the appropriate 0.0.
+
+void
+frange::flush_denormals_to_zero ()
+{
+  if (undefined_p () || known_isnan ())
+return;
+
+  // Flush [x, -DENORMAL] to [x, -0.0].
+  if (real_isdenormal (&m_max) && real_isneg (&m_max))
+{
+  m_max = dconst0;
+  if (HONOR_SIGNED_ZEROS (m_type))
+   m_max.sign = 1;
+}
+  // Flush [+DENORMAL, x] to [+0.0, x].
+  if (real_isdenormal (&m_min) && !real_isneg (&m_min))
+m_min = dconst0;
+}
+
 // Setter for franges.
 
 void
@@ -317,6 +337,10 @@ frange::set (tree min, tree max, value_range_kind kind)
   gcc_checking_assert (tree_compare (LE_EXPR, min, max));
 
   normalize_kind ();
+
+  if (flag_unsafe_math_optimizations)
+flush_denormals_to_zero ();
+
   if (flag_checking)
 verify_range ();
 }
diff --git a/gcc/value-range.h b/gcc/value-range.h
index 3a401f3e4e2..795b1f00fdc 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -327,6 +327,7 @@ private:
   bool union_nans (const frange &);
   bool intersect_nans (const frange &);
   bool combine_zeros (const frange &, bool union_p);
+  void flush_denormals_to_zero ();
 
   tree m_type;
   REAL_VALUE_TYPE m_min;
-- 
2.37.1



Re: [PATCH] c++: Implement P1467R9 - Extended floating-point types and standard names compiler part except for bfloat16 [PR106652]

2022-09-17 Thread Jason Merrill via Gcc-patches

On 9/16/22 13:34, Jakub Jelinek wrote:

On Fri, Sep 16, 2022 at 01:48:54PM +0200, Jason Merrill wrote:

On 9/12/22 04:05, Jakub Jelinek wrote:

The following patch implements the compiler part of C++23
P1467R9 - Extended floating-point types and standard names compiler part
by introducing _Float{16,32,64,128} as keywords and builtin types
like they are implemented for C already since GCC 7.
It doesn't introduce _Float{32,64,128}x for C++, those remain C only
for now, mainly because 
https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling
has mangling for:
::= DF  _ # ISO/IEC TS 18661 binary floating point type _FloatN (N bits)
but doesn't for _FloatNx.  And it doesn't add anything for bfloat16_t
support, see below.
Regarding mangling, I think mangling _FloatNx as DF  x _ would be
possible, but would need to be discussed and voted in.


As you've seen, I opened a pull request for these.  I think we can go ahead
and implement that and make sure it's resolved before the GCC 13 release.

Or we could temporarily mangle them as an extension, i.e. u9_Float32x.

I would expect _Float64x, at least, to be fairly popular.


If we get the mangling for _Floatx agreed on, whether it is DFx_ as
you proposed, or DFx or DFx_, sure, I agree we should just enable
those too and will tweak the patch.  It would then fix also PR85518.

Though, when we add support for _Floatx and declare they are extended
floating-point types, the question is about subrank comparisons, shall those
have lower conversion subrank than _Float type with the same conversion
rank or the other way around?  Say on x86_64 where _Float32x and _Float64
has the same rank, shall (_Float32x) x + 0.0f64 have _Float32x type or
_Float64?
And, shall we support f32x etc. constant literal suffixes (with pedwarn
always even in C++23)?


The patch wants to keep backwards compatibility with how __float128 has
been handled in C++ before, both for mangling and behavior in binary
operations, overload resolution etc.  So, there are some backend changes
where for C __float128 and _Float128 are the same type (float128_type_node
and float128t_type_node are the same pointer), but for C++ they are distinct
types which mangle differently and _Float128 is treated as extended
floating-point type while __float128 is treated as non-standard floating
point type.


How important do you think this backwards compatibility is?

As I mentioned in the ABI proposal, I think it makes sense to make
__float128 an alias for std::float128_t, and continue using the current
mangling for __float128.


I thought it is fairly important because __float128 has been around in GCC
for 19 years already.  To be precise, I think e.g. for x86_64 GCC 3.4
introduced it, but mangling was implemented only in GCC 4.1 (2006), before we 
ICEd
on those.  Until glibc 2.26 (2017) one had to use libquadmath when
math library functions were needed, but since then one can just use libm.
__float128 is on some targets (e.g. PA) just another name for long double,
not a distinct type.


I think we certainly want to continue to support __float128, what I'm 
wondering is how much changing it to mean _Float128 will affect existing 
code.  I would guess that a lot of code that just works on __float128 
will continue to work without modification.  Does anyone know of 
significant existing uses of __float128?



Another thing are the PowerPC __ieee128 and __ibm128 type, I think for the
former we can't make it the same type as _Float128, because e.g. libstdc++
code relies on __ieee128 and __ibm128 being long double type of the other
ABI, so they should mangle as long double of the other ABI.  But in that
case they can't act as distinct types when long double should mangle the
same as they do.  And it would be weird if those types in one
-mabi=*longdouble mode worked as standard floating-point type and in another
as extended floating-point type, rather than just types which are neither
standard nor extended as before.


Absolutely we don't want to mess with __ieee128 and __ibm128.  And I 
guess that means that we need to preserve the non-standard type handling 
for the alternate long double.


I think we can still change __float128 to be _Float128 on PPC and other 
targets where it's currently an alias for long double.


It seems to me that it's a question of what provides the better 
transition path for users.  I imagine we'll want to encourage people to 
replace __float128 with std::float128_t everywhere.


In the existing model, it's not portable whether

void f(long double) { }
void f(__float128) { }

is an overload or an erroneous redefinition.  In the new model, you can 
portably write


void f(long double) { }
void f(std::float128_t) { }

and existing __float128 code will call the second one.  Old code that 
had conditional __float128 overloads when it's different from long 
double will need to change to have unconditional _Float128 overloads.


If we don't change __float128 to mean _Float128, we require fewer 
immediate 

[PATCH] LoongArch: Drop the stack first when performing stack checking.

2022-09-17 Thread Lulu Cheng
The old stack detection was performed before the stack was dropped,
which would cause the detection tool to report a memory leak.

The current stack detection scheme is as follows:

'-fstack-clash-protection':
1. When the frame->total_size is smaller than the guard page size,
   the stack is dropped according to the original scheme, and there
   is no need to perform stack detection in the prologue.
2. When frame->total_size is greater than or equal to guard page size,
   the first step to drop the stack is to drop the space required by
   the s register. This space needs to save the s register, so an
   implicit stack check is performed. Then check the remaining space.

'-fstack-check':
There is no one-time stack drop and then page-by-page detection as
described in the document. It is also the same as
'-fstack-clash-protection', which is detected immediately after page drop.

It is judged that when frame->total_size is not 0, only the size required
to save the s register is dropped for the first stack down.

gcc/ChangeLog:

* config/loongarch/linux.h (STACK_CHECK_MOVING_SP):
Define this macro to 1.
* config/loongarch/loongarch.cc (loongarch_first_stack_step):
Return the size of the first drop stack according to whether stack 
checking
is performed
(loongarch_emit_probe_stack_range): Adjust the method of stack checking 
in prologue.
(loongarch_output_probe_stack_range): Delete useless code.
(loongarch_expand_prologue): Adjust the method of stack checking in 
prologue.
(loongarch_option_override_internal): Enforce that interval is the same
size as size so the mid-end does the right thing.
* config/loongarch/loongarch.h (STACK_CLASH_MAX_UNROLL_PAGES):
New macro decide whether to loop stack detection.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add loongarch support for 
stack_clash_protection.
* gcc.target/loongarch/stack-check-alloca-1.c: New test.
* gcc.target/loongarch/stack-check-alloca-2.c: New test.
* gcc.target/loongarch/stack-check-alloca-3.c: New test.
* gcc.target/loongarch/stack-check-alloca-4.c: New test.
* gcc.target/loongarch/stack-check-alloca-5.c: New test.
* gcc.target/loongarch/stack-check-alloca-6.c: New test.
* gcc.target/loongarch/stack-check-alloca.h: New test.
* gcc.target/loongarch/stack-check-cfa-1.c: New test.
* gcc.target/loongarch/stack-check-cfa-2.c: New test.
* gcc.target/loongarch/stack-check-prologue-1.c: New test.
* gcc.target/loongarch/stack-check-prologue-2.c: New test.
* gcc.target/loongarch/stack-check-prologue-3.c: New test.
* gcc.target/loongarch/stack-check-prologue-4.c: New test.
* gcc.target/loongarch/stack-check-prologue-5.c: New test.
* gcc.target/loongarch/stack-check-prologue-6.c: New test.
* gcc.target/loongarch/stack-check-prologue-7.c: New test.
* gcc.target/loongarch/stack-check-prologue.h: New test.
---
 gcc/config/loongarch/linux.h  |   3 +
 gcc/config/loongarch/loongarch.cc | 259 +++---
 gcc/config/loongarch/loongarch.h  |   4 +
 .../loongarch/stack-check-alloca-1.c  |  15 +
 .../loongarch/stack-check-alloca-2.c  |  12 +
 .../loongarch/stack-check-alloca-3.c  |  12 +
 .../loongarch/stack-check-alloca-4.c  |  12 +
 .../loongarch/stack-check-alloca-5.c  |  13 +
 .../loongarch/stack-check-alloca-6.c  |  13 +
 .../gcc.target/loongarch/stack-check-alloca.h |  15 +
 .../gcc.target/loongarch/stack-check-cfa-1.c  |  12 +
 .../gcc.target/loongarch/stack-check-cfa-2.c  |  12 +
 .../loongarch/stack-check-prologue-1.c|  11 +
 .../loongarch/stack-check-prologue-2.c|  11 +
 .../loongarch/stack-check-prologue-3.c|  11 +
 .../loongarch/stack-check-prologue-4.c|  11 +
 .../loongarch/stack-check-prologue-5.c|  12 +
 .../loongarch/stack-check-prologue-6.c|  11 +
 .../loongarch/stack-check-prologue-7.c|  12 +
 .../loongarch/stack-check-prologue.h  |   5 +
 gcc/testsuite/lib/target-supports.exp |   7 +-
 21 files changed, 372 insertions(+), 101 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca-1.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca-2.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca-3.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca-4.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca-5.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca-6.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-alloca.h
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-cfa-1.c
 create mode 100644 gcc/testsuite/gcc.target/loongarch/stack-check-cfa-2.c
 create mode 100644 gcc/t

[PATCH] testsuite: Only run test on target if VMA == LMA

2022-09-17 Thread Torbjörn SVENSSON via Gcc-patches
Checking that the triplet matches arm*-*-eabi (or msp430-*-*) is not
enough to know if the execution will enter an endless loop, or if it
will give a meaningful result. As the execution test only work when
VMA and LMA are equal, make sure that this condition is met.

2022-09-16  Torbjörn SVENSSON  

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_vma_equals_lma): New.
* c-c++-common/torture/attr-noinit-1.c: Requre VMA == LMA to run.
* c-c++-common/torture/attr-noinit-2.c: Likewise.
* c-c++-common/torture/attr-noinit-3.c: Likewise.
* c-c++-common/torture/attr-persistent-1.c: Likewise.
* c-c++-common/torture/attr-persistent-3.c: Likewise.

Co-Authored-By: Yvan ROUX  
Signed-off-by: Torbjörn SVENSSON  
---
 .../c-c++-common/torture/attr-noinit-1.c  |  3 +-
 .../c-c++-common/torture/attr-noinit-2.c  |  3 +-
 .../c-c++-common/torture/attr-noinit-3.c  |  3 +-
 .../c-c++-common/torture/attr-persistent-1.c  |  3 +-
 .../c-c++-common/torture/attr-persistent-3.c  |  3 +-
 gcc/testsuite/lib/target-supports.exp | 49 +++
 6 files changed, 59 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/torture/attr-noinit-1.c 
b/gcc/testsuite/c-c++-common/torture/attr-noinit-1.c
index 877e7647ac9..3c89011a7b6 100644
--- a/gcc/testsuite/c-c++-common/torture/attr-noinit-1.c
+++ b/gcc/testsuite/c-c++-common/torture/attr-noinit-1.c
@@ -1,4 +1,5 @@
-/* { dg-do run } */
+/* { dg-do compile } */
+/* { dg-do run { target { vma_equals_lma } } } */
 /* { dg-require-effective-target noinit } */
 /* { dg-skip-if "data LMA != VMA" { msp430-*-* } { "-mlarge" } } */
 /* { dg-options "-save-temps" } */
diff --git a/gcc/testsuite/c-c++-common/torture/attr-noinit-2.c 
b/gcc/testsuite/c-c++-common/torture/attr-noinit-2.c
index befa2a0bd52..24ff74c06d4 100644
--- a/gcc/testsuite/c-c++-common/torture/attr-noinit-2.c
+++ b/gcc/testsuite/c-c++-common/torture/attr-noinit-2.c
@@ -1,4 +1,5 @@
-/* { dg-do run } */
+/* { dg-do compile } */
+/* { dg-do run { target { vma_equals_lma } } } */
 /* { dg-require-effective-target noinit } */
 /* { dg-options "-fdata-sections -save-temps" } */
 /* { dg-skip-if "data LMA != VMA" { msp430-*-* } { "-mlarge" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/attr-noinit-3.c 
b/gcc/testsuite/c-c++-common/torture/attr-noinit-3.c
index 519e88a59a6..a20809f2783 100644
--- a/gcc/testsuite/c-c++-common/torture/attr-noinit-3.c
+++ b/gcc/testsuite/c-c++-common/torture/attr-noinit-3.c
@@ -1,4 +1,5 @@
-/* { dg-do run } */
+/* { dg-do compile } */
+/* { dg-do run { target { vma_equals_lma } } } */
 /* { dg-require-effective-target noinit } */
 /* { dg-options "-flto -save-temps" } */
 /* { dg-skip-if "data LMA != VMA" { msp430-*-* } { "-mlarge" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/attr-persistent-1.c 
b/gcc/testsuite/c-c++-common/torture/attr-persistent-1.c
index 72dc3c27192..bdd221788e8 100644
--- a/gcc/testsuite/c-c++-common/torture/attr-persistent-1.c
+++ b/gcc/testsuite/c-c++-common/torture/attr-persistent-1.c
@@ -1,4 +1,5 @@
-/* { dg-do run } */
+/* { dg-do compile } */
+/* { dg-do run { target { vma_equals_lma } } } */
 /* { dg-require-effective-target persistent } */
 /* { dg-skip-if "data LMA != VMA" { msp430-*-* } { "-mlarge" } } */
 /* { dg-options "-save-temps" } */
diff --git a/gcc/testsuite/c-c++-common/torture/attr-persistent-3.c 
b/gcc/testsuite/c-c++-common/torture/attr-persistent-3.c
index 3e4fd28618d..be03e386e14 100644
--- a/gcc/testsuite/c-c++-common/torture/attr-persistent-3.c
+++ b/gcc/testsuite/c-c++-common/torture/attr-persistent-3.c
@@ -1,4 +1,5 @@
-/* { dg-do run } */
+/* { dg-do compile } */
+/* { dg-do run { target { vma_equals_lma } } } */
 /* { dg-require-effective-target persistent } */
 /* { dg-options "-flto -save-temps" } */
 /* { dg-skip-if "data LMA != VMA" { msp430-*-* } { "-mlarge" } } */
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index 703aba412a6..df8141a15d8 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -370,6 +370,55 @@ proc check_weak_override_available { } {
 return [check_weak_available]
 }
 
+# Return 1 if VMA is equal to LMA for the .data section, 0
+# otherwise.  Cache the result.
+
+proc check_effective_target_vma_equals_lma { } {
+global tool
+
+return [check_cached_effective_target vma_equals_lma {
+   set src vma_equals_lma[pid].c
+   set exe vma_equals_lma[pid].exe
+   verbose "check_effective_target_vma_equals_lma  compiling testfile 
$src" 2
+   set f [open $src "w"]
+   puts $f "#ifdef __cplusplus\nextern \"C\"\n#endif\n"
+   puts $f "int foo = 42; void main() {}"
+   close $f
+   set lines [${tool}_target_compile $src $exe executable ""]
+   file delete $src
+
+   if [string match "" $lines] then {
+   # No error messages
+
+set objdump_name [find_binut

Re: [PATCH] c++: Implement C++23 P1169R4 - static operator() [PR106651]

2022-09-17 Thread Jason Merrill via Gcc-patches

On 9/17/22 02:42, Jakub Jelinek wrote:

On Sat, Sep 17, 2022 at 01:23:59AM +0200, Jason Merrill wrote:

On 9/13/22 12:42, Jakub Jelinek wrote:

The following patch attempts to implement C++23 P1169R4 - static operator()
paper's compiler side (there is some small library side too not implemented
yet).  This allows static members as user operator() declarations and
static specifier on lambdas without lambda capture.  As decl specifier
parsing doesn't track about the presence and locations of all specifiers,
the patch unfortunately replaces the diagnostics about duplicate mutable
with diagnostics about conflicting specifiers because the information
whether it was mutable mutable, mutable static, static mutable or static
static is lost.


I wonder why we don't give an error when setting the
conflicting_specifiers_p flag in cp_parser_set_storage_class?  We should be
able to give a better diagnostic at that point.


I will try that.


Beyond this, the synthetized conversion operator changes
for static lambdas as it can just return the operator() static method
address, doesn't need to create a thunk for it.
The change I'm least sure about is the call.cc (joust) change, one thing
is that we ICEd because we assumed that len could be different only if
both candidates are direct calls but it can be one direct and one indirect
call,


How do you mean?


That is either on the
struct less {
 static constexpr auto operator()(int i, int j) -> bool {
 return i < j;
 }

 using P = bool(*)(int, int);
 operator P() const { return operator(); }
};

static_assert(less{}(1, 2));
testcase from the paper (S in static-operator-call1.C) or any static lambda
which also has static operator() and cast operator.
The paper then talks about
operator()(contrived-parameter, int, int);
call-function(bool(*)(int, int), int, int);
which is exactly what caused the ICE, one of the cand?->fn was the static
operator(), the other cand?->fn isn't a FUNCTION_DECL, but a function
pointer (what the cast operator returns).


Ah, makes sense.


+   {
+ /* C++23 [over.best.ics.general] says:
+When the parameter is the implicit object parameter of a static
+member function, the implicit conversion sequence is a standard
+conversion sequence that is neither better nor worse than any
+other standard conversion sequence.
+Apply this for C++23 or when the static member function is
+overloaded call operator (C++23 feature we accept as an
+extension).  */
+ if ((cxx_dialect >= cxx23
+  || (DECL_OVERLOADED_OPERATOR_P (cand1->fn)
+  && DECL_OVERLOADED_OPERATOR_IS (cand1->fn, CALL_EXPR)))
+ && CONVERSION_RANK (cand2->convs[0]) >= cr_user)
+   winner = -1;


I don't know what you're trying to do here.  The above passage describes how
we've always compared static to non-static, which should work the same way
for operator().


It doesn't work for the static operator() case vs. pointer returned by
cast operator, with just the
-  int static_1 = DECL_STATIC_FUNCTION_P (cand1->fn);
-  int static_2 = DECL_STATIC_FUNCTION_P (cand2->fn);
+  int static_1 = (TREE_CODE (cand1->fn) == FUNCTION_DECL
+ && DECL_STATIC_FUNCTION_P (cand1->fn));
+  int static_2 = (TREE_CODE (cand2->fn) == FUNCTION_DECL
+ && DECL_STATIC_FUNCTION_P (cand2->fn));

-  if (DECL_CONSTRUCTOR_P (cand1->fn)
+  if (TREE_CODE (cand1->fn) == FUNCTION_DECL
+ && TREE_CODE (cand2->fn) == FUNCTION_DECL
+ && DECL_CONSTRUCTOR_P (cand1->fn)
changes in joust so that it doesn't ICE, the call is ambiguous.
Previously the standard had:
"If F is a static member function, ICS1(F) is defined such that ICS1(F) is 
neither better
nor worse than ICS1(G) for any function G, and, symmetrically, ICS1(G) is 
neither better
nor worse than ICS1(F); otherwise,"
but that was removed and instead:
"When the parameter is the implicit object parameter of a static member 
function, the
implicit conversion sequence is a standard conversion sequence that is neither 
better
nor worse than any other standard conversion sequence."
was added elsewhere.  The code implements the former.  We don't have any
conversion for the non-existing object parameter to static, so what I want
to do is keep doing what we were before if the other candidate's conversion
on the first parameter is standard (CONVERSION_RANK (cand2->convs[0]) < cr_user)
and otherwise indicate that the static operator() has there a standard
conversion and so it should be better than any user/ellipsis/bad conversion.
Now, it could be done just for cxx_dialect >= cxx23 as that is pedantically
what the standard says, but if we allow with a pedwarn static operator()
in earlier standards, I think we better treat it there the same, because
otherwise one can't call any static lambdas (all those calls would be
ambiguous).


Ah, OK.  I don't thi

Re: [PATCH] Fortran: add IEEE_QUIET_* and IEEE_SIGNALING_* comparisons

2022-09-17 Thread Mikael Morin

Le 02/09/2022 à 13:37, FX via Fortran a écrit :

Hi,

These operations were added to Fortran 2018, and correspond to well-defined 
IEEE comparison operations, with defined signaling semantics for NaNs. All are 
implemented in terms of GCC expressions and built-ins, with no library support 
needed.

Bootstrapped and regtested on x86_64-linux, both 32- and 64-bit. Depends on a 
patch currently under review for the middle-end 
(https://gcc.gnu.org/pipermail/gcc-patches/2022-September/600840.html).

OK to commit?
FX



Hello,

the implementation looks good, but the tests lack checks regarding 
exception status.  This is an important part, I think, and basically 
what makes a difference between the quiet and signaling variants.
As the functions are elemental, a few checks with array values would be 
nice too.

OK with these additional checks.

Mikael


Re: [PATCH] Fortran: add IEEE_MODES_TYPE, IEEE_GET_MODES and IEEE_SET_MODES

2022-09-17 Thread Mikael Morin

Le 04/09/2022 à 18:30, FX via Fortran a écrit :

Hi,

The IEEE_MODES_TYPE type and the two functions that get and set it were added 
in Fortran 2018.  They can be implemented using the already existing 
target-specific functions.  A future optimization could, on some targets, 
set/get all modes through one or two instructions only, but that would need a 
new set of functions in all config/fpu-* files.

This was regtested on aarch64-darwin, which does not support underflow modes, 
so I will further test on x86_64-linux when I finish travelling in a couple of 
days.
OK to commit?



Looks good, thanks.


Re: [PATCH] c++: constraint matching, TEMPLATE_ID_EXPR, current inst

2022-09-17 Thread Patrick Palka via Gcc-patches
On Sat, 17 Sep 2022, Jason Merrill wrote:

> On 9/16/22 10:59, Patrick Palka wrote:
> > On Fri, 16 Sep 2022, Jason Merrill wrote:
> > 
> > > On 9/15/22 11:58, Patrick Palka wrote:
> > > > Here we're crashing during constraint matching for the instantiated
> > > > hidden friends due to two issues with dependent substitution into a
> > > > TEMPLATE_ID_EXPR naming a template from the current instantiation
> > > > (as performed from maybe_substitute_reqs_for for C<3> with T=T):
> > > > 
> > > > * tsubst_copy substitutes into such a TEMPLATE_DECL by looking it
> > > >   up from the substituted class scope.  But for this to not fail
> > > > when
> > > >   the args are dependent, we need to pass entering_scope=true for
> > > > the
> > > >   class scope substitution so that we obtain the primary template
> > > > type
> > > >   A (which has TYPE_BINFO) instead of the implicit instantiation
> > > >   A (which doesn't).
> > > > * lookup_and_finish_template_variable shouldn't instantiate a
> > > >   TEMPLATE_ID_EXPR that names a TEMPLATE_DECL which has more than
> > > >   one level of (unsubstituted) parameters (such as A::C).
> > > > 
> > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > > > trunk?
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > * pt.cc (lookup_and_finish_template_variable): Don't
> > > > instantiate if the template's scope is dependent.
> > > > (tsubst_copy) : Pass entering_scope=true
> > > > when substituting the class scope.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > * g++.dg/cpp2a/concepts-friend10.C: New test.
> > > > ---
> > > >gcc/cp/pt.cc  | 14 +++--
> > > >.../g++.dg/cpp2a/concepts-friend10.C  | 21
> > > > +++
> > > >2 files changed, 29 insertions(+), 6 deletions(-)
> > > >create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-friend10.C
> > > > 
> > > > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> > > > index db4e808adec..bfcbe0b8670 100644
> > > > --- a/gcc/cp/pt.cc
> > > > +++ b/gcc/cp/pt.cc
> > > > @@ -10475,14 +10475,15 @@ tree
> > > >lookup_and_finish_template_variable (tree templ, tree targs,
> > > >  tsubst_flags_t complain)
> > > >{
> > > > -  templ = lookup_template_variable (templ, targs);
> > > > -  if (!any_dependent_template_arguments_p (targs))
> > > > +  tree var = lookup_template_variable (templ, targs);
> > > > +  if (TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (templ)) == 1
> > > > +  && !any_dependent_template_arguments_p (targs))
> > > 
> > > I notice that finish_id_expression_1 uses the equivalent of
> > > type_dependent_expression_p (var).  Does that work here?
> > 
> > Hmm, it does, but kind of by accident: type_dependent_expression_p
> > returns true for all variable TEMPLATE_ID_EXPRs because of their empty
> > TREE_TYPE (as set by finish_template_variable).  So testing t_d_e_p here
> > is equivalent to testing processing_template_decl, it seems -- maximally
> > conservative.
> > 
> > We can improve type_dependent_expression_p for variable TEMPLATE_ID_EXPR
> > by ignoring its (always empty) TREE_TYPE and just considering dependence
> > of its template and args directly.
> > 
> > Doing so exposes that value_dependent_expression_p is wrong for
> > (non-type-dependent) variable template specializations -- since we don't
> > set/track DECL_DEPENDENT_INIT_P for them,
> 
> Hmm, why not?

AFAICT we set DECL_DEPENDENT_INIT_P only from cp_finish_decl, which we
call when parsing a variable template (or templated static data member
or templated local variable IIUC) and when instantiating its definition,
but not when specializing it.  So if we want to rely on it to determine
value dependence of a variable template specialization, it seems we need
to also set/propagate DECL_DEPENDENT_INIT_P at specialization time.

Note that the flag currently tracks ordinary value dependence of the
initializer, but for variable template specializations, it seems we'd
want the flag to just track dependence on _outer_ template parameters,
because if we're at that point in v_d_e_p, we already know that the
innermost arguments are non-dependent (it was ruled out by t_d_e_p),
so dependence on the innermost arguments in the initializer shouldn't
matter.  But this'd mean the flag would have a different meaning
depending on the kind of VAR_DECL.

To keep things simple, perhaps we should just ignore the flag entirely
for variable template specializations (and keep using it for templated
static data members and templated local variables)?

> 
> > the VAR_DECL branch ends up
> > returning false even if the initializer depends on outer args.  Instead,
> > I suppose we can give a reasonably conservative answer by considering
> > dependence of its enclosing scope as we do for FUNCTION_DECL.
> 
> I wonder why we do that for functions rather than rely on the later
> an

Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-17 Thread Thomas Koenig via Gcc-patches



Hi Mikael,


This adds support for clobbering of partial variable references, when
they are passed as actual argument and the associated dummy has the
INTENT(OUT) attribute.
Support includes array elements, derived type component references,
and complex real or imaginary parts.

This is done by removing the check for lack of subreferences, which is
basically a revert of r9-4911-gbd810d637041dba49a5aca3d085504575374ac6f.
This removal allows more expressions than just array elements,
components and complex parts, but the other expressions are excluded by
other conditions: substrings are excluded by the check on expression
type (CHARACTER is excluded), KIND and LEN references are rejected by
the compiler as not valid in a variable definition context.

The check for scalarness is also updated as it was only valid when there
was no subreference.


First, thanks a lot for digging into this subject. I have looked through
the patch series, and it looks very good so far.

I have a concern about this part, though.  My understanding at the
time was that it is not possible to clobber an individual array
element, but that this clobbers anything off the pointer that this
is based on.

So,

  integer, dimension(3) :: a

  a(1) = 1
  a(3) = 3
  call foo(a(1))

would also invalidate the store to a(3).  Is my understanding correct?
If so, I think this we cannot revert that patch (which was introduced
because of a regression).

Best regards

Thomas


Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-17 Thread Mikael Morin

Le 17/09/2022 à 19:03, Thomas Koenig via Fortran a écrit :


Hi Mikael,


This adds support for clobbering of partial variable references, when
they are passed as actual argument and the associated dummy has the
INTENT(OUT) attribute.
Support includes array elements, derived type component references,
and complex real or imaginary parts.

This is done by removing the check for lack of subreferences, which is
basically a revert of r9-4911-gbd810d637041dba49a5aca3d085504575374ac6f.
This removal allows more expressions than just array elements,
components and complex parts, but the other expressions are excluded by
other conditions: substrings are excluded by the check on expression
type (CHARACTER is excluded), KIND and LEN references are rejected by
the compiler as not valid in a variable definition context.

The check for scalarness is also updated as it was only valid when there
was no subreference.


First, thanks a lot for digging into this subject. I have looked through
the patch series, and it looks very good so far.

I have a concern about this part, though.  My understanding at the
time was that it is not possible to clobber an individual array
element, but that this clobbers anything off the pointer that this
is based on.

Well, we need the middle-end guys to give a definitive answer on this 
topic, but I think it would be a very penalizing limitation if that was 
the case.  I have assumed that the clobber spanned the value it was 
applied on, neither more nor less, so just the array element in case of 
array elements.



So,

   integer, dimension(3) :: a

   a(1) = 1
   a(3) = 3
   call foo(a(1))

would also invalidate the store to a(3).  Is my understanding correct?


I think it was the case before patch 2 in in the series, because the 
clobber was applied to the symbol decl, so in the case of the expression 
A(1), it was applied to A which is the full array.  After patch 2, the 
clobber is applied to the expression A(1), so the element alone.



If so, I think this we cannot revert that patch (which was introduced
because of a regression).

The testcase from the patch was not specifically checking lack of 
side-effect clobbers, so I have double-checked with the following 
testcase, which should lift your concerns.
I propose to keep the patch with the testcase added to it.  What do you 
think?


Mikael

! { dg-do run }
! { dg-additional-options "-fno-inline -fno-ipa-modref -fdump-tree-optimized -fdump-tree-original" }
!
! PR fortran/41453
! Check that the INTENT(OUT) attribute causes one clobber to be emitted
! for the array element passed as argument in the *.original dump, and the
! associated initialization constant to be optimized away in the *.optimized
! dump, whereas the other initialization constants are not optimized away.

module x
implicit none
contains
  subroutine foo(a)
integer, intent(out) :: a
a = 42
  end subroutine foo
end module x

program main
  use x
  implicit none
  integer :: ac(3)

  ac(1) = 123
  ac(2) = 456
  ac(3) = 789
  call foo(ac(2))
  if (any(ac /= [123, 42, 789])) stop 1

end program main

! { dg-final { scan-tree-dump-times "CLOBBER" 1 "original" } }
! { dg-final { scan-tree-dump "ac\\\[1\\\] = {CLOBBER};" "original" } }
! { dg-final { scan-tree-dump "123" "original" } }
! { dg-final { scan-tree-dump "123" "optimized" } }
! { dg-final { scan-tree-dump "456" "original" } }
! { dg-final { scan-tree-dump-not "456" "optimized" { target __OPTIMIZE__ } } }
! { dg-final { scan-tree-dump "789" "original" } }
! { dg-final { scan-tree-dump "789" "optimized" } }


Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-17 Thread Bernhard Reutner-Fischer via Gcc-patches
On 17 September 2022 21:33:22 CEST, Mikael Morin  wrote:
>Le 17/09/2022 à 19:03, Thomas Koenig via Fortran a écrit :
>> 
>> Hi Mikael,
>> 
>>> This adds support for clobbering of partial variable references, when
>>> they are passed as actual argument and the associated dummy has the
>>> INTENT(OUT) attribute.
>>> Support includes array elements, derived type component references,
>>> and complex real or imaginary parts.
>>> 
>>> This is done by removing the check for lack of subreferences, which is
>>> basically a revert of r9-4911-gbd810d637041dba49a5aca3d085504575374ac6f.
>>> This removal allows more expressions than just array elements,
>>> components and complex parts, but the other expressions are excluded by
>>> other conditions: substrings are excluded by the check on expression
>>> type (CHARACTER is excluded), KIND and LEN references are rejected by
>>> the compiler as not valid in a variable definition context.
>>> 
>>> The check for scalarness is also updated as it was only valid when there
>>> was no subreference.
>> 
>> First, thanks a lot for digging into this subject. I have looked through
>> the patch series, and it looks very good so far.

I second that!
The series looks plausible IMO.

>> 
>> I have a concern about this part, though.  My understanding at the
>> time was that it is not possible to clobber an individual array
>> element, but that this clobbers anything off the pointer that this
>> is based on.
>> 
>Well, we need the middle-end guys to give a definitive answer on this topic, 
>but I think it would be a very penalizing limitation if that was the case.  I 
>have assumed that the clobber spanned the value it was applied on, neither 
>more nor less, so just the array element in case of array elements.

I would assume the same, fwiw.
Let's blame the ME iff something goes amiss then, but I doubt it will.

>> So,
>> 
>>    integer, dimension(3) :: a
>> 
>>    a(1) = 1
>>    a(3) = 3
>>    call foo(a(1))
>> 
>> would also invalidate the store to a(3).  Is my understanding correct?
>
>I think it was the case before patch 2 in in the series, because the clobber 
>was applied to the symbol decl, so in the case of the expression A(1), it was 
>applied to A which is the full array.  After patch 2, the clobber is applied 
>to the expression A(1), so the element alone.

Yep.

>> If so, I think this we cannot revert that patch (which was introduced
>> because of a regression).
>> 
>The testcase from the patch was not specifically checking lack of side-effect 
>clobbers, so I have double-checked with the following testcase, which should 
>lift your concerns.
>I propose to keep the patch with the testcase added to it.  What do you think?

I cannot approve it but the series looks good to me.

Thanks!


Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-17 Thread Mikael Morin

Le 17/09/2022 à 21:33, Mikael Morin a écrit :
The testcase from the patch was not specifically checking lack of 
side-effect clobbers, so I have double-checked with the following 
testcase, which should lift your concerns.



The dump matches didn’t fail as expected with patch 2/10 reversed.
This testcase should be better.! { dg-do run }
! { dg-additional-options "-fno-inline -fno-ipa-modref -fdump-tree-optimized -fdump-tree-original" }
!
! PR fortran/41453
! Check that the INTENT(OUT) attribute causes one clobber to be emitted
! for the array element passed as argument in the *.original dump, and the
! associated initialization constant to be optimized away in the *.optimized
! dump, whereas the other initialization constants are not optimized away.

module x
implicit none
contains
  subroutine foo(a)
integer, intent(out) :: a
a = 42
  end subroutine foo
end module x

program main
  use x
  implicit none
  integer :: ac(3)

  ac(1) = 123
  ac(2) = 456
  ac(3) = 789
  call foo(ac(2))
  if (any(ac /= [123, 42, 789])) stop 1

end program main

! { dg-final { scan-tree-dump-times "CLOBBER" 1 "original" } }
! { dg-final { scan-tree-dump "ac\\\[1\\\] = {CLOBBER};" "original" } }
! { dg-final { scan-tree-dump-times "123" 2 "original" } }
! { dg-final { scan-tree-dump-times "123" 2 "optimized" } }
! { dg-final { scan-tree-dump-times "456" 1 "original" } }
! { dg-final { scan-tree-dump-times "456" 0 "optimized" { target __OPTIMIZE__ } } }
! { dg-final { scan-tree-dump-times "789" 2 "original" } }
! { dg-final { scan-tree-dump-times "789" 2 "optimized" } }


Re: [PATCH] RISC-V missing __builtin_lceil and __builtin_lfloor

2022-09-17 Thread Palmer Dabbelt

On Mon, 15 Aug 2022 17:44:35 PDT (-0700), kev...@rivosinc.com wrote:

Hello,
Currently, __builtin_lceil and __builtin_lfloor doesn't generate an
existing instruction fcvt, but rather calls ceil and floor from the
library. This patch adds the missing iterator and attributes for lceil and
lfloor to produce the optimized code.
 The test cases check the correct generation of the fcvt instruction for
float/double to int/long/long long. Passed the test in riscv-linux.
Could this patch be committed?


Reviewed-by: Palmer Dabbelt 
Acked-by: Palmer Dabbelt 

Not sure if Kito had any comments for this one, but it looks good to me.


gcc/ChangeLog:
   Michael Collison  
* config/riscv/riscv.md (RINT): Add iterator for lceil and lround.
(rint_pattern): Add ceil and floor.
(rint_rm): Add rup and rdn.

gcc/testsuite/ChangeLog:
Kevin Lee  
* gcc.target/riscv/lfloor-lceil.c: New test.
---
 gcc/config/riscv/riscv.md | 13 ++-
 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c | 79 +++
 2 files changed, 88 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index c6399b1389e..070004fa7fe 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -43,6 +43,9 @@ (define_c_enum "unspec" [
   UNSPEC_LRINT
   UNSPEC_LROUND

+  UNSPEC_LCEIL
+  UNSPEC_LFLOOR
+
   ;; Stack tie
   UNSPEC_TIE
 ])
@@ -345,10 +348,12 @@ (define_mode_attr UNITMODE [(SF "SF") (DF "DF")])
 ;; the controlling mode.
 (define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])

-;; Iterator and attributes for floating-point rounding instructions.
-(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND])
-(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
"round")])
-(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")])
+;; Iterator and attributes for floating-point rounding instructions.f
+(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND UNSPEC_LCEIL
UNSPEC_LFLOOR])
+(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
"round")
+ (UNSPEC_LCEIL "ceil") (UNSPEC_LFLOOR
"floor")])
+(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")
+(UNSPEC_LCEIL "rup") (UNSPEC_LFLOOR "rdn")])

 ;; Iterator and attributes for quiet comparisons.
 (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
diff --git a/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
new file mode 100644
index 000..4d81c12cefa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
@@ -0,0 +1,79 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+int
+ceil1(float i)
+{
+  return __builtin_lceil(i);
+}
+
+long
+ceil2(float i)
+{
+  return __builtin_lceil(i);
+}
+
+long long
+ceil3(float i)
+{
+  return __builtin_lceil(i);
+}
+
+int
+ceil4(double i)
+{
+  return __builtin_lceil(i);
+}
+
+long
+ceil5(double i)
+{
+  return __builtin_lceil(i);
+}
+
+long long
+ceil6(double i)
+{
+  return __builtin_lceil(i);
+}
+
+int
+floor1(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+long
+floor2(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+long long
+floor3(float i)
+{
+  return __builtin_lfloor(i);
+}
+
+int
+floor4(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+long
+floor5(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+long long
+floor6(double i)
+{
+  return __builtin_lfloor(i);
+}
+
+/* { dg-final { scan-assembler-times "fcvt.l.s" 6 } } */
+/* { dg-final { scan-assembler-times "fcvt.l.d" 6 } } */
+/* { dg-final { scan-assembler-not "call" } } */


Re: [PATCH] RISC-V missing __builtin_lceil and __builtin_lfloor

2022-09-17 Thread Kito Cheng via Gcc-patches
LGTM, thanks, I guess I just missed this before

Palmer Dabbelt  於 2022年9月17日 週六 23:07 寫道:

> On Mon, 15 Aug 2022 17:44:35 PDT (-0700), kev...@rivosinc.com wrote:
> > Hello,
> > Currently, __builtin_lceil and __builtin_lfloor doesn't generate an
> > existing instruction fcvt, but rather calls ceil and floor from the
> > library. This patch adds the missing iterator and attributes for lceil
> and
> > lfloor to produce the optimized code.
> >  The test cases check the correct generation of the fcvt instruction for
> > float/double to int/long/long long. Passed the test in riscv-linux.
> > Could this patch be committed?
>
> Reviewed-by: Palmer Dabbelt 
> Acked-by: Palmer Dabbelt 
>
> Not sure if Kito had any comments for this one, but it looks good to me.
>
> > gcc/ChangeLog:
> >Michael Collison  
> > * config/riscv/riscv.md (RINT): Add iterator for lceil and
> lround.
> > (rint_pattern): Add ceil and floor.
> > (rint_rm): Add rup and rdn.
> >
> > gcc/testsuite/ChangeLog:
> > Kevin Lee  
> > * gcc.target/riscv/lfloor-lceil.c: New test.
> > ---
> >  gcc/config/riscv/riscv.md | 13 ++-
> >  gcc/testsuite/gcc.target/riscv/lfloor-lceil.c | 79 +++
> >  2 files changed, 88 insertions(+), 4 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> >
> > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> > index c6399b1389e..070004fa7fe 100644
> > --- a/gcc/config/riscv/riscv.md
> > +++ b/gcc/config/riscv/riscv.md
> > @@ -43,6 +43,9 @@ (define_c_enum "unspec" [
> >UNSPEC_LRINT
> >UNSPEC_LROUND
> >
> > +  UNSPEC_LCEIL
> > +  UNSPEC_LFLOOR
> > +
> >;; Stack tie
> >UNSPEC_TIE
> >  ])
> > @@ -345,10 +348,12 @@ (define_mode_attr UNITMODE [(SF "SF") (DF "DF")])
> >  ;; the controlling mode.
> >  (define_mode_attr HALFMODE [(DF "SI") (DI "SI") (TF "DI")])
> >
> > -;; Iterator and attributes for floating-point rounding instructions.
> > -(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND])
> > -(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
> > "round")])
> > -(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")])
> > +;; Iterator and attributes for floating-point rounding instructions.f
> > +(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND UNSPEC_LCEIL
> > UNSPEC_LFLOOR])
> > +(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND
> > "round")
> > + (UNSPEC_LCEIL "ceil") (UNSPEC_LFLOOR
> > "floor")])
> > +(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")
> > +(UNSPEC_LCEIL "rup") (UNSPEC_LFLOOR "rdn")])
> >
> >  ;; Iterator and attributes for quiet comparisons.
> >  (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET
> UNSPEC_FLE_QUIET])
> > diff --git a/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> > b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> > new file mode 100644
> > index 000..4d81c12cefa
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/lfloor-lceil.c
> > @@ -0,0 +1,79 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-march=rv64gc -mabi=lp64d" } */
> > +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
> > +
> > +int
> > +ceil1(float i)
> > +{
> > +  return __builtin_lceil(i);
> > +}
> > +
> > +long
> > +ceil2(float i)
> > +{
> > +  return __builtin_lceil(i);
> > +}
> > +
> > +long long
> > +ceil3(float i)
> > +{
> > +  return __builtin_lceil(i);
> > +}
> > +
> > +int
> > +ceil4(double i)
> > +{
> > +  return __builtin_lceil(i);
> > +}
> > +
> > +long
> > +ceil5(double i)
> > +{
> > +  return __builtin_lceil(i);
> > +}
> > +
> > +long long
> > +ceil6(double i)
> > +{
> > +  return __builtin_lceil(i);
> > +}
> > +
> > +int
> > +floor1(float i)
> > +{
> > +  return __builtin_lfloor(i);
> > +}
> > +
> > +long
> > +floor2(float i)
> > +{
> > +  return __builtin_lfloor(i);
> > +}
> > +
> > +long long
> > +floor3(float i)
> > +{
> > +  return __builtin_lfloor(i);
> > +}
> > +
> > +int
> > +floor4(double i)
> > +{
> > +  return __builtin_lfloor(i);
> > +}
> > +
> > +long
> > +floor5(double i)
> > +{
> > +  return __builtin_lfloor(i);
> > +}
> > +
> > +long long
> > +floor6(double i)
> > +{
> > +  return __builtin_lfloor(i);
> > +}
> > +
> > +/* { dg-final { scan-assembler-times "fcvt.l.s" 6 } } */
> > +/* { dg-final { scan-assembler-times "fcvt.l.d" 6 } } */
> > +/* { dg-final { scan-assembler-not "call" } } */
>


Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-17 Thread Bernhard Reutner-Fischer via Gcc-patches
On 17 September 2022 21:50:20 CEST, Mikael Morin  wrote:
>Le 17/09/2022 à 21:33, Mikael Morin a écrit :
>> The testcase from the patch was not specifically checking lack of 
>> side-effect clobbers, so I have double-checked with the following testcase, 
>> which should lift your concerns.
>> 
>The dump matches didn’t fail as expected with patch 2/10 reversed.
>This testcase should be better.

! { dg-final { scan-tree-dump-times "456" 0 "optimized" { target __OPTIMIZE__ } 
} }

I'd spell this as scan-tree-dump-not, fwiw.

That said, plain scan-tree-dump is usually only viable in arch influenced 
checks which in fortran we do not usually have. Here, we should for the most 
part use -not or a specific -times.

I think you had a check for integer(kind=4) in there, too, which might not work 
all that well for -fdefault-integer-8 or, for the corresponding real scan, 
-fdefault-real-8, eventually. Easily tweaked on top if anyone (certainly will) 
complain later on, though..

fore, either way, I'd say :-)
thanks,


Re: [PATCH] libstdc++: Introduce GNU/Hurd-specific libstdc++ os-defines.h

2022-09-17 Thread Samuel Thibault via Gcc-patches
Ping?

Samuel Thibault, le lun. 29 août 2022 02:30:40 +0200, a ecrit:
> This is notably needed because in glibc 2.34, the move of pthread functions
> into libc.so happened for Linux only, not GNU/Hurd.
> 
> The pthread_self() function can also always be used fine as it is.
> 
> libstdc++-v3/ChangeLog:
> 
> * config/os/gnu/os_defines.h: New file.
> * config/os/gnu/ctype_base.h: New file.
> * config/os/gnu/ctype_configure_char.cc: New file.
> * config/os/gnu/ctype_inline.h: New file.
> * configure.host: On gnu* host, use os/gnu instead of os/gnu-linux.
> 
> diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
> index ba5939d9003..dd288cce2ca 100644
> --- a/libstdc++-v3/ChangeLog
> +++ b/libstdc++-v3/ChangeLog
> @@ -1,3 +1,11 @@
> +2022-08-28  Samuel Thibault  
> +
> + * config/os/gnu/os_defines.h: New file.
> + * config/os/gnu/ctype_base.h: New file.
> + * config/os/gnu/ctype_configure_char.cc: New file.
> + * config/os/gnu/ctype_inline.h: New file.
> + * configure.host: On gnu* host, use os/gnu instead of os/gnu-linux.
> +
>  2022-08-27  Patrick Palka  
>  
>   * testsuite/20_util/logical_traits/requirements/base_classes.cc: New 
> test.
> diff --git a/libstdc++-v3/config/os/gnu/ctype_base.h 
> b/libstdc++-v3/config/os/gnu/ctype_base.h
> new file mode 100644
> index 000..955146543db
> --- /dev/null
> +++ b/libstdc++-v3/config/os/gnu/ctype_base.h
> @@ -0,0 +1,66 @@
> +// Locale support -*- C++ -*-
> +
> +// Copyright (C) 1997-2022 Free Software Foundation, Inc.
> +//
> +// This file is part of the GNU ISO C++ Library.  This library is free
> +// software; you can redistribute it and/or modify it under the
> +// terms of the GNU General Public License as published by the
> +// Free Software Foundation; either version 3, or (at your option)
> +// any later version.
> +
> +// This library is distributed in the hope that it will be useful,
> +// but WITHOUT ANY WARRANTY; without even the implied warranty of
> +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +// GNU General Public License for more details.
> +
> +// Under Section 7 of GPL version 3, you are granted additional
> +// permissions described in the GCC Runtime Library Exception, version
> +// 3.1, as published by the Free Software Foundation.
> +
> +// You should have received a copy of the GNU General Public License and
> +// a copy of the GCC Runtime Library Exception along with this program;
> +// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> +// .
> +
> +/** @file bits/ctype_base.h
> + *  This is an internal header file, included by other library headers.
> + *  Do not attempt to use it directly. @headername{locale}
> + */
> +
> +//
> +// ISO C++ 14882: 22.1  Locales
> +//
> +
> +// Information as gleaned from /usr/include/ctype.h
> +
> +namespace std _GLIBCXX_VISIBILITY(default)
> +{
> +_GLIBCXX_BEGIN_NAMESPACE_VERSION
> +
> +  /// @brief  Base class for ctype.
> +  struct ctype_base
> +  {
> +// Non-standard typedefs.
> +typedef const int*   __to_type;
> +
> +// NB: Offsets into ctype::_M_table force a particular size
> +// on the mask type. Because of this, we don't use an enum.
> +typedef unsigned short   mask;
> +static const mask upper  = _ISupper;
> +static const mask lower  = _ISlower;
> +static const mask alpha  = _ISalpha;
> +static const mask digit  = _ISdigit;
> +static const mask xdigit = _ISxdigit;
> +static const mask space  = _ISspace;
> +static const mask print  = _ISprint;
> +static const mask graph  = _ISalpha | _ISdigit | _ISpunct;
> +static const mask cntrl  = _IScntrl;
> +static const mask punct  = _ISpunct;
> +static const mask alnum  = _ISalpha | _ISdigit;
> +#if __cplusplus >= 201103L
> +static const mask blank  = _ISblank;
> +#endif
> +  };
> +
> +_GLIBCXX_END_NAMESPACE_VERSION
> +} // namespace
> diff --git a/libstdc++-v3/config/os/gnu/ctype_configure_char.cc 
> b/libstdc++-v3/config/os/gnu/ctype_configure_char.cc
> new file mode 100644
> index 000..5a88fc11ab3
> --- /dev/null
> +++ b/libstdc++-v3/config/os/gnu/ctype_configure_char.cc
> @@ -0,0 +1,196 @@
> +// Locale support -*- C++ -*-
> +
> +// Copyright (C) 2011-2022 Free Software Foundation, Inc.
> +//
> +// This file is part of the GNU ISO C++ Library.  This library is free
> +// software; you can redistribute it and/or modify it under the
> +// terms of the GNU General Public License as published by the
> +// Free Software Foundation; either version 3, or (at your option)
> +// any later version.
> +
> +// This library is distributed in the hope that it will be useful,
> +// but WITHOUT ANY WARRANTY; without even the implied warranty of
> +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +// GNU General Public License for more details.
> +
> +// Under Section 7 of GPL version 3, you are grante

Re: [PATCH 09/10] fortran: Support clobbering of variable subreferences [PR88364]

2022-09-17 Thread Richard Biener via Gcc-patches
On Sat, Sep 17, 2022 at 9:33 PM Mikael Morin  wrote:
>
> Le 17/09/2022 à 19:03, Thomas Koenig via Fortran a écrit :
> >
> > Hi Mikael,
> >
> >> This adds support for clobbering of partial variable references, when
> >> they are passed as actual argument and the associated dummy has the
> >> INTENT(OUT) attribute.
> >> Support includes array elements, derived type component references,
> >> and complex real or imaginary parts.
> >>
> >> This is done by removing the check for lack of subreferences, which is
> >> basically a revert of r9-4911-gbd810d637041dba49a5aca3d085504575374ac6f.
> >> This removal allows more expressions than just array elements,
> >> components and complex parts, but the other expressions are excluded by
> >> other conditions: substrings are excluded by the check on expression
> >> type (CHARACTER is excluded), KIND and LEN references are rejected by
> >> the compiler as not valid in a variable definition context.
> >>
> >> The check for scalarness is also updated as it was only valid when there
> >> was no subreference.
> >
> > First, thanks a lot for digging into this subject. I have looked through
> > the patch series, and it looks very good so far.
> >
> > I have a concern about this part, though.  My understanding at the
> > time was that it is not possible to clobber an individual array
> > element, but that this clobbers anything off the pointer that this
> > is based on.
> >
> Well, we need the middle-end guys to give a definitive answer on this
> topic, but I think it would be a very penalizing limitation if that was
> the case.  I have assumed that the clobber spanned the value it was
> applied on, neither more nor less, so just the array element in case of
> array elements.

There is IL verification that the LHS of a CLOBBER is either
a declaration or a pointer dereference, no array or component
selection is allowed there.  Now, nothing should go wrong here,
but we might eventually just drop those CLOBBERs or ICE if
we frontend hands us an "invalid" one.

Richard.

> > So,
> >
> >integer, dimension(3) :: a
> >
> >a(1) = 1
> >a(3) = 3
> >call foo(a(1))
> >
> > would also invalidate the store to a(3).  Is my understanding correct?
>
> I think it was the case before patch 2 in in the series, because the
> clobber was applied to the symbol decl, so in the case of the expression
> A(1), it was applied to A which is the full array.  After patch 2, the
> clobber is applied to the expression A(1), so the element alone.
>
> > If so, I think this we cannot revert that patch (which was introduced
> > because of a regression).
> >
> The testcase from the patch was not specifically checking lack of
> side-effect clobbers, so I have double-checked with the following
> testcase, which should lift your concerns.
> I propose to keep the patch with the testcase added to it.  What do you
> think?
>
> Mikael
>