Re: [PATCH] [testsuite] require -Ofast for vect-ifcvt-18 even without avx

2025-01-30 Thread Richard Biener
On Thu, Jan 30, 2025 at 8:49 PM Alexandre Oliva  wrote:
>
>
> The test expects transformations that depend on -Ofast on x86*, but
> that option is only passed when the avx_runtime is available.
>
> Split -Ofast out of the avx conditional, so that it is passed on the
> same targets that expect the transformation.

It effectively would require -Ofast on all targets (it's vectorizing a
FP reduction),
so I'd say -Ofast should not depend on x86 only.

> Regstrapped on x86_64-linux-gnu, also tested with gcc-14 targeting
> x86_64-elf.  Ok to install?
>
>
> for  gcc/testsuite/ChangeLog
>
> * gcc.dg/vect/vect-ifcvt-18.c: Split -Ofast out of
> avx_runtime.
> ---
>  gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c 
> b/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
> index c1d3c27d81933..228011ae07bce 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
> @@ -1,6 +1,7 @@
>  /* { dg-require-effective-target vect_condition } */
>  /* { dg-require-effective-target vect_float } */
> -/* { dg-additional-options "-Ofast -mavx" { target avx_runtime } } */
> +/* { dg-additional-options "-Ofast" { target i?86-*-* x86_64-*-* } } */
> +/* { dg-additional-options "-mavx" { target avx_runtime } } */
>
>
>  int A0[4] = {36,39,42,45};
>
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH] Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

2025-01-30 Thread Richard Biener
On Fri, Jan 31, 2025 at 3:55 AM Michael Meissner  wrote:
>
> Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.
>
> In bug PR target/118541 on power9, power10, and power11 systems, for the
> function:
>
> extern double __ieee754_acos (double);
>
> double
> __acospi (double x)
> {
>   double ret = __ieee754_acos (x) / 3.14;
>   return __builtin_isgreater (ret, 1.0) ? 1.0 : ret;
> }
>
> GCC currently generates the following code:
>
> Power9  Power10 and Power11
> ==  ===
> bl __ieee754_acos   bl __ieee754_acos@notoc
> nop plfd 0,.LC0@pcrel
> addis 9,2,.LC2@toc@ha   xxspltidp 12,1065353216
> addi 1,1,32 addi 1,1,32
> lfd 0,.LC2@toc@l(9) ld 0,16(1)
> addis 9,2,.LC0@toc@ha   fdiv 0,1,0
> ld 0,16(1)  mtlr 0
> lfd 12,.LC0@toc@l(9)xscmpgtdp 1,0,12
> fdiv 0,1,0  xxsel 1,0,12,1
> mtlr 0  blr
> xscmpgtdp 1,0,12
> xxsel 1,0,12,1
> blr
>
> This is because ifcvt.c optimizes the conditional floating point move to use 
> the
> XSCMPGTDP instruction.
>
> However, the XSCMPGTDP instruction will generate an interrupt if one of the
> arguments is a signalling NaN and signalling NaNs can generate an interrupt.
> The IEEE comparison functions (isgreater, etc.) require that the comparison 
> not
> raise an interrupt.
>
> The following patch changes the PowerPC back end so that ifcvt.c will not 
> change
> the if/then test and move into a conditional move if the comparison is one of
> the comparisons that do not raise an error with signalling NaNs and -Ofast is
> not used.  If a normal comparison is used or -Ofast is used, GCC will continue
> to generate XSCMPGTDP and XXSEL.
>
> For the following code:
>
> double
> ordered_compare (double a, double b, double c, double d)
> {
>   return __builtin_isgreater (a, b) ? c : d;
> }
>
> /* Verify normal > does generate xscmpgtdp.  */
>
> double
> normal_compare (double a, double b, double c, double d)
> {
>   return a > b ? c : d;
> }
>
> with the following patch, GCC generates the following for power9, power10, and
> power11:
>
> ordered_compare:
> fcmpu 0,1,2
> fmr 1,4
> bnglr 0
> fmr 1,3
> blr
>
> normal_compare:
> xscmpgtdp 1,1,2
> xxsel 1,4,3,1
> blr
>
> I have built bootstrap compilers on big endian power9 systems and little 
> endian
> power9/power10 systems and there were no regressions.  Can I check this patch
> into the GCC trunk, and after a waiting period, can I check this into the 
> active
> older branches?
>
> 2025-01-30  Michael Meissner  
>
> gcc/
>
> PR target/118541
> * config/rs6000/rs6000-protos.h (REVERSE_COND_ORDERED_OK): New macro.
> (REVERSE_COND_NO_ORDERED): Likewise.
> (rs6000_reverse_condition): Add argument.
> * config/rs6000/rs6000.cc (rs6000_reverse_condition): Do not allow
> ordered comparisons to be reversed for floating point cmoves.
> (rs6000_emit_sCOND): Adjust rs6000_reverse_condition call.
> * config/rs6000/rs6000.h (REVERSE_CONDITION): Likewise.
> * config/rs6000/rs6000.md (reverse_branch_comparison): Name insn.
> Adjust rs6000_reverse_condition call.
>
> gcc/testsuite/
>
> PR target/118541
> * gcc.target/powerpc/pr118541.c: New test.
> ---
>  gcc/config/rs6000/rs6000-protos.h   |  6 ++-
>  gcc/config/rs6000/rs6000.cc | 23 ---
>  gcc/config/rs6000/rs6000.h  | 10 -
>  gcc/config/rs6000/rs6000.md | 24 +++-
>  gcc/testsuite/gcc.target/powerpc/pr118541.c | 43 +
>  5 files changed, 89 insertions(+), 17 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr118541.c
>
> diff --git a/gcc/config/rs6000/rs6000-protos.h 
> b/gcc/config/rs6000/rs6000-protos.h
> index 4619142d197..112332660d3 100644
> --- a/gcc/config/rs6000/rs6000-protos.h
> +++ b/gcc/config/rs6000/rs6000-protos.h
> @@ -114,8 +114,12 @@ extern const char *rs6000_sibcall_template (rtx *, 
> unsigned int);
>  extern const char *rs6000_indirect_call_template (rtx *, unsigned int);
>  extern const char *rs6000_indirect_sibcall_template (rtx *, unsigned int);
>  extern const char *rs6000_pltseq_template (rtx *, int);
> +
> +#define REVERSE_COND_ORDERED_OKfalse
> +#define REVERSE_COND_NO_ORDEREDtrue
> +
>  extern enum rtx_code rs6000_reverse_condition (machine_mode,
> -  enum rtx_code);
> + 

[PATCH] [testsuite] require profiling support [PR113689]

2025-01-30 Thread Alexandre Oliva


pr113689 testcases use -fprofile without testing for profiling
support.  Fix them.

Regstrapped on x86_64-linux-gnu, also tested with gcc-14 targeting
x86_64-elf.  Ok to install?


for  gcc/testsuite/ChangeLog

PR target/113689
* gcc.target/i386/pr113689-1.c: Require profiling support.
* gcc.target/i386/pr113689-2.c: Likewise.
* gcc.target/i386/pr113689-3.c: Likewise.
---
 gcc/testsuite/gcc.target/i386/pr113689-1.c |1 +
 gcc/testsuite/gcc.target/i386/pr113689-2.c |1 +
 gcc/testsuite/gcc.target/i386/pr113689-3.c |1 +
 3 files changed, 3 insertions(+)

diff --git a/gcc/testsuite/gcc.target/i386/pr113689-1.c 
b/gcc/testsuite/gcc.target/i386/pr113689-1.c
index 0424db2dfdca0..0ed911393eeac 100644
--- a/gcc/testsuite/gcc.target/i386/pr113689-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr113689-1.c
@@ -1,5 +1,6 @@
 /* { dg-do run { target { lp64 && fpic } } } */
 /* { dg-options "-O2 -fno-pic -no-pie -fprofile -mcmodel=large" } */
+/* { dg-require-profiling "-fprofile" } */
 /* { dg-skip-if "PR90698" { *-*-darwin* } } */
 /* { dg-skip-if "PR113909" { *-*-solaris2* } } */
 
diff --git a/gcc/testsuite/gcc.target/i386/pr113689-2.c 
b/gcc/testsuite/gcc.target/i386/pr113689-2.c
index 58688b9a387c8..decc44a44fb56 100644
--- a/gcc/testsuite/gcc.target/i386/pr113689-2.c
+++ b/gcc/testsuite/gcc.target/i386/pr113689-2.c
@@ -1,5 +1,6 @@
 /* { dg-do run { target { lp64 && fpic } } } */
 /* { dg-options "-O2 -fpic -fprofile -mcmodel=large" } */
+/* { dg-require-profiling "-fprofile" } */
 /* { dg-skip-if "PR90698" { *-*-darwin* } } */
 /* { dg-skip-if "PR113909" { *-*-solaris2* } } */
 
diff --git a/gcc/testsuite/gcc.target/i386/pr113689-3.c 
b/gcc/testsuite/gcc.target/i386/pr113689-3.c
index 14c906239f9db..a904feec13f04 100644
--- a/gcc/testsuite/gcc.target/i386/pr113689-3.c
+++ b/gcc/testsuite/gcc.target/i386/pr113689-3.c
@@ -1,5 +1,6 @@
 /* { dg-do run { target { lp64 && fpic } } } */
 /* { dg-options "-O2 -fpic -fprofile -mcmodel=large" } */
+/* { dg-require-profiling "-fprofile" } */
 /* { dg-skip-if "PR90698" { *-*-darwin* } } */
 /* { dg-skip-if "PR113909" { *-*-solaris2* } } */
 


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


[PATCH] [testsuite] require -Ofast for vect-ifcvt-18 even without avx

2025-01-30 Thread Alexandre Oliva


The test expects transformations that depend on -Ofast on x86*, but
that option is only passed when the avx_runtime is available.

Split -Ofast out of the avx conditional, so that it is passed on the
same targets that expect the transformation.

Regstrapped on x86_64-linux-gnu, also tested with gcc-14 targeting
x86_64-elf.  Ok to install?


for  gcc/testsuite/ChangeLog

* gcc.dg/vect/vect-ifcvt-18.c: Split -Ofast out of
avx_runtime.
---
 gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c 
b/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
index c1d3c27d81933..228011ae07bce 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
@@ -1,6 +1,7 @@
 /* { dg-require-effective-target vect_condition } */
 /* { dg-require-effective-target vect_float } */
-/* { dg-additional-options "-Ofast -mavx" { target avx_runtime } } */
+/* { dg-additional-options "-Ofast" { target i?86-*-* x86_64-*-* } } */
+/* { dg-additional-options "-mavx" { target avx_runtime } } */
 
 
 int A0[4] = {36,39,42,45};

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH] [testsuite] require -Ofast for vect-ifcvt-18 even without avx

2025-01-30 Thread Jakub Jelinek
On Thu, Jan 30, 2025 at 04:46:49PM -0300, Alexandre Oliva wrote:
> 
> The test expects transformations that depend on -Ofast on x86*, but
> that option is only passed when the avx_runtime is available.
> 
> Split -Ofast out of the avx conditional, so that it is passed on the
> same targets that expect the transformation.
> 
> Regstrapped on x86_64-linux-gnu, also tested with gcc-14 targeting
> x86_64-elf.  Ok to install?
> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.dg/vect/vect-ifcvt-18.c: Split -Ofast out of
>   avx_runtime.

LGTM.
> ---
>  gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c 
> b/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
> index c1d3c27d81933..228011ae07bce 100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-ifcvt-18.c
> @@ -1,6 +1,7 @@
>  /* { dg-require-effective-target vect_condition } */
>  /* { dg-require-effective-target vect_float } */
> -/* { dg-additional-options "-Ofast -mavx" { target avx_runtime } } */
> +/* { dg-additional-options "-Ofast" { target i?86-*-* x86_64-*-* } } */
> +/* { dg-additional-options "-mavx" { target avx_runtime } } */

Jakub



Re: [patch, libfortran] PR114618 Version 2 Format produces incorrect output when contains 1x, ok when uses " "

2025-01-30 Thread Jerry D

On 1/29/25 10:30 AM, Jerry D wrote:

Follow-up:

On 1/28/25 1:33 PM, Harald Anlauf wrote:

Jerry,

while I haven't read your actual patch yet, I think the testcase
is slightly incorrect. In fact, Intel, NAG, Nvidia and AMD flang
disagree with it.

--- snip ---

The following adjustment to the patch puts this right.

 case FMT_X:
 case FMT_TR:
   consume_data_flag = 0;
   dtp->u.p.skips += f->u.n;
   tab_pos = bytes_used + dtp->u.p.skips - 1;
   dtp->u.p.pending_spaces = tab_pos - dtp->u.p.max_pos + 1;
   dtp->u.p.pending_spaces = dtp->u.p.pending_spaces < 0
     ? f->u.n : dtp->u.p.pending_spaces;

   //if (t == FMT_X && tab_pos < dtp->u.p.max_pos)
   //{
     //write_x (dtp, dtp->u.p.skips, dtp->u.p.pending_spaces);
     //dtp->u.p.skips = dtp->u.p.pending_spaces = 0;
   //}

Interestingly, it also fixes a floating point exception I ran into while 
setting up another test case for part 2 of this effort. I suspect this 
may be what was detected by the auto patch tester.


I will clean this up, adjust the test case for this part and re-submit.

Regards,

Jerry


Here is version 2 of the patch cleaned up and with the test case revised 
accordingly.


Thank you Herald for helping with my blindness.

Regressions tested on x86_64. I will wait a bit to see if the auto patch 
tester reports anything.


Otherwise, OK for trunk?

Regards,

Jerry

PS working on part 2 still.




diff --git a/gcc/testsuite/gfortran.dg/pr114618.f90 b/gcc/testsuite/gfortran.dg/pr114618.f90
new file mode 100644
index 000..835597b8513
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr114618.f90
@@ -0,0 +1,15 @@
+! { dg-do run }
+! PR114618 Format produces incorrect output when contains 1x, ok when uses " " 
+! aside: Before patch output1 is garbage.
+program pr114618
+   implicit none
+   integer, parameter :: wp = kind(0d0)
+   real(kind=wp) :: pi  = 3.14159265358979323846264338_wp
+   character(len=*), parameter:: fmt1 = '(19("."),t1,g0,1x,t21,g0)'
+   character(len=*), parameter:: fmt2 = '(19("."),t1,g0," ",t21,g0)'
+   character(21) :: output1, output2
+   write (output1, fmt1) 'RADIX', radix(pi)
+   write (output2, fmt2) 'RADIX', radix(pi)
+   if (output1 /= 'RADIX.. 2') stop 1
+   if (output2 /= 'RADIX . 2') stop 2
+end program pr114618
diff --git a/libgfortran/io/transfer.c b/libgfortran/io/transfer.c
index b3b72f39c5b..3fc53938b4a 100644
--- a/libgfortran/io/transfer.c
+++ b/libgfortran/io/transfer.c
@@ -2068,12 +2068,14 @@ static void
 formatted_transfer_scalar_write (st_parameter_dt *dtp, bt type, void *p, int kind,
  size_t size)
 {
-  gfc_offset pos, bytes_used;
+  gfc_offset tab_pos, bytes_used;
   const fnode *f;
   format_token t;
   int n;
   int consume_data_flag;
 
+  tab_pos = 0; bytes_used = 0;
+
   /* Change a complex data item into a pair of reals.  */
 
   n = (p == NULL) ? 0 : ((type != BT_COMPLEX) ? 1 : 2);
@@ -2398,10 +2400,12 @@ formatted_transfer_scalar_write (st_parameter_dt *dtp, bt type, void *p, int kin
 	case FMT_X:
 	case FMT_TR:
 	  consume_data_flag = 0;
-
 	  dtp->u.p.skips += f->u.n;
-	  pos = bytes_used + dtp->u.p.skips - 1;
-	  dtp->u.p.pending_spaces = pos - dtp->u.p.max_pos + 1;
+	  tab_pos = bytes_used + dtp->u.p.skips - 1;
+	  dtp->u.p.pending_spaces = tab_pos - dtp->u.p.max_pos + 1;
+	  dtp->u.p.pending_spaces = dtp->u.p.pending_spaces < 0
+? f->u.n : dtp->u.p.pending_spaces;
+
 	  /* Writes occur just before the switch on f->format, above, so
 	 that trailing blanks are suppressed, unless we are doing a
 	 non-advancing write in which case we want to output the blanks
@@ -2414,35 +2418,50 @@ formatted_transfer_scalar_write (st_parameter_dt *dtp, bt type, void *p, int kin
 	  break;
 
 	case FMT_TL:
-	case FMT_T:
 	  consume_data_flag = 0;
-
-	  if (f->format == FMT_TL)
+	  /* Handle the special case when no bytes have been used yet.
+	 Cannot go below zero. */
+	  if (bytes_used == 0)
 	{
-
-	  /* Handle the special case when no bytes have been used yet.
-	 Cannot go below zero. */
-	  if (bytes_used == 0)
-		{
-		  dtp->u.p.pending_spaces -= f->u.n;
-		  dtp->u.p.skips -= f->u.n;
-		  dtp->u.p.skips = dtp->u.p.skips < 0 ? 0 : dtp->u.p.skips;
-		}
-
-	  pos = bytes_used - f->u.n;
+	  dtp->u.p.pending_spaces -= f->u.n;
+	  dtp->u.p.skips -= f->u.n;
+	  dtp->u.p.skips = dtp->u.p.skips < 0 ? 0 : dtp->u.p.skips;
 	}
-	  else /* FMT_T */
-	pos = f->u.n - dtp->u.p.pending_spaces - 1;
+
+	  tab_pos = bytes_used - f->u.n;
 
 	  /* Standard 10.6.1.1: excessive left tabbing is reset to the
 	 left tab limit.  We do not check if the position has gone
 	 beyond the end of record because a subsequent tab could
 	 bring us back again.  */
-	  pos = pos < 0 ? 0 : pos;
+	  tab_pos = tab_pos < 0 ? 0 : tab_pos;
 
-	  dtp->u.p.skips = dtp->u.p.skips + pos - bytes_used;
+	  dtp->u.p.skips = dtp->u.p.skips + tab_pos - bytes_used;
 	  dtp->u.

Re: [GCC16 stage 1][RFC][PATCH 0/3]extend "counted_by" attribute to pointer fields of structures

2025-01-30 Thread Qing Zhao
Hi, Bill,

Sorry for the late reply (just be back from a short vacation).

> On Jan 23, 2025, at 17:44, Bill Wendling  wrote:
> 
> On Fri, Jan 17, 2025 at 8:20 AM Qing Zhao  wrote:
>> 
>> Hi, Bill,
>> 
>> Thanks a lot for your testing case.
>> 
>> I studied this testing case and realized that we need to decide
>> what’s the expected behavior for the following situation:
>> 
>> struct bar;
>> 
>> struct a {
>>   int dummy;
>>   struct bar *array __attribute__((counted_by(count)));
>>   char count;
>> };
>> 
>> when the size information of the element of the pointer array is not 
>> available
>> in the current compilation, i.e.,  there is no definition of the structure 
>> “bar” in the
>> current file, the size of “structure bar” is not known, as a result, 
>> compilation is not
>> able to compute the object size of the pointer array “array” even though the 
>> length
>> of the array is known.
>> 
>> So, my question is:
>> 
>> 1. When should the compiler issue warning for such situation?
>> A. During C frontend when checking the counted_by attributes.
>> B. During middle-end when __builtin_dynamic_object_size is computing the 
>> object size.
>> 
>> I prefer B.  The reason is: even though the counted_by attribute under such 
>> situation is not enough for object_size,
>> It should be enough for the bound sanitizer?
>> 
> My feelings on this is that we should allow this in the struct
> declaration, because when the user goes to allocate the objects for
> 'array', struct bar will be defined. So there shouldn't be an issue.
> There are two possible uses (maybe more) that I can think of:
> 
>  ptr->array = malloc (sizeof (struct bar));
>  ptr->count = 1;
> 
> or
> 
>  ptr->array = malloc (sizeof (struct bar) * count);
>  ptr->count = count;
> 
> And then you can access the 5th element like this:
> 
>  (((char *) ptr->array) + sizeof (struct bar) * 5).some_element;

 A CLANG patch that is currently under review is trying to resolve this similar 
issue:

https://github.com/llvm/llvm-project/pull/106321

This patch proposed the following rule to handle such situation:
"  • For incomplete pointee types that can never be completed (e.g. void) these 
are treated as error where the attribute is written (just like before this 
patch).
   • For incomplete pointee types that might be completable later on (struct, 
union, and enum forward declarations) in the translation unit, writing the 
attribute on the incomplete pointee type is allowed on the FieldDecl 
declaration but "uses" of the declared pointer are forbidden if at the point of 
"use" the pointee type is still incomplete.
For this patch a "use" of a FieldDecl covers:
   • Explicit and Implicit initialization (note see Tentative Definition 
Initialization for an exception to this)
   • Assignment
   • Conversion to an lvalue (e.g. for use in an expression)
“

I think that the above rule is reasonable to handle such situation.

Let me know if you have any comment or suggestion on this.

Qing

> 
> (Gotta love C.)
> 
> -bw



[PATCH] Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

2025-01-30 Thread Michael Meissner
Fix PR 118541, do not generate unordered fp cmoves for IEEE compares.

In bug PR target/118541 on power9, power10, and power11 systems, for the
function:

extern double __ieee754_acos (double);

double
__acospi (double x)
{
  double ret = __ieee754_acos (x) / 3.14;
  return __builtin_isgreater (ret, 1.0) ? 1.0 : ret;
}

GCC currently generates the following code:

Power9  Power10 and Power11
==  ===
bl __ieee754_acos   bl __ieee754_acos@notoc
nop plfd 0,.LC0@pcrel
addis 9,2,.LC2@toc@ha   xxspltidp 12,1065353216
addi 1,1,32 addi 1,1,32
lfd 0,.LC2@toc@l(9) ld 0,16(1)
addis 9,2,.LC0@toc@ha   fdiv 0,1,0
ld 0,16(1)  mtlr 0
lfd 12,.LC0@toc@l(9)xscmpgtdp 1,0,12
fdiv 0,1,0  xxsel 1,0,12,1
mtlr 0  blr
xscmpgtdp 1,0,12
xxsel 1,0,12,1
blr

This is because ifcvt.c optimizes the conditional floating point move to use the
XSCMPGTDP instruction.

However, the XSCMPGTDP instruction will generate an interrupt if one of the
arguments is a signalling NaN and signalling NaNs can generate an interrupt.
The IEEE comparison functions (isgreater, etc.) require that the comparison not
raise an interrupt.

The following patch changes the PowerPC back end so that ifcvt.c will not change
the if/then test and move into a conditional move if the comparison is one of
the comparisons that do not raise an error with signalling NaNs and -Ofast is
not used.  If a normal comparison is used or -Ofast is used, GCC will continue
to generate XSCMPGTDP and XXSEL.

For the following code:

double
ordered_compare (double a, double b, double c, double d)
{
  return __builtin_isgreater (a, b) ? c : d;
}

/* Verify normal > does generate xscmpgtdp.  */

double
normal_compare (double a, double b, double c, double d)
{
  return a > b ? c : d;
}

with the following patch, GCC generates the following for power9, power10, and
power11:

ordered_compare:
fcmpu 0,1,2
fmr 1,4
bnglr 0
fmr 1,3
blr

normal_compare:
xscmpgtdp 1,1,2
xxsel 1,4,3,1
blr

I have built bootstrap compilers on big endian power9 systems and little endian
power9/power10 systems and there were no regressions.  Can I check this patch
into the GCC trunk, and after a waiting period, can I check this into the active
older branches?

2025-01-30  Michael Meissner  

gcc/

PR target/118541
* config/rs6000/rs6000-protos.h (REVERSE_COND_ORDERED_OK): New macro.
(REVERSE_COND_NO_ORDERED): Likewise.
(rs6000_reverse_condition): Add argument.
* config/rs6000/rs6000.cc (rs6000_reverse_condition): Do not allow
ordered comparisons to be reversed for floating point cmoves.
(rs6000_emit_sCOND): Adjust rs6000_reverse_condition call.
* config/rs6000/rs6000.h (REVERSE_CONDITION): Likewise.
* config/rs6000/rs6000.md (reverse_branch_comparison): Name insn.
Adjust rs6000_reverse_condition call.

gcc/testsuite/

PR target/118541
* gcc.target/powerpc/pr118541.c: New test.
---
 gcc/config/rs6000/rs6000-protos.h   |  6 ++-
 gcc/config/rs6000/rs6000.cc | 23 ---
 gcc/config/rs6000/rs6000.h  | 10 -
 gcc/config/rs6000/rs6000.md | 24 +++-
 gcc/testsuite/gcc.target/powerpc/pr118541.c | 43 +
 5 files changed, 89 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr118541.c

diff --git a/gcc/config/rs6000/rs6000-protos.h 
b/gcc/config/rs6000/rs6000-protos.h
index 4619142d197..112332660d3 100644
--- a/gcc/config/rs6000/rs6000-protos.h
+++ b/gcc/config/rs6000/rs6000-protos.h
@@ -114,8 +114,12 @@ extern const char *rs6000_sibcall_template (rtx *, 
unsigned int);
 extern const char *rs6000_indirect_call_template (rtx *, unsigned int);
 extern const char *rs6000_indirect_sibcall_template (rtx *, unsigned int);
 extern const char *rs6000_pltseq_template (rtx *, int);
+
+#define REVERSE_COND_ORDERED_OKfalse
+#define REVERSE_COND_NO_ORDEREDtrue
+
 extern enum rtx_code rs6000_reverse_condition (machine_mode,
-  enum rtx_code);
+  enum rtx_code, bool);
 extern rtx rs6000_emit_eqne (machine_mode, rtx, rtx, rtx);
 extern rtx rs6000_emit_fp_cror (rtx_code, machine_mode, rtx);
 extern void rs6000_emit_sCOND (machine_mode, rtx[]);
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/r

libbacktrace patch committed: Add casts to avoid undefined shifts

2025-01-30 Thread Ian Lance Taylor
This patch from pgerell at GitHub adds some casts to libbacktrace to
avoid undefined shifts. These shifts are OK on all real systems but
may as well get it right. Bootstrapped and ran libbacktrace tests on
x86_64-pc-linux-gnu. Committed to mainline.

Ian

libbacktrace: add casts to avoid undefined shifts

Patch from pgerell@github.

* elf.c (elf_fetch_bits): Add casts to avoid potentially shifting
a value farther than its type size.
(elf_fetch_bits_backward): Likewise.
(elf_uncompress_lzma_block): Likewise.
(elf_uncompress_lzma): Likewise.
db23cdf5d7b714d1ca7fe649cdd2a8aced383aac
diff --git a/libbacktrace/elf.c b/libbacktrace/elf.c
index a450f3535c6..d766fa41a61 100644
--- a/libbacktrace/elf.c
+++ b/libbacktrace/elf.c
@@ -1147,7 +1147,10 @@ elf_fetch_bits (const unsigned char **ppin, const 
unsigned char *pinend,
   next = __builtin_bswap32 (next);
 #endif
 #else
-  next = pin[0] | (pin[1] << 8) | (pin[2] << 16) | (pin[3] << 24);
+  next = ((uint32_t)pin[0]
+ | ((uint32_t)pin[1] << 8)
+ | ((uint32_t)pin[2] << 16)
+ | ((uint32_t)pin[3] << 24));
 #endif
 
   val |= (uint64_t)next << bits;
@@ -1198,7 +1201,10 @@ elf_fetch_bits_backward (const unsigned char **ppin,
   next = __builtin_bswap32 (next);
 #endif
 #else
-  next = pin[0] | (pin[1] << 8) | (pin[2] << 16) | (pin[3] << 24);
+  next = ((uint32_t)pin[0]
+ | ((uint32_t)pin[1] << 8)
+ | ((uint32_t)pin[2] << 16)
+ | ((uint32_t)pin[3] << 24));
 #endif
 
   val <<= 32;
@@ -5872,10 +5878,10 @@ elf_uncompress_lzma_block (const unsigned char 
*compressed,
  /* The byte at compressed[off] is ignored for some
 reason.  */
 
- code = ((compressed[off + 1] << 24)
- + (compressed[off + 2] << 16)
- + (compressed[off + 3] << 8)
- + compressed[off + 4]);
+ code = ((uint32_t)(compressed[off + 1] << 24)
+ + ((uint32_t)compressed[off + 2] << 16)
+ + ((uint32_t)compressed[off + 3] << 8)
+ + (uint32_t)compressed[off + 4]);
  off += 5;
 
  /* This is the main LZMA decode loop.  */
@@ -6198,10 +6204,10 @@ elf_uncompress_lzma_block (const unsigned char 
*compressed,
  return 0;
}
   computed_crc = elf_crc32 (0, uncompressed, uncompressed_offset);
-  stream_crc = (compressed[off]
-   | (compressed[off + 1] << 8)
-   | (compressed[off + 2] << 16)
-   | (compressed[off + 3] << 24));
+  stream_crc = ((uint32_t)compressed[off]
+   | ((uint32_t)compressed[off + 1] << 8)
+   | ((uint32_t)compressed[off + 2] << 16)
+   | ((uint32_t)compressed[off + 3] << 24));
   if (computed_crc != stream_crc)
{
  elf_uncompress_failed ();
@@ -6336,10 +6342,10 @@ elf_uncompress_lzma (struct backtrace_state *state,
 
   /* Before that is the size of the index field, which precedes the
  footer.  */
-  index_size = (compressed[offset - 4]
-   | (compressed[offset - 3] << 8)
-   | (compressed[offset - 2] << 16)
-   | (compressed[offset - 1] << 24));
+  index_size = ((size_t)compressed[offset - 4]
+   | ((size_t)compressed[offset - 3] << 8)
+   | ((size_t)compressed[offset - 2] << 16)
+   | ((size_t)compressed[offset - 1] << 24));
   index_size = (index_size + 1) * 4;
   offset -= 4;
 


Re: [PATCH] c++: check_flexarray fixes [PR117516]

2025-01-30 Thread Jason Merrill

On 12/7/24 5:34 AM, Jakub Jelinek wrote:

Hi!

On the pr117516.C testcase check_flexarrays and its helper functions
have exponential complexity, plus it reports the same bug over and over
again in some cases instead of reporting perhaps other bugs.
The functions want to diagnose flexible array member (and strangely [0]
arrays too) followed by some other non-empty or array members in the same
strcuture, or followed by other non-empty or array members in a containing
structure (any of them), or flexible array members/[0] arrays in structures
with no other non-empty members, or those nested in other structures.
Strangely it doesn't complain if flexible array member is in a structure
used in an array.

As can be seeen on e.g. the flexary41.C test, it keeps reporting the
same bug over and over:
flexary41.C:5:24: error: flexible array member ‘A::b’ not at end of ‘struct A’
flexary41.C:5:24: error: flexible array member ‘A::b’ not at end of ‘struct B’
flexary41.C:5:24: error: flexible array member ‘A::b’ not at end of ‘struct C’
flexary41.C:5:24: error: flexible array member ‘A::b’ not at end of ‘struct D’
flexary41.C:13:39: error: flexible array member ‘En’ not at 
end of ‘struct E’
flexary41.C:18:23: error: flexible array member ‘H::t’ not at end of ‘struct K’
flexary41.C:25:36: note: next member ‘int K::ab’ declared here
flexary41.C:25:8: note: in the definition of ‘struct K’
The bug that A::b is followed by A::c is one bug reported 4 times, while it
doesn't report the other bugs, that B::e flexarray is followed by B::f
and that C::h flexarray is followed by C::i.
That is because it always walks all the structures/unions of all the members
and just finds the first flexarray in there.

Now, this has horrible complexity plus it doesn't seem really useful to
users.  So, for cases where a flexible array member is followed by a
non-empty other member in the same structure, the following patch just
reports it once when finalizing that structure, and otherwise just recurses
in structures solely into the last member, so that it can report cases like
struct X { int a; int b[]; };
struct Y { X c; int d; };
or
struct Z { X c; };
i.e. correct use of flexarray in X but following it by another member in Y
or just nesting it (the former is error, the latter pedwarn as before).
By only looking at the last member for structures we get rid of the complexity.

Note, the patch doesn't do anything about unions, I think we still could
spend a lot of time compiling.
struct S { char s; };
union U0 { S a, b; };
union U1 { union U0 a, b; };
union U2 { union U1 a, b; };
...
union U32 { union U31 a, b; };
struct T { union U32 a; int b; };
Not really sure what we could do about that, all the elements are "last"
(but admittedly I haven't studied in detail how the original code worked
in union, there is fmem->after[pun] where pun is whether it is somewhere
inside of a union).  Perhaps in a hash table marking unions which don't have
any flexarrays at the end, nested or not, so that we don't walk them again?
Plus if we find some with flexarray at the end, maybe there is no point
to look other union members?  In any case, I think that is less severe,
because people usually don't nest unions deeply.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-12-07  Jakub Jelinek  

PR c++/117516
* class.cc (field_nonempty_p): Formatting fixes.  Use
integer_zerop instead of tree_int_cst_equal with size_zero_node.
(struct flexmems_t): Change type of first member from tree to bool.
(find_flexarrays): Add nested_p argument.  Change pun argument type
from tree to bool, adjust uses.  Formatting fixes.  If BASE_P or
NESTED_P and T is RECORD_TYPE, start looking only at the last
non-empty or array FIELD_DECL.  Adjust recursive call, set first
if it was a nested call and found an array.
(diagnose_invalid_flexarray, diagnose_flexarrays, check_flexarrays):
Formatting fixes.

* g++.dg/ext/flexary9.C: Expect different wording of one of the
warnings and at a different line.
* g++.dg/ext/flexary19.C: Likewise.
* g++.dg/ext/flexary41.C: New test.
* g++.dg/other/pr117516.C: New test.

--- gcc/cp/class.cc.jj  2024-12-06 11:00:21.418671398 +0100
+++ gcc/cp/class.cc 2024-12-06 13:55:39.074460055 +0100
@@ -141,8 +141,8 @@ static bool accessible_nvdtor_p (tree);
  /* Used by find_flexarrays and related functions.  */
  struct flexmems_t;
  static void diagnose_flexarrays (tree, const flexmems_t *);
-static void find_flexarrays (tree, flexmems_t *, bool = false,
-tree = NULL_TREE, tree = NULL_TREE);
+static void find_flexarrays (tree, flexmems_t *, bool, bool = false,
+bool = false, tree = NULL_TREE);
  static void check_flexarrays (tree, flexmems_t * = NULL, bool = false);
  static void check_bases (tree, int *, int *);
  static void check_bases_and_members (tre

[PATCH] Fortran: host association issue with symbol in COMMON block [PR108454]

2025-01-30 Thread Harald Anlauf

Dear all,

analyzing the the PR (by Gerhard) turned out to two slightly related
issues.  The first one, where a variable in a COMMON block is falsely
resolved to a derived type declared in the host, leads to a false
freeing of the symbol, resulting in memory corruption and ICE.
If we already know that the symbol is in a common block, we may
just skip that interface search.

The other issue is a resolution issue, where the derived type
declared in the host is used in a variable declaration in the
procedure (as type or class), and a variable of the same name
as the derived type is used in a common block but later resolves
to a basic type, without a proper detection of the conflict.
But as this issue is found to be independent of the presence of
a COMMON block, I have opened a separate issue (pr118709) for it.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

From 3ab31890f81d24b0231da6d9a5dc29ea316e364d Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 30 Jan 2025 22:21:19 +0100
Subject: [PATCH] Fortran: host association issue with symbol in COMMON block
 [PR108454]

When resolving a flavorless symbol that is already registered with a COMMON
block, and which neither has the intrinsic, generic, or external attribute,
skip searching among interfaces to avoid false resolution to a derived type
of the same name.

	PR fortran/108454

gcc/fortran/ChangeLog:

	* resolve.cc (resolve_common_blocks): Initialize variable.
	(resolve_symbol): If a symbol is already registered with a COMMON
	block, do not search for an interface with the same name.

gcc/testsuite/ChangeLog:

	* gfortran.dg/common_29.f90: New test.
---
 gcc/fortran/resolve.cc  |  9 ++-
 gcc/testsuite/gfortran.dg/common_29.f90 | 34 +
 2 files changed, 42 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/common_29.f90

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index 12a623da851..f2eef12199c 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -1049,7 +1049,7 @@ resolve_common_vars (gfc_common_head *common_block, bool named_common)
 static void
 resolve_common_blocks (gfc_symtree *common_root)
 {
-  gfc_symbol *sym;
+  gfc_symbol *sym = NULL;
   gfc_gsymbol * gsym;
 
   if (common_root == NULL)
@@ -17693,6 +17693,12 @@ resolve_symbol (gfc_symbol *sym)
 	  && sym->attr.if_source == IFSRC_UNKNOWN
 	  && sym->ts.type == BT_UNKNOWN))
 {
+  /* A symbol in a common block might not have been resolved yet properly.
+	 Do not try to find an interface with the same name.  */
+  if (sym->attr.flavor == FL_UNKNOWN && !sym->attr.intrinsic
+	  && !sym->attr.generic && !sym->attr.external
+	  && sym->attr.in_common)
+	goto skip_interfaces;
 
 /* If we find that a flavorless symbol is an interface in one of the
parent namespaces, find its symtree in this namespace, free the
@@ -17716,6 +17722,7 @@ resolve_symbol (gfc_symbol *sym)
 	}
 	}
 
+skip_interfaces:
   /* Otherwise give it a flavor according to such attributes as
 	 it has.  */
   if (sym->attr.flavor == FL_UNKNOWN && sym->attr.external == 0
diff --git a/gcc/testsuite/gfortran.dg/common_29.f90 b/gcc/testsuite/gfortran.dg/common_29.f90
new file mode 100644
index 000..66f2a18a483
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/common_29.f90
@@ -0,0 +1,34 @@
+! { dg-do compile }
+! PR fortran/108454
+!
+! Contributed by G.Steinmetz
+
+module m
+  type t
+  end type
+contains
+  subroutine s
+common t
+  end
+end
+
+module m2
+  implicit none
+  type t
+  end type
+contains
+  subroutine s
+real :: t
+common /com/ t
+  end
+end
+
+module m3
+  type t
+  end type
+contains
+  subroutine s
+type(t) :: x  ! { dg-error "cannot be host associated at .1." }
+common t  ! { dg-error "incompatible object of the same name" }
+  end
+end
-- 
2.43.0



Re: [PATCH] c++: remove LAMBDA_EXPR_CAPTURES_THIS_P

2025-01-30 Thread Jason Merrill

On 1/29/25 3:49 PM, Patrick Palka wrote:

Built on x86_64-pc-linux-gnu, does this look OK for trunk?


OK.


-- >8 --

This unused accessor is just a simple alias of LAMBDA_EXPR_THIS_CAPTURE
and contrary to the documentation doesn't actually use TREE_LANG_FLAG_0.
Might as well remove it.

gcc/cp/ChangeLog:

* cp-tree.h (LAMBDA_EXPR_CAPTURES_THIS_P): Remove.
---
  gcc/cp/cp-tree.h | 5 -
  1 file changed, 5 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 21011d0b003..ec976928f5f 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -429,7 +429,6 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
LAMBDA_CAPTURE_EXPLICIT_P (in a TREE_LIST in LAMBDA_EXPR_CAPTURE_LIST)
PARENTHESIZED_LIST_P (in the TREE_LIST for a parameter-declaration-list)
CONSTRUCTOR_IS_DIRECT_INIT (in CONSTRUCTOR)
-  LAMBDA_EXPR_CAPTURES_THIS_P (in LAMBDA_EXPR)
DECLTYPE_FOR_LAMBDA_CAPTURE (in DECLTYPE_TYPE)
VEC_INIT_EXPR_IS_CONSTEXPR (in VEC_INIT_EXPR)
DECL_OVERRIDE_P (in FUNCTION_DECL)
@@ -1543,10 +1542,6 @@ enum cp_lambda_default_capture_mode_type {
  #define LAMBDA_EXPR_THIS_CAPTURE(NODE) \
(((struct tree_lambda_expr *)LAMBDA_EXPR_CHECK (NODE))->this_capture)
  
-/* Predicate tracking whether `this' is in the effective capture set.  */

-#define LAMBDA_EXPR_CAPTURES_THIS_P(NODE) \
-  LAMBDA_EXPR_THIS_CAPTURE(NODE)
-
  /* True iff uses of a const variable capture were optimized away.  */
  #define LAMBDA_EXPR_CAPTURE_OPTIMIZED(NODE) \
TREE_LANG_FLAG_2 (LAMBDA_EXPR_CHECK (NODE))




Re: [PATCH] c++: Update const_decl handling after r15-7259 [PR118673].

2025-01-30 Thread Jason Merrill

On 1/30/25 3:19 AM, Iain Sandoe wrote:

Fixes a regression in handling Objective-C++ strings.
Tested on x86_64-darwin21, OK for trunk?


OK.


thanks
Iain

--- 8< ---

Objective-C++ uses CONST_DECLs to hold constant string objects
these should also be treated as mergable lvalues.

PR c++/118673

gcc/cp/ChangeLog:

* tree.cc (lvalue_kind): Mark CONST_DECLs as mergable
when they are also TREE_STATIC.

Signed-off-by: Iain Sandoe 
---
  gcc/cp/tree.cc | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index fb6b2b18e94..79bc74fa2b7 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -213,7 +213,7 @@ lvalue_kind (const_tree ref)
  && DECL_IN_AGGR_P (ref))
return clk_none;
  
-  if (DECL_MERGEABLE (ref))

+  if (TREE_CODE (ref) == CONST_DECL || DECL_MERGEABLE (ref))
return clk_ordinary | clk_mergeable;
  
/* FALLTHRU */




Re: Honor dump options for C/C++ '-fdump-tree-original'

2025-01-30 Thread Jason Merrill

On 1/21/25 8:48 AM, Thomas Schwinge wrote:

Hi!

On 2025-01-16T15:57:52+0100, I wrote:

I have noticed that '-fdump-tree-original-lineno' for Fortran (for
example) does dump location information, but for C/C++ it does not.
The reason is that Fortran (and other front ends) use code like:

 /* Output the GENERIC tree.  */
 dump_function (TDI_original, fndecl);

..., but 'gcc/c-family/c-gimplify.cc:c_genericize' has some special code
to "Dump the C-specific tree IR", and that (unless 'TDF_RAW') calls
'gcc/c-family/c-pretty-print.cc:print_c_tree', and appears to completely
ignore the 'dump_flags_t'.  (Ignores it in 'c_pretty_printer::statement',
and passes 'TDF_NONE' into 'dump_generic_node'.)

See the attached "Honor dump options for C/C++ '-fdump-tree-original'"
for what I have quickly hacked up.  Does that make any sense to do like
this, and if yes, how much more polish does this need, or if no, how
should we approach this issue otherwise?

(I need this, no surprise, for use in test cases.)


In addition to upcoming use of '-fdump-tree-original-lineno', this patch
actually resolves XFAILs for 'c-c++-common/goacc/pr92793-1.c', which had
gotten added as part of commit fa410314ec94c9df2ad270c1917adc51f9147c2c
"[OpenACC] Elaborate testcases that verify column location information 
[PR92793]".

With 'c-c++-common/goacc/pr92793-1.c' un-XFAILed, is it OK, for now, to
push the attached "Honor dump options for C/C++ '-fdump-tree-original'"?
I've 'make check'ed a number of different GCC targets/configurations.


OK in a week if no other comments.

Jason



[PATCH] s390: Fix up *vec_cmpgt{,u}_nocc_emu splitters [PR118696]

2025-01-30 Thread Jakub Jelinek
Hi!

The following testcase is miscompiled on s390x-linux with e.g. -march=z13
(both -O0 and -O2) starting with r15-7053.
The problem is in the splitters which emulate TImode/V1TImode GT and GTU
comparisons.
For GT we want to do
(ior (gt (hi op1) (hi op2))
 (and (eq (hi op1) (hi op2)) (gtu (lo op1) (lo op2
and for GTU similarly except for gtu instead of gt in there.
Now, the splitter emulation is using V2DImode comparisons where on s390x
the hi part is in the first element of the vector, lo part in the second,
and for the gtu case it swaps the elements of the vector.
So, we get the right result in the first element of the result vector.
But vrepg was then broadcasting the second element of the result vector
rather than the first, and the value of the second element of the vector
is instead
(ior (gt (lo op1) (lo op2))
 (and (eq (lo op1) (lo op2)) (gtu (hi op1) (hi op2
so something not really usable for the emulated comparison.

The following patch fixes that.  The testcase tries to test behavior of
double-word smin/smax/umin/umax with various cases of the halves of both
operands (one that is sometimes EQ, sometimes GT, sometimes LT, sometimes
GTU, sometimes LTU).

Stefan has successfully bootstrapped/regtested this on s390x-linux (thanks
for that; I'm still in stage3 of LTO profiledbootstrap), ok for trunk?

2025-01-30  Jakub Jelinek  
Stefan Schulze Frielinghaus  

PR target/118696
* config/s390/vector.md (*vec_cmpgt_nocc_emu,
*vec_cmpgtu_nocc_emu): Duplicate the first rather than
second V2DImode element.

* gcc.dg/pr118696.c: New test.
* gcc.target/s390/vector/pr118696.c: New test.
* gcc.target/s390/vector/vec-abs-emu.c: Expect vrepg with 0 as last
operand rather than 1.
* gcc.target/s390/vector/vec-max-emu.c: Likewise.
* gcc.target/s390/vector/vec-min-emu.c: Likewise.

--- gcc/config/s390/vector.md.jj2025-01-24 17:37:48.987458141 +0100
+++ gcc/config/s390/vector.md   2025-01-30 09:10:53.413542300 +0100
@@ -2166,7 +2166,7 @@ (define_insn_and_split "*vec_cmpgt
(vec_duplicate:V2DI
 (vec_select:DI
  (match_dup 4)
- (parallel [(const_int 1)]
+ (parallel [(const_int 0)]
(set (match_dup 0)
(subreg: (match_dup 4) 0))]
 {
@@ -2198,7 +2198,7 @@ (define_insn_and_split "*vec_cmpgtu (match_dup 4) 0))]
 {
--- gcc/testsuite/gcc.dg/pr118696.c.jj  2025-01-30 09:52:52.064679434 +0100
+++ gcc/testsuite/gcc.dg/pr118696.c 2025-01-30 09:52:33.430936447 +0100
@@ -0,0 +1,131 @@
+/* PR target/118696 */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#if __CHAR_BIT__ == 8
+#if __SIZEOF_INT128__ == 16 && __SIZEOF_LONG_LONG__ == 8
+#define D __int128
+#define S long long
+#define M 0x8000ULL
+#define C 64
+#elif __SIZEOF_LONG_LONG__ == 8 && __SIZEOF_INT__ == 4
+#define D long long
+#define S int
+#define M 0x8000U
+#define C 32
+#endif
+#endif
+
+#ifdef D
+static inline D
+combine (unsigned S x, unsigned S y)
+{
+  return (unsigned D) x << C | y;
+}
+
+__attribute__((noipa)) D
+smin (D x, D y)
+{
+  return x < y ? x : y;
+}
+
+__attribute__((noipa)) D
+smax (D x, D y)
+{
+  return x > y ? x : y;
+}
+
+__attribute__((noipa)) unsigned D
+umin (unsigned D x, unsigned D y)
+{
+  return x < y ? x : y;
+}
+
+__attribute__((noipa)) unsigned D
+umax (unsigned D x, unsigned D y)
+{
+  return x > y ? x : y;
+}
+#endif
+
+int
+main ()
+{
+#ifdef D
+  unsigned S vals[] = {
+0, 12, 42, M, M | 12, M | 42
+  };
+  unsigned char expected[] = {
+4,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
+2,2,2,2,0,4,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,
+2,2,2,2,2,2,2,2,0,0,4,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,
+2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,4,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,2,
+2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,4,3,3,3,3,3,3,3,3,3,3,3,
+3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,0,4,3,3,3,3,3,3,
+3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,0,0,4,3,
+3,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,
+0,0,0,4,3,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
+0,0,0,0,0,0,0,0,4,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
+2,2,2,2,0,0,0,0,0,0,0,0,0,4,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,
+2,2,2,2,2,2,2,2,0,0,0,0,0,0,0,0,0,0,4,3,3,3,3,3,3,3,2,2,2,2,2,2,
+2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,0,0,0,0,0,0,0,4,3,3,3,3,3,3,2,2,
+2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,0,0,0,0,0,0,0,0,4,3,3,3,
+3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,0,0,0,0,0,0,0,0,
+0,4,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,0,0,0,0,
+0,0,0,0,0,0,4,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,
+0,0,0,0,0,0,0,0,0,0,0,4,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
+0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
+2,2,2,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,4,2,2,2,2,2,2,2,2,2,2,
+2,2,2,2,2,2,2,2,1,1,1,1,1,

[PATCH] middle-end/113762 - document bits of TYPE_ADDR_SPACE

2025-01-30 Thread Richard Biener
This documents some constraints on TYPE_ADDR_SPACE in tcc_reference
trees.

Documentation around types seems to be mostly non-existient ... but
since I opened this bug I might as well try to fix it (minimally).

OK?

PR middle-end/113762
* generic.texi (Types): Document TYPE_ADDR_SPACE.
(Storage References): Document where to get at the address-space
accessed.
---
 gcc/doc/generic.texi | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index d4ac580a7a8..f0ed2698b67 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -345,6 +345,9 @@ The following functions and macros deal with 
cv-qualification of types:
 This macro returns the unqualified version of a type.  It may be applied
 to an unqualified type, but it is not always the identity function in
 that case.
+
+@item TYPE_ADDR_SPACE
+This macro returns the address space objects of this type reside in.
 @end ftable
 
 A few other macros and functions are usable with all types:
@@ -1290,6 +1293,11 @@ The type of the node specifies the alignment of the 
access.
 
 @end table
 
+The referenced address-space is available as @code{TYPE_ADDR_SPACE}
+of the reference type and should be consistent along the types of a
+reference composed of nested storage reference trees.
+
+
 @node Unary and Binary Expressions
 @subsection Unary and Binary Expressions
 @tindex NEGATE_EXPR
-- 
2.43.0


Re: [PATCH v2] get source line for diagnostic from preprocessed file [PR preprocessor/79106]

2025-01-30 Thread David Malcolm
On Thu, 2025-01-30 at 12:01 +, Bader, Lucas wrote:
> Thanks a lot for your detailed feedback!
> I will rework my patch, especially to make the
> get_source_line_preprocessed function
> more readable and more efficient.
> 
> Some comments in the mean time:
> 
> > This may sound silly, but please use "num" rather than "no" as an
> > abbreviation for "number": to my under-caffeinated brain "no" reads
> > as
> > the opposite of "yes".
> 
> I fully agree, seems like 5 years ago my naming taste was a bit off.
> 
> > What about buffer[1]?  Does this have to be a ' '?
> ...
> > Don't we also need a trailing '"' and a newline?
> 
> Ack, doesn't hurt to check buffer[1]. The newline might not come
> directly after the trailing '"' however, as
> line markers can contain additional flags after the source file name.
> If we have:
> 
> # (\d+) "(.+)"
> 
> it should be safe to assume it's a line marker.
> 
> > Has this code been tested on large examples?  It looks to me like
> > this
> function does a linear scan starting at line 1 every time in the
> .i/.ii
> file upwards looking for markers of interest;
> 
> Good points, I am looking into tracking line markers in the
> file_cache_slot.
> I am currently testing a version with a simple vector for
> memoization.
> 
> FWIW we are using this patch in our internal gcc fork for building a
> pretty substantial C++ codebase and
> have not seen build time regressions directly linked to it.
> However, testing it against a fairly large .ii file that produces
> many warnings from that codebase (~400k LoC) shows 
> significant speedup with the tracking version:
> 
> ~/gcc-dev/bin/gcc TEST.cpp.ii  62.58s user 1.01s system 99% cpu
> 1:03.66 total # path as is
> ~/gcc-dev/bin/gcc TEST.cpp.ii  32.80s user 0.95s system 99% cpu
> 33.749 total # with line marker memoization

Interesting - thanks.  Do you have a profile of the hot spots?  I
wonder to what extent Andi's recent patches would help.

> 
> > This function is rather complicated
> 
> I agree. To be honest, looking at it after a couple of years was also
> not very pleasant.
> 
> > Something else that occurred to me: assuming the file in question
> > has
> gone through libcpp, we've already parsed the line markers, and
> line_table will have a series of line map instances representing the
> ranges in question.
> 
> According to some comments I found, it seems that for e.g the C
> front-end, it can happen that we start emitting diagnostics
> before the line map has seen the end of the file, so it could
> probably only act as a hint?

Yes - though I think for any location we're likely to warn about, the
line maps instance will have line map instances for all of the
markers;.  For reference the code in libcpp to parse the line markers
is in libcpp/directives.cc:::do_linemarker

But this will only work for cases where libcpp is used to lex the file,
which is the case for frontends that support .i/.ii files, but there
are use-cases of input.cc that don't do that, e.g. libgdiagnostics.

So though I thought this alternate approach was worth mentioning, I
think the overall way your patch does it is probably the one we should
follow, as it avoids tight coupling with libcpp.

Thanks
Dave



RE: [PATCH v2] get source line for diagnostic from preprocessed file [PR preprocessor/79106]

2025-01-30 Thread Bader, Lucas
Thanks a lot for your detailed feedback!
I will rework my patch, especially to make the get_source_line_preprocessed 
function
more readable and more efficient.

Some comments in the mean time:

> This may sound silly, but please use "num" rather than "no" as an
> abbreviation for "number": to my under-caffeinated brain "no" reads
> as
> the opposite of "yes".

I fully agree, seems like 5 years ago my naming taste was a bit off.

> What about buffer[1]?  Does this have to be a ' '?
...
> Don't we also need a trailing '"' and a newline?

Ack, doesn't hurt to check buffer[1]. The newline might not come directly after 
the trailing '"' however, as
line markers can contain additional flags after the source file name. If we 
have:

# (\d+) "(.+)"

it should be safe to assume it's a line marker.

> Has this code been tested on large examples?  It looks to me like this
function does a linear scan starting at line 1 every time in the .i/.ii
file upwards looking for markers of interest;

Good points, I am looking into tracking line markers in the file_cache_slot.
I am currently testing a version with a simple vector for memoization.

FWIW we are using this patch in our internal gcc fork for building a pretty 
substantial C++ codebase and
have not seen build time regressions directly linked to it.
However, testing it against a fairly large .ii file that produces many warnings 
from that codebase (~400k LoC) shows 
significant speedup with the tracking version:

~/gcc-dev/bin/gcc TEST.cpp.ii  62.58s user 1.01s system 99% cpu 1:03.66 total # 
path as is
~/gcc-dev/bin/gcc TEST.cpp.ii  32.80s user 0.95s system 99% cpu 33.749 total # 
with line marker memoization

> This function is rather complicated

I agree. To be honest, looking at it after a couple of years was also not very 
pleasant.

> Something else that occurred to me: assuming the file in question has
gone through libcpp, we've already parsed the line markers, and
line_table will have a series of line map instances representing the
ranges in question.

According to some comments I found, it seems that for e.g the C front-end, it 
can happen that we start emitting diagnostics
before the line map has seen the end of the file, so it could probably only act 
as a hint?

Best
Lucas

-Original Message-
From: David Malcolm  
Sent: Wednesday, 29 January 2025 16:17
To: Bader, Lucas ; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH v2] get source line for diagnostic from preprocessed file 
[PR preprocessor/79106]

On Wed, 2025-01-29 at 09:16 -0500, David Malcolm wrote:
> On Wed, 2025-01-29 at 13:05 +, Bader, Lucas wrote:
> > Hi,
> > 
> > as discussed, I rebased and tested my patch against current master.
> > Additionally, it now has the Signed-off-by tag.
> > Looking forward to your comments.
> > 
> > Best
> > Lucas
> 
> Thanks for the updated patch.
> 
> Various notes inline throughout below...
> 
> > 
> > 
> > 
> > Within a compile cluster, only the preprocessed output of GCC is
> > transferred
> > to remote nodes for compilation. When GCC produces advanced
> > diagnostics
> > (with -fdiagnostics-show-caret), e.g. prints out the affected
> > source
> > line and fixit hints, it attempts to read the source file again,
> > even
> > when
> > compiling a preprocessed file (-fpreprocessed). This leads to wrong
> > diagnostics when building with a compile cluster, or, more
> > generally,
> > when changing or deleting the original source file.
> > 
> > This patch alters GCC to read from the preprocessed file instead by
> > calculating the corresponding source line. This behavior is
> > consistent
> > with clang.
> > 
> > gcc/c-family/ChangeLog:
> > 
> > * c-opts.cc (c_common_handle_option): pass -fpreprocessed
> > option value to global diagnostic configuration
> > 
> > gcc/ChangeLog:
> > 
> > * diagnostic-show-locus.cc
> > (layout::calculate_x_offset_display): read line from source or
> > preprocessed
> > file based on -fpreprocessed value
> > (source_line::source_line): read line from source or
> > preprocessed
> > file based on -fpreprocessed value
> > (layout_printer::print_line): read line from source or
> > preprocessed
> > file based on -fpreprocessed value
> > 
> > * diagnostic.cc (diagnostic_context::initialize):
> > initialize
> > new members
> > * diagnostic.h: new members for reading
> > source lines from preprocessed files
> > 
> > * input.cc (file_cache::get_source_line_preprocessed): new
> > function
> > to read source lines from preprocessed files
> > (test_reading_source_line_preprocessed): new test case
> > (input_cc_tests): execute new test case
> > * input.h (class file_cache): add new member function
> > 
> > * opts-global.cc (read_cmdline_options): pass input
> > filename
> > to global
> > diagnostic context
> 
> 
> Some minor ChangeLog nits:
> * you should add "PR preprocessor/79106" (without quotes) to the
> start
> of each ChangeLog entry; have a look at the existing C

Re: [PATCH] middle-end/118695 - missed misalign handling in MEM_REF expansion

2025-01-30 Thread Richard Biener
On Thu, 30 Jan 2025, Jakub Jelinek wrote:

> On Thu, Jan 30, 2025 at 11:34:16AM +0100, Richard Biener wrote:
> > When MEM_REF expansion of a non-MEM falls back to a stack temporary
> > we fail to handle the case where the offset adjusted reference to
> > the temporary is not aligned according to the requirement of the
> > mode.  We have to go through bitfield extraction or movmisalign
> > in this case.  Fortunately there's a helper for this.
> > 
> > This fixes an ICE observed on arm which has sanity checks in its
> > move patterns for this.
> > 
> > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> > 
> > OK?
> > 
> > PR middle-end/118695
> > * expr.cc (expand_expr_real_1): When expanding a MEM_REF
> > to a non-MEM by committing it to a stack temporary make
> > sure to handle misaligned accesses correctly.
> > 
> > * gcc.dg/pr118695.c: New testcase.
> > ---
> >  gcc/expr.cc | 11 +++
> >  gcc/testsuite/gcc.dg/pr118695.c |  9 +
> >  2 files changed, 20 insertions(+)
> >  create mode 100644 gcc/testsuite/gcc.dg/pr118695.c
> > 
> > diff --git a/gcc/expr.cc b/gcc/expr.cc
> > index 10467f82c0d..0136678a40b 100644
> > --- a/gcc/expr.cc
> > +++ b/gcc/expr.cc
> > @@ -11796,6 +11796,7 @@ expand_expr_real_1 (tree exp, rtx target, 
> > machine_mode tmode,
> > && known_eq (GET_MODE_BITSIZE (DECL_MODE (base)), type_size))
> >   return expand_expr (build1 (VIEW_CONVERT_EXPR, type, base),
> >   target, tmode, modifier);
> > +   unsigned align;
> > if (TYPE_MODE (type) == BLKmode || maybe_lt (offset, 0))
> >   {
> > temp = assign_stack_temp (DECL_MODE (base),
> > @@ -11804,6 +11805,16 @@ expand_expr_real_1 (tree exp, rtx target, 
> > machine_mode tmode,
> > temp = adjust_address (temp, TYPE_MODE (type), offset);
> > if (TYPE_MODE (type) == BLKmode)
> >   set_mem_size (temp, int_size_in_bytes (type));
> > +   /* When the original ref was misaligned so will be the
> > +  access to the stack temporary.  Not all targets handle
> > +  this correctly, some will ICE in sanity checking.
> > +  Handle this by doing bitfield extraction when necessary.  */
> > +   else if ((align = get_object_alignment (exp))
> > +< GET_MODE_ALIGNMENT (TYPE_MODE (type)))
> > + temp = expand_misaligned_mem_ref
> > +  (temp, TYPE_MODE (type), unsignedp, align,
> > +   modifier == EXPAND_STACK_PARM ? NULL_RTX : target,
> > +   NULL);
> 
> I don't like very much this kind of formatting (function name on one line,
> open paren on another).  Of course there are cases where there is no other 
> choice,
> but in this case
>temp
>  = expand_misaligned_mem_ref (temp, TYPE_MODE (type),
>   unsignedp, align,
>   modifier == EXPAND_STACK_PARM
>   ? NULL_RTX : target, NULL);
> doesn't look too bad.
> 
> Ok for trunk either way.

Pushed with formatting as you suggested.

Richard.


Re: [patch,avr] Add built-ins for strlen of a string in some address-space

2025-01-30 Thread Georg-Johann Lay

For easier review, I broke that patch into two parts:

One for the strlen built-ins, and one to handle built-ins
that are only available in C.

Delta is the same.

Johann


Am 30.01.25 um 11:42 schrieb Georg-Johann Lay:

AVR: Provide built-ins for strlen where the string lives in some AS.

This patch adds built-in functions __builtin_avr_strlen_flash,
__builtin_avr_strlen_flashx and __builtin_avr_strlen_memx.
Purpose is that higher-level functions can use __builtin_constant_p
on strlen without raising a diagnostic due to -Waddr-space-convert.

gcc/
 * config/avr/builtins.def (AVR_FIRST_C_ONLY_BUILTIN_ID): New macro.
 (STRLEN_FLASH, STRLEN_FLASHX, STRLEN_MEMX): New DEF_BUILTIN's.
 * config/avr/avr-protos.h (avr_builtin_C_only_p): New.
 * config/avr/avr.cc (avr_builtin_C_only_p): New function.
 (avr_ftype_strlen): New static function.
 (avr_init_builtins): Only provide a built-in when it is supported
 for the compiled language.
 : Provide
 new fntypes.
 (avr_fold_builtin) [AVR_BUILTIN_STRLEN_FLASH]
 [AVR_BUILTIN_STRLEN_FLASHX, AVR_BUILTIN_STRLEN_MEMX]: Fold if
 possible.
 * config/avr/avr-c.cc (avr_cpu_cpp_builtins): Only define the
 __BUILTIN_AVR_ build-in defines when a built-in function
 is available for the compiled language.
 * doc/extend.texi (AVR Built-in Functions): Document
 __builtin_avr_strlen_flash, __builtin_avr_strlen_flashx,
 __builtin_avr_strlen_memx.
libgcc/
 * config/avr/t-avr (LIB1ASMFUNCS): Add _strlen_memx.
 * config/avr/lib1funcs.S : Implement.AVR: Only provide a built-in when it is available for the language.

Some built-ins are not available for C++ since they are using
named address-spaces or fixed-point types.

gcc/
* config/avr/builtins.def (AVR_FIRST_C_ONLY_BUILTIN_ID): New macro.
* config/avr/avr-protos.h (avr_builtin_C_only_p): New.
* config/avr/avr.cc (avr_builtin_C_only_p): New function.
(avr_init_builtins): Only provide a built-in when it is supported
for the compiled language.
* config/avr/avr-c.cc (avr_cpu_cpp_builtins): Only define the
__BUILTIN_AVR_ build-in defines when a built-in function
is available for the compiled language.
* doc/extend.texi (AVR Built-in Functions): Add a note that
following built-ins are supported for only for GNU-C.

diff --git a/gcc/config/avr/avr-c.cc b/gcc/config/avr/avr-c.cc
index 53f15f2be7b..ced541a9ddc 100644
--- a/gcc/config/avr/avr-c.cc
+++ b/gcc/config/avr/avr-c.cc
@@ -500,7 +500,9 @@ avr_cpu_cpp_builtins (cpp_reader *pfile)
  not a specific builtin is available. */
 
 #define DEF_BUILTIN(NAME, N_ARGS, TYPE, CODE, LIBNAME, ATTRS) \
-  cpp_define (pfile, "__BUILTIN_AVR_" #NAME);
+  if (lang_GNU_C ()	  \
+  || ! avr_builtin_C_only_p (AVR_BUILTIN_ ## NAME))	  \
+cpp_define (pfile, "__BUILTIN_AVR_" #NAME);
 #include "builtins.def"
 #undef DEF_BUILTIN
 
diff --git a/gcc/config/avr/avr-protos.h b/gcc/config/avr/avr-protos.h
index 83137c7f6f6..72eb5163cce 100644
--- a/gcc/config/avr/avr-protos.h
+++ b/gcc/config/avr/avr-protos.h
@@ -21,6 +21,7 @@
 
 extern bool avr_function_arg_regno_p (int r);
 extern void avr_cpu_cpp_builtins (cpp_reader * pfile);
+extern bool avr_builtin_C_only_p (unsigned id);
 extern enum reg_class avr_regno_reg_class (int r);
 extern void asm_globalize_label (FILE *file, const char *name);
 extern void avr_adjust_reg_alloc_order (void);
diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
index 656d3e7389b..de9216e70ce 100644
--- a/gcc/config/avr/avr.cc
+++ b/gcc/config/avr/avr.cc
@@ -15689,6 +15689,17 @@ avr_bdesc[AVR_BUILTIN_COUNT] =
   };
 
 
+/* Some of our built-in function are available for GNU-C only:
+   - Built-ins that use named address-spaces.
+   - Built-ins that use fixed-point types.  */
+
+bool
+avr_builtin_C_only_p (unsigned id)
+{
+  return id >= AVR_FIRST_C_ONLY_BUILTIN_ID;
+}
+
+
 /* Implement `TARGET_BUILTIN_DECL'.  */
 
 static tree
@@ -15885,8 +15896,15 @@ avr_init_builtins (void)
   tree attr_const = tree_cons (get_identifier ("const"), NULL, NULL);
 
 #define DEF_BUILTIN(NAME, N_ARGS, TYPE, CODE, LIBNAME, ATTRS)		\
-  {	\
+  do {	\
 int id = AVR_BUILTIN_ ## NAME;	\
+if (! lang_GNU_C ()			\
+	&& avr_builtin_C_only_p (id))	\
+  {	\
+	avr_bdesc[id].fndecl = NULL_TREE;\
+	break;\
+  }	\
+	\
 const char *Name = "__builtin_avr_" #NAME;\
 char *name = (char *) alloca (1 + strlen (Name));			\
 	\
@@ -15894,7 +15912,7 @@ avr_init_builtins (void)
 avr_bdesc[id].fndecl		\
   = add_builtin_function (avr_tolower (name, Name), TYPE, id,	\
 			  BUILT_IN_MD, LIBNAME, ATTRS);		\
-  }
+  } while (0);
 #include "builtins.def"
 #undef DEF_BUILTIN
 
diff --git a/gcc/config/avr/builtins.def b/gcc/config/avr/builtins.def
index 61dbc3a6c1b..ad75fe9c267 100

[committed] libstdc++: Use safe integer comparisons in std::latch [PR98749]

2025-01-30 Thread Jonathan Wakely
The std::latch::max() function assumes that the returned value can be
represented by ptrdiff_t, which is true when __platform_wait_t is int
(e.g. on Linux) but not when it's unsigned long, which is the case for
most other 64-bit targets. We should use the smaller of PTRDIFF_MAX and
std::numeric_limits<__platform_wait_t>::max(). Use std::cmp_less to do a
safe comparison that works for all types. We can also use std::cmp_less
and std::cmp_equal in std::latch::count_down so that we don't need to
deal with comparisons between signed and unsigned.

Also add a missing precondition check to constructor and fix the
existing check in count_down which was duplicated by mistake.

libstdc++-v3/ChangeLog:

PR libstdc++/98749
* include/std/latch (latch::max()): Ensure the return value is
representable as the return type.
(latch::latch(ptrdiff_t)): Add assertion.
(latch::count_down): Fix copy & pasted duplicate assertion. Use
std::cmp_equal to compare __platform_wait_t and ptrdiff_t
values.
(latch::_M_a): Use defined constant for alignment.
* testsuite/30_threads/latch/1.cc: Check max(). Check constant
initialization works for values in the valid range. Check
alignment.
---

Tested x86_64-linux and sparcv9-solaris (where max() was returning -1 so
the new checks i 30_threads/latch/1.cc failed).

Pushed to trunk. This seems safe to backport to gcc-14 too.

It would be nice if the safe integer comparison functions (cmp_less etc)
were in their own  header so that  didn't need to
include all of , which pulls in .

 libstdc++-v3/include/std/latch   | 32 +---
 libstdc++-v3/testsuite/30_threads/latch/1.cc | 12 +++-
 2 files changed, 32 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/include/std/latch b/libstdc++-v3/include/std/latch
index 9220580613d2..cf648545629d 100644
--- a/libstdc++-v3/include/std/latch
+++ b/libstdc++-v3/include/std/latch
@@ -41,6 +41,7 @@
 #ifdef __cpp_lib_latch // C++ >= 20 && atomic_wait
 #include 
 #include 
+#include  // cmp_equal, cmp_less_equal, etc.
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -51,24 +52,34 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   public:
 static constexpr ptrdiff_t
 max() noexcept
-{ return __gnu_cxx::__int_traits<__detail::__platform_wait_t>::__max; }
+{
+  using __gnu_cxx::__int_traits;
+  constexpr auto __max = __int_traits<__detail::__platform_wait_t>::__max;
+  if constexpr (std::cmp_less(__max, __PTRDIFF_MAX__))
+   return __max;
+  return __PTRDIFF_MAX__;
+}
 
-constexpr explicit latch(ptrdiff_t __expected) noexcept
-  : _M_a(__expected) { }
+constexpr explicit
+latch(ptrdiff_t __expected) noexcept
+: _M_a(__expected)
+{ __glibcxx_assert(__expected >= 0 && __expected <= max()); }
 
 ~latch() = default;
+
 latch(const latch&) = delete;
 latch& operator=(const latch&) = delete;
 
 _GLIBCXX_ALWAYS_INLINE void
 count_down(ptrdiff_t __update = 1)
 {
-  __glibcxx_assert(__update >= 0);
-  auto const __old = __atomic_impl::fetch_sub(&_M_a,
-   __update, memory_order::release);
-  __glibcxx_assert(__update >= 0);
-  if (__old == static_cast<__detail::__platform_wait_t>(__update))
+  __glibcxx_assert(__update >= 0 && __update <= max());
+  auto const __old = __atomic_impl::fetch_sub(&_M_a, __update,
+ memory_order::release);
+  if (std::cmp_equal(__old, __update))
__atomic_impl::notify_all(&_M_a);
+  else
+   __glibcxx_assert(std::cmp_less(__update, __old));
 }
 
 _GLIBCXX_ALWAYS_INLINE bool
@@ -90,9 +101,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
   private:
-// This alignas is not redundant, it increases the alignment for
-// long long on x86.
-alignas(__alignof__(__detail::__platform_wait_t)) 
__detail::__platform_wait_t _M_a;
+alignas(__detail::__platform_wait_alignment)
+  __detail::__platform_wait_t _M_a;
   };
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
diff --git a/libstdc++-v3/testsuite/30_threads/latch/1.cc 
b/libstdc++-v3/testsuite/30_threads/latch/1.cc
index 20c40254c0e6..29984cbf354e 100644
--- a/libstdc++-v3/testsuite/30_threads/latch/1.cc
+++ b/libstdc++-v3/testsuite/30_threads/latch/1.cc
@@ -22,6 +22,16 @@
 
 #ifndef __cpp_lib_latch
 # error "Feature-test macro for latch missing in "
-#elif __cpp_lib_latch!= 201907L
+#elif __cpp_lib_latch != 201907L
 # error "Feature-test macro for latch has wrong value in "
 #endif
+
+static_assert(std::latch::max() > 0);
+
+constinit std::latch l0(0);
+constinit std::latch l1(1);
+constinit std::latch l2(std::latch::max());
+
+#ifdef _GLIBCXX_RELEASE
+static_assert(alignof(std::latch) == std::__detail::__platform_wait_alignment);
+#endif
-- 
2.48.1



Re: [PATCH] s390: Fix up *vec_cmpgt{,u}_nocc_emu splitters [PR118696]

2025-01-30 Thread Stefan Schulze Frielinghaus
On Thu, Jan 30, 2025 at 05:11:54PM +0100, Jakub Jelinek wrote:
> Hi!
> 
> The following testcase is miscompiled on s390x-linux with e.g. -march=z13
> (both -O0 and -O2) starting with r15-7053.
> The problem is in the splitters which emulate TImode/V1TImode GT and GTU
> comparisons.
> For GT we want to do
> (ior (gt (hi op1) (hi op2))
>  (and (eq (hi op1) (hi op2)) (gtu (lo op1) (lo op2
> and for GTU similarly except for gtu instead of gt in there.
> Now, the splitter emulation is using V2DImode comparisons where on s390x
> the hi part is in the first element of the vector, lo part in the second,
> and for the gtu case it swaps the elements of the vector.
> So, we get the right result in the first element of the result vector.
> But vrepg was then broadcasting the second element of the result vector
> rather than the first, and the value of the second element of the vector
> is instead
> (ior (gt (lo op1) (lo op2))
>  (and (eq (lo op1) (lo op2)) (gtu (hi op1) (hi op2
> so something not really usable for the emulated comparison.
> 
> The following patch fixes that.  The testcase tries to test behavior of
> double-word smin/smax/umin/umax with various cases of the halves of both
> operands (one that is sometimes EQ, sometimes GT, sometimes LT, sometimes
> GTU, sometimes LTU).
> 
> Stefan has successfully bootstrapped/regtested this on s390x-linux (thanks
> for that; I'm still in stage3 of LTO profiledbootstrap), ok for trunk?

Ok and thanks again for fixing this so quickly!

Cheers,
Stefan

> 
> 2025-01-30  Jakub Jelinek  
>   Stefan Schulze Frielinghaus  
> 
>   PR target/118696
>   * config/s390/vector.md (*vec_cmpgt_nocc_emu,
>   *vec_cmpgtu_nocc_emu): Duplicate the first rather than
>   second V2DImode element.
> 
>   * gcc.dg/pr118696.c: New test.
>   * gcc.target/s390/vector/pr118696.c: New test.
>   * gcc.target/s390/vector/vec-abs-emu.c: Expect vrepg with 0 as last
>   operand rather than 1.
>   * gcc.target/s390/vector/vec-max-emu.c: Likewise.
>   * gcc.target/s390/vector/vec-min-emu.c: Likewise.
> 
> --- gcc/config/s390/vector.md.jj  2025-01-24 17:37:48.987458141 +0100
> +++ gcc/config/s390/vector.md 2025-01-30 09:10:53.413542300 +0100
> @@ -2166,7 +2166,7 @@ (define_insn_and_split "*vec_cmpgt
>   (vec_duplicate:V2DI
>(vec_select:DI
> (match_dup 4)
> -   (parallel [(const_int 1)]
> +   (parallel [(const_int 0)]
> (set (match_dup 0)
>   (subreg: (match_dup 4) 0))]
>  {
> @@ -2198,7 +2198,7 @@ (define_insn_and_split "*vec_cmpgtu   (vec_duplicate:V2DI
>(vec_select:DI
> (match_dup 4)
> -   (parallel [(const_int 1)]
> +   (parallel [(const_int 0)]
> (set (match_dup 0)
>   (subreg: (match_dup 4) 0))]
>  {
> --- gcc/testsuite/gcc.dg/pr118696.c.jj2025-01-30 09:52:52.064679434 
> +0100
> +++ gcc/testsuite/gcc.dg/pr118696.c   2025-01-30 09:52:33.430936447 +0100
> @@ -0,0 +1,131 @@
> +/* PR target/118696 */
> +/* { dg-do run } */
> +/* { dg-options "-O2" } */
> +
> +#if __CHAR_BIT__ == 8
> +#if __SIZEOF_INT128__ == 16 && __SIZEOF_LONG_LONG__ == 8
> +#define D __int128
> +#define S long long
> +#define M 0x8000ULL
> +#define C 64
> +#elif __SIZEOF_LONG_LONG__ == 8 && __SIZEOF_INT__ == 4
> +#define D long long
> +#define S int
> +#define M 0x8000U
> +#define C 32
> +#endif
> +#endif
> +
> +#ifdef D
> +static inline D
> +combine (unsigned S x, unsigned S y)
> +{
> +  return (unsigned D) x << C | y;
> +}
> +
> +__attribute__((noipa)) D
> +smin (D x, D y)
> +{
> +  return x < y ? x : y;
> +}
> +
> +__attribute__((noipa)) D
> +smax (D x, D y)
> +{
> +  return x > y ? x : y;
> +}
> +
> +__attribute__((noipa)) unsigned D
> +umin (unsigned D x, unsigned D y)
> +{
> +  return x < y ? x : y;
> +}
> +
> +__attribute__((noipa)) unsigned D
> +umax (unsigned D x, unsigned D y)
> +{
> +  return x > y ? x : y;
> +}
> +#endif
> +
> +int
> +main ()
> +{
> +#ifdef D
> +  unsigned S vals[] = {
> +0, 12, 42, M, M | 12, M | 42
> +  };
> +  unsigned char expected[] = {
> +4,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
> +2,2,2,2,0,4,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,
> +2,2,2,2,2,2,2,2,0,0,4,3,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,
> +2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,4,3,3,3,3,3,3,3,3,3,3,3,3,3,3,2,2,
> +2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,4,3,3,3,3,3,3,3,3,3,3,3,
> +3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,0,4,3,3,3,3,3,3,
> +3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,0,0,4,3,
> +3,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,0,
> +0,0,0,4,3,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
> +0,0,0,0,0,0,0,0,4,3,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
> +2,2,2,2,0,0,0,0,0,0,0,0,0,4,3,3,3,3,3,3,3,3,2,2,2,2,2,2,2,2,2,2,
> +2,2,2,2,2,2,2,2,0,0,0,0,0,0,0,0,0,0,4,3,3,3,3,3,3,3,2,2,2,2,2,2,
> +2,2,2,2,2,2,2,2,2,2,2,2,0,0,0,

Re: [PATCH v3 1/2] c++: improve location of parsed RETURN_EXPRs

2025-01-30 Thread David Malcolm
On Wed, 2024-08-21 at 16:05 -0400, Jason Merrill wrote:
> On 8/21/24 3:34 PM, Arsen Arsenović wrote:
> > Jason Merrill  writes:
> > 
> > > On 8/21/24 1:52 PM, Arsen Arsenović wrote:
> > > > For clarity, here's the entire split-up patch I intend to push,
> > > > if it
> > > > looks OK.  Tested on x86_64-pc-linux-gnu.
> > > > I've renamed the field we've discussed and also a few
> > > > parameters that
> > > > refer to 'kw' to be less specific.  The code is functionally
> > > > identical.
> > > > OK for trunk?
> > > > TIA, have a lovely day.
> > > > -- >8 --
> > > > This patch improves the EXPR_LOCATION associated with parsed
> > > > RETURN_EXPRs so that they can be used in diagnostics later. 
> > > > This change
> > > > also happened to un-suppress an analyzer false-negative that
> > > > was
> > > > happening because the location of RETURN_EXPR was entirely
> > > > within the
> > > > NULL macro, which was defined in a system header.  PR
> > > > analyzer/116304.
> > > 
> > > The PR number should be on its own line, and in the subject line.
> > 
> > It isn't a total fix for that PR, that's why I didn't name it as
> > part of
> > the changelog and subject line, but to be fair it would not be
> > wrong to
> > put it there anyway, as it is related.  Will reword.
> > 
> > > > gcc/cp/ChangeLog:
> > > > * cp-tree.h (finish_return_stmt): Add optional
> > > > location_t
> > > > parameter, defaulting to input_location.
> > > > * parser.cc (cp_parser_jump_statement): Improve return
> > > > and
> > > > co_return locations so that they span their entire
> > > > statements.
> > > > * semantics.cc (finish_return_stmt): Use the new
> > > > stmt_loc
> > > > parameter in place of input_location.
> > > > gcc/testsuite/ChangeLog:
> > > > * c-c++-common/analyzer/inlining-4-multiline.c: Adjust
> > > > locations
> > > > in diagnostics.
> > > 
> > > This doesn't look like a location adjustment, but removing
> > > testing of expected
> > > output.  If the output changed, please change the test to check
> > > the new output
> > > rather than not at all.
> > 
> > This is the new output - the diagnostics no longer expand that
> > macro,
> > since the location is not wholly contained within it.  The relevant
> > part
> > of the diagnostics after the change is:
> > 
> > --8<---cut here---start->8---
> >    |
> >  'const char* inner(int)': event 5 (depth 3)
> >    |
> >    | return NULL;
> >    |
> > --8<---cut here---end--->8---
> > 
> > ... as opposed to (excuse the quote difference - the former was
> > pulled
> > from g++.log and the latter from a manual invocation):
> > 
> > --8<---cut here---start->8---
> >    |
> >  ‘const char* inner(int)’: event 5 (depth 3)
> >    |
> >    |/home/arsen/gcc-mine/gcc/testsuite/c-c++-
> > common/analyzer/../../gcc.dg/analyzer/analyzer-decls.h:7:14:
> >    | #define NULL nullptr
> >    |  ^~~
> >    |  |
> >    |  (5) ...to here
> > /home/arsen/gcc-mine/gcc/testsuite/c-c++-common/analyzer/inlining-
> > 4-multiline.c:14:12: note: in expansion of macro ‘NULL’
> >    | return NULL;
> >    |    ^~~~
> >    |
> > --8<---cut here---end--->8---
> > 
> > ... presumably, the diagnostics chose to elide those bits of output
> > due
> > to the new location covering the entire line (and hence not being
> > too
> > informative) - but I haven't debugged that (as I assumed the
> > diagnostic
> > code is DTRT now).
> 
> It seems weird to lose the "...to here" marking on the return.  Any 
> thoughts, David?

Sorry for missing this earlier.

With the patch applied, I see:

$ ./xg++ -B. -S -fanalyzer -O2 
../../src/gcc/testsuite/c-c++-common/analyzer/inlining-4-multiline.c
../../src/gcc/testsuite/c-c++-common/analyzer/inlining-4-multiline.c: In 
function ‘char outer(int)’:
../../src/gcc/testsuite/c-c++-common/analyzer/inlining-4-multiline.c:27:23: 
warning: dereference of NULL ‘0’ [CWE-476] [-Wanalyzer-null-dereference]
   27 |   return *middle (flag); /* { dg-warning "dereference of NULL" 
"warning" } */
  |   ^
  ‘char outer(int)’: events 1-2
│
│   25 | outer (int flag)
│  | ^
│  | |
│  | (1) entry to ‘outer’
│   26 | {
│   27 |   return *middle (flag); /* { dg-warning "dereference of NULL" 
"warning" } */
│  |  ~
│  |  |
│  |  (2) inlined call to ‘middle’ from ‘outer’
│
└──> ‘const char* middle(int)’: event 3
   │
   │   21 |   ret

[PATCH v2] c++: wrong-code with consteval constructor [PR117501]

2025-01-30 Thread Marek Polacek
On Wed, Jan 29, 2025 at 08:03:37AM -0500, Jason Merrill wrote:
> On 1/27/25 6:19 PM, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14?
> > 
> > -- >8 --
> > We've had a wrong-code problem since r14-4140, due to which we
> > forget to initialize a variable.
> > 
> > In consteval39.C, we evaluate
> > 
> >  struct QQQ q;
> >< >  QQQ::QQQ (&q, TARGET_EXPR  >5
> >__ct_comp
> >D.2687
> >(struct basic_string_view *) <<< Unknown tree: void_cst >>>
> >(const char *) "" ) >;
> > 
> > into
> > 
> >  struct QQQ q;
> >< >  {.data={._M_len=42, ._M_str=0}} >;
> > 
> > and then the useless expr_stmt is dropped on the floor, so q isn't
> > initialized.  As pre-r14-4140, we need to handle constructors specially.
> 
> Hmm, why didn't the following code in eval_outermost make this a
> rejects-valid bug rather than wrong-code?
> 
> >   /* If T is calling a constructor to initialize an object,
> > reframe
> > it as an AGGR_INIT_EXPR to avoid trying to modify an object
> > from outside the constant evaluation, which will fail even if
> > the value is actually constant (is_constant_evaluated3.C).  */

So yes, we go here, create an AGGR_INIT_EXPR, then evaluate it into
{.data={._M_len=42, ._M_str=0}}.  What should give an error?

> Your change should share code with this block doing the same thing in
> cp_fold_r:
> 
> > if (TREE_CODE (r) != CALL_EXPR)
> >   {
> > if (DECL_CONSTRUCTOR_P (callee))
> >   {
> > loc = EXPR_LOCATION (x);
> > tree a = CALL_EXPR_ARG (x, 0);
> > bool return_this = targetm.cxx.cdtor_returns_this ();
> > if (return_this)
> >   a = cp_save_expr (a);
> > tree s = build_fold_indirect_ref_loc (loc, a);
> > r = cp_build_init_expr (s, r);
> > if (return_this)
> >   r = build2_loc (loc, COMPOUND_EXPR, TREE_TYPE (x), r,
> >   fold_convert_loc (loc, TREE_TYPE (x), a));
> >   }
> > x = r;
> > break;
> >   }
> 
> Like there, we shouldn't need this for AGGR_INIT_EXPR, only CALL_EXPR.

Okay, but that then means that I can't call cxx_constant_value with an
object, otherwise I don't think I can unify the code above.  Which is
fine, we go down the eval_outermost/AGGR_INIT_EXPR path as mentioned
above.  At least I hope that is okay.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14?

-- >8 --
We've had a wrong-code problem since r14-4140, due to which we
forget to initialize a variable.

In consteval39.C, we evaluate

struct QQQ q;
  <>>
  (const char *) "" ) >;

into

struct QQQ q;
  <;

and then the useless expr_stmt is dropped on the floor, so q isn't
initialized.  As pre-r14-4140, we need to handle constructors specially.

With this patch, we generate:

struct QQQ q;
  <;

initializing q properly.

PR c++/117501

gcc/cp/ChangeLog:

* cp-gimplify.cc (cp_build_init_expr_for_ctor): New.
(cp_fold_immediate_r): Call it.
(cp_fold): Break out code into cp_build_init_expr_for_ctor.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/consteval39.C: New test.
* g++.dg/cpp2a/consteval40.C: New test.
---
 gcc/cp/cp-gimplify.cc| 41 
 gcc/testsuite/g++.dg/cpp2a/consteval39.C | 27 
 gcc/testsuite/g++.dg/cpp2a/consteval40.C | 25 +++
 3 files changed, 80 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval39.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/consteval40.C

diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index 4ec3de13008..bdbdcafec54 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -1182,6 +1182,27 @@ taking_address_of_imm_fn_error (tree expr, tree decl)
   maybe_explain_promoted_consteval (loc, decl);
 }
 
+/* Build up an INIT_EXPR to initialize the object of a constructor.  CALL
+   is the CALL_EXPR for the constructor call; INIT is the initializer.  */
+
+static tree
+cp_build_init_expr_for_ctor (tree call, tree init)
+{
+  tree a = CALL_EXPR_ARG (call, 0);
+  if (is_dummy_object (a))
+return init;
+  const bool return_this = targetm.cxx.cdtor_returns_this ();
+  const location_t loc = EXPR_LOCATION (call);
+  if (return_this)
+a = cp_save_expr (a);
+  tree s = build_fold_indirect_ref_loc (loc, a);
+  init = cp_build_init_expr (s, init);
+  if (return_this)
+init = build2_loc (loc, COMPOUND_EXPR, TREE_TYPE (call), init,
+   fold_convert_loc (loc, TREE_TYPE (call), a));
+  return init;
+}
+
 /* A subroutine of cp_fold_r to handle immediate functions.  */
 
 static tree
@@ -1297,7 +1318,12 @@ cp_fold_immediate_r (tree *stmt_p, int *walk_subtrees, 
void *data_)
}
   /* We've evaluated the 

Re: [PATCH] [testsuite] require profiling support [PR113689]

2025-01-30 Thread Jeff Law




On 1/30/25 12:45 PM, Alexandre Oliva wrote:


pr113689 testcases use -fprofile without testing for profiling
support.  Fix them.

Regstrapped on x86_64-linux-gnu, also tested with gcc-14 targeting
x86_64-elf.  Ok to install?


for  gcc/testsuite/ChangeLog

PR target/113689
* gcc.target/i386/pr113689-1.c: Require profiling support.
* gcc.target/i386/pr113689-2.c: Likewise.
* gcc.target/i386/pr113689-3.c: Likewise.

OK
jeff



[PATCH] c++: auto in trailing-return-type in parameter [PR117778]

2025-01-30 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14?

-- >8 --
This PR describes a few issues, both ICE and rejects-valid, but
ultimately the problem is that we don't properly synthesize the
second auto in:

  int
  g (auto fp() -> auto)
  {
return fp ();
  }

since r12-5860, which disabled auto_is_implicit_function_template_parm_p
in cp_parser_parameter_declaration after parsing the decl-specifier-seq.

If there is no trailing auto, there is no problem.

So we have to make sure auto_is_implicit_function_template_parm_p is
properly set when parsing the trailing auto.  A complication is that
one can write:

  auto f (auto fp(auto fp2() -> auto) -> auto) -> auto;
  ~~~

where only the underlined auto should be synthesized.  So when we
parse a parameter-declaration-clause inside another
parameter-declaration-clause, we should not enable the flag.  We
have no flags to keep track of such nesting, but I think I can walk
current_binding_level to see if we find ourselves in such an unlikely
scenario.

PR c++/117778

gcc/cp/ChangeLog:

* parser.cc (cp_parser_late_return_type_opt): Maybe override
auto_is_implicit_function_template_parm_p.
(cp_parser_parameter_declaration): Update commentary.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/lambda-generic-117778.C: New test.
* g++.dg/cpp2a/abbrev-fn2.C: New test.
* g++.dg/cpp2a/abbrev-fn3.C: New test.
---
 gcc/cp/parser.cc  | 24 -
 .../g++.dg/cpp1y/lambda-generic-117778.C  | 12 +
 gcc/testsuite/g++.dg/cpp2a/abbrev-fn2.C   | 49 +++
 gcc/testsuite/g++.dg/cpp2a/abbrev-fn3.C   |  7 +++
 4 files changed, 90 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/lambda-generic-117778.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/abbrev-fn2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/abbrev-fn3.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 44515bb9074..89c5c2721a7 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -25514,6 +25514,25 @@ cp_parser_late_return_type_opt (cp_parser *parser, 
cp_declarator *declarator,
   /* Consume the ->.  */
   cp_lexer_consume_token (parser->lexer);
 
+  /* We may be in the context of parsing a parameter declaration,
+namely, its declarator.  auto_is_implicit_function_template_parm_p
+will be disabled in that case.  But for code like
+
+  int g (auto fp() -> auto);
+
+we have to re-enable the flag for the trailing auto.  However, that
+only applies for the outermost trailing auto in a parameter clause; in
+
+  int f2 (auto fp(auto fp2() -> auto) -> auto);
+
+the inner -> auto should not be synthesized.  */
+  int i = 0;
+  for (cp_binding_level *b = current_binding_level;
+  b->kind == sk_function_parms; b = b->level_chain)
+   ++i;
+  auto cleanup = make_temp_override
+   (parser->auto_is_implicit_function_template_parm_p, i == 2);
+
   type = cp_parser_trailing_type_id (parser);
 }
 
@@ -26283,8 +26302,9 @@ cp_parser_parameter_declaration (cp_parser *parser,
  type-constraint opt auto can be used as a decl-specifier of the
  decl-specifier-seq of a parameter-declaration of a function declaration
  or lambda-expression..." but we must not synthesize an implicit template
- type parameter in its declarator.  That is, in "void f(auto[auto{10}]);"
- we want to synthesize only the first auto.  */
+ type parameter in its declarator (except the trailing-return-type).
+ That is, in "void f(auto[auto{10}]);" we want to synthesize only the
+ first auto.  */
   auto cleanup = make_temp_override
 (parser->auto_is_implicit_function_template_parm_p, false);
 
diff --git a/gcc/testsuite/g++.dg/cpp1y/lambda-generic-117778.C 
b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-117778.C
new file mode 100644
index 000..f377e3acc91
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/lambda-generic-117778.C
@@ -0,0 +1,12 @@
+// PR c++/117778
+// { dg-do compile { target c++14 } }
+
+auto l1 = [](auto (*fp)() -> auto) { return fp; };
+auto l2 = [](auto fp() -> auto) { return fp; };
+auto l3 = [](auto fp()) { return fp; };
+auto l4 = [](auto (*fp)()) { return fp; };
+auto l5 = [](auto fp() -> auto) -> auto { return fp; };
+auto l6 = [](auto fp(auto fp2()) -> auto) -> auto { return fp; }; // { 
dg-error ".auto. parameter not permitted" }
+auto l7 = [](auto fp(auto fp2() -> auto) -> auto) -> auto { return fp; }; // { 
dg-error ".auto. parameter not permitted" }
+auto l8 = [](int fp(auto fp2())) { return fp; }; // { dg-error ".auto. 
parameter not permitted" }
+auto l9 = [](auto fp(auto fp2() -> auto) -> auto) { return fp; }; // { 
dg-error ".auto. parameter not permitted" }
diff --git a/gcc/testsuite/g++.dg/cpp2a/abbrev-fn2.C 
b/gcc/testsuite/g++.dg/cpp2a/abbrev-fn2.C
new file mode 100644
index 000..902382651b8

[PATCH] c++: Update const_decl handling after r15-7259 [PR118673].

2025-01-30 Thread Iain Sandoe
Fixes a regression in handling Objective-C++ strings.
Tested on x86_64-darwin21, OK for trunk?
thanks
Iain

--- 8< --- 

Objective-C++ uses CONST_DECLs to hold constant string objects
these should also be treated as mergable lvalues.

PR c++/118673

gcc/cp/ChangeLog:

* tree.cc (lvalue_kind): Mark CONST_DECLs as mergable
when they are also TREE_STATIC.

Signed-off-by: Iain Sandoe 
---
 gcc/cp/tree.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cp/tree.cc b/gcc/cp/tree.cc
index fb6b2b18e94..79bc74fa2b7 100644
--- a/gcc/cp/tree.cc
+++ b/gcc/cp/tree.cc
@@ -213,7 +213,7 @@ lvalue_kind (const_tree ref)
  && DECL_IN_AGGR_P (ref))
return clk_none;
 
-  if (DECL_MERGEABLE (ref))
+  if (TREE_CODE (ref) == CONST_DECL || DECL_MERGEABLE (ref))
return clk_ordinary | clk_mergeable;
 
   /* FALLTHRU */
-- 
2.39.2 (Apple Git-143)



Re: [PATCH v6 4/6] OpenMP: Fortran support for metadirectives and dynamic selectors

2025-01-30 Thread Tobias Burnus

Hi Sandra,

Sandra Loosemore wrote:

* Unless it is quickly fixable, we agreed on deferring the bogus message
   "Error: ‘target’ construct with nested ‘teams’ construct contains 
directives

   outside of the ‘teams’ construct"
   to a new PR. That's for:
OpenMP_VV's 
tests/5.0/metadirective/test_metadirective_arch_is_nvidia.F90

...

I filed PR118694 for this.

This one is going to be hard to fix, or at least I don't have any good 
ideas on how to fix it.  Cases that require late resolution without 
dynamic selectors might be hacked around to recognize and skip over an 
intervening metadirective between the "target" and "teams", but it 
doesn't seem like the code for evaluating dynamic selectors can 
generally be hoisted outside of the outer "target".


I think the typical pattern is:

omp target

  omp metadirective when(device={kind(gpu)} : teams distribute parallel 
for/do) otherwise(parallel do/for)


as, for GPUs, using 'teams' is usually a good idea, but for host 
fallback it isn't.


In GCC, this means that it is statically resolved but only late. (In 
other compilers, it is early resolved.)


But for dynamic selectors, I think printing an error makes sense – and 
it depends on the user-code how to resolve it best.


* * *

I'll push the attached patch tomorrow, along with the remaining piece 
of the series to update the implementation status table in the libgomp 
manual, unless you want more time to do another round of review first. 
Also need to update the GCC 15 release notes!


I glanced it again and it LGTM. Thanks!

[Glancing at it, I saw a missing space before '=' in

+  gfc_omp_variant *variant= c->ext.omp_variants;
]

Tobias



[PATCH] middle-end/118695 - missed misalign handling in MEM_REF expansion

2025-01-30 Thread Richard Biener
When MEM_REF expansion of a non-MEM falls back to a stack temporary
we fail to handle the case where the offset adjusted reference to
the temporary is not aligned according to the requirement of the
mode.  We have to go through bitfield extraction or movmisalign
in this case.  Fortunately there's a helper for this.

This fixes an ICE observed on arm which has sanity checks in its
move patterns for this.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

OK?

PR middle-end/118695
* expr.cc (expand_expr_real_1): When expanding a MEM_REF
to a non-MEM by committing it to a stack temporary make
sure to handle misaligned accesses correctly.

* gcc.dg/pr118695.c: New testcase.
---
 gcc/expr.cc | 11 +++
 gcc/testsuite/gcc.dg/pr118695.c |  9 +
 2 files changed, 20 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr118695.c

diff --git a/gcc/expr.cc b/gcc/expr.cc
index 10467f82c0d..0136678a40b 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -11796,6 +11796,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode 
tmode,
&& known_eq (GET_MODE_BITSIZE (DECL_MODE (base)), type_size))
  return expand_expr (build1 (VIEW_CONVERT_EXPR, type, base),
  target, tmode, modifier);
+   unsigned align;
if (TYPE_MODE (type) == BLKmode || maybe_lt (offset, 0))
  {
temp = assign_stack_temp (DECL_MODE (base),
@@ -11804,6 +11805,16 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode 
tmode,
temp = adjust_address (temp, TYPE_MODE (type), offset);
if (TYPE_MODE (type) == BLKmode)
  set_mem_size (temp, int_size_in_bytes (type));
+   /* When the original ref was misaligned so will be the
+  access to the stack temporary.  Not all targets handle
+  this correctly, some will ICE in sanity checking.
+  Handle this by doing bitfield extraction when necessary.  */
+   else if ((align = get_object_alignment (exp))
+< GET_MODE_ALIGNMENT (TYPE_MODE (type)))
+ temp = expand_misaligned_mem_ref
+  (temp, TYPE_MODE (type), unsignedp, align,
+   modifier == EXPAND_STACK_PARM ? NULL_RTX : target,
+   NULL);
return temp;
  }
/* When the access is fully outside of the underlying object
diff --git a/gcc/testsuite/gcc.dg/pr118695.c b/gcc/testsuite/gcc.dg/pr118695.c
new file mode 100644
index 000..55e3b767a21
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr118695.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+
+void * g(int obj)
+{
+  char *t = (char*)&obj;
+  t -= 1;
+  return *(int**)t;
+}
-- 
2.43.0


[patch,avr] Add built-ins for strlen of a string in some address-space

2025-01-30 Thread Georg-Johann Lay

AVR: Provide built-ins for strlen where the string lives in some AS.

This patch adds built-in functions __builtin_avr_strlen_flash,
__builtin_avr_strlen_flashx and __builtin_avr_strlen_memx.
Purpose is that higher-level functions can use __builtin_constant_p
on strlen without raising a diagnostic due to -Waddr-space-convert.


Ok for trunk?

Johann

--

AVR: Provide built-ins for strlen where the string lives in some AS.

This patch adds built-in functions __builtin_avr_strlen_flash,
__builtin_avr_strlen_flashx and __builtin_avr_strlen_memx.
Purpose is that higher-level functions can use __builtin_constant_p
on strlen without raising a diagnostic due to -Waddr-space-convert.

gcc/
* config/avr/builtins.def (AVR_FIRST_C_ONLY_BUILTIN_ID): New macro.
(STRLEN_FLASH, STRLEN_FLASHX, STRLEN_MEMX): New DEF_BUILTIN's.
* config/avr/avr-protos.h (avr_builtin_C_only_p): New.
* config/avr/avr.cc (avr_builtin_C_only_p): New function.
(avr_ftype_strlen): New static function.
(avr_init_builtins): Only provide a built-in when it is supported
for the compiled language.
: Provide
new fntypes.
(avr_fold_builtin) [AVR_BUILTIN_STRLEN_FLASH]
[AVR_BUILTIN_STRLEN_FLASHX, AVR_BUILTIN_STRLEN_MEMX]: Fold if
possible.
* config/avr/avr-c.cc (avr_cpu_cpp_builtins): Only define the
__BUILTIN_AVR_ build-in defines when a built-in function
is available for the compiled language.
* doc/extend.texi (AVR Built-in Functions): Document
__builtin_avr_strlen_flash, __builtin_avr_strlen_flashx,
__builtin_avr_strlen_memx.
libgcc/
* config/avr/t-avr (LIB1ASMFUNCS): Add _strlen_memx.
* config/avr/lib1funcs.S : Implement.diff --git a/gcc/config/avr/avr-c.cc b/gcc/config/avr/avr-c.cc
index 53f15f2be7b..ced541a9ddc 100644
--- a/gcc/config/avr/avr-c.cc
+++ b/gcc/config/avr/avr-c.cc
@@ -500,7 +500,9 @@ avr_cpu_cpp_builtins (cpp_reader *pfile)
  not a specific builtin is available. */
 
 #define DEF_BUILTIN(NAME, N_ARGS, TYPE, CODE, LIBNAME, ATTRS) \
-  cpp_define (pfile, "__BUILTIN_AVR_" #NAME);
+  if (lang_GNU_C ()	  \
+  || ! avr_builtin_C_only_p (AVR_BUILTIN_ ## NAME))	  \
+cpp_define (pfile, "__BUILTIN_AVR_" #NAME);
 #include "builtins.def"
 #undef DEF_BUILTIN
 
diff --git a/gcc/config/avr/avr-protos.h b/gcc/config/avr/avr-protos.h
index 83137c7f6f6..72eb5163cce 100644
--- a/gcc/config/avr/avr-protos.h
+++ b/gcc/config/avr/avr-protos.h
@@ -21,6 +21,7 @@
 
 extern bool avr_function_arg_regno_p (int r);
 extern void avr_cpu_cpp_builtins (cpp_reader * pfile);
+extern bool avr_builtin_C_only_p (unsigned id);
 extern enum reg_class avr_regno_reg_class (int r);
 extern void asm_globalize_label (FILE *file, const char *name);
 extern void avr_adjust_reg_alloc_order (void);
diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc
index 656d3e7389b..c33beb9b95d 100644
--- a/gcc/config/avr/avr.cc
+++ b/gcc/config/avr/avr.cc
@@ -15689,6 +15689,17 @@ avr_bdesc[AVR_BUILTIN_COUNT] =
   };
 
 
+/* Some of our built-in function are available for GNU-C only:
+   - Built-ins that use named address-spaces.
+   - Built-ins that use fixed-point types.  */
+
+bool
+avr_builtin_C_only_p (unsigned id)
+{
+  return id >= AVR_FIRST_C_ONLY_BUILTIN_ID;
+}
+
+
 /* Implement `TARGET_BUILTIN_DECL'.  */
 
 static tree
@@ -15714,6 +15725,25 @@ avr_init_builtin_int24 (void)
 }
 
 
+/* Return a function signature type similar to strlen, but where
+   the address is qualified by named address-space AS.  */
+
+static tree
+avr_ftype_strlen (addr_space_t as)
+{
+  tree const_AS_char_node
+= build_qualified_type (char_type_node,
+			TYPE_QUAL_CONST | ENCODE_QUAL_ADDR_SPACE (as));
+  tree const_AS_ptr_type_node
+= build_pointer_type_for_mode (const_AS_char_node,
+   avr_addr_space_pointer_mode (as), false);
+  tree size_ftype_const_AS_char_ptr
+= build_function_type_list (size_type_node, const_AS_ptr_type_node, NULL);
+
+  return size_ftype_const_AS_char_ptr;
+}
+
+
 /* Implement `TARGET_INIT_BUILTINS' */
 /* Set up all builtin functions for this target.  */
 
@@ -15771,6 +15801,10 @@ avr_init_builtins (void)
 const_memx_ptr_type_node,
 NULL);
 
+  tree strlen_flash_node = avr_ftype_strlen (ADDR_SPACE_FLASH);
+  tree strlen_flashx_node = avr_ftype_strlen (ADDR_SPACE_FLASHX);
+  tree strlen_memx_node = avr_ftype_strlen (ADDR_SPACE_MEMX);
+
 #define ITYP(T) \
   lang_hooks.types.type_for_size (TYPE_PRECISION (T), TYPE_UNSIGNED (T))
 
@@ -15885,8 +15919,15 @@ avr_init_builtins (void)
   tree attr_const = tree_cons (get_identifier ("const"), NULL, NULL);
 
 #define DEF_BUILTIN(NAME, N_ARGS, TYPE, CODE, LIBNAME, ATTRS)		\
-  {	\
+  do {	\
 int id = AVR_BUILTIN_ ## NAME;	\
+if (! lang_GNU_C ()			\
+	&& avr_builtin_C_only_p (id))	\
+  {	\
+	avr_bdesc[id].fndecl = NULL_TREE;\
+	br

Re: [PATCH] middle-end/118695 - missed misalign handling in MEM_REF expansion

2025-01-30 Thread Jakub Jelinek
On Thu, Jan 30, 2025 at 11:34:16AM +0100, Richard Biener wrote:
> When MEM_REF expansion of a non-MEM falls back to a stack temporary
> we fail to handle the case where the offset adjusted reference to
> the temporary is not aligned according to the requirement of the
> mode.  We have to go through bitfield extraction or movmisalign
> in this case.  Fortunately there's a helper for this.
> 
> This fixes an ICE observed on arm which has sanity checks in its
> move patterns for this.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> 
> OK?
> 
>   PR middle-end/118695
>   * expr.cc (expand_expr_real_1): When expanding a MEM_REF
>   to a non-MEM by committing it to a stack temporary make
>   sure to handle misaligned accesses correctly.
> 
>   * gcc.dg/pr118695.c: New testcase.
> ---
>  gcc/expr.cc | 11 +++
>  gcc/testsuite/gcc.dg/pr118695.c |  9 +
>  2 files changed, 20 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pr118695.c
> 
> diff --git a/gcc/expr.cc b/gcc/expr.cc
> index 10467f82c0d..0136678a40b 100644
> --- a/gcc/expr.cc
> +++ b/gcc/expr.cc
> @@ -11796,6 +11796,7 @@ expand_expr_real_1 (tree exp, rtx target, 
> machine_mode tmode,
>   && known_eq (GET_MODE_BITSIZE (DECL_MODE (base)), type_size))
> return expand_expr (build1 (VIEW_CONVERT_EXPR, type, base),
> target, tmode, modifier);
> + unsigned align;
>   if (TYPE_MODE (type) == BLKmode || maybe_lt (offset, 0))
> {
>   temp = assign_stack_temp (DECL_MODE (base),
> @@ -11804,6 +11805,16 @@ expand_expr_real_1 (tree exp, rtx target, 
> machine_mode tmode,
>   temp = adjust_address (temp, TYPE_MODE (type), offset);
>   if (TYPE_MODE (type) == BLKmode)
> set_mem_size (temp, int_size_in_bytes (type));
> + /* When the original ref was misaligned so will be the
> +access to the stack temporary.  Not all targets handle
> +this correctly, some will ICE in sanity checking.
> +Handle this by doing bitfield extraction when necessary.  */
> + else if ((align = get_object_alignment (exp))
> +  < GET_MODE_ALIGNMENT (TYPE_MODE (type)))
> +   temp = expand_misaligned_mem_ref
> +(temp, TYPE_MODE (type), unsignedp, align,
> + modifier == EXPAND_STACK_PARM ? NULL_RTX : target,
> + NULL);

I don't like very much this kind of formatting (function name on one line,
open paren on another).  Of course there are cases where there is no other 
choice,
but in this case
 temp
   = expand_misaligned_mem_ref (temp, TYPE_MODE (type),
unsignedp, align,
modifier == EXPAND_STACK_PARM
? NULL_RTX : target, NULL);
doesn't look too bad.

Ok for trunk either way.

Jakub