Re: [PATCH] libgfortran: add fallback for trigonometric pi-based functions

2025-07-06 Thread Tobias Burnus
On Sunday, July 6, 2025, Yuao Ma  wrote:
>>> Since I don't have root/sudo permissions on my devbox, I manually
downloaded
>>> and compiled the autoconf 2.69 tarball. This means there might be some
minor
>>> discrepancies compared to the version shipped with OS distributions.

In principle, having autoconf/automake locally is normal as the distro's
version too new.

>>> I suspect the issue could be related to platforms where `off_t` is
32-bit,
>>> causing a left shift of 62 to result in undefined behavior. The commit
at
https://cgit.git.savannah.gnu.org/cgit/autoconf.git/commit/?id=a1d8293f3bfa2516f9a0424e3a6e63c2f8e93c6e
>>> seems to support my theory.

As that commit is from 2020 and 2.69 in from 2012, it seems as if your
autoconf is too new. Can you re-check that the right version is at the
beginning of the PATH?

Note that there is a CI job that checks whether the generated files are in
deed up to date, i.e. the current trunk should be fine unless someone has
messed up.

The other question is whether the autoconf version shouldn't be updated,
given that 2.69 is quite old.

Tobias,
currently a less active due to FTO.


Re: [PATCH v2] doc: Correct the return type of float comparison

2025-07-06 Thread Trevor Gross
On Tue May 27, 2025 at 9:35 PM CDT, Trevor Gross wrote:
> Documentation for `__cmpsf2` and similar functions currently indicate a
> return type of `int`. This is not correct however; the `libgcc`
> functions return `CMPtype`, the size of which is determined by the
> `libgcc_cmp_return` mode.
>
> Update documentation to use `CMPtype` and indicate that this is
> target-dependent, also mentioning the usual modes.
>
> Reported-by: beetrees 
> Fixes: 
> https://github.com/rust-lang/compiler-builtins/issues/919#issuecomment-2905347318
> Signed-off-by: Trevor Gross 
> ---

Joseph (or anyone else), if the change looks correct would you be able
to apply this?

Thanks,
Trevor


[patch,avr,applied] Add AVRxxDAyyS devices

2025-07-06 Thread Georg-Johann Lay

Applied as obvious.

Johann

--

AVR: Add support for AVR32DAxxS, AVR64DAxxS, AVR128DAxxS devices.

gcc/
* config/avr/avr-mcus.def (avr32da28S, avr32da32S, avr32da48S)
(avr64da28S, avr64da32S, avr64da48S avr64da64S)
(avr128da28S, avr128da32S, avr128da48S, avr128da64S): Add devices.
* doc/avr-mmcu.texi: Rebuild.AVR: Add support for AVR32DAxxS, AVR64DAxxS, AVR128DAxxS devices.

gcc/
* config/avr/avr-mcus.def (avr32da28S, avr32da32S, avr32da48S)
(avr64da28S, avr64da32S, avr64da48S avr64da64S)
(avr128da28S, avr128da32S, avr128da48S, avr128da64S): Add devices.
* doc/avr-mmcu.texi: Rebuild.
diff --git a/gcc/config/avr/avr-mcus.def b/gcc/config/avr/avr-mcus.def
index ad640501541..1c78855c8f1 100644
--- a/gcc/config/avr/avr-mcus.def
+++ b/gcc/config/avr/avr-mcus.def
@@ -313,6 +313,10 @@ AVR_MCU ("avr64da28",ARCH_AVRXMEGA2, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR
 AVR_MCU ("avr64da32",ARCH_AVRXMEGA2, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR64DA32__",   0x6000, 0x0, 0x1, 0)
 AVR_MCU ("avr64da48",ARCH_AVRXMEGA2, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR64DA48__",   0x6000, 0x0, 0x1, 0)
 AVR_MCU ("avr64da64",ARCH_AVRXMEGA2, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR64DA64__",   0x6000, 0x0, 0x1, 0)
+AVR_MCU ("avr64da28S",   ARCH_AVRXMEGA2, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR64DA28S__",  0x6000, 0x0, 0x1, 0)
+AVR_MCU ("avr64da32S",   ARCH_AVRXMEGA2, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR64DA32S__",  0x6000, 0x0, 0x1, 0)
+AVR_MCU ("avr64da48S",   ARCH_AVRXMEGA2, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR64DA48S__",  0x6000, 0x0, 0x1, 0)
+AVR_MCU ("avr64da64S",   ARCH_AVRXMEGA2, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR64DA64S__",  0x6000, 0x0, 0x1, 0)
 AVR_MCU ("avr64db28",ARCH_AVRXMEGA2, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR64DB28__",   0x6000, 0x0, 0x1, 0)
 AVR_MCU ("avr64db32",ARCH_AVRXMEGA2, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR64DB32__",   0x6000, 0x0, 0x1, 0)
 AVR_MCU ("avr64db48",ARCH_AVRXMEGA2, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR64DB48__",   0x6000, 0x0, 0x1, 0)
@@ -389,6 +393,9 @@ AVR_MCU ("avr16du32",ARCH_AVRXMEGA3, AVR_CVT, "__AVR_AVR
 AVR_MCU ("avr32da28",ARCH_AVRXMEGA3, AVR_CVT, "__AVR_AVR32DA28__",   0x7000, 0x0, 0x8000, 0x8000)
 AVR_MCU ("avr32da32",ARCH_AVRXMEGA3, AVR_CVT, "__AVR_AVR32DA32__",   0x7000, 0x0, 0x8000, 0x8000)
 AVR_MCU ("avr32da48",ARCH_AVRXMEGA3, AVR_CVT, "__AVR_AVR32DA48__",   0x7000, 0x0, 0x8000, 0x8000)
+AVR_MCU ("avr32da28S",   ARCH_AVRXMEGA3, AVR_CVT, "__AVR_AVR32DA28S__",  0x7000, 0x0, 0x8000, 0x8000)
+AVR_MCU ("avr32da32S",   ARCH_AVRXMEGA3, AVR_CVT, "__AVR_AVR32DA32S__",  0x7000, 0x0, 0x8000, 0x8000)
+AVR_MCU ("avr32da48S",   ARCH_AVRXMEGA3, AVR_CVT, "__AVR_AVR32DA48S__",  0x7000, 0x0, 0x8000, 0x8000)
 AVR_MCU ("avr32db28",ARCH_AVRXMEGA3, AVR_CVT, "__AVR_AVR32DB28__",   0x7000, 0x0, 0x8000, 0x8000)
 AVR_MCU ("avr32db32",ARCH_AVRXMEGA3, AVR_CVT, "__AVR_AVR32DB32__",   0x7000, 0x0, 0x8000, 0x8000)
 AVR_MCU ("avr32db48",ARCH_AVRXMEGA3, AVR_CVT, "__AVR_AVR32DB48__",   0x7000, 0x0, 0x8000, 0x8000)
@@ -427,6 +434,10 @@ AVR_MCU ("avr128da28",   ARCH_AVRXMEGA4, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR
 AVR_MCU ("avr128da32",   ARCH_AVRXMEGA4, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR128DA32__",  0x4000, 0x0, 0x2, 0)
 AVR_MCU ("avr128da48",   ARCH_AVRXMEGA4, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR128DA48__",  0x4000, 0x0, 0x2, 0)
 AVR_MCU ("avr128da64",   ARCH_AVRXMEGA4, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR128DA64__",  0x4000, 0x0, 0x2, 0)
+AVR_MCU ("avr128da28S",  ARCH_AVRXMEGA4, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR128DA28S__", 0x4000, 0x0, 0x2, 0)
+AVR_MCU ("avr128da32S",  ARCH_AVRXMEGA4, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR128DA32S__", 0x4000, 0x0, 0x2, 0)
+AVR_MCU ("avr128da48S",  ARCH_AVRXMEGA4, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR128DA48S__", 0x4000, 0x0, 0x2, 0)
+AVR_MCU ("avr128da64S",  ARCH_AVRXMEGA4, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR128DA64S__", 0x4000, 0x0, 0x2, 0)
 AVR_MCU ("avr128db28",   ARCH_AVRXMEGA4, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR128DB28__",  0x4000, 0x0, 0x2, 0)
 AVR_MCU ("avr128db32",   ARCH_AVRXMEGA4, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR128DB32__",  0x4000, 0x0, 0x2, 0)
 AVR_MCU ("avr128db48",   ARCH_AVRXMEGA4, AVR_CVT | AVR_ISA_FLMAP, "__AVR_AVR128DB48__",  0x4000, 0x0, 0x2, 0)
diff --git a/gcc/doc/avr-mmcu.texi b/gcc/doc/avr-mmcu.texi
index feb772549a4..a5d12ebc56e 100644
--- a/gcc/doc/avr-mmcu.texi
+++ b/gcc/doc/avr-mmcu.texi
@@ -50,15 +50,15 @@
 
 @item @anchor{avrxmega2}avrxmega2
 ``XMEGA'' devices with more than 8@tie{}KiB and up to 64@tie{}KiB of program memory.
-@*@var{mcu}@tie{}= @code{atxmega8e5}, @code{atxmega16a4}, @code{atxmega16a4u

Add cutoff information to profile_info and use it when forcing non-zero value

2025-07-06 Thread Jan Hubicka
Hi,
main difference between normal profile feedback and auto-fdo is that with 
profile
feedback every basic block with non-zero profile has an incomming edge with 
non-zero
profile.  With auto-profile it is possible that none of predecessors was sampled
and also the tool has cutoff parameter which makes it to ignore small counts.

This becomes a problem when one tries to specialize code and scale profile.
For exmaple if inline function happens to have hot loop with non-zero counts
but its entry count has zero counts and we want to inline to zero counts and we
want to inline to a call with a non-zero count X, we want to scale the body by
X/0 which we currently turn into X/1.

This is a problem since I added logic to scale up the auto-profiles (to get
some extra bits of precision) so X is often a large value and multiplying by X
is not a right answer at all.  The multiply factor should be <= 1.

Iterating this few times will make counts to cap and we will lost any useful 
info.
Original implementation avoided this by doing all inlines before AFDO readback,
bit this is not possible with LTO (unless we move AFDO readback to WPA or add
support for context sensitive profiles).  I think I can get the scaling work
reasonably well and then we can look into possible benefits of context sensitive
profiling which can be implemented both atop of AFDO as well as FDO.

This patch adds cutoff value to profile_info which is initialized by profile
feedback to 1 and by auto-profile to the scale factor (since we do not know the
cutoff create_gcov used; llvm's tool streams it and we probably should too).
Then force_nonzero forces every value smaller than cutoff/2 to cutoff/2 which
should keep scaling factors in reasonable ranges.

Bootstrapped/regtested x86_64-linux.

gcc/ChangeLog:

* auto-profile.cc 
(autofdo_source_profile::read): Scale cutoff.
(read_autofdo_file): Initialize cutoff
* coverage.cc (read_counts_file): Initialize cutoff to 1.
* gcov-io.h (struct gcov_summary): Add cutoff field.
* ipa-inline.cc (inline_small_functions): mac_count can be non-zero
also with auto_profile.
* lto-cgraph.cc (output_profile_summary): Write cutoff
and sum_max.
(input_profile_summary): Read cutoff and sum max.
(merge_profile_summaries): Initialize and scale global cutoffs
and sum max.
* profile-count.cc: Include profile.h
(profile_count::force_nonzero): move here from ...; use cutoff.
* profile-count.h: (profile_count::force_nonzero): ... here.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-prof/clone-merge-1.c:

diff --git a/gcc/auto-profile.cc b/gcc/auto-profile.cc
index 64f4cda1b52..ea237bd484c 100644
--- a/gcc/auto-profile.cc
+++ b/gcc/auto-profile.cc
@@ -2522,6 +2590,7 @@ autofdo_source_profile::read ()
 afdo_count_scale
   = MAX (((gcov_type)1 << (profile_count::n_bits / 2))
 / afdo_profile_info->sum_max, 1);
+  afdo_profile_info->cutoff *= afdo_count_scale;
   afdo_hot_bb_threshod
 = hot_frac
   ? afdo_profile_info->sum_max * afdo_count_scale / hot_frac
@@ -2531,10 +2600,12 @@ autofdo_source_profile::read ()
 fprintf (dump_file, "Max count in profile %" PRIu64 "\n"
"Setting scale %" PRIu64 "\n"
"Scaled max count %" PRIu64 "\n"
+   "Cutoff %" PRIu64 "\n"
"Hot count threshold %" PRIu64 "\n\n",
 (int64_t)afdo_profile_info->sum_max,
 (int64_t)afdo_count_scale,
 (int64_t)(afdo_profile_info->sum_max * afdo_count_scale),
+(int64_t)afdo_profile_info->cutoff,
 (int64_t)afdo_hot_bb_threshod);
   afdo_profile_info->sum_max *= afdo_count_scale;
   return true;
@@ -3865,6 +3936,7 @@ read_autofdo_file (void)
   autofdo::afdo_profile_info = XNEW (gcov_summary);
   autofdo::afdo_profile_info->runs = 1;
   autofdo::afdo_profile_info->sum_max = 0;
+  autofdo::afdo_profile_info->cutoff = 1;
 
   /* Read the profile from the profile file.  */
   autofdo::read_profile ();
diff --git a/gcc/coverage.cc b/gcc/coverage.cc
index dd3ed2ed842..75a24c61448 100644
--- a/gcc/coverage.cc
+++ b/gcc/coverage.cc
@@ -238,6 +238,7 @@ read_counts_file (void)
  gcov_profile_info = profile_info = XCNEW (gcov_summary);
  profile_info->runs = gcov_read_unsigned ();
  profile_info->sum_max = gcov_read_unsigned ();
+ profile_info->cutoff = 1;
}
   else if (GCOV_TAG_IS_COUNTER (tag) && fn_ident)
{
diff --git a/gcc/gcov-io.h b/gcc/gcov-io.h
index d48291c1fe3..f3e3a1c08da 100644
--- a/gcc/gcov-io.h
+++ b/gcc/gcov-io.h
@@ -349,6 +349,11 @@ struct gcov_summary
 {
   gcov_unsigned_t runs;/* Number of program runs.  */
   gcov_type sum_max;   /* Sum of individual run max values.  */
+  gcov_type cutoff;/* Values smaller than this value are not
+  reliable (0 may

Re: [PATCH] libgfortran: add fallback for trigonometric pi-based functions

2025-07-06 Thread Yuao Ma

Hi Tobias,

On 7/6/2025 6:34 PM, Tobias Burnus wrote:
As that commit is from 2020 and 2.69 in from 2012, it seems as if your 
autoconf is too new. Can you re-check that the right version is at the 
beginning of the PATH?


Note that there is a CI job that checks whether the generated files are 
in deed up to date, i.e. the current trunk should be fine unless someone 
has messed up.




It is possible that autoconf 2.69 contains this commit, as we can see 
from 
https://github.com/autotools-mirror/autoconf/commit/a1d8293f3bfa2516f9a0424e3a6e63c2f8e93c6e 
that it has been backported to v2.69b - v2.69e.


The main reason is that my devbox has autoconf2.69 (2.69-3.1) from 
Debian, released on Sat, 19 Nov 2022 21:40:11 +0200, which includes the 
commit from 2020. This version takes precedence over my compiled 
version. Once I switch to my compiled version, the generation output 
functions as expected.


The other question is whether the autoconf version shouldn't be updated, 
given that 2.69 is quite old.


Upgrading this basic component may seem like a major change, but opting 
for the same version with a backported patch appears to be a better choice.


I have used the old version to create a new patch. Hope this looks good 
to you.


YuaoFrom 2d62a2f707e43f37b4d886b7ed3aa40f2ab62437 Mon Sep 17 00:00:00 2001
From: Yuao Ma 
Date: Sun, 6 Jul 2025 20:55:08 +0800
Subject: [PATCH] libgfortran: add fallback for trigonometric pi-based
 functions

This patch introduces a fallback implementation for pi-based trigonometric
functions, ensuring broader compatibility and robustness. The new
implementation supports float, double, and long double data types. Accordingly,
the test cases for r4 and r8 have been revised to reflect these changes.

libgfortran/ChangeLog:

* Makefile.am: Add c23_functions.c to Makefile.
* Makefile.in: Regenerate.
* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Check if trig-pi functions exist.
* gfortran.map: Add a section for c23 functions.
* libgfortran.h: Include c23 proto file.
* c23_protos.h: Add c23 proto file and trig-pi funcs.
* intrinsics/c23_functions.c: Add trig-pi fallback impls.

gcc/testsuite/ChangeLog:

* gfortran.dg/dec_math_5.f90: Revise test to use fallback.

Signed-off-by: Yuao Ma 
Co-authored-by: Steven G. Kargl 
---
 gcc/testsuite/gfortran.dg/dec_math_5.f90 |   36 +-
 libgfortran/Makefile.am  |4 +
 libgfortran/Makefile.in  |8 +
 libgfortran/c23_protos.h |  133 +++
 libgfortran/config.h.in  |   63 ++
 libgfortran/configure| 1010 ++
 libgfortran/configure.ac |   23 +
 libgfortran/gfortran.map |   25 +
 libgfortran/intrinsics/c23_functions.c   |  308 +++
 libgfortran/libgfortran.h|1 +
 10 files changed, 1593 insertions(+), 18 deletions(-)
 create mode 100644 libgfortran/c23_protos.h
 create mode 100644 libgfortran/intrinsics/c23_functions.c

diff --git a/gcc/testsuite/gfortran.dg/dec_math_5.f90 
b/gcc/testsuite/gfortran.dg/dec_math_5.f90
index a7ff3275236..7bcf07fce67 100644
--- a/gcc/testsuite/gfortran.dg/dec_math_5.f90
+++ b/gcc/testsuite/gfortran.dg/dec_math_5.f90
@@ -102,26 +102,26 @@ program p
   if (abs(c1 - 0.5) > e3) stop 39
   if (abs(d1 - 0.5) > e4) stop 40
 
-  a1 = acospi(0.5)
-  b1 = acospi(-0.5)
+  a1 = 0.5; a1 = acospi(a1)
+  b1 = -0.5; b1 = acospi(b1)
   c1 = acospi(0.5)
   d1 = acospi(-0.5)
-  if (abs(a1 - 1.0 / 3) > e1) stop 41
-  if (abs(b1 - 2.0 / 3) > e2) stop 42
+  if (abs(a1 - 1._4 / 3) > e1) stop 41
+  if (abs(b1 - 2._8 / 3) > e2) stop 42
   if (abs(c1 - 1.0 / 3) > e3) stop 43
   if (abs(d1 - 2.0 / 3) > e4) stop 44
 
-  a1 = asinpi(0.5)
-  b1 = asinpi(-0.5)
+  a1 = 0.5; a1 = asinpi(a1)
+  b1 = -0.5; b1 = asinpi(b1)
   c1 = asinpi(0.5)
   d1 = asinpi(-0.5)
-  if (abs(a1 - 1.0 / 6) > e1) stop 45
-  if (abs(b1 + 1.0 / 6) > e2) stop 46
+  if (abs(a1 - 1._4 / 6) > e1) stop 45
+  if (abs(b1 + 1._8 / 6) > e2) stop 46
   if (abs(c1 - 1.0 / 6) > e3) stop 47
   if (abs(d1 + 1.0 / 6) > e4) stop 48
 
-  a1 = atanpi(1.0)
-  b1 = atanpi(-1.0)
+  a1 = 1.0; a1 = atanpi(a1)
+  b1 = -1.0; b1 = atanpi(b1)
   c1 = atanpi(1.0)
   d1 = atanpi(-1.0)
   if (abs(a1 - 0.25) > e1) stop 49
@@ -129,8 +129,8 @@ program p
   if (abs(c1 - 0.25) > e3) stop 51
   if (abs(d1 + 0.25) > e4) stop 52
 
-  a1 = atan2pi(1.0, 1.0)
-  b1 = atan2pi(1.0, 1.0)
+  a1 = 1.0; a1 = atan2pi(a1, a1)
+  b1 = 1.0; b1 = atan2pi(b1, b1)
   c1 = atan2pi(1.0, 1.0)
   d1 = atan2pi(1.0, 1.0)
   if (abs(a1 - 0.25) > e1) stop 53
@@ -138,8 +138,8 @@ program p
   if (abs(c1 - 0.25) > e3) stop 55
   if (abs(d1 - 0.25) > e4) stop 56
 
-  a1 = cospi(1._4 / 3)
-  b1 = cospi(-1._8 / 3)
+  a1 = 1._4 / 3; a1 = cospi(a1)
+  b1 = -1._8 / 3; b1 = cospi(b1)
   c1 = cospi(4._ep / 3)
   d1 = cospi(-4._16 / 3)
   if (abs(a1 - 0.5) > e1) stop 57
@@ -147,8 +147,8 @@ pro

gcc-patches@gcc.gnu.org

2025-07-06 Thread Jan Hubicka
Hi,
this fixes stupid mistake of mine in the overflow check for sreal
multiplication.  This was introduced this stage1 so unless we want to
backport the ipa-cp heuristics bugfixes, this does not need to go to
release branches.

Regtested and bootstrapped x86_64-linux.

Honza

gcc/ChangeLog:

* profile-count.cc (profile_count::operator*): fix overflow check.

diff --git a/gcc/profile-count.cc b/gcc/profile-count.cc
index 190bbebb5a7..21477008b70 100644
--- a/gcc/profile-count.cc
+++ b/gcc/profile-count.cc
@@ -557,7 +557,7 @@ profile_count::operator* (const sreal &num) const
   sreal scaled = num * m_val;
   gcc_checking_assert (scaled >= 0);
   profile_count ret;
-  if (m_val > max_count)
+  if (scaled > max_count)
 ret.m_val = max_count;
   else
 ret.m_val = scaled.to_nearest_int ();


[PATCH v2] ipa, cgraph: Enable constant propagation to OpenMP kernels

2025-07-06 Thread Josef Melcr
> This patch enables constant propagation to outlined OpenMP kernels and
> improves support for optimizing callback functions in general.  It
> implements the attribute 'callback' as found in clang, though argument
> numbering is a bit different, as described below.  The title says OpenMP,
> but it can be used for any function which takes a callback argument, such
> as pthread functions, qsort and others.
> 
> The attribute 'callback' captures the notion of a function calling one
> of its arguments with some of its parameters as arguments.  An OpenMP
> example of such function is GOMP_parallel.
> We implement the attribute with new callgraph edges called 'callback'
> edges.  They are imaginary edges pointing from the caller of the function
> with the attribute (e.g. caller of GOMP_parallel) to the body function
> itself (e.g. the outlined OpenMP body).  They share their call statement
> with the edge from which they are derived (direct edge caller -> GOMP_parallel
> in this case).  These edges allow passes such as ipa-cp to the see the
> hidden call site to the body function and optimize the function accordingly.
> 
> To illustrate on an example, the body GOMP_parallel looks something
> like this:
> 
> void GOMP_parallel (void (*fn) (void *), void *data, /* ... */)
> {
>   /* ... */
>   fn (data);
>   /* ... */
> }
> 
> 
> If we extend it with the attribute 'callback(1, 2)', we express that the
> function calls its first argument and passes it its second argument.
> This is represented in the call graph in this manner:
> 
>  direct indirect
> caller -> GOMP_parallel ---> fn
>   |
>   --> fn
>   callback
> 
> The direct edge is then the parent edge, with all callback edges being
> the child edges.
> While constant propagation is the main focus of this patch, callback
> edges can be useful for different passes (for example, it improves icf
> for OpenMP kernels), as they allow for address redirection.
> If the outlined body function gets optimized and cloned, from body_fn to
> body_fn.optimized, the callback edge allows us to replace the
> address in the arguments list:
> 
> GOMP_parallel (body_fn, &data_struct, /* ... */);
> 
> becomes
> 
> GOMP_parallel (body_fn.optimized, &data_struct, /* ... */);
> 
> This redirection is possible for any function with the attribute.
> 
> This callback attribute implementation is partially compatible with
> clang's implementation.  Its semantics, arguments and argument
> indexing style are the same, but we represent an unknown argument
> position with 0 (precedent set by attributes such as 'format'),
> while clang uses -1 or '?'.  We also allow for multiple callback
> attributes on the same function, while clang only allows one.
> 
> The attribute allows us to propagate constants into body functions of
> OpenMP constructs.  Currently, GCC won't propagate the value 'c' into the
> OpenMP body in the following example:
> 
> int a[100];
> void test(int c) {
> #pragma omp parallel for
>   for (int i = 0; i < c; i++) {
> if (!__builtin_constant_p(c)) {
>   __builtin_abort();
> }
> a[i] = i;
>   }
> }
> int main() {
>   test(100);
>   return a[5] - 5;
> }
> 
> With this patch, the body function will get cloned and the constant 'c'
> will get propagated.
> 
> Bootstrapped and regtested on x86_64-linux.  OK for master?
> 

---
This is a second version of this patch.  Changes made in this version:
- Attribute is now called " callback" and is thus outside of the public
  API.  I removed its docs and tests which no longer apply.
- GOMP_task no longer has the attribute and uses it on demand.  The
  attribute is freshly created when needed, which is about 3 times per
  child edge.  I think the extra allocations are worth it when
  considering code readability.
- Edge redirection no longer leaves dangling refs.
- Formatting issues shoud be resolved.

Boostrapped and regtested on x86_64-pc-linux-gnu.

gcc/ChangeLog:

* builtin-attrs.def (ATTR_CALLBACK): Callback attr identifier.
(DEF_CALLBACK_ATTRIBUTE): Macro for callback attr creation.
(GOMP): Attrs for libgomp functions.
(OACC): Attrs for oacc functions.
(ATTR_CALLBACK_GOMP_LIST): ATTR_NOTHROW_LIST with GOMP callback
attr added.
(ATTR_CALLBACK_OACC_LIST): ATTR_NOTHROW_LIST with OACC callback
attr added.
* cgraph.cc (cgraph_add_edge_to_call_site_hash): Always hash the
parent edge.
(cgraph_node::get_edge): Always return the parent edge.
(cgraph_edge::set_call_stmt): Add cascade for callback child
edges.
(symbol_table::create_edge): Allow callback edges to share the
same call statement, initialize new flags.
(cgraph_edge::make_callback): New method, derives a new callback
edge.
(cgraph_edge::get_callback_parent_edge): New method.
(cgraph_edge::first_callback_target): Likewise.
  

Re: [PATCH v3] RISC-V: Mips P8700 Conditional Move Support.

2025-07-06 Thread Umesh Kalappa
Hi @Jeff Law   and @ma...@orcam.me.uk
 ,

Please have a look at the updated patch for conditional move support and
any comments or suggestions  please let us know ?

Thank you
~U

On Wed, Jul 2, 2025 at 12:46 PM Umesh Kalappa 
wrote:

> Indentation are updated accordingly and no regress found.
>
> gcc/ChangeLog:
>
> *config/riscv/riscv-cores.def(RISCV_CORE): Updated the supported
> march.
> *config/riscv/riscv-ext-mips.def(DEFINE_RISCV_EXT):
> New file added for mips conditional mov extension.
> *config/riscv/riscv-ext.def: Likewise.
> *config/riscv/t-riscv: Generates riscv-ext.opt
> *config/riscv/riscv-ext.opt: Generated file.
> *config/riscv/riscv.cc(riscv_expand_conditional_move): Updated for
> mips cmov
> and outlined some code that handle arch cond move.
> *config/riscv/riscv.md(movcc): updated expand for MIPS CCMOV.
> *config/riscv/mips-insn.md: New file for mips-p8700 ccmov insn.
> *gcc/doc/riscv-ext.texi: Updated for mips cmov.
>
> gcc/testsuite/ChangeLog:
>
> *testsuite/gcc.target/riscv/mipscondmov.c: Test file for
> mips.ccmov insn.
> ---
>  gcc/config/riscv/mips-insn.md|  36 +++
>  gcc/config/riscv/riscv-cores.def |   3 +-
>  gcc/config/riscv/riscv-ext-mips.def  |  35 ++
>  gcc/config/riscv/riscv-ext.def   |   1 +
>  gcc/config/riscv/riscv-ext.opt   |   4 +
>  gcc/config/riscv/riscv.cc| 107 +--
>  gcc/config/riscv/riscv.md|   3 +-
>  gcc/config/riscv/t-riscv |   3 +-
>  gcc/doc/riscv-ext.texi   |   4 +
>  gcc/testsuite/gcc.target/riscv/mipscondmov.c |  30 ++
>  10 files changed, 189 insertions(+), 37 deletions(-)
>  create mode 100644 gcc/config/riscv/mips-insn.md
>  create mode 100644 gcc/config/riscv/riscv-ext-mips.def
>  create mode 100644 gcc/testsuite/gcc.target/riscv/mipscondmov.c
>
> diff --git a/gcc/config/riscv/mips-insn.md b/gcc/config/riscv/mips-insn.md
> new file mode 100644
> index 000..de53638d587
> --- /dev/null
> +++ b/gcc/config/riscv/mips-insn.md
> @@ -0,0 +1,36 @@
> +;; Machine description for MIPS custom instructions.
> +;; Copyright (C) 2025 Free Software Foundation, Inc.
> +
> +;; This file is part of GCC.
> +
> +;; GCC is free software; you can redistribute it and/or modify
> +;; it under the terms of the GNU General Public License as published by
> +;; the Free Software Foundation; either version 3, or (at your option)
> +;; any later version.
> +
> +;; GCC is distributed in the hope that it will be useful,
> +;; but WITHOUT ANY WARRANTY; without even the implied warranty of
> +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +;; GNU General Public License for more details.
> +
> +;; You should have received a copy of the GNU General Public License
> +;; along with GCC; see the file COPYING3.  If not see
> +;; .
> +
> +(define_insn "*movcc_bitmanip"
> +  [(set (match_operand:GPR 0 "register_operand" "=r")
> +   (if_then_else:GPR
> + (any_eq:X (match_operand:X 1 "register_operand" "r")
> +(match_operand:X 2 "const_0_operand" "J"))
> +(match_operand:GPR 3 "reg_or_0_operand" "rJ")
> +(match_operand:GPR 4 "reg_or_0_operand" "rJ")))]
> +  "TARGET_XMIPSCMOV"
> +{
> +  enum rtx_code code = ;
> +  if (code == NE)
> +return "mips.ccmov\t%0,%1,%z3,%z4";
> +  else
> +return "mips.ccmov\t%0,%1,%z4,%z3";
> +}
> +[(set_attr "type" "condmove")
> + (set_attr "mode" "")])
> diff --git a/gcc/config/riscv/riscv-cores.def
> b/gcc/config/riscv/riscv-cores.def
> index 2096c0095d4..98f347034fb 100644
> --- a/gcc/config/riscv/riscv-cores.def
> +++ b/gcc/config/riscv/riscv-cores.def
> @@ -169,7 +169,6 @@ RISCV_CORE("xiangshan-kunminghu",
>  "rv64imafdcbvh_sdtrig_sha_shcounterenw_"
>   "zvfhmin_zvkt_zvl128b_zvl32b_zvl64b",
>   "xiangshan-kunminghu")
>
> -RISCV_CORE("mips-p8700",   "rv64imafd_zicsr_zmmul_"
> - "zaamo_zalrsc_zba_zbb",
> +RISCV_CORE("mips-p8700",  "rv64imfd_zicsr_zifencei_zalrsc_zba_zbb",
>   "mips-p8700")
>  #undef RISCV_CORE
> diff --git a/gcc/config/riscv/riscv-ext-mips.def
> b/gcc/config/riscv/riscv-ext-mips.def
> new file mode 100644
> index 000..f24507139f6
> --- /dev/null
> +++ b/gcc/config/riscv/riscv-ext-mips.def
> @@ -0,0 +1,35 @@
> +/* MIPS extension definition file for RISC-V.
> +   Copyright (C) 2025 Free Software Foundation, Inc.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify
> +it under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful,
> +but WITHOUT ANY WARRANTY; with

[PATCH] crc: Error out on non-constant poly arguments for the crc builtins [PR120709]

2025-07-06 Thread Andrew Pinski
These builtins requires a constant integer for the third argument but currently
there is assert rather than error. This fixes that and updates the 
documentation too.
Uses the same terms as was being used for the __builtin_prefetch arguments.

Bootstrapped and tested on x86_64-linux-gnu.

PR middle-end/120709

gcc/ChangeLog:

* builtins.cc (expand_builtin_crc_table_based): Error out
instead of asserting the 3rd argument is an integer constant.
* internal-fn.cc (expand_crc_optab_fn): Likewise.
* doc/extend.texi (crc): Document requirement of the poly argument
being a constant.

gcc/testsuite/ChangeLog:

* gcc.dg/crc-non-cst-poly-1.c: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/builtins.cc   | 12 +---
 gcc/doc/extend.texi   |  4 ++--
 gcc/internal-fn.cc| 11 ---
 gcc/testsuite/gcc.dg/crc-non-cst-poly-1.c | 11 +++
 4 files changed, 30 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/crc-non-cst-poly-1.c

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index a2ce3726810..7f580a3145f 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -7799,11 +7799,17 @@ expand_builtin_crc_table_based (internal_fn fn, 
scalar_mode crc_mode,
 
   rtx op1 = expand_normal (rhs1);
   rtx op2 = expand_normal (rhs2);
-  gcc_assert (TREE_CODE (rhs3) == INTEGER_CST);
-  rtx op3 = gen_int_mode (TREE_INT_CST_LOW (rhs3), crc_mode);
+  rtx op3;
+  if (TREE_CODE (rhs3) != INTEGER_CST)
+{
+  error ("third argument to % builtins must be a constant");
+  op3 = const0_rtx;
+}
+  else
+op3 = convert_to_mode (crc_mode, expand_normal (rhs3), 0);
 
   if (CONST_INT_P (op2))
-op2 = gen_int_mode (INTVAL (op2), crc_mode);
+op2 = convert_to_mode (crc_mode, op2, 0);
 
   if (fn == IFN_CRC)
 expand_crc_table_based (target, op1, op2, op3, data_mode);
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 70adf2dab0a..a119ad31ea2 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -15553,7 +15553,7 @@ are 128-bit.  Only supported on targets when 128-bit 
types are supported.
 Returns the calculated 8-bit bit-reversed CRC using the initial CRC (8-bit),
 data (8-bit) and the polynomial (8-bit).
 @var{crc} is the initial CRC, @var{data} is the data and
-@var{poly} is the polynomial without leading 1.
+@var{poly} is the polynomial without leading 1. @var{poly} is required to be a 
compile-time constant.
 Table-based or clmul-based CRC may be used for the
 calculation, depending on the target architecture.
 @enddefbuiltin
@@ -15608,7 +15608,7 @@ is 32-bit.
 Returns the calculated 8-bit bit-forward CRC using the initial CRC (8-bit),
 data (8-bit) and the polynomial (8-bit).
 @var{crc} is the initial CRC, @var{data} is the data and
-@var{poly} is the polynomial without leading 1.
+@var{poly} is the polynomial without leading 1. @var{poly} is required to be a 
compile-time constant.
 Table-based or clmul-based CRC may be used for the
 calculation, depending on the target architecture.
 @enddefbuiltin
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 3f4ac937367..39048d77d23 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -4031,9 +4031,14 @@ expand_crc_optab_fn (internal_fn fn, gcall *stmt, 
convert_optab optab)
   rtx dest = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
   rtx crc = expand_normal (rhs1);
   rtx data = expand_normal (rhs2);
-  gcc_assert (TREE_CODE (rhs3) == INTEGER_CST);
-  rtx polynomial = gen_rtx_CONST_INT (TYPE_MODE (result_type),
- TREE_INT_CST_LOW (rhs3));
+  rtx polynomial;
+  if (TREE_CODE (rhs3) != INTEGER_CST)
+{
+  error ("third argument to % builtins must be a constant");
+  polynomial = const0_rtx;
+}
+  else
+polynomial = convert_to_mode (TYPE_MODE (result_type), expand_normal 
(rhs3), 0);
 
   /* Use target specific expansion if it exists.
  Otherwise, generate table-based CRC.  */
diff --git a/gcc/testsuite/gcc.dg/crc-non-cst-poly-1.c 
b/gcc/testsuite/gcc.dg/crc-non-cst-poly-1.c
new file mode 100644
index 000..0c3d9054017
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/crc-non-cst-poly-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+/* PR middle-end/120709 */
+/* Make sure we don't ICE on a non-constant poly argument. */
+
+
+typedef unsigned char uint8_t;
+uint8_t crc8_data8(uint8_t crc, uint8_t data, uint8_t polynomial) {
+  return __builtin_rev_crc32_data8 (crc, data, polynomial); /* { dg-error 
"must be a constant" } */
+}
-- 
2.43.0



Re: Fwd: [PATCH v3] simplify-rtx.cc:Simplify XOR(AND(ROTATE(~1) A) ASHIFT(1 A)) to IOR.

2025-07-06 Thread chenxiaolong

Hi,

At present, the corresponding optimization function has not been 
verified through test cases redundant-bitmap-2.C  on RV. Part of the 
reason is that SHIFT_COUNT_TRUNCATED is defined as 0, which does not 
meet the condition.


Could you provide some relevant test cases to verify the optimization 
effect of this supplementary order on architectures such as RV or x86?




This patch adds a new simplification rule to `simplify-rtx.cc` that
handles a common bit manipulation pattern involving a single-bit set
and clear followed by XOR.

The transformation targets RTL of the form:

(xor (and (rotate (~1) A) B) (ashift 1 A))

which is semantically equivalent to:

B | (1 << A)

- v3 log:
Update RTL format, remove commas.
Only apply on SHIFT_COUNT_TRUNCATED target.
check '!side_effects_p' on XEXP (op1, 1).

gcc/ChangeLog:

* simplify-rtx.cc (simplify_context::simplify_binary_operation_1): Handle
more logical simplifications.

---
gcc/simplify-rtx.cc | 14 ++
1 file changed, 14 insertions(+)

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index b34fd2f4b9e..cbe61b49bf6 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -4063,6 +4063,20 @@ simplify_context::simplify_binary_operation_1 
(rtx_code code,

&& rtx_equal_p (XEXP (XEXP (op0, 0), 0), op1))
return simplify_gen_binary (IOR, mode, XEXP (op0, 1), op1);
+ /* Convert (xor (and (rotate (~1) A) B) (ashift 1 A))
+ into B | (1 << A). */
+ if (SHIFT_COUNT_TRUNCATED
+ && GET_CODE (op0) == AND
+ && GET_CODE (XEXP (op0, 0)) == ROTATE
+ && CONST_INT_P (XEXP (XEXP (op0, 0), 0))
+ && INTVAL (XEXP (XEXP (op0, 0), 0)) == -2
+ && GET_CODE (op1) == ASHIFT
+ && CONST_INT_P (XEXP (op1, 0))
+ && INTVAL (XEXP (op1, 0)) == 1
+ && rtx_equal_p (XEXP (XEXP (op0, 0), 1), XEXP (op1, 1))
+ && !side_effects_p (XEXP (op1, 1)))
+ return simplify_gen_binary (IOR, mode, XEXP (op0, 1), op1);
+
tem = simplify_with_subreg_not (code, mode, op0, op1);
if (tem)
return tem;
--
2.43.0





RE: [PATCH v3 3/4] RISC-V: Implement unsigned scalar SAT_MUL from uint128_t

2025-07-06 Thread Li, Pan2
I see, thanks a lot.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, July 7, 2025 10:38 AM
To: Li, Pan2 ; Robin Dapp ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Chen, Ken ; 
Liu, Hongtao 
Subject: Re: [PATCH v3 3/4] RISC-V: Implement unsigned scalar SAT_MUL from 
uint128_t



On 7/4/25 10:54 PM, Li, Pan2 wrote:
> 
>> What you do want to watch out for is constants
> 
> Here I want the max value of unsigned scalar based on mode, it could be 
> UINT8_MAX,
> UINT16_MAX, UINT32_MAX and UINT64_MAX.
Understood, but within the compiler HOST_WIDE_INT is how we tend to want 
to work on objects.  And to get constants handled correctly we use the 
macros I mentioned in the prior message.  We don't tend to use things 
like int32_t, int64_t in the compiler itself.


jeff



RE: [PATCH] RISC-V: Add testcases for unsigned vector SAT_SUB form 11 and form 12

2025-07-06 Thread Li, Pan2
Thanks ciyuan.

+#include "vec_sat_arith.h"
+
+#define T  uint16_t
+#define N  16
+#define RUN_VEC_SAT_BINARY RUN_VEC_SAT_U_SUB_FMT_11
+
+DEF_VEC_SAT_U_SUB_FMT_11(T)
+
+T test_data[][3][N] = {
+  {
+{
+  0, 0, 0, 0,
+  0, 0, 0, 0,
+  0, 0, 0, 0,
+  0, 0, 0, 0,
+}, /* arg_0 */
+{
+  0, 0, 0, 0,
+  0, 0, 0, 0,
+  0, 0, 0, 0,
+  0, 0, 0, 0,
+}, /* arg_1 */
+{
+  0, 0, 0, 0,
+  0, 0, 0, 0,
+  0, 0, 0, 0,
+  0, 0, 0, 0,
+}, /* expect */
+  },
+  {

Can we just leverage the existing data in "vec_sat_data.h"? I suppose we have
test data for SAT_SUB already. Ok with that change.

Pan

-Original Message-
From: Ciyan Pan  
Sent: Monday, July 7, 2025 10:41 AM
To: gcc-patches@gcc.gnu.org
Cc: kito.ch...@gmail.com; richard.guent...@gmail.com; tamar.christ...@arm.com; 
juzhe.zh...@rivai.ai; Li, Pan2 ; jeffreya...@gmail.com; 
rdapp@gmail.com; panciyan 
Subject: [PATCH] RISC-V: Add testcases for unsigned vector SAT_SUB form 11 and 
form 12

From: panciyan 

This patch adds testcase for form11 and form12, as shown below:

void __attribute__((noinline))   \
vec_sat_u_sub_##T##_fmt_11 (T *out, T *op_1, T *op_2, unsigned limit) \
{\
  unsigned i;\
  for (i = 0; i < limit; i++)\
{\
  T x = op_1[i]; \
  T y = op_2[i]; \
  T ret; \
  T overflow = __builtin_sub_overflow (x, y, &ret);   \
  out[i] = overflow ? 0 : ret;   \
}\
}

void __attribute__((noinline))\
vec_sat_u_sub_##T##_fmt_12 (T *out, T *op_1, T *op_2, unsigned limit) \
{ \
  unsigned i; \
  for (i = 0; i < limit; i++) \
{ \
  T x = op_1[i];  \
  T y = op_2[i];  \
  T ret;  \
  T overflow = __builtin_sub_overflow (x, y, &ret);\
  out[i] = !overflow ? ret : 0;   \
} \
}

Passed the rv64gcv regression test.

Signed-off-by: Ciyan Pan 
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/sat/vec_sat_arith.h: Add unsigned vector 
SAT_SUB form11 and form12.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u16.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u32.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u64.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u8.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u16.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u32.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u64.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u8.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-11-u16.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-11-u32.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-11-u64.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-11-u8.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-12-u16.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-12-u32.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-12-u64.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-12-u8.c: New test.

---
 .../riscv/rvv/autovec/sat/vec_sat_arith.h | 30 
 .../rvv/autovec/sat/vec_sat_u_sub-11-u16.c|  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-11-u32.c|  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-11-u64.c|  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-11-u8.c |  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-12-u16.c|  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-12-u32.c|  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-12-u64.c|  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-12-u8.c |  9 +++
 .../autovec/sat/vec_sat_u_sub-run-11-u16.c| 75 +++
 .../autovec/sat/vec_sat_u_sub-run-11-u32.c| 75 +++
 .../autovec/sat/vec_sat_u_sub-

Re: [PATCH v2] libstdc++: Use hidden friends for __normal_iterator operators

2025-07-06 Thread Jonathan Wakely
On Sun, 6 Jul 2025 at 19:53, Stephan Bergmann  wrote:
>
> On 6/12/25 10:46, Jonathan Wakely wrote:
> > It now says:
> >
> > I also had to reorder the __attribute__((always_inline)) and
> > [[nodiscard]] attributes on the pre-c++20 operators, because Clang won't
> > allow [[foo]] after __attribute__((bar)) on a friend function:
> >
> > :4:36: error: an attribute list cannot appear here
> > 4 | __attribute__((always_inline)) [[nodiscard]] friend bool
> >   |^
>
> Just noting that at least with recent Clang 21 trunk,

Yes, this is https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120949

>
> > $ cat test.cc
> > #include 
>
> > $ clang++ -std=c++17 -fsyntax-only test.cc
> > In file included from test.cc:1:
> > In file included from 
> > ~/gcc/inst/lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../include/c++/16.0.0/iterator:65:
> > ~/gcc/inst/lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../include/c++/16.0.0/bits/stl_iterator.h:1252:37:
> >  error:
> >   an attribute list cannot appear here
> >  1252 | __attribute__((__always_inline__)) _GLIBCXX_NODISCARD 
> > _GLIBCXX_CONSTEXPR
> >   |^~
> > ~/gcc/inst/lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../include/c++/16.0.0/x86_64-pc-linux-gnu/bits/c++config.h:173:29:
> >  note:
> >   expanded from macro '_GLIBCXX_NODISCARD'
> >   173 | # define _GLIBCXX_NODISCARD [[__nodiscard__]]
> >   | ^
> > In file included from test.cc:1:
> > In file included from 
> > ~/gcc/inst/lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../include/c++/16.0.0/iterator:65:
> > ~/gcc/inst/lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../include/c++/16.0.0/bits/stl_iterator.h:1269:37:
> >  error:
> >   an attribute list cannot appear here
> >  1269 | __attribute__((__always_inline__)) _GLIBCXX_NODISCARD 
> > _GLIBCXX_CONSTEXPR
> >   |^~
> > ~/gcc/inst/lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../include/c++/16.0.0/x86_64-pc-linux-gnu/bits/c++config.h:173:29:
> >  note:
> >   expanded from macro '_GLIBCXX_NODISCARD'
> >   173 | # define _GLIBCXX_NODISCARD [[__nodiscard__]]
> >   | ^
> > 2 errors generated.
>
> now fails (while with --std=c++20) it works, where something like
>
> > diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
> > b/libstdc++-v3/include/bits/stl_iterator.h
> > index a7188f46f6d..26bb0206375 100644
> > --- a/libstdc++-v3/include/bits/stl_iterator.h
> > +++ b/libstdc++-v3/include/bits/stl_iterator.h
> > @@ -1248,8 +1248,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >
> >// Random access iterator requirements
> >template
> > +   __attribute__((__always_inline__)) _GLIBCXX_NODISCARD
> > friend
> > -   __attribute__((__always_inline__)) _GLIBCXX_NODISCARD 
> > _GLIBCXX_CONSTEXPR
> > +   _GLIBCXX_CONSTEXPR
> > inline bool
> > operator<(const __normal_iterator& __lhs,
> >   const __normal_iterator<_Iter, _Container>& __rhs)
> > @@ -1265,8 +1266,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >{ return __lhs.base() < __rhs.base(); }
> >
> >template
> > +   __attribute__((__always_inline__)) _GLIBCXX_NODISCARD
> > friend
> > -   __attribute__((__always_inline__)) _GLIBCXX_NODISCARD 
> > _GLIBCXX_CONSTEXPR
> > +   _GLIBCXX_CONSTEXPR
> > bool
> > operator>(const __normal_iterator& __lhs,
> >   const __normal_iterator<_Iter, _Container>& __rhs)
>
> would fix that.
>



Re: [PATCH v3 3/4] RISC-V: Implement unsigned scalar SAT_MUL from uint128_t

2025-07-06 Thread Jeff Law




On 7/4/25 10:54 PM, Li, Pan2 wrote:



What you do want to watch out for is constants


Here I want the max value of unsigned scalar based on mode, it could be 
UINT8_MAX,
UINT16_MAX, UINT32_MAX and UINT64_MAX.
Understood, but within the compiler HOST_WIDE_INT is how we tend to want 
to work on objects.  And to get constants handled correctly we use the 
macros I mentioned in the prior message.  We don't tend to use things 
like int32_t, int64_t in the compiler itself.



jeff



[PATCH] RISC-V: Add testcases for unsigned vector SAT_SUB form 11 and form 12

2025-07-06 Thread Ciyan Pan
From: panciyan 

This patch adds testcase for form11 and form12, as shown below:

void __attribute__((noinline))   \
vec_sat_u_sub_##T##_fmt_11 (T *out, T *op_1, T *op_2, unsigned limit) \
{\
  unsigned i;\
  for (i = 0; i < limit; i++)\
{\
  T x = op_1[i]; \
  T y = op_2[i]; \
  T ret; \
  T overflow = __builtin_sub_overflow (x, y, &ret);   \
  out[i] = overflow ? 0 : ret;   \
}\
}

void __attribute__((noinline))\
vec_sat_u_sub_##T##_fmt_12 (T *out, T *op_1, T *op_2, unsigned limit) \
{ \
  unsigned i; \
  for (i = 0; i < limit; i++) \
{ \
  T x = op_1[i];  \
  T y = op_2[i];  \
  T ret;  \
  T overflow = __builtin_sub_overflow (x, y, &ret);\
  out[i] = !overflow ? ret : 0;   \
} \
}

Passed the rv64gcv regression test.

Signed-off-by: Ciyan Pan 
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/sat/vec_sat_arith.h: Add unsigned vector 
SAT_SUB form11 and form12.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u16.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u32.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u64.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u8.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u16.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u32.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u64.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u8.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-11-u16.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-11-u32.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-11-u64.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-11-u8.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-12-u16.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-12-u32.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-12-u64.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-run-12-u8.c: New test.

---
 .../riscv/rvv/autovec/sat/vec_sat_arith.h | 30 
 .../rvv/autovec/sat/vec_sat_u_sub-11-u16.c|  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-11-u32.c|  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-11-u64.c|  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-11-u8.c |  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-12-u16.c|  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-12-u32.c|  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-12-u64.c|  9 +++
 .../rvv/autovec/sat/vec_sat_u_sub-12-u8.c |  9 +++
 .../autovec/sat/vec_sat_u_sub-run-11-u16.c| 75 +++
 .../autovec/sat/vec_sat_u_sub-run-11-u32.c| 75 +++
 .../autovec/sat/vec_sat_u_sub-run-11-u64.c| 75 +++
 .../rvv/autovec/sat/vec_sat_u_sub-run-11-u8.c | 75 +++
 .../autovec/sat/vec_sat_u_sub-run-12-u16.c| 75 +++
 .../autovec/sat/vec_sat_u_sub-run-12-u32.c| 75 +++
 .../autovec/sat/vec_sat_u_sub-run-12-u64.c| 75 +++
 .../rvv/autovec/sat/vec_sat_u_sub-run-12-u8.c | 75 +++
 17 files changed, 702 insertions(+)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-11-u8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/sat/vec_sat_u_sub-12-u32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/sat/vec

Re: [SNAPv3] libstdc++: Add NTTP bind_front, _back (P2714) [PR119744]

2025-07-06 Thread Tomasz Kaminski
On Fri, Jul 4, 2025 at 6:39 AM Nathan Myers  wrote:

> This is a snapshot of work on P2714 "Bind front and back to NTTP
> callables", posted for reference.
>
> Questions:
> 1. Jonathan asks if __type_forward_like_t does the same job as __like_t
> in bits/move.h.
> 2. Could the "if constexpr" statements be better expressed as requires
> clauses via the A=>B == !A||B identity?
>
I would create a wrappers for no bound arguments case, as
bind_front/bind_back
have exactly the same behavior here. We may also want to consider a separate
type for single argument case, that correspond to function_ref nontype,
ptr/ref
constructor.

>
> libstdc++-v3/ChangeLog:
> PR libstdc++/119744
> * include/bits/version.def: Redefine __cpp_lib_bind_front etc.
> * include/bits/version.h: Ditto.
> * include/std/functional: Add new bind_front etc. overloads
> ---
>  libstdc++-v3/include/bits/version.def |  12 +++
>  libstdc++-v3/include/bits/version.h   |  21 -
>  libstdc++-v3/include/std/functional   | 124 +-
>  3 files changed, 153 insertions(+), 4 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/version.def
> b/libstdc++-v3/include/bits/version.def
> index 5d5758bf203..8ab9a7207e7 100644
> --- a/libstdc++-v3/include/bits/version.def
> +++ b/libstdc++-v3/include/bits/version.def
> @@ -463,6 +463,10 @@ ftms = {
>
>  ftms = {
>name = not_fn;
> +  values = {
> +v = 202306;
> +cxxmin = 26;
> +  };
>values = {
>  v = 201603;
>  cxxmin = 17;
> @@ -776,6 +780,10 @@ ftms = {
>
>  ftms = {
>name = bind_front;
> +  values = {
> +v = 202306;
> +cxxmin = 26;
> +  };
>values = {
>  v = 201907;
>  cxxmin = 20;
> @@ -784,6 +792,10 @@ ftms = {
>
>  ftms = {
>name = bind_back;
> +  values = {
> +v = 202306;
> +cxxmin = 26;
> +  };
>values = {
>  v = 202202;
>  cxxmin = 23;
> diff --git a/libstdc++-v3/include/bits/version.h
> b/libstdc++-v3/include/bits/version.h
> index 2b00e8419b3..c204ae3c48c 100644
> --- a/libstdc++-v3/include/bits/version.h
> +++ b/libstdc++-v3/include/bits/version.h
> @@ -511,7 +511,12 @@
>  #undef __glibcxx_want_make_from_tuple
>
>  #if !defined(__cpp_lib_not_fn)
> -# if (__cplusplus >= 201703L)
> +# if (__cplusplus >  202302L)
> +#  define __glibcxx_not_fn 202306L
> +#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_not_fn)
> +#   define __cpp_lib_not_fn 202306L
> +#  endif
> +# elif (__cplusplus >= 201703L)
>  #  define __glibcxx_not_fn 201603L
>  #  if defined(__glibcxx_want_all) || defined(__glibcxx_want_not_fn)
>  #   define __cpp_lib_not_fn 201603L
> @@ -866,7 +871,12 @@
>  #undef __glibcxx_want_atomic_value_initialization
>
>  #if !defined(__cpp_lib_bind_front)
> -# if (__cplusplus >= 202002L)
> +# if (__cplusplus >  202302L)
> +#  define __glibcxx_bind_front 202306L
> +#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_bind_front)
> +#   define __cpp_lib_bind_front 202306L
> +#  endif
> +# elif (__cplusplus >= 202002L)
>  #  define __glibcxx_bind_front 201907L
>  #  if defined(__glibcxx_want_all) || defined(__glibcxx_want_bind_front)
>  #   define __cpp_lib_bind_front 201907L
> @@ -876,7 +886,12 @@
>  #undef __glibcxx_want_bind_front
>
>  #if !defined(__cpp_lib_bind_back)
> -# if (__cplusplus >= 202100L) && (__cpp_explicit_this_parameter)
> +# if (__cplusplus >  202302L)
> +#  define __glibcxx_bind_back 202306L
> +#  if defined(__glibcxx_want_all) || defined(__glibcxx_want_bind_back)
> +#   define __cpp_lib_bind_back 202306L
> +#  endif
> +# elif (__cplusplus >= 202100L) && (__cpp_explicit_this_parameter)
>  #  define __glibcxx_bind_back 202202L
>  #  if defined(__glibcxx_want_all) || defined(__glibcxx_want_bind_back)
>  #   define __cpp_lib_bind_back 202202L
> diff --git a/libstdc++-v3/include/std/functional
> b/libstdc++-v3/include/std/functional
> index 307bcb95bcc..21f0b1cb2d5 100644
> --- a/libstdc++-v3/include/std/functional
> +++ b/libstdc++-v3/include/std/functional
> @@ -940,7 +940,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>   _M_bound_args(std::forward<_Args>(__args)...)
> { static_assert(sizeof...(_Args) == sizeof...(_BoundArgs)); }
>
> -#if __cpp_explicit_this_parameter
> +#ifdef __cpp_explicit_this_parameter
>template
> constexpr
> invoke_result_t<__like_t<_Self, _Fd>, __like_t<_Self,
> _BoundArgs>..., _CallArgs...>
> @@ -1218,8 +1218,130 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  {
>return _Not_fn>{std::forward<_Fn>(__fn), 0};
>  }
> +#if __cpp_lib_not_fn >= 202306
> +  /** Wrap a function type to create a function object that negates its
> result.
> +   *
> +   * The function template `std::not_fn` creates a "forwarding call
> wrapper",
> +   * which is a function object that when called forwards its arguments to
> +   * its invocable template argument.
> +   *
> +   * The result of invoking the wrapper is the negation (using `!`) of
> +   * the wrapped function object.
> +   *
> +   *  @ingroup funct

Re: [PATCH v2 4/5] libstdc++: Implement mdspan and tests.

2025-07-06 Thread Tomasz Kaminski
On Thu, Jul 3, 2025 at 4:14 PM Luc Grosheintz 
wrote:

> Thank you for the nice review! I've locally implemented everything and
> I'll send a v3 later today or tomorrow; after squashing the commits
> correctly; and retesting everything.
>
> Meanwhile a couple of comments below.
>
> On 7/1/25 16:42, Tomasz Kaminski wrote:
> > On Fri, Jun 27, 2025 at 11:37 AM Luc Grosheintz <
> luc.groshei...@gmail.com>
> > wrote:
> >
> >> Implements the class mdspan as described in N4950, i.e. without P3029.
> >> It also adds tests for mdspan.
> >>
> >> libstdc++-v3/ChangeLog:
> >>
> >>  * include/std/mdspan (mdspan): New class.
> >>  * src/c++23/std.cc.in: Add std::mdspan.
> >>  * testsuite/23_containers/mdspan/class_mandate_neg.cc: New
> test.
> >>  * testsuite/23_containers/mdspan/mdspan.cc: New test.
> >>  * testsuite/23_containers/mdspan/layout_like.h: Add class
> >>  LayoutLike which models a user-defined layout.
> >>
> >> Signed-off-by: Luc Grosheintz 
> >> ---
> >>
> > As usual really solid implementation, few additional comments:
> > * use () to value-initialize in ctor initializer list
> > * redundant parentheses in requires clauses
> > * suggesting for adding __mdspan::__size
> > * few suggestion for tests
> >
> >   libstdc++-v3/include/std/mdspan   | 282 +
> >>   libstdc++-v3/src/c++23/std.cc.in  |   3 +-
> >>   .../23_containers/mdspan/class_mandate_neg.cc |  58 ++
> >>   .../23_containers/mdspan/layout_like.h|  63 ++
> >>   .../testsuite/23_containers/mdspan/mdspan.cc  | 540 ++
> >>   5 files changed, 945 insertions(+), 1 deletion(-)
> >>   create mode 100644
> >> libstdc++-v3/testsuite/23_containers/mdspan/class_mandate_neg.cc
> >>   create mode 100644
> >> libstdc++-v3/testsuite/23_containers/mdspan/layout_like.h
> >>   create mode 100644
> libstdc++-v3/testsuite/23_containers/mdspan/mdspan.cc
> >>
> >> diff --git a/libstdc++-v3/include/std/mdspan
> >> b/libstdc++-v3/include/std/mdspan
> >> index e198d65bba3..852f881971e 100644
> >> --- a/libstdc++-v3/include/std/mdspan
> >> +++ b/libstdc++-v3/include/std/mdspan
> >> @@ -1052,6 +1052,288 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >> { return __p + __i; }
> >>   };
> >>
> >> +  namespace __mdspan
> >> +  {
> >> +template
> >> +  constexpr bool
> >> +  __is_multi_index(const _Extents& __exts, span<_IndexType, _Nm>
> >> __indices)
> >> +  {
> >> +   static_assert(__exts.rank() == _Nm);
> >> +   for (size_t __i = 0; __i < __exts.rank(); ++__i)
> >> + if (__indices[__i] >= __exts.extent(__i))
> >> +   return false;
> >> +   return true;
> >> +  }
> >> +  }
> >> +
> >> +  template >> +  typename _LayoutPolicy = layout_right,
> >> +  typename _AccessorPolicy = default_accessor<_ElementType>>
> >> +class mdspan
> >> +{
> >> +  static_assert(!is_array_v<_ElementType>,
> >> +   "ElementType must not be an array type");
> >> +  static_assert(!is_abstract_v<_ElementType>,
> >> +   "ElementType must not be an abstract class type");
> >> +  static_assert(__mdspan::__is_extents<_Extents>,
> >> +   "Extents must be a specialization of std::extents");
> >> +  static_assert(is_same_v<_ElementType,
> >> + typename _AccessorPolicy::element_type>);
> >> +
> >> +public:
> >> +  using extents_type = _Extents;
> >> +  using layout_type = _LayoutPolicy;
> >> +  using accessor_type = _AccessorPolicy;
> >> +  using mapping_type = typename layout_type::template
> >> mapping;
> >> +  using element_type = _ElementType;
> >> +  using value_type = remove_cv_t;
> >> +  using index_type = typename extents_type::index_type;
> >> +  using size_type = typename extents_type::size_type;
> >> +  using rank_type = typename extents_type::rank_type;
> >> +  using data_handle_type = typename
> accessor_type::data_handle_type;
> >> +  using reference = typename accessor_type::reference;
> >> +
> >> +  static constexpr rank_type
> >> +  rank() noexcept { return extents_type::rank(); }
> >> +
> >> +  static constexpr rank_type
> >> +  rank_dynamic() noexcept { return extents_type::rank_dynamic(); }
> >> +
> >> +  static constexpr size_t
> >> +  static_extent(rank_type __r) noexcept
> >> +  { return extents_type::static_extent(__r); }
> >> +
> >> +  constexpr index_type
> >> +  extent(rank_type __r) const noexcept { return
> >> extents().extent(__r); }
> >> +
> >> +  constexpr
> >> +  mdspan()
> >> +  requires (rank_dynamic() > 0 &&
> >> + is_default_constructible_v &&
> >> + is_default_constructible_v &&
> >> + is_default_constructible_v)
> >> +  : _M_accessor{}, _M_mapping{}, _M_handle{}
> >>
> > Here and in every other constructor, please use () to value-initialize
> the
> > field,
> > for example:
> >   _M_accessor(), _M_mapping(), _M_handle()
>

Re: [PATCH v2] libstdc++: Use hidden friends for __normal_iterator operators

2025-07-06 Thread Stephan Bergmann

On 6/12/25 10:46, Jonathan Wakely wrote:

It now says:

I also had to reorder the __attribute__((always_inline)) and
[[nodiscard]] attributes on the pre-c++20 operators, because Clang won't
allow [[foo]] after __attribute__((bar)) on a friend function:

:4:36: error: an attribute list cannot appear here
4 | __attribute__((always_inline)) [[nodiscard]] friend bool
  |^


Just noting that at least with recent Clang 21 trunk,


$ cat test.cc
#include 



$ clang++ -std=c++17 -fsyntax-only test.cc
In file included from test.cc:1:
In file included from 
~/gcc/inst/lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../include/c++/16.0.0/iterator:65:
~/gcc/inst/lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../include/c++/16.0.0/bits/stl_iterator.h:1252:37: error: 
  an attribute list cannot appear here

 1252 | __attribute__((__always_inline__)) _GLIBCXX_NODISCARD 
_GLIBCXX_CONSTEXPR
  |^~
~/gcc/inst/lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../include/c++/16.0.0/x86_64-pc-linux-gnu/bits/c++config.h:173:29: note: 
  expanded from macro '_GLIBCXX_NODISCARD'

  173 | # define _GLIBCXX_NODISCARD [[__nodiscard__]]
  | ^
In file included from test.cc:1:
In file included from 
~/gcc/inst/lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../include/c++/16.0.0/iterator:65:
~/gcc/inst/lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../include/c++/16.0.0/bits/stl_iterator.h:1269:37: error: 
  an attribute list cannot appear here

 1269 | __attribute__((__always_inline__)) _GLIBCXX_NODISCARD 
_GLIBCXX_CONSTEXPR
  |^~
~/gcc/inst/lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../include/c++/16.0.0/x86_64-pc-linux-gnu/bits/c++config.h:173:29: note: 
  expanded from macro '_GLIBCXX_NODISCARD'

  173 | # define _GLIBCXX_NODISCARD [[__nodiscard__]]
  | ^
2 errors generated.


now fails (while with --std=c++20) it works, where something like


diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h
index a7188f46f6d..26bb0206375 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -1248,8 +1248,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   // Random access iterator requirements

   template
+   __attribute__((__always_inline__)) _GLIBCXX_NODISCARD
friend
-   __attribute__((__always_inline__)) _GLIBCXX_NODISCARD _GLIBCXX_CONSTEXPR
+   _GLIBCXX_CONSTEXPR
inline bool
operator<(const __normal_iterator& __lhs,
  const __normal_iterator<_Iter, _Container>& __rhs)
@@ -1265,8 +1266,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { return __lhs.base() < __rhs.base(); }
 
   template

+   __attribute__((__always_inline__)) _GLIBCXX_NODISCARD
friend
-   __attribute__((__always_inline__)) _GLIBCXX_NODISCARD _GLIBCXX_CONSTEXPR
+   _GLIBCXX_CONSTEXPR
bool
operator>(const __normal_iterator& __lhs,
  const __normal_iterator<_Iter, _Container>& __rhs)


would fix that.


Re: [PATCH] c++, v2: Pedwarn on invalid decl specifiers for for-range-declaration [PR84009]

2025-07-06 Thread Jason Merrill

On 7/5/25 1:05 PM, Jakub Jelinek wrote:

On Sat, Jul 05, 2025 at 08:46:31AM -0400, Jason Merrill wrote:

I think we want these diagnostics enabled by default; I don't feel strongly
about unconditional pedwarn vs. permerror.


So like this then?


OK.


2025-07-05  Jakub Jelinek  

PR c++/84009
* parser.cc (cp_parser_decomposition_declaration): Pedwarn
on thread_local, __thread or static in decl_specifiers for
for-range-declaration.
(cp_parser_init_declarator): Likewise, and also for extern
or register.

* g++.dg/cpp0x/range-for40.C: New test.
* g++.dg/cpp0x/range-for41.C: New test.
* g++.dg/cpp0x/range-for42.C: New test.
* g++.dg/cpp0x/range-for43.C: New test.

--- gcc/cp/parser.cc.jj 2025-07-04 19:49:14.702864248 +0200
+++ gcc/cp/parser.cc2025-07-05 18:41:28.248664302 +0200
@@ -16919,6 +16919,15 @@ cp_parser_decomposition_declaration (cp_
/* Ensure DECL_VALUE_EXPR is created for all the decls but
 the underlying DECL.  */
cp_finish_decomp (decl, &decomp);
+  if (decl_spec_seq_has_spec_p (decl_specifiers, ds_thread))
+   pedwarn (decl_specifiers->locations[ds_thread],
+0, "for-range-declaration cannot be %qs",
+decl_specifiers->gnu_thread_keyword_p
+? "__thread" : "thread_local");
+  else if (decl_specifiers->storage_class == sc_static)
+   pedwarn (decl_specifiers->locations[ds_storage_class],
+0, "for-range-declaration cannot be %qs",
+"static");
  }
  
if (pushed_scope)

@@ -24162,7 +24171,26 @@ cp_parser_init_declarator (cp_parser* pa
  && token->type != CPP_SEMICOLON)
{
  if (maybe_range_for_decl && *maybe_range_for_decl != error_mark_node)
-   range_for_decl_p = true;
+   {
+ range_for_decl_p = true;
+ if (decl_spec_seq_has_spec_p (decl_specifiers, ds_thread))
+   pedwarn (decl_specifiers->locations[ds_thread],
+0, "for-range-declaration cannot be %qs",
+decl_specifiers->gnu_thread_keyword_p
+? "__thread" : "thread_local");
+ else if (decl_specifiers->storage_class == sc_static)
+   pedwarn (decl_specifiers->locations[ds_storage_class],
+0, "for-range-declaration cannot be %qs",
+"static");
+ else if (decl_specifiers->storage_class == sc_extern)
+   pedwarn (decl_specifiers->locations[ds_storage_class],
+0, "for-range-declaration cannot be %qs",
+"extern");
+ else if (decl_specifiers->storage_class == sc_register)
+   pedwarn (decl_specifiers->locations[ds_storage_class],
+0, "for-range-declaration cannot be %qs",
+"register");
+   }
  else
{
  if (!maybe_range_for_decl)
--- gcc/testsuite/g++.dg/cpp0x/range-for40.C.jj 2025-07-04 21:03:00.951729318 
+0200
+++ gcc/testsuite/g++.dg/cpp0x/range-for40.C2025-07-04 21:11:29.426240713 
+0200
@@ -0,0 +1,41 @@
+// PR c++/84009
+// { dg-do compile { target c++11 } }
+
+int z[64];
+
+void
+foo ()
+{
+  for (static auto a : z)  // { dg-error "for-range-declaration cannot 
be 'static'" }
+;
+  for (thread_local auto a : z)// { dg-error "for-range-declaration 
cannot be 'thread_local'" }
+;
+  for (__thread auto a : z)// { dg-error "for-range-declaration cannot 
be '__thread'" }
+;  // { dg-error "function-scope 'a' implicitly auto and 
declared '__thread'" "" { target *-*-* } .-1 }
+  for (register auto a : z)// { dg-error "for-range-declaration cannot 
be 'register'" }
+;  // { dg-error "does not allow 'register' storage class 
specifier" "" { target c++17 } .-1 }
+  for (extern auto a : z)  // { dg-error "for-range-declaration cannot 
be 'extern'" }
+;  // { dg-error "'a' has both 'extern' and 
initializer" "" { target *-*-* } .-1 }
+  for (mutable auto a : z) // { dg-error "non-member 'a' cannot be 
declared 'mutable'" }
+;
+  for (virtual auto a : z) // { dg-error "'virtual' outside class 
declaration" }
+;
+  for (explicit auto a : z)// { dg-error "'explicit' outside class 
declaration" }
+;
+  for (friend auto a : z)  // { dg-error "'friend' used outside of 
class" }
+;
+  for (typedef auto a : z) // { dg-error "typedef declared 'auto'" 
}
+;  // { dg-error "typedef 'a' is initialized \\\(use 
'decltype' instead\\\)" "" { target *-*-* } .-1 }
+#if __cplusplus >= 202002L
+  for (consteval auto a : z)   // { dg-error "a variable cannot be declared 
'constev

Re: [PATCH 1/2] Allow the target to request a masked vector epilogue

2025-07-06 Thread Richard Biener
On Fri, 4 Jul 2025, Richard Sandiford wrote:

> Richard Biener  writes:
> > @@ -1738,8 +1738,13 @@ protected:
> >unsigned int m_suggested_unroll_factor;
> >  
> >/* The suggested mode to be used for a vectorized epilogue or VOIDmode,
> > - determined at finish_cost.  */
> > + determined at finish_cost.  m_masked_epilogue is epilogue should use
> > + masked vectorization, regardless of the --param 
> > vect-partial-vector-usage
> 
> Does this mean "m_masked_epilogue is 1 if the epilogue should use..."?

m_masked_epilogue specifies whether the epilogue should use...

sorry for the bad english, fixed as indicated.

> Is 0 valid, and if so, does it override the --param?  Or is the override
> only in one direction (to 1)?

Yes, 0 is valid and overrides the --param.

Richard.

> 
> Otherwise LGTM too FWIW.
> 
> Thanks,
> Richard
> 
> > + default.  If -1 then the --param setting takes precedence.  If the
> > + user explicitly specified --param vect-partial-vector-usage then that
> > + takes precedence.  */
> >machine_mode m_suggested_epilogue_mode;
> > +  int m_masked_epilogue;
> >  
> >/* True if finish_cost has been called.  */
> >bool m_finished;
> > @@ -1755,6 +1760,7 @@ vector_costs::vector_costs (vec_info *vinfo, bool 
> > costing_for_scalar)
> >  m_costs (),
> >  m_suggested_unroll_factor(1),
> >  m_suggested_epilogue_mode(VOIDmode),
> > +m_masked_epilogue (-1),
> >  m_finished (false)
> >  {
> >  }
> > @@ -1815,9 +1821,10 @@ vector_costs::suggested_unroll_factor () const
> >  /* Return the suggested epilogue mode.  */
> >  
> >  inline machine_mode
> > -vector_costs::suggested_epilogue_mode () const
> > +vector_costs::suggested_epilogue_mode (int &masked_p) const
> >  {
> >gcc_checking_assert (m_finished);
> > +  masked_p = m_masked_epilogue;
> >return m_suggested_epilogue_mode;
> >  }
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: Add template keyword to for Clang

2025-07-06 Thread Tomasz Kaminski
Thanks.
I am not sure if template is necessary here, as I believe this is type-only
context, but I never understood the rules around this.

On Sat, Jul 5, 2025 at 1:15 AM Jonathan Wakely  wrote:

> Clang wants this change:
>
> --- a/libstdc++-v3/include/std/mdspan
> +++ b/libstdc++-v3/include/std/mdspan
> @@ -509,7 +509,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
> template
>   concept __mapping_of =
> -   is_same_v _Mapping::extents_type>,
> +   is_same_v _Mapping::extents_type>,
>  _Mapping>;
>
> template
>
> to fix:
>
>
> /home/jwakely/gcc/latest/lib/gcc/x86_64-pc-linux-gnu/16.0.0/../../../../include/c++/16.0.0/mdspan:512:30:
> error: use 'template' keyword to treat 'mapping' a
> s a dependent template name
>  512 | is_same_v _Mapping::extents_type>,
>  | ^
>
>
> I'll push that on Monday.
>
>


[r16-1971 Regression] FAIL: gcc.target/i386/pr120936-8.c check-function-bodies foo on Linux/x86_64

2025-07-06 Thread haochen.jiang
On Linux/x86_64,

349da53f13de274864d01b6ccc466961c472dbe1 is the first bad commit
commit 349da53f13de274864d01b6ccc466961c472dbe1
Author: H.J. Lu 
Date:   Thu Jul 3 10:13:48 2025 +0800

x86: Emit label only for __mcount_loc section

caused

FAIL: gcc.target/i386/pr120936-10.c check-function-bodies foo
FAIL: gcc.target/i386/pr120936-11.c check-function-bodies foo
FAIL: gcc.target/i386/pr120936-12.c check-function-bodies foo
FAIL: gcc.target/i386/pr120936-4.c check-function-bodies foo
FAIL: gcc.target/i386/pr120936-5.c check-function-bodies foo
FAIL: gcc.target/i386/pr120936-6.c check-function-bodies foo
FAIL: gcc.target/i386/pr120936-8.c check-function-bodies foo

with GCC configured with

../../gcc/configure 
--prefix=/export/users3/haochenj/src/gcc-bisect/master/master/r16-1971/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-10.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-10.c --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-11.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-11.c --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-12.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-12.c --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-4.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-4.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-4.c --target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-4.c --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-5.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-5.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-5.c --target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-5.c --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-6.c --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-6.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-6.c --target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-6.c --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-8.c --target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr120936-8.c --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com.)
(If you met problems with cascadelake related, disabling AVX512F in command 
line might save that.)
(However, please make sure that there is no potential problems with AVX512.)


Re: [PATCH] crc: Error out on non-constant poly arguments for the crc builtins [PR120709]

2025-07-06 Thread Richard Biener



> Am 06.07.2025 um 23:23 schrieb Andrew Pinski :
> 
> These builtins requires a constant integer for the third argument but 
> currently
> there is assert rather than error. This fixes that and updates the 
> documentation too.
> Uses the same terms as was being used for the __builtin_prefetch arguments.
> 
> Bootstrapped and tested on x86_64-linux-gnu.

Ok

Richard 

>PR middle-end/120709
> 
> gcc/ChangeLog:
> 
>* builtins.cc (expand_builtin_crc_table_based): Error out
>instead of asserting the 3rd argument is an integer constant.
>* internal-fn.cc (expand_crc_optab_fn): Likewise.
>* doc/extend.texi (crc): Document requirement of the poly argument
>being a constant.
> 
> gcc/testsuite/ChangeLog:
> 
>* gcc.dg/crc-non-cst-poly-1.c: New test.
> 
> Signed-off-by: Andrew Pinski 
> ---
> gcc/builtins.cc   | 12 +---
> gcc/doc/extend.texi   |  4 ++--
> gcc/internal-fn.cc| 11 ---
> gcc/testsuite/gcc.dg/crc-non-cst-poly-1.c | 11 +++
> 4 files changed, 30 insertions(+), 8 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/crc-non-cst-poly-1.c
> 
> diff --git a/gcc/builtins.cc b/gcc/builtins.cc
> index a2ce3726810..7f580a3145f 100644
> --- a/gcc/builtins.cc
> +++ b/gcc/builtins.cc
> @@ -7799,11 +7799,17 @@ expand_builtin_crc_table_based (internal_fn fn, 
> scalar_mode crc_mode,
> 
>   rtx op1 = expand_normal (rhs1);
>   rtx op2 = expand_normal (rhs2);
> -  gcc_assert (TREE_CODE (rhs3) == INTEGER_CST);
> -  rtx op3 = gen_int_mode (TREE_INT_CST_LOW (rhs3), crc_mode);
> +  rtx op3;
> +  if (TREE_CODE (rhs3) != INTEGER_CST)
> +{
> +  error ("third argument to % builtins must be a constant");
> +  op3 = const0_rtx;
> +}
> +  else
> +op3 = convert_to_mode (crc_mode, expand_normal (rhs3), 0);
> 
>   if (CONST_INT_P (op2))
> -op2 = gen_int_mode (INTVAL (op2), crc_mode);
> +op2 = convert_to_mode (crc_mode, op2, 0);
> 
>   if (fn == IFN_CRC)
> expand_crc_table_based (target, op1, op2, op3, data_mode);
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 70adf2dab0a..a119ad31ea2 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -15553,7 +15553,7 @@ are 128-bit.  Only supported on targets when 128-bit 
> types are supported.
> Returns the calculated 8-bit bit-reversed CRC using the initial CRC (8-bit),
> data (8-bit) and the polynomial (8-bit).
> @var{crc} is the initial CRC, @var{data} is the data and
> -@var{poly} is the polynomial without leading 1.
> +@var{poly} is the polynomial without leading 1. @var{poly} is required to be 
> a compile-time constant.
> Table-based or clmul-based CRC may be used for the
> calculation, depending on the target architecture.
> @enddefbuiltin
> @@ -15608,7 +15608,7 @@ is 32-bit.
> Returns the calculated 8-bit bit-forward CRC using the initial CRC (8-bit),
> data (8-bit) and the polynomial (8-bit).
> @var{crc} is the initial CRC, @var{data} is the data and
> -@var{poly} is the polynomial without leading 1.
> +@var{poly} is the polynomial without leading 1. @var{poly} is required to be 
> a compile-time constant.
> Table-based or clmul-based CRC may be used for the
> calculation, depending on the target architecture.
> @enddefbuiltin
> diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
> index 3f4ac937367..39048d77d23 100644
> --- a/gcc/internal-fn.cc
> +++ b/gcc/internal-fn.cc
> @@ -4031,9 +4031,14 @@ expand_crc_optab_fn (internal_fn fn, gcall *stmt, 
> convert_optab optab)
>   rtx dest = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>   rtx crc = expand_normal (rhs1);
>   rtx data = expand_normal (rhs2);
> -  gcc_assert (TREE_CODE (rhs3) == INTEGER_CST);
> -  rtx polynomial = gen_rtx_CONST_INT (TYPE_MODE (result_type),
> -  TREE_INT_CST_LOW (rhs3));
> +  rtx polynomial;
> +  if (TREE_CODE (rhs3) != INTEGER_CST)
> +{
> +  error ("third argument to % builtins must be a constant");
> +  polynomial = const0_rtx;
> +}
> +  else
> +polynomial = convert_to_mode (TYPE_MODE (result_type), expand_normal 
> (rhs3), 0);
> 
>   /* Use target specific expansion if it exists.
>  Otherwise, generate table-based CRC.  */
> diff --git a/gcc/testsuite/gcc.dg/crc-non-cst-poly-1.c 
> b/gcc/testsuite/gcc.dg/crc-non-cst-poly-1.c
> new file mode 100644
> index 000..0c3d9054017
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/crc-non-cst-poly-1.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "" } */
> +
> +/* PR middle-end/120709 */
> +/* Make sure we don't ICE on a non-constant poly argument. */
> +
> +
> +typedef unsigned char uint8_t;
> +uint8_t crc8_data8(uint8_t crc, uint8_t data, uint8_t polynomial) {
> +  return __builtin_rev_crc32_data8 (crc, data, polynomial); /* { dg-error 
> "must be a constant" } */
> +}
> --
> 2.43.0
>