date:20250515

[COMMITTED 2/4] PR tree-optimization/116546 - Improve constant bitmasks.

2025-05-15 Thread Andrew MacLeod

Bitmasks for constants are currently created only for trailing zeros. 
There is no reason not to also include leading 1's in the value that are 
also known.


Given a range such as [5,7],  possible values in binary are 0101, 
0110, 0111.    we can conclude that not only are the leading 5 
bits 0, but we can also be sure that the next bit is a 1.


The calculation to include the 1's is actually slightly faster than 
excluding them. This also helps with various bit comparisons.


  before :  [5, 7]  mask 0x7 value 0x0  //  first 5 bits are known to 
be 0
  after  :  [5, 7]  mask 0x3 value 0x4    //  first 6 bits are known to 
be 01


Bootstraps on  x86_64-pc-linux-gnu with no regressions.   Pushed.

Andrew
From 65cd212bd4c533351a09e6974f40ae5d7effca84 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Wed, 14 May 2025 11:12:22 -0400
Subject: [PATCH 2/4] Improve constant bitmasks.

bitmasks for constants are created only for trailing zeros. It is no
additional work to also include leading 1's in the value that are also
known.
  before :  [5, 7]  mask 0x7 value 0x0
  after  :  [5, 7]  mask 0x3 value 0x4

	PR tree-optimization/116546
	* value-range.cc (irange_bitmask::irange_bitmask): Include
	leading ones in the bitmask.
---
 gcc/value-range.cc | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 48a1521b81e..64d10f41168 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -47,9 +47,11 @@ irange_bitmask::irange_bitmask (tree type,
   else
 {
   wide_int xorv = min ^ max;
-  xorv = wi::mask (prec - wi::clz (xorv), false, prec);
-  m_value = wi::zero (prec);
-  m_mask = min | xorv;
+  // Mask will have leading zeros for all leading bits that are
+  // common, both zeros and ones.
+  m_mask = wi::mask (prec - wi::clz (xorv), false, prec);
+  // Now set value to those bits which are known, and zero the rest.
+  m_value = ~m_mask & min;
 }
 }
 
-- 
2.45.0

[COMMITTED 3/4] PR tee-optimization/116546 - Allow bitmask intersection to process unknown masks.

2025-05-15 Thread Andrew MacLeod

bitmask_intersection should not return immediately if the current mask 
is unknown.


Unknown often means it is the default for a range, and this may interact 
in interesting ways with the other bitmask, producing more precise results.


Bootstraps on  x86_64-pc-linux-gnu with no regressions.   Pushed.

Andrew
From b3327649bffd32af962662dce054b94be2d7330d Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Wed, 14 May 2025 11:13:15 -0400
Subject: [PATCH 3/4] Allow bitmask intersection to process unknown masks.

bitmask_intersection should not return immediately if the current mask is
unknown.  Unknown may mean its the default for a range, and this may
interact in intersting ways with the other bitmask.

	PR tree-optimization/116546
	* value-range.cc (irange::intersect_bitmask): Allow unknown
	bitmasks to be processed.
---
 gcc/value-range.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 64d10f41168..ed3760fa6ff 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -2434,7 +2434,7 @@ irange::intersect_bitmask (const irange &r)
 {
   gcc_checking_assert (!undefined_p () && !r.undefined_p ());
 
-  if (r.m_bitmask.unknown_p () || m_bitmask == r.m_bitmask)
+  if (m_bitmask == r.m_bitmask)
 return false;
 
   irange_bitmask bm = get_bitmask ();
-- 
2.45.0

Re: [PATCH 5/6] RISC-V: frm/mode-switch: Reduce FRM restores on DYN transition

2025-05-15 Thread Vineet Gupta

On 5/9/25 13:27, Vineet Gupta wrote:
> FRM mode switching state machine has DYN as default state which it also
> fallsback to after transitioning to other states such as DYN_CALL.
> Currently TARGET_MODE_EMIT generates a FRM restore on any transition to
> DYN leading to spurious/extraneous FRM restores.
>
> Only do this if an interim static Rounding Mode was observed in the state
> machine.
>
> This reduces the number of FRM writes in SPEC2017 -Ofast -mrv64gcv build
> significantly.
>
>BeforeAfter
>   -  -
>   frrm fsrmi fsrm   frrm fsrmi frrm
>   perlbench_r   4204  1701
>  cpugcc_r  1670   17  1100
>  bwaves_r   1601  1601
> mcf_r   1100  1100
>  cactusBSSN_r   760   27  1901
>namd_r  1190   63  1401
>  parest_r  1680  114  2401
>  povray_r  1231   17  2616
> lbm_r600   600
> omnetpp_r   1701  1701
> wrf_r 2287   13 19561268   13 1603
>cpuxalan_r   1701  1701
>  ldecod_r   1100  1100
>x264_r   1401  1100
> blender_r  724   12  182  61   12   42
>cam4_r  324   13  169  45   13   20
>   deepsjeng_r   1100  1100
> imagick_r  265   16   34 132   16   25
>   leela_r   1200  1200
> nab_r   1301  1301
>   exchange2_r   1601  1601
>   fotonik3d_r   200   11  1901
>roms_r   330   23  2101
>  xz_r600   600
>  -----
>   4498   55 26231804   55 1707
>
> gcc/ChangeLog:
>
>   * config/riscv/riscv.cc (riscv_emit_frm_mode_set): check
>   STATIC_FRM_P for trnsition to DYN.
>
> Signed-off-by: Vineet Gupta 
> ---
>  gcc/config/riscv/riscv.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index f1b4b20499fc..37f3ace49a8b 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -12121,7 +12121,7 @@ riscv_emit_frm_mode_set (int mode, int prev_mode)
> && prev_mode != riscv_vector::FRM_DYN
> && prev_mode != riscv_vector::FRM_DYN_CALL)
>/* Restore frm value when switch to DYN mode.  */
> -  || (mode == riscv_vector::FRM_DYN
> +  || (STATIC_FRM_P (cfun) && mode == riscv_vector::FRM_DYN
> && prev_mode != riscv_vector::FRM_DYN_CALL);
>  
>if (restore_p)

FWIW this itself (and not 6/6) is sufficient to fix the extraneous FRMs around
call insns in PR/119164 and re-fix of PR/119832 (w/o confluence fix)

Thx,
-Vineet

[PATCH 1/4] Make end_sequence return the insn sequence

2025-05-15 Thread Richard Sandiford

The start_sequence/end_sequence interface was a big improvement over
the previous state, but one slightly awkward thing about it is that
you have to call get_insns before end_sequence in order to get the
insn sequence itself:

   To get the contents of the sequence just made, you must call
   `get_insns' *before* calling here.

We therefore have quite a lot of code like this:

  insns = get_insns ();
  end_sequence ();
  return insns;

It would seem simpler to write:

  return end_sequence ();

instead.

I can see three main potential objections to this:

(1) It isn't obvious whether ending the sequence would return the first
or the last instruction.  But although some code reads *both* the
first and the last instruction, I can't think of a specific case
where code would want *only* the last instruction.  All the emit
functions take the first instruction rather than the last.

(2) The "end" in end_sequence might imply the C++ meaning of an exclusive
endpoint iterator.  But for an insn sequence, the exclusive endpoint
is always the null pointer, so it would never need to be returned.
That said, we could rename the function to something like
"finish_sequence" or "complete_sequence" if this is an issue.

(3) There might have been an intention that start_sequence/end_sequence
could in future reclaim memory for unwanted sequences, and so an
explicit get_insns was used to indicate that the caller does want
the sequence.

But that sort of memory reclaimation has never been added,
and now that the codebase is C++, it would be easier to handle
using RAII.  I think reclaiming memory would be difficult to do in
any case, since some code records the individual instructions that
they emit, rather than using get_insns.
---
 gcc/emit-rtl.cc | 14 --
 gcc/rtl.h   |  2 +-
 2 files changed, 9 insertions(+), 7 deletions(-)

diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index 3e2c4309dee..d86fb23b29a 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -5722,22 +5722,22 @@ pop_topmost_sequence (void)
   end_sequence ();
 }
 
-/* After emitting to a sequence, restore previous saved state.
-
-   To get the contents of the sequence just made, you must call
-   `get_insns' *before* calling here.
+/* After emitting to a sequence, restore the previous saved state and return
+   the start of the completed sequence.
 
If the compiler might have deferred popping arguments while
generating this sequence, and this sequence will not be immediately
inserted into the instruction stream, use do_pending_stack_adjust
-   before calling get_insns.  That will ensure that the deferred
+   before calling this function.  That will ensure that the deferred
pops are inserted into this sequence, and not into some random
location in the instruction stream.  See INHIBIT_DEFER_POP for more
information about deferred popping of arguments.  */
 
-void
+rtx_insn *
 end_sequence (void)
 {
+  rtx_insn *insns = get_insns ();
+
   struct sequence_stack *tem = get_current_sequence ()->next;
 
   set_first_insn (tem->first);
@@ -5747,6 +5747,8 @@ end_sequence (void)
   memset (tem, 0, sizeof (*tem));
   tem->next = free_sequence_stack;
   free_sequence_stack = tem;
+
+  return insns;
 }
 
 /* Return true if currently emitting into a sequence.  */
diff --git a/gcc/rtl.h b/gcc/rtl.h
index cc25aed1f49..5623a4b06b4 100644
--- a/gcc/rtl.h
+++ b/gcc/rtl.h
@@ -3306,7 +3306,7 @@ extern rtx_insn *get_last_nonnote_insn (void);
 extern void start_sequence (void);
 extern void push_to_sequence (rtx_insn *);
 extern void push_to_sequence2 (rtx_insn *, rtx_insn *);
-extern void end_sequence (void);
+extern rtx_insn *end_sequence (void);
 #if TARGET_SUPPORTS_WIDE_INT == 0
 extern double_int rtx_to_double_int (const_rtx);
 #endif
-- 
2.43.0

[PATCH 0/4] Make end_sequence return the insn sequence

2025-05-15 Thread Richard Sandiford

This series makes end_sequence return the insn sequence that
it ends, so that callers don't need to call get_insns separately.
It also updates many callers to take advantage of the new return value.

Although this kind of refactoring/API change can in general make
backports harder, I think in this case it would be enough to backport
the first patch along with the first backport that needs it.
We did something similar for force_lowpart_subreg.

See the covering note in patch 1 for more discussion about the pros
and cons of the interface change.

The first patch is a prerequisite for a genemit series that I'm
hoping to post tomorrow.

Bootstrapped & regression-tested on aarch64-linux-gnu & x86_64-linux-gnu.
Also tested against "all" config.gcc cases using config/config-list.mk.
OK to install?

Richard


Richard Sandiford (4):
  Make end_sequence return the insn sequence
  Automatic replacement of get_insns/end_sequence pairs
  Automatic replacement of end_sequence/return pairs
  Manual tweak of some end_sequence callers

 gcc/asan.cc   | 17 ++
 gcc/auto-inc-dec.cc   |  3 +-
 gcc/avoid-store-forwarding.cc |  3 +-
 gcc/bb-reorder.cc |  3 +-
 gcc/builtins.cc   | 27 --
 gcc/calls.cc  |  6 +--
 gcc/cfgexpand.cc  |  6 +--
 gcc/cfgloopanal.cc|  6 +--
 gcc/cfgrtl.cc |  9 ++--
 gcc/config/aarch64/aarch64-speculation.cc | 11 ++--
 gcc/config/aarch64/aarch64.cc | 39 +-
 gcc/config/alpha/alpha.cc | 12 ++---
 gcc/config/arc/arc.cc |  3 +-
 gcc/config/arm/aarch-common.cc|  3 +-
 gcc/config/arm/arm-builtins.cc|  3 +-
 gcc/config/arm/arm.cc | 34 +++-
 gcc/config/avr/avr-passes.cc  |  8 +--
 gcc/config/avr/avr.cc | 12 ++---
 gcc/config/bfin/bfin.cc   |  3 +-
 gcc/config/c6x/c6x.cc |  3 +-
 gcc/config/cris/cris.cc   |  3 +-
 gcc/config/cris/cris.md   |  3 +-
 gcc/config/csky/csky.cc   |  3 +-
 gcc/config/epiphany/resolve-sw-modes.cc   |  3 +-
 gcc/config/fr30/fr30.cc   |  3 +-
 gcc/config/frv/frv.cc | 12 ++---
 gcc/config/frv/frv.md | 15 ++
 gcc/config/gcn/gcn.cc | 18 +++
 gcc/config/i386/i386-expand.cc| 45 ++--
 gcc/config/i386/i386-features.cc  | 12 ++---
 gcc/config/i386/i386.cc   | 18 +++
 gcc/config/ia64/ia64.cc   | 15 ++
 gcc/config/loongarch/loongarch.cc |  4 +-
 gcc/config/m32r/m32r.cc   |  3 +-
 gcc/config/m32r/m32r.md   |  6 +--
 gcc/config/m68k/m68k.cc   |  9 ++--
 gcc/config/m68k/m68k.md   | 12 ++---
 gcc/config/microblaze/microblaze.cc   |  3 +-
 gcc/config/mips/mips.cc   | 19 +++
 gcc/config/nvptx/nvptx.cc | 39 +-
 gcc/config/or1k/or1k.cc   |  3 +-
 gcc/config/pa/pa.cc   |  3 +-
 gcc/config/pru/pru.cc |  6 +--
 gcc/config/riscv/riscv-shorten-memrefs.cc |  3 +-
 gcc/config/riscv/riscv-vsetvl.cc  |  6 +--
 gcc/config/riscv/riscv.cc | 10 ++--
 gcc/config/rl78/rl78.cc   |  3 +-
 gcc/config/rs6000/rs6000.cc   |  3 +-
 gcc/config/s390/s390.cc   | 21 +++-
 gcc/config/sh/sh_treg_combine.cc  |  5 +-
 gcc/config/sparc/sparc.cc | 15 ++
 gcc/config/stormy16/stormy16.cc   |  3 +-
 gcc/config/xtensa/xtensa.cc   | 15 ++
 gcc/dse.cc|  9 ++--
 gcc/emit-rtl.cc   | 23 -
 gcc/except.cc | 21 +++-
 gcc/expmed.cc |  6 +--
 gcc/expr.cc   | 57 +++-
 gcc/function.cc   | 57 +++-
 gcc/gcse.cc   |  3 +-
 gcc/gentarget-def.cc  |  4 +-
 gcc/ifcvt.cc  |  9 ++--
 gcc/init-regs.cc  |  3 +-
 gcc/internal-fn.cc| 12 ++---
 gcc/ira-emit.cc   |  6 +--
 gcc/ira.cc|  3 +-
 gcc/loop-doloop.cc|  3 +-
 gcc/loop-unroll.cc| 21 +++-
 gcc/lower-subreg.cc   |  7 +--
 gcc/lra-constraints.cc| 54 +++
 gcc/lra-remat.cc  |  3 +-
 gcc/mode-switching.cc |  6 +--
 gcc/optabs.cc | 63 ---
 gcc/ree.c

[PATCH 3/4] Automatic replacement of end_sequence/return pairs

2025-05-15 Thread Richard Sandiford

This is the result of using a regexp to replace:

  rtx( |_insn *) = end_sequence ();
  return ;

with:

  return end_sequence ();

gcc/
* asan.cc (asan_emit_allocas_unpoison): Directly return the
result of end_sequence.
(hwasan_emit_untag_frame): Likewise.
* config/aarch64/aarch64-speculation.cc
(aarch64_speculation_clobber_sp): Likewise.
(aarch64_speculation_establish_tracker): Likewise.
* config/arm/arm.cc (arm_call_tls_get_addr): Likewise.
* config/avr/avr-passes.cc (avr_parallel_insn_from_insns): Likewise.
* config/sh/sh_treg_combine.cc
(sh_treg_combine::make_not_reg_insn): Likewise.
* tree-outof-ssa.cc (emit_partition_copy): Likewise.
---
 gcc/asan.cc   | 6 ++
 gcc/config/aarch64/aarch64-speculation.cc | 6 ++
 gcc/config/arm/arm.cc | 4 +---
 gcc/config/avr/avr-passes.cc  | 4 +---
 gcc/config/sh/sh_treg_combine.cc  | 4 +---
 gcc/tree-outof-ssa.cc | 4 +---
 6 files changed, 8 insertions(+), 20 deletions(-)

diff --git a/gcc/asan.cc b/gcc/asan.cc
index dfb044c08b7..748b289d6f9 100644
--- a/gcc/asan.cc
+++ b/gcc/asan.cc
@@ -2304,8 +2304,7 @@ asan_emit_allocas_unpoison (rtx top, rtx bot, rtx_insn 
*before)
 top, ptr_mode, bot, ptr_mode);
 
   do_pending_stack_adjust ();
-  rtx_insn *insns = end_sequence ();
-  return insns;
+  return end_sequence ();
 }
 
 /* Return true if DECL, a global var, might be overridden and needs
@@ -4737,8 +4736,7 @@ hwasan_emit_untag_frame (rtx dynamic, rtx vars)
 size_rtx, ptr_mode);
 
   do_pending_stack_adjust ();
-  rtx_insn *insns = end_sequence ();
-  return insns;
+  return end_sequence ();
 }
 
 /* Needs to be GTY(()), because cgraph_build_static_cdtor may
diff --git a/gcc/config/aarch64/aarch64-speculation.cc 
b/gcc/config/aarch64/aarch64-speculation.cc
index 5bcbfad2c13..618045afbc1 100644
--- a/gcc/config/aarch64/aarch64-speculation.cc
+++ b/gcc/config/aarch64/aarch64-speculation.cc
@@ -160,8 +160,7 @@ aarch64_speculation_clobber_sp ()
   emit_insn (gen_rtx_SET (scratch, sp));
   emit_insn (gen_anddi3 (scratch, scratch, tracker));
   emit_insn (gen_rtx_SET (sp, scratch));
-  rtx_insn *seq = end_sequence ();
-  return seq;
+  return end_sequence ();
 }
 
 /* Generate a code sequence to establish the tracker variable from the
@@ -175,8 +174,7 @@ aarch64_speculation_establish_tracker ()
   rtx cc = aarch64_gen_compare_reg (EQ, sp, const0_rtx);
   emit_insn (gen_cstoredi_neg (tracker,
   gen_rtx_NE (CCmode, cc, const0_rtx), cc));
-  rtx_insn *seq = end_sequence ();
-  return seq;
+  return end_sequence ();
 }
 
 /* Main speculation tracking pass.  */
diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 60c961ab272..94624cc87a4 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -9278,9 +9278,7 @@ arm_call_tls_get_addr (rtx x, rtx reg, rtx *valuep, int 
reloc)
 LCT_PURE, /* LCT_CONST?  */
 Pmode, reg, Pmode);
 
-  rtx_insn *insns = end_sequence ();
-
-  return insns;
+  return end_sequence ();
 }
 
 static rtx
diff --git a/gcc/config/avr/avr-passes.cc b/gcc/config/avr/avr-passes.cc
index 55785b8b700..284f49d1468 100644
--- a/gcc/config/avr/avr-passes.cc
+++ b/gcc/config/avr/avr-passes.cc
@@ -3942,9 +3942,7 @@ avr_parallel_insn_from_insns (rtx_insn *i[5])
 PATTERN (i[3]), PATTERN (i[4]));
   start_sequence ();
   emit (gen_rtx_PARALLEL (VOIDmode, vec));
-  rtx_insn *insn = end_sequence ();
-
-  return insn;
+  return end_sequence ();
 }
 
 
diff --git a/gcc/config/sh/sh_treg_combine.cc b/gcc/config/sh/sh_treg_combine.cc
index 33f528e0b76..696fe328a12 100644
--- a/gcc/config/sh/sh_treg_combine.cc
+++ b/gcc/config/sh/sh_treg_combine.cc
@@ -945,9 +945,7 @@ sh_treg_combine::make_not_reg_insn (rtx dst_reg, rtx 
src_reg) const
   else
 gcc_unreachable ();
 
-  rtx i = end_sequence ();
-
-  return i;
+  return end_sequence ();
 }
 
 rtx_insn *
diff --git a/gcc/tree-outof-ssa.cc b/gcc/tree-outof-ssa.cc
index d7e9ddbd082..bdf474dbd93 100644
--- a/gcc/tree-outof-ssa.cc
+++ b/gcc/tree-outof-ssa.cc
@@ -264,9 +264,7 @@ emit_partition_copy (rtx dest, rtx src, int unsignedsrcp, 
tree sizeexp)
 emit_move_insn (dest, src);
   do_pending_stack_adjust ();
 
-  rtx_insn *seq = end_sequence ();
-
-  return seq;
+  return end_sequence ();
 }
 
 /* Insert a copy instruction from partition SRC to DEST onto edge E.  */
-- 
2.43.0

[PATCH 4/4] Manual tweak of some end_sequence callers

2025-05-15 Thread Richard Sandiford

This patch mops up obvious redundancies that weren't caught by the
automatic regexp replacements in earlier patches.  It doesn't do
anything with genemit.cc, since that will be part of a later series.

gcc/
* config/arm/arm.cc (arm_gen_load_multiple_1): Simplify use of
end_sequence.
(arm_gen_store_multiple_1): Likewise.
* expr.cc (gen_move_insn): Likewise.
* gentarget-def.cc (main): Likewise.
---
 gcc/config/arm/arm.cc | 12 ++--
 gcc/expr.cc   |  5 +
 gcc/gentarget-def.cc  |  4 +---
 3 files changed, 4 insertions(+), 17 deletions(-)

diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 94624cc87a4..bde06f3fa86 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -14891,8 +14891,6 @@ arm_gen_load_multiple_1 (int count, int *regs, rtx 
*mems, rtx basereg,
 
   if (!multiple_operation_profitable_p (false, count, 0))
 {
-  rtx seq;
-
   start_sequence ();
 
   for (i = 0; i < count; i++)
@@ -14901,9 +14899,7 @@ arm_gen_load_multiple_1 (int count, int *regs, rtx 
*mems, rtx basereg,
   if (wback_offset != 0)
emit_move_insn (basereg, plus_constant (Pmode, basereg, wback_offset));
 
-  seq = end_sequence ();
-
-  return seq;
+  return end_sequence ();
 }
 
   result = gen_rtx_PARALLEL (VOIDmode,
@@ -14941,8 +14937,6 @@ arm_gen_store_multiple_1 (int count, int *regs, rtx 
*mems, rtx basereg,
 
   if (!multiple_operation_profitable_p (false, count, 0))
 {
-  rtx seq;
-
   start_sequence ();
 
   for (i = 0; i < count; i++)
@@ -14951,9 +14945,7 @@ arm_gen_store_multiple_1 (int count, int *regs, rtx 
*mems, rtx basereg,
   if (wback_offset != 0)
emit_move_insn (basereg, plus_constant (Pmode, basereg, wback_offset));
 
-  seq = end_sequence ();
-
-  return seq;
+  return end_sequence ();
 }
 
   result = gen_rtx_PARALLEL (VOIDmode,
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 0bc2095dae3..1eeefa1cadc 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -4760,12 +4760,9 @@ emit_move_insn (rtx x, rtx y)
 rtx_insn *
 gen_move_insn (rtx x, rtx y)
 {
-  rtx_insn *seq;
-
   start_sequence ();
   emit_move_insn_1 (x, y);
-  seq = end_sequence ();
-  return seq;
+  return end_sequence ();
 }
 
 /* If Y is representable exactly in a narrower mode, and the target can
diff --git a/gcc/gentarget-def.cc b/gcc/gentarget-def.cc
index a846a7cb200..d0a557864ef 100644
--- a/gcc/gentarget-def.cc
+++ b/gcc/gentarget-def.cc
@@ -319,9 +319,7 @@ main (int argc, const char **argv)
   printf ("return insn;\n");
   printf ("  start_sequence ();\n");
   printf ("  emit (x, false);\n");
-  printf ("  rtx_insn *res = get_insns ();\n");
-  printf ("  end_sequence ();\n");
-  printf ("  return res;\n");
+  printf ("  return end_sequence ();\n");
   printf ("}\n");
 
 #define DEF_TARGET_INSN(INSN, ARGS) \
-- 
2.43.0

[PATCH 2/4] Automatic replacement of get_insns/end_sequence pairs

2025-05-15 Thread Richard Sandiford

This is the result of using a regexp to replace instances of:

   = get_insns ();
  end_sequence ();

with:

   = end_sequence ();

where the indentation is the same for both lines, and where there
might be blank lines inbetween.

gcc/
* asan.cc (asan_clear_shadow): Use the return value of end_sequence,
rather than calling get_insns separately.
(asan_emit_stack_protection, asan_emit_allocas_unpoison): Likewise.
(hwasan_frame_base, hwasan_emit_untag_frame): Likewise.
* auto-inc-dec.cc (attempt_change): Likewise.
* avoid-store-forwarding.cc (process_store_forwarding): Likewise.
* bb-reorder.cc (fix_crossing_unconditional_branches): Likewise.
* builtins.cc (expand_builtin_apply_args): Likewise.
(expand_builtin_return, expand_builtin_mathfn_ternary): Likewise.
(expand_builtin_mathfn_3, expand_builtin_int_roundingfn): Likewise.
(expand_builtin_int_roundingfn_2, expand_builtin_saveregs): Likewise.
(inline_string_cmp): Likewise.
* calls.cc (expand_call): Likewise.
* cfgexpand.cc (expand_asm_stmt, pass_expand::execute): Likewise.
* cfgloopanal.cc (init_set_costs): Likewise.
* cfgrtl.cc (insert_insn_on_edge, prepend_insn_to_edge): Likewise.
(rtl_lv_add_condition_to_bb): Likewise.
* config/aarch64/aarch64-speculation.cc
(aarch64_speculation_clobber_sp): Likewise.
(aarch64_speculation_establish_tracker): Likewise.
(aarch64_do_track_speculation): Likewise.
* config/aarch64/aarch64.cc (aarch64_load_symref_appropriately)
(aarch64_expand_vector_init, aarch64_gen_ccmp_first): Likewise.
(aarch64_gen_ccmp_next, aarch64_mode_emit): Likewise.
(aarch64_md_asm_adjust): Likewise.
(aarch64_switch_pstate_sm_for_landing_pad): Likewise.
(aarch64_switch_pstate_sm_for_jump): Likewise.
(aarch64_switch_pstate_sm_for_call): Likewise.
* config/alpha/alpha.cc (alpha_legitimize_address_1): Likewise.
(alpha_emit_xfloating_libcall, alpha_gp_save_rtx): Likewise.
* config/arc/arc.cc (hwloop_optimize): Likewise.
* config/arm/aarch-common.cc (arm_md_asm_adjust): Likewise.
* config/arm/arm-builtins.cc: Likewise.
* config/arm/arm.cc (require_pic_register): Likewise.
(arm_call_tls_get_addr, arm_gen_load_multiple_1): Likewise.
(arm_gen_store_multiple_1, cmse_clear_registers): Likewise.
(cmse_nonsecure_call_inline_register_clear): Likewise.
(arm_attempt_dlstp_transform): Likewise.
* config/avr/avr-passes.cc (bbinfo_t::optimize_one_block): Likewise.
(avr_parallel_insn_from_insns): Likewise.
* config/avr/avr.cc (avr_prologue_setup_frame): Likewise.
(avr_expand_epilogue): Likewise.
* config/bfin/bfin.cc (hwloop_optimize): Likewise.
* config/c6x/c6x.cc (c6x_expand_compare): Likewise.
* config/cris/cris.cc (cris_split_movdx): Likewise.
* config/cris/cris.md: Likewise.
* config/csky/csky.cc (csky_call_tls_get_addr): Likewise.
* config/epiphany/resolve-sw-modes.cc
(pass_resolve_sw_modes::execute): Likewise.
* config/fr30/fr30.cc (fr30_move_double): Likewise.
* config/frv/frv.cc (frv_split_scc, frv_split_cond_move): Likewise.
(frv_split_minmax, frv_split_abs): Likewise.
* config/frv/frv.md: Likewise.
* config/gcn/gcn.cc (move_callee_saved_registers): Likewise.
(gcn_expand_prologue, gcn_restore_exec, gcn_md_reorg): Likewise.
* config/i386/i386-expand.cc
(ix86_expand_carry_flag_compare, ix86_expand_int_movcc): Likewise.
(ix86_vector_duplicate_value, expand_vec_perm_interleave2): Likewise.
(expand_vec_perm_vperm2f128_vblend): Likewise.
(expand_vec_perm_2perm_interleave): Likewise.
(expand_vec_perm_2perm_pblendv): Likewise.
(expand_vec_perm2_vperm2f128_vblend, ix86_gen_ccmp_first): Likewise.
(ix86_gen_ccmp_next): Likewise.
* config/i386/i386-features.cc
(scalar_chain::make_vector_copies): Likewise.
(scalar_chain::convert_reg, scalar_chain::convert_op): Likewise.
(timode_scalar_chain::convert_insn): Likewise.
* config/i386/i386.cc (ix86_init_pic_reg, ix86_va_start): Likewise.
(ix86_get_drap_rtx, legitimize_tls_address): Likewise.
(ix86_md_asm_adjust): Likewise.
* config/ia64/ia64.cc (ia64_expand_tls_address): Likewise.
(ia64_expand_compare, spill_restore_mem): Likewise.
(expand_vec_perm_interleave_2): Likewise.
* config/loongarch/loongarch.cc
(loongarch_call_tls_get_addr): Likewise.
* config/m32r/m32r.cc (gen_split_move_double): Likewise.
* config/m32r/m32r.md: Likewise.
* config/m68k/m68k.cc (m68k_call_tls_get_addr): Likewise.
(m68k_call_m68k_read_tp, m68k_sched_md_init_global): Likewise.
* config/m68k/m68k.md: Likewise.
* conf

New Chinese (simplified) PO file for 'cpplib' (version 15.1-b20250316)

2025-05-15 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Chinese (simplified) team of translators.  The file is available at:

https://translationproject.org/latest/cpplib/zh_CN.po

(This file, 'cpplib-15.1-b20250316.zh_CN.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

[COMMITTED 1/4] PR tree-optimization/116546 - Turn get_bitmask_from_range into an irange_bitmask, constructor.

2025-05-15 Thread Andrew MacLeod

This patch is split into 4 parts in case one of them causes any issues, 
it'll be easier to track.


In an attempt to make bitmasks on ranges a bit more consistent, 
get_bitmask_from_range has uses outside value-range.cc.  This patch 
simply moves the static function into a constructor for irange_bitmask.


Bootstrapped on x86_64-pc-linux-gnu with no regressions.  pushed.
From e4c6a0214c3ea8aa73e50b1496eb7a8aa5eda635 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Tue, 13 May 2025 13:23:16 -0400
Subject: [PATCH 1/4] Turn get_bitmask_from_range into an irange_bitmask
 constructor.

There are other places where this is interesting, so move the static
function into a constructor for class irange_bitmask.

	* value-range.cc (irange_bitmask::irange_bitmask): Rename from
	get_bitmask_from_range and tweak.
	(prange::set): Use new constructor.
	(prange::intersect): Use new constructor.
	(irange::get_bitmask): Likewise.
	* value-range.h (irange_bitmask): New constructor prototype.
---
 gcc/value-range.cc | 32 
 gcc/value-range.h  |  2 ++
 2 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index d2c14e7900d..48a1521b81e 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -31,25 +31,26 @@ along with GCC; see the file COPYING3.  If not see
 #include "fold-const.h"
 #include "gimple-range.h"
 
-// Return the bitmask inherent in a range.
+// Return the bitmask inherent in a range :   TYPE [MIN, MAX].
+// This use to be get_bitmask_from_range ().
 
-static irange_bitmask
-get_bitmask_from_range (tree type,
-			const wide_int &min, const wide_int &max)
+irange_bitmask::irange_bitmask (tree type,
+const wide_int &min, const wide_int &max)
 {
   unsigned prec = TYPE_PRECISION (type);
-
   // All the bits of a singleton are known.
   if (min == max)
 {
-  wide_int mask = wi::zero (prec);
-  wide_int value = min;
-  return irange_bitmask (value, mask);
+  m_mask = wi::zero (prec);
+  m_value = min;
+}
+  else
+{
+  wide_int xorv = min ^ max;
+  xorv = wi::mask (prec - wi::clz (xorv), false, prec);
+  m_value = wi::zero (prec);
+  m_mask = min | xorv;
 }
-
-  wide_int xorv = min ^ max;
-  xorv = wi::mask (prec - wi::clz (xorv), false, prec);
-  return irange_bitmask (wi::zero (prec), min | xorv);
 }
 
 void
@@ -469,7 +470,7 @@ prange::set (tree type, const wide_int &min, const wide_int &max,
 }
 
   m_kind = VR_RANGE;
-  m_bitmask = get_bitmask_from_range (type, min, max);
+  m_bitmask = irange_bitmask (type, min, max);
   if (flag_checking)
 verify_range ();
 }
@@ -583,7 +584,7 @@ prange::intersect (const vrange &v)
 }
 
   // Intersect all bitmasks: the old one, the new one, and the other operand's.
-  irange_bitmask new_bitmask = get_bitmask_from_range (m_type, m_min, m_max);
+  irange_bitmask new_bitmask (m_type, m_min, m_max);
   m_bitmask.intersect (new_bitmask);
   m_bitmask.intersect (r.m_bitmask);
   if (varying_compatible_p ())
@@ -2396,8 +2397,7 @@ irange::get_bitmask () const
   // in the mask.
   //
   // See also the note in irange_bitmask::intersect.
-  irange_bitmask bm
-= get_bitmask_from_range (type (), lower_bound (), upper_bound ());
+  irange_bitmask bm (type (), lower_bound (), upper_bound ());
   if (!m_bitmask.unknown_p ())
 bm.intersect (m_bitmask);
   return bm;
diff --git a/gcc/value-range.h b/gcc/value-range.h
index f6942989a6f..74cdf29ddcb 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -136,6 +136,8 @@ public:
   irange_bitmask () { /* uninitialized */ }
   irange_bitmask (unsigned prec) { set_unknown (prec); }
   irange_bitmask (const wide_int &value, const wide_int &mask);
+  irange_bitmask (tree type, const wide_int &min, const wide_int &max);
+
   wide_int value () const { return m_value; }
   wide_int mask () const { return m_mask; }
   void set_unknown (unsigned prec);
-- 
2.45.0

[COMMITTED 4/4] PR tree-optimization/116546 - Enhance bitwise_and::op1_range

2025-05-15 Thread Andrew MacLeod

This patch improves op1_range for bitwise and operations. Previously we 
did some basic attempts to determine a range but they often we not very 
precise.


This leverages the previous changes and recognizes that any known bit on 
the LHS of an AND that falls within the MASK must also be known in operand1.

   ie      [5,7] = op & 7

[5, 7] has binary patterns 101, 110, 111 .

In order to produce that result, we know that bit 100 must always be set 
in 'op' as well.   The final result now produces

   op =  [-INF, -1][5, +INF] MASK 0xfffb VALUE 0x4

This patch resolves the PR by successfully tracking the bit through to 
the AND operation.


Bootstraps on  x86_64-pc-linux-gnu with no regressions.   Pushed.

Andrew
From ac55655ce45a237a6a01e0cce50211841603c2ec Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Wed, 14 May 2025 11:32:58 -0400
Subject: [PATCH 4/4] Enhance bitwise_and::op1_range

Any known bits from the LHS range can be used to specify known bits in
the non-mask operand.

	PR tree-optimization/116546
	gcc/
	* range-op.cc (operator_bitwise_and::op1_range): Utilize bitmask
	from the LHS to improve op1's bitmask.

	gcc/testsuite/
	* gcc.dg/pr116546.c: New.
---
 gcc/range-op.cc | 22 +++-
 gcc/testsuite/gcc.dg/pr116546.c | 46 +
 2 files changed, 67 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr116546.c

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index 06d357f5199..e2b9c82bc7b 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -3716,14 +3716,34 @@ operator_bitwise_and::op1_range (irange &r, tree type,
   return true;
 }
 
+  if (!op2.singleton_p (mask))
+return true;
+
   // For 0 = op1 & MASK, op1 is ~MASK.
-  if (lhs.zero_p () && op2.singleton_p ())
+  if (lhs.zero_p ())
 {
   wide_int nz = wi::bit_not (op2.get_nonzero_bits ());
   int_range<2> tmp (type);
   tmp.set_nonzero_bits (nz);
   r.intersect (tmp);
 }
+
+  irange_bitmask lhs_bm = lhs.get_bitmask ();
+  // given   [5,7]  mask 0x3 value 0x4 =  N &  [7, 7] mask 0x0 value 0x7
+  // Nothing is known about the bits not specified in the mask value (op2),
+  //  Start with the mask, 1's will occur where values were masked.
+  wide_int op1_mask = ~mask;
+  // Any bits that are unknown on the LHS are also unknown in op1,
+  // so union the current mask with the LHS mask.
+  op1_mask |= lhs_bm.mask ();
+  // The resulting zeros correspond to known bits in the LHS mask, and
+  // the LHS value should tell us what they are.  Mask off any
+  // extraneous values thats are not convered by the mask.
+  wide_int op1_value = lhs_bm.value () & ~op1_mask;
+  irange_bitmask op1_bm (op1_value, op1_mask);
+  // INtersect this mask with anything already known about the value.
+  op1_bm.intersect (r.get_bitmask ());
+  r.update_bitmask (op1_bm);
   return true;
 }
 
diff --git a/gcc/testsuite/gcc.dg/pr116546.c b/gcc/testsuite/gcc.dg/pr116546.c
new file mode 100644
index 000..b82dc27f452
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr116546.c
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-evrp" } */
+
+extern long foo (void);
+extern long bar (void);
+
+long
+test1 (long n)
+{
+  n &= 7;
+  if (n == 4) {
+if (n & 4)
+  return foo ();
+else
+  return bar ();
+  }
+  return 0;
+}
+
+long
+test2 (long n)
+{
+  n &= 7;
+  if (n > 4) {
+if (n & 4)
+  return foo ();
+else
+  return bar ();
+  }
+  return 0;
+}
+
+long
+test3 (long n)
+{
+  n &= 7;
+  if (n >= 4) {
+if (n & 4)
+  return foo ();
+else
+  return bar ();
+  }
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-not "bar" "evrp" } } */
-- 
2.45.0

[COMMITTED] PR tee-optimization/120277 - Check for casts becoming UNDEFINED.

2025-05-15 Thread Andrew MacLeod

Recent changes to get_range_from_bitmask can sometimes turn a small 
range into an undefined one if the bitmask indicates the bits make all 
values impossible.


range_cast () was not expecting this and checks for UNDEFINED before 
peforming the cast.   It also needs to check for it after the cast now.


in this testcase, the pattern is

 y = x * 4   <- we know y will have the bottom 2 bits cleared
 z = Y + 7   <- we know z will have the bottom 2 bit set.

then a switch checks for z == 128 | z== 129 and performs a store into 
*(int *)y
eventually the store is eliminated as unreachable,  but range analysis 
recognizes that the value is UNDEFINED when [121, 122] with the last 2 
bits having to be 11 is calculated :-P


Do the default for casts, and if the result is UNDEFINED, turn it into 
VARYING.


Bootstrapped on x86_64-pc-linux-gnu with no regressions. Pushed.

Andrew
From f332f23d6b6b45dce3ab19440eecffa26bf3fc15 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Thu, 15 May 2025 11:06:05 -0400
Subject: [PATCH 1/5] Check for casts becoming UNDEFINED.

In various situations a cast that is ultimately unreahcable may produce
an UNDEFINED result, and we can't check the bounds in this case.

	PR tree-optimization/120277
	gcc/
	* range-op-ptr.cc (operator_cast::fold_range): Check if the cast
	if UNDEFINED before setting bounds.

	gcc/testsuite/
	* gcc.dg/pr120277.c: New.
---
 gcc/range-op-ptr.cc | 10 --
 gcc/testsuite/gcc.dg/pr120277.c | 21 +
 2 files changed, 29 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr120277.c

diff --git a/gcc/range-op-ptr.cc b/gcc/range-op-ptr.cc
index 36e9dfc20ba..6aadc9cf2c9 100644
--- a/gcc/range-op-ptr.cc
+++ b/gcc/range-op-ptr.cc
@@ -602,8 +602,14 @@ operator_cast::fold_range (prange &r, tree type,
   int_range<2> tmp = inner;
   tree pointer_uint_type = make_unsigned_type (TYPE_PRECISION (type));
   range_cast (tmp, pointer_uint_type);
-  r.set (type, tmp.lower_bound (), tmp.upper_bound ());
-  r.update_bitmask (tmp.get_bitmask ());
+  // Casts may cause ranges to become UNDEFINED based on bitmasks.
+  if (tmp.undefined_p ())
+r.set_varying (type);
+  else
+{
+  r.set (type, tmp.lower_bound (), tmp.upper_bound ());
+  r.update_bitmask (tmp.get_bitmask ());
+}
   return true;
 }
 
diff --git a/gcc/testsuite/gcc.dg/pr120277.c b/gcc/testsuite/gcc.dg/pr120277.c
new file mode 100644
index 000..f291e920db1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr120277.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+int a, b;
+int c(int d, long e) {
+  switch (d) {
+  case 129:
+a = 1;
+  case 128:
+break;
+  default:
+return 1;
+  }
+  *(int *)e = 0;
+}
+void f(int d, long e) { c(d, e); }
+void g() {
+  int h = b * sizeof(int);
+  f(h + 7, h);
+}
+void main() {}
-- 
2.45.0

Contents of PO file 'cpplib-15.1-b20250316.zh_CN.po'

2025-05-15 Thread Translation Project Robot



cpplib-15.1-b20250316.zh_CN.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.

[PATCH] libstdc++: Fix proc check_v3_target_namedlocale for "" locale [PR65909]

2025-05-15 Thread Jonathan Wakely

When the last format argument to a Tcl proc is named 'args' it has
special meaning and is a list that accepts any number of arguments[1].
This means when "" is passed to the proc and then we expand "$args" we
get an empty list formatted as "{}". My r16-537-g3e2b83faeb6b14 change
broke all uses of dg-require-namedlocale with empty locale names, "".

By changing the name of the formal argument to 'locale' we avoid the
special behaviour for 'args' and now it only accepts a single argument
(as was always intended). When expanded as "$locale" we get "" as I
expected.

[1] https://www.tcl-lang.org/man/tcl9.0/TclCmd/proc.html

libstdc++-v3/ChangeLog:

PR libstdc++/65909
* testsuite/lib/libstdc++.exp (check_v3_target_namedlocale):
Change name of formal argument to locale.
---

Tested x86_64-linux.

I also plan to audit the other procs in libstdc++.exp to see if they
should not use 'args', but that can be done later. The priority for now
is to fix what I broke recently.

 libstdc++-v3/testsuite/lib/libstdc++.exp | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp 
b/libstdc++-v3/testsuite/lib/libstdc++.exp
index da1f4245e4b8..9f2dd8a17248 100644
--- a/libstdc++-v3/testsuite/lib/libstdc++.exp
+++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
@@ -1019,8 +1019,8 @@ proc check_v3_target_time { } {
 }]
 }
 
-proc check_v3_target_namedlocale { args } {
-set key "et_namedlocale $args"
+proc check_v3_target_namedlocale { locale } {
+set key "et_namedlocale $locale"
 return [check_v3_target_prop_cached $key {
global tool
# Set up, compile, and execute a C++ test program that tries to use
@@ -1048,7 +1048,7 @@ proc check_v3_target_namedlocale { args } {
puts $f "}"
puts $f "int main ()"
puts $f "{"
-   puts $f "  const char *namedloc = transform_locale(\"$args\");"
+   puts $f "  const char *namedloc = transform_locale(\"$locale\");"
puts $f "  try"
puts $f "  {"
puts $f "locale((const char*)namedloc);"
@@ -1075,7 +1075,7 @@ proc check_v3_target_namedlocale { args } {
set result [${tool}_load "./$exe" "" ""]
set status [lindex $result 0]
 
-   verbose "check_v3_target_namedlocale <$args>: status is <$status>" 2
+   verbose "check_v3_target_namedlocale <$locale>: status is <$status>" 2
 
if { $status == "pass" } {
return 1
-- 
2.49.0

[committed] cobol: One additional edit to testsuite/cobol.dg/group1/check_88.cob [PR120251]

2025-05-15 Thread Robert Dubner

Subject: [PATCH] cobol: One additional edit to
 testsuite/cobol.dg/group1/check_88.cob [PR120251]
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Missed one edit.  This fixes that.

gcc/testsuite/ChangeLog:

PR cobol/120251
* cobol.dg/group1/check_88.cob: One final regex "." instead of "ß"
---
 gcc/testsuite/cobol.dg/group1/check_88.cob | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/cobol.dg/group1/check_88.cob
b/gcc/testsuite/cobol.dg/group1/check_88.cob
index 18a299fc282b..f1d0685e478a 100644
--- a/gcc/testsuite/cobol.dg/group1/check_88.cob
+++ b/gcc/testsuite/cobol.dg/group1/check_88.cob
@@ -17,7 +17,7 @@
 *> { dg-output {.* Bundesstra.e
(\n|\r\n|\r)} }
 *> { dg-output { (\n|\r\n|\r)} }
 *> { dg-output {There should be no spaces before the final
quote(\n|\r\n|\r)} }
-*> { dg-output {".* Bundesstraße"(\n|\r\n|\r)} }
+*> { dg-output {".* Bundesstra.e"(\n|\r\n|\r)} }
 *> { dg-output { (\n|\r\n|\r)} }
 *> { dg-output {   IsLow   ""(\n|\r\n|\r)} }
 *> { dg-output {   IsZero  "000"(\n|\r\n|\r)} }
--
2.34.1

Re: [PATCH v2] libstdc++: Preserve the argument type in basic_format_args [PR119246]

2025-05-15 Thread Rainer Orth

Hi Jonathan,

> On Thu, 15 May 2025 at 15:02, Rainer Orth  
> wrote:
>>
>> Hi Jonathan,
>>
>> >> > this patch broke Solaris bootstrap, both i386-pc-solaris2.11 and
>> >> > sparc-sun-solaris2.11:
>> >> >
>> >> > In file included from
>> >> > /vol/gcc/src/hg/master/local/libstdc++-v3/src/c++20/format.cc:29:
>> >> > /var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/format:
>> >> > In member function ‘typename std::basic_format_context<_Out,
>> >> > _CharT>::iterator std::formatter<__float128,
>> >> > _CharT>::format(__float128, std::basic_format_context<_Out, _CharT>&)
>> >> > const’:
>> >> > /var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/format:2994:41:
>> >> > error: ‘__flt128_t’ is not a member of ‘std::__format’; did you mean
>> >> > ‘__bflt16_t’? [-Wtemplate-body]
>> >> > 2994 | { return _M_f.format((__format::__flt128_t)__u, __fc); }
>> >> >  | ^~
>> >> >  | __bflt16_t
>> >> >
>> >> > and one more instance.
>> >>
>> >> And on x86_64-darwin too.
>> >
>> > Tomasz, should this be:
>> >
>> > --- a/libstdc++-v3/include/std/format
>> > +++ b/libstdc++-v3/include/std/format
>> > @@ -2973,7 +2973,7 @@ namespace __format
>> > };
>> > #endif
>> >
>> > -#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 != 1
>> > +#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 > 1
>> >   // Reuse __formatter_fp::format<__format::__flt128_t, Out> for 
>> > __float128.
>> >   // This formatter is not declared if
>> > _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT is true,
>> >   // as __float128 when present is same type as __ieee128, which may be 
>> > same as
>> >
>>
>> with this patch applied, I could link libstdc++.so.  I'll run a full
>> bootstrap later today.
>
>
> Good to know, thanks. Tomasz already pushed that change as
> r16-647-gd010a39b9e788a
> so trunk should be OK now.

it is on Solaris/i386, but sparc is broken in a different way now:

/var/gcc/regression/master/11.4-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/format:2999:23:
 error: static assertion failed: This specialization should not be used for 
long double
 2999 |   static_assert( !is_same_v<__float128, long double>,
  |   ^~
/var/gcc/regression/master/11.4-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/format:2999:23:
 note: ‘!(bool)std::is_same_v’ evaluates to false
/var/gcc/regression/master/11.4-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/format:
 In instantiation of ‘struct std::formatter’:
/var/gcc/regression/master/11.4-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/type_traits:3559:54:
   required from ‘constexpr const bool 
std::is_default_constructible_v >’
 3559 |   inline constexpr bool is_default_constructible_v = 
__is_constructible(_Tp);
  |  
^~~
[...]

I've a local patch in tree to support __float128 on SPARC, so I'll try
with an unmodified tree first.  However, 2 days ago I could bootstrap
with that included just fine.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

New Chinese (simplified) PO file for 'gcc' (version 15.1.0)

2025-05-15 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Chinese (simplified) team of translators.  The file is available at:

https://translationproject.org/latest/gcc/zh_CN.po

(This file, 'gcc-15.1.0.zh_CN.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

Re: [PATCH] libstdc++: Implement C++26 function_ref [PR119126]

2025-05-15 Thread Tomasz Kaminski

I noticed that I am missing deduction guides for function_ref.
Interestingly only function_ref has them, and not move_only_function,
copyable_function.

On Thu, May 15, 2025 at 5:00 AM Patrick Palka  wrote:

>
>
> On Wed, 14 May 2025, Tomasz Kamiński wrote:
>
> > This patch implements C++26 function_ref as specified in P0792R14,
> > with correction for constraints for constructor accepting nontype_t
> > parameter from LWG 4256.
> >
> > As function_ref may store a pointer to the const object, __Ptrs::_M_obj
> is
> > changed to const void*, so again we do not cast away const from const
> > objects. To help with necessary cast, a __polyfunc::__cast_to helper is
> > added, that accepts a reference to that type.
> >
> > The _Invoker now defines additional call methods used by function_ref:
> > _S_ptrs() for invoking target passed by reference, and __S_nttp,
> _S_bind_ptr,
> > _S_bind_ref for handling constructors accepting nontype_t. The existing
> > _S_call_storage is changed to thin wrappers, that initialies _Ptrs,
> > and forwards to _S_call_ptrs.
> >
> > This reduced the most uses of _Storage::_M_ptr and _Storage::_M_ref,
> > so this functions was removed, and _Manager uses were adjusted.
> >
> > Finally we make function_ref available in freestanding mode, as
> > move_only_function and copyable_function iarecurrently only available in
> hosted,
>
> "are currently"
>
> > so we define _Manager and _Mo_base only if either
> __glibcxx_move_only_function
> > or __glibcxx_copyable_function is defined.
> >
> >   PR libstdc++/119126
> >
> > libstdc++-v3/ChangeLog:
> >
> >   * doc/doxygen/stdheader.cc: Added funcref_impl.h file.
> >   * include/Makefile.am: Added funcref_impl.h file.
> >   * include/Makefile.in: Added funcref_impl.h file.
> >   * include/bits/funcref_impl.h: New file.
> >   * include/bits/funcwrap.h: (_Ptrs::_M_obj): Const-qualify.
> >   (_Storage::_M_ptr, _Storage::_M_ref): Remove.
> >   (__polyfunc::__cast_to) Define.
> >   (_Base_invoker::_S_ptrs, _Base_invoker::_S_nttp)
> >   (_Base_invoker::_S_bind_ptrs, _Base_invoker::_S_bind_ref)
> >   (_Base_invoker::_S_call_ptrs): Define.
> >   (_Base_invoker::_S_call_storage): Foward to _S_call_ptrs.
> >   (_Manager::_S_local, _Manager::_S_ptr): Adjust for _M_obj being
> >   const qualified.
> >   (__polyfunc::_Manager, __polyfunc::_Mo_base): Guard with
> >   __glibcxx_move_only_function || __glibcxx_copyable_function.
> >   (std::function_ref, std::__is_function_ref_v)
> >   [__glibcxx_function_ref]: Define.
> >   * include/bits/utility.h (std::nontype_t, std::nontype)
> >   (__is_nontype_v) [__glibcxx_function_ref]: Define.
> >   * include/bits/version.def: Define function_ref.
> >   * include/bits/version.h: Regenerate.
> >   * src/c++23/std.cc.in (std::function_ref)
> [__cpp_lib_function_ref]:
> >Export.
> >   * testsuite/20_util/function_ref/assign.cc: New test.
> >   * testsuite/20_util/function_ref/call.cc: New test.
> >   * testsuite/20_util/function_ref/cons.cc: New test.
> >   * testsuite/20_util/function_ref/cons_neg.cc: New test.
> >   * testsuite/20_util/function_ref/conv.cc: New test.
>
> Should some of these tests run in freestanding mode too, given that
> function_ref is freestanding?

I have made all test expect conv.cc to run in freestanding mode.
The conv.cc uses move_only_function and copyable_function, so I left
requirement for it to be hosted.

>
> > ---
> > Would appreciate check of the documentation comments in funcref_impl.h
> > file.
> >
> >  libstdc++-v3/doc/doxygen/stdheader.cc |   1 +
> >  libstdc++-v3/include/Makefile.am  |   1 +
> >  libstdc++-v3/include/Makefile.in  |   1 +
> >  libstdc++-v3/include/bits/funcref_impl.h  | 185 +++
> >  libstdc++-v3/include/bits/funcwrap.h  | 154 
> >  libstdc++-v3/include/bits/utility.h   |  17 ++
> >  libstdc++-v3/include/bits/version.def |   8 +
> >  libstdc++-v3/include/bits/version.h   |  10 +
> >  libstdc++-v3/include/std/functional   |   4 +-
> >  libstdc++-v3/src/c++23/std.cc.in  |   3 +
> >  .../testsuite/20_util/function_ref/assign.cc  | 110 +
> >  .../testsuite/20_util/function_ref/call.cc| 145 
> >  .../testsuite/20_util/function_ref/cons.cc| 219 ++
> >  .../20_util/function_ref/cons_neg.cc  |  30 +++
> >  .../testsuite/20_util/function_ref/conv.cc| 152 
> >  15 files changed, 993 insertions(+), 47 deletions(-)
> >  create mode 100644 libstdc++-v3/include/bits/funcref_impl.h
> >  create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/assign.cc
> >  create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/call.cc
> >  create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/cons.cc
> >  create mode 100644
> libstdc++-v3/testsuite/20_util/function_ref/cons_neg.cc
> >  c

[PATCH] Forwprop: add a debug dump after propagate into comparison does something

2025-05-15 Thread Andrew Pinski

I noticed that fowprop does not dump when forward_propagate_into_comparison
did a change to the assign statement.
I am actually using it to help guide changing/improving/add match patterns
instead of depending on doing a tree "combiner" here.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* tree-ssa-forwprop.cc (forward_propagate_into_comparison): Dump
when there is a change to the statement happened.

Signed-off-by: Andrew Pinski 
---
 gcc/tree-ssa-forwprop.cc | 8 
 1 file changed, 8 insertions(+)

diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
index 3187314390f..9986799da3b 100644
--- a/gcc/tree-ssa-forwprop.cc
+++ b/gcc/tree-ssa-forwprop.cc
@@ -523,6 +523,14 @@ forward_propagate_into_comparison (gimple_stmt_iterator 
*gsi)
 type, rhs1, rhs2);
   if (tmp && useless_type_conversion_p (type, TREE_TYPE (tmp)))
 {
+  if (dump_file)
+   {
+ fprintf (dump_file, "  Replaced '");
+ print_gimple_expr (dump_file, stmt, 0);
+ fprintf (dump_file, "' with '");
+ print_generic_expr (dump_file, tmp);
+ fprintf (dump_file, "'\n");
+   }
   gimple_assign_set_rhs_from_tree (gsi, tmp);
   fold_stmt (gsi);
   update_stmt (gsi_stmt (*gsi));
-- 
2.43.0

Re: [PATCH 1/6] emit-rtl: document next_nonnote_nondebug_insn_bb () can breach into next BB

2025-05-15 Thread Vineet Gupta

+CC @pinskia

On 5/10/25 06:55, Jeff Law wrote:
>
> On 5/9/25 2:27 PM, Vineet Gupta wrote:
>> gcc/ChangeLog:
>>
>>  * emit-rtl.cc (next_nonnote_nondebug_insn): Update comments.
>>
>> Signed-off-by: Vineet Gupta 
>> ---
>>   gcc/emit-rtl.cc | 6 +-
>>   1 file changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
>> index 3e2c4309dee6..b78b29ecf989 100644
>> --- a/gcc/emit-rtl.cc
>> +++ b/gcc/emit-rtl.cc
>> @@ -3677,7 +3677,11 @@ next_nonnote_nondebug_insn (rtx_insn *insn)
>>   
>>   /* Return the next insn after INSN that is not a NOTE nor DEBUG_INSN,
>>  but stop the search before we enter another basic block.  This
>> -   routine does not look inside SEQUENCEs.  */
>> +   routine does not look inside SEQUENCEs.
>> +   NOTE: This can potentially bleed into next BB. If current insn is
>> + last insn of BB, followed by a code_label before the start of
>> + the next BB, code_label will be returned. But this is the
>> + behavior rest of gcc assumes/relies on e.g. get_last_bb_insn.  */
> I don't see how get_last_bb_insn itself inherently needs this behavior. 
> I could believe something that *calls* get_last_bb_insn might depend on 
> this behavior -- but I'd also consider that bogus.  The CODE_LABEL is 
> part of the next block, so returning that seems quite wrong given the 
> original comment and users like get_last_bb_insn.

Back when I stumbled into this, @pinskia mentioned on IRC that fixing
next_nonnote_nondebug_insn would affect get_last_bb_insn as that expects
BARRIERS (in next bb too)

Thx,
-Vineet

> I don't mind adding the comment, but I'd much rather chase down the 
> offenders and fix them.
>
> Jeff
>
>

[PATCH] Add pattern match in match.pd for .AVG_CEIL

2025-05-15 Thread liuhongt

1) Optimize (a >> 1) + (b >> 1) + ((a | b) & 1) to .AVG_CEIL (a, b)
2) Optimize (a | b) - ((a ^ b) >> 1) to .AVG_CEIL (a, b)

Prof is at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118994#c6

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

PR middle-end/118994
* match.pd ((a >> 1) + (b >> 1) + ((a | b) & 1) to
.AVG_CEIL (a, b)): New pattern.
((a | b) - ((a ^ b) >> 1) to .AVG_CEIL (a, b)): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr118994-1.c: New test.
* gcc.target/i386/pr118994-2.c: New test.
---
 gcc/match.pd   | 23 ++
 gcc/testsuite/gcc.target/i386/pr118994-1.c | 37 ++
 gcc/testsuite/gcc.target/i386/pr118994-2.c | 37 ++
 3 files changed, 97 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr118994-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr118994-2.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 96136404f5e..d391ac86edc 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -11455,3 +11455,26 @@ and,
   }
   (if (full_perm_p)
(vec_perm (op@3 @0 @1) @3 @2))
+
+#if GIMPLE
+/* Simplify (a >> 1) + (b >> 1) + ((a | b) & 1) to .AVG_CEIL (a, b).
+   Similar for (a | b) - ((a ^ b) >> 1).  */
+
+(simplify
+  (plus:c
+(plus (rshift @0 integer_onep@1) (rshift @2 @1))
+(bit_and (bit_ior @0 @2) integer_onep@3))
+  (if (cfun && (cfun->curr_properties & PROP_last_full_fold) != 0
+  && VECTOR_TYPE_P (type)
+  && direct_internal_fn_supported_p (IFN_AVG_CEIL, type, 
OPTIMIZE_FOR_BOTH))
+  (IFN_AVG_CEIL @0 @2)))
+
+(simplify
+  (minus
+(bit_ior @0 @2)
+(rshift (bit_xor @0 @2) integer_onep@1))
+  (if (cfun && (cfun->curr_properties & PROP_last_full_fold) != 0
+  && VECTOR_TYPE_P (type)
+  && direct_internal_fn_supported_p (IFN_AVG_CEIL, type, 
OPTIMIZE_FOR_BOTH))
+  (IFN_AVG_CEIL @0 @2)))
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/pr118994-1.c 
b/gcc/testsuite/gcc.target/i386/pr118994-1.c
new file mode 100644
index 000..5f40ababccc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr118994-1.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -mavx512vl -O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-times "\.AVG_CEIL" 6 "optimized"} } */
+
+#define VecRoundingAvg(a, b) ((a >> 1) + (b >> 1) + ((a | b) & 1))
+
+typedef unsigned char GccU8x16Vec __attribute__((__vector_size__(16)));
+typedef unsigned short GccU16x8Vec __attribute__((__vector_size__(16)));
+typedef unsigned char GccU8x32Vec __attribute__((__vector_size__(32)));
+typedef unsigned short GccU16x16Vec __attribute__((__vector_size__(32)));
+typedef unsigned char GccU8x64Vec __attribute__((__vector_size__(64)));
+typedef unsigned short GccU16x32Vec __attribute__((__vector_size__(64)));
+
+GccU8x16Vec U8x16VecRoundingAvg(GccU8x16Vec a, GccU8x16Vec b) {
+  return VecRoundingAvg(a, b);
+}
+
+GccU16x8Vec U16x8VecRoundingAvg(GccU16x8Vec a, GccU16x8Vec b) {
+  return VecRoundingAvg(a, b);
+}
+
+GccU8x32Vec U8x32VecRoundingAvg(GccU8x32Vec a, GccU8x32Vec b) {
+  return VecRoundingAvg(a, b);
+}
+
+GccU16x16Vec U16x16VecRoundingAvg(GccU16x16Vec a, GccU16x16Vec b) {
+  return VecRoundingAvg(a, b);
+}
+
+GccU8x64Vec U8x64VecRoundingAvg(GccU8x64Vec a, GccU8x64Vec b) {
+  return VecRoundingAvg(a, b);
+}
+
+GccU16x32Vec U16x32VecRoundingAvg(GccU16x32Vec a, GccU16x32Vec b) {
+  return VecRoundingAvg(a, b);
+}
+
diff --git a/gcc/testsuite/gcc.target/i386/pr118994-2.c 
b/gcc/testsuite/gcc.target/i386/pr118994-2.c
new file mode 100644
index 000..ba90e0a2992
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr118994-2.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-mavx512bw -mavx512vl -O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-times "\.AVG_CEIL" 6 "optimized"} } */
+
+#define VecRoundingAvg(a, b) ((a | b) - ((a ^ b) >> 1))
+
+typedef unsigned char GccU8x16Vec __attribute__((__vector_size__(16)));
+typedef unsigned short GccU16x8Vec __attribute__((__vector_size__(16)));
+typedef unsigned char GccU8x32Vec __attribute__((__vector_size__(32)));
+typedef unsigned short GccU16x16Vec __attribute__((__vector_size__(32)));
+typedef unsigned char GccU8x64Vec __attribute__((__vector_size__(64)));
+typedef unsigned short GccU16x32Vec __attribute__((__vector_size__(64)));
+
+GccU8x16Vec U8x16VecRoundingAvg(GccU8x16Vec a, GccU8x16Vec b) {
+  return VecRoundingAvg(a, b);
+}
+
+GccU16x8Vec U16x8VecRoundingAvg(GccU16x8Vec a, GccU16x8Vec b) {
+  return VecRoundingAvg(a, b);
+}
+
+GccU8x32Vec U8x32VecRoundingAvg(GccU8x32Vec a, GccU8x32Vec b) {
+  return VecRoundingAvg(a, b);
+}
+
+GccU16x16Vec U16x16VecRoundingAvg(GccU16x16Vec a, GccU16x16Vec b) {
+  return VecRoundingAvg(a, b);
+}
+
+GccU8x64Vec U8x64VecRoundingAvg(GccU8x64Vec a, GccU8x64Vec b) {
+  return VecRoundingAvg(a, b);
+}
+
+GccU16x32Vec U16x32VecRoundingAvg(GccU16x32Vec a, GccU16x32Vec b

RE: [PATCH 1/2]middle-end: Apply loop->unroll directly in vectorizer

2025-05-15 Thread Richard Biener

On Wed, 14 May 2025, Tamar Christina wrote:

> > > > >
> > > > > -  /* Loops vectorized with a variable factor won't benefit from
> > > > > +  /* Loops vectorized would have already taken into account unrolling
> > specified
> > > > > + by the user as the suggested unroll factor, as such we need to 
> > > > > prevent the
> > > > > + RTL unroller from unrolling twice.  The only exception is 
> > > > > static known
> > > > > + iterations where we would have expected the loop to be fully 
> > > > > unrolled.
> > > > > + Loops vectorized with a variable factor won't benefit from
> > > > >   unrolling/peeling.  */
> > > > > -  if (!vf.is_constant ())
> > > > > +  if (LOOP_VINFO_USER_UNROLL (loop_vinfo)
> > > >
> > > > ... this is the transform phase - is LOOP_VINFO_USER_UNROLL copied
> > > > from the earlier attempt?
> > >
> > > Ah, I see I forgot to copy it when the loop_vinfo is copied..  Will fix.
> > >
> 
> I've been looking more into the behavior and I think it's correct not to copy 
> it from an earlier attempt.
> The flag would be re-set every time during vect_estimate_min_profitable_iters 
> as we have to recalculate
> the unroll based on the assumed_vf.
> 
> When vect_analyze_loop_2 initializes the costing structure, we just set it 
> again during vect_analyze_loop_costing
> as loop->unroll is not cleared until vectorization succeeds.
> 
> For the epilogue it would be false, which I think makes sense as the 
> epilogues should determine their VF solely
> based on that of the previous attempt? Because I think it makes sense that 
> the epilogues should be able to tell
> the vectorizer that it wants to re-use the same mode for the next attempt, 
> just without the unrolling.
> 
> > In the end whatever we do it's going to be a matter of documenting
> > the interaction between vectorization and #pragma GCC unroll.
> > 
> 
> Docs added
> 
> > The way you handle it is reasonable, the question is whether to
> > set loop->unroll to 1 in the end (disable any further unrolling)
> > or to 0 (only auto-unroll based on heuristics).  I'd argue 0
> > makes more sense - iff we chose to apply the extra unrolling
> > during vectorization.
> 
> 0 does make more sense to me as well.  I think where we got crossed earlier 
> was that I was mentioning that
> Having unroll > 1 after this wasn't a good idea, so was a miscommunication. 
> But 0 makes much sense.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * tree-vectorizer.h (vector_costs::set_suggested_unroll_factor,
>   LOOP_VINFO_USER_UNROLL): New.
>   (class _loop_vec_info): Add user_unroll.
>   * tree-vect-loop.cc (vect_estimate_min_profitable_iters): Set
>   suggested_unroll_factor before calling backend costing.
>   (_loop_vec_info::_loop_vec_info): Initialize user_unroll.
>   (vect_transform_loop): Clear the loop->unroll value if the pragma was
>   used.
>   doc/extend.texi (pragma unroll): Document vectorizer interaction.
> 
> -- inline copy of patch --
> 
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 
> e87a3c271f8420d8fd175823b5bb655f76c89afe..f8261d13903afc90d3341c09ab3fdbd0ab96ea49
>  100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -10398,6 +10398,11 @@ unrolled @var{n} times regardless of any commandline 
> arguments.
>  When the option is @var{preferred} then the user is allowed to override the
>  unroll amount through commandline options.
>  
> +If the loop was vectorized the unroll factor specified will be used to seed 
> the
> +vectorizer unroll factor.  Whether the loop is unrolled or not will be
> +determined by target costing.  The resulting vectorized loop may still be
> +unrolled more in later passes depending on the target costing.
> +
>  @end table
>  
>  @node Thread-Local
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index 
> fe6f3cf188e40396b299ff9e814cc402bc2d4e2d..1fbf92b5f4b176ada7379930b73ab503fb423e99
>  100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -1073,6 +1073,7 @@ _loop_vec_info::_loop_vec_info (class loop *loop_in, 
> vec_info_shared *shared)
>  peeling_for_gaps (false),
>  peeling_for_niter (false),
>  early_breaks (false),
> +user_unroll (false),
>  no_data_dependencies (false),
>  has_mask_store (false),
>  scalar_loop_scaling (profile_probability::uninitialized ()),
> @@ -4983,6 +4984,26 @@ vect_estimate_min_profitable_iters (loop_vec_info 
> loop_vinfo,
>   }
>  }
>  
> +  /* Seed the target cost model with what the user requested if the unroll
> + factor is larger than 1 vector VF.  */
> +  auto user_unroll = LOOP_VINFO_LOOP (loop_vinfo)->unroll;
> +  if (user_unroll > 1)
> +{
> +  LOOP_VINFO_USER_UNROLL (loop_vinfo) = true;
> +  int unroll_fact = user_unroll / assumed_vf;
> +  unroll_fact =

[PATCH] Match: Handle commonly used unsigned modulo counters

2025-05-15 Thread MCC CS

Dear all,

Here's my patch for PR120265. Bootstrapped and tested on aarch64 that it
causes no regressions. I also added a testcase. I'd be grateful
if you could commit it.

Otherwise, feedback to improve it is welcome.

Many thanks
MCCCS


From 1e901c3fa5c8cc3e55d4f1715b4aae4ae3d66714 Mon Sep 17 00:00:00 2001
From: MCCCS 
Date: Thu, 15 May 2025 09:16:49 +0100
Subject: [PATCH] tree-optimization/120265 - Optimize modular counters

This PR is about replacing trunc_mod with with a
simpler expression given the bounds of variables.

PR tree-optimization/120265
* match.pd:
X % M -> X for X in 0 to M-1
X % M -> (X == M) ? 0 : X for X in 0 to M
X % M -> (X >= M) ? (X - M) : X for X in 0 to 2*M-1.

* gcc.dg/pr120265.c. New testcase.
---
 gcc/match.pd| 27 
 gcc/testsuite/gcc.dg/pr120265.c | 44 +
 2 files changed, 71 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr120265.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 79485f9678a..bd8950b4e10 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5602,6 +5602,33 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
optab_vector)))
(eq (trunc_mod @0 @1) { build_zero_cst (TREE_TYPE (@0)); })))
 
+#if GIMPLE
+/* X % M -> X for X in 0 to M-1.  */
+/* X % M -> (X == M) ? 0 : X for X in 0 to M.  */
+/* X % M -> (X >= M) ? (X - M) : X for X in 0 to 2*M-1.  */
+(simplify
+ (trunc_mod @0 @1)
+  (with { int_range_max vr0, vr1; }
+   (if (get_range_query (cfun)->range_of_expr (vr0, @0)
+   && get_range_query (cfun)->range_of_expr (vr1, @1)
+   && !vr0.undefined_p ()
+   && !vr1.undefined_p ()
+   && !integer_zerop (@1)
+   && (TYPE_UNSIGNED (type)
+   || (vr0.nonnegative_p () && vr1.nonnegative_p (
+(with
+ { wide_int twice = 2 * vr1.lower_bound (); }
+ (switch
+  (if (wi::gtu_p (vr1.lower_bound (), vr0.upper_bound ()))
+   @0)
+  (if (wi::geu_p (vr1.lower_bound (), vr0.upper_bound ()))
+   (cond (eq @0 @1)
+   { build_zero_cst (type); }
+   @0))
+  (if (wi::gtu_p (twice, vr0.upper_bound ()))
+   (cond (ge @0 @1) (minus @0 @1) @0)))
+#endif
+
 /* ((X /[ex] C1) +- C2) * (C1 * C3)  -->  (X * C3) +- (C1 * C2 * C3).  */
 (for op (plus minus)
  (simplify
diff --git a/gcc/testsuite/gcc.dg/pr120265.c b/gcc/testsuite/gcc.dg/pr120265.c
new file mode 100644
index 000..2634af36226
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr120265.c
@@ -0,0 +1,44 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-optimized" } */
+__attribute__((noipa)) void g(int r)
+{
+ (void) r;
+}
+
+int x;
+
+void a(void)
+{
+ unsigned m = 0;
+  for(int i = 0; i < 300; i++)
+  {
+   m++;
+   m %= 600;
+   g(m);
+  }
+}
+
+void b(void)
+{
+ unsigned m = 0;
+  for(int i = 0; i < x; i++)
+  {
+   m++;
+   m %= 600;
+   g(m);
+  }
+}
+
+void c(void)
+{
+ unsigned m = 0;
+ for(int i = 0; i < x; i++)
+ {
+  m += 7;
+  m %= 600;
+  g(m);
+ }
+}
+
+/* { dg-final { scan-tree-dump-not "% 600" "optimized" } } */
+
-- 
2.45.2

Re: [PATCH v1] libstdc++: Fix class mandate for extents.

2025-05-15 Thread Tomasz Kaminski

On Wed, May 14, 2025 at 9:18 PM Luc Grosheintz 
wrote:

> The standard states that the IndexType must be a signed or unsigned
> integer. This mandate was implemented using `std::is_integral_v`. Which
> also includes (among others) char and bool, which neither signed nor
> unsigned integers.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/mdspan: Implement the mandate for extents as
> signed or unsigned integer and not any interal type.
> * testsuite/23_containers/mdspan/extents/class_mandates_neg.cc:
> Check
> that extents and extents are invalid.
> * testsuite/23_containers/mdspan/extents/misc.cc: Update
> tests to avoid `char` and `bool` as IndexType.
> ---
>
LGTM thanks for catching this up.

>  libstdc++-v3/include/std/mdspan|  3 ++-
>  .../23_containers/mdspan/extents/class_mandates_neg.cc | 10 +++---
>  .../testsuite/23_containers/mdspan/extents/misc.cc |  8 
>  3 files changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/mdspan
> b/libstdc++-v3/include/std/mdspan
> index aee96dda7cd..22509d9c8f4 100644
> --- a/libstdc++-v3/include/std/mdspan
> +++ b/libstdc++-v3/include/std/mdspan
> @@ -163,7 +163,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>template
>  class extents
>  {
> -  static_assert(is_integral_v<_IndexType>, "_IndexType must be
> integral.");
> +  static_assert(__is_standard_integer<_IndexType>::value,
> +   "_IndexType must be a signed or unsigned integer.");
>static_assert(
>   (__mdspan::__valid_static_extent<_Extents, _IndexType> && ...),
>   "Extents must either be dynamic or representable as _IndexType");
> diff --git
> a/libstdc++-v3/testsuite/23_containers/mdspan/extents/class_mandates_neg.cc
> b/libstdc++-v3/testsuite/23_containers/mdspan/extents/class_mandates_neg.cc
> index b654e3920a8..63a2db77c08 100644
> ---
> a/libstdc++-v3/testsuite/23_containers/mdspan/extents/class_mandates_neg.cc
> +++
> b/libstdc++-v3/testsuite/23_containers/mdspan/extents/class_mandates_neg.cc
> @@ -1,8 +1,12 @@
>  // { dg-do compile { target c++23 } }
>  #include
>
> -std::extents e1; // { dg-error "from here" }
> -std::extents e2;// { dg-error "from here" }
> +#include 
> +
> +std::extents e1; // { dg-error "from here" }
> +std::extents e2; // { dg-error "from here" }
> +std::extents e3; // { dg-error "from here" }
> +std::extents e4;   // { dg-error "from here" }
>  // { dg-prune-output "dynamic or representable as _IndexType" }
> -// { dg-prune-output "must be integral" }
> +// { dg-prune-output "signed or unsigned integer" }
>  // { dg-prune-output "invalid use of incomplete type" }
> diff --git a/libstdc++-v3/testsuite/23_containers/mdspan/extents/misc.cc
> b/libstdc++-v3/testsuite/23_containers/mdspan/extents/misc.cc
> index 1620475..e71fdc54230 100644
> --- a/libstdc++-v3/testsuite/23_containers/mdspan/extents/misc.cc
> +++ b/libstdc++-v3/testsuite/23_containers/mdspan/extents/misc.cc
> @@ -1,6 +1,7 @@
>  // { dg-do run { target c++23 } }
>  #include 
>
> +#include 
>  #include 
>
>  constexpr size_t dyn = std::dynamic_extent;
> @@ -20,7 +21,6 @@ static_assert(std::is_same_v 2>::rank_type, size_t>);
>  static_assert(std::is_unsigned_v::size_type>);
>  static_assert(std::is_unsigned_v 2>::size_type>);
>
> -static_assert(std::is_same_v::index_type, char>);
>  static_assert(std::is_same_v::index_type, int>);
>  static_assert(std::is_same_v::index_type,
>   unsigned int>);
> @@ -49,7 +49,7 @@ static_assert(check_rank_return_types());
>
>  // Check that the static extents don't take up space.
>  static_assert(sizeof(std::extents) == sizeof(int));
> -static_assert(sizeof(std::extents) == sizeof(char));
> +static_assert(sizeof(std::extents) == sizeof(short));
>
>  template
>  class Container
> @@ -58,7 +58,7 @@ class Container
>[[no_unique_address]] std::extents b0;
>  };
>
> -static_assert(sizeof(Container>) == sizeof(int));
> +static_assert(sizeof(Container>) ==
> sizeof(int));
>  static_assert(sizeof(Container>) ==
> sizeof(int));
>
>  // operator=
> @@ -103,7 +103,7 @@ test_deduction_all()
>test_deduction<0>();
>test_deduction<1>(1);
>test_deduction<2>(1.0, 2.0f);
> -  test_deduction<3>(int(1), char(2), size_t(3));
> +  test_deduction<3>(int(1), short(2), size_t(3));
>return true;
>  }
>
> --
> 2.49.0
>
>

Re: [RFC PATCH] Implement -fdiag-prefix-map

2025-05-15 Thread Rasmus Villemoes

On Wed, May 14 2025, David Malcolm  wrote:

> On Thu, 2025-04-03 at 13:58 +0200, Rasmus Villemoes wrote:
>> In many setups, especially when CI and/or some meta-build system like
>> Yocto or buildroot, is involved, gcc ends up being invoked using
>> absolute path names, which are often long and uninteresting.
>> 
>> That amounts to a lot of noise both when trying to decipher the
>> warning or error, when the warning text is copy-pasted to a commit
>> fixing the issue, or posted to a mailing list.
>
> Various thoughts:
>
> The patch only touches "text" output, it doesn't affect "sarif" (or
> "json", but that's deprecated and I plan to remove it soon).  
>
> FWIW SARIF has some interesting support for redacting sensitive
> information; see e.g. 
> "3.5.2 Redactable strings"
> https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/sarif-v2.1.0-errata01-os-complete.html#_Toc141790687
>
> and "3.14.28 redactionTokens
> property"https://docs.oasis-open.org/sarif/sarif/v2.1.0/errata01/os/sarif-v2.1.0-errata01-os-complete.html#_Toc141790762
>

Interesting. I didn't know that "SARIF" concept until just now, nor of
course that gcc could produce such output.

I also now see that there is a lot of existing -fdiagnostics-* options,
so if this is actually going to happen, it should probably be
-fdiagnostics-prefix-map and not the short form as in $subject.

> I don't know if we want to implement redaction within GCC's SARIF
> output code, or to simply defer that to 3rd party post-processing
> tools.

I think it's better to do it in (all) gcc output, as gcc has a better
chance of knowing what is actually a build path; otherwise it seems that
any post-processor would need to do more or less naive string
matching. And in many CI setups, the leading part of the build path can
be somewhat random (some docker volume mapped to some path including the
CI job ID or whatnot), so I think it might be hard to even know which
strings to replace.

> If we're going to punt on the issue of redaction, then one approach
> here might be to say that the new option only affects text output.  I'm
> not sure here.  Is redaction the only use-case, or is this also about
> cleaning up long and irrelevant paths in CI output?

It's really both. For redaction purposes, one will have to be really
careful anyway (because the errors one quotes might as well come from
Make or some other part of the build system). But as a completely random
example found online, see
https://github.com/openembedded/meta-openembedded/commit/21315bd00b2a1dcdf532929992c90496b8425192
; everything up to and including /git/ in that path is irrelevant, and
it would be nice if it simply wasn't there to begin with so simple
copy-pasting would give a nicer commit message. And even when not used
for a commit, just reading the CI output is easier without that long
leading path.

>
> [...snip...]
>
>>  
>>    const char *line_col = maybe_line_and_column (line, col);
>> diff --git a/gcc/file-prefix-map.cc b/gcc/file-prefix-map.cc
>> index 3a77b195ae3..7ae3e7f95d5 100644
>> --- a/gcc/file-prefix-map.cc
>> +++ b/gcc/file-prefix-map.cc
>
> [...snip...]
>
>> +
>> +/* Remap using -fdiag-prefix-map.  Return the GC-allocated new name
>> +   corresponding to FILENAME or FILENAME if no remapping was
>> performed.  */
>> +const char *
>> +remap_diag_filename (const char *filename)
>> +{
>> +  return remap_filename (diag_prefix_maps, filename);
>> +}
>
> The returned string is GC-allocated, via ggc_internal_alloc.  For host
> binaries linked against ggc-page.o this buffer will eventually be
> garbage-collected, but for host binaries linked against ggc-none.o this
> is a memory leak.  Perhaps the remap_filename code could simply return
> a std::string?

Perhaps, that is beyond my skills in gcc internals. I merely copy-pasted
as much as I could from the existing remapping, but then also had to do
a bit of Makefile juggling to make the result link.

But wouldn't such a change require lots of changes in the callers of the
one-liner functions defined in terms of remap_filename? Or would you
change those three functions to copy the .c_str to a GC'ed allocation as
they do now and only return the std::string from the new
remap_diag_filename?

Rasmus

Re: [PATCH] libstdc++: Micro-optimization in std::arg overload for scalars

2025-05-15 Thread Tomasz Kaminski

On Wed, May 14, 2025 at 10:46 PM Jonathan Wakely  wrote:

> Use __builtin_signbit directly instead of std::signbit.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/complex (arg(T)): Use __builtin_signbit instead of
> std::signbit.
> ---
>
> This would avoid overload resolution for std::signbit, and avoid a
> function call for -O0, but I'm not sure it's worth bothering.
>
You already did it, tested it and pushed a patch, so I see no reason to
not commit it. LGTM

>
> Tested x86_64-linux.
>
>  libstdc++-v3/include/std/complex | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/complex
> b/libstdc++-v3/include/std/complex
> index 67f37d4ec2b7..d9d2d8afda89 100644
> --- a/libstdc++-v3/include/std/complex
> +++ b/libstdc++-v3/include/std/complex
> @@ -2532,8 +2532,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>  {
>typedef typename __gnu_cxx::__promote<_Tp>::__type __type;
>  #if (_GLIBCXX11_USE_C99_MATH && !_GLIBCXX_USE_C99_FP_MACROS_DYNAMIC)
> -  return std::signbit(__x) ?
> __type(3.1415926535897932384626433832795029L)
> -  : __type();
> +  return __builtin_signbit(__type(__x))
> +  ? __type(3.1415926535897932384626433832795029L) : __type();
>  #else
>return std::arg(std::complex<__type>(__x));
>  #endif
> --
> 2.49.0
>
>

[PATCH v2] libstdc++: Implement C++26 function_ref [PR119126]

2025-05-15 Thread Tomasz Kamiński

This patch implements C++26 function_ref as specified in P0792R14,
with correction for constraints for constructor accepting nontype_t
parameter from LWG 4256.

As function_ref may store a pointer to the const object, __Ptrs::_M_obj is
changed to const void*, so again we do not cast away const from const
objects. To help with necessary cast, a __polyfunc::__cast_to helper is
added, that accepts a reference to that type.

The _Invoker now defines additional call methods used by function_ref:
_S_ptrs() for invoking target passed by reference, and __S_nttp, _S_bind_ptr,
_S_bind_ref for handling constructors accepting nontype_t. The existing
_S_call_storage is changed to thin wrappers, that initialies _Ptrs,
and forwards to _S_call_ptrs.

This reduced the most uses of _Storage::_M_ptr and _Storage::_M_ref,
so this functions was removed, and _Manager uses were adjusted.

Finally we make function_ref available in freestanding mode, as
move_only_function and copyable_function are currently only available in hosted,
so we define _Manager and _Mo_base only if either __glibcxx_move_only_function
or __glibcxx_copyable_function is defined.

PR libstdc++/119126

libstdc++-v3/ChangeLog:

* doc/doxygen/stdheader.cc: Added funcref_impl.h file.
* include/Makefile.am: Added funcref_impl.h file.
* include/Makefile.in: Added funcref_impl.h file.
* include/bits/funcref_impl.h: New file.
* include/bits/funcwrap.h: (_Ptrs::_M_obj): Const-qualify.
(_Storage::_M_ptr, _Storage::_M_ref): Remove.
(__polyfunc::__cast_to) Define.
(_Base_invoker::_S_ptrs, _Base_invoker::_S_nttp)
(_Base_invoker::_S_bind_ptrs, _Base_invoker::_S_bind_ref)
(_Base_invoker::_S_call_ptrs): Define.
(_Base_invoker::_S_call_storage): Foward to _S_call_ptrs.
(_Manager::_S_local, _Manager::_S_ptr): Adjust for _M_obj being
const qualified.
(__polyfunc::_Manager, __polyfunc::_Mo_base): Guard with
__glibcxx_move_only_function || __glibcxx_copyable_function.
(__polyfunc::__skip_first_arg, __polyfunc::__deduce_funcref)
(std::function_ref) [__glibcxx_function_ref]: Define.
* include/bits/utility.h (std::nontype_t, std::nontype)
(__is_nontype_v) [__glibcxx_function_ref]: Define.
* include/bits/version.def: Define function_ref.
* include/bits/version.h: Regenerate.
* include/std/functional: Define __cpp_lib_function_ref.
* src/c++23/std.cc.in (std::nontype_t, std::nontype)
(std::function_ref) [__cpp_lib_function_ref]: Export.
* testsuite/20_util/function_ref/assign.cc: New test.
* testsuite/20_util/function_ref/call.cc: New test.
* testsuite/20_util/function_ref/cons.cc: New test.
* testsuite/20_util/function_ref/cons_neg.cc: New test.
* testsuite/20_util/function_ref/conv.cc: New test.
* testsuite/20_util/function_ref/deduction.cc: New test.
---
This updated patch addresses review comments, and add a deduction
guide for the function_ref that was previously missing.
The handling of _GLIBCXX_MOF_REF and related macros was removed
from the funcref_impl.h, as it is not used currently.
For the ifdef checks, I replaced defined(__cpp_lib_) with
__cpp_lib_ consistently.
The unused __is_function_ref variable template was removed.
All tests expect conv.cc now test with freestanding.

Tested on x86_64-linux. Tested *function_ref* with -ffreestanding.


 libstdc++-v3/doc/doxygen/stdheader.cc |   1 +
 libstdc++-v3/include/Makefile.am  |   1 +
 libstdc++-v3/include/Makefile.in  |   1 +
 libstdc++-v3/include/bits/funcref_impl.h  | 188 +++
 libstdc++-v3/include/bits/funcwrap.h  | 183 +++
 libstdc++-v3/include/bits/utility.h   |  17 ++
 libstdc++-v3/include/bits/version.def |   8 +
 libstdc++-v3/include/bits/version.h   |  10 +
 libstdc++-v3/include/std/functional   |   3 +-
 libstdc++-v3/src/c++23/std.cc.in  |   7 +
 .../testsuite/20_util/function_ref/assign.cc  | 108 +
 .../testsuite/20_util/function_ref/call.cc| 144 
 .../testsuite/20_util/function_ref/cons.cc| 218 ++
 .../20_util/function_ref/cons_neg.cc  |  30 +++
 .../testsuite/20_util/function_ref/conv.cc| 152 
 .../20_util/function_ref/deduction.cc | 103 +
 16 files changed, 1127 insertions(+), 47 deletions(-)
 create mode 100644 libstdc++-v3/include/bits/funcref_impl.h
 create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/assign.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/call.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/cons.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/cons_neg.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/conv.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/function

Re: [PATCH v1] libstdc++: Fix class mandate for extents.

2025-05-15 Thread Jonathan Wakely

On Wed, 14 May 2025 at 20:18, Luc Grosheintz  wrote:
>
> The standard states that the IndexType must be a signed or unsigned
> integer. This mandate was implemented using `std::is_integral_v`. Which
> also includes (among others) char and bool, which neither signed nor
> unsigned integers.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/mdspan: Implement the mandate for extents as
> signed or unsigned integer and not any interal type.
> * testsuite/23_containers/mdspan/extents/class_mandates_neg.cc: Check
> that extents and extents are invalid.
> * testsuite/23_containers/mdspan/extents/misc.cc: Update
> tests to avoid `char` and `bool` as IndexType.
> ---
>  libstdc++-v3/include/std/mdspan|  3 ++-
>  .../23_containers/mdspan/extents/class_mandates_neg.cc | 10 +++---
>  .../testsuite/23_containers/mdspan/extents/misc.cc |  8 
>  3 files changed, 13 insertions(+), 8 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/mdspan b/libstdc++-v3/include/std/mdspan
> index aee96dda7cd..22509d9c8f4 100644
> --- a/libstdc++-v3/include/std/mdspan
> +++ b/libstdc++-v3/include/std/mdspan
> @@ -163,7 +163,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>template
>  class extents
>  {
> -  static_assert(is_integral_v<_IndexType>, "_IndexType must be 
> integral.");
> +  static_assert(__is_standard_integer<_IndexType>::value,
> +   "_IndexType must be a signed or unsigned integer.");

GCC's diagnostics never end with a full stop (aka period), and we
follow that convention for our static assertions.

So I'll remove the '.' at the end of the string literal, and then push
this to trunk.


>static_assert(
>   (__mdspan::__valid_static_extent<_Extents, _IndexType> && ...),
>   "Extents must either be dynamic or representable as _IndexType");
> diff --git 
> a/libstdc++-v3/testsuite/23_containers/mdspan/extents/class_mandates_neg.cc 
> b/libstdc++-v3/testsuite/23_containers/mdspan/extents/class_mandates_neg.cc
> index b654e3920a8..63a2db77c08 100644
> --- 
> a/libstdc++-v3/testsuite/23_containers/mdspan/extents/class_mandates_neg.cc
> +++ 
> b/libstdc++-v3/testsuite/23_containers/mdspan/extents/class_mandates_neg.cc
> @@ -1,8 +1,12 @@
>  // { dg-do compile { target c++23 } }
>  #include
>
> -std::extents e1; // { dg-error "from here" }
> -std::extents e2;// { dg-error "from here" }
> +#include 
> +
> +std::extents e1; // { dg-error "from here" }
> +std::extents e2; // { dg-error "from here" }
> +std::extents e3; // { dg-error "from here" }
> +std::extents e4;   // { dg-error "from here" }
>  // { dg-prune-output "dynamic or representable as _IndexType" }
> -// { dg-prune-output "must be integral" }
> +// { dg-prune-output "signed or unsigned integer" }
>  // { dg-prune-output "invalid use of incomplete type" }
> diff --git a/libstdc++-v3/testsuite/23_containers/mdspan/extents/misc.cc 
> b/libstdc++-v3/testsuite/23_containers/mdspan/extents/misc.cc
> index 1620475..e71fdc54230 100644
> --- a/libstdc++-v3/testsuite/23_containers/mdspan/extents/misc.cc
> +++ b/libstdc++-v3/testsuite/23_containers/mdspan/extents/misc.cc
> @@ -1,6 +1,7 @@
>  // { dg-do run { target c++23 } }
>  #include 
>
> +#include 
>  #include 
>
>  constexpr size_t dyn = std::dynamic_extent;
> @@ -20,7 +21,6 @@ static_assert(std::is_same_v 2>::rank_type, size_t>);
>  static_assert(std::is_unsigned_v::size_type>);
>  static_assert(std::is_unsigned_v::size_type>);
>
> -static_assert(std::is_same_v::index_type, char>);
>  static_assert(std::is_same_v::index_type, int>);
>  static_assert(std::is_same_v::index_type,
>   unsigned int>);
> @@ -49,7 +49,7 @@ static_assert(check_rank_return_types());
>
>  // Check that the static extents don't take up space.
>  static_assert(sizeof(std::extents) == sizeof(int));
> -static_assert(sizeof(std::extents) == sizeof(char));
> +static_assert(sizeof(std::extents) == sizeof(short));
>
>  template
>  class Container
> @@ -58,7 +58,7 @@ class Container
>[[no_unique_address]] std::extents b0;
>  };
>
> -static_assert(sizeof(Container>) == sizeof(int));
> +static_assert(sizeof(Container>) == sizeof(int));
>  static_assert(sizeof(Container>) == sizeof(int));
>
>  // operator=
> @@ -103,7 +103,7 @@ test_deduction_all()
>test_deduction<0>();
>test_deduction<1>(1);
>test_deduction<2>(1.0, 2.0f);
> -  test_deduction<3>(int(1), char(2), size_t(3));
> +  test_deduction<3>(int(1), short(2), size_t(3));
>return true;
>  }
>
> --
> 2.49.0
>

Re: [PATCH v1] libstdc++: Fix class mandate for extents.

2025-05-15 Thread Jonathan Wakely

On Thu, 15 May 2025 at 11:14, Jonathan Wakely  wrote:
>
> On Wed, 14 May 2025 at 20:18, Luc Grosheintz  wrote:
> >
> > The standard states that the IndexType must be a signed or unsigned
> > integer. This mandate was implemented using `std::is_integral_v`. Which
> > also includes (among others) char and bool, which neither signed nor
> > unsigned integers.
> >
> > libstdc++-v3/ChangeLog:
> >
> > * include/std/mdspan: Implement the mandate for extents as
> > signed or unsigned integer and not any interal type.
> > * testsuite/23_containers/mdspan/extents/class_mandates_neg.cc: 
> > Check
> > that extents and extents are invalid.
> > * testsuite/23_containers/mdspan/extents/misc.cc: Update
> > tests to avoid `char` and `bool` as IndexType.
> > ---
> >  libstdc++-v3/include/std/mdspan|  3 ++-
> >  .../23_containers/mdspan/extents/class_mandates_neg.cc | 10 +++---
> >  .../testsuite/23_containers/mdspan/extents/misc.cc |  8 
> >  3 files changed, 13 insertions(+), 8 deletions(-)
> >
> > diff --git a/libstdc++-v3/include/std/mdspan 
> > b/libstdc++-v3/include/std/mdspan
> > index aee96dda7cd..22509d9c8f4 100644
> > --- a/libstdc++-v3/include/std/mdspan
> > +++ b/libstdc++-v3/include/std/mdspan
> > @@ -163,7 +163,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >template
> >  class extents
> >  {
> > -  static_assert(is_integral_v<_IndexType>, "_IndexType must be 
> > integral.");
> > +  static_assert(__is_standard_integer<_IndexType>::value,
> > +   "_IndexType must be a signed or unsigned integer.");
>
> GCC's diagnostics never end with a full stop (aka period), and we
> follow that convention for our static assertions.
>
> So I'll remove the '.' at the end of the string literal, and then push
> this to trunk.
>
>
> >static_assert(
> >   (__mdspan::__valid_static_extent<_Extents, _IndexType> && ...),
> >   "Extents must either be dynamic or representable as _IndexType");

I've just noticed that this static_assert refers to "Extents" without
the leading underscore, but "_IndexType" with a leading underscore.

I think it's OK to omit the leading underscore, it might be a bit more
user-friendly and I don't think anybody will be confused by the fact
it's not identical to the real template parameter. But we should
either do it consistently for _Extents and _IndexType or for neither
of them.

Anybody want to argue for or against underscores?

Re: [PATCH v1] libstdc++: Fix class mandate for extents.

2025-05-15 Thread Ville Voutilainen

Mild preference against; use the names from the standard, not the
implementation, in such diagnostics.

to 15. toukok. 2025 klo 13.20 Jonathan Wakely 
kirjoitti:

> On Thu, 15 May 2025 at 11:14, Jonathan Wakely  wrote:
> >
> > On Wed, 14 May 2025 at 20:18, Luc Grosheintz 
> wrote:
> > >
> > > The standard states that the IndexType must be a signed or unsigned
> > > integer. This mandate was implemented using `std::is_integral_v`. Which
> > > also includes (among others) char and bool, which neither signed nor
> > > unsigned integers.
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > > * include/std/mdspan: Implement the mandate for extents as
> > > signed or unsigned integer and not any interal type.
> > > *
> testsuite/23_containers/mdspan/extents/class_mandates_neg.cc: Check
> > > that extents and extents are invalid.
> > > * testsuite/23_containers/mdspan/extents/misc.cc: Update
> > > tests to avoid `char` and `bool` as IndexType.
> > > ---
> > >  libstdc++-v3/include/std/mdspan|  3 ++-
> > >  .../23_containers/mdspan/extents/class_mandates_neg.cc | 10 +++---
> > >  .../testsuite/23_containers/mdspan/extents/misc.cc |  8 
> > >  3 files changed, 13 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/libstdc++-v3/include/std/mdspan
> b/libstdc++-v3/include/std/mdspan
> > > index aee96dda7cd..22509d9c8f4 100644
> > > --- a/libstdc++-v3/include/std/mdspan
> > > +++ b/libstdc++-v3/include/std/mdspan
> > > @@ -163,7 +163,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > >template
> > >  class extents
> > >  {
> > > -  static_assert(is_integral_v<_IndexType>, "_IndexType must be
> integral.");
> > > +  static_assert(__is_standard_integer<_IndexType>::value,
> > > +   "_IndexType must be a signed or unsigned
> integer.");
> >
> > GCC's diagnostics never end with a full stop (aka period), and we
> > follow that convention for our static assertions.
> >
> > So I'll remove the '.' at the end of the string literal, and then push
> > this to trunk.
> >
> >
> > >static_assert(
> > >   (__mdspan::__valid_static_extent<_Extents, _IndexType> &&
> ...),
> > >   "Extents must either be dynamic or representable as
> _IndexType");
>
> I've just noticed that this static_assert refers to "Extents" without
> the leading underscore, but "_IndexType" with a leading underscore.
>
> I think it's OK to omit the leading underscore, it might be a bit more
> user-friendly and I don't think anybody will be confused by the fact
> it's not identical to the real template parameter. But we should
> either do it consistently for _Extents and _IndexType or for neither
> of them.
>
> Anybody want to argue for or against underscores?
>
>

[PATCH] libstdc++: Fix std::format_kind primary template for Clang [PR120190]

2025-05-15 Thread Jonathan Wakely

Although Clang trunk has been adjusted to handle our std::format_kind
definition (because they need to be able to compile the GCC 15.1.0
release), it's probably better to not rely on something that they might
start diagnosing again in future.

Define the primary template in terms of an immediately invoked function
expression, so that we can put a static_assert(false) in the body.

The diagnostic that users will see looks like this:

include/c++/16.0.0/format:5470:21: error: static assertion failed
 5470 |   static_assert(false); // Cannot use primary template of 
std::format_kind
  | ^

libstdc++-v3/ChangeLog:

PR libstdc++/120190
* include/std/format (format_kind): Adjust primary template to
not depend on itself.
* testsuite/std/format/ranges/format_kind_neg.cc: Adjust
expected errors. Check more invalid specializations.
---

Tested x86_64-linux.

I haven't tested it fully with Clang, because the version of Clang in
Fedora is older than the change which started rejecting the previous
definition of std::format_kind, and the build in Compiler Explorer is
newer than the revert of that change.

 libstdc++-v3/include/std/format   | 19 ++-
 .../std/format/ranges/format_kind_neg.cc  | 15 ++-
 2 files changed, 24 insertions(+), 10 deletions(-)

diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index bfda5895e0c0..887e891f2096 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -5464,13 +5464,22 @@ namespace __format
 debug_string
   };
 
-  /// @cond undocumented
+  /** @brief A constant determining how a range should be formatted.
+   *
+   * The primary template of `std::format_kind` cannot be instantiated.
+   * There is a partial specialization for input ranges and you can
+   * specialize the variable template for your own cv-unqualified types
+   * that satisfy the `ranges::input_range` concept.
+   *
+   * @since C++23
+   */
   template
-constexpr auto format_kind =
-__primary_template_not_defined(
-  format_kind<_Rg> // you can specialize this for non-const input ranges
-);
+constexpr auto format_kind = []{
+  static_assert(false); // Cannot use primary template of 
'std::format_kind'
+  return type_identity<_Rg>{};
+}();
 
+  /// @cond undocumented
   template
 consteval range_format
 __fmt_kind()
diff --git a/libstdc++-v3/testsuite/std/format/ranges/format_kind_neg.cc 
b/libstdc++-v3/testsuite/std/format/ranges/format_kind_neg.cc
index bf8619d3d276..b2f1b6f42201 100644
--- a/libstdc++-v3/testsuite/std/format/ranges/format_kind_neg.cc
+++ b/libstdc++-v3/testsuite/std/format/ranges/format_kind_neg.cc
@@ -5,9 +5,14 @@
 
 #include 
 
-template struct Tester { };
+void test()
+{
+  (void) std::format_kind; // { dg-error "here" }
+  (void) std::format_kind; // { dg-error "here" }
+  (void) std::format_kind; // { dg-error "here" }
+  (void) std::format_kind; // { dg-error "here" }
+  (void) std::format_kind; // { dg-error "here" }
+  (void) std::format_kind; // { dg-error "here" }
+}
 
-Tester> t; // { dg-error "here" }
-
-// { dg-error "use of 'std::format_kind" "" { target *-*-* } 0 }
-// { dg-error "primary_template_not_defined" "" { target *-*-* } 0 }
+// { dg-error "static assertion failed" "" { target *-*-* } 0 }
-- 
2.49.0

Re: [PATCH] libstdc++: Fix std::format_kind primary template for Clang [PR120190]

2025-05-15 Thread Daniel Krügler

Am Do., 15. Mai 2025 um 13:00 Uhr schrieb Jonathan Wakely <
jwak...@redhat.com>:

> Although Clang trunk has been adjusted to handle our std::format_kind
> definition (because they need to be able to compile the GCC 15.1.0
> release), it's probably better to not rely on something that they might
> start diagnosing again in future.
>
> Define the primary template in terms of an immediately invoked function
> expression, so that we can put a static_assert(false) in the body.
>
> The diagnostic that users will see looks like this:
>
> include/c++/16.0.0/format:5470:21: error: static assertion failed
>  5470 |   static_assert(false); // Cannot use primary template of
> std::format_kind
>   | ^
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/120190
> * include/std/format (format_kind): Adjust primary template to
> not depend on itself.
> * testsuite/std/format/ranges/format_kind_neg.cc: Adjust
> expected errors. Check more invalid specializations.
> ---
>
> Tested x86_64-linux.
>
> I haven't tested it fully with Clang, because the version of Clang in
> Fedora is older than the change which started rejecting the previous
> definition of std::format_kind, and the build in Compiler Explorer is
> newer than the revert of that change.
>
>  libstdc++-v3/include/std/format   | 19 ++-
>  .../std/format/ranges/format_kind_neg.cc  | 15 ++-
>  2 files changed, 24 insertions(+), 10 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/format
> b/libstdc++-v3/include/std/format
> index bfda5895e0c0..887e891f2096 100644
> --- a/libstdc++-v3/include/std/format
> +++ b/libstdc++-v3/include/std/format
> @@ -5464,13 +5464,22 @@ namespace __format
>  debug_string
>};
>
> -  /// @cond undocumented
> +  /** @brief A constant determining how a range should be formatted.
> +   *
> +   * The primary template of `std::format_kind` cannot be instantiated.
> +   * There is a partial specialization for input ranges and you can
> +   * specialize the variable template for your own cv-unqualified types
> +   * that satisfy the `ranges::input_range` concept.
> +   *
> +   * @since C++23
> +   */
>template
> -constexpr auto format_kind =
> -__primary_template_not_defined(
> -  format_kind<_Rg> // you can specialize this for non-const input
> ranges
> -);
> +constexpr auto format_kind = []{
> +  static_assert(false); // Cannot use primary template of
> 'std::format_kind'
>

Shouldn't the comment actually be used as static_assert message

static_assert(false, "Cannot use primary template of 'std::format_kind'");

?

- Daniel

[PATCH] RISC-V: Fix the warning of temporary object dangling references.

2025-05-15 Thread Dongyan Chen

During the GCC compilation, some warnings about temporary object dangling 
references emerged. They appeared in these code lines in riscv-common.cc:
const riscv_ext_info_t &implied_ext_info, const riscv_ext_info_t &ext_info = 
get_riscv_ext_info (ext) and auto &ext_info = get_riscv_ext_info (search_ext).
The issue arose because the local variable types were not used in a 
standardized way, causing their references to dangle once the function ended.
To fix this, the patch converts the const char* type to std::string via forced 
type conversion, thereby eliminating the warnings.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc 
(riscv_ext_info_t::apply_implied_ext): Type conversion.
(riscv_subset_list::handle_implied_ext): Ditto.
(riscv_minimal_hwprobe_feature_bits): Ditto.

---
 gcc/common/config/riscv/riscv-common.cc | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 53ca03910b38..c2e35dfe54d2 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -245,8 +245,9 @@ riscv_ext_info_t::apply_implied_ext (riscv_subset_list 
*subset_list) const
   subset_list->add (implied_info.implied_ext, true);

   /* Recursively add implied extension by implied_info->implied_ext.  */
+  std::string implied_ext_str = implied_info.implied_ext;
   const riscv_ext_info_t &implied_ext_info
-   = get_riscv_ext_info (implied_info.implied_ext);
+   = get_riscv_ext_info (implied_ext_str);
   implied_ext_info.apply_implied_ext (subset_list);
 }
   return any_change;
@@ -1089,7 +1090,8 @@ riscv_subset_list::parse_single_std_ext (const char *p, 
bool exact_single_p)
 void
 riscv_subset_list::handle_implied_ext (const char *ext)
 {
-  const riscv_ext_info_t &ext_info = get_riscv_ext_info (ext);
+  std::string ext_str = ext;
+  const riscv_ext_info_t &ext_info = get_riscv_ext_info (ext_str);
   ext_info.apply_implied_ext (this);

   /* For RISC-V ISA version 2.2 or earlier version, zicsr and zifence is
@@ -1642,7 +1644,8 @@ riscv_minimal_hwprobe_feature_bits (const char *isa,
  search_q.pop ();

  /* Iterate through the implied extension table.  */
- auto &ext_info = get_riscv_ext_info (search_ext);
+ std::string search_ext_str = search_ext;
+ auto &ext_info = get_riscv_ext_info (search_ext_str);
  for (const auto &implied_ext : ext_info.implied_exts ())
{
  /* When the search extension matches the implied extension and
--
2.43.0

Re: [RFC PATCH 0/2] Add target_clones profile option support

2025-05-15 Thread Alice Carlotti

On Thu, May 08, 2025 at 09:41:08PM +0800, Yangyu Chen wrote:
> 
> 
> > On 8 May 2025, at 18:36, Richard Sandiford  
> > wrote:
> > 
> > Yangyu Chen  writes:
> >>> On 6 May 2025, at 17:49, Alfie Richards  wrote:
> >>> 
> >>> On 06/05/2025 09:36, Yangyu Chen wrote:
> > On 6 May 2025, at 16:01, Alfie Richards  wrote:
> > 
> > Additionally, I think ideally the file can express functions 
> > disambiguated by file, signature, and namespace.
> > I imagine we could use similar syntax to gdb supports?
> > 
> > For example:
> > 
> > ```
> > foo  |arch=+v
> > bar(int, char)   |arch=+zba,+zbb
> > file.C:baz(char) |arch=+zba,+zbb#arch=+v
> > namespace::qux   |arch=+v
> > ```
>  Also a great idea. However, I think it's not easy to use to implement
>  it now in GCC. But I would like to accept any further feedback if
>  we have such a simple API in GCC to do so, or if it will be implemented
>  by the community.
>  And something behind this idea is that I'm researching auto-generating
>  target clones attributes for developers. Only accepting the ASM
>  name is enough to implement this.
> >>> 
> >>> Ah that makes sense, apologies I missed that.
> >>> 
> >>> I think accepting the assembler name is good, and solves the overloading 
> >>> ambiguity issue.
> >>> 
> >>> Maybe we can use the pipe '|' instead of ':' in the file format to leave 
> >>> room for both in future?
> >> 
> >> 
> >> I will consider using the pipe '|' in the next revision. Thanks for
> >> the advice.
> > 
> > How about instead using a json file?  There's already a parser built into
> > the compiler.
> > 
> > That has the advantage of being an established format that generators
> > can use.  It would also allow other ways of specifying the functions
> > to be added in future.
> > 
> 
> Thanks for this useful information. I also want to extend the current
> function multi-versioning to have the ability to set -mtune for
> different micro-architectures in the future. Using JSON can provide
> more extensibility in the future.
> 
> Thanks,
> Yangyu Chen

For AArch64, we already support setting different tuning options for different
versions (this is why I deliberately allowed the "target" attribute to be used
on a function that also has the "target_clones" or "target_version" attribute).


What we don't support is allowing different microarchitectures to affect the
version selection algorithm when the architecture features remain the same.
This sounds like a nice idea, but on further consideration I think there are
several reasons why we shouldn't try adding this.  I think these reasons apply
at least to the AArch64 and RiscV ecosystems.

- Detecting specific microarchitecture properties (e.g. details like cache
  size, pipeline widths, instruction latencies) would be somewhere between
  difficult and impossible.  It certainly isn't something that can be packaged
  into a standard user-friendly attribute.

- Detecting specific microarchitectures by individual names would probably
  require writing long lists of known microarchitectures, and wouldn't be able
  to make a non-default choice for microarchitectures that weren't known about
  when the software was written.

- The only good way I can see to systematically group microarchitectures that
  works for unknown future microarchitectures is to look at what architecture
  version or features they support.  But then we're back to using architecture
  versions as the basis for version selection, which we already support.

I'm not saying that users should never use microarchitecture detection to
select between different implementations of a function.  I just think that it's
a relatively uncommon to need to this, and that doing it well is much more
complicated than checking whether architecture features are supported.  It
would be best to just let users write their own custom resolvers and use ifuncs
(or other platform specific mechanisms) directly.

Re: [PATCH] libstdc++: Fix std::format_kind primary template for Clang [PR120190]

2025-05-15 Thread Jonathan Wakely

On Thu, 15 May 2025 at 12:15, Daniel Krügler  wrote:
>
> Am Do., 15. Mai 2025 um 13:00 Uhr schrieb Jonathan Wakely 
> :
>>
>> Although Clang trunk has been adjusted to handle our std::format_kind
>> definition (because they need to be able to compile the GCC 15.1.0
>> release), it's probably better to not rely on something that they might
>> start diagnosing again in future.
>>
>> Define the primary template in terms of an immediately invoked function
>> expression, so that we can put a static_assert(false) in the body.
>>
>> The diagnostic that users will see looks like this:
>>
>> include/c++/16.0.0/format:5470:21: error: static assertion failed
>>  5470 |   static_assert(false); // Cannot use primary template of 
>> std::format_kind
>>   | ^
>>
>> libstdc++-v3/ChangeLog:
>>
>> PR libstdc++/120190
>> * include/std/format (format_kind): Adjust primary template to
>> not depend on itself.
>> * testsuite/std/format/ranges/format_kind_neg.cc: Adjust
>> expected errors. Check more invalid specializations.
>> ---
>>
>> Tested x86_64-linux.
>>
>> I haven't tested it fully with Clang, because the version of Clang in
>> Fedora is older than the change which started rejecting the previous
>> definition of std::format_kind, and the build in Compiler Explorer is
>> newer than the revert of that change.
>>
>>  libstdc++-v3/include/std/format   | 19 ++-
>>  .../std/format/ranges/format_kind_neg.cc  | 15 ++-
>>  2 files changed, 24 insertions(+), 10 deletions(-)
>>
>> diff --git a/libstdc++-v3/include/std/format 
>> b/libstdc++-v3/include/std/format
>> index bfda5895e0c0..887e891f2096 100644
>> --- a/libstdc++-v3/include/std/format
>> +++ b/libstdc++-v3/include/std/format
>> @@ -5464,13 +5464,22 @@ namespace __format
>>  debug_string
>>};
>>
>> -  /// @cond undocumented
>> +  /** @brief A constant determining how a range should be formatted.
>> +   *
>> +   * The primary template of `std::format_kind` cannot be instantiated.
>> +   * There is a partial specialization for input ranges and you can
>> +   * specialize the variable template for your own cv-unqualified types
>> +   * that satisfy the `ranges::input_range` concept.
>> +   *
>> +   * @since C++23
>> +   */
>>template
>> -constexpr auto format_kind =
>> -__primary_template_not_defined(
>> -  format_kind<_Rg> // you can specialize this for non-const input ranges
>> -);
>> +constexpr auto format_kind = []{
>> +  static_assert(false); // Cannot use primary template of 
>> 'std::format_kind'
>
>
> Shouldn't the comment actually be used as static_assert message
>
> static_assert(false, "Cannot use primary template of 'std::format_kind'");
>
> ?

Ha, yes, obviously that would be better! :-)

Re: [PATCH] libstdc++: Fix std::format_kind primary template for Clang [PR120190]

2025-05-15 Thread Tomasz Kaminski

Please also add the message to dg-error check in format_kind_neg.cc.
With that LGTM.

On Thu, May 15, 2025 at 2:19 PM Jonathan Wakely  wrote:

> On Thu, 15 May 2025 at 12:15, Daniel Krügler 
> wrote:
> >
> > Am Do., 15. Mai 2025 um 13:00 Uhr schrieb Jonathan Wakely <
> jwak...@redhat.com>:
> >>
> >> Although Clang trunk has been adjusted to handle our std::format_kind
> >> definition (because they need to be able to compile the GCC 15.1.0
> >> release), it's probably better to not rely on something that they might
> >> start diagnosing again in future.
> >>
> >> Define the primary template in terms of an immediately invoked function
> >> expression, so that we can put a static_assert(false) in the body.
> >>
> >> The diagnostic that users will see looks like this:
> >>
> >> include/c++/16.0.0/format:5470:21: error: static assertion failed
> >>  5470 |   static_assert(false); // Cannot use primary template of
> std::format_kind
> >>   | ^
> >>
> >> libstdc++-v3/ChangeLog:
> >>
> >> PR libstdc++/120190
> >> * include/std/format (format_kind): Adjust primary template to
> >> not depend on itself.
> >> * testsuite/std/format/ranges/format_kind_neg.cc: Adjust
> >> expected errors. Check more invalid specializations.
> >> ---
> >>
> >> Tested x86_64-linux.
> >>
> >> I haven't tested it fully with Clang, because the version of Clang in
> >> Fedora is older than the change which started rejecting the previous
> >> definition of std::format_kind, and the build in Compiler Explorer is
> >> newer than the revert of that change.
> >>
> >>  libstdc++-v3/include/std/format   | 19 ++-
> >>  .../std/format/ranges/format_kind_neg.cc  | 15 ++-
> >>  2 files changed, 24 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/libstdc++-v3/include/std/format
> b/libstdc++-v3/include/std/format
> >> index bfda5895e0c0..887e891f2096 100644
> >> --- a/libstdc++-v3/include/std/format
> >> +++ b/libstdc++-v3/include/std/format
> >> @@ -5464,13 +5464,22 @@ namespace __format
> >>  debug_string
> >>};
> >>
> >> -  /// @cond undocumented
> >> +  /** @brief A constant determining how a range should be formatted.
> >> +   *
> >> +   * The primary template of `std::format_kind` cannot be instantiated.
> >> +   * There is a partial specialization for input ranges and you can
> >> +   * specialize the variable template for your own cv-unqualified types
> >> +   * that satisfy the `ranges::input_range` concept.
> >> +   *
> >> +   * @since C++23
> >> +   */
> >>template
> >> -constexpr auto format_kind =
> >> -__primary_template_not_defined(
> >> -  format_kind<_Rg> // you can specialize this for non-const input
> ranges
> >> -);
> >> +constexpr auto format_kind = []{
> >> +  static_assert(false); // Cannot use primary template of
> 'std::format_kind'
> >
> >
> > Shouldn't the comment actually be used as static_assert message
> >
> > static_assert(false, "Cannot use primary template of
> 'std::format_kind'");
> >
> > ?
>
> Ha, yes, obviously that would be better! :-)
>
>

Re: [PATCH v1] libstdc++: Fix class mandate for extents.

2025-05-15 Thread Tomasz Kaminski

No strong preference, but Ville's argument sounds reasonable.

On Thu, May 15, 2025 at 12:25 PM Ville Voutilainen <
ville.voutilai...@gmail.com> wrote:

> Mild preference against; use the names from the standard, not the
> implementation, in such diagnostics.
>
> to 15. toukok. 2025 klo 13.20 Jonathan Wakely 
> kirjoitti:
>
>> On Thu, 15 May 2025 at 11:14, Jonathan Wakely  wrote:
>> >
>> > On Wed, 14 May 2025 at 20:18, Luc Grosheintz 
>> wrote:
>> > >
>> > > The standard states that the IndexType must be a signed or unsigned
>> > > integer. This mandate was implemented using `std::is_integral_v`.
>> Which
>> > > also includes (among others) char and bool, which neither signed nor
>> > > unsigned integers.
>> > >
>> > > libstdc++-v3/ChangeLog:
>> > >
>> > > * include/std/mdspan: Implement the mandate for extents as
>> > > signed or unsigned integer and not any interal type.
>> > > *
>> testsuite/23_containers/mdspan/extents/class_mandates_neg.cc: Check
>> > > that extents and extents are invalid.
>> > > * testsuite/23_containers/mdspan/extents/misc.cc: Update
>> > > tests to avoid `char` and `bool` as IndexType.
>> > > ---
>> > >  libstdc++-v3/include/std/mdspan|  3 ++-
>> > >  .../23_containers/mdspan/extents/class_mandates_neg.cc | 10
>> +++---
>> > >  .../testsuite/23_containers/mdspan/extents/misc.cc |  8 
>> > >  3 files changed, 13 insertions(+), 8 deletions(-)
>> > >
>> > > diff --git a/libstdc++-v3/include/std/mdspan
>> b/libstdc++-v3/include/std/mdspan
>> > > index aee96dda7cd..22509d9c8f4 100644
>> > > --- a/libstdc++-v3/include/std/mdspan
>> > > +++ b/libstdc++-v3/include/std/mdspan
>> > > @@ -163,7 +163,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>> > >template
>> > >  class extents
>> > >  {
>> > > -  static_assert(is_integral_v<_IndexType>, "_IndexType must be
>> integral.");
>> > > +  static_assert(__is_standard_integer<_IndexType>::value,
>> > > +   "_IndexType must be a signed or unsigned
>> integer.");
>> >
>> > GCC's diagnostics never end with a full stop (aka period), and we
>> > follow that convention for our static assertions.
>> >
>> > So I'll remove the '.' at the end of the string literal, and then push
>> > this to trunk.
>> >
>> >
>> > >static_assert(
>> > >   (__mdspan::__valid_static_extent<_Extents, _IndexType> &&
>> ...),
>> > >   "Extents must either be dynamic or representable as
>> _IndexType");
>>
>> I've just noticed that this static_assert refers to "Extents" without
>> the leading underscore, but "_IndexType" with a leading underscore.
>>
>> I think it's OK to omit the leading underscore, it might be a bit more
>> user-friendly and I don't think anybody will be confused by the fact
>> it's not identical to the real template parameter. But we should
>> either do it consistently for _Extents and _IndexType or for neither
>> of them.
>>
>> Anybody want to argue for or against underscores?
>>
>>

[PATCH v3] libstdc++: Implement C++26 function_ref [PR119126]

2025-05-15 Thread Tomasz Kamiński

This patch implements C++26 function_ref as specified in P0792R14,
with correction for constraints for constructor accepting nontype_t
parameter from LWG 4256.

As function_ref may store a pointer to the const object, __Ptrs::_M_obj is
changed to const void*, so again we do not cast away const from const
objects. To help with necessary casts, a __polyfunc::__cast_to helper is
added, that accepts reference to or target type direclty.

The _Invoker now defines additional call methods used by function_ref:
_S_ptrs() for invoking target passed by reference, and __S_nttp, _S_bind_ptr,
_S_bind_ref for handling constructors accepting nontype_t. The existing
_S_call_storage is changed to thin wrapper, that initialies _Ptrs, and forwards
to _S_call_ptrs.

This reduced the most uses of _Storage::_M_ptr and _Storage::_M_ref,
so this functions was removed, and _Manager uses were adjusted.

Finally we make function_ref available in freestanding mode, as
move_only_function and copyable_function are currently only available in hosted,
so we define _Manager and _Mo_base only if either __glibcxx_move_only_function
or __glibcxx_copyable_function is defined.

PR libstdc++/119126

libstdc++-v3/ChangeLog:

* doc/doxygen/stdheader.cc: Added funcref_impl.h file.
* include/Makefile.am: Added funcref_impl.h file.
* include/Makefile.in: Added funcref_impl.h file.
* include/bits/funcref_impl.h: New file.
* include/bits/funcwrap.h: (_Ptrs::_M_obj): Const-qualify.
(_Storage::_M_ptr, _Storage::_M_ref): Remove.
(__polyfunc::__cast_to) Define.
(_Base_invoker::_S_ptrs, _Base_invoker::_S_nttp)
(_Base_invoker::_S_bind_ptrs, _Base_invoker::_S_bind_ref)
(_Base_invoker::_S_call_ptrs): Define.
(_Base_invoker::_S_call_storage): Foward to _S_call_ptrs.
(_Manager::_S_local, _Manager::_S_ptr): Adjust for _M_obj being
const qualified.
(__polyfunc::_Manager, __polyfunc::_Mo_base): Guard with
__glibcxx_move_only_function || __glibcxx_copyable_function.
(__polyfunc::__skip_first_arg, __polyfunc::__deduce_funcref)
(std::function_ref) [__glibcxx_function_ref]: Define.
* include/bits/utility.h (std::nontype_t, std::nontype)
(__is_nontype_v) [__glibcxx_function_ref]: Define.
* include/bits/version.def: Define function_ref.
* include/bits/version.h: Regenerate.
* include/std/functional: Define __cpp_lib_function_ref.
* src/c++23/std.cc.in (std::nontype_t, std::nontype)
(std::function_ref) [__cpp_lib_function_ref]: Export.
* testsuite/20_util/function_ref/assign.cc: New test.
* testsuite/20_util/function_ref/call.cc: New test.
* testsuite/20_util/function_ref/cons.cc: New test.
* testsuite/20_util/function_ref/cons_neg.cc: New test.
* testsuite/20_util/function_ref/conv.cc: New test.
* testsuite/20_util/function_ref/deduction.cc: New test.
---
This patch handles using nontype with function pointer/reference.
In function_ref we now use _M_init(ptr) function, that handles
both function and object pointers. The __polyfunc::__cast_to is
also expaned to handle function_types.

 libstdc++-v3/doc/doxygen/stdheader.cc |   1 +
 libstdc++-v3/include/Makefile.am  |   1 +
 libstdc++-v3/include/Makefile.in  |   1 +
 libstdc++-v3/include/bits/funcref_impl.h  | 198 
 libstdc++-v3/include/bits/funcwrap.h  | 185 +++
 libstdc++-v3/include/bits/utility.h   |  17 ++
 libstdc++-v3/include/bits/version.def |   8 +
 libstdc++-v3/include/bits/version.h   |  10 +
 libstdc++-v3/include/std/functional   |   3 +-
 libstdc++-v3/src/c++23/std.cc.in  |   7 +
 .../testsuite/20_util/function_ref/assign.cc  | 108 +
 .../testsuite/20_util/function_ref/call.cc| 186 +++
 .../testsuite/20_util/function_ref/cons.cc| 218 ++
 .../20_util/function_ref/cons_neg.cc  |  30 +++
 .../testsuite/20_util/function_ref/conv.cc| 152 
 .../20_util/function_ref/deduction.cc | 103 +
 16 files changed, 1181 insertions(+), 47 deletions(-)
 create mode 100644 libstdc++-v3/include/bits/funcref_impl.h
 create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/assign.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/call.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/cons.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/cons_neg.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/conv.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/function_ref/deduction.cc

diff --git a/libstdc++-v3/doc/doxygen/stdheader.cc 
b/libstdc++-v3/doc/doxygen/stdheader.cc
index 839bfc81bc0..938b2b04a26 100644
--- a/libstdc++-v3/doc/doxygen/stdheader.cc
+++ b/libstdc++-v3/doc/doxygen/stdheader.cc
@@ -55,6 +55,7 @@ void init_

[PATCH v2] libstdc++: Fix std::format_kind primary template for Clang [PR120190]

2025-05-15 Thread Jonathan Wakely


On 15/05/25 14:35 +0200, Tomasz Kaminski wrote:

Please also add the message to dg-error check in format_kind_neg.cc.
With that LGTM.


Yes, already done locally. Here's what I'm testing now.


commit 3c154b2d95d30580c18aa0fedd9e67200867653f
Author: Jonathan Wakely 
AuthorDate: Thu May 15 11:01:05 2025
Commit: Jonathan Wakely 
CommitDate: Thu May 15 13:21:32 2025

libstdc++: Fix std::format_kind primary template for Clang [PR120190]

Although Clang trunk has been adjusted to handle our std::format_kind
definition (because they need to be able to compile the GCC 15.1.0
release), it's probably better to not rely on something that they might
start diagnosing again in future.

Define the primary template in terms of an immediately invoked function
expression, so that we can put a static_assert(false) in the body.

libstdc++-v3/ChangeLog:

PR libstdc++/120190
* include/std/format (format_kind): Adjust primary template to
not depend on itself.
* testsuite/std/format/ranges/format_kind_neg.cc: Adjust
expected errors. Check more invalid specializations.

diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index bfda5895e0c0..b1823db83bc9 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -5464,13 +5464,22 @@ namespace __format
 debug_string
   };
 
-  /// @cond undocumented
+  /** @brief A constant determining how a range should be formatted.
+   *
+   * The primary template of `std::format_kind` cannot be instantiated.
+   * There is a partial specialization for input ranges and you can
+   * specialize the variable template for your own cv-unqualified types
+   * that satisfy the `ranges::input_range` concept.
+   *
+   * @since C++23
+   */
   template
-constexpr auto format_kind =
-__primary_template_not_defined(
-  format_kind<_Rg> // you can specialize this for non-const input ranges
-);
+constexpr auto format_kind = []{
+  static_assert(false, "cannot use primary template of 'std::format_kind'");
+  return type_identity<_Rg>{};
+}();
 
+  /// @cond undocumented
   template
 consteval range_format
 __fmt_kind()
diff --git a/libstdc++-v3/testsuite/std/format/ranges/format_kind_neg.cc b/libstdc++-v3/testsuite/std/format/ranges/format_kind_neg.cc
index bf8619d3d276..bd22541b36b8 100644
--- a/libstdc++-v3/testsuite/std/format/ranges/format_kind_neg.cc
+++ b/libstdc++-v3/testsuite/std/format/ranges/format_kind_neg.cc
@@ -5,9 +5,14 @@
 
 #include 
 
-template struct Tester { };
+void test()
+{
+  (void) std::format_kind; // { dg-error "here" }
+  (void) std::format_kind; // { dg-error "here" }
+  (void) std::format_kind; // { dg-error "here" }
+  (void) std::format_kind; // { dg-error "here" }
+  (void) std::format_kind; // { dg-error "here" }
+  (void) std::format_kind; // { dg-error "here" }
+}
 
-Tester> t; // { dg-error "here" }
-
-// { dg-error "use of 'std::format_kind" "" { target *-*-* } 0 }
-// { dg-error "primary_template_not_defined" "" { target *-*-* } 0 }
+// { dg-error "cannot use primary template" "" { target *-*-* } 0 }

Re: [PATCH v2] libstdc++: Fix std::format_kind primary template for Clang [PR120190]

2025-05-15 Thread Tomasz Kaminski

On Thu, May 15, 2025 at 2:41 PM Jonathan Wakely  wrote:

> On 15/05/25 14:35 +0200, Tomasz Kaminski wrote:
> >Please also add the message to dg-error check in format_kind_neg.cc.
> >With that LGTM.
>
> Yes, already done locally. Here's what I'm testing now.
>
Any reason to not put the whole message "cannot use primary template of
'std::format_kind'"?

Re: [PATCH] RISC-V: Add missing insn types for XiangShan Nanhu scheduler model

2025-05-15 Thread Yangyu Chen


We should also back-port this commit to GCC-14.

Thanks,
Yangyu Chen

On 6/9/2024 07:07, Zhao Dingyi wrote:

This patch aims to add the missing instruction types to the XiangShan-Nanhu 
scheduler model.
The current XiangShan -Nanhu model lacks the trap, atomic trap, fcvt_i2f, and 
fcvt_f2i instructions.

The trap, atomic, and i2f instructions belong to xs_jmp_rs. [1]
The f2i instruction belongs to xs_fmisc_rs.[2]

[1] 
https://github.com/OpenXiangShan/XiangShan/blob/v2.0/src/main/scala/xiangshan/package.scala#L780
[2] 
https://github.com/OpenXiangShan/XiangShan/blob/v2.0/src/main/scala/xiangshan/backend/decode/DecodeUnit.scala#L290

gcc/ChangeLog:* config/riscv/xiangshan.md: Add atomic, trap, fcvt_i2f, 
fcvt_f2i.

---
gcc/config/riscv/xiangshan.md | 11 ---
1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/xiangshan.md b/gcc/config/riscv/xiangshan.md
index 76539d332b8..eb83bbff1be 100644
--- a/gcc/config/riscv/xiangshan.md
+++ b/gcc/config/riscv/xiangshan.md
@@ -70,12 +70,17 @@
(define_insn_reservation "xiangshan_jump" 1
   (and (eq_attr "tune" "xiangshan")
-   (eq_attr "type" "jump,call,auipc,unknown,branch,jalr,ret,sfb_alu"))
+   (eq_attr "type" "jump,call,auipc,unknown,branch,jalr,ret,sfb_alu,trap"))
   "xs_jmp_rs")
(define_insn_reservation "xiangshan_i2f" 3
   (and (eq_attr "tune" "xiangshan")
-   (eq_attr "type" "mtc"))
+   (eq_attr "type" "mtc,fcvt_i2f"))
+  "xs_jmp_rs")
+
+(define_insn_reservation "xiangshan_atomic" 1
+  (and (eq_attr "tune" "xiangshan")
+   (eq_attr "type" "atomic"))
   "xs_jmp_rs")
(define_insn_reservation "xiangshan_mul" 3
@@ -115,7 +120,7 @@
(define_insn_reservation "xiangshan_f2f" 3
   (and (eq_attr "tune" "xiangshan")
-   (eq_attr "type" "fcvt,fmove"))
+   (eq_attr "type" "fcvt,fcvt_f2i,fmove"))
   "xs_fmisc_rs")
(define_insn_reservation "xiangshan_f2i" 3
--
2.43.0

Re: [PATCH v2] libstdc++: Fix std::format_kind primary template for Clang [PR120190]

2025-05-15 Thread Jonathan Wakely

On Thu, 15 May 2025 at 13:43, Tomasz Kaminski  wrote:
>
>
>
> On Thu, May 15, 2025 at 2:41 PM Jonathan Wakely  wrote:
>>
>> On 15/05/25 14:35 +0200, Tomasz Kaminski wrote:
>> >Please also add the message to dg-error check in format_kind_neg.cc.
>> >With that LGTM.
>>
>> Yes, already done locally. Here's what I'm testing now.
>
> Any reason to not put the whole message "cannot use primary template of 
> 'std::format_kind'"?

It doesn't fit in 80 columns, and doesn't seem to make the test any
more or less likely to PASS.

e.g. if we change the static_assert message to not use single quotes,
or to say 'std::format_kind' or some other change, would we want
the test to fail and require changes? Would that improve the value of
the test? IMHO it doesn't, because if it matches some substring of the
static_assert then the test is working correctly, but I'll change it
if you feel strongly.

Re: [PATCH v2] libstdc++: Fix std::format_kind primary template for Clang [PR120190]

2025-05-15 Thread Tomasz Kaminski

On Thu, May 15, 2025 at 2:53 PM Jonathan Wakely  wrote:

> On Thu, 15 May 2025 at 13:43, Tomasz Kaminski  wrote:
> >
> >
> >
> > On Thu, May 15, 2025 at 2:41 PM Jonathan Wakely 
> wrote:
> >>
> >> On 15/05/25 14:35 +0200, Tomasz Kaminski wrote:
> >> >Please also add the message to dg-error check in format_kind_neg.cc.
> >> >With that LGTM.
> >>
> >> Yes, already done locally. Here's what I'm testing now.
> >
> > Any reason to not put the whole message "cannot use primary template of
> 'std::format_kind'"?
>
> It doesn't fit in 80 columns, and doesn't seem to make the test any
> more or less likely to PASS.
>
> e.g. if we change the static_assert message to not use single quotes,
> or to say 'std::format_kind' or some other change, would we want
> the test to fail and require changes? Would that improve the value of
> the test? IMHO it doesn't, because if it matches some substring of the
> static_assert then the test is working correctly, but I'll change it
> if you feel strongly.
>
> Not strongly, but "cannot use primary template" sounds like something that
may be part of
unrelated compiler error. And we do not check that this is produced from
static_assert either.

Re: [RFC PATCH 0/2] Add target_clones profile option support

2025-05-15 Thread Yangyu Chen




> On 15 May 2025, at 14:01, Alice Carlotti  wrote:
> 
> On Thu, May 08, 2025 at 09:41:08PM +0800, Yangyu Chen wrote:
>> 
>> 
>>> On 8 May 2025, at 18:36, Richard Sandiford  
>>> wrote:
>>> 
>>> Yangyu Chen  writes:
> On 6 May 2025, at 17:49, Alfie Richards  wrote:
> 
> On 06/05/2025 09:36, Yangyu Chen wrote:
>>> On 6 May 2025, at 16:01, Alfie Richards  wrote:
>>> 
>>> Additionally, I think ideally the file can express functions 
>>> disambiguated by file, signature, and namespace.
>>> I imagine we could use similar syntax to gdb supports?
>>> 
>>> For example:
>>> 
>>> ```
>>> foo  |arch=+v
>>> bar(int, char)   |arch=+zba,+zbb
>>> file.C:baz(char) |arch=+zba,+zbb#arch=+v
>>> namespace::qux   |arch=+v
>>> ```
>> Also a great idea. However, I think it's not easy to use to implement
>> it now in GCC. But I would like to accept any further feedback if
>> we have such a simple API in GCC to do so, or if it will be implemented
>> by the community.
>> And something behind this idea is that I'm researching auto-generating
>> target clones attributes for developers. Only accepting the ASM
>> name is enough to implement this.
> 
> Ah that makes sense, apologies I missed that.
> 
> I think accepting the assembler name is good, and solves the overloading 
> ambiguity issue.
> 
> Maybe we can use the pipe '|' instead of ':' in the file format to leave 
> room for both in future?
 
 
 I will consider using the pipe '|' in the next revision. Thanks for
 the advice.
>>> 
>>> How about instead using a json file?  There's already a parser built into
>>> the compiler.
>>> 
>>> That has the advantage of being an established format that generators
>>> can use.  It would also allow other ways of specifying the functions
>>> to be added in future.
>>> 
>> 
>> Thanks for this useful information. I also want to extend the current
>> function multi-versioning to have the ability to set -mtune for
>> different micro-architectures in the future. Using JSON can provide
>> more extensibility in the future.
>> 
>> Thanks,
>> Yangyu Chen
> 
> For AArch64, we already support setting different tuning options for different
> versions (this is why I deliberately allowed the "target" attribute to be used
> on a function that also has the "target_clones" or "target_version" 
> attribute).
> 
> 
> What we don't support is allowing different microarchitectures to affect the
> version selection algorithm when the architecture features remain the same.
> This sounds like a nice idea, but on further consideration I think there are
> several reasons why we shouldn't try adding this.  I think these reasons apply
> at least to the AArch64 and RiscV ecosystems.
> 
> - Detecting specific microarchitecture properties (e.g. details like cache
>  size, pipeline widths, instruction latencies) would be somewhere between
>  difficult and impossible.  It certainly isn't something that can be packaged
>  into a standard user-friendly attribute.
> 
> - Detecting specific microarchitectures by individual names would probably
>  require writing long lists of known microarchitectures, and wouldn't be able
>  to make a non-default choice for microarchitectures that weren't known about
>  when the software was written.
> 
> - The only good way I can see to systematically group microarchitectures that
>  works for unknown future microarchitectures is to look at what architecture
>  version or features they support.  But then we're back to using architecture
>  versions as the basis for version selection, which we already support.
> 

Indeed, it's not common to have such a wide range of versioning
options. However, if it's possible to implement in the future and
we use the same type of target string to express the tune option
along with the ISA extension, it won't affect the current design.

I'm currently traveling for the RISC-V summit Europe. I'll revise
this patch upon my return to work.

> I'm not saying that users should never use microarchitecture detection to
> select between different implementations of a function.  I just think that 
> it's
> a relatively uncommon to need to this, and that doing it well is much more
> complicated than checking whether architecture features are supported.  It
> would be best to just let users write their own custom resolvers and use 
> ifuncs
> (or other platform specific mechanisms) directly.
> 

Indeed.

Thanks,
Yangyu Chen

Re: [PATCH v2] libstdc++: Preserve the argument type in basic_format_args [PR119246]

2025-05-15 Thread Rainer Orth

Hi Jonathan,

>> > this patch broke Solaris bootstrap, both i386-pc-solaris2.11 and
>> > sparc-sun-solaris2.11:
>> >
>> > In file included from
>> > /vol/gcc/src/hg/master/local/libstdc++-v3/src/c++20/format.cc:29:
>> > /var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/format:
>> > In member function ‘typename std::basic_format_context<_Out,
>> > _CharT>::iterator std::formatter<__float128,
>> > _CharT>::format(__float128, std::basic_format_context<_Out, _CharT>&)
>> > const’:
>> > /var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/format:2994:41:
>> > error: ‘__flt128_t’ is not a member of ‘std::__format’; did you mean
>> > ‘__bflt16_t’? [-Wtemplate-body]
>> > 2994 | { return _M_f.format((__format::__flt128_t)__u, __fc); }
>> >  | ^~
>> >  | __bflt16_t
>> >
>> > and one more instance.
>>
>> And on x86_64-darwin too.
>
> Tomasz, should this be:
>
> --- a/libstdc++-v3/include/std/format
> +++ b/libstdc++-v3/include/std/format
> @@ -2973,7 +2973,7 @@ namespace __format
> };
> #endif
>
> -#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 != 1
> +#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 > 1
>   // Reuse __formatter_fp::format<__format::__flt128_t, Out> for 
> __float128.
>   // This formatter is not declared if
> _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT is true,
>   // as __float128 when present is same type as __ieee128, which may be same 
> as
>

with this patch applied, I could link libstdc++.so.  I'll run a full
bootstrap later today.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH v2] libstdc++: Fix std::format_kind primary template for Clang [PR120190]

2025-05-15 Thread Jonathan Wakely

On Thu, 15 May 2025 at 13:56, Tomasz Kaminski  wrote:
>
>
>
> On Thu, May 15, 2025 at 2:53 PM Jonathan Wakely  wrote:
>>
>> On Thu, 15 May 2025 at 13:43, Tomasz Kaminski  wrote:
>> >
>> >
>> >
>> > On Thu, May 15, 2025 at 2:41 PM Jonathan Wakely  wrote:
>> >>
>> >> On 15/05/25 14:35 +0200, Tomasz Kaminski wrote:
>> >> >Please also add the message to dg-error check in format_kind_neg.cc.
>> >> >With that LGTM.
>> >>
>> >> Yes, already done locally. Here's what I'm testing now.
>> >
>> > Any reason to not put the whole message "cannot use primary template of 
>> > 'std::format_kind'"?
>>
>> It doesn't fit in 80 columns, and doesn't seem to make the test any
>> more or less likely to PASS.
>>
>> e.g. if we change the static_assert message to not use single quotes,
>> or to say 'std::format_kind' or some other change, would we want
>> the test to fail and require changes? Would that improve the value of
>> the test? IMHO it doesn't, because if it matches some substring of the
>> static_assert then the test is working correctly, but I'll change it
>> if you feel strongly.
>>
> Not strongly, but "cannot use primary template" sounds like something that 
> may be part of
> unrelated compiler error.

OK, I'll use this and push it:

+// { dg-error "cannot use primary template of 'std::format_kind'" ""
{ target *-*-* } 0 }

> And we do not check that this is produced from static_assert either.

I'm not worried about that. I don't think we need to be too paranoid
here, we control the code and diagnostics from GCC aren't going to
just randomly change without us realising. And it doesn't really
matter whether a "cannot use primary template" diagnostic for
std::format_kind comes from a static assert, or a function using
=delete("reason"), or a custom compiler diagnostic, or something else.
All that matters is that it's ill-formed and the message is fairly
user-friendly.

Re: [PATCH v2] libstdc++: Preserve the argument type in basic_format_args [PR119246]

2025-05-15 Thread Jonathan Wakely

On Thu, 15 May 2025 at 15:02, Rainer Orth  wrote:
>
> Hi Jonathan,
>
> >> > this patch broke Solaris bootstrap, both i386-pc-solaris2.11 and
> >> > sparc-sun-solaris2.11:
> >> >
> >> > In file included from
> >> > /vol/gcc/src/hg/master/local/libstdc++-v3/src/c++20/format.cc:29:
> >> > /var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/format:
> >> > In member function ‘typename std::basic_format_context<_Out,
> >> > _CharT>::iterator std::formatter<__float128,
> >> > _CharT>::format(__float128, std::basic_format_context<_Out, _CharT>&)
> >> > const’:
> >> > /var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/format:2994:41:
> >> > error: ‘__flt128_t’ is not a member of ‘std::__format’; did you mean
> >> > ‘__bflt16_t’? [-Wtemplate-body]
> >> > 2994 | { return _M_f.format((__format::__flt128_t)__u, __fc); }
> >> >  | ^~
> >> >  | __bflt16_t
> >> >
> >> > and one more instance.
> >>
> >> And on x86_64-darwin too.
> >
> > Tomasz, should this be:
> >
> > --- a/libstdc++-v3/include/std/format
> > +++ b/libstdc++-v3/include/std/format
> > @@ -2973,7 +2973,7 @@ namespace __format
> > };
> > #endif
> >
> > -#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 != 1
> > +#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 > 1
> >   // Reuse __formatter_fp::format<__format::__flt128_t, Out> for 
> > __float128.
> >   // This formatter is not declared if
> > _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT is true,
> >   // as __float128 when present is same type as __ieee128, which may be 
> > same as
> >
>
> with this patch applied, I could link libstdc++.so.  I'll run a full
> bootstrap later today.


Good to know, thanks. Tomasz already pushed that change as
r16-647-gd010a39b9e788a
so trunk should be OK now.

Re: [PATCH 3/5] ipa: Dump cgraph_node UID instead of order into ipa-clones dump file

2025-05-15 Thread Jan Hubicka

> Hi,
> 
> starting with GCC 15 the order is not unique for any symtab_nodes but
> m_uid is, I believe we ought to dump the latter in the ipa-clones dump,
> if only so that people can reliably match entries about new clones to
> those about removed nodes (if any).
> 
> Bootstrapped and tested on x86_64-linux. OK for master and gcc 15?
> 
> Thanks,
> 
> Martin
> 
> 
> gcc/ChangeLog:
> 
> 2025-04-23  Martin Jambor  
> 
>   * cgraph.h (symtab_node): Make member function get_uid const.
>   * cgraphclones.cc (dump_callgraph_transformation): Dump m_uid of the
>   call graph nodes instead of order.
>   * cgraph.cc (cgraph_node::remove): Likewise.
OK,
thanks!
Honza

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-15 Thread Andi Kleen

On Wed, May 14, 2025 at 02:46:15AM +, Kugan Vivekanandarajah wrote:
> Adding Eugene and Andi to CC as Sam suggested.
> 
> > On 13 May 2025, at 12:57 am, Richard Sandiford 
> wrote:
> >
> > External email: Use caution opening links or attachments
> >
> >
> > Kugan Vivekanandarajah  writes:
> >> diff --git a/configure.ac b/configure.ac
> >> index 730db3c1402..701284e38f2 100644
> >> --- a/configure.ac
> >> +++ b/configure.ac
> >> @@ -621,6 +621,14 @@ case "${target}" in
> >> ;;
> >> esac
> >>
> >> +autofdo_target="i386"
> >> +case "${target}" in
> >> +  aarch64-*-*)
> >> +autofdo_target="aarch64"
> >> +;;
> >> +esac
> >> +AC_SUBST(autofdo_target)
> >> +
> >> # Disable libssp for some systems.
> >> case "${target}" in
> >>   avr-*-*)
> >
> > Couldn't we use the existing $cpu_type, rather than adding a new variable?
> > I don't think the two would ever need to diverge.
> 
> I tried doing this but looks to me that $cpu_type is available only in libgcc.
> Am I missing something  or do you want me to replicate that here?

I guess replicating is fine. btw while you are looking at this
profiledbootstrap is currently disabled for various languages like
fortran. It would be good to enable it everywhere.

Andi

[pushed] c++: use normal std list for module tests

2025-05-15 Thread Jason Merrill

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

The modules tests have used their own version of the code to run tests under
multiple standard versions; they should use the same one as other tests.

I'm not sure about continuing to run modules tests in C++17 mode, but I
guess we might as well for now.

gcc/testsuite/ChangeLog:

* lib/g++-dg.exp (g++-std-flags): Factor out of g++-dg-runtest.
* g++.dg/modules/modules.exp: Use it instead of a copy.
---
 gcc/testsuite/g++.dg/modules/modules.exp |  39 +--
 gcc/testsuite/lib/g++-dg.exp | 129 +--
 2 files changed, 78 insertions(+), 90 deletions(-)

diff --git a/gcc/testsuite/g++.dg/modules/modules.exp 
b/gcc/testsuite/g++.dg/modules/modules.exp
index 81d0bebdfe5..73b5de1397f 100644
--- a/gcc/testsuite/g++.dg/modules/modules.exp
+++ b/gcc/testsuite/g++.dg/modules/modules.exp
@@ -36,7 +36,6 @@ if ![info exists DEFAULT_CXXFLAGS] then {
 set DEFAULT_CXXFLAGS " -pedantic-errors -Wno-long-long"
 }
 set DEFAULT_MODFLAGS $DEFAULT_CXXFLAGS
-set MOD_STD_LIST { 17 2a 2b }
 
 dg-init
 
@@ -261,44 +260,16 @@ proc srcdir {} {
 return $testdir
 }
 
-# Return set of std options to iterate over, taken from g++-dg.exp & compat.exp
+# Return set of std options to iterate over.
 proc module-init { src } {
-set tmp [dg-get-options $src]
-set option_list {}
-set have_std 0
-set std_prefix "-std=c++"
+set option_list [g++-std-flags $src]
 global extra_tool_flags
 set extra_tool_flags {}
-global MOD_STD_LIST
 
-foreach op $tmp {
-   switch [lindex $op 0] {
-   "dg-options" {
-   set std_prefix "-std=gnu++"
-   if { [string match "*-std=*" [lindex $op 2]] } {
-   set have_std 1
-   }
-   eval lappend extra_tool_flags [lindex $op 2]
-   }
-   "dg-additional-options" {
-   if { [string match "*-std=*" [lindex $op 2]] } {
-   set have_std 1
-   }
-   eval lappend extra_tool_flags [lindex $op 2]
-   }
-   }
-}
-
-if { $have_std } {
-   lappend option_list ""
-} elseif { [string match "*xtreme*" $src] } {
+if { [llength $option_list]
+&& [string match "*xtreme*" $src] } {
# Only run the xtreme tests once.
-   set x [lindex $MOD_STD_LIST end]
-   lappend option_list "${std_prefix}$x"
-} else {
-   foreach x $MOD_STD_LIST {
-   lappend option_list "${std_prefix}$x"
-   }
+   set option_list [lrange [lsort $option_list] end end]
 }
 
 return $option_list
diff --git a/gcc/testsuite/lib/g++-dg.exp b/gcc/testsuite/lib/g++-dg.exp
index 26bda651e01..eb7a9714bc4 100644
--- a/gcc/testsuite/lib/g++-dg.exp
+++ b/gcc/testsuite/lib/g++-dg.exp
@@ -27,6 +27,78 @@ proc g++-dg-prune { system text } {
 return [gcc-dg-prune $system $text]
 }
 
+# Return a list of -std flags to use for TEST.
+proc g++-std-flags { test } {
+# If the testcase specifies a standard, use that one.
+# If not, run it under several standards, allowing GNU extensions
+# if there's a dg-options line.
+if ![search_for $test "-std=*++"] {
+   if [search_for $test "dg-options"] {
+   set std_prefix "-std=gnu++"
+   } else {
+   set std_prefix "-std=c++"
+   }
+
+   set low 0
+   # Some directories expect certain minimums.
+   if { [string match "*/modules/*" $test] } { set low 17 }
+
+   # See g++.exp for the initial value of this list.
+   global gpp_std_list
+   if { [llength $gpp_std_list] > 0 } {
+   set std_list {}
+   foreach ver $gpp_std_list {
+   set cmpver $ver
+   if { $ver == 98 } { set cmpver 03 }
+   if { $ver ni $std_list
+&& $cmpver >= $low } {
+   lappend std_list $ver
+   }
+   }
+   } else {
+   # If the test mentions specific C++ versions, test those.
+   set lines [get_matching_lines $test {\{ dg* c++[0-9][0-9]}]
+   set std_list {}
+   foreach line $lines {
+   regexp {c\+\+([0-9][0-9])} $line -> ver
+   lappend std_list $ver
+
+   if { $ver == 98 } {
+   # Leave low alone.
+   } elseif { [regexp {dg-do|dg-require-effective-target} $line] } 
{
+   set low $ver
+   }
+   }
+   #verbose "low: $low" 1
+
+   set std_list [lsort -unique $std_list]
+
+   # If fewer than 3 specific versions are mentioned, add more.
+   # The order of this list is significant: first $cxx_default,
+   # then the oldest and newest, then others in rough order of
+   # importance based on test coverage and usage.
+   foreach ver { 17 98 26 11 20 14 23 } {
+   set cmpver $ver
+   if { $ver == 98 } { set cmpver 03 }
+

[PATCH] c++: Further simplify the stdlib inline folding

2025-05-15 Thread Ville Voutilainen

This is a follow-up to the earlier patch that adds std::to_underlying to the
set of stdlib functions that are folded. We don't seem to need to handle
the same-type case specially, the folding will just do the right thing.

Also fix up the mistake of not tweaking the cmdline switch doc in the earlier
patch.

Tested Linux-PPC64, ok for trunk?

Further simplify the stdlib inline folding

gcc/cp/ChangeLog:
* cp-gimplify.cc (cp_fold): Remove the nop handling, let the
folding just fold it all.

gcc/ChangeLog:
* doc/invoke.texi: Add to_underlying to -ffold-simple-inlines.
diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index eab55504b05..906ab248d51 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -3347,8 +3347,6 @@ cp_fold (tree x, fold_flags_t flags)
 		|| id_equal (DECL_NAME (callee), "as_const")))
 	  {
 	r = CALL_EXPR_ARG (x, 0);
-	if (!same_type_p (TREE_TYPE (x), TREE_TYPE (r)))
-	  r = build_nop (TREE_TYPE (x), r);
 	x = cp_fold (r, flags);
 	break;
 	  }
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ee7180110e1..83c63ce6ae5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -3348,7 +3348,8 @@ aliases, the default is @option{-fno-extern-tls-init}.
 @item -ffold-simple-inlines
 @itemx -fno-fold-simple-inlines
 Permit the C++ frontend to fold calls to @code{std::move}, @code{std::forward},
-@code{std::addressof} and @code{std::as_const}.  In contrast to inlining, this
+@code{std::addressof}, @code{std::to_underlying}
+and @code{std::as_const}.  In contrast to inlining, this
 means no debug information will be generated for such calls.  Since these
 functions are rarely interesting to debug, this flag is enabled by default
 unless @option{-fno-inline} is active.

GCC 14.2.1 Status Report (2025-05-15), branch frozen for release

2025-05-15 Thread Richard Biener

Status
==

The GCC 14 branch is now frozen for the GCC 14.3 release, a release
candidate is being prepared.

All changes to the branch require release manager approval.


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2025-April/245990.html

[pushed] c++: -fimplicit-constexpr and modules

2025-05-15 Thread Jason Merrill

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

Import didn't like differences in DECL_DECLARED_CONSTEXPR_P due to implicit
constexpr, breaking several g++.dg/modules tests; we should handle that
along with DECL_MAYBE_DELETED.  For which we need to stream the bit.

gcc/cp/ChangeLog:

* module.cc (trees_out::lang_decl_bools): Stream implicit_constexpr.
(trees_in::lang_decl_bools): Likewise.
(trees_in::is_matching_decl): Check it.
---
 gcc/cp/module.cc | 26 ++
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index e7782627a49..4f9c3788380 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -6024,7 +6024,7 @@ trees_out::lang_decl_bools (tree t, bits_out& bits)
   WB (lang->u.fn.has_dependent_explicit_spec_p);
   WB (lang->u.fn.immediate_fn_p);
   WB (lang->u.fn.maybe_deleted);
-  /* We do not stream lang->u.fn.implicit_constexpr.  */
+  WB (lang->u.fn.implicit_constexpr);
   WB (lang->u.fn.escalated_p);
   WB (lang->u.fn.xobj_func);
   goto lds_min;
@@ -6095,7 +6095,7 @@ trees_in::lang_decl_bools (tree t, bits_in& bits)
   RB (lang->u.fn.has_dependent_explicit_spec_p);
   RB (lang->u.fn.immediate_fn_p);
   RB (lang->u.fn.maybe_deleted);
-  /* We do not stream lang->u.fn.implicit_constexpr.  */
+  RB (lang->u.fn.implicit_constexpr);
   RB (lang->u.fn.escalated_p);
   RB (lang->u.fn.xobj_func);
   goto lds_min;
@@ -12193,13 +12193,23 @@ trees_in::is_matching_decl (tree existing, tree decl, 
bool is_typedef)
 
   /* Similarly if EXISTING has undeduced constexpr, but DECL's
 is already deduced.  */
-  if (DECL_MAYBE_DELETED (e_inner) && !DECL_MAYBE_DELETED (d_inner)
- && DECL_DECLARED_CONSTEXPR_P (d_inner))
-   DECL_DECLARED_CONSTEXPR_P (e_inner) = true;
-  else if (!DECL_MAYBE_DELETED (e_inner) && DECL_MAYBE_DELETED (d_inner))
-   /* Nothing to do.  */;
+  if (DECL_DECLARED_CONSTEXPR_P (e_inner)
+ == DECL_DECLARED_CONSTEXPR_P (d_inner))
+   /* Already matches.  */;
+  else if (DECL_DECLARED_CONSTEXPR_P (d_inner)
+  && (DECL_MAYBE_DELETED (e_inner)
+  || decl_implicit_constexpr_p (d_inner)))
+   /* DECL was deduced, copy to EXISTING.  */
+   {
+ DECL_DECLARED_CONSTEXPR_P (e_inner) = true;
+ if (decl_implicit_constexpr_p (d_inner))
+   DECL_LANG_SPECIFIC (e_inner)->u.fn.implicit_constexpr = true;
+   }
   else if (DECL_DECLARED_CONSTEXPR_P (e_inner)
-  != DECL_DECLARED_CONSTEXPR_P (d_inner))
+  && (DECL_MAYBE_DELETED (d_inner)
+  || decl_implicit_constexpr_p (e_inner)))
+   /* EXISTING was deduced, leave it alone.  */;
+  else
{
  mismatch_msg = G_("conflicting % for imported "
"declaration %#qD");

base-commit: 2ee1fce9fc35de21b28823ccae433c90a0ce270b
-- 
2.49.0

[pushed] c++: one more PR99599 tweak

2025-05-15 Thread Jason Merrill

Tested x86_64-pc-linux-gnu, applying to trunk.

-- 8< --

Patrick pointed out that if the parm/arg types aren't complete yet at this
point, it would affect the type_has_converting_constructor and
TYPE_HAS_CONVERSION tests.  I don't have a testcase, but it makes sense for
safety.

PR c++/99599

gcc/cp/ChangeLog:

* pt.cc (conversion_may_instantiate_p): Make sure
classes are complete.
---
 gcc/cp/pt.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 0d64a1cfb12..e0857fc26d0 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -23529,13 +23529,13 @@ conversion_may_instantiate_p (tree to, tree from)
 
   /* Converting to a non-aggregate class type will consider its
  user-declared constructors, which might induce instantiation.  */
-  if (CLASS_TYPE_P (to)
+  if (CLASS_TYPE_P (complete_type (to))
   && type_has_converting_constructor (to))
 return true;
 
   /* Similarly, converting from a class type will consider its conversion
  functions.  */
-  if (CLASS_TYPE_P (from)
+  if (CLASS_TYPE_P (complete_type (from))
   && TYPE_HAS_CONVERSION (from))
 return true;
 

base-commit: f28ff1e4f1c91f46d80e26dd77917e47cdd41bbe
-- 
2.49.0

Re: [PATCH] c++: unifying specializations of non-primary tmpls [PR120161]

2025-05-15 Thread Jason Merrill


On 5/14/25 4:48 PM, Patrick Palka wrote:

On Wed, 14 May 2025, Patrick Palka wrote:


On Wed, 14 May 2025, Jason Merrill wrote:


On 5/14/25 2:44 PM, Patrick Palka wrote:

On Wed, 14 May 2025, Patrick Palka wrote:


On Wed, 14 May 2025, Jason Merrill wrote:


On 5/12/25 7:53 PM, Patrick Palka wrote:

Bootstrapped and regtested on x86-64-pc-linux-gnu, does this look OK
for trunk/15/14?

-- >8 --

Here unification of P=Wrap::type, A=Wrap::type wrongly
succeeds ever since r14-4112 which made the RECORD_TYPE case of unify
no longer recurse into template arguments for non-primary templates
(since they're a non-deduced context) and so the int/long mismatch
that
makes the two types distinct goes unnoticed.

In the case of (comparing specializations of) a non-primary template,
unify should still go on to compare the types directly before
returning
success.


Should the PRIMARY_TEMPLATE_P check instead move up to join the
CLASSTYPE_TEMPLATE_INFO check?  try_class_deduction also doesn't seem
applicable to non-primary templates.


I don't think that'd work, for either the CLASSTYPE_TEMPLATE_INFO (parm)
check
or the earlier CLASSTYPE_TEMPLATE_INFO (arg) check.

While try_class_deduction directly doesn't apply to non-primary templates,
get_template_base still might, so if we move up the PRIMARY_TEMPLATE_P to
join
the C_T_I (parm) check, then we wouldn't try get_template_base anymore
which
would  break e.g.

  template struct B { };

  template
  struct A {
struct C : B { };
  };

  template void f(B*);

  int main() {
A::C c;
f(&c);
  }

If we move the PRIMARY_TEMPLATE_P check up to the C_T_I (arg) check, then
that'd mean we still don't check same_type_p on the two types in the
non-primary case, which seems wrong (although it'd fix the PR thanks to
the
parm == arg early exit in unify).


FWIW it seems part of the weird/subtle logic here is due to the fact
that when unifying e.g. P=C with A=C, we do it twice, first via
try_class_deduction using a copy of 'targs', and if that succeeds we do
it again with the real 'targs'.  I think the logic could simultaneously
be simplified and made memory efficient if we made it so that if the
trial unification from try_class_deduction succeeds we just use its
'targs' instead of having to repeat the unification.


Hmm, good point, though I don't see what you mean by "a copy", it looks to me
like we do it twice with the real 'targs'.  Seems like we should move
try_class_unification out of the UNIFY_ALLOW_DERIVED block and remove the
unify that your previous patch conditionalized.


By a copy, I mean via the call to copy_template_args from
try_class_unification?  There's currently no way to get at the
arguments that were deduced by try_class_unification because of
that copy.


Ah, of course, I was overlooking that.


Ah, and the function has a long comment with an example about why it
uses an empty (innermost) targ vector rather than a straight copy.  If
that comment is still correct, I guess we won't be able to avoid the
trial unify after all :/ But I noticed that Clang accepts the example in
the comment, whereas GCC rejects.  I wonder who is correct?


This seems to be https://eel.is/c++draft/temp.deduct#type-2 "Type 
deduction is done independently for each P/A pair, and the deduced 
template argument values are then combined."


The comment is talking about a get_template_base case that doesn't apply 
to the initial call to try_class_unification, but I suppose the comment 
about cluttering targs with a partial unification does.


It might be nice to return the successful targs from 
try_class_unification and then merge them with the real targs, but I'm 
not sure that would be a significant speedup.



In any case, shall we go with the original patch for sake of backports?


Yes, thanks.

Jason

Re: [PATCH v2] libstdc++: Fix std::format_kind primary template for Clang [PR120190]

2025-05-15 Thread Tomasz Kaminski

On Thu, May 15, 2025 at 4:07 PM Jonathan Wakely  wrote:

> On Thu, 15 May 2025 at 13:56, Tomasz Kaminski  wrote:
> >
> >
> >
> > On Thu, May 15, 2025 at 2:53 PM Jonathan Wakely 
> wrote:
> >>
> >> On Thu, 15 May 2025 at 13:43, Tomasz Kaminski 
> wrote:
> >> >
> >> >
> >> >
> >> > On Thu, May 15, 2025 at 2:41 PM Jonathan Wakely 
> wrote:
> >> >>
> >> >> On 15/05/25 14:35 +0200, Tomasz Kaminski wrote:
> >> >> >Please also add the message to dg-error check in format_kind_neg.cc.
> >> >> >With that LGTM.
> >> >>
> >> >> Yes, already done locally. Here's what I'm testing now.
> >> >
> >> > Any reason to not put the whole message "cannot use primary template
> of 'std::format_kind'"?
> >>
> >> It doesn't fit in 80 columns, and doesn't seem to make the test any
> >> more or less likely to PASS.
> >>
> >> e.g. if we change the static_assert message to not use single quotes,
> >> or to say 'std::format_kind' or some other change, would we want
> >> the test to fail and require changes? Would that improve the value of
> >> the test? IMHO it doesn't, because if it matches some substring of the
> >> static_assert then the test is working correctly, but I'll change it
> >> if you feel strongly.
> >>
> > Not strongly, but "cannot use primary template" sounds like something
> that may be part of
> > unrelated compiler error.
>
> OK, I'll use this and push it:
>
> +// { dg-error "cannot use primary template of 'std::format_kind'" ""
> { target *-*-* } 0 }
>
Thank you.

>
> > And we do not check that this is produced from static_assert either.
>
> I'm not worried about that. I don't think we need to be too paranoid
> here, we control the code and diagnostics from GCC aren't going to
> just randomly change without us realising. And it doesn't really
> matter whether a "cannot use primary template" diagnostic for
> std::format_kind comes from a static assert, or a function using
> =delete("reason"), or a custom compiler diagnostic, or something else.
> All that matters is that it's ill-formed and the message is fairly
> user-friendly.
>
To clarify, I do not mean that we need to check if it is produced by static
assert,
just having a check "cannot use primary template" + it is from static
assert,
would also prevent my concern of accidentally matching compiler error
messages.
But checking for longer substring, including format_kind, is IMHO a better
way to do that.

Re: [PATCH v1] libstdc++: Fix class mandate for extents.

2025-05-15 Thread Luc Grosheintz


Without would make sense to me, because whenever I wrote an
identifier with _ I felt like I was presenting the user with
a name that they shouldn't know about.

A pedantic question: Can I also fix the line below, or do
you prefer to be strict about one semantic change per commit?

On 5/15/25 2:40 PM, Tomasz Kaminski wrote:

No strong preference, but Ville's argument sounds reasonable.

On Thu, May 15, 2025 at 12:25 PM Ville Voutilainen <
ville.voutilai...@gmail.com> wrote:


Mild preference against; use the names from the standard, not the
implementation, in such diagnostics.

to 15. toukok. 2025 klo 13.20 Jonathan Wakely 
kirjoitti:


On Thu, 15 May 2025 at 11:14, Jonathan Wakely  wrote:


On Wed, 14 May 2025 at 20:18, Luc Grosheintz 

wrote:


The standard states that the IndexType must be a signed or unsigned
integer. This mandate was implemented using `std::is_integral_v`.

Which

also includes (among others) char and bool, which neither signed nor
unsigned integers.

libstdc++-v3/ChangeLog:

 * include/std/mdspan: Implement the mandate for extents as
 signed or unsigned integer and not any interal type.
 *

testsuite/23_containers/mdspan/extents/class_mandates_neg.cc: Check

 that extents and extents are invalid.
 * testsuite/23_containers/mdspan/extents/misc.cc: Update
 tests to avoid `char` and `bool` as IndexType.
---
  libstdc++-v3/include/std/mdspan|  3 ++-
  .../23_containers/mdspan/extents/class_mandates_neg.cc | 10

+++---

  .../testsuite/23_containers/mdspan/extents/misc.cc |  8 
  3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/include/std/mdspan

b/libstdc++-v3/include/std/mdspan

index aee96dda7cd..22509d9c8f4 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -163,7 +163,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
template
  class extents
  {
-  static_assert(is_integral_v<_IndexType>, "_IndexType must be

integral.");

+  static_assert(__is_standard_integer<_IndexType>::value,
+   "_IndexType must be a signed or unsigned

integer.");


GCC's diagnostics never end with a full stop (aka period), and we
follow that convention for our static assertions.

So I'll remove the '.' at the end of the string literal, and then push
this to trunk.



static_assert(
   (__mdspan::__valid_static_extent<_Extents, _IndexType> &&

...),

   "Extents must either be dynamic or representable as

_IndexType");

I've just noticed that this static_assert refers to "Extents" without
the leading underscore, but "_IndexType" with a leading underscore.

I think it's OK to omit the leading underscore, it might be a bit more
user-friendly and I don't think anybody will be confused by the fact
it's not identical to the real template parameter. But we should
either do it consistently for _Extents and _IndexType or for neither
of them.

Anybody want to argue for or against underscores?

Re: [PATCH v1] libstdc++: Fix class mandate for extents.

2025-05-15 Thread Jonathan Wakely

On Thu, 15 May 2025 at 16:11, Luc Grosheintz  wrote:
>
> Without would make sense to me, because whenever I wrote an
> identifier with _ I felt like I was presenting the user with
> a name that they shouldn't know about.
>
> A pedantic question: Can I also fix the line below, or do
> you prefer to be strict about one semantic change per commit?

We don't need to be that strict. Don't worry, I've already changed it
locally and will push it for you.


>
> On 5/15/25 2:40 PM, Tomasz Kaminski wrote:
> > No strong preference, but Ville's argument sounds reasonable.
> >
> > On Thu, May 15, 2025 at 12:25 PM Ville Voutilainen <
> > ville.voutilai...@gmail.com> wrote:
> >
> >> Mild preference against; use the names from the standard, not the
> >> implementation, in such diagnostics.
> >>
> >> to 15. toukok. 2025 klo 13.20 Jonathan Wakely 
> >> kirjoitti:
> >>
> >>> On Thu, 15 May 2025 at 11:14, Jonathan Wakely  wrote:
> 
>  On Wed, 14 May 2025 at 20:18, Luc Grosheintz 
> >>> wrote:
> >
> > The standard states that the IndexType must be a signed or unsigned
> > integer. This mandate was implemented using `std::is_integral_v`.
> >>> Which
> > also includes (among others) char and bool, which neither signed nor
> > unsigned integers.
> >
> > libstdc++-v3/ChangeLog:
> >
> >  * include/std/mdspan: Implement the mandate for extents as
> >  signed or unsigned integer and not any interal type.
> >  *
> >>> testsuite/23_containers/mdspan/extents/class_mandates_neg.cc: Check
> >  that extents and extents are invalid.
> >  * testsuite/23_containers/mdspan/extents/misc.cc: Update
> >  tests to avoid `char` and `bool` as IndexType.
> > ---
> >   libstdc++-v3/include/std/mdspan|  3 ++-
> >   .../23_containers/mdspan/extents/class_mandates_neg.cc | 10
> >>> +++---
> >   .../testsuite/23_containers/mdspan/extents/misc.cc |  8 
> >   3 files changed, 13 insertions(+), 8 deletions(-)
> >
> > diff --git a/libstdc++-v3/include/std/mdspan
> >>> b/libstdc++-v3/include/std/mdspan
> > index aee96dda7cd..22509d9c8f4 100644
> > --- a/libstdc++-v3/include/std/mdspan
> > +++ b/libstdc++-v3/include/std/mdspan
> > @@ -163,7 +163,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > template
> >   class extents
> >   {
> > -  static_assert(is_integral_v<_IndexType>, "_IndexType must be
> >>> integral.");
> > +  static_assert(__is_standard_integer<_IndexType>::value,
> > +   "_IndexType must be a signed or unsigned
> >>> integer.");
> 
>  GCC's diagnostics never end with a full stop (aka period), and we
>  follow that convention for our static assertions.
> 
>  So I'll remove the '.' at the end of the string literal, and then push
>  this to trunk.
> 
> 
> > static_assert(
> >(__mdspan::__valid_static_extent<_Extents, _IndexType> &&
> >>> ...),
> >"Extents must either be dynamic or representable as
> >>> _IndexType");
> >>>
> >>> I've just noticed that this static_assert refers to "Extents" without
> >>> the leading underscore, but "_IndexType" with a leading underscore.
> >>>
> >>> I think it's OK to omit the leading underscore, it might be a bit more
> >>> user-friendly and I don't think anybody will be confused by the fact
> >>> it's not identical to the real template parameter. But we should
> >>> either do it consistently for _Extents and _IndexType or for neither
> >>> of them.
> >>>
> >>> Anybody want to argue for or against underscores?
> >>>
> >>>
> >
>

Re: [PATCH v1] libstdc++: Fix class mandate for extents.

2025-05-15 Thread Jonathan Wakely

On Thu, 15 May 2025 at 16:12, Jonathan Wakely  wrote:
>
> On Thu, 15 May 2025 at 16:11, Luc Grosheintz  wrote:
> >
> > Without would make sense to me, because whenever I wrote an
> > identifier with _ I felt like I was presenting the user with
> > a name that they shouldn't know about.
> >
> > A pedantic question: Can I also fix the line below, or do
> > you prefer to be strict about one semantic change per commit?
>
> We don't need to be that strict. Don't worry, I've already changed it
> locally and will push it for you.

To be clear, we do prefer separate changes to be separate commits, but
I think "add a static assert and tweak the one on the line below" is
fine to consider as a single change.


>
>
> >
> > On 5/15/25 2:40 PM, Tomasz Kaminski wrote:
> > > No strong preference, but Ville's argument sounds reasonable.
> > >
> > > On Thu, May 15, 2025 at 12:25 PM Ville Voutilainen <
> > > ville.voutilai...@gmail.com> wrote:
> > >
> > >> Mild preference against; use the names from the standard, not the
> > >> implementation, in such diagnostics.
> > >>
> > >> to 15. toukok. 2025 klo 13.20 Jonathan Wakely 
> > >> kirjoitti:
> > >>
> > >>> On Thu, 15 May 2025 at 11:14, Jonathan Wakely  
> > >>> wrote:
> > 
> >  On Wed, 14 May 2025 at 20:18, Luc Grosheintz 
> > >>> wrote:
> > >
> > > The standard states that the IndexType must be a signed or unsigned
> > > integer. This mandate was implemented using `std::is_integral_v`.
> > >>> Which
> > > also includes (among others) char and bool, which neither signed nor
> > > unsigned integers.
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > >  * include/std/mdspan: Implement the mandate for extents as
> > >  signed or unsigned integer and not any interal type.
> > >  *
> > >>> testsuite/23_containers/mdspan/extents/class_mandates_neg.cc: Check
> > >  that extents and extents are invalid.
> > >  * testsuite/23_containers/mdspan/extents/misc.cc: Update
> > >  tests to avoid `char` and `bool` as IndexType.
> > > ---
> > >   libstdc++-v3/include/std/mdspan|  3 ++-
> > >   .../23_containers/mdspan/extents/class_mandates_neg.cc | 10
> > >>> +++---
> > >   .../testsuite/23_containers/mdspan/extents/misc.cc |  8 
> > >   3 files changed, 13 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/libstdc++-v3/include/std/mdspan
> > >>> b/libstdc++-v3/include/std/mdspan
> > > index aee96dda7cd..22509d9c8f4 100644
> > > --- a/libstdc++-v3/include/std/mdspan
> > > +++ b/libstdc++-v3/include/std/mdspan
> > > @@ -163,7 +163,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> > > template
> > >   class extents
> > >   {
> > > -  static_assert(is_integral_v<_IndexType>, "_IndexType must be
> > >>> integral.");
> > > +  static_assert(__is_standard_integer<_IndexType>::value,
> > > +   "_IndexType must be a signed or unsigned
> > >>> integer.");
> > 
> >  GCC's diagnostics never end with a full stop (aka period), and we
> >  follow that convention for our static assertions.
> > 
> >  So I'll remove the '.' at the end of the string literal, and then push
> >  this to trunk.
> > 
> > 
> > > static_assert(
> > >(__mdspan::__valid_static_extent<_Extents, _IndexType> &&
> > >>> ...),
> > >"Extents must either be dynamic or representable as
> > >>> _IndexType");
> > >>>
> > >>> I've just noticed that this static_assert refers to "Extents" without
> > >>> the leading underscore, but "_IndexType" with a leading underscore.
> > >>>
> > >>> I think it's OK to omit the leading underscore, it might be a bit more
> > >>> user-friendly and I don't think anybody will be confused by the fact
> > >>> it's not identical to the real template parameter. But we should
> > >>> either do it consistently for _Extents and _IndexType or for neither
> > >>> of them.
> > >>>
> > >>> Anybody want to argue for or against underscores?
> > >>>
> > >>>
> > >
> >

Re: [PATCH] c++: Further simplify the stdlib inline folding

2025-05-15 Thread Ville Voutilainen

On Thu, 15 May 2025 at 18:19, Jason Merrill  wrote:

> > @@ -3347,8 +3347,6 @@ cp_fold (tree x, fold_flags_t flags)
> >   || id_equal (DECL_NAME (callee), "as_const")))
> > {
> >   r = CALL_EXPR_ARG (x, 0);
> > - if (!same_type_p (TREE_TYPE (x), TREE_TYPE (r)))
> > -   r = build_nop (TREE_TYPE (x), r);
>
> This is removing the conversion entirely; I'm rather surprised it didn't
> break anything.  I thought you were thinking to make the build_nop
> unconditional.

Oops. Yes, that makes more sense. I am confused how that build_nop
actually works, but it indeed should
convert r to x, and not be completely nuked. Re-doing...

[PATCH 0/9] AArch64: CMPBR support

2025-05-15 Thread Karl Meakin

This patch series adds support for the CMPBR extension. It includes the
new `+cmpbr` option and rules to generate the new instructions when
lowering conditional branches.

Testing done:
`make bootstrap; make check`

Karl Meakin (9):
  AArch64: place branch instruction rules together
  AArch64: reformat branch instruction rules
  AArch64: rename branch instruction rules
  AArch64: add constants for branch displacements
  AArch64: make `far_branch` attribute a boolean
  AArch64: recognize `+cmpbr` option
  AArch64: precommit test for CMPBR instructions
  AArch64: rules for CMPBR instructions
  AArch64: make rules for CBZ/TBZ higher priority

 .../aarch64/aarch64-option-extensions.def |2 +
 gcc/config/aarch64/aarch64-simd.md|2 +-
 gcc/config/aarch64/aarch64-sme.md |2 +-
 gcc/config/aarch64/aarch64.cc |4 +-
 gcc/config/aarch64/aarch64.h  |3 +
 gcc/config/aarch64/aarch64.md |  564 ---
 gcc/config/aarch64/iterators.md   |5 +
 gcc/config/aarch64/predicates.md  |   15 +
 gcc/doc/invoke.texi   |3 +
 gcc/testsuite/gcc.target/aarch64/cmpbr.c  | 1481 +
 gcc/testsuite/lib/target-supports.exp |   14 +-
 11 files changed, 1871 insertions(+), 224 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/cmpbr.c

-- 
2.45.2

[PATCH 5/9] AArch64: make `far_branch` attribute a boolean

2025-05-15 Thread Karl Meakin

The `far_branch` attribute only ever takes the values 0 or 1, so make it
a `no/yes` valued string attribute instead.

gcc/ChangeLog:

* config/aarch64/aarch64.md (far_branch): Replace 0/1 with
no/yes.
(aarch64_bcond): Handle rename.
(aarch64_cbz1): Likewise.
(*aarch64_tbz1): Likewise.
(@aarch64_tbz): Likewise.
---
 gcc/config/aarch64/aarch64.md | 22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index c31ad4fc16e..b61e3e5a72f 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -554,16 +554,14 @@ (define_attr "mode_enabled" "false,true"
 ;; Attribute that controls whether an alternative is enabled or not.
 (define_attr "enabled" "no,yes"
   (if_then_else (and (eq_attr "arch_enabled" "yes")
 (eq_attr "mode_enabled" "true"))
(const_string "yes")
(const_string "no")))
 
 ;; Attribute that specifies whether we are dealing with a branch to a
 ;; label that is far away, i.e. further away than the maximum/minimum
 ;; representable in a signed 21-bits number.
-;; 0 :=: no
-;; 1 :=: yes
-(define_attr "far_branch" "" (const_int 0))
+(define_attr "far_branch" "no,yes" (const_string "no"))
 
 ;; Attribute that specifies whether the alternative uses MOVPRFX.
 (define_attr "movprfx" "no,yes" (const_string "no"))
@@ -759,45 +757,45 @@ (define_expand "cbranchcc4"
 ;; Emit `B`, assuming that the condition is already in the CC register.
 (define_insn "aarch64_bcond"
   [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
[(match_operand 1 "cc_register")
 (const_int 0)])
   (label_ref (match_operand 2))
   (pc)))]
   ""
   {
 /* GCC's traditional style has been to use "beq" instead of "b.eq", etc.,
but the "." is required for SVE conditions.  */
 bool use_dot_p = GET_MODE (operands[1]) == CC_NZCmode;
 if (get_attr_length (insn) == 8)
   return aarch64_gen_far_branch (operands, 2, "Lbcond",
 use_dot_p ? "b.%M0\\t" : "b%M0\\t");
 else
   return use_dot_p ? "b.%m0\\t%l2" : "b%m0\\t%l2";
   }
   [(set_attr "type" "branch")
(set (attr "length")
(if_then_else (and (ge (minus (match_dup 2) (pc))
   (const_int BRANCH_LEN_N_1MiB))
   (lt (minus (match_dup 2) (pc))
   (const_int BRANCH_LEN_P_1MiB)))
  (const_int 4)
  (const_int 8)))
(set (attr "far_branch")
(if_then_else (and (ge (minus (match_dup 2) (pc))
   (const_int BRANCH_LEN_N_1MiB))
   (lt (minus (match_dup 2) (pc))
   (const_int BRANCH_LEN_P_1MiB)))
- (const_int 0)
- (const_int 1)))]
+ (const_string "no")
+ (const_string "yes")))]
 )
 
 ;; For a 24-bit immediate CST we can optimize the compare for equality
 ;; and branch sequence from:
 ;; mov x0, #imm1
 ;; movkx0, #imm2, lsl 16 /* x0 contains CST.  */
 ;; cmp x1, x0
 ;; b .Label
 ;; into the shorter:
 ;; sub x0, x1, #(CST & 0xfff000)
 ;; subsx0, x0, #(CST & 0x000fff)
 ;; b .Label
@@ -829,77 +827,77 @@ (define_insn_and_split "*aarch64_bcond_wide_imm"
 ;; For an EQ/NE comparison against zero, emit `CBZ`/`CBNZ`
 (define_insn "aarch64_cbz1"
   [(set (pc) (if_then_else (EQL (match_operand:GPI 0 "register_operand" "r")
(const_int 0))
   (label_ref (match_operand 1))
   (pc)))]
   "!aarch64_track_speculation"
   {
 if (get_attr_length (insn) == 8)
   return aarch64_gen_far_branch (operands, 1, "Lcb", "\\t%0, ");
 else
   return "\\t%0, %l1";
   }
   [(set_attr "type" "branch")
(set (attr "length")
(if_then_else (and (ge (minus (match_dup 1) (pc))
   (const_int BRANCH_LEN_N_1MiB))
   (lt (minus (match_dup 1) (pc))
   (const_int BRANCH_LEN_P_1MiB)))
  (const_int 4)
  (const_int 8)))
(set (attr "far_branch")
(if_then_else (and (ge (minus (match_dup 2) (pc))
   (const_int BRANCH_LEN_N_1MiB))
   (lt (minus (match_dup 2) (pc))
   (const_int BRANCH_LEN_P_1MiB)))
- (const_int 0)
- (const_int 1)))]
+ (const_string "no")
+ (const_string "yes")))]
 )
 
 ;; For an LT/GE comparison against zero, emit `TBZ`/`TBNZ`
 (define_insn "*aarch64_tbz1"
   [(set (pc) (if_then_else (LTGE (match_operand:A

[PATCH 1/9] AArch64: place branch instruction rules together

2025-05-15 Thread Karl Meakin

The rules for conditional branches were spread throughout `aarch64.md`.
Group them together so it is easier to understand how `cbranch4`
is lowered to RTL.

gcc/ChangeLog:

* config/aarch64/aarch64.md (condjump): Move.
(*compare_condjump): Likewise.
(aarch64_cb1): Likewise.
(*cb1): Likewise.
(tbranch_3): Likewise.
(@aarch64_tb): Likewise.
---
 gcc/config/aarch64/aarch64.md | 387 ++
 1 file changed, 201 insertions(+), 186 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 6dbc9faf713..874df262781 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -674,6 +674,10 @@ (define_insn "aarch64_write_sysregti"
  "msrr\t%x0, %x1, %H1"
 )
 
+;; ---
+;; Unconditional jumps
+;; ---
+
 (define_insn "indirect_jump"
   [(set (pc) (match_operand:DI 0 "register_operand" "r"))]
   ""
@@ -692,6 +696,12 @@ (define_insn "jump"
   [(set_attr "type" "branch")]
 )
 
+
+
+;; ---
+;; Conditional jumps
+;; ---
+
 (define_expand "cbranch4"
   [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
[(match_operand:GPI 1 "register_operand")
@@ -731,6 +741,197 @@ (define_expand "cbranchcc4"
   ""
   "")
 
+(define_insn "condjump"
+  [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
+   [(match_operand 1 "cc_register" "") (const_int 0)])
+  (label_ref (match_operand 2 "" ""))
+  (pc)))]
+  ""
+  {
+/* GCC's traditional style has been to use "beq" instead of "b.eq", etc.,
+   but the "." is required for SVE conditions.  */
+bool use_dot_p = GET_MODE (operands[1]) == CC_NZCmode;
+if (get_attr_length (insn) == 8)
+  return aarch64_gen_far_branch (operands, 2, "Lbcond",
+use_dot_p ? "b.%M0\\t" : "b%M0\\t");
+else
+  return use_dot_p ? "b.%m0\\t%l2" : "b%m0\\t%l2";
+  }
+  [(set_attr "type" "branch")
+   (set (attr "length")
+   (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -1048576))
+  (lt (minus (match_dup 2) (pc)) (const_int 1048572)))
+ (const_int 4)
+ (const_int 8)))
+   (set (attr "far_branch")
+   (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -1048576))
+  (lt (minus (match_dup 2) (pc)) (const_int 1048572)))
+ (const_int 0)
+ (const_int 1)))]
+)
+
+;; For a 24-bit immediate CST we can optimize the compare for equality
+;; and branch sequence from:
+;; mov x0, #imm1
+;; movkx0, #imm2, lsl 16 /* x0 contains CST.  */
+;; cmp x1, x0
+;; b .Label
+;; into the shorter:
+;; sub x0, x1, #(CST & 0xfff000)
+;; subsx0, x0, #(CST & 0x000fff)
+;; b .Label
+(define_insn_and_split "*compare_condjump"
+  [(set (pc) (if_then_else (EQL
+ (match_operand:GPI 0 "register_operand" "r")
+ (match_operand:GPI 1 "aarch64_imm24" "n"))
+  (label_ref:P (match_operand 2 "" ""))
+  (pc)))]
+  "!aarch64_move_imm (INTVAL (operands[1]), mode)
+   && !aarch64_plus_operand (operands[1], mode)
+   && !reload_completed"
+  "#"
+  "&& true"
+  [(const_int 0)]
+  {
+HOST_WIDE_INT lo_imm = UINTVAL (operands[1]) & 0xfff;
+HOST_WIDE_INT hi_imm = UINTVAL (operands[1]) & 0xfff000;
+rtx tmp = gen_reg_rtx (mode);
+emit_insn (gen_add3 (tmp, operands[0], GEN_INT (-hi_imm)));
+emit_insn (gen_add3_compare0 (tmp, tmp, GEN_INT (-lo_imm)));
+rtx cc_reg = gen_rtx_REG (CC_NZmode, CC_REGNUM);
+rtx cmp_rtx = gen_rtx_fmt_ee (, mode,
+ cc_reg, const0_rtx);
+emit_jump_insn (gen_condjump (cmp_rtx, cc_reg, operands[2]));
+DONE;
+  }
+)
+
+(define_insn "aarch64_cb1"
+  [(set (pc) (if_then_else (EQL (match_operand:GPI 0 "register_operand" "r")
+   (const_int 0))
+  (label_ref (match_operand 1 "" ""))
+  (pc)))]
+  "!aarch64_track_speculation"
+  {
+if (get_attr_length (insn) == 8)
+  return aarch64_gen_far_branch (operands, 1, "Lcb", "\\t%0, ");
+else
+  return "\\t%0, %l1";
+  }
+  [(set_attr "type" "branch")
+   (set (attr "length")
+   (if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -1048576))
+  (lt (minus (match_dup 1) (pc)) (const_int 1048572)))
+ (const_int 4)
+ (const_int 8)))
+   (set (attr "far_branch")
+   (if_then_else (and (ge (minu

Re: [PATCH] c++: Further simplify the stdlib inline folding

2025-05-15 Thread Jason Merrill


On 5/15/25 10:36 AM, Ville Voutilainen wrote:

This is a follow-up to the earlier patch that adds std::to_underlying to the
set of stdlib functions that are folded. We don't seem to need to handle
the same-type case specially, the folding will just do the right thing.

Also fix up the mistake of not tweaking the cmdline switch doc in the earlier
patch.

Tested Linux-PPC64, ok for trunk?

Further simplify the stdlib inline folding

gcc/cp/ChangeLog:
 * cp-gimplify.cc (cp_fold): Remove the nop handling, let the
folding just fold it all.

gcc/ChangeLog:
 * doc/invoke.texi: Add to_underlying to -ffold-simple-inlines.



@@ -3347,8 +3347,6 @@ cp_fold (tree x, fold_flags_t flags)
|| id_equal (DECL_NAME (callee), "as_const")))
  {
r = CALL_EXPR_ARG (x, 0);
-   if (!same_type_p (TREE_TYPE (x), TREE_TYPE (r)))
- r = build_nop (TREE_TYPE (x), r);


This is removing the conversion entirely; I'm rather surprised it didn't 
break anything.  I thought you were thinking to make the build_nop 
unconditional.


The doc fix is OK.

Jason

[PATCH 8/9] AArch64: rules for CMPBR instructions

2025-05-15 Thread Karl Meakin

Add rules for lowering `cbranch4` to CBB/CBH/CB when
CMPBR extension is enabled.

gcc/ChangeLog:

* config/aarch64/aarch64.md (BRANCH_LEN_P_1Kib): New constant.
(BRANCH_LEN_N_1Kib): Likewise.
(cbranch4): Emit CMPBR instructions if possible.
(cbranch4): New expand rule.
(*aarch64_cb): Likewise.
(*aarch64_cb): Likewise.
* config/aarch64/iterators.md (cmpbr_suffix): New mode attr.
* config/aarch64/predicates.md (const_0_to_63_operand): New
predicate.
(aarch64_cb_immediate): Likewise.
(aarch64_cb_operand): Likewise.
(aarch64_cb_short_operand): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/cmpbr.c: update tests.
---
 gcc/config/aarch64/aarch64.md|  87 +++-
 gcc/config/aarch64/iterators.md  |   5 +
 gcc/config/aarch64/predicates.md |  15 +
 gcc/testsuite/gcc.target/aarch64/cmpbr.c | 598 ---
 4 files changed, 311 insertions(+), 394 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index b61e3e5a72f..0b708f8b2f6 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -697,37 +697,60 @@ (define_insn "jump"
 ;; Maximum PC-relative positive/negative displacements for various branching
 ;; instructions.
 (define_constants
   [
 ;; +/- 128MiB.  Used by B, BL.
 (BRANCH_LEN_P_128MiB  134217724)
 (BRANCH_LEN_N_128MiB -134217728)
 
 ;; +/- 1MiB.  Used by B., CBZ, CBNZ.
 (BRANCH_LEN_P_1MiB  1048572)
 (BRANCH_LEN_N_1MiB -1048576)
 
 ;; +/- 32KiB.  Used by TBZ, TBNZ.
 (BRANCH_LEN_P_32KiB  32764)
 (BRANCH_LEN_N_32KiB -32768)
+
+;; +/- 1KiB.  Used by CBB, CBH, CB.
+(BRANCH_LEN_P_1Kib  1020)
+(BRANCH_LEN_N_1Kib -1024)
   ]
 )
 
 ;; ---
 ;; Conditional jumps
 ;; ---
 
-(define_expand "cbranch4"
+(define_expand "cbranch4"
   [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
[(match_operand:GPI 1 "register_operand")
 (match_operand:GPI 2 "aarch64_plus_operand")])
   (label_ref (match_operand 3))
   (pc)))]
   ""
-  "
-  operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]), operands[1],
-operands[2]);
-  operands[2] = const0_rtx;
-  "
+  {
+  if (TARGET_CMPBR && aarch64_cb_operand (operands[2], mode))
+{
+  emit_jump_insn (gen_aarch64_cb (operands[0], operands[1],
+   operands[2], operands[3]));
+  DONE;
+}
+  else
+{
+  operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]),
+operands[1], operands[2]);
+  operands[2] = const0_rtx;
+}
+  }
+)
+
+(define_expand "cbranch4"
+  [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
+   [(match_operand:SHORT 1 "register_operand")
+(match_operand:SHORT 2 
"aarch64_cb_short_operand")])
+  (label_ref (match_operand 3))
+  (pc)))]
+  "TARGET_CMPBR"
+  ""
 )
 
 (define_expand "cbranch4"
@@ -747,13 +770,65 @@ (define_expand "cbranch4"
 (define_expand "cbranchcc4"
   [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
[(match_operand 1 "cc_register")
 (match_operand 2 "const0_operand")])
   (label_ref (match_operand 3))
   (pc)))]
   ""
   ""
 )
 
+;; Emit a `CB (register)` or `CB (immediate)` instruction.
+(define_insn "aarch64_cb"
+  [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
+   [(match_operand:GPI 1 "register_operand" "r")
+(match_operand:GPI 2 "aarch64_cb_operand" "ri")])
+  (label_ref (match_operand 3))
+  (pc)))]
+  "TARGET_CMPBR"
+  "cb%m0\\t%1, %2, %l3";
+  [(set_attr "type" "branch")
+   (set (attr "length")
+   (if_then_else (and (ge (minus (match_dup 3) (pc))
+  (const_int BRANCH_LEN_N_1Kib))
+  (lt (minus (match_dup 3) (pc))
+  (const_int BRANCH_LEN_P_1Kib)))
+ (const_int 4)
+ (const_int 8)))
+   (set (attr "far_branch")
+   (if_then_else (and (ge (minus (match_dup 3) (pc))
+  (const_int BRANCH_LEN_N_1Kib))
+  (lt (minus (match_dup 3) (pc))
+  (const_int BRANCH_LEN_P_1Kib)))
+ (const_string "no")
+ (const_string "yes")))]
+)
+
+;; Emit a `CBB (register)` or `CBH (register)` instruct

[PATCH 6/9] AArch64: recognize `+cmpbr` option

2025-05-15 Thread Karl Meakin

Add the `+cmpbr` option to enable the FEAT_CMPBR architectural
extension.

gcc/ChangeLog:

* config/aarch64/aarch64-option-extensions.def (cmpbr): New
option.
* config/aarch64/aarch64.h (TARGET_CMPBR): New macro.
* doc/invoke.texi (cmpbr): New option.
---
 gcc/config/aarch64/aarch64-option-extensions.def | 2 ++
 gcc/config/aarch64/aarch64.h | 3 +++
 gcc/doc/invoke.texi  | 3 +++
 3 files changed, 8 insertions(+)

diff --git a/gcc/config/aarch64/aarch64-option-extensions.def 
b/gcc/config/aarch64/aarch64-option-extensions.def
index dbbb021f05a..1c3e69799f5 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -249,6 +249,8 @@ AARCH64_OPT_EXTENSION("mops", MOPS, (), (), (), "mops")
 
 AARCH64_OPT_EXTENSION("cssc", CSSC, (), (), (), "cssc")
 
+AARCH64_OPT_EXTENSION("cmpbr", CMPBR, (), (), (), "cmpbr")
+
 AARCH64_OPT_EXTENSION("lse128", LSE128, (LSE), (), (), "lse128")
 
 AARCH64_OPT_EXTENSION("d128", D128, (LSE128), (), (), "d128")
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index e8bd8c73c12..d5c4a42e96d 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -202,326 +202,329 @@ constexpr auto AARCH64_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED
   = AARCH64_ISA_MODE_SM_OFF;
 constexpr auto AARCH64_FL_DEFAULT_ISA_MODE ATTRIBUTE_UNUSED
   = aarch64_feature_flags (AARCH64_DEFAULT_ISA_MODE);
 
 #endif
 
 /* Macros to test ISA flags.
 
There is intentionally no macro for AARCH64_FL_CRYPTO, since this flag bit
is not always set when its constituent features are present.
Check (TARGET_AES && TARGET_SHA2) instead.  */
 
 #define AARCH64_HAVE_ISA(X) (bool (aarch64_isa_flags & AARCH64_FL_##X))
 
 #define AARCH64_ISA_MODE((aarch64_isa_flags & AARCH64_FL_ISA_MODES).val[0])
 
 /* The current function is a normal non-streaming function.  */
 #define TARGET_NON_STREAMING AARCH64_HAVE_ISA (SM_OFF)
 
 /* The current function has a streaming body.  */
 #define TARGET_STREAMING AARCH64_HAVE_ISA (SM_ON)
 
 /* The current function has a streaming-compatible body.  */
 #define TARGET_STREAMING_COMPATIBLE \
   ((aarch64_isa_flags & AARCH64_FL_SM_STATE) == 0)
 
 /* PSTATE.ZA is enabled in the current function body.  */
 #define TARGET_ZA AARCH64_HAVE_ISA (ZA_ON)
 
 /* AdvSIMD is supported in the default configuration, unless disabled by
-mgeneral-regs-only or by the +nosimd extension.  The set of available
instructions is then subdivided into:
 
- the "base" set, available both in SME streaming mode and in
  non-streaming mode
 
- the full set, available only in non-streaming mode.  */
 #define TARGET_BASE_SIMD AARCH64_HAVE_ISA (SIMD)
 #define TARGET_SIMD (TARGET_BASE_SIMD && TARGET_NON_STREAMING)
 #define TARGET_FLOAT AARCH64_HAVE_ISA (FP)
 
 /* AARCH64_FL options necessary for system register implementation.  */
 
 /* Define AARCH64_FL aliases for architectural features which are protected
by -march flags in binutils but which receive no special treatment by GCC.
 
Such flags are inherited from the Binutils definition of system registers
and are mapped to the architecture in which the feature is implemented.  */
 #define AARCH64_FL_RASAARCH64_FL_V8A
 #define AARCH64_FL_LORAARCH64_FL_V8_1A
 #define AARCH64_FL_PANAARCH64_FL_V8_1A
 #define AARCH64_FL_AMUAARCH64_FL_V8_4A
 #define AARCH64_FL_SCXTNUMAARCH64_FL_V8_5A
 #define AARCH64_FL_ID_PFR2AARCH64_FL_V8_5A
 
 /* Armv8.9-A extension feature bits defined in Binutils but absent from GCC,
aliased to their base architecture.  */
 #define AARCH64_FL_AIEAARCH64_FL_V8_9A
 #define AARCH64_FL_DEBUGv8p9  AARCH64_FL_V8_9A
 #define AARCH64_FL_FGT2   AARCH64_FL_V8_9A
 #define AARCH64_FL_ITEAARCH64_FL_V8_9A
 #define AARCH64_FL_PFAR   AARCH64_FL_V8_9A
 #define AARCH64_FL_PMUv3_ICNTRAARCH64_FL_V8_9A
 #define AARCH64_FL_PMUv3_SS   AARCH64_FL_V8_9A
 #define AARCH64_FL_PMUv3p9AARCH64_FL_V8_9A
 #define AARCH64_FL_RASv2  AARCH64_FL_V8_9A
 #define AARCH64_FL_S1PIE  AARCH64_FL_V8_9A
 #define AARCH64_FL_S1POE  AARCH64_FL_V8_9A
 #define AARCH64_FL_S2PIE  AARCH64_FL_V8_9A
 #define AARCH64_FL_S2POE  AARCH64_FL_V8_9A
 #define AARCH64_FL_SCTLR2 AARCH64_FL_V8_9A
 #define AARCH64_FL_SEBEP  AARCH64_FL_V8_9A
 #define AARCH64_FL_SPE_FDSAARCH64_FL_V8_9A
 #define AARCH64_FL_TCR2   AARCH64_FL_V8_9A
 
 #define TARGET_V8R AARCH64_HAVE_ISA (V8R)
 #define TARGET_V9A AARCH64_HAVE_ISA (V9A)
 
 
 /* SHA2 is an optional extension to AdvSIMD.  */
 #define TARGET_SHA2 AARCH64_HAVE_ISA (SHA2)
 
 /* SHA3 is an optional extension to AdvSIMD.  */
 #define TARGET_SHA3 AARCH64_HAVE_ISA (SHA3)
 
 /* AES is an optional extension to AdvSIMD.  */
 #define TARGET_AES AARCH64_HAVE_ISA (AES)
 
 /* SM is an optiona

[PATCH 2/9] AArch64: reformat branch instruction rules

2025-05-15 Thread Karl Meakin

Make the formatting of the RTL templates in the rules for branch
instructions more consistent with each other.

gcc/ChangeLog:

* config/aarch64/aarch64.md (cbranch4): Reformat.
(cbranchcc4): Likewise.
(condjump): Likewise.
(*compare_condjump): Likewise.
(aarch64_cb1): Likewise.
(*cb1): Likewise.
(tbranch_3): Likewise.
(@aarch64_tb): Likewise.
---
 gcc/config/aarch64/aarch64.md | 77 +--
 1 file changed, 38 insertions(+), 39 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 874df262781..05d86595bb1 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -705,229 +705,228 @@ (define_insn "jump"
 (define_expand "cbranch4"
   [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
[(match_operand:GPI 1 "register_operand")
 (match_operand:GPI 2 "aarch64_plus_operand")])
-  (label_ref (match_operand 3 "" ""))
+  (label_ref (match_operand 3))
   (pc)))]
   ""
   "
   operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]), operands[1],
 operands[2]);
   operands[2] = const0_rtx;
   "
 )
 
 (define_expand "cbranch4"
-  [(set (pc) (if_then_else
-   (match_operator 0 "aarch64_comparison_operator"
-[(match_operand:GPF_F16 1 "register_operand")
- (match_operand:GPF_F16 2 "aarch64_fp_compare_operand")])
-   (label_ref (match_operand 3 "" ""))
-   (pc)))]
+  [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
+   [(match_operand:GPF_F16 1 "register_operand")
+(match_operand:GPF_F16 2 
"aarch64_fp_compare_operand")])
+  (label_ref (match_operand 3))
+  (pc)))]
   ""
-  "
+  {
   operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]), operands[1],
 operands[2]);
   operands[2] = const0_rtx;
-  "
+  }
 )
 
 (define_expand "cbranchcc4"
-  [(set (pc) (if_then_else
- (match_operator 0 "aarch64_comparison_operator"
-  [(match_operand 1 "cc_register")
-   (match_operand 2 "const0_operand")])
- (label_ref (match_operand 3 "" ""))
- (pc)))]
+  [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
+   [(match_operand 1 "cc_register")
+(match_operand 2 "const0_operand")])
+  (label_ref (match_operand 3))
+  (pc)))]
   ""
-  "")
+  ""
+)
 
 (define_insn "condjump"
   [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
-   [(match_operand 1 "cc_register" "") (const_int 0)])
-  (label_ref (match_operand 2 "" ""))
+   [(match_operand 1 "cc_register")
+(const_int 0)])
+  (label_ref (match_operand 2))
   (pc)))]
   ""
   {
 /* GCC's traditional style has been to use "beq" instead of "b.eq", etc.,
but the "." is required for SVE conditions.  */
 bool use_dot_p = GET_MODE (operands[1]) == CC_NZCmode;
 if (get_attr_length (insn) == 8)
   return aarch64_gen_far_branch (operands, 2, "Lbcond",
 use_dot_p ? "b.%M0\\t" : "b%M0\\t");
 else
   return use_dot_p ? "b.%m0\\t%l2" : "b%m0\\t%l2";
   }
   [(set_attr "type" "branch")
(set (attr "length")
(if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -1048576))
   (lt (minus (match_dup 2) (pc)) (const_int 1048572)))
  (const_int 4)
  (const_int 8)))
(set (attr "far_branch")
(if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -1048576))
   (lt (minus (match_dup 2) (pc)) (const_int 1048572)))
  (const_int 0)
  (const_int 1)))]
 )
 
 ;; For a 24-bit immediate CST we can optimize the compare for equality
 ;; and branch sequence from:
 ;; mov x0, #imm1
 ;; movkx0, #imm2, lsl 16 /* x0 contains CST.  */
 ;; cmp x1, x0
 ;; b .Label
 ;; into the shorter:
 ;; sub x0, x1, #(CST & 0xfff000)
 ;; subsx0, x0, #(CST & 0x000fff)
 ;; b .Label
 (define_insn_and_split "*compare_condjump"
-  [(set (pc) (if_then_else (EQL
- (match_operand:GPI 0 "register_operand" "r")
- (match_operand:GPI 1 "aarch64_imm24" "n"))
-  (label_ref:P (match_operand 2 "" ""))
+  [(set (pc) (if_then_else (EQL (match_operand:GPI 0 "register_operand" "r")
+

[PATCH 9/9] AArch64: make rules for CBZ/TBZ higher priority

2025-05-15 Thread Karl Meakin

Move the rules for CBZ/TBZ to be above the rules for
CBB/CBH/CB. We want them to have higher priority
because they can express larger displacements.

gcc/ChangeLog:

* config/aarch64/aarch64.md (aarch64_cbz1): Move
above rules for CBB/CBH/CB.
(*aarch64_tbz1): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/cmpbr.c: Update tests.
---
 gcc/config/aarch64/aarch64.md| 162 ---
 gcc/testsuite/gcc.target/aarch64/cmpbr.c |  32 ++---
 2 files changed, 104 insertions(+), 90 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 0b708f8b2f6..d3514ff1ef9 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -697,27 +697,38 @@ (define_insn "jump"
 ;; Maximum PC-relative positive/negative displacements for various branching
 ;; instructions.
 (define_constants
   [
 ;; +/- 128MiB.  Used by B, BL.
 (BRANCH_LEN_P_128MiB  134217724)
 (BRANCH_LEN_N_128MiB -134217728)
 
 ;; +/- 1MiB.  Used by B., CBZ, CBNZ.
 (BRANCH_LEN_P_1MiB  1048572)
 (BRANCH_LEN_N_1MiB -1048576)
 
 ;; +/- 32KiB.  Used by TBZ, TBNZ.
 (BRANCH_LEN_P_32KiB  32764)
 (BRANCH_LEN_N_32KiB -32768)
 
 ;; +/- 1KiB.  Used by CBB, CBH, CB.
 (BRANCH_LEN_P_1Kib  1020)
 (BRANCH_LEN_N_1Kib -1024)
   ]
 )
 
 ;; ---
 ;; Conditional jumps
+;; The order of the rules below is important.
+;; Higher priority rules are preferred because they can express larger
+;; displacements.
+;; 1) EQ/NE comparisons against zero are handled by CBZ/CBNZ.
+;; 2) LT/GE comparisons against zero are handled by TBZ/TBNZ.
+;; 3) When the CMPBR extension is enabled:
+;;   a) Comparisons between two registers are handled by
+;;  CBB/CBH/CB.
+;;   b) Comparisons between a GP register and an immediate in the range 0-63 
are
+;;  handled by CB (immediate).
+;; 4) Otherwise, emit a CMP+B sequence.
 ;; ---
 
 (define_expand "cbranch4"
@@ -770,14 +781,91 @@ (define_expand "cbranch4"
 (define_expand "cbranchcc4"
   [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
[(match_operand 1 "cc_register")
 (match_operand 2 "const0_operand")])
   (label_ref (match_operand 3))
   (pc)))]
   ""
   ""
 )
 
+;; For an EQ/NE comparison against zero, emit `CBZ`/`CBNZ`
+(define_insn "aarch64_cbz1"
+  [(set (pc) (if_then_else (EQL (match_operand:GPI 0 "register_operand" "r")
+   (const_int 0))
+  (label_ref (match_operand 1))
+  (pc)))]
+  "!aarch64_track_speculation"
+  {
+if (get_attr_length (insn) == 8)
+  return aarch64_gen_far_branch (operands, 1, "Lcb", "\\t%0, ");
+else
+  return "\\t%0, %l1";
+  }
+  [(set_attr "type" "branch")
+   (set (attr "length")
+   (if_then_else (and (ge (minus (match_dup 1) (pc))
+  (const_int BRANCH_LEN_N_1MiB))
+  (lt (minus (match_dup 1) (pc))
+  (const_int BRANCH_LEN_P_1MiB)))
+ (const_int 4)
+ (const_int 8)))
+   (set (attr "far_branch")
+   (if_then_else (and (ge (minus (match_dup 2) (pc))
+  (const_int BRANCH_LEN_N_1MiB))
+  (lt (minus (match_dup 2) (pc))
+  (const_int BRANCH_LEN_P_1MiB)))
+ (const_string "no")
+ (const_string "yes")))]
+)
+
+;; For an LT/GE comparison against zero, emit `TBZ`/`TBNZ`
+(define_insn "*aarch64_tbz1"
+  [(set (pc) (if_then_else (LTGE (match_operand:ALLI 0 "register_operand" "r")
+(const_int 0))
+  (label_ref (match_operand 1))
+  (pc)))
+   (clobber (reg:CC CC_REGNUM))]
+  "!aarch64_track_speculation"
+  {
+if (get_attr_length (insn) == 8)
+  {
+   if (get_attr_far_branch (insn) == FAR_BRANCH_YES)
+ return aarch64_gen_far_branch (operands, 1, "Ltb",
+"\\t%0, , ");
+   else
+ {
+   char buf[64];
+   uint64_t val = ((uint64_t) 1)
+   << (GET_MODE_SIZE (mode) * BITS_PER_UNIT - 1);
+   sprintf (buf, "tst\t%%0, %" PRId64, val);
+   output_asm_insn (buf, operands);
+   return "\t%l1";
+ }
+  }
+else
+  return "\t%0, , %l1";
+  }
+  [(set_attr "type" "branch")
+   (set (attr "length")
+   (if_then_else (and (ge (minus (match_dup 1) (pc))
+  (const_int BRANCH_LEN_N_32KiB))
+  (lt (minus (match_dup 1) (pc))
+  (const_int BRANCH_LEN_P_32KiB)))
+ (const_int 4)
+

[PATCH 7/9] AArch64: precommit test for CMPBR instructions

2025-05-15 Thread Karl Meakin

Commit the test file `cmpbr.c` before rules for generating the new
instructions are added, so that the changes in codegen are more obvious
in the next commit.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add `cmpbr` to the list of extensions.
* gcc.target/aarch64/cmpbr.c: New test.
---
 gcc/testsuite/gcc.target/aarch64/cmpbr.c | 1659 ++
 gcc/testsuite/lib/target-supports.exp|   14 +-
 2 files changed, 1667 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/cmpbr.c

diff --git a/gcc/testsuite/gcc.target/aarch64/cmpbr.c 
b/gcc/testsuite/gcc.target/aarch64/cmpbr.c
new file mode 100644
index 000..8534283bc26
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/cmpbr.c
@@ -0,0 +1,1659 @@
+// Test that the instructions added by FEAT_CMPBR are emitted
+// { dg-do compile }
+// { dg-do-if assemble { target aarch64_asm_cmpbr_ok } }
+// { dg-options "-march=armv9.5-a+cmpbr -O2" }
+// { dg-final { check-function-bodies "**" "" "" } }
+
+#include 
+
+typedef uint8_t u8;
+typedef int8_t i8;
+
+typedef uint16_t u16;
+typedef int16_t i16;
+
+typedef uint32_t u32;
+typedef int32_t i32;
+
+typedef uint64_t u64;
+typedef int64_t i64;
+
+int taken();
+int not_taken();
+
+#define COMPARE(ty, name, op, rhs) 
\
+  int ty##_x0_##name##_##rhs(ty x0, ty x1) {   
\
+return (x0 op rhs) ? taken() : not_taken();
\
+  }
+
+#define COMPARE_ALL(unsigned_ty, signed_ty, rhs)   
\
+  COMPARE(unsigned_ty, eq, ==, rhs);   
\
+  COMPARE(unsigned_ty, ne, !=, rhs);   
\
+   
\
+  COMPARE(unsigned_ty, ult, <, rhs);   
\
+  COMPARE(unsigned_ty, ule, <=, rhs);  
\
+  COMPARE(unsigned_ty, ugt, >, rhs);   
\
+  COMPARE(unsigned_ty, uge, >=, rhs);  
\
+   
\
+  COMPARE(signed_ty, slt, <, rhs); 
\
+  COMPARE(signed_ty, sle, <=, rhs);
\
+  COMPARE(signed_ty, sgt, >, rhs); 
\
+  COMPARE(signed_ty, sge, >=, rhs);
+
+//  CBB (register) 
+COMPARE_ALL(u8, i8, x1);
+
+//  CBH (register) 
+COMPARE_ALL(u16, i16, x1);
+
+//  CB (register) 
+COMPARE_ALL(u32, i32, x1);
+COMPARE_ALL(u64, i64, x1);
+
+//  CB (immediate) 
+COMPARE_ALL(u32, i32, 42);
+COMPARE_ALL(u64, i64, 42);
+
+//  Special cases 
+// Comparisons against the immediate 0 can be done for all types,
+// because we can use the wzr/xzr register as one of the operands.
+// However, we should prefer to use CBZ/CBNZ or TBZ/TBNZ when possible,
+// because they have larger range.
+COMPARE_ALL(u8, i8, 0);
+COMPARE_ALL(u16, i16, 0);
+COMPARE_ALL(u32, i32, 0);
+COMPARE_ALL(u64, i64, 0);
+
+// CBB and CBH cannot have immediate operands.
+// Instead we have to do a MOV+CB.
+COMPARE_ALL(u8, i8, 42);
+COMPARE_ALL(u16, i16, 42);
+
+// 64 is out of the range for immediate operands (0 to 63).
+// * For 8/16-bit types, use a MOV+CB as above.
+// * For 32/64-bit types, use a CMP+B instead,
+//   because B has a longer range than CB.
+COMPARE_ALL(u8, i8, 64);
+COMPARE_ALL(u16, i16, 64);
+COMPARE_ALL(u32, i32, 64);
+COMPARE_ALL(u64, i64, 64);
+
+// 4098 is out of the range for CMP (0 to 4095, optionally shifted by left by 
12
+// bits), but it can be materialized in a single MOV.
+COMPARE_ALL(u16, i16, 4098);
+COMPARE_ALL(u32, i32, 4098);
+COMPARE_ALL(u64, i64, 4098);
+
+/*
+** u8_x0_eq_x1:
+** and w1, w1, 255
+** cmp w1, w0, uxtb
+** beq .L4
+** b   not_taken
+** b   taken
+*/
+
+/*
+** u8_x0_ne_x1:
+** and w1, w1, 255
+** cmp w1, w0, uxtb
+** beq .L6
+** b   taken
+** b   not_taken
+*/
+
+/*
+** u8_x0_ult_x1:
+** and w1, w1, 255
+** cmp w1, w0, uxtb
+** bls .L8
+** b   taken
+** b   not_taken
+*/
+
+/*
+** u8_x0_ule_x1:
+** and w1, w1, 255
+** cmp w1, w0, uxtb
+** bcc .L10
+** b   taken
+** b   not_taken
+*/
+
+/*
+** u8_x0_ugt_x1:
+** and w1, w1, 255
+** cmp w1, w0, uxtb
+** bcs .L12
+** b   taken
+** b   not_taken
+*/
+
+/*
+** u8_x0_uge_x1:
+** and w1, w1, 255
+** cmp w1, w0, uxtb
+** bhi .L14
+** b   taken
+** b   not_taken
+*/
+
+/*
+** i8_x0_slt_x1:
+** sxtbw1, w1
+** cmp w1, w0, sxtb
+** ble .L16
+** b   taken
+** b   not_taken
+*/
+
+/*
+** i8_x0_sle_x1:
+**

[PATCH 4/9] AArch64: add constants for branch displacements

2025-05-15 Thread Karl Meakin

Extract the hardcoded values for the minimum PC-relative displacements
into named constants and document them.

gcc/ChangeLog:

* config/aarch64/aarch64.md (BRANCH_LEN_P_128MiB): New constant.
(BRANCH_LEN_N_128MiB): Likewise.
(BRANCH_LEN_P_1MiB): Likewise.
(BRANCH_LEN_N_1MiB): Likewise.
(BRANCH_LEN_P_32KiB): Likewise.
(BRANCH_LEN_N_32KiB): Likewise.
---
 gcc/config/aarch64/aarch64.md | 64 ++-
 1 file changed, 48 insertions(+), 16 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index ba0d1cccdd0..c31ad4fc16e 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -692,12 +692,28 @@ (define_insn "indirect_jump"
 (define_insn "jump"
   [(set (pc) (label_ref (match_operand 0 "" "")))]
   ""
   "b\\t%l0"
   [(set_attr "type" "branch")]
 )
 
+;; Maximum PC-relative positive/negative displacements for various branching
+;; instructions.
+(define_constants
+  [
+;; +/- 128MiB.  Used by B, BL.
+(BRANCH_LEN_P_128MiB  134217724)
+(BRANCH_LEN_N_128MiB -134217728)
+
+;; +/- 1MiB.  Used by B., CBZ, CBNZ.
+(BRANCH_LEN_P_1MiB  1048572)
+(BRANCH_LEN_N_1MiB -1048576)
 
+;; +/- 32KiB.  Used by TBZ, TBNZ.
+(BRANCH_LEN_P_32KiB  32764)
+(BRANCH_LEN_N_32KiB -32768)
+  ]
+)
 
 ;; ---
 ;; Conditional jumps
 ;; ---
@@ -743,41 +759,45 @@ (define_expand "cbranchcc4"
 ;; Emit `B`, assuming that the condition is already in the CC register.
 (define_insn "aarch64_bcond"
   [(set (pc) (if_then_else (match_operator 0 "aarch64_comparison_operator"
[(match_operand 1 "cc_register")
 (const_int 0)])
   (label_ref (match_operand 2))
   (pc)))]
   ""
   {
 /* GCC's traditional style has been to use "beq" instead of "b.eq", etc.,
but the "." is required for SVE conditions.  */
 bool use_dot_p = GET_MODE (operands[1]) == CC_NZCmode;
 if (get_attr_length (insn) == 8)
   return aarch64_gen_far_branch (operands, 2, "Lbcond",
 use_dot_p ? "b.%M0\\t" : "b%M0\\t");
 else
   return use_dot_p ? "b.%m0\\t%l2" : "b%m0\\t%l2";
   }
   [(set_attr "type" "branch")
(set (attr "length")
-   (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -1048576))
-  (lt (minus (match_dup 2) (pc)) (const_int 1048572)))
+   (if_then_else (and (ge (minus (match_dup 2) (pc))
+  (const_int BRANCH_LEN_N_1MiB))
+  (lt (minus (match_dup 2) (pc))
+  (const_int BRANCH_LEN_P_1MiB)))
  (const_int 4)
  (const_int 8)))
(set (attr "far_branch")
-   (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -1048576))
-  (lt (minus (match_dup 2) (pc)) (const_int 1048572)))
+   (if_then_else (and (ge (minus (match_dup 2) (pc))
+  (const_int BRANCH_LEN_N_1MiB))
+  (lt (minus (match_dup 2) (pc))
+  (const_int BRANCH_LEN_P_1MiB)))
  (const_int 0)
  (const_int 1)))]
 )
 
 ;; For a 24-bit immediate CST we can optimize the compare for equality
 ;; and branch sequence from:
 ;; mov x0, #imm1
 ;; movkx0, #imm2, lsl 16 /* x0 contains CST.  */
 ;; cmp x1, x0
 ;; b .Label
 ;; into the shorter:
 ;; sub x0, x1, #(CST & 0xfff000)
 ;; subsx0, x0, #(CST & 0x000fff)
 ;; b .Label
@@ -809,69 +829,77 @@ (define_insn_and_split "*aarch64_bcond_wide_imm"
 ;; For an EQ/NE comparison against zero, emit `CBZ`/`CBNZ`
 (define_insn "aarch64_cbz1"
   [(set (pc) (if_then_else (EQL (match_operand:GPI 0 "register_operand" "r")
(const_int 0))
   (label_ref (match_operand 1))
   (pc)))]
   "!aarch64_track_speculation"
   {
 if (get_attr_length (insn) == 8)
   return aarch64_gen_far_branch (operands, 1, "Lcb", "\\t%0, ");
 else
   return "\\t%0, %l1";
   }
   [(set_attr "type" "branch")
(set (attr "length")
-   (if_then_else (and (ge (minus (match_dup 1) (pc)) (const_int -1048576))
-  (lt (minus (match_dup 1) (pc)) (const_int 1048572)))
+   (if_then_else (and (ge (minus (match_dup 1) (pc))
+  (const_int BRANCH_LEN_N_1MiB))
+  (lt (minus (match_dup 1) (pc))
+  (const_int BRANCH_LEN_P_1MiB)))
  (const_int 4)
  (const_int 8)))
(set (attr "far_branch")
-   (if_then_else (and (ge (minus (match_dup 2) (pc)) (const_int -1048576))
-

[PATCH v22 0/3] c: Add _Countof and

2025-05-15 Thread Alejandro Colomar

Hi,

Here's the patch set.  This time, feature complete, and fully tested
with no regressions.  I'll send a reply with the test results in a
moment.

v22 changes:

-  Move Link: tags to above the changelog, as Jason requested.
-  Update the tests for -pedantic-errors.  Some tests are now errors
   instead of warnings.  I've had to move some tests about GNU
   extensions to new test files, so now it's more granular.  Some tests
   were removed, since I realized they were redundant while moving to
   smaller files.
-  Add  (and NDEBUG) to some test files that were missing it,
   and also the forward declaration of strcmp(3).
-  Fix typos in dejagnu diagnostic comments.

Is this ready to merge now, hopefully?  :-)


Have a lovely night!
Alex

Alejandro Colomar (3):
  c: Add _Countof operator
  c: Add 
  c: Add -Wpedantic diagnostic for _Countof

 gcc/Makefile.in   |   1 +
 gcc/c-family/c-common.cc  |  26 
 gcc/c-family/c-common.def |   3 +
 gcc/c-family/c-common.h   |   2 +
 gcc/c/c-decl.cc   |  22 +++-
 gcc/c/c-parser.cc |  63 +++--
 gcc/c/c-tree.h|   4 +
 gcc/c/c-typeck.cc | 115 +++-
 gcc/doc/extend.texi   |  30 +
 gcc/ginclude/stdcountof.h |  31 +
 gcc/testsuite/gcc.dg/countof-compat.c |   8 ++
 gcc/testsuite/gcc.dg/countof-compile.c| 124 ++
 gcc/testsuite/gcc.dg/countof-no-compat.c  |   5 +
 .../gcc.dg/countof-pedantic-errors.c  |   8 ++
 gcc/testsuite/gcc.dg/countof-pedantic.c   |   8 ++
 gcc/testsuite/gcc.dg/countof-stdcountof.c |  25 
 gcc/testsuite/gcc.dg/countof-vla.c|  35 +
 gcc/testsuite/gcc.dg/countof-vmt.c|  21 +++
 gcc/testsuite/gcc.dg/countof-zero-compile.c   |  38 ++
 gcc/testsuite/gcc.dg/countof-zero.c   |  32 +
 gcc/testsuite/gcc.dg/countof.c| 121 +
 21 files changed, 698 insertions(+), 24 deletions(-)
 create mode 100644 gcc/ginclude/stdcountof.h
 create mode 100644 gcc/testsuite/gcc.dg/countof-compat.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-no-compat.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-pedantic-errors.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-pedantic.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-stdcountof.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vmt.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-zero-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-zero.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c

Range-diff against v21:
1:  432081a4747 ! 1:  1c983c3baa7 c: Add _Countof operator
@@ Commit message
and somehow magically return the number of elements of the array,
regardless of it being really a pointer.
 
+Link: 
+Link: 
+Link: 
+Link: 

+Link: 
+Link: 
+Link: 
+Link: 
+Link: 
+Link: 
+Link: 
+Link: 
+
 gcc/ChangeLog:
 
 * doc/extend.texi: Document _Countof operator.
@@ Commit message
 
 * gcc.dg/countof-compile.c
 * gcc.dg/countof-vla.c
+* gcc.dg/countof-vmt.c
+* gcc.dg/countof-zero-compile.c
+* gcc.dg/countof-zero.c
 * gcc.dg/countof.c: Add tests for _Countof operator.
 
-Link: 
-Link: 
-Link: 
-Link: 

-Link: 
-Link: 
-Link: 
-Link:

[PATCH v22 2/3] c: Add

2025-05-15 Thread Alejandro Colomar

gcc/ChangeLog:

* Makefile.in (USER_H): Add .
* ginclude/stdcountof.h: Add countof macro.

gcc/testsuite/ChangeLog:

* gcc.dg/countof-stdcountof.c: Add tests for .

Signed-off-by: Alejandro Colomar 
---
 gcc/Makefile.in   |  1 +
 gcc/ginclude/stdcountof.h | 31 +++
 gcc/testsuite/gcc.dg/countof-stdcountof.c | 25 ++
 3 files changed, 57 insertions(+)
 create mode 100644 gcc/ginclude/stdcountof.h
 create mode 100644 gcc/testsuite/gcc.dg/countof-stdcountof.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 72d132207c0..fc8a7e532b9 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -481,6 +481,7 @@ USER_H = $(srcdir)/ginclude/float.h \
 $(srcdir)/ginclude/stdalign.h \
 $(srcdir)/ginclude/stdatomic.h \
 $(srcdir)/ginclude/stdckdint.h \
+$(srcdir)/ginclude/stdcountof.h \
 $(EXTRA_HEADERS)
 
 USER_H_INC_NEXT_PRE = @user_headers_inc_next_pre@
diff --git a/gcc/ginclude/stdcountof.h b/gcc/ginclude/stdcountof.h
new file mode 100644
index 000..1d914f40e5d
--- /dev/null
+++ b/gcc/ginclude/stdcountof.h
@@ -0,0 +1,31 @@
+/* Copyright (C) 2025 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+/* ISO C2Y: 7.21 Array count .  */
+
+#ifndef _STDCOUNTOF_H
+#define _STDCOUNTOF_H
+
+#define countof  _Countof
+
+#endif /* stdcountof.h */
diff --git a/gcc/testsuite/gcc.dg/countof-stdcountof.c 
b/gcc/testsuite/gcc.dg/countof-stdcountof.c
new file mode 100644
index 000..a7fe4079c69
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/countof-stdcountof.c
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+/* { dg-options "-std=c2y -pedantic-errors" } */
+
+#include 
+
+#undef NDEBUG
+#include 
+
+extern int strcmp (const char *, const char *);
+
+#ifndef countof
+#error "countof not defined"
+#endif
+
+int a[3];
+int b[countof a];
+
+#define str(x) #x
+#define xstr(x) str(x)
+
+int
+main (void)
+{
+  assert (strcmp (xstr(countof), "_Countof") == 0);
+}
-- 
2.49.0

[PATCH v22 3/3] c: Add -Wpedantic diagnostic for _Countof

2025-05-15 Thread Alejandro Colomar

It has been standardized in C2y.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_sizeof_or_countof_expression):
Add -Wpedantic diagnostic for _Countof in <= C23 mode.

gcc/testsuite/ChangeLog:

* gcc.dg/countof-compat.c
* gcc.dg/countof-no-compat.c
* gcc.dg/countof-pedantic.c
* gcc.dg/countof-pedantic-errors.c:
Test pedantic diagnostics for _Countof.

Signed-off-by: Alejandro Colomar 
---
 gcc/c/c-parser.cc  | 4 
 gcc/testsuite/gcc.dg/countof-compat.c  | 8 
 gcc/testsuite/gcc.dg/countof-no-compat.c   | 5 +
 gcc/testsuite/gcc.dg/countof-pedantic-errors.c | 8 
 gcc/testsuite/gcc.dg/countof-pedantic.c| 8 
 5 files changed, 33 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/countof-compat.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-no-compat.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-pedantic-errors.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-pedantic.c

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 87700339394..d2193ad2f34 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -10637,6 +10637,10 @@ c_parser_sizeof_or_countof_expression (c_parser 
*parser, enum rid rid)
 
   start = c_parser_peek_token (parser)->location;
 
+  if (rid == RID_COUNTOF)
+pedwarn_c23 (start, OPT_Wpedantic,
+"ISO C does not support %qs before C23", op_name);
+
   c_parser_consume_token (parser);
   c_inhibit_evaluation_warnings++;
   if (rid == RID_COUNTOF)
diff --git a/gcc/testsuite/gcc.dg/countof-compat.c 
b/gcc/testsuite/gcc.dg/countof-compat.c
new file mode 100644
index 000..ab5b4ae6219
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/countof-compat.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c2y -pedantic-errors -Wc23-c2y-compat" } */
+
+#include 
+
+int a[1];
+int b[countof(a)];
+int c[_Countof(a)];  /* { dg-warning "ISO C does not support" } */
diff --git a/gcc/testsuite/gcc.dg/countof-no-compat.c 
b/gcc/testsuite/gcc.dg/countof-no-compat.c
new file mode 100644
index 000..4a244cf222f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/countof-no-compat.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -pedantic-errors -Wno-c23-c2y-compat" } */
+
+int a[1];
+int b[_Countof(a)];
diff --git a/gcc/testsuite/gcc.dg/countof-pedantic-errors.c 
b/gcc/testsuite/gcc.dg/countof-pedantic-errors.c
new file mode 100644
index 000..5d5bedbe1f7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/countof-pedantic-errors.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -pedantic-errors" } */
+
+#include 
+
+int a[1];
+int b[countof(a)];
+int c[_Countof(a)];  /* { dg-error "ISO C does not support" } */
diff --git a/gcc/testsuite/gcc.dg/countof-pedantic.c 
b/gcc/testsuite/gcc.dg/countof-pedantic.c
new file mode 100644
index 000..408dc6f9366
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/countof-pedantic.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -pedantic" } */
+
+#include 
+
+int a[1];
+int b[countof(a)];
+int c[_Countof(a)];  /* { dg-warning "ISO C does not support" } */
-- 
2.49.0

[PATCH v22 1/3] c: Add _Countof operator

2025-05-15 Thread Alejandro Colomar

This operator is similar to sizeof but can only be applied to an array,
and returns its number of elements.

FUTURE DIRECTIONS:

-  We should make it work with array parameters to functions,
   and somehow magically return the number of elements of the array,
   regardless of it being really a pointer.

Link: 
Link: 
Link: 
Link: 

Link: 
Link: 
Link: 
Link: 
Link: 
Link: 
Link: 
Link: 

gcc/ChangeLog:

* doc/extend.texi: Document _Countof operator.

gcc/c-family/ChangeLog:

* c-common.h
* c-common.def
* c-common.cc (c_countof_type): Add _Countof operator.

gcc/c/ChangeLog:

* c-tree.h
(c_expr_countof_expr, c_expr_countof_type)
* c-decl.cc
(start_struct, finish_struct)
(start_enum, finish_enum)
* c-parser.cc
(c_parser_sizeof_expression)
(c_parser_countof_expression)
(c_parser_sizeof_or_countof_expression)
(c_parser_unary_expression)
* c-typeck.cc
(build_external_ref)
(record_maybe_used_decl)
(pop_maybe_used)
(is_top_array_vla)
(c_expr_countof_expr, c_expr_countof_type):
Add _Countof operator.

gcc/testsuite/ChangeLog:

* gcc.dg/countof-compile.c
* gcc.dg/countof-vla.c
* gcc.dg/countof-vmt.c
* gcc.dg/countof-zero-compile.c
* gcc.dg/countof-zero.c
* gcc.dg/countof.c: Add tests for _Countof operator.

Suggested-by: Xavier Del Campo Romero 
Co-authored-by: Martin Uecker 
Acked-by: "James K. Lowden" 
Signed-off-by: Alejandro Colomar 
---
 gcc/c-family/c-common.cc|  26 
 gcc/c-family/c-common.def   |   3 +
 gcc/c-family/c-common.h |   2 +
 gcc/c/c-decl.cc |  22 +++-
 gcc/c/c-parser.cc   |  59 +++---
 gcc/c/c-tree.h  |   4 +
 gcc/c/c-typeck.cc   | 115 +-
 gcc/doc/extend.texi |  30 +
 gcc/testsuite/gcc.dg/countof-compile.c  | 124 
 gcc/testsuite/gcc.dg/countof-vla.c  |  35 ++
 gcc/testsuite/gcc.dg/countof-vmt.c  |  21 
 gcc/testsuite/gcc.dg/countof-zero-compile.c |  38 ++
 gcc/testsuite/gcc.dg/countof-zero.c |  32 +
 gcc/testsuite/gcc.dg/countof.c  | 121 +++
 14 files changed, 608 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vmt.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-zero-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-zero.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 587d76461e9..f71cb2652d5 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -394,6 +394,7 @@ const struct c_common_resword c_common_reswords[] =
 {
   { "_Alignas",RID_ALIGNAS,   D_CONLY },
   { "_Alignof",RID_ALIGNOF,   D_CONLY },
+  { "_Countof",RID_COUNTOF,   D_CONLY },
   { "_Atomic", RID_ATOMIC,D_CONLY },
   { "_BitInt", RID_BITINT,D_CONLY },
   { "_Bool",   RID_BOOL,  D_CONLY },
@@ -4080,6 +4081,31 @@ c_alignof_expr (location_t loc, tree expr)
 
   return fold_convert_loc (loc, size_type_node, t);
 }
+
+/* Implement the _Countof keyword:
+   Return the number of elements of an array.  */
+
+tree
+c_countof_type (location_t loc, tree type)
+{
+  enum tree_code type_code;
+
+  type_code = TREE_CODE (type);
+  if (type_code != ARRAY_TYPE)
+{
+  error_at (loc, "invalid application of %<_Countof%> to type %qT", type);
+  return error_mark_node;
+}
+  if (!COMPLETE_TYPE_P (type))
+{
+  error_at (loc,
+   "invalid application of %<_Countof%> to incomplete type %qT",
+   type);
+  return error_mark_node;
+}
+
+  return array_type_nelts_top (type);
+}
 
 /* Handle C and C++ default attributes.  */
 
diff --git a/gcc/c-family/c-common.def b/gcc/c-family/c-common.def
index cf2228201fa..0bcc4998afe 100644
--- a/gcc/c-family/c-common.de

Re: [PATCH v22 0/3] c: Add _Countof and

2025-05-15 Thread Alejandro Colomar

Here's the test run.  No regressions.

BTW, there are some differences between runs.  I _think_ this is due to
running them in separate days, and having run 'make install' in between,
which seems to have made some tests that would normally fail now succeed
but that's unrelated to the feature, and in other runs that I've run
immediately, those differences don't appear.  They started appearing
when I ran 'make install', which I suspect had some effect on them,
even if it shouldn't (because 'make check' is supposed to not be
affected by the system, IMO).  Anyway, here it goes.



alx@debian:/srv/alx/src/gnu/gcc/len$ cat /home/alx/tmp/rt 
alx@debian:/srv/alx/src/gnu/gcc/len$ git log --oneline gnu/master^..len22
c44ef4a2c75 (HEAD -> len, tag: len22, alx/len) c: Add -Wpedantic diagno>
418f81175e7 c: Add 
1c983c3baa7 c: Add _Countof operator
90c6ccebd76 (gnu/trunk, gnu/master, gnu/HEAD) RISC-V: Drop riscv_ext_fl>
alx@debian:/srv/alx/src/gnu/gcc/len$ git reset gnu/master --h
HEAD is now at 90c6ccebd76 RISC-V: Drop riscv_ext_flag_table in favor of 
riscv_ext_info_t data
alx@debian:/srv/alx/src/gnu/gcc/len$ mkdir ../len22
alx@debian:/srv/alx/src/gnu/gcc/len$ cd ../len22
alx@debian:/srv/alx/src/gnu/gcc/len22$ /bin/time ../len/configure 
--disable-multilib --prefix=/opt/local/gnu/gcc/countof22 |& ts -s | tail -n 3; 
echo $?
00:00:01 config.status: creating Makefile
00:00:01 1.59user 0.57system 0:01.78elapsed 122%CPU (0avgtext+0avgdata 
26812maxresident)k
00:00:01 0inputs+8056outputs (0major+280314minor)pagefaults 0swaps
0
alx@debian:/srv/alx/src/gnu/gcc/len22$ /bin/time make -j12 bootstrap |& ts -s | 
tail -n 3; echo $?
00:32:17 make[1]: Leaving directory '/srv/alx/src/gnu/gcc/len22'
00:32:17 16769.95user 436.07system 32:16.93elapsed 888%CPU (0avgtext+0avgdata 
1491560maxresident)k
00:32:17 0inputs+31383912outputs (856major+112120780minor)pagefaults 0swaps
0
alx@debian:/srv/alx/src/gnu/gcc/len22$ /bin/time make check |& ts -s | tail -n 
3; echo $?
07:30:54 make[1]: Leaving directory '/srv/alx/src/gnu/gcc/len22'
07:30:54 23876.26user 3393.05system 7:30:54elapsed 100%CPU (0avgtext+0avgdata 
1041556maxresident)k
07:30:54 700720inputs+21152256outputs (2963major+1060028942minor)pagefaults 
0swaps
0
alx@debian:/srv/alx/src/gnu/gcc/len22$ cd ../len
alx@debian:/srv/alx/src/gnu/gcc/len$ git merge --ff-only len22
Updating 90c6ccebd76..c44ef4a2c75
Fast-forward
 gcc/Makefile.in   |   1 +
 gcc/c-family/c-common.cc  |  26 
 gcc/c-family/c-common.def |   3 +
 gcc/c-family/c-common.h   |   2 +
 gcc/c/c-decl.cc   |  22 +++-
 gcc/c/c-parser.cc |  63 +++--
 gcc/c/c-tree.h|   4 +
 gcc/c/c-typeck.cc | 115 +++-
 gcc/doc/extend.texi   |  30 +
 gcc/ginclude/stdcountof.h |  31 +
 gcc/testsuite/gcc.dg/countof-compat.c |   8 ++
 gcc/testsuite/gcc.dg/countof-compile.c| 124 ++
 gcc/testsuite/gcc.dg/countof-no-compat.c  |   5 +
 .../gcc.dg/countof-pedantic-errors.c  |   8 ++
 gcc/testsuite/gcc.dg/countof-pedantic.c   |   8 ++
 gcc/testsuite/gcc.dg/countof-stdcountof.c |  25 
 gcc/testsuite/gcc.dg/countof-vla.c|  35 +
 gcc/testsuite/gcc.dg/countof-vmt.c|  21 +++
 gcc/testsuite/gcc.dg/countof-zero-compile.c   |  38 ++
 gcc/testsuite/gcc.dg/countof-zero.c   |  32 +
 gcc/testsuite/gcc.dg/countof.c| 121 +
 21 files changed, 698 insertions(+), 24 deletions(-)
 create mode 100644 gcc/ginclude/stdcountof.h
 create mode 100644 gcc/testsuite/gcc.dg/countof-compat.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-no-compat.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-pedantic-errors.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-pedantic.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-stdcountof.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vmt.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-zero-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-zero.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c
alx@debian:/srv/alx/src/gnu/gcc/len$ cd ..
alx@debian:/srv/alx/src/gnu/gcc$ mv len22/ len22_b4
alx@debian:/srv/alx/src/gnu/gcc$ mkdir len22
alx@debian:/srv/alx/src/gnu/gcc$ cd len22
alx@debian:/srv/alx/src/gnu/gcc/len22$ /bin/time ../len/configure 
--disable-multilib --prefix=/opt/local/gnu/gcc/countof22 |& ts -s | tail -n 3; 
echo $?
00:00:01 config.status: creating Makefile
00:00:01 1.51user 0.56system 0:01.71elapsed 120%CPU (0avgtext+0avgdata 
26792maxresident)k
00:00:01 0inputs+8056outputs (0major+278567minor)pagefaults 0swaps
0
alx@debian:/srv/alx/src/gnu/gcc/len22$ /bin/time make -j12 bootstrap |& ts -s | 
tail

Re: [PATCH] libstdc++: Fix proc check_v3_target_namedlocale for "" locale [PR65909]

2025-05-15 Thread Tomasz Kaminski

On Thu, May 15, 2025 at 7:30 PM Jonathan Wakely  wrote:

> When the last format argument to a Tcl proc is named 'args' it has
> special meaning and is a list that accepts any number of arguments[1].
> This means when "" is passed to the proc and then we expand "$args" we
> get an empty list formatted as "{}". My r16-537-g3e2b83faeb6b14 change
> broke all uses of dg-require-namedlocale with empty locale names, "".
>
> By changing the name of the formal argument to 'locale' we avoid the
> special behaviour for 'args' and now it only accepts a single argument
> (as was always intended). When expanded as "$locale" we get "" as I
> expected.
>
> [1] https://www.tcl-lang.org/man/tcl9.0/TclCmd/proc.html
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/65909
> * testsuite/lib/libstdc++.exp (check_v3_target_namedlocale):
> Change name of formal argument to locale.
> ---
>
> Tested x86_64-linux.
>
LGTM. Thanks for the link.

>
> I also plan to audit the other procs in libstdc++.exp to see if they
> should not use 'args', but that can be done later. The priority for now
> is to fix what I broke recently.
>
>  libstdc++-v3/testsuite/lib/libstdc++.exp | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp
> b/libstdc++-v3/testsuite/lib/libstdc++.exp
> index da1f4245e4b8..9f2dd8a17248 100644
> --- a/libstdc++-v3/testsuite/lib/libstdc++.exp
> +++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
> @@ -1019,8 +1019,8 @@ proc check_v3_target_time { } {
>  }]
>  }
>
> -proc check_v3_target_namedlocale { args } {
> -set key "et_namedlocale $args"
> +proc check_v3_target_namedlocale { locale } {
> +set key "et_namedlocale $locale"
>  return [check_v3_target_prop_cached $key {
> global tool
> # Set up, compile, and execute a C++ test program that tries to use
> @@ -1048,7 +1048,7 @@ proc check_v3_target_namedlocale { args } {
> puts $f "}"
> puts $f "int main ()"
> puts $f "{"
> -   puts $f "  const char *namedloc = transform_locale(\"$args\");"
> +   puts $f "  const char *namedloc = transform_locale(\"$locale\");"
> puts $f "  try"
> puts $f "  {"
> puts $f "locale((const char*)namedloc);"
> @@ -1075,7 +1075,7 @@ proc check_v3_target_namedlocale { args } {
> set result [${tool}_load "./$exe" "" ""]
> set status [lindex $result 0]
>
> -   verbose "check_v3_target_namedlocale <$args>: status is <$status>"
> 2
> +   verbose "check_v3_target_namedlocale <$locale>: status is
> <$status>" 2
>
> if { $status == "pass" } {
> return 1
> --
> 2.49.0
>
>

Re: [COMMITTED] PR tee-optimization/120277 - Check for casts becoming UNDEFINED.

2025-05-15 Thread Richard Biener

On Thu, May 15, 2025 at 7:02 PM Andrew MacLeod  wrote:
>
> Recent changes to get_range_from_bitmask can sometimes turn a small
> range into an undefined one if the bitmask indicates the bits make all
> values impossible.
>
> range_cast () was not expecting this and checks for UNDEFINED before
> peforming the cast.   It also needs to check for it after the cast now.
>
> in this testcase, the pattern is
>
>   y = x * 4   <- we know y will have the bottom 2 bits cleared
>   z = Y + 7   <- we know z will have the bottom 2 bit set.
>
> then a switch checks for z == 128 | z== 129 and performs a store into
> *(int *)y
> eventually the store is eliminated as unreachable,  but range analysis
> recognizes that the value is UNDEFINED when [121, 122] with the last 2
> bits having to be 11 is calculated :-P
>
> Do the default for casts, and if the result is UNDEFINED, turn it into
> VARYING.
>
> Bootstrapped on x86_64-pc-linux-gnu with no regressions. Pushed.

For unreachable code it probably does not matter much. but IMO dropping
from UNDEFINED to VARYING within the core ranger is pessimizing
and could end up papering over issues that would otherwise show up.

IMO such UNDEFINED -> VARYING should happen at the consumer
side or alternatively a uppermost API layer that's not used from within
ranger itself.

Richard.

>
> Andrew

Re: [PATCH] Forwprop: add a debug dump after propagate into comparison does something

2025-05-15 Thread Richard Biener

On Thu, May 15, 2025 at 8:22 PM Andrew Pinski  wrote:
>
> I noticed that fowprop does not dump when forward_propagate_into_comparison
> did a change to the assign statement.
> I am actually using it to help guide changing/improving/add match patterns
> instead of depending on doing a tree "combiner" here.
>
> Bootstrapped and tested on x86_64-linux-gnu.

OK.

> gcc/ChangeLog:
>
> * tree-ssa-forwprop.cc (forward_propagate_into_comparison): Dump
> when there is a change to the statement happened.
>
> Signed-off-by: Andrew Pinski 
> ---
>  gcc/tree-ssa-forwprop.cc | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
> index 3187314390f..9986799da3b 100644
> --- a/gcc/tree-ssa-forwprop.cc
> +++ b/gcc/tree-ssa-forwprop.cc
> @@ -523,6 +523,14 @@ forward_propagate_into_comparison (gimple_stmt_iterator 
> *gsi)
>  type, rhs1, rhs2);
>if (tmp && useless_type_conversion_p (type, TREE_TYPE (tmp)))
>  {
> +  if (dump_file)
> +   {
> + fprintf (dump_file, "  Replaced '");
> + print_gimple_expr (dump_file, stmt, 0);
> + fprintf (dump_file, "' with '");
> + print_generic_expr (dump_file, tmp);
> + fprintf (dump_file, "'\n");
> +   }
>gimple_assign_set_rhs_from_tree (gsi, tmp);
>fold_stmt (gsi);
>update_stmt (gsi_stmt (*gsi));
> --
> 2.43.0
>

Re: [PATCH 1/4] Make end_sequence return the insn sequence

2025-05-15 Thread Richard Biener

On Fri, May 16, 2025 at 1:05 AM Jeff Law  wrote:
>
>
>
> On 5/15/25 11:18 AM, Richard Sandiford wrote:
> > The start_sequence/end_sequence interface was a big improvement over
> > the previous state, but one slightly awkward thing about it is that
> > you have to call get_insns before end_sequence in order to get the
> > insn sequence itself:
> I can't even remember what the previous state was...  push_to_sequence?
>
> >
> > To get the contents of the sequence just made, you must call
> > `get_insns' *before* calling here.
> >
> > We therefore have quite a lot of code like this:
> >
> >insns = get_insns ();
> >end_sequence ();
> >return insns;
> >
> > It would seem simpler to write:
> >
> >return end_sequence ();
> >
> > instead.
> Totally agreed.  It's much more natural to just return the sequence.
>
> >
> > I can see three main potential objections to this:
> >
> > (1) It isn't obvious whether ending the sequence would return the first
> >  or the last instruction.  But although some code reads *both* the
> >  first and the last instruction, I can't think of a specific case
> >  where code would want *only* the last instruction.  All the emit
> >  functions take the first instruction rather than the last.
> Understood.  It's a bad name.  finish_sequence, close_sequence, or
> somesuch would probably be much clearer.  And no, I'm not immediately
> aware of cases where we'd only look at the last instruction.  I could
> probably come up with one in theory, but again, not aware of any case in
> practice.
>
> >
> > (2) The "end" in end_sequence might imply the C++ meaning of an exclusive
> >  endpoint iterator.  But for an insn sequence, the exclusive endpoint
> >  is always the null pointer, so it would never need to be returned.
> >  That said, we could rename the function to something like
> >  "finish_sequence" or "complete_sequence" if this is an issue.
> See above.  It's a large mechanical change, but it would remove some of
> the ambiguity in the current API.
> >
> > (3) There might have been an intention that start_sequence/end_sequence
> >  could in future reclaim memory for unwanted sequences, and so an
> >  explicit get_insns was used to indicate that the caller does want
> >  the sequence.
> >
> >  But that sort of memory reclaimation has never been added,
> >  and now that the codebase is C++, it would be easier to handle
> >  using RAII.  I think reclaiming memory would be difficult to do in
> >  any case, since some code records the individual instructions that
> >  they emit, rather than using get_insns.
> IMHO this isn't a strong reason to keep the status quo.  And as you
> note, if we really wanted to wade into this space, RAII style
> ctors/dtors would likely be a much better way to go.
>
> I'm absolutely in favor of the change, even if we didn't do a name
> change in the process.  But let's give others a bit of time to chime in.

I agree with the change, thus OK.

Richard.

>
> jeff
>

Re: [PATCH v2] libstdc++: Preserve the argument type in basic_format_args [PR119246]

2025-05-15 Thread Tomasz Kaminski

On Thu, May 15, 2025 at 7:46 PM Rainer Orth 
wrote:

> Hi Jonathan,
>
> > On Thu, 15 May 2025 at 15:02, Rainer Orth 
> wrote:
> >>
> >> Hi Jonathan,
> >>
> >> >> > this patch broke Solaris bootstrap, both i386-pc-solaris2.11 and
> >> >> > sparc-sun-solaris2.11:
> >> >> >
> >> >> > In file included from
> >> >> > /vol/gcc/src/hg/master/local/libstdc++-v3/src/c++20/format.cc:29:
> >> >> >
> /var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/format:
> >> >> > In member function ‘typename std::basic_format_context<_Out,
> >> >> > _CharT>::iterator std::formatter<__float128,
> >> >> > _CharT>::format(__float128, std::basic_format_context<_Out,
> _CharT>&)
> >> >> > const’:
> >> >> >
> /var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/format:2994:41:
> >> >> > error: ‘__flt128_t’ is not a member of ‘std::__format’; did you
> mean
> >> >> > ‘__bflt16_t’? [-Wtemplate-body]
> >> >> > 2994 | { return _M_f.format((__format::__flt128_t)__u,
> __fc); }
> >> >> >  | ^~
> >> >> >  | __bflt16_t
> >> >> >
> >> >> > and one more instance.
> >> >>
> >> >> And on x86_64-darwin too.
> >> >
> >> > Tomasz, should this be:
> >> >
> >> > --- a/libstdc++-v3/include/std/format
> >> > +++ b/libstdc++-v3/include/std/format
> >> > @@ -2973,7 +2973,7 @@ namespace __format
> >> > };
> >> > #endif
> >> >
> >> > -#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 != 1
> >> > +#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 > 1
> >> >   // Reuse __formatter_fp::format<__format::__flt128_t, Out> for
> __float128.
> >> >   // This formatter is not declared if
> >> > _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT is true,
> >> >   // as __float128 when present is same type as __ieee128, which may
> be same as
> >> >
> >>
> >> with this patch applied, I could link libstdc++.so.  I'll run a full
> >> bootstrap later today.
> >
> >
> > Good to know, thanks. Tomasz already pushed that change as
> > r16-647-gd010a39b9e788a
> > so trunk should be OK now.
>
> it is on Solaris/i386, but sparc is broken in a different way now:
>
> /var/gcc/regression/master/11.4-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/format:2999:23:
> error: static assertion failed: This specialization should not be used for
> long double
>  2999 |   static_assert( !is_same_v<__float128, long double>,
>   |   ^~
> /var/gcc/regression/master/11.4-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/format:2999:23:
> note: ‘!(bool)std::is_same_v’ evaluates to false
> /var/gcc/regression/master/11.4-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/format:
> In instantiation of ‘struct std::formatter’:
> /var/gcc/regression/master/11.4-gcc/build/sparc-sun-solaris2.11/libstdc++-v3/include/type_traits:3559:54:
>  required from ‘constexpr const bool
> std::is_default_constructible_v >’
>  3559 |   inline constexpr bool is_default_constructible_v =
> __is_constructible(_Tp);
>   |
> ^~~
> [...]
>
> I've a local patch in tree to support __float128 on SPARC, so I'll try
> with an unmodified tree first.  However, 2 days ago I could bootstrap
> with that included just fine.
>
Is __float128 a distinct type from long double, in case when both are
128bit?
I have assumed that this would be the case, and this specialization is
meant to avoid generating separate formatting code for both of them.

>
> Rainer
>
> --
>
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University
>
>

Re: [PATCH v2] libstdc++: Preserve the argument type in basic_format_args [PR119246]

2025-05-15 Thread Rainer Orth

Hi Tomasz,

>> I've a local patch in tree to support __float128 on SPARC, so I'll try
>> with an unmodified tree first.  However, 2 days ago I could bootstrap
>> with that included just fine.
>>
> Is __float128 a distinct type from long double, in case when both are
> 128bit?
> I have assumed that this would be the case, and this specialization is
> meant to avoid generating separate formatting code for both of them.

I'm using

/* Create builtin type for __float128.  */

static void
sparc_float128_init_builtins (void)
{
  /* With 128-bit long double, the __float128 type is a synonym for
 "long double".  */
  lang_hooks.types.register_builtin_type (long_double_type_node, "__float128");
}

I guess I should file a PR for this, attaching the current patch, to
discuss the remaining issues (fetestexcept failures in the testsuite, on
both Solaris/sparc and Linux/sparc64).

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] Fortran: default-initialization and functions returning derived type[PR85750]

2025-05-15 Thread Andre Vehreschild


LGTM!

Thanks for the Patch.

- Andre
Andre Vehreschild * ve...@gmx.de
Am 15. Mai 2025 22:36:19 schrieb Harald Anlauf :


Dear all,

the attached patch fixes missing default-initialization of function
results of derived type that happens under some conditions, see PR.
The logic when default initialization is to be applied is rather
contorted, and reversing the order of two cases fixed the issue.

Regtesting revealed a few bogus warnings in the testsuite,
and some counts of tree-dump scans needed adjustment.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

The PR marks the issue as a regression since gcc-6, which I cannot
test.  I therefore feel that this fix should be backported to
15-branch.  Going back further would need backporting of other
patches (e.g. to pr98454), so if someone pushes me, I can try.
Let me know what you think.

Cheers,
Harald

Re: [PATCH v2] libstdc++: Preserve the argument type in basic_format_args [PR119246]

2025-05-15 Thread Tomasz Kaminski

An updated version of patch, that takes a safer approach on not declaring a
special formatter for __float128 if its long double is possibly IEEE 128.
Please let me know if that version addressed the problem on sparc. I
thinking there are more sparc machines in compiler farm, so I am going to
check
that tomorrow.

diff --git a/libstdc++-v3/include/std/format
b/libstdc++-v3/include/std/format
index b1823db83bc..d2ac84a1709 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -2973,11 +2973,10 @@ namespace __format
 };
 #endif

-#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 > 1
+#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 == 2
   // Reuse __formatter_fp::format<__format::__flt128_t, Out> for
__float128.
-  // This formatter is not declared if _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT
is true,
-  // as __float128 when present is same type as __ieee128, which may be
same as
-  // long double.
+  // This formatter is only declared when __flt128_t is _Float128, as in
other
+  // cases __float128 may be same type as long double (powerpc and sparc).
   template<__format::__char _CharT>
 struct formatter<__float128, _CharT>
 {
@@ -2995,9 +2994,6 @@ namespace __format

 private:
   __format::__formatter_fp<_CharT> _M_f;
-
-  static_assert( !is_same_v<__float128, long double>,
-"This specialization should not be used for long
double" );
 };
 #endif

-- 
2.49.0


On Thu, May 15, 2025 at 9:20 PM Tomasz Kaminski  wrote:

> Hi,
>
> I apologize for the chrum that this patch created. I have added this
> static assert to detect environments where __float128 is same as long
> double.
> From original commit message:
> >We also provide formatter<__float128, _CharT> that formats via __flt128_t.
> >As this type may be disabled (-mno-float128), extra care needs to be
> taken,
> >for situation when __float128 is same as long double. If the formatter
> would be
> >defined in such case, the formatter would be
> generated
> >from different specializations, and have different mangling:
> >  * formatter<__float128, _CharT> if __float128 is present,
> >  * formatter<__format::__formattable_float, _CharT> otherwise.
> >To best of my knowledge this happens only on ppc64 for __ieee128 and
> __float128,
> >so the formatter is not defined in this case. static_assert is added to
> detect
> >other configurations like that. In such case we should replace it with
> constra
>
> I do not have access to a machine with that setup to confirm,
> but could you please check if the following patch fixes this.
> ---
>  libstdc++-v3/include/std/format | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/format
> b/libstdc++-v3/include/std/format
> index b1823db83bc..a870aa3f3ea 100644
> --- a/libstdc++-v3/include/std/format
> +++ b/libstdc++-v3/include/std/format
> @@ -2979,6 +2979,7 @@ namespace __format
>// as __float128 when present is same type as __ieee128, which may be
> same as
>// long double.
>template<__format::__char _CharT>
> +requires (!is_same_v<__float128, long double>)
>  struct formatter<__float128, _CharT>
>  {
>formatter() = default;
> @@ -2995,9 +2996,6 @@ namespace __format
>
>  private:
>__format::__formatter_fp<_CharT> _M_f;
> -
> -  static_assert( !is_same_v<__float128, long double>,
> -"This specialization should not be used for long
> double" );
>  };
>  #endif
>
> --
> 2.49.0
>
>
>
> On Thu, May 15, 2025 at 7:46 PM Rainer Orth 
> wrote:
>
>> Hi Jonathan,
>>
>> > On Thu, 15 May 2025 at 15:02, Rainer Orth 
>> wrote:
>> >>
>> >> Hi Jonathan,
>> >>
>> >> >> > this patch broke Solaris bootstrap, both i386-pc-solaris2.11 and
>> >> >> > sparc-sun-solaris2.11:
>> >> >> >
>> >> >> > In file included from
>> >> >> > /vol/gcc/src/hg/master/local/libstdc++-v3/src/c++20/format.cc:29:
>> >> >> >
>> /var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/format:
>> >> >> > In member function ‘typename std::basic_format_context<_Out,
>> >> >> > _CharT>::iterator std::formatter<__float128,
>> >> >> > _CharT>::format(__float128, std::basic_format_context<_Out,
>> _CharT>&)
>> >> >> > const’:
>> >> >> >
>> /var/gcc/regression/master/11.4-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/format:2994:41:
>> >> >> > error: ‘__flt128_t’ is not a member of ‘std::__format’; did you
>> mean
>> >> >> > ‘__bflt16_t’? [-Wtemplate-body]
>> >> >> > 2994 | { return _M_f.format((__format::__flt128_t)__u,
>> __fc); }
>> >> >> >  | ^~
>> >> >> >  | __bflt16_t
>> >> >> >
>> >> >> > and one more instance.
>> >> >>
>> >> >> And on x86_64-darwin too.
>> >> >
>> >> > Tomasz, should this be:
>> >> >
>> >> > --- a/libstdc++-v3/include/std/format
>> >> > +++ b/libstdc++-v3/include/std/format
>> >> > @@ -2973,7 +2973,7 @@ namespac

[PATCH v2 1/2] emit-rtl: Allow extra checks for paradoxical subregs [PR119966]

2025-05-15 Thread Dimitar Dimitrov

When a paradoxical subreg is detected, validate_subreg exits early, thus
skipping the important checks later in the function.

Fix by continuing with the checks instead of declaring early that the
paradoxical subreg is valid.

One of the newly allowed subsequent checks needed to be disabled for
paradoxical subregs.  It turned out that combine attempts to create
a paradoxical subreg of mem even for strict-alignment targets.
That is invalid and should eventually be rejected, but is
temporarily left allowed to prevent regressions for
armv8l-unknown-linux-gnueabihf.

Tests I did:
 - No regressions were found for C and C++ for the following targets:
   - native x86_64-pc-linux-gnu
   - cross riscv64-unknown-linux-gnu
   - cross riscv32-none-elf
 - Sanity checked armv8l-unknown-linux-gnueabihf by cross-building
   up to including libgcc. I'll monitor Linaro CI bot for the
   full regression test results.
 - Sanity checked powerpc64-unknown-linux-gnu by building native
   toolchain, but could not setup qemu-user for DejaGnu testing.

PR target/119966

gcc/ChangeLog:

* emit-rtl.cc (validate_subreg): Do not exit immediately for
paradoxical subregs.  Filter subsequent tests which are
not valid for paradoxical subregs.

Co-authored-by: Richard Sandiford 
Signed-off-by: Dimitar Dimitrov 
---
 gcc/emit-rtl.cc | 25 ++---
 1 file changed, 18 insertions(+), 7 deletions(-)

diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index 3e2c4309dee..e46b0f9eac4 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -969,10 +969,10 @@ validate_subreg (machine_mode omode, machine_mode imode,
 }
 
   /* Paradoxical subregs must have offset zero.  */
-  if (maybe_gt (osize, isize))
-return known_eq (offset, 0U);
+  if (maybe_gt (osize, isize) && !known_eq (offset, 0U))
+return false;
 
-  /* This is a normal subreg.  Verify that the offset is representable.  */
+  /* Verify that the offset is representable.  */
 
   /* For hard registers, we already have most of these rules collected in
  subreg_offset_representable_p.  */
@@ -988,9 +988,13 @@ validate_subreg (machine_mode omode, machine_mode imode,
 
   return subreg_offset_representable_p (regno, imode, offset, omode);
 }
-  /* Do not allow SUBREG with stricter alignment than the inner MEM.  */
+  /* Do not allow normal SUBREG with stricter alignment than the inner MEM.
+
+ FIXME: Combine can create paradoxical mem subregs even for
+ strict-alignment targets.  Allow it until combine is fixed.  */
   else if (reg && MEM_P (reg) && STRICT_ALIGNMENT
-  && MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (omode))
+  && MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (omode)
+  && known_le (osize, isize))
 return false;
 
   /* The outer size must be ordered wrt the register size, otherwise
@@ -999,7 +1003,7 @@ validate_subreg (machine_mode omode, machine_mode imode,
   if (!ordered_p (osize, regsize))
 return false;
 
-  /* For pseudo registers, we want most of the same checks.  Namely:
+  /* For normal pseudo registers, we want most of the same checks.  Namely:
 
  Assume that the pseudo register will be allocated to hard registers
  that can hold REGSIZE bytes each.  If OSIZE is not a multiple of REGSIZE,
@@ -1008,8 +1012,15 @@ validate_subreg (machine_mode omode, machine_mode imode,
  otherwise it is at the lowest offset.
 
  Given that we've already checked the mode and offset alignment,
- we only have to check subblock subregs here.  */
+ we only have to check subblock subregs here.
+
+ For paradoxical little-endian registers, this check is redundant.  The
+ offset has already been validated to be zero.
+
+ For paradoxical big-endian registers, this check is not valid
+ because the offset is zero.  */
   if (maybe_lt (osize, regsize)
+  && known_le (osize, isize)
   && ! (lra_in_progress && (FLOAT_MODE_P (imode) || FLOAT_MODE_P (omode
 {
   /* It is invalid for the target to pick a register size for a mode
-- 
2.49.0

[PATCH v2 0/2] emit-rtl: Add more checks for paradoxical subregs [PR119966]

2025-05-15 Thread Dimitar Dimitrov

Commit r16-160-ge6f89d78c1a752 exposed that validate_subreg is
skipping important checks for paradoxical subregs, which manifested as a
regression for pru-unknown-elf.  This patch series is enabling some of
the skipped checks, and adds one more for hardware subreg mode validity.

Changes since v1:
 - Alignment check for paradoxical mem subregs was left disabled
   to avoid regression for armv8l-unknown-linux-gnueabihf.
 - Disabled subblock subreg checks for paradoxical subregs to
   fix regression for big endian targets.
 - Split into two patches.
 - Added co-authored-by to acknowledge the help and code snippets
   received in PR119966.

Dimitar Dimitrov (2):
  emit-rtl: Allow extra checks for paradoxical subregs [PR119966]
  emit-rtl: Validate mode for paradoxical hardware subregs [PR119966]

 gcc/emit-rtl.cc | 28 +---
 1 file changed, 21 insertions(+), 7 deletions(-)

-- 
2.49.0

[PATCH v2 2/2] emit-rtl: Validate mode for paradoxical hardware subregs [PR119966]

2025-05-15 Thread Dimitar Dimitrov

After r16-160-ge6f89d78c1a752, late_combine2 started transforming the
following RTL for pru-unknown-elf:

  (insn 3949 3948 3951 255 (set (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856])
  (and:QI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855])
  (const_int 3 [0x3])))
   (nil))
  ...
  (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0)
  (zero_extend:SI (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856])))
   (nil))

into:

  (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0)
  (and:SI (subreg:SI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855]) 0)
  (const_int 3 [0x3])))
   (nil))

That caused libbacktrace build to break for pru-unknown-elf.  Register
r0.b1 (regno 1) is not valid for SImode, which validate_subreg failed to
reject.

Fix by calling HARD_REGNO_MODE_OK to ensure that both inner and outer
modes are valid for the hardware subreg.

This patch fixes the broken PRU toolchain build.  It leaves only two
test case regressions for PRU, caused by rnreg pass renaming a valid
paradoxical subreg into an invalid one.
  gcc.c-torture/execute/20040709-1.c
  gcc.c-torture/execute/20040709-2.c

PR target/119966

gcc/ChangeLog:

* emit-rtl.cc (validate_subreg): Validate inner
and outer mode for paradoxical hardware subregs.

Co-authored-by: Andrew Pinski 
Signed-off-by: Dimitar Dimitrov 
---
 gcc/emit-rtl.cc | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index e46b0f9eac4..6c5d9b55508 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -983,6 +983,9 @@ validate_subreg (machine_mode omode, machine_mode imode,
   if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
  && GET_MODE_INNER (imode) == omode)
;
+  else if (!targetm.hard_regno_mode_ok (regno, imode)
+  || !targetm.hard_regno_mode_ok (regno, omode))
+   return false;
   else if (!REG_CAN_CHANGE_MODE_P (regno, imode, omode))
return false;
 
-- 
2.49.0

[PATCH] Fortran: default-initialization and functions returning derived type[PR85750]

2025-05-15 Thread Harald Anlauf


Dear all,

the attached patch fixes missing default-initialization of function
results of derived type that happens under some conditions, see PR.
The logic when default initialization is to be applied is rather
contorted, and reversing the order of two cases fixed the issue.

Regtesting revealed a few bogus warnings in the testsuite,
and some counts of tree-dump scans needed adjustment.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

The PR marks the issue as a regression since gcc-6, which I cannot
test.  I therefore feel that this fix should be backported to
15-branch.  Going back further would need backporting of other
patches (e.g. to pr98454), so if someone pushes me, I can try.
Let me know what you think.

Cheers,
Harald

From 8a1f2ae8c0ea3a92d9b20f0e678b56583ca4a849 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Thu, 15 May 2025 21:07:07 +0200
Subject: [PATCH] Fortran: default-initialization and functions returning
 derived type[PR85750]

Functions with non-pointer, non-allocatable result and of derived type did
not always get initialized although the type had default-initialization,
and a derived type component had the allocatable or pointer attribute.
Rearrange the logic when to apply default-initialization.

	PR fortran/85750

gcc/fortran/ChangeLog:

	* resolve.cc (resolve_symbol): Reorder conditions when to apply
	default-initializers.

gcc/testsuite/ChangeLog:

	* gfortran.dg/alloc_comp_auto_array_3.f90: Adjust scan counts.
	* gfortran.dg/alloc_comp_class_3.f03: Remove bogus warnings.
	* gfortran.dg/alloc_comp_class_4.f03: Likewise.
	* gfortran.dg/allocate_with_source_14.f03: Adjust scan count.
	* gfortran.dg/derived_constructor_comps_6.f90: Likewise.
	* gfortran.dg/derived_result_5.f90: New test.
---
 gcc/fortran/resolve.cc|   8 +-
 .../gfortran.dg/alloc_comp_auto_array_3.f90   |   4 +-
 .../gfortran.dg/alloc_comp_class_3.f03|   3 +-
 .../gfortran.dg/alloc_comp_class_4.f03|   5 +-
 .../gfortran.dg/allocate_with_source_14.f03   |   2 +-
 .../derived_constructor_comps_6.f90   |   2 +-
 .../gfortran.dg/derived_result_5.f90  | 123 ++
 7 files changed, 134 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/derived_result_5.f90

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index bf1aa704888..d09aef0a899 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -18059,16 +18059,16 @@ skip_interfaces:
 	  || (a->dummy && !a->pointer && a->intent == INTENT_OUT
 	  && sym->ns->proc_name->attr.if_source != IFSRC_IFBODY))
 	apply_default_init (sym);
+  else if (a->function && !a->pointer && !a->allocatable && !a->use_assoc
+	   && sym->result)
+	/* Default initialization for function results.  */
+	apply_default_init (sym->result);
   else if (a->function && sym->result && a->access != ACCESS_PRIVATE
 	   && (sym->ts.u.derived->attr.alloc_comp
 		   || sym->ts.u.derived->attr.pointer_comp))
 	/* Mark the result symbol to be referenced, when it has allocatable
 	   components.  */
 	sym->result->attr.referenced = 1;
-  else if (a->function && !a->pointer && !a->allocatable && !a->use_assoc
-	   && sym->result)
-	/* Default initialization for function results.  */
-	apply_default_init (sym->result);
 }
 
   if (sym->ts.type == BT_CLASS && sym->ns == gfc_current_ns
diff --git a/gcc/testsuite/gfortran.dg/alloc_comp_auto_array_3.f90 b/gcc/testsuite/gfortran.dg/alloc_comp_auto_array_3.f90
index 2af089e84e8..d0751f3d3eb 100644
--- a/gcc/testsuite/gfortran.dg/alloc_comp_auto_array_3.f90
+++ b/gcc/testsuite/gfortran.dg/alloc_comp_auto_array_3.f90
@@ -25,6 +25,6 @@ contains
 allocate (array(1)%bigarr)
   end function
 end
-! { dg-final { scan-tree-dump-times "builtin_malloc" 3 "original" } }
+! { dg-final { scan-tree-dump-times "builtin_malloc" 4 "original" } }
 ! { dg-final { scan-tree-dump-times "builtin_free" 3 "original" } }
-! { dg-final { scan-tree-dump-times "while \\(1\\)" 4 "original" } }
+! { dg-final { scan-tree-dump-times "while \\(1\\)" 5 "original" } }
diff --git a/gcc/testsuite/gfortran.dg/alloc_comp_class_3.f03 b/gcc/testsuite/gfortran.dg/alloc_comp_class_3.f03
index 0753e33d535..8202d783621 100644
--- a/gcc/testsuite/gfortran.dg/alloc_comp_class_3.f03
+++ b/gcc/testsuite/gfortran.dg/alloc_comp_class_3.f03
@@ -45,11 +45,10 @@ contains
 type(c), value :: d
   end subroutine
 
-  type(c) function c_init()  ! { dg-warning "not set" }
+  type(c) function c_init()
   end function
 
   subroutine sub(d)
 type(u), value :: d
   end subroutine
 end program test_pr58586
-
diff --git a/gcc/testsuite/gfortran.dg/alloc_comp_class_4.f03 b/gcc/testsuite/gfortran.dg/alloc_comp_class_4.f03
index 4a55d73b245..9ff38e3fb7c 100644
--- a/gcc/testsuite/gfortran.dg/alloc_comp_class_4.f03
+++ b/gcc/testsuite/gfortran.dg/alloc_comp_class_4.f03
@@ -51,14 +51,14 @@ contains
 type(t), value :: d
   end subroutine
 
-  type(c) function c_init() !

Re: [PATCH v2 2/2] emit-rtl: Validate mode for paradoxical hardware subregs [PR119966]

2025-05-15 Thread Andrew Pinski

On Thu, May 15, 2025 at 12:34 PM Dimitar Dimitrov  wrote:
>
> After r16-160-ge6f89d78c1a752, late_combine2 started transforming the
> following RTL for pru-unknown-elf:
>
>   (insn 3949 3948 3951 255 (set (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856])
>   (and:QI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855])
>   (const_int 3 [0x3])))
>(nil))
>   ...
>   (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0)
>   (zero_extend:SI (reg:QI 56 r14.b0 [orig:1856 _619 ] [1856])))
>(nil))
>
> into:
>
>   (insn 3961 7067 3962 255 (set (reg:SI 56 r14.b0)
>   (and:SI (subreg:SI (reg:QI 1 r0.b1 [orig:1855 _201 ] [1855]) 0)
>   (const_int 3 [0x3])))
>(nil))
>
> That caused libbacktrace build to break for pru-unknown-elf.  Register
> r0.b1 (regno 1) is not valid for SImode, which validate_subreg failed to
> reject.
>
> Fix by calling HARD_REGNO_MODE_OK to ensure that both inner and outer
> modes are valid for the hardware subreg.
>
> This patch fixes the broken PRU toolchain build.  It leaves only two
> test case regressions for PRU, caused by rnreg pass renaming a valid
> paradoxical subreg into an invalid one.
>   gcc.c-torture/execute/20040709-1.c
>   gcc.c-torture/execute/20040709-2.c
>
> PR target/119966
>
> gcc/ChangeLog:
>
> * emit-rtl.cc (validate_subreg): Validate inner
> and outer mode for paradoxical hardware subregs.
>
> Co-authored-by: Andrew Pinski 

Note this should be `Andrew Pinski ` for
legal requirements.

Thanks,
Andrew

> Signed-off-by: Dimitar Dimitrov 
> ---
>  gcc/emit-rtl.cc | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> index e46b0f9eac4..6c5d9b55508 100644
> --- a/gcc/emit-rtl.cc
> +++ b/gcc/emit-rtl.cc
> @@ -983,6 +983,9 @@ validate_subreg (machine_mode omode, machine_mode imode,
>if ((COMPLEX_MODE_P (imode) || VECTOR_MODE_P (imode))
>   && GET_MODE_INNER (imode) == omode)
> ;
> +  else if (!targetm.hard_regno_mode_ok (regno, imode)
> +  || !targetm.hard_regno_mode_ok (regno, omode))
> +   return false;
>else if (!REG_CAN_CHANGE_MODE_P (regno, imode, omode))
> return false;
>
> --
> 2.49.0
>

Re: [PATCH] c++: Further simplify the stdlib inline folding

2025-05-15 Thread Ville Voutilainen

On Thu, 15 May 2025 at 18:32, Ville Voutilainen
 wrote:
>
> On Thu, 15 May 2025 at 18:19, Jason Merrill  wrote:
>
> > > @@ -3347,8 +3347,6 @@ cp_fold (tree x, fold_flags_t flags)
> > >   || id_equal (DECL_NAME (callee), "as_const")))
> > > {
> > >   r = CALL_EXPR_ARG (x, 0);
> > > - if (!same_type_p (TREE_TYPE (x), TREE_TYPE (r)))
> > > -   r = build_nop (TREE_TYPE (x), r);
> >
> > This is removing the conversion entirely; I'm rather surprised it didn't
> > break anything.  I thought you were thinking to make the build_nop
> > unconditional.
>
> Oops. Yes, that makes more sense. I am confused how that build_nop
> actually works, but it indeed should
> convert r to x, and not be completely nuked. Re-doing...

So, let's try this again. As discussed privately, the difference of
whether that build_nop is there is hard to the point of
unknown-how-to test-trigger, but this patch makes more sense.

Tested on Linux-PPC64 (gcc112). Ok for trunk?

  Further simplify the stdlib inline folding

   gcc/cp/ChangeLog:
   * cp-gimplify.cc (cp_fold): Do the conversion
unconditionally, even for same-type cases.

   gcc/ChangeLog:
   * doc/invoke.texi: Add to_underlying to -ffold-simple-inlines.
diff --git a/gcc/cp/cp-gimplify.cc b/gcc/cp/cp-gimplify.cc
index eab55504b05..f7bd453bc5e 100644
--- a/gcc/cp/cp-gimplify.cc
+++ b/gcc/cp/cp-gimplify.cc
@@ -3347,8 +3347,7 @@ cp_fold (tree x, fold_flags_t flags)
 		|| id_equal (DECL_NAME (callee), "as_const")))
 	  {
 	r = CALL_EXPR_ARG (x, 0);
-	if (!same_type_p (TREE_TYPE (x), TREE_TYPE (r)))
-	  r = build_nop (TREE_TYPE (x), r);
+	r = build_nop (TREE_TYPE (x), r);
 	x = cp_fold (r, flags);
 	break;
 	  }
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index ee7180110e1..83c63ce6ae5 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -3348,7 +3348,8 @@ aliases, the default is @option{-fno-extern-tls-init}.
 @item -ffold-simple-inlines
 @itemx -fno-fold-simple-inlines
 Permit the C++ frontend to fold calls to @code{std::move}, @code{std::forward},
-@code{std::addressof} and @code{std::as_const}.  In contrast to inlining, this
+@code{std::addressof}, @code{std::to_underlying}
+and @code{std::as_const}.  In contrast to inlining, this
 means no debug information will be generated for such calls.  Since these
 functions are rarely interesting to debug, this flag is enabled by default
 unless @option{-fno-inline} is active.

[WWWDOCS, COMMITTED] Update git repository docs after creation of devel/omp/gcc-15.

2025-05-15 Thread Sandra Loosemore

   * git.html: Note that devel/omp/gcc-15 exists, and that the
   corresponding gcc-14 branch is now stale.
---
 htdocs/git.html | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/htdocs/git.html b/htdocs/git.html
index 8edaa254..0b55a970 100644
--- a/htdocs/git.html
+++ b/htdocs/git.html
@@ -280,17 +280,19 @@ in Git.
   Makarov mailto:vmaka...@redhat.com";>vmaka...@redhat.com.
   
 
-  https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/heads/devel/omp/gcc-14";>devel/omp/gcc-14
+  https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/heads/devel/omp/gcc-15";>devel/omp/gcc-15
   This branch is for collaborative development of
   https://gcc.gnu.org/wiki/OpenACC";>OpenACC and
   https://gcc.gnu.org/wiki/openmp";>OpenMP support and related
   functionality, such
   as https://gcc.gnu.org/wiki/Offloading";>offloading support (OMP:
   offloading and multi processing).
-  The branch is based on releases/gcc-14.
-  Please send patch emails with a short-hand [og14] tag in the
-  subject line, and use ChangeLog.omp files. (Likewise but now
-  stale branches exists for the prior GCC releases 9 to 13.)
+  The branch is based on releases/gcc-15.
+  Please send patch emails with a short-hand [og15] tag in the
+  subject line.  This branch uses ChangeLog.omp files
+  that are occasionally updated automatically from the commit messages, so
+  please ensure that your commit messages include a valid ChangeLog section.
+  (Stale branches also exist for the prior GCC releases 9 to 14.)
 
   unified-autovect
   This branch is for work on improving effectiveness and generality of 
GCC's
@@ -904,13 +906,14 @@ merged.
   https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/heads/devel/omp/gcc-11";>devel/omp/gcc-11
   https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/heads/devel/omp/gcc-12";>devel/omp/gcc-12
   https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/heads/devel/omp/gcc-13";>devel/omp/gcc-13
+  https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/heads/devel/omp/gcc-14";>devel/omp/gcc-14
   These branches were used for collaborative development of
   https://gcc.gnu.org/wiki/OpenACC";>OpenACC and
   https://gcc.gnu.org/wiki/openmp";>OpenMP support and related
   functionality as the successors to openacc-gcc-9-branch after the move to
   Git.
   The branches were based on releases/gcc-9, releases/gcc-10, etc.
-  Development has now moved to the devel/omp/gcc-14 branch.
+  Development has now moved to the devel/omp/gcc-15 branch.
 
   hammer-3_3-branch
   The goal of this branch was to have a stable compiler based on GCC 3.3
-- 
2.34.1

Re: [PATCH] RISC-V: Fix the warning of temporary object dangling references.

2025-05-15 Thread Jeff Law





On 5/15/25 8:35 PM, Kito Cheng wrote:

Hm, it really doesn't make too much sense to get that warning, but
I can reproduce that when I compile with gcc 13 (and newer)...and
seems like a known issue [1][2]...
I still hadn't managed to convince myself the code was wrong.  It looked 
more like the change was papering over a bug in the warning itself.


The referenced bug is closed as fixed, but there was a link (from the 
stack overflow discussion) to a testcase in godbolt that was still failing.




However I don't really like that approach, could you change the
argument type of get_riscv_ext_info to `const char *` to suppress that
warning instead?

If that works, it'd be better IMHO.
jeff

[to-be-committed][RISC-V] Avoid setting output object more than once in IOR/XOR synthesis

2025-05-15 Thread Jeff Law

While evaluating Shreya's logical AND synthesis work on spec2017 I ran 
into a code quality regression where combine was failing to eliminate a 
redundant sign extension.


I had a hunch the problem would be with the multiple sets of the same 
pseudo register in the AND synthesis path.  I was right that the problem 
was multiple sets of the same pseudo, but it was actually some of the 
splitters in the RISC-V backend that were the culprit.  Those multiple 
sets caused the sign bit tracking code to need to make conservative 
assumptions thus resulting in failure to eliminate the unnecessary sign 
extension.


So before we start moving on the logical AND patch we're going to do 
some cleanups.


There's multiple moving parts in play.  For example, we have splitters 
which do multiple sets of the output register.  Fixing some of those 
independently would result in a code quality regression.  Instead they 
need some adjustments to or removal of mvconst_internal.  Of course 
getting rid of mvconst_internal will trigger all kinds of code quality 
regressions right now which ultimately lead back to the need to revamp 
the logical AND expander.  Point being we've got some circular 
dependencies and breaking them may result in short term code quality 
regressions.  I'll obviously try to avoid those as much as possible.


So to start the process this patch adjusts the recently added XOR/IOR 
synthesis to avoid re-using the destination register.  While the reuse 
was clearly safe from a semantic standpoint, various parts of the 
compiler can do a better job for pseudos that are only set once.


Given this synthesis path should only be active during initial RTL 
generation, we can create new pseudos at will, so we create a new one 
for each insn.  At the end of the sequence we copy from the last set 
into the final destination.


This has various trivial impacts on the code generation, but the 
resulting code looks no better or worse to me across spec2017.


This has been tested in my tester and is currently bootstrapping on my 
BPI.  Waiting on data from the pre-commit tester before moving forward...


Jeff

* config/riscv/riscv.cc (synthesize_ior_xor): Avoid writing
operands[0] more than once, use new pseudos instead.

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index d996965d095..b908c4684ac 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -14271,17 +14271,21 @@ synthesize_ior_xor (rtx_code code, rtx operands[3])
 {
   /* Pre-flipping bits we want to preserve.  */
   rtx input = operands[1];
+  rtx output = NULL_RTX;
   ival = ~INTVAL (operands[2]);
   while (ival)
{
  HOST_WIDE_INT tmpval = HOST_WIDE_INT_UC (1) << ctz_hwi (ival);
  rtx x = GEN_INT (tmpval);
  x = gen_rtx_XOR (word_mode, input, x);
- emit_insn (gen_rtx_SET (operands[0], x));
- input = operands[0];
+ output = gen_reg_rtx (word_mode);
+ emit_insn (gen_rtx_SET (output, x));
+ input = output;
  ival &= ~tmpval;
}
 
+  gcc_assert (output);
+
   /* Now flip all the bits, which restores the bits we were
 preserving.  */
   rtx x = gen_rtx_NOT (word_mode, input);
@@ -14304,23 +14308,29 @@ synthesize_ior_xor (rtx_code code, rtx operands[3])
   int msb = BITS_PER_WORD - 1 - clz_hwi (ival);
   if (msb - lsb + 1 <= 11)
{
+ rtx output = gen_reg_rtx (word_mode);
+ rtx input = operands[1];
+
  /* Rotate the source right by LSB bits.  */
  rtx x = GEN_INT (lsb);
- x = gen_rtx_ROTATERT (word_mode, operands[1], x);
- emit_insn (gen_rtx_SET (operands[0], x));
+ x = gen_rtx_ROTATERT (word_mode, input, x);
+ emit_insn (gen_rtx_SET (output, x));
+ input = output;
 
  /* Shift the constant right by LSB bits.  */
  x = GEN_INT (ival >> lsb);
 
  /* Perform the IOR/XOR operation.  */
- x = gen_rtx_fmt_ee (code, word_mode, operands[0], x);
- emit_insn (gen_rtx_SET (operands[0], x));
+ x = gen_rtx_fmt_ee (code, word_mode, input, x);
+ output = gen_reg_rtx (word_mode);
+ emit_insn (gen_rtx_SET (output, x));
+ input = output;
 
  /* And rotate left to put everything back in place, we don't
 have rotate left by a constant, so use rotate right by
 an adjusted constant.  */
  x = GEN_INT (BITS_PER_WORD - lsb);
- x = gen_rtx_ROTATERT (word_mode, operands[1], x);
+ x = gen_rtx_ROTATERT (word_mode, input, x);
  emit_insn (gen_rtx_SET (operands[0], x));
  return true;
}
@@ -14341,22 +14351,28 @@ synthesize_ior_xor (rtx_code code, rtx operands[3])
   if ((INTVAL (operands[2]) & HOST_WIDE_INT_UC (0x7ff)) != 0
  && msb - lsb + 1 <= 11)
{
+ rtx output = gen_reg_rtx (word_mode);
+ rtx input = operands[1];
+
  /* Rota

[PATCH v2] libstdc++: Fix preprocessor check for __float128 formatter [PR119246]

2025-05-15 Thread Tomasz Kamiński

The previous check `_GLIBCXX_FORMAT_F128 != 1` was passing if
_GLIBCXX_FORMAT_F128 was not defined, i.e. evaluted to zero.

This broke sparc-sun-solaris2.11 and x86_64-darwin.

PR libstdc++/119246

libstdc++-v3/ChangeLog:

* include/std/format: Updated check for _GLIBCXX_FORMAT_F128.
---
 libstdc++-v3/include/std/format | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index f0b0252255d..bfda5895e0c 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -2973,7 +2973,7 @@ namespace __format
 };
 #endif
 
-#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 != 1
+#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 > 1
   // Reuse __formatter_fp::format<__format::__flt128_t, Out> for __float128.
   // This formatter is not declared if _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT is 
true,
   // as __float128 when present is same type as __ieee128, which may be same as
-- 
2.49.0

Re: [PATCH v2] libstdc++: Fix preprocessor check for __float128 formatter [PR119246]

2025-05-15 Thread Tomasz Kaminski

On Thu, May 15, 2025 at 9:17 AM Tomasz Kamiński  wrote:

> The previous check `_GLIBCXX_FORMAT_F128 != 1` was passing if
> _GLIBCXX_FORMAT_F128 was not defined, i.e. evaluted to zero.
>
> This broke sparc-sun-solaris2.11 and x86_64-darwin.
>
> PR libstdc++/119246
>
> libstdc++-v3/ChangeLog:
>
> * include/std/format: Updated check for _GLIBCXX_FORMAT_F128.
> ---
>
This was committed to the trunk.

>  libstdc++-v3/include/std/format | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/include/std/format
> b/libstdc++-v3/include/std/format
> index f0b0252255d..bfda5895e0c 100644
> --- a/libstdc++-v3/include/std/format
> +++ b/libstdc++-v3/include/std/format
> @@ -2973,7 +2973,7 @@ namespace __format
>  };
>  #endif
>
> -#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 != 1
> +#if defined(__SIZEOF_FLOAT128__) && _GLIBCXX_FORMAT_F128 > 1
>// Reuse __formatter_fp::format<__format::__flt128_t, Out> for
> __float128.
>// This formatter is not declared if _GLIBCXX_LONG_DOUBLE_ALT128_COMPAT
> is true,
>// as __float128 when present is same type as __ieee128, which may be
> same as
> --
> 2.49.0
>
>

Re: [PATCH] c++/modules: Fix handling of -fdeclone-ctor-dtor with explicit instantiations [PR120125]

2025-05-15 Thread Jason Merrill


On 5/14/25 6:26 AM, Nathaniel Shead wrote:

On Tue, May 13, 2025 at 12:40:30PM -0400, Jason Merrill wrote:

On 5/9/25 11:48 AM, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk/15?

One slight concern I have is why we end up in 'maybe_thunk_body' to
start with: the imported constructor isn't DECL_ONE_ONLY (as its
external) and so 'can_alias_cdtor' returns false.


That does seem like a problem; can_alias_cdtor shouldn't change due to
explicit instantiation.


The change in
write_function_def (which I believe is necessary regardless) hides this
because we never actually emit the function definitions, but I worry
that this might somehow affect LTO, though I haven't been able to
construct a testcase which fails.  To avoid the potential of that we
could do something like this to mark such functions as DECL_ONE_ONLY
just in case; thoughts?

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index e7782627a49..c96e81aceef 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -16738,9 +16738,15 @@ module_state::read_cluster (unsigned snum)
 /* Make sure we emit explicit instantiations.
FIXME do we want to do this in expand_or_defer_fn instead?  */
-  if (DECL_EXPLICIT_INSTANTIATION (decl)
-  && !DECL_EXTERNAL (decl))
-setup_explicit_instantiation_definition_linkage (decl);
+  if (DECL_EXPLICIT_INSTANTIATION (decl))
+{
+  if (DECL_DECLARED_INLINE_P (decl)
+  && TREE_PUBLIC (decl)
+  && DECL_MAYBE_IN_CHARGE_CDTOR_P (decl))
+maybe_make_one_only (decl);


This seems fragile, we shouldn't need to reproduce this specific behavior
here.


Agreed.


How is this case different from an extern template in a single TU?


It looks like the difference is that in a single TU, an extern template
is never marked as DECL_WEAK, and so in can_alias_cdtor we check
DECL_INTERFACE_KNOWN && !DECL_ONE_ONLY instead.  In the modules case,
the explicit instantiation definition is marked as DECL_WEAK by
make_decl_one_only (in setup_explicit_instantiation_definition_linkage),
which we inherit.

So perhaps we should additionally clear DECL_WEAK on export for explicit
instantiations?  Or maybe DECL_WEAK should only be set when we import
the definition, as with DECL_NOT_REALLY_EXTERN.  I'm not sure if any
other of the visibility flags will also need special treatment (maybe
DECL_COMDAT?).


I think it makes sense for us to end up on import with the same result 
as calling mark_decl_instantiated (true) in the first place, whatever 
that is.  Perhaps with an undo_explicit_instantiation_definition_linkage 
function.


In particular, I think we want to avoid DECL_WEAK on external references 
so we aren't emitting weak references.


Jason

[committed] cobol: Don't display 0xFF HIGH-VALUE characters in testcases. [PR120251]

2025-05-15 Thread Robert Dubner

 


0001-cobol-Don-t-display-0xFF-HIGH-VALUE-characters-in-te.patch
Description: Binary data

Re: [PING][PATCH][GCC15] Alpha: Fix base block alignment calculation regression

2025-05-15 Thread Jeff Law





On 5/12/25 11:21 AM, Maciej W. Rozycki wrote:

On Tue, 25 Feb 2025, Maciej W. Rozycki wrote:


Address this issue by recursing into COMPONENT_REF tree nodes until the
outermost one has been reached, which is supposed to be a MEM_REF one,
accumulating the offset as we go, fixing a commit e0dae4da4c45 ("Alpha:
Also use tree information to get base block alignment") regression.


  GCC 15 backport ping for:
.

OK.
jeff

Re: [PATCH v4 1/2] RISC-V: Support RISC-V Profiles 20/22.

2025-05-15 Thread Palmer Dabbelt


On Sat, 10 May 2025 09:42:16 PDT (-0700), jeffreya...@gmail.com wrote:



On 5/10/25 6:30 AM, Jiawei wrote:

This patch introduces support for RISC-V Profiles RV20 and RV22 [1],
enabling developers to utilize these profiles through the -march option.

[1] https://github.com/riscv/riscv-profiles/releases/tag/v1.0

Version log:
Using lowercase letters to present Profiles.
Using '_' as divsor between Profiles and other RISC-V extension.
Add descriptions in invoke.texi.
Checking if there exist '_' between Profiles and additional extensions.
Using std::string to avoid memory problems.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (struct riscv_profiles): New 
struct.
(riscv_subset_list::parse_profiles): New parser.
(riscv_subset_list::parse_base_ext): Ditto.
* config/riscv/riscv-subset.h: New def.
* doc/invoke.texi: New option descriptions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/arch-49.c: New test.
* gcc.target/riscv/arch-50.c: New test.
* gcc.target/riscv/arch-51.c: New test.
* gcc.target/riscv/arch-52.c: New test.

Both patches in this series are OK.  Please install.


Looks like they did.  I'm not trying to argue with that, but I was 
talking to some distro people and this came up.  So I figured I'd leave 
a bit of a summary here in case anyone else goes down the same rabbit 
hole:


Supporting these profile arguments really has nothing to do with whether 
or not distros can target a profile.  The profiles are just a short name 
for a bunch of ISA extensions, vendors still self-certify their 
implementations and can claim profile/extension compatibility without 
actually implementing the behaviors defined by the specification.


We do our best to work around these sorts of issues is software land 
(mostly handling the traps in the kernel), but it's really a losing 
battle.  At best we get something correct long after the systems ship, 
usually with some disastrous impact on performance.  I don't think 
there's really a way to change that on the SW side of things, it's just 
the natural end result of this weak stance on compatibility in RISC-V 
land.


If users want to generate code that runs correctly and/or performance 
well on a set of RISC-V systems, they need to go find the common ground 
of what those systems actually implement (ie, not what the marketing 
material says) and then go target those extensions.  That's a bunch of 
work for users, but it's just where we are right now.


When this came up a few years ago we decided not to put some sort of 
"target all the hardware that shipped in 2022" type argument for GCC 
because we didn't have a good enough idea of what distros are going to 
want to target.  That still seems reasonable to me, but with binary 
distros in the pipeline it looks like we're going to be figuring that 
out sooner rather than later.



Thanks again,
Jeff

Re: [PATCH v2] libstdc++: Preserve the argument type in basic_format_args [PR119246]

2025-05-15 Thread Tomasz Kaminski

On Thu, May 15, 2025 at 9:45 PM Rainer Orth 
wrote:

> Hi Tomasz,
>
> >> I've a local patch in tree to support __float128 on SPARC, so I'll try
> >> with an unmodified tree first.  However, 2 days ago I could bootstrap
> >> with that included just fine.
> >>
> > Is __float128 a distinct type from long double, in case when both are
> > 128bit?
> > I have assumed that this would be the case, and this specialization is
> > meant to avoid generating separate formatting code for both of them.
>
> I'm using
>
> /* Create builtin type for __float128.  */
>
> static void
> sparc_float128_init_builtins (void)
> {
>   /* With 128-bit long double, the __float128 type is a synonym for
>  "long double".  */
>   lang_hooks.types.register_builtin_type (long_double_type_node,
> "__float128");
> }
>
> I guess I should file a PR for this, attaching the current patch, to
> discuss the remaining issues (fetestexcept failures in the testsuite, on
> both Solaris/sparc and Linux/sparc64).
>
I am not aware of architectures  where the additional specialization for
__float128
would be useful, so if my patch of changing _GLIBCXX_FORMAT_F128 > 1 to
_GLIBCXX_FORMAT_F128 == 2
addressed your build, I can submit it.

>
> Rainer
>
> --
>
> -
> Rainer Orth, Center for Biotechnology, Bielefeld University
>
>

Re: [PATCH v1] libstdc++: Fix class mandate for extents.

2025-05-15 Thread Luc Grosheintz





On 5/15/25 5:12 PM, Jonathan Wakely wrote:

On Thu, 15 May 2025 at 16:12, Jonathan Wakely  wrote:


On Thu, 15 May 2025 at 16:11, Luc Grosheintz  wrote:


Without would make sense to me, because whenever I wrote an
identifier with _ I felt like I was presenting the user with
a name that they shouldn't know about.

A pedantic question: Can I also fix the line below, or do
you prefer to be strict about one semantic change per commit?


We don't need to be that strict. Don't worry, I've already changed it
locally and will push it for you.


To be clear, we do prefer separate changes to be separate commits, but
I think "add a static assert and tweak the one on the line below" is
fine to consider as a single change.


Because I've left out other opportunities in the spirit of keeping
the number of emails, I'll do so now:

Thank you for fixing this; thanks to Jonathan and Tomasz for the
very kind welcome. And finally the assurance that I aim to keep
the additional work I cause you as low as possible and if it's
easier for you to send something back for me to fix, please do so.









On 5/15/25 2:40 PM, Tomasz Kaminski wrote:

No strong preference, but Ville's argument sounds reasonable.

On Thu, May 15, 2025 at 12:25 PM Ville Voutilainen <
ville.voutilai...@gmail.com> wrote:


Mild preference against; use the names from the standard, not the
implementation, in such diagnostics.

to 15. toukok. 2025 klo 13.20 Jonathan Wakely 
kirjoitti:


On Thu, 15 May 2025 at 11:14, Jonathan Wakely  wrote:


On Wed, 14 May 2025 at 20:18, Luc Grosheintz 

wrote:


The standard states that the IndexType must be a signed or unsigned
integer. This mandate was implemented using `std::is_integral_v`.

Which

also includes (among others) char and bool, which neither signed nor
unsigned integers.

libstdc++-v3/ChangeLog:

  * include/std/mdspan: Implement the mandate for extents as
  signed or unsigned integer and not any interal type.
  *

testsuite/23_containers/mdspan/extents/class_mandates_neg.cc: Check

  that extents and extents are invalid.
  * testsuite/23_containers/mdspan/extents/misc.cc: Update
  tests to avoid `char` and `bool` as IndexType.
---
   libstdc++-v3/include/std/mdspan|  3 ++-
   .../23_containers/mdspan/extents/class_mandates_neg.cc | 10

+++---

   .../testsuite/23_containers/mdspan/extents/misc.cc |  8 
   3 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/include/std/mdspan

b/libstdc++-v3/include/std/mdspan

index aee96dda7cd..22509d9c8f4 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -163,7 +163,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 template
   class extents
   {
-  static_assert(is_integral_v<_IndexType>, "_IndexType must be

integral.");

+  static_assert(__is_standard_integer<_IndexType>::value,
+   "_IndexType must be a signed or unsigned

integer.");


GCC's diagnostics never end with a full stop (aka period), and we
follow that convention for our static assertions.

So I'll remove the '.' at the end of the string literal, and then push
this to trunk.



 static_assert(
(__mdspan::__valid_static_extent<_Extents, _IndexType> &&

...),

"Extents must either be dynamic or representable as

_IndexType");

I've just noticed that this static_assert refers to "Extents" without
the leading underscore, but "_IndexType" with a leading underscore.

I think it's OK to omit the leading underscore, it might be a bit more
user-friendly and I don't think anybody will be confused by the fact
it's not identical to the real template parameter. But we should
either do it consistently for _Extents and _IndexType or for neither
of them.

Anybody want to argue for or against underscores?

1 2 >

1 - 100 of 109 matches

Mail list logo