[Patch] omp-general.cc: Remove 'if' around call to always 'true' returning function [PR118627]

2025-03-25 Thread Tobias Burnus

I intent to commit this a bit later today as obvious,
unless there are comments.

This is about two functions unconditionally return 'true'.
Both appear (also) in an 'if' condition that is pointless.

(The second one also appears in two other calls without an
'if', so far for consistency.)

Besides making the code less readable and a tad slower, one
'if' clause also lead to a Clang warning - as the variable is
only initialized in the 'true' case.

Thanks to David for reporting, to Xi for a first analysis and
to Kaaden for writing RFC patches in the PR.

Tobias
omp-general.cc: Remove 'if' around call to always 'true' returning function [PR118627]

Before omp_parse_access_method and omp_parse_access_methods unconditionally
returned true, now they are void functions.
Accordingly, calls had to be updated by removing the 'if' around the call;
this also fixes Clang's -Wsometimes-uninitialized warning when compiling
omp-general.cc as one variable remained uninitialized for a never occurring
false.

gcc/ChangeLog:

	PR middle-end/118627

	* omp-general.cc (omp_parse_access_method): Change to return void.
	(omp_parse_access_methods): Return void; remove 'if' around a
	function call.
	(omp_parse_expr): Remove 'if' around a function call.

 gcc/omp-general.cc | 13 +
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc
index 0a2dd6b5be7..0b7c3b9d318 100644
--- a/gcc/omp-general.cc
+++ b/gcc/omp-general.cc
@@ -4183,11 +4183,11 @@ omp_parse_pointer (tree *expr0, bool *has_offset)
 }
 
   return false;
 }
 
-static bool
+static void
 omp_parse_access_method (tree *expr0, enum access_method_kinds *kind)
 {
   tree expr = *expr0;
   bool has_offset;
 
@@ -4214,32 +4214,30 @@ omp_parse_access_method (tree *expr0, enum access_method_kinds *kind)
 *kind = ACCESS_DIRECT;
 
   STRIP_NOPS (expr);
 
   *expr0 = expr;
-  return true;
 }
 
-static bool
+static void
 omp_parse_access_methods (vec &addr_tokens, tree *expr0)
 {
   tree expr = *expr0;
   enum access_method_kinds kind;
   tree am_expr;
 
-  if (omp_parse_access_method (&expr, &kind))
-am_expr = expr;
+  omp_parse_access_method (&expr, &kind);
+  am_expr = expr;
 
   if (TREE_CODE (expr) == INDIRECT_REF
   || TREE_CODE (expr) == MEM_REF
   || TREE_CODE (expr) == ARRAY_REF)
 omp_parse_access_methods (addr_tokens, &expr);
 
   addr_tokens.safe_push (new omp_addr_token (kind, am_expr));
 
   *expr0 = expr;
-  return true;
 }
 
 static bool omp_parse_structured_expr (vec &, tree *);
 
 static bool
@@ -4353,12 +4351,11 @@ bool
 omp_parse_expr (vec &addr_tokens, tree expr)
 {
   using namespace omp_addr_tokenizer;
   auto_vec expr_access_tokens;
 
-  if (!omp_parse_access_methods (expr_access_tokens, &expr))
-return false;
+  omp_parse_access_methods (expr_access_tokens, &expr);
 
   if (omp_parse_structured_expr (addr_tokens, &expr))
 ;
   else if (omp_parse_array_expr (addr_tokens, &expr))
 ;


[PATCH] libstdc++: Adjust how __gnu_debug::vector detects invalidation

2025-03-25 Thread Jonathan Wakely
The new C++23 member functions assign_range, insert_range and
append_range were checking whether the begin() iterator changed after
calling the base class member. That works, but is technically undefined
when the original iterator has been invalidated by a change in capacity.

We can just check the capacity directly, because reallocation only
occurs if a change in capacity is required.

N.B. we can't use data() either because std::vector doesn't have
it.

libstdc++-v3/ChangeLog:

* include/debug/vector (vector::assign_range): Use change in
capacity to detect reallocation.
(vector::insert_range, vector::append_range): Likewise. Remove
unused variables.
---

Tested x86_64-linux.

 libstdc++-v3/include/debug/vector | 14 ++
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/libstdc++-v3/include/debug/vector 
b/libstdc++-v3/include/debug/vector
index 022ebe8c664..b49766c18a7 100644
--- a/libstdc++-v3/include/debug/vector
+++ b/libstdc++-v3/include/debug/vector
@@ -876,12 +876,12 @@ namespace __debug
constexpr void
assign_range(_Rg&& __rg)
{
- auto __old_begin = _Base::begin();
+ auto __old_capacity = _Base::capacity();
  auto __old_size = _Base::size();
  _Base::assign_range(__rg);
  if (!std::__is_constant_evaluated())
{
- if (_Base::begin() != __old_begin)
+ if (_Base::capacity() != __old_capacity)
this->_M_invalidate_all();
  else if (_Base::size() < __old_size)
this->_M_invalidate_after_nth(_Base::size());
@@ -893,12 +893,11 @@ namespace __debug
constexpr iterator
insert_range(const_iterator __pos, _Rg&& __rg)
{
- auto __old_begin = _Base::begin();
- auto __old_size = _Base::size();
+ auto __old_capacity = _Base::capacity();
  auto __res = _Base::insert_range(__pos.base(), __rg);
  if (!std::__is_constant_evaluated())
{
- if (_Base::begin() != __old_begin)
+ if (_Base::capacity() != __old_capacity)
this->_M_invalidate_all();
  this->_M_update_guaranteed_capacity();
}
@@ -909,12 +908,11 @@ namespace __debug
constexpr void
append_range(_Rg&& __rg)
{
- auto __old_begin = _Base::begin();
- auto __old_size = _Base::size();
+ auto __old_capacity = _Base::capacity();
  _Base::append_range(__rg);
  if (!std::__is_constant_evaluated())
{
- if (_Base::begin() != __old_begin)
+ if (_Base::capacity() != __old_capacity)
this->_M_invalidate_all();
  this->_M_update_guaranteed_capacity();
}
-- 
2.49.0



[PATCH] RISC-V: Remove the priority in FMV ASM name mangling

2025-03-25 Thread Yangyu Chen
We don't need to add priority in ASM name mangling, keeping this might
cause an issue if we call another MV clone directly but only one place
has the priority declared.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_mangle_decl_assembler_name): Remove
priority in fmv asm name mangling.

Signed-off-by: Yangyu Chen 
---
 gcc/config/riscv/riscv.cc | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 38f3ae7cd84..4a042878554 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -13238,7 +13238,11 @@ riscv_mangle_decl_assembler_name (tree decl, tree id)
 
   /* Replace non-alphanumeric characters with underscores as the suffix.  
*/
   for (const char *c = version_string; *c; c++)
-   name += ISALNUM (*c) == 0 ? '_' : *c;
+   {
+ /* Skip ';' for ";priority"  */
+ if (*c == ';') break;
+ name += ISALNUM (*c) == 0 ? '_' : *c;
+   }
 
   if (DECL_ASSEMBLER_NAME_SET_P (decl))
SET_DECL_RTL (decl, NULL);
-- 
2.49.0



[committed] libstdc++: Cast -1 to size_t in [PR119429]

2025-03-25 Thread Jonathan Wakely
This avoids a runtime error from Clang's annoying -fsanitize=integer
(even though it's not undefined and behaves correctly).

libstdc++-v3/ChangeLog:

PR libstdc++/119429
* include/std/format (__format::_Scanner::_Scanner): Cast
default argument to size_t.
---

Tested x86_64-linux.

Reluctantly pushed to trunk. I continue to dislike this sanitizer.

 libstdc++-v3/include/std/format | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
index 1b38913359d..c3327e1d384 100644
--- a/libstdc++-v3/include/std/format
+++ b/libstdc++-v3/include/std/format
@@ -4051,7 +4051,7 @@ namespace __format
   } _M_pc;
 
   constexpr explicit
-  _Scanner(basic_string_view<_CharT> __str, size_t __nargs = -1)
+  _Scanner(basic_string_view<_CharT> __str, size_t __nargs = (size_t)-1)
   : _M_pc(__str, __nargs)
   { }
 
-- 
2.49.0



[Patch, v2] libgomp.texi: Document supported OpenMP 'interop' types for nvptx and gcn

2025-03-25 Thread Tobias Burnus

On August 24, 2024 Tobias Burnus wrote:
[...] it documents the code added at "[patch][rfc] libgomp: Add OpenMP 
interop support to nvptx + gcn plugin", 
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661207.html


Quite some time has passed and those features are now on mainline.

The attached patch is an updated version of it, also documenting
the settings used to create the stream/queue object, which is
mainly relevant for HSA - as the CUDA and HIP versions are pretty
standard. - I think everything else is also pretty standard.

BTW: For HIP on AMD, I assume that when HSA is found via dlopen,
also HIP will be found via dlopen and shy away from wording like
'if available/found at runtime' or similar.

Comments before I commit it? I bet someone has!

Tobias
libgomp.texi: Document supported OpenMP 'interop' types for nvptx and gcn

libgomp/ChangeLog:

	* libgomp.texi (OpenMP 5.1): Add @ref to offload-target specifics
	for 'interop'.
	(OpenMP 6.0): Mark dispatch's interop clause as implemented.
	(omp_get_interop_int, omp_get_interop_str,
	omp_get_interop_ptr, omp_get_interop_type_desc): Add @ref to
	Offload-Target Specifics.
	(Offload-Target Specifics): Document the supported OpenMP
	interop foreign runtimes on AMD and Nvidia GPUs.

 libgomp/libgomp.texi | 152 +--
 1 file changed, 146 insertions(+), 6 deletions(-)

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index d1cf9be47ca..04e7ed2352c 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -314,7 +314,7 @@ The OpenMP 4.5 specification is fully supported.
   clauses @tab N @tab
 @item Indirect calls to the device version of a procedure or function in
   @code{target} regions @tab Y @tab
-@item @code{interop} directive @tab N @tab
+@item @code{interop} directive @tab Y @tab Cf. @ref{Offload-Target Specifics}
 @item @code{omp_interop_t} object support in runtime routines @tab Y @tab
 @item @code{nowait} clause in @code{taskwait} directive @tab Y @tab
 @item Extensions to the @code{atomic} directive @tab Y @tab
@@ -545,7 +545,7 @@ to address of matching mapped list item per 5.1, Sect. 2.21.7.2 @tab N @tab
   @tab N @tab
 @item Semicolon-separated list to @code{uses_allocators} @tab N @tab
 @item New @code{need_device_addr} modifier to @code{adjust_args} clause @tab N @tab
-@item @code{interop} clause to @code{dispatch} @tab N @tab
+@item @code{interop} clause to @code{dispatch} @tab Y @tab
 @item Scope requirement changes for @code{declare_target} @tab N @tab
 @item @code{message} and @code{severity} clauses to @code{parallel} directive
   @tab N @tab
@@ -3062,7 +3062,8 @@ the initial device is unspecified.
 @end multitable
 
 @item @emph{See also}:
-@ref{omp_get_interop_ptr}, @ref{omp_get_interop_str}, @ref{omp_get_interop_rc_desc}
+@ref{omp_get_interop_ptr}, @ref{omp_get_interop_str}, @ref{omp_get_interop_rc_desc},
+@ref{Offload-Target Specifics}
 
 @item @emph{Reference}:
 @uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.12.2,
@@ -3107,7 +3108,8 @@ the initial device is unspecified.
 @end multitable
 
 @item @emph{See also}:
-@ref{omp_get_interop_int}, @ref{omp_get_interop_str}, @ref{omp_get_interop_rc_desc}
+@ref{omp_get_interop_int}, @ref{omp_get_interop_str}, @ref{omp_get_interop_rc_desc},
+@ref{Offload-Target Specifics}
 
 @item @emph{Reference}:
 @uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.12.3,
@@ -3151,7 +3153,8 @@ the initial device is unspecified.
 @end multitable
 
 @item @emph{See also}:
-@ref{omp_get_interop_int}, @ref{omp_get_interop_ptr}, @ref{omp_get_interop_rc_desc}
+@ref{omp_get_interop_int}, @ref{omp_get_interop_ptr}, @ref{omp_get_interop_rc_desc},
+@ref{Offload-Target Specifics}
 
 @item @emph{Reference}:
 @uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.12.4,
@@ -3234,7 +3237,8 @@ a null pointer is returned. The effect of running this routine in a
 @end multitable
 
 @item @emph{See also}:
-@ref{omp_get_num_interop_properties}, @ref{omp_get_interop_name}
+@ref{omp_get_num_interop_properties}, @ref{omp_get_interop_name},
+@ref{Offload-Target Specifics}
 
 @item @emph{Reference}:
 @uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.12.6,
@@ -6837,6 +6841,10 @@ The following sections present notes on the offload-target specifics
 @node AMD Radeon
 @section AMD Radeon (GCN)
 
+@menu
+* Foreign-runtime support for AMD GPUs::
+@end menu
+
 On the hardware side, there is the hierarchy (fine to coarse):
 @itemize
 @item work item (thread)
@@ -6912,10 +6920,69 @@ The implementation remark:
 @end itemize
 
 
+@node Foreign-runtime support for AMD GPUs
+@subsection OpenMP @code{interop} -- Foreign-Runtime Support for AMD GPUs
+
+On AMD GPUs, the foreign runtimes are HIP (C++ Heterogeneous-Compute Interface
+for Portability) and HSA (Heterogeneous System Architecture),
+where HIP is the default.  The interop object is created using OpenMP's
+@code{interop} directive or, implicitl

Re: [PATCH] libstdc++: Adjust how __gnu_debug::vector detects invalidation

2025-03-25 Thread Tomasz Kaminski
On Tue, Mar 25, 2025 at 12:26 PM Jonathan Wakely  wrote:

> The new C++23 member functions assign_range, insert_range and
> append_range were checking whether the begin() iterator changed after
> calling the base class member. That works, but is technically undefined
> when the original iterator has been invalidated by a change in capacity.
>
> We can just check the capacity directly, because reallocation only
> occurs if a change in capacity is required.
>
> N.B. we can't use data() either because std::vector doesn't have
> it.
>
> libstdc++-v3/ChangeLog:
>
> * include/debug/vector (vector::assign_range): Use change in
> capacity to detect reallocation.
> (vector::insert_range, vector::append_range): Likewise. Remove
> unused variables.
> ---
>
> Tested x86_64-linux.
>
LGTM.

>
>  libstdc++-v3/include/debug/vector | 14 ++
>  1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/libstdc++-v3/include/debug/vector
> b/libstdc++-v3/include/debug/vector
> index 022ebe8c664..b49766c18a7 100644
> --- a/libstdc++-v3/include/debug/vector
> +++ b/libstdc++-v3/include/debug/vector
> @@ -876,12 +876,12 @@ namespace __debug
> constexpr void
> assign_range(_Rg&& __rg)
> {
> - auto __old_begin = _Base::begin();
> + auto __old_capacity = _Base::capacity();
>   auto __old_size = _Base::size();
>   _Base::assign_range(__rg);
>   if (!std::__is_constant_evaluated())
> {
> - if (_Base::begin() != __old_begin)
> + if (_Base::capacity() != __old_capacity)
> this->_M_invalidate_all();
>   else if (_Base::size() < __old_size)
> this->_M_invalidate_after_nth(_Base::size());
> @@ -893,12 +893,11 @@ namespace __debug
> constexpr iterator
> insert_range(const_iterator __pos, _Rg&& __rg)
> {
> - auto __old_begin = _Base::begin();
> - auto __old_size = _Base::size();
> + auto __old_capacity = _Base::capacity();
>   auto __res = _Base::insert_range(__pos.base(), __rg);
>   if (!std::__is_constant_evaluated())
> {
> - if (_Base::begin() != __old_begin)
> + if (_Base::capacity() != __old_capacity)
> this->_M_invalidate_all();
>   this->_M_update_guaranteed_capacity();
> }
> @@ -909,12 +908,11 @@ namespace __debug
> constexpr void
> append_range(_Rg&& __rg)
> {
> - auto __old_begin = _Base::begin();
> - auto __old_size = _Base::size();
> + auto __old_capacity = _Base::capacity();
>   _Base::append_range(__rg);
>   if (!std::__is_constant_evaluated())
> {
> - if (_Base::begin() != __old_begin)
> + if (_Base::capacity() != __old_capacity)
> this->_M_invalidate_all();
>   this->_M_update_guaranteed_capacity();
> }
> --
> 2.49.0
>
>


[PATCH v3] libstdc++: Fix std::vector::append_range for overlapping ranges

2025-03-25 Thread Jonathan Wakely
Unlike insert_range and assign_range, the append_range function does not
have a precondition that the range doesn't overlap *this. That means we
need to avoid relocating the existing elements until after copying from
the range. This means I need to revert r15-8488-g3e1d760bf49d0e which
made the from_range_t constructor use append_range, because the
constructor can avoid the additional complexity needed by append_range.
When relocating the existing elements in append_range we can use
std::__relocate_a to do it more efficiently, if that's valid.

std::vector::append_range needs similar treatment, although it's a
bit simpler as we know that the elements are trivially copyable and so
we don't need to worry about them throwing. assign_range doesn't allow
overlapping ranges, so can be rewritten to be more efficient than
calling append_range for the forward or sized range case.

libstdc++-v3/ChangeLog:

* include/bits/stl_bvector.h (vector::assign_range): More
efficient implementation for forward/sized ranges.
(vector::append_range): Handle potentially overlapping range.
* include/bits/stl_vector.h (vector(from_range_t, R&&, Alloc)):
Do not use append_range for non-sized input range case.
(vector::append_range): Handle potentially overlapping range.
* include/bits/vector.tcc (vector::insert_range): Forward range
instead of moving it.
* testsuite/23_containers/vector/bool/modifiers/insert/append_range.cc:
Test overlapping ranges.
* testsuite/23_containers/vector/modifiers/append_range.cc:
Likewise.
---

Patch v3 fixes the problem Tomasz noticed with calling reserve(n) on an
empty vector with non-zero capacity, which can invalidate the range
parameter.

Adds additional comments to the tests to try and clarify that the "XXX"
comments only apply to the calls to do_test, for Patrick's comment.

Also improved the doxygen comments on all the C++23 range members.

Tested x86_64-linux.

 libstdc++-v3/include/bits/stl_bvector.h   |  77 ++--
 libstdc++-v3/include/bits/stl_vector.h| 110 +++-
 libstdc++-v3/include/bits/vector.tcc  |   2 +-
 .../bool/modifiers/insert/append_range.cc |  51 ++
 .../vector/modifiers/append_range.cc  | 165 ++
 5 files changed, 385 insertions(+), 20 deletions(-)

diff --git a/libstdc++-v3/include/bits/stl_bvector.h 
b/libstdc++-v3/include/bits/stl_bvector.h
index 3ee15eaa938..03f6434604c 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -899,6 +899,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 #if __glibcxx_ranges_to_container // C++ >= 23
   /**
* @brief Construct a vector from a range.
+   * @param __rg A range of values that are convertible to `value_type`.
* @since C++23
*/
   template<__detail::__container_compatible_range _Rg>
@@ -1028,6 +1029,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 #if __glibcxx_ranges_to_container // C++ >= 23
   /**
* @brief Assign a range to the vector.
+   * @param __rg A range of values that are convertible to `value_type`.
+   * @pre `__rg` and `*this` do not overlap.
* @since C++23
*/
   template<__detail::__container_compatible_range _Rg>
@@ -1035,8 +1038,25 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
assign_range(_Rg&& __rg)
{
  static_assert(assignable_from>);
- clear();
- append_range(std::forward<_Rg>(__rg));
+ if constexpr (ranges::forward_range<_Rg> || ranges::sized_range<_Rg>)
+   {
+ if (auto __n = size_type(ranges::distance(__rg)))
+   {
+ reserve(__n);
+ this->_M_impl._M_finish
+ = ranges::copy(std::forward<_Rg>(__rg), begin()).out;
+   }
+ else
+   clear();
+   }
+ else
+   {
+ clear();
+ auto __first = ranges::begin(__rg);
+ const auto __last = ranges::end(__rg);
+ for (; __first != __last; ++__first)
+   emplace_back(*__first);
+   }
}
 #endif
 
@@ -1330,6 +1350,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 #if __glibcxx_ranges_to_container // C++ >= 23
   /**
* @brief Insert a range into the vector.
+   * @param __rg A range of values that are convertible to `bool`.
+   * @return An iterator that points to the first new element inserted,
+   * or to `__pos` if `__rg` is an empty range.
+   * @pre `__rg` and `*this` do not overlap.
* @since C++23
*/
   template<__detail::__container_compatible_range _Rg>
@@ -1385,24 +1409,53 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
constexpr void
append_range(_Rg&& __rg)
{
+ // N.B. __rg may overlap with *this, so we must copy from __rg before
+ // existing elements or iterators referring to *this 

Re: [PATCH] doc: Use more precise cross-references for -ftrivial-auto-var-init

2025-03-25 Thread Jonathan Wakely

On 11/03/25 17:39 +, Jonathan Wakely wrote:

Add anchors for the hardbool and uninitialized attributes and then link
directly to them.

gcc/ChangeLog:

* doc/extend.texi (Common Variable Attributes): Add @anchor to
hardbool attribute.
(Common Type Attributes): Add @anchor to uninitialized
attribute.
* doc/invoke.texi (Optimize Options): Use new anchors for
cross-references from -ftrivial-auto-var-init description.
---

OK for trunk?


Ping.1


Relevant branches too? (The hardbool attribute is only on gcc-14 so
obviously that part won't backport.)

gcc/doc/extend.texi | 2 ++
gcc/doc/invoke.texi | 8 +---
2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index bae3fba6b2b..5e7b7503e8c 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -8476,6 +8476,7 @@ will be placed in new, unique sections.

This additional functionality requires Binutils version 2.36 or later.

+@anchor{@code{uninitialized} variable attribute}
@cindex @code{uninitialized} variable attribute
@item uninitialized
This attribute, attached to a variable with automatic storage, means that
@@ -9315,6 +9316,7 @@ its enumerators are used in bitwise operations, so e.g. 
@option{-Wswitch}
should not warn about a @code{case} that corresponds to a bitwise
combination of enumerators.

+@anchor{@code{hardbool} type attribute}
@cindex @code{hardbool} type attribute
@item hardbool
@itemx hardbool (@var{false_value})
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4fbb4cda101..c7bbb92363c 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -14354,8 +14354,9 @@ Note that the initializer values, whether @samp{zero} 
or @samp{pattern},
refer to data representation (in memory or machine registers), rather
than to their interpretation as numerical values.  This distinction may
be important in languages that support types with biases or implicit
-multipliers, and with such extensions as @samp{hardbool} (@pxref{Type
-Attributes}).  For example, a variable that uses 8 bits to represent
+multipliers, and with such extensions as @samp{hardbool}
+(@pxref{@code{hardbool} type attribute}).
+For example, a variable that uses 8 bits to represent
(biased) quantities in the @code{range 160..400} will be initialized
with the bit patterns @code{0x00} or @code{0xFE}, depending on
@var{choice}, whether or not these representations stand for values in
@@ -14372,7 +14373,8 @@ are initialized with @code{false} (zero), even when 
@samp{pattern} is
requested.

You can control this behavior for a specific variable by using the variable
-attribute @code{uninitialized} (@pxref{Variable Attributes}).
+attribute @code{uninitialized} (@pxref{@code{uninitialized} variable
+attribute}).

@opindex fvect-cost-model
@item -fvect-cost-model=@var{model}
--
2.48.1






[Patch] install.texi: gcn - suggest to use Newlib with simd math fix [PR119325]

2025-03-25 Thread Tobias Burnus

A GCC 15 regression turned out to be a bug in Newlib related to
undefined behavior that just started to trigger in some cases.

As it is now fixed, it makes IMHO sense to mention that Newlib
commit in GCC's install documentation for AMD GPUs.

Comments, suggestions to the attached patch?

Tobias

PS: Current HTML version is 
athttps://gcc.gnu.org/install/specific.html#amdgcn-x-amdhsa
install.texi: gcn - suggest to use Newlib with simd math fix [PR119325]

Suggest a Newlib with a fix for the SIMD math issue.  Newlib commit:
https://sourceware.org/git/?p=newlib-cygwin.git;a=commitdiff;h=2ef1a37e7

Additionally, for generic support in ROCm, it is expected that 6.4 will
added the support; the current version is 6.3.3 and it does not support it;
bump >6.3.2 to >6.3.3 in install.texi to avoid doubts.

gcc/ChangeLog:

	PR middle-end/119325

	* doc/install.texi (gcn): Change ROCm > 6.3.2 to >6.3.3 for generic
	support; mention Newlib commit that fixes a SIMD math issue.

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 308529669b1..c1280fd9787 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -4056,11 +4056,12 @@ ISA targets @code{gfx9-generic}, @code{gfx10-3-generic}, and
 @code{gfx11-generic} reduce the number of required multilibs but note
 that @code{gfx9-generic} does not include @code{gfx908} or @code{gfx90a},
 that linking specific ISA code with generic code is currently not supported,
-and that only a future ROCm release (newer than 6.3.2) will be able to execute
+and that only a future ROCm release (newer than 6.3.3) will be able to execute
 generic code.
 
 Use Newlib (4.3.0 or newer; 4.4.0 contains some improvements and 4.5.0 fixes
-the device console output for GFX10 and GFX11 devices).
+the device console output for GFX10 and GFX11 devices; post-4.5.0
+commit 2ef1a37e7 [Mar 25, 2025] fixes a SIMD math issue).
 
 To run the binaries, install the HSA Runtime from the
 @uref{https://rocm.docs.amd.com/,,ROCm Platform}, and use


Re: [PATCH] tailc: Only diagnose musttail failures during tailc or musttail passes [PR119376]

2025-03-25 Thread Richard Biener
On Tue, 25 Mar 2025, Jakub Jelinek wrote:

> Hi!
> 
> The following testcases FAIL because musttail failures are diagnosed
> not just in the tailc or musttail passes, but also during the tailr1
> and tailr2.
> tailr1 pass is before IPA and in the testcases eh cleanup has not
> cleaned up the IL sufficiently yet to make the musttail calls pass,
> even tailr2 could be too early.
> 
> The following patch does that only during the tailc pass, and if that
> pass is not actually executed, during musttail pass.
> To do it only in the tailc pass, I chose to pass a new bool flag, because
> while we have the opt_tailcalls argument, it is actually passed by reference
> to find_tail_calls and sometimes cleared during that.
> musttail calls when the new DIAG_MUSTTAIL flag is not set are handled like
> any other calls, we simply silently punt on those if they can't be turned
> into tail calls.
> 
> Furthermore, I had to tweak the musttail pass gate.  Previously it was
> !flag_optimize_sibling_calls && f->has_musttail.  The problem is that
> gate of tailr and tailc passes is
> flag_optimize_sibling_calls != 0 && dbg_cnt (tail_call)
> and furthermore, tailc pass is only in the normal optimization queue,
> so only if not -O0 or -Og.  So when one would use tail_call dbg_cnt
> with some limit, or when e.g. using -foptimize-sibling-calls with -O0 or
> -Og, nothing would actually diagnose invalid musttail calls or set tail call
> flags on those if they are ok.  I could insert a new PROP_ flag on whether
> musttail has been handled by tailc pass, but given that we have the
> cfun->has_musttail flag already and nothing after tailc/musttail passes uses
> it, I think it is easier to just clear the flag when musttail failures are
> diagnosed and correct ones have [[tail call]] flag added.  Expansion will
> then only look at the [[tail call]] flag, it could even at the [[must tail
> call]] flag, but I don't see a point to check cfun->has_musttail.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2025-03-25  Jakub Jelinek  
> 
>   PR ipa/119376
>   * tree-tailcall.cc (suitable_for_tail_opt_p): Add DIAG_MUSTTAIL
>   argument, propagate it down to maybe_error_musttail.
>   (suitable_for_tail_call_opt_p): Likewise.
>   (maybe_error_musttail): Add DIAG_MUSTTAIL argument.  Don't emit error
>   for gimple_call_must_tail_p calls if it is false.
>   (find_tail_calls): Add DIAG_MUSTTAIL argument, propagate it down to
>   maybe_error_musttail, suitable_for_tail_opt_p,
>   suitable_for_tail_call_opt_p and find_tail_calls calls.
>   (tree_optimize_tail_calls_1): Add DIAG_MUSTTAIL argument, propagate
>   it down to find_tail_calls and if set, clear cfun->has_musttail flag
>   at the end.  Rename OPT_MUSTCALL argument to OPT_MUSTTAIL.
>   (execute_tail_calls): Pass true to DIAG_MUSTTAIL
>   tree_optimize_tail_calls_1 argument.
>   (pass_tail_recursion::execute): Pass false to DIAG_MUSTTAIL
>   tree_optimize_tail_calls_1 argument.
>   (pass_musttail::gate): Don't test flag_optimize_sibling_calls.
>   (pass_musttail::execute): Pass true to DIAG_MUSTTAIL
>   tree_optimize_tail_calls_1 argument.
> 
>   * g++.dg/torture/musttail1.C: New test.
>   * g++.dg/opt/musttail2.C: New test.
> 
> --- gcc/tree-tailcall.cc.jj   2025-01-16 09:27:53.645909094 +0100
> +++ gcc/tree-tailcall.cc  2025-03-24 12:51:56.271628242 +0100
> @@ -139,18 +139,18 @@ static tree m_acc, a_acc;
>  
>  static bitmap tailr_arg_needs_copy;
>  
> -static void maybe_error_musttail (gcall *call, const char *err);
> +static void maybe_error_musttail (gcall *call, const char *err, bool);
>  
>  /* Returns false when the function is not suitable for tail call optimization
> from some reason (e.g. if it takes variable number of arguments). CALL
> is call to report for.  */
>  
>  static bool
> -suitable_for_tail_opt_p (gcall *call)
> +suitable_for_tail_opt_p (gcall *call, bool diag_musttail)
>  {
>if (cfun->stdarg)
>  {
> -  maybe_error_musttail (call, _("caller uses stdargs"));
> +  maybe_error_musttail (call, _("caller uses stdargs"), diag_musttail);
>return false;
>  }
>  
> @@ -163,7 +163,7 @@ suitable_for_tail_opt_p (gcall *call)
> tail call discovery happen. CALL is call to report error for.  */
>  
>  static bool
> -suitable_for_tail_call_opt_p (gcall *call)
> +suitable_for_tail_call_opt_p (gcall *call, bool diag_musttail)
>  {
>tree param;
>  
> @@ -171,7 +171,7 @@ suitable_for_tail_call_opt_p (gcall *cal
>   sibling call optimizations, but not tail recursion.  */
>if (cfun->calls_alloca)
>  {
> -  maybe_error_musttail (call, _("caller uses alloca"));
> +  maybe_error_musttail (call, _("caller uses alloca"), diag_musttail);
>return false;
>  }
>  
> @@ -181,7 +181,8 @@ suitable_for_tail_call_opt_p (gcall *cal
>if (targetm_common.except_unwind_info (&global_options) == UI

[PATCH v2] i386: Add "s_" as Saturation for AVX10.2 Converting Intrinsics.

2025-03-25 Thread Hu, Lin1
Modify ChangeLog.

This patch aims to add "s_" after 'cvt' represent saturation.

gcc/ChangeLog:

* config/i386/avx10_2-512convertintrin.h (_mm512_mask_cvtx2ps_ph): 
Formatting fixes
(_mm512_mask_cvtx_round2ps_ph): Ditto
(_mm512_maskz_cvtx_round2ps_ph): Ditto
(_mm512_cvtbiassph_bf8): Rename to _mm512_cvts_biasph_bf8.
(_mm512_mask_cvtbiassph_bf8): Rename to _mm512_mask_cvts_biasph_bf8.
(_mm512_maskz_cvtbiassph_bf8): Rename to _mm512_maskz_cvts_biasph_bf8.
(_mm512_cvtbiassph_hf8): Rename to _mm512_cvts_biasph_hf8.
(_mm512_mask_cvtbiassph_hf8): Rename to _mm512_mask_cvts_biasph_hf8.
(_mm512_maskz_cvtbiassph_hf8): Rename to _mm512_maskz_cvts_biasph_hf8.
(_mm512_cvts2ph_bf8): Rename to _mm512_cvts_2ph_bf8.
(_mm512_mask_cvts2ph_bf8): Rename to _mm512_mask_cvts_2ph_bf8.
(_mm512_maskz_cvts2ph_bf8): Rename to _mm512_maskz_cvts_2ph_bf8.
(_mm512_cvts2ph_hf8): Rename to _mm512_cvts_2ph_hf8.
(_mm512_mask_cvts2ph_hf8): Rename to _mm512_mask_cvts_2ph_hf8.
(_mm512_maskz_cvts2ph_hf8): Rename to _mm512_maskz_cvts_2ph_hf8.
(_mm512_cvtsph_bf8): Rename to _mm512_cvts_ph_bf8.
(_mm512_mask_cvtsph_bf8): Rename to _mm512_mask_cvts_ph_bf8.
(_mm512_maskz_cvtsph_bf8): Rename to _mm512_maskz_cvts_ph_bf8.
(_mm512_cvtsph_hf8): Rename to _mm512_cvts_ph_hf8.
(_mm512_mask_cvtsph_hf8): Rename to _mm512_mask_cvts_ph_hf8.
(_mm512_maskz_cvtsph_hf8): Rename to _mm512_maskz_cvts_ph_hf8.
* config/i386/avx10_2convertintrin.h
(_mm_cvtbiassph_bf8): Rename to _mm_cvts_biasph_bf8.
(_mm_mask_cvtbiassph_bf8): Rename to _mm_mask_cvts_biasph_bf8.
(_mm_maskz_cvtbiassph_bf8): Rename to _mm_maskz_cvts_biasph_bf8.
(_mm256_cvtbiassph_bf8): Rename to _mm256_cvts_biasph_bf8.
(_mm256_mask_cvtbiassph_bf8): Rename to _mm256_mask_cvts_biasph_bf8.
(_mm256_maskz_cvtbiassph_bf8): Rename to _mm256_maskz_cvts_biasph_bf8.
(_mm_cvtbiassph_hf8): Rename to _mm_cvts_biasph_hf8.
(_mm_mask_cvtbiassph_hf8): Rename to _mm_mask_cvts_biasph_hf8.
(_mm_maskz_cvtbiassph_hf8): Rename to _mm_maskz_cvts_biasph_hf8.
(_mm256_cvtbiassph_hf8): Rename to _mm256_cvts_biasph_hf8.
(_mm256_mask_cvtbiassph_hf8): Rename to _mm256_mask_cvts_biasph_hf8.
(_mm256_maskz_cvtbiassph_hf8): Rename to _mm256_maskz_cvts_biasph_hf8.
(_mm_cvts2ph_bf8): Rename to _mm_cvts_2ph_bf8.
(_mm_mask_cvts2ph_bf8): Rename to _mm_mask_cvts_2ph_bf8.
(_mm_maskz_cvts2ph_bf8): Rename to _mm_maskz_cvts_2ph_bf8.
(_mm256_cvts2ph_bf8): Rename to _mm256_cvts_2ph_bf8.
(_mm256_mask_cvts2ph_bf8): Rename to _mm256_mask_cvts_2ph_bf8.
(_mm256_maskz_cvts2ph_bf8): Rename to _mm256_maskz_cvts_2ph_bf8.
(_mm_cvts2ph_hf8): Rename to _mm_cvts_2ph_hf8.
(_mm_mask_cvts2ph_hf8): Rename to _mm_mask_cvts_2ph_hf8.
(_mm_maskz_cvts2ph_hf8): Rename to _mm_maskz_cvts_2ph_hf8.
(_mm256_cvts2ph_hf8): Rename to _mm256_cvts_2ph_hf8.
(_mm256_mask_cvts2ph_hf8): Rename to _mm256_mask_cvts_2ph_hf8.
(_mm256_maskz_cvts2ph_hf8): Rename to _mm256_maskz_cvts_2ph_hf8.
(_mm_cvtsph_bf8): Rename to _mm_cvts_ph_bf8.
(_mm_mask_cvtsph_bf8): Rename to _mm_mask_cvts_ph_bf8.
(_mm_maskz_cvtsph_bf8): Rename to _mm_maskz_cvts_ph_bf8.
(_mm256_cvtsph_bf8): Rename to _mm256_cvts_ph_bf8.
(_mm256_mask_cvtsph_bf8): Rename to _mm256_mask_cvts_ph_bf8.
(_mm256_maskz_cvtsph_bf8): Rename to _mm256_maskz_cvts_ph_bf8.
(_mm_cvtsph_hf8): Rename to _mm_cvts_ph_hf8.
(_mm_mask_cvtsph_hf8): Rename to _mm_mask_cvts_ph_hf8.
(_mm_maskz_cvtsph_hf8): Rename to _mm_maskz_cvts_ph_hf8.
(_mm256_cvtsph_hf8): Rename to _mm256_cvts_ph_hf8.
(_mm256_mask_cvtsph_hf8): Rename to _mm256_mask_cvts_ph_hf8.
(_mm256_maskz_cvtsph_hf8): Rename to _mm256_maskz_cvts_ph_hf8.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10_2-512-convert-1.c: Modify function name
to follow the latest version.
* gcc.target/i386/avx10_2-512-vcvt2ph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvt2ph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-convert-1.c: Ditto.
---
 gcc/config/i386/avx10_2-512convertintrin.h| 50 +--
 gcc/config/i386/avx10_2convertintrin.h| 88 +--
 .../gcc.target/i386/avx10_2-512-convert-1.c   | 36 
 .../i386/avx10_2-512-vcvt2ph2bf8s-2.c |  6 +-
 .../i386/avx10_2-512-vcvt2ph2hf8s-2.c |  6 +-
 .../i386/avx10_2-512-vcvtbiasph2bf8s-2.c  |  6 +-
 .../i386/avx10_2-512-vcvtbiasph2hf8s-2.c  |  6 +-
 .../i386/avx10_2-512-

[PATCH][COBOL][RFC] Remove strtof128 based diagnostics

2025-03-25 Thread Richard Biener
The following removes uses of strtof128 which are all in some way
verifying something parses as _Float128 but which lexing should
have guarnateed.

Tested on x86_64-unknown-linux-gnu.

Richard.

gcc/cobol/
* parse.y (intrinsic): Remove checking that $r1->field->data.initial
parses as _Float128.
(numstr2i): Remove checking that all of the string is consumed
by the converted from number.
* symbols.h (strtof128): Remove.
(cbl_field_data_t::valify): Remove checking that all of the
string is consumed by the number conversion.
---
 gcc/cobol/parse.y   | 21 +
 gcc/cobol/symbols.h | 20 
 2 files changed, 5 insertions(+), 36 deletions(-)

diff --git a/gcc/cobol/parse.y b/gcc/cobol/parse.y
index 390e115f37e..5fa472a3645 100644
--- a/gcc/cobol/parse.y
+++ b/gcc/cobol/parse.y
@@ -10326,16 +10326,7 @@ intrinsic:  function_udf
   }
   if( $1 == NUMVAL_F ) {
 if( is_literal($r1->field) ) {
-  _Float128 output __attribute__ ((__unused__));
-  auto input = $r1->field->data.initial;
-  auto local = xstrdup(input), pend = local;
-  std::replace(local, local + strlen(local), ',', '.');
-  std::remove_if(local, local + strlen(local), isspace);
-  output = strtof128(local, &pend);
-  // bad if strtof128 could not convert input
-  if( *pend != '\0' ) {
-error_msg(@r1, "'%s' is not a numeric string", input);
-  }
+ // we assume $r1->field->data.initial parses as float
 }
   }
   if( ! intrinsic_call_1($$, $1, $r1, @r1)) YYERROR;
@@ -12065,20 +12056,18 @@ static REAL_VALUE_TYPE
 numstr2i( const char input[], radix_t radix ) {
   REAL_VALUE_TYPE output;
   size_t integer = 0;
-  int erc=0, n=0;
+  int erc=0;
 
   switch( radix ) {
   case decimal_e: { // Use decimal point for comma, just in case.
-  auto local = xstrdup(input), pend = local;
+  auto local = xstrdup(input);
   if( !local ) { erc = -1; break; }
   std::replace(local, local + strlen(local), ',', '.');
   real_from_string3 (&output, local, TYPE_MODE (float128_type_node));
-  strtof128(local, &pend);
-  n = pend - local;
 }
 break;
   case hexadecimal_e:
-erc = sscanf(input, "%zx%n", &integer, &n);
+erc = sscanf(input, "%zx", &integer);
 real_from_integer (&output, VOIDmode, integer, UNSIGNED);
 break;
   case boolean_e:
@@ -12101,7 +12090,7 @@ numstr2i( const char input[], radix_t radix ) {
 real_from_integer (&output, VOIDmode, integer, UNSIGNED);
 return output;
   }
-  if( erc == -1 || n < int(strlen(input)) ) {
+  if( erc == -1 ) {
 yywarn("'%s' was accepted as %lld", input, output);
   }
   return output;
diff --git a/gcc/cobol/symbols.h b/gcc/cobol/symbols.h
index 72bb188ec5b..35e4d816233 100644
--- a/gcc/cobol/symbols.h
+++ b/gcc/cobol/symbols.h
@@ -48,17 +48,6 @@
 
 #define PICTURE_MAX 64
 
-#if ! (__HAVE_FLOAT128 && __GLIBC_USE (IEC_60559_TYPES_EXT))
-static_assert( sizeof(output) == sizeof(long double), "long doubles?" );
-
-// ???  This is still used for verificataion that __nptr parses as
-// float number via setting *__endptr.
-static inline _Float128
-strtof128 (const char *__restrict __nptr, char **__restrict __endptr) {
-  return strtold(nptr, endptr);
-}
-#endif
-
 extern const char *numed_message;
 
 enum cbl_dialect_t {
@@ -352,15 +341,6 @@ struct cbl_field_data_t {
   std::replace(input.begin(), input.end(), ',', '.');
 }
 
-char *pend = NULL;
-
-strtof128(input.c_str(), &pend);
-
-if( pend != input.c_str() + len ) {
-  dbgmsg("%s: error: could not interpret '%s' of '%s' as a number",
- __func__, pend, initial);
-}
-
 REAL_VALUE_TYPE r;
 real_from_string (&r, input.c_str());
 r = real_value_truncate (TYPE_MODE (float128_type_node), r);
-- 
2.43.0


Re: [PATCH 0/4] Update the configure of host 'basename'.

2025-03-25 Thread Richard Biener
On Mon, Mar 24, 2025 at 9:54 AM Iain Sandoe  wrote:
>
> Currently, we misconfigure GCC on POSIX platforms that require the
> inclusion of  to declare 'basename()'.
>
> The series here does the following:
>  - ensures that the libiberty configure caters for platforms that need
> (it does not alter the outcome on those that also have
>basename() in libc). [PR119218]
>  - ensures that the gcc/ configure matches the behaviour of
>libiberty [PS119250]
>  - switches the remaining two uses of host 'basename()' to use the
>libiberty 'lbasename()'.
>
> Despite the last change, the first two are still needed to allow the
> inclusion of  in GCC sources (otherwise the host definition
> clashes with the libiberty one).
>
> At some stage (not proposed in this patch series) perhaps we should
> just poison the host basename/dirname and require use of the libiberty
> replacements.
>
> All tested on x86_64-linux, darwin, aarch64-linux, darwin.
> OK for trunk? (when?)

This looks all reasonable, so OK from my side, even now.  Do you
agree, Jakub?

Thanks,
Richard.

> thanks
> Iain
>
> 
>
> Iain Sandoe (4):
>   libiberty: Append  to AC_CHECK_DECLS [PR119218].
>   gcc, configure: When checking for basename, use the same process as
> libiberty [PR119250].
>   gcc, gcov: Use 'lbasename' consistently.
>   rust: Use 'lbasename()' consistently.
>
>  gcc/config.in | 10 --
>  gcc/configure | 18 +++---
>  gcc/configure.ac  | 12 ++--
>  gcc/gcov.cc   |  2 +-
>  gcc/rust/metadata/rust-export-metadata.cc |  2 +-
>  libiberty/config.in   |  3 +++
>  libiberty/configure   | 12 +---
>  libiberty/configure.ac|  9 +++--
>  8 files changed, 50 insertions(+), 18 deletions(-)
>
> --
> 2.39.2 (Apple Git-143)
>


Re: [PATCH 0/4] Update the configure of host 'basename'.

2025-03-25 Thread Jakub Jelinek
On Tue, Mar 25, 2025 at 11:02:39AM +0100, Richard Biener wrote:
> On Mon, Mar 24, 2025 at 9:54 AM Iain Sandoe  wrote:
> >
> > Currently, we misconfigure GCC on POSIX platforms that require the
> > inclusion of  to declare 'basename()'.
> >
> > The series here does the following:
> >  - ensures that the libiberty configure caters for platforms that need
> > (it does not alter the outcome on those that also have
> >basename() in libc). [PR119218]
> >  - ensures that the gcc/ configure matches the behaviour of
> >libiberty [PS119250]
> >  - switches the remaining two uses of host 'basename()' to use the
> >libiberty 'lbasename()'.
> >
> > Despite the last change, the first two are still needed to allow the
> > inclusion of  in GCC sources (otherwise the host definition
> > clashes with the libiberty one).
> >
> > At some stage (not proposed in this patch series) perhaps we should
> > just poison the host basename/dirname and require use of the libiberty
> > replacements.
> >
> > All tested on x86_64-linux, darwin, aarch64-linux, darwin.
> > OK for trunk? (when?)
> 
> This looks all reasonable, so OK from my side, even now.  Do you
> agree, Jakub?

Yes.

Jakub



[PATCH] i386: Add "s_" as Saturation for AVX10.2 Converting Intrinsics.

2025-03-25 Thread Hu, Lin1
Hi, all

This patch aims to add "s_" after 'cvt' represent saturation.

Bootstrapped and regtested on x86_64-linux-gnu-{-m32,-m64}, OK for trunk?

BRs,
Lin

gcc/ChangeLog:

* config/i386/avx10_2-512convertintrin.h: Modify intrin name.
* config/i386/avx10_2convertintrin.h: Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx10_2-512-convert-1.c: Modify function name.
* gcc.target/i386/avx10_2-512-vcvt2ph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvt2ph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtbiasph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtph2bf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-512-vcvtph2hf8s-2.c: Ditto.
* gcc.target/i386/avx10_2-convert-1.c: Ditto.
---
 gcc/config/i386/avx10_2-512convertintrin.h| 50 +--
 gcc/config/i386/avx10_2convertintrin.h| 88 +--
 .../gcc.target/i386/avx10_2-512-convert-1.c   | 36 
 .../i386/avx10_2-512-vcvt2ph2bf8s-2.c |  6 +-
 .../i386/avx10_2-512-vcvt2ph2hf8s-2.c |  6 +-
 .../i386/avx10_2-512-vcvtbiasph2bf8s-2.c  |  6 +-
 .../i386/avx10_2-512-vcvtbiasph2hf8s-2.c  |  6 +-
 .../i386/avx10_2-512-vcvtph2bf8s-2.c  |  6 +-
 .../i386/avx10_2-512-vcvtph2hf8s-2.c  |  6 +-
 .../gcc.target/i386/avx10_2-convert-1.c   | 72 +++
 10 files changed, 141 insertions(+), 141 deletions(-)

diff --git a/gcc/config/i386/avx10_2-512convertintrin.h 
b/gcc/config/i386/avx10_2-512convertintrin.h
index 8007cf36d76..611a40d83e2 100644
--- a/gcc/config/i386/avx10_2-512convertintrin.h
+++ b/gcc/config/i386/avx10_2-512convertintrin.h
@@ -49,7 +49,7 @@ _mm512_cvtx2ps_ph (__m512 __A, __m512 __B)
 extern __inline __m512h
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_mask_cvtx2ps_ph (__m512h __W, __mmask32 __U, __m512 __A,
- __m512 __B)
+   __m512 __B)
 {
   return (__m512h) __builtin_ia32_vcvt2ps2phx512_mask_round ((__v16sf) __A,
 (__v16sf) __B,
@@ -86,7 +86,7 @@ _mm512_cvtx_round2ps_ph (__m512 __A, __m512 __B, const int 
__R)
 extern __inline __m512h
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_mask_cvtx_round2ps_ph (__m512h __W, __mmask32 __U, __m512 __A,
-__m512 __B, const int __R)
+ __m512 __B, const int __R)
 {
   return (__m512h) __builtin_ia32_vcvt2ps2phx512_mask_round ((__v16sf) __A,
(__v16sf) __B,
@@ -98,7 +98,7 @@ _mm512_mask_cvtx_round2ps_ph (__m512h __W, __mmask32 __U, 
__m512 __A,
 extern __inline __m512h
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_maskz_cvtx_round2ps_ph (__mmask32 __U, __m512 __A,
- __m512 __B, const int __R)
+  __m512 __B, const int __R)
 {
   return (__m512h) __builtin_ia32_vcvt2ps2phx512_mask_round ((__v16sf) __A,
(__v16sf) __B,
@@ -166,7 +166,7 @@ _mm512_maskz_cvtbiasph_bf8 (__mmask32 __U, __m512i __A, 
__m512h __B)
 
 extern __inline__ __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm512_cvtbiassph_bf8 (__m512i __A, __m512h __B)
+_mm512_cvts_biasph_bf8 (__m512i __A, __m512h __B)
 {
   return (__m256i) __builtin_ia32_vcvtbiasph2bf8s512_mask ((__v64qi) __A,
   (__v32hf) __B,
@@ -177,8 +177,8 @@ _mm512_cvtbiassph_bf8 (__m512i __A, __m512h __B)
 
 extern __inline__ __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm512_mask_cvtbiassph_bf8 (__m256i __W, __mmask32 __U,
-   __m512i __A, __m512h __B)
+_mm512_mask_cvts_biasph_bf8 (__m256i __W, __mmask32 __U,
+__m512i __A, __m512h __B)
 {
   return (__m256i) __builtin_ia32_vcvtbiasph2bf8s512_mask ((__v64qi) __A,
   (__v32hf) __B,
@@ -188,7 +188,7 @@ _mm512_mask_cvtbiassph_bf8 (__m256i __W, __mmask32 __U,
 
 extern __inline__ __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm512_maskz_cvtbiassph_bf8 (__mmask32 __U, __m512i __A, __m512h __B)
+_mm512_maskz_cvts_biasph_bf8 (__mmask32 __U, __m512i __A, __m512h __B)
 {
   return (__m256i) __builtin_ia32_vcvtbiasph2bf8s512_mask ((__v64qi) __A,
   (__v32hf) __B,
@@ -232,7 +232,7 @@ _mm512_maskz_cvtbiasph_hf8 (__mmask32 __U, __m512i __A, 
__m512h __B)
 
 extern __inline__ __m256i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm512_cvtbiassph_hf8 (__m512i __A, __m512h __B)
+_mm512_cvts_biasph_hf8 (__m512i __A, __m512h __B)
 {
   return (__m256i) __builtin_ia32_vcvtbiasph2hf8s5

Re: [PATCH][RFC] [cobol] change cbl_field_data_t::etc_t::value from _Float128 to tree

2025-03-25 Thread Richard Biener
On Mon, 24 Mar 2025, Jakub Jelinek wrote:

> On Mon, Mar 24, 2025 at 10:55:44PM +0100, Jakub Jelinek wrote:
> > If it was HOST_WIDE_INT_MAX + (size_t) 1 to ~(size_t) 0, previously it would
> > be false and now is false.
> 
> Sorry, this case used to be false and now is true.

Just to add, when writing this I wondered whether a

bool real_is_integer (const REAL_VALUE_TYPE *, wide_int *, int);

would be useful, or adding an optional bool *exact arg to the
existing real_to_integer.

Semantics of -0.0 vs. 0.0 and INF/NAN (which previously, when
casting to (size_t) was UB, likewise for out-of-bound values?)
is of course details that need to be documented.

I think the main issue with my transform is that it lost the
non-negative check.  The problem with the orignal code is that
it lacks documentation on the intent of the check.

Richard.


[PATCH] testsuite: add testcase for recent alias fix

2025-03-25 Thread Sam James
r15-7961-gdc47161c1f32c3 fixes a typo in ao_compare::compare_ao_refs
but there wasn't a testcase available at the time. Now there is.

Thanks to Andrew for the testcase.

gcc/testsuite/ChangeLog:
PR testsuite/119382

* gcc.dg/ipa/ipa-icf-40.c: New test.

Co-authored-by: Andrew Pinski 
---
OK? Fails with that commit reverted.

 gcc/testsuite/gcc.dg/ipa/ipa-icf-40.c | 16 
 1 file changed, 16 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/ipa-icf-40.c

diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-icf-40.c 
b/gcc/testsuite/gcc.dg/ipa/ipa-icf-40.c
new file mode 100644
index ..ab328ba33412
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/ipa-icf-40.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-ipa-icf-optimized" } */
+
+int c0 = 0;
+typedef int v4si __attribute__((vector_size(4*sizeof(int;
+v4si a;
+int f()
+{
+return a[c0];
+}
+int g()
+{
+return a[c0];
+}
+
+/* { dg-final { scan-ipa-dump "optimized: Semantic equality 
hit:f/\[0-9+\]+->g/\[0-9+\]+" "icf" } } */

base-commit: 127a24ede2f82eafecb5eb142e21dbda38d06c18
-- 
2.49.0



[PATCH] tailc: Only diagnose musttail failures during tailc or musttail passes [PR119376]

2025-03-25 Thread Jakub Jelinek
Hi!

The following testcases FAIL because musttail failures are diagnosed
not just in the tailc or musttail passes, but also during the tailr1
and tailr2.
tailr1 pass is before IPA and in the testcases eh cleanup has not
cleaned up the IL sufficiently yet to make the musttail calls pass,
even tailr2 could be too early.

The following patch does that only during the tailc pass, and if that
pass is not actually executed, during musttail pass.
To do it only in the tailc pass, I chose to pass a new bool flag, because
while we have the opt_tailcalls argument, it is actually passed by reference
to find_tail_calls and sometimes cleared during that.
musttail calls when the new DIAG_MUSTTAIL flag is not set are handled like
any other calls, we simply silently punt on those if they can't be turned
into tail calls.

Furthermore, I had to tweak the musttail pass gate.  Previously it was
!flag_optimize_sibling_calls && f->has_musttail.  The problem is that
gate of tailr and tailc passes is
flag_optimize_sibling_calls != 0 && dbg_cnt (tail_call)
and furthermore, tailc pass is only in the normal optimization queue,
so only if not -O0 or -Og.  So when one would use tail_call dbg_cnt
with some limit, or when e.g. using -foptimize-sibling-calls with -O0 or
-Og, nothing would actually diagnose invalid musttail calls or set tail call
flags on those if they are ok.  I could insert a new PROP_ flag on whether
musttail has been handled by tailc pass, but given that we have the
cfun->has_musttail flag already and nothing after tailc/musttail passes uses
it, I think it is easier to just clear the flag when musttail failures are
diagnosed and correct ones have [[tail call]] flag added.  Expansion will
then only look at the [[tail call]] flag, it could even at the [[must tail
call]] flag, but I don't see a point to check cfun->has_musttail.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2025-03-25  Jakub Jelinek  

PR ipa/119376
* tree-tailcall.cc (suitable_for_tail_opt_p): Add DIAG_MUSTTAIL
argument, propagate it down to maybe_error_musttail.
(suitable_for_tail_call_opt_p): Likewise.
(maybe_error_musttail): Add DIAG_MUSTTAIL argument.  Don't emit error
for gimple_call_must_tail_p calls if it is false.
(find_tail_calls): Add DIAG_MUSTTAIL argument, propagate it down to
maybe_error_musttail, suitable_for_tail_opt_p,
suitable_for_tail_call_opt_p and find_tail_calls calls.
(tree_optimize_tail_calls_1): Add DIAG_MUSTTAIL argument, propagate
it down to find_tail_calls and if set, clear cfun->has_musttail flag
at the end.  Rename OPT_MUSTCALL argument to OPT_MUSTTAIL.
(execute_tail_calls): Pass true to DIAG_MUSTTAIL
tree_optimize_tail_calls_1 argument.
(pass_tail_recursion::execute): Pass false to DIAG_MUSTTAIL
tree_optimize_tail_calls_1 argument.
(pass_musttail::gate): Don't test flag_optimize_sibling_calls.
(pass_musttail::execute): Pass true to DIAG_MUSTTAIL
tree_optimize_tail_calls_1 argument.

* g++.dg/torture/musttail1.C: New test.
* g++.dg/opt/musttail2.C: New test.

--- gcc/tree-tailcall.cc.jj 2025-01-16 09:27:53.645909094 +0100
+++ gcc/tree-tailcall.cc2025-03-24 12:51:56.271628242 +0100
@@ -139,18 +139,18 @@ static tree m_acc, a_acc;
 
 static bitmap tailr_arg_needs_copy;
 
-static void maybe_error_musttail (gcall *call, const char *err);
+static void maybe_error_musttail (gcall *call, const char *err, bool);
 
 /* Returns false when the function is not suitable for tail call optimization
from some reason (e.g. if it takes variable number of arguments). CALL
is call to report for.  */
 
 static bool
-suitable_for_tail_opt_p (gcall *call)
+suitable_for_tail_opt_p (gcall *call, bool diag_musttail)
 {
   if (cfun->stdarg)
 {
-  maybe_error_musttail (call, _("caller uses stdargs"));
+  maybe_error_musttail (call, _("caller uses stdargs"), diag_musttail);
   return false;
 }
 
@@ -163,7 +163,7 @@ suitable_for_tail_opt_p (gcall *call)
tail call discovery happen. CALL is call to report error for.  */
 
 static bool
-suitable_for_tail_call_opt_p (gcall *call)
+suitable_for_tail_call_opt_p (gcall *call, bool diag_musttail)
 {
   tree param;
 
@@ -171,7 +171,7 @@ suitable_for_tail_call_opt_p (gcall *cal
  sibling call optimizations, but not tail recursion.  */
   if (cfun->calls_alloca)
 {
-  maybe_error_musttail (call, _("caller uses alloca"));
+  maybe_error_musttail (call, _("caller uses alloca"), diag_musttail);
   return false;
 }
 
@@ -181,7 +181,8 @@ suitable_for_tail_call_opt_p (gcall *cal
   if (targetm_common.except_unwind_info (&global_options) == UI_SJLJ
   && current_function_has_exception_handlers ())
 {
-  maybe_error_musttail (call, _("caller uses sjlj exceptions"));
+  maybe_error_musttail (call, _("caller uses sjlj exceptions"),
+ 

[PATCH 2/3] arm: Add missing multilib default values

2025-03-25 Thread Keith Packard
The arm multilib configuration includes two more parameters which
affect multilib selection, marm/mthumb and mfloat-abi. Without those,
the default multilib selection is mis-specified and the only reason it
works is because '.' is the fall-back path.

Add "marm" and "mfloat-abi=soft" to MULTILIB_DEFAULTS to actually
match when the compiler is run without any target parameters.

This hasn't caused any problems in practice because there are no
non-default multilib options which can be applied to the default
-march target as it has neither an FPU nor any branch protection
support. Specifying another cpu or architecture always sets -marm and
-mfloat-abi and so those multilib configuration don't rely on the
defaults.

Signed-off-by: Keith Packard 
---
 gcc/config/arm/arm-mlib.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/arm/arm-mlib.h b/gcc/config/arm/arm-mlib.h
index 211f222f00d..a8b673bda31 100644
--- a/gcc/config/arm/arm-mlib.h
+++ b/gcc/config/arm/arm-mlib.h
@@ -19,4 +19,4 @@
along with GCC; see the file COPYING3.  If not see
.  */
 
-#define MULTILIB_DEFAULTS { "mbranch-protection=none" }
+#define MULTILIB_DEFAULTS { "mbranch-protection=none", "marm", 
"mfloat-abi=soft", }
-- 
2.49.0



[PATCH 3/3] gcc: Add --enable-multilib-space option

2025-03-25 Thread Keith Packard
This option adds a per-multilib variant that specifies -Os
instead of the default.

Signed-off-by: Keith Packard 
---
 config-ml.in |  2 +-
 gcc/Makefile.in  | 32 +++-
 gcc/configure| 13 +
 gcc/configure.ac |  7 +++
 gcc/doc/install.texi | 12 
 5 files changed, 60 insertions(+), 6 deletions(-)

diff --git a/config-ml.in b/config-ml.in
index 7934a1ddf4b..c01fe4741c2 100644
--- a/config-ml.in
+++ b/config-ml.in
@@ -175,7 +175,7 @@ eval scan_arguments "${ac_configure_args}"
 unset scan_arguments
 
 # Only do this if --enable-multilib.
-if [ "${enable_multilib}" = yes ]; then
+if [ "${enable_multilib}" = yes -o "${enable_multilib_space}" = yes ]; then
 
 # Compute whether this is the library's top level directory
 # (ie: not a multilib subdirectory, and not a subdirectory like newlib/src).
diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 9ca389a9c61..c2263fae1d5 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2470,25 +2470,47 @@ libgcc.mvars: config.status Makefile specs xgcc$(exeext)
 
mv tmp-libgcc.mvars libgcc.mvars
 
+ifeq (@enable_multilib_space@,yes)
+MULTILIB_OPTIONS   += Os
+MULTILIB_DIRNAMES  += space
+MULTILIB_MATCHES   += Os=Oz
+
+MULTILIB_OSDIRNAMES_SPACE = $(MULTILIB_OSDIRNAMES)\
+   $(if $(findstring =,$(MULTILIB_OSDIRNAMES)),\
+ $(foreach OSD,$(MULTILIB_OSDIRNAMES),$(subst =,/Os=,$(OSD))/space),\
+ $(if $(MULTILIB_OSDIRNAMES),space,))
+MULTILIB_REQUIRED_SPACE = $(if $(MULTILIB_REQUIRED),Os $(foreach REQ, 
$(MULTILIB_REQUIRED), $(REQ) $(REQ)/Os),)
+MULTILIB_EXCEPTIONS_SPACE = $(foreach EXC, $(MULTILIB_EXCEPTIONS), $(EXC) 
$(EXC)/Os)
+MULTILIB_REUSE_SPACE = $(foreach REU, $(MULTILIB_REUSE), $(REU) $(subst 
=,/Os=,$(REU))/Os)
+MULTILIB_ENABLE = yes
+else
+MULTILIB_OSDIRNAMES_SPACE = $(MULTILIB_OSDIRNAMES)
+MULTILIB_REQUIRED_SPACE = $(MULTILIB_REQUIRED)
+MULTILIB_EXCEPTIONS_SPACE = $(MULTILIB_EXCEPTIONS)
+MULTILIB_REUSE_SPACE = $(MULTILIB_REUSE)
+MULTILIB_ENABLE = @enable_multilib@
+endif
+
 # Use the genmultilib shell script to generate the information the gcc
 # driver program needs to select the library directory based on the
 # switches.
 multilib.h: s-mlib; @true
 s-mlib: $(srcdir)/genmultilib Makefile
if test @enable_multilib@ = yes \
+   || test @enable_multilib_space@ = yes \
   || test -n "$(MULTILIB_OSDIRNAMES)"; then \
  $(SHELL) $(srcdir)/genmultilib \
"$(MULTILIB_OPTIONS)" \
"$(MULTILIB_DIRNAMES)" \
"$(MULTILIB_MATCHES)" \
-   "$(MULTILIB_EXCEPTIONS)" \
+   "$(MULTILIB_EXCEPTIONS_SPACE)" \
"$(MULTILIB_EXTRA_OPTS)" \
"$(MULTILIB_EXCLUSIONS)" \
-   "$(MULTILIB_OSDIRNAMES)" \
-   "$(MULTILIB_REQUIRED)" \
+   "$(MULTILIB_OSDIRNAMES_SPACE)" \
+   "$(MULTILIB_REQUIRED_SPACE)" \
"$(if $(MULTILIB_OSDIRNAMES),,$(MULTIARCH_DIRNAME))" \
-   "$(MULTILIB_REUSE)" \
-   "@enable_multilib@" \
+   "$(MULTILIB_REUSE_SPACE)" \
+   "$(MULTILIB_ENABLE)" \
> tmp-mlib.h; \
else \
  $(SHELL) $(srcdir)/genmultilib '' '' '' '' '' '' '' '' \
diff --git a/gcc/configure b/gcc/configure
index 063b9ce6701..08d23c0ccb6 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -859,6 +859,7 @@ DEFAULT_MATCHPD_PARTITIONS
 with_float
 with_cpu
 enable_multiarch
+enable_multilib_space
 enable_multilib
 coverage_flags
 valgrind_command
@@ -980,6 +981,7 @@ enable_coverage
 enable_gather_detailed_mem_stats
 enable_valgrind_annotations
 enable_multilib
+enable_multilib_space
 enable_multiarch
 with_stack_clash_protection_guard_size
 with_matchpd_partitions
@@ -1723,6 +1725,7 @@ Optional Features:
   --enable-valgrind-annotations
   enable valgrind runtime interaction
   --enable-multilib   enable library support for multiple ABIs
+  --enable-multilib-space enable extra -Os variant for every multilib ABI
   --enable-multiarch  enable support for multiarch paths
   --enable-__cxa_atexit   enable __cxa_atexit for C++
   --enable-decimal-float={no,yes,bid,dpd}
@@ -7860,6 +7863,16 @@ fi
 
 
 
+# Determine whether or not -Os multilibs are enabled.
+# Check whether --enable-multilib-space was given.
+if test "${enable_multilib_space+set}" = set; then :
+  enableval=$enable_multilib_space;
+else
+  enable_multilib_space=no
+fi
+
+
+
 # Determine whether or not multiarch is enabled.
 # Check whether --enable-multiarch was given.
 if test "${enable_multiarch+set}" = set; then :
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 3243472680c..0ab6a6a6858 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -864,6 +864,13 @@ AC_ARG_ENABLE(multilib,
 [], [enable_multilib=yes])
 AC_SUBST(enable_multilib)
 
+# Determine whether or not -Os multilibs are enabled.
+AC_ARG_ENABLE(multilib-space,
+[AS_HELP_STRING([--enable-multilib-space],
+   [enable extra -Os varian

[PATCH 1/3] libgcc: Use -Os/-Oz from CC or CFLAGS

2025-03-25 Thread Keith Packard
Override other optimization settings with any -Os or -Oz found in CC
or CFLAGS.

Signed-off-by: Keith Packard 
---
 libgcc/Makefile.in | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in
index 0719fd0615d..a157e28cbb7 100644
--- a/libgcc/Makefile.in
+++ b/libgcc/Makefile.in
@@ -294,16 +294,20 @@ override CFLAGS := $(filter-out -fprofile-generate 
-fprofile-use,$(CFLAGS))
 # CFLAGS first is not perfect; normally setting CFLAGS should override any
 # options in LIBGCC2_CFLAGS.  But LIBGCC2_CFLAGS may contain -g0, and CFLAGS
 # will usually contain -g, so for the moment CFLAGS goes first.  We must
-# include CFLAGS - that's where multilib options live.
+# include CFLAGS - that's where multilib options live. If CC or CFLAGS
+# specify -Os or -Oz, we want that to override our local options as that
+# could be a multilib flag.
 INTERNAL_CFLAGS = $(CFLAGS) $(LIBGCC2_CFLAGS) $(HOST_LIBGCC2_CFLAGS) \
- $(INCLUDES) @set_have_cc_tls@ @set_use_emutls@
+ $(INCLUDES) @set_have_cc_tls@ @set_use_emutls@ \
+ $(filter -Os -Oz,$(CC) $(CFLAGS))
 
 # Options to use when compiling crtbegin/end.
 CRTSTUFF_CFLAGS = -O2 $(GCC_CFLAGS) $(INCLUDES) $(MULTILIB_CFLAGS) -g0 \
   $(NO_PIE_CFLAGS) -finhibit-size-directive -fno-inline -fno-exceptions \
   -fno-zero-initialized-in-bss -fno-toplevel-reorder -fno-tree-vectorize \
   -fbuilding-libgcc -fno-stack-protector $(FORCE_EXPLICIT_EH_REGISTRY) \
-  $(INHIBIT_LIBC_CFLAGS) $(USE_TM_CLONE_REGISTRY)
+  $(INHIBIT_LIBC_CFLAGS) $(USE_TM_CLONE_REGISTRY) \
+  $(filter -Os -Oz,$(CC) $(CFLAGS))
 
 # Extra flags to use when compiling crt{begin,end}.o.
 CRTSTUFF_T_CFLAGS =
-- 
2.49.0



Re: [PATCH] testsuite: Fix gcc.target/powerpc/vsx-builtin-7.c test [PR119382]

2025-03-25 Thread Peter Bergner
On 3/25/25 5:17 PM, Segher Boessenkool wrote:
> On Tue, Mar 25, 2025 at 03:33:59PM -0500, Peter Bergner wrote:
>> Segher, any reason you can give on why we shouldn't go the easy route to
>> "fix" (yes, these are air-quotes) this by using -fno-ipa-icf?
> 
> One reason is that that option should not make any difference whatsoever
> for a well-written testcase: a testcases that wants to test what insns
> are generated for particular code, damn well should be written in such a
> way that it is very unlikely the compiler will ever generate different
> code for it.  Another reason is I had to look up what that option with
> the cryptical name does, what that names stands for.  And finally, will
> we be doing more maintenance on this later?  Testcase maintenance is
> wasted work, work that does not scale even, so it is important to write
> testcases so that maintenance isn't needed, and if it becomes necessary
> anyway to improve it so that it will not be needed so much in the
> future.

I know there are reasons for wanting it split up, but do we really want
to spend the development time splitting this old power7 test case up rather
than just adding the -fno-ipa-icf option?  You also didn't explicitly say
which solution we should go with, so we're in a little limbo here.

Peter



[committed] cobol: Changes to eliminate _Float128 from the front end

2025-03-25 Thread Robert Dubner
I am putting up this e-mail for the record.  I asked myself if it was
"okay for trunk?", and myself answered "If it's not, I quit!"

When merged into the cobolworx test environment, all of our tests pass.

When merged into master, the results compile, and check-cobol, such as
it is, succeeds.

I just pushed it into master.

>From a4e0d3376b02b2cae7880038e66f241a4942c488 Mon Sep 17 00:00:00 2001
From: Bob Dubner mailto:rdub...@symas.com
Date: Tue, 25 Mar 2025 15:38:38 -0400
Subject: [PATCH] cobol: Changes to eliminate _Float128 from the front end
 [PR119241]

These changes switch _Float128 types to REAL_VALUE_TYPE in the front end.
Some __int128 variables and function return values are changed to
FIXED_WIDE_INT(128)

gcc/cobol

PR cobol/119241
* cdf.y: (cdfval_base_t::operator()): Return const.
* cdfval.h: (struct cdfval_base_t): Add const cdfval_base_t&
operator().
(struct cdfval_t): Add cdfval_t constructor.  Change cdf_value
definitions.
* gcobolspec.cc (lang_specific_driver): Formatting fix.
* genapi.cc: Include fold-const.h and realmpfr.h.
(initialize_variable_internal): Use real_to_decimal instead of
strfromf128.
(get_binary_value_from_float): Use wide_int_to_tree instead of
build_int_cst_type.
(psa_FldLiteralN): Use fold_convert instead of strfromf128,
real_from_string and build_real.
(parser_display_internal): Rewritten to work on REAL_VALUE_TYPE
rather than _Float128.
(mh_source_is_literalN): Use FIXED_WIDE_INT(128) rather than
__int128, wide_int_to_tree rather than build_int_cst_type,
fold_convert rather than build_string_literal.
(real_powi10): New function.
(binary_initial_from_float128): Change type of last argument from
_Float128 to REAL_VALUE_TYPE, process it using real.cc and mpfr
APIs.
(digits_from_float128): Likewise.
(initial_from_float128): Make static.  Remove value argument, add
local REAL_VALUE_TYPE value variable instead, process it using
real.cc and native_encode_expr APIs.
(parser_symbol_add): Adjust initial_from_float128 caller.
* genapi.h (initial_from_float128): Remove declaration.
* genutil.cc (get_power_of_ten): Change return type from __int128
to FIXED_WIDE_INT(128), ditto for retval type, change type of pos
from __int128 to unsigned long long.
(scale_by_power_of_ten_N): Use wide_int_to_tree instead of
build_int_cst_type.  Use FIXED_WIDE_INT(128) instead of __int128
as power_of_ten variable type.
(copy_little_endian_into_place): Likewise.
* genutil.h (get_power_of_ten): Change return type from __int128
to FIXED_WIDE_INT(128).
* parse.y (%union): Change type of float128 from _Float128 to
REAL_VALUE_TYPE.
(string_of): Change argument type from _Float128 to
const REAL_VALUE_TYPE &, use real_to_decimal rather than
strfromf128.  Add another overload with tree argument type.
(field: cdf): Use real_zerop rather than comparison against 0.0.
(occurs_clause, const_value): Use real_to_integer.
(value78): Use build_real and real_to_integer.
(data_descr1): Use real_to_integer.
(count): Use real_to_integer, real_from_integer and real_identical
instead of direct comparison.
(value_clause): Use real_from_string3 instead of num_str2i.  Use
real_identical instead of direct comparison.  Use build_real.
(allocate): Use real_isneg and real_iszero instead of <= 0
comparison.
(move_tgt): Use real_to_integer, real_value_truncate,
real_from_integer and real_identical instead of comparison of
casts.
(cce_expr): Use real_arithmetic and real_convert or
real_value_negate
instead of direct arithmetics on _Float128.
(cce_factor): Use real_from_string3 instead of numstr2i.
(literal_refmod_valid): Use real_to_integer.
* symbols.cc (symbol_table_t::registers_t::registers_t):
Formatting
fix.
(ERROR_FIELD): Likewise.
(extend_66_capacity): Likewise.
(cbl_occurs_t::subscript_ok): Use real_to_integer,
real_from_integer
and real_identical.
* symbols.h (cbl_field_data_t::etc_t::value): Change type from
_Float128 to tree.
(cbl_field_data_t::etc_t::etc_t): Adjust defaulted argument value.
(cbl_field_data_t::cbl_field_data_t): Formatting fix.  Use etc()
rather than etc(0).
(cbl_field_data_t::value_of): Change return type from _Float128 to
tree.
(cbl_field_data_t::operator=): Change return and argument type
from
_Float128 to tree.
(cbl_field_data_t::valify): Use real_from_string,
real_value_truncate
and build_real.
(cbl_field_t::same_as): Use build_zero_cst instead of
_Float128(0.0).

gcc/testsuite

* cobol.dg/liter

Re: [PATCH] c++: Fix ICE when template lambdas call with default parameters in unevaluated context

2025-03-25 Thread Jason Merrill

On 3/25/25 11:43 AM, yxj-github-437 wrote:

This patch would like to avoid the ICE when template lambdas call with
default parameters in unevaluated context. The bug is the same as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119385. For example as blow:

  1   | template 
  2   | void foo(T x) {
  3   |   sizeof [](T=x) { return 0; }();
  4   | }
  5   |
  6   | void test {
  7   |   foo(0);
  8   | }

when compile with -fsyntax-only -std=c++20, it will have ICE similar as below

test.cc: In instantiation of 'void foo(T) [with T = int]':
test.cc:7:6:   required from here
  6 |   foo(0);
|   ~~~^~~
test.cc:3:38: internal compiler error: in tsubst_expr, at cp/pt.cc:21919
  2 |   sizeof [](T=x) { return 0; }();
|  ^~

And if without the template code ``, the code will pass compile, it's 
wrong.

When parsing lambda, the sizeof will affect the lambda internal unevaluated 
operand
being handled. So consider save/restore cp_unevaluated_operand.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_lambda_expression): Save/restore
cp_unevaluated_operand when parser lambda.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-uneval25.C: New test.
---
   gcc/cp/parser.cc |  4 
   gcc/testsuite/g++.dg/cpp2a/lambda-uneval25.C | 11 +++
   2 files changed, 15 insertions(+)
   create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-uneval25.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 57a461042bf..9cc51f57fa7 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -11754,6 +11754,8 @@ cp_parser_lambda_expression (cp_parser* parser)
   /* Inside the class, surrounding template-parameter-lists do not apply.  
*/
   unsigned int saved_num_template_parameter_lists
   = parser->num_template_parameter_lists;
+/* Inside the lambda, outside unevaluated context do not apply.   */
+int saved_cp_unevaluated_operand = cp_unevaluated_operand;



Instead of following the surrounding pattern, please use cp_evaluated.
That avoids the need for any change in the other two places.


OK, I've replace it with cp_evaluated.


Pushed, thanks.  I removed an extra space in the comment and adjusted 
the commit message to fit in 75 columns.


Jason



Re: [PATCH] toplevel, libcobol: Add dependency on libquadmath build [PR119244].

2025-03-25 Thread Jakub Jelinek
On Tue, Mar 25, 2025 at 05:48:57PM +, Iain Sandoe wrote:
> Tested on x86_64, aarch64-linux and x86_64-darwin, verified that there
> is no change in the libquadmath build on the platforms that do not need
> it.  OK for trunk?
> thanks
> Iain
> 
> --- 8< ---
> 
> For the configuration of libgcobol to be correct for targets that need
> to use libquadmath for 128b FP support, we must be able to find the
> quadmath library (or not, for targets that have the support in libc).
> 
>   PR cobol/119244
> 
> ChangeLog:
> 
>   * Makefile.def: libgcobol configure depends on libquadmath build.
>   * Makefile.in: Regenerate.
> 
> Signed-off-by: Iain Sandoe 

LGTM.

Jakub



Re: [PATCH] c++: Properly fold .* [PR114525]

2025-03-25 Thread Simon Martin
Hi,

On Tue Mar 25, 2025 at 6:52 PM CET, Jason Merrill wrote:
> On 3/25/25 1:50 PM, Marek Polacek wrote:
>> On Tue, Mar 25, 2025 at 05:18:23PM +, Simon Martin wrote:
>>> We've been miscompiling the following since r0-51314-gd6b4ea8592e338 (I
>>> did not go compile something that old, and identified this change via
>>> git blame, so might be wrong)
>>>
>>> === cut here ===
>>> struct Foo { int x; };
>>> Foo& get (Foo &v) { return v; }
>>> void bar () {
>>>Foo v; v.x = 1;
>>>(true ? get (v) : get (v)).*(&Foo::x) = 2;
>>>// v.x still equals 1 here...
>>> }
>>> === cut here ===
>>>
>>> The problem lies in build_m_component_ref, that computes the address of
>>> the COND_EXPR using build_address to build the representation of
>>>(true ? get (v) : get (v)).*(&Foo::x);
>>> and gets something like
>>>&(true ? get (v) : get (v))  // #1
>>> instead of
>>>(true ? &get (v) : &get (v)) // #2
>>> and the write does not go where want it to, hence the miscompile.
>>>
>>> This patch replaces the call to build_address by a call to
>>> cp_build_addr_expr, which gives #2, that is properly handled.
>>>
>>> Successfully tested on x86_64-pc-linux-gnu. OK for trunk? And for active
>>> branches after 2-3 weeks since it's a nasty one (albeit very old)?
>>>
>>> PR c++/114525
>>>
>>> gcc/cp/ChangeLog:
>>>
>>> * typeck2.cc (build_m_component_ref): Call cp_build_addr_expr
>>> instead of build_address.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * g++.dg/parse/pr114525.C: New test.
>> 
>> g++.dg/expr/cond18.C seems like a more appropriate place, but the
>> patch itself LGTM.
Good call out, thanks Marek.

I've merged the patch with the suggested test rename as
r15-8911-g35ce9afc84a63f.

I'll reply to this thread in 2-3 weeks when I've backported to 13 and
14.

Simon


[PATCH] cobol: Get rid of __int128 uses in the COBOL FE [PR119242]

2025-03-25 Thread Jakub Jelinek
Hi!

The following patch changes some remaining __int128 uses in the FE
into FIXED_WIDE_INT(128), i.e. emulated 128-bit integral type.
The use of wide_int_to_tree directly from that rather than going through
build_int_cst_type means we don't throw away the upper 64 bits of the
values, so the emitting of constants needing full 128 bits can be greatly
simplied.
Plus all the #pragma GCC diagnostic ignored "-Wpedantic" spots aren't
needed, we don't use the _Float128/__int128 types directly in the FE
anymore.

Tested on x86_64-linux with make check-cobol, could you please test this
on UAT/NIST?

Note, PR119241/PR119242 bugs are still not fully fixed, I think the
remaining problem is that several FE sources include
../../libgcobol/libgcobol.h and that header declares various APIs with
__int128 and _Float128 types, so trying to build a cross-compiler on a host
without __int128 and _Float128 will still fail miserably.
I believe none of those APIs are actually used by the FE, so the question is
what the FE needs from libgcobol.h and whether the rest could be wrapped
with #ifndef IN_GCC or #ifndef IN_GCC_FRONTEND or something similar
(those 2 macros are predefined when compiling the FE files).

2025-03-26  Jakub Jelinek  

PR cobol/119242
* cobol/genutil.h (get_power_of_ten): Remove #pragma GCC diagnostic
around declaration.
* cobol/genapi.cc (psa_FldLiteralN): Change type of value from
__int128 to FIXED_WIDE_INT(128).  Remove #pragma GCC diagnostic
around the declaration.  Use wi::min_precision to determine
minimum unsigned precision of the value.  Use wi::neg_p instead
of value < 0 tests and wi::set_bit_in_zero
to build sign bit.  Handle field->data.capacity == 16 like
1, 2, 4 and 8, use wide_int_to_tree instead of build_int_cst.
(mh_source_is_literalN): Remove #pragma GCC diagnostic around
the definition.
(binary_initial_from_float128): Likewise.
* cobol/genutil.cc (get_power_of_ten): Remove #pragma GCC diagnostic
before the definition.

--- gcc/cobol/genutil.h.jj  2025-03-25 21:14:48.448384925 +0100
+++ gcc/cobol/genutil.h 2025-03-25 21:19:24.358620134 +0100
@@ -104,10 +104,7 @@ void  get_binary_value( tree value,
 tree  get_data_address( cbl_field_t *field,
 tree offset);
 
-#pragma GCC diagnostic push
-#pragma GCC diagnostic ignored "-Wpedantic"
 FIXED_WIDE_INT(128) get_power_of_ten(int n);
-#pragma GCC diagnostic pop
 void  scale_by_power_of_ten_N(tree value,
 int N,
 bool check_for_fractional = false);
--- gcc/cobol/genapi.cc.jj  2025-03-25 21:11:06.767409766 +0100
+++ gcc/cobol/genapi.cc 2025-03-25 21:22:28.038113833 +0100
@@ -3798,16 +3798,13 @@ psa_FldLiteralN(struct cbl_field_t *fiel
   // We are constructing a completely static constant structure, based on the
   // text string in .initial
 
-#pragma GCC diagnostic push
-#pragma GCC diagnostic ignored "-Wpedantic"
-  __int128 value = 0;
-#pragma GCC diagnostic pop
+  FIXED_WIDE_INT(128) value = 0;
 
   do
 {
 // This is a false do{}while, to isolate the variables:
 
-// We need to convert data.initial to an __int128 value
+// We need to convert data.initial to an FIXED_WIDE_INT(128) value
 char *p = const_cast(field->data.initial);
 int sign = 1;
 if( *p == '-' )
@@ -3903,24 +3900,24 @@ psa_FldLiteralN(struct cbl_field_t *fiel
 
 // We now need to calculate the capacity.
 
-unsigned char *pvalue = (unsigned char *)&value;
+unsigned int min_prec = wi::min_precision(value, UNSIGNED);
 int capacity;
-if( *(uint64_t*)(pvalue + 8) )
+if( min_prec > 64 )
   {
   // Bytes 15 through 8 are non-zero
   capacity = 16;
   }
-else if( *(uint32_t*)(pvalue + 4) )
+else if( min_prec > 32 )
   {
   // Bytes 7 through 4 are non-zero
   capacity = 8;
   }
-else if( *(uint16_t*)(pvalue + 2) )
+else if( min_prec > 16 )
   {
   // Bytes 3 and 2
   capacity = 4;
   }
-else if( pvalue[1] )
+else if( min_prec > 8 )
   {
   // Byte 1 is non-zero
   capacity = 2;
@@ -3940,11 +3937,15 @@ psa_FldLiteralN(struct cbl_field_t *fiel
 
 if( capacity < 16 && (field->attr & signable_e) )
   {
-  if( value < 0 && (((pvalue[capacity-1] & 0x80) == 0 )))
+  if( wi::neg_p (value)
+  && (value & wi::set_bit_in_zero(capacity * 8
+   - 1)) != 0 )
 {
 capacity *= 2;
 }
-  else if( value >= 0 && (((pvalue[capacity-1] & 0x80) == 0x80 )))
+  else if( !wi::neg_p (value)
+   && (value & wi::set_bit_in_zero(capacity * 
8
+- 1)) == 0 
)
 {
 capacity *= 2;
 }
@@ -3964,86 +3965,15 @@ psa_FldLiteralN(struct cbl_field_t *f

[PATCH 0/3] Automate use of -Os/-Oz as multilib selector

2025-03-25 Thread Keith Packard
Embedded toolchains face conflicting requirements from different
customers. Some need the best possible speed while others are tightly
size constrained. To avoid having every toolchain user need to
re-compile all libraries, it's convenient to add -Os/-Oz as an
additional multilib selector so that provided libraries will be built
both ways.

This series provides a new configure time options,
--enable-multilib-space, which takes each library configuration and
adds a duplicate with -Os appended.

When this option is not enabled, there is no effect on the resulting
toolchain.

This series includes a couple of preparatory changes:

 * To make this pass the right compiler flags while building libgcc,
   any -Os or -Oz option included in CFLAGS is appended to the arguments
   to override the -O2 option present in LIBGCC2_CFLAGS.

 * The arm multilib configuration was missing a couple of
   MULTILIB_DEFAULTS values that don't affect "normal" operation but
   break things when -Os is added.

Keith Packard (3):
  libgcc: Use -Os/-Oz from CC or CFLAGS
  arm: Add missing multilib default values
  gcc: Add --enable-multilib-space option

 config-ml.in  |  2 +-
 gcc/Makefile.in   | 32 +++-
 gcc/config/arm/arm-mlib.h |  2 +-
 gcc/configure | 13 +
 gcc/configure.ac  |  7 +++
 gcc/doc/install.texi  | 12 
 libgcc/Makefile.in| 10 +++---
 7 files changed, 68 insertions(+), 10 deletions(-)

-- 
2.49.0



[PATCH] i386: Set attr "addr" as "gpr16" for constraint "jm". [PR 119425]

2025-03-25 Thread Hu, Lin1
Hi, all

This patch aims to ensure each alternative with constraint "jm" should
set addr "gpr16", otherwise maybe raise ICE in reload pass.

Bootstrapped and Regtested for x86_64-pc-linux-gnu{-m32,-m64}, ok for trunk?

BRs,
Lin

gcc/ChangeLog:

PR target/119425
* config/i386/sse.md:
(vec_set_0): Set the alternative with constraint "jm"'s
attribute "addr" to "gpr16".
(avx512dq_shuf_64x2_1):
Ditto.
(avx512vl_shuf_32x4_1): Ditto.
(avx2_pblendd): Ditto.
(aesenc): Ditto.
(aesenclast): Ditto.
(aesdec): Ditto.
(aesdeclast): Ditto.
(vaesdec_): Ditto.
(vaesdeclast_): Ditto.
(vaesenc_):: Ditto.
(vaesenclast_):: Ditto.
(aesu8): Ditto.
(*aesu8): Ditto.

gcc/testsuite/ChangeLog:

PR target/119425
* gcc.target/i386/pr119425.c: New test.

Co-authered-by: Hongyu Wang 
---
 gcc/config/i386/sse.md   | 31 +---
 gcc/testsuite/gcc.target/i386/pr119425.c | 37 
 2 files changed, 57 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr119425.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 70c2cf3f60d..1a9214fdedc 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -11935,7 +11935,7 @@ (define_insn "vec_set_0"
   ]
   (const_string "ssemov")))
(set (attr "addr")
- (if_then_else (eq_attr "alternative" "8,9")
+ (if_then_else (eq_attr "alternative" "9,10")
   (const_string "gpr16")
   (const_string "*")))
(set (attr "prefix_extra")
@@ -20204,6 +20204,7 @@ (define_insn 
"avx512dq_shuf_64x2_1"
   return "vshuf64x2\t{%3, %2, %1, 
%0|%0, %1, %2, %3}";
 }
   [(set_attr "type" "sselog")
+   (set_attr "addr" "gpr16,*")
(set_attr "length_immediate" "1")
(set_attr "prefix" "evex")
(set_attr "mode" "XI")])
@@ -20365,6 +20366,7 @@ (define_insn 
"avx512vl_shuf_32x4_1"
   return "vshuf32x4\t{%3, %2, %1, 
%0|%0, %1, %2, %3}";
 }
   [(set_attr "type" "sselog")
+   (set_attr "addr" "gpr16,*")
(set_attr "length_immediate" "1")
(set_attr "prefix" "evex")
(set_attr "mode" "")])
@@ -24107,6 +24109,7 @@ (define_insn "avx2_pblendd"
   "TARGET_AVX2"
   "vpblendd\t{%3, %2, %1, %0|%0, %1, %2, %3}"
   [(set_attr "type" "ssemov")
+   (set_attr "addr" "gpr16")
(set_attr "prefix_extra" "1")
(set_attr "length_immediate" "1")
(set_attr "prefix" "vex")
@@ -27116,7 +27119,7 @@ (define_insn "aesenc"
vaesenc\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "noavx,avx,vaes_avx512vl")
(set_attr "type" "sselog1")
-   (set_attr "addr" "gpr16,*,*")
+   (set_attr "addr" "gpr16,gpr16,*")
(set_attr "prefix_extra" "1")
(set_attr "prefix" "orig,maybe_evex,evex")
(set_attr "btver2_decode" "double,double,double")
@@ -27134,7 +27137,7 @@ (define_insn "aesenclast"
vaesenclast\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "noavx,avx,vaes_avx512vl")
(set_attr "type" "sselog1")
-   (set_attr "addr" "gpr16,*,*")
+   (set_attr "addr" "gpr16,gpr16,*")
(set_attr "prefix_extra" "1")
(set_attr "prefix" "orig,maybe_evex,evex")
(set_attr "btver2_decode" "double,double,double")
@@ -27152,7 +27155,7 @@ (define_insn "aesdec"
vaesdec\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "noavx,avx,vaes_avx512vl")
(set_attr "type" "sselog1")
-   (set_attr "addr" "gpr16,*,*")
+   (set_attr "addr" "gpr16,gpr16,*")
(set_attr "prefix_extra" "1")
(set_attr "prefix" "orig,maybe_evex,evex")
(set_attr "btver2_decode" "double,double,double") 
@@ -27169,7 +27172,7 @@ (define_insn "aesdeclast"
* return TARGET_AES ? \"vaesdeclast\t{%2, %1, %0|%0, %1, %2}\" : \"%{evex%} 
vaesdeclast\t{%2, %1, %0|%0, %1, %2}\";
vaesdeclast\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "noavx,avx,vaes_avx512vl")
-   (set_attr "addr" "gpr16,*,*")
+   (set_attr "addr" "gpr16,gpr16,*")
(set_attr "type" "sselog1")
(set_attr "prefix_extra" "1")
(set_attr "prefix" "orig,maybe_evex,evex")
@@ -30873,7 +30876,8 @@ (define_insn "vaesdec_"
 return "%{evex%} vaesdec\t{%2, %1, %0|%0, %1, %2}";
   else
 return "vaesdec\t{%2, %1, %0|%0, %1, %2}";
-})
+}
+[(set_attr "addr" "gpr16,*")])
 
 (define_insn "vaesdeclast_"
   [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v")
@@ -30887,7 +30891,8 @@ (define_insn "vaesdeclast_"
 return "%{evex%} vaesdeclast\t{%2, %1, %0|%0, %1, %2}";
   else
 return "vaesdeclast\t{%2, %1, %0|%0, %1, %2}";
-})
+}
+[(set_attr "addr" "gpr16,*")])
 
 (define_insn "vaesenc_"
   [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v")
@@ -30901,7 +30906,8 @@ (define_insn "vaesenc_"
 return "%{evex%} vaesenc\t{%2, %1, %0|%0, %1, %2}";
   else
 return "vaesenc\t{%2, %1, %0|%0, %1, %2}";
-})
+}
+[(set_attr "addr" "gpr16,*")])
 
 (define_insn "vaesenclast_"
   [(set (match_operand:VI1_AVX512VL_F 0 "register_operand" "=x,v")
@@ -30915,7 +30

Re: [PATCH] tailc: Don't fail musttail calls if they use or could use local arguments, instead warn [PR119376]

2025-03-25 Thread Jason Merrill

On 3/25/25 3:34 AM, Jakub Jelinek wrote:

Hi!

As discussed here and in bugzilla, [[clang::musttail]] attribute in clang
not just strongly asks for tail call or error, but changes behavior.
To quote:
https://clang.llvm.org/docs/AttributeReference.html#musttail
"The lifetimes of all local variables and function parameters end immediately
before the call to the function.  This means that it is undefined behaviour
to pass a pointer or reference to a local variable to the called function,
which is not the case without the attribute.  Clang will emit a warning in
common cases where this happens."

The GCC behavior was just to error if we can't prove the musttail callee
could not have dereferenced escaped pointers to local vars or parameters
of the caller.  That is still the case for variables with non-trivial
destruction (even in clang), like vars with C++ non-trivial destructors or
variables with cleanup attribute.

The following patch changes the behavior to match that of clang, for all of
[[clang::musttail]], [[gnu::musttail]] and __attribute__((musttail)).

clang 20 actually added warning for some cases of it in
https://github.com/llvm/llvm-project/pull/109255
but it is under -Wreturn-stack-address warning.

Now, gcc doesn't have that warning, but -Wreturn-local-addr instead, and
IMHO it is better to have this under new warnings, because this isn't about
returning local address, but about passing it to a musttail call, or maybe
escaping to a musttail call.  And perhaps users will appreciate they can
control it separately as well.

The patch introduces 2 new warnings.
-Wmusttail-local-addr
which is turn on by default and warns for the always dumb cases of passing
an address of a local variable or parameter to musttail call's argument.


I don't think this is a significantly different case from 
-Wreturn-local-addr; in both cases we are passing a local address out at 
the same time as the stack frame goes away, the only difference is 
whether it's passed by return or argument.  I don't think that 
difference justifies using a different flag.


If others agree with you I don't mind going along, just wanted to make 
my case.



And then
-Wmaybe-musttail-local-addr
which is only diagnosed if -Wmusttail-local-addr was not diagnosed and
diagnoses at most one (so that we don't emit 100s of warnings for one call
if 100s of vars can escape) case where an address of a local var could have
escaped to the musttail call.  This is less severe, the code doesn't have
to be obviously wrong, so the warning is only enabled in -Wextra.


I agree that this should be a different flag.

Jason



RE: [PATCH v2] i386: Add "s_" as Saturation for AVX10.2 Converting Intrinsics.

2025-03-25 Thread Liu, Hongtao



> -Original Message-
> From: Hu, Lin1 
> Sent: Tuesday, March 25, 2025 4:23 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Liu, Hongtao ; ubiz...@gmail.com
> Subject: RE: [PATCH v2] i386: Add "s_" as Saturation for AVX10.2 Converting
> Intrinsics.
> 
> More details: Alignment with llvm (https://github.com/llvm/llvm-
> project/pull/131592)
> 
> BRs,
> Lin
> 
> > -Original Message-
> > From: Hu, Lin1 
> > Sent: Tuesday, March 25, 2025 4:10 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: Liu, Hongtao ; ubiz...@gmail.com
> > Subject: [PATCH v2] i386: Add "s_" as Saturation for AVX10.2
> > Converting Intrinsics.
> >
> > Modify ChangeLog.
> >
> > This patch aims to add "s_" after 'cvt' represent saturation.
> >
> > gcc/ChangeLog:
> >
> > * config/i386/avx10_2-512convertintrin.h
> (_mm512_mask_cvtx2ps_ph):
> > Formatting fixes
> > (_mm512_mask_cvtx_round2ps_ph): Ditto
> > (_mm512_maskz_cvtx_round2ps_ph): Ditto
> > (_mm512_cvtbiassph_bf8): Rename to _mm512_cvts_biasph_bf8.
> > (_mm512_mask_cvtbiassph_bf8): Rename to
> _mm512_mask_cvts_biasph_bf8.
> > (_mm512_maskz_cvtbiassph_bf8): Rename to
> > _mm512_maskz_cvts_biasph_bf8.
> > (_mm512_cvtbiassph_hf8): Rename to _mm512_cvts_biasph_hf8.
> > (_mm512_mask_cvtbiassph_hf8): Rename to
> _mm512_mask_cvts_biasph_hf8.
> > (_mm512_maskz_cvtbiassph_hf8): Rename to
> > _mm512_maskz_cvts_biasph_hf8.
> > (_mm512_cvts2ph_bf8): Rename to _mm512_cvts_2ph_bf8.
> > (_mm512_mask_cvts2ph_bf8): Rename to
> > _mm512_mask_cvts_2ph_bf8.
> > (_mm512_maskz_cvts2ph_bf8): Rename to
> _mm512_maskz_cvts_2ph_bf8.
> > (_mm512_cvts2ph_hf8): Rename to _mm512_cvts_2ph_hf8.
> > (_mm512_mask_cvts2ph_hf8): Rename to
> > _mm512_mask_cvts_2ph_hf8.
> > (_mm512_maskz_cvts2ph_hf8): Rename to
> _mm512_maskz_cvts_2ph_hf8.
> > (_mm512_cvtsph_bf8): Rename to _mm512_cvts_ph_bf8.
> > (_mm512_mask_cvtsph_bf8): Rename to
> _mm512_mask_cvts_ph_bf8.
> > (_mm512_maskz_cvtsph_bf8): Rename to
> _mm512_maskz_cvts_ph_bf8.
> > (_mm512_cvtsph_hf8): Rename to _mm512_cvts_ph_hf8.
> > (_mm512_mask_cvtsph_hf8): Rename to
> _mm512_mask_cvts_ph_hf8.
> > (_mm512_maskz_cvtsph_hf8): Rename to
> _mm512_maskz_cvts_ph_hf8.
> > * config/i386/avx10_2convertintrin.h
> > (_mm_cvtbiassph_bf8): Rename to _mm_cvts_biasph_bf8.
> > (_mm_mask_cvtbiassph_bf8): Rename to
> _mm_mask_cvts_biasph_bf8.
> > (_mm_maskz_cvtbiassph_bf8): Rename to
> _mm_maskz_cvts_biasph_bf8.
> > (_mm256_cvtbiassph_bf8): Rename to _mm256_cvts_biasph_bf8.
> > (_mm256_mask_cvtbiassph_bf8): Rename to
> _mm256_mask_cvts_biasph_bf8.
> > (_mm256_maskz_cvtbiassph_bf8): Rename to
> > _mm256_maskz_cvts_biasph_bf8.
> > (_mm_cvtbiassph_hf8): Rename to _mm_cvts_biasph_hf8.
> > (_mm_mask_cvtbiassph_hf8): Rename to
> _mm_mask_cvts_biasph_hf8.
> > (_mm_maskz_cvtbiassph_hf8): Rename to
> _mm_maskz_cvts_biasph_hf8.
> > (_mm256_cvtbiassph_hf8): Rename to _mm256_cvts_biasph_hf8.
> > (_mm256_mask_cvtbiassph_hf8): Rename to
> _mm256_mask_cvts_biasph_hf8.
> > (_mm256_maskz_cvtbiassph_hf8): Rename to
> > _mm256_maskz_cvts_biasph_hf8.
> > (_mm_cvts2ph_bf8): Rename to _mm_cvts_2ph_bf8.
> > (_mm_mask_cvts2ph_bf8): Rename to _mm_mask_cvts_2ph_bf8.
> > (_mm_maskz_cvts2ph_bf8): Rename to _mm_maskz_cvts_2ph_bf8.
> > (_mm256_cvts2ph_bf8): Rename to _mm256_cvts_2ph_bf8.
> > (_mm256_mask_cvts2ph_bf8): Rename to
> > _mm256_mask_cvts_2ph_bf8.
> > (_mm256_maskz_cvts2ph_bf8): Rename to
> _mm256_maskz_cvts_2ph_bf8.
> > (_mm_cvts2ph_hf8): Rename to _mm_cvts_2ph_hf8.
> > (_mm_mask_cvts2ph_hf8): Rename to _mm_mask_cvts_2ph_hf8.
> > (_mm_maskz_cvts2ph_hf8): Rename to _mm_maskz_cvts_2ph_hf8.
> > (_mm256_cvts2ph_hf8): Rename to _mm256_cvts_2ph_hf8.
> > (_mm256_mask_cvts2ph_hf8): Rename to
> > _mm256_mask_cvts_2ph_hf8.
> > (_mm256_maskz_cvts2ph_hf8): Rename to
> _mm256_maskz_cvts_2ph_hf8.
> > (_mm_cvtsph_bf8): Rename to _mm_cvts_ph_bf8.
> > (_mm_mask_cvtsph_bf8): Rename to _mm_mask_cvts_ph_bf8.
> > (_mm_maskz_cvtsph_bf8): Rename to _mm_maskz_cvts_ph_bf8.
> > (_mm256_cvtsph_bf8): Rename to _mm256_cvts_ph_bf8.
> > (_mm256_mask_cvtsph_bf8): Rename to
> _mm256_mask_cvts_ph_bf8.
> > (_mm256_maskz_cvtsph_bf8): Rename to
> _mm256_maskz_cvts_ph_bf8.
> > (_mm_cvtsph_hf8): Rename to _mm_cvts_ph_hf8.
> > (_mm_mask_cvtsph_hf8): Rename to _mm_mask_cvts_ph_hf8.
> > (_mm_maskz_cvtsph_hf8): Rename to _mm_maskz_cvts_ph_hf8.
> > (_mm256_cvtsph_hf8): Rename to _mm256_cvts_ph_hf8.
> > (_mm256_mask_cvtsph_hf8): Rename to
> _mm256_mask_cvts_ph_hf8.
> > (_mm256_maskz_cvtsph_hf8): Rename to
> _mm256_maskz_cvts_ph_hf8.

Ok, thanks for Jakub's comments.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/i386/avx10_2-512-convert-1.c: Modify function name
> > to follow the latest version.
> > * gcc.target/i386/avx10_2-512-vcvt2ph2bf8s-2.c: Ditto.
> > * gcc.target/i386/avx10_2-512-vcvt2ph2hf8

Re: [PATCH] OpenMP: 'interop' construct - add ME support + target-independent libgomp

2025-03-25 Thread Sandra Loosemore

On 3/25/25 09:25, Paul-Antoine Arras wrote:

On 24/03/2025 21:17, Sandra Loosemore wrote:

[snip]

I think you also need to update BUILT_IN_GOMP_INTEROP in omp- 
builtins.def; at least, that is the source of the decl used for the 
implicit creation/destruction of interop objects in "declare variant" 
expansion.


I took some time to check how other functions defined in target.c are 
declared in omp-builtins.def. It appears that the convention seems to be 
to omit const qualifiers, except in very simple cases like 
omp_get_mapped_ptr.


I'm not sure what the convention is elsewhere.  But GOMP_interop isn't a 
user-visible function in libgomp; it's only called implicitly from the 
code GCC generates when expanding the interop directive and "declare 
variant".  And both of those places use the declaration provided by 
BUILT_IN_GOMP_INTEROP rather than anything in libgomp.h.


Besides, I am not sure how to encode complex types like (**const *). 
Does that require creating new definitions in gcc/builtin-types.def and 
gcc/fortran/types.def?


I don't understand what the Fortran front end has to do with this, 
BUILT_IN_GOMP_INTEROP is only referenced from gimplify.cc and omp-low.cc.


And what difference does it make to have an argument declared as BT_PTR 
instead of, say, BT_PTR_CONST_PTR_PTR? Is is just a matter of optimisation?

If someone could shed some light...


Well, the declaration used by GCC for code generation should match the 
definition in the library, right?


Mainly, I got to thinking about this because the "declare variant" 
interop creation/destruction code that Tobias had sketched out before 
handing it over to me included a bunch of explicit clobbers that 
confused the heck out of me.  AFAICT, the only thing that GOMP_interop 
needs to modify is the actual interop objects whose pointers are in the 
init/destroy arrays, so in my final version of the code those are the 
only things clobbered after the interop objects are destroyed.  If GCC 
knows the arrays themselves are const, that potentially also enables 
optimizations like hoisting the code to set up the argument arrays out a 
loop containing a variant call.


-Sandra


[PATCH] libstdc++: Optimize std::vector construction from input iterators [PR108487]

2025-03-25 Thread Jonathan Wakely
LWG 3291 make std::ranges::iota_view's iterator have input_iterator_tag
as its iterator_category, even though it satisfies the C++20
std::forward_iterator concept. This means that the traditional
std::vector::vector(InputIterator, InputIterator) constructor treats
iota_view iterators as input iterators, because it only understands the
C++17 iterator requirements, not the C++20 iterator concepts. This
results in a loop that calls emplace_back for each individual element of
the iota_view, requiring the vector to reallocate repeatedly as the
values are inserted. This makes it unnecessarily slow to construct a
vector from an iota_view.

This change adds a new _M_range_initialize_n function for initializing a
vector from a range (which doesn't have to be common) and a size. This
new function can be used by vector(InputIterator, InputIterator) and
vector(from_range_t, R&&) when std::ranges::distance can be used to get
the size. It can also be used by the _M_range_initialize overload that
gets the size for a Cpp17ForwardIterator pair using std::distance, and
by the vector(initializer_list) constructor.

With this new function constructing a std::vector from iota_view does
a single allocation of the correct size and so doesn't need to
reallocate in a loop.

libstdc++-v3/ChangeLog:

PR libstdc++/108487
* include/bits/stl_vector.h (vector(initializer_list)): Call
_M_range_initialize_n instead of _M_range_initialize.
(vector(InputIterator, InputIterator)): Use _M_range_initialize_n
for C++20 sized sentinels and forward iterators.
(vector(from_range_t, R&&)): Use _M_range_initialize_n for sized
ranges and forward ranges.
(vector::_M_range_initialize(FwIt, FwIt, forward_iterator_tag)):
Likewise.
(vector::_M_range_initialize_n): New function.
* testsuite/23_containers/vector/cons/108487.cc: New test.
---

Tests running for x86_64-linux.

This gives a 10x speed up for the PR108487 testcase using iota_view.

I don't see why doing this wouldn't be allowed by the standard, so it
seems worth doing.

 libstdc++-v3/include/bits/stl_vector.h| 48 ---
 .../23_containers/vector/cons/108487.cc   | 24 ++
 2 files changed, 56 insertions(+), 16 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/23_containers/vector/cons/108487.cc

diff --git a/libstdc++-v3/include/bits/stl_vector.h 
b/libstdc++-v3/include/bits/stl_vector.h
index 21f6cd04f49..458adc987da 100644
--- a/libstdc++-v3/include/bits/stl_vector.h
+++ b/libstdc++-v3/include/bits/stl_vector.h
@@ -65,6 +65,9 @@
 #if __cplusplus >= 202002L
 # include 
 #endif
+#if __glibcxx_concepts // C++ >= C++20
+# include   // ranges::distance
+#endif
 #if __glibcxx_ranges_to_container // C++ >= 23
 # include   // ranges::copy
 # include   // ranges::subrange
@@ -706,8 +709,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 const allocator_type& __a = allocator_type())
   : _Base(__a)
   {
-   _M_range_initialize(__l.begin(), __l.end(),
-   random_access_iterator_tag());
+   _M_range_initialize_n(__l.begin(), __l.end(), __l.size());
   }
 #endif
 
@@ -735,6 +737,17 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   const allocator_type& __a = allocator_type())
: _Base(__a)
{
+#if __glibcxx_concepts // C++ >= C++20
+ if constexpr (sized_sentinel_for<_InputIterator, _InputIterator>
+ || forward_iterator<_InputIterator>)
+   {
+ const auto __n
+   = static_cast(ranges::distance(__first, __last));
+ _M_range_initialize_n(__first, __last, __n);
+ return;
+   }
+ else
+#endif
  _M_range_initialize(__first, __last,
  std::__iterator_category(__first));
}
@@ -763,13 +776,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
{
  if constexpr (ranges::forward_range<_Rg> || ranges::sized_range<_Rg>)
{
- const auto __n = size_type(ranges::distance(__rg));
- pointer __start =
-   this->_M_allocate(_S_check_init_len(__n,
-   _M_get_Tp_allocator()));
- this->_M_impl._M_finish = this->_M_impl._M_start = __start;
- this->_M_impl._M_end_of_storage = __start + __n;
- _Base::_M_append_range(__rg);
+ const auto __n = static_cast(ranges::distance(__rg));
+ _M_range_initialize_n(ranges::begin(__rg), ranges::end(__rg),
+   __n);
}
  else
{
@@ -1962,15 +1971,22 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
_M_range_initialize(_ForwardIterator __first, _ForwardIterator __last,
std::forward_iterator_tag)
{
- const size_type __n = std::distance(__first, __last);
- pointer __start =
+ _M_ran

Re: [PATCH] tailc: Don't fail musttail calls if they use or could use local arguments, instead warn [PR119376]

2025-03-25 Thread Andi Kleen
> This can be rewritten as
> 
> void foo(int v)
> {
>   {
> int a;
> capture(&a);
> if (condition)
>   goto tail_position;
> // do something with a
>   }
> tail_position:
>   tailcall(v);
> }
> 
> or with 'do { ... if (...) break; ...} while (0)' when one prefers that to 
> goto.

This could get really ugly in more complex functions though with large
scale transformation needed. Not sure that is something I would recommend
to anyone.

I don't know if the clang people considered such a case, but I can see
a point for their semantics.


-Andi


Re: [PATCH] testsuite: Fix gcc.target/powerpc/vsx-builtin-7.c test [PR119382]

2025-03-25 Thread Segher Boessenkool
On Tue, Mar 25, 2025 at 03:33:59PM -0500, Peter Bergner wrote:
> On 3/25/25 1:42 AM, jeevitha wrote:
> > gcc/testsuite/
> > PR testsuite/119382
> > * gcc.target/powerpc/vsx-builtin-7.c: Add '-fno-ipa-icf' to dg-options.
> > 
> > diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-7.c 
> > b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-7.c
> > index 5095d5030fd..78e4e23d102 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-7.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-7.c
> > @@ -1,6 +1,6 @@
> >  /* { dg-do compile { target { powerpc*-*-* } } } */
> >  /* { dg-skip-if "" { powerpc*-*-darwin* } } */
> > -/* { dg-options "-O2 -mdejagnu-cpu=power7 -fno-inline-functions" } */
> > +/* { dg-options "-O2 -mdejagnu-cpu=power7 -fno-inline-functions 
> > -fno-ipa-icf" } */
> >  /* { dg-require-effective-target powerpc_vsx } */
> >  
> >  /* Test simple extract/insert/slat operations.  Make sure all types are
> 
> I REALLY dislike these BIG built-in test cases that test for all possible
> built-ins and then do multiple insn scans.

Yup.  Very generally, every testcase should say exactly what it is
testing for (in the header comments, for example).  The "what" it is
testing for and the "how" it is testing for it are related in an obvious
way, for very simple testcases, but not for more complex ones.

> Much better would be having
> many small test cases that test for one specific thing, rather than testing
> lots of things.

Yup.  Easier to debug, too, when they start misbehaving.  Easier to edit
etc. even!  We all now how to run trivial cp commands I hope :-)

> As shown here, they're pretty fragile to changes in the compiler.

Pretty much all scan-assembler-times tests are very suspect, not to say
plain wrong.

> That said, I'm not sure it's really worth splitting this older Power7
> test case up, so I guess adding -fno-ipa-icf is probably the best/easiest
> of all of the bad options.

It is probably less work the next time one of those tests starts failing
to *start* with splitting the test up :-)

> Segher, any reason you can give on why we shouldn't go the easy route to
> "fix" (yes, these are air-quotes) this by using -fno-ipa-icf?

One reason is that that option should not make any difference whatsoever
for a well-written testcase: a testcases that wants to test what insns
are generated for particular code, damn well should be written in such a
way that it is very unlikely the compiler will ever generate different
code for it.  Another reason is I had to look up what that option with
the cryptical name does, what that names stands for.  And finally, will
we be doing more maintenance on this later?  Testcase maintenance is
wasted work, work that does not scale even, so it is important to write
testcases so that maintenance isn't needed, and if it becomes necessary
anyway to improve it so that it will not be needed so much in the
future.


Segher


Re: [PATCH v3] libstdc++: Fix std::vector::append_range for overlapping ranges

2025-03-25 Thread Tomasz Kaminski
On Tue, Mar 25, 2025 at 12:40 PM Jonathan Wakely  wrote:

> Unlike insert_range and assign_range, the append_range function does not
> have a precondition that the range doesn't overlap *this. That means we
> need to avoid relocating the existing elements until after copying from
> the range. This means I need to revert r15-8488-g3e1d760bf49d0e which
> made the from_range_t constructor use append_range, because the
> constructor can avoid the additional complexity needed by append_range.
> When relocating the existing elements in append_range we can use
> std::__relocate_a to do it more efficiently, if that's valid.
>
> std::vector::append_range needs similar treatment, although it's a
> bit simpler as we know that the elements are trivially copyable and so
> we don't need to worry about them throwing. assign_range doesn't allow
> overlapping ranges, so can be rewritten to be more efficient than
> calling append_range for the forward or sized range case.
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/stl_bvector.h (vector::assign_range): More
> efficient implementation for forward/sized ranges.
> (vector::append_range): Handle potentially overlapping range.
> * include/bits/stl_vector.h (vector(from_range_t, R&&, Alloc)):
> Do not use append_range for non-sized input range case.
> (vector::append_range): Handle potentially overlapping range.
> * include/bits/vector.tcc (vector::insert_range): Forward range
> instead of moving it.
> *
> testsuite/23_containers/vector/bool/modifiers/insert/append_range.cc:
> Test overlapping ranges.
> * testsuite/23_containers/vector/modifiers/append_range.cc:
> Likewise.
> ---
>
> Patch v3 fixes the problem Tomasz noticed with calling reserve(n) on an
> empty vector with non-zero capacity, which can invalidate the range
> parameter.
>
> Adds additional comments to the tests to try and clarify that the "XXX"
> comments only apply to the calls to do_test, for Patrick's comment.
>
> Also improved the doxygen comments on all the C++23 range members.
>
> Tested x86_64-linux.
>
LGTM.

>
>  libstdc++-v3/include/bits/stl_bvector.h   |  77 ++--
>  libstdc++-v3/include/bits/stl_vector.h| 110 +++-
>  libstdc++-v3/include/bits/vector.tcc  |   2 +-
>  .../bool/modifiers/insert/append_range.cc |  51 ++
>  .../vector/modifiers/append_range.cc  | 165 ++
>  5 files changed, 385 insertions(+), 20 deletions(-)
>
> diff --git a/libstdc++-v3/include/bits/stl_bvector.h
> b/libstdc++-v3/include/bits/stl_bvector.h
> index 3ee15eaa938..03f6434604c 100644
> --- a/libstdc++-v3/include/bits/stl_bvector.h
> +++ b/libstdc++-v3/include/bits/stl_bvector.h
> @@ -899,6 +899,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>  #if __glibcxx_ranges_to_container // C++ >= 23
>/**
> * @brief Construct a vector from a range.
> +   * @param __rg A range of values that are convertible to
> `value_type`.
> * @since C++23
> */
>template<__detail::__container_compatible_range _Rg>
> @@ -1028,6 +1029,8 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>  #if __glibcxx_ranges_to_container // C++ >= 23
>/**
> * @brief Assign a range to the vector.
> +   * @param __rg A range of values that are convertible to
> `value_type`.
> +   * @pre `__rg` and `*this` do not overlap.
> * @since C++23
> */
>template<__detail::__container_compatible_range _Rg>
> @@ -1035,8 +1038,25 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> assign_range(_Rg&& __rg)
> {
>   static_assert(assignable_from ranges::range_reference_t<_Rg>>);
> - clear();
> - append_range(std::forward<_Rg>(__rg));
> + if constexpr (ranges::forward_range<_Rg> ||
> ranges::sized_range<_Rg>)
> +   {
> + if (auto __n = size_type(ranges::distance(__rg)))
> +   {
> + reserve(__n);
> + this->_M_impl._M_finish
> + = ranges::copy(std::forward<_Rg>(__rg), begin()).out;
> +   }
> + else
> +   clear();
> +   }
> + else
> +   {
> + clear();
> + auto __first = ranges::begin(__rg);
> + const auto __last = ranges::end(__rg);
> + for (; __first != __last; ++__first)
> +   emplace_back(*__first);
> +   }
> }
>  #endif
>
> @@ -1330,6 +1350,10 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>  #if __glibcxx_ranges_to_container // C++ >= 23
>/**
> * @brief Insert a range into the vector.
> +   * @param __rg A range of values that are convertible to `bool`.
> +   * @return An iterator that points to the first new element
> inserted,
> +   * or to `__pos` if `__rg` is an empty range.
> +   * @pre `__rg` and `*this` do not overlap.
> * @since C++23
> */
>templ

[pushed] c++: add fixed test [PR101881]

2025-03-25 Thread Marek Polacek
Tested x86_64-pc-linux-gnu, applying to trunk.

-- >8 --
Fixed recently by r15-7822.

PR c++/101881

gcc/testsuite/ChangeLog:

* g++.dg/ext/vector44.C: New test.
---
 gcc/testsuite/g++.dg/ext/vector44.C | 5 +
 1 file changed, 5 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ext/vector44.C

diff --git a/gcc/testsuite/g++.dg/ext/vector44.C 
b/gcc/testsuite/g++.dg/ext/vector44.C
new file mode 100644
index 000..cb24ef6e264
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/vector44.C
@@ -0,0 +1,5 @@
+// PR c++/101881
+// { dg-do compile { target c++11 } }
+
+template using A = int __attribute__((vector_size(N)))*;
+void foo(A<4>) {}

base-commit: 35ce9afc84a63fb647a90cbecb2adf3e748178be
-- 
2.49.0



Re: [PATCH] target/119010 - add missing integer store reservations for znver4 and znver5

2025-03-25 Thread Jan Hubicka
> The imov and imovx classified stores miss reservations in the znver4/5
> pipeline description.  The following adds them.
> 
> Bootstrap and regtest pending on x86_64-unknown-linux-gnu.
> 
> OK?
> 
>   PR target/119010
>   * config/i386/zn4zn5.md (znver4_imov_double_store,
>   znver5_imov_double_store, znver4_imov_store, znver5_imov_store):
>   New reservations for integer stores.

This is OK,
thanks!
Honza


Re: [PATCH] libstdc++: Optimize std::vector construction from input iterators [PR108487]

2025-03-25 Thread Jonathan Wakely
On Tue, 25 Mar 2025 at 17:31, Tomasz Kaminski wrote:
> I have checked  that _M_range_initialize is called only from constructors,
> or _M_initialize_dispatch, that is directly called in the constructor.
> So guards indeed seem to be redundant.
>
> Please add this to the comment message, and otherwise LGTM.

Done, thanks for the review.

I checked the history, and the RAII guard was added there by
r15-905-gd22eaeca7634b5 so a recent addition. That part of the change
was not needed though.



RE: [committed] cobol: Changes to eliminate _Float128 from the front end

2025-03-25 Thread Robert Dubner
And as an addendum:  Special thanks to Richard Biener and Jakub Jelinek
for all their work on this, and to the community in general for the
generous advice and support.

I can honestly say I have never worked in this kind of paradigm, and it's
been a remarkable experince, and really kind of fun.

(I note that I once jumped out of an airplane.  After all the training and
drill, the jumpmaster said, 

"Look: We train and train for something to go wrong.  Nothing ever goes
wrong.  And we train you to go into an arched falling position.  You won't
do that; nobody does on their first jump.  What's going to happen is you
are going to go out the door and into the slipstream, and there will be a
second of complete confusion and disorientation until the static line
starts to pull your 'chute out of the pack.

"That second is what you paid your money for.  Enjoy it."

"Fun" can have many meanings.)



> -Original Message-
> From: Robert Dubner 
> Sent: Tuesday, March 25, 2025 16:12
> To: gcc-patches@gcc.gnu.org
> Subject: [committed] cobol: Changes to eliminate _Float128 from the
front
> end
> 
> I am putting up this e-mail for the record.  I asked myself if it was
> "okay for trunk?", and myself answered "If it's not, I quit!"
> 
> When merged into the cobolworx test environment, all of our tests pass.
> 
> When merged into master, the results compile, and check-cobol, such as
> it is, succeeds.
> 
> I just pushed it into master.


Re: [PATCH] OpenMP: 'interop' construct - add ME support + target-independent libgomp

2025-03-25 Thread Paul-Antoine Arras

On 24/03/2025 21:17, Sandra Loosemore wrote:

On 3/24/25 08:20, Paul-Antoine Arras wrote:

On 21/03/2025 20:17, Sandra Loosemore wrote:
Does the attached patch reflect what you have in mind?

diff --git libgomp/libgomp_g.h libgomp/libgomp_g.h
index 8993ec610fb..274f4937680 100644
--- libgomp/libgomp_g.h
+++ libgomp/libgomp_g.h
@@ -359,9 +359,10 @@ extern void GOMP_teams (unsigned int, unsigned int);
 extern bool GOMP_teams4 (unsigned int, unsigned int, unsigned int, 
bool);

 extern void *GOMP_target_map_indirect_ptr (void *);
 struct interop_obj_t;
-extern void GOMP_interop (int, int, struct interop_obj_t ***, const 
int *,

-  const char **, int, struct interop_obj_t **, int,
-  struct interop_obj_t ***, unsigned, void **);
+extern void GOMP_interop (int, int, struct interop_obj_t **const *, 
const int *,

+  const char *const *, int, struct interop_obj_t **,
+  int, struct interop_obj_t **const *, unsigned,
+  void **);

 /* teams.c */

diff --git libgomp/target.c libgomp/target.c
index 36ed797b0a9..54c244e0f13 100644
--- libgomp/target.c
+++ libgomp/target.c
@@ -5279,11 +5279,11 @@ ialias (omp_get_interop_rc_desc)
 struct interop_data_t
 {
   int device_num, n_init, n_use, n_destroy;
-  struct interop_obj_t ***init;
+  struct interop_obj_t **const *init;
   struct interop_obj_t **use;
-  struct interop_obj_t ***destroy;
+  struct interop_obj_t **const *destroy;
   const int *target_targetsync;
-  const char **prefer_type;
+  const char *const *prefer_type;
 };

 static void
@@ -5348,10 +5348,10 @@ gomp_interop_internal (void *data)
    'flags' is used for the 'nowait' clause.  */

 void
-GOMP_interop (int device_num, int n_init, struct interop_obj_t ***init,
-  const int *target_targetsync, const char **prefer_type, int 
n_use,

-  struct interop_obj_t **use, int n_destroy,
-  struct interop_obj_t ***destroy, unsigned int flags,
+GOMP_interop (int device_num, int n_init, struct interop_obj_t 
**const *init,

+  const int *target_targetsync, const char *const *prefer_type,
+  int n_use, struct interop_obj_t **use, int n_destroy,
+  struct interop_obj_t **const *destroy, unsigned int flags,
   void **depend)
 {
   struct interop_data_t args;


I think you also need to update BUILT_IN_GOMP_INTEROP in omp- 
builtins.def; at least, that is the source of the decl used for the 
implicit creation/destruction of interop objects in "declare variant" 
expansion.


I took some time to check how other functions defined in target.c are 
declared in omp-builtins.def. It appears that the convention seems to be 
to omit const qualifiers, except in very simple cases like 
omp_get_mapped_ptr.


Besides, I am not sure how to encode complex types like (**const *). 
Does that require creating new definitions in gcc/builtin-types.def and 
gcc/fortran/types.def?
And what difference does it make to have an argument declared as BT_PTR 
instead of, say, BT_PTR_CONST_PTR_PTR? Is is just a matter of optimisation?

If someone could shed some light...

Thanks,
--
PA


[PATCH] c++: fix missing lifetime extension [PR119383]

2025-03-25 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14?

-- >8 --
Since r15-8011 cp_build_indirect_ref_1 won't do the *&TARGET_EXPR ->
TARGET_EXPR folding not to change its value category.  That fix is
correct but it made us stop extending the lifetime in this testcase,
causing a wrong-code issue -- extend_ref_init_temps_1 did not see
through the extra *& because it doesn't use a tree walk.  It is not
hard to fix that, but there may be other places that need this
adjustment.  :/

PR c++/119383

gcc/cp/ChangeLog:

* call.cc (extend_ref_init_temps_1): Handle *&TARGET_EXPR the same as
TARGET_EXPR.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/temp-extend3.C: New test.
---
 gcc/cp/call.cc|  6 +
 gcc/testsuite/g++.dg/cpp0x/temp-extend3.C | 32 +++
 2 files changed, 38 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/temp-extend3.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index c1c8987ec8b..ed2bdc85d87 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -14835,6 +14835,12 @@ extend_ref_init_temps_1 (tree decl, tree init, 
vec **cleanups,
   for (p = &TREE_OPERAND (sub, 0);
TREE_CODE (*p) == COMPONENT_REF || TREE_CODE (*p) == ARRAY_REF; )
 p = &TREE_OPERAND (*p, 0);
+  /* cp_build_indirect_ref_1 leaves *&TARGET_EXPR intact, handle it here.  */
+  if (INDIRECT_REF_P (*p)
+  && TREE_CODE (TREE_OPERAND (*p, 0)) == ADDR_EXPR
+  && same_type_p (TREE_TYPE (*p),
+ TREE_TYPE (TREE_TYPE (TREE_OPERAND (*p, 0)
+p = &TREE_OPERAND (TREE_OPERAND (*p, 0), 0);
   if (TREE_CODE (*p) == TARGET_EXPR)
 {
   tree subinit = NULL_TREE;
diff --git a/gcc/testsuite/g++.dg/cpp0x/temp-extend3.C 
b/gcc/testsuite/g++.dg/cpp0x/temp-extend3.C
new file mode 100644
index 000..3eab88d0076
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/temp-extend3.C
@@ -0,0 +1,32 @@
+// PR c++/119383
+// { dg-do run { target c++11 } }
+
+int g;
+
+struct base {
+  virtual base *clone() const = 0;
+  ~base() { }
+};
+
+struct impl : virtual base {
+  base *clone() const { return new impl; }  // #1
+  impl() { ++g; }
+  ~impl() { --g; }
+};
+
+const base *
+make_a_clone ()
+{
+  const base &base = impl{}; // #2
+  return base.clone();
+}
+
+int
+main ()
+{
+  make_a_clone ();
+  // impl::impl() is called twice (#1 and #2), impl::~impl() once,
+  // at the end of make_a_clone.
+  if (g != 1)
+__builtin_abort ();
+}

base-commit: 927cfea902c330092848bd7a228b714b07d08f6b
-- 
2.49.0



Re: [Patch, v3] libgomp.texi: Document supported OpenMP 'interop' types for nvptx and gcn

2025-03-25 Thread Sandra Loosemore

On 3/25/25 10:59, Tobias Burnus wrote:

Updated patch:



+Available properties for an HIP interop object:
+
+@multitable @columnfractions .20 .35 .20 .20
+@headitem Property  @tab C data type @tab API routine @tab 
value (if constant)
+@item @code{fr_id}  @tab @code{omp_interop_fr_t} @tab int @tab 
@code{omp_fr_hip}
+@item @code{fr_name}@tab @code{const char *} @tab str @tab 
``hip''
+@item @code{vendor} @tab @code{int}  @tab int @tab 
1
+@item @code{vendor_name}@tab @code{const char *} @tab str @tab 
``amd''
+@item @code{device_num} @tab @code{int}  @tab int @tab
+@item @code{platform}   @tab N/A @tab @tab
+@item @code{device} @tab @code{hipDevice_t}  @tab int @tab
+@item @code{device_context} @tab @code{hipCtx_t} @tab ptr @tab
+@item @code{targetsync} @tab @code{hipStream_t}  @tab ptr @tab
+@end multitable


That's not what I was suggesting previously.  I was asking for 
@code{"hip"} etc in the rightmost column if the value is a string 
constant.  I think you might as well also use @code{1} for the integer 
value, too, for consistent formatting and to emphasize that it's a literal.


Likewise for the other similar tables in the patch.

-Sandra


[PATCH] middle-end/118795 - fix can_vec_perm_const_p query in match.pd

2025-03-25 Thread Richard Biener
When expanding to RTL we always use vec_perm_indices with two operands
which can make a difference with respect to supported vs. unsupported.
So the following adjusts a query in match.pd for target support which
got this "wrong" and using 1 for a equal operand permute.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Eventually a reduced testcase is ready in time.

Richard.

PR middle-end/118795
* match.pd (vec_perm > -> vec_perm ):
Use the appropriate check to see whether the original
outer permute was supported.
---
 gcc/match.pd | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/match.pd b/gcc/match.pd
index ad966766376..c0402e81c28 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -11128,7 +11128,7 @@ and,
 (with
  {
vec_perm_indices sel0 (builder0, 2, nelts);
-   vec_perm_indices sel1 (builder1, 1, nelts);
+   vec_perm_indices sel1 (builder1, 2, nelts);
 
for (int i = 0; i < nelts; i++)
 builder2.quick_push (sel0[sel1[i].to_constant ()]);
-- 
2.43.0


Re: [PATCH 06/10] testsuite: aarch64: arm: Add -mfpu=auto to arm_v8_2a_bf16_neon_ok

2025-03-25 Thread Christophe Lyon
On Mon, 24 Mar 2025 at 16:13, Richard Earnshaw (lists)
 wrote:
>
> On 24/03/2025 14:52, Christophe Lyon wrote:
> > On Mon, 24 Mar 2025 at 15:13, Richard Earnshaw (lists)
> >  wrote:
> >>
> >> On 21/03/2025 17:30, Christophe Lyon wrote:
> >>> On Fri, 21 Mar 2025 at 16:51, Richard Earnshaw (lists)
> >>>  wrote:
> 
>  On 21/03/2025 15:15, Christophe Lyon wrote:
> > On Fri, 21 Mar 2025 at 15:25, Richard Earnshaw (lists)
> >  wrote:
> >>
> >> On 21/03/2025 14:05, Christophe Lyon wrote:
> >>> On Fri, 21 Mar 2025 at 11:18, Richard Earnshaw (lists)
> >>>  wrote:
> 
>  On 20/03/2025 16:15, Christophe Lyon wrote:
> > Depending on if/how the testing flags are overridden, the first 
> > value
> > we try("") might not do what we want.
> >
> > For instance, if the whole testsuite is executed with
> > (A) -mthumb -march=armv7-m -mtune=cortex-m3 -mfloat-abi=softfp
> >
> > bf16_neon_ok is first compiled with
> > (A) (B)
> > where B = -mcpu=unset -march=armv8.2-a+bf16
> >
> > which is accepted, so a testcase like vld2q_lane_bf16_indices_1.c
> > is compiled with:
> > (A) (C) (B)
> > where C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a 
> > -mfpu=neon-fp16 -mfp16-format=ieee
> >
> > because advsimd-intrinsics.exp has set additional_flags to (C)
> > via arm_neon_fp16_ok
> >
> > So the testcase is compiled with
> > [...] -mfpu=neon-fp16 -mcpu=unset -march=armv8.2-a+bf16
> > (thus -mfpu=neon-fp16) and bf16 support is disabled.
> >
> > The patch replaces "" with -mfpu=auto which matches the intended
> > effect of -march=armv8.2-a+bf16 as added by bf16_neon_ok, and the
> > testcase is now compiled with
> > (A) (C) -mfpu=auto (B)
> >
> > However, since this effective-target is also used on aarch64 (which
> > does not support -mfpu=auto), we do this only on arm.
> >
> > This patch improves coverage, and makes
> > v{ld,st}[234]q_lane_bf16_indices_1.c pass when testsuite flags are
> > overridden as described above (e.g. for M-profile).
> >
> >   gcc/testsuite/
> >   * lib/target-supports.exp
> >   (check_effective_target_arm_v8_2a_bf16_neon_ok_nocache):
> >   Conditionally use -mfpu=auto.
> > ---
> >  gcc/testsuite/lib/target-supports.exp | 9 -
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/gcc/testsuite/lib/target-supports.exp 
> > b/gcc/testsuite/lib/target-supports.exp
> > index e2622a445c5..09b16a14024 100644
> > --- a/gcc/testsuite/lib/target-supports.exp
> > +++ b/gcc/testsuite/lib/target-supports.exp
> > @@ -6871,12 +6871,19 @@ proc add_options_for_arm_fp16fml_neon { 
> > flags } {
> >  proc check_effective_target_arm_v8_2a_bf16_neon_ok_nocache { } {
> >  global et_arm_v8_2a_bf16_neon_flags
> >  set et_arm_v8_2a_bf16_neon_flags ""
> > +set fpu_auto ""
> >
> >  if { ![istarget arm*-*-*] && ![istarget aarch64*-*-*] } {
> >   return 0;
> >  }
> >
> > -foreach flags {"" "-mfloat-abi=softfp -mfpu=neon-fp-armv8" 
> > "-mfloat-abi=hard -mfpu=neon-fp-armv8" } {
> > +if { [istarget arm*-*-*] } {
> > + set fpu_auto "-mfpu=auto"
> > +}
> > +
> > +foreach flags [list "$fpu_auto" \
> 
>  Shouldn't we try first with "", even on Arm?  Thus
> foreach flags [list "" "$fpu_auto" \
>  ...
> 
> >>> I don't think so, that's why I tried to explain above.
> >>> "" is acceptable / accepted in arm_v8_2a_bf16_neon_ok
> >>> (this is (A) (B) above, where the important parts are:
> >>> -march=armv7-m -mcpu=unset -march=armv8.2-a+bf16
> >>> (so -mfpu is set to the toolchain's default)
> >>
> >> That's never going to work reliably.  We need to check, somewhere, the 
> >> full set of options we intend to pass to the compilation.  We can't 
> >> assume that separately testing if A is ok and B is ok => A + B is ok.
> >>
> >
> > Hmmm I think I raised that problem years ago, because of the way the
> > test system is designed...
> >
> >>>
> >>> but then the actual testcase is compiled with additional flags (C)
> >>> defined by the test driver using arm_neon_fp16_ok
> >>> C = -mfpu=neon -mfloat-abi=softfp -mcpu=unset -march=armv7-a
> >>> -mfpu=neon-fp16 -mfp16-format=ieee
> >>>
> >>> so the relevant parts of (A) (C) (B) are:
> >>> -march=armv7-m  -mfpu=neon -mcpu=unset -march=armv7-a -mfpu=neon-fp16
> >>> -mcpu=unset -march=armv8.2-a+bf16
> >>> 

Re: [PATCH] RISC-V: Remove the priority in FMV ASM name mangling

2025-03-25 Thread Kito Cheng
Will it only cause issues with this patch
https://gcc.gnu.org/pipermail/gcc-patches/2025-March/678918.html
or will it cause problems with the current trunk as well?

If the latter one, could you provide a case for that?

Thanks :)

On Tue, Mar 25, 2025 at 7:15 PM Yangyu Chen  wrote:
>
> We don't need to add priority in ASM name mangling, keeping this might
> cause an issue if we call another MV clone directly but only one place
> has the priority declared.
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_mangle_decl_assembler_name): Remove
> priority in fmv asm name mangling.
>
> Signed-off-by: Yangyu Chen 
> ---
>  gcc/config/riscv/riscv.cc | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 38f3ae7cd84..4a042878554 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -13238,7 +13238,11 @@ riscv_mangle_decl_assembler_name (tree decl, tree id)
>
>/* Replace non-alphanumeric characters with underscores as the suffix. 
>  */
>for (const char *c = version_string; *c; c++)
> -   name += ISALNUM (*c) == 0 ? '_' : *c;
> +   {
> + /* Skip ';' for ";priority"  */
> + if (*c == ';') break;
> + name += ISALNUM (*c) == 0 ? '_' : *c;
> +   }
>
>if (DECL_ASSEMBLER_NAME_SET_P (decl))
> SET_DECL_RTL (decl, NULL);
> --
> 2.49.0
>


Re: [PATCH] c++: Fix ICE when template lambdas call with default parameters in unevaluated context

2025-03-25 Thread yxj-github-437
>> This patch would like to avoid the ICE when template lambdas call with
>> default parameters in unevaluated context. For example as blow:
>>
>>  1   │ template 
>>  2   │ void foo(T x) {
>>  3   │   sizeof [](T=x) { return 0; }();
>>  4   │ }
>>  5   │
>>  6   │ void test {
>>  7   │   foo(0);
>>  8   │ }
>>
>> when compile with -fsyntax-only -std=c++20, it will have ICE similar as below
>>
>> test.cc: In instantiation of ‘void foo(T) [with T = int]’:
>> test.cc:7:6:   required from here
>>  6 |   foo(0);
>>|   ~~~^~~
>> test.cc:3:38: internal compiler error: in tsubst_expr, at cp/pt.cc:21919
>>  2 |   sizeof [](T=x) { return 0; }();
>>|  ^~
>>
>> For this example, the template lambda will build with an independent 
>> unevaluated
>> context. When convert default arguments, handling `int x` will be in a no 
>> unevaluated
>> operand context, the code `gcc_assert (cp_unevaluated_operand)` will make 
>> ICE.
>> So just remove this assert, and the code will get an effective error 
>> information:
>> "‘x’ is not captured".

> Without the sizeof we get the better error "parameter 'x' cannot appear
> in this context"; capturing or not isn't the reason it's ill-formed.

> It seems like this code:

>> /* Check to see if DECL is a local variable in a context
>>where that is forbidden.  */
>> if ((parser->local_variables_forbidden_p & LOCAL_VARS_FORBIDDEN)
>> && local_variable_p (decl)
>> /* DR 2082 permits local variables in unevaluated contexts
>>within a default argument.  */
>> && !cp_unevaluated_operand)

> is confused by the sizeof; I guess we want to cp_evaluated for default
> arguments like we do for template arguments.

> Jason

Thanks, to fix this bug should indeed be handled when lambda parsing.
I will make the following modifications.

-- >8 --

This patch would like to avoid the ICE when template lambdas call with
default parameters in unevaluated context. The bug is the same as 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119385. For example as blow:

1   | template 
2   | void foo(T x) {
3   |   sizeof [](T=x) { return 0; }();
4   | }
5   |
6   | void test {
7   |   foo(0);
8   | }

when compile with -fsyntax-only -std=c++20, it will have ICE similar as below

test.cc: In instantiation of 'void foo(T) [with T = int]':
test.cc:7:6:   required from here
6 |   foo(0);
  |   ~~~^~~
test.cc:3:38: internal compiler error: in tsubst_expr, at cp/pt.cc:21919
2 |   sizeof [](T=x) { return 0; }();
  |  ^~

And if without the template code ``, the code will pass compile, it's 
wrong.

When parsing lambda, the sizeof will affect the lambda internal unevaluated 
operand
being handled. So consider save/restore cp_unevaluated_operand.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_lambda_expression): Save/restore
cp_unevaluated_operand when parser lambda.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-uneval25.C: New test.
---
 gcc/cp/parser.cc |  4 
 gcc/testsuite/g++.dg/cpp2a/lambda-uneval25.C | 11 +++
 2 files changed, 15 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-uneval25.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 57a461042bf..9cc51f57fa7 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -11754,6 +11754,8 @@ cp_parser_lambda_expression (cp_parser* parser)
 /* Inside the class, surrounding template-parameter-lists do not apply.  */
 unsigned int saved_num_template_parameter_lists
 = parser->num_template_parameter_lists;
+/* Inside the lambda, outside unevaluated context do not apply.   */
+int saved_cp_unevaluated_operand = cp_unevaluated_operand;
 unsigned char in_statement = parser->in_statement;
 bool in_switch_statement_p = parser->in_switch_statement_p;
 bool fully_implicit_function_template_p
@@ -11765,6 +11767,7 @@ cp_parser_lambda_expression (cp_parser* parser)
 bool saved_omp_array_section_p = parser->omp_array_section_p;
 
 parser->num_template_parameter_lists = 0;
+cp_unevaluated_operand = 0;
 parser->in_statement = 0;
 parser->in_switch_statement_p = false;
 parser->fully_implicit_function_template_p = false;
@@ -11814,6 +11817,7 @@ cp_parser_lambda_expression (cp_parser* parser)
 in_discarded_stmt = discarded;
 
 parser->num_template_parameter_lists = saved_num_template_parameter_lists;
+cp_unevaluated_operand = saved_cp_unevaluated_operand;
 parser->in_statement = in_statement;
 parser->in_switch_statement_p = in_switch_statement_p;
 parser->fully_implicit_function_template_p
diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-uneval25.C 
b/gcc/testsuite/g++.dg/cpp2a/lambda-uneval25.C
new file mode 100644
index 000..7fdd44d3dd

Re: [PATCH v2] RISC-V: Fix wrong LMUL when only implict zve32f.

2025-03-25 Thread Kito Cheng
Hi Robin


Sorry Kito, that we're having so much back and forth here, it's not my
> intention to block anything (not that I could anyway).  I just want to
> make sure I properly understand the rationale (or the spec, rather).
>

No worries, it's a great chance to clarify the spec together :)
Some time I aso misunderstand the spec...:P


>
> > Oh, ok, I got the point why you confused on this, the new condition is
> > little bit `indirect`,
> > it say TARGET_VECTOR_ELEN_64, it would be clear if we TARGET_VECTOR_ELEN
> > 32,
> > however we don't have that so we test TARGET_VECTOR_ELEN_64 instead of
> > TARGET_VECTOR_ELEN > 32, and that will implicitly mean not allow
> > zve32x or zve32f
> > with any VLEN since we didn't limit/test VLEN here.
> >
> > In theory that should also test VLEN >= 64 for that, but since we
> > already forbit zve32x
> > or zve32f which means it at least requires zve64 and it will imply VLEN
> >= 64,
> > so we don't need to test that.
>
> Ok, yeah, TARGET_VECTOR_ELEN_64 implies VLEN >= 64.  My take is that
> TARGET_MIN_VLEN >= 64 is already sufficient as a check and
> TARGET_VECTOR_ELEN_64 is too restrictive.  My reading of the spec is that
> vsetvl ...,e16,mf4 could be allowed for zve32x_zvl64b (or any higher
> VLEN/zvl).
> Note I'm not talking about mf8 here, that one is settled, just about
> e16,mf4
> and e32,mf2.
>

zve32x_zvl64b will have the same requirement as zve32x_zvl32b,
I mean e16,mf4 could be allowed on zve32x_zvl64b, but it also spec
conformance
if implementation decides to raise an illegal instruction on e16,mf4, which
means
e16,mf4 is not safe to use on zve32x/zve32f.


> (Likewise we don't allow e.g. vsetvl e64,mf4 on v_zvl128b by default but
> only
> do so starting at v_zvl256b.)
>
> In the end those cases are very rare and it won't matter much anyway.  I'd
> just
> like to understand if either
>  (a) it's in the spec and I'm reading it wrong or
>  (b) we're disabling more than we need to because we don't really mind
>  (and performance implications are negligible to non-existent anyway)
>
> You have been hitting those on your uarchs.  Is this the attached test
> case
> with e32,mf2?  And exactly the vsetvl ... e32, mf2 faults on an embedded
> board
> with ELEN = 32 and VLEN = 128?  If so then option (a) above is likely :)
>

Yeah, our core has ELEN = 32 and VLEN >= 128...


>
> I was hoping we could just disable all mf8 modes by TARGET_VECTOR_ELEN_64
> and
> be done with it.
>
> > I guess I still haven't got the point yet? we didn't touch the
> > alignment within this patch,
> > so it still requires element alignment for each vector type?
> > I mean using MF8 or losing MF8 didn't let us get the capability to do
> > misalignment access?
> >
> > We lose ELLEN=8 MF8 (RVVMF8QI), but we still ELEN=8 MF4 (RVVMF4QI) to
> > do those unaligned memory accesses, that should be functional
> > equivalence.
> > (and both are occupy one vector register, so using MF8 isn't get fewer
> > register pressure than MF4)
>
> Sorry, I was just talking about the spec not about the patch itself.
> Please
> disregard, I think it's leading us down the wrong alley here.  Yes, the
> patch
> is fine with regards to alignment as it doesn't touch it.
>

Oh, I read your reply again, yeah...that's really unfortunate about the
uselessness of the Zicclsm extension...


>
> --
> Regards
>  Robin
>
>


Re: [PATCH] c++: Fix ICE when template lambdas call with default parameters in unevaluated context

2025-03-25 Thread Jason Merrill

On 3/25/25 10:17 AM, yxj-github-437 wrote:

This patch would like to avoid the ICE when template lambdas call with
default parameters in unevaluated context. For example as blow:

  1   │ template 
  2   │ void foo(T x) {
  3   │   sizeof [](T=x) { return 0; }();
  4   │ }
  5   │
  6   │ void test {
  7   │   foo(0);
  8   │ }

when compile with -fsyntax-only -std=c++20, it will have ICE similar as below

test.cc: In instantiation of ‘void foo(T) [with T = int]’:
test.cc:7:6:   required from here
  6 |   foo(0);
|   ~~~^~~
test.cc:3:38: internal compiler error: in tsubst_expr, at cp/pt.cc:21919
  2 |   sizeof [](T=x) { return 0; }();
|  ^~

For this example, the template lambda will build with an independent unevaluated
context. When convert default arguments, handling `int x` will be in a no 
unevaluated
operand context, the code `gcc_assert (cp_unevaluated_operand)` will make ICE.
So just remove this assert, and the code will get an effective error 
information:
"‘x’ is not captured".



Without the sizeof we get the better error "parameter 'x' cannot appear
in this context"; capturing or not isn't the reason it's ill-formed.



It seems like this code:



 /* Check to see if DECL is a local variable in a context
where that is forbidden.  */
 if ((parser->local_variables_forbidden_p & LOCAL_VARS_FORBIDDEN)
 && local_variable_p (decl)
 /* DR 2082 permits local variables in unevaluated contexts
within a default argument.  */
 && !cp_unevaluated_operand)



is confused by the sizeof; I guess we want to cp_evaluated for default
arguments like we do for template arguments.



Jason


Thanks, to fix this bug should indeed be handled when lambda parsing.
I will make the following modifications.

-- >8 --

This patch would like to avoid the ICE when template lambdas call with
default parameters in unevaluated context. The bug is the same as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119385. For example as blow:

 1   | template 
 2   | void foo(T x) {
 3   |   sizeof [](T=x) { return 0; }();
 4   | }
 5   |
 6   | void test {
 7   |   foo(0);
 8   | }

when compile with -fsyntax-only -std=c++20, it will have ICE similar as below

test.cc: In instantiation of 'void foo(T) [with T = int]':
test.cc:7:6:   required from here
 6 |   foo(0);
   |   ~~~^~~
test.cc:3:38: internal compiler error: in tsubst_expr, at cp/pt.cc:21919
 2 |   sizeof [](T=x) { return 0; }();
   |  ^~

And if without the template code ``, the code will pass compile, it's 
wrong.

When parsing lambda, the sizeof will affect the lambda internal unevaluated 
operand
being handled. So consider save/restore cp_unevaluated_operand.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_lambda_expression): Save/restore
cp_unevaluated_operand when parser lambda.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-uneval25.C: New test.
---
  gcc/cp/parser.cc |  4 
  gcc/testsuite/g++.dg/cpp2a/lambda-uneval25.C | 11 +++
  2 files changed, 15 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-uneval25.C

diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 57a461042bf..9cc51f57fa7 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -11754,6 +11754,8 @@ cp_parser_lambda_expression (cp_parser* parser)
  /* Inside the class, surrounding template-parameter-lists do not apply.  
*/
  unsigned int saved_num_template_parameter_lists
  = parser->num_template_parameter_lists;
+/* Inside the lambda, outside unevaluated context do not apply.   */
+int saved_cp_unevaluated_operand = cp_unevaluated_operand;


Instead of following the surrounding pattern, please use cp_evaluated. 
That avoids the need for any change in the other two places.



  unsigned char in_statement = parser->in_statement;
  bool in_switch_statement_p = parser->in_switch_statement_p;
  bool fully_implicit_function_template_p
@@ -11765,6 +11767,7 @@ cp_parser_lambda_expression (cp_parser* parser)
  bool saved_omp_array_section_p = parser->omp_array_section_p;
  
  parser->num_template_parameter_lists = 0;

+cp_unevaluated_operand = 0;
  parser->in_statement = 0;
  parser->in_switch_statement_p = false;
  parser->fully_implicit_function_template_p = false;
@@ -11814,6 +11817,7 @@ cp_parser_lambda_expression (cp_parser* parser)
  in_discarded_stmt = discarded;
  
  parser->num_template_parameter_lists = saved_num_template_parameter_lists;

+cp_unevaluated_operand = saved_cp_unevaluated_operand;
  parser->in_statement = in_statement;
  parser->in_switch_statement_p = in_switch_statement_p;
  parser->fully_implicit_function_template_p
diff --git a/gcc/testsuite/g++.

[PATCH] target/119010 - add missing integer store reservations for znver4 and znver5

2025-03-25 Thread Richard Biener
The imov and imovx classified stores miss reservations in the znver4/5
pipeline description.  The following adds them.

Bootstrap and regtest pending on x86_64-unknown-linux-gnu.

OK?

PR target/119010
* config/i386/zn4zn5.md (znver4_imov_double_store,
znver5_imov_double_store, znver4_imov_store, znver5_imov_store):
New reservations for integer stores.
---
 gcc/config/i386/zn4zn5.md | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/gcc/config/i386/zn4zn5.md b/gcc/config/i386/zn4zn5.md
index f8772fed620..954cdc528d6 100644
--- a/gcc/config/i386/zn4zn5.md
+++ b/gcc/config/i386/zn4zn5.md
@@ -142,6 +142,20 @@
   (eq_attr "memory" "load"
 "znver4-double,znver5-load,znver5-ieu")
 
+(define_insn_reservation "znver4_imov_double_store" 5
+   (and (eq_attr "cpu" "znver4")
+(and (eq_attr "znver1_decode" "double")
+ (and (eq_attr "type" "imov")
+  (eq_attr "memory" "store"
+"znver4-double,znver4-store,znver4-ieu")
+
+(define_insn_reservation "znver5_imov_double_store" 5
+   (and (eq_attr "cpu" "znver5")
+(and (eq_attr "znver1_decode" "double")
+ (and (eq_attr "type" "imov")
+  (eq_attr "memory" "store"
+"znver4-double,znver5-store,znver5-ieu")
+
 ;; imov, imovx
 (define_insn_reservation "znver4_imov" 1
 (and (eq_attr "cpu" "znver4")
@@ -167,6 +181,18 @@
  (eq_attr "memory" "load")))
 "znver4-direct,znver5-load,znver5-ieu")
 
+(define_insn_reservation "znver4_imov_store" 5
+   (and (eq_attr "cpu" "znver4")
+(and (eq_attr "type" "imov,imovx")
+ (eq_attr "memory" "store")))
+"znver4-direct,znver4-store,znver4-ieu")
+
+(define_insn_reservation "znver5_imov_store" 5
+   (and (eq_attr "cpu" "znver5")
+(and (eq_attr "type" "imov,imovx")
+ (eq_attr "memory" "store")))
+"znver4-direct,znver5-store,znver5-ieu")
+
 ;; Push Instruction
 (define_insn_reservation "znver4_push" 1
(and (eq_attr "cpu" "znver4")
-- 
2.43.0


Re: [PATCH v3] Don't instrument exit edges after musttail

2025-03-25 Thread Jakub Jelinek
On Tue, Mar 25, 2025 at 08:33:41AM -0700, Andi Kleen wrote:
> > 2025-03-25  Jakub Jelinek  
> > Andi Kleen  
> > 
> > PR gcov-profile/118442
> > * profile.cc (branch_prob): Ignore EDGE_FAKE edges from musttail calls
> > to EXIT.
> > 
> > * c-c++-common/pr118442.c: New test.
> > 
> > --- gcc/profile.cc.jj   2025-01-02 11:23:16.458517673 +0100
> > +++ gcc/profile.cc  2025-03-25 09:57:21.860398601 +0100
> > @@ -1340,6 +1340,20 @@ branch_prob (bool thunk)
> >   EDGE_INFO (e)->ignore = 1;
> >   ignored_edges++;
> > }
> > +  /* Ignore fake edges after musttail calls.  */
> > +  if ((e->flags & EDGE_FAKE)
> > + && e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun))
> > +   {
> > + gimple_stmt_iterator gsi = gsi_last_bb (e->src);
> 
> At least the musttail pass allows some statements after the call, like labels
> and debug information. Not sure if it matters.

I think it shouldn't.
gimple_flow_call_edges_add splits basic blocks with calls after those calls
and adds the EDGE_FAKE edges to EXIT, so the last stmt at the end of
the bb from which the edge goes should be the call.

Jakub



Re: [PATCH] c++: Fix ICE when template lambdas call with default parameters in unevaluated context

2025-03-25 Thread yxj-github-437
>> This patch would like to avoid the ICE when template lambdas call with
>> default parameters in unevaluated context. The bug is the same as
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119385. For example as blow:
>> 
>>  1   | template 
>>  2   | void foo(T x) {
>>  3   |   sizeof [](T=x) { return 0; }();
>>  4   | }
>>  5   |
>>  6   | void test {
>>  7   |   foo(0);
>>  8   | }
>> 
>> when compile with -fsyntax-only -std=c++20, it will have ICE similar as below
>> 
>> test.cc: In instantiation of 'void foo(T) [with T = int]':
>> test.cc:7:6:   required from here
>>  6 |   foo(0);
>>|   ~~~^~~
>> test.cc:3:38: internal compiler error: in tsubst_expr, at cp/pt.cc:21919
>>  2 |   sizeof [](T=x) { return 0; }();
>>|  ^~
>> 
>> And if without the template code ``, the code will pass compile, it's 
>> wrong.
>> 
>> When parsing lambda, the sizeof will affect the lambda internal unevaluated 
>> operand
>> being handled. So consider save/restore cp_unevaluated_operand.
>> 
>> gcc/cp/ChangeLog:
>> 
>>  * parser.cc (cp_parser_lambda_expression): Save/restore
>>  cp_unevaluated_operand when parser lambda.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * g++.dg/cpp2a/lambda-uneval25.C: New test.
>> ---
>>   gcc/cp/parser.cc |  4 
>>   gcc/testsuite/g++.dg/cpp2a/lambda-uneval25.C | 11 +++
>>   2 files changed, 15 insertions(+)
>>   create mode 100644 gcc/testsuite/g++.dg/cpp2a/lambda-uneval25.C
>> 
>> diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
>> index 57a461042bf..9cc51f57fa7 100644
>> --- a/gcc/cp/parser.cc
>> +++ b/gcc/cp/parser.cc
>> @@ -11754,6 +11754,8 @@ cp_parser_lambda_expression (cp_parser* parser)
>>   /* Inside the class, surrounding template-parameter-lists do not 
>> apply.  */
>>   unsigned int saved_num_template_parameter_lists
>>   = parser->num_template_parameter_lists;
>> +/* Inside the lambda, outside unevaluated context do not apply.   */
>> +int saved_cp_unevaluated_operand = cp_unevaluated_operand;

> Instead of following the surrounding pattern, please use cp_evaluated. 
> That avoids the need for any change in the other two places.

OK, I've replace it with cp_evaluated.

>>   unsigned char in_statement = parser->in_statement;
>>   bool in_switch_statement_p = parser->in_switch_statement_p;
>>   bool fully_implicit_function_template_p
>> @@ -11765,6 +11767,7 @@ cp_parser_lambda_expression (cp_parser* parser)
>>   bool saved_omp_array_section_p = parser->omp_array_section_p;
>>   
>>   parser->num_template_parameter_lists = 0;
>> +cp_unevaluated_operand = 0;
>>   parser->in_statement = 0;
>>   parser->in_switch_statement_p = false;
>>   parser->fully_implicit_function_template_p = false;
>> @@ -11814,6 +11817,7 @@ cp_parser_lambda_expression (cp_parser* parser)
>>   in_discarded_stmt = discarded;
>>   
>>   parser->num_template_parameter_lists = 
>> saved_num_template_parameter_lists;
>> +cp_unevaluated_operand = saved_cp_unevaluated_operand;
>>   parser->in_statement = in_statement;
>>   parser->in_switch_statement_p = in_switch_statement_p;
>>   parser->fully_implicit_function_template_p
>> diff --git a/gcc/testsuite/g++.dg/cpp2a/lambda-uneval25.C 
>> b/gcc/testsuite/g++.dg/cpp2a/lambda-uneval25.C
>> new file mode 100644
>> index 000..7fdd44d3ddd
>> --- /dev/null
>> +++ b/gcc/testsuite/g++.dg/cpp2a/lambda-uneval25.C
>> @@ -0,0 +1,11 @@
>> +// { dg-do compile { target c++20 } }
>> +
>> +template 
>> +void foo(T x) {
>> +  sizeof [](T=x) { return 0; }(); // { dg-error "may not appear" }
>> +  sizeof [](T=x) { return 0; }(); // { dg-error "may not appear" }
>> +};
>> +
>> +void test() {
>> +  foo(0);
>> +}

-- >8 --

This patch would like to avoid the ICE when template lambdas call with
default parameters in unevaluated context. The bug is the same as 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119385. For example as blow:

1   | template 
2   | void foo(T x) {
3   |   sizeof [](T=x) { return 0; }();
4   | }
5   |
6   | void test {
7   |   foo(0);
8   | }

when compile with -fsyntax-only -std=c++20, it will have ICE similar as below

test.cc: In instantiation of 'void foo(T) [with T = int]':
test.cc:7:6:   required from here
6 |   foo(0);
  |   ~~~^~~
test.cc:3:38: internal compiler error: in tsubst_expr, at cp/pt.cc:21919
2 |   sizeof [](T=x) { return 0; }();
  |  ^~

And if without the template code ``, the code will pass compile, it's 
wrong.

When parsing lambda, the sizeof will affect the lambda internal unevaluated 
operand
being handled. So consider save/restore cp_unevaluated_operand.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_lambda_expression): Use cp_evaluated.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-uneval25.C: New test.

---
 gcc/cp/pa

[PATCHSET] Update Rust frontend 21/03/2024 3/4

2025-03-25 Thread arthur . cohen
Hi everyone,

This is our the third patchset in the series for updating upstream GCC
with the latest changes in our development repository.

Most notably this contains handling for if-let statements by Marc
Poulhiès, changes to our name-resolution pass rewrite, and massive
changes to our AST and HIR representations to allow Rust lang-item paths
to be represented. This is different from how the official Rust compiler
handles lang-items, but allows us to refer to essential Rust items
easily while we are still in the process of compiling said core crate.

These lang-item changes also enabled us to continue our work on built-in
derive macros, with Clone and Copy being fully implemented in this
patchset. The remaining built-in derive macros will be upstreamed in the
next patchset. We are still missing on PartialOrd and PartialEq, which
will be upstreamed in time for 15.1.

There are also multiple type-system fixes, and testsuite fixes for
systems with different endianness.

Kindly,

Arthur



Re: [PATCH] testsuite: Fix gcc.target/powerpc/vsx-builtin-7.c test [PR119382]

2025-03-25 Thread Peter Bergner
On 3/25/25 1:42 AM, jeevitha wrote:
> gcc/testsuite/
>   PR testsuite/119382
>   * gcc.target/powerpc/vsx-builtin-7.c: Add '-fno-ipa-icf' to dg-options.
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-7.c 
> b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-7.c
> index 5095d5030fd..78e4e23d102 100644
> --- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-7.c
> +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-7.c
> @@ -1,6 +1,6 @@
>  /* { dg-do compile { target { powerpc*-*-* } } } */
>  /* { dg-skip-if "" { powerpc*-*-darwin* } } */
> -/* { dg-options "-O2 -mdejagnu-cpu=power7 -fno-inline-functions" } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power7 -fno-inline-functions 
> -fno-ipa-icf" } */
>  /* { dg-require-effective-target powerpc_vsx } */
>  
>  /* Test simple extract/insert/slat operations.  Make sure all types are

I REALLY dislike these BIG built-in test cases that test for all possible
built-ins and then do multiple insn scans.  Much better would be having
many small test cases that test for one specific thing, rather than testing
lots of things.  As shown here, they're pretty fragile to changes in
the compiler.

That said, I'm not sure it's really worth splitting this older Power7
test case up, so I guess adding -fno-ipa-icf is probably the best/easiest
of all of the bad options.

Segher, any reason you can give on why we shouldn't go the easy route to
"fix" (yes, these are air-quotes) this by using -fno-ipa-icf?


Peter



Re: [PATCH] testsuite: Fix gcc.target/powerpc/vsx-builtin-7.c test [PR119382]

2025-03-25 Thread Segher Boessenkool
On Tue, Mar 25, 2025 at 07:00:34PM -0500, Peter Bergner wrote:
> On 3/25/25 5:17 PM, Segher Boessenkool wrote:
> > On Tue, Mar 25, 2025 at 03:33:59PM -0500, Peter Bergner wrote:
> >> Segher, any reason you can give on why we shouldn't go the easy route to
> >> "fix" (yes, these are air-quotes) this by using -fno-ipa-icf?
> > 
> > One reason is that that option should not make any difference whatsoever
> > for a well-written testcase: a testcases that wants to test what insns
> > are generated for particular code, damn well should be written in such a
> > way that it is very unlikely the compiler will ever generate different
> > code for it.  Another reason is I had to look up what that option with
> > the cryptical name does, what that names stands for.  And finally, will
> > we be doing more maintenance on this later?  Testcase maintenance is
> > wasted work, work that does not scale even, so it is important to write
> > testcases so that maintenance isn't needed, and if it becomes necessary
> > anyway to improve it so that it will not be needed so much in the
> > future.
> 
> I know there are reasons for wanting it split up, but do we really want
> to spend the development time splitting this old power7 test case up rather
> than just adding the -fno-ipa-icf option?

Like I said:

> It is probably less work the next time one of those tests starts failing
> to *start* with splitting the test up :-)

> You also didn't explicitly say
> which solution we should go with, so we're in a little limbo here.

I didn't finish my reply to Jeevitha's patch yet, so you didn't see
anything of that yet, correct.

Reason 4 (is it four?  Some bigger number anyway) to not want to use
command line options like this in tests: it will be copied unthinkingly
to other tests and maybe to production code even.  Cargo-cult
programming is a thing.  At the very least always add a comment saying
why some unusual option is used!


Segher


Re: [PATCH] gimple: Verify that lhs of an assign is NOT a function [PR118796]

2025-03-25 Thread Andrew Pinski
On Tue, Mar 25, 2025 at 10:59 PM Richard Biener
 wrote:
>
>
>
> > Am 26.03.2025 um 04:47 schrieb Andrew Pinski :
> >
> > This adds a simple verification so that the LHS of an assignment is
> > not a function decl. SRA and FRE will produce an ICE for this anyways
> > so let's catch it earlier.  This showed up because the fortran front-end
> > didn't translate the function name into the result decl in some cases.
> >
> > Bootstrapped and tested on x86_64_linux-gnu with no regressions.
>
> Why is the is_gimple_val test not catching this?

Because verify_gimple_assign_single does not check is_gimple_val. So
maybe this should moved into verify_gimple_assign_single instead of
where I placed it.

Thanks,
Andrew

>
> >
> > gcc/ChangeLog:
> >
> >PR middle-end/118796
> >* tree-cfg.cc (verify_gimple_assign): Verify the lhs is not
> >a function decl.
> >
> > Signed-off-by: Andrew Pinski 
> > ---
> > gcc/tree-cfg.cc | 9 +
> > 1 file changed, 9 insertions(+)
> >
> > diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
> > index 2fa5678051a..f25480cf6fc 100644
> > --- a/gcc/tree-cfg.cc
> > +++ b/gcc/tree-cfg.cc
> > @@ -4840,6 +4840,15 @@ verify_gimple_assign_single (gassign *stmt)
> > static bool
> > verify_gimple_assign (gassign *stmt)
> > {
> > +  tree lhs = gimple_assign_lhs (stmt);
> > +
> > +  if (TREE_CODE (lhs) == FUNCTION_DECL)
> > +{
> > +  error ("lhs cannot be a function");
> > +  debug_generic_stmt (lhs);
> > +  return true;
> > +}
> > +
> >   if (gimple_assign_nontemporal_move_p (stmt))
> > {
> >   tree lhs = gimple_assign_lhs (stmt);
> > --
> > 2.43.0
> >


[PATCH] gimple: Verify that lhs of an assign is NOT a function [PR118796]

2025-03-25 Thread Andrew Pinski
This adds a simple verification so that the LHS of an assignment is
not a function decl. SRA and FRE will produce an ICE for this anyways
so let's catch it earlier.  This showed up because the fortran front-end
didn't translate the function name into the result decl in some cases.

Bootstrapped and tested on x86_64_linux-gnu with no regressions.

gcc/ChangeLog:

PR middle-end/118796
* tree-cfg.cc (verify_gimple_assign): Verify the lhs is not
a function decl.

Signed-off-by: Andrew Pinski 
---
 gcc/tree-cfg.cc | 9 +
 1 file changed, 9 insertions(+)

diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 2fa5678051a..f25480cf6fc 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -4840,6 +4840,15 @@ verify_gimple_assign_single (gassign *stmt)
 static bool
 verify_gimple_assign (gassign *stmt)
 {
+  tree lhs = gimple_assign_lhs (stmt);
+
+  if (TREE_CODE (lhs) == FUNCTION_DECL)
+{
+  error ("lhs cannot be a function");
+  debug_generic_stmt (lhs);
+  return true;
+}
+
   if (gimple_assign_nontemporal_move_p (stmt))
 {
   tree lhs = gimple_assign_lhs (stmt);
-- 
2.43.0



Re: [PATCH 2/2] [cobol] make sources coretypes.h and tree.h clean

2025-03-25 Thread Richard Biener
On Wed, 19 Mar 2025, James K. Lowden wrote:

> On Wed, 19 Mar 2025 15:24:19 +0100 (CET)
> Richard Biener  wrote:
> 
> > The following removes HOWEVER_GCC_DEFINES_TREE and the alternate
> > definition of tree from symbols.h and instead ensures that both
> > coretypes.h and tree.h are included where required.  This required
> > putting GCCs own 'NONE' in a scoped enum (see separate patch) and
> > renaming the cobol use of UNSIGNED, SIGNED and BLOCK which conflict
> > with enums from tree.h.
> 
> IIUC, your intention is to pave the way for computations on tree types
> in the front end, in order to do away with _Float128.  
> 
> I'm not convinced this effort is either good or necessary. I'm a 
> afraid of ending up with code no one including me understands, for the
> sake of portability to architectures no one will ever use.  I think
> you're assuming I understand things I don't, and possibly assuming
> something to be necessary that isn't, in this context.  

I hope that I can explain (and comment in code) everything needed here.

> More than one person has said something along the lines of "the host
> shouldn't use native computation" as though it's self-evident  I'd
> really, really like to understand that better.  
> 
> If that's written down somewhere with a rationale, I'm happy to read it
> before making any fuss.  As of today, I don't understand why that
> should be true.  Even better would be that plus documentation for how
> to do it, since I already know how computation in C++ works (pretty
> well, anyway).  
> 
> Before we get too far into it, let me alert you to ramifications that
> come immediately to mind.  
> 
> In parse.y I'm sure you've already seen cce_expr, an arithmetic parser
> for "Constant Compile-time Expressions" as ISO has it, or "cce" among
> friends.  As of now, any expression consisting of only numeric
> constants is reduced to a _Float128.  That value may: 
> 
> 1.  participate in boolean evaluation, including to strings. 
> 2.  participate later evaluation with another cce 
> 3.  participate in further runtime evaluation with runtime values
> 4.  define the size of a "data item", COBOL for "variable", usually in
> Working-Storage Section.  That includes the size of numeric type in
> digits, the size of an alphanumeric type in characters, and the size of
> an array (which COBOL calls a table.  COBOL is like French: it has a
> different word for everything.) 
> 5.  be an initial value for a numeric type (display, integer, fixed- or
> floating-point). 
> 
> While the computation is floating point, the parser in its actions
> restricts some uses to integral types.  Failure to do so is a bug.  
> 
> On one hand, in general I'm not sure _Float128 suffices to meet ISO
> COBOL's requirements.  On the other hand, we'll never have a single
> variable with a size of 2^64, never mind a *size* with 31 digits of
> precision!  So _Float128 has us covered for all practical purposes,
> especially regarding sizes.  
> 
> On the 3rd hand, it's very nice while debugging the parser to see
> these numbers as numbers, not as some abstract tree type.  
> 
> I just want to put all 3 hands on the table, and make sure we all
> understand why we're doing this, if we are, and what it will entail if
> we do.  I'm sure  you feel the same.  

Sure!  So let me explain the advantage I see in using 'tree' instead
of _Float128 as representation.  You have confirmed what I somehow
reverse engineered from the bits I touched - there's cases where
integral type constants are stored in the _Float128 and it's required
they keep being that.  With using 'tree' you can actually store those
as integer typed nodes.

Then there is host _Float128 computations, doing those with 'tree'
(or REAL_VALUE_TYPE) makes sure they behave exactly the same as if
they were performed on the CPU you code generate for.

The code also currently transfers the host representation of floats
and integers to target memory without caring for endianess or
float format differences.

I realize most of the issues will only exist when cross-compiling
but that's what GCC supports.

That said - the current patching experiment I'm doing is to simply
translate _Float128 usage to tree (or mostly REAL_VALUE_TYPE),
without addressing endianess or target float format encoding issues
or even trying to store integers as integers.  One of the main
points of the exercise for me is to see how far that _Float128
usage extends and what relies on it (there's quite some string
processing done, and correctly converting that might be a
challenge).

Richard.

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH] cobol: Rename COB_{BLOCK,UNSIGNED,SIGNED} to {BLOCK,UNSIGNED,SIGNED}_kw for consistency

2025-03-25 Thread Richard Biener
On Fri, 21 Mar 2025, Jakub Jelinek wrote:

> On Wed, Mar 19, 2025 at 06:03:24PM -0400, James K. Lowden wrote:
> > Elsewhere in the parser where there was a conflict like that, I renamed
> > the token.  For example, the COBOL word TRUE uses a token named
> > TRUE_kw.  I don't mind either way; your solution has less impact on the
> > parser.  
> 
> I think consistency is good and when it is a suffix rather than prefix,
> it also sorts alphabetically together with the actual keywords.
> 
> Ok for trunk?

OK.

Richard.

> 2025-03-21  Jakub Jelinek  
> 
>   * parse.y: Rename COB_BLOCK to BLOCK_kw, COB_SIGNED to SIGNED_kw and
>   COB_UNSIGNED to UNSIGNED_kw.
>   * scan.l: Likewise.
>   * token_names.h: Regenerate.
> 
> --- gcc/cobol/scan.l.jj   2025-03-21 10:09:38.903966677 +0100
> +++ gcc/cobol/scan.l  2025-03-21 10:12:07.079914108 +0100
> @@ -374,7 +374,7 @@ ROUNDING  { return ROUNDING; }
>  SECONDS  { return SECONDS; }
>  SECURE   { return SECURE; }
>  SHORT{ return SHORT; }
> -SIGNED   { return COB_SIGNED; }
> +SIGNED   { return SIGNED_kw; }
>  STANDARD-BINARY  { return STANDARD_BINARY; }
>  STANDARD-DECIMAL { return STANDARD_DECIMAL; }
>  STATEMENT{ return STATEMENT; }
> @@ -394,7 +394,7 @@ TOWARD-LESSER { return TOWARD_LESSER;
>  TRUNCATION   { return TRUNCATION; }
>  UCS-4{ return UCS_4; }
>  UNDERLINE{ return UNDERLINE; }
> -UNSIGNED { return COB_UNSIGNED; }
> +UNSIGNED { return UNSIGNED_kw; }
>  UTF-16   { return UTF_16; }
>  UTF-8{ return UTF_8; }
>  
> @@ -837,7 +837,7 @@ CALL  { return CALL; }
>  BY   { return BY; }
>  BOTTOM   { return BOTTOM; }
>  BEFORE   { return BEFORE; }
> -BLOCK{ return COB_BLOCK; }
> +BLOCK{ return BLOCK_kw; }
>  BACKWARD { return BACKWARD; }
>  
>  AT   { return AT; }
> @@ -1042,7 +1042,7 @@ USE({SPC}FOR)?  { return USE; }
>AS { return AS; }
>ASCENDING  { return ASCENDING; }
>BLANK  { return BLANK; }
> -  BLOCK  { return COB_BLOCK; }
> +  BLOCK  { return BLOCK_kw; }
>BY { return BY; }
>BYTE-LENGTH{ return BYTE_LENGTH; }
>CHARACTER  { return CHARACTER; }
> @@ -2164,7 +2164,7 @@ BASIS   { yy_push_state(basis); return BA
>BINARY { return BINARY; }
>BIT{ return BIT; }
>BLANK  { return BLANK; }
> -  BLOCK  { return COB_BLOCK; }
> +  BLOCK  { return BLOCK_kw; }
>BOTTOM { return BOTTOM; }
>BY { return BY; }
>CALL   { return CALL; }
> --- gcc/cobol/parse.y.jj  2025-03-21 10:09:38.902966690 +0100
> +++ gcc/cobol/parse.y 2025-03-21 10:11:12.178674614 +0100
> @@ -408,7 +408,7 @@
>  
>   BASED BASECONVERT
>   BEFORE BINARY BIT BIT_OF "BIT-OF" BIT_TO_CHAR 
> "BIT-TO-CHAR"
> - BLANK COB_BLOCK
> + BLANK BLOCK_kw
>   BOOLEAN_OF_INTEGER "BOOLEAN-OF-INTEGER"
>   BOTTOM BY
>   BYTE BYTE_LENGTH "BYTE-LENGTH"
> @@ -613,7 +613,7 @@
>   NONE NORMAL NUMBERS
>   PREFIXED PREVIOUS PROHIBITED RELATION REQUIRED
>   REVERSE_VIDEO ROUNDING
> - SECONDS SECURE SHORT COB_SIGNED
> + SECONDS SECURE SHORT SIGNED_kw
>   STANDARD_BINARY "STANDARD-BINARY"
>   STANDARD_DECIMAL "STANDARD-DECIMAL"
>   STATEMENT STEP STRUCTURE
> @@ -621,7 +621,7 @@
>   TOWARD_LESSER "TOWARD-LESSER"
>   TRUNCATION
>   UCS_4 "UCS-4"
> - UNDERLINE COB_UNSIGNED
> + UNDERLINE UNSIGNED_kw
>   UTF_16 "UTF-16"
>   UTF_8 "UTF-8"
>  
> @@ -1014,7 +1014,7 @@
>  
>  BACKWARD BASED BASECONVERT
>   BEFORE BINARY BIT BIT_OF BIT_TO_CHAR
> -BLANK COB_BLOCK
> +BLANK BLOCK_kw
>   BOOLEAN_OF_INTEGER
>   BOTTOM BY
>   BYTE BYTE_LENGTH
> @@ -1228,7 +1228,7 @@
>  NONE NORMAL NUMBERS
>  PREFIXED PREVIOUS PROHIBITED RELATION REQUIRED
>  REVERSE_VIDEO ROUNDING
> -SECONDS SECURE SHORT COB_SIGNED
> +SECONDS SECURE SHORT SIGNED_kw
>  

RE: [PATCH] cobol: Get rid of __int128 uses in the COBOL FE [PR119242]

2025-03-25 Thread Robert Dubner



> -Original Message-
> From: Jakub Jelinek 
> Sent: Tuesday, March 25, 2025 19:49
> To: Robert Dubner ; James K. Lowden
> ; Richard Biener 
> Cc: gcc-patches@gcc.gnu.org
> Subject: [PATCH] cobol: Get rid of __int128 uses in the COBOL FE
> [PR119242]
> 
> Hi!
> 
> The following patch changes some remaining __int128 uses in the FE
> into FIXED_WIDE_INT(128), i.e. emulated 128-bit integral type.
> The use of wide_int_to_tree directly from that rather than going through
> build_int_cst_type means we don't throw away the upper 64 bits of the
> values, so the emitting of constants needing full 128 bits can be
greatly
> simplied.
> Plus all the #pragma GCC diagnostic ignored "-Wpedantic" spots aren't
> needed, we don't use the _Float128/__int128 types directly in the FE
> anymore.
> 
> Tested on x86_64-linux with make check-cobol, could you please test this
> on UAT/NIST?
> 
> Note, PR119241/PR119242 bugs are still not fully fixed, I think the
> remaining problem is that several FE sources include
> ../../libgcobol/libgcobol.h and that header declares various APIs with
> __int128 and _Float128 types, so trying to build a cross-compiler on a
> host
> without __int128 and _Float128 will still fail miserably.
> I believe none of those APIs are actually used by the FE, so the
question
> is
> what the FE needs from libgcobol.h and whether the rest could be wrapped
> with #ifndef IN_GCC or #ifndef IN_GCC_FRONTEND or something similar
> (those 2 macros are predefined when compiling the FE files).

I'll take a look at this tomorrow.  Conditional compilation is one
possibility.  Another is to bust that .h file into two; one that goes into
libgcobol and is accessed by gcc/cobol and libgcobol, and a second that is
accessed solely by libgcobol.  I'll want to look at the magnitude of the
problem before deciding which.

> 
> 2025-03-26  Jakub Jelinek  
> 
>   PR cobol/119242
>   * cobol/genutil.h (get_power_of_ten): Remove #pragma GCC
diagnostic
>   around declaration.
>   * cobol/genapi.cc (psa_FldLiteralN): Change type of value from
>   __int128 to FIXED_WIDE_INT(128).  Remove #pragma GCC diagnostic
>   around the declaration.  Use wi::min_precision to determine
>   minimum unsigned precision of the value.  Use wi::neg_p instead
>   of value < 0 tests and wi::set_bit_in_zero
>   to build sign bit.  Handle field->data.capacity == 16 like
>   1, 2, 4 and 8, use wide_int_to_tree instead of build_int_cst.
>   (mh_source_is_literalN): Remove #pragma GCC diagnostic around
>   the definition.
>   (binary_initial_from_float128): Likewise.
>   * cobol/genutil.cc (get_power_of_ten): Remove #pragma GCC
diagnostic
>   before the definition.
> 
> --- gcc/cobol/genutil.h.jj2025-03-25 21:14:48.448384925 +0100
> +++ gcc/cobol/genutil.h   2025-03-25 21:19:24.358620134 +0100
> @@ -104,10 +104,7 @@ void  get_binary_value( tree value,
>  tree  get_data_address( cbl_field_t *field,
>  tree offset);
> 
> -#pragma GCC diagnostic push
> -#pragma GCC diagnostic ignored "-Wpedantic"
>  FIXED_WIDE_INT(128) get_power_of_ten(int n);
> -#pragma GCC diagnostic pop
>  void  scale_by_power_of_ten_N(tree value,
>  int N,
>  bool check_for_fractional = false);
> --- gcc/cobol/genapi.cc.jj2025-03-25 21:11:06.767409766 +0100
> +++ gcc/cobol/genapi.cc   2025-03-25 21:22:28.038113833 +0100
> @@ -3798,16 +3798,13 @@ psa_FldLiteralN(struct cbl_field_t *fiel
>// We are constructing a completely static constant structure, based
on
> the
>// text string in .initial
> 
> -#pragma GCC diagnostic push
> -#pragma GCC diagnostic ignored "-Wpedantic"
> -  __int128 value = 0;
> -#pragma GCC diagnostic pop
> +  FIXED_WIDE_INT(128) value = 0;
> 
>do
>  {
>  // This is a false do{}while, to isolate the variables:
> 
> -// We need to convert data.initial to an __int128 value
> +// We need to convert data.initial to an FIXED_WIDE_INT(128) value
>  char *p = const_cast(field->data.initial);
>  int sign = 1;
>  if( *p == '-' )
> @@ -3903,24 +3900,24 @@ psa_FldLiteralN(struct cbl_field_t *fiel
> 
>  // We now need to calculate the capacity.
> 
> -unsigned char *pvalue = (unsigned char *)&value;
> +unsigned int min_prec = wi::min_precision(value, UNSIGNED);
>  int capacity;
> -if( *(uint64_t*)(pvalue + 8) )
> +if( min_prec > 64 )
>{
>// Bytes 15 through 8 are non-zero
>capacity = 16;
>}
> -else if( *(uint32_t*)(pvalue + 4) )
> +else if( min_prec > 32 )
>{
>// Bytes 7 through 4 are non-zero
>capacity = 8;
>}
> -else if( *(uint16_t*)(pvalue + 2) )
> +else if( min_prec > 16 )
>{
>// Bytes 3 and 2
>capacity = 4;
>}
> -else if( pvalue[1] )
> +else if( min_prec > 8 )
>{
>// Byte 1 is non-zero
>

Re: [PATCH] gimple: Verify that lhs of an assign is NOT a function [PR118796]

2025-03-25 Thread Richard Biener



> Am 26.03.2025 um 04:47 schrieb Andrew Pinski :
> 
> This adds a simple verification so that the LHS of an assignment is
> not a function decl. SRA and FRE will produce an ICE for this anyways
> so let's catch it earlier.  This showed up because the fortran front-end
> didn't translate the function name into the result decl in some cases.
> 
> Bootstrapped and tested on x86_64_linux-gnu with no regressions.

Why is the is_gimple_val test not catching this?

> 
> gcc/ChangeLog:
> 
>PR middle-end/118796
>* tree-cfg.cc (verify_gimple_assign): Verify the lhs is not
>a function decl.
> 
> Signed-off-by: Andrew Pinski 
> ---
> gcc/tree-cfg.cc | 9 +
> 1 file changed, 9 insertions(+)
> 
> diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
> index 2fa5678051a..f25480cf6fc 100644
> --- a/gcc/tree-cfg.cc
> +++ b/gcc/tree-cfg.cc
> @@ -4840,6 +4840,15 @@ verify_gimple_assign_single (gassign *stmt)
> static bool
> verify_gimple_assign (gassign *stmt)
> {
> +  tree lhs = gimple_assign_lhs (stmt);
> +
> +  if (TREE_CODE (lhs) == FUNCTION_DECL)
> +{
> +  error ("lhs cannot be a function");
> +  debug_generic_stmt (lhs);
> +  return true;
> +}
> +
>   if (gimple_assign_nontemporal_move_p (stmt))
> {
>   tree lhs = gimple_assign_lhs (stmt);
> --
> 2.43.0
> 


Re: [PATCH] i386: Fix up combination of -2 r<<= (x & 7) into btr [PR119428]

2025-03-25 Thread Uros Bizjak
On Tue, Mar 25, 2025 at 7:55 AM Jakub Jelinek  wrote:
>
> Hi!
>
> The following patch is miscompiled from r15-8478 but latently already
> since my r11-5756 and r11-6631 changes.
> The r11-5756 change was
> https://gcc.gnu.org/pipermail/gcc-patches/2020-December/561164.html
> which changed the splitters to immediately throw away the masking.
> And the r11-6631 change was an optimization to recognize
> (set (zero_extract:HI (...) (const_int 1) (...)) (const_int 1)
> as btr.
>
> The problem is their interaction.  x86 is not a SHIFT_COUNT_TRUNCATED
> target, so the masking needs to be explicit in the IL.
> And combine.cc (make_field_assignment) has since 1992 optimizations
> which try to optimize x &= (-2 r<< y) into zero_extract (x) = 0.
> Now, such an optimization is fine if y has not been masked or if the
> chosen zero_extract has the same mode as the rotate (or it recognizes
> something with a left shift too).  IMHO such optimization is invalid
> for SHIFT_COUNT_TRUNCATED targets because we explicitly say that
> the masking of the shift/rotate counts are redundant there and don't
> need to be part of the IL (I have a patch for that, but because it
> is just latent, I'm not sure it needs to be posted for gcc 15 (and
> also am not sure if it should punt or add operand masking just in case)).
> x86 is not SHIFT_COUNT_TRUNCATED though and so even fixing combine
> not to do that for SHIFT_COUNT_TRUNCATED targets doesn't help, and we don't
> have QImode insv, so it is optimized into HImode insertions.  Now,
> if the y in x &= (-2 r<< y) wasn't masked in any way, turning it into
> HImode btr is just fine, but if it was x &= (-2 r<< (y & 7)) and we just
> decided to throw away the masking, using btr changes the behavior on it
> and causes e2fsprogs and sqlite miscompilations.
>
> So IMHO on !SHIFT_COUNT_TRUNCATED targets, we need to keep the maskings
> explicit in the IL, either at least for the duration of the combine pass
> as does the following patch (where combine is the only known pass to have
> such transformation), or even keep it until final pass in case there are
> some later optimizations that would also need to know whether there was
> explicit masking or not and with what mask.  The latter change would be
> much larger.
>
> The following patch just reverts the r11-5756 change and adds a testcase.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2025-03-25  Jakub Jelinek  
>
> PR target/96226
> PR target/119428
> * config/i386/i386.md (splitter after *3_mask,
> splitter after *3_mask_1): Revert 2020-12-05
> changes.
>
> * gcc.c-torture/execute/pr119428.c: New test.

OK.

BTW: According to [1], the revert is at author's discretion, no
outside approval is needed.

[1] https://gcc.gnu.org/gitwrite.html#all

Thanks,
Uros.

>
> --- gcc/config/i386/i386.md.jj  2025-03-24 11:29:12.271793423 +0100
> +++ gcc/config/i386/i386.md 2025-03-24 11:32:14.139305881 +0100
> @@ -18168,7 +18168,8 @@ (define_split
>   [(set (match_dup 4) (match_dup 1))
>(set (match_dup 0)
> (any_rotate:SWI (match_dup 4)
> -  (subreg:QI (match_dup 2) 0)))]
> +  (subreg:QI
> +(and:SI (match_dup 2) (match_dup 3)) 0)))]
>   "operands[4] = gen_reg_rtx (mode);")
>
>  (define_insn_and_split "*3_mask_1"
> @@ -18202,7 +18203,8 @@ (define_split
>== GET_MODE_BITSIZE (mode) - 1"
>   [(set (match_dup 4) (match_dup 1))
>(set (match_dup 0)
> -   (any_rotate:SWI (match_dup 4) (match_dup 2)))]
> +   (any_rotate:SWI (match_dup 4)
> +  (and:QI (match_dup 2) (match_dup 3]
>   "operands[4] = gen_reg_rtx (mode);")
>
>  (define_insn_and_split "*3_add"
> --- gcc/testsuite/gcc.c-torture/execute/pr119428.c.jj   2025-03-24 
> 11:41:31.583658619 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr119428.c  2025-03-24 
> 11:40:37.884395211 +0100
> @@ -0,0 +1,18 @@
> +/* PR target/119428 */
> +
> +__attribute__((noipa)) void
> +foo (unsigned int x, unsigned char *y)
> +{
> +  y += x >> 3;
> +  *y &= (unsigned char) ~(1 << (x & 0x07));
> +}
> +
> +int
> +main ()
> +{
> +  unsigned char buf[8];
> +  __builtin_memset (buf, 0xff, 8);
> +  foo (8, buf);
> +  if (buf[1] != 0xfe)
> +__builtin_abort ();
> +}
>
> Jakub
>


[COMMITTED 026/146] gccrs: hir: Mark AttrVec::get_outer_attrs as override

2025-03-25 Thread arthur . cohen
From: Arthur Cohen 

gcc/rust/ChangeLog:

* hir/tree/rust-hir.h: Add override qualifier to overriden method.
---
 gcc/rust/hir/tree/rust-hir.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/rust/hir/tree/rust-hir.h b/gcc/rust/hir/tree/rust-hir.h
index 8ce5cf4d102..8544d0d5f09 100644
--- a/gcc/rust/hir/tree/rust-hir.h
+++ b/gcc/rust/hir/tree/rust-hir.h
@@ -673,7 +673,7 @@ public:
   // Returns whether the lifetime param has an outer attribute.
   bool has_outer_attribute () const override { return outer_attrs.size () > 1; 
}
 
-  AST::AttrVec &get_outer_attrs () { return outer_attrs; }
+  AST::AttrVec &get_outer_attrs () override { return outer_attrs; }
 
   // Returns whether the lifetime param is in an error state.
   bool is_error () const { return lifetime.is_error (); }
-- 
2.45.2



[COMMITTED 034/144] gccrs: Add typecheck for path patterns.

2025-03-25 Thread arthur . cohen
From: Raiki Tamura 

gcc/rust/ChangeLog:

* hir/tree/rust-hir.cc (Item::item_kind_string): New function.
* hir/tree/rust-hir.h: New function.
* typecheck/rust-hir-type-check-expr.cc (TypeCheckExpr::visit):
Modify to check all arms in match expressions even if some of 
them
has errors.
* typecheck/rust-hir-type-check-pattern.cc (TypeCheckPattern::visit):
Add and fix check for path patterns.

gcc/testsuite/ChangeLog:

* rust/compile/issue-2324-2.rs: Fix error message.
* rust/compile/match9.rs: New test.

Signed-off-by: Raiki Tamura 
---
 gcc/rust/hir/tree/rust-hir.cc |  38 ++
 gcc/rust/hir/tree/rust-hir.h  |   2 +
 .../typecheck/rust-hir-type-check-expr.cc |  13 +-
 .../typecheck/rust-hir-type-check-pattern.cc  | 118 ++
 gcc/testsuite/rust/compile/issue-2324-2.rs|   2 +-
 gcc/testsuite/rust/compile/match9.rs  |  30 +
 6 files changed, 175 insertions(+), 28 deletions(-)
 create mode 100644 gcc/testsuite/rust/compile/match9.rs

diff --git a/gcc/rust/hir/tree/rust-hir.cc b/gcc/rust/hir/tree/rust-hir.cc
index 6290e72669e..8e0d444ce15 100644
--- a/gcc/rust/hir/tree/rust-hir.cc
+++ b/gcc/rust/hir/tree/rust-hir.cc
@@ -212,6 +212,44 @@ Module::as_string () const
   return str + "\n";
 }
 
+std::string
+Item::item_kind_string (Item::ItemKind kind)
+{
+  switch (kind)
+{
+case Item::ItemKind::Static:
+  return "static";
+case Item::ItemKind::Constant:
+  return "constant";
+case Item::ItemKind::TypeAlias:
+  return "type alias";
+case Item::ItemKind::Function:
+  return "function";
+case Item::ItemKind::UseDeclaration:
+  return "use declaration";
+case Item::ItemKind::ExternBlock:
+  return "extern block";
+case Item::ItemKind::ExternCrate:
+  return "extern crate";
+case Item::ItemKind::Struct:
+  return "struct";
+case Item::ItemKind::Union:
+  return "union";
+case Item::ItemKind::Enum:
+  return "enum";
+case Item::ItemKind::EnumItem:
+  return "enum item";
+case Item::ItemKind::Trait:
+  return "trait";
+case Item::ItemKind::Impl:
+  return "impl";
+case Item::ItemKind::Module:
+  return "module";
+default:
+  rust_unreachable ();
+}
+}
+
 std::string
 StaticItem::as_string () const
 {
diff --git a/gcc/rust/hir/tree/rust-hir.h b/gcc/rust/hir/tree/rust-hir.h
index 8a27161434e..f8eb22db087 100644
--- a/gcc/rust/hir/tree/rust-hir.h
+++ b/gcc/rust/hir/tree/rust-hir.h
@@ -220,6 +220,8 @@ public:
 Module,
   };
 
+  static std::string item_kind_string (ItemKind kind);
+
   virtual ItemKind get_item_kind () const = 0;
 
   // Unique pointer custom clone function
diff --git a/gcc/rust/typecheck/rust-hir-type-check-expr.cc 
b/gcc/rust/typecheck/rust-hir-type-check-expr.cc
index 81d82952550..38734d58948 100644
--- a/gcc/rust/typecheck/rust-hir-type-check-expr.cc
+++ b/gcc/rust/typecheck/rust-hir-type-check-expr.cc
@@ -1465,6 +1465,7 @@ TypeCheckExpr::visit (HIR::MatchExpr &expr)
   TyTy::BaseType *scrutinee_tyty
 = TypeCheckExpr::Resolve (expr.get_scrutinee_expr ().get ());
 
+  bool saw_error = false;
   std::vector kase_block_tys;
   for (auto &kase : expr.get_match_cases ())
 {
@@ -1475,7 +1476,10 @@ TypeCheckExpr::visit (HIR::MatchExpr &expr)
  TyTy::BaseType *kase_arm_ty
= TypeCheckPattern::Resolve (pattern.get (), scrutinee_tyty);
  if (kase_arm_ty->get_kind () == TyTy ::TypeKind::ERROR)
-   return;
+   {
+ saw_error = true;
+ continue;
+   }
 
  TyTy::BaseType *checked_kase = unify_site (
expr.get_mappings ().get_hirid (),
@@ -1484,7 +1488,10 @@ TypeCheckExpr::visit (HIR::MatchExpr &expr)
TyTy::TyWithLocation (kase_arm_ty, pattern->get_locus ()),
expr.get_locus ());
  if (checked_kase->get_kind () == TyTy::TypeKind::ERROR)
-   return;
+   {
+ saw_error = true;
+ continue;
+   }
}
 
   // check the kase type
@@ -1492,6 +1499,8 @@ TypeCheckExpr::visit (HIR::MatchExpr &expr)
= TypeCheckExpr::Resolve (kase.get_expr ().get ());
   kase_block_tys.push_back (kase_block_ty);
 }
+  if (saw_error)
+return;
 
   if (kase_block_tys.size () == 0)
 {
diff --git a/gcc/rust/typecheck/rust-hir-type-check-pattern.cc 
b/gcc/rust/typecheck/rust-hir-type-check-pattern.cc
index 2b0b02ad5ef..a4f9a908feb 100644
--- a/gcc/rust/typecheck/rust-hir-type-check-pattern.cc
+++ b/gcc/rust/typecheck/rust-hir-type-check-pattern.cc
@@ -43,32 +43,100 @@ TypeCheckPattern::Resolve (HIR::Pattern *pattern, 
TyTy::BaseType *parent)
 void
 TypeCheckPattern::visit (HIR::PathInExpression &pattern)
 {
-  infered = TypeCheckExpr::Resolve (&pattern);
-
-  /*
-   * We are compiling a PathInExpression, which can't be a Struct or Tuple
-   * pattern. We 

RE: [PATCH] cobol: Get rid of __int128 uses in the COBOL FE [PR119242]

2025-03-25 Thread Robert Dubner
> -Original Message-
> From: Jakub Jelinek 
> Sent: Tuesday, March 25, 2025 19:49
> To: Robert Dubner ; James K. Lowden
> ; Richard Biener 
> Cc: gcc-patches@gcc.gnu.org
> Subject: [PATCH] cobol: Get rid of __int128 uses in the COBOL FE
> [PR119242]
> 
> Hi!
> 
> The following patch changes some remaining __int128 uses in the FE
> into FIXED_WIDE_INT(128), i.e. emulated 128-bit integral type.
> The use of wide_int_to_tree directly from that rather than going through
> build_int_cst_type means we don't throw away the upper 64 bits of the
> values, so the emitting of constants needing full 128 bits can be
greatly
> simplied.
> Plus all the #pragma GCC diagnostic ignored "-Wpedantic" spots aren't
> needed, we don't use the _Float128/__int128 types directly in the FE
> anymore.
> 
> Tested on x86_64-linux with make check-cobol, could you please test this
> on UAT/NIST?

I took a minute to apply the patch and run the tests.  Ten of the UAT
tests fail; they are the ones that test the ROUNDED clause.

It's 00:30 local time here, so I am not going to look into it now.  But
here is a simple case so that you have something to chew on while I am
getting my beauty sleep:

   IDENTIFICATION DIVISION.
   PROGRAM-ID. prog.
   DATA DIVISION.
   WORKING-STORAGE SECTION.
   01  NPIC S9.
   PROCEDURE DIVISION.
   COMPUTE N ROUNDED MODE AWAY-FROM-ZERO = -2.51
   DISPLAY "N should be -3"
   DISPLAY "Nis " N
   GOBACK.
   END PROGRAM prog.

N should be -3
Nis +1

> 
> Note, PR119241/PR119242 bugs are still not fully fixed, I think the
> remaining problem is that several FE sources include
> ../../libgcobol/libgcobol.h and that header declares various APIs with
> __int128 and _Float128 types, so trying to build a cross-compiler on a
> host
> without __int128 and _Float128 will still fail miserably.
> I believe none of those APIs are actually used by the FE, so the
question
> is
> what the FE needs from libgcobol.h and whether the rest could be wrapped
> with #ifndef IN_GCC or #ifndef IN_GCC_FRONTEND or something similar
> (those 2 macros are predefined when compiling the FE files).
> 
> 2025-03-26  Jakub Jelinek  
> 
>   PR cobol/119242
>   * cobol/genutil.h (get_power_of_ten): Remove #pragma GCC
diagnostic
>   around declaration.
>   * cobol/genapi.cc (psa_FldLiteralN): Change type of value from
>   __int128 to FIXED_WIDE_INT(128).  Remove #pragma GCC diagnostic
>   around the declaration.  Use wi::min_precision to determine
>   minimum unsigned precision of the value.  Use wi::neg_p instead
>   of value < 0 tests and wi::set_bit_in_zero
>   to build sign bit.  Handle field->data.capacity == 16 like
>   1, 2, 4 and 8, use wide_int_to_tree instead of build_int_cst.
>   (mh_source_is_literalN): Remove #pragma GCC diagnostic around
>   the definition.
>   (binary_initial_from_float128): Likewise.
>   * cobol/genutil.cc (get_power_of_ten): Remove #pragma GCC
diagnostic
>   before the definition.
> 
> --- gcc/cobol/genutil.h.jj2025-03-25 21:14:48.448384925 +0100
> +++ gcc/cobol/genutil.h   2025-03-25 21:19:24.358620134 +0100
> @@ -104,10 +104,7 @@ void  get_binary_value( tree value,
>  tree  get_data_address( cbl_field_t *field,
>  tree offset);
> 
> -#pragma GCC diagnostic push
> -#pragma GCC diagnostic ignored "-Wpedantic"
>  FIXED_WIDE_INT(128) get_power_of_ten(int n);
> -#pragma GCC diagnostic pop
>  void  scale_by_power_of_ten_N(tree value,
>  int N,
>  bool check_for_fractional = false);
> --- gcc/cobol/genapi.cc.jj2025-03-25 21:11:06.767409766 +0100
> +++ gcc/cobol/genapi.cc   2025-03-25 21:22:28.038113833 +0100
> @@ -3798,16 +3798,13 @@ psa_FldLiteralN(struct cbl_field_t *fiel
>// We are constructing a completely static constant structure, based
on
> the
>// text string in .initial
> 
> -#pragma GCC diagnostic push
> -#pragma GCC diagnostic ignored "-Wpedantic"
> -  __int128 value = 0;
> -#pragma GCC diagnostic pop
> +  FIXED_WIDE_INT(128) value = 0;
> 
>do
>  {
>  // This is a false do{}while, to isolate the variables:
> 
> -// We need to convert data.initial to an __int128 value
> +// We need to convert data.initial to an FIXED_WIDE_INT(128) value
>  char *p = const_cast(field->data.initial);
>  int sign = 1;
>  if( *p == '-' )
> @@ -3903,24 +3900,24 @@ psa_FldLiteralN(struct cbl_field_t *fiel
> 
>  // We now need to calculate the capacity.
> 
> -unsigned char *pvalue = (unsigned char *)&value;
> +unsigned int min_prec = wi::min_precision(value, UNSIGNED);
>  int capacity;
> -if( *(uint64_t*)(pvalue + 8) )
> +if( min_prec > 64 )
>{
>// Bytes 15 through 8 are non-zero
>capacity = 16;
>}
> -else if( *(uint32_t*)(pvalue + 4) 

Re: [PATCH] tailc: Don't fail musttail calls if they use or could use local arguments, instead warn [PR119376]

2025-03-25 Thread Alexander Monakov
Hello,

FWIW I think Clang made a mistake in bending semantics in a way that is clearly
misaligned with the general design of C and C++, where a language-native, so to
speak, solution was available: introduce a scope for the local variables to
indicate that they cannot escape to the intended tailcall:

void foo(int v)
{
  {
int a;
capture(&a);
  }
  tailcall(v); // this cannot refer to 'a', even though it escaped earlier
}

I think this would be easily teachable to users who need [[tailcall]].

I wonder if Clang folks would be open to a dialogue about undoing this
misdesign. I'd rather not see it propagated into GCC.

Thanks.
Alexander


Re: [PATCH] target/119010 - add missing DF load/store reservations for znver4 and znver5

2025-03-25 Thread Richard Biener



> Am 25.03.2025 um 16:22 schrieb Jan Hubicka :
> 
> 
>> 
>>> On Tue, 25 Mar 2025, Richard Biener wrote:
>>> 
>>> The following resolves missing reservations for DFmode *movdf_internal
>>> loads and stores, visible as 'nothing' in -fsched-verbose=2 dumps.
>>> 
>>> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
>> 
>> The alternative for the larger scale problem of missing DFmode handling
>> is to s/V1DF/DF/ in the file - znver.md gets along without V1DF handling,
>> supposedly using V1DF (V1SF isn't a thing) was a mistake?
> 
> Hmm, so we use DF for movsd and V1DF for movlpd.
> movss is SF and movlps is V2SF so I guess we wnat to handle both?

Ok.  There are more missing DF cases, so I will followup with adding DF to all 
reservations covering V1DF.

> Also core2.md and others do not list V1 variants which I suppose is also
> missing case.  Perhaps as Jeff suggets, I could try to see how many
> insns misses reservations by adding the assert...
> 
> 
>> 
>> Richard.
>> 
>>>PR target/119010
>>>* config/i386/zn4zn5.md (znver4_sse_mov_fp, znver4_sse_mov_fp_load,
>>>znver5_sse_mov_fp_load, znver4_sse_mov_fp_store,
>>>znver5_sse_mov_fp_store): Also match V1SF and DF.
>>> ---
>>> gcc/config/i386/zn4zn5.md | 10 +-
>>> 1 file changed, 5 insertions(+), 5 deletions(-)
>>> 
>>> diff --git a/gcc/config/i386/zn4zn5.md b/gcc/config/i386/zn4zn5.md
>>> index ae188a1201e..f8772fed620 100644
>>> --- a/gcc/config/i386/zn4zn5.md
>>> +++ b/gcc/config/i386/zn4zn5.md
>>> @@ -986,35 +986,35 @@
>>> (define_insn_reservation "znver4_sse_mov_fp" 1
>>> (and (eq_attr "cpu" "znver4,znver5")
>>>  (and (eq_attr "type" "ssemov")
>>> -   (and (eq_attr "mode" 
>>> "V16SF,V8DF,V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,SF")
>>> +   (and (eq_attr "mode" 
>>> "V16SF,V8DF,V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,DF,SF")
> 
> This is OK.  I believe movsd and movlpd are about the same for
> scheduling POV
> 
> Honza


Re: [PATCH] libstdc++: Allow std::ranges::to to create unions

2025-03-25 Thread Tomasz Kaminski
On Tue, Mar 25, 2025 at 1:43 PM Jonathan Wakely  wrote:

> LWG 4229 points out that the std::ranges::to wording refers to class
> types, but I added an assertion using std::is_class_v which only allows
> non-union class types. LWG consensus is that unions should be allowed,
> so this additionally uses std::is_union_v.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/ranges (ranges::to): Allow unions as well as
> non-union class types.
> * testsuite/std/ranges/conv/lwg4229.cc: New test.
> ---
>
> Tested x86_64-linux.
>
LGTM. I still do not have any real case for using ranges::to with unions,
but consensus seems to be to allow them.

>
>  libstdc++-v3/include/std/ranges|  4 ++--
>  .../testsuite/std/ranges/conv/lwg4229.cc   | 18 ++
>  2 files changed, 20 insertions(+), 2 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/std/ranges/conv/lwg4229.cc
>
> diff --git a/libstdc++-v3/include/std/ranges
> b/libstdc++-v3/include/std/ranges
> index 34c6f113e21..7a339c51368 100644
> --- a/libstdc++-v3/include/std/ranges
> +++ b/libstdc++-v3/include/std/ranges
> @@ -9421,7 +9421,7 @@ namespace __detail
>  to [[nodiscard]] (_Rg&& __r, _Args&&... __args)
>  {
>static_assert(!is_const_v<_Cont> && !is_volatile_v<_Cont>);
> -  static_assert(is_class_v<_Cont>);
> +  static_assert(is_class_v<_Cont> || is_union_v<_Cont>);
>
>if constexpr (__detail::__toable<_Cont, _Rg>)
> {
> @@ -9580,7 +9580,7 @@ namespace __detail
>  to [[nodiscard]] (_Args&&... __args)
>  {
>static_assert(!is_const_v<_Cont> && !is_volatile_v<_Cont>);
> -  static_assert(is_class_v<_Cont>);
> +  static_assert(is_class_v<_Cont> || is_union_v<_Cont>);
>
>using __detail::_To;
>using views::__adaptor::_Partial;
> diff --git a/libstdc++-v3/testsuite/std/ranges/conv/lwg4229.cc
> b/libstdc++-v3/testsuite/std/ranges/conv/lwg4229.cc
> new file mode 100644
> index 000..780ed1fd932
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/std/ranges/conv/lwg4229.cc
> @@ -0,0 +1,18 @@
> +// { dg-do compile { target c++23 } }
> +
> +// LWG 4229 std::ranges::to with union return type
> +
> +#include 
> +
> +union U
> +{
> +  template U(std::from_range_t, R&&) { }
> +
> +  int i;
> +};
> +
> +void
> +test_lwg4229(std::ranges::subrange r)
> +{
> +  U u = std::ranges::to(r);
> +}
> --
> 2.49.0
>
>


[committed] arm: testsuite: skip mtp tests on thumb1

2025-03-25 Thread Richard Earnshaw
These tests need access to the MRC instruction, but that isn't part of
of the Thumb1 ISA.  So skip the tests when this isn't the case.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mtp_1.c: Require arm32.
* gcc.target/arm/mtp_2.c: Likewise.
* gcc.target/arm/mtp_3.c: Likewise.
* gcc.target/arm/mtp_4.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/mtp_1.c | 1 +
 gcc/testsuite/gcc.target/arm/mtp_2.c | 1 +
 gcc/testsuite/gcc.target/arm/mtp_3.c | 1 +
 gcc/testsuite/gcc.target/arm/mtp_4.c | 1 +
 4 files changed, 4 insertions(+)

diff --git a/gcc/testsuite/gcc.target/arm/mtp_1.c 
b/gcc/testsuite/gcc.target/arm/mtp_1.c
index 678d27d9234..f78ceb8574e 100644
--- a/gcc/testsuite/gcc.target/arm/mtp_1.c
+++ b/gcc/testsuite/gcc.target/arm/mtp_1.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target tls_native } */
+/* { dg-require-effective-target arm32 } */
 /* { dg-options "-O -mtp=cp15" } */
 
 #include "mtp.c"
diff --git a/gcc/testsuite/gcc.target/arm/mtp_2.c 
b/gcc/testsuite/gcc.target/arm/mtp_2.c
index bcb308f2637..1368fe4a3a3 100644
--- a/gcc/testsuite/gcc.target/arm/mtp_2.c
+++ b/gcc/testsuite/gcc.target/arm/mtp_2.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target tls_native } */
+/* { dg-require-effective-target arm32 } */
 /* { dg-options "-O -mtp=tpidrprw" } */
 
 #include "mtp.c"
diff --git a/gcc/testsuite/gcc.target/arm/mtp_3.c 
b/gcc/testsuite/gcc.target/arm/mtp_3.c
index 7d5cea3cab6..2ef2e95b62d 100644
--- a/gcc/testsuite/gcc.target/arm/mtp_3.c
+++ b/gcc/testsuite/gcc.target/arm/mtp_3.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target tls_native } */
+/* { dg-require-effective-target arm32 } */
 /* { dg-options "-O -mtp=tpidruro" } */
 
 #include "mtp.c"
diff --git a/gcc/testsuite/gcc.target/arm/mtp_4.c 
b/gcc/testsuite/gcc.target/arm/mtp_4.c
index 068078df84e..121fc836513 100644
--- a/gcc/testsuite/gcc.target/arm/mtp_4.c
+++ b/gcc/testsuite/gcc.target/arm/mtp_4.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target tls_native } */
+/* { dg-require-effective-target arm32 } */
 /* { dg-options "-O -mtp=tpidrurw" } */
 
 #include "mtp.c"
-- 
2.34.1



[PATCH] libgcobol: Provide fallbacks for C32 strfromf32/64 [PR119296].

2025-03-25 Thread Iain Sandoe
This is on top of the C++-ify configure and random_r patches.
Tested on x86_64,aarch64-Linux and x86_64-darwin, OK for trunk?
thanks
Iain

--- 8< --- 

strfrom{f,d,l,fN) are all C23 and might not be available in general.
This uses snprintf() to provide fall-backs where the libc does not
yet have support.

PR cobol/119296

libgcobol/ChangeLog:

* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Check for availability of strfromf32 and
strfromf64.
* libgcobol.cc (strfromf32, strfromf64): New.

Signed-off-by: Iain Sandoe 
---
 libgcobol/config.h.in  |  6 ++
 libgcobol/configure| 13 +++--
 libgcobol/configure.ac |  3 +++
 libgcobol/libgcobol.cc | 24 
 4 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/libgcobol/config.h.in b/libgcobol/config.h.in
index e7e1492b579..d61ff7ad497 100644
--- a/libgcobol/config.h.in
+++ b/libgcobol/config.h.in
@@ -36,6 +36,12 @@
 /* Define to 1 if you have the  header file. */
 #undef HAVE_STDLIB_H
 
+/* Define to 1 if you have the `strfromf32' function. */
+#undef HAVE_STRFROMF32
+
+/* Define to 1 if you have the `strfromf64' function. */
+#undef HAVE_STRFROMF64
+
 /* Define to 1 if you have the  header file. */
 #undef HAVE_STRINGS_H
 
diff --git a/libgcobol/configure b/libgcobol/configure
index acf78646d5b..23474881f35 100755
--- a/libgcobol/configure
+++ b/libgcobol/configure
@@ -2696,6 +2696,8 @@ as_fn_append ac_func_list " random_r"
 as_fn_append ac_func_list " srandom_r"
 as_fn_append ac_func_list " initstate_r"
 as_fn_append ac_func_list " setstate_r"
+as_fn_append ac_func_list " strfromf32"
+as_fn_append ac_func_list " strfromf64"
 # Check that the precious variables saved in the cache have kept the same
 # value.
 ac_cache_corrupted=false
@@ -11621,7 +11623,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11624 "configure"
+#line 11626 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -11727,7 +11729,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11730 "configure"
+#line 11732 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -17699,6 +17701,13 @@ done
 
 
 
+# These are C23, and might not be available in libc.
+
+
+
+
+
+
 if test "${multilib}" = "yes"; then
   multilib_arg="--enable-multilib"
 else
diff --git a/libgcobol/configure.ac b/libgcobol/configure.ac
index 0356f9e9c67..6e1ea3700d0 100644
--- a/libgcobol/configure.ac
+++ b/libgcobol/configure.ac
@@ -198,6 +198,9 @@ AC_SUBST(extra_ldflags_libgcobol)
 # These are GLIBC
 AC_CHECK_FUNCS_ONCE(random_r srandom_r initstate_r setstate_r)
 
+# These are C23, and might not be available in libc.
+AC_CHECK_FUNCS_ONCE(strfromf32 strfromf64)
+
 if test "${multilib}" = "yes"; then
   multilib_arg="--enable-multilib"
 else
diff --git a/libgcobol/libgcobol.cc b/libgcobol/libgcobol.cc
index 6aeaaa2c142..85f016e9735 100644
--- a/libgcobol/libgcobol.cc
+++ b/libgcobol/libgcobol.cc
@@ -68,6 +68,30 @@
 
 #include "exceptl.h"
 
+#if !defined (HAVE_STRFROMF32)
+# if __FLT_MANT_DIG__ == 24 && __FLT_MAX_EXP__ == 128
+static int
+strfromf32 (char *s, size_t n, const char *f, float v)
+{
+  return snprintf (s, n, f, (double) v);
+}
+# else
+#  error "It looks like float on this platform is not IEEE754"
+# endif
+#endif
+
+#if !defined (HAVE_STRFROMF64)
+# if __DBL_MANT_DIG__ == 53 && __DBL_MAX_EXP__ == 1024
+static int
+strfromf64 (char *s, size_t n, const char *f, double v)
+{
+  return snprintf (s, n, f, v);
+}
+# else
+#  error "It looks like double on this platform is not IEEE754"
+# endif
+#endif
+
 // This couldn't be defined in symbols.h because it conflicts with a LEVEL66
 // in parse.h
 #define LEVEL66 (66)
-- 
2.39.2 (Apple Git-143)



[Patch, v3] libgomp.texi: Document supported OpenMP 'interop' types for nvptx and gcn

2025-03-25 Thread Tobias Burnus

Updated patch:

* I noticed that the API functions in omp.h.in (and since OpenMP 6.0) 
take omp_interop_rc_t* not int*.


Thus, I updated it to match omp.h.in. Unfortunately, the difference 
matters for C++; the enum itself is available already in 5.1 and C does 
not care.


* I now use -- as suggested below -- a normal, non-typewriter font for 
the value, except for the first one which is an identifier (enum/parameter).


* I use now an example, illustrating how to obtain the value.

* * *

Tobias Burnus wrote:
For the string-valued constants in the table, please include the 
quotes, unless those are identifiers instead of string literal. 


Except for the first that is an identifier to a named constant 
(parameter/enum value), all others are the value returned by the API 
function, i.e. 11 is the integral value returned by omp_get_interop_int 
(...). And ‘amd’ is the value the string has.


Thus, except for the first one, they actually do not need to be use the 
typewriter font but could be also a plain 11, nvidia or ``nvidia''.


Actually, the same is kind of true for the property names, but I guess 
with the underscores it might be nicer to keep using @code (code wise 
and display wise).


Thoughts to this version? I think it is now better.

Tobias
libgomp.texi: Document supported OpenMP 'interop' types for nvptx and gcn

Note that this commit also updates the API interface to OpenMP 6.0;
while 5.1 and 5.2 use 'int *' for the the ret_code argument,
OpenMP 6.0 changed this to omp_interop_rc_t *; this enum also exists in
OpenMP 5.1. However, C++ does not like this change such that unless NULL
is passed (i.e. the argument is ignored), OpenMP 5.x and 6.x are not
compatible.

Note that GCC's omp.h already follows OpenMP 6.0 and is now in sync with
the documentation.

libgomp/ChangeLog:

	* libgomp.texi (OpenMP 5.1): Add @ref to offload-target specifics
	for 'interop'.
	(OpenMP 6.0): Mark dispatch's interop clause as implemented.
	(omp_get_interop_int, omp_get_interop_str,
	omp_get_interop_ptr, omp_get_interop_type_desc): Add @ref to
	Offload-Target Specifics; change ret_code argument type to
	'omp_interop_rc_t *'.
	(Offload-Target Specifics): Document the supported OpenMP
	interop foreign runtimes on AMD and Nvidia GPUs.

 libgomp/libgomp.texi | 170 ---
 1 file changed, 161 insertions(+), 9 deletions(-)

diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index d1cf9be47ca..ad3649f8536 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -314,7 +314,7 @@ The OpenMP 4.5 specification is fully supported.
   clauses @tab N @tab
 @item Indirect calls to the device version of a procedure or function in
   @code{target} regions @tab Y @tab
-@item @code{interop} directive @tab N @tab
+@item @code{interop} directive @tab Y @tab Cf. @ref{Offload-Target Specifics}
 @item @code{omp_interop_t} object support in runtime routines @tab Y @tab
 @item @code{nowait} clause in @code{taskwait} directive @tab Y @tab
 @item Extensions to the @code{atomic} directive @tab Y @tab
@@ -545,7 +545,7 @@ to address of matching mapped list item per 5.1, Sect. 2.21.7.2 @tab N @tab
   @tab N @tab
 @item Semicolon-separated list to @code{uses_allocators} @tab N @tab
 @item New @code{need_device_addr} modifier to @code{adjust_args} clause @tab N @tab
-@item @code{interop} clause to @code{dispatch} @tab N @tab
+@item @code{interop} clause to @code{dispatch} @tab Y @tab
 @item Scope requirement changes for @code{declare_target} @tab N @tab
 @item @code{message} and @code{severity} clauses to @code{parallel} directive
   @tab N @tab
@@ -3048,7 +3048,7 @@ the initial device is unspecified.
 @item @emph{C/C++}:
 @multitable @columnfractions .20 .80
 @item @emph{Prototype}: @tab @code{omp_intptr_t omp_get_interop_int(const omp_interop_t interop,
-   omp_interop_property_t property_id, int *ret_code)}
+   omp_interop_property_t property_id, omp_interop_rc_t *ret_code)}
 @end multitable
 
 @item @emph{Fortran}:
@@ -3062,7 +3062,8 @@ the initial device is unspecified.
 @end multitable
 
 @item @emph{See also}:
-@ref{omp_get_interop_ptr}, @ref{omp_get_interop_str}, @ref{omp_get_interop_rc_desc}
+@ref{omp_get_interop_ptr}, @ref{omp_get_interop_str}, @ref{omp_get_interop_rc_desc},
+@ref{Offload-Target Specifics}
 
 @item @emph{Reference}:
 @uref{https://www.openmp.org, OpenMP specification v5.1}, Section 3.12.2,
@@ -3093,7 +3094,7 @@ the initial device is unspecified.
 @item @emph{C/C++}:
 @multitable @columnfractions .20 .80
 @item @emph{Prototype}: @tab @code{void *omp_get_interop_ptr(const omp_interop_t interop,
-   omp_interop_property_t property_id, int *ret_code)}
+   omp_interop_property_t property_id, omp_interop_rc_t *ret_code)}
 @end multitable
 
 @item @emph{Fortran}:
@@ -3107,7 +3108,8 @@ the initial device is unspecified.
 @end multita

[COMMITTED] RISC-V: disable the abd expander for gcc-15 release [PR119224]

2025-03-25 Thread Vineet Gupta
It seems the new expander triggers a latent issue in sched1 causing
extraneous spills in a different sad variant.
Given how close we are to gcc-15 release, disable it for now.

Since we do want to retain and re-enable this capabilty, manully disable
vs. reverting the orig patch which takes away the test case too.
Fix the orig test case to expect old codegen idiom (although vneg is no
longer emitted, in favor of vrsub).
Also add a new testcase which flags any future spills in the affected
routine.

PR target/119224

gcc/ChangeLog:
* config/riscv/autovec.md: Disable abd splitter.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/pr117722.c: Adjust output insn.
* gcc.target/riscv/rvv/autovec/pr119224.c: Add new test.

Signed-off-by: Vineet Gupta 
---
 gcc/config/riscv/autovec.md   |  3 ++-
 .../gcc.target/riscv/rvv/autovec/pr117722.c   |  6 ++---
 .../gcc.target/riscv/rvv/autovec/pr119224.c   | 27 +++
 3 files changed, 32 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr119224.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index c7f12f9e36f5..f53ed3a5e3fd 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -2975,7 +2975,8 @@ (define_expand "uabd3"
   [(match_operand:V_VLSI 0 "register_operand")
(match_operand:V_VLSI 1 "register_operand")
(match_operand:V_VLSI 2 "register_operand")]
-  "TARGET_VECTOR"
+  ;; Disabled until PR119224 is resolved
+  "TARGET_VECTOR && 0"
   {
 rtx max = gen_reg_rtx (mode);
 insn_code icode = code_for_pred (UMAX, mode);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117722.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117722.c
index f255ceb2cee6..493dab056212 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117722.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr117722.c
@@ -18,6 +18,6 @@ int pixel_sad_n(unsigned char *pix1, unsigned char *pix2, int 
n)
   return sum;
 }
 
-/* { dg-final { scan-assembler {vminu\.v} } } */
-/* { dg-final { scan-assembler {vmaxu\.v} } } */
-/* { dg-final { scan-assembler {vsub\.v} } } */
+/* { dg-final { scan-assembler {vrsub\.v} } } */
+/* { dg-final { scan-assembler {vmax\.v} } } */
+/* { dg-final { scan-assembler {vwsubu\.v} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr119224.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr119224.c
new file mode 100644
index ..fa3386c345b8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr119224.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -ffast-math -march=rv64gcv_zvl256b -mabi=lp64d 
-mtune=generic-ooo -mrvv-vector-bits=zvl" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-O2" "-Og" "-Os" "-Oz" } } */
+
+/* A core routine of x264 which should not spill for OoO VLS build.  */
+
+inline int abs(int i)
+{
+  return (i < 0 ? -i : i);
+}
+
+int x264_sad_16x16(unsigned char *p1, int st1, unsigned char *p2, int st2)
+{
+int sum = 0;
+
+for(int y = 0; y < 16; y++)
+  {
+   for(int x = 0; x < 16; x++)
+   sum += abs (p1[x] - p2[x]);
+   p1 += st1; p2 += st2;
+  }
+
+  return sum;
+}
+
+/* { dg-final { scan-assembler-not {addi\t[a-x0-9]+,sp} } } */
+/* { dg-final { scan-assembler-not {addi\tsp,sp} } } */
-- 
2.43.0



Re: [PATCH] libstdc++: Optimize std::vector construction from input iterators [PR108487]

2025-03-25 Thread Jonathan Wakely
On Tue, 25 Mar 2025 at 16:49, Tomasz Kaminski  wrote:
>
>
>
> On Tue, Mar 25, 2025 at 5:30 PM Jonathan Wakely  wrote:
>>
>> LWG 3291 make std::ranges::iota_view's iterator have input_iterator_tag
>> as its iterator_category, even though it satisfies the C++20
>> std::forward_iterator concept. This means that the traditional
>> std::vector::vector(InputIterator, InputIterator) constructor treats
>> iota_view iterators as input iterators, because it only understands the
>> C++17 iterator requirements, not the C++20 iterator concepts. This
>> results in a loop that calls emplace_back for each individual element of
>> the iota_view, requiring the vector to reallocate repeatedly as the
>> values are inserted. This makes it unnecessarily slow to construct a
>> vector from an iota_view.
>>
>> This change adds a new _M_range_initialize_n function for initializing a
>> vector from a range (which doesn't have to be common) and a size. This
>> new function can be used by vector(InputIterator, InputIterator) and
>> vector(from_range_t, R&&) when std::ranges::distance can be used to get
>> the size. It can also be used by the _M_range_initialize overload that
>> gets the size for a Cpp17ForwardIterator pair using std::distance, and
>> by the vector(initializer_list) constructor.
>>
>> With this new function constructing a std::vector from iota_view does
>> a single allocation of the correct size and so doesn't need to
>> reallocate in a loop.
>>
>> libstdc++-v3/ChangeLog:
>>
>> PR libstdc++/108487
>> * include/bits/stl_vector.h (vector(initializer_list)): Call
>> _M_range_initialize_n instead of _M_range_initialize.
>> (vector(InputIterator, InputIterator)): Use _M_range_initialize_n
>> for C++20 sized sentinels and forward iterators.
>> (vector(from_range_t, R&&)): Use _M_range_initialize_n for sized
>> ranges and forward ranges.
>> (vector::_M_range_initialize(FwIt, FwIt, forward_iterator_tag)):
>> Likewise.
>> (vector::_M_range_initialize_n): New function.
>> * testsuite/23_containers/vector/cons/108487.cc: New test.
>> ---
>>
>> Tests running for x86_64-linux.
>>
>> This gives a 10x speed up for the PR108487 testcase using iota_view.
>>
>> I don't see why doing this wouldn't be allowed by the standard, so it
>> seems worth doing.
>>
>>  libstdc++-v3/include/bits/stl_vector.h| 48 ---
>>  .../23_containers/vector/cons/108487.cc   | 24 ++
>>  2 files changed, 56 insertions(+), 16 deletions(-)
>>  create mode 100644 
>> libstdc++-v3/testsuite/23_containers/vector/cons/108487.cc
>>
>> diff --git a/libstdc++-v3/include/bits/stl_vector.h 
>> b/libstdc++-v3/include/bits/stl_vector.h
>> index 21f6cd04f49..458adc987da 100644
>> --- a/libstdc++-v3/include/bits/stl_vector.h
>> +++ b/libstdc++-v3/include/bits/stl_vector.h
>> @@ -65,6 +65,9 @@
>>  #if __cplusplus >= 202002L
>>  # include 
>>  #endif
>> +#if __glibcxx_concepts // C++ >= C++20
>> +# include   // ranges::distance
>> +#endif
>>  #if __glibcxx_ranges_to_container // C++ >= 23
>>  # include   // ranges::copy
>>  # include   // ranges::subrange
>> @@ -706,8 +709,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>>  const allocator_type& __a = allocator_type())
>>: _Base(__a)
>>{
>> -   _M_range_initialize(__l.begin(), __l.end(),
>> -   random_access_iterator_tag());
>> +   _M_range_initialize_n(__l.begin(), __l.end(), __l.size());
>>}
>>  #endif
>>
>> @@ -735,6 +737,17 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>>const allocator_type& __a = allocator_type())
>> : _Base(__a)
>> {
>> +#if __glibcxx_concepts // C++ >= C++20
>> + if constexpr (sized_sentinel_for<_InputIterator, _InputIterator>
>> + || forward_iterator<_InputIterator>)
>> +   {
>> + const auto __n
>> +   = static_cast(ranges::distance(__first, __last));
>> + _M_range_initialize_n(__first, __last, __n);
>> + return;
>> +   }
>> + else
>> +#endif
>>   _M_range_initialize(__first, __last,
>>   std::__iterator_category(__first));
>> }
>> @@ -763,13 +776,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>> {
>>   if constexpr (ranges::forward_range<_Rg> || 
>> ranges::sized_range<_Rg>)
>> {
>> - const auto __n = size_type(ranges::distance(__rg));
>> - pointer __start =
>> -   this->_M_allocate(_S_check_init_len(__n,
>> -   _M_get_Tp_allocator()));
>> - this->_M_impl._M_finish = this->_M_impl._M_start = __start;
>> - this->_M_impl._M_end_of_storage = __start + __n;
>> - _Base::_M_append_range(__rg);
>> + const auto __n = 
>> static_cast(ranges::distance(__rg));
>> + _M_range_initializ

[COMMITTED 007/144] gccrs: Move mbe macro tests to their own directory

2025-03-25 Thread arthur . cohen
From: Pierre-Emmanuel Patry 

gcc/testsuite/ChangeLog:

* rust/compile/macro-delim.rs: Move to...
* rust/compile/macros/mbe/macro-delim.rs: ...here.
* rust/compile/macro-issue1053-2.rs: Move to...
* rust/compile/macros/mbe/macro-issue1053-2.rs: ...here.
* rust/compile/macro-issue1053.rs: Move to...
* rust/compile/macros/mbe/macro-issue1053.rs: ...here.
* rust/compile/macro-issue1224.rs: Move to...
* rust/compile/macros/mbe/macro-issue1224.rs: ...here.
* rust/compile/macro-issue1233.rs: Move to...
* rust/compile/macros/mbe/macro-issue1233.rs: ...here.
* rust/compile/macro-issue1395-2.rs: Move to...
* rust/compile/macros/mbe/macro-issue1395-2.rs: ...here.
* rust/compile/macro-issue1395.rs: Move to...
* rust/compile/macros/mbe/macro-issue1395.rs: ...here.
* rust/compile/macro-issue1400-2.rs: Move to...
* rust/compile/macros/mbe/macro-issue1400-2.rs: ...here.
* rust/compile/macro-issue1400.rs: Move to...
* rust/compile/macros/mbe/macro-issue1400.rs: ...here.
* rust/compile/macro-issue2092.rs: Move to...
* rust/compile/macros/mbe/macro-issue2092.rs: ...here.
* rust/compile/macro-issue2192.rs: Move to...
* rust/compile/macros/mbe/macro-issue2192.rs: ...here.
* rust/compile/macro-issue2194.rs: Move to...
* rust/compile/macros/mbe/macro-issue2194.rs: ...here.
* rust/compile/macro-issue2229.rs: Move to...
* rust/compile/macros/mbe/macro-issue2229.rs: ...here.
* rust/compile/macro-issue2264.rs: Move to...
* rust/compile/macros/mbe/macro-issue2264.rs: ...here.
* rust/compile/macro-issue2268.rs: Move to...
* rust/compile/macros/mbe/macro-issue2268.rs: ...here.
* rust/compile/macro-issue2273.rs: Move to...
* rust/compile/macros/mbe/macro-issue2273.rs: ...here.
* rust/compile/macro-issue2653.rs: Move to...
* rust/compile/macros/mbe/macro-issue2653.rs: ...here.
* rust/compile/macro-issue2983_2984.rs: Move to...
* rust/compile/macros/mbe/macro-issue2983_2984.rs: ...here.
* rust/compile/macro1.rs: Move to...
* rust/compile/macros/mbe/macro1.rs: ...here.
* rust/compile/macro10.rs: Move to...
* rust/compile/macros/mbe/macro10.rs: ...here.
* rust/compile/macro11.rs: Move to...
* rust/compile/macros/mbe/macro11.rs: ...here.
* rust/compile/macro12.rs: Move to...
* rust/compile/macros/mbe/macro12.rs: ...here.
* rust/compile/macro13.rs: Move to...
* rust/compile/macros/mbe/macro13.rs: ...here.
* rust/compile/macro14.rs: Move to...
* rust/compile/macros/mbe/macro14.rs: ...here.
* rust/compile/macro15.rs: Move to...
* rust/compile/macros/mbe/macro15.rs: ...here.
* rust/compile/macro16.rs: Move to...
* rust/compile/macros/mbe/macro16.rs: ...here.
* rust/compile/macro17.rs: Move to...
* rust/compile/macros/mbe/macro17.rs: ...here.
* rust/compile/macro18.rs: Move to...
* rust/compile/macros/mbe/macro18.rs: ...here.
* rust/compile/macro19.rs: Move to...
* rust/compile/macros/mbe/macro19.rs: ...here.
* rust/compile/macro2.rs: Move to...
* rust/compile/macros/mbe/macro2.rs: ...here.
* rust/compile/macro20.rs: Move to...
* rust/compile/macros/mbe/macro20.rs: ...here.
* rust/compile/macro21.rs: Move to...
* rust/compile/macros/mbe/macro21.rs: ...here.
* rust/compile/macro22.rs: Move to...
* rust/compile/macros/mbe/macro22.rs: ...here.
* rust/compile/macro23.rs: Move to...
* rust/compile/macros/mbe/macro23.rs: ...here.
* rust/compile/macro25.rs: Move to...
* rust/compile/macros/mbe/macro25.rs: ...here.
* rust/compile/macro26.rs: Move to...
* rust/compile/macros/mbe/macro26.rs: ...here.
* rust/compile/macro27.rs: Move to...
* rust/compile/macros/mbe/macro27.rs: ...here.
* rust/compile/macro28.rs: Move to...
* rust/compile/macros/mbe/macro28.rs: ...here.
* rust/compile/macro29.rs: Move to...
* rust/compile/macros/mbe/macro29.rs: ...here.
* rust/compile/macro3.rs: Move to...
* rust/compile/macros/mbe/macro3.rs: ...here.
* rust/compile/macro30.rs: Move to...
* rust/compile/macros/mbe/macro30.rs: ...here.
* rust/compile/macro31.rs: Move to...
* rust/compile/macros/mbe/macro31.rs: ...here.
* rust/compile/macro32.rs: Move to...
* rust/compile/macros/mbe/macro32.rs: ...here.
* rust/compile/macro33.rs: Move to...
* rust/compile/macros/mbe/macro33.rs: ...here.
* rust/compile/macro34.rs: Move to...
* rust/compile/macros/mbe/macro34.rs: ...here.
* rust/compile/macro35.rs: Move to...
* rust/compile/macros/mbe/macro35.rs: ...here.
 

Re: [PATCH] libstdc++: Optimize std::vector construction from input iterators [PR108487]

2025-03-25 Thread Tomasz Kaminski
On Tue, Mar 25, 2025 at 6:20 PM Jonathan Wakely  wrote:

> On Tue, 25 Mar 2025 at 16:49, Tomasz Kaminski  wrote:
> >
> >
> >
> > On Tue, Mar 25, 2025 at 5:30 PM Jonathan Wakely 
> wrote:
> >>
> >> LWG 3291 make std::ranges::iota_view's iterator have input_iterator_tag
> >> as its iterator_category, even though it satisfies the C++20
> >> std::forward_iterator concept. This means that the traditional
> >> std::vector::vector(InputIterator, InputIterator) constructor treats
> >> iota_view iterators as input iterators, because it only understands the
> >> C++17 iterator requirements, not the C++20 iterator concepts. This
> >> results in a loop that calls emplace_back for each individual element of
> >> the iota_view, requiring the vector to reallocate repeatedly as the
> >> values are inserted. This makes it unnecessarily slow to construct a
> >> vector from an iota_view.
> >>
> >> This change adds a new _M_range_initialize_n function for initializing a
> >> vector from a range (which doesn't have to be common) and a size. This
> >> new function can be used by vector(InputIterator, InputIterator) and
> >> vector(from_range_t, R&&) when std::ranges::distance can be used to get
> >> the size. It can also be used by the _M_range_initialize overload that
> >> gets the size for a Cpp17ForwardIterator pair using std::distance, and
> >> by the vector(initializer_list) constructor.
> >>
> >> With this new function constructing a std::vector from iota_view does
> >> a single allocation of the correct size and so doesn't need to
> >> reallocate in a loop.
> >>
> >> libstdc++-v3/ChangeLog:
> >>
> >> PR libstdc++/108487
> >> * include/bits/stl_vector.h (vector(initializer_list)): Call
> >> _M_range_initialize_n instead of _M_range_initialize.
> >> (vector(InputIterator, InputIterator)): Use
> _M_range_initialize_n
> >> for C++20 sized sentinels and forward iterators.
> >> (vector(from_range_t, R&&)): Use _M_range_initialize_n for sized
> >> ranges and forward ranges.
> >> (vector::_M_range_initialize(FwIt, FwIt, forward_iterator_tag)):
> >> Likewise.
> >> (vector::_M_range_initialize_n): New function.
> >> * testsuite/23_containers/vector/cons/108487.cc: New test.
> >> ---
> >>
> >> Tests running for x86_64-linux.
> >>
> >> This gives a 10x speed up for the PR108487 testcase using iota_view.
> >>
> >> I don't see why doing this wouldn't be allowed by the standard, so it
> >> seems worth doing.
> >>
> >>  libstdc++-v3/include/bits/stl_vector.h| 48 ---
> >>  .../23_containers/vector/cons/108487.cc   | 24 ++
> >>  2 files changed, 56 insertions(+), 16 deletions(-)
> >>  create mode 100644
> libstdc++-v3/testsuite/23_containers/vector/cons/108487.cc
> >>
> >> diff --git a/libstdc++-v3/include/bits/stl_vector.h
> b/libstdc++-v3/include/bits/stl_vector.h
> >> index 21f6cd04f49..458adc987da 100644
> >> --- a/libstdc++-v3/include/bits/stl_vector.h
> >> +++ b/libstdc++-v3/include/bits/stl_vector.h
> >> @@ -65,6 +65,9 @@
> >>  #if __cplusplus >= 202002L
> >>  # include 
> >>  #endif
> >> +#if __glibcxx_concepts // C++ >= C++20
> >> +# include   // ranges::distance
> >> +#endif
> >>  #if __glibcxx_ranges_to_container // C++ >= 23
> >>  # include   // ranges::copy
> >>  # include   // ranges::subrange
> >> @@ -706,8 +709,7 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> >>  const allocator_type& __a = allocator_type())
> >>: _Base(__a)
> >>{
> >> -   _M_range_initialize(__l.begin(), __l.end(),
> >> -   random_access_iterator_tag());
> >> +   _M_range_initialize_n(__l.begin(), __l.end(), __l.size());
> >>}
> >>  #endif
> >>
> >> @@ -735,6 +737,17 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> >>const allocator_type& __a = allocator_type())
> >> : _Base(__a)
> >> {
> >> +#if __glibcxx_concepts // C++ >= C++20
> >> + if constexpr (sized_sentinel_for<_InputIterator,
> _InputIterator>
> >> + || forward_iterator<_InputIterator>)
> >> +   {
> >> + const auto __n
> >> +   = static_cast(ranges::distance(__first,
> __last));
> >> + _M_range_initialize_n(__first, __last, __n);
> >> + return;
> >> +   }
> >> + else
> >> +#endif
> >>   _M_range_initialize(__first, __last,
> >>   std::__iterator_category(__first));
> >> }
> >> @@ -763,13 +776,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> >> {
> >>   if constexpr (ranges::forward_range<_Rg> ||
> ranges::sized_range<_Rg>)
> >> {
> >> - const auto __n = size_type(ranges::distance(__rg));
> >> - pointer __start =
> >> -   this->_M_allocate(_S_check_init_len(__n,
> >> -
>  _M_get_Tp_allocator()));
> >> - this->_M_impl._M_finish = this->_M_impl._M_start =
> __st

Re: [PATCH] tailc: Don't fail musttail calls if they use or could use local arguments, instead warn [PR119376]

2025-03-25 Thread Jakub Jelinek
On Tue, Mar 25, 2025 at 07:43:28PM +0300, Alexander Monakov wrote:
> FWIW I think Clang made a mistake in bending semantics in a way that is 
> clearly
> misaligned with the general design of C and C++, where a language-native, so 
> to
> speak, solution was available: introduce a scope for the local variables to
> indicate that they cannot escape to the intended tailcall:
> 
> void foo(int v)
> {
>   {
> int a;
> capture(&a);
>   }
>   tailcall(v); // this cannot refer to 'a', even though it escaped earlier
> }
> 
> I think this would be easily teachable to users who need [[tailcall]].
> 
> I wonder if Clang folks would be open to a dialogue about undoing this
> misdesign. I'd rather not see it propagated into GCC.

I'm not excited about it either, but they had introduced the attribute with
this behavior already 3.5 years ago and it is used in the wild, we should
have complained before that did that :(.
And having significantly different behavior for attribute with the same name
(especially [[clang::musttail]], but also the __attribute__((musttail)) form
which is supposed to be gnu::musttail but they handle it like
clang::musttail) would be confusing to users as well.

For the
void bar (int *);
void
foo (int *x)
{
  int a;
  [[clang::musttail]] return bar (&a);
}
case they now (newly) warn by default and the posted patch warns by default
too, for the other case I've put the warning into -Wextra but am open to
move it to -Wall.
Adding compound statement with the locals is definitely what users can then
do to silence that warning; unfortunately one can't do that with function
arguments if their addresses escape.

Jakub



Re: [PATCH v3] Don't instrument exit edges after musttail

2025-03-25 Thread Andi Kleen
> 2025-03-25  Jakub Jelinek  
>   Andi Kleen  
> 
>   PR gcov-profile/118442
>   * profile.cc (branch_prob): Ignore EDGE_FAKE edges from musttail calls
>   to EXIT.
> 
>   * c-c++-common/pr118442.c: New test.
> 
> --- gcc/profile.cc.jj 2025-01-02 11:23:16.458517673 +0100
> +++ gcc/profile.cc2025-03-25 09:57:21.860398601 +0100
> @@ -1340,6 +1340,20 @@ branch_prob (bool thunk)
> EDGE_INFO (e)->ignore = 1;
> ignored_edges++;
>   }
> +  /* Ignore fake edges after musttail calls.  */
> +  if ((e->flags & EDGE_FAKE)
> +   && e->dest == EXIT_BLOCK_PTR_FOR_FN (cfun))
> + {
> +   gimple_stmt_iterator gsi = gsi_last_bb (e->src);

At least the musttail pass allows some statements after the call, like labels
and debug information. Not sure if it matters.

The rest looks good to me.

-Andi

> +   gimple *stmt = gsi_stmt (gsi);
> +   if (stmt
> +   && is_gimple_call (stmt)
> +   && gimple_call_must_tail_p (as_a  (stmt)))
> + {
> +   EDGE_INFO (e)->ignore = 1;
> +   ignored_edges++;
> + }
> + }
>  }


Re: [PATCH v3] libstdc++: Add P1206R7 range operations to std::deque [PR111055]

2025-03-25 Thread Jonathan Wakely

On 25/03/25 13:30 +0100, Tomasz Kamiński wrote:

This is another piece of P1206R7, adding from_range constructor, append_range,
prepend_range, insert_range, and assign_range members to std::deque.

For append_front of input non-sized range, we are emplacing element at the 
front and
then reverse inserted elements. This does not existing elements, and properly 
handle
aliasing ranges.

For insert_range, the handling of insertion in the middle of input-only ranges
that are sized could be optimized, we still insert nodes one-by-one in such 
case.
For forward and stronger ranges, we reduce them to common_range case, by 
computing
the iterator when computing the distance. This is slightly suboptimal, as it 
require
range to be iterated for non-common forward ranges that are sized, but reduces
number of instantiations.

This patch extract _M_range_prepend, _M_range_append helper functions that 
accepts
(iterator, sentinel) pair. This all used in all standard modes.

PR libstdc++/111055

libstdc++-v3/ChangeLog:

* include/bits/deque.tcc (deque::prepend_range, deque::append_range)
(deque::insert_range, _advance_dist): Define.


There should be two underscores at the start of _advance_dist.


(deque::_M_range_prepend, deque::_M_range_append):
Extract from _M_range_insert_aux for _ForwardIterator(s).
* include/bits/stl_deque.h (deque::assing_range): Define.


Spelling "assign"


(deque::prepend_range, deque::append_range, deque::insert_range):
Declare.
deque(from_range_t, _Rg&&, const allocator_type&): Define constructor


Missing parentheses around the function name:

(deque(from_range_t, _Rg&&, const allocator_type&)): Define
constructor and deduction guide.


and deduction guide.
* include/debug/deque (deque::prepend_range, deque::append_range),


No comma at the end of the line here.


(assing_range): Define.


Spelling "assign" again.


deque(from_range_t, _Rg&&, const allocator_type&): Define constructor


Missing parens again.


and deduction guide.
* testsuite/23_containers/deque/cons/from_range.cc: New test.
* testsuite/23_containers/deque/modifiers/append_range.cc: New test.
* testsuite/23_containers/deque/modifiers/assign/assign_range.cc: New 
test.
* testsuite/23_containers/deque/modifiers/prepend_range.cc: New test.
---
Expanded the documentetion comments. Tests for the stability of 
pointers/references
to existing elements, when inserting at front/end.

libstdc++-v3/include/bits/deque.tcc   | 229 +++---
libstdc++-v3/include/bits/stl_deque.h | 110 +
libstdc++-v3/include/debug/deque  |  51 
.../23_containers/deque/cons/from_range.cc| 117 +
.../deque/modifiers/append_range.cc   | 128 ++
.../deque/modifiers/assign/assign_range.cc| 109 +
.../deque/modifiers/insert/insert_range.cc| 142 +++
.../deque/modifiers/prepend_range.cc  | 140 +++
8 files changed, 995 insertions(+), 31 deletions(-)
create mode 100644 libstdc++-v3/testsuite/23_containers/deque/cons/from_range.cc
create mode 100644 
libstdc++-v3/testsuite/23_containers/deque/modifiers/append_range.cc
create mode 100644 
libstdc++-v3/testsuite/23_containers/deque/modifiers/assign/assign_range.cc
create mode 100644 
libstdc++-v3/testsuite/23_containers/deque/modifiers/insert/insert_range.cc
create mode 100644 
libstdc++-v3/testsuite/23_containers/deque/modifiers/prepend_range.cc

diff --git a/libstdc++-v3/include/bits/deque.tcc 
b/libstdc++-v3/include/bits/deque.tcc
index fcbecca55b4..6de76ebc116 100644
--- a/libstdc++-v3/include/bits/deque.tcc
+++ b/libstdc++-v3/include/bits/deque.tcc
@@ -583,6 +583,51 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
  this->_M_impl._M_start._M_cur = this->_M_impl._M_start._M_first;
}

+  template 
+template 
+  void
+  deque<_Tp, _Alloc>::
+  _M_range_prepend(_InputIterator __first, _Sentinel __last,
+ size_type __n)
+  {
+iterator __new_start = _M_reserve_elements_at_front(__n);
+__try
+  {
+std::__uninitialized_copy_a(_GLIBCXX_MOVE(__first), __last,
+__new_start, _M_get_Tp_allocator());
+this->_M_impl._M_start = __new_start;
+  }
+__catch(...)
+  {
+_M_destroy_nodes(__new_start._M_node,
+ this->_M_impl._M_start._M_node);
+__throw_exception_again;
+  }
+  }
+
+  template 
+template 
+  void
+  deque<_Tp, _Alloc>::
+  _M_range_append(_InputIterator __first, _Sentinel __last,
+ size_type __n)
+ {
+   iterator __new_finish = _M_reserve_elements_at_back(__n);
+   __try
+   {
+ std::__uninitialized_copy_a(_GLIBCXX_MOVE(__first), __last,
+ this->_

[committed] arm: testsuite: adjust ftest tests

2025-03-25 Thread Richard Earnshaw
The ftest-*.c tests for Arm check certain ACLE mandated macros to ensure
they are correctly defined based on the selected architecture.  ACLE
states that the macro should be defined if the operation exists in
the hardware, but it doesn't have to exist in the current ISA because
and interworking call to the library function will still result in using
the hardware operation (both GCC and Clang agree on this).  So adjust
the tests accordingly.

Whilst cleaning this up, also remove the now redundant dg-skip-if operations
that were testing for incompatible command-line options.  That should now
be a thing of the past as the framework will clean this up more thoroughly
before running the test, or detect incompatible option combinations.

gcc/testsuite/ChangeLog:

* gcc.target/arm/ftest-armv4t-thumb.c:  Expect __ARM_FEATURE_CLZ to be
defined.  Remove redundant dg-skip-if rules.
* gcc.target/arm/ftest-armv5t-thumb.c: Likewise.
* gcc.target/arm/ftest-armv5te-thumb.c: Likewise.
* gcc.target/arm/ftest-armv6-thumb.c: Likewise.
* gcc.target/arm/ftest-armv6k-thumb.c: Likewise.
* gcc.target/arm/ftest-armv6z-thumb.c: Likewise.
* gcc.target/arm/ftest-armv7em-thumb.c: Remove redundant dg-skip-if
rules.  Add a require-effective-target for armv7em.
* gcc.target/arm/ftest-armv7a-arm.c: Likewise.
* gcc.target/arm/ftest-armv7a-thumb.c: Likewise.
* gcc.target/arm/ftest-armv7r-arm.c: Likewise.
* gcc.target/arm/ftest-armv7r-thumb.c: Likewise.
* gcc.target/arm/ftest-armv7ve-arm.c: Likewise.
* gcc.target/arm/ftest-armv7ve-thumb.c: Likewise.
* gcc.target/arm/ftest-armv8a-arm.c: Likewise.
* gcc.target/arm/ftest-armv8a-thumb.c: Likewise.
* gcc.target/arm/ftest-armv4-arm.c: Remove redundant dg-skip-if rules.
* gcc.target/arm/ftest-armv4t-arm.c: Likewise.
* gcc.target/arm/ftest-armv5t-arm.c: Likewise.
* gcc.target/arm/ftest-armv5te-arm.c: Likewise.
* gcc.target/arm/ftest-armv6-arm.c: Likewise.
* gcc.target/arm/ftest-armv6k-arm.c: Likewise.
* gcc.target/arm/ftest-armv6m-thumb.c: Likewise.
* gcc.target/arm/ftest-armv6t2-arm.c: Likewise.
* gcc.target/arm/ftest-armv6t2-thumb.c: Likewise.
* gcc.target/arm/ftest-armv6z-arm.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/ftest-armv4-arm.c | 2 --
 gcc/testsuite/gcc.target/arm/ftest-armv4t-arm.c| 2 --
 gcc/testsuite/gcc.target/arm/ftest-armv4t-thumb.c  | 2 --
 gcc/testsuite/gcc.target/arm/ftest-armv5t-arm.c| 2 --
 gcc/testsuite/gcc.target/arm/ftest-armv5t-thumb.c  | 7 +--
 gcc/testsuite/gcc.target/arm/ftest-armv5te-arm.c   | 2 --
 gcc/testsuite/gcc.target/arm/ftest-armv5te-thumb.c | 7 +--
 gcc/testsuite/gcc.target/arm/ftest-armv6-arm.c | 2 --
 gcc/testsuite/gcc.target/arm/ftest-armv6-thumb.c   | 7 +--
 gcc/testsuite/gcc.target/arm/ftest-armv6k-arm.c| 2 --
 gcc/testsuite/gcc.target/arm/ftest-armv6k-thumb.c  | 7 +--
 gcc/testsuite/gcc.target/arm/ftest-armv6m-thumb.c  | 2 --
 gcc/testsuite/gcc.target/arm/ftest-armv6t2-arm.c   | 2 --
 gcc/testsuite/gcc.target/arm/ftest-armv6t2-thumb.c | 2 --
 gcc/testsuite/gcc.target/arm/ftest-armv6z-arm.c| 2 --
 gcc/testsuite/gcc.target/arm/ftest-armv6z-thumb.c  | 7 +--
 gcc/testsuite/gcc.target/arm/ftest-armv7a-arm.c| 4 +---
 gcc/testsuite/gcc.target/arm/ftest-armv7a-thumb.c  | 4 +---
 gcc/testsuite/gcc.target/arm/ftest-armv7em-thumb.c | 3 +--
 gcc/testsuite/gcc.target/arm/ftest-armv7r-arm.c| 4 +---
 gcc/testsuite/gcc.target/arm/ftest-armv7r-thumb.c  | 4 +---
 gcc/testsuite/gcc.target/arm/ftest-armv7ve-arm.c   | 4 +---
 gcc/testsuite/gcc.target/arm/ftest-armv7ve-thumb.c | 4 +---
 gcc/testsuite/gcc.target/arm/ftest-armv8a-arm.c| 4 +---
 gcc/testsuite/gcc.target/arm/ftest-armv8a-thumb.c  | 4 +---
 25 files changed, 34 insertions(+), 58 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/ftest-armv4-arm.c 
b/gcc/testsuite/gcc.target/arm/ftest-armv4-arm.c
index 447a8ec16ae..63d57d41d3f 100644
--- a/gcc/testsuite/gcc.target/arm/ftest-armv4-arm.c
+++ b/gcc/testsuite/gcc.target/arm/ftest-armv4-arm.c
@@ -1,6 +1,4 @@
 /* { dg-do compile } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-march=*" } 
{ "-march=armv4" } } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-mthumb" } { 
"" } } */
 /* { dg-require-effective-target arm_arch_v4_ok } */
 /* { dg-options "-marm" } */
 /* { dg-add-options arm_arch_v4 } */
diff --git a/gcc/testsuite/gcc.target/arm/ftest-armv4t-arm.c 
b/gcc/testsuite/gcc.target/arm/ftest-armv4t-arm.c
index 28fd2f79ddb..d33beef1d73 100644
--- a/gcc/testsuite/gcc.target/arm/ftest-armv4t-arm.c
+++ b/gcc/testsuite/gcc.target/arm/ftest-armv4t-arm.c
@@ -1,6 +1,4 @@
 /* { dg-do compile } */
-/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-march=*" } 
{ "-march=armv4t" } } */
-/* { dg-skip-if "avoid conflicting multilib options" { *

Re: [PATCH] OpenMP: Create additional interop objects with append_args.

2025-03-25 Thread Tobias Burnus

Sandra Loosemore wrote:

On 3/23/25 14:28, Tobias Burnus wrote:

Thus, please useGOMP_DEVICE_DEFAULT_OMP_61 for: […]

Done.


(B) prefer_type & Fortran […]
This turned out to be an accidental redefinition of a variable that 
shadowed one of the same name in an outer scope.  I added a simplified 
version of your Fortran testcase, too.


Thanks!

I've also fixed the documentation to reflect that append_args is now 
fully supported.  New version of the patch attached; is this one OK to 
commit?


LGTM.

Thanks for the patch :-)

Tobias



Re: [PATCH v4] libstdc++: Add P1206R7 range operations to std::deque [PR111055]

2025-03-25 Thread Jonathan Wakely
On Tue, 25 Mar 2025 at 15:53, Tomasz Kamiński  wrote:
>
> This is another piece of P1206R7, adding from_range constructor, append_range,
> prepend_range, insert_range, and assign_range members to std::deque.
>
> For append_front of input non-sized range, we are emplacing element at the 
> front and
> then reverse inserted elements. This does not existing elements, and properly 
> handle
> aliasing ranges.
>
> For insert_range, the handling of insertion in the middle of input-only ranges
> that are sized could be optimized, we still insert nodes one-by-one in such 
> case.
> For forward and stronger ranges, we reduce them to common_range case, by 
> computing
> the iterator when computing the distance. This is slightly suboptimal, as it 
> require
> range to be iterated for non-common forward ranges that are sized, but reduces
> number of instantiations.
>
> This patch extract _M_range_prepend, _M_range_append helper functions that 
> accepts
> (iterator, sentinel) pair. This all used in all standard modes.
>
> PR libstdc++/111055
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/deque.tcc (deque::prepend_range, deque::append_range)
> (deque::insert_range, __advance_dist): Define.
> (deque::_M_range_prepend, deque::_M_range_append):
> Extract from _M_range_insert_aux for _ForwardIterator(s).
> * include/bits/stl_deque.h (deque::assign_range): Define.
> (deque::prepend_range, deque::append_range, deque::insert_range):
> Declare.
> (deque(from_range_t, _Rg&&, const allocator_type&)): Define 
> constructor
> and deduction guide.
> * include/debug/deque (deque::prepend_range, deque::append_range)
> (deque::assign_range):  Define.
> (deque(from_range_t, _Rg&&, const allocator_type&)): Define 
> constructor
> and deduction guide.
> * testsuite/23_containers/deque/cons/from_range.cc: New test.
> * testsuite/23_containers/deque/modifiers/append_range.cc: New test.
> * testsuite/23_containers/deque/modifiers/assign/assign_range.cc:
> New test.
> * testsuite/23_containers/deque/modifiers/prepend_range.cc: New test.
> ---
> Addressed all the comments.
> Testing on x86_64-linux, container tests passed.
> OK for trunk if tests pass?

OK, thanks.

>
>  libstdc++-v3/include/bits/deque.tcc   | 229 +++---
>  libstdc++-v3/include/bits/stl_deque.h | 111 +
>  libstdc++-v3/include/debug/deque  |  51 
>  .../23_containers/deque/cons/from_range.cc| 117 +
>  .../deque/modifiers/append_range.cc   | 128 ++
>  .../deque/modifiers/assign/assign_range.cc| 109 +
>  .../deque/modifiers/insert/insert_range.cc| 142 +++
>  .../deque/modifiers/prepend_range.cc  | 140 +++
>  8 files changed, 996 insertions(+), 31 deletions(-)
>  create mode 100644 
> libstdc++-v3/testsuite/23_containers/deque/cons/from_range.cc
>  create mode 100644 
> libstdc++-v3/testsuite/23_containers/deque/modifiers/append_range.cc
>  create mode 100644 
> libstdc++-v3/testsuite/23_containers/deque/modifiers/assign/assign_range.cc
>  create mode 100644 
> libstdc++-v3/testsuite/23_containers/deque/modifiers/insert/insert_range.cc
>  create mode 100644 
> libstdc++-v3/testsuite/23_containers/deque/modifiers/prepend_range.cc
>
> diff --git a/libstdc++-v3/include/bits/deque.tcc 
> b/libstdc++-v3/include/bits/deque.tcc
> index fcbecca55b4..87ea1cebdaa 100644
> --- a/libstdc++-v3/include/bits/deque.tcc
> +++ b/libstdc++-v3/include/bits/deque.tcc
> @@ -583,6 +583,51 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>this->_M_impl._M_start._M_cur = this->_M_impl._M_start._M_first;
>  }
>
> +  template 
> +template 
> +  void
> +  deque<_Tp, _Alloc>::
> +  _M_range_prepend(_InputIterator __first, _Sentinel __last,
> + size_type __n)
> +  {
> +iterator __new_start = _M_reserve_elements_at_front(__n);
> +__try
> +  {
> +std::__uninitialized_copy_a(_GLIBCXX_MOVE(__first), __last,
> +__new_start, _M_get_Tp_allocator());
> +this->_M_impl._M_start = __new_start;
> +  }
> +__catch(...)
> +  {
> +_M_destroy_nodes(__new_start._M_node,
> + this->_M_impl._M_start._M_node);
> +__throw_exception_again;
> +  }
> +  }
> +
> +  template 
> +template 
> +  void
> +  deque<_Tp, _Alloc>::
> +  _M_range_append(_InputIterator __first, _Sentinel __last,
> + size_type __n)
> + {
> +   iterator __new_finish = _M_reserve_elements_at_back(__n);
> +   __try
> +   {
> + std::__uninitialized_copy_a(_GLIBCXX_MOVE(__first), __last,
> + this->_M_impl._M_finish,
> + _M_get_Tp_allocator());
> + this

Re: [PATCH] tailc: Don't fail musttail calls if they use or could use local arguments, instead warn [PR119376]

2025-03-25 Thread Andi Kleen
On Tue, Mar 25, 2025 at 07:43:28PM +0300, Alexander Monakov wrote:
> Hello,
> 
> FWIW I think Clang made a mistake in bending semantics in a way that is 
> clearly
> misaligned with the general design of C and C++, where a language-native, so 
> to
> speak, solution was available: introduce a scope for the local variables to
> indicate that they cannot escape to the intended tailcall:
> 
> void foo(int v)
> {
>   {
> int a;
> capture(&a);
>   }
>   tailcall(v); // this cannot refer to 'a', even though it escaped earlier
> }

This is not a universal fix? e.g. consider a case like

void foo(int v)
{
   int a;
   capture(&a); 
   if (condition)
 return tailcall(v);
   // do something with a
}

-Andix> 


Re: [PATCH] c++: Properly fold .* [PR114525]

2025-03-25 Thread Jason Merrill

On 3/25/25 1:18 PM, Simon Martin wrote:

We've been miscompiling the following since r0-51314-gd6b4ea8592e338 (I
did not go compile something that old, and identified this change via
git blame, so might be wrong)

=== cut here ===
struct Foo { int x; };
Foo& get (Foo &v) { return v; }
void bar () {
   Foo v; v.x = 1;
   (true ? get (v) : get (v)).*(&Foo::x) = 2;
   // v.x still equals 1 here...
}
=== cut here ===

The problem lies in build_m_component_ref, that computes the address of
the COND_EXPR using build_address to build the representation of
   (true ? get (v) : get (v)).*(&Foo::x);
and gets something like
   &(true ? get (v) : get (v))  // #1
instead of
   (true ? &get (v) : &get (v)) // #2
and the write does not go where want it to, hence the miscompile.

This patch replaces the call to build_address by a call to
cp_build_addr_expr, which gives #2, that is properly handled.

Successfully tested on x86_64-pc-linux-gnu. OK for trunk? And for active
branches after 2-3 weeks since it's a nasty one (albeit very old)?


OK, and yes.


PR c++/114525

gcc/cp/ChangeLog:

* typeck2.cc (build_m_component_ref): Call cp_build_addr_expr
instead of build_address.

gcc/testsuite/ChangeLog:

* g++.dg/parse/pr114525.C: New test.

---
  gcc/cp/typeck2.cc |  2 +-
  gcc/testsuite/g++.dg/parse/pr114525.C | 36 +++
  2 files changed, 37 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/parse/pr114525.C

diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
index 1adc05aa86d..45edd180173 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -2387,7 +2387,7 @@ build_m_component_ref (tree datum, tree component, 
tsubst_flags_t complain)
  (cp_type_quals (type)
   | cp_type_quals (TREE_TYPE (datum;
  
-  datum = build_address (datum);

+  datum = cp_build_addr_expr (datum, complain);
  
/* Convert object to the correct base.  */

if (binfo)
diff --git a/gcc/testsuite/g++.dg/parse/pr114525.C 
b/gcc/testsuite/g++.dg/parse/pr114525.C
new file mode 100644
index 000..326985eed50
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/pr114525.C
@@ -0,0 +1,36 @@
+/* PR c++/114525 */
+/* { dg-do run } */
+
+struct Foo {
+  int x;
+};
+
+Foo& get (Foo& v) {
+  return v;
+}
+
+int main () {
+  bool cond = true;
+
+  /* Testcase from PR; v.x would wrongly remain equal to 1.  */
+  Foo v_ko;
+  v_ko.x = 1;
+  (cond ? get (v_ko) : get (v_ko)).*(&Foo::x) = 2;
+  if (v_ko.x != 2)
+__builtin_abort ();
+
+  /* Those would already work, i.e. x be changed to 2.  */
+  Foo v_ok_1;
+  v_ok_1.x = 1;
+  (cond ? get (v_ok_1) : get (v_ok_1)).x = 2;
+  if (v_ok_1.x != 2)
+__builtin_abort ();
+
+  Foo v_ok_2;
+  v_ok_2.x = 1;
+  get (v_ok_2).*(&Foo::x) = 2;
+  if (v_ok_2.x != 2)
+__builtin_abort ();
+
+  return 0;
+}




Re: [Patch, Fortran] C prototypes for functions returning C function pointers

2025-03-25 Thread Thomas Koenig

Hello Harald,


OK with the above addressed.


Both addressed and pushed in

https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=737a5760bb24a0a945cc2c916ba452e3f0060c58

Thanks for the review (and for catching the miscellaneous
problems on the way)!

Best regards

Thomas




Re: [PATCH] c++: Properly fold .* [PR114525]

2025-03-25 Thread Marek Polacek
On Tue, Mar 25, 2025 at 05:18:23PM +, Simon Martin wrote:
> We've been miscompiling the following since r0-51314-gd6b4ea8592e338 (I
> did not go compile something that old, and identified this change via
> git blame, so might be wrong)
> 
> === cut here ===
> struct Foo { int x; };
> Foo& get (Foo &v) { return v; }
> void bar () {
>   Foo v; v.x = 1;
>   (true ? get (v) : get (v)).*(&Foo::x) = 2;
>   // v.x still equals 1 here...
> }
> === cut here ===
> 
> The problem lies in build_m_component_ref, that computes the address of
> the COND_EXPR using build_address to build the representation of
>   (true ? get (v) : get (v)).*(&Foo::x);
> and gets something like
>   &(true ? get (v) : get (v))  // #1
> instead of
>   (true ? &get (v) : &get (v)) // #2
> and the write does not go where want it to, hence the miscompile.
> 
> This patch replaces the call to build_address by a call to
> cp_build_addr_expr, which gives #2, that is properly handled.
> 
> Successfully tested on x86_64-pc-linux-gnu. OK for trunk? And for active
> branches after 2-3 weeks since it's a nasty one (albeit very old)?
> 
>   PR c++/114525
> 
> gcc/cp/ChangeLog:
> 
>   * typeck2.cc (build_m_component_ref): Call cp_build_addr_expr
>   instead of build_address.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/parse/pr114525.C: New test.

g++.dg/expr/cond18.C seems like a more appropriate place, but the
patch itself LGTM.

> 
> ---
>  gcc/cp/typeck2.cc |  2 +-
>  gcc/testsuite/g++.dg/parse/pr114525.C | 36 +++
>  2 files changed, 37 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/parse/pr114525.C
> 
> diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
> index 1adc05aa86d..45edd180173 100644
> --- a/gcc/cp/typeck2.cc
> +++ b/gcc/cp/typeck2.cc
> @@ -2387,7 +2387,7 @@ build_m_component_ref (tree datum, tree component, 
> tsubst_flags_t complain)
> (cp_type_quals (type)
>  | cp_type_quals (TREE_TYPE (datum;
>  
> -  datum = build_address (datum);
> +  datum = cp_build_addr_expr (datum, complain);
>  
>/* Convert object to the correct base.  */
>if (binfo)
> diff --git a/gcc/testsuite/g++.dg/parse/pr114525.C 
> b/gcc/testsuite/g++.dg/parse/pr114525.C
> new file mode 100644
> index 000..326985eed50
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/parse/pr114525.C
> @@ -0,0 +1,36 @@
> +/* PR c++/114525 */
> +/* { dg-do run } */
> +
> +struct Foo {
> +  int x;
> +};
> +
> +Foo& get (Foo& v) {
> +  return v;
> +}
> +
> +int main () {
> +  bool cond = true;
> +
> +  /* Testcase from PR; v.x would wrongly remain equal to 1.  */
> +  Foo v_ko;
> +  v_ko.x = 1;
> +  (cond ? get (v_ko) : get (v_ko)).*(&Foo::x) = 2;
> +  if (v_ko.x != 2)
> +__builtin_abort ();
> +
> +  /* Those would already work, i.e. x be changed to 2.  */
> +  Foo v_ok_1;
> +  v_ok_1.x = 1;
> +  (cond ? get (v_ok_1) : get (v_ok_1)).x = 2;
> +  if (v_ok_1.x != 2)
> +__builtin_abort ();
> +
> +  Foo v_ok_2;
> +  v_ok_2.x = 1;
> +  get (v_ok_2).*(&Foo::x) = 2;
> +  if (v_ok_2.x != 2)
> +__builtin_abort ();
> +
> +  return 0;
> +}
> -- 
> 2.44.0
> 

Marek



Re: [PATCH] RISC-V: Remove the priority in FMV ASM name mangling

2025-03-25 Thread Yangyu Chen




On 25/3/2025 21:23, Kito Cheng wrote:

Will it only cause issues with this patch
https://gcc.gnu.org/pipermail/gcc-patches/2025-March/678918.html


Yes. But I think we can merge this first.

Thanks,
Yangyu Chen


or will it cause problems with the current trunk as well?

If the latter one, could you provide a case for that?

Thanks :)

On Tue, Mar 25, 2025 at 7:15 PM Yangyu Chen  wrote:


We don't need to add priority in ASM name mangling, keeping this might
cause an issue if we call another MV clone directly but only one place
has the priority declared.

gcc/ChangeLog:

 * config/riscv/riscv.cc (riscv_mangle_decl_assembler_name): Remove
 priority in fmv asm name mangling.

Signed-off-by: Yangyu Chen 
---
  gcc/config/riscv/riscv.cc | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 38f3ae7cd84..4a042878554 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -13238,7 +13238,11 @@ riscv_mangle_decl_assembler_name (tree decl, tree id)

/* Replace non-alphanumeric characters with underscores as the suffix.  
*/
for (const char *c = version_string; *c; c++)
-   name += ISALNUM (*c) == 0 ? '_' : *c;
+   {
+ /* Skip ';' for ";priority"  */
+ if (*c == ';') break;
+ name += ISALNUM (*c) == 0 ? '_' : *c;
+   }

if (DECL_ASSEMBLER_NAME_SET_P (decl))
 SET_DECL_RTL (decl, NULL);
--
2.49.0





Re: [PATCH] target/119010 - add missing DF load/store reservations for znver4 and znver5

2025-03-25 Thread Richard Biener
On Tue, 25 Mar 2025, Richard Biener wrote:

> The following resolves missing reservations for DFmode *movdf_internal
> loads and stores, visible as 'nothing' in -fsched-verbose=2 dumps.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.

The alternative for the larger scale problem of missing DFmode handling
is to s/V1DF/DF/ in the file - znver.md gets along without V1DF handling,
supposedly using V1DF (V1SF isn't a thing) was a mistake?

Richard.
 
>   PR target/119010
>   * config/i386/zn4zn5.md (znver4_sse_mov_fp, znver4_sse_mov_fp_load,
>   znver5_sse_mov_fp_load, znver4_sse_mov_fp_store,
>   znver5_sse_mov_fp_store): Also match V1SF and DF.
> ---
>  gcc/config/i386/zn4zn5.md | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/config/i386/zn4zn5.md b/gcc/config/i386/zn4zn5.md
> index ae188a1201e..f8772fed620 100644
> --- a/gcc/config/i386/zn4zn5.md
> +++ b/gcc/config/i386/zn4zn5.md
> @@ -986,35 +986,35 @@
>  (define_insn_reservation "znver4_sse_mov_fp" 1
>(and (eq_attr "cpu" "znver4,znver5")
> (and (eq_attr "type" "ssemov")
> -(and (eq_attr "mode" 
> "V16SF,V8DF,V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,SF")
> +(and (eq_attr "mode" 
> "V16SF,V8DF,V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,DF,SF")
>   (eq_attr "memory" "none"
>"znver4-direct,znver4-fpu")
>  
>  (define_insn_reservation "znver4_sse_mov_fp_load" 6
>(and (eq_attr "cpu" "znver4")
> (and (eq_attr "type" "ssemov")
> -(and (eq_attr "mode" 
> "V16SF,V8DF,V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,SF")
> +(and (eq_attr "mode" 
> "V16SF,V8DF,V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,DF,SF")
>   (eq_attr "memory" "load"
>"znver4-direct,znver4-load,znver4-fpu")
>  
>  (define_insn_reservation "znver5_sse_mov_fp_load" 6
>(and (eq_attr "cpu" "znver5")
> (and (eq_attr "type" "ssemov")
> -(and (eq_attr "mode" 
> "V16SF,V8DF,V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,SF")
> +(and (eq_attr "mode" 
> "V16SF,V8DF,V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,DF,SF")
>   (eq_attr "memory" "load"
>"znver4-direct,znver5-load,znver4-fpu")
>  
>  (define_insn_reservation "znver4_sse_mov_fp_store" 1
>(and (eq_attr "cpu" "znver4")
> (and (eq_attr "type" "ssemov")
> -(and (eq_attr "mode" 
> "V16SF,V8DF,V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,SF")
> +(and (eq_attr "mode" 
> "V16SF,V8DF,V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,DF,SF")
>   (eq_attr "memory" "store"
>"znver4-direct,znver4-fp-store")
>  
>  (define_insn_reservation "znver5_sse_mov_fp_store" 1
>(and (eq_attr "cpu" "znver5")
> (and (eq_attr "type" "ssemov")
> -(and (eq_attr "mode" 
> "V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,SF")
> +(and (eq_attr "mode" 
> "V8SF,V4DF,V4SF,V2DF,V2SF,V1DF,DF,SF")
>   (eq_attr "memory" "store"
>"znver4-direct,znver5-fp-store256")
>  
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


RE: [PATCH][COBOL][RFC] Remove strtof128 based diagnostics

2025-03-25 Thread Robert Dubner
I patched this in on top of all the other patches.  It passes what Jim has
called the "Bob CI/CD pipeline".

Jim is finalizing his changes.

> -Original Message-
> From: Richard Biener 
> Sent: Tuesday, March 25, 2025 05:51
> To: gcc-patches@gcc.gnu.org
> Cc: rdub...@symas.com; Jakub Jelinek 
> Subject: [PATCH][COBOL][RFC] Remove strtof128 based diagnostics
> 
> The following removes uses of strtof128 which are all in some way
> verifying something parses as _Float128 but which lexing should
> have guarnateed.
> 
> Tested on x86_64-unknown-linux-gnu.
> 
> Richard.
> 
> gcc/cobol/
>   * parse.y (intrinsic): Remove checking that
$r1->field->data.initial
>   parses as _Float128.
>   (numstr2i): Remove checking that all of the string is consumed
>   by the converted from number.
>   * symbols.h (strtof128): Remove.
>   (cbl_field_data_t::valify): Remove checking that all of the
>   string is consumed by the number conversion.
> ---
>  gcc/cobol/parse.y   | 21 +
>  gcc/cobol/symbols.h | 20 
>  2 files changed, 5 insertions(+), 36 deletions(-)
> 
> diff --git a/gcc/cobol/parse.y b/gcc/cobol/parse.y
> index 390e115f37e..5fa472a3645 100644
> --- a/gcc/cobol/parse.y
> +++ b/gcc/cobol/parse.y
> @@ -10326,16 +10326,7 @@ intrinsic:  function_udf
>}
>if( $1 == NUMVAL_F ) {
>  if( is_literal($r1->field) ) {
> -  _Float128 output __attribute__ ((__unused__));
> -  auto input = $r1->field->data.initial;
> -  auto local = xstrdup(input), pend = local;
> -  std::replace(local, local + strlen(local), ',',
> '.');
> -  std::remove_if(local, local + strlen(local),
> isspace);
> -  output = strtof128(local, &pend);
> -  // bad if strtof128 could not convert input
> -  if( *pend != '\0' ) {
> -error_msg(@r1, "'%s' is not a numeric string",
> input);
> -  }
> +   // we assume $r1->field->data.initial parses as
float
>  }
>}
>if( ! intrinsic_call_1($$, $1, $r1, @r1)) YYERROR;
> @@ -12065,20 +12056,18 @@ static REAL_VALUE_TYPE
>  numstr2i( const char input[], radix_t radix ) {
>REAL_VALUE_TYPE output;
>size_t integer = 0;
> -  int erc=0, n=0;
> +  int erc=0;
> 
>switch( radix ) {
>case decimal_e: { // Use decimal point for comma, just in case.
> -  auto local = xstrdup(input), pend = local;
> +  auto local = xstrdup(input);
>if( !local ) { erc = -1; break; }
>std::replace(local, local + strlen(local), ',', '.');
>real_from_string3 (&output, local, TYPE_MODE
(float128_type_node));
> -  strtof128(local, &pend);
> -  n = pend - local;
>  }
>  break;
>case hexadecimal_e:
> -erc = sscanf(input, "%zx%n", &integer, &n);
> +erc = sscanf(input, "%zx", &integer);
>  real_from_integer (&output, VOIDmode, integer, UNSIGNED);
>  break;
>case boolean_e:
> @@ -12101,7 +12090,7 @@ numstr2i( const char input[], radix_t radix ) {
>  real_from_integer (&output, VOIDmode, integer, UNSIGNED);
>  return output;
>}
> -  if( erc == -1 || n < int(strlen(input)) ) {
> +  if( erc == -1 ) {
>  yywarn("'%s' was accepted as %lld", input, output);
>}
>return output;
> diff --git a/gcc/cobol/symbols.h b/gcc/cobol/symbols.h
> index 72bb188ec5b..35e4d816233 100644
> --- a/gcc/cobol/symbols.h
> +++ b/gcc/cobol/symbols.h
> @@ -48,17 +48,6 @@
> 
>  #define PICTURE_MAX 64
> 
> -#if ! (__HAVE_FLOAT128 && __GLIBC_USE (IEC_60559_TYPES_EXT))
> -static_assert( sizeof(output) == sizeof(long double), "long doubles?"
);
> -
> -// ???  This is still used for verificataion that __nptr parses as
> -// float number via setting *__endptr.
> -static inline _Float128
> -strtof128 (const char *__restrict __nptr, char **__restrict __endptr) {
> -  return strtold(nptr, endptr);
> -}
> -#endif
> -
>  extern const char *numed_message;
> 
>  enum cbl_dialect_t {
> @@ -352,15 +341,6 @@ struct cbl_field_data_t {
>std::replace(input.begin(), input.end(), ',', '.');
>  }
> 
> -char *pend = NULL;
> -
> -strtof128(input.c_str(), &pend);
> -
> -if( pend != input.c_str() + len ) {
> -  dbgmsg("%s: error: could not interpret '%s' of '%s' as a number",
> - __func__, pend, initial);
> -}
> -
>  REAL_VALUE_TYPE r;
>  real_from_string (&r, input.c_str());
>  r = real_value_truncate (TYPE_MODE (float128_type_node), r);
> --
> 2.43.0


Re: [PATCH v2] RISC-V: Fix wrong LMUL when only implict zve32f.

2025-03-25 Thread Robin Dapp

zve32x_zvl64b will have the same requirement as zve32x_zvl32b,
I mean e16,mf4 could be allowed on zve32x_zvl64b, but it also spec
conformance
if implementation decides to raise an illegal instruction on e16,mf4, which
means
e16,mf4 is not safe to use on zve32x/zve32f.


OK I see, thanks.  Sometimes I really wish the spec would be a bit more
ambitious in its defaults ;)  So I suppose we should change all
TARGET_MIN_VLEN > 32 in vector-iterators.md as well then?

--
Regards
Robin



Re: [Patch, v2] libgomp.texi: Document supported OpenMP 'interop' types for nvptx and gcn

2025-03-25 Thread Tobias Burnus

Sandra Loosemore wrote:
I find the part about the "associated named constant" really 
confusing; I am not sure those are for property identifiers or 
property values. Can you give an example, or actually list the named 
constants individually? 


Well, the property is called 'dev_num'; the identifier (named constant) 
is 'omp_ipr_dev_num', which is an enum in C/C++ and a parameter in Fortran.


* * *

To obtain the name to a property (range: omp_ipr_first to 
omp_get_num_interop_properties()), you can call:


printf ("%s\n", omp_get_interop_name (obj, -5));

printf ("%s\n", omp_get_interop_name (obj, omp_ipr_dev_num));

will print "dev_num" (twice).

In the second call, the named constant is used.

* * *

A typical code to obtain the property would be:

int dev_num = omp_get_interop_int (obj, omp_ipr_dev_num, NULL);

* * *

Regarding:


+To each listed property, an associated named constant exists with prefix
+@code{omp_ipr_}.  Note that @code{device_num} is the OpenMP device 
number
+while @code{device} is the HIP device number or HSA device handle. 


Any suggestion how to handle this best?

Listing both the property name and the named constant seems to be a bit 
pointless – and I fear linebreaks will appear in the table when viewed 
as 'info' file.


I could remove the first quoted sentence, but it might help the user 
when calling the API function.


For the string-valued constants in the table, please include the 
quotes, unless those are identifiers instead of string literal. 


Using nice quotes like in ‘…’ as with @samp (in info, HTML, PDF it 
appears as such)—or just, e.g.,  "amd" or 'nvidia', i.e. plain quotes?


Tobias



Re: [PATCH v2] c++: Don't replace INDIRECT_REFs by a const capture proxy too eagerly [PR117504]

2025-03-25 Thread Simon Martin
Hi,

On Thu Mar 6, 2025 at 10:15 AM CET, Simon Martin wrote:
> On Wed Mar 5, 2025 at 10:32 PM CET, Jason Merrill wrote:
>> On 3/5/25 6:58 AM, Simon Martin wrote:
>>> Hi Jason,
>>> 
>>> On Tue Mar 4, 2025 at 11:47 PM CET, Jason Merrill wrote:
 On 2/14/25 12:08 PM, Simon Martin wrote:
> We have been miscompiling the following valid code since GCC8, and
> r8-3497-g281e6c1d8f1b4c
>
> === cut here ===
> struct span {
> span (const int (&__first)[1]) : _M_ptr (__first) {}
> int operator[] (long __i) { return _M_ptr[__i]; }
> const int *_M_ptr;
> };
> void foo () {
> constexpr int a_vec[]{1};
> auto vec{[&a_vec]() -> span { return a_vec; }()};
> }
> === cut here ===
>
> The problem is that perform_implicit_conversion_flags (via
> mark_rvalue_use) replaces "a_vec" in the return statement by a
> CONSTRUCTOR representing a_vec's constant value, and then takes its
> address when invoking span's constructor. So we end up with an instance
> that points to garbage instead of a_vec's storage.
>
> I've tried many things to somehow recover from this replacement, but I
> actually think we should not do it when converting to a class type: we
> have no idea whether the conversion will involve a constructor taking an
> address or reference. So we should assume it's the case, and call
> mark_lvalue_use, not mark_rvalue_use (I might very weel be overseeing
> things, and feedback is more than welcome).

 Yeah, those mark_*_use calls seem misplaced, they should be called
 instead by the code that actually does the conversion.

 What if we replace all of them here with just mark_exp_read?  Or nothing?
>>> Thanks for the suggestions; simply removing those calls actually works. This
>>> is what the attached updated patch does.
>>> 
>>> Successfully tested on x86_64-pc-linux-gnu. OK for trunk? And if so, OK for
>>> branches after 2-3 weeks since it's a wrong code bug?
>>
>> OK, yes.
> Thanks, merged as r15-7849-gfdf846fdddcc04.
>
> I'll reply to this thread when I've backported this patch to active branches,
> in 2-3 weeks.
To close the loop here, I've backported this fix as
  r13-9450-gf10853a0087bc115c8ee1ddb5fc641bffdb7f1a4
and
  r14-11445-gf078a613bf85eff138c2567b599779dee6ae4b22.

Simon



Re: [PATCH] OpenMP: Create additional interop objects with append_args.

2025-03-25 Thread Sandra Loosemore

On 3/23/25 14:28, Tobias Burnus wrote:

[snip]

Thus, please useGOMP_DEVICE_DEFAULT_OMP_61 for:

+  if (dispatch_device_num == NULL_TREE)
+    /* Not remapping device number.  */
+    dispatch_device_num = build_int_cst (integer_type_node,
+ GOMP_DEVICE_ICV);


Done.


(B) prefer_type & Fortran

The other issue is exposed by the attached testcase. It has

append_args(interop(target),
     interop(prefer_type("cuda","hip"), targetsync),
     interop(target,targetsync,prefer_type({attr("ompx_foo")})))

but the prefer_type is not passed on.

I assume that's my fault by not handling it correctly in
gcc/fortran/trans-openmp.cc, but if you could have a look?


This turned out to be an accidental redefinition of a variable that 
shadowed one of the same name in an outer scope.  I added a simplified 
version of your Fortran testcase, too.


I've also fixed the documentation to reflect that append_args is now 
fully supported.  New version of the patch attached; is this one OK to 
commit?


-SandraFrom b1291a3088375538d187e1965b4eb8a8274fb9e8 Mon Sep 17 00:00:00 2001
From: Sandra Loosemore 
Date: Tue, 25 Mar 2025 15:55:45 +
Subject: [PATCH v2] OpenMP: Create additional interop objects with append_args.

This patch adds support for the case where #pragma omp declare variant
with append_args is used inside a #pragma omp dispatch interop that
specifies fewer interop args than required by the variant; new interop
objects are implicitly created and then destroyed around the call to the
variant, using the GOMP_interop builtin.

gcc/fortran/ChangeLog
	* trans-openmp.cc (gfc_trans_omp_declare_variant): Remove accidental
	redeclaration of pref.

gcc/ChangeLog
	* gimplify.cc (modify_call_for_omp_dispatch): Adjust arguments.
	Remove the "sorry" for the case where new interop objects must be
	constructed, and add code to make it work instead.
	(expand_variant_call_expr): Adjust arguments and call to
	modify_call_for_omp_dispatch.
	(gimplify_variant_call_expr): Simplify logic for calling
	expand_variant_call_expr.

gcc/testsuite/ChangeLog
	* c-c++-common/gomp/append-args-1.c: Adjust expected behavior.
	* c-c++-common/gomp/append-args-interop.c: New.
	* c-c++-common/gomp/dispatch-11.c: Adjust expected behavior.
	* g++.dg/gomp/append-args-1.C: Likewise.
	* gfortran.dg/gomp/append-args-interop.f90: New.
	* gfortran.dg/gomp/declare-variant-mod-2.f90: Adjust expected behavior.

libgomp/ChangeLog
	* libgomp.texi (OpenMP 5.1): Mark append_args as fully supported.

Co-Authored-By: Tobias Burnus 
---
 gcc/fortran/trans-openmp.cc   |   4 +-
 gcc/gimplify.cc   | 229 +++---
 .../c-c++-common/gomp/append-args-1.c |  14 +-
 .../c-c++-common/gomp/append-args-interop.c   |  44 
 gcc/testsuite/c-c++-common/gomp/dispatch-11.c |   3 -
 gcc/testsuite/g++.dg/gomp/append-args-1.C |  18 +-
 .../gfortran.dg/gomp/append-args-interop.f90  |  27 +++
 .../gomp/declare-variant-mod-2.f90|   6 -
 libgomp/libgomp.texi  |   3 +-
 9 files changed, 275 insertions(+), 73 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/gomp/append-args-interop.c
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/append-args-interop.f90

diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index bf8c34172c3..03d94326bc8 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -8980,8 +8980,8 @@ gfc_trans_omp_declare_variant (gfc_namespace *ns, gfc_namespace *parent_ns)
 			  tree pref = NULL_TREE;
 			  if (n->u.init.len)
 			{
-			  tree pref = build_string (n->u.init.len,
-			n->u2.init_interop);
+			  pref = build_string (n->u.init.len,
+		   n->u2.init_interop);
 			  TREE_TYPE (pref) = build_array_type_nelts (
 		   unsigned_char_type_node,
 		   n->u.init.len);
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 354c3d663e7..2364de081ff 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -3874,8 +3874,8 @@ find_supercontext (void)
 
 /* OpenMP: Handle the append_args and adjust_args clauses of
declare_variant for EXPR, which is a CALL_EXPR whose CALL_EXPR_FN
-   is the variant, within a dispatch construct with clauses DISPATCH_CLAUSES
-   and location DISPATCH_LOC.
+   is the variant, within a dispatch construct with clauses DISPATCH_CLAUSES.
+   WANT_VALUE and POINTERIZE are as for expand_variant_call_expr.
 
'append_args' causes interop objects are added after the last regular
(nonhidden, nonvariadic) arguments of the variant function.
@@ -3885,7 +3885,7 @@ find_supercontext (void)
address.  */
 static tree
 modify_call_for_omp_dispatch (tree expr, tree dispatch_clauses,
-			  location_t dispatch_loc)
+			  bool want_value, bool pointerize)
 {
   tree fndecl = get_callee_fndecl (expr);
 
@@ -3893,9 +3893,11 @@ modify_call_for_omp_dispatch (tree expr, tree dispatch_clauses,
   if (!fndecl)
 return expr;
 
+  tree ini

[PATCH v3] libstdc++: Add P1206R7 range operations to std::deque [PR111055]

2025-03-25 Thread Tomasz Kamiński
This is another piece of P1206R7, adding from_range constructor, append_range,
prepend_range, insert_range, and assign_range members to std::deque.

For append_front of input non-sized range, we are emplacing element at the 
front and
then reverse inserted elements. This does not existing elements, and properly 
handle
aliasing ranges.

For insert_range, the handling of insertion in the middle of input-only ranges
that are sized could be optimized, we still insert nodes one-by-one in such 
case.
For forward and stronger ranges, we reduce them to common_range case, by 
computing
the iterator when computing the distance. This is slightly suboptimal, as it 
require
range to be iterated for non-common forward ranges that are sized, but reduces
number of instantiations.

This patch extract _M_range_prepend, _M_range_append helper functions that 
accepts
(iterator, sentinel) pair. This all used in all standard modes.

PR libstdc++/111055

libstdc++-v3/ChangeLog:

* include/bits/deque.tcc (deque::prepend_range, deque::append_range)
(deque::insert_range, _advance_dist): Define.
(deque::_M_range_prepend, deque::_M_range_append):
Extract from _M_range_insert_aux for _ForwardIterator(s).
* include/bits/stl_deque.h (deque::assing_range): Define.
(deque::prepend_range, deque::append_range, deque::insert_range):
Declare.
deque(from_range_t, _Rg&&, const allocator_type&): Define constructor
and deduction guide.
* include/debug/deque (deque::prepend_range, deque::append_range),
(assing_range): Define.
deque(from_range_t, _Rg&&, const allocator_type&): Define constructor
and deduction guide.
* testsuite/23_containers/deque/cons/from_range.cc: New test.
* testsuite/23_containers/deque/modifiers/append_range.cc: New test.
* testsuite/23_containers/deque/modifiers/assign/assign_range.cc: New 
test.
* testsuite/23_containers/deque/modifiers/prepend_range.cc: New test.
---
Expanded the documentetion comments. Tests for the stability of 
pointers/references
to existing elements, when inserting at front/end.

 libstdc++-v3/include/bits/deque.tcc   | 229 +++---
 libstdc++-v3/include/bits/stl_deque.h | 110 +
 libstdc++-v3/include/debug/deque  |  51 
 .../23_containers/deque/cons/from_range.cc| 117 +
 .../deque/modifiers/append_range.cc   | 128 ++
 .../deque/modifiers/assign/assign_range.cc| 109 +
 .../deque/modifiers/insert/insert_range.cc| 142 +++
 .../deque/modifiers/prepend_range.cc  | 140 +++
 8 files changed, 995 insertions(+), 31 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/23_containers/deque/cons/from_range.cc
 create mode 100644 
libstdc++-v3/testsuite/23_containers/deque/modifiers/append_range.cc
 create mode 100644 
libstdc++-v3/testsuite/23_containers/deque/modifiers/assign/assign_range.cc
 create mode 100644 
libstdc++-v3/testsuite/23_containers/deque/modifiers/insert/insert_range.cc
 create mode 100644 
libstdc++-v3/testsuite/23_containers/deque/modifiers/prepend_range.cc

diff --git a/libstdc++-v3/include/bits/deque.tcc 
b/libstdc++-v3/include/bits/deque.tcc
index fcbecca55b4..6de76ebc116 100644
--- a/libstdc++-v3/include/bits/deque.tcc
+++ b/libstdc++-v3/include/bits/deque.tcc
@@ -583,6 +583,51 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   this->_M_impl._M_start._M_cur = this->_M_impl._M_start._M_first;
 }
 
+  template 
+template 
+  void
+  deque<_Tp, _Alloc>::
+  _M_range_prepend(_InputIterator __first, _Sentinel __last,
+ size_type __n)
+  {
+iterator __new_start = _M_reserve_elements_at_front(__n);
+__try
+  {
+std::__uninitialized_copy_a(_GLIBCXX_MOVE(__first), __last,
+__new_start, _M_get_Tp_allocator());
+this->_M_impl._M_start = __new_start;
+  }
+__catch(...)
+  {
+_M_destroy_nodes(__new_start._M_node,
+ this->_M_impl._M_start._M_node);
+__throw_exception_again;
+  }
+  }
+
+  template 
+template 
+  void
+  deque<_Tp, _Alloc>::
+  _M_range_append(_InputIterator __first, _Sentinel __last,
+ size_type __n)
+ {
+   iterator __new_finish = _M_reserve_elements_at_back(__n);
+   __try
+   {
+ std::__uninitialized_copy_a(_GLIBCXX_MOVE(__first), __last,
+ this->_M_impl._M_finish,
+ _M_get_Tp_allocator());
+ this->_M_impl._M_finish = __new_finish;
+   }
+  __catch(...)
+   {
+ _M_destroy_nodes(this->_M_impl._M_finish._M_node + 1,
+  __new_finish._M_node + 1);
+ __throw_exception_again;
+   }
+ }
+
   template 
 template 
   

C++: Adjust implicit '__cxa_bad_typeid' prototype to reality

2025-03-25 Thread Thomas Schwinge
Hi!

On 2025-03-24T13:38:56-0400, Jason Merrill  wrote:
> On 3/24/25 7:02 AM, Thomas Schwinge wrote:
>> On 2025-03-21T15:46:01+0100, I wrote:
>>> On 2025-03-19T14:25:49+, Jonathan Wakely  wrote:
 On Wed, 19 Mar 2025 at 14:21, Marek Polacek  wrote:
> On Wed, Mar 19, 2025 at 12:38:31PM +0100, Thomas Schwinge wrote:
>> --- a/gcc/cp/rtti.cc
>> +++ b/gcc/cp/rtti.cc
>> @@ -198,7 +198,7 @@ throw_bad_cast (void)
>> fn = get_global_binding (name);
>> if (!fn)
>>fn = push_throw_library_fn
>> -   (name, build_function_type_list (ptr_type_node, NULL_TREE));
>> +   (name, build_function_type_list (void_type_node, NULL_TREE));
>>   }
>>
>> return build_cxx_call (fn, 0, NULL, tf_warning_or_error);
>
> LGTM, matches what I see in abi-eh.html from the itanium-cxx-abi.
>> 
>>> [...] I've now pushed to trunk branch:
>>> commit 618c42d23726be6e2086d452d6718abe5e0daca8
>>> "C++: Adjust implicit '__cxa_bad_cast' prototype to reality", [...]
>> 
>> So, a similar problem exists for '__cxa_bad_typeid'.  See the attached
>> "[WIP] C++: Adjust implicit '__cxa_bad_typeid' prototype to reality" for
>> what seemed to be the corresponding patch for that one.  However, that
>> isn't sufficient; we run into internal errors like:
>> 
>>  [...]/g++.dg/rtti/typeid1.C: In function 'int main()':
>>  [...]/g++.dg/rtti/typeid1.C:9:12: error: lvalue required as unary '&' 
>> operand
>> 
>> I got lost in the C++ front end code trying to understand how to resolve
>> this mismatch.  Anyone able to advise, please?
>
> This addition seems to resolve it: [...]

Yes, thanks!  OK to push the attached
"C++: Adjust implicit '__cxa_bad_typeid' prototype to reality"?


Grüße
 Thomas


>From 6f8045a4976c52f49562e5e3dc7919100032 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 19 Mar 2025 12:18:26 +0100
Subject: [PATCH] C++: Adjust implicit '__cxa_bad_typeid' prototype to reality

In 2001 Subversion r40924 (Git commit 52a11cbfcf0cfb32628b6953588b6af4037ac0b6)
"IA-64 ABI Exception Handling", '__cxa_bad_typeid' changed from
'std::type_info const &' to 'void' return type:

--- libstdc++-v3/libsupc++/exception_support.cc
+++ /dev/null
@@ -1,388 +0,0 @@
-[...]
-// Helpers for rtti. Although these don't return, we give them return types so
-// that the type system is not broken.
-[...]
-extern "C" std::type_info const &
-__cxa_bad_typeid ()
-{
-  [...]
-}
-[...]

--- /dev/null
+++ libstdc++-v3/libsupc++/unwind-cxx.h
@@ -0,0 +1,163 @@
+[...]
+extern "C" void __cxa_bad_typeid ();
+[...]

--- /dev/null
+++ libstdc++-v3/libsupc++/eh_aux_runtime.cc
@@ -0,0 +1,56 @@
+[...]
+extern "C" void
+__cxa_bad_typeid ()
+{
+  [...]
+}

The implicit prototype in the C++ front end however wasn't likewise adjusted,
and so for nvptx we generate code for 'std::type_info const &' return type:

// BEGIN GLOBAL FUNCTION DECL: __cxa_bad_typeid
.extern .func (.param .u64 %value_out) __cxa_bad_typeid;

{
.param .u64 %value_in;
call (%value_in),__cxa_bad_typeid;
trap;
// (noreturn)
exit;
// (noreturn)
ld.param.u64 %r39,[%value_in];
}

..., which is in conflict with the library code with 'void' return type:

// BEGIN GLOBAL FUNCTION DECL: __cxa_bad_typeid
.visible .func __cxa_bad_typeid;

// BEGIN GLOBAL FUNCTION DEF: __cxa_bad_typeid
.visible .func __cxa_bad_typeid
{
[...]
}

..., and we thus get execution test FAILs for 'g++.dg/rtti/typeid11.C', for
example:

error   : Prototype doesn't match for '__cxa_bad_typeid' in 'input file 4 at offset 22204', first defined in 'input file 4 at offset 22204'
nvptx-run: cuLinkAddData failed: unknown error (CUDA_ERROR_UNKNOWN, 999)

With this patched, we get the expected:

 // BEGIN GLOBAL FUNCTION DECL: __cxa_bad_typeid
-.extern .func (.param .u64 %value_out) __cxa_bad_typeid;
+.extern .func __cxa_bad_typeid;

 {
-.param .u64 %value_in;
-call (%value_in),__cxa_bad_typeid;
+call __cxa_bad_typeid;
 trap;
 // (noreturn)
 exit;
 // (noreturn)
-ld.param.u64 %r39,[%value_in];
 }

..., and execution test PASSes for a few test cases.

	gcc/cp/
	* rtti.cc (throw_bad_typeid): Adjust implicit '__cxa_bad_typeid'
	prototype to reality.  Adjust all users.

Co-authored-by: Jason Merrill 
---
 gcc/cp/rtti.cc | 18 +++---
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/gcc/cp/rtti.cc b/gcc/cp/rtti.cc
index 158b5ba13979..353996206f5c 100644
--- a/gcc/cp/rtti.cc
+++ b/gcc/cp/rtti.cc
@@ -204,8 +204,7 @@ throw_bad_cast (void)
   return build_cxx_call (fn, 0, NULL, tf_warning_or_error);
 }
 
-/* Return an expression for "__cxa_bad_typeid()".  The expression
-   returned is an lvalue of type "const std::type_info".  */
+/* See 'libstdc++-v3/libsupc++/eh_aux_runtime.cc' for '__cxa_bad_type

[committed] arm: testsuite use -std=gnu17 for pr65647.c

2025-03-25 Thread Richard Earnshaw
This test has missing prototypes.  To avoid disturbing the test, use gnu17.

gcc/testsuite/ChangeLog:

* gcc.target/arm/pr65647.c (dg-options): Add -std=gnu17.
---
 gcc/testsuite/gcc.target/arm/pr65647.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/arm/pr65647.c 
b/gcc/testsuite/gcc.target/arm/pr65647.c
index e0c534bc813..663157c9c66 100644
--- a/gcc/testsuite/gcc.target/arm/pr65647.c
+++ b/gcc/testsuite/gcc.target/arm/pr65647.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_arch_v6m_ok } */
-/* { dg-options "-O3 -w -fpermissive" } */
+/* { dg-options "-O3 -w -fpermissive -std=gnu17" } */
 /* { dg-add-options arm_arch_v6m } */
 
 a, b, c, e, g = &e, h, i = 7, l = 1, m, n, o, q = &m, r, s = &r, u, w = 9, x,
-- 
2.34.1



Re: C++: Adjust implicit '__cxa_bad_typeid' prototype to reality

2025-03-25 Thread Jason Merrill

On 3/25/25 9:17 AM, Thomas Schwinge wrote:

Hi!

On 2025-03-24T13:38:56-0400, Jason Merrill  wrote:

On 3/24/25 7:02 AM, Thomas Schwinge wrote:

On 2025-03-21T15:46:01+0100, I wrote:

On 2025-03-19T14:25:49+, Jonathan Wakely  wrote:

On Wed, 19 Mar 2025 at 14:21, Marek Polacek  wrote:

On Wed, Mar 19, 2025 at 12:38:31PM +0100, Thomas Schwinge wrote:

--- a/gcc/cp/rtti.cc
+++ b/gcc/cp/rtti.cc
@@ -198,7 +198,7 @@ throw_bad_cast (void)
 fn = get_global_binding (name);
 if (!fn)
fn = push_throw_library_fn
-   (name, build_function_type_list (ptr_type_node, NULL_TREE));
+   (name, build_function_type_list (void_type_node, NULL_TREE));
   }

 return build_cxx_call (fn, 0, NULL, tf_warning_or_error);


LGTM, matches what I see in abi-eh.html from the itanium-cxx-abi.



[...] I've now pushed to trunk branch:
commit 618c42d23726be6e2086d452d6718abe5e0daca8
"C++: Adjust implicit '__cxa_bad_cast' prototype to reality", [...]


So, a similar problem exists for '__cxa_bad_typeid'.  See the attached
"[WIP] C++: Adjust implicit '__cxa_bad_typeid' prototype to reality" for
what seemed to be the corresponding patch for that one.  However, that
isn't sufficient; we run into internal errors like:

  [...]/g++.dg/rtti/typeid1.C: In function 'int main()':
  [...]/g++.dg/rtti/typeid1.C:9:12: error: lvalue required as unary '&' 
operand

I got lost in the C++ front end code trying to understand how to resolve
this mismatch.  Anyone able to advise, please?


This addition seems to resolve it: [...]


Yes, thanks!  OK to push the attached
"C++: Adjust implicit '__cxa_bad_typeid' prototype to reality"?


OK.



[committed] arm: testsuite: update expected output in vect-early-break-cbranch.c

2025-03-25 Thread Richard Earnshaw
Similar to r15-4930-gd56d2f3102ada3, update the branch operations when not
using CBN?Z for inverting the direction of the branch operations.

gcc/testsuite/ChangeLog:

* gcc.target/arm/vect-early-break-cbranch.c: Allow BEQ as well as BNE.
---
 .../gcc.target/arm/vect-early-break-cbranch.c| 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c 
b/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c
index 4dc0edd874b..045f143fb93 100644
--- a/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c
+++ b/gcc/testsuite/gcc.target/arm/vect-early-break-cbranch.c
@@ -18,7 +18,7 @@ int b[N] = {0};
 ** vmovr[0-9]+, s[0-9]+@ int
 ** (
 ** cmp r[0-9]+, #0
-** bne \.L[0-9]+
+** b(ne|eq)\.L[0-9]+
 ** |
 ** cbn?z   r[0-9]+, \.L.+
 ** )
@@ -43,7 +43,7 @@ void f1 ()
 ** vmovr[0-9]+, s[0-9]+@ int
 ** (
 ** cmp r[0-9]+, #0
-** bne \.L[0-9]+
+** b(ne|eq)\.L[0-9]+
 ** |
 ** cbn?z   r[0-9]+, \.L.+
 ** )
@@ -68,7 +68,7 @@ void f2 ()
 ** vmovr[0-9]+, s[0-9]+@ int
 ** (
 ** cmp r[0-9]+, #0
-** bne \.L[0-9]+
+** b(ne|eq)\.L[0-9]+
 ** |
 ** cbn?z   r[0-9]+, \.L.+
 ** )
@@ -94,7 +94,7 @@ void f3 ()
 ** vmovr[0-9]+, s[0-9]+@ int
 ** (
 ** cmp r[0-9]+, #0
-** bne \.L[0-9]+
+** b(ne|eq)\.L[0-9]+
 ** |
 ** cbn?z   r[0-9]+, \.L.+
 ** )
@@ -119,7 +119,7 @@ void f4 ()
 ** vmovr[0-9]+, s[0-9]+@ int
 ** (
 ** cmp r[0-9]+, #0
-** bne \.L[0-9]+
+** b(ne|eq)\.L[0-9]+
 ** |
 ** cbn?z   r[0-9]+, \.L.+
 ** )
@@ -144,7 +144,7 @@ void f5 ()
 ** vmovr[0-9]+, s[0-9]+@ int
 ** (
 ** cmp r[0-9]+, #0
-** bne \.L[0-9]+
+** b(ne|eq)\.L[0-9]+
 ** |
 ** cbn?z   r[0-9]+, \.L.+
 ** )
-- 
2.34.1



[committed] arm: testsuite: avoid dg-options in primary LTO file

2025-03-25 Thread Richard Earnshaw
As the primary LTO file in this test, it cannot use dg-options.  Move
the flags from there to dg-lto-options.

gcc/testsuite/ChangeLog:

* gcc.target/arm/lto/pr96939_0.c (dg-options):  Delete.  Move the
options from here ...
(dg-lto-options): ... to here.
---
 gcc/testsuite/gcc.target/arm/lto/pr96939_0.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/lto/pr96939_0.c 
b/gcc/testsuite/gcc.target/arm/lto/pr96939_0.c
index 21d2c1d70a4..8dfbc061009 100644
--- a/gcc/testsuite/gcc.target/arm/lto/pr96939_0.c
+++ b/gcc/testsuite/gcc.target/arm/lto/pr96939_0.c
@@ -1,8 +1,7 @@
 /* PR target/96939 */
 /* { dg-lto-do link } */
 /* { dg-require-effective-target arm_arch_v8a_link } */
-/* { dg-options "-mcpu=unset -march=armv8-a+simd -mfpu=auto" } */
-/* { dg-lto-options { { -flto -O2 } } } */
+/* { dg-lto-options { { -flto -O2 -mcpu=unset -march=armv8-a+simd -mfpu=auto} 
} } */
 
 extern unsigned crc (unsigned, const void *);
 typedef unsigned (*fnptr) (unsigned, const void *);
-- 
2.34.1



Re: [PATCH] RISC-V: disable the abd expander for gcc-15 release [PR119224]

2025-03-25 Thread Vineet Gupta
On 3/25/25 00:45, Robin Dapp wrote:
>> -  "TARGET_VECTOR"
>> +  "TARGET_VECTOR && 0"
> Would you mind adding a comment here before committing, maybe even reference 
> the PR?  Not that we want to keep this around for long anyway but just to 
> make 
> sure :)

Of course, I pondered the same but then laziness took over.

-Vineet


Re: [PATCH] RISC-V: Remove the priority in FMV ASM name mangling

2025-03-25 Thread Jeff Law




On 3/25/25 8:04 AM, Yangyu Chen wrote:



On 25/3/2025 21:23, Kito Cheng wrote:

Will it only cause issues with this patch
https://gcc.gnu.org/pipermail/gcc-patches/2025-March/678918.html


Yes. But I think we can merge this first.
But if this patch doesn't fix a regression or possibly a correctness 
issue, then it really needs to defer to gcc-16.


Jeff



  1   2   >