Re: [PATCH] libstdc++: Implement stringstream from string_view [P2495R3]

2025-05-20 Thread Jonathan Wakely
On Tue, 20 May 2025 at 09:10, Tomasz Kaminski  wrote:
>
>
>
> On Mon, May 19, 2025 at 11:28 PM Nathan Myers  wrote:
>> +void
>> +test02()
>> +{
>> +  // Test C++26 constructors taking string views using different allocators
>> +
>> +  using alloc_type = __gnu_test::tracker_allocator;
>
> I would use __gnu_test::uneq_allocator<>, as it have state (int), that is 
> checked in equality in VERIFY.

You can even stack them, so uneq_allocator>, if you really want to verify the number of
bytes allocated *and* that allocator identity is correctly propagated.


Re: [PATCH v1 5/6] libstdc++: Implement layout_stride from mdspan.

2025-05-20 Thread Luc Grosheintz




On 5/20/25 10:24 AM, Tomasz Kaminski wrote:

On Sun, May 18, 2025 at 10:16 PM Luc Grosheintz 
wrote:


Implements the remaining parts of layout_left and layout_right; and all
of layout_stride.

libstdc++-v3/ChangeLog:

 * include/std/mdspan(layout_stride): New class.

Signed-off-by: Luc Grosheintz 
---
  libstdc++-v3/include/std/mdspan | 219 +++-
  1 file changed, 216 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/std/mdspan
b/libstdc++-v3/include/std/mdspan
index b1984eb2a33..31a38c736c2 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -366,6 +366,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
class mapping;
};

+  struct layout_stride
+  {
+template
+  class mapping;
+  };
+
namespace __mdspan
{
  template
@@ -434,7 +440,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

  template
concept __standardized_mapping = __mapping_of
-  || __mapping_of;
+  || __mapping_of
+  || __mapping_of;

  template
concept __mapping_like = requires
@@ -503,6 +510,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : mapping(__other.extents(), __mdspan::__internal_ctor{})
 { }

+  template
+   requires (is_constructible_v)
+   constexpr explicit(extents_type::rank() > 0)
+   mapping(const layout_stride::mapping<_OExtents>& __other)
+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   {
+ __glibcxx_assert(
+   layout_left::mapping<_OExtents>(__other.extents()) == __other);


Could this be *this == other?


+   }
+
constexpr mapping&
operator=(const mapping&) noexcept = default;

@@ -518,8 +535,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 constexpr index_type
 operator()(_Indices... __indices) const noexcept
 {
- return __mdspan::__linear_index_left(
-   this->extents(), static_cast(__indices)...);
+ return __mdspan::__linear_index_left(_M_extents,
+   static_cast(__indices)...);
 }

static constexpr bool
@@ -633,6 +650,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : mapping(__other.extents(), __mdspan::__internal_ctor{})
 { }

+  template
+   requires (is_constructible_v)
+   constexpr explicit(extents_type::rank() > 0)
+   mapping(const layout_stride::mapping<_OExtents>& __other) noexcept
+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   {
+ __glibcxx_assert(
+   layout_right::mapping<_OExtents>(__other.extents()) ==
__other);


Similary here.


+   }
+
constexpr mapping&
operator=(const mapping&) noexcept = default;

@@ -695,6 +722,192 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 [[no_unique_address]] _Extents _M_extents;
  };

+  namespace __mdspan
+  {
+template
+  constexpr typename _Mapping::index_type
+  __offset_impl(const _Mapping& __m, index_sequence<_Counts...>)
noexcept
+  { return __m(((void) _Counts, 0)...); }
+
+template
+  constexpr typename _Mapping::index_type
+  __offset(const _Mapping& __m) noexcept
+  {


Again, I would define __impl as nested lambda here:
auto __impl = [&](index_seqeunce<_Counts>) noexcept
{ return  __m(((void) _Counts, 0)...);  }


+   return __offset_impl(__m,
+   make_index_sequence<_Mapping::extents_type::rank()>());
+  }
+
+template
+  constexpr typename _Mapping::index_type
+  __linear_index_strides(const _Mapping& __m,
+_Indices... __indices)
+  {
+   using _IndexType = typename _Mapping::index_type;
+   _IndexType __res = 0;
+   if constexpr (sizeof...(__indices) > 0)
+ {
+   auto __update = [&, __pos = 0u](_IndexType __idx) mutable
+ {
+   __res += __idx * __m.stride(__pos++);
+ };
+   (__update(__indices), ...);
+ }
+   return __res;
+  }
+  }
+
+  template
+class layout_stride::mapping
+{
+  static_assert(__mdspan::__layout_extent<_Extents>,
+   "The size of extents_type must be representable as index_type");
+
+public:
+  using extents_type = _Extents;
+  using index_type = typename extents_type::index_type;
+  using size_type = typename extents_type::size_type;
+  using rank_type = typename extents_type::rank_type;
+  using layout_type = layout_stride;
+
+  constexpr
+  mapping() noexcept
+  {
+   auto __stride = index_type(1);
+   for(size_t __i = extents_type::rank(); __i > 0; --__i)
+ {
+   _M_strides[__i - 1] = __stride;
+   __stride *= _M_extents.extent(__i - 1);
+ }
+  }
+
+  constexpr
+  mapping(const mapping&) noexcept = default;
+
+  template<__mdspan::__valid_index_type _OIndexType>
+   constexpr
+   mapping(co

Re: [PATCH v1 5/6] libstdc++: Implement layout_stride from mdspan.

2025-05-20 Thread Tomasz Kaminski
On Tue, May 20, 2025 at 10:45 AM Luc Grosheintz 
wrote:

>
>
> On 5/20/25 10:24 AM, Tomasz Kaminski wrote:
> > On Sun, May 18, 2025 at 10:16 PM Luc Grosheintz <
> luc.groshei...@gmail.com>
> > wrote:
> >
> >> Implements the remaining parts of layout_left and layout_right; and all
> >> of layout_stride.
> >>
> >> libstdc++-v3/ChangeLog:
> >>
> >>  * include/std/mdspan(layout_stride): New class.
> >>
> >> Signed-off-by: Luc Grosheintz 
> >> ---
> >>   libstdc++-v3/include/std/mdspan | 219 +++-
> >>   1 file changed, 216 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/libstdc++-v3/include/std/mdspan
> >> b/libstdc++-v3/include/std/mdspan
> >> index b1984eb2a33..31a38c736c2 100644
> >> --- a/libstdc++-v3/include/std/mdspan
> >> +++ b/libstdc++-v3/include/std/mdspan
> >> @@ -366,6 +366,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >> class mapping;
> >> };
> >>
> >> +  struct layout_stride
> >> +  {
> >> +template
> >> +  class mapping;
> >> +  };
> >> +
> >> namespace __mdspan
> >> {
> >>   template
> >> @@ -434,7 +440,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>
> >>   template
> >> concept __standardized_mapping = __mapping_of _Mapping>
> >> -  || __mapping_of >> _Mapping>;
> >> +  || __mapping_of >> _Mapping>
> >> +  || __mapping_of >> _Mapping>;
> >>
> >>   template
> >> concept __mapping_like = requires
> >> @@ -503,6 +510,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>  : mapping(__other.extents(), __mdspan::__internal_ctor{})
> >>  { }
> >>
> >> +  template
> >> +   requires (is_constructible_v)
> >> +   constexpr explicit(extents_type::rank() > 0)
> >> +   mapping(const layout_stride::mapping<_OExtents>& __other)
> >> +   : mapping(__other.extents(), __mdspan::__internal_ctor{})
> >> +   {
> >> + __glibcxx_assert(
> >> +   layout_left::mapping<_OExtents>(__other.extents()) ==
> __other);
> >>
> > Could this be *this == other?
> >
> >> +   }
> >> +
> >> constexpr mapping&
> >> operator=(const mapping&) noexcept = default;
> >>
> >> @@ -518,8 +535,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>  constexpr index_type
> >>  operator()(_Indices... __indices) const noexcept
> >>  {
> >> - return __mdspan::__linear_index_left(
> >> -   this->extents(), static_cast(__indices)...);
> >> + return __mdspan::__linear_index_left(_M_extents,
> >> +   static_cast(__indices)...);
> >>  }
> >>
> >> static constexpr bool
> >> @@ -633,6 +650,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>  : mapping(__other.extents(), __mdspan::__internal_ctor{})
> >>  { }
> >>
> >> +  template
> >> +   requires (is_constructible_v)
> >> +   constexpr explicit(extents_type::rank() > 0)
> >> +   mapping(const layout_stride::mapping<_OExtents>& __other)
> noexcept
> >> +   : mapping(__other.extents(), __mdspan::__internal_ctor{})
> >> +   {
> >> + __glibcxx_assert(
> >> +   layout_right::mapping<_OExtents>(__other.extents()) ==
> >> __other);
> >>
> > Similary here.
> >
> >> +   }
> >> +
> >> constexpr mapping&
> >> operator=(const mapping&) noexcept = default;
> >>
> >> @@ -695,6 +722,192 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >>  [[no_unique_address]] _Extents _M_extents;
> >>   };
> >>
> >> +  namespace __mdspan
> >> +  {
> >> +template
> >> +  constexpr typename _Mapping::index_type
> >> +  __offset_impl(const _Mapping& __m, index_sequence<_Counts...>)
> >> noexcept
> >> +  { return __m(((void) _Counts, 0)...); }
> >> +
> >> +template
> >> +  constexpr typename _Mapping::index_type
> >> +  __offset(const _Mapping& __m) noexcept
> >> +  {
> >>
> > Again, I would define __impl as nested lambda here:
> > auto __impl = [&](index_seqeunce<_Counts>) noexcept
> > { return  __m(((void) _Counts, 0)...);  }
> >
> >> +   return __offset_impl(__m,
> >> +   make_index_sequence<_Mapping::extents_type::rank()>());
> >> +  }
> >> +
> >> +template
> >> +  constexpr typename _Mapping::index_type
> >> +  __linear_index_strides(const _Mapping& __m,
> >> +_Indices... __indices)
> >> +  {
> >> +   using _IndexType = typename _Mapping::index_type;
> >> +   _IndexType __res = 0;
> >> +   if constexpr (sizeof...(__indices) > 0)
> >> + {
> >> +   auto __update = [&, __pos = 0u](_IndexType __idx) mutable
> >> + {
> >> +   __res += __idx * __m.stride(__pos++);
> >> + };
> >> +   (__update(__indices), ...);
> >> + }
> >> +   return __res;
> >> +  }
> >> +  }
> >> +
> >> +  template
> >> +class layout_stride::mapping
> >> +{
> >> +  static_assert(__mdspan::__layout

Re: [PATCH] [gcc-14] testsuite: Improve check-function-bodies

2025-05-20 Thread Richard Earnshaw (lists)
On 20/05/2025 05:26, Alexandre Oliva wrote:
> The backport of commit 205515da82a2914d765e74ba73fd2765e1254112 to
> gcc-14 as 8b1146fe46e62f8b03bd9ddee48995794e192e82, rewriting
> gcc.target/arm/fp16-aapcs-[1234].c into check-function-bodies, requires
> the following patch for the one-character function names used in those
> tests.  Tested with gcc-14 on arm-vxworks7r2.  Ok to install?
> 
> From: Wilco Dijkstra 
> 
> Improve check-function-bodies by allowing single-character function names.
> 
> gcc/testsuite:
>   * lib/scanasm.exp (configure_check-function-bodies): Allow single-char
>   function names.
> 
> (cherry pick from commit acdc9df371fbe99e814a3f35a439531e08af79e7)
> ---
>  gcc/testsuite/lib/scanasm.exp |6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/testsuite/lib/scanasm.exp b/gcc/testsuite/lib/scanasm.exp
> index d1c8e3b50794a..737eefc655e90 100644
> --- a/gcc/testsuite/lib/scanasm.exp
> +++ b/gcc/testsuite/lib/scanasm.exp
> @@ -869,15 +869,15 @@ proc configure_check-function-bodies { config } {
>  # Regexp for the start of a function definition (name in \1).
>  if { [istarget nvptx*-*-*] } {
>   set up_config(start) {
> - {^// BEGIN(?: GLOBAL|) FUNCTION DEF: ([a-zA-Z_]\S+)$}
> + {^// BEGIN(?: GLOBAL|) FUNCTION DEF: ([a-zA-Z_]\S*)$}
>   }
>  } elseif { [istarget *-*-darwin*] } {
>   set up_config(start) {
> - {^_([a-zA-Z_]\S+):$}
> + {^_([a-zA-Z_]\S*):$}
>   {^LFB[0-9]+:}
>   }
>  } else {
> - set up_config(start) {{^([a-zA-Z_]\S+):$}}
> + set up_config(start) {{^([a-zA-Z_]\S*):$}}
>  }
>  
>  # Regexp for the end of a function definition.
> 

OK once gcc-14.3 has been released (we're in RC phase right now and this isn't 
critical).

R.


Re: [PATCH] libstdc++: Implement stringstream from string_view [P2495R3]

2025-05-20 Thread Jonathan Wakely
On Tue, 20 May 2025 at 09:10, Tomasz Kaminski  wrote:
>
>
>
> On Mon, May 19, 2025 at 11:28 PM Nathan Myers  wrote:
> In the title, we usually put link to bugzilla PR119741 in your case, not the 
> paper.
> Then link the paper in commit descritpion.

Right. When there's no bugzilla I'll sometimes put the paper number in
parens, e.g. in https://gcc.gnu.org/g:91f4550e1700 which has this
summary:
libstdc++: Move std::monostate to  for C++26 (P0472R2)

But the paper number should not be in square brackets, that should be
reserved for a bugzilla PR number if there is one (and for this
feature, we do have a bugzilla PR).


>
>> Add constructors to stringbuf, stringstream, istringstream,
>> and ostringstream, and a matching overload of str(sv) in each,
>> that take anything convertible to a string_view where the
>> existing functions take a string.
>
>
> After you put bugzilla number, git gcc-verify will suggest you to add 
> following node:
> PR libstdc++/119741
>
>>
>>
>> libstdc++-v3/ChangeLog:
>>
>> P2495R3 stringstream to init from string_view-ish
>
> We usually put only the change files here. Did git gcc-verify accepted it.
>>
>> * include/std/sstream: full implementation, really just
>> decls, requires clause and plumbing.
>> * include/std/bits/version.def, .h: new preprocessor symbol
>> __cpp_lib_sstream_from_string_view.
>> * testsuite/27_io/basic_stringbuf/cons/char/3.cc: New tests.
>> * testsuite/27_io/basic_istringstream/cons/char/2.cc: New tests.
>> * testsuite/27_io/basic_ostringstream/cons/char/4.cc: New tests.
>> * testsuite/27_io/basic_stringstream/cons/char/2.cc: New tests.
>> ---
>>  libstdc++-v3/ChangeLog|  11 +
>>  libstdc++-v3/include/bits/version.def |  11 +-
>>  libstdc++-v3/include/bits/version.h   |  10 +
>>  libstdc++-v3/include/std/sstream  | 181 +--
>>  .../27_io/basic_istringstream/cons/char/2.cc  | 193 
>>  .../27_io/basic_ostringstream/cons/char/4.cc  | 193 
>>  .../27_io/basic_stringbuf/cons/char/3.cc  | 216 ++
>>  .../27_io/basic_stringstream/cons/char/2.cc   | 194 
>>  8 files changed, 990 insertions(+), 19 deletions(-)
>>  create mode 100644 
>> libstdc++-v3/testsuite/27_io/basic_istringstream/cons/char/2.cc
>>  create mode 100644 
>> libstdc++-v3/testsuite/27_io/basic_ostringstream/cons/char/4.cc
>>  create mode 100644 
>> libstdc++-v3/testsuite/27_io/basic_stringbuf/cons/char/3.cc
>>  create mode 100644 
>> libstdc++-v3/testsuite/27_io/basic_stringstream/cons/char/2.cc
>>
>> diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
>> index b45f8c2c7a5..ac0ff4a386f 100644
>> --- a/libstdc++-v3/ChangeLog
>> +++ b/libstdc++-v3/ChangeLog
>> @@ -41,6 +41,17 @@
>> PR libstdc++/119246
>> * include/std/format: Updated check for _GLIBCXX_FORMAT_F128.
>>
>> +2025-05-14  Nathan Myers  
>> +   P2495R3 stringstream to init from string_view-ish
>> +   * include/std/sstream: full implementation, really just
>> +   decls, requires clause and plumbing.
>> +   * include/std/bits/version.def, .h: new preprocessor symbol
>> +   __cpp_lib_sstream_from_string_view.
>> +   * testsuite/27_io/basic_stringbuf/cons/char/3.cc: New tests.
>> +   * testsuite/27_io/basic_istringstream/cons/char/2.cc: New tests.
>> +   * testsuite/27_io/basic_ostringstream/cons/char/4.cc: New tests.
>> +   * testsuite/27_io/basic_stringstream/cons/char/2.cc: New tests.
>> +
>
> Changelogs are now automatically generated from commit messages, so you do not
> need to edit this file.
>>
>>  2025-05-14  Tomasz Kamiński  
>>
>> PR libstdc++/119125
>> diff --git a/libstdc++-v3/include/bits/version.def 
>> b/libstdc++-v3/include/bits/version.def
>> index 6ca148f0488..567c56b4117 100644
>> --- a/libstdc++-v3/include/bits/version.def
>> +++ b/libstdc++-v3/include/bits/version.def
>> @@ -649,7 +649,7 @@ ftms = {
>>};
>>values = {
>>  v = 1;
>> -/* For when there's no gthread.  */
>> +// For when there is no gthread.
>>  cxxmin = 17;
>>  hosted = yes;
>>  gthread = no;
>> @@ -1961,6 +1961,15 @@ ftms = {
>>};
>>  };
>>
>> +ftms = {
>> +  name = sstream_from_string_view;
>> +  values = {
>> +v = 202302;
>> +cxxmin = 26;
>> +hosted = yes;
>> +  };
>> +};
>> +
>>  // Standard test specifications.
>>  stds[97] = ">= 199711L";
>>  stds[03] = ">= 199711L";
>> diff --git a/libstdc++-v3/include/bits/version.h 
>> b/libstdc++-v3/include/bits/version.h
>> index 48a090c14a3..5d1beb83a25 100644
>> --- a/libstdc++-v3/include/bits/version.h
>> +++ b/libstdc++-v3/include/bits/version.h
>> @@ -2193,4 +2193,14 @@
>>  #endif /* !defined(__cpp_lib_modules) && defined(__glibcxx_want_modules) */
>>  #undef __glibcxx_want_modules
>>
>> +#if !defined(__cpp_lib_sstream_from_string_view)
>> +# if (__cplusplus >  202302L) && _GLIBCXX_HOSTED
>> +#  def

Re: [PATCH] [testsuite] tolerate missing std::stold

2025-05-20 Thread Jonathan Wakely
On Tue, 20 May 2025, 05:00 Alexandre Oliva,  wrote:

>
> basic_string.h doesn't define the non-w string version of std::stold
> when certain conditions aren't met, and then a couple of tests fail to
> compile.
>
> Guard the portions of the tests that depend on std::stold with the
> conditions for it to be defined.
>
> Regstrapped on x86_64-linux-gnu.  Also tested with gcc-14 on aarch64-,
> arm-, x86-, and x86_64-vxworks7r2.  Ok to install?
>

OK

Maybe we should just define it in terms of std::stod, so it always exists
but might not support the full accuracy of long double.


>
> for  libstdc++-v3/ChangeLog
>
> *
> testsuite/21_strings/basic_string/numeric_conversions/char/stold.cc:
> Guard non-wide stold calls with conditions for it to be
> defined.
> *
> testsuite/27_io/basic_ostream/inserters_arthmetic/char/hexfloat.cc:
> Likewise.
> ---
>  .../basic_string/numeric_conversions/char/stold.cc |6 ++
>  .../inserters_arithmetic/char/hexfloat.cc  |6 ++
>  2 files changed, 12 insertions(+)
>
> diff --git
> a/libstdc++-v3/testsuite/21_strings/basic_string/numeric_conversions/char/stold.cc
> b/libstdc++-v3/testsuite/21_strings/basic_string/numeric_conversions/char/stold.cc
> index b64ad0c868345..dd777c4529a08 100644
> ---
> a/libstdc++-v3/testsuite/21_strings/basic_string/numeric_conversions/char/stold.cc
> +++
> b/libstdc++-v3/testsuite/21_strings/basic_string/numeric_conversions/char/stold.cc
> @@ -31,6 +31,11 @@
>  void
>  test01()
>  {
> +  /* If these conditions are not met, basic_string.h doesn't define
> + std::stold(const string&, size_t* = 0), and then the test would
> + fail to compile.  */
> +#if (_GLIBCXX_HAVE_STRTOLD && ! _GLIBCXX_HAVE_BROKEN_STRTOLD) \
> +  || __DBL_MANT_DIG__ == __LDBL_MANT_DIG__
>bool test = false;
>using namespace std;
>
> @@ -106,6 +111,7 @@ test01()
>test = false;
>  }
>VERIFY( test );
> +#endif
>  }
>
>  int main()
> diff --git
> a/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_arithmetic/char/hexfloat.cc
> b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_arithmetic/char/hexfloat.cc
> index b1bc7fbb9d4e1..f694730901edb 100644
> ---
> a/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_arithmetic/char/hexfloat.cc
> +++
> b/libstdc++-v3/testsuite/27_io/basic_ostream/inserters_arithmetic/char/hexfloat.cc
> @@ -95,6 +95,11 @@ test01()
>  void
>  test02()
>  {
> +  /* If these conditions are not met, basic_string.h doesn't define
> + std::stold(const string&, size_t* = 0), and then the test would
> + fail to compile.  */
> +#if (_GLIBCXX_HAVE_STRTOLD && ! _GLIBCXX_HAVE_BROKEN_STRTOLD) \
> +  || __DBL_MANT_DIG__ == __LDBL_MANT_DIG__
>ostringstream os;
>long double d = 272.L; // 0x1.1p+8L;
>os << hexfloat << setprecision(1);
> @@ -140,6 +145,7 @@ test02()
>cout << "got: " << os.str() << endl;
>  #endif
>VERIFY( os && os.str() == "15" );
> +#endif
>  }
>
>  int
>
> --
> Alexandre Oliva, happy hackerhttps://blog.lx.oliva.nom.br/
> Free Software Activist FSFLA co-founder GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity.
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive!
>


[PATCH v4 0/4] Hard Register Constraints

2025-05-20 Thread Stefan Schulze Frielinghaus
This is a follow-up to
https://gcc.gnu.org/pipermail/gcc-patches/2025-January/672941.html
with basically the only change of adding more tests like for aarch64 and
the y constraint or for i386 and constraints S,D,A.

I just realized that in the second patch "Error handling for hard
register constraints" locs may be different, now.  For example

gcc.target/i386/pr79804.c: In function ‘foo’:
gcc.target/i386/pr79804.c:9:3: error: register for operand 0 is an internal GCC 
implementation detail

prior my patch the line number was 7 and not 9 as it is now.  This is
due to how I iterate over an inline asm statement during gimplification
and error out over operands instead of erroring out for register asm
decls.  I didn't dare to fix this right now since I already spend quite
some time on it over the course of more than a year and would rather
like to get some feedback whether the implementation of hard register
constraints is sensible at all or whether I should drop this patch.

Cheers,
Stefan

Stefan Schulze Frielinghaus (4):
  Hard register constraints
  Error handling for hard register constraints
  genoutput: Verify hard register constraints
  Rewrite register asm into hard register constraints

 gcc/cfgexpand.cc  |  42 ---
 gcc/common.opt|   4 +
 gcc/config/cris/cris.cc   |   6 +-
 gcc/config/i386/i386.cc   |   6 +
 gcc/config/s390/s390.cc   |   6 +-
 gcc/doc/extend.texi   | 178 +++
 gcc/doc/md.texi   |   6 +
 gcc/function.cc   | 116 
 gcc/genoutput.cc  |  60 
 gcc/genpreds.cc   |   4 +-
 gcc/gimplify.cc   | 236 ++-
 gcc/gimplify_reg_info.h   | 169 +++
 gcc/ira.cc|  79 -
 gcc/lra-constraints.cc|  13 +
 gcc/output.h  |   2 +
 gcc/recog.cc  |  11 +-
 gcc/stmt.cc   | 278 +-
 gcc/stmt.h|   9 +-
 gcc/testsuite/gcc.dg/asm-hard-reg-1.c |  85 ++
 gcc/testsuite/gcc.dg/asm-hard-reg-2.c |  33 +++
 gcc/testsuite/gcc.dg/asm-hard-reg-3.c |  25 ++
 gcc/testsuite/gcc.dg/asm-hard-reg-4.c |  50 
 gcc/testsuite/gcc.dg/asm-hard-reg-5.c |  36 +++
 gcc/testsuite/gcc.dg/asm-hard-reg-6.c |  60 
 gcc/testsuite/gcc.dg/asm-hard-reg-7.c |  41 +++
 gcc/testsuite/gcc.dg/asm-hard-reg-8.c |  49 +++
 .../gcc.dg/asm-hard-reg-demotion-1.c  |  19 ++
 .../gcc.dg/asm-hard-reg-demotion-2.c  |  19 ++
 .../gcc.dg/asm-hard-reg-demotion-error-1.c|  29 ++
 gcc/testsuite/gcc.dg/asm-hard-reg-demotion.h  |  52 
 gcc/testsuite/gcc.dg/asm-hard-reg-error-1.c   |  83 ++
 gcc/testsuite/gcc.dg/asm-hard-reg-error-2.c   |  26 ++
 gcc/testsuite/gcc.dg/asm-hard-reg-error-3.c   |  27 ++
 gcc/testsuite/gcc.dg/asm-hard-reg-error-4.c   |  24 ++
 gcc/testsuite/gcc.dg/asm-hard-reg-error-5.c   |  13 +
 gcc/testsuite/gcc.dg/pr87600-2.c  |  30 +-
 gcc/testsuite/gcc.dg/pr87600-3.c  |  35 +++
 .../gcc.target/aarch64/asm-hard-reg-1.c   |  55 
 .../gcc.target/i386/asm-hard-reg-1.c  | 115 
 .../gcc.target/s390/asm-hard-reg-1.c  | 103 +++
 .../gcc.target/s390/asm-hard-reg-2.c  |  43 +++
 .../gcc.target/s390/asm-hard-reg-3.c  |  42 +++
 .../gcc.target/s390/asm-hard-reg-4.c  |   6 +
 .../gcc.target/s390/asm-hard-reg-5.c  |   6 +
 .../gcc.target/s390/asm-hard-reg-6.c  | 152 ++
 .../gcc.target/s390/asm-hard-reg-longdouble.h |  18 ++
 gcc/testsuite/lib/scanasm.exp |   4 +
 gcc/toplev.cc |   4 +
 48 files changed, 2409 insertions(+), 100 deletions(-)
 create mode 100644 gcc/gimplify_reg_info.h
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-1.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-2.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-3.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-4.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-5.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-6.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-7.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-8.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-demotion-1.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-demotion-2.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-demotion-error-1.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-demotion.h
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-error-1.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-error-2.c
 create mode 100644 gcc/testsuite/gcc.dg/asm

[PATCH v4 1/4] Hard register constraints

2025-05-20 Thread Stefan Schulze Frielinghaus
Implement hard register constraints of the form {regname} where regname
must be a valid register name for the target.  Such constraints may be
used in asm statements as a replacement for register asm and in machine
descriptions.

It is expected and desired that optimizations coalesce multiple pseudos
into one whenever possible.  However, in case of hard register
constraints we may have to undo this and introduce copies since
otherwise we could constraint a single pseudo to different hard
registers.  Therefore, we have to introduce copies of such a pseudo and
use these for conflicting inputs.  This is done prior RA during asmcons
in match_asm_constraints_2().  While IRA tries to reduce live ranges, it
also replaces some register-register moves.  That in turn might undo
those copies of a pseudo which we just introduced during asmcons.  Thus,
check in decrease_live_ranges_number() via
valid_replacement_for_asm_input_p() whether it is valid to perform a
replacement.

The reminder of the patch mostly deals with parsing and decoding hard
register constraints.  The actual work is done by LRA in
process_alt_operands() where a register filter, according to the
constraint, is installed.

For the sake of "reviewability" and in order to show the beauty of LRA,
error handling (which gets pretty involved) is spread out into a
subsequent patch.

Limitation
--

Currently, a fixed register cannot be used as hard register constraint.
For example, loading the stack pointer on x86_64 via

void *
foo (void)
{
  void *y;
  __asm__ ("" : "={rsp}" (y));
  return y;
}

leads to an error.  This is unfortunate since register asm does not have
this limitation.  The culprit seems to be that during reload
ira_class_hard_regs_num[rclass] does not even include fixed registers
which is why lra_assign() ultimately fails.  Does anyone have an idea
how to lift this limitation?  Maybe there is even a shortcut in order to
force a pseudo into a hard reg?

Asm Adjust Hook
---

The following targets implement TARGET_MD_ASM_ADJUST:

- aarch64
- arm
- avr
- cris
- i386
- mn10300
- nds32
- pdp11
- rs6000
- s390
- vax

Most of them only add the CC register to the list of clobbered register.
However, cris, i386, and s390 need some minor adjustment.
---
 gcc/config/cris/cris.cc   |   6 +-
 gcc/config/i386/i386.cc   |   6 +
 gcc/config/s390/s390.cc   |   6 +-
 gcc/doc/extend.texi   | 178 ++
 gcc/doc/md.texi   |   6 +
 gcc/function.cc   | 116 
 gcc/genoutput.cc  |  14 ++
 gcc/genpreds.cc   |   4 +-
 gcc/ira.cc|  79 +++-
 gcc/lra-constraints.cc|  13 ++
 gcc/recog.cc  |  11 +-
 gcc/stmt.cc   |  39 
 gcc/stmt.h|   1 +
 gcc/testsuite/gcc.dg/asm-hard-reg-1.c |  85 +
 gcc/testsuite/gcc.dg/asm-hard-reg-2.c |  33 
 gcc/testsuite/gcc.dg/asm-hard-reg-3.c |  25 +++
 gcc/testsuite/gcc.dg/asm-hard-reg-4.c |  50 +
 gcc/testsuite/gcc.dg/asm-hard-reg-5.c |  36 
 gcc/testsuite/gcc.dg/asm-hard-reg-6.c |  60 ++
 gcc/testsuite/gcc.dg/asm-hard-reg-7.c |  41 
 gcc/testsuite/gcc.dg/asm-hard-reg-8.c |  49 +
 .../gcc.target/aarch64/asm-hard-reg-1.c   |  55 ++
 .../gcc.target/i386/asm-hard-reg-1.c  | 115 +++
 .../gcc.target/s390/asm-hard-reg-1.c  | 103 ++
 .../gcc.target/s390/asm-hard-reg-2.c  |  43 +
 .../gcc.target/s390/asm-hard-reg-3.c  |  42 +
 .../gcc.target/s390/asm-hard-reg-4.c  |   6 +
 .../gcc.target/s390/asm-hard-reg-5.c  |   6 +
 .../gcc.target/s390/asm-hard-reg-6.c  | 152 +++
 .../gcc.target/s390/asm-hard-reg-longdouble.h |  18 ++
 30 files changed, 1391 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-1.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-2.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-3.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-4.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-5.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-6.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-7.c
 create mode 100644 gcc/testsuite/gcc.dg/asm-hard-reg-8.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/asm-hard-reg-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/asm-hard-reg-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/asm-hard-reg-1.c
 create mode 100644 gcc/testsuite/gcc.target/s390/asm-hard-reg-2.c
 create mode 100644 gcc/testsuite/gcc.target/s390/asm-hard-reg-3.c
 create mode 100644 gcc/testsuite/gcc.target/s390/asm-hard-reg-4.c
 create mode 100644 gcc/testsuite/gc

Re: [PATCH] libgcc: Move bitint support exports to x86/aarch64 specific map files

2025-05-20 Thread Richard Biener



> Am 20.05.2025 um 08:48 schrieb Jakub Jelinek :
> 
> Hi!
> 
> When adding _BitInt support I was hoping all or most of arches would
> implement it already for GCC 14.  That didn't happen and with
> new hosts adding support for _BitInt for GCC 16 (s390x-linux and as was
> posted today loongarch-linux too), we need the _BitInt support functions
> exported on those arches at GCC_16.0.0 rather than GCC_14.0.0 which
> shouldn't be changed anymore.
> 
> The following patch does that.  Both arches were already exporting
> some of the _BitInt related symbols in their specific map files, this
> just moves the remaining ones there as well.
> 
> Tested on x86_64-linux (-m32/-m64, with all bitint related tests, both
> normally and --target_board=unix/-shared-libgcc, additionally compared
> abilists of libgcc before/after), ok for trunk?

Ok

Richard 

> 2025-05-20  Jakub Jelinek  
> 
>* libgcc-std.ver.in (GCC_14.0.0): Remove bitint related exports
>from here.
>* config/i386/libgcc-glibc.ver (GCC_14.0.0): Add them here.
>* config/i386/libgcc-darwin.ver (GCC_14.0.0): Likewise.
>* config/i386/libgcc-sol2.ver (GCC_14.0.0): Likewise.
>* config/aarch64/libgcc-softfp.ver (GCC_14.0.0): Likewise.
> 
> --- libgcc/libgcc-std.ver.in.jj2025-04-08 14:09:53.631413461 +0200
> +++ libgcc/libgcc-std.ver.in2025-05-20 08:23:24.323741294 +0200
> @@ -1947,12 +1947,6 @@ GCC_7.0.0 {
> 
> %inherit GCC_14.0.0 GCC_7.0.0
> GCC_14.0.0 {
> -  __PFX__mulbitint3
> -  __PFX__divmodbitint4
> -  __PFX__fixsfbitint
> -  __PFX__fixdfbitint
> -  __PFX__floatbitintsf
> -  __PFX__floatbitintdf
>   __PFX__hardcfr_check
>   __PFX__strub_enter
>   __PFX__strub_update
> --- libgcc/config/i386/libgcc-glibc.ver.jj2025-04-08 14:09:53.518415033 
> +0200
> +++ libgcc/config/i386/libgcc-glibc.ver2025-05-20 08:25:59.310613264 +0200
> @@ -229,10 +229,16 @@ GCC_13.0.0 {
> 
> %inherit GCC_14.0.0 GCC_13.0.0
> GCC_14.0.0 {
> +  __mulbitint3
> +  __divmodbitint4
> +  __fixsfbitint
> +  __fixdfbitint
>   __fixxfbitint
>   __fixtfbitint
>   __floatbitintbf
>   __floatbitinthf
> +  __floatbitintsf
> +  __floatbitintdf
>   __floatbitintxf
>   __floatbitinttf
> }
> --- libgcc/config/i386/libgcc-darwin.ver.jj2024-02-09 11:59:11.907051978 
> +0100
> +++ libgcc/config/i386/libgcc-darwin.ver2025-05-20 08:27:08.877659307 
> +0200
> @@ -37,10 +37,16 @@ GCC_14.0.0 {
>   __truncxfbf2
>   __trunchfbf2
>   # Added to GCC_14.0.0 in i386/libgcc-glibc.ver.
> +  __mulbitint3
> +  __divmodbitint4
> +  __fixsfbitint
> +  __fixdfbitint
>   __fixxfbitint
>   __fixtfbitint
>   __floatbitintbf
>   __floatbitinthf
> +  __floatbitintsf
> +  __floatbitintdf
>   __floatbitintxf
>   __floatbitinttf
> }
> --- libgcc/config/i386/libgcc-sol2.ver.jj2025-04-08 14:09:53.518415033 
> +0200
> +++ libgcc/config/i386/libgcc-sol2.ver2025-05-20 08:26:44.751990139 +0200
> @@ -144,10 +144,16 @@ GCC_14.0.0 {
>   __truncxfbf2
>   __trunchfbf2
>   # Added to GCC_14.0.0 in i386/libgcc-glibc.ver.
> +  __mulbitint3
> +  __divmodbitint4
> +  __fixsfbitint
> +  __fixdfbitint
>   __fixxfbitint
>   __fixtfbitint
>   __floatbitintbf
>   __floatbitinthf
> +  __floatbitintsf
> +  __floatbitintdf
>   __floatbitintxf
>   __floatbitinttf
> }
> --- libgcc/config/aarch64/libgcc-softfp.ver.jj2025-04-08 
> 14:09:53.174419821 +0200
> +++ libgcc/config/aarch64/libgcc-softfp.ver2025-05-20 08:28:03.638908388 
> +0200
> @@ -42,8 +42,14 @@ GCC_13.0.0 {
> 
> %inherit GCC_14.0.0 GCC_13.0.0
> GCC_14.0.0 {
> +  __mulbitint3
> +  __divmodbitint4
> +  __fixsfbitint
> +  __fixdfbitint
>   __fixtfbitint
>   __floatbitintbf
>   __floatbitinthf
> +  __floatbitintsf
> +  __floatbitintdf
>   __floatbitinttf
> }
> 
>Jakub
> 


Re: [PATCH 1/5] libstdc++: keep subtree sizes in pb_ds binary search trees (PR 81806)

2025-05-20 Thread Jonathan Wakely

On 13/07/20 16:40 +0800, Xi Ruoyao via Libstdc++ wrote:

The first patch removes two redundant statements which are confusing.  It
should
be applied anyway, disregarding other patches.


The patch is attached, to prevent my mail client from destroying it :(.

Please ignore a previous duplication of this mail with wrong title :(.

libstdc++-v3/ChangeLog:

* include/ext/pb_ds/detail/bin_search_tree_/insert_fn_imps.hpp
  (insert_leaf_new, insert_imp_empty): remove redundant statements.
--
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University



From 4eea45261ebf974ddf02f6154166c5cb6aa180da Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?X=E2=84=B9=20Ruoyao?= 
Date: Fri, 10 Jul 2020 20:10:52 +0800
Subject: [PATCH 1/5] libstdc++: remove two redundant statements in pb_ds
binary tree

libstdc++-v3/ChangeLog:

* include/ext/pb_ds/detail/bin_search_tree_/insert_fn_imps.hpp
  (insert_leaf_new, insert_imp_empty): remove redundant statements.


OK for trunk.


---
.../ext/pb_ds/detail/bin_search_tree_/insert_fn_imps.hpp| 2 --
1 file changed, 2 deletions(-)

diff --git 
a/libstdc++-v3/include/ext/pb_ds/detail/bin_search_tree_/insert_fn_imps.hpp 
b/libstdc++-v3/include/ext/pb_ds/detail/bin_search_tree_/insert_fn_imps.hpp
index 3942da05600..bdc10379af6 100644
--- a/libstdc++-v3/include/ext/pb_ds/detail/bin_search_tree_/insert_fn_imps.hpp
+++ b/libstdc++-v3/include/ext/pb_ds/detail/bin_search_tree_/insert_fn_imps.hpp
@@ -122,7 +122,6 @@ insert_leaf_new(const_reference r_value, node_pointer p_nd, 
bool left_nd)
}

  p_new_nd->m_p_parent = p_nd;
-  p_new_nd->m_p_left = p_new_nd->m_p_right = 0;
  PB_DS_ASSERT_NODE_CONSISTENT(p_nd)

  update_to_top(p_new_nd, (node_update* )this);
@@ -142,7 +141,6 @@ insert_imp_empty(const_reference r_value)
m_p_head->m_p_parent = p_new_node;

  p_new_node->m_p_parent = m_p_head;
-  p_new_node->m_p_left = p_new_node->m_p_right = 0;
  _GLIBCXX_DEBUG_ONLY(debug_base::insert_new(PB_DS_V2F(r_value));)

  update_to_top(m_p_head->m_p_parent, (node_update*)this);
--
2.27.0





[PATCH 0/1] Add warnings of potentially-uninitialized padding bits

2025-05-20 Thread Christopher Bazley
Commit 0547dbb725b reduced the number of cases in which
union padding bits are zeroed when the relevant language
standard does not strictly require it, unless gcc was
invoked with -fzero-init-padding-bits=unions or
-fzero-init-padding-bits=all in order to explicitly
request zeroing of padding bits.

This commit adds a closely related warning,
-Wzero-init-padding-bits=, which is intended to help
programmers to find code that might now need to be
rewritten or recompiled with
-fzero-init-padding-bits=unions or
-fzero-init-padding-bits=all in order to replicate
the behaviour that it had when compiled by older
versions of GCC. It can also be used to find struct
padding that was never previously guaranteed to be
zero initialized and still isn't unless GCC is
invoked with -fzero-init-padding-bits=all option.

The new warning can be set to the same three states
as -fzero-init-padding-bits ('standard', 'unions'
or 'all') and has the same default value ('standard').

The two options interact as follows:

  f: standard  f: unions   f: all
w: standard X X X
w: unions   U X X
w: all  A S X

X = No warnings about padding
U = Warnings about padding of unions.
S = Warnings about padding of structs.
A = Warnings about padding of structs and unions.

The level of optimisation and whether or not the
entire initializer is dropped to memory can both
affect whether warnings are produced when compiling
a given program. This is intentional, since tying
the warnings more closely to the relevant language
standard would require a very different approach
that would still be target-dependent, might impose
an unacceptable burden on programmers, and would
risk not satisfying the intended use-case (which
is closely tied to a specific optimisation).

Bootstrapped the compiler and tested on AArch64
and x86-64 using some new tests for
-Wzero-init-padding-bits and the existing tests
for -fzero-init-padding-bits
(check-gcc RUNTESTFLAGS="dg.exp=*-empty-init-*.c").

Base commit is a470433732e77ae29a717cf79049ceeea3cbe979

Christopher Bazley (1):
  Add warnings of potentially-uninitialized padding bits

 gcc/common.opt|  4 +
 gcc/doc/invoke.texi   | 85 ++-
 gcc/expr.cc   | 41 -
 gcc/expr.h|  7 +-
 gcc/gimplify.cc   | 29 ++-
 gcc/testsuite/gcc.dg/c23-empty-init-warn-1.c  | 68 +++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-10.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-11.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-12.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-13.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-14.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-15.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-16.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-17.c | 51 +++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-2.c  | 69 +++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-3.c  |  7 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-4.c  | 69 +++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-5.c  |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-6.c  |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-7.c  |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-8.c  |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-9.c  | 69 +++
 .../gcc.dg/gnu11-empty-init-warn-1.c  | 52 
 .../gcc.dg/gnu11-empty-init-warn-10.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-11.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-12.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-13.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-14.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-15.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-16.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-17.c | 51 +++
 .../gcc.dg/gnu11-empty-init-warn-2.c  | 59 +
 .../gcc.dg/gnu11-empty-init-warn-3.c  |  7 ++
 .../gcc.dg/gnu11-empty-init-warn-4.c  | 63 ++
 .../gcc.dg/gnu11-empty-init-warn-5.c  |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-6.c  |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-7.c  |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-8.c  |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-9.c  | 55 
 39 files changed, 937 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c23-empty-init-warn-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-empty-init-warn-10.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-empty-init-warn-11.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-empty-init-warn-12.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-empty-init-warn-13.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-empty-init-warn-14.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-empty-init-warn-15.c
 create mode 100644 gcc/t

Re: [PATCH v3 1/3] sbitmap: Add bitmap_bit_in_range_p_1 helper function

2025-05-20 Thread Richard Sandiford
Konstantinos Eleftheriou  writes:
> This patch adds the `bitmap_bit_in_range_p_1` helper function,
> in order to be used by `bitmap_bit_in_range_p`. The helper function
> contains the previous implementation of `bitmap_bit_in_range_p` and
> `bitmap_bit_in_range_p` has been updated to call the helper function.
>
> gcc/ChangeLog:
>
>   * sbitmap.cc (bitmap_bit_in_range_p): Call `bitmap_bit_in_range_p_1`.
>   (bitmap_bit_in_range_p_1): New function.
>
> Signed-off-by: Konstantinos Eleftheriou 

>From a staging perspective, I'm not sure this change is worth splitting out.
I think it'd be simpler to understand as part of patch 2.

I agree with Philipp's comment about making it static.

Thanks,
Richard

> ---
>
> (no changes since v1)
>
>  gcc/sbitmap.cc | 12 +++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/sbitmap.cc b/gcc/sbitmap.cc
> index df2e1aa49358..94f2bbd6c8fd 100644
> --- a/gcc/sbitmap.cc
> +++ b/gcc/sbitmap.cc
> @@ -330,7 +330,8 @@ bitmap_set_range (sbitmap bmap, unsigned int start, 
> unsigned int count)
> the simple bitmap BMAP.  Return FALSE otherwise.  */
>  
>  bool
> -bitmap_bit_in_range_p (const_sbitmap bmap, unsigned int start, unsigned int 
> end)
> +bitmap_bit_in_range_p_1 (const_sbitmap bmap, unsigned int start,
> +  unsigned int end)
>  {
>gcc_checking_assert (start <= end);
>bitmap_check_index (bmap, end);
> @@ -375,6 +376,15 @@ bitmap_bit_in_range_p (const_sbitmap bmap, unsigned int 
> start, unsigned int end)
>return (bmap->elms[start_word] & mask) != 0;
>  }
>  
> +/* Return TRUE if any bit between START and END inclusive is set within
> +   the simple bitmap BMAP.  Return FALSE otherwise.  */
> +
> +bool
> +bitmap_bit_in_range_p (const_sbitmap bmap, unsigned int start, unsigned int 
> end)
> +{
> +  return bitmap_bit_in_range_p_1 (bmap, start, end);
> +}
> +
>  #if GCC_VERSION < 3400
>  /* Table of number of set bits in a character, indexed by value of char.  */
>  static const unsigned char popcount_table[] =


Re: [PATCH 3/5] libstdc++: keep subtree sizes in pb_ds binary search trees (PR 81806)

2025-05-20 Thread Jonathan Wakely

On 13/07/20 16:45 +0800, Xi Ruoyao via Libstdc++ wrote:



The second and third patch together resolve PR 81806.


The attached patch modifies split_finish to use the subtree size we maintained
in the previous patch, resolving libstdc++/81806.

libstdc++-v3/ChangeLog:

PR libstdc++/81806
* include/ext/pb_ds/detail/bin_search_tree_/split_join_fn_imps.hpp
  (split_finish): Use maintained size, instead of calling
  std::distance.


OK for trunk


--
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University



From 4434da1b2b45797204f4fd978dcc4fbba4b17c6e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?X=E2=84=B9=20Ruoyao?= 
Date: Fri, 10 Jul 2020 21:38:09 +0800
Subject: [PATCH 3/5] libstdc++: use maintained size when split pb_ds binary
search trees

libstdc++-v3/ChangeLog:

PR libstdc++/81806
* include/ext/pb_ds/detail/bin_search_tree_/split_join_fn_imps.hpp
  (split_finish): Use maintained size, instead of calling
  std::distance.
---
.../ext/pb_ds/detail/bin_search_tree_/split_join_fn_imps.hpp  | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git 
a/libstdc++-v3/include/ext/pb_ds/detail/bin_search_tree_/split_join_fn_imps.hpp 
b/libstdc++-v3/include/ext/pb_ds/detail/bin_search_tree_/split_join_fn_imps.hpp
index d08288f186d..fb924b4434b 100644
--- 
a/libstdc++-v3/include/ext/pb_ds/detail/bin_search_tree_/split_join_fn_imps.hpp
+++ 
b/libstdc++-v3/include/ext/pb_ds/detail/bin_search_tree_/split_join_fn_imps.hpp
@@ -133,7 +133,9 @@ PB_DS_CLASS_C_DEC::
split_finish(PB_DS_CLASS_C_DEC& other)
{
  other.initialize_min_max();
-  other.m_size = std::distance(other.begin(), other.end());
+  other.m_size = 0;
+  if (other.m_p_head->m_p_parent != 0)
+other.m_size = other.m_p_head->m_p_parent->m_subtree_size;
  m_size -= other.m_size;
  initialize_min_max();
  PB_DS_ASSERT_VALID((*this))
--
2.27.0





Re: [PATCH] libgcc: Small bitint_reduce_prec big-endian fixes

2025-05-20 Thread Richard Biener

On Mon, 19 May 2025, Jakub Jelinek wrote:


Hi!

The big-endian _BitInt support in libgcc was written without any
testing and so I haven't discovered I've made one mistake in it
(in multiple places).
The bitint_reduce_prec function attempts to optimize inputs
which have some larger precision but at runtime they are found
to need smaller number of limbs.
For little-endian that is handled just by returning smaller
precision (or negative precision for signed), but for
big-endian we need to adjust the passed in limb pointer so that
when it returns smaller precision the argument still contains
the least significant limbs for the returned precision.

Bootstrapped/regtested on x86_64-linux and i686-linux (where it
doesn't do anything) and tested with all the _BitInt related
tests on s390x-linux, ok for trunk?


OK.

Richard.


2025-05-19  Jakub Jelinek  

* libgcc2.c (bitint_reduce_prec): For big endian
__LIBGCC_BITINT_ORDER__ use ++*p and --*p instead of
++p and --p.
* soft-fp/bitint.h (bitint_reduce_prec): Likewise.

--- libgcc/libgcc2.c.jj 2025-04-08 14:09:53.632413447 +0200
+++ libgcc/libgcc2.c2025-05-14 17:16:48.642879943 +0200
@@ -1333,7 +1333,7 @@ bitint_reduce_prec (const UBILtype **p,
  if (prec >= -1)
return -2;
#if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__
- ++p;
+ ++*p;
#else
  --i;
#endif
@@ -1347,7 +1347,7 @@ bitint_reduce_prec (const UBILtype **p,
  if (prec >= -1)
return -2;
#if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__
- ++p;
+ ++*p;
#else
  --i;
#endif
@@ -1358,7 +1358,7 @@ bitint_reduce_prec (const UBILtype **p,
  if ((Wtype) mslimb >= 0)
{
#if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__
- --p;
+ --*p;
#endif
  return prec - 1;
}
@@ -1387,7 +1387,7 @@ bitint_reduce_prec (const UBILtype **p,
  if (prec == 0)
return 1;
#if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__
- ++p;
+ ++*p;
#else
  --i;
#endif
@@ -1400,7 +1400,7 @@ bitint_reduce_prec (const UBILtype **p,
  if (prec == 0)
return 1;
#if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__
-  ++p;
+  ++*p;
#else
  --i;
#endif
--- libgcc/soft-fp/bitint.h.jj  2024-02-13 10:32:57.730666010 +0100
+++ libgcc/soft-fp/bitint.h 2025-05-14 17:17:00.418723808 +0200
@@ -76,7 +76,7 @@ bitint_reduce_prec (const UBILtype **p,
  if (prec >= -1)
return -2;
#if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__
- ++p;
+ ++*p;
#else
  --i;
#endif
@@ -90,7 +90,7 @@ bitint_reduce_prec (const UBILtype **p,
  if (prec >= -1)
return -2;
#if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__
- ++p;
+ ++*p;
#else
  --i;
#endif
@@ -101,7 +101,7 @@ bitint_reduce_prec (const UBILtype **p,
  if ((BILtype) mslimb >= 0)
{
#if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__
- --p;
+ --*p;
#endif
  return prec - 1;
}
@@ -130,7 +130,7 @@ bitint_reduce_prec (const UBILtype **p,
  if (prec == 0)
return 1;
#if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__
- ++p;
+ ++*p;
#else
  --i;
#endif
@@ -143,7 +143,7 @@ bitint_reduce_prec (const UBILtype **p,
  if (prec == 0)
return 1;
#if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__
-  ++p;
+  ++*p;
#else
  --i;
#endif

Jakub




Re: [RFC PATCH 1/3] bitint: Support ABI-extended _BitInt(N)

2025-05-20 Thread Jakub Jelinek
On Tue, May 20, 2025 at 11:29:10AM +0800, Yang Yujie wrote:
> gcc/c-family/ChangeLog:
> 
>   * c-common.cc (resolve_overloaded_atomic_exchange): Truncate
>   _BitInt values before atomic store.

You aren't truncating _BitInt values before atomic store, you are extending
them.

>   (resolve_overloaded_atomic_compare_exchange): Same.
> 
> gcc/ChangeLog:
> 
>   * explow.cc (promote_function_mode): Allow _BitInt types
>   to be promoted.
>   (promote_mode): Same.
>   * expr.cc (expand_expr_real_1): Do not truncate _BitInts
>   from ABI bondaries if the target sets the "extended" flag.
>   (EXTEND_BITINT): Same.
>   * gimple-lower-bitint.cc (struct bitint_large_huge):
> Access the highest-order limb of a large/huge _BitInt using limb
> type rather than a new type with reduced precision if _BitInt(N)
> is extended by definition.

8 spaces instead of tabs on the 3 lines above.
Though, as I wrote, I'd really like to see that those changes in
gimple-lower-bitint.cc are really needed.  Plus I've committed large
changes to the file today for big endian support, so the patch won't apply
cleanly most likely (and needs to take into account endianity for the
extensions when really needed).

> --- a/gcc/c-family/c-common.cc
> +++ b/gcc/c-family/c-common.cc
> @@ -8033,11 +8033,36 @@ resolve_overloaded_atomic_exchange (location_t loc, 
> tree function,
>/* Convert new value to required type, and dereference it.
>   If *p1 type can have padding or may involve floating point which
>   could e.g. be promoted to wider precision and demoted afterwards,
> - state of padding bits might not be preserved.  */
> + state of padding bits might not be preserved.
> +
> + However, as a special case, we still want to preserve the padding
> + bits of _BitInt values if the ABI requires them to be extended in
> + memory.  */
> +
>build_indirect_ref (loc, p1, RO_UNARY_STAR);
> -  p1 = build2_loc (loc, MEM_REF, I_type,
> -build1 (VIEW_CONVERT_EXPR, I_type_ptr, p1),
> -build_zero_cst (TREE_TYPE (p1)));
> +
> +  tree p1type = TREE_TYPE (p1);
> +  bool bitint_extended_p = false;
> +  if (TREE_CODE (TREE_TYPE (p1type)) == BITINT_TYPE)
> +{
> +  struct bitint_info info;
> +  unsigned prec = TYPE_PRECISION (TREE_TYPE (p1type));
> +  targetm.c.bitint_type_info (prec, &info);
> +  bitint_extended_p = info.extended;
> +}
> +
> + if (bitint_extended_p)

The indentation of the above line looks wrong.

> +p1 = build1_loc (loc, CONVERT_EXPR, I_type,

Shouldn't this be fold_convert_loc instead?

> +  build2_loc (loc, MEM_REF, TREE_TYPE (p1type),
> +  p1, build_zero_cst (p1type)));
> +

Why the empty line in the middle of if/else?

> +  /* Otherwise, the padding bits might not be preserved, as stated above.  */
> +  else
> +p1 = build2_loc (loc, MEM_REF, I_type,
> +  build1 (VIEW_CONVERT_EXPR, I_type_ptr, p1),
> +  build_zero_cst (p1type));
> +
> +

Just one empty line, not two?

> +
> + if (bitint_extended_p)

Again.

> +p2 = build1_loc (loc, CONVERT_EXPR, I_type,

Again.

> +  build2_loc (loc, MEM_REF, TREE_TYPE (p2type),
> +  p2, build_zero_cst (p2type)));
> +

Again.

> +  /* Otherwise, the padding bits might not be preserved, as stated above.  */
> +  else
> +p2 = build2_loc (loc, MEM_REF, I_type,
> +  build1 (VIEW_CONVERT_EXPR, I_type_ptr, p2),
> +  build_zero_cst (p2type));
> +
>(*params)[2] = p2;
>  
>/* The rest of the parameters are fine. NULL means no special return value
> --- a/gcc/explow.cc
> +++ b/gcc/explow.cc
> @@ -852,11 +852,26 @@ promote_function_mode (const_tree type, machine_mode 
> mode, int *punsignedp,
>   return mode;
>  }
>  
> +  /* Handle _BitInt(N) that does not require promotion.  */
> +  if (TREE_CODE (type) == BITINT_TYPE)
> +{
> +  if (TYPE_MODE (type) == BLKmode)
> + return mode;
> +
> +  struct bitint_info info;
> +  bool ok = targetm.c.bitint_type_info (TYPE_PRECISION (type), &info);
> +  gcc_assert (ok);
> +
> +  if (!info.extended)
> + return mode;
> +}

There is a switch below, so IMNSHO the BITINT_TYPE handling should go into
the switch.

> +
> +
>switch (TREE_CODE (type))
>  {

So perhaps better add it here like:
case BITINT_TYPE:
  if (TYPE_MODE (type) == BLKmode)
return mode;
  else
{
  struct bitint_info info;
  bool ok = targetm.c.bitint_type_info (TYPE_PRECISION (type), &info);
  gcc_assert (ok);
  if (!info.extended)
return mode;
}
  /* FALLTHRU */
>  case INTEGER_TYPE:   case ENUMERAL_TYPE:   case BOOLEAN_TYPE:
>  case REAL_TYPE:  case OFFSET_TYPE: case FIXED_POINT_TYPE:
> -case POINTER_TYPE:   case REFERENCE_TYPE:
> +case POINT

Re: [PATCH v1 2/6] libstdc++: Add tests for layout_left.

2025-05-20 Thread Tomasz Kaminski
On Tue, May 20, 2025 at 10:24 AM Luc Grosheintz 
wrote:

>
>
> On 5/19/25 2:56 PM, Tomasz Kaminski wrote:
> > On Sun, May 18, 2025 at 10:14 PM Luc Grosheintz <
> luc.groshei...@gmail.com>
> > wrote:
> >
> >> Implements a suite of tests for the currently implemented parts of
> >> layout_left. The individual tests are templated over the layout type, to
> >> allow reuse as more layouts are added.
> >>
> >> libstdc++-v3/ChangeLog:
> >>
> >>  * testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc:
> New
> >> test.
> >>  * testsuite/23_containers/mdspan/layouts/ctors.cc: New test.
> >>  * testsuite/23_containers/mdspan/layouts/mapping.cc: New test.
> >>
> >> Signed-off-by: Luc Grosheintz 
> >> ---
> >>   .../mdspan/layouts/class_mandate_neg.cc   |  22 +
> >>   .../23_containers/mdspan/layouts/ctors.cc | 258 ++
> >>   .../23_containers/mdspan/layouts/mapping.cc   | 445 ++
> >>   3 files changed, 725 insertions(+)
> >>   create mode 100644
> >> libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
> >>   create mode 100644
> >> libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
> >>   create mode 100644
> >> libstdc++-v3/testsuite/23_containers/mdspan/layouts/mapping.cc
> >>
> >> diff --git
> >>
> a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
> >>
> b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
> >> new file mode 100644
> >> index 000..f122541b3e8
> >> --- /dev/null
> >> +++
> >>
> b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
> >> @@ -0,0 +1,22 @@
> >> +// { dg-do compile { target c++23 } }
> >> +#include
> >> +
> >> +#include 
> >> +
> >> +constexpr size_t dyn = std::dynamic_extent;
> >> +static constexpr size_t n = (size_t(1) << 7) - 1;
> >>
> > I would use numeric_limits_max here.
> >
> >> +
> >> +template
> >> +  struct A
> >> +  {
> >> +typename Layout::mapping> m0;
> >> +typename Layout::mapping> m1;
> >> +typename Layout::mapping> m2;
> >> +
> >> +using extents_type = std::extents;
> >> +typename Layout::mapping m3; // { dg-error "required
> >> from" }
> >> +  };
> >> +
> >> +A a_left; // { dg-error "required
> >> from" }
> >> +
> >> +// { dg-prune-output "must be representable as index_type" }
> >> diff --git
> a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
> >> b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
> >> new file mode 100644
> >> index 000..4592a05dec8
> >> --- /dev/null
> >> +++ b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
> >> @@ -0,0 +1,258 @@
> >> +// { dg-do run { target c++23 } }
> >> +#include 
> >> +
> >> +#include 
> >> +
> >> +constexpr size_t dyn = std::dynamic_extent;
> >> +
> >> +template
> >> +  constexpr void
> >> +  verify_from_exts(OExtents exts)
> >> +  {
> >> +auto m = Mapping(exts);
> >> +VERIFY(m.extents() == exts);
> >> +  }
> >> +
> >> +
> >> +template
> >> +  constexpr void
> >> +  verify_from_mapping(OMapping other)
> >> +  {
> >> +auto m = SMapping(other);
> >> +VERIFY(m.extents() == other.extents());
> >> +  }
> >> +
> >> +template
> >> +  requires (std::__mdspan::__is_extents)
> >> +  constexpr void
> >> +  verify(OExtents oexts)
> >> +  {
> >
> > In general, wen possible we prefer to not use internal details in tests.
> > I would use if constexpr with requires { typename Other::layout_type; },
> ie.
> > template
> > cosntexpr void
> > verify(Source const& src)
> > {
> >if constexpr (requires { typename Other::layout_type; })
> >   verify_from_mapping(src)
> >else
> >   verify_from_extents(src);
> > }
> >
> > +auto m = Mapping(oexts);
> >> +VERIFY(m.extents() == oexts);
> >> +  }
> >> +
> >> +template
>
>> +  requires (std::__mdspan::__standardized_mapping)
> >> +  constexpr void
> >> +  verify(OMapping other)
> >> +  {
> >> +constexpr auto rank = Mapping::extents_type::rank();
> >> +auto m = Mapping(other);
> >> +VERIFY(m.extents() == other.extents());
> >> +if constexpr (rank > 0)
> >> +  for(size_t i = 0; i < rank; ++i)
> >> +   VERIFY(std::cmp_equal(m.stride(i), other.stride(i)));
> >>
> > Why is this not checked in verify_from_mapping?
> >
> >> +  }
> >> +
> >> +
> >> +template
> >> +  constexpr void
> >> +  verify_nothrow_convertible(From from)
> >> +  {
> >> +static_assert(std::is_nothrow_constructible_v);
> >>
> > I would call  `verify_convertible` here, instead of these two lines.
> >
> >> +static_assert(std::is_convertible_v);
> >> +verify(from);
> >> +  }
> >> +
> >> +template
> >> +  constexpr void
> >> +  verify_convertible(From from)
> >> +  {
> >> +static_assert(std::is_convertible_v);
> >> +verify(from);
> >> +  }
> >> +
> >> +template
> >> +  constexpr void
> >> +  verify_constructible(From from)
> >> +  {
> >> +static_assert(!std::is_convertible_v);
> >> +static_assert(!std::is_not

[PATCH 1/2] vect: Remove non-SLP paths in strided slp and elementwise.

2025-05-20 Thread Robin Dapp
This replaces if (slp) with if (1) and if (!slp) with if (0).

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_load): Make non-slp paths
unreachable.
---
 gcc/tree-vect-stmts.cc | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 66958543bf8..c1998a22448 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -10689,7 +10689,7 @@ vectorizable_load (vec_info *vinfo,
  first_dr_info = dr_info;
}
 
-  if (slp && grouped_load
+  if (1 && grouped_load
  && memory_access_type == VMAT_STRIDED_SLP)
{
  group_size = DR_GROUP_SIZE (first_stmt_info);
@@ -10830,7 +10830,7 @@ vectorizable_load (vec_info *vinfo,
  ltype = build_aligned_type (ltype, align * BITS_PER_UNIT);
}
 
-  if (slp)
+  if (1)
{
  /* For SLP permutation support we need to load the whole group,
 not only the number of vector stmts the permutation result
@@ -10883,14 +10883,14 @@ vectorizable_load (vec_info *vinfo,
CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, new_temp);
 
  group_el += lnel;
- if (! slp
+ if (0
  || group_el == group_size)
{
  n_groups++;
  /* When doing SLP make sure to not load elements from
 the next vector iteration, those will not be accessed
 so just use the last element again.  See PR107451.  */
- if (!slp || known_lt (n_groups, vf))
+ if (0 || known_lt (n_groups, vf))
{
  tree newoff = copy_ssa_name (running_off);
  gimple *incr
@@ -10938,7 +10938,7 @@ vectorizable_load (vec_info *vinfo,
 
  if (!costing_p)
{
- if (slp)
+ if (1)
{
  if (slp_perm)
dr_chain.quick_push (gimple_assign_lhs (new_stmt));
@@ -11032,7 +11032,7 @@ vectorizable_load (vec_info *vinfo,
   group_gap_adj = 0;
 
   /* VEC_NUM is the number of vect stmts to be created for this group.  */
-  if (slp)
+  if (1)
{
  grouped_load = false;
  /* If an SLP permutation is from N elements to N elements,
@@ -11077,7 +11077,7 @@ vectorizable_load (vec_info *vinfo,
   group_size = vec_num = 1;
   group_gap_adj = 0;
   ref_type = reference_alias_ptr_type (DR_REF (first_dr_info->dr));
-  if (slp)
+  if (1)
vec_num = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node);
 }
 
-- 
2.49.0



[PATCH 0/2] vect: Use strided loads for VMAT_STRIDED_SLP.

2025-05-20 Thread Robin Dapp
The second patch adds strided-load support for strided-slp memory
access.  The first patch makes the respective non-slp paths unreachable.

Robin Dapp (2):
  vect: Remove non-SLP paths in strided slp and elementwise.
  vect: Use strided loads for VMAT_STRIDED_SLP.

 gcc/internal-fn.cc|  21 ++
 gcc/internal-fn.h |   2 +
 .../gcc.target/riscv/rvv/autovec/pr118019-2.c |  51 +
 gcc/tree-vect-stmts.cc| 215 +++---
 4 files changed, 253 insertions(+), 36 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr118019-2.c

-- 
2.49.0



Re: [PATCH v1 5/6] libstdc++: Implement layout_stride from mdspan.

2025-05-20 Thread Luc Grosheintz




On 5/20/25 10:48 AM, Tomasz Kaminski wrote:

On Tue, May 20, 2025 at 10:45 AM Luc Grosheintz 
wrote:




On 5/20/25 10:24 AM, Tomasz Kaminski wrote:

On Sun, May 18, 2025 at 10:16 PM Luc Grosheintz <

luc.groshei...@gmail.com>

wrote:


Implements the remaining parts of layout_left and layout_right; and all
of layout_stride.

libstdc++-v3/ChangeLog:

  * include/std/mdspan(layout_stride): New class.

Signed-off-by: Luc Grosheintz 
---
   libstdc++-v3/include/std/mdspan | 219 +++-
   1 file changed, 216 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/std/mdspan
b/libstdc++-v3/include/std/mdspan
index b1984eb2a33..31a38c736c2 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -366,6 +366,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 class mapping;
 };

+  struct layout_stride
+  {
+template
+  class mapping;
+  };
+
 namespace __mdspan
 {
   template
@@ -434,7 +440,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

   template
 concept __standardized_mapping = __mapping_of
_Mapping>

-  || __mapping_of;
+  || __mapping_of
+  || __mapping_of;

   template
 concept __mapping_like = requires
@@ -503,6 +510,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  : mapping(__other.extents(), __mdspan::__internal_ctor{})
  { }

+  template
+   requires (is_constructible_v)
+   constexpr explicit(extents_type::rank() > 0)
+   mapping(const layout_stride::mapping<_OExtents>& __other)
+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   {
+ __glibcxx_assert(
+   layout_left::mapping<_OExtents>(__other.extents()) ==

__other);



Could this be *this == other?


+   }
+
 constexpr mapping&
 operator=(const mapping&) noexcept = default;

@@ -518,8 +535,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  constexpr index_type
  operator()(_Indices... __indices) const noexcept
  {
- return __mdspan::__linear_index_left(
-   this->extents(), static_cast(__indices)...);
+ return __mdspan::__linear_index_left(_M_extents,
+   static_cast(__indices)...);
  }

 static constexpr bool
@@ -633,6 +650,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  : mapping(__other.extents(), __mdspan::__internal_ctor{})
  { }

+  template
+   requires (is_constructible_v)
+   constexpr explicit(extents_type::rank() > 0)
+   mapping(const layout_stride::mapping<_OExtents>& __other)

noexcept

+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   {
+ __glibcxx_assert(
+   layout_right::mapping<_OExtents>(__other.extents()) ==
__other);


Similary here.


+   }
+
 constexpr mapping&
 operator=(const mapping&) noexcept = default;

@@ -695,6 +722,192 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  [[no_unique_address]] _Extents _M_extents;
   };

+  namespace __mdspan
+  {
+template
+  constexpr typename _Mapping::index_type
+  __offset_impl(const _Mapping& __m, index_sequence<_Counts...>)
noexcept
+  { return __m(((void) _Counts, 0)...); }
+
+template
+  constexpr typename _Mapping::index_type
+  __offset(const _Mapping& __m) noexcept
+  {


Again, I would define __impl as nested lambda here:
auto __impl = [&](index_seqeunce<_Counts>) noexcept
{ return  __m(((void) _Counts, 0)...);  }


+   return __offset_impl(__m,
+   make_index_sequence<_Mapping::extents_type::rank()>());
+  }
+
+template
+  constexpr typename _Mapping::index_type
+  __linear_index_strides(const _Mapping& __m,
+_Indices... __indices)
+  {
+   using _IndexType = typename _Mapping::index_type;
+   _IndexType __res = 0;
+   if constexpr (sizeof...(__indices) > 0)
+ {
+   auto __update = [&, __pos = 0u](_IndexType __idx) mutable
+ {
+   __res += __idx * __m.stride(__pos++);
+ };
+   (__update(__indices), ...);
+ }
+   return __res;
+  }
+  }
+
+  template
+class layout_stride::mapping
+{
+  static_assert(__mdspan::__layout_extent<_Extents>,
+   "The size of extents_type must be representable as index_type");
+
+public:
+  using extents_type = _Extents;
+  using index_type = typename extents_type::index_type;
+  using size_type = typename extents_type::size_type;
+  using rank_type = typename extents_type::rank_type;
+  using layout_type = layout_stride;
+
+  constexpr
+  mapping() noexcept
+  {
+   auto __stride = index_type(1);
+   for(size_t __i = extents_type::rank(); __i > 0; --__i)
+ {
+   _M_strides[__i - 1] = __stride;
+   __stride *= _M_extents.extent(__i - 1);
+ }

Re: [PATCH v3] libstdc++: Implement C++23 P1659R3 starts_with and ends_with

2025-05-20 Thread Tomasz Kaminski
On Mon, May 19, 2025 at 6:06 PM Patrick Palka  wrote:

> On Mon, 19 May 2025, Patrick Palka wrote:
>
> > Changes in v3:
> >   * Use the forward_range code path for a (non-sized) bidirectional
> > haystack, since it's slightly fewer increments/decrements
> > overall.
> >   * Fix wrong iter_difference_t cast in starts_with.
> >
> > Changes in v2:
> >   Addressed Tomasz's review comments, namely:
> >   * Added explicit iter_difference_t casts
> >   * Made _S_impl member private
> >   * Optimized sized bidirectional case of ends_with
> >   * Rearranged control flow of starts_with::_S_impl
> >
> > Still left to do:
> >   * Add tests for integer-class types
> >   * Still working on a better commit description ;)
> >
> > -- >8 --
> >
> > libstdc++-v3/ChangeLog:
> >
> >   * include/bits/ranges_algo.h (__starts_with_fn, starts_with):
> >   Define.
> >   (__ends_with_fn, ends_with): Define.
> >   * include/bits/version.def (ranges_starts_ends_with): Define.
> >   * include/bits/version.h: Regenerate.
> >   * include/std/algorithm: Provide __cpp_lib_ranges_starts_ends_with.
> >   * src/c++23/std.cc.in (ranges::starts_with): Export.
> >   (ranges::ends_with): Export.
> >   * testsuite/25_algorithms/ends_with/1.cc: New test.
> >   * testsuite/25_algorithms/starts_with/1.cc: New test.
> > ---
> >  libstdc++-v3/include/bits/ranges_algo.h   | 236 ++
> >  libstdc++-v3/include/bits/version.def |   8 +
> >  libstdc++-v3/include/bits/version.h   |  10 +
> >  libstdc++-v3/include/std/algorithm|   1 +
> >  libstdc++-v3/src/c++23/std.cc.in  |   4 +
> >  .../testsuite/25_algorithms/ends_with/1.cc| 129 ++
> >  .../testsuite/25_algorithms/starts_with/1.cc  | 128 ++
> >  7 files changed, 516 insertions(+)
> >  create mode 100644 libstdc++-v3/testsuite/25_algorithms/ends_with/1.cc
> >  create mode 100644 libstdc++-v3/testsuite/25_algorithms/starts_with/1.cc
> >
> > diff --git a/libstdc++-v3/include/bits/ranges_algo.h
> b/libstdc++-v3/include/bits/ranges_algo.h
> > index f36e7dd59911..54646ae62f7e 100644
> > --- a/libstdc++-v3/include/bits/ranges_algo.h
> > +++ b/libstdc++-v3/include/bits/ranges_algo.h
> > @@ -438,6 +438,242 @@ namespace ranges
> >
> >inline constexpr __search_n_fn search_n{};
> >
> > +#if __glibcxx_ranges_starts_ends_with // C++ >= 23
> > +  struct __starts_with_fn
> > +  {
> > +template _Sent1,
> > +  input_iterator _Iter2, sentinel_for<_Iter2> _Sent2,
> > +  typename _Pred = ranges::equal_to,
> > +  typename _Proj1 = identity, typename _Proj2 = identity>
> > +  requires indirectly_comparable<_Iter1, _Iter2, _Pred, _Proj1,
> _Proj2>
> > +  constexpr bool
> > +  operator()(_Iter1 __first1, _Sent1 __last1,
> > +  _Iter2 __first2, _Sent2 __last2, _Pred __pred = {},
> > +  _Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
> > +  {
> > + iter_difference_t<_Iter1> __n1 = -1;
> > + iter_difference_t<_Iter2> __n2 = -1;
> > + if constexpr (sized_sentinel_for<_Sent1, _Iter1>)
> > +   __n1 = __last1 - __first1;
> > + if constexpr (sized_sentinel_for<_Sent2, _Iter2>)
> > +   __n2 = __last2 - __first2;
> > + return _S_impl(std::move(__first1), __last1, __n1,
> > +std::move(__first2), __last2, __n2,
> > +std::move(__pred),
> > +std::move(__proj1), std::move(__proj2));
> > +  }
> > +
> > +template > +  typename _Pred = ranges::equal_to,
> > +  typename _Proj1 = identity, typename _Proj2 = identity>
> > +  requires indirectly_comparable,
> iterator_t<_Range2>,
> > +  _Pred, _Proj1, _Proj2>
> > +  constexpr bool
> > +  operator()(_Range1&& __r1, _Range2&& __r2, _Pred __pred = {},
> > +  _Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
> > +  {
> > + range_difference_t<_Range1> __n1 = -1;
> > + range_difference_t<_Range1> __n2 = -1;
> > + if constexpr (sized_range<_Range1>)
> > +   __n1 = ranges::size(__r1);
> > + if constexpr (sized_range<_Range2>)
> > +   __n2 = ranges::size(__r2);
> > + return _S_impl(ranges::begin(__r1), ranges::end(__r1), __n1,
> > +ranges::begin(__r2), ranges::end(__r2), __n2,
> > +std::move(__pred),
> > +std::move(__proj1), std::move(__proj2));
> > +  }
> > +
> > +  private:
> > +template typename _Sent2,
> > +  typename _Pred,
> > +  typename _Proj1, typename _Proj2>
> > +  static constexpr bool
> > +  _S_impl(_Iter1 __first1, _Sent1 __last1,
> iter_difference_t<_Iter1> __n1,
> > +   _Iter2 __first2, _Sent2 __last2, iter_difference_t<_Iter2>
> __n2,
> > +   _Pred __pred,
> > +   _Proj1 __proj1, _Proj2 __proj2)
> > +  {
> > + if (__first2 == __last2) [[unlikely]]
> > +   return true;
> > + e

Re: [PATCH v1 5/6] libstdc++: Implement layout_stride from mdspan.

2025-05-20 Thread Luc Grosheintz




On 5/20/25 11:40 AM, Tomasz Kaminski wrote:

On Tue, May 20, 2025 at 11:20 AM Luc Grosheintz 
wrote:




On 5/20/25 10:48 AM, Tomasz Kaminski wrote:

On Tue, May 20, 2025 at 10:45 AM Luc Grosheintz <

luc.groshei...@gmail.com>

wrote:




On 5/20/25 10:24 AM, Tomasz Kaminski wrote:

On Sun, May 18, 2025 at 10:16 PM Luc Grosheintz <

luc.groshei...@gmail.com>

wrote:


Implements the remaining parts of layout_left and layout_right; and

all

of layout_stride.

libstdc++-v3/ChangeLog:

   * include/std/mdspan(layout_stride): New class.

Signed-off-by: Luc Grosheintz 
---
libstdc++-v3/include/std/mdspan | 219

+++-

1 file changed, 216 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/std/mdspan
b/libstdc++-v3/include/std/mdspan
index b1984eb2a33..31a38c736c2 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -366,6 +366,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  class mapping;
  };

+  struct layout_stride
+  {
+template
+  class mapping;
+  };
+
  namespace __mdspan
  {
template
@@ -434,7 +440,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

template
  concept __standardized_mapping = __mapping_of
_Mapping>

-  || __mapping_of;
+  || __mapping_of
+  || __mapping_of;

template
  concept __mapping_like = requires
@@ -503,6 +510,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   : mapping(__other.extents(), __mdspan::__internal_ctor{})
   { }

+  template
+   requires (is_constructible_v)
+   constexpr explicit(extents_type::rank() > 0)
+   mapping(const layout_stride::mapping<_OExtents>& __other)
+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   {
+ __glibcxx_assert(
+   layout_left::mapping<_OExtents>(__other.extents()) ==

__other);



Could this be *this == other?


+   }
+
  constexpr mapping&
  operator=(const mapping&) noexcept = default;

@@ -518,8 +535,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr index_type
   operator()(_Indices... __indices) const noexcept
   {
- return __mdspan::__linear_index_left(
-   this->extents(), static_cast(__indices)...);
+ return __mdspan::__linear_index_left(_M_extents,
+   static_cast(__indices)...);
   }

  static constexpr bool
@@ -633,6 +650,16 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   : mapping(__other.extents(), __mdspan::__internal_ctor{})
   { }

+  template
+   requires (is_constructible_v)
+   constexpr explicit(extents_type::rank() > 0)
+   mapping(const layout_stride::mapping<_OExtents>& __other)

noexcept

+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   {
+ __glibcxx_assert(
+   layout_right::mapping<_OExtents>(__other.extents()) ==
__other);


Similary here.


+   }
+
  constexpr mapping&
  operator=(const mapping&) noexcept = default;

@@ -695,6 +722,192 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   [[no_unique_address]] _Extents _M_extents;
};

+  namespace __mdspan
+  {
+template
+  constexpr typename _Mapping::index_type
+  __offset_impl(const _Mapping& __m, index_sequence<_Counts...>)
noexcept
+  { return __m(((void) _Counts, 0)...); }
+
+template
+  constexpr typename _Mapping::index_type
+  __offset(const _Mapping& __m) noexcept
+  {


Again, I would define __impl as nested lambda here:
auto __impl = [&](index_seqeunce<_Counts>) noexcept
{ return  __m(((void) _Counts, 0)...);  }


+   return __offset_impl(__m,
+   make_index_sequence<_Mapping::extents_type::rank()>());
+  }
+
+template
+  constexpr typename _Mapping::index_type
+  __linear_index_strides(const _Mapping& __m,
+_Indices... __indices)
+  {
+   using _IndexType = typename _Mapping::index_type;
+   _IndexType __res = 0;
+   if constexpr (sizeof...(__indices) > 0)
+ {
+   auto __update = [&, __pos = 0u](_IndexType __idx) mutable
+ {
+   __res += __idx * __m.stride(__pos++);
+ };
+   (__update(__indices), ...);
+ }
+   return __res;
+  }
+  }
+
+  template
+class layout_stride::mapping
+{
+  static_assert(__mdspan::__layout_extent<_Extents>,
+   "The size of extents_type must be representable as

index_type");

+
+public:
+  using extents_type = _Extents;
+  using index_type = typename extents_type::index_type;
+  using size_type = typename extents_type::size_type;
+  using rank_type = typename extents_type::rank_type;
+  using layout_type = layout_stride;
+
+  constexpr
+  mapping() noexcept
+  {
+   auto __stride = index_type(1);
+   for(siz

[committed] libstdc++: Cleanup and stabilize format _Spec<_CharT> and _Pres_type.

2025-05-20 Thread Tomasz Kamiński
These patch makes following changes to _Pres_type values:
 * _Pres_esc is replaced with separate _M_debug flag.
 * _Pres_s, _Pres_p do not overlap with _Pres_none.
 * hexadecimal presentation use same values for pointer, integer
   and floating point types.

The members of _Spec<_CharT> are rearranged so the class contains 8 bits
reserved for future use (_M_reserved) and 8 bits of tail padding.
Derived classes (like _ChronoSpec<_CharT>) can reuse the storage for initial
members. We also add _SpecBase as the base class for _Spec<_CharT> to make
it non-C++98 POD, which allows tail padding to be reused on Itanium ABI.

Finally, the format enumerators are defined as enum class with unsigned
char as underlying type, followed by using enum to bring names in scope.
_Term_char names are adjusted for consistency, and enumerator values are
changed so it can fit in smaller bitfields.

The '?' is changed to separate _M_debug flag, to allow debug format to be
independent from the presentation type, and applied to multiple presentation
types. For example it could be used to trigger memberwise or reflection based
formatting.

The _M_format_character and _M_format_character_escaped functions are merged
to single function that handle normal and debug presentation. In particular
this would allow future support for '?c' for printing integer types as escaped
character. _S_character_width is also folded in the merged function.

Decoupling _Pres_s value from _Pres_none, allows it to be used for string
presentation for range formatting, and removes the need for separate _Pres_seq
and _Pres_str. This does not affect formatting of bool as 
__formatter_int::_M_parse
overrides default value of _M_type. And with separation of the _M_debug flag,
__formatter_str::format behavior is now agnostic to _M_type value.

The values for integer presentation types, are arranged so textual presentations
(_Prec_s, _Pres_c) are grouped together. For consistency floating point
hexadecimal presentation uses the same values as integer ones.

New _Pres_p and setting for _M_alt enables using some spec to configure 
formatting
of  uintptr_t with __formatter_int, and const void* with __formatter_ptr.
Differentiating it from _Pres_none would allow future of formatter
that would require explicit presentation type to be specified. This would allow
std::vector to be formatted directly with '{::p}' format spec.

The constructors for __formatter_int and _formatter_ptr from _Spec<_CharT>,
now also set default presentation modes, as format functions expects them.

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (_ChronoSpec::_M_locale_specific):
Declare as bit fiekd in tail-padding..
* include/bits/formatfwd.h (__format::_Align): Defined as enum
class and add using enum.
* include/std/format (__format::_Pres_type, __format::_Sign)
(__format::_WidthPrec,  __format::_Arg_t): Defined as enum class
and add using enum.
(_Pres_type::_Pres_esc): Replace with _Pres_max.
(_Pres_type::_Pres_seq, _Pres_type::_Pres_str): Remove.
(__format::_Pres_type): Updated values of enumerators as described
above.
(__format::_Spec): Rearranged members to have 8 bits of tail-padding.
(_Spec::_M_debug): Defined.
(_Spec::_M_reserved): Extended to 8 bits and moved at the end.
(_Spec::_M_reserved2): Removed.
(_Spec::_M_parse_fill_and_align, _Spec::_M_parse_sign)
(__format::__write_padded_as_spec): Adjusted default value checks.
(__format::_Term_char): Add using enum and adjust enumertors.
(__Escapes::_S_term): Adjusted for _Term_char values.
(__format::__should_escape_ascii): Adjusted _Term_char uses.
(__format::__write_escaped): Adjusted for _Term_char.
(__formatter_str::parse): Set _Pres_s if specifed and _M_debug
instead of _Pres_esc.
(__formatter_str::set_debug_format): Set _M_debug instead of
_Pres_esc.
(__formatter_str::format, __formatter_str::_M_format_range):
Check _M_debug instead of _Prec_esc.
(__formatter_str::_M_format_escaped): Adjusted _Term_char uses.
(__formatter_int::__formatter_int(_Spec<_CharT>)): Set _Pres_d if
default presentation type is not set.
(__formatter_int::_M_parse): Adjusted default value checks.
(__formatter_int::_M_do_parse): Set _M_debug instead of _Pres_esc.
(__formatter_int::_M_format_character): Handle escaped presentation.
(__formatter_int::_M_format_character_escaped)
(__formatter_int::_S_character_width): Merged into
_M_format_character.
(__formatter_ptr::__formatter_ptr(_Spec<_CharT>)): Set _Pres_p if
default presentation type is not set.
(__formatter_ptr::parse): Add default __type parameter, store _Pres_p,
and handle _M_alt to be consistent with meaning for integers.
(__foramtter_ptr<_CharT>::_M_set_default): Define.
  

Re: [PATCH] libstdc++: Implement stringstream from string_view [P2495R3]

2025-05-20 Thread Tomasz Kaminski
On Mon, May 19, 2025 at 11:28 PM Nathan Myers  wrote:
In the title, we usually put link to bugzilla *PR119741*
 in your case, not the
paper.
Then link the paper in commit descritpion.

Add constructors to stringbuf, stringstream, istringstream,
> and ostringstream, and a matching overload of str(sv) in each,
> that take anything convertible to a string_view where the
> existing functions take a string.
>

After you put bugzilla number, git gcc-verify will suggest you to add
following node:
PR libstdc++/*119741*



>
> libstdc++-v3/ChangeLog:
>
> P2495R3 stringstream to init from string_view-ish
>
We usually put only the change files here. Did git gcc-verify accepted it.

> * include/std/sstream: full implementation, really just
> decls, requires clause and plumbing.
> * include/std/bits/version.def, .h: new preprocessor symbol
> __cpp_lib_sstream_from_string_view.
> * testsuite/27_io/basic_stringbuf/cons/char/3.cc: New tests.
> * testsuite/27_io/basic_istringstream/cons/char/2.cc: New tests.
> * testsuite/27_io/basic_ostringstream/cons/char/4.cc: New tests.
> * testsuite/27_io/basic_stringstream/cons/char/2.cc: New tests.
> ---
>  libstdc++-v3/ChangeLog|  11 +
>  libstdc++-v3/include/bits/version.def |  11 +-
>  libstdc++-v3/include/bits/version.h   |  10 +
>  libstdc++-v3/include/std/sstream  | 181 +--
>  .../27_io/basic_istringstream/cons/char/2.cc  | 193 
>  .../27_io/basic_ostringstream/cons/char/4.cc  | 193 
>  .../27_io/basic_stringbuf/cons/char/3.cc  | 216 ++
>  .../27_io/basic_stringstream/cons/char/2.cc   | 194 
>  8 files changed, 990 insertions(+), 19 deletions(-)
>  create mode 100644
> libstdc++-v3/testsuite/27_io/basic_istringstream/cons/char/2.cc
>  create mode 100644
> libstdc++-v3/testsuite/27_io/basic_ostringstream/cons/char/4.cc
>  create mode 100644
> libstdc++-v3/testsuite/27_io/basic_stringbuf/cons/char/3.cc
>  create mode 100644
> libstdc++-v3/testsuite/27_io/basic_stringstream/cons/char/2.cc
>
> diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
> index b45f8c2c7a5..ac0ff4a386f 100644
> --- a/libstdc++-v3/ChangeLog
> +++ b/libstdc++-v3/ChangeLog
> @@ -41,6 +41,17 @@
> PR libstdc++/119246
> * include/std/format: Updated check for _GLIBCXX_FORMAT_F128.
>
> +2025-05-14  Nathan Myers  
> +   P2495R3 stringstream to init from string_view-ish
> +   * include/std/sstream: full implementation, really just
> +   decls, requires clause and plumbing.
> +   * include/std/bits/version.def, .h: new preprocessor symbol
> +   __cpp_lib_sstream_from_string_view.
> +   * testsuite/27_io/basic_stringbuf/cons/char/3.cc: New tests.
> +   * testsuite/27_io/basic_istringstream/cons/char/2.cc: New tests.
> +   * testsuite/27_io/basic_ostringstream/cons/char/4.cc: New tests.
> +   * testsuite/27_io/basic_stringstream/cons/char/2.cc: New tests.
> +
>
Changelogs are now automatically generated from commit messages, so you do
not
need to edit this file.

>  2025-05-14  Tomasz Kamiński  
>
> PR libstdc++/119125
> diff --git a/libstdc++-v3/include/bits/version.def
> b/libstdc++-v3/include/bits/version.def
> index 6ca148f0488..567c56b4117 100644
> --- a/libstdc++-v3/include/bits/version.def
> +++ b/libstdc++-v3/include/bits/version.def
> @@ -649,7 +649,7 @@ ftms = {
>};
>values = {
>  v = 1;
> -/* For when there's no gthread.  */
> +// For when there is no gthread.
>  cxxmin = 17;
>  hosted = yes;
>  gthread = no;
> @@ -1961,6 +1961,15 @@ ftms = {
>};
>  };
>
> +ftms = {
> +  name = sstream_from_string_view;
> +  values = {
> +v = 202302;
> +cxxmin = 26;
> +hosted = yes;
> +  };
> +};
> +
>  // Standard test specifications.
>  stds[97] = ">= 199711L";
>  stds[03] = ">= 199711L";
> diff --git a/libstdc++-v3/include/bits/version.h
> b/libstdc++-v3/include/bits/version.h
> index 48a090c14a3..5d1beb83a25 100644
> --- a/libstdc++-v3/include/bits/version.h
> +++ b/libstdc++-v3/include/bits/version.h
> @@ -2193,4 +2193,14 @@
>  #endif /* !defined(__cpp_lib_modules) && defined(__glibcxx_want_modules)
> */
>  #undef __glibcxx_want_modules
>
> +#if !defined(__cpp_lib_sstream_from_string_view)
> +# if (__cplusplus >  202302L) && _GLIBCXX_HOSTED
> +#  define __glibcxx_sstream_from_string_view 202302L
> +#  if defined(__glibcxx_want_all) ||
> defined(__glibcxx_want_sstream_from_string_view)
> +#   define __cpp_lib_sstream_from_string_view 202302L
> +#  endif
> +# endif
> +#endif /* !defined(__cpp_lib_sstream_from_string_view) &&
> defined(__glibcxx_want_sstream_from_string_view) */
> +#undef __glibcxx_want_sstream_from_string_view
> +
>  #undef __glibcxx_want_all
> diff --git a/libstdc++-v3/include/std/sst

Re: [PATCH v2 1/6] libstdc++: Implement layout_left from mdspan.

2025-05-20 Thread Luc Grosheintz




On 5/20/25 4:07 PM, Tomasz Kaminski wrote:

On Tue, May 20, 2025 at 3:16 PM Luc Grosheintz 
wrote:


Implements the parts of layout_left that don't depend on any of the
other layouts.

libstdc++-v3/ChangeLog:

 * include/std/mdspan (layout_left): New class.

Signed-off-by: Luc Grosheintz 
---


Sending feedback on this PR, I do not think I will have time to review the
remaining ones today.
Thanks, for the update, this really looks good. Good job on catching
rank_dynamic==0 case.

I have added more stylistic suggestions:
I would consider renaming the internal function to `exts` instead of
`subextents`/`extents`.
We are inside __mdspan namespace, so risk of collision is minimal. Also
added few suggestion
for default arguments.


Many thanks for the review and no rush! One clarifying question, since
some of the loops have (or will have) short-circuiting, do you mean
std::for_each or range-based for-loops `for(auto __factor : __exts)`. Do
you mind if I use the latter?



More comments below.


  libstdc++-v3/include/std/mdspan | 309 +++-
  1 file changed, 308 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/mdspan
b/libstdc++-v3/include/std/mdspan
index 47cfa405e44..d90fed57a19 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -144,6 +144,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   { return __exts[__i]; });
   }

+   static constexpr span
+   _S_static_subextents(size_t __begin, size_t __end) noexcept


+   {

+ return {_Extents.data() + __begin, _Extents.data() + __end};
+   }
+
+   constexpr span
+   _M_dynamic_subextents(size_t __begin, size_t __end) const noexcept
+   requires (_Extents.size() > 0)
+   {
+ return {_M_dynamic_extents + _S_dynamic_index[__begin],
+ _M_dynamic_extents + _S_dynamic_index[__end]};
+   }
+
private:
 using _S_storage = __array_traits<_IndexType,
_S_rank_dynamic>::_Type;
 [[no_unique_address]] _S_storage _M_dynamic_extents;
@@ -160,6 +174,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 || _Extent <= numeric_limits<_IndexType>::max();
}

+  namespace __mdspan
+  {
+template
+  constexpr span
+  __static_subextents(size_t __begin, size_t __end)


Consider adding default arguments: __begin = 0u, __end = _Extents::rank().
This would simplify calls.


+  { return _Extents::_S_storage::_S_static_subextents(__begin,
__end); }
+
+template
+  constexpr span
+  __dynamic_subextents(const _Extents& __exts, size_t __begin, size_t
__end)
+  {
+   return __exts._M_dynamic_extents._M_dynamic_subextents(__begin,
__end);
+  }
+  }
+
template
  class extents
  {
@@ -251,7 +280,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 : _M_dynamic_extents(span(__exts))
 { }

-
template<__mdspan::__valid_index_type _OIndexType,
size_t _Nm>
 requires (_Nm == rank() || _Nm == rank_dynamic())
 constexpr explicit(_Nm != rank_dynamic())
@@ -276,6 +304,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }

  private:
+  friend span
+  __mdspan::__static_subextents(size_t, size_t);
+
+  friend span
+  __mdspan::__dynamic_subextents(const extents&, size_t,
size_t);
+
using _S_storage = __mdspan::_ExtentsStorage<
 _IndexType, array{_Extents...}>;
[[no_unique_address]] _S_storage _M_dynamic_extents;
@@ -286,6 +320,52 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

namespace __mdspan
{
+template
+  constexpr size_t
+  __static_extents_prod(size_t __begin, size_t __end)
+  {


+   auto __sta_exts = __static_subextents<_Extents>(__begin, __end);

+   size_t __ret = 1;
+   for(size_t __i = 0; __i < __sta_exts.size(); ++__i)


Again we could use each here, once we have span.


+ if (__sta_exts[__i] != dynamic_extent)
+   __ret *= __sta_exts[__i];
+   return __ret;
+  }
+
+template
+  constexpr size_t
+  __dynamic_extents_prod(const _Extents& __exts, size_t __begin,
+size_t __end)
+  {
+   auto __dyn_exts = __dynamic_subextents<_Extents>(__exts, __begin,
+__end);
+   size_t __ret = 1;
+   for(size_t __i = 0; __i < __dyn_exts.size(); ++__i)


Again we could use each here, once we have span.  And we could inline the
function in __exts_prod.


+   __ret *= __dyn_exts[__i];
+   return __ret;
+  }
+
+template
+  constexpr typename _Extents::index_type
+  __exts_prod(const _Extents& __exts, size_t __begin, size_t __end)
noexcept
+  {
+   using _IndexType = typename _Extents::index_type;
+   auto __ret = __static_extents_prod<_Extents>(__begin, __end);
+   if constexpr (_Extents::rank_dynamic() > 0)
+ __ret *= __dynamic_extents_prod(__exts, __begin, __end);
+   return __ret;
+  }
+
+tem

[PATCH v5] libstdc++: Implement C++23 P1659R3 starts_with and ends_with

2025-05-20 Thread Patrick Palka
Changes in v5:
  * dispatch to starts_with for the both-bidi/common range case

Changes in v4:
  * optimize the both-bidi/common ranges case, as suggested by
Tomasz
  * add tests for that code path

Changes in v3:
  * Use the forward_range code path for a (non-sized) bidirectional
haystack, since it's slightly fewer increments/decrements
overall.
  * Fix wrong iter_difference_t cast in starts_with.

Changes in v2:
  Addressed Tomasz's review comments, namely:
  * Added explicit iter_difference_t casts
  * Made _S_impl member private
  * Optimized sized bidirectional case of ends_with
  * Rearranged control flow of starts_with::_S_impl

Still left to do:
  * Add tests for integer-class types
  * Still working on a better commit description ;)

-- >8 --

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__starts_with_fn, starts_with):
Define.
(__ends_with_fn, ends_with): Define.
* include/bits/version.def (ranges_starts_ends_with): Define.
* include/bits/version.h: Regenerate.
* include/std/algorithm: Provide __cpp_lib_ranges_starts_ends_with.
* src/c++23/std.cc.in (ranges::starts_with): Export.
(ranges::ends_with): Export.
* testsuite/25_algorithms/ends_with/1.cc: New test.
* testsuite/25_algorithms/starts_with/1.cc: New test.
---
 libstdc++-v3/include/bits/ranges_algo.h   | 247 ++
 libstdc++-v3/include/bits/version.def |   8 +
 libstdc++-v3/include/bits/version.h   |  10 +
 libstdc++-v3/include/std/algorithm|   1 +
 libstdc++-v3/src/c++23/std.cc.in  |   4 +
 .../testsuite/25_algorithms/ends_with/1.cc| 135 ++
 .../testsuite/25_algorithms/starts_with/1.cc  | 128 +
 7 files changed, 533 insertions(+)
 create mode 100644 libstdc++-v3/testsuite/25_algorithms/ends_with/1.cc
 create mode 100644 libstdc++-v3/testsuite/25_algorithms/starts_with/1.cc

diff --git a/libstdc++-v3/include/bits/ranges_algo.h 
b/libstdc++-v3/include/bits/ranges_algo.h
index f36e7dd59911..60f7bf841f3f 100644
--- a/libstdc++-v3/include/bits/ranges_algo.h
+++ b/libstdc++-v3/include/bits/ranges_algo.h
@@ -438,6 +438,253 @@ namespace ranges
 
   inline constexpr __search_n_fn search_n{};
 
+#if __glibcxx_ranges_starts_ends_with // C++ >= 23
+  struct __starts_with_fn
+  {
+template _Sent1,
+input_iterator _Iter2, sentinel_for<_Iter2> _Sent2,
+typename _Pred = ranges::equal_to,
+typename _Proj1 = identity, typename _Proj2 = identity>
+  requires indirectly_comparable<_Iter1, _Iter2, _Pred, _Proj1, _Proj2>
+  constexpr bool
+  operator()(_Iter1 __first1, _Sent1 __last1,
+_Iter2 __first2, _Sent2 __last2, _Pred __pred = {},
+_Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
+  {
+   iter_difference_t<_Iter1> __n1 = -1;
+   iter_difference_t<_Iter2> __n2 = -1;
+   if constexpr (sized_sentinel_for<_Sent1, _Iter1>)
+ __n1 = __last1 - __first1;
+   if constexpr (sized_sentinel_for<_Sent2, _Iter2>)
+ __n2 = __last2 - __first2;
+   return _S_impl(std::move(__first1), __last1, __n1,
+  std::move(__first2), __last2, __n2,
+  std::move(__pred),
+  std::move(__proj1), std::move(__proj2));
+  }
+
+template
+  requires indirectly_comparable, iterator_t<_Range2>,
+_Pred, _Proj1, _Proj2>
+  constexpr bool
+  operator()(_Range1&& __r1, _Range2&& __r2, _Pred __pred = {},
+_Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
+  {
+   range_difference_t<_Range1> __n1 = -1;
+   range_difference_t<_Range1> __n2 = -1;
+   if constexpr (sized_range<_Range1>)
+ __n1 = ranges::size(__r1);
+   if constexpr (sized_range<_Range2>)
+ __n2 = ranges::size(__r2);
+   return _S_impl(ranges::begin(__r1), ranges::end(__r1), __n1,
+  ranges::begin(__r2), ranges::end(__r2), __n2,
+  std::move(__pred),
+  std::move(__proj1), std::move(__proj2));
+  }
+
+  private:
+template
+  static constexpr bool
+  _S_impl(_Iter1 __first1, _Sent1 __last1, iter_difference_t<_Iter1> __n1,
+ _Iter2 __first2, _Sent2 __last2, iter_difference_t<_Iter2> __n2,
+ _Pred __pred, _Proj1 __proj1, _Proj2 __proj2)
+  {
+   if (__first2 == __last2) [[unlikely]]
+ return true;
+   else if (__n1 == -1 || __n2 == -1)
+ return ranges::mismatch(std::move(__first1), __last1,
+ std::move(__first2), __last2,
+ std::move(__pred),
+ std::move(__proj1), std::move(__proj2)).in2 
== __last2;
+   else if (__n1 < __n2)
+ return false;
+   else if constexpr (random_access_iterator<_Iter1>)
+ return ranges::equal(__first1, __firs

Re: [PATCH v5] libstdc++: Implement C++23 P1659R3 starts_with and ends_with

2025-05-20 Thread Patrick Palka
On Tue, 20 May 2025, Patrick Palka wrote:

> Changes in v5:
>   * dispatch to starts_with for the both-bidi/common range case

Forgot to mention:

And for a bidi common haystack, prefer iterating forward instead of
backward if the needle is at least half the size of the haystack,

> 
> Changes in v4:
>   * optimize the both-bidi/common ranges case, as suggested by
> Tomasz
>   * add tests for that code path
> 
> Changes in v3:
>   * Use the forward_range code path for a (non-sized) bidirectional
> haystack, since it's slightly fewer increments/decrements
> overall.
>   * Fix wrong iter_difference_t cast in starts_with.
> 
> Changes in v2:
>   Addressed Tomasz's review comments, namely:
>   * Added explicit iter_difference_t casts
>   * Made _S_impl member private
>   * Optimized sized bidirectional case of ends_with
>   * Rearranged control flow of starts_with::_S_impl
> 
> Still left to do:
>   * Add tests for integer-class types
>   * Still working on a better commit description ;)
> 
> -- >8 --
> 
> libstdc++-v3/ChangeLog:
> 
>   * include/bits/ranges_algo.h (__starts_with_fn, starts_with):
>   Define.
>   (__ends_with_fn, ends_with): Define.
>   * include/bits/version.def (ranges_starts_ends_with): Define.
>   * include/bits/version.h: Regenerate.
>   * include/std/algorithm: Provide __cpp_lib_ranges_starts_ends_with.
>   * src/c++23/std.cc.in (ranges::starts_with): Export.
>   (ranges::ends_with): Export.
>   * testsuite/25_algorithms/ends_with/1.cc: New test.
>   * testsuite/25_algorithms/starts_with/1.cc: New test.
> ---
>  libstdc++-v3/include/bits/ranges_algo.h   | 247 ++
>  libstdc++-v3/include/bits/version.def |   8 +
>  libstdc++-v3/include/bits/version.h   |  10 +
>  libstdc++-v3/include/std/algorithm|   1 +
>  libstdc++-v3/src/c++23/std.cc.in  |   4 +
>  .../testsuite/25_algorithms/ends_with/1.cc| 135 ++
>  .../testsuite/25_algorithms/starts_with/1.cc  | 128 +
>  7 files changed, 533 insertions(+)
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/ends_with/1.cc
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/starts_with/1.cc
> 
> diff --git a/libstdc++-v3/include/bits/ranges_algo.h 
> b/libstdc++-v3/include/bits/ranges_algo.h
> index f36e7dd59911..60f7bf841f3f 100644
> --- a/libstdc++-v3/include/bits/ranges_algo.h
> +++ b/libstdc++-v3/include/bits/ranges_algo.h
> @@ -438,6 +438,253 @@ namespace ranges
>  
>inline constexpr __search_n_fn search_n{};
>  
> +#if __glibcxx_ranges_starts_ends_with // C++ >= 23
> +  struct __starts_with_fn
> +  {
> +template _Sent1,
> +  input_iterator _Iter2, sentinel_for<_Iter2> _Sent2,
> +  typename _Pred = ranges::equal_to,
> +  typename _Proj1 = identity, typename _Proj2 = identity>
> +  requires indirectly_comparable<_Iter1, _Iter2, _Pred, _Proj1, _Proj2>
> +  constexpr bool
> +  operator()(_Iter1 __first1, _Sent1 __last1,
> +  _Iter2 __first2, _Sent2 __last2, _Pred __pred = {},
> +  _Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
> +  {
> + iter_difference_t<_Iter1> __n1 = -1;
> + iter_difference_t<_Iter2> __n2 = -1;
> + if constexpr (sized_sentinel_for<_Sent1, _Iter1>)
> +   __n1 = __last1 - __first1;
> + if constexpr (sized_sentinel_for<_Sent2, _Iter2>)
> +   __n2 = __last2 - __first2;
> + return _S_impl(std::move(__first1), __last1, __n1,
> +std::move(__first2), __last2, __n2,
> +std::move(__pred),
> +std::move(__proj1), std::move(__proj2));
> +  }
> +
> +template +  typename _Pred = ranges::equal_to,
> +  typename _Proj1 = identity, typename _Proj2 = identity>
> +  requires indirectly_comparable, 
> iterator_t<_Range2>,
> +  _Pred, _Proj1, _Proj2>
> +  constexpr bool
> +  operator()(_Range1&& __r1, _Range2&& __r2, _Pred __pred = {},
> +  _Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
> +  {
> + range_difference_t<_Range1> __n1 = -1;
> + range_difference_t<_Range1> __n2 = -1;
> + if constexpr (sized_range<_Range1>)
> +   __n1 = ranges::size(__r1);
> + if constexpr (sized_range<_Range2>)
> +   __n2 = ranges::size(__r2);
> + return _S_impl(ranges::begin(__r1), ranges::end(__r1), __n1,
> +ranges::begin(__r2), ranges::end(__r2), __n2,
> +std::move(__pred),
> +std::move(__proj1), std::move(__proj2));
> +  }
> +
> +  private:
> +template _Sent2,
> +  typename _Pred,
> +  typename _Proj1, typename _Proj2>
> +  static constexpr bool
> +  _S_impl(_Iter1 __first1, _Sent1 __last1, iter_difference_t<_Iter1> 
> __n1,
> +   _Iter2 __first2, _Sent2 __last2, iter_difference_t<_Iter2> __n2,
> +   _Pred __pred, _Proj1 __proj1, _Proj2 __proj2)
> +

Re: [AUTOFDO][AARCH64] Add support for profilebootstrap

2025-05-20 Thread Kugan Vivekanandarajah
Thanks Richard for the review.

> On 20 May 2025, at 2:47 am, Richard Sandiford  
> wrote:
>
> External email: Use caution opening links or attachments
>
>
> Kugan Vivekanandarajah  writes:
>> diff --git a/Makefile.in b/Makefile.in
>> index b1ed67d3d4f..b5e3e520791 100644
>> --- a/Makefile.in
>> +++ b/Makefile.in
>> @@ -4271,7 +4271,7 @@ all-stageautoprofile-bfd: 
>> configure-stageautoprofile-bfd
>>  $(HOST_EXPORTS) \
>>  $(POSTSTAGE1_HOST_EXPORTS)  \
>>  cd $(HOST_SUBDIR)/bfd && \
>> - $$s/gcc/config/i386/$(AUTO_PROFILE) \
>> + $$s/gcc/config/@cpu_type@/$(AUTO_PROFILE) \
>>  $(MAKE) $(BASE_FLAGS_TO_PASS) \
>>  CFLAGS="$(STAGEautoprofile_CFLAGS)" \
>>  GENERATOR_CFLAGS="$(STAGEautoprofile_GENERATOR_CFLAGS)" \
>
> The usual style seems to be to assign @foo@ to a makefile variable
> called foo or FOO, rather than to use @foo@ directly in rules.  Otherwise
> the makefile stuff looks good.
>
> I don't feel qualified to review the script, but some general shell stuff:
>
>> diff --git a/gcc/config/aarch64/gcc-auto-profile 
>> b/gcc/config/aarch64/gcc-auto-profile
>> new file mode 100755
>> index 000..0ceec035e69
>> --- /dev/null
>> +++ b/gcc/config/aarch64/gcc-auto-profile
>> @@ -0,0 +1,51 @@
>> +#!/bin/sh
>> +# Profile workload for gcc profile feedback (autofdo) using Linux perf.
>> +# Copyright The GNU Toolchain Authors.
>> +#
>> +# This file is part of GCC.
>> +#
>> +# GCC is free software; you can redistribute it and/or modify it under
>> +# the terms of the GNU General Public License as published by the Free
>> +# Software Foundation; either version 3, or (at your option) any later
>> +# version.
>> +
>> +# GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>> +# WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>> +# for more details.
>> +
>> +# You should have received a copy of the GNU General Public License
>> +# along with GCC; see the file COPYING3.  If not see
>> +# .  */
>> +
>> +# Run perf record with branch stack sampling and check for
>> +# specific error message to see if it is supported.
>> +use_brbe=true
>> +output=$(perf record -j any,u ls 2>&1)
>
> How about using /bin/true rather than ls for the test program?
>
>> +if [[ "$output" = *"Error::P: PMU Hardware or event type doesn't support 
>> branch stack sampling."* ]]; then
>
> [[ isn't POSIX, or at least dash doesn't accept it.  Since this script
> is effectively linux-specific, we can probably assume that /bin/bash
> exists and use that in the #! line.
>
> If we use bash, then the test could use =~ rather than an exact match.
> This could be useful if perf prints other diagnostics besides the
> one being tested for, or if future versions of perf alter the wording
> slightly.
>
>> +  use_brbe=false
>> +fi
>> +
>> +FLAGS=u
>> +if [ "$1" = "--kernel" ] ; then
>> +  FLAGS=k
>> +  shift
>> +fi
>> +if [ "$1" = "--all" ] ; then
>
> How about making this an elif, so that we don't accept --kernel --all?
>
>> +  FLAGS=u,k
>> +  shift
>> +fi
>> +
>> +if [ "$use_brbe" = true ] ; then
>> +  if grep -q hypervisor /proc/cpuinfo ; then
>> +echo >&2 "Warning: branch profiling may not be functional in VMs"
>> +  fi
>> +  set -x
>> +  perf record -j any,$FLAGS "$@"
>> +  set +x
>> +else
>> +  set -x
>> +  echo >&2 "Warning: branch profiling may not be functional without BRBE"
>> +  perf record "$@"
>> +  set +x
>
> Putting the set -x after the echo seems better, as for the "then" branch.

Here is the revised version that handles the above comments.

Thanks,
Kugan



>
> Thanks,
> Richard
>
>> +fi



0004-AUTOFDO_v3-AARCH64-Add-support-for-profilebootstrap.patch
Description: 0004-AUTOFDO_v3-AARCH64-Add-support-for-profilebootstrap.patch


Re: [PATCH 3/5 v3] c++, coroutines: Address CWG2563 return value init [PR119916].

2025-05-20 Thread Jason Merrill

On 5/20/25 9:42 AM, Iain Sandoe wrote:

Hi Jason


So I moved this to the position before the g_r_o is initialized
(since we only manage cleanups of the entities that come before that, although
that's a bit hard to see from the patch).



This will probably need reevaluation if you take my suggestion from the 
decltype patch for addressing 115908, but this is fine for now.


I am adding the suggestion to my TODO.


...
+  if (flag_exceptions)
+{
+  r = cp_build_init_expr (coro_before_return, boolean_false_node);



This should be MODIFY_EXPR, not INIT_EXPR; it got an initial value already in 
the DECL_EXPR.


Fixed, OK for trunk now?


OK.


--- 8< ---

This addresses the clarification that, when the get_return_object is of a
different type from the ramp return, any necessary conversions should be
performed on the return expression (so that they typically occur after the
function body has started execution).

PR c++/119916

gcc/cp/ChangeLog:

* coroutines.cc
(cp_coroutine_transform::wrap_original_function_body): Do not
initialise initial_await_resume_called here...
(cp_coroutine_transform::build_ramp_function): ... but here.
When the coroutine is not void, initialize a GRO object from
promise.get_return_object().  Use this as the argument to the
return expression.  Use a regular cleanup for the GRO, since
it is ramp-local.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/torture/special-termination-00-sync-completion.C:
Amend for CWG2563 expected behaviour.
* g++.dg/coroutines/torture/special-termination-01-self-destruct.C:
Likewise.
* g++.dg/coroutines/torture/pr119916.C: New test.

Signed-off-by: Iain Sandoe 
---
  gcc/cp/coroutines.cc  | 126 ++
  .../g++.dg/coroutines/torture/pr119916.C  |  66 +
  .../special-termination-00-sync-completion.C  |   2 +-
  .../special-termination-01-self-destruct.C|   2 +-
  4 files changed, 109 insertions(+), 87 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/coroutines/torture/pr119916.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 743da068e35..bc5fb9381db 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -4451,7 +4451,7 @@ cp_coroutine_transform::wrap_original_function_body ()
tree i_a_r_c
= coro_build_artificial_var (loc, coro_frame_i_a_r_c_id,
 boolean_type_node, orig_fn_decl,
-boolean_false_node);
+NULL_TREE);
DECL_CHAIN (i_a_r_c) = var_list;
var_list = i_a_r_c;
add_decl_expr (i_a_r_c);
@@ -4867,7 +4867,6 @@ cp_coroutine_transform::build_ramp_function ()
add_decl_expr (coro_fp);
  
tree coro_promise_live = NULL_TREE;

-  tree coro_gro_live = NULL_TREE;
if (flag_exceptions)
  {
/* Signal that we need to clean up the promise object on exception.  */
@@ -4876,13 +4875,6 @@ cp_coroutine_transform::build_ramp_function ()
  boolean_type_node, orig_fn_decl,
  boolean_false_node);
  
-  /* When the get-return-object is in the RETURN slot, we need to arrange

-for cleanup on exception.  */
-  coro_gro_live
-   = coro_build_and_push_artificial_var (loc, "_Coro_gro_live",
- boolean_type_node, orig_fn_decl,
- boolean_false_node);
-
/* To signal that we need to cleanup copied function args.  */
if (DECL_ARGUMENTS (orig_fn_decl))
for (tree arg = DECL_ARGUMENTS (orig_fn_decl); arg != NULL;
@@ -4970,13 +4962,19 @@ cp_coroutine_transform::build_ramp_function ()
tree ramp_try_block = NULL_TREE;
tree ramp_try_stmts = NULL_TREE;
tree iarc_x = NULL_TREE;
+  tree coro_before_return = NULL_TREE;
if (flag_exceptions)
  {
+  coro_before_return
+   = coro_build_and_push_artificial_var (loc, "_Coro_before_return",
+ boolean_type_node, orig_fn_decl,
+ boolean_true_node);
iarc_x
= coro_build_and_push_artificial_var_with_dve (loc,
   coro_frame_i_a_r_c_id,
   boolean_type_node,
-  orig_fn_decl, NULL_TREE,
+  orig_fn_decl,
+  boolean_false_node,
   deref_fp);
ramp_try_block = begin_try_block ();
ramp_try_stmts = begin_compound_stmt (BCS_TRY_BLOCK);
@@ -5136,90 +5134,54 @@ cp_coroutine_transform::build_ramp_function ()
  (loc, coro_resume_index_id,

[PATCH] ipa: Fix whitespace when dumping VR in jump_functions

2025-05-20 Thread Martin Jambor
Hi,

lack of white space breakes the tree-visualisation structure and makes
the dump unnecessarily difficult to read.

I have bootstrapped and tested the patch on x86_64-linux.  I plan to
commit it soon as obvious.

Thanks,

Martin


gcc/ChangeLog:

2025-05-19  Martin Jambor  

* ipa-prop.cc (ipa_dump_jump_function): Fix whitespace when
dumping IPA VRs.
---
 gcc/ipa-prop.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index 0398d69962f..24a538034e3 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -542,6 +542,7 @@ ipa_dump_jump_function (FILE *f, ipa_jump_func *jump_func,
 
   if (jump_func->m_vr)
 {
+  fprintf (f, " ");
   jump_func->m_vr->dump (f);
   fprintf (f, "\n");
 }
-- 
2.49.0



Re: [PATCH] Add pattern match in match.pd for .AVG_CEIL

2025-05-20 Thread Richard Biener
On Thu, May 15, 2025 at 10:04 AM liuhongt  wrote:
>
> 1) Optimize (a >> 1) + (b >> 1) + ((a | b) & 1) to .AVG_CEIL (a, b)
> 2) Optimize (a | b) - ((a ^ b) >> 1) to .AVG_CEIL (a, b)
>
> Prof is at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118994#c6
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?

OK.

Thanks,
Richard.

> gcc/ChangeLog:
>
> PR middle-end/118994
> * match.pd ((a >> 1) + (b >> 1) + ((a | b) & 1) to
> .AVG_CEIL (a, b)): New pattern.
> ((a | b) - ((a ^ b) >> 1) to .AVG_CEIL (a, b)): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr118994-1.c: New test.
> * gcc.target/i386/pr118994-2.c: New test.
> ---
>  gcc/match.pd   | 23 ++
>  gcc/testsuite/gcc.target/i386/pr118994-1.c | 37 ++
>  gcc/testsuite/gcc.target/i386/pr118994-2.c | 37 ++
>  3 files changed, 97 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr118994-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr118994-2.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 96136404f5e..d391ac86edc 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -11455,3 +11455,26 @@ and,
>}
>(if (full_perm_p)
> (vec_perm (op@3 @0 @1) @3 @2))
> +
> +#if GIMPLE
> +/* Simplify (a >> 1) + (b >> 1) + ((a | b) & 1) to .AVG_CEIL (a, b).
> +   Similar for (a | b) - ((a ^ b) >> 1).  */
> +
> +(simplify
> +  (plus:c
> +(plus (rshift @0 integer_onep@1) (rshift @2 @1))
> +(bit_and (bit_ior @0 @2) integer_onep@3))
> +  (if (cfun && (cfun->curr_properties & PROP_last_full_fold) != 0
> +  && VECTOR_TYPE_P (type)
> +  && direct_internal_fn_supported_p (IFN_AVG_CEIL, type, 
> OPTIMIZE_FOR_BOTH))
> +  (IFN_AVG_CEIL @0 @2)))
> +
> +(simplify
> +  (minus
> +(bit_ior @0 @2)
> +(rshift (bit_xor @0 @2) integer_onep@1))
> +  (if (cfun && (cfun->curr_properties & PROP_last_full_fold) != 0
> +  && VECTOR_TYPE_P (type)
> +  && direct_internal_fn_supported_p (IFN_AVG_CEIL, type, 
> OPTIMIZE_FOR_BOTH))
> +  (IFN_AVG_CEIL @0 @2)))
> +#endif
> diff --git a/gcc/testsuite/gcc.target/i386/pr118994-1.c 
> b/gcc/testsuite/gcc.target/i386/pr118994-1.c
> new file mode 100644
> index 000..5f40ababccc
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr118994-1.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512bw -mavx512vl -O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times "\.AVG_CEIL" 6 "optimized"} } */
> +
> +#define VecRoundingAvg(a, b) ((a >> 1) + (b >> 1) + ((a | b) & 1))
> +
> +typedef unsigned char GccU8x16Vec __attribute__((__vector_size__(16)));
> +typedef unsigned short GccU16x8Vec __attribute__((__vector_size__(16)));
> +typedef unsigned char GccU8x32Vec __attribute__((__vector_size__(32)));
> +typedef unsigned short GccU16x16Vec __attribute__((__vector_size__(32)));
> +typedef unsigned char GccU8x64Vec __attribute__((__vector_size__(64)));
> +typedef unsigned short GccU16x32Vec __attribute__((__vector_size__(64)));
> +
> +GccU8x16Vec U8x16VecRoundingAvg(GccU8x16Vec a, GccU8x16Vec b) {
> +  return VecRoundingAvg(a, b);
> +}
> +
> +GccU16x8Vec U16x8VecRoundingAvg(GccU16x8Vec a, GccU16x8Vec b) {
> +  return VecRoundingAvg(a, b);
> +}
> +
> +GccU8x32Vec U8x32VecRoundingAvg(GccU8x32Vec a, GccU8x32Vec b) {
> +  return VecRoundingAvg(a, b);
> +}
> +
> +GccU16x16Vec U16x16VecRoundingAvg(GccU16x16Vec a, GccU16x16Vec b) {
> +  return VecRoundingAvg(a, b);
> +}
> +
> +GccU8x64Vec U8x64VecRoundingAvg(GccU8x64Vec a, GccU8x64Vec b) {
> +  return VecRoundingAvg(a, b);
> +}
> +
> +GccU16x32Vec U16x32VecRoundingAvg(GccU16x32Vec a, GccU16x32Vec b) {
> +  return VecRoundingAvg(a, b);
> +}
> +
> diff --git a/gcc/testsuite/gcc.target/i386/pr118994-2.c 
> b/gcc/testsuite/gcc.target/i386/pr118994-2.c
> new file mode 100644
> index 000..ba90e0a2992
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr118994-2.c
> @@ -0,0 +1,37 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mavx512bw -mavx512vl -O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-times "\.AVG_CEIL" 6 "optimized"} } */
> +
> +#define VecRoundingAvg(a, b) ((a | b) - ((a ^ b) >> 1))
> +
> +typedef unsigned char GccU8x16Vec __attribute__((__vector_size__(16)));
> +typedef unsigned short GccU16x8Vec __attribute__((__vector_size__(16)));
> +typedef unsigned char GccU8x32Vec __attribute__((__vector_size__(32)));
> +typedef unsigned short GccU16x16Vec __attribute__((__vector_size__(32)));
> +typedef unsigned char GccU8x64Vec __attribute__((__vector_size__(64)));
> +typedef unsigned short GccU16x32Vec __attribute__((__vector_size__(64)));
> +
> +GccU8x16Vec U8x16VecRoundingAvg(GccU8x16Vec a, GccU8x16Vec b) {
> +  return VecRoundingAvg(a, b);
> +}
> +
> +GccU16x8Vec U16x8VecRoundingAvg(GccU16x8Vec a, GccU16x8Vec b) {
> +  return VecRoundingAvg(a, b);
> +}
> +
> +GccU8x32Vec U8x32VecRoundingAvg(GccU8x32V

[PATCH] ipa: When inlining, don't combine PT JFs changing signedness (PR120295)

2025-05-20 Thread Martin Jambor
Hi,

in GCC 15 we allowed jump-function generation code to skip over a
type-cast converting one integer to another as long as the latter can
hold all the values of the former or has at least the same precision.
This works well for IPA-CP where we do then evaluate each jump
function as we propagate values and value-ranges.  However, the
test-case in PR 120295 shows a problem with inlining, where we combine
pass-through jump-functions so that they are always relative to the
function which is the root of the inline tree.  Unfortunately, we are
happy to combine also those with type-casts to a different signedness
which makes us use sign zero extension for the expected value ranges
where we should have used sign extension.  When the value-range which
then leads to wrong insertion of a call to builtin_unreachable is
being computed, the information about an existence of a intermediary
signed type has already been lost during previous inlining.

This patch simply blocks combining such jump-functions so that it is
back-portable to GCC 15.  Once we switch pass-through jump functions
to use a vector of operations rather than having room for just one, we
will be able to address this situation with adding an extra conversion
instead.

Bootstrapped and LTO-bootstrapped on x86_64-linux.  OK for master and
gcc-15 branch?

Thanks,

Martin


gcc/ChangeLog:

2025-05-19  Martin Jambor  

PR ipa/120295
* ipa-prop.cc (update_jump_functions_after_inlining): Do not
combine pass-through jump functions with type-casts changing
signedness.

gcc/testsuite/ChangeLog:

2025-05-19  Martin Jambor  

PR ipa/120295
* gcc.dg/ipa/pr120295.c: New test.
---
 gcc/ipa-prop.cc | 28 
 gcc/testsuite/gcc.dg/ipa/pr120295.c | 66 +
 2 files changed, 94 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/pr120295.c

diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index 24a538034e3..84d4fb5db67 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -3330,6 +3330,10 @@ update_jump_functions_after_inlining (struct cgraph_edge 
*cs,
   ipa_edge_args *args = ipa_edge_args_sum->get (e);
   if (!args)
 return;
+  ipa_node_params *old_inline_root_info = ipa_node_params_sum->get 
(cs->callee);
+  ipa_node_params *new_inline_root_info
+= ipa_node_params_sum->get (cs->caller->inlined_to
+   ? cs->caller->inlined_to : cs->caller);
   int count = ipa_get_cs_argument_count (args);
   int i;
 
@@ -3541,6 +3545,30 @@ update_jump_functions_after_inlining (struct cgraph_edge 
*cs,
enum tree_code operation;
operation = ipa_get_jf_pass_through_operation (src);
 
+   tree old_ir_ptype = ipa_get_type (old_inline_root_info,
+ dst_fid);
+   tree new_ir_ptype = ipa_get_type (new_inline_root_info,
+ formal_id);
+   if (!useless_type_conversion_p (old_ir_ptype, new_ir_ptype))
+ {
+   /* Jump-function construction now permits type-casts
+  from an integer to another if the latter can hold
+  all values or has at least the same precision.
+  However, as we're combining multiple pass-through
+  functions together, we are losing information about
+  signedness and thus if conversions should sign or
+  zero extend.  Therefore we must prevent combining
+  such jump-function if signednesses do not match.  */
+   if (!INTEGRAL_TYPE_P (old_ir_ptype)
+   || !INTEGRAL_TYPE_P (new_ir_ptype)
+   || (TYPE_UNSIGNED (new_ir_ptype)
+   != TYPE_UNSIGNED (old_ir_ptype)))
+ {
+   ipa_set_jf_unknown (dst);
+   continue;
+ }
+ }
+
if (operation == NOP_EXPR)
  {
bool agg_p;
diff --git a/gcc/testsuite/gcc.dg/ipa/pr120295.c 
b/gcc/testsuite/gcc.dg/ipa/pr120295.c
new file mode 100644
index 000..2033ee9493d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/pr120295.c
@@ -0,0 +1,66 @@
+/* { dg-do run } */
+/* { dg-options "-O3" } */
+
+struct {
+  signed a;
+} b;
+int a, f, j, l;
+char c, k, g, e;
+short d[2] = {0};
+int *i = &j;
+
+volatile int glob;
+void __attribute__((noipa)) sth (const char *, int a)
+{
+  glob = a;
+  return;
+}
+
+void marker_37() {
+  a++;
+  sth ("%d\n", a);
+}
+unsigned long long m(unsigned, char, unsigned, short);
+int n(int, unsigned char, long long);
+int o(long long, unsigned, unsigned);
+unsigned short p(void) {
+  int *r = &l;
+  *r |= ({
+long long y = (m(c,

Re: [PATCH v22 0/3] c: Add _Countof and

2025-05-20 Thread Alejandro Colomar
Hi Joseph,

On Tue, May 20, 2025 at 02:43:55PM +, Joseph Myers wrote:
> > Could you please clarify if I need to do anything or if this is already
> > scheduled for review when you have some time?  Also please clarify if
> > you're okay with amending that or if you prefer that I send v23.
> 
> I have it on my list for review.

Thanks!

>  I'd prefer v23 (also with complete 
> ChangeLog entries, please, rather than the present placeholders, so they 
> don't need to be written at commit time).

The ChangeLog entries that I've sent are supposed to be complete.  Did I
miss anything?

I've based on gnulib commits, which I believe follow the same
guidelines.  For example:

commit 6608062398ef4c983a58b90a1520c39f12fb7ac1
Author: Paul Eggert 
Date:   Fri Jan 10 10:34:58 2025 -0800

doc: document some file system portability issues

* doc/glibc-functions/flistxattr.texi:
* doc/glibc-functions/listxattr.texi:
* doc/glibc-functions/llistxattr.texi:
* doc/posix-functions/fchdir.texi, doc/posix-functions/fstat.texi:
* doc/posix-functions/fstatvfs.texi:
Document some portability gotchas that Gnulib does not work around.

Now I realize that maybe my changelog misses the trailing ':' for the
entries that have no text (because it's only once at the end)?  So for
example instead of

gcc/c-family/ChangeLog:

* c-common.h
* c-common.def
* c-common.cc (c_countof_type): Add __countof__ operator.

I should do this?

gcc/c-family/ChangeLog:

* c-common.h:
* c-common.def:
* c-common.cc (c_countof_type): Add __countof__ operator.

Or maybe this?

gcc/c-family/ChangeLog:

* c-common.h:
* c-common.def:
* c-common.cc (c_countof_type):
Add __countof__ operator.

Please let me know.


Cheers,
Alex

-- 



signature.asc
Description: PGP signature


Re: [RFC PATCH 0/3] _BitInt(N) support for LoongArch

2025-05-20 Thread Yang Yujie
Hi Jakub,

Thanks for the quick review.

Aside from code formatting issues, can I conclude that you suggest
we should rebase this onto your new big-endian support patch?  Or
do you think it's necessary to add big-endian && extended support
together?

> Are you sure all those changes were really necessary (rather than doing them
> just in case)?  I believe most of gimple-lower-bitint.cc already should be
> sign or zero extending the partial limbs when storing stuff, there can be
> some corner cases (I think one of the shift directions at least).

The modifications to gimple-lower-bitint.cc are based on testing, 
since I found that simply setting the "info.extended" flag won't work unless
I make changes to promote_function_mode, which leads to a series of
changes to correct all the regtests.

Specifically, the tests told me to extend (thought "truncate"
was kind of an equivalent word) the output of left shift, plus/minus,
bitwise not, libgcc calls, stores with "separate extension", as well
as the input of atomic stores.  It would be great if you could help me
simplify the changes based on your original design.

The repeated calling of TARGET_C_BITINT_TYPE_INFO is just trying
to minimize the changes.  Now that we are calling it very often, having
the return value cached would be a great idea, and I think the
resulting limit on the ABI flavors to be either extended or
non-extended is not really a issue.

On Tue, May 20, 2025 at 09:04:23AM GMT, Jakub Jelinek wrote:
> 
> In the later 2 paragraphs you say they are sign or zero extended depending
> on if it is signed or unsigned type.  I hope it is the latter and not the
> former.
> 

Maybe the ABI text for the N <= 64 case is a bit vague. My intention is that
a _BitInt(N) where N <= 64 should be first sign or zero extended to the width
of a containing type ("the smallest fundamental integral type that can contain
it") according to its signedness, and then it would be laid out just like
the containing type, which may involve another sign or zero extension if passed
in a wider register.


> In any case, I think for targets which set info->extended_p = true; we
> actually need testsuite coverage to verify it works properly.
> I'd think something like
> #define CHECK(x) \
>   do {\
> if ((typeof (x)) -1 < 0)  \
>   {   \
>   _BitInt(sizeof (x) * __CHAR_BIT__) __x; \
>   __builtin_memcpy (&__x, &(x), sizeof (__x));\
>   if (__x != (x)) \
> __builtin_abort ();   \
>   }   \
> else  \
>   {   \
>   unsigned _BitInt(sizeof (x) * __CHAR_BIT__) __x;\
>   __builtin_memcpy (&__x, &(x), sizeof (__x));\
>   if (__x != (x)) \
> __builtin_abort ();   \
>   }   \
>   } while (0)
> and use the macro on various _BitInt variables, arguments, return values
> after various arithmetic operations (and enable those tests solely on
> the info->extended_p targets, which would be loongarch, most likely s390x
> and arm 32-bit).
> 
>   Jakub

More common tests would surely be helpful, especially for new ports.

However, the specific test you mentioned would not be compatible with
the proposed LoongArch ABI, where the top 64-bit limb within the top
128-bit ABI-limb may be undefined. e.g. _BitInt(192).

Perhaps it's better to leave it to target-specific tests?

Yujie



[PATCH] libstdc++: implement Philox Engine [PR119794]

2025-05-20 Thread 1nfocalypse
Implements Philox Engine (P2075R6) and associated tests.

Curiously, the ordering for the template unpacking functions
caused a bug in which ordering mattered. As such, counter to
the recommended style, they are placed in a private classifier
prior to the public block. I was unable to find a way around this.

Additionally, this patch does not utilize SIMD instructions,
although it could technically be a possible point of interest. In its
current state, I don't believe their inclusion would be particularly
useful, although with some restructuring of how the internal state
is managed, they may be.

Also, since the word width can be specified (and thus atypical),
calculation of the max size does not use numeric_limits, but rather
sets the top bit of the specified width and turns on trailing 0s. This is
indistinguishable from the end result of numeric_limits, but provides
further flexibility in the case of a user-specified size outside the norm
where the maximum value of a register is not equal to the maximum
value of the specified word length.

Built and tested on x86_64-linux-gnu.

1nfocalypseFrom 69db1ff9e7b8ace24cd4da246f3481a4544a9aec Mon Sep 17 00:00:00 2001
From: 1nfocalypse <1nfocaly...@protonmail.com>
Date: Tue, 20 May 2025 04:28:42 +
Subject: [PATCH] [PATCH] libstdc++: implement Philox Engine (P2075R6)

This commit implements P2075R6 for C++26, adding the Philox Engine.

The template unpacking functions, while private, are placed prior
to the public access specifier due to issues where the template
pack could not be unpacked and used to populate the public member
arrays without being declared beforehand.

Additionally, the tests implemented attempt to mirror the tests
for other engines, when they apply. Lastly, changes to random
provided cause for changing 'pr60037-neg.cc' because it suppresses
an error by explicit line number. It should still be correctly
suppressed in this patch.

	PR libstdc++/119794

libstdc++-v3/ChangeLog:

	* include/bits/random.h: Add Philox Engine components.
	* include/bits/random.tcc: Implement Philox Engine components.
	* testsuite/26_numerics/random/pr60037-neg.cc: Alter line number
	* testsuite/26_numerics/random/inequal.cc: New test.
	* testsuite/26_numerics/random/philox4x32.cc: New test.
	* testsuite/26_numerics/random/philox4x64.cc: New test.
	* testsuite/26_numerics/random/philox_engine/cons/119794.cc: New
	test.
	* testsuite/26_numerics/random/philox_engine/cons/copy.cc: New
	test.
	* testsuite/26_numerics/random/philox_engine/cons/default.cc: New
	test.
	* testsuite/26_numerics/random/philox_engine/cons/seed.cc: New
	test.
	* testsuite/26_numerics/random/philox_engine/cons/seed_seq.cc:
	New test.
	* testsuite/26_numerics/random/philox_engine/operators/equal.cc:
	New test.
	* testsuite/26_numerics/random/philox_engine/operators/inequal.cc:
	New test.
	* testsuite/26_numerics/random/philox_engine/operators/
	serialize.cc: New test.
	* testsuite/26_numerics/random/philox_engine/requirements/
	constants.cc: New test.
	* testsuite/26_numerics/random/philox_engine/requirements/
	constexpr_data.cc: New test.
	* testsuite/26_numerics/random/philox_engine/requirements/
	constexpr_functions.cc: New test.
	* testsuite/26_numerics/random/philox_engine/requirements/
	typedefs.cc: New test.
---
 libstdc++-v3/include/bits/random.h| 340 ++
 libstdc++-v3/include/bits/random.tcc  | 201 +++
 .../testsuite/26_numerics/random/inequal.cc   |  49 +++
 .../26_numerics/random/philox4x32.cc  |  42 +++
 .../26_numerics/random/philox4x64.cc  |  44 +++
 .../random/philox_engine/cons/119794.cc   |  58 +++
 .../random/philox_engine/cons/copy.cc |  45 +++
 .../random/philox_engine/cons/default.cc  |  46 +++
 .../random/philox_engine/cons/seed.cc |  39 ++
 .../random/philox_engine/cons/seed_seq.cc |  42 +++
 .../random/philox_engine/operators/equal.cc   |  50 +++
 .../random/philox_engine/operators/inequal.cc |  49 +++
 .../philox_engine/operators/serialize.cc  |  69 
 .../philox_engine/requirements/constants.cc   |  45 +++
 .../requirements/constexpr_data.cc|  69 
 .../requirements/constexpr_functions.cc   |  61 
 .../philox_engine/requirements/typedefs.cc|  45 +++
 .../26_numerics/random/pr60037-neg.cc |   2 +-
 18 files changed, 1295 insertions(+), 1 deletion(-)
 create mode 100644 libstdc++-v3/testsuite/26_numerics/random/inequal.cc
 create mode 100644 libstdc++-v3/testsuite/26_numerics/random/philox4x32.cc
 create mode 100644 libstdc++-v3/testsuite/26_numerics/random/philox4x64.cc
 create mode 100644 libstdc++-v3/testsuite/26_numerics/random/philox_engine/cons/119794.cc
 create mode 100644 libstdc++-v3/testsuite/26_numerics/random/philox_engine/cons/copy.cc
 create mode 100644 libstdc++-v3/testsuite/26_numerics/random/philox_engine/cons/default.cc
 create mode 100644 libstdc++-v3/testsuite/26_numerics/random/philox_engine/cons/seed.cc
 create mode 100644 lib

Re: [PATCH v1 6/6] libstdc++: Add tests for layout_stride.

2025-05-20 Thread Tomasz Kaminski
On Sun, May 18, 2025 at 10:12 PM Luc Grosheintz 
wrote:

> Implements the tests for layout_stride and for the features of the other
> two layouts that depend on layout_stride.
>
> libstdc++-v3/ChangeLog:
>
> * testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc: Add
> tests for layout_stride.
> * testsuite/23_containers/mdspan/layouts/ctors.cc: Add test for
> layout_stride and the interaction with other layouts.
> * testsuite/23_containers/mdspan/layouts/mapping.cc: Ditto.
> * testsuite/23_containers/mdspan/layouts/stride.cc: New test.
>
> Signed-off-by: Luc Grosheintz 
> ---
>
You do not seem to be testing strides method.

>  .../mdspan/layouts/class_mandate_neg.cc   |  19 +
>  .../23_containers/mdspan/layouts/ctors.cc |  99 
>  .../23_containers/mdspan/layouts/mapping.cc   |  72 ++-
>  .../23_containers/mdspan/layouts/stride.cc| 494 ++
>  4 files changed, 683 insertions(+), 1 deletion(-)
>  create mode 100644
> libstdc++-v3/testsuite/23_containers/mdspan/layouts/stride.cc
>
> diff --git
> a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
> b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
> index 137cf8f06a9..d1998f4eae3 100644
> ---
> a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
> +++
> b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
> @@ -17,7 +17,26 @@ template
>  typename Layout::mapping m3; // { dg-error "required
> from" }
>};
>
> +template
> +  struct B // { dg-error "expansion of" }
> +  {
> +using Extents = std::extents;
> +using OExtents = std::extents;
> +
> +using Mapping = typename Layout::mapping;
> +using OMapping = typename Layout::mapping;
> +
> +Mapping m{OMapping{}};
> +  };
> +
>  A a_left; // { dg-error "required
> from" }
>  A a_right;   // { dg-error "required
> from" }
> +A a_stride; // { dg-error "required
> from" }
> +
> +B<1, std::layout_left, std::layout_right> blr; // { dg-error
> "required here" }
> +B<2, std::layout_left, std::layout_stride> bls;// { dg-error
> "required here" }
> +
> +B<3, std::layout_right, std::layout_left> brl; // { dg-error
> "required here" }
> +B<4, std::layout_right, std::layout_stride> brs;   // { dg-error
> "required here" }
>
>  // { dg-prune-output "must be representable as index_type" }
> diff --git a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
> b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
> index e3e25528f33..19a6c8853e9 100644
> --- a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
> +++ b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
> @@ -302,12 +302,111 @@ namespace from_left_or_right
>  }
>  }
>
> +// ctor: mapping(layout_stride::mapping)
> +namespace from_stride
> +{
> +  template
> +constexpr auto
> +strides(Mapping m)
> +{
> +  constexpr auto rank = Mapping::extents_type::rank();
> +  std::array s;
> +
> +  if constexpr (rank > 0)
> +   for(size_t i = 0; i < rank; ++i)
> + s[i] = m.stride(i);
> +  return s;
> +}
> +
> +  template
> +constexpr void
> +verify_convertible(OExtents oexts)
> +{
> +  using Mapping = typename Layout::mapping;
> +  using OMapping = std::layout_stride::mapping;
> +
> +  constexpr auto other = OMapping(oexts,
> strides(Mapping(Extents(oexts;
> +  if constexpr (std::is_same_v)
> +   ::verify_nothrow_convertible(other);
> +  else
> +   ::verify_convertible(other);
> +}
> +
> +  template
> +constexpr void
> +verify_constructible(OExtents oexts)
> +{
> +  using Mapping = typename Layout::mapping;
> +  using OMapping = std::layout_stride::mapping;
> +
> +  constexpr auto other = OMapping(oexts,
> strides(Mapping(Extents(oexts;
> +  if constexpr (std::is_same_v)
> +   ::verify_nothrow_constructible(other);
> +  else
> +   ::verify_constructible(other);
> +}
> +
> +  template
> +constexpr bool
> +test_ctor()
> +{
> +  assert_not_constructible<
> +   typename Layout::mapping>,
> +   std::layout_stride::mapping>>();
> +
> +  assert_not_constructible<
> +   typename Layout::mapping>,
> +   std::layout_stride::mapping>>();
> +
> +  assert_not_constructible<
> +   typename Layout::mapping>,
> +   std::layout_stride::mapping>>();
> +
> +  verify_convertible>(std::extents{});
> +
> +  verify_convertible>(
> +   std::extents{});
> +
> +  // Rank ==  0 doesn't check IndexType for convertibility.
> +  verify_convertible>(
> +   std::extents{});
> +
> +  verify_constructible>(
> +   std::extents{});
> +
> +  verify_constructible>(
> +   std::extents{});
> +
> +  verify_constructible>(
> +   std::extents{});
> +
> +  verify_con

Re: [PATCH v3] libstdc++: Implement C++23 P1659R3 starts_with and ends_with

2025-05-20 Thread Tomasz Kaminski
On Tue, May 20, 2025 at 3:17 PM Tomasz Kaminski  wrote:

>
>
> On Tue, May 20, 2025 at 3:07 PM Patrick Palka  wrote:
>
>> On Tue, 20 May 2025, Tomasz Kaminski wrote:
>>
>> >
>> >
>> > On Mon, May 19, 2025 at 6:06 PM Patrick Palka 
>> wrote:
>> >   On Mon, 19 May 2025, Patrick Palka wrote:
>> >
>> >   > Changes in v3:
>> >   >   * Use the forward_range code path for a (non-sized)
>> bidirectional
>> >   > haystack, since it's slightly fewer increments/decrements
>> >   > overall.
>> >   >   * Fix wrong iter_difference_t cast in starts_with.
>> >   >
>> >   > Changes in v2:
>> >   >   Addressed Tomasz's review comments, namely:
>> >   >   * Added explicit iter_difference_t casts
>> >   >   * Made _S_impl member private
>> >   >   * Optimized sized bidirectional case of ends_with
>> >   >   * Rearranged control flow of starts_with::_S_impl
>> >   >
>> >   > Still left to do:
>> >   >   * Add tests for integer-class types
>> >   >   * Still working on a better commit description ;)
>> >   >
>> >   > -- >8 --
>> >   >
>> >   > libstdc++-v3/ChangeLog:
>> >   >
>> >   >   * include/bits/ranges_algo.h (__starts_with_fn,
>> starts_with):
>> >   >   Define.
>> >   >   (__ends_with_fn, ends_with): Define.
>> >   >   * include/bits/version.def (ranges_starts_ends_with):
>> Define.
>> >   >   * include/bits/version.h: Regenerate.
>> >   >   * include/std/algorithm: Provide
>> __cpp_lib_ranges_starts_ends_with.
>> >   >   * src/c++23/std.cc.in (ranges::starts_with): Export.
>> >   >   (ranges::ends_with): Export.
>> >   >   * testsuite/25_algorithms/ends_with/1.cc: New test.
>> >   >   * testsuite/25_algorithms/starts_with/1.cc: New test.
>> >   > ---
>> >   >  libstdc++-v3/include/bits/ranges_algo.h   | 236
>> ++
>> >   >  libstdc++-v3/include/bits/version.def |   8 +
>> >   >  libstdc++-v3/include/bits/version.h   |  10 +
>> >   >  libstdc++-v3/include/std/algorithm|   1 +
>> >   >  libstdc++-v3/src/c++23/std.cc.in  |   4 +
>> >   >  .../testsuite/25_algorithms/ends_with/1.cc| 129 ++
>> >   >  .../testsuite/25_algorithms/starts_with/1.cc  | 128 ++
>> >   >  7 files changed, 516 insertions(+)
>> >   >  create mode 100644
>> libstdc++-v3/testsuite/25_algorithms/ends_with/1.cc
>> >   >  create mode 100644
>> libstdc++-v3/testsuite/25_algorithms/starts_with/1.cc
>> >   >
>> >   > diff --git a/libstdc++-v3/include/bits/ranges_algo.h
>> b/libstdc++-v3/include/bits/ranges_algo.h
>> >   > index f36e7dd59911..54646ae62f7e 100644
>> >   > --- a/libstdc++-v3/include/bits/ranges_algo.h
>> >   > +++ b/libstdc++-v3/include/bits/ranges_algo.h
>> >   > @@ -438,6 +438,242 @@ namespace ranges
>> >   >
>> >   >inline constexpr __search_n_fn search_n{};
>> >   >
>> >   > +#if __glibcxx_ranges_starts_ends_with // C++ >= 23
>> >   > +  struct __starts_with_fn
>> >   > +  {
>> >   > +template
>> _Sent1,
>> >   > +  input_iterator _Iter2, sentinel_for<_Iter2> _Sent2,
>> >   > +  typename _Pred = ranges::equal_to,
>> >   > +  typename _Proj1 = identity, typename _Proj2 =
>> identity>
>> >   > +  requires indirectly_comparable<_Iter1, _Iter2, _Pred,
>> _Proj1, _Proj2>
>> >   > +  constexpr bool
>> >   > +  operator()(_Iter1 __first1, _Sent1 __last1,
>> >   > +  _Iter2 __first2, _Sent2 __last2, _Pred __pred =
>> {},
>> >   > +  _Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
>> >   > +  {
>> >   > + iter_difference_t<_Iter1> __n1 = -1;
>> >   > + iter_difference_t<_Iter2> __n2 = -1;
>> >   > + if constexpr (sized_sentinel_for<_Sent1, _Iter1>)
>> >   > +   __n1 = __last1 - __first1;
>> >   > + if constexpr (sized_sentinel_for<_Sent2, _Iter2>)
>> >   > +   __n2 = __last2 - __first2;
>> >   > + return _S_impl(std::move(__first1), __last1, __n1,
>> >   > +std::move(__first2), __last2, __n2,
>> >   > +std::move(__pred),
>> >   > +std::move(__proj1), std::move(__proj2));
>> >   > +  }
>> >   > +
>> >   > +template> >   > +  typename _Pred = ranges::equal_to,
>> >   > +  typename _Proj1 = identity, typename _Proj2 =
>> identity>
>> >   > +  requires indirectly_comparable,
>> iterator_t<_Range2>,
>> >   > +  _Pred, _Proj1, _Proj2>
>> >   > +  constexpr bool
>> >   > +  operator()(_Range1&& __r1, _Range2&& __r2, _Pred __pred
>> = {},
>> >   > +  _Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
>> >   > +  {
>> >   > + range_difference_t<_Range1

Re: [RFC PATCH 0/3] _BitInt(N) support for LoongArch

2025-05-20 Thread Xi Ruoyao
On Tue, 2025-05-20 at 15:44 +0200, Jakub Jelinek wrote:
> > Specifically, the tests told me to extend (thought "truncate"
> > was kind of an equivalent word) the output of left shift, plus/minus,
> 
> Truncation is the exact opposite of extension.
> I can understand the need for handling of left shifts, for all the rest
> I'd really like to see testcases.

I guess the terminology thing is caused by the past experience of the
Loongson team with "a famous !TARGET_TRULY_NOOP_TRUNCATION target."  On
that target truncsidi2 is a sign-extension as required by the ISA spec.

I'm trying to fix an ext-dce bug regarding !TARGET_TRULY_NOOP_TRUNCATION
so I just decided to chime in and explain this :).

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH 4/5 v3] c++, coroutines: Use decltype(auto) for the g_r_o.

2025-05-20 Thread Iain Sandoe
Hi Jason,

>>>Moving this initialization doesn't seem connected to the type of gro, or 
>>>mentioned above?

>>A fly-by tidy up.. removed.
>>I still see it in the new patch?
Apologies, I obviously fat-fingered something - done now.

>>...return object from an object that has already been destroyed.
>This message doesn't quote my comment
I added that now - with the folly part elided since we understand that is
not affected by this.

>>+= finish_decltype_type (get_ro, / 
>>*id_expression_or_member_access_p*/false,
>>+tf_warning_or_error); // TREE_TYPE (get_ro);
>Let's drop this comment, it looks vestigial.

Indeed, done.

>So please xfail the test than test for the defect.

I have done this and added a header comment explaining why we expect it
to fail.

OK for trunk now?
thanks
Iain

--- 8< ---

The revised wording for coroutines, uses decltype(auto) for the
type of the get return object, which preserves references.

It is quite reasonable for a  coroutine body implementation to
complete before control is returned to the ramp - and in that
case we would be creating the ramp return object from an already-
deleted promise object.

Jason observes that this is a terrible situation and we should
seek a resolution to it via core.

Since the test added here explicitly performs the unsafe action
dscribed above we expect it to fail (until a resolution is found).

gcc/cp/ChangeLog:

* coroutines.cc
(cp_coroutine_transform::build_ramp_function): Use
decltype(auto) to determine the type of the temporary
get_return_object.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/pr115908.C: Count promise construction
and destruction. Run the test and XFAIL it.

Signed-off-by: Iain Sandoe 
---
 gcc/cp/coroutines.cc   | 12 ++-
 gcc/testsuite/g++.dg/coroutines/pr115908.C | 86 --
 2 files changed, 72 insertions(+), 26 deletions(-)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index bc5fb9381db..5c4133a42b7 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -5120,8 +5120,11 @@ cp_coroutine_transform::build_ramp_function ()
   /* Check for a bad get return object type.
  [dcl.fct.def.coroutine] / 7 requires:
  The expression promise.get_return_object() is used to initialize the
- returned reference or prvalue result object ... */
-  tree gro_type = TREE_TYPE (get_ro);
+ returned reference or prvalue result object ...
+ When we use a local to hold this, it is decltype(auto).  */
+  tree gro_type
+= finish_decltype_type (get_ro, /*id_expression_or_member_access_p*/false,
+   tf_warning_or_error);
   if (VOID_TYPE_P (gro_type) && !void_ramp_p)
 {
   error_at (fn_start, "no viable conversion from % provided by"
@@ -5159,7 +5162,7 @@ cp_coroutine_transform::build_ramp_function ()
= coro_build_and_push_artificial_var (loc, "_Coro_gro", gro_type,
  orig_fn_decl, NULL_TREE);
 
-  r = cp_build_init_expr (coro_gro, get_ro);
+  r = cp_build_init_expr (coro_gro, STRIP_REFERENCE_REF (get_ro));
   finish_expr_stmt (r);
   tree coro_gro_cleanup
= cxx_maybe_build_cleanup (coro_gro, tf_warning_or_error);
@@ -5181,7 +5184,8 @@ cp_coroutine_transform::build_ramp_function ()
   /* The ramp is done, we just need the return statement, which we build from
  the return object we constructed before we called the function body.  */
 
-  finish_return_stmt (void_ramp_p ? NULL_TREE : coro_gro);
+  r = void_ramp_p ? NULL_TREE : convert_from_reference (coro_gro);
+  finish_return_stmt (r);
 
   if (flag_exceptions)
 {
diff --git a/gcc/testsuite/g++.dg/coroutines/pr115908.C 
b/gcc/testsuite/g++.dg/coroutines/pr115908.C
index ac27d916de2..a40cece1143 100644
--- a/gcc/testsuite/g++.dg/coroutines/pr115908.C
+++ b/gcc/testsuite/g++.dg/coroutines/pr115908.C
@@ -1,3 +1,16 @@
+// { dg-do run }
+
+// With the changes to deal with CWG2563 (and PR119916) we now use the
+// referenced promise in the return expression.  It is quite reasonable
+// for a body implementation to complete before control is returned to
+// the ramp - and in that case we would be creating the ramp return object
+// from an already-deleted promise object.
+// This is recognised to be a poor situation and resolution via a core
+// issue is planned.
+
+// In this test we explicitly trigger the circumstance mentioned above.
+// { dg-xfail-run-if "" { *-*-* } }
+
 #include 
 
 #ifdef OUTPUT
@@ -6,23 +19,25 @@
 
 struct Promise;
 
-bool promise_live = false;
+int promise_life = 0;
 
 struct Handle : std::coroutine_handle {
+
 Handle(Promise &p) : 
std::coroutine_handle(Handle::from_promise(p)) {
-if (!promise_live)
-  __builtin_abort ();
 #ifdef OUTPUT
-std::cout << "Handle(Promise &)\n";
+std::cout << "Handle(Promise &) " << promise_life << std::endl;
 #endif
-}
-Handle(Promise &&p) : 
std::coro

Re: [PATCH v2 1/6] libstdc++: Implement layout_left from mdspan.

2025-05-20 Thread Tomasz Kaminski
On Tue, May 20, 2025 at 3:16 PM Luc Grosheintz 
wrote:

> Implements the parts of layout_left that don't depend on any of the
> other layouts.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/mdspan (layout_left): New class.
>
> Signed-off-by: Luc Grosheintz 
> ---

Sending feedback on this PR, I do not think I will have time to review the
remaining ones today.
Thanks, for the update, this really looks good. Good job on catching
rank_dynamic==0 case.

I have added more stylistic suggestions:
I would consider renaming the internal function to `exts` instead of
`subextents`/`extents`.
We are inside __mdspan namespace, so risk of collision is minimal. Also
added few suggestion
for default arguments.

More comments below.

>  libstdc++-v3/include/std/mdspan | 309 +++-
>  1 file changed, 308 insertions(+), 1 deletion(-)
>
> diff --git a/libstdc++-v3/include/std/mdspan
> b/libstdc++-v3/include/std/mdspan
> index 47cfa405e44..d90fed57a19 100644
> --- a/libstdc++-v3/include/std/mdspan
> +++ b/libstdc++-v3/include/std/mdspan
> @@ -144,6 +144,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>   { return __exts[__i]; });
>   }
>
> +   static constexpr span
> +   _S_static_subextents(size_t __begin, size_t __end) noexcept
>
+   {
> + return {_Extents.data() + __begin, _Extents.data() + __end};
> +   }
> +
> +   constexpr span
> +   _M_dynamic_subextents(size_t __begin, size_t __end) const noexcept
> +   requires (_Extents.size() > 0)
> +   {
> + return {_M_dynamic_extents + _S_dynamic_index[__begin],
> + _M_dynamic_extents + _S_dynamic_index[__end]};
> +   }
> +
>private:
> using _S_storage = __array_traits<_IndexType,
> _S_rank_dynamic>::_Type;
> [[no_unique_address]] _S_storage _M_dynamic_extents;
> @@ -160,6 +174,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> || _Extent <= numeric_limits<_IndexType>::max();
>}
>
> +  namespace __mdspan
> +  {
> +template
> +  constexpr span
> +  __static_subextents(size_t __begin, size_t __end)
>
Consider adding default arguments: __begin = 0u, __end = _Extents::rank().
This would simplify calls.

> +  { return _Extents::_S_storage::_S_static_subextents(__begin,
> __end); }
> +
> +template
> +  constexpr span
> +  __dynamic_subextents(const _Extents& __exts, size_t __begin, size_t
> __end)
> +  {
> +   return __exts._M_dynamic_extents._M_dynamic_subextents(__begin,
> __end);
> +  }
> +  }
> +
>template
>  class extents
>  {
> @@ -251,7 +280,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> : _M_dynamic_extents(span(__exts))
> { }
>
> -
>template<__mdspan::__valid_index_type _OIndexType,
> size_t _Nm>
> requires (_Nm == rank() || _Nm == rank_dynamic())
> constexpr explicit(_Nm != rank_dynamic())
> @@ -276,6 +304,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> }
>
>  private:
> +  friend span
> +  __mdspan::__static_subextents(size_t, size_t);
> +
> +  friend span
> +  __mdspan::__dynamic_subextents(const extents&, size_t,
> size_t);
> +
>using _S_storage = __mdspan::_ExtentsStorage<
> _IndexType, array{_Extents...}>;
>[[no_unique_address]] _S_storage _M_dynamic_extents;
> @@ -286,6 +320,52 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>namespace __mdspan
>{
> +template
> +  constexpr size_t
> +  __static_extents_prod(size_t __begin, size_t __end)
> +  {
>
+   auto __sta_exts = __static_subextents<_Extents>(__begin, __end);
> +   size_t __ret = 1;
> +   for(size_t __i = 0; __i < __sta_exts.size(); ++__i)
>
Again we could use each here, once we have span.

> + if (__sta_exts[__i] != dynamic_extent)
> +   __ret *= __sta_exts[__i];
> +   return __ret;
> +  }
> +
> +template
> +  constexpr size_t
> +  __dynamic_extents_prod(const _Extents& __exts, size_t __begin,
> +size_t __end)
> +  {
> +   auto __dyn_exts = __dynamic_subextents<_Extents>(__exts, __begin,
> +__end);
> +   size_t __ret = 1;
> +   for(size_t __i = 0; __i < __dyn_exts.size(); ++__i)
>
Again we could use each here, once we have span.  And we could inline the
function in __exts_prod.

> +   __ret *= __dyn_exts[__i];
> +   return __ret;
> +  }
> +
> +template
> +  constexpr typename _Extents::index_type
> +  __exts_prod(const _Extents& __exts, size_t __begin, size_t __end)
> noexcept
> +  {
> +   using _IndexType = typename _Extents::index_type;
> +   auto __ret = __static_extents_prod<_Extents>(__begin, __end);
> +   if constexpr (_Extents::rank_dynamic() > 0)
> + __ret *= __dynamic_extents_prod(__exts, __begin, __end);
> +   return __ret;
> +  }
> +
> +template
> +  constexpr typename _Extents::index_type
> +  __fwd_prod(const _

Re: [PATCH 2/2]AArch64: propose -mmax-vectorization as an option to override vector costing

2025-05-20 Thread Richard Sandiford
Tamar Christina  writes:
> Hi All,
>
> With the middle-end providing a way to make vectorization more profitable by
> scaling vect-scalar-cost-multiplier this makes a more user friendly option
> to make it easier to use.
>
> I propose making it an actual -m option that we document and retain vs using
> the parameter name.  In the future I would like to extend this option to 
> modify
> additional costing in the AArch64 backend itself.
>
> This can be used together with --param aarch64-autovec-preference to get the
> vectorizer to say, always vectorize with SVE.  I did consider making this an
> additional enum to --param aarch64-autovec-preference but I also think this is
> a useful thing to be able to set with pragmas and attributes, but am open to
> suggestions.
>
> Note that as a follow up I plan on extending -fdump-tree-vect to support 
> -stats
> which is then intended to be usable with this flag.
>
> Bootstrapped Regtested on aarch64-none-linux-gnu,
> arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
> -m32, -m64 and no issues.
>
> Ok for master?
>
> Thanks,
> Tamar
>
> gcc/ChangeLog:
>
>   * config/aarch64/aarch64.opt (max-vectorization): New.
>   * config/aarch64/aarch64.cc (aarch64_override_options_internal): Save
>   and restore option.
>   Implement it through vect-scalar-cost-multiplier.
>   (aarch64_attributes): Default to off.
>   * common/config/aarch64/aarch64-common.cc (aarch64_handle_option):
>   Initialize option.
>   * doc/extend.texi (max-vectorization): Document attribute.
>   * doc/invoke.texi (max-vectorization): Document flag.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/sve/cost_model_17.c: New test.
>   * gcc.target/aarch64/sve/cost_model_18.c: New test.

Sorry for the slow reply.  This was obviously paired with the original
version of patch 1, but I was waiting to see how that developed before
reviewing this.  I've taken the updated version of patch 1 into account
below.

> ---
>
> diff --git a/gcc/common/config/aarch64/aarch64-common.cc 
> b/gcc/common/config/aarch64/aarch64-common.cc
> index 
> b9ed83642ade4462f1b030d68cf9744d31d70c23..1488697c6ce43108ae2938e5b8a00ac7ac262da6
>  100644
> --- a/gcc/common/config/aarch64/aarch64-common.cc
> +++ b/gcc/common/config/aarch64/aarch64-common.cc
> @@ -142,6 +142,10 @@ aarch64_handle_option (struct gcc_options *opts,
>opts->x_aarch64_flag_outline_atomics = val;
>return true;
>  
> +case OPT_mmax_vectorization:
> +  opts->x_flag_aarch64_max_vectorization = val;
> +  return true;
> +
>  default:
>return true;
>  }
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index 
> 9e3f2885bccb62550c5fcfdf93d72fbc2e63233e..46204264fea5af781be15374edc89587429518cb
>  100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -18973,6 +18973,12 @@ aarch64_override_options_internal (struct 
> gcc_options *opts)
>if (TARGET_SME && !TARGET_SVE2)
>  sorry ("no support for %qs without %qs", "sme", "sve2");
>  
> +  /* Set scalar costing to a high value such that we always pick
> + vectorization.  */
> +  if (opts->x_flag_aarch64_max_vectorization)
> +SET_OPTION_IF_UNSET (opts, &global_options_set,
> +  param_vect_scalar_cost_multiplier, 0x);

The new maximum in patch 1 is 1, which is less than this.
I suppose we should use 1 here too.

> +
>aarch64_override_options_after_change_1 (opts);
>  }
>  
> @@ -19723,6 +19729,8 @@ static const struct aarch64_attribute_info 
> aarch64_attributes[] =
>   OPT_msign_return_address_ },
>{ "outline-atomics", aarch64_attr_bool, true, NULL,
>   OPT_moutline_atomics},
> +  { "max-vectorization", aarch64_attr_bool, false, NULL,
> + OPT_mmax_vectorization},
>{ NULL, aarch64_attr_custom, false, NULL, OPT }
>  };
>  
> diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
> index 
> f32d56d4ffaef7862c1c45a11753be5d480220d0..2725c50da64a2c05489ea6202bdd5eedf1ba7e27
>  100644
> --- a/gcc/config/aarch64/aarch64.opt
> +++ b/gcc/config/aarch64/aarch64.opt
> @@ -290,6 +290,10 @@ msve-vector-bits=
>  Target RejectNegative Joined Enum(sve_vector_bits) 
> Var(aarch64_sve_vector_bits) Init(SVE_SCALABLE)
>  -msve-vector-bits=   Set the number of bits in an SVE vector 
> register.
>  
> +mmax-vectorization
> +Target Undocumented Var(flag_aarch64_max_vectorization) Save

It is (rightly) not undocumented :)

> +Override the scalar cost model such that vectorization is always profitable.
> +
>  mverbose-cost-dump
>  Target Undocumented Var(flag_aarch64_verbose_cost)
>  Enables verbose cost model dumping in the debug dump files.
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 
> 40ccf22b29f4316928f905ec2c978fdaf30a55ec..759a04bc7c4c66155154d55045bb75d695b2d6c2
>  100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -3882,6 +3882,13 @@ Enable or disable calls to ou

Re: [PATCH v22 0/3] c: Add _Countof and

2025-05-20 Thread Joseph Myers
On Tue, 20 May 2025, Alejandro Colomar wrote:

> Could you please clarify if I need to do anything or if this is already
> scheduled for review when you have some time?  Also please clarify if
> you're okay with amending that or if you prefer that I send v23.

I have it on my list for review.  I'd prefer v23 (also with complete 
ChangeLog entries, please, rather than the present placeholders, so they 
don't need to be written at commit time).

-- 
Joseph S. Myers
josmy...@redhat.com



[PATCH v2 1/1] Add warnings of potentially-uninitialized padding bits

2025-05-20 Thread Christopher Bazley
Commit 0547dbb725b reduced the number of cases in which
union padding bits are zeroed when the relevant language
standard does not strictly require it, unless gcc was
invoked with -fzero-init-padding-bits=unions or
-fzero-init-padding-bits=all in order to explicitly
request zeroing of padding bits.

This commit adds a closely related warning,
-Wzero-init-padding-bits=, which is intended to help
programmers to find code that might now need to be
rewritten or recompiled with
-fzero-init-padding-bits=unions or
-fzero-init-padding-bits=all in order to replicate
the behaviour that it had when compiled by older
versions of GCC. It can also be used to find struct
padding that was never previously guaranteed to be
zero initialized and still isn't unless GCC is
invoked with -fzero-init-padding-bits=all.

The new warning can be set to the same three states
as -fzero-init-padding-bits ('standard', 'unions'
or 'all') and has the same default value ('standard').

The two options interact as follows:

 f: standard  f: unions   f: all
w: standard X X X
w: unions   U X X
w: all  A S X

X = No warnings about padding
U = Warnings about padding of unions.
S = Warnings about padding of structs.
A = Warnings about padding of structs and unions.

The level of optimisation and whether or not the
entire initializer is dropped to memory can both
affect whether warnings are produced when compiling
a given program. This is intentional, since tying
the warnings more closely to the relevant language
standard would require a very different approach
that would still be target-dependent, might impose
an unacceptable burden on programmers, and would
risk not satisfying the intended use-case (which
is closely tied to a specific optimisation).

gcc/ChangeLog:

* common.opt: Add Wzero-init-padding-bits=.
* doc/invoke.texi: Document Wzero-init-padding-bits=.
* expr.cc (categorize_ctor_elements_1): Update new struct type
ctor_completeness instead of an integer to indicate presence of
padding or missing fields in a constructor. Instead of setting -1
upon discovery of padding bits in both structs and unions,
set separate flags to indicate the type of padding bits.
(categorize_ctor_elements): Update the type and documentation of
the p_complete parameter.
(mostly_zeros_p): Use new struct type ctor_completeness when
calling categorize_ctor_elements.
(all_zeros_p): Use new struct type ctor_completeness when
calling categorize_ctor_elements.
* expr.h (struct ctor_completeness): New struct type to replace an
an integer that could take the value -1 ('all fields are
initialized, but there's padding'), 0 ('fields are missing') or
1 ('all fields are initialized, and there's no padding'). Named
bool members make the code easier to understand and make room to
disambiguate struct padding bits from union padding bits.
(categorize_ctor_elements): Update the function declaration to use
the new struct type in the last parameter declaration.
* gimplify.cc (gimplify_init_constructor): Replace use of
complete_p != 0 ('all fields are initialized') with !sparse,
replace use of complete == 0 ('fields are missing') with sparse, and
replace use of complete <= 0 ('fields are missing' or 'all fields are
initialized, but there's padding') with sparse || padded_union or
padded_non_union. Trigger new warnings if storage for the object
is not zeroed but padded_union or padded_non_union is set
(because this combination implies possible non-zero padding bits).

gcc/testsuite/ChangeLog:

* gcc.dg/c23-empty-init-warn-1.c: New test.
* gcc.dg/c23-empty-init-warn-10.c: New test.
* gcc.dg/c23-empty-init-warn-11.c: New test.
* gcc.dg/c23-empty-init-warn-12.c: New test.
* gcc.dg/c23-empty-init-warn-13.c: New test.
* gcc.dg/c23-empty-init-warn-14.c: New test.
* gcc.dg/c23-empty-init-warn-15.c: New test.
* gcc.dg/c23-empty-init-warn-16.c: New test.
* gcc.dg/c23-empty-init-warn-17.c: New test.
* gcc.dg/c23-empty-init-warn-2.c: New test.
* gcc.dg/c23-empty-init-warn-3.c: New test.
* gcc.dg/c23-empty-init-warn-4.c: New test.
* gcc.dg/c23-empty-init-warn-5.c: New test.
* gcc.dg/c23-empty-init-warn-6.c: New test.
* gcc.dg/c23-empty-init-warn-7.c: New test.
* gcc.dg/c23-empty-init-warn-8.c: New test.
* gcc.dg/c23-empty-init-warn-9.c: New test.
* gcc.dg/gnu11-empty-init-warn-1.c: New test.
* gcc.dg/gnu11-empty-init-warn-10.c: New test.
* gcc.dg/gnu11-empty-init-warn-11.c: New test.

[PATCH v2 6/6] libstdc++: Add tests for layout_stride.

2025-05-20 Thread Luc Grosheintz
Implements the tests for layout_stride and for the features of the other
two layouts that depend on layout_stride.

libstdc++-v3/ChangeLog:

* testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc: Add
tests for layout_stride.
* testsuite/23_containers/mdspan/layouts/ctors.cc: Add test for
layout_stride and the interaction with other layouts.
* testsuite/23_containers/mdspan/layouts/mapping.cc: Ditto.
* testsuite/23_containers/mdspan/layouts/stride.cc: New test.

Signed-off-by: Luc Grosheintz 
---
 libstdc++-v3/include/std/mdspan   |   7 +-
 .../mdspan/layouts/class_mandate_neg.cc   |  19 +
 .../23_containers/mdspan/layouts/ctors.cc |  99 
 .../23_containers/mdspan/layouts/mapping.cc   |  75 ++-
 .../23_containers/mdspan/layouts/stride.cc| 494 ++
 5 files changed, 692 insertions(+), 2 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/23_containers/mdspan/layouts/stride.cc

diff --git a/libstdc++-v3/include/std/mdspan b/libstdc++-v3/include/std/mdspan
index 68fbf3ae81e..25f03b69bc1 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -884,7 +884,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   extents() const noexcept { return _M_extents; }
 
   constexpr array
-  strides() const noexcept { return _M_strides; }
+  strides() const noexcept {
+   array __ret;
+   for(size_t __i = 0; __i < extents_type::rank(); ++__i)
+ __ret[__i] = _M_strides[__i];
+   return __ret;
+  }
 
   constexpr index_type
   required_span_size() const noexcept
diff --git 
a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc 
b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
index a41bad988d2..0e39bd3aab0 100644
--- a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
+++ b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
@@ -17,7 +17,26 @@ template
 typename Layout::mapping m3; // { dg-error "required from" }
   };
 
+template
+  struct B // { dg-error "expansion of" }
+  {
+using Extents = std::extents;
+using OExtents = std::extents;
+
+using Mapping = typename Layout::mapping;
+using OMapping = typename Layout::mapping;
+
+Mapping m{OMapping{}};
+  };
+
 A a_left; // { dg-error "required from" }
 A a_right;   // { dg-error "required from" }
+A a_stride; // { dg-error "required from" }
+
+B<1, std::layout_left, std::layout_right> blr; // { dg-error "required 
here" }
+B<2, std::layout_left, std::layout_stride> bls;// { dg-error "required 
here" }
+
+B<3, std::layout_right, std::layout_left> brl; // { dg-error "required 
here" }
+B<4, std::layout_right, std::layout_stride> brs;   // { dg-error "required 
here" }
 
 // { dg-prune-output "must be representable as index_type" }
diff --git a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc 
b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
index 4a7d2bffeef..89d1b3a01a0 100644
--- a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
+++ b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
@@ -282,12 +282,111 @@ namespace from_left_or_right
 }
 }
 
+// ctor: mapping(layout_stride::mapping)
+namespace from_stride
+{
+  template
+constexpr auto
+strides(Mapping m)
+{
+  constexpr auto rank = Mapping::extents_type::rank();
+  std::array s;
+
+  if constexpr (rank > 0)
+   for(size_t i = 0; i < rank; ++i)
+ s[i] = m.stride(i);
+  return s;
+}
+
+  template
+constexpr void
+verify_convertible(OExtents oexts)
+{
+  using Mapping = typename Layout::mapping;
+  using OMapping = std::layout_stride::mapping;
+
+  constexpr auto other = OMapping(oexts, strides(Mapping(Extents(oexts;
+  if constexpr (std::is_same_v)
+   ::verify_nothrow_convertible(other);
+  else
+   ::verify_convertible(other);
+}
+
+  template
+constexpr void
+verify_constructible(OExtents oexts)
+{
+  using Mapping = typename Layout::mapping;
+  using OMapping = std::layout_stride::mapping;
+
+  constexpr auto other = OMapping(oexts, strides(Mapping(Extents(oexts;
+  if constexpr (std::is_same_v)
+   ::verify_nothrow_constructible(other);
+  else
+   ::verify_constructible(other);
+}
+
+  template
+constexpr bool
+test_ctor()
+{
+  assert_not_constructible<
+   typename Layout::mapping>,
+   std::layout_stride::mapping>>();
+
+  assert_not_constructible<
+   typename Layout::mapping>,
+   std::layout_stride::mapping>>();
+
+  assert_not_constructible<
+   typename Layout::mapping>,
+   std::layout_stride::mapping>>();
+
+  verify_convertible>(std::extents{});
+
+  verify_convertible>(
+   std::extents{});
+
+

[PATCH v2 3/6] libstdc++: Implement layout_right from mdspan.

2025-05-20 Thread Luc Grosheintz
Implement the parts of layout_left that depend on layout_right; and the
parts of layout_right that don't depend on layout_stride.

libstdc++-v3/ChangeLog:

* include/std/mdspan (layout_right): New class.

Signed-off-by: Luc Grosheintz 
---
 libstdc++-v3/include/std/mdspan | 153 +++-
 1 file changed, 152 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/mdspan b/libstdc++-v3/include/std/mdspan
index d90fed57a19..13ed27bcaaf 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -390,6 +390,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   class mapping;
   };
 
+  struct layout_right
+  {
+template
+  class mapping;
+  };
+
   namespace __mdspan
   {
 template
@@ -494,7 +500,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  _Mapping>;
 
 template
-  concept __standardized_mapping = __mapping_of;
+  concept __standardized_mapping = __mapping_of
+  || __mapping_of;
 
 template
   concept __mapping_like = requires
@@ -544,6 +551,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
: mapping(__other.extents(), __mdspan::__internal_ctor{})
{ }
 
+  template
+   requires (_Extents::rank() <= 1
+ && is_constructible_v<_Extents, _OExtents>)
+   constexpr explicit(!is_convertible_v<_OExtents, _Extents>)
+   mapping(const layout_right::mapping<_OExtents>& __other) noexcept
+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   { }
+
   constexpr mapping&
   operator=(const mapping&) noexcept = default;
 
@@ -611,6 +626,142 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
[[no_unique_address]] _Extents _M_extents;
 };
 
+  namespace __mdspan
+  {
+template
+  constexpr typename _Extents::index_type
+  __linear_index_right(const _Extents& __exts, _Indices... __indices)
+  {
+   using _IndexType = typename _Extents::index_type;
+   array<_IndexType, sizeof...(__indices)> __ind_arr{__indices...};
+   _IndexType __res = 0;
+   if constexpr (sizeof...(__indices) > 0)
+ {
+   _IndexType __mult = 1;
+   auto __update = [&, __pos = __exts.rank()](_IndexType) mutable
+ {
+   --__pos;
+   __res += __ind_arr[__pos] * __mult;
+   __mult *= __exts.extent(__pos);
+ };
+   (__update(__indices), ...);
+ }
+   return __res;
+  }
+  }
+
+  template
+class layout_right::mapping
+{
+public:
+  using extents_type = _Extents;
+  using index_type = typename extents_type::index_type;
+  using size_type = typename extents_type::size_type;
+  using rank_type = typename extents_type::rank_type;
+  using layout_type = layout_right;
+
+  static_assert(__mdspan::__representable_size<_Extents, index_type>,
+   "The size of extents_type must be representable as index_type");
+
+  constexpr
+  mapping() noexcept = default;
+
+  constexpr
+  mapping(const mapping&) noexcept = default;
+
+  constexpr
+  mapping(const _Extents& __extents) noexcept
+  : _M_extents(__extents)
+  { __glibcxx_assert(__mdspan::__is_representable_extents(_M_extents)); }
+
+  template
+   requires (is_constructible_v)
+   constexpr explicit(!is_convertible_v<_OExtents, extents_type>)
+   mapping(const mapping<_OExtents>& __other) noexcept
+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   { }
+
+  template
+   requires (extents_type::rank() <= 1
+   && is_constructible_v)
+   constexpr explicit(!is_convertible_v<_OExtents, extents_type>)
+   mapping(const layout_left::mapping<_OExtents>& __other) noexcept
+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   { }
+
+  constexpr mapping&
+  operator=(const mapping&) noexcept = default;
+
+  constexpr const _Extents&
+  extents() const noexcept { return _M_extents; }
+
+  constexpr index_type
+  required_span_size() const noexcept
+  { return __mdspan::__fwd_prod(_M_extents, extents_type::rank()); }
+
+  template<__mdspan::__valid_index_type... _Indices>
+   requires (sizeof...(_Indices) == extents_type::rank())
+   constexpr index_type
+   operator()(_Indices... __indices) const noexcept
+   {
+ return __mdspan::__linear_index_right(
+   _M_extents, static_cast(__indices)...);
+   }
+
+  static constexpr bool
+  is_always_unique() noexcept
+  { return true; }
+
+  static constexpr bool
+  is_always_exhaustive() noexcept
+  { return true; }
+
+  static constexpr bool
+  is_always_strided() noexcept
+  { return true; }
+
+  static constexpr bool
+  is_unique() noexcept
+  { return true; }
+
+  static constexpr bool
+  is_exhaustive() noexcept
+  { return true; }
+
+  static constexpr bool
+  is_st

[PATCH v2 0/6] Implement layouts from mdspan.

2025-05-20 Thread Luc Grosheintz
This follows up on:
https://gcc.gnu.org/pipermail/libstdc++/2025-May/061459.html

The changes are:
  * Fix layout_stride::strides; and add tests.
  * Add accessors for ranges of static and dynamic extents.
  * Use them to implement __fwd_prod and __rev_prod.
  * Remove public members with protected names from extents,
by using friends and the accessors instead.
  * Reduce runtime cost of __is_representable_extents by
separating static and dynamic extents.
  * Remove __layout_extents.
  * Move private ctors to bottom of class and directly
initialize _M_extents.
  * Implement precondition in layout_{left,right}(layout_stride)
using `*this == __other`.
  * Twice: Use lambda instead of *_impl function.
  * Various smaller improvements to the tests.

Luc Grosheintz (6):
  libstdc++: Implement layout_left from mdspan.
  libstdc++: Add tests for layout_left.
  libstdc++: Implement layout_right from mdspan.
  libstdc++: Add tests for layout_right.
  libstdc++: Implement layout_stride from mdspan.
  libstdc++: Add tests for layout_stride.

 libstdc++-v3/include/std/mdspan   | 670 +-
 .../mdspan/layouts/class_mandate_neg.cc   |  42 ++
 .../23_containers/mdspan/layouts/ctors.cc | 401 +++
 .../23_containers/mdspan/layouts/mapping.cc   | 569 +++
 .../23_containers/mdspan/layouts/stride.cc| 494 +
 5 files changed, 2175 insertions(+), 1 deletion(-)
 create mode 100644 
libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
 create mode 100644 libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
 create mode 100644 
libstdc++-v3/testsuite/23_containers/mdspan/layouts/mapping.cc
 create mode 100644 
libstdc++-v3/testsuite/23_containers/mdspan/layouts/stride.cc

-- 
2.49.0



[PATCH v2 5/6] libstdc++: Implement layout_stride from mdspan.

2025-05-20 Thread Luc Grosheintz
Implements the remaining parts of layout_left and layout_right; and all
of layout_stride.

libstdc++-v3/ChangeLog:

* include/std/mdspan(layout_stride): New class.

Signed-off-by: Luc Grosheintz 
---
 libstdc++-v3/include/std/mdspan | 211 +++-
 1 file changed, 208 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/std/mdspan b/libstdc++-v3/include/std/mdspan
index 13ed27bcaaf..68fbf3ae81e 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -396,6 +396,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   class mapping;
   };
 
+  struct layout_stride
+  {
+template
+  class mapping;
+  };
+
   namespace __mdspan
   {
 template
@@ -501,7 +507,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 template
   concept __standardized_mapping = __mapping_of
-  || __mapping_of;
+  || __mapping_of
+  || __mapping_of;
 
 template
   concept __mapping_like = requires
@@ -559,6 +566,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
: mapping(__other.extents(), __mdspan::__internal_ctor{})
{ }
 
+  template
+   requires (is_constructible_v)
+   constexpr explicit(extents_type::rank() > 0)
+   mapping(const layout_stride::mapping<_OExtents>& __other)
+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   { __glibcxx_assert(*this == __other); }
+
   constexpr mapping&
   operator=(const mapping&) noexcept = default;
 
@@ -574,8 +588,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
constexpr index_type
operator()(_Indices... __indices) const noexcept
{
- return __mdspan::__linear_index_left(
-   this->extents(), static_cast(__indices)...);
+ return __mdspan::__linear_index_left(_M_extents,
+   static_cast(__indices)...);
}
 
   static constexpr bool
@@ -689,6 +703,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
: mapping(__other.extents(), __mdspan::__internal_ctor{})
{ }
 
+  template
+   requires (is_constructible_v)
+   constexpr explicit(extents_type::rank() > 0)
+   mapping(const layout_stride::mapping<_OExtents>& __other) noexcept
+   : mapping(__other.extents(), __mdspan::__internal_ctor{})
+   { __glibcxx_assert(*this == __other); }
+
   constexpr mapping&
   operator=(const mapping&) noexcept = default;
 
@@ -762,6 +783,190 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
[[no_unique_address]] _Extents _M_extents;
 };
 
+  namespace __mdspan
+  {
+template
+  constexpr typename _Mapping::index_type
+  __offset(const _Mapping& __m) noexcept
+  {
+   using _IndexType = typename _Mapping::index_type;
+
+   auto __impl = [&__m](index_sequence<_Counts...>)
+   { return __m(((void) _Counts, _IndexType(0))...); };
+   return __impl(make_index_sequence<_Mapping::extents_type::rank()>());
+  }
+
+template
+  constexpr typename _Mapping::index_type
+  __linear_index_strides(const _Mapping& __m,
+_Indices... __indices)
+  {
+   using _IndexType = typename _Mapping::index_type;
+   _IndexType __res = 0;
+   if constexpr (sizeof...(__indices) > 0)
+ {
+   auto __update = [&, __pos = 0u](_IndexType __idx) mutable
+ {
+   __res += __idx * __m.stride(__pos++);
+ };
+   (__update(__indices), ...);
+ }
+   return __res;
+  }
+  }
+
+  template
+class layout_stride::mapping
+{
+public:
+  using extents_type = _Extents;
+  using index_type = typename extents_type::index_type;
+  using size_type = typename extents_type::size_type;
+  using rank_type = typename extents_type::rank_type;
+  using layout_type = layout_stride;
+
+  static_assert(__mdspan::__representable_size<_Extents, index_type>,
+   "The size of extents_type must be representable as index_type");
+
+  constexpr
+  mapping() noexcept
+  {
+   auto __stride = index_type(1);
+   for(size_t __i = extents_type::rank(); __i > 0; --__i)
+ {
+   _M_strides[__i - 1] = __stride;
+   __stride *= _M_extents.extent(__i - 1);
+ }
+  }
+
+  constexpr
+  mapping(const mapping&) noexcept = default;
+
+  template<__mdspan::__valid_index_type _OIndexType>
+   constexpr
+   mapping(const extents_type& __exts,
+   span<_OIndexType, extents_type::rank()> __strides) noexcept
+   : _M_extents(__exts)
+   {
+ for(size_t __i = 0; __i < extents_type::rank(); ++__i)
+   _M_strides[__i] = index_type(as_const(__strides[__i]));
+   }
+
+  template<__mdspan::__valid_index_type _OIndexType>
+   constexpr
+   mapping(const extents_type& __exts,
+   const array<_OIndexType, extents_type::rank()>& __strides)
+   noexcept
+

[committed] hpux: Fix detection of atomic support when profiling

2025-05-20 Thread John David Anglin
Tested on hppa64-hp-hpux11.11.  Committed to trunk.

Dave
---

hpux: Fix detection of atomic support when profiling

The pa target lacks atomic sync compare and swap instructions.
These are implemented as libcalls and in libatomic.  As on linux,
we lie about their availability.

This fixes the gcov-30.c test on hppa64-hpux11.

2025-05-19  John David Anglin  

gcc/ChangeLog:

* config/pa/pa-hpux.h (TARGET_HAVE_LIBATOMIC): Define.
(HAVE_sync_compare_and_swapqi): Likewise.
(HAVE_sync_compare_and_swaphi): Likewise.
(HAVE_sync_compare_and_swapsi): Likewise.
(HAVE_sync_compare_and_swapdi): Likewise.

diff --git a/gcc/config/pa/pa-hpux.h b/gcc/config/pa/pa-hpux.h
index 74e30eda9b5..1439447fdbe 100644
--- a/gcc/config/pa/pa-hpux.h
+++ b/gcc/config/pa/pa-hpux.h
@@ -114,3 +114,17 @@ along with GCC; see the file COPYING3.  If not see
 
 #undef TARGET_LIBC_HAS_FUNCTION
 #define TARGET_LIBC_HAS_FUNCTION no_c99_libc_has_function
+
+/* Assume we have libatomic if sync libcalls are disabled.  */
+#undef TARGET_HAVE_LIBATOMIC
+#define TARGET_HAVE_LIBATOMIC (!flag_sync_libcalls)
+
+/* The SYNC operations are implemented as library functions, not
+   INSN patterns.  As a result, the HAVE defines for the patterns are
+   not defined.  We need to define them to generate the corresponding
+   __GCC_HAVE_SYNC_COMPARE_AND_SWAP_* and __GCC_ATOMIC_*_LOCK_FREE
+   defines.  */
+#define HAVE_sync_compare_and_swapqi (flag_sync_libcalls)
+#define HAVE_sync_compare_and_swaphi (flag_sync_libcalls)
+#define HAVE_sync_compare_and_swapsi (flag_sync_libcalls)
+#define HAVE_sync_compare_and_swapdi (flag_sync_libcalls)


[PATCH v2 0/1] Add warnings of potentially-uninitialized padding bits

2025-05-20 Thread Christopher Bazley
Commit 0547dbb725b reduced the number of cases in which
union padding bits are zeroed when the relevant language
standard does not strictly require it, unless gcc was
invoked with -fzero-init-padding-bits=unions or
-fzero-init-padding-bits=all in order to explicitly
request zeroing of padding bits.

This commit adds a closely related warning,
-Wzero-init-padding-bits=, which is intended to help
programmers to find code that might now need to be
rewritten or recompiled with
-fzero-init-padding-bits=unions or
-fzero-init-padding-bits=all in order to replicate
the behaviour that it had when compiled by older
versions of GCC. It can also be used to find struct
padding that was never previously guaranteed to be
zero initialized and still isn't unless GCC is
invoked with -fzero-init-padding-bits=all option.

The new warning can be set to the same three states
as -fzero-init-padding-bits ('standard', 'unions'
or 'all') and has the same default value ('standard').

The two options interact as follows:

  f: standard  f: unions   f: all
w: standard X X X
w: unions   U X X
w: all  A S X

X = No warnings about padding
U = Warnings about padding of unions.
S = Warnings about padding of structs.
A = Warnings about padding of structs and unions.

The level of optimisation and whether or not the
entire initializer is dropped to memory can both
affect whether warnings are produced when compiling
a given program. This is intentional, since tying
the warnings more closely to the relevant language
standard would require a very different approach
that would still be target-dependent, might impose
an unacceptable burden on programmers, and would
risk not satisfying the intended use-case (which
is closely tied to a specific optimisation).

Bootstrapped the compiler and tested on AArch64
and x86-64 using some new tests for
-Wzero-init-padding-bits and the existing tests
for -fzero-init-padding-bits
(check-gcc RUNTESTFLAGS="dg.exp=*-empty-init-*.c").

Base commit is a470433732e77ae29a717cf79049ceeea3cbe979

Changes in v2:
 - Add missing changelog entry.

Link to v1:
https://inbox.sourceware.org/gcc-patches/20250520104940.3546-1-chris.baz...@arm.com/

Christopher Bazley (1):
  Add warnings of potentially-uninitialized padding bits

 gcc/common.opt|  4 +
 gcc/doc/invoke.texi   | 85 ++-
 gcc/expr.cc   | 41 -
 gcc/expr.h|  7 +-
 gcc/gimplify.cc   | 29 ++-
 gcc/testsuite/gcc.dg/c23-empty-init-warn-1.c  | 68 +++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-10.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-11.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-12.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-13.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-14.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-15.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-16.c |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-17.c | 51 +++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-2.c  | 69 +++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-3.c  |  7 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-4.c  | 69 +++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-5.c  |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-6.c  |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-7.c  |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-8.c  |  8 ++
 gcc/testsuite/gcc.dg/c23-empty-init-warn-9.c  | 69 +++
 .../gcc.dg/gnu11-empty-init-warn-1.c  | 52 
 .../gcc.dg/gnu11-empty-init-warn-10.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-11.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-12.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-13.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-14.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-15.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-16.c |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-17.c | 51 +++
 .../gcc.dg/gnu11-empty-init-warn-2.c  | 59 +
 .../gcc.dg/gnu11-empty-init-warn-3.c  |  7 ++
 .../gcc.dg/gnu11-empty-init-warn-4.c  | 63 ++
 .../gcc.dg/gnu11-empty-init-warn-5.c  |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-6.c  |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-7.c  |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-8.c  |  8 ++
 .../gcc.dg/gnu11-empty-init-warn-9.c  | 55 
 39 files changed, 937 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/c23-empty-init-warn-1.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-empty-init-warn-10.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-empty-init-warn-11.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-empty-init-warn-12.c
 create mode 100644 gcc/testsuite/gcc.dg/c23-empty-init-warn-13.c
 create mod

Re: [PATCH v5] libstdc++: Implement C++23 P1659R3 starts_with and ends_with

2025-05-20 Thread Tomasz Kaminski
I think I do not have any more suggestions for cases to check, so the impl
LGTM.

On Tue, May 20, 2025 at 4:33 PM Patrick Palka  wrote:

> Changes in v5:
>   * dispatch to starts_with for the both-bidi/common range case
>
> Changes in v4:
>   * optimize the both-bidi/common ranges case, as suggested by
> Tomasz
>   * add tests for that code path
>
> Changes in v3:
>   * Use the forward_range code path for a (non-sized) bidirectional
> haystack, since it's slightly fewer increments/decrements
> overall.
>   * Fix wrong iter_difference_t cast in starts_with.
>
> Changes in v2:
>   Addressed Tomasz's review comments, namely:
>   * Added explicit iter_difference_t casts
>   * Made _S_impl member private
>   * Optimized sized bidirectional case of ends_with
>   * Rearranged control flow of starts_with::_S_impl
>
> Still left to do:
>   * Add tests for integer-class types
>   * Still working on a better commit description ;)
>
> -- >8 --
>
> libstdc++-v3/ChangeLog:
>
> * include/bits/ranges_algo.h (__starts_with_fn, starts_with):
> Define.
> (__ends_with_fn, ends_with): Define.
> * include/bits/version.def (ranges_starts_ends_with): Define.
> * include/bits/version.h: Regenerate.
> * include/std/algorithm: Provide __cpp_lib_ranges_starts_ends_with.
> * src/c++23/std.cc.in (ranges::starts_with): Export.
> (ranges::ends_with): Export.
> * testsuite/25_algorithms/ends_with/1.cc: New test.
> * testsuite/25_algorithms/starts_with/1.cc: New test.
> ---
>  libstdc++-v3/include/bits/ranges_algo.h   | 247 ++
>  libstdc++-v3/include/bits/version.def |   8 +
>  libstdc++-v3/include/bits/version.h   |  10 +
>  libstdc++-v3/include/std/algorithm|   1 +
>  libstdc++-v3/src/c++23/std.cc.in  |   4 +
>  .../testsuite/25_algorithms/ends_with/1.cc| 135 ++
>  .../testsuite/25_algorithms/starts_with/1.cc  | 128 +
>  7 files changed, 533 insertions(+)
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/ends_with/1.cc
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/starts_with/1.cc
>
> diff --git a/libstdc++-v3/include/bits/ranges_algo.h
> b/libstdc++-v3/include/bits/ranges_algo.h
> index f36e7dd59911..60f7bf841f3f 100644
> --- a/libstdc++-v3/include/bits/ranges_algo.h
> +++ b/libstdc++-v3/include/bits/ranges_algo.h
> @@ -438,6 +438,253 @@ namespace ranges
>
>inline constexpr __search_n_fn search_n{};
>
> +#if __glibcxx_ranges_starts_ends_with // C++ >= 23
> +  struct __starts_with_fn
> +  {
> +template _Sent1,
> +input_iterator _Iter2, sentinel_for<_Iter2> _Sent2,
> +typename _Pred = ranges::equal_to,
> +typename _Proj1 = identity, typename _Proj2 = identity>
> +  requires indirectly_comparable<_Iter1, _Iter2, _Pred, _Proj1,
> _Proj2>
> +  constexpr bool
> +  operator()(_Iter1 __first1, _Sent1 __last1,
> +_Iter2 __first2, _Sent2 __last2, _Pred __pred = {},
> +_Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
> +  {
> +   iter_difference_t<_Iter1> __n1 = -1;
> +   iter_difference_t<_Iter2> __n2 = -1;
> +   if constexpr (sized_sentinel_for<_Sent1, _Iter1>)
> + __n1 = __last1 - __first1;
> +   if constexpr (sized_sentinel_for<_Sent2, _Iter2>)
> + __n2 = __last2 - __first2;
> +   return _S_impl(std::move(__first1), __last1, __n1,
> +  std::move(__first2), __last2, __n2,
> +  std::move(__pred),
> +  std::move(__proj1), std::move(__proj2));
> +  }
> +
> +template +typename _Pred = ranges::equal_to,
> +typename _Proj1 = identity, typename _Proj2 = identity>
> +  requires indirectly_comparable,
> iterator_t<_Range2>,
> +_Pred, _Proj1, _Proj2>
> +  constexpr bool
> +  operator()(_Range1&& __r1, _Range2&& __r2, _Pred __pred = {},
> +_Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
> +  {
> +   range_difference_t<_Range1> __n1 = -1;
> +   range_difference_t<_Range1> __n2 = -1;
> +   if constexpr (sized_range<_Range1>)
> + __n1 = ranges::size(__r1);
> +   if constexpr (sized_range<_Range2>)
> + __n2 = ranges::size(__r2);
> +   return _S_impl(ranges::begin(__r1), ranges::end(__r1), __n1,
> +  ranges::begin(__r2), ranges::end(__r2), __n2,
> +  std::move(__pred),
> +  std::move(__proj1), std::move(__proj2));
> +  }
> +
> +  private:
> +template _Sent2,
> +typename _Pred,
> +typename _Proj1, typename _Proj2>
> +  static constexpr bool
> +  _S_impl(_Iter1 __first1, _Sent1 __last1, iter_difference_t<_Iter1>
> __n1,
> + _Iter2 __first2, _Sent2 __last2, iter_difference_t<_Iter2>
> __n2,
> + _Pred __pred, _Proj1 __proj1, _Proj2 __p

[committed] libstdc++: Fix incorrect links to archived SGI STL docs

2025-05-20 Thread Jonathan Wakely
In r8--g25949ee33201f2 I updated some URLs to point to copies of the
SGI STL docs in the Wayback Machine, because the original pags were no
longer hosted on sgi.com. However, I incorrectly assumed that if one
archived page was at https://web.archive.org/web/20171225062613/... then
all the other pages would be too. Apparently that's not how the Wayback
Machine works, and each page is archived on a different date. That meant
that some of our links were redirecting to archived copies of the
announcement that the SGI STL docs have gone away.

This fixes each URL to refer to a correctly archived copy of the
original docs.

libstdc++-v3/ChangeLog:

* doc/xml/faq.xml: Update URL for archived SGI STL docs.
* doc/xml/manual/containers.xml: Likewise.
* doc/xml/manual/extensions.xml: Likewise.
* doc/xml/manual/using.xml: Likewise.
* doc/xml/manual/utilities.xml: Likewise.
* doc/html/*: Regenerate.
---

Pushed to trunk. Backports to follow.

Maybe we should just host a copy of these docs on gcc.gnu.org and then
we can rely on a stable URL for them (boost.org recently stopped hosting
them, so we can't use that copy).

 libstdc++-v3/doc/html/faq.html  |  2 +-
 libstdc++-v3/doc/html/manual/containers.html|  2 +-
 libstdc++-v3/doc/html/manual/ext_numerics.html  |  2 +-
 libstdc++-v3/doc/html/manual/ext_sgi.html   |  4 ++--
 libstdc++-v3/doc/html/manual/using_concurrency.html | 10 +-
 libstdc++-v3/doc/html/manual/utilities.html |  4 ++--
 libstdc++-v3/doc/xml/faq.xml|  2 +-
 libstdc++-v3/doc/xml/manual/containers.xml  |  2 +-
 libstdc++-v3/doc/xml/manual/extensions.xml  |  6 +++---
 libstdc++-v3/doc/xml/manual/using.xml   | 10 +-
 libstdc++-v3/doc/xml/manual/utilities.xml   |  4 ++--
 11 files changed, 24 insertions(+), 24 deletions(-)

diff --git a/libstdc++-v3/doc/html/faq.html b/libstdc++-v3/doc/html/faq.html
index 507555839f2f..9bd477f1395d 100644
--- a/libstdc++-v3/doc/html/faq.html
+++ b/libstdc++-v3/doc/html/faq.html
@@ -796,7 +796,7 @@
 Libstdc++-v3 incorporates a lot of code from
 https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/"; 
target="_top">the SGI STL
 (the final merge was from
-https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/whats_new.html";
 target="_top">release 3.3).
+https://web.archive.org/web/20171206110416/http://www.sgi.com/tech/stl/whats_new.html";
 target="_top">release 3.3).
 The code in libstdc++ contains many fixes and changes compared to the
 original SGI code.
 
diff --git a/libstdc++-v3/doc/html/manual/containers.html 
b/libstdc++-v3/doc/html/manual/containers.html
index 7035a949074d..dcd609a6000d 100644
--- a/libstdc++-v3/doc/html/manual/containers.html
+++ b/libstdc++-v3/doc/html/manual/containers.html
@@ -11,7 +11,7 @@
  Yes it is, at least using the old
  ABI, and that's okay.  This is a decision that we preserved
  when we imported SGI's STL implementation.  The following is
- quoted from https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/FAQ.html";
 target="_top">their FAQ:
+ quoted from https://web.archive.org/web/20161222192301/http://www.sgi.com/tech/stl/FAQ.html";
 target="_top">their FAQ:

The size() member function, for list and slist, takes time
proportional to the number of elements in the list.  This was a
diff --git a/libstdc++-v3/doc/html/manual/ext_numerics.html 
b/libstdc++-v3/doc/html/manual/ext_numerics.html
index 9b864e1dcf4a..c3a5623d1752 100644
--- a/libstdc++-v3/doc/html/manual/ext_numerics.html
+++ b/libstdc++-v3/doc/html/manual/ext_numerics.html
@@ -14,7 +14,7 @@
The operation functor must be associative.
 The iota function wins the award for 
Extension With the
Coolest Name (the name comes from Ken Iverson's APL language.)  As
-   described in the https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/iota.html";
 target="_top">SGI
+   described in the https://web.archive.org/web/20170201044840/http://www.sgi.com/tech/stl/iota.html";
 target="_top">SGI
documentation, it "assigns sequentially increasing values to a range.
That is, it assigns value to *first,
value + 1 to *(first + 
1) and so on."
diff --git a/libstdc++-v3/doc/html/manual/ext_sgi.html 
b/libstdc++-v3/doc/html/manual/ext_sgi.html
index ae2062954f4f..2310857804b3 100644
--- a/libstdc++-v3/doc/html/manual/ext_sgi.html
+++ b/libstdc++-v3/doc/html/manual/ext_sgi.html
@@ -28,12 +28,12 @@
   and sets.
Each of the associative containers map, multimap, set, and multiset
   have a counterpart which uses a
-  https://web.archive.org/web/20171225062613/http://www.sgi.com/tech/stl/HashFunction.html";
 target="_top">hashing
+  https://web.archive.org/web/20171230172024/http://www.sgi.com/tech/stl/HashFunction.html";
 target="_top">hashing
   func

Re: [PATCH v1 2/6] libstdc++: Add tests for layout_left.

2025-05-20 Thread Luc Grosheintz




On 5/19/25 2:56 PM, Tomasz Kaminski wrote:

On Sun, May 18, 2025 at 10:14 PM Luc Grosheintz 
wrote:


Implements a suite of tests for the currently implemented parts of
layout_left. The individual tests are templated over the layout type, to
allow reuse as more layouts are added.

libstdc++-v3/ChangeLog:

 * testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc: New
test.
 * testsuite/23_containers/mdspan/layouts/ctors.cc: New test.
 * testsuite/23_containers/mdspan/layouts/mapping.cc: New test.

Signed-off-by: Luc Grosheintz 
---
  .../mdspan/layouts/class_mandate_neg.cc   |  22 +
  .../23_containers/mdspan/layouts/ctors.cc | 258 ++
  .../23_containers/mdspan/layouts/mapping.cc   | 445 ++
  3 files changed, 725 insertions(+)
  create mode 100644
libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
  create mode 100644
libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
  create mode 100644
libstdc++-v3/testsuite/23_containers/mdspan/layouts/mapping.cc

diff --git
a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
new file mode 100644
index 000..f122541b3e8
--- /dev/null
+++
b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
@@ -0,0 +1,22 @@
+// { dg-do compile { target c++23 } }
+#include
+
+#include 
+
+constexpr size_t dyn = std::dynamic_extent;
+static constexpr size_t n = (size_t(1) << 7) - 1;


I would use numeric_limits_max here.


+
+template
+  struct A
+  {
+typename Layout::mapping> m0;
+typename Layout::mapping> m1;
+typename Layout::mapping> m2;
+
+using extents_type = std::extents;
+typename Layout::mapping m3; // { dg-error "required
from" }
+  };
+
+A a_left; // { dg-error "required
from" }
+
+// { dg-prune-output "must be representable as index_type" }
diff --git a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
new file mode 100644
index 000..4592a05dec8
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
@@ -0,0 +1,258 @@
+// { dg-do run { target c++23 } }
+#include 
+
+#include 
+
+constexpr size_t dyn = std::dynamic_extent;
+
+template
+  constexpr void
+  verify_from_exts(OExtents exts)
+  {
+auto m = Mapping(exts);
+VERIFY(m.extents() == exts);
+  }
+
+
+template
+  constexpr void
+  verify_from_mapping(OMapping other)
+  {
+auto m = SMapping(other);
+VERIFY(m.extents() == other.extents());
+  }
+
+template
+  requires (std::__mdspan::__is_extents)
+  constexpr void
+  verify(OExtents oexts)
+  {


In general, wen possible we prefer to not use internal details in tests.
I would use if constexpr with requires { typename Other::layout_type; }, ie.
template
cosntexpr void
verify(Source const& src)
{
   if constexpr (requires { typename Other::layout_type; })
  verify_from_mapping(src)
   else
  verify_from_extents(src);
}

+auto m = Mapping(oexts);

+VERIFY(m.extents() == oexts);
+  }
+
+template
+  requires (std::__mdspan::__standardized_mapping)
+  constexpr void
+  verify(OMapping other)
+  {
+constexpr auto rank = Mapping::extents_type::rank();
+auto m = Mapping(other);
+VERIFY(m.extents() == other.extents());
+if constexpr (rank > 0)
+  for(size_t i = 0; i < rank; ++i)
+   VERIFY(std::cmp_equal(m.stride(i), other.stride(i)));


Why is this not checked in verify_from_mapping?



Because both verify_from_extents and verify_from_mapping are
leftovers from an earlier variation. They'll be removed.


+  }
+
+
+template
+  constexpr void
+  verify_nothrow_convertible(From from)
+  {
+static_assert(std::is_nothrow_constructible_v);


I would call  `verify_convertible` here, instead of these two lines.


+static_assert(std::is_convertible_v);
+verify(from);
+  }
+
+template
+  constexpr void
+  verify_convertible(From from)
+  {
+static_assert(std::is_convertible_v);
+verify(from);
+  }
+
+template
+  constexpr void
+  verify_constructible(From from)
+  {
+static_assert(!std::is_convertible_v);
+static_assert(!std::is_nothrow_constructible_v);


Implementations are allowed to add noexcept on the functions, so I would
not perform this checks.
See: https://eel.is/c++draft/res.on.exception.handling#5


+static_assert(std::is_constructible_v);
+verify(from);
+  }
+
+template
+  constexpr void
+  verify_nothrow_constructible(From from)
+  {
+static_assert(!std::is_convertible_v);
+static_assert(std::is_nothrow_constructible_v);


With the change above, I would call verify  verify_constructible.


+verify(from);
+  }
+
+template
+  constexpr void
+  assert_not_constructible()
+  {
+static_assert(!std::is_constructible_v);
+  }
+
+// ctor: mapping(const extents&)
+namespace from_extents
+{
+  template
+constexpr vo

Re: [PATCH v3] libstdc++: Implement C++23 P1659R3 starts_with and ends_with

2025-05-20 Thread Patrick Palka
On Tue, 20 May 2025, Tomasz Kaminski wrote:

> 
> 
> On Mon, May 19, 2025 at 6:06 PM Patrick Palka  wrote:
>   On Mon, 19 May 2025, Patrick Palka wrote:
> 
>   > Changes in v3:
>   >   * Use the forward_range code path for a (non-sized) bidirectional
>   >     haystack, since it's slightly fewer increments/decrements
>   >     overall.
>   >   * Fix wrong iter_difference_t cast in starts_with.
>   >
>   > Changes in v2:
>   >   Addressed Tomasz's review comments, namely:
>   >   * Added explicit iter_difference_t casts
>   >   * Made _S_impl member private
>   >   * Optimized sized bidirectional case of ends_with
>   >   * Rearranged control flow of starts_with::_S_impl
>   >
>   > Still left to do:
>   >   * Add tests for integer-class types
>   >   * Still working on a better commit description ;)
>   >
>   > -- >8 --
>   >
>   > libstdc++-v3/ChangeLog:
>   >
>   >       * include/bits/ranges_algo.h (__starts_with_fn, starts_with):
>   >       Define.
>   >       (__ends_with_fn, ends_with): Define.
>   >       * include/bits/version.def (ranges_starts_ends_with): Define.
>   >       * include/bits/version.h: Regenerate.
>   >       * include/std/algorithm: Provide 
> __cpp_lib_ranges_starts_ends_with.
>   >       * src/c++23/std.cc.in (ranges::starts_with): Export.
>   >       (ranges::ends_with): Export.
>   >       * testsuite/25_algorithms/ends_with/1.cc: New test.
>   >       * testsuite/25_algorithms/starts_with/1.cc: New test.
>   > ---
>   >  libstdc++-v3/include/bits/ranges_algo.h       | 236 
> ++
>   >  libstdc++-v3/include/bits/version.def         |   8 +
>   >  libstdc++-v3/include/bits/version.h           |  10 +
>   >  libstdc++-v3/include/std/algorithm            |   1 +
>   >  libstdc++-v3/src/c++23/std.cc.in              |   4 +
>   >  .../testsuite/25_algorithms/ends_with/1.cc    | 129 ++
>   >  .../testsuite/25_algorithms/starts_with/1.cc  | 128 ++
>   >  7 files changed, 516 insertions(+)
>   >  create mode 100644 
> libstdc++-v3/testsuite/25_algorithms/ends_with/1.cc
>   >  create mode 100644 
> libstdc++-v3/testsuite/25_algorithms/starts_with/1.cc
>   >
>   > diff --git a/libstdc++-v3/include/bits/ranges_algo.h 
> b/libstdc++-v3/include/bits/ranges_algo.h
>   > index f36e7dd59911..54646ae62f7e 100644
>   > --- a/libstdc++-v3/include/bits/ranges_algo.h
>   > +++ b/libstdc++-v3/include/bits/ranges_algo.h
>   > @@ -438,6 +438,242 @@ namespace ranges
>   > 
>   >    inline constexpr __search_n_fn search_n{};
>   > 
>   > +#if __glibcxx_ranges_starts_ends_with // C++ >= 23
>   > +  struct __starts_with_fn
>   > +  {
>   > +    template _Sent1,
>   > +          input_iterator _Iter2, sentinel_for<_Iter2> _Sent2,
>   > +          typename _Pred = ranges::equal_to,
>   > +          typename _Proj1 = identity, typename _Proj2 = identity>
>   > +      requires indirectly_comparable<_Iter1, _Iter2, _Pred, _Proj1, 
> _Proj2>
>   > +      constexpr bool
>   > +      operator()(_Iter1 __first1, _Sent1 __last1,
>   > +              _Iter2 __first2, _Sent2 __last2, _Pred __pred = {},
>   > +              _Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
>   > +      {
>   > +     iter_difference_t<_Iter1> __n1 = -1;
>   > +     iter_difference_t<_Iter2> __n2 = -1;
>   > +     if constexpr (sized_sentinel_for<_Sent1, _Iter1>)
>   > +       __n1 = __last1 - __first1;
>   > +     if constexpr (sized_sentinel_for<_Sent2, _Iter2>)
>   > +       __n2 = __last2 - __first2;
>   > +     return _S_impl(std::move(__first1), __last1, __n1,
>   > +                    std::move(__first2), __last2, __n2,
>   > +                    std::move(__pred),
>   > +                    std::move(__proj1), std::move(__proj2));
>   > +      }
>   > +
>   > +    template   > +          typename _Pred = ranges::equal_to,
>   > +          typename _Proj1 = identity, typename _Proj2 = identity>
>   > +      requires indirectly_comparable, 
> iterator_t<_Range2>,
>   > +                                  _Pred, _Proj1, _Proj2>
>   > +      constexpr bool
>   > +      operator()(_Range1&& __r1, _Range2&& __r2, _Pred __pred = {},
>   > +              _Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
>   > +      {
>   > +     range_difference_t<_Range1> __n1 = -1;
>   > +     range_difference_t<_Range1> __n2 = -1;
>   > +     if constexpr (sized_range<_Range1>)
>   > +       __n1 = ranges::size(__r1);
>   > +     if constexpr (sized_range<_Range2>)
>   > +       __n2 = ranges::size(__r2);
>   > +     return _S_impl(ranges::begin(__r1), ranges::end(__r1), __n1,
>   > +                    ranges::begin(__r2), ranges::end(__r2), __n2,
>  

Re: [PATCH v3] libstdc++: Implement C++23 P1659R3 starts_with and ends_with

2025-05-20 Thread Tomasz Kaminski
On Tue, May 20, 2025 at 3:07 PM Patrick Palka  wrote:

> On Tue, 20 May 2025, Tomasz Kaminski wrote:
>
> >
> >
> > On Mon, May 19, 2025 at 6:06 PM Patrick Palka  wrote:
> >   On Mon, 19 May 2025, Patrick Palka wrote:
> >
> >   > Changes in v3:
> >   >   * Use the forward_range code path for a (non-sized)
> bidirectional
> >   > haystack, since it's slightly fewer increments/decrements
> >   > overall.
> >   >   * Fix wrong iter_difference_t cast in starts_with.
> >   >
> >   > Changes in v2:
> >   >   Addressed Tomasz's review comments, namely:
> >   >   * Added explicit iter_difference_t casts
> >   >   * Made _S_impl member private
> >   >   * Optimized sized bidirectional case of ends_with
> >   >   * Rearranged control flow of starts_with::_S_impl
> >   >
> >   > Still left to do:
> >   >   * Add tests for integer-class types
> >   >   * Still working on a better commit description ;)
> >   >
> >   > -- >8 --
> >   >
> >   > libstdc++-v3/ChangeLog:
> >   >
> >   >   * include/bits/ranges_algo.h (__starts_with_fn,
> starts_with):
> >   >   Define.
> >   >   (__ends_with_fn, ends_with): Define.
> >   >   * include/bits/version.def (ranges_starts_ends_with):
> Define.
> >   >   * include/bits/version.h: Regenerate.
> >   >   * include/std/algorithm: Provide
> __cpp_lib_ranges_starts_ends_with.
> >   >   * src/c++23/std.cc.in (ranges::starts_with): Export.
> >   >   (ranges::ends_with): Export.
> >   >   * testsuite/25_algorithms/ends_with/1.cc: New test.
> >   >   * testsuite/25_algorithms/starts_with/1.cc: New test.
> >   > ---
> >   >  libstdc++-v3/include/bits/ranges_algo.h   | 236
> ++
> >   >  libstdc++-v3/include/bits/version.def |   8 +
> >   >  libstdc++-v3/include/bits/version.h   |  10 +
> >   >  libstdc++-v3/include/std/algorithm|   1 +
> >   >  libstdc++-v3/src/c++23/std.cc.in  |   4 +
> >   >  .../testsuite/25_algorithms/ends_with/1.cc| 129 ++
> >   >  .../testsuite/25_algorithms/starts_with/1.cc  | 128 ++
> >   >  7 files changed, 516 insertions(+)
> >   >  create mode 100644
> libstdc++-v3/testsuite/25_algorithms/ends_with/1.cc
> >   >  create mode 100644
> libstdc++-v3/testsuite/25_algorithms/starts_with/1.cc
> >   >
> >   > diff --git a/libstdc++-v3/include/bits/ranges_algo.h
> b/libstdc++-v3/include/bits/ranges_algo.h
> >   > index f36e7dd59911..54646ae62f7e 100644
> >   > --- a/libstdc++-v3/include/bits/ranges_algo.h
> >   > +++ b/libstdc++-v3/include/bits/ranges_algo.h
> >   > @@ -438,6 +438,242 @@ namespace ranges
> >   >
> >   >inline constexpr __search_n_fn search_n{};
> >   >
> >   > +#if __glibcxx_ranges_starts_ends_with // C++ >= 23
> >   > +  struct __starts_with_fn
> >   > +  {
> >   > +template _Sent1,
> >   > +  input_iterator _Iter2, sentinel_for<_Iter2> _Sent2,
> >   > +  typename _Pred = ranges::equal_to,
> >   > +  typename _Proj1 = identity, typename _Proj2 =
> identity>
> >   > +  requires indirectly_comparable<_Iter1, _Iter2, _Pred,
> _Proj1, _Proj2>
> >   > +  constexpr bool
> >   > +  operator()(_Iter1 __first1, _Sent1 __last1,
> >   > +  _Iter2 __first2, _Sent2 __last2, _Pred __pred =
> {},
> >   > +  _Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
> >   > +  {
> >   > + iter_difference_t<_Iter1> __n1 = -1;
> >   > + iter_difference_t<_Iter2> __n2 = -1;
> >   > + if constexpr (sized_sentinel_for<_Sent1, _Iter1>)
> >   > +   __n1 = __last1 - __first1;
> >   > + if constexpr (sized_sentinel_for<_Sent2, _Iter2>)
> >   > +   __n2 = __last2 - __first2;
> >   > + return _S_impl(std::move(__first1), __last1, __n1,
> >   > +std::move(__first2), __last2, __n2,
> >   > +std::move(__pred),
> >   > +std::move(__proj1), std::move(__proj2));
> >   > +  }
> >   > +
> >   > +template >   > +  typename _Pred = ranges::equal_to,
> >   > +  typename _Proj1 = identity, typename _Proj2 =
> identity>
> >   > +  requires indirectly_comparable,
> iterator_t<_Range2>,
> >   > +  _Pred, _Proj1, _Proj2>
> >   > +  constexpr bool
> >   > +  operator()(_Range1&& __r1, _Range2&& __r2, _Pred __pred =
> {},
> >   > +  _Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
> >   > +  {
> >   > + range_difference_t<_Range1> __n1 = -1;
> >   > + range_difference_t<_Range1> __n2 = -1;
> >   > + if constexpr (sized_range<_Range1>)
> >   > +   __n1 = ranges::size(__r1);
> >   

Re: [RFC PATCH 0/3] _BitInt(N) support for LoongArch

2025-05-20 Thread Jakub Jelinek
On Tue, May 20, 2025 at 08:58:18PM +0800, Yang Yujie wrote:
> Thanks for the quick review.
> 
> Aside from code formatting issues, can I conclude that you suggest
> we should rebase this onto your new big-endian support patch?  Or
> do you think it's necessary to add big-endian && extended support
> together?

I'd suggest working on it incrementally rather than with a full patch set.
In one or multiple patches handle the promote_mode stuff, the atomic
extension and expr.cc changes with the feedback incorporated.
For gimple-lower-bitint.cc I'd really like to see what testing you've done
to decide on a case by case basis.

> > Are you sure all those changes were really necessary (rather than doing them
> > just in case)?  I believe most of gimple-lower-bitint.cc already should be
> > sign or zero extending the partial limbs when storing stuff, there can be
> > some corner cases (I think one of the shift directions at least).
> 
> The modifications to gimple-lower-bitint.cc are based on testing, 

The tests weren't included :(.

> since I found that simply setting the "info.extended" flag won't work unless
> I make changes to promote_function_mode, which leads to a series of
> changes to correct all the regtests.
> 
> Specifically, the tests told me to extend (thought "truncate"
> was kind of an equivalent word) the output of left shift, plus/minus,

Truncation is the exact opposite of extension.
I can understand the need for handling of left shifts, for all the rest
I'd really like to see testcases.

> More common tests would surely be helpful, especially for new ports.
> 
> However, the specific test you mentioned would not be compatible with
> the proposed LoongArch ABI, where the top 64-bit limb within the top
> 128-bit ABI-limb may be undefined. e.g. _BitInt(192).

Ugh, that feels very creative in the psABI :(.
In any case, even that could be handled in the macro, although it would
need to have defined(__loongarch__) specific helpers that would perhaps
for
sizeof (x) * __CHAR_BIT__ >= 128
&& (__builtin_popcountg (~(typeof (x)) 0) & 64) == 0
choose unsigned _BitInt smaller by 64 bits from the one mentioned
in the macro (for signed similarly using __builtin_clrsbg).

> Perhaps it's better to leave it to target-specific tests?

Please don't, we don't want to repeat that for all the info->extended
targets (which looks to be arm 32-bit, s390x and loongarch right now).
We want to test it on all, like the whole bitint testsuite helps to find
issues on all the arches, most of it isn't target specific.

Jakub



[PATCH v2 4/6] libstdc++: Add tests for layout_right.

2025-05-20 Thread Luc Grosheintz
Adds tests for layout_right and for the parts of layout_left that depend
on layout_right.

libstdc++-v3/ChangeLog:

* testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc: Add
tests for layout_stride.
* testsuite/23_containers/mdspan/layouts/ctors.cc: Add tests for
layout_right and the interaction with layout_left.
* testsuite/23_containers/mdspan/layouts/mapping.cc: ditto.

Signed-off-by: Luc Grosheintz 
---
 .../mdspan/layouts/class_mandate_neg.cc   |  1 +
 .../23_containers/mdspan/layouts/ctors.cc | 64 +++
 .../23_containers/mdspan/layouts/mapping.cc   | 78 ---
 3 files changed, 133 insertions(+), 10 deletions(-)

diff --git 
a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc 
b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
index b276fbd333e..a41bad988d2 100644
--- a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
+++ b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
@@ -18,5 +18,6 @@ template
   };
 
 A a_left; // { dg-error "required from" }
+A a_right;   // { dg-error "required from" }
 
 // { dg-prune-output "must be representable as index_type" }
diff --git a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc 
b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
index c96f314818a..4a7d2bffeef 100644
--- a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
+++ b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
@@ -222,6 +222,66 @@ namespace from_same_layout
 }
 }
 
+// ctor: mapping(layout_{right,left}::mapping)
+namespace from_left_or_right
+{
+  template
+constexpr void
+verify_ctor(OExtents oexts)
+{
+  using SMapping = typename SLayout::mapping;
+  using OMapping = typename OLayout::mapping;
+
+  constexpr bool expected = std::is_convertible_v;
+  if constexpr (expected)
+   verify_nothrow_convertible(OMapping(oexts));
+  else
+   verify_nothrow_constructible(OMapping(oexts));
+}
+
+  template
+constexpr bool
+test_ctor()
+{
+  assert_not_constructible<
+   typename SLayout::mapping>,
+   typename OLayout::mapping>>();
+
+  verify_ctor>(
+   std::extents{});
+
+  verify_ctor>(
+   std::extents{});
+
+  assert_not_constructible<
+   typename SLayout::mapping>,
+   typename OLayout::mapping>>();
+
+  verify_ctor>(
+   std::extents{});
+
+  verify_ctor>(
+   std::extents{});
+
+  verify_ctor>(
+   std::extents{});
+
+  assert_not_constructible<
+   typename SLayout::mapping>,
+   typename OLayout::mapping>>();
+  return true;
+}
+
+  template
+constexpr void
+test_all()
+{
+  test_ctor();
+  static_assert(test_ctor());
+}
+}
+
 template
   constexpr void
   test_all()
@@ -234,5 +294,9 @@ int
 main()
 {
   test_all();
+  test_all();
+
+  from_left_or_right::test_all();
+  from_left_or_right::test_all();
   return 0;
 }
diff --git a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/mapping.cc 
b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/mapping.cc
index 60630dc37ca..c6bf04a5446 100644
--- a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/mapping.cc
+++ b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/mapping.cc
@@ -294,6 +294,15 @@ template<>
 VERIFY(m.stride(1) == 3);
   }
 
+template<>
+  constexpr void
+  test_stride_2d()
+  {
+std::layout_right::mapping> m;
+VERIFY(m.stride(0) == 5);
+VERIFY(m.stride(1) == 1);
+  }
+
 template
   constexpr void
   test_stride_3d();
@@ -308,6 +317,16 @@ template<>
 VERIFY(m.stride(2) == 3*5);
   }
 
+template<>
+  constexpr void
+  test_stride_3d()
+  {
+std::layout_right::mapping m(std::dextents(3, 5, 7));
+VERIFY(m.stride(0) == 35);
+VERIFY(m.stride(1) == 7);
+VERIFY(m.stride(2) == 1);
+  }
+
 template
   constexpr bool
   test_stride_all()
@@ -382,24 +401,59 @@ template
 { m2 != m1 } -> std::same_as;
   };
 
-template
-  constexpr bool
+template
+  constexpr void
   test_has_op_eq()
   {
+static_assert(has_op_eq<
+   typename SLayout::mapping>,
+   typename OLayout::mapping>> == Expected);
+
+static_assert(!has_op_eq<
+   typename SLayout::mapping>,
+   typename OLayout::mapping>>);
+
+static_assert(has_op_eq<
+   typename SLayout::mapping>,
+   typename OLayout::mapping>> == Expected);
+
+static_assert(has_op_eq<
+   typename SLayout::mapping>,
+   typename OLayout::mapping>> == Expected);
+
 static_assert(!has_op_eq<
-   typename Layout::mapping>,
-   typename Layout::mapping>>);
+   typename SLayout::mapping>,
+   typename OLayout::mapping>>);
 
 static_assert(has_op_eq<
-   typename Layout::mapping>,
-   typename Layout::mapping>>);
+   typename SLayout::mapping>,
+   typename OLayout::map

[PATCH v2 2/6] libstdc++: Add tests for layout_left.

2025-05-20 Thread Luc Grosheintz
Implements a suite of tests for the currently implemented parts of
layout_left. The individual tests are templated over the layout type, to
allow reuse as more layouts are added.

libstdc++-v3/ChangeLog:

* testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc: New test.
* testsuite/23_containers/mdspan/layouts/ctors.cc: New test.
* testsuite/23_containers/mdspan/layouts/mapping.cc: New test.

Signed-off-by: Luc Grosheintz 
---
 .../mdspan/layouts/class_mandate_neg.cc   |  22 +
 .../23_containers/mdspan/layouts/ctors.cc | 238 ++
 .../23_containers/mdspan/layouts/mapping.cc   | 438 ++
 3 files changed, 698 insertions(+)
 create mode 100644 
libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
 create mode 100644 libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
 create mode 100644 
libstdc++-v3/testsuite/23_containers/mdspan/layouts/mapping.cc

diff --git 
a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc 
b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
new file mode 100644
index 000..b276fbd333e
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/class_mandate_neg.cc
@@ -0,0 +1,22 @@
+// { dg-do compile { target c++23 } }
+#include
+
+#include 
+
+constexpr size_t dyn = std::dynamic_extent;
+static constexpr size_t n = std::numeric_limits::max() / 2;
+
+template
+  struct A
+  {
+typename Layout::mapping> m0;
+typename Layout::mapping> m1;
+typename Layout::mapping> m2;
+
+using extents_type = std::extents;
+typename Layout::mapping m3; // { dg-error "required from" }
+  };
+
+A a_left; // { dg-error "required from" }
+
+// { dg-prune-output "must be representable as index_type" }
diff --git a/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc 
b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
new file mode 100644
index 000..c96f314818a
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/mdspan/layouts/ctors.cc
@@ -0,0 +1,238 @@
+// { dg-do run { target c++23 } }
+#include 
+
+#include 
+
+constexpr size_t dyn = std::dynamic_extent;
+
+template
+  constexpr void
+  verify(std::extents oexts)
+  {
+auto m = Mapping(oexts);
+VERIFY(m.extents() == oexts);
+  }
+
+template
+  requires (requires { typename OMapping::layout_type; })
+  constexpr void
+  verify(OMapping other)
+  {
+constexpr auto rank = Mapping::extents_type::rank();
+auto m = Mapping(other);
+VERIFY(m.extents() == other.extents());
+if constexpr (rank > 0)
+  for(size_t i = 0; i < rank; ++i)
+   VERIFY(std::cmp_equal(m.stride(i), other.stride(i)));
+  }
+
+
+template
+  constexpr void
+  verify_convertible(From from)
+  {
+static_assert(std::is_convertible_v);
+verify(from);
+  }
+
+template
+  constexpr void
+  verify_nothrow_convertible(From from)
+  {
+static_assert(std::is_nothrow_constructible_v);
+verify_convertible(from);
+  }
+
+
+template
+  constexpr void
+  verify_constructible(From from)
+  {
+static_assert(!std::is_convertible_v);
+static_assert(std::is_constructible_v);
+verify(from);
+  }
+
+template
+  constexpr void
+  verify_nothrow_constructible(From from)
+  {
+static_assert(std::is_nothrow_constructible_v);
+verify_constructible(from);
+  }
+
+template
+  constexpr void
+  assert_not_constructible()
+  {
+static_assert(!std::is_constructible_v);
+  }
+
+// ctor: mapping(const extents&)
+namespace from_extents
+{
+  template
+constexpr void
+verify_nothrow_convertible(OExtents oexts)
+{
+  using Mapping = typename Layout::mapping;
+  ::verify_nothrow_convertible(oexts);
+}
+
+  template
+constexpr void
+verify_nothrow_constructible(OExtents oexts)
+{
+  using Mapping = typename Layout::mapping;
+  ::verify_nothrow_constructible(oexts);
+}
+
+  template
+constexpr void
+assert_not_constructible()
+{
+  using Mapping = typename Layout::mapping;
+  ::assert_not_constructible();
+}
+
+  template
+constexpr bool
+test_ctor()
+{
+  verify_nothrow_convertible>(
+   std::extents{});
+
+  verify_nothrow_convertible>(
+   std::extents{});
+
+  verify_nothrow_convertible>(
+   std::extents{2});
+
+  verify_nothrow_constructible>(
+   std::extents{});
+
+  verify_nothrow_constructible>(
+   std::extents{});
+
+  verify_nothrow_constructible>(
+   std::extents{});
+
+  assert_not_constructible,
+  std::extents>();
+  assert_not_constructible,
+  std::extents>();
+  assert_not_constructible,
+  std::extents>();
+  return true;
+}
+
+  template
+constexpr void
+assert_deducible(Extents exts)
+{
+  typename Layout::mapping m(exts);
+  static_assert(std::same_as>);
+}
+
+

[PATCH v2 1/6] libstdc++: Implement layout_left from mdspan.

2025-05-20 Thread Luc Grosheintz
Implements the parts of layout_left that don't depend on any of the
other layouts.

libstdc++-v3/ChangeLog:

* include/std/mdspan (layout_left): New class.

Signed-off-by: Luc Grosheintz 
---
 libstdc++-v3/include/std/mdspan | 309 +++-
 1 file changed, 308 insertions(+), 1 deletion(-)

diff --git a/libstdc++-v3/include/std/mdspan b/libstdc++-v3/include/std/mdspan
index 47cfa405e44..d90fed57a19 100644
--- a/libstdc++-v3/include/std/mdspan
+++ b/libstdc++-v3/include/std/mdspan
@@ -144,6 +144,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  { return __exts[__i]; });
  }
 
+   static constexpr span
+   _S_static_subextents(size_t __begin, size_t __end) noexcept
+   {
+ return {_Extents.data() + __begin, _Extents.data() + __end};
+   }
+
+   constexpr span
+   _M_dynamic_subextents(size_t __begin, size_t __end) const noexcept
+   requires (_Extents.size() > 0)
+   {
+ return {_M_dynamic_extents + _S_dynamic_index[__begin],
+ _M_dynamic_extents + _S_dynamic_index[__end]};
+   }
+
   private:
using _S_storage = __array_traits<_IndexType, _S_rank_dynamic>::_Type;
[[no_unique_address]] _S_storage _M_dynamic_extents;
@@ -160,6 +174,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
|| _Extent <= numeric_limits<_IndexType>::max();
   }
 
+  namespace __mdspan
+  {
+template
+  constexpr span
+  __static_subextents(size_t __begin, size_t __end)
+  { return _Extents::_S_storage::_S_static_subextents(__begin, __end); }
+
+template
+  constexpr span
+  __dynamic_subextents(const _Extents& __exts, size_t __begin, size_t 
__end)
+  {
+   return __exts._M_dynamic_extents._M_dynamic_subextents(__begin, __end);
+  }
+  }
+
   template
 class extents
 {
@@ -251,7 +280,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
: _M_dynamic_extents(span(__exts))
{ }
 
-
   template<__mdspan::__valid_index_type _OIndexType, size_t 
_Nm>
requires (_Nm == rank() || _Nm == rank_dynamic())
constexpr explicit(_Nm != rank_dynamic())
@@ -276,6 +304,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
}
 
 private:
+  friend span
+  __mdspan::__static_subextents(size_t, size_t);
+
+  friend span
+  __mdspan::__dynamic_subextents(const extents&, size_t, size_t);
+
   using _S_storage = __mdspan::_ExtentsStorage<
_IndexType, array{_Extents...}>;
   [[no_unique_address]] _S_storage _M_dynamic_extents;
@@ -286,6 +320,52 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   namespace __mdspan
   {
+template
+  constexpr size_t
+  __static_extents_prod(size_t __begin, size_t __end)
+  {
+   auto __sta_exts = __static_subextents<_Extents>(__begin, __end);
+   size_t __ret = 1;
+   for(size_t __i = 0; __i < __sta_exts.size(); ++__i)
+ if (__sta_exts[__i] != dynamic_extent)
+   __ret *= __sta_exts[__i];
+   return __ret;
+  }
+
+template
+  constexpr size_t
+  __dynamic_extents_prod(const _Extents& __exts, size_t __begin,
+size_t __end)
+  {
+   auto __dyn_exts = __dynamic_subextents<_Extents>(__exts, __begin,
+__end);
+   size_t __ret = 1;
+   for(size_t __i = 0; __i < __dyn_exts.size(); ++__i)
+   __ret *= __dyn_exts[__i];
+   return __ret;
+  }
+
+template
+  constexpr typename _Extents::index_type
+  __exts_prod(const _Extents& __exts, size_t __begin, size_t __end) 
noexcept
+  {
+   using _IndexType = typename _Extents::index_type;
+   auto __ret = __static_extents_prod<_Extents>(__begin, __end);
+   if constexpr (_Extents::rank_dynamic() > 0)
+ __ret *= __dynamic_extents_prod(__exts, __begin, __end);
+   return __ret;
+  }
+
+template
+  constexpr typename _Extents::index_type
+  __fwd_prod(const _Extents& __exts, size_t __r) noexcept
+  { return __exts_prod(__exts, 0, __r); }
+
+template
+  constexpr typename _Extents::index_type
+  __rev_prod(const _Extents& __exts, size_t __r) noexcept
+  { return __exts_prod(__exts, __r + 1, __exts.rank()); }
+
 template
   auto __build_dextents_type(integer_sequence)
-> extents<_IndexType, ((void) _Counts, dynamic_extent)...>;
@@ -304,6 +384,233 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 explicit extents(_Integrals...) ->
   extents()...>;
 
+  struct layout_left
+  {
+template
+  class mapping;
+  };
+
+  namespace __mdspan
+  {
+template
+  constexpr bool __is_extents = false;
+
+template
+  constexpr bool __is_extents> = true;
+
+template
+  constexpr typename _Extents::index_type
+  __linear_index_left(const _Extents& __exts, _Indices... __indices)
+  {
+   using _IndexType = typename _Extents::index_type;
+   _IndexType __res = 0;
+   if constexpr (sizeof...(__indices) 

[PATCH] libgcc: Add DPD support + fix big-endian support of _BitInt <-> dfp conversions

2025-05-20 Thread Jakub Jelinek
Hi!

The following patch fixes
FAIL: gcc.dg/dfp/bitint-1.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-2.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-3.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-4.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-5.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-6.c (test for excess errors)
FAIL: gcc.dg/dfp/bitint-8.c (test for excess errors)
FAIL: gcc.dg/dfp/int128-1.c (test for excess errors)
FAIL: gcc.dg/dfp/int128-2.c (test for excess errors)
FAIL: gcc.dg/dfp/int128-4.c (test for excess errors)
on s390x-linux (with the 3 not yet posted patches).

The patch does multiple things:
1) the routines were written for the DFP BID (binary integer decimal)
   format which is used on all arches but powerpc*/s390* (those use
   DPD - densely packed decimal format); as most of the code is actually
   the same for both BID and DPD formats, I haven't copied the sources
   + slightly modified them, but added the DPD support directly, + renaming
   of the exported symbols from __bid_* prefixed to __dpd_* prefixed that
   GCC expects on the DPD targets
2) while testing that I've found some big-endian issues in the existing
   support
3) testing also revealed that in some cases __builtin_clzll (~msb) was
   called with msb set to all ones, so invoking UB; apparently on aarch64
   and x86 we were lucky and got some value that happened to work well,
   but that wasn't the case on s390x

For 1), the patch uses two ~ 2KB tables to speed up the decoding/encoding.
I haven't found such tables in what is added into libgcc.a, though they
are in libdecnumber/bid/bid2dpd_dpd2bid.h, but there they are just huge
and next to other huge tables - there is d2b which is like __dpd_d2bbitint
in the patch but it uses 64-bit entries rather than 16-bit, then there is
d2b2 with 64-bit entries like in d2b all multiplied by 1000, then d2b3
similarly multiplied by 100, then d2b4 similarly multiplied by
10, then d2b5 similarly multiplied by 1ULL and
d2b6 similarly multipled by 1000ULL.  Arguably it can
save some of the multiplications, but on the other side accesses memory
which is unlikely in the caches, and the 2048 bytes in the patch vs.
24 times more for d2b is IMHO significant.
For b2d, libdecnumber/bid/bid2dpd_dpd2bid.h has again b2d table like
__dpd_b2dbitint in the patch, except that it has 64-bit entries rather
than 16-bit (this time 1000 entries), but then has b2d2 which has the
same entries shifted left by 10, then b2d3 shifted left by 20, b2d4 shifted
left by 30 and b2d5 shifted left by 40.  I can understand for d2b paying
memory cost to speed up multiplications, but don't understand paying
extra 4 * 8 * 1000 bytes (+ 6 * 1000 bytes for b2d not using ushort)
just to avoid shifts.

Tested on x86_64-linux, i686-linux and s390x-linux with
make check-gcc dfp.exp
ok for trunk?

2025-05-20  Jakub Jelinek  

* config/t-softfp (softfp_bid_list): Don't guard with
$(enable_decimal_float) == bid.
* soft-fp/bitint.h (__bid_pow10bitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_pow10bitint.
(__dpd_d2bbitint, __dpd_b2dbitint): Declare.
* soft-fp/bitintpow10.c (__dpd_d2bbitint, __dpd_b2dbitint): New
variables.
* soft-fp/fixsdbitint.c (__bid_fixsdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixsdbitint.
Add DPD support.  Fix big-endian support.
* soft-fp/fixddbitint.c (__bid_fixddbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixddbitint.
Add DPD support.  Fix big-endian support.
* soft-fp/fixtdbitint.c (__bid_fixtdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixtdbitint.
Add DPD support.  Fix big-endian support.
* soft-fp/fixsdti.c (__bid_fixsdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixsdbitint.
(__bid_fixsdti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_fixsdti.
* soft-fp/fixddti.c (__bid_fixddbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixddbitint.
(__bid_fixddti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_fixddti.
* soft-fp/fixtdti.c (__bid_fixtdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixtdbitint.
(__bid_fixtdti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine to
__dpd_fixtdti.
* soft-fp/fixunssdti.c (__bid_fixsdbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixsdbitint.
(__bid_fixunssdti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_fixunssdti.
* soft-fp/fixunsddti.c (__bid_fixddbitint): For
!defined(ENABLE_DECIMAL_BID_FORMAT) redefine to __dpd_fixddbitint.
(__bid_fixunsddti): For !defined(ENABLE_DECIMAL_BID_FORMAT) redefine
to __dpd_fixunsddti.
  

[PATCH v4] libstdc++: Implement C++23 P1659R3 starts_with and ends_with

2025-05-20 Thread Patrick Palka
Changes in V4:
  * optimize the both-directional/common ranges case, as suggested by
Tomasz
  * add tests for that code path

Changes in v3:
  * Use the forward_range code path for a (non-sized) bidirectional
haystack, since it's slightly fewer increments/decrements
overall.
  * Fix wrong iter_difference_t cast in starts_with.

Changes in v2:
  Addressed Tomasz's review comments, namely:
  * Added explicit iter_difference_t casts
  * Made _S_impl member private
  * Optimized sized bidirectional case of ends_with
  * Rearranged control flow of starts_with::_S_impl

Still left to do:
  * Add tests for integer-class types
  * Still working on a better commit description ;)

-- >8 --

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__starts_with_fn, starts_with):
Define.
(__ends_with_fn, ends_with): Define.
* include/bits/version.def (ranges_starts_ends_with): Define.
* include/bits/version.h: Regenerate.
* include/std/algorithm: Provide __cpp_lib_ranges_starts_ends_with.
* src/c++23/std.cc.in (ranges::starts_with): Export.
(ranges::ends_with): Export.
* testsuite/25_algorithms/ends_with/1.cc: New test.
* testsuite/25_algorithms/starts_with/1.cc: New test.
---
 libstdc++-v3/include/bits/ranges_algo.h   | 245 ++
 libstdc++-v3/include/bits/version.def |   8 +
 libstdc++-v3/include/bits/version.h   |  10 +
 libstdc++-v3/include/std/algorithm|   1 +
 libstdc++-v3/src/c++23/std.cc.in  |   4 +
 .../testsuite/25_algorithms/ends_with/1.cc| 135 ++
 .../testsuite/25_algorithms/starts_with/1.cc  | 128 +
 7 files changed, 531 insertions(+)
 create mode 100644 libstdc++-v3/testsuite/25_algorithms/ends_with/1.cc
 create mode 100644 libstdc++-v3/testsuite/25_algorithms/starts_with/1.cc

diff --git a/libstdc++-v3/include/bits/ranges_algo.h 
b/libstdc++-v3/include/bits/ranges_algo.h
index f36e7dd59911..c4b24f6fea9f 100644
--- a/libstdc++-v3/include/bits/ranges_algo.h
+++ b/libstdc++-v3/include/bits/ranges_algo.h
@@ -438,6 +438,251 @@ namespace ranges
 
   inline constexpr __search_n_fn search_n{};
 
+#if __glibcxx_ranges_starts_ends_with // C++ >= 23
+  struct __starts_with_fn
+  {
+template _Sent1,
+input_iterator _Iter2, sentinel_for<_Iter2> _Sent2,
+typename _Pred = ranges::equal_to,
+typename _Proj1 = identity, typename _Proj2 = identity>
+  requires indirectly_comparable<_Iter1, _Iter2, _Pred, _Proj1, _Proj2>
+  constexpr bool
+  operator()(_Iter1 __first1, _Sent1 __last1,
+_Iter2 __first2, _Sent2 __last2, _Pred __pred = {},
+_Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
+  {
+   iter_difference_t<_Iter1> __n1 = -1;
+   iter_difference_t<_Iter2> __n2 = -1;
+   if constexpr (sized_sentinel_for<_Sent1, _Iter1>)
+ __n1 = __last1 - __first1;
+   if constexpr (sized_sentinel_for<_Sent2, _Iter2>)
+ __n2 = __last2 - __first2;
+   return _S_impl(std::move(__first1), __last1, __n1,
+  std::move(__first2), __last2, __n2,
+  std::move(__pred),
+  std::move(__proj1), std::move(__proj2));
+  }
+
+template
+  requires indirectly_comparable, iterator_t<_Range2>,
+_Pred, _Proj1, _Proj2>
+  constexpr bool
+  operator()(_Range1&& __r1, _Range2&& __r2, _Pred __pred = {},
+_Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
+  {
+   range_difference_t<_Range1> __n1 = -1;
+   range_difference_t<_Range1> __n2 = -1;
+   if constexpr (sized_range<_Range1>)
+ __n1 = ranges::size(__r1);
+   if constexpr (sized_range<_Range2>)
+ __n2 = ranges::size(__r2);
+   return _S_impl(ranges::begin(__r1), ranges::end(__r1), __n1,
+  ranges::begin(__r2), ranges::end(__r2), __n2,
+  std::move(__pred),
+  std::move(__proj1), std::move(__proj2));
+  }
+
+  private:
+template
+  static constexpr bool
+  _S_impl(_Iter1 __first1, _Sent1 __last1, iter_difference_t<_Iter1> __n1,
+ _Iter2 __first2, _Sent2 __last2, iter_difference_t<_Iter2> __n2,
+ _Pred __pred, _Proj1 __proj1, _Proj2 __proj2)
+  {
+   if (__first2 == __last2) [[unlikely]]
+ return true;
+   else if (__n1 == -1 || __n2 == -1)
+ return ranges::mismatch(std::move(__first1), __last1,
+ std::move(__first2), __last2,
+ std::move(__pred),
+ std::move(__proj1), std::move(__proj2)).in2 
== __last2;
+   else if (__n1 < __n2)
+ return false;
+   else if constexpr (random_access_iterator<_Iter1>)
+ return ranges::equal(__first1, __first1 + 
iter_difference_t<_Iter1>(__n2),
+  std

[PATCH 3/5 v3] c++, coroutines: Address CWG2563 return value init [PR119916].

2025-05-20 Thread Iain Sandoe
Hi Jason

>>>So I moved this to the position before the g_r_o is initialized
>>>(since we only manage cleanups of the entities that come before that, 
>>>although
>>> that's a bit hard to see from the patch).

>>This will probably need reevaluation if you take my suggestion from the 
>>decltype patch for addressing 115908, but this is fine for now.

I am adding the suggestion to my TODO.

>>...
>>+  if (flag_exceptions)
>>+{
>>+  r = cp_build_init_expr (coro_before_return, boolean_false_node);

>This should be MODIFY_EXPR, not INIT_EXPR; it got an initial value already in 
>the DECL_EXPR.

Fixed, OK for trunk now?
thanks
Iain

--- 8< ---

This addresses the clarification that, when the get_return_object is of a
different type from the ramp return, any necessary conversions should be
performed on the return expression (so that they typically occur after the
function body has started execution).

PR c++/119916

gcc/cp/ChangeLog:

* coroutines.cc
(cp_coroutine_transform::wrap_original_function_body): Do not
initialise initial_await_resume_called here...
(cp_coroutine_transform::build_ramp_function): ... but here.
When the coroutine is not void, initialize a GRO object from
promise.get_return_object().  Use this as the argument to the
return expression.  Use a regular cleanup for the GRO, since
it is ramp-local.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/torture/special-termination-00-sync-completion.C:
Amend for CWG2563 expected behaviour.
* g++.dg/coroutines/torture/special-termination-01-self-destruct.C:
Likewise.
* g++.dg/coroutines/torture/pr119916.C: New test.

Signed-off-by: Iain Sandoe 
---
 gcc/cp/coroutines.cc  | 126 ++
 .../g++.dg/coroutines/torture/pr119916.C  |  66 +
 .../special-termination-00-sync-completion.C  |   2 +-
 .../special-termination-01-self-destruct.C|   2 +-
 4 files changed, 109 insertions(+), 87 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/coroutines/torture/pr119916.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 743da068e35..bc5fb9381db 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -4451,7 +4451,7 @@ cp_coroutine_transform::wrap_original_function_body ()
   tree i_a_r_c
= coro_build_artificial_var (loc, coro_frame_i_a_r_c_id,
 boolean_type_node, orig_fn_decl,
-boolean_false_node);
+NULL_TREE);
   DECL_CHAIN (i_a_r_c) = var_list;
   var_list = i_a_r_c;
   add_decl_expr (i_a_r_c);
@@ -4867,7 +4867,6 @@ cp_coroutine_transform::build_ramp_function ()
   add_decl_expr (coro_fp);
 
   tree coro_promise_live = NULL_TREE;
-  tree coro_gro_live = NULL_TREE;
   if (flag_exceptions)
 {
   /* Signal that we need to clean up the promise object on exception.  */
@@ -4876,13 +4875,6 @@ cp_coroutine_transform::build_ramp_function ()
  boolean_type_node, orig_fn_decl,
  boolean_false_node);
 
-  /* When the get-return-object is in the RETURN slot, we need to arrange
-for cleanup on exception.  */
-  coro_gro_live
-   = coro_build_and_push_artificial_var (loc, "_Coro_gro_live",
- boolean_type_node, orig_fn_decl,
- boolean_false_node);
-
   /* To signal that we need to cleanup copied function args.  */
   if (DECL_ARGUMENTS (orig_fn_decl))
for (tree arg = DECL_ARGUMENTS (orig_fn_decl); arg != NULL;
@@ -4970,13 +4962,19 @@ cp_coroutine_transform::build_ramp_function ()
   tree ramp_try_block = NULL_TREE;
   tree ramp_try_stmts = NULL_TREE;
   tree iarc_x = NULL_TREE;
+  tree coro_before_return = NULL_TREE;
   if (flag_exceptions)
 {
+  coro_before_return
+   = coro_build_and_push_artificial_var (loc, "_Coro_before_return",
+ boolean_type_node, orig_fn_decl,
+ boolean_true_node);
   iarc_x
= coro_build_and_push_artificial_var_with_dve (loc,
   coro_frame_i_a_r_c_id,
   boolean_type_node,
-  orig_fn_decl, NULL_TREE,
+  orig_fn_decl,
+  boolean_false_node,
   deref_fp);
   ramp_try_block = begin_try_block ();
   ramp_try_stmts = begin_compound_stmt (BCS_TRY_BLOCK);
@@ -5136,90 +5134,54 @@ cp_coroutine_transform::build_ramp_function ()
 (loc, coro_resume_index_id, short_unsigned_type_node,  orig_fn_decl,

Re: [PATCH] Match: Handle commonly used unsigned modulo counters

2025-05-20 Thread Richard Biener
On Thu, May 15, 2025 at 10:29 AM MCC CS  wrote:
>
> Dear all,
>
> Here's my patch for PR120265. Bootstrapped and tested on aarch64 that it
> causes no regressions. I also added a testcase. I'd be grateful
> if you could commit it.
>
> Otherwise, feedback to improve it is welcome.
>
> Many thanks
> MCCCS
>
>
> From 1e901c3fa5c8cc3e55d4f1715b4aae4ae3d66714 Mon Sep 17 00:00:00 2001
> From: MCCCS 
> Date: Thu, 15 May 2025 09:16:49 +0100
> Subject: [PATCH] tree-optimization/120265 - Optimize modular counters
>
> This PR is about replacing trunc_mod with with a
> simpler expression given the bounds of variables.
>
> PR tree-optimization/120265
> * match.pd:
> X % M -> X for X in 0 to M-1
> X % M -> (X == M) ? 0 : X for X in 0 to M
> X % M -> (X >= M) ? (X - M) : X for X in 0 to 2*M-1.

While the first case looks profitable the 2nd and third might not be
when optimizing for size and they are definitely not canonicalizations
and thus might interfere with association with division?  So I wonder
whether those are better suited to be performed at RTL expansion time?
Like on AVR when type is bigger than word_mode the transform looks bad.
When the target does not have a conditional move instruction the generated
branch might be also difficult to predict?

I would have expected we do the first pattern in VRPs simplification
step, but I might misremeber.

Richard.

> * gcc.dg/pr120265.c. New testcase.
> ---
>  gcc/match.pd| 27 
>  gcc/testsuite/gcc.dg/pr120265.c | 44 +
>  2 files changed, 71 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/pr120265.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 79485f9678a..bd8950b4e10 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -5602,6 +5602,33 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> optab_vector)))
> (eq (trunc_mod @0 @1) { build_zero_cst (TREE_TYPE (@0)); })))
>
> +#if GIMPLE
> +/* X % M -> X for X in 0 to M-1.  */
> +/* X % M -> (X == M) ? 0 : X for X in 0 to M.  */
> +/* X % M -> (X >= M) ? (X - M) : X for X in 0 to 2*M-1.  */
> +(simplify
> + (trunc_mod @0 @1)
> +  (with { int_range_max vr0, vr1; }
> +   (if (get_range_query (cfun)->range_of_expr (vr0, @0)
> +   && get_range_query (cfun)->range_of_expr (vr1, @1)
> +   && !vr0.undefined_p ()
> +   && !vr1.undefined_p ()
> +   && !integer_zerop (@1)
> +   && (TYPE_UNSIGNED (type)
> +   || (vr0.nonnegative_p () && vr1.nonnegative_p (
> +(with
> + { wide_int twice = 2 * vr1.lower_bound (); }
> + (switch
> +  (if (wi::gtu_p (vr1.lower_bound (), vr0.upper_bound ()))
> +   @0)
> +  (if (wi::geu_p (vr1.lower_bound (), vr0.upper_bound ()))
> +   (cond (eq @0 @1)
> +   { build_zero_cst (type); }
> +   @0))
> +  (if (wi::gtu_p (twice, vr0.upper_bound ()))
> +   (cond (ge @0 @1) (minus @0 @1) @0)))
> +#endif
> +
>  /* ((X /[ex] C1) +- C2) * (C1 * C3)  -->  (X * C3) +- (C1 * C2 * C3).  */
>  (for op (plus minus)
>   (simplify
> diff --git a/gcc/testsuite/gcc.dg/pr120265.c b/gcc/testsuite/gcc.dg/pr120265.c
> new file mode 100644
> index 000..2634af36226
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr120265.c
> @@ -0,0 +1,44 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -fdump-tree-optimized" } */
> +__attribute__((noipa)) void g(int r)
> +{
> + (void) r;
> +}
> +
> +int x;
> +
> +void a(void)
> +{
> + unsigned m = 0;
> +  for(int i = 0; i < 300; i++)
> +  {
> +   m++;
> +   m %= 600;
> +   g(m);
> +  }
> +}
> +
> +void b(void)
> +{
> + unsigned m = 0;
> +  for(int i = 0; i < x; i++)
> +  {
> +   m++;
> +   m %= 600;
> +   g(m);
> +  }
> +}
> +
> +void c(void)
> +{
> + unsigned m = 0;
> + for(int i = 0; i < x; i++)
> + {
> +  m += 7;
> +  m %= 600;
> +  g(m);
> + }
> +}
> +
> +/* { dg-final { scan-tree-dump-not "% 600" "optimized" } } */
> +
> --
> 2.45.2


Re: [PATCH v22 0/3] c: Add _Countof and

2025-05-20 Thread Jakub Jelinek
On Tue, May 20, 2025 at 05:15:33PM +0200, Alejandro Colomar wrote:
> I've based on gnulib commits, which I believe follow the same
> guidelines.  For example:
> 
>   commit 6608062398ef4c983a58b90a1520c39f12fb7ac1
>   Author: Paul Eggert 
>   Date:   Fri Jan 10 10:34:58 2025 -0800
> 
>   doc: document some file system portability issues
>   
>   * doc/glibc-functions/flistxattr.texi:
>   * doc/glibc-functions/listxattr.texi:
>   * doc/glibc-functions/llistxattr.texi:
>   * doc/posix-functions/fchdir.texi, doc/posix-functions/fstat.texi:
>   * doc/posix-functions/fstatvfs.texi:
>   Document some portability gotchas that Gnulib does not work around.
> 
> Now I realize that maybe my changelog misses the trailing ':' for the
> entries that have no text (because it's only once at the end)?  So for
> example instead of
> 
> gcc/c-family/ChangeLog:
> 
> * c-common.h
> * c-common.def
> * c-common.cc (c_countof_type): Add __countof__ operator.
> 
> I should do this?
> 
> gcc/c-family/ChangeLog:
> 
> * c-common.h:
> * c-common.def:
> * c-common.cc (c_countof_type): Add __countof__ operator.
> 
> Or maybe this?
> 
> gcc/c-family/ChangeLog:
> 
> * c-common.h:
> * c-common.def:
> * c-common.cc (c_countof_type):
> Add __countof__ operator.

We don't use (at least mostly) any of these, instead use
* c-common.h (whatever changed): Description.
* c-common.def (whatever else changed): Likewise.
* c-common.cc (again what changed): Likewise.
and similar (or Ditto instead of Likewise).
And just c-common.h or c-common.def without actually specifying what
you've changed there is generally bad, there are some rare exceptions
(e.g. if you add #include, that is mentioned on the whole file, or
if there are massive repetitive changes everywhere).

Jakub



Re: [PATCH] c++: substituting fn parm redeclared with dep alias tmpl [PR120224]

2025-05-20 Thread Patrick Palka
On Mon, 19 May 2025, Patrick Palka wrote:

> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
> OK for trunk/15/14?

Whoops, CI reports I missed a testsuite adjustment expecting an
additional error in other/default13.C, which seems reasonable.  Here's
an updated patch.

-- >8 --

Here we declare f twice, first ordinarily and then using a dependent
alias template.  Due to alias template transparency these logically
declare the same overload.  But now the function type of f, which was
produced from the first declaration, diverges from the type of its
formal parameter, which is produced from the subsequent redefinition,
in that substituting T=int succeeds for the function type but not for
the formal parameter type.  This eventually causes us to produce an
undiagnosed error_mark_node in the AST of function call, leading to
a sanity check failure added in r14-6343-g0c018a74eb1aff.

Before r14-6343, we would later reject the testcase albeit from
regenerate_decl_from_template when instantiating the definition of f,
making this a regression.

To fix this, it seems we just need to check for errors when substituting
the type of a PARM_DECL, since that could still fail despite substitution
into the function type succeeding.

PR c++/120224

gcc/cp/ChangeLog:

* pt.cc (tsubst_function_decl): Return error_mark_node if any
of the substituted function parameters are erroneous.
(tsubst_decl) : Return error_mark_node if
the substituted function parameter type is erroneous.

gcc/testsuite/ChangeLog:

* g++.dg/other/default13.C: Expect additional overload
resolution failure diagnostic.
* g++.dg/cpp0x/alias-decl-80.C: New test.
---
 gcc/cp/pt.cc   |  9 -
 gcc/testsuite/g++.dg/cpp0x/alias-decl-80.C | 14 ++
 gcc/testsuite/g++.dg/other/default13.C |  2 +-
 3 files changed, 23 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-80.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 1973d25b61a0..df6d7bb136ea 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -14903,7 +14903,11 @@ tsubst_function_decl (tree t, tree args, 
tsubst_flags_t complain,
 parms = DECL_CHAIN (parms);
   parms = tsubst (parms, args, complain, t);
   for (tree parm = parms; parm; parm = DECL_CHAIN (parm))
-DECL_CONTEXT (parm) = r;
+{
+  if (parm == error_mark_node)
+   return error_mark_node;
+  DECL_CONTEXT (parm) = r;
+}
   if (closure && DECL_IOBJ_MEMBER_FUNCTION_P (t))
 {
   tree tparm = build_this_parm (r, closure, type_memfn_quals (type));
@@ -15474,6 +15478,9 @@ tsubst_decl (tree t, tree args, tsubst_flags_t complain,
   /* We're dealing with a normal parameter.  */
   type = tsubst (TREE_TYPE (t), args, complain, in_decl);
 
+   if (type == error_mark_node)
+ RETURN (error_mark_node);
+
 type = type_decays_to (type);
 TREE_TYPE (r) = type;
 cp_apply_type_quals_to_decl (cp_type_quals (type), r);
diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-80.C 
b/gcc/testsuite/g++.dg/cpp0x/alias-decl-80.C
new file mode 100644
index ..e2ff663843de
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-80.C
@@ -0,0 +1,14 @@
+// PR c++/120224
+// { dg-do compile { target c++11 } }
+
+template using void_t = void;
+
+template
+void f(void*); // #1
+
+template
+void f(void_t*) { } // { dg-error "not a class" } #2
+
+int main() {
+  f(0); // { dg-error "no match" }
+}
diff --git a/gcc/testsuite/g++.dg/other/default13.C 
b/gcc/testsuite/g++.dg/other/default13.C
index eae23ffdf2d1..381aee78ea2c 100644
--- a/gcc/testsuite/g++.dg/other/default13.C
+++ b/gcc/testsuite/g++.dg/other/default13.C
@@ -8,4 +8,4 @@ template < typename > struct B
   int f;
 };
 
-B < int > b (0);
+B < int > b (0); // { dg-error "no match" }
-- 
2.49.0.608.gcb96e1697a



Re: [PATCH v3 3/3] asf: Fix calling of emit_move_insn on registers of different modes [PR119884]

2025-05-20 Thread Richard Sandiford
Richard Sandiford  writes:
> Konstantinos Eleftheriou  writes:
>> This patch uses `lowpart_subreg` for the base register initialization,
>> instead of zero-extending it. We had tried this solution before, but
>> we were leaving undefined bytes in the upper part of the register.
>> This shouldn't be happening as we are supposed to write the whole
>> register when the load is eliminated. This was occurring when having
>> multiple stores with the same offset as the load, generating a
>> register move for all of them, overwriting the bit inserts that
>> were inserted before them.
>>
>> In order to overcome this, we are removing redundant stores from the 
>> sequence,
>> i.e. stores that write to addresses that will be overwritten by stores that
>> come after them in the sequence. We are using the same bitmap that is used
>> for the load elimination check, to keep track of the bytes that are written
>> by each store.
>>
>> Also, we are now allowing the load to be eliminated even when there are
>> overlaps between the stores, as there is no obvious reason why we shouldn't
>> do that, we just want the stores to cover all of the load's bytes.
>>
>> Bootstrapped/regtested on AArch64 and x86_64.
>>
>> PR rtl-optimization/119884
>>
>> gcc/ChangeLog:
>>
>> * avoid-store-forwarding.cc (process_store_forwarding):
>>  Use `lowpart_subreg` for the base register initialization,
>>  and remove redundant stores from the store/load sequence.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/i386/pr119884.c: New test.
>>
>> Signed-off-by: Konstantinos Eleftheriou 
>> ---
>>
>> Changes in v3:
>> - Remove redundant stores, instead of generating a register move for
>> the first store that has the same offset as the load only.
>>
>> Changes in v2:
>> - Use `lowpart_subreg` for the base register initialization, but
>> only for the first store that has the same offset as the load.
>>
>> Changes in v1:
>> - Add a check for the register modes to match before calling `emit_mov_insn`.
>>
>>  gcc/avoid-store-forwarding.cc| 45 ++--
>>  gcc/testsuite/gcc.target/i386/pr119884.c | 13 +++
>>  2 files changed, 48 insertions(+), 10 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/i386/pr119884.c
>>
>> diff --git a/gcc/avoid-store-forwarding.cc b/gcc/avoid-store-forwarding.cc
>> index 5d960adec359..f88a001e5717 100644
>> --- a/gcc/avoid-store-forwarding.cc
>> +++ b/gcc/avoid-store-forwarding.cc
>> @@ -176,20 +176,28 @@ process_store_forwarding (vec &stores, 
>> rtx_insn *load_insn,
>>/* Memory sizes should be constants at this stage.  */
>>HOST_WIDE_INT load_size = MEM_SIZE (load_mem).to_constant ();
>>  
>> -  /* If the stores cover all the bytes of the load without overlap then we 
>> can
>> - eliminate the load entirely and use the computed value instead.  */
>> +  /* If the stores cover all the bytes of the load, then we can eliminate
>> + the load entirely and use the computed value instead.
>> + We can also eliminate stores on addresses that are overwritten
>> + by later stores.  */
>>  
>>sbitmap forwarded_bytes = sbitmap_alloc (load_size);
>>bitmap_clear (forwarded_bytes);
>>  
>>unsigned int i;
>>store_fwd_info* it;
>> +  auto_vec redundant_stores;
>> +  auto_vec store_ind_to_remove;
>>FOR_EACH_VEC_ELT (stores, i, it)
>>  {
>>HOST_WIDE_INT store_size = MEM_SIZE (it->store_mem).to_constant ();
>> -  if (bitmap_bit_in_range_p (forwarded_bytes, it->offset,
>> +  if (bitmap_is_range_set_p (forwarded_bytes, it->offset,
>>   it->offset + store_size - 1))
>> -break;
>> +{
>> +  redundant_stores.safe_push (*it);
>> +  store_ind_to_remove.safe_push (i);
>> +  continue;
>> +}
>>bitmap_set_range (forwarded_bytes, it->offset, store_size);
>>  }
>>  
>> @@ -215,6 +223,11 @@ process_store_forwarding (vec &stores, 
>> rtx_insn *load_insn,
>>  fprintf (dump_file, "(Load elimination candidate)\n");
>>  }
>>  
>> +  /* Remove redundant stores from the vector.  */
>> +  store_ind_to_remove.reverse ();
>> +  for (int i : store_ind_to_remove)
>> +stores.ordered_remove (i);
>> +
>
> This is quadratic.  That probably doesn't matter in practice though,
> since the dependence checking is already quadratic, and the size is
> already limited by a --param.  But I think it's worth at least a comment.
> Maybe:
>
>   /* Remove redundant stores from the vector.  ??? Although this is
>  quadratic, there doesn't to be seem much point optimizing it.

Gah, sorry for the typo: ...doesn't seem to be much point...

>  The number of redundant stores is expected to be low and the length
>  of the list is limited by a --param.  The dependence checking that
>  we did earlier is also quadratic in the size of this list.  */
>
>> From my POV, the patch is OK for trunk with that change (and with the
> obvious rename after the comments on patch 2).  Pl

Re: [RFC PATCH 0/3] _BitInt(N) support for LoongArch

2025-05-20 Thread Jakub Jelinek
On Tue, May 20, 2025 at 03:44:09PM +0200, Jakub Jelinek wrote:
> The tests weren't included :(.

I'd like to see something along the lines of following as the test(s)
for the padding bits (if LoongArch will really have the weirdo psABI
then with some special version of that for it).
Though, this doesn't fail right now even on x86_64 if temporarily
enabled in the header, so really should be extended with all the cases
which are known to fail and with any extension fix that should be extended
(ideally for things changed at once in the same test file, for separate
fixes in different files, that is why the macro is in a header file
that can be included by multiple tests).

BTW, for _BitInt test creation, I'm using attached proglet to give me
pseudorandom unsigned _BitInt values of the desired precision.

2025-05-20  Jakub Jelinek  

* gcc.dg/bitintext.h: New file.
* gcc.dg/torture/bitint-82.c: New test.

--- gcc/testsuite/gcc.dg/bitintext.h.jj 2025-05-20 16:45:26.017419463 +0200
+++ gcc/testsuite/gcc.dg/bitintext.h2025-05-20 17:19:32.610605951 +0200
@@ -0,0 +1,23 @@
+/* Macro to test whether (on targets where psABI requires it) _BitInt
+   with padding bits have those filled with sign or zero extension.  */
+#if defined(__s390x__) || defined(__arm__) || defined(__loongarch__)
+#define BEXTC(x) \
+  do { \
+if ((typeof (x)) -1 < 0)   \
+  {\
+   _BitInt(sizeof (x) * __CHAR_BIT__) __x; \
+   __builtin_memcpy (&__x, &(x), sizeof (__x));\
+   if (__x != (x)) \
+ __builtin_abort ();   \
+  }\
+else   \
+  {\
+   unsigned _BitInt(sizeof (x) * __CHAR_BIT__) __x;\
+   __builtin_memcpy (&__x, &(x), sizeof (__x));\
+   if (__x != (x)) \
+ __builtin_abort ();   \
+  }\
+  } while (0)
+#else
+#define BEXTC(x) do { (void) (x); } while (0)
+#endif
--- gcc/testsuite/gcc.dg/torture/bitint-82.c.jj 2025-05-20 16:53:31.380827655 
+0200
+++ gcc/testsuite/gcc.dg/torture/bitint-82.c2025-05-20 17:16:38.715970734 
+0200
@@ -0,0 +1,85 @@
+/* { dg-do run { target bitint } } */
+/* { dg-options "-std=c23 -pedantic-errors" } */
+/* { dg-skip-if "" { ! run_expensive_tests }  { "*" } { "-O0" "-O2" } } */
+/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */
+
+#if __BITINT_MAXWIDTH__ >= 532
+_BitInt(5) a = 2, b = -2;
+_BitInt(38) c = 12345, d = -12345;
+_BitInt(129) e = 147090211948845388976606115811401318743wb, f = 
-147090211948845388976606115811401318743wb;
+_BitInt(532) g = 34476769918317100226195145251004381172591594205376273814wb, h 
= -102116935649428556311918486808926113041433456371844211259677321wb;
+unsigned _BitInt(1) i = 1;
+unsigned _BitInt(17) j = 49127uwb;
+unsigned _BitInt(60) k = 588141367522129848uwb;
+unsigned _BitInt(205) l = 
33991671979236490040668305838261113909013362173682935296620088uwb;
+#endif
+
+#include "../bitintext.h"
+
+#if __BITINT_MAXWIDTH__ >= 532
+[[gnu::noipa]] _BitInt(217)
+f1 (_BitInt(9) a, unsigned _BitInt(12) b, _BitInt(36) c, unsigned _BitInt(105) 
d,
+_BitInt(135) e, unsigned _BitInt(168) f, _BitInt(207) g, _BitInt(207) h,
+unsigned _BitInt(531) i, _BitInt(36) j)
+{
+  BEXTC (a); BEXTC (b); BEXTC (c); BEXTC (d);
+  BEXTC (e); BEXTC (f); BEXTC (g); BEXTC (h);
+  BEXTC (i); BEXTC (j);
+  _BitInt(9) k = a + 1;
+  unsigned _BitInt(12) l = b - a;
+  _BitInt(36) m = c * j;
+  unsigned _BitInt(105) n = d >> (-2 * j);
+  _BitInt(135) o = e | -j;
+  unsigned _BitInt(168) p = f & 101010101010101010101010uwb;
+  _BitInt(207) q = g * j;
+  _BitInt(207) r = g + h;
+  unsigned _BitInt(531) s = i / j;
+  BEXTC (k); BEXTC (l); BEXTC (m); BEXTC (n);
+  BEXTC (o); BEXTC (p); BEXTC (q); BEXTC (r);
+  BEXTC (s);
+  unsigned _BitInt(105) t = d << (38 - j);
+  BEXTC (t);
+  return a + 4;
+}
+#endif
+
+int
+main ()
+{
+#if __BITINT_MAXWIDTH__ >= 532
+  BEXTC (a); BEXTC (b);
+  BEXTC (c); BEXTC (d);
+  BEXTC (e); BEXTC (f);
+  BEXTC (g); BEXTC (h);
+  BEXTC (i);
+  BEXTC (j);
+  BEXTC (k);
+  BEXTC (l);
+  {
+_BitInt(5) a = 2, b = -2;
+_BitInt(38) c = 12345, d = -12345;
+_BitInt(129) e = 147090211948845388976606115811401318743wb, f = 
-147090211948845388976606115811401318743wb;
+_BitInt(532) g = 
34476769918317100226195145251004381172591594205376273814wb, h = 
-102116935649428556311918486808926113041433456371844211259677321wb;
+unsigned _BitInt(1) i = 1;
+unsigned _BitInt(17) j = 49127uwb;
+unsigned _BitInt

[PATCH] fortran: add constant input support for trig functions with half-revolutions

2025-05-20 Thread Yuao Ma
Sorry, the previous patch had some issues with the test case. Please refer to 
the updated version, which resolves the problem.


From: Yuao Ma 
Sent: Tuesday, May 20, 2025 23:54
To: gcc-patches@gcc.gnu.org ; GCC Fortran 
; tbur...@baylibre.com 
Subject: [PATCH] fortran: add constant input support for trig functions with 
half-revolutions

Hi all,

This patch introduces constant input support for trigonometric functions,
including those involving half-revolutions. Both valid and invalid inputs have
been thoroughly tested, as have mpfr versions greater than or equal to 4.2 and
less than 4.2.

Inspired by Steve's previous work, this patch also fixes subtle bugs revealed
by newly added test cases.

If this patch is merged, I plan to work on middle-end optimization support for
previously added GCC built-ins and libgfortran intrinsics.

Best regards,
Yuao



0001-fortran-add-constant-input-support-for-trig-function.patch
Description: 0001-fortran-add-constant-input-support-for-trig-function.patch


Re: [PATCH v5] libstdc++: Implement C++23 P1659R3 starts_with and ends_with

2025-05-20 Thread Patrick Palka
On Tue, 20 May 2025, Tomasz Kaminski wrote:

> I think I do not have any more suggestions for cases to check, so the impl 
> LGTM.

It's cool how many optimizations we came up with for this algorithm :)

> 
> On Tue, May 20, 2025 at 4:33 PM Patrick Palka  wrote:
>   Changes in v5:
>     * dispatch to starts_with for the both-bidi/common range case
> 
>   Changes in v4:
>     * optimize the both-bidi/common ranges case, as suggested by
>       Tomasz
>     * add tests for that code path
> 
>   Changes in v3:
>     * Use the forward_range code path for a (non-sized) bidirectional
>       haystack, since it's slightly fewer increments/decrements
>       overall.
>     * Fix wrong iter_difference_t cast in starts_with.
> 
>   Changes in v2:
>     Addressed Tomasz's review comments, namely:
>     * Added explicit iter_difference_t casts
>     * Made _S_impl member private
>     * Optimized sized bidirectional case of ends_with
>     * Rearranged control flow of starts_with::_S_impl
> 
>   Still left to do:
>     * Add tests for integer-class types
>     * Still working on a better commit description ;)
> 
>   -- >8 --
> 
>   libstdc++-v3/ChangeLog:
> 
>           * include/bits/ranges_algo.h (__starts_with_fn, starts_with):
>           Define.
>           (__ends_with_fn, ends_with): Define.
>           * include/bits/version.def (ranges_starts_ends_with): Define.
>           * include/bits/version.h: Regenerate.
>           * include/std/algorithm: Provide 
> __cpp_lib_ranges_starts_ends_with.
>           * src/c++23/std.cc.in (ranges::starts_with): Export.
>           (ranges::ends_with): Export.
>           * testsuite/25_algorithms/ends_with/1.cc: New test.
>           * testsuite/25_algorithms/starts_with/1.cc: New test.
>   ---
>    libstdc++-v3/include/bits/ranges_algo.h       | 247 ++
>    libstdc++-v3/include/bits/version.def         |   8 +
>    libstdc++-v3/include/bits/version.h           |  10 +
>    libstdc++-v3/include/std/algorithm            |   1 +
>    libstdc++-v3/src/c++23/std.cc.in              |   4 +
>    .../testsuite/25_algorithms/ends_with/1.cc    | 135 ++
>    .../testsuite/25_algorithms/starts_with/1.cc  | 128 +
>    7 files changed, 533 insertions(+)
>    create mode 100644 libstdc++-v3/testsuite/25_algorithms/ends_with/1.cc
>    create mode 100644 
> libstdc++-v3/testsuite/25_algorithms/starts_with/1.cc
> 
>   diff --git a/libstdc++-v3/include/bits/ranges_algo.h 
> b/libstdc++-v3/include/bits/ranges_algo.h
>   index f36e7dd59911..60f7bf841f3f 100644
>   --- a/libstdc++-v3/include/bits/ranges_algo.h
>   +++ b/libstdc++-v3/include/bits/ranges_algo.h
>   @@ -438,6 +438,253 @@ namespace ranges
> 
>      inline constexpr __search_n_fn search_n{};
> 
>   +#if __glibcxx_ranges_starts_ends_with // C++ >= 23
>   +  struct __starts_with_fn
>   +  {
>   +    template _Sent1,
>   +            input_iterator _Iter2, sentinel_for<_Iter2> _Sent2,
>   +            typename _Pred = ranges::equal_to,
>   +            typename _Proj1 = identity, typename _Proj2 = identity>
>   +      requires indirectly_comparable<_Iter1, _Iter2, _Pred, _Proj1, 
> _Proj2>
>   +      constexpr bool
>   +      operator()(_Iter1 __first1, _Sent1 __last1,
>   +                _Iter2 __first2, _Sent2 __last2, _Pred __pred = {},
>   +                _Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
>   +      {
>   +       iter_difference_t<_Iter1> __n1 = -1;
>   +       iter_difference_t<_Iter2> __n2 = -1;
>   +       if constexpr (sized_sentinel_for<_Sent1, _Iter1>)
>   +         __n1 = __last1 - __first1;
>   +       if constexpr (sized_sentinel_for<_Sent2, _Iter2>)
>   +         __n2 = __last2 - __first2;
>   +       return _S_impl(std::move(__first1), __last1, __n1,
>   +                      std::move(__first2), __last2, __n2,
>   +                      std::move(__pred),
>   +                      std::move(__proj1), std::move(__proj2));
>   +      }
>   +
>   +    template   +            typename _Pred = ranges::equal_to,
>   +            typename _Proj1 = identity, typename _Proj2 = identity>
>   +      requires indirectly_comparable, 
> iterator_t<_Range2>,
>   +                                    _Pred, _Proj1, _Proj2>
>   +      constexpr bool
>   +      operator()(_Range1&& __r1, _Range2&& __r2, _Pred __pred = {},
>   +                _Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
>   +      {
>   +       range_difference_t<_Range1> __n1 = -1;
>   +       range_difference_t<_Range1> __n2 = -1;
>   +       if constexpr (sized_range<_Range1>)
>   +         __n1 = ranges::size(__r1);
>   +       if constexpr (sized_range<_Range2>)
>

[PATCH v6] libstdc++: Implement C++23 P1659R3 starts_with and ends_with

2025-05-20 Thread Patrick Palka
Changes in v6:
  * dispatch to starts_with for the bidi haystack + random access
needle case too

Changes in v5:
  * dispatch to starts_with for the both-bidi/common range case

Changes in v4:
  * optimize the both-bidi/common ranges case, as suggested by
Tomasz
  * add tests for that code path

Changes in v3:
  * Use the forward_range code path for a (non-sized) bidirectional
haystack, since it's slightly fewer increments/decrements
overall.
  * Fix wrong iter_difference_t cast in starts_with.

Changes in v2:
  Addressed Tomasz's review comments, namely:
  * Added explicit iter_difference_t casts
  * Made _S_impl member private
  * Optimized sized bidirectional case of ends_with
  * Rearranged control flow of starts_with::_S_impl

Still left to do:
  * Add tests for integer-class types
  * Still working on a better commit description ;)

-- >8 --

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__starts_with_fn, starts_with):
Define.
(__ends_with_fn, ends_with): Define.
* include/bits/version.def (ranges_starts_ends_with): Define.
* include/bits/version.h: Regenerate.
* include/std/algorithm: Provide __cpp_lib_ranges_starts_ends_with.
* src/c++23/std.cc.in (ranges::starts_with): Export.
(ranges::ends_with): Export.
* testsuite/25_algorithms/ends_with/1.cc: New test.
* testsuite/25_algorithms/starts_with/1.cc: New test.
---
 libstdc++-v3/include/bits/ranges_algo.h   | 247 ++
 libstdc++-v3/include/bits/version.def |   8 +
 libstdc++-v3/include/bits/version.h   |  10 +
 libstdc++-v3/include/std/algorithm|   1 +
 libstdc++-v3/src/c++23/std.cc.in  |   4 +
 .../testsuite/25_algorithms/ends_with/1.cc| 135 ++
 .../testsuite/25_algorithms/starts_with/1.cc  | 128 +
 7 files changed, 533 insertions(+)
 create mode 100644 libstdc++-v3/testsuite/25_algorithms/ends_with/1.cc
 create mode 100644 libstdc++-v3/testsuite/25_algorithms/starts_with/1.cc

diff --git a/libstdc++-v3/include/bits/ranges_algo.h 
b/libstdc++-v3/include/bits/ranges_algo.h
index f36e7dd59911..d94df9d29547 100644
--- a/libstdc++-v3/include/bits/ranges_algo.h
+++ b/libstdc++-v3/include/bits/ranges_algo.h
@@ -438,6 +438,253 @@ namespace ranges
 
   inline constexpr __search_n_fn search_n{};
 
+#if __glibcxx_ranges_starts_ends_with // C++ >= 23
+  struct __starts_with_fn
+  {
+template _Sent1,
+input_iterator _Iter2, sentinel_for<_Iter2> _Sent2,
+typename _Pred = ranges::equal_to,
+typename _Proj1 = identity, typename _Proj2 = identity>
+  requires indirectly_comparable<_Iter1, _Iter2, _Pred, _Proj1, _Proj2>
+  constexpr bool
+  operator()(_Iter1 __first1, _Sent1 __last1,
+_Iter2 __first2, _Sent2 __last2, _Pred __pred = {},
+_Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
+  {
+   iter_difference_t<_Iter1> __n1 = -1;
+   iter_difference_t<_Iter2> __n2 = -1;
+   if constexpr (sized_sentinel_for<_Sent1, _Iter1>)
+ __n1 = __last1 - __first1;
+   if constexpr (sized_sentinel_for<_Sent2, _Iter2>)
+ __n2 = __last2 - __first2;
+   return _S_impl(std::move(__first1), __last1, __n1,
+  std::move(__first2), __last2, __n2,
+  std::move(__pred),
+  std::move(__proj1), std::move(__proj2));
+  }
+
+template
+  requires indirectly_comparable, iterator_t<_Range2>,
+_Pred, _Proj1, _Proj2>
+  constexpr bool
+  operator()(_Range1&& __r1, _Range2&& __r2, _Pred __pred = {},
+_Proj1 __proj1 = {}, _Proj2 __proj2 = {}) const
+  {
+   range_difference_t<_Range1> __n1 = -1;
+   range_difference_t<_Range1> __n2 = -1;
+   if constexpr (sized_range<_Range1>)
+ __n1 = ranges::size(__r1);
+   if constexpr (sized_range<_Range2>)
+ __n2 = ranges::size(__r2);
+   return _S_impl(ranges::begin(__r1), ranges::end(__r1), __n1,
+  ranges::begin(__r2), ranges::end(__r2), __n2,
+  std::move(__pred),
+  std::move(__proj1), std::move(__proj2));
+  }
+
+  private:
+template
+  static constexpr bool
+  _S_impl(_Iter1 __first1, _Sent1 __last1, iter_difference_t<_Iter1> __n1,
+ _Iter2 __first2, _Sent2 __last2, iter_difference_t<_Iter2> __n2,
+ _Pred __pred, _Proj1 __proj1, _Proj2 __proj2)
+  {
+   if (__first2 == __last2) [[unlikely]]
+ return true;
+   else if (__n1 == -1 || __n2 == -1)
+ return ranges::mismatch(std::move(__first1), __last1,
+ std::move(__first2), __last2,
+ std::move(__pred),
+ std::move(__proj1), std::move(__proj2)).in2 
== __last2;
+   else if (__n1 < __n2)
+ return false;
+   

Re: [PATCH v2 1/1] Add warnings of potentially-uninitialized padding bits

2025-05-20 Thread Joseph Myers
On Tue, 20 May 2025, Christopher Bazley wrote:

> + if (!cleared)
> +   {
> + if (complete_p.padded_non_union
> + && warn_zero_init_padding_bits >= ZERO_INIT_PADDING_BITS_ALL)
> +   {
> + warning (OPT_Wzero_init_padding_bits_,
> +  "Padding bits might not be initialized to zero; "
> +  "consider using %<-fzero-init-padding-bits=all%>");
> +   }
> + else if (complete_p.padded_union
> +  && warn_zero_init_padding_bits
> + >= ZERO_INIT_PADDING_BITS_UNIONS)
> +   {
> + warning (OPT_Wzero_init_padding_bits_,
> +  "Padding bits might not be initialized to zero; "
> +  "consider using %<-fzero-init-padding-bits=unions%> "
> +  "or %<-fzero-init-padding-bits=all%>");

Diagnostics should start with a lowercase letter.

If there's a meaningful location available for the initialization, then 
warning_at (passing an explicit location) is preferred to warning.

-- 
Joseph S. Myers
josmy...@redhat.com



Re: [PATCH 2/2] RISC-V: Add testcases for signed vector SAT_ADD IMM form 1

2025-05-20 Thread Jeff Law




On 5/19/25 2:42 AM, Li Xu wrote:

From: xuli 

This patch adds testcase for form1, as shown below:

void __attribute__((noinline))   \
vec_sat_s_add_imm_##T##_fmt_1##_##INDEX (T *out, T *op_1, unsigned limit) \
{\
   unsigned i;\
   for (i = 0; i < limit; i++)\
 {\
   T x = op_1[i]; \
   T sum = (UT)x + (UT)IMM;   \
   out[i] = (x ^ IMM) < 0 \
 ? sum\
 : (sum ^ x) >= 0 \
   ? sum  \
   : x < 0 ? MIN : MAX;   \
 }\
}

Passed the rv64gcv regression test.

Signed-off-by: Li Xu 
gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/sat/vec_sat_arith.h: add signed vec 
SAT_ADD IMM form1.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_data.h: add sat_s_add_imm 
data.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-run-1-i16.c: New 
test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-run-1-i32.c: New 
test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-run-1-i64.c: New 
test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-run-1-i8.c: New 
test.
* 
gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm_type_check-1-i16.c: New test.
* 
gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm_type_check-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm_type_check-1-i8.c: 
New test.
Looks reasonably sensible.  But I'll defer to Pan here since he's done 
*far* more work than I in this space.


jeff



Re: [PATCH v2 2/2] MIPS p8700 doesn't have vector extension and added the dummies reservation for the same.

2025-05-20 Thread Jeff Law




On 5/19/25 1:03 AM, Umesh Kalappa wrote:

---
  gcc/config/riscv/mips-p8700.md | 28 
  1 file changed, 28 insertions(+)

I've pushed this to the trunk as well.

Thanks,
jeff



Re: [PATCH v2 1/2] The following changes enable P8700 processor for RISCV and P8700 is a high-performance processor from MIPS by extending RISCV with custom instructions.

2025-05-20 Thread Jeff Law




On 5/19/25 1:02 AM, Umesh Kalappa wrote:

---
  gcc/config/riscv/mips-p8700.md   | 139 +++
  gcc/config/riscv/riscv-cores.def |   5 ++
  gcc/config/riscv/riscv-opts.h|   3 +-
  gcc/config/riscv/riscv.cc|  22 +
  gcc/config/riscv/riscv.md|   3 +-
  5 files changed, 170 insertions(+), 2 deletions(-)
  create mode 100644 gcc/config/riscv/mips-p8700.md
Thanks.  I added the new cpu/tune options to the documentation in 
doc/invoke.texi.


Going forward make sure to create a git commit message as well as a 
ChangeLog entry.  You can look in the git log to see examples of commit 
messages.  The ChangeLog entry should be part of the commit message as 
we use scripting to create the ChangeLog file from the git commit messages.


Jeff



Re: [PATCH 2/2] RISC-V:Add testcases for signed .SAT_ADD IMM form 1 with IMM = -1.

2025-05-20 Thread Jeff Law




On 5/19/25 2:41 AM, Li Xu wrote:

From: xuli 

This patch adds testcase for form1, as shown below:

T __attribute__((noinline))  \
sat_s_add_imm_##T##_fmt_1##_##INDEX (T x) \
{\
   T sum = (UT)x + (UT)IMM; \
   return (x ^ IMM) < 0 \
 ? sum\
 : (sum ^ x) >= 0 \
   ? sum  \
   : x < 0 ? MIN : MAX;   \
}

Passed the rv64gcv regression test.

Signed-off-by: Li Xu 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat/sat_s_add_imm-2.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-1-i16.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-3.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-1-i32.c: ...
* gcc.target/riscv/sat/sat_s_add_imm-4.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-1-i64.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-1.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-1-i8.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-run-2.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-run-1-i16.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-run-3.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-run-1-i32.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-run-4.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-run-1-i64.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-run-1.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm-run-1-i8.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-2-1.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm_type_check-1-i16.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-3-1.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm_type_check-1-i32.c: ...here.
* gcc.target/riscv/sat/sat_s_add_imm-1-1.c: Move to...
* gcc.target/riscv/sat/sat_s_add_imm_type_check-1-i8.c: ...here.
So for the tests, why are we forcing matching of the assembly code for 
the entire function?  That must makes for a fragile test as we may 
change various aspects of code generation over time.


If the point of the patch is to detect SAT_ADD in more cases, then the 
better and more stable test is to verify the existence of SAT_ADD the 
appropriate number of times in the .optimized dump.


IMHO we really don't want this kind of whole function assembly matching.

Pan, do you have any further comments here?  Do you have strong opinions 
on whether or not we want to be doing this kind of assembly output 
testing or not?



Jeff




Re: [PATCH 3/6] RISC-V: frm/mode-switch: remove dubious frm edge insertion before call_insn

2025-05-20 Thread Robin Dapp
Maybe I'm missing something there.  Particularly whether or not you can know 
anything about frm's value after a call has returned.  Normally the answer to 
this kind of question is a hard no.


AFAICT the main difference to standard mode switching is that we (ab)use it to 
set the rounding mode to the value it had initially, either at function entry 
or after a call.  That's different to regular mode switching which assumes 
"static" rounding modes for different instructions.


Standard could e.g. be:
- insn1 demands frm1
- call1 demands frm4
- call2 demands frm5

Whereas we have:
- insn1 demands frm1
- call1 demands "frm at the start of the function"
- call2 demands "frm after call1 that could have called fesetround"

And that's where the "backup" comes from.  We pretend to have a "static"
rounding mode that we can set before call1 or call2 but need to update
it after each call (or inline asm etc.).

Thus we read FRM after each call, or rather after the last call before
an FRM setter.

--
Regards
Robin



[PATCH 1/2] [APX CFCMOV] Support APX CFCMOV in if_convert pass

2025-05-20 Thread Hongyu Wang
From: Lingling Kong 

Hi,

APX CFCMOV feature implements conditionally faulting which means
that all memory faults are suppressed when the condition code
evaluates to false and load or store a memory operand. Now we
could load or store a memory operand may trap or fault for
conditional move.

In middle-end, now we don't support a conditional move if we knew
that a load from A or B could trap or fault. To enable CFCMOV, we
use mask_load and mask_store as a proxy for backend expander. The
predicate of mask_load/mask_store is recognized as comparison rtx
in the inital implementation.

Conditional move suppress_fault for condition mem store would not
move any arithmetic calculations. For condition mem load now just
support a conditional move one trap mem and one no trap and no mem
cases.

As Richard suggests we postpond the patch to GCC16, and we hope someone
who is more familiar with rtl ifcvt pass can help review the
implementation.  

Bootstrapped & regtested on x86_64-pc-linux-gnu and aarch64-linux-gnu.

gcc/ChangeLog:

* ifcvt.cc (can_use_mask_load_store):  New function to check
wheter conditional fault load store .
(noce_try_cmove_arith): Relax the condition for operand
may_trap_or_fault check, expand with mask_load/mask_store optab
for one of the cmove operand may trap or fault.
(noce_process_if_block): Allow trap_or_fault dest for
"if (...)" *x = a; else skip" scenario when mask_store optab is
available.
* optabs.h (emit_mask_load_store): New declaration.
* optabs.cc (emit_mask_load_store): New function to emit
conditional move with mask_load/mask_store optab.
---
 gcc/ifcvt.cc  | 110 ++
 gcc/optabs.cc | 103 ++
 gcc/optabs.h  |   3 ++
 3 files changed, 200 insertions(+), 16 deletions(-)

diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
index a0c6575e4e4..1337495f7c4 100644
--- a/gcc/ifcvt.cc
+++ b/gcc/ifcvt.cc
@@ -785,6 +785,7 @@ static bool noce_try_store_flag_mask (struct noce_if_info 
*);
 static rtx noce_emit_cmove (struct noce_if_info *, rtx, enum rtx_code, rtx,
rtx, rtx, rtx, rtx = NULL, rtx = NULL);
 static bool noce_try_cmove (struct noce_if_info *);
+static bool can_use_mask_load_store (struct noce_if_info *);
 static bool noce_try_cmove_arith (struct noce_if_info *);
 static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn **);
 static bool noce_try_minmax (struct noce_if_info *);
@@ -2136,6 +2137,39 @@ noce_emit_bb (rtx last_insn, basic_block bb, bool simple)
   return true;
 }
 
+/* Return TRUE if backend supports scalar maskload_optab
+   or maskstore_optab, who suppresses memory faults when trying to
+   load or store a memory operand and the condition code evaluates
+   to false.
+   Currently the following forms
+   "if (test) *x = a; else skip;" --> mask_store
+   "if (test) x = *a; else x = b;" --> mask_load
+   "if (test) x = a; else x = *b;" --> mask_load
+   are supported.  */
+
+static bool
+can_use_mask_load_store (struct noce_if_info *if_info)
+{
+  rtx b = if_info->b;
+  rtx x = if_info->x;
+  rtx cond = if_info->cond;
+
+  if (MEM_P (x))
+{
+  if (convert_optab_handler (maskstore_optab, GET_MODE (x),
+GET_MODE (cond)) == CODE_FOR_nothing)
+   return false;
+
+  if (!rtx_equal_p (x, b) || !may_trap_or_fault_p (x))
+   return false;
+
+  return true;
+}
+  else
+return convert_optab_handler (maskload_optab, GET_MODE (x),
+ GET_MODE (cond)) != CODE_FOR_nothing;
+}
+
 /* Try more complex cases involving conditional_move.  */
 
 static bool
@@ -2155,6 +2189,9 @@ noce_try_cmove_arith (struct noce_if_info *if_info)
   enum rtx_code code;
   rtx cond = if_info->cond;
   rtx_insn *ifcvt_seq;
+  bool a_may_trap_or_fault = may_trap_or_fault_p (a);
+  bool b_may_trap_or_fault = may_trap_or_fault_p (b);
+  bool use_mask_load_store = false;
 
   /* A conditional move from two memory sources is equivalent to a
  conditional on their addresses followed by a load.  Don't do this
@@ -2171,11 +2208,22 @@ noce_try_cmove_arith (struct noce_if_info *if_info)
   x = gen_reg_rtx (address_mode);
   is_mem = true;
 }
-
-  /* ??? We could handle this if we knew that a load from A or B could
- not trap or fault.  This is also true if we've already loaded
- from the address along the path from ENTRY.  */
-  else if (may_trap_or_fault_p (a) || may_trap_or_fault_p (b))
+  /* We could not handle the case that a and b may both trap or
+ fault.  */
+  else if (a_may_trap_or_fault && b_may_trap_or_fault)
+return false;
+  /* Scalar maskload_optab/maskstore_optab implies conditionally
+ faulting, which means that if the condition mask evaluates to
+ false, all memory faults are suppressed when load or store a
+ memory operand. So if scalar_mask_load o

[PATCH 2/2] [APX CFCMOV] Support APX CFCMOV in backend

2025-05-20 Thread Hongyu Wang
From: Lingling Kong 

gcc/ChangeLog:

* config/i386/i386-expand.cc (ix86_expand_int_cfmovcc):  Expand
to cfcmov pattern.
* config/i386/i386-opts.h (enum apx_features): New.
* config/i386/i386-protos.h (ix86_expand_int_cfmovcc): Define.
* config/i386/i386.cc (ix86_rtx_costs): Add UNSPEC_APX_CFCMOV
cost.
* config/i386/i386.h (TARGET_APX_CFCMOV): Define.
* config/i386/i386.md (maskload): New define_expand.
(maskstore): Ditto.
(*cfmovcc): New define_insn.
(*cfmovcc_2): Ditto.
(*cfmovccz): Ditto.
(UNSPEC_APX_CFCMOV): New unspec for cfcmov.
* config/i386/i386.opt: Add enum value for cfcmov.

gcc/testsuite/ChangeLog:

* gcc.target/i386/apx-cfcmov-1.c: New test.
* gcc.target/i386/apx-cfcmov-2.c: Ditto.
---
 gcc/config/i386/i386-expand.cc   | 46 
 gcc/config/i386/i386-opts.h  |  4 +-
 gcc/config/i386/i386-protos.h|  1 +
 gcc/config/i386/i386.cc  | 16 +++-
 gcc/config/i386/i386.h   |  1 +
 gcc/config/i386/i386.md  | 77 +++-
 gcc/config/i386/i386.opt |  3 +
 gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c | 73 +++
 gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c | 40 ++
 9 files changed, 255 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/apx-cfcmov-2.c

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index 7fd03c88630..7f423f5eb65 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -3535,6 +3535,52 @@ ix86_expand_int_addcc (rtx operands[])
   return true;
 }
 
+void
+ix86_expand_int_cfmovcc (rtx dest, rtx compare_op, rtx vtrue, rtx vfalse)
+{
+  machine_mode mode = GET_MODE(dest);
+  enum rtx_code code = GET_CODE (compare_op);
+  rtx_insn *compare_seq;
+  rtx op0 = XEXP (compare_op, 0);
+  rtx op1 = XEXP (compare_op, 1);
+  rtx op2 = vtrue;
+  rtx op3 = vfalse;
+
+  gcc_assert (may_trap_or_fault_p (op2) || may_trap_or_fault_p (op3));
+  /* For Conditional store only handle "if (test) *x = a; else skip;".  */
+  if (MEM_P (dest))
+gcc_assert (rtx_equal_p (dest, op3));
+
+  start_sequence ();
+  compare_op = ix86_expand_compare (code, op0, op1);
+  compare_seq = get_insns ();
+  end_sequence ();
+
+  if (may_trap_or_fault_p (op2))
+op2 = gen_rtx_UNSPEC (mode, gen_rtvec (1, op2),
+ UNSPEC_APX_CFCMOV);
+  if (may_trap_or_fault_p (op3))
+op3 = gen_rtx_UNSPEC (mode, gen_rtvec (1, op3),
+ UNSPEC_APX_CFCMOV);
+  emit_insn (compare_seq);
+  /* For "if (test) x = *a; else x = *b",generate 2 cfcmov.  */
+  if (may_trap_or_fault_p (op2) && may_trap_or_fault_p (op3))
+{
+  emit_insn (gen_rtx_SET (dest,
+ gen_rtx_IF_THEN_ELSE (mode, compare_op,
+   op2, dest)));
+  emit_insn (gen_rtx_SET (dest,
+ gen_rtx_IF_THEN_ELSE (mode, compare_op,
+   dest, op3)));
+}
+  /* For conditional load one mem, like "if (test) x = *a; else x = b/0."
+ and "if (test) x = b/0; else x = *b".  */
+  else
+emit_insn (gen_rtx_SET (dest,
+   gen_rtx_IF_THEN_ELSE (mode, compare_op,
+ op2, op3)));
+}
+
 bool
 ix86_expand_int_movcc (rtx operands[])
 {
diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h
index d47184e2879..899873dfeca 100644
--- a/gcc/config/i386/i386-opts.h
+++ b/gcc/config/i386/i386-opts.h
@@ -144,8 +144,10 @@ enum apx_features {
   apx_nf = 1 << 4,
   apx_ccmp = 1 << 5,
   apx_zu = 1 << 6,
+  apx_cfcmov = 1 << 7,
   apx_all = apx_egpr | apx_push2pop2 | apx_ndd
-   | apx_ppx | apx_nf | apx_ccmp | apx_zu,
+   | apx_ppx | apx_nf | apx_ccmp | apx_zu
+   | apx_cfcmov,
 };
 
 #endif
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index e85b925704b..02f2045d9d0 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -153,6 +153,7 @@ extern bool ix86_match_ccmode (rtx, machine_mode);
 extern bool ix86_match_ptest_ccmode (rtx);
 extern void ix86_expand_branch (enum rtx_code, rtx, rtx, rtx);
 extern void ix86_expand_setcc (rtx, enum rtx_code, rtx, rtx);
+extern void ix86_expand_int_cfmovcc (rtx, rtx, rtx, rtx);
 extern bool ix86_expand_int_movcc (rtx[]);
 extern bool ix86_expand_fp_movcc (rtx[]);
 extern bool ix86_expand_fp_vcond (rtx[]);
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 5cb66dadb43..867bb934121 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -23055,10 +23055,18 @@ ix86_rtx_costs (rtx x, machine_mode mode, int 
outer_code_i, int opno,
  *total

RE: [PUSHED] aarch64: Fix an oversight in aarch64_evpc_reencode

2025-05-20 Thread quic_pzheng
> Pengxuan Zheng  writes:
> > Some fields (e.g., zero_op0_p and zero_op1_p) of the struct "newd" may
> > be left uninitialized in aarch64_evpc_reencode. This can cause reading
> > of uninitialized data. I found this oversight when testing my patches
> > on and/fmov optimizations. This patch fixes the bug by zero initializing
the
> struct.
> >
> > Pushed as obvious after bootstrap/test on aarch64-linux-gnu.
> >
> > gcc/ChangeLog:
> >
> > * config/aarch64/aarch64.cc (aarch64_evpc_reencode): Zero initialize
> > newd.
> > ---
> >  gcc/config/aarch64/aarch64.cc | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/gcc/config/aarch64/aarch64.cc
> > b/gcc/config/aarch64/aarch64.cc index 2371541ef1b..c067e099d83
> 100644
> > --- a/gcc/config/aarch64/aarch64.cc
> > +++ b/gcc/config/aarch64/aarch64.cc
> > @@ -26246,7 +26246,7 @@ aarch64_evpc_trn (struct expand_vec_perm_d
> *d)
> > static bool  aarch64_evpc_reencode (struct expand_vec_perm_d *d)  {
> > -  expand_vec_perm_d newd;
> > +  expand_vec_perm_d newd = {};
> 
> Wouldn't it be better to initialise the fields to useful values instead?
> Zeroness is carried over by reencoding, so I would expect:
> 
>   newd.zero_op0_p = d->zero_op0_p;
>   newd.zero_op1_p = d->zero_op1_p;
> 
> instead of the above.

Thanks for pointing this out, Richard! Here's the alternative fix you
suggested.
https://gcc.gnu.org/pipermail/gcc-patches/2025-May/684317.html

Please let me know if you have any other comments.

Thanks,
Pengxuan
> 
> Thanks,
> Richard
> 
> >
> >/* The subregs that we'd create are not supported for big-endian SVE;
> >   see aarch64_modes_compatible_p for details.  */



Re: [RFC PATCH 0/3] _BitInt(N) support for LoongArch

2025-05-20 Thread Yang Yujie
On Tue, May 20, 2025 at 09:55:04PM GMT, Xi Ruoyao wrote:
> On Tue, 2025-05-20 at 15:44 +0200, Jakub Jelinek wrote:
> > > Specifically, the tests told me to extend (thought "truncate"
> > > was kind of an equivalent word) the output of left shift, plus/minus,
> > 
> > Truncation is the exact opposite of extension.
> > I can understand the need for handling of left shifts, for all the rest
> > I'd really like to see testcases.
> 
> I guess the terminology thing is caused by the past experience of the
> Loongson team with "a famous !TARGET_TRULY_NOOP_TRUNCATION target."  On
> that target truncsidi2 is a sign-extension as required by the ISA spec.
> 
> I'm trying to fix an ext-dce bug regarding !TARGET_TRULY_NOOP_TRUNCATION
> so I just decided to chime in and explain this :).
> 
> -- 
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University

Hi Ruoyao,

Thank you for this nice comment. I thought this wording was not really a big 
deal.

I used "truncation" because in the code of gimple-lower-bitint.cc, truncation
or down-casting is the first actual operation that needs to be done when
"info.extended" targets "extends" the output of some operations, which are
already in m_limb_type (mostly 64-bit integers).

This truncation exists for the "non-extended" targets, and seems also
necessary for the "extended" targets in some operations.

Again, it would be great if there is a more elegant way to get this done.

Yujie



[PATCH] aarch64: Carry over zeroness in aarch64_evpc_reencode

2025-05-20 Thread Pengxuan Zheng
There was a bug in aarch64_evpc_reencode which could leave zero_op0_p and
zero_op1_p of the struct "newd" uninitialized.  r16-701-gd77c3bc1c35e303 fixed
the issue by zero initializing "newd."  This patch provides an alternative fix
as suggested by Richard Sandiford based on the fact that the zeroness is
preserved by aarch64_evpc_reencode.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_evpc_reencode): Copy zero_op0_p and
zero_op1_p from d to newd.

Signed-off-by: Pengxuan Zheng 
---
 gcc/config/aarch64/aarch64.cc | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 1da615c8955..2b837ec8e67 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -26327,7 +26327,7 @@ aarch64_evpc_trn (struct expand_vec_perm_d *d)
 static bool
 aarch64_evpc_reencode (struct expand_vec_perm_d *d)
 {
-  expand_vec_perm_d newd = {};
+  expand_vec_perm_d newd;
 
   /* The subregs that we'd create are not supported for big-endian SVE;
  see aarch64_modes_compatible_p for details.  */
@@ -26353,6 +26353,8 @@ aarch64_evpc_reencode (struct expand_vec_perm_d *d)
   newd.op1 = d->op1 ? gen_lowpart (new_mode, d->op1) : NULL;
   newd.testing_p = d->testing_p;
   newd.one_vector_p = d->one_vector_p;
+  newd.zero_op0_p = d->zero_op0_p;
+  newd.zero_op1_p = d->zero_op1_p;
 
   newd.perm.new_vector (newpermindices.encoding (), newd.one_vector_p ? 1 : 2,
newpermindices.nelts_per_input ());
-- 
2.17.1



[PATCH] middle-end: Fix complex lowering of cabs with no LHS [PR120369]

2025-05-20 Thread Andrew Pinski
This was introduced by r15-1797-gd8fe4f05ef448e . I had missed that
the LHS of the cabs call could be NULL. This seems to only happen at -O0,
I tried to produce one that happens at -O1 but needed many different
options to prevent the removal of the call.
Anyways the fix is just keep around the call if the LHS is null.

Bootstrapped and tested on x86_64-linux-gnu.

PR middle-end/120369

gcc/ChangeLog:

* tree-complex.cc (gimple_expand_builtin_cabs): Return early
if the LHS of cabs is null.

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr120369-1.c: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/testsuite/gcc.dg/torture/pr120369-1.c | 9 +
 gcc/tree-complex.cc   | 4 
 2 files changed, 13 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr120369-1.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr120369-1.c 
b/gcc/testsuite/gcc.dg/torture/pr120369-1.c
new file mode 100644
index 000..4c20fb0932f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr120369-1.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* PR middle-end/120369 */
+
+/* Make sure cabs without a lhs does not cause an ICE. */
+void f()
+{
+  double _Complex z = 1.0;
+  __builtin_cabs(z);
+}
diff --git a/gcc/tree-complex.cc b/gcc/tree-complex.cc
index 8a812d4bf9b..e339b3a5b37 100644
--- a/gcc/tree-complex.cc
+++ b/gcc/tree-complex.cc
@@ -1715,6 +1715,10 @@ gimple_expand_builtin_cabs (gimple_stmt_iterator *gsi, 
gimple *old_stmt)
 
   tree lhs = gimple_call_lhs (old_stmt);
 
+  /* If there is not a LHS, then just keep the statement around.  */
+  if (!lhs)
+return;
+
   real_part = extract_component (gsi, arg, false, true);
   imag_part = extract_component (gsi, arg, true, true);
   location_t loc = gimple_location (old_stmt);
-- 
2.43.0



Re: [RFC PATCH 0/3] _BitInt(N) support for LoongArch

2025-05-20 Thread Yang Yujie
On Tue, May 20, 2025 at 03:44:09PM GMT, Jakub Jelinek wrote:
> I'd suggest working on it incrementally rather than with a full patch set.
> In one or multiple patches handle the promote_mode stuff, the atomic
> extension and expr.cc changes with the feedback incorporated.

Ok.

> For gimple-lower-bitint.cc I'd really like to see what testing you've done
> to decide on a case by case basis.
> 
> > > Are you sure all those changes were really necessary (rather than doing 
> > > them
> > > just in case)?  I believe most of gimple-lower-bitint.cc already should be
> > > sign or zero extending the partial limbs when storing stuff, there can be
> > > some corner cases (I think one of the shift directions at least).
> > 
> > The modifications to gimple-lower-bitint.cc are based on testing, 
> 
> The tests weren't included :(.

"The tests" refer to the existing regression tests in testsuite/gcc.dg and
testsuite/gcc.dg/torture, specifically bitint-*.c.

> > since I found that simply setting the "info.extended" flag won't work unless
> > I make changes to promote_function_mode, which leads to a series of
> > changes to correct all the regtests.
> > 
> > Specifically, the tests told me to extend (thought "truncate"
> > was kind of an equivalent word) the output of left shift, plus/minus,
> 
> Truncation is the exact opposite of extension.
> I can understand the need for handling of left shifts, for all the rest
> I'd really like to see testcases.

Again, the existing testcases would do.

> > More common tests would surely be helpful, especially for new ports.
> > 
> > However, the specific test you mentioned would not be compatible with
> > the proposed LoongArch ABI, where the top 64-bit limb within the top
> > 128-bit ABI-limb may be undefined. e.g. _BitInt(192).
> 
> Ugh, that feels very creative in the psABI :(.

This only concerns the large/huge _BitInt(N) which is fairly new
(i.e. rarely implemented or actually used), so I think it's nice
to have some discussion on the creativity (or "weirdo"-ness :D) here.

The idea is simple:
1. 8-byte limbs
2. 16-byte alignment for possible vector load/store optimizations
3. The top partial limb is usually handled separately, so the top-limb
   extension would mostly be useless if we still handle these with
   non-vector instructions.

> In any case, even that could be handled in the macro, although it would
> need to have defined(__loongarch__) specific helpers that would perhaps
> for
> sizeof (x) * __CHAR_BIT__ >= 128
> && (__builtin_popcountg (~(typeof (x)) 0) & 64) == 0
> choose unsigned _BitInt smaller by 64 bits from the one mentioned
> in the macro (for signed similarly using __builtin_clrsbg).
> 
> > Perhaps it's better to leave it to target-specific tests?
> 
> Please don't, we don't want to repeat that for all the info->extended
> targets (which looks to be arm 32-bit, s390x and loongarch right now).
> We want to test it on all, like the whole bitint testsuite helps to find
> issues on all the arches, most of it isn't target specific.

Ok.



[PATCH v23 1/3] c: Add _Countof operator

2025-05-20 Thread Alejandro Colomar
This operator is similar to sizeof but can only be applied to an array,
and returns its number of elements.

FUTURE DIRECTIONS:

-  We should make it work with array parameters to functions,
   and somehow magically return the number of elements of the array,
   regardless of it being really a pointer.

Link: 
Link: 
Link: 
Link: 

Link: 
Link: 
Link: 
Link: 
Link: 
Link: 
Link: 
Link: 

gcc/ChangeLog:

* doc/extend.texi: Document _Countof operator.

gcc/c-family/ChangeLog:

* c-common.h: Add RID_COUNTOF.
(c_countof_type): New function prototype.
* c-common.def (COUNTOF_EXPR): New tree.
* c-common.cc
(c_common_reswords): Add RID_COUNTOF entry.
(c_countof_type): New function.

gcc/c/ChangeLog:

* c-tree.h
(in_countof): Add global variable declaration.
(c_expr_countof_expr): Add function prototype.
(c_expr_countof_type): Add function prototype.
* c-decl.cc
(start_struct, finish_struct): Add support for _Countof.
(start_enum, finish_enum): Add support for _Countof.
* c-parser.cc
(c_parser_sizeof_expression): New macro.
(c_parser_countof_expression): New macro.
(c_parser_sizeof_or_countof_expression):
Rename function and add support for _Countof.
(c_parser_unary_expression): Add RID_COUNTOF entry.
* c-typeck.cc
(in_countof): Add global variable.
(build_external_ref): Add support for _Countof.
(record_maybe_used_decl): Add support for _Countof.
(pop_maybe_used): Add support for _Countof.
(is_top_array_vla): New function.
(c_expr_countof_expr, c_expr_countof_type): New functions.
Add _Countof operator.

gcc/testsuite/ChangeLog:

* gcc.dg/countof-compile.c: Compile-time tests for _Countof.
* gcc.dg/countof-vla.c: Tests for _Countof with VLAs.
* gcc.dg/countof-vmt.c: Tests for _Countof with other VMTs.
* gcc.dg/countof-zero-compile.c:
Compile-time tests for _Countof with zero-sized arrays.
* gcc.dg/countof-zero.c:
Tests for _Countof with zero-sized arrays.
* gcc.dg/countof.c: Tests for _Countof.

Suggested-by: Xavier Del Campo Romero 
Co-authored-by: Martin Uecker 
Acked-by: "James K. Lowden" 
Signed-off-by: Alejandro Colomar 
---
 gcc/c-family/c-common.cc|  26 
 gcc/c-family/c-common.def   |   3 +
 gcc/c-family/c-common.h |   2 +
 gcc/c/c-decl.cc |  22 +++-
 gcc/c/c-parser.cc   |  59 +++---
 gcc/c/c-tree.h  |   4 +
 gcc/c/c-typeck.cc   | 115 +-
 gcc/doc/extend.texi |  30 +
 gcc/testsuite/gcc.dg/countof-compile.c  | 124 
 gcc/testsuite/gcc.dg/countof-vla.c  |  35 ++
 gcc/testsuite/gcc.dg/countof-vmt.c  |  20 
 gcc/testsuite/gcc.dg/countof-zero-compile.c |  38 ++
 gcc/testsuite/gcc.dg/countof-zero.c |  31 +
 gcc/testsuite/gcc.dg/countof.c  | 120 +++
 14 files changed, 605 insertions(+), 24 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vmt.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-zero-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-zero.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 587d76461e9e..f71cb2652d5a 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -394,6 +394,7 @@ const struct c_common_resword c_common_reswords[] =
 {
   { "_Alignas",RID_ALIGNAS,   D_CONLY },
   { "_Alignof",RID_ALIGNOF,   D_CONLY },
+  { "_Countof",RID_COUNTOF,   D_CONLY },
   { "_Atomic", RID_ATOMIC,D_CONLY },
   { "_BitInt", RID_BITINT,D_CONLY },
   { "_Bool",   RID_BOOL,  D_CONLY },
@@ -4080,6 +4081,31 @@ c_alignof_expr (location_t loc, tree expr)
 
   return fold_convert_loc (loc, size_type_node, t);
 }
+
+/* Implement 

[PATCH v1 3/3] RISC-V: Add test for vec_duplicate + vand.vv combine case 1 with GR2VR cost 0, 1 and 2

2025-05-20 Thread pan2 . li
From: Pan Li 

Add asm dump check test for vec_duplicate + vand.vv combine to vand.vx,
with the GR2VR cost is 0, 1 and 2.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add asm check
for vand.vx combine.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c: Ditto.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c: Ditto.

Signed-off-by: Pan Li 
---
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i64.c | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i8.c  | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u16.c | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u32.c | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u64.c | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-u8.c  | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i16.c | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i32.c | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i64.c | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-i8.c  | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u16.c | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u32.c | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u64.c | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-5-u8.c  | 4 +++-
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i16.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i32.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i64.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-i8.c  | 6 --
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u16.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u32.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u64.c | 2 ++
 gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-6-u8.c  | 2 ++
 24 files changed, 66 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c
index 6f59b07d236..62fd4e39c01 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c
@@ -7,8 +7,10 @@
 
 DEF_VX_BINARY_CASE_1_WRAP(T, +, add, VX_BINARY_BODY_X16)
 DEF_VX_BINARY_CASE_1_WRAP(T, -, sub, VX_BINARY_BODY_X16)
-DEF_VX_BINARY_REVERSE_CASE_1_WRAP(T, -, rsub, VX_BINARY_REVERSE_BODY_X16);
+DEF_VX_BINARY_REVERSE_CASE_1_WRAP(T, -, rsub, VX_BINARY_REVERSE_BODY_X16)
+DEF_VX_BINARY_CASE_1_WRAP(T, &, and, VX_BINARY_BODY_X16)
 
 /* { dg-final { scan-assembler {vadd.vx} } } */
 /* { dg-final { scan-assembler {vsub.vx} } } */
 /* { dg-final { scan-assembler {vrsub.vx} } } */
+/* { dg-final { scan-assembler {vand.vx} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c
index 69b2227d889..d047458b81d 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i32.c
@@ -7,8 +7,10 @@
 
 DEF_VX_BINARY_CASE_1_WRAP(T, +, add, VX_BINARY_BODY_X4)
 DEF_VX_BINARY_CASE_1_WRAP(T, -, sub, VX_BINARY_BODY_X4)
-DEF_VX_BINARY_REVERSE_CASE_1_WRAP(T, -, rsub, VX_BINARY_REVERSE_BODY_X4);
+DEF_VX_BINARY_REVERSE_CASE_1_WRAP(T, -, rsub, VX_BINARY_REVERSE_BODY_X4)
+DEF_VX_BI

[PATCH v1 1/3] RISC-V: RISC-V: Combine vec_duplicate + vand.vv to vand.vx on GR2VR cost

2025-05-20 Thread pan2 . li
From: Pan Li 

This patch would like to combine the vec_duplicate + vand.vv to the
vand.vx.  From example as below code.  The related pattern will depend
on the cost of vec_duplicate from GR2VR.  Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the GR2VR cost is greater than zero.

Assume we have example code like below, GR2VR cost is 0.

  #define DEF_VX_BINARY(T, OP)\
  void\
  test_vx_binary (T * restrict out, T * restrict in, T x, unsigned n) \
  {   \
for (unsigned i = 0; i < n; i++)  \
  out[i] = in[i] OP x;\
  }

  DEF_VX_BINARY(int32_t, &)

Before this patch:
  10   │ test_vx_binary_and_int32_t_case_0:
  11   │ beq a3,zero,.L8
  12   │ vsetvli a5,zero,e32,m1,ta,ma
  13   │ vmv.v.x v2,a2
  14   │ sllia3,a3,32
  15   │ srlia3,a3,32
  16   │ .L3:
  17   │ vsetvli a5,a3,e32,m1,ta,ma
  18   │ vle32.v v1,0(a1)
  19   │ sllia4,a5,2
  20   │ sub a3,a3,a5
  21   │ add a1,a1,a4
  22   │ vand.vv v1,v1,v2
  23   │ vse32.v v1,0(a0)
  24   │ add a0,a0,a4
  25   │ bne a3,zero,.L3

After this patch:
  10   │ test_vx_binary_and_int32_t_case_0:
  11   │ beq a3,zero,.L8
  12   │ sllia3,a3,32
  13   │ srlia3,a3,32
  14   │ .L3:
  15   │ vsetvli a5,a3,e32,m1,ta,ma
  16   │ vle32.v v1,0(a1)
  17   │ sllia4,a5,2
  18   │ sub a3,a3,a5
  19   │ add a1,a1,a4
  20   │ vand.vx v1,v1,a2
  21   │ vse32.v v1,0(a0)
  22   │ add a0,a0,a4
  23   │ bne a3,zero,.L3

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/riscv-v.cc (expand_vx_binary_vec_dup_vec): Add new
case for rtx code AND.
(expand_vx_binary_vec_vec_dup): Ditto.
* config/riscv/riscv.cc (riscv_rtx_costs): Ditto.
* config/riscv/vector-iterators.md: Add new op and to no_shift_vx_ops.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-v.cc  | 2 ++
 gcc/config/riscv/riscv.cc| 1 +
 gcc/config/riscv/vector-iterators.md | 2 +-
 3 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 1b5ef51886e..e406e7a7f59 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -5511,6 +5511,7 @@ expand_vx_binary_vec_dup_vec (rtx op_0, rtx op_1, rtx 
op_2,
   switch (code)
 {
 case PLUS:
+case AND:
   icode = code_for_pred_scalar (code, mode);
   break;
 case MINUS:
@@ -5537,6 +5538,7 @@ expand_vx_binary_vec_vec_dup (rtx op_0, rtx op_1, rtx 
op_2,
   switch (code)
 {
 case MINUS:
+case AND:
   icode = code_for_pred_scalar (code, mode);
   break;
 default:
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 0b10842d176..cfffbc18f57 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -3894,6 +3894,7 @@ riscv_rtx_costs (rtx x, machine_mode mode, int 
outer_code, int opno ATTRIBUTE_UN
break;
  case PLUS:
  case MINUS:
+ case AND:
{
  rtx op_0 = XEXP (x, 0);
  rtx op_1 = XEXP (x, 1);
diff --git a/gcc/config/riscv/vector-iterators.md 
b/gcc/config/riscv/vector-iterators.md
index 23cb940310f..026be6f65d3 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -4042,7 +4042,7 @@ (define_code_iterator any_int_binop [plus minus and ior 
xor ashift ashiftrt lshi
 ])
 
 (define_code_iterator any_int_binop_no_shift_vx [
-  plus minus
+  plus minus and
 ])
 
 (define_code_iterator any_int_unop [neg not])
-- 
2.43.0



[PATCH v1 0/3] RISC-V: Combine vec_duplicate + vand.vv to vand.vx on GR2VR cost

2025-05-20 Thread pan2 . li
From: Pan Li 

This patch would like to introduce the combine of vec_dup + vand.vv into
vand.vx on the cost value of GR2VR.  The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test.  There will be two cases for the combine:

Case 0:
 |   ...
 |   vmv.v.x
 | L1:
 |   vand.vv
 |   J L1
 |   ...

Case 1:
 |   ...
 | L1:
 |   vmv.v.x
 |   vand.vv
 |   J L1
 |   ...

Both will be combined to below if the cost of GR2VR is zero.
 |   ...
 | L1:
 |   vand.vx
 |   J L1
 |   ...

The below test suites are passed for this patch series.
* The rv64gcv fully regression test.

Pan Li (3):
  RISC-V: RISC-V: Combine vec_duplicate + vand.vv to vand.vx on GR2VR cost
  RISC-V: Add test for vec_duplicate + vand.vv combine case 0 with GR2VR cost 
0, 2 and 15
  RISC-V: Add test for vec_duplicate + vand.vv combine case 1 with GR2VR cost 
0, 1 and 2

 gcc/config/riscv/riscv-v.cc   |   2 +
 gcc/config/riscv/riscv.cc |   1 +
 gcc/config/riscv/vector-iterators.md  |   2 +-
 .../riscv/rvv/autovec/vx_vf/vx-1-i16.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-i32.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-i64.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-i8.c |   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-u16.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-u32.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-u64.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-u8.c |   2 +
 .../riscv/rvv/autovec/vx_vf/vx-2-i16.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-i32.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-i64.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-i8.c |   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-u16.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-u32.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-u64.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-u8.c |   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-i16.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-i32.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-i64.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-i8.c |   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-u16.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-u32.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-u64.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-u8.c |   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-4-i16.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-4-i32.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-4-i64.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-4-i8.c |   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-4-u16.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-4-u32.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-4-u64.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-4-u8.c |   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-5-i16.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-5-i32.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-5-i64.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-5-i8.c |   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-5-u16.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-5-u32.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-5-u64.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-5-u8.c |   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-6-i16.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-6-i32.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-6-i64.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-6-i8.c |   6 +-
 .../riscv/rvv/autovec/vx_vf/vx-6-u16.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-6-u32.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-6-u64.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-6-u8.c |   2 +
 .../riscv/rvv/autovec/vx_vf/vx_binary_data.h  | 392 ++
 .../rvv/autovec/vx_vf/vx_vand-run-1-i16.c |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-i32.c |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-i64.c |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-i8.c  |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-u16.c |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-u32.c |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-u64.c |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-u8.c  |  15 +
 60 files changed, 646 insertions(+), 35 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-i8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-u16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-u32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-u64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_

[PATCH v1 2/3] RISC-V: Add test for vec_duplicate + vand.vv combine case 0 with GR2VR cost 0, 2 and 15

2025-05-20 Thread pan2 . li
From: Pan Li 

Add asm dump check test for vec_duplicate + vand.vv combine to vand.vx,
with the GR2VR cost is 0, 2 and 15.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add test cases
for vand vx combine case 0 on GR2VR cost.
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-u8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-2-u8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-i8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u16.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u32.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u64.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx-3-u8.c: Ditto
* gcc.target/riscv/rvv/autovec/vx_vf/vx_binary_data.h: Add test
data for vand.vx run test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-u16.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-u32.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-u64.c: New test.
* gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-u8.c: New test.

Signed-off-by: Pan Li 
---
 .../riscv/rvv/autovec/vx_vf/vx-1-i16.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-i32.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-i64.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-i8.c |   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-u16.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-u32.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-u64.c|   2 +
 .../riscv/rvv/autovec/vx_vf/vx-1-u8.c |   2 +
 .../riscv/rvv/autovec/vx_vf/vx-2-i16.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-i32.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-i64.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-i8.c |   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-u16.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-u32.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-u64.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-2-u8.c |   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-i16.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-i32.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-i64.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-i8.c |   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-u16.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-u32.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-u64.c|   4 +-
 .../riscv/rvv/autovec/vx_vf/vx-3-u8.c |   4 +-
 .../riscv/rvv/autovec/vx_vf/vx_binary_data.h  | 392 ++
 .../rvv/autovec/vx_vf/vx_vand-run-1-i16.c |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-i32.c |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-i64.c |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-i8.c  |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-u16.c |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-u32.c |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-u64.c |  15 +
 .../rvv/autovec/vx_vf/vx_vand-run-1-u8.c  |  15 +
 33 files changed, 576 insertions(+), 16 deletions(-)
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-i16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-i32.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-i64.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-i8.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-u16.c
 create mode 100644 
gcc/testsuite/gcc.target/riscv/rvv/autovec/vx_vf/vx_vand-run-1-u32.c
 create mode 10

[PATCH V4] RISC-V: Prevent speculative vsetvl insn scheduling

2025-05-20 Thread Edwin Lu
The instruction scheduler appears to be speculatively hoisting vsetvl
insns outside of their basic block without checking for data
dependencies. This resulted in a situation where the following occurs

vsetvli a5,a1,e32,m1,tu,ma
vle32.v v2,0(a0)
sub a1,a1,a5 <-- a1 potentially set to 0
sh2add  a0,a5,a0
vfmacc.vv   v1,v2,v2
vsetvli a5,a1,e32,m1,tu,ma <-- incompatible vinfo. update vl to 0
beq a1,zero,.L12 <-- check if avl is 0

This patch would essentially delay the vsetvl update to after the branch
to prevent unnecessarily updating the vinfo at the end of a basic block.

PR 117974

gcc/ChangeLog:

* config/riscv/riscv.cc (struct riscv_tune_param): Add tune
  param.
(riscv_sched_can_speculate_insn): Implement.
(TARGET_SCHED_CAN_SPECULATE_INSN): Implement.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr117974.c: New test.

Signed-off-by: Edwin Lu 
---
V2: add testcase
V3: add opt flag to test performance
V4: change opt flag to tune param
---
 gcc/config/riscv/riscv.cc | 34 +++
 .../gcc.target/riscv/rvv/vsetvl/pr117974.c| 16 +
 2 files changed, 50 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr117974.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 4c5bb02754d..2e5cd0dbcd2 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -297,6 +297,7 @@ struct riscv_tune_param
   bool vector_unaligned_access;
   bool use_divmod_expansion;
   bool overlap_op_by_pieces;
+  bool speculative_sched_vsetvl;
   unsigned int fusible_ops;
   const struct cpu_vector_cost *vec_costs;
   const char *function_align;
@@ -459,6 +460,7 @@ static const struct riscv_tune_param rocket_tune_info = {
   false,   /* vector_unaligned_access */
   false,   /* use_divmod_expansion */
   false,   /* overlap_op_by_pieces */
+  false,   /* speculative_sched_vsetvl */
   RISCV_FUSE_NOTHING,   /* fusible_ops */
   NULL,/* vector cost */
   NULL,/* function_align */
@@ -481,6 +483,7 @@ static const struct riscv_tune_param sifive_7_tune_info = {
   false,   /* vector_unaligned_access */
   false,   /* use_divmod_expansion */
   false,   /* overlap_op_by_pieces */
+  false,   /* speculative_sched_vsetvl */
   RISCV_FUSE_NOTHING,   /* fusible_ops */
   NULL,/* vector cost */
   NULL,/* function_align */
@@ -503,6 +506,7 @@ static const struct riscv_tune_param sifive_p400_tune_info 
= {
   false,   /* vector_unaligned_access */
   false,   /* use_divmod_expansion */
   false,   /* overlap_op_by_pieces */
+  false,   /* speculative_sched_vsetvl */
   RISCV_FUSE_LUI_ADDI | RISCV_FUSE_AUIPC_ADDI,  /* fusible_ops */
   &generic_vector_cost,/* vector cost */
   NULL,/* function_align */
@@ -525,6 +529,7 @@ static const struct riscv_tune_param sifive_p600_tune_info 
= {
   false,   /* vector_unaligned_access */
   false,   /* use_divmod_expansion */
   false,   /* overlap_op_by_pieces */
+  false,   /* speculative_sched_vsetvl */
   RISCV_FUSE_LUI_ADDI | RISCV_FUSE_AUIPC_ADDI,  /* fusible_ops */
   &generic_vector_cost,/* vector cost */
   NULL,/* function_align */
@@ -547,6 +552,7 @@ static const struct riscv_tune_param thead_c906_tune_info = 
{
   false,   /* vector_unaligned_access */
   false,   /* use_divmod_expansion */
   false,   /* overlap_op_by_pieces */
+  false,   /* speculative_sched_vsetvl */
   RISCV_FUSE_NOTHING,   /* fusible_ops */
   NULL,/* vector cost */
   NULL,/* function_align */
@@ -569,6 +575,7 @@ static const struct riscv_tune_param 
xiangshan_nanhu_tune_info = {
   false,   /* vector_unaligned_access */
   false,   /* use_divmod_

[to-be-committed][RISC-V] Infrastructure of synthesizing logical AND with constant

2025-05-20 Thread Jeff Law
So this is the next step on the path to mvconst_internal removal and is 
work from Shreya and myself.


This puts in the infrastructure to allow us to synthesize logical AND 
much like we're doing with logical IOR/XOR.


Unlike IOR/XOR, AND has many more special cases that can be profitable. 
For example, you can use shifts to clear many bits.  You can use zero 
extension to clear bits, you can use rotate+andi+rotate, shift pairs, etc.


So to make potential bisecting easy the plan is to drop in the work on 
logical AND in several steps, essentially one new case at a time.


This step just puts the basics of a operation synthesis in place.  It 
still uses the same code generation strategies as we are currently using.


I'd like to say this is NFC, but unfortunately that's not true.  While 
the code generation strategy is the same, this does indirectly introduce 
new REG_EQUAL notes.  Those additional notes in turn can impact how 
various optimizers behave in very minor ways.


As usual, this has survived my tester on riscv32-elf and riscv64-elf.

Waiting on pre-commit to do its thing.  And I'll start queuing up the 
additional cases we want to handle while waiting ;-)


Jeffgcc/
* config/riscv/riscv-protos.h (synthesize_and): Prototype.
* config/riscv/riscv.cc (synthesize_and): New function.
* config/riscv/riscv.md (and3): Use it.

diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index b39b858acac..d8c8f6b5079 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -141,6 +141,7 @@ extern void riscv_expand_ustrunc (rtx, rtx);
 extern void riscv_expand_sstrunc (rtx, rtx);
 extern int riscv_register_move_cost (machine_mode, reg_class_t, reg_class_t);
 extern bool synthesize_ior_xor (rtx_code, rtx [3]);
+extern bool synthesize_and (rtx [3]);
 
 #ifdef RTX_CODE
 extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool 
*invert_ptr = 0);
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 0b10842d176..c7b010d6220 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -14464,6 +14464,60 @@ synthesize_ior_xor (rtx_code code, rtx operands[3])
   return true;
 }
 
+/* Synthesize OPERANDS[0] = OPERANDS[1] & OPERANDS[2].
+
+OPERANDS[0] and OPERANDS[1] will be a REG and may be the same
+REG.
+
+OPERANDS[2] is a CONST_INT.
+
+Return TRUE if the operation was fully synthesized and the caller
+need not generate additional code.  Return FALSE if the operation
+was not synthesized and the caller is responsible for emitting the
+proper sequence.  */
+
+bool
+synthesize_and (rtx operands[3])
+{
+  /* Trivial cases that don't need synthesis.  */
+  if (SMALL_OPERAND (INTVAL (operands[2]))
+ || (TARGET_ZBS && not_single_bit_mask_operand (operands[2], word_mode)))
+return false; 
+  
+  /* If the second operand is a mode mask, emit an extension
+ insn instead.  */
+  if (CONST_INT_P (operands[2]))
+{
+  enum machine_mode tmode = VOIDmode;
+  if (UINTVAL (operands[2]) == GET_MODE_MASK (HImode))
+   tmode = HImode;
+  else if (UINTVAL (operands[2]) == GET_MODE_MASK (SImode))
+   tmode = SImode;
+  
+  if (tmode != VOIDmode)
+   {
+ rtx tmp = gen_lowpart (tmode, operands[1]);
+ emit_insn (gen_extend_insn (operands[0], tmp, word_mode, tmode, 1));
+ return true;
+   }
+} 
+
+  /* If the remaining budget has gone to less than zero, it 
+ forces the value into a register and performs the AND 
+ operation.  It returns TRUE to the caller so the caller 
+ knows code generation is complete. 
+ FIXME: This is hacked to always be enabled until the last
+ patch in the series is enabled.  */
+  if (1)
+{
+  rtx x = force_reg (word_mode, operands[2]);
+  x = gen_rtx_AND (word_mode, operands[1], x);
+  emit_insn (gen_rtx_SET (operands[0], x));
+  return true;
+}
+}
+
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 154b49d55c5..209d9be96a8 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1724,26 +1724,11 @@ (define_insn "smax3"
 (define_expand "and3"
   [(set (match_operand:X0 "register_operand")
 (and:X (match_operand:X 1 "register_operand")
-  (match_operand:X 2 "arith_or_mode_mask_or_zbs_operand")))]
+  (match_operand:X 2 "reg_or_const_int_operand")))]
   ""
 {
-  /* If the second operand is a mode mask, emit an extension
- insn instead.  */
-  if (CONST_INT_P (operands[2]))
-{
-  enum machine_mode tmode = VOIDmode;
-  if (UINTVAL (operands[2]) == GET_MODE_MASK (HImode))
-   tmode = HImode;
-  else if (UINTVAL (operands[2]) == GET_MODE_MASK (SImode))
-   tmode = SImode;
-
-  if (tmode != VOIDmode)
-   {

Re: [PATCH v22 0/3] c: Add _Countof and

2025-05-20 Thread Jakub Jelinek
On Tue, May 20, 2025 at 11:44:43PM +0200, Alejandro Colomar wrote:
> Hi Jakub,
> 
> On Tue, May 20, 2025 at 11:20:27PM +0200, Jakub Jelinek wrote:
> > On Tue, May 20, 2025 at 11:12:38PM +0200, Alejandro Colomar wrote:
> > > Okay; how about this?
> > > 
> > > gcc/c-family/ChangeLog:
> > > 
> > > * c-common.h: Add _Countof operator.
> > > * c-common.def: Likewise.
> > > * c-common.cc (c_countof_type): Likewise.
> > 
> > No, that doesn't describe what you've changed and how.
> 
> Well, it does.  All the changes I've applied to those files are all
> to implement the new _Countof operator, and only for that.  That is,
> they're sufficient and necessary.  So, saying I've added the _Countof
> operator is correct.  I could go and talk about the specific changes to
> each file, but then I don't see the value in that change log over the
> actual diff.
> 
> > 
> > So probably something like:
> > 
> > * c-common.h (enum rid): Add RID_COUNTOF.
> > * c-common.def (COUNTOF_EXPR): New tree.
> > * c-common.cc (c_common_reswords): Add RID_COUNTOF entry.
> > (c_countof_type): New function.
> 
> I'm honestly unsure about the usefulness of going too low level in the
> changelog as to listing newly added functions as added functions,

No, that is exactly the level all others fill in and people grep that etc.

> instead of talking high-level about what they're for.  But if that's
> what you want, then okay.
> 
> I think
> 
>   (c_countof_type): New function.
> 
> is an example of what I think is useless bureaucracy.  Could you please
> confirm that's what you want?

Yes, we want exactly that.

Jakub



Re: [PATCH v22 0/3] c: Add _Countof and

2025-05-20 Thread Alejandro Colomar
Hi Jakub,

On Tue, May 20, 2025 at 11:20:27PM +0200, Jakub Jelinek wrote:
> On Tue, May 20, 2025 at 11:12:38PM +0200, Alejandro Colomar wrote:
> > Okay; how about this?
> > 
> > gcc/c-family/ChangeLog:
> > 
> > * c-common.h: Add _Countof operator.
> > * c-common.def: Likewise.
> > * c-common.cc (c_countof_type): Likewise.
> 
> No, that doesn't describe what you've changed and how.

Well, it does.  All the changes I've applied to those files are all
to implement the new _Countof operator, and only for that.  That is,
they're sufficient and necessary.  So, saying I've added the _Countof
operator is correct.  I could go and talk about the specific changes to
each file, but then I don't see the value in that change log over the
actual diff.

> 
> So probably something like:
> 
>   * c-common.h (enum rid): Add RID_COUNTOF.
>   * c-common.def (COUNTOF_EXPR): New tree.
>   * c-common.cc (c_common_reswords): Add RID_COUNTOF entry.
>   (c_countof_type): New function.

I'm honestly unsure about the usefulness of going too low level in the
changelog as to listing newly added functions as added functions,
instead of talking high-level about what they're for.  But if that's
what you want, then okay.

I think

(c_countof_type): New function.

is an example of what I think is useless bureaucracy.  Could you please
confirm that's what you want?


Cheers,
Alex

> You can use contrib/mklog, that at least pre-fills some of it for you.
> 
>   Jakub
> 

-- 



signature.asc
Description: PGP signature


[PATCH v23 0/3] c: Add _Countof and

2025-05-20 Thread Alejandro Colomar
Hi!

Here's another revision of this patch set.

v23 changes:

-  More specific change logs.
-  #define assert() instead of #include'ing .

`make check` says all's good.  I haven't diffed against master this
time, because that's slow as hell, and the changes are minimal.  I've
only ran `make check -j24` once at the tip, and all the countof tests
pass.


Have a lovely night!
Alex


Alejandro Colomar (3):
  c: Add _Countof operator
  c: Add 
  c: Add -Wpedantic diagnostic for _Countof

 gcc/Makefile.in   |   1 +
 gcc/c-family/c-common.cc  |  26 
 gcc/c-family/c-common.def |   3 +
 gcc/c-family/c-common.h   |   2 +
 gcc/c/c-decl.cc   |  22 +++-
 gcc/c/c-parser.cc |  63 +++--
 gcc/c/c-tree.h|   4 +
 gcc/c/c-typeck.cc | 115 +++-
 gcc/doc/extend.texi   |  30 +
 gcc/ginclude/stdcountof.h |  31 +
 gcc/testsuite/gcc.dg/countof-compat.c |   8 ++
 gcc/testsuite/gcc.dg/countof-compile.c| 124 ++
 gcc/testsuite/gcc.dg/countof-no-compat.c  |   5 +
 .../gcc.dg/countof-pedantic-errors.c  |   8 ++
 gcc/testsuite/gcc.dg/countof-pedantic.c   |   8 ++
 gcc/testsuite/gcc.dg/countof-stdcountof.c |  24 
 gcc/testsuite/gcc.dg/countof-vla.c|  35 +
 gcc/testsuite/gcc.dg/countof-vmt.c|  20 +++
 gcc/testsuite/gcc.dg/countof-zero-compile.c   |  38 ++
 gcc/testsuite/gcc.dg/countof-zero.c   |  31 +
 gcc/testsuite/gcc.dg/countof.c| 120 +
 21 files changed, 694 insertions(+), 24 deletions(-)
 create mode 100644 gcc/ginclude/stdcountof.h
 create mode 100644 gcc/testsuite/gcc.dg/countof-compat.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-no-compat.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-pedantic-errors.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-pedantic.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-stdcountof.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vla.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-vmt.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-zero-compile.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-zero.c
 create mode 100644 gcc/testsuite/gcc.dg/countof.c

Range-diff against v22:
1:  1c983c3baa7f ! 1:  5040a7d25a96 c: Add _Countof operator
@@ Commit message
 
 gcc/c-family/ChangeLog:
 
-* c-common.h
-* c-common.def
-* c-common.cc (c_countof_type): Add _Countof operator.
+* c-common.h: Add RID_COUNTOF.
+(c_countof_type): New function prototype.
+* c-common.def (COUNTOF_EXPR): New tree.
+* c-common.cc
+(c_common_reswords): Add RID_COUNTOF entry.
+(c_countof_type): New function.
 
 gcc/c/ChangeLog:
 
 * c-tree.h
-(c_expr_countof_expr, c_expr_countof_type)
+(in_countof): Add global variable declaration.
+(c_expr_countof_expr): Add function prototype.
+(c_expr_countof_type): Add function prototype.
 * c-decl.cc
-(start_struct, finish_struct)
-(start_enum, finish_enum)
+(start_struct, finish_struct): Add support for _Countof.
+(start_enum, finish_enum): Add support for _Countof.
 * c-parser.cc
-(c_parser_sizeof_expression)
-(c_parser_countof_expression)
-(c_parser_sizeof_or_countof_expression)
-(c_parser_unary_expression)
+(c_parser_sizeof_expression): New macro.
+(c_parser_countof_expression): New macro.
+(c_parser_sizeof_or_countof_expression):
+Rename function and add support for _Countof.
+(c_parser_unary_expression): Add RID_COUNTOF entry.
 * c-typeck.cc
-(build_external_ref)
-(record_maybe_used_decl)
-(pop_maybe_used)
-(is_top_array_vla)
-(c_expr_countof_expr, c_expr_countof_type):
+(in_countof): Add global variable.
+(build_external_ref): Add support for _Countof.
+(record_maybe_used_decl): Add support for _Countof.
+(pop_maybe_used): Add support for _Countof.
+(is_top_array_vla): New function.
+(c_expr_countof_expr, c_expr_countof_type): New functions.
 Add _Countof operator.
 
 gcc/testsuite/ChangeLog:
 
-* gcc.dg/countof-compile.c
-* gcc.dg/countof-vla.c
-* gcc.

[PATCH v23 3/3] c: Add -Wpedantic diagnostic for _Countof

2025-05-20 Thread Alejandro Colomar
It has been standardized in C2y.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_sizeof_or_countof_expression):
Add -Wpedantic diagnostic for _Countof in <= C23 mode.

gcc/testsuite/ChangeLog:

* gcc.dg/countof-compat.c:
Test _Countof diagnostics with -Wc23-c2y-compat on C2y.
* gcc.dg/countof-no-compat.c:
Test _Countof diagnostics with -Wno-c23-c2y-compat on C23.
* gcc.dg/countof-pedantic.c:
Test _Countof diagnostics with -pedantic on C23.
* gcc.dg/countof-pedantic-errors.c:
Test _Countof diagnostics with -pedantic-errors on C23.

Signed-off-by: Alejandro Colomar 
---
 gcc/c/c-parser.cc  | 4 
 gcc/testsuite/gcc.dg/countof-compat.c  | 8 
 gcc/testsuite/gcc.dg/countof-no-compat.c   | 5 +
 gcc/testsuite/gcc.dg/countof-pedantic-errors.c | 8 
 gcc/testsuite/gcc.dg/countof-pedantic.c| 8 
 5 files changed, 33 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/countof-compat.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-no-compat.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-pedantic-errors.c
 create mode 100644 gcc/testsuite/gcc.dg/countof-pedantic.c

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 87700339394b..d2193ad2f34f 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -10637,6 +10637,10 @@ c_parser_sizeof_or_countof_expression (c_parser 
*parser, enum rid rid)
 
   start = c_parser_peek_token (parser)->location;
 
+  if (rid == RID_COUNTOF)
+pedwarn_c23 (start, OPT_Wpedantic,
+"ISO C does not support %qs before C23", op_name);
+
   c_parser_consume_token (parser);
   c_inhibit_evaluation_warnings++;
   if (rid == RID_COUNTOF)
diff --git a/gcc/testsuite/gcc.dg/countof-compat.c 
b/gcc/testsuite/gcc.dg/countof-compat.c
new file mode 100644
index ..ab5b4ae6219c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/countof-compat.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c2y -pedantic-errors -Wc23-c2y-compat" } */
+
+#include 
+
+int a[1];
+int b[countof(a)];
+int c[_Countof(a)];  /* { dg-warning "ISO C does not support" } */
diff --git a/gcc/testsuite/gcc.dg/countof-no-compat.c 
b/gcc/testsuite/gcc.dg/countof-no-compat.c
new file mode 100644
index ..4a244cf222f6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/countof-no-compat.c
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -pedantic-errors -Wno-c23-c2y-compat" } */
+
+int a[1];
+int b[_Countof(a)];
diff --git a/gcc/testsuite/gcc.dg/countof-pedantic-errors.c 
b/gcc/testsuite/gcc.dg/countof-pedantic-errors.c
new file mode 100644
index ..5d5bedbe1f7e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/countof-pedantic-errors.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -pedantic-errors" } */
+
+#include 
+
+int a[1];
+int b[countof(a)];
+int c[_Countof(a)];  /* { dg-error "ISO C does not support" } */
diff --git a/gcc/testsuite/gcc.dg/countof-pedantic.c 
b/gcc/testsuite/gcc.dg/countof-pedantic.c
new file mode 100644
index ..408dc6f93667
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/countof-pedantic.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-std=c23 -pedantic" } */
+
+#include 
+
+int a[1];
+int b[countof(a)];
+int c[_Countof(a)];  /* { dg-warning "ISO C does not support" } */
-- 
2.49.0



[PATCH v23 2/3] c: Add

2025-05-20 Thread Alejandro Colomar
gcc/ChangeLog:

* Makefile.in (USER_H): Add .
* ginclude/stdcountof.h: Add countof macro.

gcc/testsuite/ChangeLog:

* gcc.dg/countof-stdcountof.c: Add tests for .

Signed-off-by: Alejandro Colomar 
---
 gcc/Makefile.in   |  1 +
 gcc/ginclude/stdcountof.h | 31 +++
 gcc/testsuite/gcc.dg/countof-stdcountof.c | 24 ++
 3 files changed, 56 insertions(+)
 create mode 100644 gcc/ginclude/stdcountof.h
 create mode 100644 gcc/testsuite/gcc.dg/countof-stdcountof.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 72d132207c0d..fc8a7e532b97 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -481,6 +481,7 @@ USER_H = $(srcdir)/ginclude/float.h \
 $(srcdir)/ginclude/stdalign.h \
 $(srcdir)/ginclude/stdatomic.h \
 $(srcdir)/ginclude/stdckdint.h \
+$(srcdir)/ginclude/stdcountof.h \
 $(EXTRA_HEADERS)
 
 USER_H_INC_NEXT_PRE = @user_headers_inc_next_pre@
diff --git a/gcc/ginclude/stdcountof.h b/gcc/ginclude/stdcountof.h
new file mode 100644
index ..1d914f40e5db
--- /dev/null
+++ b/gcc/ginclude/stdcountof.h
@@ -0,0 +1,31 @@
+/* Copyright (C) 2025 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+.  */
+
+/* ISO C2Y: 7.21 Array count .  */
+
+#ifndef _STDCOUNTOF_H
+#define _STDCOUNTOF_H
+
+#define countof  _Countof
+
+#endif /* stdcountof.h */
diff --git a/gcc/testsuite/gcc.dg/countof-stdcountof.c 
b/gcc/testsuite/gcc.dg/countof-stdcountof.c
new file mode 100644
index ..2fb0c6306ef0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/countof-stdcountof.c
@@ -0,0 +1,24 @@
+/* { dg-do run } */
+/* { dg-options "-std=c2y -pedantic-errors" } */
+
+#include 
+
+#define assert(e)  ((e) ? (void) 0 : __builtin_abort ())
+
+extern int strcmp (const char *, const char *);
+
+#ifndef countof
+#error "countof not defined"
+#endif
+
+int a[3];
+int b[countof a];
+
+#define str(x) #x
+#define xstr(x) str(x)
+
+int
+main (void)
+{
+  assert (strcmp (xstr(countof), "_Countof") == 0);
+}
-- 
2.49.0



Re: Fix PR 118541, do not generate unordered fp cmoves for IEEE compares

2025-05-20 Thread Segher Boessenkool
On Mon, May 12, 2025 at 06:35:15PM -0400, Michael Meissner wrote:
> On Mon, May 12, 2025 at 01:24:04PM +0530, Surya Kumari Jangala wrote:
> > Hi Mike,
> > Irrespective of whether -Ofast is used or not, should’nt we generate 
> > XSCMPUDP instruction for ‘isgreater()’ operation? This is because XSCMPGTDP 
> > insn will generate a trap if either operand is an SNaN or a QNaN. Whereas, 
> > XSCMPUDP insn will generate a trap only if either operand is an SNaN. The 
> > issue with the failing glibc tests is that an “Invalid operation” exception 
> > is being thrown due to qNaNs.
> 
> But -Ofast says not to worry about Nans (signalling or otherwise).  But if
> Segher desires, I remove the test for Ofast.

That is not what -Ofast means at all.  It means "-O3, but also
-ffast-math, and some other not recommendable things".  Its name is a
total misnomer: it often is not faster than even -O2 (the baseline
here), but it also is very non-standard-compliant and similar things.

"-Ofast-and-loose" might be a name that does make sense.  As the
dictionary says:
   "If you say that someone is playing fast and loose, you are
expressing disapproval of them for behaving in a deceitful, immoral,
or irresponsible way."

And yeah, xscmpgtdp is plain wrong no matter what flags are used, unless
we adopt a -fuck-up flag :-(


Segher


RE: [PATCH 2/2] RISC-V: Add testcases for signed vector SAT_ADD IMM form 1

2025-05-20 Thread Li, Pan2
> Looks reasonably sensible.  But I'll defer to Pan here since he's done 
> *far* more work than I in this space.

Thanks Jeff.

LGTM but please wait the ack from Richard for the middle-end change.

Pan

-Original Message-
From: Jeff Law  
Sent: Wednesday, May 21, 2025 2:06 AM
To: Li Xu ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@gmail.com; richard.guent...@gmail.com; tamar.christ...@arm.com; 
juzhe.zh...@rivai.ai; Li, Pan2 
Subject: Re: [PATCH 2/2] RISC-V: Add testcases for signed vector SAT_ADD IMM 
form 1



On 5/19/25 2:42 AM, Li Xu wrote:
> From: xuli 
> 
> This patch adds testcase for form1, as shown below:
> 
> void __attribute__((noinline))   \
> vec_sat_s_add_imm_##T##_fmt_1##_##INDEX (T *out, T *op_1, unsigned limit) \
> {\
>unsigned i;\
>for (i = 0; i < limit; i++)\
>  {\
>T x = op_1[i]; \
>T sum = (UT)x + (UT)IMM;   \
>out[i] = (x ^ IMM) < 0 \
>  ? sum\
>  : (sum ^ x) >= 0 \
>? sum  \
>: x < 0 ? MIN : MAX;   \
>  }\
> }
> 
> Passed the rv64gcv regression test.
> 
> Signed-off-by: Li Xu 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/rvv/autovec/sat/vec_sat_arith.h: add signed vec 
> SAT_ADD IMM form1.
>   * gcc.target/riscv/rvv/autovec/sat/vec_sat_data.h: add sat_s_add_imm 
> data.
>   * gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-1-i16.c: New test.
>   * gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-1-i32.c: New test.
>   * gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-1-i64.c: New test.
>   * gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-1-i8.c: New test.
>   * gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-run-1-i16.c: New 
> test.
>   * gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-run-1-i32.c: New 
> test.
>   * gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-run-1-i64.c: New 
> test.
>   * gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm-run-1-i8.c: New 
> test.
>   * 
> gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm_type_check-1-i16.c: New 
> test.
>   * 
> gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm_type_check-1-i32.c: New 
> test.
>   * gcc.target/riscv/rvv/autovec/sat/vec_sat_s_add_imm_type_check-1-i8.c: 
> New test.
Looks reasonably sensible.  But I'll defer to Pan here since he's done 
*far* more work than I in this space.

jeff



[PATCH 2/2] aarch64: Improve rtx_cost for constants in COMPARE [PR120372]

2025-05-20 Thread Andrew Pinski
The middle-end uses rtx_cost on constants with the outer of being COMPARE
to find out the cost of a constant formation for a comparison instruction.
So for aarch64 backend, we would just return the cost of constant formation
in general. We can improve this by seeing if the outer is COMPARE and if
the constant fits the constraints of the cmp instruction just set the costs
to being one instruction.

Built and tested for aarch64-linux-gnu.

PR target/120372

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_rtx_costs ): 
Handle
if outer is COMPARE and the constant can be handled by the cmp 
instruction.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/imm_choice_comparison-2.c: New test.

Signed-off-by: Andrew Pinski 
---
 gcc/config/aarch64/aarch64.cc |  7 ++
 .../aarch64/imm_choice_comparison-2.c | 90 +++
 2 files changed, 97 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/imm_choice_comparison-2.c

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 1da615c8955..c747ad42ac4 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -14578,6 +14578,13 @@ aarch64_rtx_costs (rtx x, machine_mode mode, int outer 
ATTRIBUTE_UNUSED,
 we don't need to consider that here.  */
   if (x == const0_rtx)
*cost = 0;
+  /* If the outer is a COMPARE which is used by the middle-end
+and the constant fits how the cmp instruction allows, say the cost
+is the same as 1 insn.  */
+  else if (outer == COMPARE
+  && (aarch64_uimm12_shift (INTVAL (x))
+  || aarch64_uimm12_shift (- (unsigned HOST_WIDE_INT) INTVAL 
(x
+   *cost = COSTS_N_INSNS (1);
   else
{
  /* To an approximation, building any other constant is
diff --git a/gcc/testsuite/gcc.target/aarch64/imm_choice_comparison-2.c 
b/gcc/testsuite/gcc.target/aarch64/imm_choice_comparison-2.c
new file mode 100644
index 000..379fc50563c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/imm_choice_comparison-2.c
@@ -0,0 +1,90 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/* PR target/120372 */
+
+/* Go from 2 moves to none.  */
+
+/*
+** GT:
+** ...
+** cmp w0, 11182080
+** ...
+*/
+
+int
+GT (unsigned int x)
+{
+  return x > 0xaa9fff;
+}
+
+/*
+** LE:
+** ...
+** cmp w0, 11182080
+** ...
+*/
+
+int
+LE (unsigned int x)
+{
+  return x <= 0xaa9fff;
+}
+
+/*
+** GE:
+** ...
+** cmp x0, 11182080
+** ...
+*/
+
+int
+GE (long long x)
+{
+  return x >= 0xaaa000;
+}
+
+/*
+** LT:
+** ...
+** cmp w0, 11182080
+** ...
+*/
+
+int
+LT (int x)
+{
+  return x < 0xaaa000;
+}
+
+/* Optimize the immediate in conditionals.  */
+
+/*
+** check:
+** ...
+** cmp w0, 11182080
+** ...
+*/
+
+int
+check (int x, int y)
+{
+  if (x > y && GT (x))
+return 100;
+
+  return x;
+}
+
+/*
+** tern:
+** ...
+** cmp w0, 11182080
+** ...
+*/
+
+int
+tern (int x)
+{
+  return x >= 0xaaa000 ? 5 : -3;
+}
-- 
2.43.0



[PATCH 1/2] expand: Use rtx_cost directly instead of gen_move_insn for canonicalize_comparison.

2025-05-20 Thread Andrew Pinski
This is the first part in fixing PR target/120372.
The current code for canonicalize_comparison, uses gen_move_insn and rtx_cost 
to find
out the cost of generating a constant. This is ok in most cases except sometimes
the comparison instruction can handle different constants than a simple set
intruction can do. This changes to use rtx_cost directly with the outer being 
COMPARE
just like how prepare_cmp_insn handles that.

Note this is also a small speedup and small memory improvement because we are 
not creating
a move for the constant any more. Since we are not creating a psedu-register 
any more, this
also removes the check on that.

Also adds a dump so we can see why one choice was chosen over the other.

Build and tested for aarch64-linux-gnu.

gcc/ChangeLog:

* expmed.cc (canonicalize_comparison): Use rtx_cost directly
instead of gen_move_insn. Print out the choice if dump is enabled.

Signed-off-by: Andrew Pinski 
---
 gcc/expmed.cc | 23 +++
 1 file changed, 15 insertions(+), 8 deletions(-)

diff --git a/gcc/expmed.cc b/gcc/expmed.cc
index 72dbafe5d9f..d5da199d033 100644
--- a/gcc/expmed.cc
+++ b/gcc/expmed.cc
@@ -6408,18 +6408,25 @@ canonicalize_comparison (machine_mode mode, enum 
rtx_code *code, rtx *imm)
   if (overflow)
 return;
 
-  /* The following creates a pseudo; if we cannot do that, bail out.  */
-  if (!can_create_pseudo_p ())
-return;
-
-  rtx reg = gen_rtx_REG (mode, LAST_VIRTUAL_REGISTER + 1);
   rtx new_imm = immed_wide_int_const (imm_modif, mode);
 
-  rtx_insn *old_rtx = gen_move_insn (reg, *imm);
-  rtx_insn *new_rtx = gen_move_insn (reg, new_imm);
+  int old_cost = rtx_cost (*imm, mode, COMPARE, 0, true);
+  int new_cost = rtx_cost (new_imm, mode, COMPARE, 0, true);
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+{
+  fprintf (dump_file, ";; cmp: %s, old cst: ",
+  GET_RTX_NAME (*code));
+  print_rtl (dump_file, *imm);
+  fprintf (dump_file, " new cst: ");
+  print_rtl (dump_file, new_imm);
+  fprintf (dump_file, "\n");
+  fprintf (dump_file, ";; old cst cost: %d, new cst cost: %d\n",
+  old_cost, new_cost);
+}
 
   /* Update the immediate and the code.  */
-  if (insn_cost (old_rtx, true) > insn_cost (new_rtx, true))
+  if (old_cost > new_cost)
 {
   *code = equivalent_cmp_code (*code);
   *imm = new_imm;
-- 
2.43.0



RE: [PATCH 2/2] RISC-V:Add testcases for signed .SAT_ADD IMM form 1 with IMM = -1.

2025-05-20 Thread Li, Pan2
Thanks Jeff.

> So for the tests, why are we forcing matching of the assembly code for 
> the entire function?  That must makes for a fragile test as we may 
> change various aspects of code generation over time.

> If the point of the patch is to detect SAT_ADD in more cases, then the 
> better and more stable test is to verify the existence of SAT_ADD the 
> appropriate number of times in the .optimized dump.

> IMHO we really don't want this kind of whole function assembly matching.

> Pan, do you have any further comments here?  Do you have strong opinions 
> on whether or not we want to be doing this kind of assembly output 
> testing or not?

Unlike vector we have vsadd for asm check, the scalar SAT_* will expand to 
sorts of branchless codes.
So I add the function body check with no-schedule-insns. However, we have run 
test for the same
scenarios which may also indicates the code-gen for SAT_ADD is correct.

Given that it is totally OK to drop that whole function check. How about keep 
this series as is and I
can help to drop all the similar check of scalar in another thread?

Pan

-Original Message-
From: Jeff Law  
Sent: Wednesday, May 21, 2025 2:05 AM
To: Li Xu ; gcc-patches@gcc.gnu.org
Cc: kito.ch...@gmail.com; richard.guent...@gmail.com; tamar.christ...@arm.com; 
juzhe.zh...@rivai.ai; Li, Pan2 
Subject: Re: [PATCH 2/2] RISC-V:Add testcases for signed .SAT_ADD IMM form 1 
with IMM = -1.



On 5/19/25 2:41 AM, Li Xu wrote:
> From: xuli 
> 
> This patch adds testcase for form1, as shown below:
> 
> T __attribute__((noinline))  \
> sat_s_add_imm_##T##_fmt_1##_##INDEX (T x) \
> {\
>T sum = (UT)x + (UT)IMM; \
>return (x ^ IMM) < 0 \
>  ? sum\
>  : (sum ^ x) >= 0 \
>? sum  \
>: x < 0 ? MIN : MAX;   \
> }
> 
> Passed the rv64gcv regression test.
> 
> Signed-off-by: Li Xu 
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/riscv/sat/sat_s_add_imm-2.c: Move to...
>   * gcc.target/riscv/sat/sat_s_add_imm-1-i16.c: ...here.
>   * gcc.target/riscv/sat/sat_s_add_imm-3.c: Move to...
>   * gcc.target/riscv/sat/sat_s_add_imm-1-i32.c: ...
>   * gcc.target/riscv/sat/sat_s_add_imm-4.c: Move to...
>   * gcc.target/riscv/sat/sat_s_add_imm-1-i64.c: ...here.
>   * gcc.target/riscv/sat/sat_s_add_imm-1.c: Move to...
>   * gcc.target/riscv/sat/sat_s_add_imm-1-i8.c: ...here.
>   * gcc.target/riscv/sat/sat_s_add_imm-run-2.c: Move to...
>   * gcc.target/riscv/sat/sat_s_add_imm-run-1-i16.c: ...here.
>   * gcc.target/riscv/sat/sat_s_add_imm-run-3.c: Move to...
>   * gcc.target/riscv/sat/sat_s_add_imm-run-1-i32.c: ...here.
>   * gcc.target/riscv/sat/sat_s_add_imm-run-4.c: Move to...
>   * gcc.target/riscv/sat/sat_s_add_imm-run-1-i64.c: ...here.
>   * gcc.target/riscv/sat/sat_s_add_imm-run-1.c: Move to...
>   * gcc.target/riscv/sat/sat_s_add_imm-run-1-i8.c: ...here.
>   * gcc.target/riscv/sat/sat_s_add_imm-2-1.c: Move to...
>   * gcc.target/riscv/sat/sat_s_add_imm_type_check-1-i16.c: ...here.
>   * gcc.target/riscv/sat/sat_s_add_imm-3-1.c: Move to...
>   * gcc.target/riscv/sat/sat_s_add_imm_type_check-1-i32.c: ...here.
>   * gcc.target/riscv/sat/sat_s_add_imm-1-1.c: Move to...
>   * gcc.target/riscv/sat/sat_s_add_imm_type_check-1-i8.c: ...here.
So for the tests, why are we forcing matching of the assembly code for 
the entire function?  That must makes for a fragile test as we may 
change various aspects of code generation over time.

If the point of the patch is to detect SAT_ADD in more cases, then the 
better and more stable test is to verify the existence of SAT_ADD the 
appropriate number of times in the .optimized dump.

IMHO we really don't want this kind of whole function assembly matching.

Pan, do you have any further comments here?  Do you have strong opinions 
on whether or not we want to be doing this kind of assembly output 
testing or not?


Jeff




Re: [PATCH v2 2/2] MIPS p8700 doesn't have vector extension and added the dummies reservation for the same.

2025-05-20 Thread Umesh Kalappa
>> I've pushed this to the trunk as well.
Thank you Jeff ,

~U

On Tue, May 20, 2025 at 11:29 PM Jeff Law  wrote:

>
>
> On 5/19/25 1:03 AM, Umesh Kalappa wrote:
> > ---
> >   gcc/config/riscv/mips-p8700.md | 28 
> >   1 file changed, 28 insertions(+)
> I've pushed this to the trunk as well.
>
> Thanks,
> jeff
>
>


Re: [PATCH v2 1/2] The following changes enable P8700 processor for RISCV and P8700 is a high-performance processor from MIPS by extending RISCV with custom instructions.

2025-05-20 Thread Umesh Kalappa
>>Thanks.  I added the new cpu/tune options to the documentation in
doc/invoke.texi.
Thank you for the same

>>Going forward make sure to create a git commit message as well as a
ChangeLog entry.
Sure and we make sure that ChangeLog details in the commit log .

Thank you again
~U

On Tue, May 20, 2025 at 11:26 PM Jeff Law  wrote:

>
>
> On 5/19/25 1:02 AM, Umesh Kalappa wrote:
> > ---
> >   gcc/config/riscv/mips-p8700.md   | 139 +++
> >   gcc/config/riscv/riscv-cores.def |   5 ++
> >   gcc/config/riscv/riscv-opts.h|   3 +-
> >   gcc/config/riscv/riscv.cc|  22 +
> >   gcc/config/riscv/riscv.md|   3 +-
> >   5 files changed, 170 insertions(+), 2 deletions(-)
> >   create mode 100644 gcc/config/riscv/mips-p8700.md
> Thanks.  I added the new cpu/tune options to the documentation in
> doc/invoke.texi.
>
> Going forward make sure to create a git commit message as well as a
> ChangeLog entry.  You can look in the git log to see examples of commit
> messages.  The ChangeLog entry should be part of the commit message as
> we use scripting to create the ChangeLog file from the git commit messages.
>
> Jeff
>
>


  1   2   >